[galaxy-user] Unable to upload data via ftp site

2014-01-13 Thread Varsha Pardeshi
Hi

I am a registered user of Galaxy. I am trying to load my plant species
genome (Cicer arietinum) via FTP using my registered account
varsha.parde...@rmit.edu.au and password. However, the connection has
failed several times.

Kindly help me to solve this problem

 I also request you to load Cicer arietinum genome sequence in galaxy
dataset

The sequence is available http://www.ncbi.nlm.nih.gov/assembly/525138/

Thank you

Kind regards

Varsha


-- 
Dr. Varsha Pardeshi
Research Fellow
Health Innovations Research Institute
School of Applied Sciences
RMIT University
Building 223, Level 1
Plenty Road, Bundoora.
Victoria. 3083.
Australia.
Lab.: +61 3 99257140,
Office: +61 3 99257113
Fax.: +61 3 9925 7110
Mobile: +61 3 0416183650
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

[galaxy-user] bam to bigwig

2014-01-13 Thread Susanne Warrenfeltz
Hello,

I am trying to convert a BAM file to Bigwig using the Convert Format option 
under Attributes (I click on the pencil next to the file name in my history)

The conversion fails with the error message:
11L3_v3 is not found in chromosome sizes file.

11L3_v3 is a genomic sequence ID for the genome that the BAM file represents. 
The genome I need is not in the list of Database/build option in Galaxy.  How 
do I get my conversion to work?

I have uploaded the fasta file for my genome into my history but I do not see a 
way to point the conversion tool to that file. Am I on the right track?

Cheers from a Galaxy Newbie!
Susanne

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

[galaxy-user] Help

2014-01-13 Thread Pasquale Notarangelo

Hi all,
we are two new galaxy users.

We have developed 2 new tools and we would connect them into a new 
workflow.


We are able to import both tools and to link them into a workflow but we 
aren't able to pass the output of the first tool as the input of the 
second tool.



The first tool calls a bash script that produces a simple string (this 
is the path of the file generated by the script).


This is the xnl file of the first tool:

tool id=infnTools_ConcatenateArgumentsTool name=Concatenate 
Arguments and generate file Tool


  descriptionConcatenate arguments strings and generate 
file/description


  command interpreter=bashconcatenateArgumentsAndPutFile.sh 
$inputArguments/command


  inputs
param name=inputArguments type=text label=ARGUMENTS 
optional=false/

  /inputs

  outputs
data format=string name=output /
  /outputs

/tool


This is the bash script of the first tool 
(concatenateArgumentsAndPutFile.sh):


#!/bin/bash
echo ARGUMENTS: $@
export PathFile=/tmp/$RANDOM$RANDOM
echo $@  $PathFile
echo PathFile: $PathFile


The xml of the second tool is the following:

tool id=infnTools_InsertBiomasTools name=InsertJobs and check the 
status of Biomas

  descriptionInsertJobs Biomas Tool and check the status/description

  command interpreter=bashinsertAndCheckBiomasJobs.sh  $input 
/command


  inputs
	param format=string name=input type=data label=Insert path 
file/

  /inputs

  outputs
data format=tabular name=output/
  /outputs

/tool


When we run the workflow the output of the first tool isn't seen as 
input of the second tool.


Into the galaxy history we see this value for the input of the second 
tool: /home/pasquale/galaxy-dist/database/files/000/dataset_83.dat


Also this file is emtpy.


How we can resolve the problem?

Thanks and best regard
Pasquale  Alfonso




Dott. Pasquale Notarangelo
INFN Istituto Nazionale di Fisica Nucleare - Sezione Bari

Via Orabona, 4 - 70126 Bari, Italy

Tel. ufficio: +39 080-5443194
Interno ufficio: 3194
Mail: pasquale.notarang...@ba.infn.it
Skype: pasquale.notarangelo_1985
Msn: pasqualenotarang...@hotmail.it
Gmail: notarangelo@gmail.com


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

 http://galaxyproject.org/search/mailinglists/


Re: [galaxy-user] all FPKMs are 0 in the tmap files produced by cuffcompare

2014-01-13 Thread Jennifer Jackson

Hello,

It looks like the data is mapping as novel - not linked with the 
reference annotation. There can be a few factors that can cause this to 
occur for part of a dataset (often desirable) but when it occurs for an 
entire dataset, there is often a data mismatch or parameter issue.


The first item I always check is that the reference genomes are a match 
between inputs. Do this by confirming that the identifiers in the 
reference GFF file are the same as those in the Tophat BAM output 
(convert to SAM, with headers, to see the chromosome names). For the GFF 
file, the tool  Join, Subtract and Group - Group on the first column, 
chromosome name, with the action count distinct will isolate these.


But the real problem could be in the parameters, see below:

On 1/11/14 10:43 PM, Yang Bi wrote:

Dear all:

I am new to Galaxy and I followed online tutorials/tips to analyze my RNA seq data for 
alternative splicing. I used tophat for illumina to align my sequencing data 
after QC/filtering. Other than setting min intron to 20, I used the default settings. 
Then I feed the accepted hit files to cufflink. I set Min isoform fraction to 0, use 
annotation (tair10 gff3) as guide and choose yes for perform bias correction (locally 
cached tair10).
My guess is that this Cufflinks run had the same issue - have you 
checked it? The 'Min isoform fraction' set to 0 may be problematic (I 
have never run Cufflinks this way). It may seem that this is a setting 
that is permissive - to capture even very small expression levels - but 
it may have had the reverse effect of not assigning any reads.


(The Tophat run with min intron at 20 is pretty low/sensitive - but with 
a smaller genome this probably will not cause memory issues with the 
mapping. Was this set based on the genome having transcripts with known, 
characterized introns this short? I didn't check, but you can in the 
reference GFF file.).


Maybe double check the above Cufflinks run, confirm the results were as 
expected, then try the default in Cufflinks to see how that works out 
(0.1)? As a first pass test? If you want to make this more sensitive 
in subsequent run, you could try 0.01 - although how significant those 
results are, given this genome and your specific input data, would need 
to be evaluated.


After that, if you are still having trouble, please feel free to share a 
history link and we can try to help (copy and email a share link from 
the public server, direct to me, to keep your data private). Here is how:

https://wiki.galaxyproject.org/Support#Shared_and_Published_data

Hopefully the parameter change works, or a reference genome issue is 
found and corrected, but if not, I'll watch for your email,


Jen
Galaxy team


I merged the assembled transcripts with cuffmerge and use cuffcompare to compare the resultant 
merged assembled transcript to the reference annotation file tair10 gff3. I choose yes for 
use sequence data and locally cached tair10 as the reference list. I get 
this for the transcript accuracy analysis:

# Cuffcompare v2.1.1 | Command line was:
#cuffcompare -o cc_output -r 
/galaxy-repl/main/files/007/386/dataset_7386886.dat -s 
/galaxy/data/Arabidopsis_thaliana_TAIR10/sam_index/Arabidopsis_thaliana_TAIR10.fa
 ./input1
#

#= Summary for dataset: ./input1 :
# Query mRNAs :   72778 in   51779 loci  (57559 multi-exon transcripts)
#(12679 multi-transcript loci, ~1.4 transcripts per locus)
# Reference mRNAs :   42163 in   33350 loci  (30127 multi-exon)
# Corresponding super-loci:  33140
#|   Sn   |  Sp   |  fSn |  fSp
 Base level:100.062.7 -   -
 Exon level:104.659.5   100.060.5
   Intron level:100.055.5   100.056.5
Intron chain level:  98.351.5   100.060.3
   Transcript level: 98.757.294.854.9
Locus level: 99.464.0   100.064.1

  Matching intron chains:   29618
   Matching loci:   33147

   Missed exons:   1/169820 (  0.0%)
Novel exons:  128021/298149 ( 42.9%)
 Missed introns:   0/127896 (  0.0%)
  Novel introns:  102614/230568 ( 44.5%)
Missed loci:   1/33350  (  0.0%)
 Novel loci:2962/51779  (  5.7%)

  Total union super-loci across all input datasets: 51779

For the tmap file, all my FPKMs are 0:

ref_gene_id ref_id  class_code  cuff_gene_idcuff_id FMI FPKM
FPKM_conf_loFPKM_conf_hicov len major_iso_idref_match_len
AT1G01010   AT1G01010.1 =   AT1G01010   TCONS_0001  0   
0.000.000.000.001688
TCONS_0001  1688
AT1G01040   AT1G01040.1 =   AT1G01040   TCONS_0002  0   
0.000.000.000.006251
TCONS_0002  6251
AT1G01040   AT1G01040.2 =   AT1G01040   TCONS_0003  0   
0.00 

Re: [galaxy-user] bam to bigwig

2014-01-13 Thread Jennifer Jackson

Hello Susanne,

First, add the genome to the list of Custom Builds for your account. The 
form to do this is under User - Custom Builds. The .fasta version of 
the genome is one entry option, so go ahead and use that. Pick a name 
and a unique key that will not conflict with other genomes already in 
Galaxy (a full list can be viewed by clicking on the link around the 
middle of this form, Show loaded, system-installed builds).


Once the load execution is started, this will take some time to 
process - how long depends roughly on the size of the genome. After 
added, you will be able to assign the build to datasets just like any 
other builds that are system-installed. Assign this to your dataset, 
then try the tool again.


I am also running a test as a double check that there are no problems 
with the method (have not attempted this since our move to the new 
hardware a few months ago), but do not anticipate problems. Should an 
issue occur, I will write you to follow up.


Meanwhile, you should go ahead and proceed as well. Having your custom 
genome set up this way is useful for other reasons (visualization, 
general data tracking, etc.).


Best,

Jen
Galaxy team

On 1/13/14 6:56 AM, Susanne Warrenfeltz wrote:


Hello,

I am trying to convert a BAM file to Bigwig using the Convert Format 
option under Attributes (I click on the pencil next to the file name 
in my history)


The conversion fails with the error message:

11L3_v3 is not found in chromosome sizes file.

11L3_v3 is a genomic sequence ID for the genome that the BAM file 
represents. The genome I need is not in the list of Database/build 
option in Galaxy.  How do I get my conversion to work?


I have uploaded the fasta file for my genome into my history but I do 
not see a way to point the conversion tool to that file. Am I on the 
right track?


Cheers from a Galaxy Newbie!

Susanne



___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

   http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

   http://galaxyproject.org/search/mailinglists/


--
Jennifer Hillman-Jackson
http://galaxyproject.org

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-user] Unable to upload data via ftp site

2014-01-13 Thread Jennifer Jackson

Hi Varsha,

With regard to the genome addition, I have added this to the genome 
request ticket in Trello to be reviewed. Currently, a large batch of 
genomes is undergoing a significant processing cycle for near-term 
release, and another is already scheduled to start in the beginning of 
Feb (most are not in this particular ticket, but will be announced when 
released). This means that the time line is on the order of a few months 
for those marked as consider (e.g. review) in the ticket.

https://trello.com/c/kzVklAIE

But, the best news is that you do not have to wait for us. The genome 
can be used with nearly all tools as a custom genome right away. 
Instructions for prep and usage are included here in our wiki:

https://wiki.galaxyproject.org/Support#Custom_reference_genome

Take care,

Jen
Galaxy team

On 1/12/14 4:41 PM, Varsha Pardeshi wrote:

Hi

I am a registered user of Galaxy. I am trying to load my plant species 
genome (Cicer arietinum) via FTP using my registered account 
varsha.parde...@rmit.edu.au mailto:varsha.parde...@rmit.edu.au and 
password. However, the connection has failed several times.


Kindly help me to solve this problem

 I also request you to load Cicer arietinum genome sequence in galaxy 
dataset


The sequence is available http://www.ncbi.nlm.nih.gov/assembly/525138/

Thank you

Kind regards

Varsha


--
Dr. Varsha Pardeshi
Research Fellow
Health Innovations Research Institute
School of Applied Sciences
RMIT University
Building 223, Level 1
Plenty Road, Bundoora.
Victoria. 3083.
Australia.
Lab.: +61 3 99257140,
Office: +61 3 99257113
Fax.: +61 3 9925 7110
Mobile: +61 3 0416183650



___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

   http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

   http://galaxyproject.org/search/mailinglists/


--
Jennifer Hillman-Jackson
http://galaxyproject.org

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-user] Help

2014-01-13 Thread Jennifer Jackson

Hi Pasquale,

From a quick look (and I am not the tool-building expert of our team!), 
I suspect that the problem is with the format assigned to the output 
of the first tool, and input of the second tool. Specifically, 
format=string is problematic, unless you have also defined this in 
your local install. Even then, having it contain a path to a file 
deviates from the regular usage (if I have understood your snippet of 
code correctly).


Our wiki for tool configuration is located here. The wiki has examples, 
but you can also look at tools in the source code or Tool shed repos to 
see how format is used.

https://wiki.galaxyproject.org/Admin/Tools/ToolConfigSyntax

I don't want to send you away from this list, since I know that you 
already emailed earlier, but the galaxy-...@bx.psu.edu mailing list is 
where most tool development questions are discussed. That said, when 
troubleshooting, individual scripts are not often corrected by the 
community if the answer is already in the wiki, existing code base, or 
in a prior discussion. So, making use of these resources is the first 
place to start. There is a search tool for development topics that can 
be of great use to locate the bits for you that can be helpful:

http://galaxyproject.org/search/

Try a search with Admin  Development - I found in the first few hits 
this link, which includes the tool config link above plus many other 
related resources listed at the bottom:

https://wiki.galaxyproject.org/Admin/Tools/Add%20Tool%20Tutorial

Hopefully this helps a little, and others reading the post are welcome 
to add in more of course!


Jen
Galaxy team

On 1/13/14 7:31 AM, Pasquale Notarangelo wrote:

Hi all,
we are two new galaxy users.

We have developed 2 new tools and we would connect them into a new 
workflow.


We are able to import both tools and to link them into a workflow but 
we aren't able to pass the output of the first tool as the input of 
the second tool.



The first tool calls a bash script that produces a simple string (this 
is the path of the file generated by the script).


This is the xnl file of the first tool:

tool id=infnTools_ConcatenateArgumentsTool name=Concatenate 
Arguments and generate file Tool


  descriptionConcatenate arguments strings and generate 
file/description


  command interpreter=bashconcatenateArgumentsAndPutFile.sh 
$inputArguments/command


  inputs
param name=inputArguments type=text label=ARGUMENTS 
optional=false/

  /inputs

  outputs
data format=string name=output /
  /outputs

/tool


This is the bash script of the first tool 
(concatenateArgumentsAndPutFile.sh):


#!/bin/bash
echo ARGUMENTS: $@
export PathFile=/tmp/$RANDOM$RANDOM
echo $@  $PathFile
echo PathFile: $PathFile


The xml of the second tool is the following:

tool id=infnTools_InsertBiomasTools name=InsertJobs and check the 
status of Biomas

  descriptionInsertJobs Biomas Tool and check the status/description

  command interpreter=bashinsertAndCheckBiomasJobs.sh $input 
/command


  inputs
param format=string name=input type=data label=Insert path 
file/

  /inputs

  outputs
data format=tabular name=output/
  /outputs

/tool


When we run the workflow the output of the first tool isn't seen as 
input of the second tool.


Into the galaxy history we see this value for the input of the second 
tool: /home/pasquale/galaxy-dist/database/files/000/dataset_83.dat


Also this file is emtpy.


How we can resolve the problem?

Thanks and best regard
Pasquale  Alfonso




Dott. Pasquale Notarangelo
INFN Istituto Nazionale di Fisica Nucleare - Sezione Bari

Via Orabona, 4 - 70126 Bari, Italy

Tel. ufficio: +39 080-5443194
Interno ufficio: 3194
Mail: pasquale.notarang...@ba.infn.it
Skype: pasquale.notarangelo_1985
Msn: pasqualenotarang...@hotmail.it
Gmail: notarangelo@gmail.com


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

 http://galaxyproject.org/search/mailinglists/


--
Jennifer Hillman-Jackson
http://galaxyproject.org

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the 

[galaxy-user] add T. brucei 927 genome to Galaxy main?

2014-01-13 Thread Susanne Warrenfeltz
It would be very nice to see the Trypanosoma brucei TREU927 genome added to 
Galaxy:

http://tritrypdb.org/common/downloads/Current_Release/TbruceiTREU927/fasta/data/TriTrypDB-6.0_TbruceiTREU927_Genome.fasta


Thanks
Susanne


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-user] bam to bigwig

2014-01-13 Thread Susanne Warrenfeltz
Clicked submit
Thanks!
From: Jennifer Jackson [mailto:j...@bx.psu.edu]
Sent: Monday, January 13, 2014 2:39 PM
To: Susanne Warrenfeltz; galaxy-user@lists.bx.psu.edu
Subject: Re: [galaxy-user] bam to bigwig

Hello Susanne,

First, add the genome to the list of Custom Builds for your account. The form 
to do this is under User - Custom Builds. The .fasta version of the genome 
is one entry option, so go ahead and use that. Pick a name and a unique key 
that will not conflict with other genomes already in Galaxy (a full list can be 
viewed by clicking on the link around the middle of this form, Show loaded, 
system-installed builds).

Once the load execution is started, this will take some time to process - how 
long depends roughly on the size of the genome. After added, you will be able 
to assign the build to datasets just like any other builds that are 
system-installed. Assign this to your dataset, then try the tool again.

I am also running a test as a double check that there are no problems with the 
method (have not attempted this since our move to the new hardware a few months 
ago), but do not anticipate problems. Should an issue occur, I will write you 
to follow up.

Meanwhile, you should go ahead and proceed as well. Having your custom genome 
set up this way is useful for other reasons (visualization, general data 
tracking, etc.).

Best,

Jen
Galaxy team


On 1/13/14 6:56 AM, Susanne Warrenfeltz wrote:
Hello,

I am trying to convert a BAM file to Bigwig using the Convert Format option 
under Attributes (I click on the pencil next to the file name in my history)

The conversion fails with the error message:
11L3_v3 is not found in chromosome sizes file.

11L3_v3 is a genomic sequence ID for the genome that the BAM file represents. 
The genome I need is not in the list of Database/build option in Galaxy.  How 
do I get my conversion to work?

I have uploaded the fasta file for my genome into my history but I do not see a 
way to point the conversion tool to that file. Am I on the right track?

Cheers from a Galaxy Newbie!
Susanne





___

The Galaxy User list should be used for the discussion of

Galaxy analysis and other features on the public server

at usegalaxy.org.  Please keep all replies on the list by

using reply all in your mail client.  For discussion of

local Galaxy instances and the Galaxy source code, please

use the Galaxy Development list:



  http://lists.bx.psu.edu/listinfo/galaxy-dev



To manage your subscriptions to this and other Galaxy lists,

please use the interface at:



  http://lists.bx.psu.edu/



To search Galaxy mailing lists use the unified search at:



  http://galaxyproject.org/search/mailinglists/



--

Jennifer Hillman-Jackson

http://galaxyproject.org
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-user] add T. brucei 927 genome to Galaxy main?

2014-01-13 Thread Jennifer Jackson

Hi Susanne,

I wasn't able to locate this exact assembly at NCBI - is it there? Is 
there a known reason why it isn't? We have a general guideline to 
include genomes that are published there, although there is some leeway 
for open source, finished (or at least not early stage draft) genomes 
from other sources. I'll need to review (credits, usage, status, etc.).


Please have a look and write back with more details of what you know 
about this one (can be direct to me), so I don't miss anything.


Thanks!

Jen
Galaxy team

On 1/13/14 9:23 AM, Susanne Warrenfeltz wrote:


It would be very nice to see the Trypanosoma brucei TREU927 genome 
added to Galaxy:


http://tritrypdb.org/common/downloads/Current_Release/TbruceiTREU927/fasta/data/TriTrypDB-6.0_TbruceiTREU927_Genome.fasta

Thanks

Susanne



___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

   http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

   http://galaxyproject.org/search/mailinglists/


--
Jennifer Hillman-Jackson
http://galaxyproject.org

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-user] all FPKMs are 0 in the tmap files produced by cuffcompare

2014-01-13 Thread Yang Bi
Hi Jen:

Thank you for the prompt reply. RPKMs produced by cufflink look normal (from an 
assembled transcript file):

Seqname Source  Feature Start   End Score   Strand  Frame   Attributes
chr1Cufflinks   transcript  11960   13178   1000.   .   
gene_id CUFF.180; transcript_id CUFF.180.1; FPKM 6.5441928094; frac 
1.00; conf_lo 3.594986; conf_hi 8.987465; cov 2.413218; 
full_read_support yes;
chr1Cufflinks   exon11960   13178   1000.   .   gene_id 
CUFF.180; transcript_id CUFF.180.1; exon_number 1; FPKM 6.5441928094; 
frac 1.00; conf_lo 3.594986; conf_hi 8.987465; cov 2.413218;
chr1Cufflinks   transcript  453653141000+   .   
gene_id CUFF.178; transcript_id CUFF.178.1; FPKM 11.0556332840; frac 
1.00; conf_lo 3.645830; conf_hi 13.216134; cov 4.076844; 
full_read_support no;
chr1Cufflinks   exon453646051000+   .   gene_id 
CUFF.178; transcript_id CUFF.178.1; exon_number 1; FPKM 11.0556332840; 
frac 1.00; conf_lo 3.645830; conf_hi 13.216134; cov 4.076844;
chr1Cufflinks   exon470650951000+   .   gene_id 
CUFF.178; transcript_id CUFF.178.1; exon_number 2; FPKM 11.0556332840; 
frac 1.00; conf_lo 3.645830; conf_hi 13.216134; cov 4.076844;
chr1Cufflinks   exon517453141000+   .   gene_id 
CUFF.178; transcript_id CUFF.178.1; exon_number 3; FPKM 11.0556332840; 
frac 1.00; conf_lo 3.645830; conf_hi 13.216134; cov 4.076844;

I checked the chromosome names and I realized that the BAM outputs use lower 
cases for RNAME, eg. chr1 while my gff3 file uses initial capital letters 
for seqId, eg Chr1. Could this be the problem? What is the fastest way to 
convert the capital C in my gff3 file to lower case?

Thank you very much
Yang

- 原始邮件 -
发件人: Jennifer Jackson j...@bx.psu.edu
收件人: Yang Bi bey...@stanford.edu, galaxy-user@lists.bx.psu.edu
发送时间: 星期一, 2014年 1 月 13日 上午 10:56:39
主题: Re: [galaxy-user] all FPKMs are 0 in the tmap files produced by cuffcompare

Hello,

It looks like the data is mapping as novel - not linked with the 
reference annotation. There can be a few factors that can cause this to 
occur for part of a dataset (often desirable) but when it occurs for an 
entire dataset, there is often a data mismatch or parameter issue.

The first item I always check is that the reference genomes are a match 
between inputs. Do this by confirming that the identifiers in the 
reference GFF file are the same as those in the Tophat BAM output 
(convert to SAM, with headers, to see the chromosome names). For the GFF 
file, the tool  Join, Subtract and Group - Group on the first column, 
chromosome name, with the action count distinct will isolate these.

But the real problem could be in the parameters, see below:

On 1/11/14 10:43 PM, Yang Bi wrote:
 Dear all:

 I am new to Galaxy and I followed online tutorials/tips to analyze my RNA seq 
 data for alternative splicing. I used tophat for illumina to align my 
 sequencing data after QC/filtering. Other than setting min intron to 20, I 
 used the default settings. Then I feed the accepted hit files to cufflink. I 
 set Min isoform fraction to 0, use annotation (tair10 gff3) as guide and 
 choose yes for perform bias correction (locally cached tair10).
My guess is that this Cufflinks run had the same issue - have you 
checked it? The 'Min isoform fraction' set to 0 may be problematic (I 
have never run Cufflinks this way). It may seem that this is a setting 
that is permissive - to capture even very small expression levels - but 
it may have had the reverse effect of not assigning any reads.

(The Tophat run with min intron at 20 is pretty low/sensitive - but with 
a smaller genome this probably will not cause memory issues with the 
mapping. Was this set based on the genome having transcripts with known, 
characterized introns this short? I didn't check, but you can in the 
reference GFF file.).

Maybe double check the above Cufflinks run, confirm the results were as 
expected, then try the default in Cufflinks to see how that works out 
(0.1)? As a first pass test? If you want to make this more sensitive 
in subsequent run, you could try 0.01 - although how significant those 
results are, given this genome and your specific input data, would need 
to be evaluated.

After that, if you are still having trouble, please feel free to share a 
history link and we can try to help (copy and email a share link from 
the public server, direct to me, to keep your data private). Here is how:
https://wiki.galaxyproject.org/Support#Shared_and_Published_data

Hopefully the parameter change works, or a reference genome issue is 
found and corrected, but if not, I'll watch for your email,

Jen
Galaxy team

 I merged the assembled transcripts with cuffmerge and use cuffcompare to 
 compare the resultant merged assembled transcript to the reference annotation 
 file tair10 gff3. I choose 

Re: [galaxy-user] all FPKMs are 0 in the tmap files produced by cuffcompare

2014-01-13 Thread Jennifer Jackson

Hello Yang,

Glad the problem was isolated - the mismatched chromosomes is definitely 
something to be fixed.


The tools in 'Text Manipulation can help. The tool Change Case of 
selected columns can change the case for you. Click on the pencil icon 
after running the tool to reassign the datatype correctly as needed.


Take care,

Jen
Galaxy team

On 1/13/14 6:31 PM, Yang Bi wrote:

Hi Jen:

Thank you for the prompt reply. RPKMs produced by cufflink look normal (from an 
assembled transcript file):

Seqname Source  Feature Start   End Score   Strand  Frame   Attributes
chr1Cufflinks   transcript  11960   13178   1000.   .   gene_id CUFF.180; transcript_id CUFF.180.1; FPKM 
6.5441928094; frac 1.00; conf_lo 3.594986; conf_hi 8.987465; cov 2.413218; full_read_support 
yes;
chr1Cufflinks   exon11960   13178   1000.   .   gene_id CUFF.180; transcript_id CUFF.180.1; exon_number 
1; FPKM 6.5441928094; frac 1.00; conf_lo 3.594986; conf_hi 8.987465; cov 2.413218;
chr1Cufflinks   transcript  453653141000+   .   gene_id CUFF.178; transcript_id CUFF.178.1; FPKM 
11.0556332840; frac 1.00; conf_lo 3.645830; conf_hi 13.216134; cov 4.076844; full_read_support 
no;
chr1Cufflinks   exon453646051000+   .   gene_id CUFF.178; transcript_id CUFF.178.1; exon_number 
1; FPKM 11.0556332840; frac 1.00; conf_lo 3.645830; conf_hi 13.216134; cov 4.076844;
chr1Cufflinks   exon470650951000+   .   gene_id CUFF.178; transcript_id CUFF.178.1; exon_number 
2; FPKM 11.0556332840; frac 1.00; conf_lo 3.645830; conf_hi 13.216134; cov 4.076844;
chr1Cufflinks   exon517453141000+   .   gene_id CUFF.178; transcript_id CUFF.178.1; exon_number 
3; FPKM 11.0556332840; frac 1.00; conf_lo 3.645830; conf_hi 13.216134; cov 4.076844;

I checked the chromosome names and I realized that the BAM outputs use lower cases for RNAME, eg. 
chr1 while my gff3 file uses initial capital letters for seqId, eg Chr1. Could this 
be the problem? What is the fastest way to convert the capital C in my gff3 file to lower case?

Thank you very much
Yang


--
Jennifer Hillman-Jackson
http://galaxyproject.org

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

 http://galaxyproject.org/search/mailinglists/