Re: [galaxy-user] Running Cufflinks on Bacterial RNAseq data

2012-08-01 Thread Jennifer Jackson
Peter is correct (I oversimplified)! And Cufflinks does allow for an ID 
attribute to span lines as long as it represents the same feature.


To be clear, this error was a true format issue.

The best way to understand the finer points is to see the specification 
(also linked from wiki below):


http://www.sequenceontology.org/gff3.shtml
(quote)

Column 9: "attributes"
<...>
ID 	Indicates the ID of the feature. IDs for each feature must be unique 
within the scope of the GFF file. In the case of discontinuous features 
(i.e. a single feature that exists over multiple genomic locations) the 
same ID may appear on multiple lines. All lines that share an ID 
collectively represent a single feature.


Thanks Peter for the clarification!

Jen
Galaxy team

On 7/31/12 11:35 AM, Jennifer Jackson wrote:

Hello Rachel,

When datasets are in a grey "waiting to run" state this indicates that
they are in the queue and in line to run. For the majority of cases,
including yours, leaving the job alone and allowing it to run is the
correct option. The missing metadata only means that the result has not
yet posted to your history (expected when still grey).

It looks as if your jobs have now run, but resulted in errors. I can let
you know that the problem is with the input GFF3 dataset. It contains at
least one duplicated "ID" attribute, which is required to be unique
within GFF3 files. Clicking on the green bug icon in any of the red
error datasets will point to the example duplicated ID. To my knowledge,
the content being based on a bacterial genome is unrelated to this
format problem.

For reference, this is the specification help for GFF3:
http://wiki.g2.bx.psu.edu/Learn/Datatypes#GFF3

This can be a difficult problem to resolve on your own since the scope
of the true file issues are unknown. Locating an alternate source or
contacting the original source of this GFF3 dataset to request a
correction would be potential solutions. The tophat.cuffli...@gmail.com
mailing list or seqanswers.com are suggested places to query for
reference annotation file recommendations.


Best,

Jen
Galaxy team


On 7/27/12 10:30 AM, Rachel Krasich wrote:

I am attempting to run Cufflinks on Galaxy main to analyze my E. coli
RNAseq data.  I have mapped my reads using an outside program (Genious)
and uploaded the resulting BAM file.  I also have uploaded the E. coli
annotations as a gtf file.  However when I attempt to run Cufflinks
using my annotations it just stays on "Job is waiting to run" for
hours.  If I click on "Edit attributes", I see an error message
"Required metadata values are missing".  Does this mean that my files
are somehow incomplete and cufflinks will never run, or do I just need
to wait longer?  When searching around the mailing lists I saw others
have had issues with bacteria due to its circular chromosome, and was
wondering if this might somehow be related.  Thanks.

Rachel


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

   http://lists.bx.psu.edu/





--
Jennifer Jackson
http://galaxyproject.org
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-user] Running Cufflinks on Bacterial RNAseq data

2012-08-01 Thread Peter Cock
On Tue, Jul 31, 2012 at 7:35 PM, Jennifer Jackson  wrote:
> Hello Rachel,
>
> When datasets are in a grey "waiting to run" state this indicates that they
> are in the queue and in line to run. For the majority of cases, including
> yours, leaving the job alone and allowing it to run is the correct option.
> The missing metadata only means that the result has not yet posted to your
> history (expected when still grey).
>
> It looks as if your jobs have now run, but resulted in errors. I can let you
> know that the problem is with the input GFF3 dataset. It contains at least
> one duplicated "ID" attribute, which is required to be unique within GFF3
> files.

Actually that isn't quite right (although it may be a limitation imposed by
some tools using GFF3 as an input). Features split over multiple locations
are described in GFF3 using multiple lines sharing the same ID attribute.
This is most commonly used for genes made up of multiple exons, but
can even apply across references in some extreme trans-splicing cases.

Peter
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-user] Running Cufflinks on Bacterial RNAseq data

2012-07-31 Thread Jennifer Jackson

Hello Rachel,

When datasets are in a grey "waiting to run" state this indicates that 
they are in the queue and in line to run. For the majority of cases, 
including yours, leaving the job alone and allowing it to run is the 
correct option. The missing metadata only means that the result has not 
yet posted to your history (expected when still grey).


It looks as if your jobs have now run, but resulted in errors. I can let 
you know that the problem is with the input GFF3 dataset. It contains at 
least one duplicated "ID" attribute, which is required to be unique 
within GFF3 files. Clicking on the green bug icon in any of the red 
error datasets will point to the example duplicated ID. To my knowledge, 
the content being based on a bacterial genome is unrelated to this 
format problem.


For reference, this is the specification help for GFF3:
http://wiki.g2.bx.psu.edu/Learn/Datatypes#GFF3

This can be a difficult problem to resolve on your own since the scope 
of the true file issues are unknown. Locating an alternate source or 
contacting the original source of this GFF3 dataset to request a 
correction would be potential solutions. The tophat.cuffli...@gmail.com 
mailing list or seqanswers.com are suggested places to query for 
reference annotation file recommendations.



Best,

Jen
Galaxy team


On 7/27/12 10:30 AM, Rachel Krasich wrote:

I am attempting to run Cufflinks on Galaxy main to analyze my E. coli
RNAseq data.  I have mapped my reads using an outside program (Genious)
and uploaded the resulting BAM file.  I also have uploaded the E. coli
annotations as a gtf file.  However when I attempt to run Cufflinks
using my annotations it just stays on "Job is waiting to run" for
hours.  If I click on "Edit attributes", I see an error message
"Required metadata values are missing".  Does this mean that my files
are somehow incomplete and cufflinks will never run, or do I just need
to wait longer?  When searching around the mailing lists I saw others
have had issues with bacteria due to its circular chromosome, and was
wondering if this might somehow be related.  Thanks.

Rachel


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

   http://lists.bx.psu.edu/



--
Jennifer Jackson
http://galaxyproject.org
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/


[galaxy-user] Running Cufflinks on Bacterial RNAseq data

2012-07-27 Thread Rachel Krasich
I am attempting to run Cufflinks on Galaxy main to analyze my E. coli
RNAseq data.  I have mapped my reads using an outside program (Genious) and
uploaded the resulting BAM file.  I also have uploaded the E. coli
annotations as a gtf file.  However when I attempt to run Cufflinks using
my annotations it just stays on "Job is waiting to run" for hours.  If I
click on "Edit attributes", I see an error message "Required metadata
values are missing".  Does this mean that my files are somehow incomplete
and cufflinks will never run, or do I just need to wait longer?  When
searching around the mailing lists I saw others have had issues with
bacteria due to its circular chromosome, and was wondering if this might
somehow be related.  Thanks.

Rachel
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Running cufflinks on a genome without a bowtie index

2011-12-22 Thread Jeremy Goecks
Noa,

Using your FASTA in Tophat and Cufflinks is the correct approach. You don't 
need to provide an annotation file in Cufflinks, and you can also avoid using 
your FASTA in Cufflinks by not using bias correction.

If you're still having problems, the issue is likely your parameter choices in 
Tophat and/or Cufflinks. You'll want to read the documentation carefully to 
choose parameters appropriately for your data.

Good luck,
J.

On Dec 22, 2011, at 5:09 AM, Noa Sher wrote:

> Hi
> I am trying to run Cufflinks on a genome without a bowtie index.
> How do I make my own index? I have a FASTA file of the genome, but if I run 
> tophat using just that and then cufflinks using a gtf file of the 
> transcriptome, I get zero in all FPKM values
> Thanks
> Noa
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> 
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> 
>  http://lists.bx.psu.edu/

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-user] Running cufflinks on a genome without a bowtie index

2011-12-22 Thread Noa Sher

  
  
Hi
I am trying to run Cufflinks on a genome without a bowtie index.
  How do I make my own index? I have a FASTA file of the genome, but
  if I run tophat using just that and then cufflinks using a gtf
  file of the transcriptome, I get zero in all FPKM values
Thanks
Noa

  

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] running cufflinks

2011-09-07 Thread Peter Cock
On Tue, Sep 6, 2011 at 9:29 PM, Peng, Tao  wrote:
> Hi I had 3 GTF files from ensemble, UCSC and NCBI for annotation; ONLY
> ensemble were recognized by GALAXY cufflinks as a GTF file although they
> all have .GTF. I am NOT sure why UCSC and NCBI GTF files were seen as
> GFF files?
>
> Thx,

I'm pretty sure Galaxy ignores the original filename extension
(it will be stored on disk as *.dat once uploaded to Galaxy).

If you could post the start of each file (or links to the complete
files) that would be very helpful for working out why Galaxy
has misidentified the GTF files as GFF.

Peter
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-user] running cufflinks

2011-09-06 Thread Jennifer Jackson

===>  Please use "Reply All" when responding to this email<===


Hello,

File type can be set by using a dataset's pencil icon to reach the "Edit 
Attributes" form. GTF and GFF are both very similar. It is possible that 
the 9th field ("group" for GFF, "attributes" for GTF) was not detected 
automatically upon upload or that there is some problem with format to 
double check (extra tabs or whitespace).


Hopefully this helps,

Best,

Jen
Galaxy team



On 9/6/11 1:29 PM, Peng, Tao wrote:
> Hi I had 3 GTF files from ensemble, UCSC and NCBI for annotation; ONLY
> ensemble were recognized by GALAXY cufflinks as a GTF file although they
> all have .GTF. I am NOT sure why UCSC and NCBI GTF files were seen as
> GFF files?
>
> Thx,
>
> tao
--
Jennifer Jackson
http://usegalaxy.org
http://galaxyproject.org/Support
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-user] running cufflinks

2011-09-06 Thread Peng, Tao
Hi I had 3 GTF files from ensemble, UCSC and NCBI for annotation; ONLY
ensemble were recognized by GALAXY cufflinks as a GTF file although they
all have .GTF. I am NOT sure why UCSC and NCBI GTF files were seen as
GFF files?

Thx,

tao

-Original Message-
From: Jennifer Jackson [mailto:j...@bx.psu.edu] 
Sent: Thursday, September 01, 2011 5:08 PM
To: galaxy-user
Cc: Peng, Tao
Subject: running cufflinks


  ===> Please use "Reply All" when responding to this email <===

Hello,

This is the same reply as for the bug report, but for others who may run

into the same problem job that fails with this error:

terminate called after throwing an instance of 'std::bad_alloc'
   what():  std::bad_alloc


the reason is explained in #3 in the RNA-seq FAQ:
http://usegalaxy.org/u/jeremy/p/transcriptome-analysis-faq#faq3

A local or cloud instance may be the solution. These options are 
explained here:
http://galaxyproject.org/wiki/Big%20Picture/Choices

Our apologies for any inconvenience,

Best,

Jen
Galaxy team

On 9/1/11 4:55 PM, Peng, Tao wrote:
> Hi I am NOT sure why running cufflinks failed here. Thanks for your
> suggestion,
>
> tao
>
> ---
> Tool: Cufflinks
> Name: Cufflinks on data 6 and data 26: assembled transcripts
> Created: Sep 01, 2011
> Filesize: 81.3 Mb
> Dbkey: hg19
> Format: gtf
> Tool Version:
>
> Input Parameter Value
> SAM or BAM file of aligned RNA-Seq reads 6: Tophat for
> R4_CG_wh_accepted_hits
> Max Intron Length 30
> Min Isoform Fraction 0.05
> Pre MRNA Fraction 0.05
> Perform quartile normalization Yes
> Conditional (reference_annotation) 1
> Reference Aonnotation 26: Homo_sapiens.GRCh37.63.gtf
> Conditional (bias_correction) 0
> Conditional (seq_source) 0
> Conditional (singlePaired) 0
>
> --
>
> Message from History panel in GALAXY:
>
> An error occurred running this job: cufflinks v1.0.3
> cufflinks -q --no-update-check -I 30 -F 0.05 -j 0.05 -p 8
-N
> -b /galaxy/data/hg19/sam_index/hg19.fa
> Error running cufflinks. [18:40:45] Inspecting reads and determining
> fragment length distribution.
> Processed 915556 loci.
>

-- 
Jennifer Jackson
http://usegalaxy.org
http://galaxyproject.org/Support

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-user] running cufflinks

2011-09-01 Thread Peng, Tao
Hi I am NOT sure why running cufflinks failed here. Thanks for your suggestion,

tao

---
Tool: Cufflinks
Name:   Cufflinks on data 6 and data 26: assembled transcripts
Created:Sep 01, 2011
Filesize:   81.3 Mb
Dbkey:  hg19
Format: gtf
Tool Version:   

Input Parameter Value
SAM or BAM file of aligned RNA-Seq reads6: Tophat for 
R4_CG_wh_accepted_hits
Max Intron Length   30
Min Isoform Fraction0.05
Pre MRNA Fraction   0.05
Perform quartile normalization  Yes
Conditional (reference_annotation)  1
Reference Aonnotation   26: Homo_sapiens.GRCh37.63.gtf
Conditional (bias_correction)   0
Conditional (seq_source)0
Conditional (singlePaired)  0

--

Message from History panel in GALAXY:

An error occurred running this job: cufflinks v1.0.3
cufflinks -q --no-update-check -I 30 -F 0.05 -j 0.05 -p 8 -N -b 
/galaxy/data/hg19/sam_index/hg19.fa
Error running cufflinks. [18:40:45] Inspecting reads and determining fragment 
length distribution.
Processed 915556 loci.
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-user] running cufflinks

2011-09-01 Thread Jennifer Jackson

 ===> Please use "Reply All" when responding to this email <===

Hello,

This is the same reply as for the bug report, but for others who may run 
into the same problem job that fails with this error:


terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc


the reason is explained in #3 in the RNA-seq FAQ:
http://usegalaxy.org/u/jeremy/p/transcriptome-analysis-faq#faq3

A local or cloud instance may be the solution. These options are 
explained here:

http://galaxyproject.org/wiki/Big%20Picture/Choices

Our apologies for any inconvenience,

Best,

Jen
Galaxy team

On 9/1/11 4:55 PM, Peng, Tao wrote:

Hi I am NOT sure why running cufflinks failed here. Thanks for your
suggestion,

tao

---
Tool: Cufflinks
Name: Cufflinks on data 6 and data 26: assembled transcripts
Created: Sep 01, 2011
Filesize: 81.3 Mb
Dbkey: hg19
Format: gtf
Tool Version:

Input Parameter Value
SAM or BAM file of aligned RNA-Seq reads 6: Tophat for
R4_CG_wh_accepted_hits
Max Intron Length 30
Min Isoform Fraction 0.05
Pre MRNA Fraction 0.05
Perform quartile normalization Yes
Conditional (reference_annotation) 1
Reference Aonnotation 26: Homo_sapiens.GRCh37.63.gtf
Conditional (bias_correction) 0
Conditional (seq_source) 0
Conditional (singlePaired) 0

--

Message from History panel in GALAXY:

An error occurred running this job: cufflinks v1.0.3
cufflinks -q --no-update-check -I 30 -F 0.05 -j 0.05 -p 8 -N
-b /galaxy/data/hg19/sam_index/hg19.fa
Error running cufflinks. [18:40:45] Inspecting reads and determining
fragment length distribution.
Processed 915556 loci.



--
Jennifer Jackson
http://usegalaxy.org
http://galaxyproject.org/Support
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/