Re: [galaxy-user] How can I extract sequence information fromm cuffdiff files?

2012-09-13 Thread Jennifer Jackson

Hi Humberto,

Yes, my apologies, this should have been included in the original reply. 
The 'locus' field in the Cuffdiff files refers to a gene bound - not 
individual transcripts. To get to the transcripts, the inputs to 
Cuffdiff need to be accessed. If you used Cuffmerge, the "merged 
transcripts" GTF file would be the correct file to use as input to 
"Extract". If you used just Cuffcompare, use the "combined transcripts" GTF.


To know which transcript was associated with which gene bound, compare 
the Cuffmerge merged transcripts GTF attributes (9th column: gene_id, 
tss_id, etc) with Cuffdiffs "gene_id", "tss_id" values - is also in the 
test_id column, depending on the file. The Cuffcompare GTF comparisons 
will be similar.


You can gain access to the GTF attributes with the tool "Filter and Sort 
-> Filter GTF data by attribute values_list". Cut out the column of 
interest in the Cuffdiff file ("Text Manipulation -> Cut"), edit as 
desired, and use as a list filter. Or explore the other GFF filter 
options in the same tool group.


Take care,

Jen
Galaxy team

On 9/13/12 11:14 AM, Humberto Boncristiani wrote:

Hi

Fetch sequence-extract genomic DNA do not accept cuffidif files.
Should I convert this file to some specific format?

Thanks,

Humberto.

*Dr. Humberto Boncristiani*
National Research Council (NRC) Fellow
Adjunct Research Associate
Department of Biology
Univ. North Carolina at Greensboro
312 Eberhart Bldg
Greensboro, NC 27403, USA.
Tel.:(1) 336-256-2591
Fax: (1) 336-334-5839
email: hum...@gmail.com 




On Sep 13, 2012, at 2:06 PM, Jennifer Jackson wrote:


Hello,

By no annotation, do you mean species-specific annotation (GTF) was
not used? And you want to compare to a protein database like Genbank
NR or RefSeq? Then these are the instructions. Please let us know if
you had something else in mind.

The sequence extraction can be done on Galaxy Main (if that is where
you are working), but the BLAST will need to be run on a local or
cloud install. To get set up (instance and data), start here:
http://getgalaxy.org
http://usegalaxy.org/cloud

The BLAST+ wrapper recently moved from the distribution to the Tool
Shed, but there are installation tools integrated to help get this
into your instance. See the latest News Brief for details (Sept 7,
2012) - these are also good to follow as you maintain your instance:
http://wiki.g2.bx.psu.edu/News
http://wiki.g2.bx.psu.edu/DevNewsBriefs/2012_09_07

Questions about local/cloud installs are best directed to the
galaxy-...@bx.psu.edu mailing list:
http://wiki.g2.bx.psu.edu/Mailing%20Lists

To extract the transcript sequences, use the tool 'Fetch Sequences ->
Extract Genomic DNA'. This will accept a custom reference genome from
the history, if you have been using one, by changing the option
"Source for Genomic Data:" to "History".

Hopefully this helps,

Jen
Galaxy team

On 9/13/12 10:09 AM, Humberto Boncristiani wrote:

Hi.

I got cuffdiff files with gene differential expression on it. I don't
have the annotation, therefore I need to extract the sequence
information from the genome coordinates and them blast them to identify
those.
How the easiest way to do it?

Thanks.

Humberto



*Dr. Humberto Boncristiani*
National Research Council (NRC) Fellow
Adjunct Research Associate
Department of Biology
Univ. North Carolina at Greensboro
312 Eberhart Bldg
Greensboro, NC 27403, USA.
Tel.:(1) 336-256-2591
Fax: (1) 336-334-5839
email: hum...@gmail.com 






___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/



--
Jennifer Jackson
http://galaxyproject.org




--
Jennifer Jackson
http://galaxyproject.org
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-user] How can I extract sequence information fromm cuffdiff files?

2012-09-13 Thread Jennifer Jackson

Hello,

By no annotation, do you mean species-specific annotation (GTF) was not 
used? And you want to compare to a protein database like Genbank NR or 
RefSeq? Then these are the instructions. Please let us know if you had 
something else in mind.


The sequence extraction can be done on Galaxy Main (if that is where you 
are working), but the BLAST will need to be run on a local or cloud 
install. To get set up (instance and data), start here:

http://getgalaxy.org
http://usegalaxy.org/cloud

The BLAST+ wrapper recently moved from the distribution to the Tool 
Shed, but there are installation tools integrated to help get this into 
your instance. See the latest News Brief for details (Sept 7, 2012) - 
these are also good to follow as you maintain your instance:

http://wiki.g2.bx.psu.edu/News
http://wiki.g2.bx.psu.edu/DevNewsBriefs/2012_09_07

Questions about local/cloud installs are best directed to the 
galaxy-...@bx.psu.edu mailing list:

http://wiki.g2.bx.psu.edu/Mailing%20Lists

To extract the transcript sequences, use the tool 'Fetch Sequences -> 
Extract Genomic DNA'. This will accept a custom reference genome from 
the history, if you have been using one, by changing the option "Source 
for Genomic Data:" to "History".


Hopefully this helps,

Jen
Galaxy team

On 9/13/12 10:09 AM, Humberto Boncristiani wrote:

Hi.

I got cuffdiff files with gene differential expression on it. I don't
have the annotation, therefore I need to extract the sequence
information from the genome coordinates and them blast them to identify
those.
How the easiest way to do it?

Thanks.

Humberto



*Dr. Humberto Boncristiani*
National Research Council (NRC) Fellow
Adjunct Research Associate
Department of Biology
Univ. North Carolina at Greensboro
312 Eberhart Bldg
Greensboro, NC 27403, USA.
Tel.:(1) 336-256-2591
Fax: (1) 336-334-5839
email: hum...@gmail.com 






___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

   http://lists.bx.psu.edu/



--
Jennifer Jackson
http://galaxyproject.org
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/


[galaxy-user] How can I extract sequence information fromm cuffdiff files?

2012-09-13 Thread Humberto Boncristiani
Hi.

I got cuffdiff files with gene differential expression on it. I don't have the 
annotation, therefore I need to extract the sequence information from the 
genome coordinates and them blast them to identify those.
How the easiest way to do it?

Thanks.

Humberto



Dr. Humberto Boncristiani
National Research Council (NRC) Fellow
Adjunct Research Associate
Department of Biology
Univ. North Carolina at Greensboro
312 Eberhart Bldg
Greensboro, NC 27403, USA.
Tel.:(1) 336-256-2591
Fax: (1) 336-334-5839
email: hum...@gmail.com




___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/