Re: [galaxy-user] GenBank Submission - How to Generate Fasta (not fastq) files

2011-06-22 Thread John David Osborne
Thanks for your reply Jen.

I managed to use Pileup-to-Interval (on my new strain) and then Extract Genomic 
DNA but I'm not too sure what I got in terms of a FASTA file. What I am looking 
for is a consensus file for my sequence that can be submitted to Genbank, not a 
interval on a reference strain. Is that what this is returning? I haven't done 
any alignement yet. Also because it uses samtools, it doesn't incorporate 
indels found in my unknown strain...?

You mention using Genome Diversity, but it's not clear to me how extracting the 
region flanking SNPs will get me the desired consensus sequence.

Right now I am using John Nash's pileup2fasta (which works great, thanks John!) 
but I was hoping for something incorporated into galaxy.

 -John



From: Jennifer Jackson [j...@bx.psu.edu]
Sent: Tuesday, June 21, 2011 5:38 PM
To: John David Osborne
Cc: galaxy-u...@bx.psu.edu
Subject: Re: [galaxy-user] GenBank Submission - How to Generate Fasta (not 
fastq) files

Hello John,

One solution, if you want fasta sequence based on the reference genome
(could be a native Galaxy genome, a custom genome in your history, or
really any fasta file in your history as long as the mapped
"chromosomes" names are identical), is to use the tool "NGS: SAM Tools
-> Pileup-to-Interval". Then, to extract fasta sequence based on these
coordinates use the tool "Fetch Sequences -> Extract Genomic DNA".

This utilizes SAMTools, but is in the Galaxy public server and perhaps
this makes it an acceptable option.

If you are interested in examining the variation in your data vs the
reference, please see the tools under "NGS: Indel Analysis". Combined
with the tool "Genome Diversity -> Extract DNA flanking chosen SNPs"
this can incorporate your SNPs into the background reference to produce
novel fasta sequences.

If still needed, moving from FASTQ to FASTA in Galaxy is very simple
using the tool "NGS: QC and manipulation -> FASTQ to FASTA converter ".

If command line is your preference, all of Galaxy's tools can be run
there, too, using the source. http://getgalaxy.org

I will post these options at BioStar at the question you quoted, for
that user and others who may have a similar analysis project.

Apologies for the delay in reply. Please let us know if we can help again,

Best,

Jen
Galaxy team

On 6/13/11 11:34 AM, John David Osborne wrote:
> I still haven't found an easy solution to this problem and I am afraid
> I'm going to have to write one my own - which makes little sense as I
> bet this has been solved thousands of times!
> Can anybody point me to a script/software to convert a samtools pileup
> file into a fasta consensus file? It would be nice to set coverage
> thresholds, etc... but I'll take anything I can work with.
> The best google could do for me was this:
> http://biostar.stackexchange.com/questions/1389/how-to-generate-a-consensus-fasta-sequence-from-sam-tools-pileup
> Not that helpful,
> -John
> P.S. If there is a better way of doing this (something other than
> samtools) I'm all ears.
>
>
>
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
>
>http://lists.bx.psu.edu/listinfo/galaxy-dev
>
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
>
>http://lists.bx.psu.edu/

--
Jennifer Jackson
http://usegalaxy.org/
http://galaxyproject.org/
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-user] GenBank Submission - How to Generate Fasta (not fastq) files

2011-06-21 Thread Jennifer Jackson

Hello John,

One solution, if you want fasta sequence based on the reference genome 
(could be a native Galaxy genome, a custom genome in your history, or 
really any fasta file in your history as long as the mapped 
"chromosomes" names are identical), is to use the tool "NGS: SAM Tools 
-> Pileup-to-Interval". Then, to extract fasta sequence based on these 
coordinates use the tool "Fetch Sequences -> Extract Genomic DNA".


This utilizes SAMTools, but is in the Galaxy public server and perhaps 
this makes it an acceptable option.


If you are interested in examining the variation in your data vs the 
reference, please see the tools under "NGS: Indel Analysis". Combined 
with the tool "Genome Diversity -> Extract DNA flanking chosen SNPs" 
this can incorporate your SNPs into the background reference to produce 
novel fasta sequences.


If still needed, moving from FASTQ to FASTA in Galaxy is very simple 
using the tool "NGS: QC and manipulation -> FASTQ to FASTA converter ".


If command line is your preference, all of Galaxy's tools can be run 
there, too, using the source. http://getgalaxy.org


I will post these options at BioStar at the question you quoted, for 
that user and others who may have a similar analysis project.


Apologies for the delay in reply. Please let us know if we can help again,

Best,

Jen
Galaxy team

On 6/13/11 11:34 AM, John David Osborne wrote:

I still haven't found an easy solution to this problem and I am afraid
I'm going to have to write one my own - which makes little sense as I
bet this has been solved thousands of times!
Can anybody point me to a script/software to convert a samtools pileup
file into a fasta consensus file? It would be nice to set coverage
thresholds, etc... but I'll take anything I can work with.
The best google could do for me was this:
http://biostar.stackexchange.com/questions/1389/how-to-generate-a-consensus-fasta-sequence-from-sam-tools-pileup
Not that helpful,
-John
P.S. If there is a better way of doing this (something other than
samtools) I'm all ears.



___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

   http://lists.bx.psu.edu/


--
Jennifer Jackson
http://usegalaxy.org/
http://galaxyproject.org/
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/