Noa has the right idea, but if you're asking for how to split a dataset into 
two non-overlapping halves you'll want to use "Select First" and "Select Last", 
instead of random lines.  Get an accurate line count from your file using the 
"Line/Word/Character count" tool and then split it right in the middle using 
select first/last.

-Dannon

On Feb 16, 2012, at 2:35 PM, Noa Sher wrote:

> Hi Scott
> I  never used megablast so what i am writing is true of just any fasta file 
> (so if there is anything quirky in megablast that i dont know about, 
> apologies!):
>       • Take your fasta file and convert to tabular (under "fasta 
> manipulation" - this will make it go to one line per record).
>       • Then randomly choose whatever number of reads you want using "select 
> random lines from a file" under the text maniupulation tab.
>       • Then convert the tabular file back to fasta. (under the fasta 
> manipulation tab)
> noa
> On 16/02/2012 19:31, Scott Tighe wrote:
>> Hi all
>> 
>> When using Galaxy megablast, is there a simple way to reduce my FASTA files 
>> from 23 million reads to 1/2 that size and submit to megablast separately?
>> 
>> Thanks
>> -- 
>> Scott Tighe
>> Advanced Genome Technology Lab
>> Vermont Cancer Center at the University of Vermont
>> 149 Beaumont Avenue
>> Health Science Research Bd RM 305
>> Burlington Vermont USA 05405
>> lab  802-656-AGTC (2482)
>> cell 802-999-6666
>> 
>> 
>> 
>> ___________________________________________________________
>> The Galaxy User list should be used for the discussion of
>> Galaxy analysis and other features on the public server
>> at usegalaxy.org.  Please keep all replies on the list by
>> using "reply all" in your mail client.  For discussion of
>> local Galaxy instances and the Galaxy source code, please
>> use the Galaxy Development list:
>> 
>>   
>> http://lists.bx.psu.edu/listinfo/galaxy-dev
>> 
>> 
>> To manage your subscriptions to this and other Galaxy lists,
>> please use the interface at:
>> 
>>   
>> http://lists.bx.psu.edu/
> ___________________________________________________________
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> 
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> 
>  http://lists.bx.psu.edu/


___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Reply via email to