Re: [Denovoassembler-users] Execution Time Estimation

Goor Sasson Thu, 21 Nov 2013 10:37:46 -0800

Hi Sébastien,

I currently don't have the information regarding the distribution of the
taxa in the pooled metagenome.
In case that the abundance doesn't follow some kind of a power law
distribution, would it
affect the duration or correctness of the assembly ?


Thanks,
Goor
2013/11/8 Sébastien Boisvert <[email protected]>

> On 08/11/13 03:40 AM, Goor Sasson wrote:
>
>> Hi Sebastien & Adrian,
>>
>>   The Blacklight we consider to use is the one at Urbana Illinois
>> university.
>> (http://help.igb.illinois.edu/Biocluster#Cluster_Specifications).
>> A rough estimation for the the total genome size is at least 5000 genomes.
>> Hopes that these details can shed some light.
>>
>
> Do you expect some sort of power law distribution ?
>
>
> To summarize, you have one sample with 2.5 Gigabases of Illumina data
> (probably one lane or something like that).
> And that contains roughly 5000 bacterial genomes.
>
> Is that correct ?
>
>
> Also, the next version (2.3.1) will include a patch that will lower the
> memory usage.
> see https://github.com/sebhtml/ray/issues/210
>
>
>> Thanks,
>> Goor
>>
>>
>> 2013/11/6 Sébastien Boisvert <[email protected] <mailto:
>> [email protected]>>
>>
>>
>>     On 06/11/13 09:49 AM, Goor Sasson wrote:
>>
>>         Dear Mailing-List,
>>
>>
>>     Hi,
>>
>>
>>
>>             I've chosen to use Ray Meta in order to assemble reads from a
>> pooled metagenome that contains 2.5g of illumina  (HiSeq2500) paired-end
>> reads
>>         (read length 100bp, totalling ~250g bp).
>>         I need an gross estimation for the predicted execution time. The
>> computational platform is a Blacklight platform with ~400 cores (Xeon
>> X7542) with 2TB RAM.
>>
>>
>>     Is it this Blacklight: http://www.psc.edu/index.php/_
>> _computing-resources/blacklight <http://www.psc.edu/index.php/
>> computing-resources/blacklight> ?
>>
>>
>>     What is the interconnect (the network) ?
>>
>>
>>         Thanks
>>
>>
>>     As Adrian Pelin pointed out, it depends a lot of the genomic content
>> of your metagenome.
>>
>>     In our paper http://genomebiology.com/2012/__13/12/R122 <
>> http://genomebiology.com/2012/13/12/R122> , we used 128 machines each
>> with 8 cores
>>
>>     (Xeon X5560). Each machine had 24 GiB of RAM memory, so that was
>> around 3 TiB of distributed memory.
>>     It took ~15 hours to assemble metagenome containing 1000 bacterial
>> genomes (power law abundances)
>>     + human contaminations. Note that this 15 hours included also the
>> taxonomic profiling and so on.
>>     Doing only an assembly will be shorter in time.
>>
>>     Obviously, a test that you can you before going for the whole thing
>> is:
>>
>>
>>     mpiexec -n 400 Ray -test-network-only -o NetworkTest-1
>>
>>
>>     Then, check the measured latency in NetworkTest-1/NetworkTest.txt
>>
>>
>>
>>     ---Sébastien
>>
>>
>>
>

------------------------------------------------------------------------------
Shape the Mobile Experience: Free Subscription
Software experts and developers: Be at the forefront of tech innovation.
Intel(R) Software Adrenaline delivers strategic insight and game-changing 
conversations that shape the rapidly evolving mobile landscape. Sign up now. 
http://pubads.g.doubleclick.net/gampad/clk?id=63431311&iu=/4140/ostg.clktrk

_______________________________________________
Denovoassembler-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users

Re: [Denovoassembler-users] Execution Time Estimation

Reply via email to