On 21/11/13 07:30 AM, Goor Sasson wrote: > Hi Sébastien, > > I currently don't have the information regarding the distribution of the taxa > in the pooled metagenome. > In case that the abundance doesn't follow some kind of a power law > distribution, would it > affect the duration or correctness of the assembly ?
No, you abundances don't need to follow a power law. > > Thanks, > Goor > 2013/11/8 Sébastien Boisvert <sebastien.boisver...@ulaval.ca > <mailto:sebastien.boisver...@ulaval.ca>> > > On 08/11/13 03:40 AM, Goor Sasson wrote: > > Hi Sebastien & Adrian, > > The Blacklight we consider to use is the one at Urbana Illinois > university. > (http://help.igb.illinois.edu/__Biocluster#Cluster___Specifications > <http://help.igb.illinois.edu/Biocluster#Cluster_Specifications>). > A rough estimation for the the total genome size is at least 5000 > genomes. > Hopes that these details can shed some light. > > > Do you expect some sort of power law distribution ? > > > To summarize, you have one sample with 2.5 Gigabases of Illumina data > (probably one lane or something like that). > And that contains roughly 5000 bacterial genomes. > > Is that correct ? > > > Also, the next version (2.3.1) will include a patch that will lower the > memory usage. > see https://github.com/sebhtml/__ray/issues/210 > <https://github.com/sebhtml/ray/issues/210> > > > Thanks, > Goor > > > 2013/11/6 Sébastien Boisvert <sebastien.boisvert.3@ulaval.__ca > <mailto:sebastien.boisver...@ulaval.ca> > <mailto:sebastien.boisvert.3@__ulaval.ca > <mailto:sebastien.boisver...@ulaval.ca>>> > > > On 06/11/13 09:49 AM, Goor Sasson wrote: > > Dear Mailing-List, > > > Hi, > > > > I've chosen to use Ray Meta in order to assemble reads > from a pooled metagenome that contains 2.5g of illumina (HiSeq2500) > paired-end reads > (read length 100bp, totalling ~250g bp). > I need an gross estimation for the predicted execution time. > The computational platform is a Blacklight platform with ~400 cores (Xeon > X7542) with 2TB RAM. > > > Is it this Blacklight: > http://www.psc.edu/index.php/____computing-resources/__blacklight > <http://www.psc.edu/index.php/__computing-resources/blacklight> > <http://www.psc.edu/index.php/__computing-resources/blacklight > <http://www.psc.edu/index.php/computing-resources/blacklight>__> ? > > > What is the interconnect (the network) ? > > > Thanks > > > As Adrian Pelin pointed out, it depends a lot of the genomic > content of your metagenome. > > In our paper http://genomebiology.com/2012/____13/12/R122 > <http://genomebiology.com/2012/__13/12/R122> > <http://genomebiology.com/__2012/13/12/R122 > <http://genomebiology.com/2012/13/12/R122>> , we used 128 machines each with > 8 cores > > (Xeon X5560). Each machine had 24 GiB of RAM memory, so that was > around 3 TiB of distributed memory. > It took ~15 hours to assemble metagenome containing 1000 > bacterial genomes (power law abundances) > + human contaminations. Note that this 15 hours included also > the taxonomic profiling and so on. > Doing only an assembly will be shorter in time. > > Obviously, a test that you can you before going for the whole > thing is: > > > mpiexec -n 400 Ray -test-network-only -o NetworkTest-1 > > > Then, check the measured latency in NetworkTest-1/NetworkTest.txt > > > > ---Sébastien > > > > ------------------------------------------------------------------------------ Shape the Mobile Experience: Free Subscription Software experts and developers: Be at the forefront of tech innovation. Intel(R) Software Adrenaline delivers strategic insight and game-changing conversations that shape the rapidly evolving mobile landscape. Sign up now. http://pubads.g.doubleclick.net/gampad/clk?id=63431311&iu=/4140/ostg.clktrk _______________________________________________ Denovoassembler-users mailing list Denovoassembler-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/denovoassembler-users