Thanks for the answer, I have added comments below.
On Wed, Apr 17, 2013 at 1:55 PM, Sébastien Boisvert <
sebastien.boisver...@ulaval.ca> wrote:
> On 17/04/13 06:44 AM, Francesco Strozzi wrote:
> > Hello,
> > we are using Ray to assemble a metagenomics sample coming from a single
> Illumina HiSeq lane (around 32 Gigabases of sequence with 100bp paired end
> reads). We have prepared
> >the taxonomy and bacterial genomes sequences using the script available
> on the Ray repository. We are running the analysis on a single server with
> 48 cores and
> >512 Gb of RAM
>
> You probably mean 512 GiB, not 512 Gb.
>
> > and the process is running since 24h. Here is the command line used:
> >
> > mpiexec -n 40 Ray
> > -k 31
> > -p /metagenomics/sample1/cleanReads/sample1_R1_clean.fastq
> /metagenomics/sample1/cleanReads/sample1_R2_clean.fastq
> >-serach /ray_taxo_file/NCBI-taxonomy/NCBI-Finished-Bacterial-Genomes
>
> The correct option is -search, not -serach.
>
That was a typo.
>
> > -with-taxonomy /ray_taxo_file/NCBI-taxonomy/Genome-to-Taxon.tsv
> /ray_taxo_file/NCBI-taxonomy/TreeOfLife-Edges.tsv
> /ray_taxo_file/NCBI-taxonomy/Taxon-Names.tsv
> >
> > The problem is that we are seeing a high memory usage, basically Ray is
> taking all the 512 Gb of RAM, which is something unexpected since in the
> Ray Meta paper
> >it is said that for a test sample of this size (1 HiSeq lane) it used ~ 2
> Gb of memory on 128 cores.
>
> The paper says 1.5 GiB per core, with 1024 cores.
>
But for the 100 genome metagenome test it is said:
"The number of reads for this 100-genome metagenome roughly corresponds to
the number
of reads generated by one lane of an Illumina HiSeq 2000
flow cell (Illumina, Inc.)[...]This dataset was assembled by Ray Meta
using 128 processor cores in 13 hours, 26 minutes, with an
average memory usage of 2 GB per core."
So we based our assumptions on this information.
What are the last 10 lines in the standard output ?
>
> Can you provide the last 10 lines of SequencePartition.txt ?
>
Here are the last 10 lines of the standard output:
Rank 11: assembler memory usage: 15324264 KiB
Rank 8 has 273900000 vertices
Rank 8: assembler memory usage: 15324264 KiB
Rank 12 has 273800000 vertices
Rank 12: assembler memory usage: 15324264 KiB
Rank 24 is counting k-mers in sequence reads [6560001/8133265]
Speed RAY_SLAVE_MODE_ADD_VERTICES 2 units/second
Estimated remaining time for this step: 9 days, 2 hours, 30 minutes, 32
seconds
Rank 1 has 273800000 vertices
Rank 1: assembler memory usage: 15320168 KiB
[root@node1 RayOutput]# tail /var/spool/pbs/spool/
4748.pbs.bioinformatics.tecnoparco.org.ER
[root@node1 RayOutput]# tail
/var/spool/pbs/spool/4748.pbs.bioinformatics.tecnoparco.org.OU
Rank 12 has 273800000 vertices
Rank 12: assembler memory usage: 15324264 KiB
Rank 24 is counting k-mers in sequence reads [6560001/8133265]
Speed RAY_SLAVE_MODE_ADD_VERTICES 2 units/second
Estimated remaining time for this step: 9 days, 2 hours, 30 minutes, 32
seconds
Rank 1 has 273800000 vertices
Rank 1: assembler memory usage: 15320168 KiB
Rank 19 is counting k-mers in sequence reads [6520001/8133265]
Speed RAY_SLAVE_MODE_ADD_VERTICES 2 units/second
Estimated remaining time for this step: 9 days, 8 hours, 3 minutes, 52
seconds
And here are the last lines of the SequencePartition.txt file:
30 243997950 252131214 8133265
31 252131215 260264479 8133265
32 260264480 268397744 8133265
33 268397745 276531009 8133265
34 276531010 284664274 8133265
35 284664275 292797539 8133265
36 292797540 300930804 8133265
37 300930805 309064069 8133265
38 309064070 317197334 8133265
39 317197335 325330621 8133287
Thanks
--
Francesco
------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Denovoassembler-users mailing list
Denovoassembler-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users