Thanks for the answer, I have added comments below.

On Wed, Apr 17, 2013 at 1:55 PM, Sébastien Boisvert <
sebastien.boisver...@ulaval.ca> wrote:

> On 17/04/13 06:44 AM, Francesco Strozzi wrote:
> > Hello,
> > we are using Ray to assemble a metagenomics sample coming from a single
> Illumina HiSeq lane (around 32 Gigabases of sequence with 100bp paired end
> reads). We have prepared
> >the taxonomy and bacterial genomes sequences using the script available
> on the Ray repository. We are running the analysis on a single server with
> 48 cores and
> >512 Gb of RAM
>
> You probably mean 512 GiB, not 512 Gb.
>
> > and the process is running since 24h. Here is the command line used:
> >
> > mpiexec -n 40 Ray
> > -k 31
> > -p /metagenomics/sample1/cleanReads/sample1_R1_clean.fastq
> /metagenomics/sample1/cleanReads/sample1_R2_clean.fastq
> >-serach /ray_taxo_file/NCBI-taxonomy/NCBI-Finished-Bacterial-Genomes
>
> The correct option is -search, not -serach.
>

That was a typo.


>
> > -with-taxonomy /ray_taxo_file/NCBI-taxonomy/Genome-to-Taxon.tsv
> /ray_taxo_file/NCBI-taxonomy/TreeOfLife-Edges.tsv
> /ray_taxo_file/NCBI-taxonomy/Taxon-Names.tsv
> >
> > The problem is that we are seeing a high memory usage, basically Ray is
> taking all the 512 Gb of RAM, which is something unexpected since in the
> Ray Meta paper
> >it is said that for a test sample of this size (1 HiSeq lane) it used ~ 2
> Gb of memory on 128 cores.
>
> The paper says 1.5 GiB per core, with 1024 cores.
>

But for the 100 genome metagenome test it is said:

 "The number of reads for this 100-genome metagenome roughly corresponds to
the number
 of reads generated by one lane of an Illumina HiSeq 2000
 flow cell (Illumina, Inc.)[...]This dataset was assembled by Ray Meta
 using 128 processor cores in 13 hours, 26 minutes, with an
 average memory usage of 2 GB per core."

So we based our assumptions on this information.


What are the last 10 lines in the standard output ?
>
> Can you provide the last 10 lines of SequencePartition.txt ?
>

Here are the last 10 lines of the standard output:

Rank 11: assembler memory usage: 15324264 KiB
Rank 8 has 273900000 vertices
Rank 8: assembler memory usage: 15324264 KiB
Rank 12 has 273800000 vertices
Rank 12: assembler memory usage: 15324264 KiB
Rank 24 is counting k-mers in sequence reads [6560001/8133265]
Speed RAY_SLAVE_MODE_ADD_VERTICES 2 units/second
Estimated remaining time for this step: 9 days, 2 hours, 30 minutes, 32
seconds
Rank 1 has 273800000 vertices
Rank 1: assembler memory usage: 15320168 KiB
[root@node1 RayOutput]# tail /var/spool/pbs/spool/
4748.pbs.bioinformatics.tecnoparco.org.ER
[root@node1 RayOutput]# tail
/var/spool/pbs/spool/4748.pbs.bioinformatics.tecnoparco.org.OU
Rank 12 has 273800000 vertices
Rank 12: assembler memory usage: 15324264 KiB
Rank 24 is counting k-mers in sequence reads [6560001/8133265]
Speed RAY_SLAVE_MODE_ADD_VERTICES 2 units/second
Estimated remaining time for this step: 9 days, 2 hours, 30 minutes, 32
seconds
Rank 1 has 273800000 vertices
Rank 1: assembler memory usage: 15320168 KiB
Rank 19 is counting k-mers in sequence reads [6520001/8133265]
Speed RAY_SLAVE_MODE_ADD_VERTICES 2 units/second
Estimated remaining time for this step: 9 days, 8 hours, 3 minutes, 52
seconds


And here are the last lines of the SequencePartition.txt file:

30    243997950    252131214    8133265
31    252131215    260264479    8133265
32    260264480    268397744    8133265
33    268397745    276531009    8133265
34    276531010    284664274    8133265
35    284664275    292797539    8133265
36    292797540    300930804    8133265
37    300930805    309064069    8133265
38    309064070    317197334    8133265
39    317197335    325330621    8133287


Thanks

-- 

Francesco
------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Denovoassembler-users mailing list
Denovoassembler-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users

Reply via email to