On 17/04/13 06:44 AM, Francesco Strozzi wrote:
> Hello,
> we are using Ray to assemble a metagenomics sample coming from a single 
> Illumina HiSeq lane (around 32 Gigabases of sequence with 100bp paired end 
> reads). We have prepared
>the taxonomy and bacterial genomes sequences using the script available on the 
>Ray repository. We are running the analysis on a single server with 48 cores 
>and
>512 Gb of RAM

You probably mean 512 GiB, not 512 Gb.

> and the process is running since 24h. Here is the command line used:
>
> mpiexec -n 40 Ray
> -k 31
> -p /metagenomics/sample1/cleanReads/sample1_R1_clean.fastq 
> /metagenomics/sample1/cleanReads/sample1_R2_clean.fastq
>-serach /ray_taxo_file/NCBI-taxonomy/NCBI-Finished-Bacterial-Genomes

The correct option is -search, not -serach.

> -with-taxonomy /ray_taxo_file/NCBI-taxonomy/Genome-to-Taxon.tsv 
> /ray_taxo_file/NCBI-taxonomy/TreeOfLife-Edges.tsv 
> /ray_taxo_file/NCBI-taxonomy/Taxon-Names.tsv
>
> The problem is that we are seeing a high memory usage, basically Ray is 
> taking all the 512 Gb of RAM, which is something unexpected since in the Ray 
> Meta paper
>it is said that for a test sample of this size (1 HiSeq lane) it used ~ 2 Gb 
>of memory on 128 cores.

The paper says 1.5 GiB per core, with 1024 cores.

> So we expected the memory usage to be
>around 256 Gb of RAM, assuming the memory requested is proportional to the 
>size of the input and not to the number of cores used.
>
> Is it normal to have Ray using so much memory or have we done something wrong 
> when we launched the process ?

What are the last 10 lines in the standard output ?

Can you provide the last 10 lines of SequencePartition.txt ?

>
> It is the first time we try using Ray so any help or advise is more than 
> welcome!
>
> Many thanks
>
> --
>
> Francesco


------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Denovoassembler-users mailing list
Denovoassembler-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users

Reply via email to