Sebastion,
I'm working on deciding on some reasonable criteria for reducing my read
inputs, but in the meantime I tried your first suggested fix of upping the
bloom filter and it seems to be running. I'll let the list (and you) know
what type of result I end up with.
Good luck on your applications.
Cheers,
Nate
On Mon, Sep 23, 2013 at 12:17 PM, Sébastien Boisvert <
sebastien.boisver...@ulaval.ca> wrote:
> On 23/09/13 11:44 AM, Nathaniel Jue wrote:
> > Sebastien,
> >
> > Okay, if I'm reading the output correctly, this should be the memory
> allocation in KiB (is that bits or bytes? Or Kibibytes? I'm not sure) for
> each rank (which should
> >basically be each processor, right?). Reads allocated to each rank were
> 32868106.
>
> Hi Nathaniel,
>
> The Bloom filter is a component that will accumulate most of the sequencing
> errors to reduce the memory usage consumed by the graph.
>
> The size of the Bloom filter is a linear function of the number of reads
> because
> the number of sequencing errors is also a linear function of the number of
> reads.
>
>
> In your log, it says " -109661072 bytes of type
> RAY_MALLOC_TYPE_BLOOM_FILTER".
>
> Therefore, there is a bug in the code that causes an integer overflow.
>
>
> The number of bits for the Bloom filter is (as of Ray v2.3.0-devel):
>
> Bits = NumberOfReads * 4 * 2 * 2 * KmerLength
>
> In your case:
>
> Bits = 32868106 * 4 * 2 * 2 * 31 = 16 302 580 576 bits (2 037 822 572
> bytes).
>
>
> So this is a bug in Ray.
>
>
> But, I think that 657362120 reads (20 ranks * 32868106 reads / rank) is a
> lot of reads.
> You may want to scale out a little bit on this (add more cores).
>
>
> I created a ticket here: https://github.com/sebhtml/ray/issues/196
>
>
> However, at the moment I am busy with postdoctoral scholarship
> applications.
>
>
> Here are 2 possible workarounds that I can offer:
>
> 1. Set the number of bits manually with -bloom-filter-bits. For example,
> to use 512 MiB of memory
> for the Bloom filter on each MPI rank, use -bloom-filter-bits
> 4398046511104.
>
> 2. Use more MPI ranks (more processor cores).
>
> If you go from 20 to 200 MPI ranks, everything will be faster etc.
>
>
>
> Séb
>
> >
> > Rank-KiB
> >
> > 0-3008256
> > 1-2074124
> > 2-1926640
> > 3-1926636
> > 4-1926640
> > 5-1926636
> > 6-1074460
> > 7-1926636
> > 8-1926636
> > 9-1926640
> > 10-1910248
> > 11-1074460
> > 12-1926640
> > 13-1926640
> > 14-1926640
> > 15-1926640
> > 16-1074460
> > 17-1926640
> > 18-1926640
> > 19-1910248
> >
> > Rough estimate adds up to around 40GB, right?
> >
> > Cheers,
> > Nate
> >
> > On Mon, Sep 23, 2013 at 9:36 AM, Sébastien Boisvert <
> sebastien.boisver...@ulaval.ca <mailto:sebastien.boisver...@ulaval.ca>>
> wrote:
> >
> > On 22/09/13 10:13 PM, Nathaniel Jue wrote:
> >
> > Hi Sebastien,
> >
> > By job.log, I assume you mean the stout. I tried grepping
> "BloomFilter" in both that output and all the files created in the output
> directory. The only instances of the term BloomFilter occurred in the test
> error example I sent you earlier, which occurs right after loading all the
> reads. Do you think this is a too many reads issue or something else? If
> so, any suggestions on how to deal with that?
> >
> >
> > (Please use the list.)
> >
> > I would be informative to get the number of bits that the Bloom
> filter is trying to allocate
> > -- this is the information that can be found in the standard output.
> >
> >
> > Regards,
> > Nate
> >
> >
> > On Sep 20, 2013 5:19 PM, "Sébastien Boisvert" <
> sebastien.boisver...@ulaval.ca <mailto:sebastien.boisver...@ulaval.ca>
> <mailto:sebastien.boisver...@ulaval.ca <mailto:
> sebastien.boisver...@ulaval.ca>>> wrote:
> >
> > On 20/09/13 05:06 PM, Nathaniel Jue wrote:
> >
> > Hi,
> >
> > I've run into a bit of an issue with Ray (v2.3.0-devel)
> and was wondering if you might be able to give me some advice/help. I keep
> on getting this error message when I try to run Ray with all of my data
> with the following command (the ellipse represents the first two lines of
> the error being repeat 29 more time for each processor or all 30 processors
> in total). There is quite a bit data in the analysis (2 runs of MiSeq data
> and 2 lanes of HiSeq):
> >
> > >mpiexec -n 20 Ray -k 31 -p
> /data2/reads/illumina/limulus/__GRL1397_S1_L001.left.rept.__corr.fasta
> /data2/reads/illumina/limulus/__GRL1397_S1_L001.right.rept.__corr.fasta -p
> /data2/reads/illumina/limulus/__GRL1402_errorCorrect/GRL1402.__left.2.rept.corr.fasta
> /data2/reads/illumina/limulus/__GRL1402_errorCorrect/GRL1402.__right.rept.corr.fasta
> -s /data2/reads/illumina/limulus/__GRL1397_S1_L001.up.rept.corr.__fasta -s
> /data2/reads/illumina/limulus/__GRL1402_errorCorrect/GRL1402.__up.rept.corr.fasta
> -p
> /data2/reads/illumina/limulus/__8871_CGATGT_L003_errorCorrect/__8871_CGATGT_L003.left.2.rept.__corr.fasta
> /data2/reads/illumina/limulus/__8871_CGATGT_L003_errorCorrect/__8871_CGATGT_L003.right.rept.__corr.fasta
> -p
> /data2/reads/illumina/limulus/__8871_CGATGT_L004_errorCorrect/__8871_CGATGT_L004.left.2.rept.__corr.fasta
> /data2/reads/illumina/limulus/__8871_CGATGT_L004_errorCorrect/__8871_CGATGT_L004.right.rept.__corr.fasta
> -s
> >
>
> /data2/reads/illumina/limulus/__8871_CGATGT_L003_errorCorrect/__8871_CGATGT_L003.up.rept.corr.__fasta
> -s
> >
>
> /data2/reads/illumina/limulus/__8871_CGATGT_L004_errorCorrect/__8871_CGATGT_L004.up.rept.corr.__fasta
> -o limulus_ray_IlluminaOnly
> >
> >
> > Subsequent error message:
> >
> > Critical exception: The system is out of memory,
> returned NULL.
> > Requested -109661072 bytes of type
> RAY_MALLOC_TYPE_BLOOM_FILTER
> >
> >
> > So you are getting this with the git version of Ray.
> Strange.
> >
> > ...
> >
>
> ------------------------------__------------------------------__--------------
> >
> > mpiexec has exited due to process rank 8 with PID 22018
> on
> > node redqueen exiting improperly. There are two reasons
> this could occur:
> >
> > 1. this process did not call "init" before exiting, but
> others in
> > the job did. This can cause a job to hang indefinitely
> while it waits
> > for all processes to call "init". By rule, if one
> process calls "init",
> > then ALL processes must call "init" prior to
> termination.
> >
> > 2. this process called "init", but exited without
> calling "finalize".
> > By rule, all processes that call "init" MUST call
> "finalize" prior to
> > exiting or it will be considered an "abnormal
> termination"
> >
> > This may have caused other processes in the application
> to be
> > terminated by signals sent by mpiexec (as reported
> here).
> >
>
> ------------------------------__------------------------------__--------------
> >
> >
> > I did look in the Ray mailing list, installed the
> development version of the program and Ray Platform and found discussion on
> a patch which I tried to apply to program. When
> > I did that, I get this:
> >
> > patching file
> code/VerticesExtractor/__VerticesExtractor.cpp
> > patching file
> code/VerticesExtractor/__VerticesExtractor.h
> >
> > Reversed (or previously applied) patch detected!
> Assume -R? [n]
> >
> >
> > You don't need to patch the code since the git repository
> includes these
> > patches already.
> >
> > These patches are to be applied on Ray 2.2.0.
> >
> > which I have to respond "y" to in order to patch the
> program. Even after patching, though, the program still gives me these
> error. I will also add that when I tried
> > an assembly with just the MiSeq data, the program was
> able to finish the assembly with the same 20 processors indicates.
> >
> > I am using our supercomputer to do this assembly with
> consists of 48 Intel(R) Xeon(R) X7542 CPUs @ 2.67GHz (I think each as 6
> cores, if I remember right;
> > I'm not the hardware guy so not sure about that) with
> something like 500GB of RAM (again, I think).
> >
> > Do you have any thoughts or insight into what might be
> going on? Mpi or ray issue?
> >
> >
> >
> > The number of bytes for the Bloom filter depends on the
> number of reads, mostly.
> > I think it is a bug in Ray that occurs when you have too
> many reads
> > and not enough ranks.
> >
> > Can you search for BloomFilter in your log ?
> >
> > You can do that with this command:
> >
> > grep BloomFilter job.log
> >
> >
> >
> > Regards,
> > Nate
> >
> >
> >
> >
>
>
>
> ------------------------------------------------------------------------------
> LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99!
> 1,500+ hours of tutorials including VisualStudio 2012, Windows 8,
> SharePoint
> 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack
> includes
> Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13.
> http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk
> _______________________________________________
> Denovoassembler-users mailing list
> Denovoassembler-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users
>
------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk
_______________________________________________
Denovoassembler-users mailing list
Denovoassembler-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users