On 20/09/13 05:06 PM, Nathaniel Jue wrote:
> Hi,
>
> I've run into a bit of an issue with Ray (v2.3.0-devel) and was wondering if 
> you might be able to give me some advice/help. I keep on getting this error 
> message when I try to run Ray with all of my data with the following command 
> (the ellipse represents the first two lines of the error being repeat 29 more 
> time for each processor or all 30 processors in total). There is quite a bit 
> data in the analysis (2 runs of MiSeq data and 2 lanes of HiSeq):
>
>  >mpiexec -n 20 Ray -k 31 -p 
> /data2/reads/illumina/limulus/GRL1397_S1_L001.left.rept.corr.fasta 
> /data2/reads/illumina/limulus/GRL1397_S1_L001.right.rept.corr.fasta -p 
> /data2/reads/illumina/limulus/GRL1402_errorCorrect/GRL1402.left.2.rept.corr.fasta
>  
> /data2/reads/illumina/limulus/GRL1402_errorCorrect/GRL1402.right.rept.corr.fasta
>  -s /data2/reads/illumina/limulus/GRL1397_S1_L001.up.rept.corr.fasta -s 
> /data2/reads/illumina/limulus/GRL1402_errorCorrect/GRL1402.up.rept.corr.fasta 
> -p 
> /data2/reads/illumina/limulus/8871_CGATGT_L003_errorCorrect/8871_CGATGT_L003.left.2.rept.corr.fasta
>  
> /data2/reads/illumina/limulus/8871_CGATGT_L003_errorCorrect/8871_CGATGT_L003.right.rept.corr.fasta
>  -p 
> /data2/reads/illumina/limulus/8871_CGATGT_L004_errorCorrect/8871_CGATGT_L004.left.2.rept.corr.fasta
>  
> /data2/reads/illumina/limulus/8871_CGATGT_L004_errorCorrect/8871_CGATGT_L004.right.rept.corr.fasta
>  -s 
> /data2/reads/illumina/limulus/8871_CGATGT_L003_errorCorrect/8871_CGATGT_L003.up.rept.corr.fasta
>  -s
> /data2/reads/illumina/limulus/8871_CGATGT_L004_errorCorrect/8871_CGATGT_L004.up.rept.corr.fasta
>  -o limulus_ray_IlluminaOnly
>
> Subsequent error message:
>
> Critical exception: The system is out of memory, returned NULL.
> Requested -109661072 bytes of type RAY_MALLOC_TYPE_BLOOM_FILTER

So you are getting this with the git version of Ray. Strange.

> ...
> --------------------------------------------------------------------------
> mpiexec has exited due to process rank 8 with PID 22018 on
> node redqueen exiting improperly. There are two reasons this could occur:
>
> 1. this process did not call "init" before exiting, but others in
> the job did. This can cause a job to hang indefinitely while it waits
> for all processes to call "init". By rule, if one process calls "init",
> then ALL processes must call "init" prior to termination.
>
> 2. this process called "init", but exited without calling "finalize".
> By rule, all processes that call "init" MUST call "finalize" prior to
> exiting or it will be considered an "abnormal termination"
>
> This may have caused other processes in the application to be
> terminated by signals sent by mpiexec (as reported here).
> --------------------------------------------------------------------------
>
> I did look in the Ray mailing list, installed the development version of the 
> program and Ray Platform and found discussion on a patch which I tried to 
> apply to program. When
>I did that, I get this:
>
> patching file code/VerticesExtractor/VerticesExtractor.cpp
> patching file code/VerticesExtractor/VerticesExtractor.h
> Reversed (or previously applied) patch detected!  Assume -R? [n]
>

You don't need to patch the code since the git repository includes these
patches already.

These patches are to be applied on Ray 2.2.0.

> which I have to respond "y" to in order to patch the program. Even after 
> patching, though, the program still gives me these error. I will also add 
> that when I tried
>an assembly with just the MiSeq data, the program was able to finish the 
>assembly with the same 20 processors indicates.
>
> I am using our supercomputer to do this assembly with consists of 48 Intel(R) 
> Xeon(R) X7542  CPUs @ 2.67GHz (I think each as 6 cores, if I remember right;
>I'm not the hardware guy so not sure about that) with something like 500GB of 
>RAM (again, I think).
>
> Do you have any thoughts or insight into what might be going on? Mpi or ray 
> issue?


The number of bytes for the Bloom filter depends on the number of reads, mostly.
I think it is a bug in Ray that occurs when you have too many reads
and not enough ranks.

Can you search for BloomFilter in your log ?

You can do that with this command:

grep BloomFilter job.log


>
> Regards,
> Nate


------------------------------------------------------------------------------
LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99!
1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint
2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes
Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. 
http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk
_______________________________________________
Denovoassembler-users mailing list
Denovoassembler-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users

Reply via email to