On 20/09/13 05:06 PM, Nathaniel Jue wrote: > Hi, > > I've run into a bit of an issue with Ray (v2.3.0-devel) and was wondering if > you might be able to give me some advice/help. I keep on getting this error > message when I try to run Ray with all of my data with the following command > (the ellipse represents the first two lines of the error being repeat 29 more > time for each processor or all 30 processors in total). There is quite a bit > data in the analysis (2 runs of MiSeq data and 2 lanes of HiSeq): > > >mpiexec -n 20 Ray -k 31 -p > /data2/reads/illumina/limulus/GRL1397_S1_L001.left.rept.corr.fasta > /data2/reads/illumina/limulus/GRL1397_S1_L001.right.rept.corr.fasta -p > /data2/reads/illumina/limulus/GRL1402_errorCorrect/GRL1402.left.2.rept.corr.fasta > > /data2/reads/illumina/limulus/GRL1402_errorCorrect/GRL1402.right.rept.corr.fasta > -s /data2/reads/illumina/limulus/GRL1397_S1_L001.up.rept.corr.fasta -s > /data2/reads/illumina/limulus/GRL1402_errorCorrect/GRL1402.up.rept.corr.fasta > -p > /data2/reads/illumina/limulus/8871_CGATGT_L003_errorCorrect/8871_CGATGT_L003.left.2.rept.corr.fasta > > /data2/reads/illumina/limulus/8871_CGATGT_L003_errorCorrect/8871_CGATGT_L003.right.rept.corr.fasta > -p > /data2/reads/illumina/limulus/8871_CGATGT_L004_errorCorrect/8871_CGATGT_L004.left.2.rept.corr.fasta > > /data2/reads/illumina/limulus/8871_CGATGT_L004_errorCorrect/8871_CGATGT_L004.right.rept.corr.fasta > -s > /data2/reads/illumina/limulus/8871_CGATGT_L003_errorCorrect/8871_CGATGT_L003.up.rept.corr.fasta > -s > /data2/reads/illumina/limulus/8871_CGATGT_L004_errorCorrect/8871_CGATGT_L004.up.rept.corr.fasta > -o limulus_ray_IlluminaOnly > > Subsequent error message: > > Critical exception: The system is out of memory, returned NULL. > Requested -109661072 bytes of type RAY_MALLOC_TYPE_BLOOM_FILTER
So you are getting this with the git version of Ray. Strange. > ... > -------------------------------------------------------------------------- > mpiexec has exited due to process rank 8 with PID 22018 on > node redqueen exiting improperly. There are two reasons this could occur: > > 1. this process did not call "init" before exiting, but others in > the job did. This can cause a job to hang indefinitely while it waits > for all processes to call "init". By rule, if one process calls "init", > then ALL processes must call "init" prior to termination. > > 2. this process called "init", but exited without calling "finalize". > By rule, all processes that call "init" MUST call "finalize" prior to > exiting or it will be considered an "abnormal termination" > > This may have caused other processes in the application to be > terminated by signals sent by mpiexec (as reported here). > -------------------------------------------------------------------------- > > I did look in the Ray mailing list, installed the development version of the > program and Ray Platform and found discussion on a patch which I tried to > apply to program. When >I did that, I get this: > > patching file code/VerticesExtractor/VerticesExtractor.cpp > patching file code/VerticesExtractor/VerticesExtractor.h > Reversed (or previously applied) patch detected! Assume -R? [n] > You don't need to patch the code since the git repository includes these patches already. These patches are to be applied on Ray 2.2.0. > which I have to respond "y" to in order to patch the program. Even after > patching, though, the program still gives me these error. I will also add > that when I tried >an assembly with just the MiSeq data, the program was able to finish the >assembly with the same 20 processors indicates. > > I am using our supercomputer to do this assembly with consists of 48 Intel(R) > Xeon(R) X7542 CPUs @ 2.67GHz (I think each as 6 cores, if I remember right; >I'm not the hardware guy so not sure about that) with something like 500GB of >RAM (again, I think). > > Do you have any thoughts or insight into what might be going on? Mpi or ray > issue? The number of bytes for the Bloom filter depends on the number of reads, mostly. I think it is a bug in Ray that occurs when you have too many reads and not enough ranks. Can you search for BloomFilter in your log ? You can do that with this command: grep BloomFilter job.log > > Regards, > Nate ------------------------------------------------------------------------------ LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99! 1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk _______________________________________________ Denovoassembler-users mailing list Denovoassembler-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/denovoassembler-users