On 02/23/2013 01:02 PM, James Vincent wrote: > UNfortunately a bigger job has also failed with no apparent warnings. > stdout has nothing beginning with 'error'. Is there something else to > look at? >
Did you look in the standard error ? > The last few lines of stdout are: > > Rank 10 is purging edges [3450001/61093950] > Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 14509 units/second > Estimated remaining time for this step: 1 hours, 6 minutes, 12 seconds > Rank 3 is purging edges [3200001/61096936] > Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 13563 units/second > Estimated remaining time for this step: 1 hours, 11 minutes, 8 seconds > Rank 5 is purging edges [3350001/61077916] > Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 14354 units/second > Estimated remaining time for this step: 1 hours, 7 minutes, 1 seconds > Rank 7 is purging edges [3250001/61120658] > Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 13945 units/second > Estimated remaining time for this step: 1 hours, 9 minutes, 9 seconds > Rank 31 is purging edges [3400001/61067514] > Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 14516 units/second > Estimated remaining time for this step: 1 hours, 6 minutes, 12 seconds > Rank 9 is purging edges [3500001/61066398] > Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 14833 units/second > Estimated remaining time for this step: 1 hours, 4 minutes, 40 seconds > Rank 4 is purging edges [3700001/61082686] > Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 17092 units/second > Estimated remaining time for this step: 55 minutes, 57 seconds > > And the last few entries in ElapsedTime are: > > > *** > Step: Coverage distribution analysis > Date: Fri Feb 22 19:34:44 2013 > Elapsed time: 18 seconds > Since beginning: 1 hours, 3 minutes, 36 seconds > *** > > > *** > Step: Graph construction > Date: Fri Feb 22 20:31:57 2013 > Elapsed time: 57 minutes, 13 seconds > Since beginning: 2 hours, 49 seconds > *** > > This job was run a little differently. I did not use any profiliing > but I did set a minimum contig length: > > mpiexec --mca btl ^sm -n 40 $RAY -o $OUTDIR -p $LEFT $RIGHT -k 31 \ > -minimum-contig-length $MINCONTIG >&out.$OUTDIR > > The input files have 78426887 paired reads. These were quality trimmed > with sickle. > > Any ideas? > > Thanks, > Jim > > > On Fri, Feb 22, 2013 at 5:00 PM, Sébastien Boisvert > <[email protected]> wrote: >> On 02/22/2013 03:46 PM, James Vincent wrote: >>> >>> That did it - thanks very much. The job completed. It's still a small >>> job, but I'll test with full size soon. >>> >> >> Excellent ! >> >> >>> >>> This job completed quickly. The previous run must have been hanging on >>> something because I showed 100% CPU on all 40 cores for many, many >>> hours with the same input. After making the change you suggest below >>> the job finishes in 1 hour. >>> >> >> With the message passing interface, Ray processes probes actively for new >> messages, it's not >> event-driven. >> >> >>> For 100K paired reads on 40 cores, does one hour sound roughly in the ball >>> park? >>> >> >> Sure. >> >> Most of the time was probably in the graph coloring though. >> >> >>> On Fri, Feb 22, 2013 at 9:34 AM, Sébastien Boisvert >>> <[email protected]> wrote: >>>> >>>> Can you try adding this to Open-MPI options: >>>> >>>> --mca btl ^sm >>>> >>>> >>>> Full command: >>>> >>>> mpiexec --mca btl ^sm -n 40 $RAY -o $OUTDIR -p $LEFT $RIGHT -k 31 \ >>>> -search $NCBIDIR/NCBI-Finished-Bacterial-Genomes \ >>>> -with-taxonomy $NCBIDIR/Genome-to-Taxon.tsv >>>> $NCBIDIR/TreeOfLife-Edges.tsv $NCBIDIR/Taxon-Names.tsv >>>> >>>> >>>> This Open-MPI option will disable Open-MPI's "sm" byte transfer layer. >>>> "sm" means shared memory. >>>> >>>> All messages will go through "tcp" (using the loopback since it's all on >>>> the same machine), >>>> or through "self". >>>> >>>> >>>> p.s.: Open-MPI 1.4.3 is really old (2010-10-05). The last release is >>>> Open-MPI 1.6.4. >>>> >>>> On 02/22/2013 08:06 AM, James Vincent wrote: >>>>> >>>>> It is OpenMPI 1.4.3. Here is the blurb at the start of a run: >>>>> >>>>> >>>>> MAXKMERLENGTH: 32 >>>>> KMER_U64_ARRAY_SIZE: 1 >>>>> Maximum coverage depth stored by CoverageDepth: 4294967295 >>>>> MAXIMUM_MESSAGE_SIZE_IN_BYTES: 4000 bytes >>>>> FORCE_PACKING = n >>>>> ASSERT = n >>>>> HAVE_LIBZ = n >>>>> HAVE_LIBBZ2 = n >>>>> CONFIG_PROFILER_COLLECT = n >>>>> CONFIG_CLOCK_GETTIME = n >>>>> __linux__ = y >>>>> _MSC_VER = n >>>>> __GNUC__ = y >>>>> RAY_32_BITS = n >>>>> RAY_64_BITS = y >>>>> MPI standard version: MPI 2.1 >>>>> MPI library: Open-MPI 1.4.3 >>>>> Compiler: GNU gcc/g++ 4.4.6 20120305 (Red Hat 4.4.6-4) >>>>> >>>>> Rank 0: Operating System: Linux (__linux__) POSIX (OS_POSIX) >>>>> >>>>> >>>>> On Fri, Feb 22, 2013 at 8:01 AM, Sébastien Boisvert >>>>> <[email protected]> wrote: >>>>>> >>>>>> Which MPI library are you using ? >>>>>> >>>>>> >>>>>> >>>>>> On 02/22/2013 07:34 AM, James Vincent wrote: >>>>>>> >>>>>>> Here are the last 50 lines : >>>>>>> >>>>>>> -bash-4.1$ tail -50 out.run >>>>>>> >>>>>>> OPERATION_incrementReferences operations: 32762 >>>>>>> OPERATION_decrementReferences operations: 32386 >>>>>>> >>>>>>> OPERATION_purgeVirtualColor operations: 1567 >>>>>>> ********************************************************** >>>>>>> >>>>>>> Rank 25: assembler memory usage: 138560 KiB >>>>>>> Rank 31 biological abundances 3210 [1/1] [1752/2254] [4/7] >>>>>>> Rank 31 RAY_SLAVE_MODE_ADD_COLORS processed files: 43/56 >>>>>>> Rank 31 RAY_SLAVE_MODE_ADD_COLORS processed sequences in file: 3/7 >>>>>>> Rank 31 RAY_SLAVE_MODE_ADD_COLORS total processed sequences: 81/107 >>>>>>> Rank 31 RAY_SLAVE_MODE_ADD_COLORS processed k-mers for current >>>>>>> sequence: 2341425/0 >>>>>>> Rank 31 RAY_SLAVE_MODE_ADD_COLORS total processed k-mers: 149500000 >>>>>>> Speed RAY_SLAVE_MODE_ADD_COLORS 20803 units/second >>>>>>> >>>>>>> ********************************************************** >>>>>>> Coloring summary >>>>>>> Number of virtual colors: 232 >>>>>>> Number of real colors: 3444 >>>>>>> >>>>>>> Keys in index: 230 >>>>>>> Observed collisions when populating the index: 0 >>>>>>> COLOR_NAMESPACE_MULTIPLIER= 10000000000000000 >>>>>>> >>>>>>> Operations >>>>>>> >>>>>>> OPERATION_getVirtualColorFrom operations: 35380 >>>>>>> >>>>>>> OPERATION_IN_PLACE_ONE_REFERENCE: 31951 >>>>>>> OPERATION_NO_VIRTUAL_COLOR_HAS_HASH_CREATION operations: 1730 >>>>>>> OPERATION_VIRTUAL_COLOR_HAS_COLORS_FETCH operations: 1506 >>>>>>> OPERATION_NO_VIRTUAL_COLOR_HAS_COLORS_CREATION operations: 193 >>>>>>> >>>>>>> OPERATION_createVirtualColorFrom operations: 1923 >>>>>>> >>>>>>> OPERATION_allocateVirtualColorHandle operations: 1923 >>>>>>> OPERATION_NEW_FROM_EMPTY operations: 1692 >>>>>>> OPERATION_NEW_FROM_SCRATCH operations: 231 >>>>>>> >>>>>>> OPERATION_applyHashOperation operations: 37303 >>>>>>> OPERATION_getHash operations: 0 >>>>>>> >>>>>>> OPERATION_incrementReferences operations: 35380 >>>>>>> OPERATION_decrementReferences operations: 34985 >>>>>>> >>>>>>> OPERATION_purgeVirtualColor operations: 1693 >>>>>>> ********************************************************** >>>>>>> >>>>>>> Rank 31: assembler memory usage: 139452 KiB >>>>>>> >>>>>>> On Fri, Feb 22, 2013 at 7:13 AM, Sébastien Boisvert >>>>>>> <[email protected]> wrote: >>>>>>>> >>>>>>>> What's the last thing reported in stdout ? >>>>>>>> >>>>>>>> On 02/22/2013 06:23 AM, jjv5 wrote: >>>>>>>>> >>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> I am running ray meta on a shared memory machine with 40 cores and >>>>>>>>> 1TB >>>>>>>>> memory. One very small job with 25K reads finished and gave various >>>>>>>>> taxonomic outputs. Jobs with slightly >>>>>>>>> more input seem to never finish. The output log just stops but there >>>>>>>>> are no errors indicated. Where might one start to look to determine >>>>>>>>> where a job has gone wrong? >>>>>>>>> >>>>>>>>> The command I use is below. There are 100,000 paired reads. >>>>>>>>> >>>>>>>>> mpiexec -n 40 $RAY -o $OUTDIR -p $LEFT $RIGHT -k 31 \ >>>>>>>>> -search $NCBIDIR/NCBI-Finished-Bacterial-Genomes \ >>>>>>>>> -with-taxonomy $NCBIDIR/Genome-to-Taxon.tsv >>>>>>>>> $NCBIDIR/TreeOfLife-Edges.tsv $NCBIDIR/Taxon-Names.tsv >>>>>>>>> >>>>>>>>> >>>>>>>>> ------------------------------------------------------------------------------ >>>>>>>>> Everyone hates slow websites. So do we. >>>>>>>>> Make your web apps faster with AppDynamics >>>>>>>>> Download AppDynamics Lite for free today: >>>>>>>>> http://p.sf.net/sfu/appdyn_d2d_feb >>>>>>>>> _______________________________________________ >>>>>>>>> Denovoassembler-users mailing list >>>>>>>>> [email protected] >>>>>>>>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> ------------------------------------------------------------------------------ >>>>>>>> Everyone hates slow websites. So do we. >>>>>>>> Make your web apps faster with AppDynamics >>>>>>>> Download AppDynamics Lite for free today: >>>>>>>> http://p.sf.net/sfu/appdyn_d2d_feb >>>>>>>> _______________________________________________ >>>>>>>> Denovoassembler-users mailing list >>>>>>>> [email protected] >>>>>>>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> ------------------------------------------------------------------------------ >>>>>> Everyone hates slow websites. So do we. >>>>>> Make your web apps faster with AppDynamics >>>>>> Download AppDynamics Lite for free today: >>>>>> http://p.sf.net/sfu/appdyn_d2d_feb >>>>>> _______________________________________________ >>>>>> Denovoassembler-users mailing list >>>>>> [email protected] >>>>>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users >>>> >>>> >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Everyone hates slow websites. So do we. >>>> Make your web apps faster with AppDynamics >>>> Download AppDynamics Lite for free today: >>>> http://p.sf.net/sfu/appdyn_d2d_feb >>>> _______________________________________________ >>>> Denovoassembler-users mailing list >>>> [email protected] >>>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users >> >> ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb _______________________________________________ Denovoassembler-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/denovoassembler-users
