[Please CC the mailing list]
On 02/23/2013 05:26 PM, James Vincent wrote: > I redirect stdout and sterr together in one file. It's huge. > > The last Step that was logged is this: > > *** > Step: Estimation of outer distances for paired reads > Date: Sat Feb 23 16:49:02 2013 > Elapsed time: 42 seconds > Since beginning: 3 hours, 38 minutes, 41 seconds > *** > > That is about halfway through the log. Its followed by many lines like this: > > Current peak coverage -> 178 > Rank 0 reached 0 vertices from seed 0, flow 1 > Rank 0: assembler memory usage: 2318560 KiB > Rank 15 reached 1000 vertices from seed 0, flow 1 > Speed RAY_SLAVE_MODE_EXTENSION 5503 units/second > Rank 15: assembler memory usage: 2314452 KiB > Rank 33 reached 1000 vertices from seed 0, flow 1 > Speed RAY_SLAVE_MODE_EXTENSION 5107 units/second > Rank 33: assembler memory usage: 2322828 KiB > Rank 16 reached 1000 vertices from seed 0, flow 1 > Speed RAY_SLAVE_MODE_EXTENSION 4963 units/second > Rank 16: assembler memory usage: 2318688 KiB > Rank 15 reached 1163 vertices from seed 0, flow 1 > Speed RAY_SLAVE_MODE_EXTENSION 7050 units/second > Rank 15: assembler memory usage: 2314452 KiB > In your other email, you said that the last lines were: Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 17092 units/second Estimated remaining time for this step: 55 minutes, 57 seconds Is your job still running ? > > fgrep -c RAY_SLAVE_MODE_EXTENSION out.Sample_BS.sickle > 1394932 > > > > > > On Sat, Feb 23, 2013 at 3:41 PM, Sébastien Boisvert > <[email protected]> wrote: >> On 02/23/2013 01:02 PM, James Vincent wrote: >>> >>> UNfortunately a bigger job has also failed with no apparent warnings. >>> stdout has nothing beginning with 'error'. Is there something else to >>> look at? >>> >> >> Did you look in the standard error ? >> >> >>> >>> The last few lines of stdout are: >>> >>> Rank 10 is purging edges [3450001/61093950] >>> Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 14509 units/second >>> Estimated remaining time for this step: 1 hours, 6 minutes, 12 seconds >>> Rank 3 is purging edges [3200001/61096936] >>> Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 13563 units/second >>> Estimated remaining time for this step: 1 hours, 11 minutes, 8 seconds >>> Rank 5 is purging edges [3350001/61077916] >>> Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 14354 units/second >>> Estimated remaining time for this step: 1 hours, 7 minutes, 1 seconds >>> Rank 7 is purging edges [3250001/61120658] >>> Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 13945 units/second >>> Estimated remaining time for this step: 1 hours, 9 minutes, 9 seconds >>> Rank 31 is purging edges [3400001/61067514] >>> Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 14516 units/second >>> Estimated remaining time for this step: 1 hours, 6 minutes, 12 seconds >>> Rank 9 is purging edges [3500001/61066398] >>> Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 14833 units/second >>> Estimated remaining time for this step: 1 hours, 4 minutes, 40 seconds >>> Rank 4 is purging edges [3700001/61082686] >>> Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 17092 units/second >>> Estimated remaining time for this step: 55 minutes, 57 seconds >>> >>> And the last few entries in ElapsedTime are: >>> >>> >>> *** >>> Step: Coverage distribution analysis >>> Date: Fri Feb 22 19:34:44 2013 >>> Elapsed time: 18 seconds >>> Since beginning: 1 hours, 3 minutes, 36 seconds >>> *** >>> >>> >>> *** >>> Step: Graph construction >>> Date: Fri Feb 22 20:31:57 2013 >>> Elapsed time: 57 minutes, 13 seconds >>> Since beginning: 2 hours, 49 seconds >>> *** >>> >>> This job was run a little differently. I did not use any profiliing >>> but I did set a minimum contig length: >>> >>> mpiexec --mca btl ^sm -n 40 $RAY -o $OUTDIR -p $LEFT $RIGHT -k 31 \ >>> -minimum-contig-length $MINCONTIG >&out.$OUTDIR >>> >>> The input files have 78426887 paired reads. These were quality trimmed >>> with sickle. >>> >>> Any ideas? >>> >>> Thanks, >>> Jim >>> >>> >>> On Fri, Feb 22, 2013 at 5:00 PM, Sébastien Boisvert >>> <[email protected]> wrote: >>>> >>>> On 02/22/2013 03:46 PM, James Vincent wrote: >>>>> >>>>> >>>>> That did it - thanks very much. The job completed. It's still a small >>>>> job, but I'll test with full size soon. >>>>> >>>> >>>> Excellent ! >>>> >>>> >>>>> >>>>> This job completed quickly. The previous run must have been hanging on >>>>> something because I showed 100% CPU on all 40 cores for many, many >>>>> hours with the same input. After making the change you suggest below >>>>> the job finishes in 1 hour. >>>>> >>>> >>>> With the message passing interface, Ray processes probes actively for new >>>> messages, it's not >>>> event-driven. >>>> >>>> >>>>> For 100K paired reads on 40 cores, does one hour sound roughly in the >>>>> ball >>>>> park? >>>>> >>>> >>>> Sure. >>>> >>>> Most of the time was probably in the graph coloring though. >>>> >>>> >>>>> On Fri, Feb 22, 2013 at 9:34 AM, Sébastien Boisvert >>>>> <[email protected]> wrote: >>>>>> >>>>>> >>>>>> Can you try adding this to Open-MPI options: >>>>>> >>>>>> --mca btl ^sm >>>>>> >>>>>> >>>>>> Full command: >>>>>> >>>>>> mpiexec --mca btl ^sm -n 40 $RAY -o $OUTDIR -p $LEFT $RIGHT -k 31 \ >>>>>> -search $NCBIDIR/NCBI-Finished-Bacterial-Genomes \ >>>>>> -with-taxonomy $NCBIDIR/Genome-to-Taxon.tsv >>>>>> $NCBIDIR/TreeOfLife-Edges.tsv $NCBIDIR/Taxon-Names.tsv >>>>>> >>>>>> >>>>>> This Open-MPI option will disable Open-MPI's "sm" byte transfer layer. >>>>>> "sm" means shared memory. >>>>>> >>>>>> All messages will go through "tcp" (using the loopback since it's all >>>>>> on >>>>>> the same machine), >>>>>> or through "self". >>>>>> >>>>>> >>>>>> p.s.: Open-MPI 1.4.3 is really old (2010-10-05). The last release is >>>>>> Open-MPI 1.6.4. >>>>>> >>>>>> On 02/22/2013 08:06 AM, James Vincent wrote: >>>>>>> >>>>>>> >>>>>>> It is OpenMPI 1.4.3. Here is the blurb at the start of a run: >>>>>>> >>>>>>> >>>>>>> MAXKMERLENGTH: 32 >>>>>>> KMER_U64_ARRAY_SIZE: 1 >>>>>>> Maximum coverage depth stored by CoverageDepth: 4294967295 >>>>>>> MAXIMUM_MESSAGE_SIZE_IN_BYTES: 4000 bytes >>>>>>> FORCE_PACKING = n >>>>>>> ASSERT = n >>>>>>> HAVE_LIBZ = n >>>>>>> HAVE_LIBBZ2 = n >>>>>>> CONFIG_PROFILER_COLLECT = n >>>>>>> CONFIG_CLOCK_GETTIME = n >>>>>>> __linux__ = y >>>>>>> _MSC_VER = n >>>>>>> __GNUC__ = y >>>>>>> RAY_32_BITS = n >>>>>>> RAY_64_BITS = y >>>>>>> MPI standard version: MPI 2.1 >>>>>>> MPI library: Open-MPI 1.4.3 >>>>>>> Compiler: GNU gcc/g++ 4.4.6 20120305 (Red Hat 4.4.6-4) >>>>>>> >>>>>>> Rank 0: Operating System: Linux (__linux__) POSIX (OS_POSIX) >>>>>>> >>>>>>> >>>>>>> On Fri, Feb 22, 2013 at 8:01 AM, Sébastien Boisvert >>>>>>> <[email protected]> wrote: >>>>>>>> >>>>>>>> >>>>>>>> Which MPI library are you using ? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 02/22/2013 07:34 AM, James Vincent wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> Here are the last 50 lines : >>>>>>>>> >>>>>>>>> -bash-4.1$ tail -50 out.run >>>>>>>>> >>>>>>>>> OPERATION_incrementReferences operations: 32762 >>>>>>>>> OPERATION_decrementReferences operations: 32386 >>>>>>>>> >>>>>>>>> OPERATION_purgeVirtualColor operations: 1567 >>>>>>>>> ********************************************************** >>>>>>>>> >>>>>>>>> Rank 25: assembler memory usage: 138560 KiB >>>>>>>>> Rank 31 biological abundances 3210 [1/1] [1752/2254] [4/7] >>>>>>>>> Rank 31 RAY_SLAVE_MODE_ADD_COLORS processed files: 43/56 >>>>>>>>> Rank 31 RAY_SLAVE_MODE_ADD_COLORS processed sequences in file: 3/7 >>>>>>>>> Rank 31 RAY_SLAVE_MODE_ADD_COLORS total processed sequences: 81/107 >>>>>>>>> Rank 31 RAY_SLAVE_MODE_ADD_COLORS processed k-mers for current >>>>>>>>> sequence: 2341425/0 >>>>>>>>> Rank 31 RAY_SLAVE_MODE_ADD_COLORS total processed k-mers: 149500000 >>>>>>>>> Speed RAY_SLAVE_MODE_ADD_COLORS 20803 units/second >>>>>>>>> >>>>>>>>> ********************************************************** >>>>>>>>> Coloring summary >>>>>>>>> Number of virtual colors: 232 >>>>>>>>> Number of real colors: 3444 >>>>>>>>> >>>>>>>>> Keys in index: 230 >>>>>>>>> Observed collisions when populating the index: 0 >>>>>>>>> COLOR_NAMESPACE_MULTIPLIER= 10000000000000000 >>>>>>>>> >>>>>>>>> Operations >>>>>>>>> >>>>>>>>> OPERATION_getVirtualColorFrom operations: 35380 >>>>>>>>> >>>>>>>>> OPERATION_IN_PLACE_ONE_REFERENCE: 31951 >>>>>>>>> OPERATION_NO_VIRTUAL_COLOR_HAS_HASH_CREATION operations: 1730 >>>>>>>>> OPERATION_VIRTUAL_COLOR_HAS_COLORS_FETCH operations: 1506 >>>>>>>>> OPERATION_NO_VIRTUAL_COLOR_HAS_COLORS_CREATION operations: 193 >>>>>>>>> >>>>>>>>> OPERATION_createVirtualColorFrom operations: 1923 >>>>>>>>> >>>>>>>>> OPERATION_allocateVirtualColorHandle operations: 1923 >>>>>>>>> OPERATION_NEW_FROM_EMPTY operations: 1692 >>>>>>>>> OPERATION_NEW_FROM_SCRATCH operations: 231 >>>>>>>>> >>>>>>>>> OPERATION_applyHashOperation operations: 37303 >>>>>>>>> OPERATION_getHash operations: 0 >>>>>>>>> >>>>>>>>> OPERATION_incrementReferences operations: 35380 >>>>>>>>> OPERATION_decrementReferences operations: 34985 >>>>>>>>> >>>>>>>>> OPERATION_purgeVirtualColor operations: 1693 >>>>>>>>> ********************************************************** >>>>>>>>> >>>>>>>>> Rank 31: assembler memory usage: 139452 KiB >>>>>>>>> >>>>>>>>> On Fri, Feb 22, 2013 at 7:13 AM, Sébastien Boisvert >>>>>>>>> <[email protected]> wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> What's the last thing reported in stdout ? >>>>>>>>>> >>>>>>>>>> On 02/22/2013 06:23 AM, jjv5 wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Hello, >>>>>>>>>>> >>>>>>>>>>> I am running ray meta on a shared memory machine with 40 cores and >>>>>>>>>>> 1TB >>>>>>>>>>> memory. One very small job with 25K reads finished and gave >>>>>>>>>>> various >>>>>>>>>>> taxonomic outputs. Jobs with slightly >>>>>>>>>>> more input seem to never finish. The output log just stops but >>>>>>>>>>> there >>>>>>>>>>> are no errors indicated. Where might one start to look to >>>>>>>>>>> determine >>>>>>>>>>> where a job has gone wrong? >>>>>>>>>>> >>>>>>>>>>> The command I use is below. There are 100,000 paired reads. >>>>>>>>>>> >>>>>>>>>>> mpiexec -n 40 $RAY -o $OUTDIR -p $LEFT $RIGHT -k 31 \ >>>>>>>>>>> -search $NCBIDIR/NCBI-Finished-Bacterial-Genomes \ >>>>>>>>>>> -with-taxonomy $NCBIDIR/Genome-to-Taxon.tsv >>>>>>>>>>> $NCBIDIR/TreeOfLife-Edges.tsv $NCBIDIR/Taxon-Names.tsv >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> ------------------------------------------------------------------------------ >>>>>>>>>>> Everyone hates slow websites. So do we. >>>>>>>>>>> Make your web apps faster with AppDynamics >>>>>>>>>>> Download AppDynamics Lite for free today: >>>>>>>>>>> http://p.sf.net/sfu/appdyn_d2d_feb >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> Denovoassembler-users mailing list >>>>>>>>>>> [email protected] >>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ------------------------------------------------------------------------------ >>>>>>>>>> Everyone hates slow websites. So do we. >>>>>>>>>> Make your web apps faster with AppDynamics >>>>>>>>>> Download AppDynamics Lite for free today: >>>>>>>>>> http://p.sf.net/sfu/appdyn_d2d_feb >>>>>>>>>> _______________________________________________ >>>>>>>>>> Denovoassembler-users mailing list >>>>>>>>>> [email protected] >>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> ------------------------------------------------------------------------------ >>>>>>>> Everyone hates slow websites. So do we. >>>>>>>> Make your web apps faster with AppDynamics >>>>>>>> Download AppDynamics Lite for free today: >>>>>>>> http://p.sf.net/sfu/appdyn_d2d_feb >>>>>>>> _______________________________________________ >>>>>>>> Denovoassembler-users mailing list >>>>>>>> [email protected] >>>>>>>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> ------------------------------------------------------------------------------ >>>>>> Everyone hates slow websites. So do we. >>>>>> Make your web apps faster with AppDynamics >>>>>> Download AppDynamics Lite for free today: >>>>>> http://p.sf.net/sfu/appdyn_d2d_feb >>>>>> _______________________________________________ >>>>>> Denovoassembler-users mailing list >>>>>> [email protected] >>>>>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users >>>> >>>> >>>> >> ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb _______________________________________________ Denovoassembler-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/denovoassembler-users
