On 02/23/2013 01:02 PM, James Vincent wrote:
> UNfortunately a bigger job has also failed with no apparent warnings.
> stdout has nothing beginning with 'error'. Is there something else to
> look at?
>

Did you look in the standard error ?
  
> The last few lines of stdout are:
>
> Rank 10 is purging edges [3450001/61093950]
> Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 14509 units/second
> Estimated remaining time for this step: 1 hours, 6 minutes, 12 seconds
> Rank 3 is purging edges [3200001/61096936]
> Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 13563 units/second
> Estimated remaining time for this step: 1 hours, 11 minutes, 8 seconds
> Rank 5 is purging edges [3350001/61077916]
> Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 14354 units/second
> Estimated remaining time for this step: 1 hours, 7 minutes, 1 seconds
> Rank 7 is purging edges [3250001/61120658]
> Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 13945 units/second
> Estimated remaining time for this step: 1 hours, 9 minutes, 9 seconds
> Rank 31 is purging edges [3400001/61067514]
> Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 14516 units/second
> Estimated remaining time for this step: 1 hours, 6 minutes, 12 seconds
> Rank 9 is purging edges [3500001/61066398]
> Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 14833 units/second
> Estimated remaining time for this step: 1 hours, 4 minutes, 40 seconds
> Rank 4 is purging edges [3700001/61082686]
> Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 17092 units/second
> Estimated remaining time for this step: 55 minutes, 57 seconds
>
> And the last few entries in ElapsedTime are:
>
>
> ***
> Step: Coverage distribution analysis
> Date: Fri Feb 22 19:34:44 2013
> Elapsed time: 18 seconds
> Since beginning: 1 hours, 3 minutes, 36 seconds
> ***
>
>
> ***
> Step: Graph construction
> Date: Fri Feb 22 20:31:57 2013
> Elapsed time: 57 minutes, 13 seconds
> Since beginning: 2 hours, 49 seconds
> ***
>
> This job was run a little differently. I did not use any profiliing
> but I did set a minimum contig length:
>
> mpiexec --mca btl ^sm -n 40 $RAY -o $OUTDIR -p $LEFT $RIGHT -k 31 \
> -minimum-contig-length $MINCONTIG >&out.$OUTDIR
>
> The input files have 78426887 paired reads. These were quality trimmed
> with sickle.
>
> Any ideas?
>
> Thanks,
> Jim
>
>
> On Fri, Feb 22, 2013 at 5:00 PM, Sébastien Boisvert
> <[email protected]> wrote:
>> On 02/22/2013 03:46 PM, James Vincent wrote:
>>>
>>> That did it - thanks very much. The job completed. It's still a small
>>> job, but I'll test with full size soon.
>>>
>>
>> Excellent !
>>
>>
>>>
>>> This job completed quickly. The previous run must have been hanging on
>>> something because I showed 100% CPU on all 40 cores for many, many
>>> hours with the same input. After making the change you suggest below
>>> the job finishes in 1 hour.
>>>
>>
>> With the message passing interface, Ray processes probes actively for new
>> messages, it's not
>> event-driven.
>>
>>
>>> For 100K paired reads on 40 cores, does one hour sound roughly in the ball
>>> park?
>>>
>>
>> Sure.
>>
>> Most of the time was probably in the graph coloring though.
>>
>>
>>> On Fri, Feb 22, 2013 at 9:34 AM, Sébastien Boisvert
>>> <[email protected]> wrote:
>>>>
>>>> Can you try adding  this to Open-MPI options:
>>>>
>>>>        --mca btl ^sm
>>>>
>>>>
>>>> Full command:
>>>>
>>>> mpiexec --mca btl ^sm -n 40 $RAY -o $OUTDIR -p $LEFT $RIGHT -k 31 \
>>>> -search $NCBIDIR/NCBI-Finished-Bacterial-Genomes  \
>>>> -with-taxonomy $NCBIDIR/Genome-to-Taxon.tsv
>>>> $NCBIDIR/TreeOfLife-Edges.tsv $NCBIDIR/Taxon-Names.tsv
>>>>
>>>>
>>>> This Open-MPI option will disable Open-MPI's "sm" byte transfer layer.
>>>> "sm" means shared memory.
>>>>
>>>> All messages will go through "tcp" (using the loopback since it's all on
>>>> the same machine),
>>>> or through "self".
>>>>
>>>>
>>>> p.s.: Open-MPI 1.4.3 is really old (2010-10-05). The last release is
>>>> Open-MPI 1.6.4.
>>>>
>>>> On 02/22/2013 08:06 AM, James Vincent wrote:
>>>>>
>>>>> It is OpenMPI 1.4.3. Here is the blurb at the start of a run:
>>>>>
>>>>>
>>>>> MAXKMERLENGTH: 32
>>>>> KMER_U64_ARRAY_SIZE: 1
>>>>> Maximum coverage depth stored by CoverageDepth: 4294967295
>>>>> MAXIMUM_MESSAGE_SIZE_IN_BYTES: 4000 bytes
>>>>> FORCE_PACKING = n
>>>>> ASSERT = n
>>>>> HAVE_LIBZ = n
>>>>> HAVE_LIBBZ2 = n
>>>>> CONFIG_PROFILER_COLLECT = n
>>>>> CONFIG_CLOCK_GETTIME = n
>>>>> __linux__ = y
>>>>> _MSC_VER = n
>>>>> __GNUC__ = y
>>>>> RAY_32_BITS = n
>>>>> RAY_64_BITS = y
>>>>> MPI standard version: MPI 2.1
>>>>> MPI library: Open-MPI 1.4.3
>>>>> Compiler: GNU gcc/g++ 4.4.6 20120305 (Red Hat 4.4.6-4)
>>>>>
>>>>> Rank 0: Operating System: Linux (__linux__) POSIX (OS_POSIX)
>>>>>
>>>>>
>>>>> On Fri, Feb 22, 2013 at 8:01 AM, Sébastien Boisvert
>>>>> <[email protected]> wrote:
>>>>>>
>>>>>> Which MPI library are you using ?
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 02/22/2013 07:34 AM, James Vincent wrote:
>>>>>>>
>>>>>>> Here are the last 50 lines :
>>>>>>>
>>>>>>> -bash-4.1$ tail -50 out.run
>>>>>>>
>>>>>>>       OPERATION_incrementReferences  operations: 32762
>>>>>>>       OPERATION_decrementReferences  operations: 32386
>>>>>>>
>>>>>>>       OPERATION_purgeVirtualColor  operations: 1567
>>>>>>> **********************************************************
>>>>>>>
>>>>>>> Rank 25: assembler memory usage: 138560 KiB
>>>>>>> Rank 31 biological abundances 3210 [1/1] [1752/2254] [4/7]
>>>>>>> Rank 31 RAY_SLAVE_MODE_ADD_COLORS processed files: 43/56
>>>>>>> Rank 31 RAY_SLAVE_MODE_ADD_COLORS processed sequences in file: 3/7
>>>>>>> Rank 31 RAY_SLAVE_MODE_ADD_COLORS total processed sequences: 81/107
>>>>>>> Rank 31 RAY_SLAVE_MODE_ADD_COLORS processed k-mers for current
>>>>>>> sequence: 2341425/0
>>>>>>> Rank 31 RAY_SLAVE_MODE_ADD_COLORS total processed k-mers: 149500000
>>>>>>> Speed RAY_SLAVE_MODE_ADD_COLORS 20803 units/second
>>>>>>>
>>>>>>> **********************************************************
>>>>>>> Coloring summary
>>>>>>>       Number of virtual colors: 232
>>>>>>>       Number of real colors: 3444
>>>>>>>
>>>>>>> Keys in index: 230
>>>>>>> Observed collisions when populating the index: 0
>>>>>>> COLOR_NAMESPACE_MULTIPLIER= 10000000000000000
>>>>>>>
>>>>>>> Operations
>>>>>>>
>>>>>>>       OPERATION_getVirtualColorFrom operations: 35380
>>>>>>>
>>>>>>>       OPERATION_IN_PLACE_ONE_REFERENCE: 31951
>>>>>>>       OPERATION_NO_VIRTUAL_COLOR_HAS_HASH_CREATION operations: 1730
>>>>>>>       OPERATION_VIRTUAL_COLOR_HAS_COLORS_FETCH operations: 1506
>>>>>>>       OPERATION_NO_VIRTUAL_COLOR_HAS_COLORS_CREATION operations: 193
>>>>>>>
>>>>>>>       OPERATION_createVirtualColorFrom  operations: 1923
>>>>>>>
>>>>>>>       OPERATION_allocateVirtualColorHandle operations: 1923
>>>>>>>       OPERATION_NEW_FROM_EMPTY operations: 1692
>>>>>>>       OPERATION_NEW_FROM_SCRATCH operations: 231
>>>>>>>
>>>>>>>       OPERATION_applyHashOperation operations: 37303
>>>>>>>       OPERATION_getHash operations: 0
>>>>>>>
>>>>>>>       OPERATION_incrementReferences  operations: 35380
>>>>>>>       OPERATION_decrementReferences  operations: 34985
>>>>>>>
>>>>>>>       OPERATION_purgeVirtualColor  operations: 1693
>>>>>>> **********************************************************
>>>>>>>
>>>>>>> Rank 31: assembler memory usage: 139452 KiB
>>>>>>>
>>>>>>> On Fri, Feb 22, 2013 at 7:13 AM, Sébastien Boisvert
>>>>>>> <[email protected]> wrote:
>>>>>>>>
>>>>>>>> What's the last thing reported in stdout ?
>>>>>>>>
>>>>>>>> On 02/22/2013 06:23 AM, jjv5 wrote:
>>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> I am running ray meta on a shared memory machine with 40 cores and
>>>>>>>>> 1TB
>>>>>>>>> memory. One very small job with 25K reads finished and gave various
>>>>>>>>> taxonomic outputs. Jobs with slightly
>>>>>>>>> more input seem to never finish. The output log just stops but there
>>>>>>>>> are no errors indicated. Where might one start to look to determine
>>>>>>>>> where a job has gone wrong?
>>>>>>>>>
>>>>>>>>> The command I use is below. There are 100,000 paired reads.
>>>>>>>>>
>>>>>>>>> mpiexec -n 40 $RAY -o $OUTDIR -p $LEFT $RIGHT -k 31 \
>>>>>>>>> -search $NCBIDIR/NCBI-Finished-Bacterial-Genomes  \
>>>>>>>>> -with-taxonomy $NCBIDIR/Genome-to-Taxon.tsv
>>>>>>>>> $NCBIDIR/TreeOfLife-Edges.tsv $NCBIDIR/Taxon-Names.tsv
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>>> Everyone hates slow websites. So do we.
>>>>>>>>> Make your web apps faster with AppDynamics
>>>>>>>>> Download AppDynamics Lite for free today:
>>>>>>>>> http://p.sf.net/sfu/appdyn_d2d_feb
>>>>>>>>> _______________________________________________
>>>>>>>>> Denovoassembler-users mailing list
>>>>>>>>> [email protected]
>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>> Everyone hates slow websites. So do we.
>>>>>>>> Make your web apps faster with AppDynamics
>>>>>>>> Download AppDynamics Lite for free today:
>>>>>>>> http://p.sf.net/sfu/appdyn_d2d_feb
>>>>>>>> _______________________________________________
>>>>>>>> Denovoassembler-users mailing list
>>>>>>>> [email protected]
>>>>>>>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------------------
>>>>>> Everyone hates slow websites. So do we.
>>>>>> Make your web apps faster with AppDynamics
>>>>>> Download AppDynamics Lite for free today:
>>>>>> http://p.sf.net/sfu/appdyn_d2d_feb
>>>>>> _______________________________________________
>>>>>> Denovoassembler-users mailing list
>>>>>> [email protected]
>>>>>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users
>>>>
>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Everyone hates slow websites. So do we.
>>>> Make your web apps faster with AppDynamics
>>>> Download AppDynamics Lite for free today:
>>>> http://p.sf.net/sfu/appdyn_d2d_feb
>>>> _______________________________________________
>>>> Denovoassembler-users mailing list
>>>> [email protected]
>>>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users
>>
>>


------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_feb
_______________________________________________
Denovoassembler-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users

Reply via email to