[Please CC the mailing list]

On 02/23/2013 05:26 PM, James Vincent wrote:
> I redirect stdout and sterr together in one file. It's huge.
>
> The last Step that was logged is this:
>
> ***
> Step: Estimation of outer distances for paired reads
> Date: Sat Feb 23 16:49:02 2013
> Elapsed time: 42 seconds
> Since beginning: 3 hours, 38 minutes, 41 seconds
> ***
>
> That is about halfway through the log. Its followed by many lines like this:
>
> Current peak coverage -> 178
> Rank 0 reached 0 vertices from seed 0, flow 1
> Rank 0: assembler memory usage: 2318560 KiB
> Rank 15 reached 1000 vertices from seed 0, flow 1
> Speed RAY_SLAVE_MODE_EXTENSION 5503 units/second
> Rank 15: assembler memory usage: 2314452 KiB
> Rank 33 reached 1000 vertices from seed 0, flow 1
> Speed RAY_SLAVE_MODE_EXTENSION 5107 units/second
> Rank 33: assembler memory usage: 2322828 KiB
> Rank 16 reached 1000 vertices from seed 0, flow 1
> Speed RAY_SLAVE_MODE_EXTENSION 4963 units/second
> Rank 16: assembler memory usage: 2318688 KiB
> Rank 15 reached 1163 vertices from seed 0, flow 1
> Speed RAY_SLAVE_MODE_EXTENSION 7050 units/second
> Rank 15: assembler memory usage: 2314452 KiB
>

In your other email, you said that the last lines were:

Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 17092 units/second
Estimated remaining time for this step: 55 minutes, 57 seconds

Is your job still running ?

>
> fgrep -c RAY_SLAVE_MODE_EXTENSION out.Sample_BS.sickle
> 1394932
>
>
>
>
>
> On Sat, Feb 23, 2013 at 3:41 PM, Sébastien Boisvert
> <[email protected]> wrote:
>> On 02/23/2013 01:02 PM, James Vincent wrote:
>>>
>>> UNfortunately a bigger job has also failed with no apparent warnings.
>>> stdout has nothing beginning with 'error'. Is there something else to
>>> look at?
>>>
>>
>> Did you look in the standard error ?
>>
>>
>>>
>>> The last few lines of stdout are:
>>>
>>> Rank 10 is purging edges [3450001/61093950]
>>> Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 14509 units/second
>>> Estimated remaining time for this step: 1 hours, 6 minutes, 12 seconds
>>> Rank 3 is purging edges [3200001/61096936]
>>> Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 13563 units/second
>>> Estimated remaining time for this step: 1 hours, 11 minutes, 8 seconds
>>> Rank 5 is purging edges [3350001/61077916]
>>> Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 14354 units/second
>>> Estimated remaining time for this step: 1 hours, 7 minutes, 1 seconds
>>> Rank 7 is purging edges [3250001/61120658]
>>> Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 13945 units/second
>>> Estimated remaining time for this step: 1 hours, 9 minutes, 9 seconds
>>> Rank 31 is purging edges [3400001/61067514]
>>> Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 14516 units/second
>>> Estimated remaining time for this step: 1 hours, 6 minutes, 12 seconds
>>> Rank 9 is purging edges [3500001/61066398]
>>> Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 14833 units/second
>>> Estimated remaining time for this step: 1 hours, 4 minutes, 40 seconds
>>> Rank 4 is purging edges [3700001/61082686]
>>> Speed RAY_SLAVE_MODE_PURGE_NULL_EDGES 17092 units/second
>>> Estimated remaining time for this step: 55 minutes, 57 seconds
>>>
>>> And the last few entries in ElapsedTime are:
>>>
>>>
>>> ***
>>> Step: Coverage distribution analysis
>>> Date: Fri Feb 22 19:34:44 2013
>>> Elapsed time: 18 seconds
>>> Since beginning: 1 hours, 3 minutes, 36 seconds
>>> ***
>>>
>>>
>>> ***
>>> Step: Graph construction
>>> Date: Fri Feb 22 20:31:57 2013
>>> Elapsed time: 57 minutes, 13 seconds
>>> Since beginning: 2 hours, 49 seconds
>>> ***
>>>
>>> This job was run a little differently. I did not use any profiliing
>>> but I did set a minimum contig length:
>>>
>>> mpiexec --mca btl ^sm -n 40 $RAY -o $OUTDIR -p $LEFT $RIGHT -k 31 \
>>> -minimum-contig-length $MINCONTIG >&out.$OUTDIR
>>>
>>> The input files have 78426887 paired reads. These were quality trimmed
>>> with sickle.
>>>
>>> Any ideas?
>>>
>>> Thanks,
>>> Jim
>>>
>>>
>>> On Fri, Feb 22, 2013 at 5:00 PM, Sébastien Boisvert
>>> <[email protected]> wrote:
>>>>
>>>> On 02/22/2013 03:46 PM, James Vincent wrote:
>>>>>
>>>>>
>>>>> That did it - thanks very much. The job completed. It's still a small
>>>>> job, but I'll test with full size soon.
>>>>>
>>>>
>>>> Excellent !
>>>>
>>>>
>>>>>
>>>>> This job completed quickly. The previous run must have been hanging on
>>>>> something because I showed 100% CPU on all 40 cores for many, many
>>>>> hours with the same input. After making the change you suggest below
>>>>> the job finishes in 1 hour.
>>>>>
>>>>
>>>> With the message passing interface, Ray processes probes actively for new
>>>> messages, it's not
>>>> event-driven.
>>>>
>>>>
>>>>> For 100K paired reads on 40 cores, does one hour sound roughly in the
>>>>> ball
>>>>> park?
>>>>>
>>>>
>>>> Sure.
>>>>
>>>> Most of the time was probably in the graph coloring though.
>>>>
>>>>
>>>>> On Fri, Feb 22, 2013 at 9:34 AM, Sébastien Boisvert
>>>>> <[email protected]> wrote:
>>>>>>
>>>>>>
>>>>>> Can you try adding  this to Open-MPI options:
>>>>>>
>>>>>>         --mca btl ^sm
>>>>>>
>>>>>>
>>>>>> Full command:
>>>>>>
>>>>>> mpiexec --mca btl ^sm -n 40 $RAY -o $OUTDIR -p $LEFT $RIGHT -k 31 \
>>>>>> -search $NCBIDIR/NCBI-Finished-Bacterial-Genomes  \
>>>>>> -with-taxonomy $NCBIDIR/Genome-to-Taxon.tsv
>>>>>> $NCBIDIR/TreeOfLife-Edges.tsv $NCBIDIR/Taxon-Names.tsv
>>>>>>
>>>>>>
>>>>>> This Open-MPI option will disable Open-MPI's "sm" byte transfer layer.
>>>>>> "sm" means shared memory.
>>>>>>
>>>>>> All messages will go through "tcp" (using the loopback since it's all
>>>>>> on
>>>>>> the same machine),
>>>>>> or through "self".
>>>>>>
>>>>>>
>>>>>> p.s.: Open-MPI 1.4.3 is really old (2010-10-05). The last release is
>>>>>> Open-MPI 1.6.4.
>>>>>>
>>>>>> On 02/22/2013 08:06 AM, James Vincent wrote:
>>>>>>>
>>>>>>>
>>>>>>> It is OpenMPI 1.4.3. Here is the blurb at the start of a run:
>>>>>>>
>>>>>>>
>>>>>>> MAXKMERLENGTH: 32
>>>>>>> KMER_U64_ARRAY_SIZE: 1
>>>>>>> Maximum coverage depth stored by CoverageDepth: 4294967295
>>>>>>> MAXIMUM_MESSAGE_SIZE_IN_BYTES: 4000 bytes
>>>>>>> FORCE_PACKING = n
>>>>>>> ASSERT = n
>>>>>>> HAVE_LIBZ = n
>>>>>>> HAVE_LIBBZ2 = n
>>>>>>> CONFIG_PROFILER_COLLECT = n
>>>>>>> CONFIG_CLOCK_GETTIME = n
>>>>>>> __linux__ = y
>>>>>>> _MSC_VER = n
>>>>>>> __GNUC__ = y
>>>>>>> RAY_32_BITS = n
>>>>>>> RAY_64_BITS = y
>>>>>>> MPI standard version: MPI 2.1
>>>>>>> MPI library: Open-MPI 1.4.3
>>>>>>> Compiler: GNU gcc/g++ 4.4.6 20120305 (Red Hat 4.4.6-4)
>>>>>>>
>>>>>>> Rank 0: Operating System: Linux (__linux__) POSIX (OS_POSIX)
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Feb 22, 2013 at 8:01 AM, Sébastien Boisvert
>>>>>>> <[email protected]> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> Which MPI library are you using ?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 02/22/2013 07:34 AM, James Vincent wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Here are the last 50 lines :
>>>>>>>>>
>>>>>>>>> -bash-4.1$ tail -50 out.run
>>>>>>>>>
>>>>>>>>>        OPERATION_incrementReferences  operations: 32762
>>>>>>>>>        OPERATION_decrementReferences  operations: 32386
>>>>>>>>>
>>>>>>>>>        OPERATION_purgeVirtualColor  operations: 1567
>>>>>>>>> **********************************************************
>>>>>>>>>
>>>>>>>>> Rank 25: assembler memory usage: 138560 KiB
>>>>>>>>> Rank 31 biological abundances 3210 [1/1] [1752/2254] [4/7]
>>>>>>>>> Rank 31 RAY_SLAVE_MODE_ADD_COLORS processed files: 43/56
>>>>>>>>> Rank 31 RAY_SLAVE_MODE_ADD_COLORS processed sequences in file: 3/7
>>>>>>>>> Rank 31 RAY_SLAVE_MODE_ADD_COLORS total processed sequences: 81/107
>>>>>>>>> Rank 31 RAY_SLAVE_MODE_ADD_COLORS processed k-mers for current
>>>>>>>>> sequence: 2341425/0
>>>>>>>>> Rank 31 RAY_SLAVE_MODE_ADD_COLORS total processed k-mers: 149500000
>>>>>>>>> Speed RAY_SLAVE_MODE_ADD_COLORS 20803 units/second
>>>>>>>>>
>>>>>>>>> **********************************************************
>>>>>>>>> Coloring summary
>>>>>>>>>        Number of virtual colors: 232
>>>>>>>>>        Number of real colors: 3444
>>>>>>>>>
>>>>>>>>> Keys in index: 230
>>>>>>>>> Observed collisions when populating the index: 0
>>>>>>>>> COLOR_NAMESPACE_MULTIPLIER= 10000000000000000
>>>>>>>>>
>>>>>>>>> Operations
>>>>>>>>>
>>>>>>>>>        OPERATION_getVirtualColorFrom operations: 35380
>>>>>>>>>
>>>>>>>>>        OPERATION_IN_PLACE_ONE_REFERENCE: 31951
>>>>>>>>>        OPERATION_NO_VIRTUAL_COLOR_HAS_HASH_CREATION operations: 1730
>>>>>>>>>        OPERATION_VIRTUAL_COLOR_HAS_COLORS_FETCH operations: 1506
>>>>>>>>>        OPERATION_NO_VIRTUAL_COLOR_HAS_COLORS_CREATION operations: 193
>>>>>>>>>
>>>>>>>>>        OPERATION_createVirtualColorFrom  operations: 1923
>>>>>>>>>
>>>>>>>>>        OPERATION_allocateVirtualColorHandle operations: 1923
>>>>>>>>>        OPERATION_NEW_FROM_EMPTY operations: 1692
>>>>>>>>>        OPERATION_NEW_FROM_SCRATCH operations: 231
>>>>>>>>>
>>>>>>>>>        OPERATION_applyHashOperation operations: 37303
>>>>>>>>>        OPERATION_getHash operations: 0
>>>>>>>>>
>>>>>>>>>        OPERATION_incrementReferences  operations: 35380
>>>>>>>>>        OPERATION_decrementReferences  operations: 34985
>>>>>>>>>
>>>>>>>>>        OPERATION_purgeVirtualColor  operations: 1693
>>>>>>>>> **********************************************************
>>>>>>>>>
>>>>>>>>> Rank 31: assembler memory usage: 139452 KiB
>>>>>>>>>
>>>>>>>>> On Fri, Feb 22, 2013 at 7:13 AM, Sébastien Boisvert
>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> What's the last thing reported in stdout ?
>>>>>>>>>>
>>>>>>>>>> On 02/22/2013 06:23 AM, jjv5 wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Hello,
>>>>>>>>>>>
>>>>>>>>>>> I am running ray meta on a shared memory machine with 40 cores and
>>>>>>>>>>> 1TB
>>>>>>>>>>> memory. One very small job with 25K reads finished and gave
>>>>>>>>>>> various
>>>>>>>>>>> taxonomic outputs. Jobs with slightly
>>>>>>>>>>> more input seem to never finish. The output log just stops but
>>>>>>>>>>> there
>>>>>>>>>>> are no errors indicated. Where might one start to look to
>>>>>>>>>>> determine
>>>>>>>>>>> where a job has gone wrong?
>>>>>>>>>>>
>>>>>>>>>>> The command I use is below. There are 100,000 paired reads.
>>>>>>>>>>>
>>>>>>>>>>> mpiexec -n 40 $RAY -o $OUTDIR -p $LEFT $RIGHT -k 31 \
>>>>>>>>>>> -search $NCBIDIR/NCBI-Finished-Bacterial-Genomes  \
>>>>>>>>>>> -with-taxonomy $NCBIDIR/Genome-to-Taxon.tsv
>>>>>>>>>>> $NCBIDIR/TreeOfLife-Edges.tsv $NCBIDIR/Taxon-Names.tsv
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>>>>> Everyone hates slow websites. So do we.
>>>>>>>>>>> Make your web apps faster with AppDynamics
>>>>>>>>>>> Download AppDynamics Lite for free today:
>>>>>>>>>>> http://p.sf.net/sfu/appdyn_d2d_feb
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Denovoassembler-users mailing list
>>>>>>>>>>> [email protected]
>>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>>>> Everyone hates slow websites. So do we.
>>>>>>>>>> Make your web apps faster with AppDynamics
>>>>>>>>>> Download AppDynamics Lite for free today:
>>>>>>>>>> http://p.sf.net/sfu/appdyn_d2d_feb
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Denovoassembler-users mailing list
>>>>>>>>>> [email protected]
>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>> Everyone hates slow websites. So do we.
>>>>>>>> Make your web apps faster with AppDynamics
>>>>>>>> Download AppDynamics Lite for free today:
>>>>>>>> http://p.sf.net/sfu/appdyn_d2d_feb
>>>>>>>> _______________________________________________
>>>>>>>> Denovoassembler-users mailing list
>>>>>>>> [email protected]
>>>>>>>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------------------
>>>>>> Everyone hates slow websites. So do we.
>>>>>> Make your web apps faster with AppDynamics
>>>>>> Download AppDynamics Lite for free today:
>>>>>> http://p.sf.net/sfu/appdyn_d2d_feb
>>>>>> _______________________________________________
>>>>>> Denovoassembler-users mailing list
>>>>>> [email protected]
>>>>>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users
>>>>
>>>>
>>>>
>>


------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_feb
_______________________________________________
Denovoassembler-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users

Reply via email to