Re: [Denovoassembler-users] Elapsed time

Louis Letourneau Thu, 08 Nov 2012 11:47:13 -0800

With the path I went from 6*120hrs(walltime was hit each time) to:
resources_used.walltime=47:35:27


geez...I should have asked earlier :-)


thanks
Louis

On 12-11-05 04:52 PM, Sébastien Boisvert wrote:
> On 11/05/2012 02:32 PM, Louis Letourneau wrote:
>> Darn I forgot about that bug and I saw it pass on the mailing list too.
>>
>> Sorry for the post then.
>>
> 
> Not sure I sent an email about that patch ;-)
>   
>> I saw your mini-rank posting, I think it's a wonderful idea especially
>> since there are more and more cores per nodes now.
>>
>> Is the infinit loop fixed in the mini-rank codebase?
> 
> It is not an infinite loop, it just that a loop over a k-mer with
> a coverage of 99999 (a large coverage) takes a while with all the messages.
> 
>> If not should I just apply the patch?
>>
> 
> Yeah, the patch fixes the long running time in the scaffolding.
> 
>> Again, thanks for the great work.
>>
> 
> Thanks for the testing !
> 
>> Louis
>>
>> On 12-11-05 12:17 PM, Sébastien Boisvert wrote:
>>> On 11/05/2012 11:14 AM, Louis Letourneau wrote:
>>>> I have assembled 2 >2.5G genomes (not the same, both mammals) in about
>>>> 48hrs using 2025cores. This works great.
>>>>
>>>
>>> Nice.
>>>
>>>> I'm trying to assemble a fish and I am having issues I don't quite know
>>>> how to debug.
>>>>
>>>> The fish is about 1.9Gb in size and not diploid.
>>>>
>>>> If I run Ray using the paired + mates using k31 I was able to assemble
>>>> it in 168hours (I needed to restart after 120hours because of
>>>> walltime...thanks for the checkpoints :-) ).
>>>>
>>>
>>> That's quite long, what's the latency ?
>>>
>>> We are working on a new programming model called "mini-ranks" to better use
>>> super computer with a lot of nodes, but also with a lot of cores per node.
>>>
>>> Ray uses RayPlatform, and RayPlatform uses MPI. In the new model, 
>>> RayPlatform
>>> uses "mini-ranks".
>>>
>>> The current model in RayPlatform is to use pure MPI programming model, which
>>> can be really bad on some super computer if there is just one network card
>>> on each node that must serve lots of MPI processes.
>>>
>>> If you are interested, we have a experimental branch called for mini-ranks 
>>> that
>>> can use only 1 MPI process per node, and as many IEEE POSIX threads (one 
>>> for each
>>> mini-ranks).
>>>
>>> With mini-ranks, the routing code in RayPlatform will become obsolete !
>>>
>>> Some latency results:
>>>
>>>
>>> Table 1: Comparison of MPI ranks with mini-ranks on the Colosse
>>> super-computer at Laval University.
>>> +-------+---------------------------------------------------+
>>> | Cores | Average round-trip latency (us)                   |
>>> +-------+-----------------------+---------------------------+
>>> |       | MPI ranks             | mini-ranks                |
>>> |       | (pure MPI)            | (MPI + pthread)           |
>>> +-------+-----------------------+---------------------------+
>>> | 8     | 11.25 +/- 0           | 24.1429 +/- 0             |
>>> | 16    | 35.875 +/- 6.92369    | 43.0179 +/- 8.76275       |
>>> | 32    | 66.3125 +/- 6.76387   | 41.7143 +/- 1.23924       |
>>> | 64    | 90 +/- 16.5265        | 37.75 +/- 6.41984         |
>>> | 128   | 126.562 +/- 25.0116   | 43.0179 +/- 8.76275       |
>>> | 256   | 203.637 +/- 67.4579   | 44.6429 +/- 6.11862       |
>>> +-------+-----------------------+---------------------------+
>>>
>>> If you want to try that:
>>>
>>> git clone g...@github.com:sebhtml/RayPlatform.git
>>> cd RayPlatform;git checkout minirank-model;cd ..
>>> git clone g...@github.com:sebhtml/ray.git
>>> cd ray;git checkout minirank-model;
>>> make
>>>
>>> then, to run on 100 nodes, with 24 cores per node:
>>>
>>>
>>> mpiexec -n 100 -bynode Ray -mini-ranks-per-rank 23 \
>>> ...
>>>
>>> Notes:
>>>
>>> 1. the -bynode is necessary in Open-MPI because the default is -byslot.
>>> -byslot will work also if the job scheduler presents the slots in a by-node
>>> round-robin strategy.
>>>
>>> 2. It is important to throw 23 mini-ranks per MPI process and not 24
>>> because each MPI process has a communication thread too and you
>>> don't want to oversubscribe the folks on the CPU at all.
>>>
>>> 3. The mini-rank code contains 0 (zero) locks, 0 mutexes, 0 spinlocks, 0 
>>> semaphore.
>>> The code is non-blocking, and lock-free which is why it works so well.
>>>
>>> 4. This work should be merged once I have made additional sanity checks.
>>>
>>> 5. If you want to look at the code, the class MessageQueue is particularly 
>>> interesting.
>>>
>>>> It worked (although the assembly wasn't great, possibly due to a lot of
>>>> repeats), but took way longer than the bigger genomes.
>>>>
>>>> I'm trying the same without the mates. I also changed the kmer from k31
>>>> to k61.
>>>>
>>>> I hit walltime 5 times now, 120hours each and it's not finished.
>>>>
>>>> The variables that changed are kmer and no mates.
>>>>
>>>> The first run ran many steps in the log.
>>>> Since the first wall time, the only output I seem to be having
>>>>
>>>> Rank X: gathering scaffold links [Y/2987] [Z/7166]
>>>>
>>>> (X,Y,Z varies of course)
>>>>
>>>
>>> Known bug where the Ray stalls on repeats too long...
>>>
>>> https://github.com/sebhtml/ray/issues/91
>>>
>>> This is because of a bug (1 month old, actually). I have a patch in the 
>>> queue,
>>> but I am not satisfied by its impact overall. The patch fixes the problem 
>>> of running
>>> time though. I will solves this bug in the scaffolder when I have time.
>>> Meanwhile, you can use the patch, which solves the problem, but it's a 
>>> dirty hack.
>>>
>>> You can test this patch:
>>>
>>> wget 
>>> http://downloads.sourceforge.net/project/denovoassembler/Ray-v2.1.0.tar.bz2
>>> tar -xjf Ray-v2.1.0.tar.bz2
>>> cd Ray-v2.1.0
>>> wget 
>>> https://github.com/sebhtml/patches/raw/master/ray/human-seb-from-13efb22270e4f563c9cafc.patc
>>> patch -p1 < human-seb-from-13efb22270e4f563c9cafc.patch
>>>
>>> make ...
>>>
>>>> I was using a version compiled from sources for the polytope routing.
>>>>
>>>
>>> As I said, "mini-ranks" *will* supercede the virtual routing subsystem. The 
>>> problem with
>>> virtual routing is that it increases the number of physical hops. With 
>>> mini-ranks,
>>> it is not the case at all.
>>>
>>>> Any ideas?
>>>>
>>>
>>> To wrap-up:
>>>
>>> 1. Try mini-ranks;
>>> 2. Try the patch;
>>>
>>>
>>> p.s.: I should resume the patchwork, branch merging once I am done 
>>> implementing
>>> the reviewers' concerns for my Debian package and Fedora package for Ray.
>>>
>>> p.s.2: For your information, our paper about Ray Meta should appear 
>>> somewhere in the
>>> near future, it is in re-review (the reviewers are assessing our revised 
>>> manuscript).
>>>
>>>> Louis
>>>>
>>>> ------------------------------------------------------------------------------
>>>> LogMeIn Central: Instant, anywhere, Remote PC access and management.
>>>> Stay in control, update software, and manage PCs from one command center
>>>> Diagnose problems and improve visibility into emerging IT issues
>>>> Automate, monitor and manage. Do more in less time with Central
>>>> http://p.sf.net/sfu/logmein12331_d2d
>>>> _______________________________________________
>>>> Denovoassembler-users mailing list
>>>> Denovoassembler-users@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users
>>>>
>>>
>>>
>>
>> ------------------------------------------------------------------------------
>> LogMeIn Central: Instant, anywhere, Remote PC access and management.
>> Stay in control, update software, and manage PCs from one command center
>> Diagnose problems and improve visibility into emerging IT issues
>> Automate, monitor and manage. Do more in less time with Central
>> http://p.sf.net/sfu/logmein12331_d2d
>> _______________________________________________
>> Denovoassembler-users mailing list
>> Denovoassembler-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users
>>
> 
> 

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_nov
_______________________________________________
Denovoassembler-users mailing list
Denovoassembler-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users

Re: [Denovoassembler-users] Elapsed time

Reply via email to