On 11/05/2012 11:14 AM, Louis Letourneau wrote:
> I have assembled 2 >2.5G genomes (not the same, both mammals) in about
> 48hrs using 2025cores. This works great.
>

Nice.
  
> I'm trying to assemble a fish and I am having issues I don't quite know
> how to debug.
>
> The fish is about 1.9Gb in size and not diploid.
>
> If I run Ray using the paired + mates using k31 I was able to assemble
> it in 168hours (I needed to restart after 120hours because of
> walltime...thanks for the checkpoints :-) ).
>

That's quite long, what's the latency ?

We are working on a new programming model called "mini-ranks" to better use
super computer with a lot of nodes, but also with a lot of cores per node.

Ray uses RayPlatform, and RayPlatform uses MPI. In the new model, RayPlatform
uses "mini-ranks".

The current model in RayPlatform is to use pure MPI programming model, which
can be really bad on some super computer if there is just one network card
on each node that must serve lots of MPI processes.

If you are interested, we have a experimental branch called for mini-ranks that
can use only 1 MPI process per node, and as many IEEE POSIX threads (one for 
each
mini-ranks).

With mini-ranks, the routing code in RayPlatform will become obsolete !

Some latency results:


Table 1: Comparison of MPI ranks with mini-ranks on the Colosse
super-computer at Laval University.
+-------+---------------------------------------------------+
| Cores | Average round-trip latency (us)                   |
+-------+-----------------------+---------------------------+
|       | MPI ranks             | mini-ranks                |
|       | (pure MPI)            | (MPI + pthread)           |
+-------+-----------------------+---------------------------+
| 8     | 11.25 +/- 0           | 24.1429 +/- 0             |
| 16    | 35.875 +/- 6.92369    | 43.0179 +/- 8.76275       |
| 32    | 66.3125 +/- 6.76387   | 41.7143 +/- 1.23924       |
| 64    | 90 +/- 16.5265        | 37.75 +/- 6.41984         |
| 128   | 126.562 +/- 25.0116   | 43.0179 +/- 8.76275       |
| 256   | 203.637 +/- 67.4579   | 44.6429 +/- 6.11862       |
+-------+-----------------------+---------------------------+

If you want to try that:

git clone g...@github.com:sebhtml/RayPlatform.git
cd RayPlatform;git checkout minirank-model;cd ..
git clone g...@github.com:sebhtml/ray.git
cd ray;git checkout minirank-model;
make

then, to run on 100 nodes, with 24 cores per node:


mpiexec -n 100 -bynode Ray -mini-ranks-per-rank 23 \
...

Notes:

1. the -bynode is necessary in Open-MPI because the default is -byslot.
-byslot will work also if the job scheduler presents the slots in a by-node
round-robin strategy.

2. It is important to throw 23 mini-ranks per MPI process and not 24
because each MPI process has a communication thread too and you
don't want to oversubscribe the folks on the CPU at all.

3. The mini-rank code contains 0 (zero) locks, 0 mutexes, 0 spinlocks, 0 
semaphore.
The code is non-blocking, and lock-free which is why it works so well.

4. This work should be merged once I have made additional sanity checks.

5. If you want to look at the code, the class MessageQueue is particularly 
interesting.

> It worked (although the assembly wasn't great, possibly due to a lot of
> repeats), but took way longer than the bigger genomes.
>
> I'm trying the same without the mates. I also changed the kmer from k31
> to k61.
>
> I hit walltime 5 times now, 120hours each and it's not finished.
>  
> The variables that changed are kmer and no mates.
>
> The first run ran many steps in the log.
> Since the first wall time, the only output I seem to be having
>
> Rank X: gathering scaffold links [Y/2987] [Z/7166]
>
> (X,Y,Z varies of course)
>

Known bug where the Ray stalls on repeats too long...

https://github.com/sebhtml/ray/issues/91

This is because of a bug (1 month old, actually). I have a patch in the queue,
but I am not satisfied by its impact overall. The patch fixes the problem of 
running
time though. I will solves this bug in the scaffolder when I have time.
Meanwhile, you can use the patch, which solves the problem, but it's a dirty 
hack.

You can test this patch:

wget http://downloads.sourceforge.net/project/denovoassembler/Ray-v2.1.0.tar.bz2
tar -xjf Ray-v2.1.0.tar.bz2
cd Ray-v2.1.0
wget 
https://github.com/sebhtml/patches/raw/master/ray/human-seb-from-13efb22270e4f563c9cafc.patc
patch -p1 < human-seb-from-13efb22270e4f563c9cafc.patch

make ...

> I was using a version compiled from sources for the polytope routing.
>

As I said, "mini-ranks" *will* supercede the virtual routing subsystem. The 
problem with
virtual routing is that it increases the number of physical hops. With 
mini-ranks,
it is not the case at all.

> Any ideas?
>

To wrap-up:

1. Try mini-ranks;
2. Try the patch;


p.s.: I should resume the patchwork, branch merging once I am done implementing
the reviewers' concerns for my Debian package and Fedora package for Ray.

p.s.2: For your information, our paper about Ray Meta should appear somewhere 
in the
near future, it is in re-review (the reviewers are assessing our revised 
manuscript).

> Louis
>
> ------------------------------------------------------------------------------
> LogMeIn Central: Instant, anywhere, Remote PC access and management.
> Stay in control, update software, and manage PCs from one command center
> Diagnose problems and improve visibility into emerging IT issues
> Automate, monitor and manage. Do more in less time with Central
> http://p.sf.net/sfu/logmein12331_d2d
> _______________________________________________
> Denovoassembler-users mailing list
> Denovoassembler-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users
>


-- 
Sent from my IBM Blue Gene/Q

------------------------------------------------------------------------------
LogMeIn Central: Instant, anywhere, Remote PC access and management.
Stay in control, update software, and manage PCs from one command center
Diagnose problems and improve visibility into emerging IT issues
Automate, monitor and manage. Do more in less time with Central
http://p.sf.net/sfu/logmein12331_d2d
_______________________________________________
Denovoassembler-users mailing list
Denovoassembler-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users

Reply via email to