Regarding the way Ray is run, soon Ray will be able to run on MPI only, MPI +
pthread, or
even pthread only:
As you may know, Ray ships with a library called RayPlatform, which abstracts
all the parallel stuff from the programmer. In Ray v2.0.0 and v2.1.0, the
associated RayPlatform library (versions 1.0.3 and 1.1.0, respectively) only
utilizes MPI.
Pure MPI applications work well on some machines, and not so much on others,
usually because the Host Communication Adapter is being used by too many MPI
processes on each node. That where hybrids come in.
Hybrids are truly the future. I visited Argonne National Laboratory recently
and I discussed with Professor Rick Stevens about hybrid programming models.
Myself, Rick Stevens, and Fangfang Xia devised something called the
"mini-ranks" hybrid programming model.
The next release of Ray (likely something like 2.1.1) will run on RayPlatform
7.0.0, which will include support for our newly introduced "mini-ranks" hybrid
programming model.
So on your hybrid machine, you will be able to run Ray like this, (assuming 8
nodes and 16 hardware threads per node):
mpiexec -n 8 -bynode \
Ray -mini-ranks-per-rank 15 \
-k 31 -o MiniRanksAreCool \
-p joe1.fastq.bz2 joe2.fastq.bz2 \
-p thor1.fastq.gz thor2.fastq.gz \
This will launch 1 MPI process per node. Each MPI process will have exactly 15
mini-ranks. Each mini-rank will run in 1 IEEE POSIX thread and an additional
thread
(the origin control thread of the process) will do MPI calls.
If you feel this is interesting for your laboratory, there is a preliminary
implementation of this available for testing.
(from http://seqanswers.com/forums/showpost.php?p=88306&postcount=200 )
On 11/01/2012 01:20 PM, Adrian Platts wrote:
> Hi Christina
>
> Here at McGill we have assembled to varying degrees depending on the project
> the selfing crucifers Sisymbrium Irio, Leavenworthia Alabamica (recent
> hexaploid) and Aethionema Arabicum.
> We've also worked on assembly of self incompatible species which tend to be
> highly heterozygous including Capsella bursa pastoris, Capsella Grandiflora
> and a Brassica Rapa
> ecotype ... and are kind of involved in the assembly of a couple of other
> non-crucifers including bean and Cleome.
>
> We only have a small assembly capacity here (3 80 core, 256 GB RAM boxes)
> but I'm keen to compare notes on where things are and are not working well in
> plant assemblies!
>
> We started by using a long Kmer (K>61) approach in Velvet but found the
> chimerism rate around TEs was worryingly high. We then moved on to
> AllpathsLG and Ray-SOAPdeNovo
> (Ray for contiging, Soap for scaffolding). We're also using the meraculous
> assembler for heterozygotes. As I say - I'd be very interested in hearing
> your experiences.
>
> Adrian
> Adrian Platts
> VEGI Project
> McGill
>
> ps. these are our latencies with various params... after some checking with
> Sebastien we're keeping away from tcp based messaging.
>
> mpiexec -n 10 --mca btl sm,self --bind-to-core --bycore --mca maffinity
> libnuma ./Ray -o foo3 -test-network-only
> # AverageForAllRanks: 7
>
> mpiexec -n 40 --mca btl sm,self --bind-to-core --bycore --mca maffinity
> libnuma ./Ray -o foo4 -test-network-only
> # AverageForAllRanks: 18.15
>
> mpiexec -n 32 --mca btl sm,self --bind-to-core --bycore --mca maffinity
> libnuma ./Ray -o foo4 -route-messages -test-network-only
> # AverageForAllRanks: 46.875
> (round-robin)
>
> mpiexec -n 40 --mca btl sm,self --bind-to-core --bycore ./Ray -o foo3
> -test-network-only
> # AverageForAllRanks: 17.9
>
> mpiexec -n 40 --mca btl sm,self ./Ray -o foo3 -test-network-only
> # AverageForAllRanks: 22.425
>
> mpiexec -n 41 --mca btl sm,self ./Ray -o foo3 -test-network-only
> # AverageForAllRanks: 45.6098
>
> mpiexec -n 70 --mca btl sm,self ./Ray -o foo3 -test-network-only
> # AverageForAllRanks: 82.105
>
>
>
> On Nov 1, 2012, at 1:07 PM, Sébastien Boisvert
> <[email protected]> wrote:
>
>> Hello,
>>
>> You should CC the mailing list as I am sure that numerous people in
>> the genomics community would be interested by plant genome de novo assembly !
>>
>> People at McGill University did some work on plant genomes with Ray too.
>> They posted their results on the list I think.
>>
>> So you have something like 1 500 000 000 sequences, right ?
>>
>> On what kind of hardware are you running ?
>>
>> What's the latency reported in NetworkTest.txt ?
>>
>>
>> --
>> Sent from my IBM Blue Gene/Q
>>
>> Sébastien
>>
>>
>>
>> On 11/01/2012 12:57 PM, Christina Boucher wrote:
>>> Thanks. I ended up attaching the .openmpi-setup file in my top-level home
>>> directory and then adding the following line to my .bashrc: source
>>> ~/.openmpi-setup
>>>
>>> After recompiling it seems to be running on my Arabidopsis data. I am
>>> trying it with all 4 lanes and hoping that it works. I don't necessarily
>>> care if I get the *best* assembly but an assembly would be nice. Other
>>> assemblers have been bailing on memory with my 512G server but I am hopeful
>>> about your program.
>>>
>>> Thanks.
>>>
>>> Best,
>>> Christina
>>>
>>>
>>>
>>>
>>>
>>> On 2012-10-31, at 3:28 PM, Sébastien Boisvert
>>> <[email protected] <mailto:[email protected]>>
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> On 10/31/2012 03:48 PM, Christina Boucher wrote:
>>>>> MPI is already installed on my server… see:
>>>>> oak # rpm -qa | grep openmpi
>>>>> openmpi-devel-1.5.4-5.fc17.1.x86_64
>>>>> openmpi-1.5.4-5.fc17.1.x86_64
>>>>>
>>>>
>>>> This is something specific to Fedora 17 (which I happen to be using on my
>>>> laptop).
>>>> My answer below is not really related to Ray, but more related to Fedora
>>>> 17.
>>>>
>>>> $ repoquery --list openmpi-1.5.4-5.fc17.1.x86_64 | grep mpiexec$ |grep bin
>>>> /usr/lib64/openmpi/bin/mpiexec
>>>>
>>>> $ repoquery --list openmpi-devel-1.5.4-5.fc17.1.x86_64 | grep mpicxx$ |
>>>> grep bin
>>>> /usr/lib64/openmpi/bin/mpicxx
>>>>
>>>>
>>>> However, the default PATH for a user on Fedora 17 is:
>>>>
>>>> [test@panic ~]$ echo $PATH
>>>> /usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/home/test/.local/bin:/home/test/bin
>>>>
>>>>
>>>> You can fix this in Fedora 17 by adding the following 2 lines to your
>>>> $HOME/.bashrc:
>>>>
>>>> export PATH=/usr/lib64/openmpi/bin:$PATH
>>>> export LD_LIBRARY_PATH=/usr/lib64/openmpi/lib/:$LD_LIBRARY_PATH
>>>>
>>>>
>>>> Let me know if that work for you.
>>>>
>>>>
>>>> Sébastien
>>>>
>>>>> oak # rpm -qa | grep openmpi
>>>>> openmpi-devel-1.5.4-5.fc17.1.x86_64
>>>>> openmpi-1.5.4-5.fc17.1.x86_64
>>>>>
>>>>> Still the installation problems persist….
>>>>>
>>>>> Christina
>>>>>
>>>>>
>>>>>
>>>>> On 2012-10-31, at 11:26 AM, Sébastien Boisvert
>>>>> <[email protected] <mailto:[email protected]>
>>>>> <mailto:[email protected]>> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>>> make[1]: mpicxx: Command not found
>>>>>>
>>>>>> To install Ray, you need an MPI library. You don't have one installed.
>>>>>>
>>>>>> For example, on Fedora, the packages are openmpi, openmpi-devel, gcc-c++.
>>>>>>
>>>>>>> In addition, is the the max kmer length 32? Most people are using
>>>>>>> upwards to 55….?
>>>>>>
>>>>>> The maximum k-mer length is set at compilation. The default is
>>>>>> MAXKMERLENGTH=32.
>>>>>> To change that:
>>>>>>
>>>>>> make MAXKMERLENGTH=64
>>>>>>
>>>>>>
>>>>>> Sébastien
>>>>>>
>>>>>> On 10/31/2012 12:50 PM, Christina Boucher wrote:
>>>>>>>>>
>>>>>>>>> I am trying to use your Ray assembler. I've been using Spades
>>>>>>>>> (mainly because I am formerly
>>>>>>>>> from Pavel Pevzner's lab) but running out of memory on a large
>>>>>>>>> dataset.
>>>>>>>>
>>>>>>>> Maybe processing your large dataset is more amenable with a
>>>>>>>> distributed assembler.
>>>>>>>
>>>>>>> Spades group released a new version yesterday that's supposed to use
>>>>>>> less memory. I am trying that and the Ray assembler.
>>>>>>>
>>>>>>>
>>>>>>>> To get it and install it:
>>>>>>>>
>>>>>>>> $ wget
>>>>>>>> http://downloads.sourceforge.net/project/denovoassembler/Ray-v2.1.0.tar.bz2
>>>>>>>> $ sha1sum Ray-v2.1.0.tar.bz2
>>>>>>>> 4c09f2731445852857af53b65aa47e444792eeb0 Ray-v2.1.0.tar.bz2
>>>>>>>>
>>>>>>>> $ tar xjf Ray-v2.1.0.tar.bz2
>>>>>>>> $ cd Ray-v2.1.0/
>>>>>>>> $ make
>>>>>>>
>>>>>>>
>>>>>>> The problem is this compilation error: After those steps I get the
>>>>>>> following error:
>>>>>>>
>>>>>>> eggs:~/Ray-v2.1.0$ make
>>>>>>>
>>>>>>> Compilation options (you can change them of course)
>>>>>>>
>>>>>>> PREFIX = install-prefix
>>>>>>> MAXKMERLENGTH = 32
>>>>>>> FORCE_PACKING = n
>>>>>>> ASSERT = n
>>>>>>> HAVE_LIBZ = n
>>>>>>> HAVE_LIBBZ2 = n
>>>>>>> INTEL_COMPILER = n
>>>>>>> MPICXX = mpicxx
>>>>>>> GPROF = n
>>>>>>> OPTIMIZE = y
>>>>>>> DEBUG = n
>>>>>>>
>>>>>>> Compilation and linking flags (generated automatically)
>>>>>>>
>>>>>>> CXXFLAGS = -Wall -std=c++98 -O3 -D MAXKMERLENGTH=32 -D
>>>>>>> RAY_VERSION=\"2.1.0\"
>>>>>>> LDFLAGS =
>>>>>>>
>>>>>>> make[1]: Entering directory
>>>>>>> `/s/parsons/f/fac/cboucher/Ray-v2.1.0/RayPlatform'
>>>>>>> mpicxx -Wall -std=c++98 -O3 -D MAXKMERLENGTH=32 -D
>>>>>>> RAY_VERSION=\"2.1.0\" -D RAYPLATFORM_VERSION=\"1.1.0\" -I. -c -o
>>>>>>> memory/ReusableMemoryStore.o memory/ReusableMemoryStore.cpp
>>>>>>> make[1]: mpicxx: Command not found
>>>>>>> make[1]: *** [memory/ReusableMemoryStore.o] Error 127
>>>>>>> make[1]: Leaving directory
>>>>>>> `/s/parsons/f/fac/cboucher/Ray-v2.1.0/RayPlatform'
>>>>>>> make[1]: Entering directory `/s/parsons/f/fac/cboucher/Ray-v2.1.0/code'
>>>>>>> mpicxx -Wall -std=c++98 -O3 -D MAXKMERLENGTH=32 -D
>>>>>>> RAY_VERSION=\"2.1.0\" -I ../RayPlatform -I. -c -o
>>>>>>> application_core/ray_main.o application_core/ray_main.cpp
>>>>>>> make[1]: mpicxx: Command not found
>>>>>>> make[1]: *** [application_core/ray_main.o] Error 127
>>>>>>> make[1]: Leaving directory `/s/parsons/f/fac/cboucher/Ray-v2.1.0/code'
>>>>>>> mpicxx code/TheRayGenomeAssembler.a RayPlatform/libRayPlatform.a -o
>>>>>>> Ray
>>>>>>> make: mpicxx: Command not found
>>>>>>> make: *** [Ray] Error 127
>>>>>>>
>>>>>>>
>>>>>>> Any thoughts?
>>>>>>>
>>>>>>> In addition, is the the max kmer length 32? Most people are using
>>>>>>> upwards to 55….?
>>>>>>>
>>>>>>> Christina
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> $ mpiexec -n 1 ./Ray -version
>>>>>>>> $ mpiexec -n 999 ./Ray -k 31 -p 1.left.fastq 1.right.fastq -p
>>>>>>>> 2.left.fastq 2.right.fastq -o Test
>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> Christina
>>>>>>>>>
>>>>>>>>> *
>>>>>>>>> ------------------------------------------------*
>>>>>>>>> *Christina Boucher*
>>>>>>>>> **Department of Computer Science
>>>>>>>>> Colorado State University
>>>>>>>>> Fort Collins, CO 80523
>>>>>>>>> +1.970.491.8063
>>>>>>>>> [email protected] <mailto:[email protected]>
>>>>>>>>> <mailto:[email protected]> <mailto:[email protected]>
>>>>>>>>> <mailto:[email protected]>
>>>>>>>>> www.christinaboucher.com <http://www.christinaboucher.com>
>>>>>>>>> <http://www.christinaboucher.com> <http://www.christinaboucher.com>
>>>>>>>>> <http://www.christinaboucher.com>
>>>>>>>>> *------------------------------------------------*
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> *
>>>>>>> ------------------------------------------------*
>>>>>>> *Christina Boucher*
>>>>>>> **Department of Computer Science
>>>>>>> Colorado State University
>>>>>>> Fort Collins, CO 80523
>>>>>>> +1.970.491.8063
>>>>>>> [email protected] <mailto:[email protected]>
>>>>>>> <mailto:[email protected]> <mailto:[email protected]>
>>>>>>> www.christinaboucher.com <http://www.christinaboucher.com>
>>>>>>> <http://www.christinaboucher.com> <http://www.christinaboucher.com>
>>>>>>> *------------------------------------------------*
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> ***
>>>>>> Sébastien Boisvert
>>>>>> http://boisvert.info
>>>>>> Sent from a PC (Linux panic 3.6.2-4.fc17.x86_64).
>>>>>
>>>>> *
>>>>> ------------------------------------------------*
>>>>> *Christina Boucher*
>>>>> **Department of Computer Science
>>>>> Colorado State University
>>>>> Fort Collins, CO 80523
>>>>> +1.970.491.8063
>>>>> [email protected] <mailto:[email protected]>
>>>>> <mailto:[email protected]>
>>>>> www.christinaboucher.com <http://www.christinaboucher.com>
>>>>> <http://www.christinaboucher.com>
>>>>> *------------------------------------------------*
>>>>>
>>>>
>>>>
>>>> --
>>>> ***
>>>> Sébastien Boisvert
>>>> http://boisvert.info
>>>> Sent from a PC (Linux panic 3.6.2-4.fc17.x86_64).
>>>
>>> *
>>> ------------------------------------------------*
>>> *Christina Boucher*
>>> **Department of Computer Science
>>> Colorado State University
>>> Fort Collins, CO 80523
>>> +1.970.491.8063
>>> [email protected] <mailto:[email protected]>
>>> www.christinaboucher.com <http://www.christinaboucher.com>
>>> *------------------------------------------------*
>>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Everyone hates slow websites. So do we.
>> Make your web apps faster with AppDynamics
>> Download AppDynamics Lite for free today:
>> http://p.sf.net/sfu/appdyn_sfd2d_oct
>> _______________________________________________
>> Denovoassembler-users mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users
>
--
Sent from my IBM Blue Gene/Q
------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
_______________________________________________
Denovoassembler-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users