Re: [Mpiblast-users] blast in 1 day but could not get mpiblast done even in 10 days for the same dataset

Aaron Darling Thu, 08 Mar 2007 05:39:09 -0800

The query test set is on the mpiblast download archive:
http://www.mpiblast.org/Downloads.Archive.html


Specifically, you're after the 300kb of e. chrysanthemi predicted ORFs:
http://www.mpiblast.org/downloads/files/e.chrysanthemi.fas

As for the nt database, you'll have to download it from NCBI:
ftp://ftp.ncbi.nih.gov/blast/db/FASTA/nt.gz
and siphon off the first 14GB (uncompressed) with dd or something 
similar.  It may not be identical to what I used in 2005, but it should 
be close enough for a cursory runtime check.  For extra points, try 
using mpiformatdb's ability to read uncompressed fasta databases from 
stdin.  that should allow you to build a series of unix pipes that saves 
plenty of disk I/O.

-Aaron


ialam wrote:
> Hi Aron,
>
> I wanted to have the benchmark dataset so that I could test mpiblast 
> performance. Could you please point me to the dataset. In the meantime 
> I am trying to get mpich running on the cluster.
>
>
> Many Thanks,
>
> Intikhab
> ----- Original Message ----- From: "intikhab alam" <[EMAIL PROTECTED]>
> To: "Aaron Darling" <[EMAIL PROTECTED]>
> Sent: Friday, March 02, 2007 1:20 PM
> Subject: Re: [Mpiblast-users] blast in 1 day but could not get 
> mpiblast done even in 10 days for the same dataset
>
>
>> Hi Aron,
>>
>> I would like to try out the benchmark dataset, could you point me from
>> where I could download this?
>>
>>
>> Intikhab
>> ----- Original Message ----- From: "Aaron Darling" <[EMAIL PROTECTED]>
>> To: "intikhab alam" <[EMAIL PROTECTED]>
>> Sent: Friday, March 02, 2007 6:21 AM
>> Subject: Re: [Mpiblast-users] blast in 1 day but could not get
>> mpiblast done even in 10 days for the same dataset
>>
>>
>> : It sounds like there must be something causing an mpiblast-specific
>> : communications bottleneck in your system.  Anybody else have ideas
>> : here?  If you're keen to verify that, you could run mpiblast on the
>> : benchmark dataset we were using on Green Destiny and compare
>> runtimes.
>> : My latest benchmark data set (dated June 2005) has a runtime of
>> about 16
>> : minutes for 64 nodes to search the 300K erwinia query set against
>> the
>> : first 14GB of nt using blastn.  Each compute node in that machine
>> was a
>> : 667Mhz transmeta chip, 640MB ram, connected via 100Mbit ethernet.  I
>> was
>> : using mpich2-1.0.1, no SCore.  Based on paper specs, your cluster
>> should
>> : be quicker than that.
>> :
>> : On the other hand, if you've got wild amounts of load imbalance,
>> : --db-replicate-count=5 may not be enough, and 41 may prove ideal
>> (where
>> : 41 = the number of nodes in your cluster).  In that case, mpiblast
>> will
>> : have effectively copied the entire database to each node, totally
>> : factoring out load imbalance from the compute time equation.  Your
>> : database is much smaller than each node's core memory, and a single
>> : fragment is probably much larger than each node's CPU cache, so I
>> can't
>> : think of a good reason not to fully distribute the database, apart
>> from
>> : the time it takes to copy DB fragments around.
>> :
>> : In any case, keep me posted if you discover anything.
>> :
>> : -Aaron
>> :
>> :
>> : intikhab alam wrote:
>> : > Hi Aaron,
>> : >
>> : > As per your suggestion, I used the following option:
>> : >
>> : > --db-replicate-count=5
>> : >
>> : > assuming it may help reach the 24hrs mark to complete the job.
>> : > However, I see that only 6% of the (total estimated) output has
>> been
>> : > generated until now(i.e after 4 days (4*24 hrs). If I continue
>> this
>> : > way, my mpiblast would finish in 64 days. Any other suggestion to
>> : > improve the running time?
>> : >
>> : > Intikhab
>> : > ----- Original Message ----- : > From: "Aaron Darling" 
>> <[EMAIL PROTECTED]>
>> : > To: "intikhab alam" <[EMAIL PROTECTED]>;
>> : > <[email protected]>
>> : > Sent: Wednesday, February 21, 2007 1:33 AM
>> : > Subject: Re: [Mpiblast-users] blast in 1 day but could not get
>> : > mpiblast done even in 10 days for the same dataset
>> : >
>> : >
>> : > : Hi Intikhab...
>> : > :
>> : > : intikhab alam wrote:
>> : > : > : can take a long time to compute the effective search space
>> : > required
>> : > : > for
>> : > : > : exact e-value calculation.  If that's the problem, then you
>> : > would
>> : > : > find
>> : > : > : just one mpiblast process consuming 100% cpu on the rank 0
>> node
>> : > for
>> : > : > : hours or days, without any output.
>> : > : >
>> : > : > Is the effective search space calculation done on the master
>> node?
>> : > If
>> : > : > yes, this mpiblast job stayed at the master node for some
>> hours
>> : > and
>> : > : > then all the compute nodes got busy with >90% usage all the
>> time
>> : > with
>> : > : > continued output being generated until the 12th day when I
>> killed
>> : > the
>> : > : > job.
>> : > : >
>> : > :
>> : > : yes, the search space calculation is done on the master node and
>> it
>> : > : sounds like using the --fast-evalue-approximation command-line
>> : > switch
>> : > : would save you a few hours, which is pretty small compared to
>> the
>> : > weeks
>> : > : or months that the rest of the search is taking.
>> : > :
>> : > : > :
>> : > : > : The more likely limiting factor is load imbalance on the
>> : > cluster.
>> : > : >
>> : > : >
>> : > : > In this case, do you think the job should finish on some nodes
>> : > earliar
>> : > : > than others? In my case job was running on all the nodes with
>> >90%
>> : > : > usage and the last output I got was on the last day when I
>> killed
>> : > the
>> : > : > job.
>> : > : >
>> : > : It's possible the other nodes may continue running mpiblast
>> workers
>> : > : which are waiting to send results back to the mpiblast writer
>> : > process.
>> : > :
>> : > : > : If some database fragments happen to have a large number of
>> hits
>> : > and
>> : > : > : others have few, and the database is distributed as one
>> fragment
>> : > per
>> : > : > : node, then the computation may be heavily imbalanced and may
>> run
>> : > : > quite
>> : > : > : slowly.  CPU consumption as given by a CPU monitoring tool
>> may
>> : > not
>> : > : > be
>> : > : > : indicative of useful work being done on the nodes since
>> workers
>> : > can
>> : > : > do a
>> : > : > : timed spin-wait for new work.
>> : > : > : I can suggest two avenues to achieve better load balance
>> with
>> : > : > mpiblast
>> : > : > : 1.4.0.  First, partition the database into more fragments,
>> : > possibly
>> : > : > two
>> : > : > : or three times as many as you currently have.  Second, use
>> the
>> : > : >
>> : > : > You mean more fragments that inturn means to use more nodes?
>> : > Actually
>> : > : > at our cluster not more than 44 nodes are allowed for the
>> parallel
>> : > : > jobs.
>> : > : >
>> : > : no, it's not necessary to run on more nodes when creating more
>> : > : fragments.  mpiblast 1.4.0 needs at least as many fragments as
>> nodes
>> : > : when --db-replicate-count=1 (the default value).
>> : > : when there are more fragments than nodes, mpiblast will happily
>> : > : distribute the extra fragments among the nodes.
>> : > :
>> : > : > : --db-replicate-count option to mpiblast.  The default value
>> for
>> : > the
>> : > : > : db-replicate-count is 1, which indicates that mpiblast will
>> : > : > distribute a
>> : > : > : single copy of your database across worker nodes.  For your
>> : > setup,
>> : > : > each
>> : > : > : node was probably getting a single fragment.  By setting
>> : > : >
>> : > : >
>> : > : > Is it not right if each single node gets a single fragment of
>> the
>> : > : > target database (the number of nodes assigned for mpiblast =
>> : > number of
>> : > : > fragments+2) so that the whole query dataset could be searched
>> : > against
>> : > : > the fragment (effective search space calculation being done
>> before
>> : > : > starting the search for blast comparable evalues) on each
>> single
>> : > node?
>> : > : >
>> : > : the search space calculation happens on the rank 0 process and
>> : > totally
>> : > : unrelated to the number of nodes and number of DB fragments.
>> The
>> : > most
>> : > : basic mpiblast setup has one fragment per node, but when
>> : > load-balancing
>> : > : is desirable, as in your case, mpiblast can be configured to use
>> : > : multiple fragments per node.  This will not affect the e-value
>> : > calculation.
>> : > :
>> : > : >
>> : > : > : --db-replicate-count to something like 5, each fragment
>> would be
>> : > : > copied
>> : > : > : to five different compute nodes, and thus five nodes would
>> be
>> : > : > available
>> : > : > : to search fragments that happen to have lots of hits.  In
>> the
>> : > : > extreme
>> : > : >
>> : > : > You mean this way nodes would be busy searching the query
>> dataset
>> : > : > against the same fragment on 5 compute nodes? Is this just a
>> way
>> : > to
>> : > : > keep the nodes busy until all the nodes complete the searches?
>> : > : >
>> : > : Yes, this will balance the load and will probably speed up your
>> : > search.
>> : > :
>> : > : > : case you could set --db-replicate-count equal to the number
>> of
>> : > : > : fragments, which would be fine if per-node memory and disk
>> space
>> : > is
>> : > : > : substantially larger than the total size of the formatted
>> : > database.
>> : > : > :
>> : > : >
>> : > : > Is it possible in mpiblast that for cases where the size of
>> the
>> : > query
>> : > : > dataset is equal to the size of target dataset, the query
>> dataset
>> : > : > should be fragmented, the target dataset should be kept in the
>> : > : > global/shared area and searches are done on single nodes (the
>> : > number
>> : > : > of nodes equal to the number of query dataset fragments) and
>> this
>> : > way
>> : > : > there would be no need to calculate the effective search space
>> as
>> : > all
>> : > : > the search jobs get the same size of the target dataset? by
>> : > following
>> : > : > this way I managed to complete this job using standard blast
>> in <
>> : > : > 24hrs.
>> : > : >
>> : > : The parallelization approach you describe is perfectly
>> reasonable
>> : > when
>> : > : the total database size is less than core memory size on each
>> node.
>> : > : With a properly configured --db-replicate-count, I would guess
>> that
>> : > : mpiblast could approach the 24 hour mark, although may take
>> slightly
>> : > : longer since there are various overheads involved with copying
>> of
>> : > : fragments and serial computation of the effective search space.
>> : > :
>> : > :
>> : > : > :
>> : > : > : In your particular situation, it may also help to randomize
>> the
>> : > : > order of
>> : > : > : sequences in the database to minimize "fragment hotspots"
>> which
>> : > : > could
>> : > : > : result from a database self-search.
>> : > : >
>> : > : > I did not get the "fragment hotspots" bit here. By randomizing
>> the
>> : > : > order of sequence you mean each node would possibly take
>> similar
>> : > time
>> : > : > to finish the searches? Otherwise it could be possible that
>> the
>> : > number
>> : > : > of hits could be lower for some fragments than others and this
>> : > ends up
>> : > : > in different times for the job completion on different nodes?
>> : > : >
>> : > : Right, the goal is to get the per-fragment search time more
>> balanced
>> : > : through randomization.  But after thinking about it a bit more,
>> i'm
>> : > not
>> : > : sure just how much this would save....
>> : > :
>> : > : >
>> : > : > : At the moment mpiblast doesn't have
>> : > : > : code to accomplish such a feat, but I think others (Jason
>> Gans?)
>> : > : > have
>> : > : > : written code for this in the past.
>> : > : >
>> : > : > Aaron, do you think Score based mpi communication may be
>> delaying
>> : > the
>> : > : > overall time in running mpiblast searches?
>> : > : >
>> : > : It's possible.
>> : > : The interprocess communication in 1.4.0 was fine-tuned for
>> default
>> : > : mpich2 1.0.2 and lam/mpi implementations.  We use various
>> : > combinations
>> : > : of the non-blocking MPI_Issend(), MPI_Irecv(), and the blocking
>> : > : send/recv api in mpiblast 1.4.0.  I have no idea how it would
>> : > interact
>> : > : with SCore.
>> : > :
>> : > : -Aaron
>> : > :
>> : > :
>> : >
>> :
>> :
>>


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Mpiblast-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mpiblast-users

Re: [Mpiblast-users] blast in 1 day but could not get mpiblast done even in 10 days for the same dataset

Reply via email to