[Mpiblast-users] Very inefficient performance of mpiBLAST 1.4.0 with large datasets

Kevin M. Carr Wed, 20 Sep 2006 16:07:08 -0700

Fellow mpiBLAST users,

I have found the performance of mpiBLAST v. 1.4.0 to be a bit disappointing
compared to previous versions.  It is entirely possible that I am doing
something wrong so I want to elicit any suggestions you might have.


Background

Cluster: 40 cpus, 96 GB ram aggregate.
Master = 4 x dual core opteron, 32 GB ram, Fedora Core 5
Nodes(8) = 2 x dual core opteron, 8 GB ram, Fedora Core 5
Gigabit Enet interconnect.

mpiBLAST:
I have tried two variations of mpiBLAST 1.4.0:

1.  Compiled myself from source using the May 2005 (2.2.11) ncbi toolkit and
MPICH2.

2.  RPMs from Joe Landman at Scalable Informatics (Oct 2004 ncbi, LAM-MPI
7.1.1).

The problem occurs with either of these installations so it doesn't appear
to be something peculiar about a particular build.

The Data:

A set of ESTs and consensus sequences built from contiged ESTs.
28,485 sequences, total query length ~26 million bases.

Databases:

Either NCBI nr (or RefSeq Protein complete) divided into 38 segments by
mpiformatdb.

Command line:

mpirun -np 40 mpiblast -p blastx -d nr -i my.data.fasta -U -f 14 -I T -e 1
-v 25 -b 25 -o my.blast.output &

The Problem:

Everything starts out fine, the cluster is humming along with all cpus
nearly pegged, then, about a half hour into the run the cpu utilization
drops through the floor.  The job(s) continue to run but they spend more
time idling than actually searching.  Running top on the head node I find on
mpiBLAST job (presumably the writer process since it has the lowest pid)
stays at ~100% cpu all the time but the remaining processes idle most of the
time and then show brief bursts of activity.  The same behavior is seen on
the worker nodes (of course there is no continuously active writer process
on the nodes.)

While initially the Average Load for the entire cluster is >95%, it then
drops to < 20% for the remainder of the run.  At some points the load is
only 1-2%. It is really quite depressing looking at my Ganglia page and
seeing all of those wasted cycles.

Problem is most pronounced with large datasets.  When the dataset is <= 3000
sequences I don't see it.  Of course the job is done in < 30 minutes which
is about the time I start to see the drop in cpu utilization on the larger
jobs.

I have search the mailing list archives and have not found anything which
resembles this problem.  Have other people seen this behavior with version
1.4.0?  Any insights from the developers?

Thanks in advance for any and all input.
 
Kevin M. Carr

**************************
Bioinformatics Specialist
Research Technology
  Support Facility
202-D Biochemistry Bldg.
Michigan State University
East Lansing, MI  48824

Ph: (517) 353-6794
Fax:(517) 353-8638
**************************



-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Mpiblast-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mpiblast-users

[Mpiblast-users] Very inefficient performance of mpiBLAST 1.4.0 with large datasets

Reply via email to