Stephen Ficklin wrote:
Hi Joe,

Here's an example of one of the commands I've used:

/usr/local/mpi/bin/mpirun -np 32 -nolocal -machinefile machines
/usr/local/mpiblast/bin/mpiblast -p blastn -i PT_7G4_00005_fil.fas -d
tigr -m 7 -v 3 -b 3 -a 2

Ok, what is in your machines file?


Generally I'll submit this to SGE but I seem to get the same response

whether I run it through SGE or straight on the command line

Ok. For SGE, you want to use $TMPDIR/machines as the machines file, and submit it with

qsub -pe mpich 32 ... rest of your command line with the -machinefile $TMPDIR/machines ...

Also, don't use -a 2. This sets the numbe of threads to 2, and this could be problematic for mpiblast. I am not sure if Aaron and Lucas have used the -a NCPU switch, or what will happen. There may be some odd interactions with the mpi libraries. Many mpi's are not thread safe unless built with the threading options.

Start by taking of the -a 2 (just omit the -a switch entirely). Also let us know what is in your machine file.

Joe


The tigr database is 4GB divided up using mpiformatdb into 32 pieces. >
Thanks,
Stephen

On Tue, 2006-01-10 at 10:17, Joe Landman wrote:

Stephen Ficklin wrote:

I may be wrong on my assessment, but it appears that when I try to run an 
mpiblast that the master node (chosen by the algorithm) does all the work. I'll 
get a constant load average of 1 on that node while the program is running. On 
the other nodes I barely register any activity.  For small database searches I 
will get results, but for larger ones it takes too long and patience gives out 
or it finishes with errors. The last large job I ran ended with this message 
after giving a few results:

NULL_Caption] FATAL ERROR: CoreLib [002.005]  000049_0498_0531: File write 
error 0      34645.5 Bailing out with signal -1
[0] MPI Abort by user Aborting program !
[0] Aborting program!

In any case it always seems to overload the master node but the workers seem to 
be doing nothing.  I've compiled MPIBlast for OSX, Linux and Solaris and I get 
the samre response on all three platforms.  Before I try an debugging I just 
wanted to check to see if anyone had experienced something similar.


Hi Stephen:

  Could you tell us how you are launching the job?

Joe

--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: [EMAIL PROTECTED]
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 734 786 8452
cell : +1 734 612 4615


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Mpiblast-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mpiblast-users




-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Mpiblast-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mpiblast-users

Reply via email to