Hi Drew,

One option if you can live with just the top scoring hits instead of all hits is to limit results by e-value and count... Try blastall's -e, -b, and -v options. Reducing the number of results can seriously reduce the amount of memory needed on the master node.

Also, you may want to have a look at mpiBLAST-pio, it can do parallel input of the database and parallel output to the results file provided it has a parallel filesystem such as PVFS2.

mpiBLAST-pio was *just* released, and there was some discussion about compiling it and getting it to work on the mpiBLAST developers mailing list earlier today...

-Aaron


Drew Bullard wrote:

Hi Aaron,

Thanks for the reply. I was able to split the database into 38 fragments.

My next question is:

I'm new to this mailing list so maybe this has been answered, sorry in
advance.

Are there works in progress to split the output processing across the mpi
grig. I have clearly hit the limit of user process memory for a 32 bit
architecture (>3GB) during output processing.

thanks,
-- Drew

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of Aaron
Darling
Sent: Wednesday, January 25, 2006 10:25 AM
To: [email protected]
Subject: Re: [Mpiblast-users] mpiformatdb - vertebrate_mammalian


Ah.  Sorry, I should have looked more closely at your e-mail.

It seems that you're having issues with NCBI formatdb's
built-in notion
that it shouldn't ever create a fragment larger than 1GB.  Whenever
formatdb is about to exceed 1gb in a fragment it
automatically starts a
new fragment.  That is why you're ending up with files with
numberings
like 000.00 and 000.01 instead of just 000, 001, ..., 004.
It may be possible to change ncbi/tools/readdb.c in the NCBI
toolbox to
change this behavior.
Specifically, try changing
#define SEQFILE_SIZE_MAX 1000000000UL
to something larger, then recompile the toolbox and relink
mpiblast with
the new toolbox.

The alternative is to simply make enough fragments so that
each fragment
is less than 1gb in size.  mpiblast can handle cases where there are
more fragments than compute nodes/mpi processes.

-Aaron



Drew Bullard wrote:

Hi,

Thanks for the reply.

The disk space has 172GB as shown below. The odd thing about
the 005.ntm is
that I have requested the data be split in 5 parts, 0-4.
That's why I don't
understand the 6th part. It is like it created the 6th part
and then tried
to open the 3rd part (002.ntm).

thanks again,

-- Drew



-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]
Behalf Of Aaron
Darling
Sent: Wednesday, January 25, 2006 6:06 AM
To: [email protected]
Subject: Re: [Mpiblast-users] mpiformatdb - vertebrate_mammalian


Hi Drew,

The .ntm files are temporary files that the NCBI library uses when
constructing the database indices.
This seems to be the relevant error message:

NOTE: CoreLib [002.003]
FileOpen("/opt/blast/newdb/vertebrate_mammalian.002.ntm"
,"r") failed
ERROR: [000.000] SORTFiles failed, change TMPDIR to a
partition with more
free s
pace or use -s option


Is it true that the machine is running out of space in
/opt/blast/newdb?  It's possible to check drive space on many unix
machines with the `df` command.

-Aaron



Drew Bullard wrote:



Hi all,

Has anyone successfully formatted the vertebrate_mammalian


NCBI database


using mpiformatdb. I have tried several iterations to format


the data into 5


chunks.

Here's my latest:

$ echo $TMPDIR
/opt/blast/tmpdir
$ df -h /opt/blast
Filesystem            Size  Used Avail Use% Mounted on
api-lcbkup-1:/blast   545G  374G  172G  69% /opt/blast

$ cat .ncbirc
[NCBI]
Data=/opt/blast/data

[BLAST]
BLASTDB=/opt/blast/newdb
BLASTMAT=/opt/blast/data

[mpiBLAST]
Shared=/opt/blast/newdb
Local=/opt/blast/tmpdir

Here's the database:
$ ls -lh /opt/blast/download/vertebrate_mammalian
-rw-r--r--    1 dbullard     ri            34G Dec 17 05:04
/opt/blast/download/vertebrate_mammalian

Here's the command:
mpiformatdb -N 5 -i /opt/blast/download/vertebrate_mammalian -p
F --skip-reorder

Here's the log:

========================[ Jan 24, 2006  7:03 AM


]========================


Version 2.2.10 [Oct-19-2004]
Started database file "/opt/blast/download/vertebrate_mammalian"
Version 2.2.10 [Oct-19-2004]
Started database file "/opt/blast/download/vertebrate_mammalian"
Version 2.2.10 [Oct-19-2004]
Started database file "/opt/blast/download/vertebrate_mammalian"
Version 2.2.10 [Oct-19-2004]
Started database file "/opt/blast/download/vertebrate_mammalian"
Version 2.2.10 [Oct-19-2004]
Started database file "/opt/blast/download/vertebrate_mammalian"
Closing volume /opt/blast/newdb/vertebrate_mammalian.001 with 10402
sequences, 3
,873,748,360 letters(.nsq file = 1006885366 bytes; .nhr file


= 1401461


bytes)
Formatted 10402 sequences in volume 1
Version 2.2.10 [Oct-19-2004]
Started database file "/opt/blast/download/vertebrate_mammalian"
Closing volume /opt/blast/newdb/vertebrate_mammalian.002 with 10894
sequences, 3
,880,693,254 letters(.nsq file = 1002450993 bytes; .nhr file


= 1467235


bytes)
Formatted 10894 sequences in volume 2
Version 2.2.10 [Oct-19-2004]
Started database file "/opt/blast/download/vertebrate_mammalian"
Closing volume /opt/blast/newdb/vertebrate_mammalian.003 with 8985
sequences, 3,
912,490,460 letters(.nsq file = 1011952328 bytes; .nhr file


= 1212563 bytes)


Formatted 8985 sequences in volume 3
Version 2.2.10 [Oct-19-2004]
Started database file "/opt/blast/download/vertebrate_mammalian"
Closing volume /opt/blast/newdb/vertebrate_mammalian.000 with 9760
sequences, 3,
914,162,677 letters(.nsq file = 1002394425 bytes; .nhr file


= 1319243 bytes)


Formatted 9760 sequences in volume 0
Version 2.2.10 [Oct-19-2004]
Started database file "/opt/blast/download/vertebrate_mammalian"
Closing volume /opt/blast/newdb/vertebrate_mammalian.004 with 8487
sequences, 3,
925,218,720 letters(.nsq file = 1000860920 bytes; .nhr file


= 1145719 bytes)


Formatted 8487 sequences in volume 4
Version 2.2.10 [Oct-19-2004]
Started database file "/opt/blast/download/vertebrate_mammalian"
Formatted 31345 sequences in volume 1
NOTE: CoreLib [002.003]
FileOpen("/opt/blast/newdb/vertebrate_mammalian.002.ntm"
,"r") failed
ERROR: [000.000] SORTFiles failed, change TMPDIR to a


partition with more


free s
pace or use -s option

Here's a listing of the final directory, NOTE the 005.ntm


file? What is


that?

$ ls -l /opt/blast/newdb/
total 6611631
-rw-r--r--    1 dbullard ri        1319243 Jan 24 08:05
vertebrate_mammalian.000.00.nhr
-rw-r--r--    1 dbullard ri         117228 Jan 24 08:05
vertebrate_mammalian.000.00.nin
-rw-r--r--    1 dbullard ri          78080 Jan 24 08:05
vertebrate_mammalian.000.00.nnd
-rw-r--r--    1 dbullard ri            356 Jan 24 08:05
vertebrate_mammalian.000.00.nni
-rw-r--r--    1 dbullard ri        1955463 Jan 24 08:05
vertebrate_mammalian.000.00.nsd
-rw-r--r--    1 dbullard ri          45207 Jan 24 08:05
vertebrate_mammalian.000.00.nsi
-rw-r--r--    1 dbullard ri       997954313 Jan 24 08:05
vertebrate_mammalian.000.00.nsq
-rw-r--r--    1 dbullard ri        4131317 Jan 24 08:51
vertebrate_mammalian.000.01.nhr
-rw-r--r--    1 dbullard ri         376248 Jan 24 08:51
vertebrate_mammalian.000.01.nin
-rw-r--r--    1 dbullard ri         250760 Jan 24 08:51
vertebrate_mammalian.000.01.nnd
-rw-r--r--    1 dbullard ri           1028 Jan 24 08:51
vertebrate_mammalian.000.01.nni
-rw-r--r--    1 dbullard ri        6720507 Jan 24 08:51
vertebrate_mammalian.000.01.nsd
-rw-r--r--    1 dbullard ri         156680 Jan 24 08:51
vertebrate_mammalian.000.01.nsi
-rw-r--r--    1 dbullard ri       875184322 Jan 24 08:51
vertebrate_mammalian.000.01.nsq
-rw-r--r--    1 dbullard ri        1401461 Jan 24 08:04
vertebrate_mammalian.001.nhr
-rw-r--r--    1 dbullard ri         124932 Jan 24 08:04
vertebrate_mammalian.001.nin
-rw-r--r--    1 dbullard ri          83216 Jan 24 08:04
vertebrate_mammalian.001.nnd
-rw-r--r--    1 dbullard ri            372 Jan 24 08:04
vertebrate_mammalian.001.nni
-rw-r--r--    1 dbullard ri        2122526 Jan 24 08:04
vertebrate_mammalian.001.nsd
-rw-r--r--    1 dbullard ri          49005 Jan 24 08:04
vertebrate_mammalian.001.nsi
-rw-r--r--    1 dbullard ri       989185886 Jan 24 08:04
vertebrate_mammalian.001.nsq
-rw-r--r--    1 dbullard ri        3618358 Jan 24 08:51
vertebrate_mammalian.002.nhr
-rw-r--r--    1 dbullard ri         330624 Jan 24 08:52
vertebrate_mammalian.002.nin
-rw-r--r--    1 dbullard ri         220344 Jan 24 08:51
vertebrate_mammalian.002.nnd
-rw-r--r--    1 dbullard ri            908 Jan 24 08:51
vertebrate_mammalian.002.nni
-rw-r--r--    1 dbullard ri              0 Jan 24 08:51
vertebrate_mammalian.002.nsd
-rw-r--r--    1 dbullard ri            648 Jan 24 08:04
vertebrate_mammalian.002.nsi
-rw-r--r--    1 dbullard ri       994808643 Jan 24 08:51
vertebrate_mammalian.002.nsq
-rw-r--r--    1 dbullard ri        3408746 Jan 24 08:51
vertebrate_mammalian.003.nhr
-rw-r--r--    1 dbullard ri         107928 Jan 24 08:05
vertebrate_mammalian.003.nin
-rw-r--r--    1 dbullard ri          71880 Jan 24 08:05
vertebrate_mammalian.003.nnd
-rw-r--r--    1 dbullard ri            332 Jan 24 08:05
vertebrate_mammalian.003.nni
-rw-r--r--    1 dbullard ri        1815511 Jan 24 08:05
vertebrate_mammalian.003.nsd
-rw-r--r--    1 dbullard ri            161 Jan 24 08:05
vertebrate_mammalian.003.nsi
-rw-r--r--    1 dbullard ri       998850878 Jan 24 08:52
vertebrate_mammalian.003.nsq
-rw-r--r--    1 dbullard ri        3653467 Jan 24 08:51
vertebrate_mammalian.004.nhr
-rw-r--r--    1 dbullard ri         101952 Jan 24 08:06
vertebrate_mammalian.004.nin
-rw-r--r--    1 dbullard ri          67896 Jan 24 08:06
vertebrate_mammalian.004.nnd
-rw-r--r--    1 dbullard ri            316 Jan 24 08:06
vertebrate_mammalian.004.nni
-rw-r--r--    1 dbullard ri        1705972 Jan 24 08:06
vertebrate_mammalian.004.nsd
-rw-r--r--    1 dbullard ri            388 Jan 24 08:06
vertebrate_mammalian.004.nsi
-rw-r--r--    1 dbullard ri       999163733 Jan 24 08:51
vertebrate_mammalian.004.nsq
-rw-r--r--    1 dbullard ri        3611609 Jan 24 08:51
vertebrate_mammalian.005.nhr
-rw-r--r--    1 dbullard ri              0 Jan 24 08:06
vertebrate_mammalian.005.nin
-rw-r--r--    1 dbullard ri       871563163 Jan 24 08:52
vertebrate_mammalian.005.nsq
-rw-r--r--    1 dbullard ri        5937997 Jan 24 08:52
vertebrate_mammalian.005.ntm
-rw-r--r--    1 dbullard ri            339 Jan 24 08:51
vertebrate_mammalian.nal

Here's the nal file:
$ cat /opt/blast/newdb/*.nal
#
# Alias file created Tue Jan 24 08:51:22 2006
#
#
TITLE /opt/blast/newdb/vertebrate_mammalian
#
DBLIST /opt/blast/newdb/vertebrate_mammalian.000
/opt/blast/newdb/vertebrate_mammalian.001
/opt/blast/newdb/vertebrate_mammalian.002
/opt/blast/newdb/vertebrate_mammalian.003
/opt/blast/newdb/vertebrate_mammalian.004
#
#GILIST
#
#OIDLIST
#

thanks,
-- Drew




-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep


through log files


for problems?  Stop!  Download the new AJAX search engine
that makes
searching your log files as easy as surfing the  web.


DOWNLOAD SPLUNK!


http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486


&dat=121642


_______________________________________________
Mpiblast-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mpiblast-users





-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep
through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.
DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486
&dat=121642
_______________________________________________
Mpiblast-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mpiblast-users



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log
files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Mpiblast-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mpiblast-users





-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Mpiblast-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mpiblast-users



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Mpiblast-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mpiblast-users



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Mpiblast-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mpiblast-users

Reply via email to