Hi,
Yestoday I found  I always got msg "[blastpgp] WARNING:  [000.000]  Failed to 
initialize search. ISAM Error code is -5" when I run blastpgp against a 
database which is stored on the parallel filesystem PVFS2, but the warning 
did not occur when blast against database shared by NFS. 
I set up a PVFS2 on a small cluster, the nodes are same: same CPU, same MEM.
And I have used BLAST 2.2.15 and the latest 2.2.17, both of the two versions 
gave the warnings.
Here is an example:
1) /data/blastdb is a directory holding some database like nr, and this 
directory is shared from the master node of a cluster to several compute 
nodes by NFS filesystem;
2) /pool/blastdb is mount point of PVFS2 (version 2.6.3) filesystem on all 
nodes, the content of this directory is identical to that of /data/blastdb (I 
use rsync to make them identical);
3) I employed a small testset of about 100 sequences to test blastpgp against 
nr database in both of the to directories. All runnings on /pool/blastdb 
complained "[blastpgp] WARNING:  [000.000]  Failed to initialize search. ISAM 
Error code is -5", but those on /data/blastdb did not;
4) It seems that BLAST failed to fetch some sequences from the database on 
PVFS2 filesystem and make the complain; I use fastacmd to fetch some 
sequence:
a) fetch from database on NFS, this is OK,
$ fastacmd -s "gi|34495614" -d /data/blastdb/nr
>gi|34495614|ref|NP_899829.1| sulfite dehydrogenase - subunitB 
[Chromobacterium violaceum ATCC 12472] >gi|34101469|gb|AAQ57838.1| sulfite 
dehydrogenase - subunitB [Chromobacterium violaceum ATCC 12472]
MRAALLALALLAAPAGAASIALPNETAMLPDSGHPGYQAALRRCLVCHSADYIALQPDFDEARWRAVVDKMRLAFKAPIP
AEEAAPIAAYLADAQRRRLLRPHPPQP
b) fetch from database on PVFS2, ohhh, 
$ fastacmd -s "gi|34495614" -d /pool/blastdb/nr
[fastacmd] ERROR: Accesion search failed for "gi|34495614" with error code -5

And then, I used FORMATDB to format the nr database in 
local disk and in PVFS2. The procedure was successful in local disk, but FAIL 
in PVFS2.

This is the log file of FORMATDB in local disk with successful messages:
========================[ Dec 6, 2007  9:07 PM ]========================
Version 2.2.17 [Aug-26-2007]
Started database file "nr"
Closing volume nr with 2976302 sequences, 999,999,232 letters(.psq file = 
1002976321 bytes; .phr file = 846550430 bytes)
Formatted 2976302 sequences in volume 0
Version 2.2.17 [Aug-26-2007]
Started database file "nr"
Formatted 2702180 sequences in volume 1
SUCCESS: formatted database nr


This is the log file of FORMATDB in PVFS2 complaining the errors:
========================[ Dec 6, 2007  9:28 PM ]========================
Version 2.2.17 [Aug-26-2007]
Started database file "nr"
Closing volume nr with 2976302 sequences, 999,999,232 letters(.psq file = 
1002976321 bytes; .phr file = 846550430 bytes)
ERROR: [000.000] Failed to create index: ISAMErrorCode -5.

Removed single-volume database nr
FATAL ERROR: [001.000] Fatal error when adding sequence to BLAST database.


Why this happen?

There is a paper on 2002 (J.D Grant, et al, Bioinformatics 2002, 18(5): 
765-766) said they had developed a distributed BLAST and PSI-BLAST on a 
cluster and the database was really stored on PVFS.

Is PVFS2 suitable for storage?


-- 
Yun He   Ph.D.
National Laboratory of Biomacromolecules
Institute of Biophysics, Chinese Academy of Sciences
15 Datun Road, Chaoyang District
Beijing 100101
China
Tel: +86 010 6488 8487
E-mail: [EMAIL PROTECTED]
  or    [EMAIL PROTECTED]

_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to