Heshan,

Thanks for the detailed reply. 

Our queries are short (35-40bp) dna sequences from nt. The database is the
entire nt database. We have around 500-1000 hits for each query, since we
run BLAST with very low thresholds for e-value. 

What we observed is that the execution times increase when we go beyond 32
processors.  This increase was some what less for mpiBLAST-pio when compared
to mpiBLAST, but it was still an increase in run times, rather than a
decrease.  When using 32 or fewer processors, the run times for mpiBLAST and
mpiBLAST-pio are very similar. If we are running on n processors, we divide
the database into n-2 chunks. 

Thanks,
Ravi

-----Original Message-----
From: Heshan Lin [mailto:[EMAIL PROTECTED] 
Sent: Monday, March 12, 2007 1:47 AM
To: [EMAIL PROTECTED]; [email protected]
Subject: RE: [Mpiblast-users] File systems that work well with mpiBLAST-pio

Hi Ravi,

In the paper pioBLAST was compared with mpiBLAST 1.2, mpiBlAST 1.4 has
improved performance a lot since then =). Besides, mpiBLAST-pio is not
working exactly the same as pioBLAST, please refer to the following message
I posted in the mail list before for their differences.
http://sourceforge.net/mailarchive/forum.php?thread_id=9626643&forum_id=4368
9

Currently mpiBLAST-pio provides two output options. 

1) paralle-write. This is a more efficient output strategy which requires
special support from parallel file systems to ensure the result correctness.
It has been tested on PVFS2 and SGI XFS. I don't have access to other
parallel file systems, but the parallel-write strategy should work on file
systems that support the level-2 write access (independent, non-contiguous
write) mentioned in the following MPI-IO paper:
Rajeev Thakur, William Gropp, and Ewing Lusk. "A case for using MPI's
derived datatypes to improve I/O performance". In Proceedings of SC98: High
Performance Networking and Computing, November 1998.

2) master-write. This output strategy is less scalable but it does not
require special support from file systems, and it is recommended on systems
with NFS. 

The performance difference between mpiBLAST 1.4 and mpiBLAST-pio depends
much on the characteristics of the query set and the database. According to
our experiences, even with the master-write output option, mpiBLAST-pio
shows significant performance improvement when searching queries against
large database with bulky output volume (e.g. searching sequences randomly
sampled from NT database against NT database itself). However, when
searching queries with small amount of output, mpiBLAST 1.4 and mpiBLAST-pio
deliver similar search throughput.

Which database and query set were you using for the performance comparison? 

Thanks,
Heshan

________________________________________
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Ravi
Vijaya Satya [Contractor, Foreign National]
Sent: Friday, March 09, 2007 12:41 PM
To: [email protected]
Subject: [Mpiblast-users] File systems that work well with mpiBLAST-pio

I was wondering if mpiBLAST-pio requires any file system features for giving
better performance. In the Lin et al paper on pioBLAST, it was shown that
pioBLAST performs better than mpiBLAST. However, I could not see any
significance improvement in performance over mpiBLAST (1.4.0) using
mpiBLAST-pio. 

Can any one list some file systems that have parallel I/O support necessary
for mpiBLAST-pio? Is luster one such file system?

Thanks,
Ravi



-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Mpiblast-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mpiblast-users

Reply via email to