Heshan, Thanks for the detailed reply.
Our queries are short (35-40bp) dna sequences from nt. The database is the entire nt database. We have around 500-1000 hits for each query, since we run BLAST with very low thresholds for e-value. What we observed is that the execution times increase when we go beyond 32 processors. This increase was some what less for mpiBLAST-pio when compared to mpiBLAST, but it was still an increase in run times, rather than a decrease. When using 32 or fewer processors, the run times for mpiBLAST and mpiBLAST-pio are very similar. If we are running on n processors, we divide the database into n-2 chunks. Thanks, Ravi -----Original Message----- From: Heshan Lin [mailto:[EMAIL PROTECTED] Sent: Monday, March 12, 2007 1:47 AM To: [EMAIL PROTECTED]; [email protected] Subject: RE: [Mpiblast-users] File systems that work well with mpiBLAST-pio Hi Ravi, In the paper pioBLAST was compared with mpiBLAST 1.2, mpiBlAST 1.4 has improved performance a lot since then =). Besides, mpiBLAST-pio is not working exactly the same as pioBLAST, please refer to the following message I posted in the mail list before for their differences. http://sourceforge.net/mailarchive/forum.php?thread_id=9626643&forum_id=4368 9 Currently mpiBLAST-pio provides two output options. 1) paralle-write. This is a more efficient output strategy which requires special support from parallel file systems to ensure the result correctness. It has been tested on PVFS2 and SGI XFS. I don't have access to other parallel file systems, but the parallel-write strategy should work on file systems that support the level-2 write access (independent, non-contiguous write) mentioned in the following MPI-IO paper: Rajeev Thakur, William Gropp, and Ewing Lusk. "A case for using MPI's derived datatypes to improve I/O performance". In Proceedings of SC98: High Performance Networking and Computing, November 1998. 2) master-write. This output strategy is less scalable but it does not require special support from file systems, and it is recommended on systems with NFS. The performance difference between mpiBLAST 1.4 and mpiBLAST-pio depends much on the characteristics of the query set and the database. According to our experiences, even with the master-write output option, mpiBLAST-pio shows significant performance improvement when searching queries against large database with bulky output volume (e.g. searching sequences randomly sampled from NT database against NT database itself). However, when searching queries with small amount of output, mpiBLAST 1.4 and mpiBLAST-pio deliver similar search throughput. Which database and query set were you using for the performance comparison? Thanks, Heshan ________________________________________ From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ravi Vijaya Satya [Contractor, Foreign National] Sent: Friday, March 09, 2007 12:41 PM To: [email protected] Subject: [Mpiblast-users] File systems that work well with mpiBLAST-pio I was wondering if mpiBLAST-pio requires any file system features for giving better performance. In the Lin et al paper on pioBLAST, it was shown that pioBLAST performs better than mpiBLAST. However, I could not see any significance improvement in performance over mpiBLAST (1.4.0) using mpiBLAST-pio. Can any one list some file systems that have parallel I/O support necessary for mpiBLAST-pio? Is luster one such file system? Thanks, Ravi ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Mpiblast-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/mpiblast-users
