Hello I'm resurecting this thread in light of htslib+samtools 1.0 release. I see in the code of hfile.c that capacity is blocked over 32k, this is not good for parallel file systems where the typical block is 4MB
Also, is there a way at runtime to control block sizes or do we need to compile a "special" version for the cluster? If we do need a special version, it would be nice to have a central constant to modify to control the block size. Thanks Louis On 13-08-28 05:40 AM, John Marshall wrote: > On 26 Aug 2013, at 18:11, Louis Letourneau wrote: >> I was having performance issues with mpileup on our GPFS cluster. I >> traced it back to the way BAMs are processed. Headers (18 bytes) are >> read first, then the rest of the block. This is not efficient on >> distributed FS. >> >> I saw that there was a TODO I/O buffering in the implementation of >> KNET. In the mean time I forced KNET off be removing the define and >> forced buffering on fread with setvbuf. In my case, to 4MB. > > As it happens, one of the things I have been doing in the last couple of > weeks is implementing I/O buffering for htslib's low-level file access. This > is a new layer above knetfile, collecting all the I/O system calls in one > place so we have the opportunity to detect I/O errors robustly, using > fstat(2) on local files to determine appropriate buffer sizes as mentioned in > that thread from last October, and allowing for format autodetection even on > pipes by peeking at the buffer. > > Have a look at the io branch at https://github.com/samtools/htslib if you're > interested. There's an upcoming commit to plug the BAM/SAM etc I/O into it, > and then it will be interesting to see what if any performance changes there > are on Lustre and other distributed file systems. > > Cheers, > > John > ------------------------------------------------------------------------------ Slashdot TV. Video for Nerds. Stuff that matters. http://tv.slashdot.org/ _______________________________________________ Samtools-help mailing list Samtools-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/samtools-help