Does anyone out there use data in htslib that *isn't* just bgzf wrapped? I'm working on speeding this up with multi-threading improvements, but it would be useful to have a better set of tests than simple BAM and BCF.
The bgzf.c code deals with totally uncompressed data (so not samtools view -u, but no zlib wrap at all) and also vanilla gzip data. I don't believe such formats are in use in the wild for BAM and I don't see any point in using gzip over bgzf for BCF either, but I don't know much of the use of this library beyond that narrow world. Tabix is one possibility, but I've no idea what sort of data people use tabix with. Tabix docs state it must be used on bgzip (bgzf) data, so I assume the gzip or uncompressed formats are irrelevant to it. I can envisage other Z_FULL_FLUSH oriented gzip interfaces that still permit random access (by something other than bai or csi), but do such tools exist and are they using htslib? It's possible the gzip code dates back to razip so if anyone has any background to why this code exists please illuminate me. Regards, James PS. For what it's worth, there are bugs in the pure gzip handling code within htslib anyway so I *hope* it's unused! -- James Bonfield (j...@sanger.ac.uk) | Hora aderat briligi. Nunc et Slythia Tova | Plurima gyrabant gymbolitare vabo; A Staden Package developer: | Et Borogovorum mimzebant undique formae, https://sf.net/projects/staden/ | Momiferique omnes exgrabure Rathi. -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. ------------------------------------------------------------------------------ What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic patterns at an interface-level. Reveals which users, apps, and protocols are consuming the most bandwidth. Provides multi-vendor support for NetFlow, J-Flow, sFlow and other flows. Make informed decisions using capacity planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e _______________________________________________ Samtools-help mailing list Samtools-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/samtools-help