On Thursday, 4 January 2018 at 02:44:09 UTC, Steven Schveighoffer
wrote:
On 1/3/18 12:03 PM, Andrew wrote:
Thanks for looking into this.
So it looks like the file you have is a concatenated gzip file.
If I gunzip the file and recompress it, it works properly.
Looking at the docs of zlib inflate [1]:
" Unlike the gunzip utility and gzread() ..., inflate() will
not automatically decode concatenated gzip streams. inflate()
will return Z_STREAM_END at the end of the gzip stream. The
state would need to be reset to continue decoding a subsequent
gzip stream."
So what is happening is the inflate function is returning
Z_STREAM_END, and I'm considering the stream done from that
return code.
I'm not sure yet how to fix this. I suppose I can check if any
more data exists, and then re-init and continue. I have to look
up what a concatenated gzip file is. gzread isn't good for
generic purposes, because it requires an actual file input (I
want to support any input type, including memory data).
-Steve
[1]
https://github.com/dlang/phobos/blob/master/etc/c/zlib.d#L874
Ah thank you, that makes sense. These types of files are
compressed using the bgzip utility so that the file can be
indexed meaning specific rows extracted quickly (there's more
details of this here http://www.htslib.org/doc/tabix.html and the
code can be found here:
https://github.com/samtools/htslib/blob/develop/bgzf.c)