[Issue 3191] std.zlib.UnCompress errors if buffer is reused
https://issues.dlang.org/show_bug.cgi?id=3191 --- Comment #8 from github-bugzi...@puremagic.com --- Commits pushed to stable at https://github.com/dlang/phobos https://github.com/dlang/phobos/commit/5cf20bd8773e0f746c74b19137a03d699cdfe28b Fixed issues 3191 and 9505 https://github.com/dlang/phobos/commit/8e47bfc54c106b222835db3c3a37e81b52ab2f04 Merge pull request #5720 from kas-luthor/fix-zlib --
[Issue 3191] std.zlib.UnCompress errors if buffer is reused
https://issues.dlang.org/show_bug.cgi?id=3191 --- Comment #7 from github-bugzi...@puremagic.com --- Commits pushed to master at https://github.com/dlang/phobos https://github.com/dlang/phobos/commit/5cf20bd8773e0f746c74b19137a03d699cdfe28b Fixed issues 3191 and 9505 std.zlib.UnCompress.uncompress() now consumes as much of the input buffer as possible and extends / reallocates the output buffer accordingly It also sets inputEnded = 1 when Z_STREAM_END is returned from inflate() so that additional data after the compressed stream is not consumed https://github.com/dlang/phobos/commit/8e47bfc54c106b222835db3c3a37e81b52ab2f04 Merge pull request #5720 from kas-luthor/fix-zlib Fix zlib issues 3191, 9505 and 8779 merged-on-behalf-of: MetaLang--
[Issue 3191] std.zlib.UnCompress errors if buffer is reused
https://issues.dlang.org/show_bug.cgi?id=3191 github-bugzi...@puremagic.com changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --
[Issue 3191] std.zlib.UnCompress errors if buffer is reused
https://issues.dlang.org/show_bug.cgi?id=3191 Andrechanged: What|Removed |Added Assignee|alver...@gmail.com |nob...@puremagic.com --
[Issue 3191] std.zlib.UnCompress errors if buffer is reused
https://issues.dlang.org/show_bug.cgi?id=3191 Andrechanged: What|Removed |Added Assignee|nob...@puremagic.com|alver...@gmail.com --
[Issue 3191] std.zlib.UnCompress errors if buffer is reused
https://issues.dlang.org/show_bug.cgi?id=3191 Andrechanged: What|Removed |Added CC||alver...@gmail.com --- Comment #6 from Andre --- (In reply to Stewart Gordon from comment #5) > (In reply to Justin Whear from comment #3) > > The DEFLATE decompression algorithm relies on the results of previous > > blocks, as it tries to reuse the encoding tree. From the RFC: "Note that a > > duplicated string reference may refer to a string in a previous block; i.e., > > the backward distance may cross one or more block boundaries. > > If I'm not mistaken, deflate blocks are independent of chunks that the > datastream may be split into for decompression. (OK, maybe "block" was the > wrong word in my original bug report.) > > That said, > (In reply to Justin Whear from comment #4) > > I should note that some amount of block reuse is possible if the blocks are > > sufficiently distant; the maximum distance for a back reference is 32,768 > > bytes. > > This says to me that it is a window of 32768 bytes (or maybe 32769 or 32770 > bytes) that needs to be kept in memory at a given time, regardless of block > or chunk boundaries. So why did I find that alternating between buffers > works even if the buffers are much smaller than this? (Indeed, I've a vague > recollection of finding that it works if each of the buffers is a single > byte.) To your knowledge, is this fixed in std/zlib or etc/c/zlib? Or did you use a workaround? --
[Issue 3191] std.zlib.UnCompress errors if buffer is reused
https://issues.dlang.org/show_bug.cgi?id=3191 Justin Whear jus...@economicmodeling.com changed: What|Removed |Added CC||jus...@economicmodeling.com --- Comment #3 from Justin Whear jus...@economicmodeling.com --- The DEFLATE decompression algorithm relies on the results of previous blocks, as it tries to reuse the encoding tree. From the RFC: Note that a duplicated string reference may refer to a string in a previous block; i.e., the backward distance may cross one or more block boundaries. However a distance cannot refer past the beginning of the output stream. (http://www.w3.org/Graphics/PNG/RFC-1951#huffman) So I think the bug should be clarified to this function allowing block reuse in the first place. --
[Issue 3191] std.zlib.UnCompress errors if buffer is reused
https://issues.dlang.org/show_bug.cgi?id=3191 --- Comment #4 from Justin Whear jus...@economicmodeling.com --- I should note that some amount of block reuse is possible if the blocks are sufficiently distant; the maximum distance for a back reference is 32,768 bytes. --
[Issue 3191] std.zlib.UnCompress errors if buffer is reused
https://issues.dlang.org/show_bug.cgi?id=3191 --- Comment #5 from Stewart Gordon s...@iname.com --- (In reply to Justin Whear from comment #3) The DEFLATE decompression algorithm relies on the results of previous blocks, as it tries to reuse the encoding tree. From the RFC: Note that a duplicated string reference may refer to a string in a previous block; i.e., the backward distance may cross one or more block boundaries. If I'm not mistaken, deflate blocks are independent of chunks that the datastream may be split into for decompression. (OK, maybe block was the wrong word in my original bug report.) That said, (In reply to Justin Whear from comment #4) I should note that some amount of block reuse is possible if the blocks are sufficiently distant; the maximum distance for a back reference is 32,768 bytes. This says to me that it is a window of 32768 bytes (or maybe 32769 or 32770 bytes) that needs to be kept in memory at a given time, regardless of block or chunk boundaries. So why did I find that alternating between buffers works even if the buffers are much smaller than this? (Indeed, I've a vague recollection of finding that it works if each of the buffers is a single byte.) --
[Issue 3191] std.zlib.UnCompress errors if buffer is reused
https://issues.dlang.org/show_bug.cgi?id=3191 Walter Bright bugzi...@digitalmars.com changed: What|Removed |Added Version|1.046 |D1 --
[Issue 3191] std.zlib.UnCompress errors if buffer is reused
http://d.puremagic.com/issues/show_bug.cgi?id=3191 Andrej Mitrovic andrej.mitrov...@gmail.com changed: What|Removed |Added CC||andrej.mitrov...@gmail.com --- Comment #2 from Andrej Mitrovic andrej.mitrov...@gmail.com 2011-05-24 22:05:17 PDT --- Greetings, I come from the future. Here's a modern implementation of your sample: import std.zlib; import std.stdio; const size_t BLOCK_SIZE = 1024; void main(string[] a) { auto file = File(a[1], r); auto uc = new UnCompress(); void[] ucData; ubyte[] block = new ubyte[BLOCK_SIZE]; foreach (ubyte[] buffer; file.byChunk(BLOCK_SIZE)) { writeln(buffer.length); ucData ~= uc.uncompress(buffer); } ucData ~= uc.flush(); writefln(Finished: %s, ucData.length); } Still errors out. But I have a hunch it has something to do with buffer being reused by file.byChunk, and zlib might internally be storing a pointer to the buffer while the GC might deallocate it in the meantime. Something like that, because if you .dup your buffer, you won't get errors anymore: import std.zlib; import std.stdio; const size_t BLOCK_SIZE = 1024; void main(string[] a) { auto file = File(a[1], r); auto uc = new UnCompress(); void[] ucData; ubyte[] block = new ubyte[BLOCK_SIZE]; foreach (ubyte[] buffer; file.byChunk(BLOCK_SIZE)) { writeln(buffer.length); ucData ~= uc.uncompress(buffer.dup); } ucData ~= uc.flush(); writefln(Finished: %s, ucData.length); } It might just be that zlib expects all data passed in to be valid while you use the UnCompress() class. I have no other explanation. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email --- You are receiving this mail because: ---
[Issue 3191] std.zlib.UnCompress errors if buffer is reused
http://d.puremagic.com/issues/show_bug.cgi?id=3191 --- Comment #1 from Stewart Gordon s...@iname.com 2009-07-19 15:37:37 PDT --- Created an attachment (id=427) -- (http://d.puremagic.com/issues/attachment.cgi?id=427) Sample compressed data file This is the data file I used with the testcase program. For the curious ones among you, it's the content extracted from a PNG file's IDAT chunk. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email --- You are receiving this mail because: ---