On Fri, Sep 4, 2009 at 18:02, Xueming Shen<xueming.s...@sun.com> wrote: > Thanks for the comments. > > Brandon Long wrote: >> >> In the new Deflater.deflate, you are checking valid flush values against >> NO_FLUSH twice (and missing FULL_FLUSH) >> > > good catch, a bad copy/paste. More unit tests will be written to catch this > kind of problem after > we finalize the APIs. > >> The new DeflaterOutputStream.flush, you loop as long as some output is >> generated, which can call deflate an extra time. Normally, code that >> does this loops until there is no def.needsInput and the output is less >> than the size of the output buffer. I would have thought it would have >> generated extra output, but then the loop would never finish... so I >> guess this must work. >> >> > > The current code works. But your suggestion of using len < b.length > (actually it means zlib's > availble_out > 0) seems a better and correct approach, which can same an > extra round. A > quick scan of the zlib deflate/flush_pending shows it should work, I will > update to that > after give the code a more careful read and run some tests. Given each/every > write() loops > till needsInput() returns true, I don't think we need to consider this in > flush(). > >> Overall, I'm least happy with #4, since I feel it leaves a bug. flush() >> on a stream should flush everything I've written to the stream. This >> bug is that it currently doesn't, and this doesn't fix it. It makes it >> possible for people to fix it (which isn't possible currently without >> using a completely separate implementation), but it doesn't fix the bug. >>
I think "bug" is going too far. There are OutputStreams that can't dispose of previously read data without reading some more first. It depends on the transformation. DeflaterOutputStream is in a gray area - it "could" write all the data to its underlying stream, but at the cost of sacrificing some compression (which is its purpose). > Understood and "agreed":-) I'm wiling to change position under more > pressure:-) And we > can add that anytime. It's better than put it in now and have to take it out > later or add in some > ugly workaround. Maybe we should understand the risk. Doing a SYNC_FLUSH on every DOS.flush() won't cause the compression/decompression to "break", you will just have very slow and very bad compression. How bad could it be? Suppose we test with random data, and doing a SYNC_FLUSH on every byte? Presumably the "compressed" output will be larger than the input by some factor. If that factor is close to 1, then it's probably OK...It's "only" a performance problem. Anyways, I am leaning towards changing DOS to do the Right Thing. Martin