Hi while looking into generalizing COMPRESS-207 I realized our CompressorOutputStreams didn't provide bytesWritten - unlike the ArchiverOutputStreams - while the InputStreams all provide a bytesRead.
And I also realized I didn't really know what bytesRead actually meant - bytes read from the compressed stream or uncompressed bytes read from *this* stream. A quick look into the implementation shows I'm not the only one who is confused. For the CompressorInputStream implementations it is the number uncompressed bytes. For the ArchiveInputStreams it is the number of bytes read from the underlying stream. For ArchiveOutputStream the picture is not as clear, zip seems to count the number of bytes written to the underlying stream while ar (which doesn't compress anything) does not count the extra archive header it writes, for example. I'm not really sure who uses the counts, but any attempt to make the counts consistent is bound to break their expectations. Should we just document the current state - and look into making ArchiverOutputStreams consistent? What would be the "correct" choice for CompressorOutputStreams? I'd probably prefer uncompressed bytes to mirror CompressorInputStream. This also raises the question of what to do with them in compress2 (unless that's a dead end). The current compress2 branch doesn't contain any counts at all. I'd probably prefer to go with a notifier approach like COMPRESS-207 suggests and drop the getBytesRead/Written methods altogether. Whoever wants that information would subscribe to the notifications that then would contain both byte counts. Stefan --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org