Re: Incremental compression
On Fri, 09 Feb 2018 17:52:33 -0800, Dan Stromberg wrote: > Perhaps: > > import lzma > lzc = lzma.LZMACompressor() Ah, thanks for the suggestion! -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: Incremental compression
Perhaps: import lzma lzc = lzma.LZMACompressor() out1 = lzc.compress(b"Some data\n") out2 = lzc.compress(b"Another piece of data\n") out3 = lzc.compress(b"Even more data\n") out4 = lzc.flush() # Concatenate all the partial results: result = b"".join([out1, out2, out3, out4]) ? lzma compresses harder than bzip2, but it's probably slower too. On Fri, Feb 9, 2018 at 5:36 PM, Steven D'Aprano wrote: > I want to compress a sequence of bytes one byte at a time. (I am already > processing the bytes one byte at a time, for other reasons.) I don't > particularly care *which* compression method is used, and in fact I'm not > even interested in the compressed data itself, only its length. So I'm > looking for something similar to this: > > count = 0 > for b in stream: > process(b) > count += incremental_compressor.compressor(b) > > > > or some variation. Apart from bzip2, do I have any other options in the > std lib? > > https://docs.python.org/3/library/bz2.html#incremental-de-compression > > > > -- > Steve > > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: Incremental Compression
Adam DePrince wrote: > On Sat, 2006-03-25 at 03:08 +0200, Eyal Lotem wrote: >> Hey. >> >> I have a problem in some network code. I want to send my packets >> compressed, but I don't want to compress each packet separately (via >> .encode('zlib') or such) but rather I'd like to compress it with regard >> to the history of the >> compression stream. If I use zlib.compressobj and flush it to get the >> packet data, I cannot continue to compress with that stream. > > Yes, you can. > > Help on built-in function flush: > > flush(...) > flush( [mode] ) -- Return a string containing any remaining > compressed data. > mode can be one of the constants Z_SYNC_FLUSH, Z_FULL_FLUSH, > Z_FINISH; the > default value used when mode is not specified is Z_FINISH. > If mode == Z_FINISH, the compressor object can no longer be used > after > calling the flush() method. Otherwise, more data can still be > compressed. > > you want to call > > mycompressor.flush( zlib.Z_SYNC_FLUSH ) > > The difference between the flushes is this: > > 1. Z_SYNC_FLUSH. This basically send enough data so that the receiver > will get everything you put in. This does decerase your compression > ratio (however, in weird case when I last played with it, it helped.) > > 2. Z_FULL_FLUSH. This sends enough data so that the receiver will get > everything you put in. This also wipes the compressors statistics, so > the when you pick up where you left of, the compressor will compress > about as well as if you had just started, you are wiping its memory of > what it saw in the past. > > 3. Z_FINISH. This is the default action, this is what is killing you. > > Good luck - Adam DePrince Thanks! That really helps. > >> >> I cannot wait until the end of the stream and then flush, because I need >> to flush after every packet. >> >> Another capability I require is to be able to copy the compression >> stream. i.e: To be able to create multiple continuations of the same >> compression stream. Something like: >> >> a = compressobj() >> pre_x = a.copy() >> x = a.compress('my_packet1') >> # send x >> # x was not acked yet, so we must send another packet via the pre_x >> compressor >> y = pre_x.compress('my_packet2') >> >> Is there a compression object that can do all this? > > > Ahh, you are trying to "pretune" the compressor before sending a little > bit ... I think C-zlib does this, but I don't know for sure. Yeah, but I don't need a powerful tuning, just a means to copy the compressor's state. I guess I'll need to write some C for this. Thanks again! > > - Adam DePrince -- http://mail.python.org/mailman/listinfo/python-list
Re: Incremental Compression
On Sat, 2006-03-25 at 03:08 +0200, Eyal Lotem wrote: > Hey. > > I have a problem in some network code. I want to send my packets compressed, > but I don't want to compress each packet separately (via .encode('zlib') or > such) but rather I'd like to compress it with regard to the history of the > compression stream. If I use zlib.compressobj and flush it to get the > packet data, I cannot continue to compress with that stream. Yes, you can. Help on built-in function flush: flush(...) flush( [mode] ) -- Return a string containing any remaining compressed data. mode can be one of the constants Z_SYNC_FLUSH, Z_FULL_FLUSH, Z_FINISH; the default value used when mode is not specified is Z_FINISH. If mode == Z_FINISH, the compressor object can no longer be used after calling the flush() method. Otherwise, more data can still be compressed. you want to call mycompressor.flush( zlib.Z_SYNC_FLUSH ) The difference between the flushes is this: 1. Z_SYNC_FLUSH. This basically send enough data so that the receiver will get everything you put in. This does decerase your compression ratio (however, in weird case when I last played with it, it helped.) 2. Z_FULL_FLUSH. This sends enough data so that the receiver will get everything you put in. This also wipes the compressors statistics, so the when you pick up where you left of, the compressor will compress about as well as if you had just started, you are wiping its memory of what it saw in the past. 3. Z_FINISH. This is the default action, this is what is killing you. Good luck - Adam DePrince > > I cannot wait until the end of the stream and then flush, because I need to > flush after every packet. > > Another capability I require is to be able to copy the compression stream. > i.e: To be able to create multiple continuations of the same compression > stream. Something like: > > a = compressobj() > pre_x = a.copy() > x = a.compress('my_packet1') > # send x > # x was not acked yet, so we must send another packet via the pre_x > compressor > y = pre_x.compress('my_packet2') > > Is there a compression object that can do all this? Ahh, you are trying to "pretune" the compressor before sending a little bit ... I think C-zlib does this, but I don't know for sure. - Adam DePrince -- http://mail.python.org/mailman/listinfo/python-list