Re: Incremental compression

2018-02-09 Thread Steven D'Aprano
On Fri, 09 Feb 2018 17:52:33 -0800, Dan Stromberg wrote:

> Perhaps:
> 
> import lzma
> lzc = lzma.LZMACompressor()

Ah, thanks for the suggestion!



-- 
Steve

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Incremental compression

2018-02-09 Thread Dan Stromberg
Perhaps:

import lzma
lzc = lzma.LZMACompressor()
out1 = lzc.compress(b"Some data\n")
out2 = lzc.compress(b"Another piece of data\n")
out3 = lzc.compress(b"Even more data\n")
out4 = lzc.flush()
# Concatenate all the partial results:
result = b"".join([out1, out2, out3, out4])

?

lzma compresses harder than bzip2, but it's probably slower too.

On Fri, Feb 9, 2018 at 5:36 PM, Steven D'Aprano
 wrote:
> I want to compress a sequence of bytes one byte at a time. (I am already
> processing the bytes one byte at a time, for other reasons.) I don't
> particularly care *which* compression method is used, and in fact I'm not
> even interested in the compressed data itself, only its length. So I'm
> looking for something similar to this:
>
> count = 0
> for b in stream:
> process(b)
> count += incremental_compressor.compressor(b)
>
>
>
> or some variation. Apart from bzip2, do I have any other options in the
> std lib?
>
> https://docs.python.org/3/library/bz2.html#incremental-de-compression
>
>
>
> --
> Steve
>
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Incremental Compression

2006-03-25 Thread Eyal Lotem
Adam DePrince wrote:

> On Sat, 2006-03-25 at 03:08 +0200, Eyal Lotem wrote:
>> Hey.
>> 
>> I have a problem in some network code. I want to send my packets
>> compressed, but I don't want to compress each packet separately (via
>> .encode('zlib') or such) but rather I'd like to compress it with regard
>> to the history of the
>> compression stream.  If I use zlib.compressobj and flush it to get the
>> packet data, I cannot continue to compress with that stream.
> 
> Yes, you can.
> 
> Help on built-in function flush:
> 
> flush(...)
> flush( [mode] ) -- Return a string containing any remaining
> compressed data.
> mode can be one of the constants Z_SYNC_FLUSH, Z_FULL_FLUSH,
> Z_FINISH; the
> default value used when mode is not specified is Z_FINISH.
> If mode == Z_FINISH, the compressor object can no longer be used
> after
> calling the flush() method.  Otherwise, more data can still be
> compressed.
> 
> you want to call
> 
> mycompressor.flush( zlib.Z_SYNC_FLUSH )
> 
> The difference between the flushes is this:
> 
> 1. Z_SYNC_FLUSH.  This basically send enough data so that the receiver
> will get everything you put in.  This does decerase your compression
> ratio (however, in weird case when I last played with it, it helped.)
> 
> 2. Z_FULL_FLUSH.  This sends enough data so that the receiver will get
> everything you put in.  This also wipes the compressors statistics, so
> the when you pick up where you left of, the compressor will compress
> about as well as if you had just started, you are wiping its memory of
> what it saw in the past.
> 
> 3. Z_FINISH.  This is the default action, this is what is killing you.
> 
> Good luck - Adam DePrince

Thanks! That really helps.

> 
>> 
>> I cannot wait until the end of the stream and then flush, because I need
>> to flush after every packet.
>> 
>> Another capability I require is to be able to copy the compression
>> stream. i.e: To be able to create multiple continuations of the same
>> compression stream. Something like:
>> 
>> a = compressobj()
>> pre_x = a.copy()
>> x = a.compress('my_packet1')
>> # send x
>> # x was not acked yet, so we must send another packet via the pre_x
>> compressor
>> y = pre_x.compress('my_packet2')
>> 
>> Is there a compression object that can do all this?
> 
> 
> Ahh, you are trying to "pretune" the compressor before sending a little
> bit ... I think C-zlib does this, but I don't know for sure.

Yeah, but I don't need a powerful tuning, just a means to copy the
compressor's state.  I guess I'll need to write some C for this.

Thanks again!

> 
> - Adam DePrince

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Incremental Compression

2006-03-24 Thread Adam DePrince
On Sat, 2006-03-25 at 03:08 +0200, Eyal Lotem wrote:
> Hey.
> 
> I have a problem in some network code. I want to send my packets compressed,
> but I don't want to compress each packet separately (via .encode('zlib') or
> such) but rather I'd like to compress it with regard to the history of the
> compression stream.  If I use zlib.compressobj and flush it to get the
> packet data, I cannot continue to compress with that stream.

Yes, you can.  

Help on built-in function flush:

flush(...)
flush( [mode] ) -- Return a string containing any remaining
compressed data.
mode can be one of the constants Z_SYNC_FLUSH, Z_FULL_FLUSH,
Z_FINISH; the
default value used when mode is not specified is Z_FINISH.
If mode == Z_FINISH, the compressor object can no longer be used
after
calling the flush() method.  Otherwise, more data can still be
compressed.

you want to call 

mycompressor.flush( zlib.Z_SYNC_FLUSH ) 

The difference between the flushes is this:

1. Z_SYNC_FLUSH.  This basically send enough data so that the receiver
will get everything you put in.  This does decerase your compression
ratio (however, in weird case when I last played with it, it helped.)  

2. Z_FULL_FLUSH.  This sends enough data so that the receiver will get
everything you put in.  This also wipes the compressors statistics, so
the when you pick up where you left of, the compressor will compress
about as well as if you had just started, you are wiping its memory of
what it saw in the past.

3. Z_FINISH.  This is the default action, this is what is killing you.

Good luck - Adam DePrince

> 
> I cannot wait until the end of the stream and then flush, because I need to
> flush after every packet.
> 
> Another capability I require is to be able to copy the compression stream. 
> i.e: To be able to create multiple continuations of the same compression
> stream. Something like:
> 
> a = compressobj()
> pre_x = a.copy()
> x = a.compress('my_packet1')
> # send x
> # x was not acked yet, so we must send another packet via the pre_x
> compressor
> y = pre_x.compress('my_packet2')
> 
> Is there a compression object that can do all this?


Ahh, you are trying to "pretune" the compressor before sending a little
bit ... I think C-zlib does this, but I don't know for sure.

- Adam DePrince



-- 
http://mail.python.org/mailman/listinfo/python-list