RE: OpenSSL and compression using ZLIB

2002-11-27 Thread Le Saux, Eric
Yes, very interesting.

This is another way of adding compression to the data pipe.
I have not looked at the code, but I assume that the compression state is
maintained for the whole life of the communication channel, which is what
gives the best results.

Have you tried to use SSL_COMP_add_compression_method() also?

Cheers,

Eric Le Saux
Electronic Arts



-Original Message-
From: Pablo J Royo [mailto:[EMAIL PROTECTED]] 
Sent: Wednesday, November 27, 2002 12:27 AM
To: [EMAIL PROTECTED]
Subject: Re: OpenSSL and compression using ZLIB


I have used ZLIB in several projects, but my knowledge of it it´s not as
deep as yours, but...aren't you talking about a simple BIO for compressing
data?.(Or,probably, I missed something in this discussion thread?)
I think the BIO would mantain the context (as z_stream struct of ZLIB do)
among several calls to BIO_write/read, so if you want to compress
communication data you have to chain this zBIO with a socket BIO.
Some disccusion and solution on this can be found here

http://marc.theaimsgroup.com/?l=openssl-devm=99927148415628w=2

I have used that to compress/cipher/base64 big files with chained BIOs (and
a similar implementation of zBIO showed there) and it works, so may be it
would work one step more with sockets BIOs.


- Original Message -
From: Le Saux, Eric [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Tuesday, November 26, 2002 7:24 PM
Subject: RE: OpenSSL and compression using ZLIB


 Again I want to clarify this point: the issue is in the way ZLIB is used
by
 OpenSSL, not in ZLIB itself.  The compressor's state is built and
destroyed
 on every record because OpenSSL uses ZLIB's compress() call, which in turn
 calls the lower-level deflateInit(), deflate() and deflateEnd() functions.

 This ensures that the records are compression-independent from one
another,
 and the initial question that started this thread was about the existence
of
 any requirement in the definition of SSL that required such independence.

 Most people discussing this point here do not believe there is such a
 requirement, but I am not sure if we have a definitive opinion on this.
 Some standards body will have to address that.

 One thing is sure though: for specific applications where client and
server
 are under the control of the same developers, it does make sense to use
ZLIB
 differently when it is definitely known that the underlying protocol is
 indeed reliable.  That is why I am currently testing a very small addition
 to OpenSSL's compression methods that I called streamzlib (I am
considering
 another name suggested yesterday on this mailing list).  Some preliminary
 tests with ZLIB showed that I can go from 2:1 compression factor to 6:1.

 For completeness I must also say that for specific applications,
compression
 can be done just before and outside of the OpenSSL library.  My personal
 decision to push it down there is to avoid adding another encapsulation
 layer in that part of our code that is written in C.

 Now when compression within SSL matures, it will be necessary to have more
 control over the compressor's operation than just turning it on.  In ZLIB
 you have the choice of 10 compression levels which trade-off between
 compression quality and speed of execution.  There are other options that
 you could set, such as the size of the dictionary that you use.  Future
 compression methods supported by SSL will probably have their own
different
 set of options.

 All this will be an excellent subject of discussion in some SSL standard
 committee.

 Cheers,

 Eric Le Saux
 Electronic Arts

 -Original Message-
 From: Howard Chu [mailto:[EMAIL PROTECTED]]
 Sent: Monday, November 25, 2002 9:01 PM
 To: [EMAIL PROTECTED]
 Subject: RE: OpenSSL and compression using ZLIB

  -Original Message-
  From: [EMAIL PROTECTED]
  [mailto:[EMAIL PROTECTED]]On Behalf Of Le Saux, Eric

  In the current implementation of OpenSSL,
  compression/decompression state is
  initialized and destroyed per record.  It cannot possibly
  interoperate with
  a compressor that maintains compression state across records.  The
  decompressor does care, unfortunately.

 This is surprising. I haven't looked at the code recently, but my
experience
 has been that a special bit sequence is emitted to signal a dictionary
 flush.
 I haven't tested it either, so if you say it didn't work I believe you.
But
 plain old LZW definitely does not have this problem, the compressor can do
 whatever it wants, and the decompressor will stay sync'd up because it
 detects these reset codes.

   -- Howard Chu
   Chief Architect, Symas Corp.   Director, Highland Sun
   http://www.symas.com   http://highlandsun.com/hyc
   Symas: Premier OpenSource Development and Support

 __
 OpenSSL Project http://www.openssl.org
 Development Mailing List   [EMAIL PROTECTED

RE: OpenSSL and compression using ZLIB

2002-11-26 Thread Le Saux, Eric
Again I want to clarify this point: the issue is in the way ZLIB is used by
OpenSSL, not in ZLIB itself.  The compressor's state is built and destroyed
on every record because OpenSSL uses ZLIB's compress() call, which in turn
calls the lower-level deflateInit(), deflate() and deflateEnd() functions.

This ensures that the records are compression-independent from one another,
and the initial question that started this thread was about the existence of
any requirement in the definition of SSL that required such independence.

Most people discussing this point here do not believe there is such a
requirement, but I am not sure if we have a definitive opinion on this.
Some standards body will have to address that.

One thing is sure though: for specific applications where client and server
are under the control of the same developers, it does make sense to use ZLIB
differently when it is definitely known that the underlying protocol is
indeed reliable.  That is why I am currently testing a very small addition
to OpenSSL's compression methods that I called streamzlib (I am considering
another name suggested yesterday on this mailing list).  Some preliminary
tests with ZLIB showed that I can go from 2:1 compression factor to 6:1.  

For completeness I must also say that for specific applications, compression
can be done just before and outside of the OpenSSL library.  My personal
decision to push it down there is to avoid adding another encapsulation
layer in that part of our code that is written in C.

Now when compression within SSL matures, it will be necessary to have more
control over the compressor's operation than just turning it on.  In ZLIB
you have the choice of 10 compression levels which trade-off between
compression quality and speed of execution.  There are other options that
you could set, such as the size of the dictionary that you use.  Future
compression methods supported by SSL will probably have their own different
set of options.

All this will be an excellent subject of discussion in some SSL standard
committee.

Cheers,

Eric Le Saux
Electronic Arts

-Original Message-
From: Howard Chu [mailto:[EMAIL PROTECTED]] 
Sent: Monday, November 25, 2002 9:01 PM
To: [EMAIL PROTECTED]
Subject: RE: OpenSSL and compression using ZLIB

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED]]On Behalf Of Le Saux, Eric

 In the current implementation of OpenSSL,
 compression/decompression state is
 initialized and destroyed per record.  It cannot possibly
 interoperate with
 a compressor that maintains compression state across records.  The
 decompressor does care, unfortunately.

This is surprising. I haven't looked at the code recently, but my experience
has been that a special bit sequence is emitted to signal a dictionary
flush.
I haven't tested it either, so if you say it didn't work I believe you. But
plain old LZW definitely does not have this problem, the compressor can do
whatever it wants, and the decompressor will stay sync'd up because it
detects these reset codes.

  -- Howard Chu
  Chief Architect, Symas Corp.   Director, Highland Sun
  http://www.symas.com   http://highlandsun.com/hyc
  Symas: Premier OpenSource Development and Support

__
OpenSSL Project http://www.openssl.org
Development Mailing List   [EMAIL PROTECTED]
Automated List Manager   [EMAIL PROTECTED]
__
OpenSSL Project http://www.openssl.org
Development Mailing List   [EMAIL PROTECTED]
Automated List Manager   [EMAIL PROTECTED]



RE: OpenSSL and compression using ZLIB

2002-11-25 Thread Le Saux, Eric
In the current implementation of OpenSSL, compression/decompression state is
initialized and destroyed per record.  It cannot possibly interoperate with
a compressor that maintains compression state across records.  The
decompressor does care, unfortunately.  The other way around could work,
though: a compressor that works per record, sending to a decompressor that
maintains state.

Personally I am adding a separate compression scheme that I called
COMP_streamzlib to the already existing COMP_zlib and COMP_rle methods
defined in OpenSSL.  The only (but significant) difference is that it will
maintain the compression state across records.  For the time being, I will
just use one of the private IDs mentionned in the previous emails (193 to
255), as it is not compatible with the current zlib/openssl compression.

Eric Le Saux
Electronic Arts

-Original Message-
From: pobox [mailto:[EMAIL PROTECTED]] 
Sent: Sunday, November 24, 2002 2:43 PM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: Re: OpenSSL and compression using ZLIB

- Original Message -
From: Jeffrey Altman [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Sunday, November 24, 2002 8:26 AM
Subject: Re: OpenSSL and compression using ZLIB


 http://www.ietf.org/internet-drafts/draft-ietf-tls-compression-03.txt

 defines the compression numbers to be:

enum { null(0), ZLIB(1), LZS(2), (255) } CompressionMethod;

 Therefore proposed numbers have been issued.  I suggest that OpenSSL
 define the CompressionMethod numbers to be:

enum { null(0), ZLIB(1), LZS(2), eayZLIB(224), eayRLE(225), (255) }
 CompresssionMethod

 as values in the range 193 to 255 are reserved for private use.

 Where does the above draft state that the dictionary must be reset?
 It states that the engine must be flushed but does not indicate that
 the dictionary is to be reset.  Resetting the dictionary would turn
 ZLIB into a stateless compression algorithm and according to the draft
 ZLIB is most certainly a stateful algorithm:

  the compressor maintains it's state through all compressed records

 I do not believe that compatibility will be an issue.  It will simply
 result in the possibility that the compressed data is distributed
 differently among the TLS frames that make up the stream.


The draft clearly implies that the dictionary need not be reset and probably
should not be reset, but it is not clear to me that it prohibits this.
However, the draft talks about ...
If TLS is not being used with a protocol that provides reliable, sequenced
packet delivery, the sender MUST flush the compressor completely ...
I find this confusing because I've always understood that TLS assumes it is
running over just such a protocol. If I read it correctly, even EAP-TLS (RFC
2716) will handle sequencing, duped, and dropped packets before TLS
processing is invoked. So what's this clause alluding to?

In any event, I think I agree that the compressor can compatibly behave in
different ways as long as the decompressor doesn't care. I'm just not sure I
understand RFC1950 and 1951 well enough to know what is possible. Is flush
the compressor completely (as in the TLS compression draft language)
equivalent to compressing all the current data and emitting an end-of-block
code (value=256 in the language of RFC1951)? I'm guessing it is. Is
resetting the dictionary equivalent to compressing all the current data
and sending the block with the BFINAL bit set? If so, then it seems like the
decompressor can always react correctly and therefore compatibly in any of
the three cases. If the dictionary is reset for every record (current
OpenSSL behavior), then the decompressor knows this because the BFINAL bit
is set for every record. If the dictionary is not reset but is flushed for
every record, then the decompressor knows this because every record ends
with and end-of-block code. If the most optimal case is in play, which
implies a single uncompressed plaintext byte might be split across multiple
records, the decompressor can recognize and react properly to this case. If
all this is correct, then the next question is ...
What will the current implementation of thedecompressor in OpenSSL do in
each of these cases?


--greg
[EMAIL PROTECTED]

__
OpenSSL Project http://www.openssl.org
Development Mailing List   [EMAIL PROTECTED]
Automated List Manager   [EMAIL PROTECTED]
__
OpenSSL Project http://www.openssl.org
Development Mailing List   [EMAIL PROTECTED]
Automated List Manager   [EMAIL PROTECTED]



RE: OpenSSL and compression using ZLIB

2002-11-12 Thread Le Saux, Eric
I will try to explain what goes on again.

OpenSSL uses ZLIB compression in the following manner:
On each block of data transmitted, compress() is called.
It's equivalent to deflateInit() + deflate() + deflateEnd().

On a reliable continuous stream of data you can use it in the following way:
You call deflateInit() when the connection is established.
You call deflate() for each bloc to transmit using Z_SYNC_FLUSH.
When the connection closes, you call deflateEnd().

In the latter case, you do not initialize and destroy the dictionary for
each block you transmit.

Now there are three options to deflate, Z_NO_FLUSH, Z_SYNC_FLUSH and
Z_FULL_FLUSH.  For interactive applications, you need to flush, otherwise
your block of data may get stuck in the pipeline until more data pushes on
it.  Using Z_SYNC_FLUSH, you force the compressor to output the compressed
data immediately.  With Z_FULL_FLUSH, you additionally reset the
compressor's state.

I ran tests using these options, and on our typical datastream sample, it
meant for us a compression factor of 6:1 with Z_SYNC_FLUSH and 2:1 with
Z_FULL_FLUSH.  With Z_SYNC_FLUSH, the dictionary is not trashed.

The way OpenSSL uses ZLIB, resetting the compressor's state after each block
of data, you would achieve similar results as with Z_FULL_FLUSH.

I hope this clarifies things.

So I am still wondering if there is a reason why each block of data is
compressed independently from the previous one in the OpenSSL use of
compression.  Is it an architectural constraint?


Eric Le Saux
Electronic Arts

-Original Message-
From: Bear Giles [mailto:bgiles;coyotesong.com] 
Sent: Monday, November 11, 2002 8:14 PM
To: [EMAIL PROTECTED]
Subject: Re: OpenSSL and compression using ZLIB

Le Saux, Eric wrote:
 
 I am trying to understand why ZLIB is being used that way.  Here is what 
 gives better results on a continuous reliable stream of data:
  
 1)   You create a z_stream for sending, and another z_stream for 
 receiving.
 
 2)   You call deflateInit() and inflateInit() on them, respectively, 
 when the communication is established.
 
 3)   For each data block you send, you call deflate() on it.  For 
 each data block you receive, you call inflate() on it.

You then die from the latency in the inflation/deflation routines.  You 
have to flush the deflater for each block, and depending on how you do 
it your performance is the same as deflating each block separately.

 4)   When the connection is terminated, you call deflateEnd() and 
 inflateEnd() respectively.

...

 But by far, the main advantage is that you can achieve good compression 
 even for very small blocks of data.  The dictionary window stays open 
 for the whole communication stream, making it possible to compress a 
 message by reference to a number of previously sent messages.

If you do a Z_SYNC_FLUSH (iirc), it blows the dictionary.  This is 
intentional, since you can restart the inflater at every SYNC mark.

I thought there was also a mode to flush the buffer (including any 
necessary padding for partial bytes) but not blowing the dictionary, but 
I'm not sure how portable it is.

__
OpenSSL Project http://www.openssl.org
Development Mailing List   [EMAIL PROTECTED]
Automated List Manager   [EMAIL PROTECTED]
__
OpenSSL Project http://www.openssl.org
Development Mailing List   [EMAIL PROTECTED]
Automated List Manager   [EMAIL PROTECTED]



RE: OpenSSL and compression using ZLIB

2002-11-12 Thread Le Saux, Eric








RFC2246 mentions compression state in their list of
connection states.

It also says the following:



6.2.2. Record compression and decompression



 [snip snip] The compression
algorithm translates a

 TLSPlaintext
structure into a TLSCompressed structure.
Compression

 functions are initialized with default state information
whenever a

 connection state is made active.



I will go ahead and see if we can take better advantage of ZLIB's stream compression.



If it doesn't fit, I will simply compress my data one layer above
SSL.



-

Eric











-Original Message-
From: David Schwartz [mailto:[EMAIL PROTECTED]] 
Sent: Tuesday, November
 12, 2002 4:24 PM
To: [EMAIL PROTECTED]; [EMAIL PROTECTED]; Le Saux, Eric
Subject: RE: OpenSSL and compression using ZLIB







On Tue, 12 Nov
 2002 18:09:13 -0600, Le
Saux, Eric wrote:



I believe Gregory Stark meant RFC2246.



 Okay,
but I don't see where RFC2246 says that the compression/decompression 

protocol can't have state or must compress each block independently or
that 

any particular compression protocol must be implemented in any
particular 

way.



 DS










OpenSSL and compression using ZLIB

2002-11-11 Thread Le Saux, Eric








OpenSSL (0.9.6g) has
support for compression, both using RLE and ZLIB.

The way ZLIB is used, calls to the compress()
function are made on each block of data transmitted.

Compress() is a
higher-level function that calls deflateInit(),
deflate() and deflateEnd().



I am trying to understand why ZLIB is being used that way. Here is what gives better results on a
continuous reliable stream of data:



1) You create a
z_stream for sending, and another z_stream
for receiving.

2) You call deflateInit()
and inflateInit() on them, respectively, when the
communication is established.

3) For each
data block you send, you call deflate() on it. For each data block you receive, you
call inflate() on it.

4) When the
connection is terminated, you call deflateEnd() and inflateEnd() respectively.



There are many advantages to that. For one, the initialization functions
are not called as often.

But by far, the main advantage is that you can achieve good
compression even for very small blocks of data. The "dictionary" window stays
open for the whole communication stream, making it possible to compress a
message by reference to a number of previously sent messages.



Thank you for sharing your ideas on this,



Eric Le Saux

Electronic Arts