RE: OpenSSL and compression using ZLIB
Yes, very interesting. This is another way of adding compression to the data pipe. I have not looked at the code, but I assume that the compression state is maintained for the whole life of the communication channel, which is what gives the best results. Have you tried to use SSL_COMP_add_compression_method() also? Cheers, Eric Le Saux Electronic Arts -Original Message- From: Pablo J Royo [mailto:[EMAIL PROTECTED]] Sent: Wednesday, November 27, 2002 12:27 AM To: [EMAIL PROTECTED] Subject: Re: OpenSSL and compression using ZLIB I have used ZLIB in several projects, but my knowledge of it it´s not as deep as yours, but...aren't you talking about a simple BIO for compressing data?.(Or,probably, I missed something in this discussion thread?) I think the BIO would mantain the context (as z_stream struct of ZLIB do) among several calls to BIO_write/read, so if you want to compress communication data you have to chain this zBIO with a socket BIO. Some disccusion and solution on this can be found here http://marc.theaimsgroup.com/?l=openssl-devm=99927148415628w=2 I have used that to compress/cipher/base64 big files with chained BIOs (and a similar implementation of zBIO showed there) and it works, so may be it would work one step more with sockets BIOs. - Original Message - From: Le Saux, Eric [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Tuesday, November 26, 2002 7:24 PM Subject: RE: OpenSSL and compression using ZLIB Again I want to clarify this point: the issue is in the way ZLIB is used by OpenSSL, not in ZLIB itself. The compressor's state is built and destroyed on every record because OpenSSL uses ZLIB's compress() call, which in turn calls the lower-level deflateInit(), deflate() and deflateEnd() functions. This ensures that the records are compression-independent from one another, and the initial question that started this thread was about the existence of any requirement in the definition of SSL that required such independence. Most people discussing this point here do not believe there is such a requirement, but I am not sure if we have a definitive opinion on this. Some standards body will have to address that. One thing is sure though: for specific applications where client and server are under the control of the same developers, it does make sense to use ZLIB differently when it is definitely known that the underlying protocol is indeed reliable. That is why I am currently testing a very small addition to OpenSSL's compression methods that I called streamzlib (I am considering another name suggested yesterday on this mailing list). Some preliminary tests with ZLIB showed that I can go from 2:1 compression factor to 6:1. For completeness I must also say that for specific applications, compression can be done just before and outside of the OpenSSL library. My personal decision to push it down there is to avoid adding another encapsulation layer in that part of our code that is written in C. Now when compression within SSL matures, it will be necessary to have more control over the compressor's operation than just turning it on. In ZLIB you have the choice of 10 compression levels which trade-off between compression quality and speed of execution. There are other options that you could set, such as the size of the dictionary that you use. Future compression methods supported by SSL will probably have their own different set of options. All this will be an excellent subject of discussion in some SSL standard committee. Cheers, Eric Le Saux Electronic Arts -Original Message- From: Howard Chu [mailto:[EMAIL PROTECTED]] Sent: Monday, November 25, 2002 9:01 PM To: [EMAIL PROTECTED] Subject: RE: OpenSSL and compression using ZLIB -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Le Saux, Eric In the current implementation of OpenSSL, compression/decompression state is initialized and destroyed per record. It cannot possibly interoperate with a compressor that maintains compression state across records. The decompressor does care, unfortunately. This is surprising. I haven't looked at the code recently, but my experience has been that a special bit sequence is emitted to signal a dictionary flush. I haven't tested it either, so if you say it didn't work I believe you. But plain old LZW definitely does not have this problem, the compressor can do whatever it wants, and the decompressor will stay sync'd up because it detects these reset codes. -- Howard Chu Chief Architect, Symas Corp. Director, Highland Sun http://www.symas.com http://highlandsun.com/hyc Symas: Premier OpenSource Development and Support __ OpenSSL Project http://www.openssl.org Development Mailing List [EMAIL PROTECTED
RE: OpenSSL and compression using ZLIB
Again I want to clarify this point: the issue is in the way ZLIB is used by OpenSSL, not in ZLIB itself. The compressor's state is built and destroyed on every record because OpenSSL uses ZLIB's compress() call, which in turn calls the lower-level deflateInit(), deflate() and deflateEnd() functions. This ensures that the records are compression-independent from one another, and the initial question that started this thread was about the existence of any requirement in the definition of SSL that required such independence. Most people discussing this point here do not believe there is such a requirement, but I am not sure if we have a definitive opinion on this. Some standards body will have to address that. One thing is sure though: for specific applications where client and server are under the control of the same developers, it does make sense to use ZLIB differently when it is definitely known that the underlying protocol is indeed reliable. That is why I am currently testing a very small addition to OpenSSL's compression methods that I called streamzlib (I am considering another name suggested yesterday on this mailing list). Some preliminary tests with ZLIB showed that I can go from 2:1 compression factor to 6:1. For completeness I must also say that for specific applications, compression can be done just before and outside of the OpenSSL library. My personal decision to push it down there is to avoid adding another encapsulation layer in that part of our code that is written in C. Now when compression within SSL matures, it will be necessary to have more control over the compressor's operation than just turning it on. In ZLIB you have the choice of 10 compression levels which trade-off between compression quality and speed of execution. There are other options that you could set, such as the size of the dictionary that you use. Future compression methods supported by SSL will probably have their own different set of options. All this will be an excellent subject of discussion in some SSL standard committee. Cheers, Eric Le Saux Electronic Arts -Original Message- From: Howard Chu [mailto:[EMAIL PROTECTED]] Sent: Monday, November 25, 2002 9:01 PM To: [EMAIL PROTECTED] Subject: RE: OpenSSL and compression using ZLIB -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Le Saux, Eric In the current implementation of OpenSSL, compression/decompression state is initialized and destroyed per record. It cannot possibly interoperate with a compressor that maintains compression state across records. The decompressor does care, unfortunately. This is surprising. I haven't looked at the code recently, but my experience has been that a special bit sequence is emitted to signal a dictionary flush. I haven't tested it either, so if you say it didn't work I believe you. But plain old LZW definitely does not have this problem, the compressor can do whatever it wants, and the decompressor will stay sync'd up because it detects these reset codes. -- Howard Chu Chief Architect, Symas Corp. Director, Highland Sun http://www.symas.com http://highlandsun.com/hyc Symas: Premier OpenSource Development and Support __ OpenSSL Project http://www.openssl.org Development Mailing List [EMAIL PROTECTED] Automated List Manager [EMAIL PROTECTED] __ OpenSSL Project http://www.openssl.org Development Mailing List [EMAIL PROTECTED] Automated List Manager [EMAIL PROTECTED]
RE: OpenSSL and compression using ZLIB
In the current implementation of OpenSSL, compression/decompression state is initialized and destroyed per record. It cannot possibly interoperate with a compressor that maintains compression state across records. The decompressor does care, unfortunately. The other way around could work, though: a compressor that works per record, sending to a decompressor that maintains state. Personally I am adding a separate compression scheme that I called COMP_streamzlib to the already existing COMP_zlib and COMP_rle methods defined in OpenSSL. The only (but significant) difference is that it will maintain the compression state across records. For the time being, I will just use one of the private IDs mentionned in the previous emails (193 to 255), as it is not compatible with the current zlib/openssl compression. Eric Le Saux Electronic Arts -Original Message- From: pobox [mailto:[EMAIL PROTECTED]] Sent: Sunday, November 24, 2002 2:43 PM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: OpenSSL and compression using ZLIB - Original Message - From: Jeffrey Altman [EMAIL PROTECTED] To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Sunday, November 24, 2002 8:26 AM Subject: Re: OpenSSL and compression using ZLIB http://www.ietf.org/internet-drafts/draft-ietf-tls-compression-03.txt defines the compression numbers to be: enum { null(0), ZLIB(1), LZS(2), (255) } CompressionMethod; Therefore proposed numbers have been issued. I suggest that OpenSSL define the CompressionMethod numbers to be: enum { null(0), ZLIB(1), LZS(2), eayZLIB(224), eayRLE(225), (255) } CompresssionMethod as values in the range 193 to 255 are reserved for private use. Where does the above draft state that the dictionary must be reset? It states that the engine must be flushed but does not indicate that the dictionary is to be reset. Resetting the dictionary would turn ZLIB into a stateless compression algorithm and according to the draft ZLIB is most certainly a stateful algorithm: the compressor maintains it's state through all compressed records I do not believe that compatibility will be an issue. It will simply result in the possibility that the compressed data is distributed differently among the TLS frames that make up the stream. The draft clearly implies that the dictionary need not be reset and probably should not be reset, but it is not clear to me that it prohibits this. However, the draft talks about ... If TLS is not being used with a protocol that provides reliable, sequenced packet delivery, the sender MUST flush the compressor completely ... I find this confusing because I've always understood that TLS assumes it is running over just such a protocol. If I read it correctly, even EAP-TLS (RFC 2716) will handle sequencing, duped, and dropped packets before TLS processing is invoked. So what's this clause alluding to? In any event, I think I agree that the compressor can compatibly behave in different ways as long as the decompressor doesn't care. I'm just not sure I understand RFC1950 and 1951 well enough to know what is possible. Is flush the compressor completely (as in the TLS compression draft language) equivalent to compressing all the current data and emitting an end-of-block code (value=256 in the language of RFC1951)? I'm guessing it is. Is resetting the dictionary equivalent to compressing all the current data and sending the block with the BFINAL bit set? If so, then it seems like the decompressor can always react correctly and therefore compatibly in any of the three cases. If the dictionary is reset for every record (current OpenSSL behavior), then the decompressor knows this because the BFINAL bit is set for every record. If the dictionary is not reset but is flushed for every record, then the decompressor knows this because every record ends with and end-of-block code. If the most optimal case is in play, which implies a single uncompressed plaintext byte might be split across multiple records, the decompressor can recognize and react properly to this case. If all this is correct, then the next question is ... What will the current implementation of thedecompressor in OpenSSL do in each of these cases? --greg [EMAIL PROTECTED] __ OpenSSL Project http://www.openssl.org Development Mailing List [EMAIL PROTECTED] Automated List Manager [EMAIL PROTECTED] __ OpenSSL Project http://www.openssl.org Development Mailing List [EMAIL PROTECTED] Automated List Manager [EMAIL PROTECTED]
RE: OpenSSL and compression using ZLIB
I will try to explain what goes on again. OpenSSL uses ZLIB compression in the following manner: On each block of data transmitted, compress() is called. It's equivalent to deflateInit() + deflate() + deflateEnd(). On a reliable continuous stream of data you can use it in the following way: You call deflateInit() when the connection is established. You call deflate() for each bloc to transmit using Z_SYNC_FLUSH. When the connection closes, you call deflateEnd(). In the latter case, you do not initialize and destroy the dictionary for each block you transmit. Now there are three options to deflate, Z_NO_FLUSH, Z_SYNC_FLUSH and Z_FULL_FLUSH. For interactive applications, you need to flush, otherwise your block of data may get stuck in the pipeline until more data pushes on it. Using Z_SYNC_FLUSH, you force the compressor to output the compressed data immediately. With Z_FULL_FLUSH, you additionally reset the compressor's state. I ran tests using these options, and on our typical datastream sample, it meant for us a compression factor of 6:1 with Z_SYNC_FLUSH and 2:1 with Z_FULL_FLUSH. With Z_SYNC_FLUSH, the dictionary is not trashed. The way OpenSSL uses ZLIB, resetting the compressor's state after each block of data, you would achieve similar results as with Z_FULL_FLUSH. I hope this clarifies things. So I am still wondering if there is a reason why each block of data is compressed independently from the previous one in the OpenSSL use of compression. Is it an architectural constraint? Eric Le Saux Electronic Arts -Original Message- From: Bear Giles [mailto:bgiles;coyotesong.com] Sent: Monday, November 11, 2002 8:14 PM To: [EMAIL PROTECTED] Subject: Re: OpenSSL and compression using ZLIB Le Saux, Eric wrote: I am trying to understand why ZLIB is being used that way. Here is what gives better results on a continuous reliable stream of data: 1) You create a z_stream for sending, and another z_stream for receiving. 2) You call deflateInit() and inflateInit() on them, respectively, when the communication is established. 3) For each data block you send, you call deflate() on it. For each data block you receive, you call inflate() on it. You then die from the latency in the inflation/deflation routines. You have to flush the deflater for each block, and depending on how you do it your performance is the same as deflating each block separately. 4) When the connection is terminated, you call deflateEnd() and inflateEnd() respectively. ... But by far, the main advantage is that you can achieve good compression even for very small blocks of data. The dictionary window stays open for the whole communication stream, making it possible to compress a message by reference to a number of previously sent messages. If you do a Z_SYNC_FLUSH (iirc), it blows the dictionary. This is intentional, since you can restart the inflater at every SYNC mark. I thought there was also a mode to flush the buffer (including any necessary padding for partial bytes) but not blowing the dictionary, but I'm not sure how portable it is. __ OpenSSL Project http://www.openssl.org Development Mailing List [EMAIL PROTECTED] Automated List Manager [EMAIL PROTECTED] __ OpenSSL Project http://www.openssl.org Development Mailing List [EMAIL PROTECTED] Automated List Manager [EMAIL PROTECTED]
RE: OpenSSL and compression using ZLIB
RFC2246 mentions compression state in their list of connection states. It also says the following: 6.2.2. Record compression and decompression [snip snip] The compression algorithm translates a TLSPlaintext structure into a TLSCompressed structure. Compression functions are initialized with default state information whenever a connection state is made active. I will go ahead and see if we can take better advantage of ZLIB's stream compression. If it doesn't fit, I will simply compress my data one layer above SSL. - Eric -Original Message- From: David Schwartz [mailto:[EMAIL PROTECTED]] Sent: Tuesday, November 12, 2002 4:24 PM To: [EMAIL PROTECTED]; [EMAIL PROTECTED]; Le Saux, Eric Subject: RE: OpenSSL and compression using ZLIB On Tue, 12 Nov 2002 18:09:13 -0600, Le Saux, Eric wrote: I believe Gregory Stark meant RFC2246. Okay, but I don't see where RFC2246 says that the compression/decompression protocol can't have state or must compress each block independently or that any particular compression protocol must be implemented in any particular way. DS
OpenSSL and compression using ZLIB
OpenSSL (0.9.6g) has support for compression, both using RLE and ZLIB. The way ZLIB is used, calls to the compress() function are made on each block of data transmitted. Compress() is a higher-level function that calls deflateInit(), deflate() and deflateEnd(). I am trying to understand why ZLIB is being used that way. Here is what gives better results on a continuous reliable stream of data: 1) You create a z_stream for sending, and another z_stream for receiving. 2) You call deflateInit() and inflateInit() on them, respectively, when the communication is established. 3) For each data block you send, you call deflate() on it. For each data block you receive, you call inflate() on it. 4) When the connection is terminated, you call deflateEnd() and inflateEnd() respectively. There are many advantages to that. For one, the initialization functions are not called as often. But by far, the main advantage is that you can achieve good compression even for very small blocks of data. The "dictionary" window stays open for the whole communication stream, making it possible to compress a message by reference to a number of previously sent messages. Thank you for sharing your ideas on this, Eric Le Saux Electronic Arts