Re: [Standards] Deprecating XEP-0138: Stream Compression

Thijs Alkemade Tue, 14 Oct 2014 05:51:12 -0700

On 9 okt. 2014, at 17:06, Peter Saint-Andre - &yet <[email protected]> wrote:


> On 10/9/14, 7:59 AM, Thijs Alkemade wrote:
>> Hello all,
>> 
>> Stream compression is insecure, that was shown with CRIME and BREACH and the
>> situation for XMPP isn't much different [1]. I think we should look at the
>> easiest way to deprecate XEP-0138 and move to something better.
>> 
>> Using a "full flush" (in zlib terms) after every stanza would solve the
>> problem, as I can't find any realistic examples where an attacker could 
>> insert
>> their own payload into the same stanza as something secret they want to know.
>> However, clients and servers have no way to negotiate a mode like that, so
>> it's not possible to reject connections that won't do a per-stanza full 
>> flush.
>> Reading draft-ietf-hybi-permessage-compression-18, I was happy to see that 
>> this
>> could be negotiated in WebSocket extension [2].
>> 
>> From my own (very small scale) tests with raw XMPP XML, it appears that full
>> flushing after every stanza yields about the same compression ratio as
>> compressing each stanza separately. Doing that would have a number of
>> advantages:
>> 
>> 1. Not relying on nothing leaking through the "full flush", which may be a
>> concept that other compression algorithms than zlib don't have or don't do
>> securely enough.
>> 
>> 2. Practically no memory overhead in the server or client between messages.
>> There's no context to keep around, each new message can be decompressed with 
>> a
>> fresh new context. Memory overhead for compression is a real concern for
>> servers: one of the reasons Prosody was pushing for XEP-0138 to replace TLS
>> compression was that it's impossible configure the memory use of TLS
>> compression to sane levels in OpenSSL.
>> 
>> However, it also has downsides. It requires either:
>> 
>> 1. That the concatenation of two compressed stanzas can be separated
>> unambiguously.
> 
> Could you explain that a bit more? For example, are you talking about 
> compressing two stanzas and sending them in the same TCP packet?

Instead of sending:

zlib(“<message/><iq/><message/><iq>...”)

(Where you’d occasionally send the compressed data you have so far.)

You'd send:

zlib(“<message/>”) + zlib(“<iq/>”) + zlib(“<message/>”) + zlib(“<iq>”)

(Where + is concatenation.)

This is easy in zlib because it’s possible to tell when a zlib stream ends 
[1][2].

> 
>> 2. Or that we apply framing outside of compression (which I expect to be
>> another can of worms).
> 
> Yes, I'd expect so. I recall debates about framing (or the lack thereof) for 
> XMPP on this very list from over 10 years ago. ;-)
> 
> a> zlib has a header bit that indicates whether a block is the last block in a
>> stream, but again, that might be zlib-specific.
> 
> Would it be worthwhile to investigate what the various compression algorithms 
> support here?

I've been trying to look into LZW, as it is described by XEP 0229, but while I
can find enough descriptions of the algorithm itself, I can't find much about
the output encoding. Most of the LZW API's I've seen also have no flush-method
or something similar.

Regards,
Thijs

[1] = http://zlib.net/manual.html:

"If the flush parameter is Z_FINISH, the remaining data is written and the
gzip stream is completed in the output. If gzwrite() is called again, a new
gzip stream will be started in the output. gzread() is able to read such
concatented gzip streams."

[2] = https://docs.python.org/2/library/zlib.html#zlib.Decompress.unused_data

signature.asc
Description: Message signed with OpenPGP using GPGMail

Re: [Standards] Deprecating XEP-0138: Stream Compression

Reply via email to