On Tue Aug  4 10:39:53 2009, Christoph Terwelp wrote:
Hi,

Am 04.08.2009 um 11:00 schrieb Dave Cridland:

So, I've been playing with compression algorithms, by way of fun and games.

Did you have a look at EXI ( http://www.w3.org/XML/EXI/ ), It's the W3Cs approach to compress XML. It may be of interest to you.

I have looked at EXI on multiple occasions, yes. One of the things I didn't like was that it's schema-informed, which doesn't work for XMPP, where there are multiple schemas, often unknown at the session commencement. Absent a schema, it essentially degenerates into DEFLATE, and so it performs - from the figures I've seen - only marginally better than DEFLATE does by itself.

DEFLATE is pretty memory-heavy - it's using around 250-300k per session, and that's really a limiting factor for XMPP servers as far as I can see.

So what I'm doing here is trying to find potential algorithms that are sufficiently lightweight in terms of memory, but give at least ballpark performance with the existing XEP-0138 and TLS based DEFLATE.

As it happens, my current plans aren't working out well - on a real XMPP session, I'm seeing figures of:

Raw: 435379 100%
Huffman String: 238509 54.8%
Huffman XML: 214689 49.3%
Huffman XML SymDef: 178203 40.9%
Zlib String: 88366 20.3%

This is based on reading in my session's telemetry log, which includes sufficient information that I can replicate flushes, etc.

I'll try increasing the order, and play some fun and games with preset trees. And possibly switch to range compression - arithmetic without the patents. (Increasing the order will, in principle, increase the memory, but if I only create trees on demand, I may avoid most of them).

Dave.
--
Dave Cridland - mailto:[email protected] - xmpp:[email protected]
 - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
 - http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade

Reply via email to