Re: [Standards] Binary data over XMPP

Dave Cridland Tue, 06 Nov 2007 06:35:31 -0800

On Tue Nov  6 13:00:44 2007, Tomasz Sterna wrote:

Dnia 05-11-2007, Pn o godzinie 16:23 +0100, Tomasz Sterna pisze:
> Alternatively we could invent binary-2-utf mapping which has less
> overhead than BASE64.


Simplest that comes to mind:

Let's take first 256 allowable UTF-8 characters and assign them to256

values of a single byte.
That would be less than 33% BASE64 overhead.

Can't do that, because many of those characters are going to beillegal even in CDATA sections.

You could take all those ones, though, and add 256 to the codepointvalue before encoding - that would - I think - be sufficient.

But bear in mind that even then, to encode a single octet will yieldbetween 1 and 3 characters. Encoding essentially random data - whichincludes the output of any decent encryption algorithm - will encodehalf the octets using 2-byte characters, yielding - on average - a50% inflation. That's higher than base64, of course.

It's possible that a modified UTF-7 might be better. (And UTF-7,modified or not, is acceptable UTF-8).


Dave.
--
Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED]
 - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
 - http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade

Re: [Standards] Binary data over XMPP

Reply via email to