Hello Dave

XEP-0322 (EXI) contains two methods of using EXI-compressed XML in pure binary 
form, i.e. not base-64 encoded. The first method contains a handshake mechanism 
where a normal XMPP session is converted to pure binary using EXI (including 
quick setup), and another mechanism which is EXI from the start, i.e. no text 
XML is sent/received.

The mention of base64 (or not) in this instance was referring to the way a 
schema-aware EXI encoder will strip off the base64 from values which are 
declared as such in the XML schema, as I understand things, such as a 
hypothetical attribute value here:

<foo attr='BLAHBLAHBL=='/>

the attribute here in base64 would not be in base64 in EXI serialization.

Feel free to correct me if I'm off the mark.

You're correct. EXI can work in two modes: "schema informed" and "schema 
uninformed". In the uninformed state, strings are stored as they occur (even 
though string tables can be used to avoid duplicate strings in a stream). In a 
schema informed grammar, the encoder and decoder knows how many possible 
characters are available for a specific attribute or value (i.e. type), and 
uses only the number of bits necessary to encode the characters. So, if a 
base-64 data type is used (and not xs:string) in the schema, only 6 
bits/character will be used to store a character effectively reducing the 
overhead of the base-64 encoding in the first place. Interestingly, encoding 
binary as upper case hexadecimal only uses 4 bits/characters, using the same 
amount of total bits in the compressed stream as if the string would have been 
encoded using base-64 (etc.)

Best regards,
Peter

Reply via email to