Re: [Standards] mobile optimizations (was: Re: G oogle Andro ï d SDK not XMPP compliant ?)
Actually the W3C binary XML standard when compared to traditional compression standards like Zip is significantly better. The binary conversion process also compresses file. You might want to read: http://www.w3.org/XML/EXI/ http://www.w3.org/TR/2007/WD-exi-measurements-20070725/ http://www.w3.org/TR/xbc-characterization/#N107D4 BTW, Fast Infoset was not selected by the W3C. On 2/14/08 5:04 PM, Fabio Forno [EMAIL PROTECTED] wrote: On Thu, Feb 14, 2008 at 9:39 PM, Dave Cridland [EMAIL PROTECTED] wrote: I've never been all that convinced about binary XML forms. They work to a degree with the highly fixed XML in, for example, SyncML, and they're pretty good at compressing individual stanza-like objects over SMS for things like OMA EMN (Email Message Notification, or something - I've long since forgotten what these acronyms stand for), but for long-running streams I'm under the impression that studies show it'll be outperformed. So if you're a big fan of Binary XML formats, please bring along your figures. :-) Missing the reference, but you should get the best with binary + compression, however it's not worth the candle, since EXTENSIBLE binary xml is not easy (there are fast infosets, but the specification is incredibly complex) and the gain is not so high -- Fabio Forno, Ph.D. Bluendo srl http://www.bluendo.com jabber id: [EMAIL PROTECTED]
Re: [Standards] mobile optimizations (was: Re: G oogle Andro ï d SDK not XMPP compliant ?)
Dave, take a look at http://www.agiledelta.com/w3c_binary_xml_proposal.html and http://www.idealliance.org/papers/xml02/dx_xml02/papers/06-02-04/06-02-04.pd f. The W3C spec is based on Agile Delta¹s EfficientXML. the data I have seen on EfficientXML indicate that it many times more efficient on than Zip. 1362 byte message strongly typed WinZip 3.13 times smaller than original EfficientXML 75.67 times smaller than original 980 byte message loosely type WinZip 1.6 times smaller than original Efficient XML 8.45 times smaller than original 21437 byte message Winzip 6 times smaller Efficient XML 33 times smaller I have other data for large message sizes if interested. Unfortunately I can¹t provide the raw data or the messages used. But group that did the study tested the messages with WinZip, MPEG-7+BIM, Xmill, Efficient XML, ASN.1 PER, and WBXML-like. Efficient XML beat them all by a large margin. Binary XML will help out in two significant errors where XMPP is used: 1. can be a significant reduce in b/w used. Which can have a big impact on the performance of a server 2. faster processing in the chat server. reading XML is expensive. most of the binary XML formats were designed to be not only much smaller in size but also much less CPU intensive to process. This should in theory dramatically improve the scalability of a given XMPP server. boyd On 2/14/08 3:39 PM, Dave Cridland [EMAIL PROTECTED] wrote: On Thu Feb 14 20:08:53 2008, Peter Saint-Andre wrote: Here's a list of things we might talk about: 1. Recommendations regarding when to use the TCP binding and when to use the HTTP binding (BOSH). 2. Compression via TLS or XEP-0138 (use it!). Also binary XML as a compression mechanism. I've never been all that convinced about binary XML forms. They work to a degree with the highly fixed XML in, for example, SyncML, and they're pretty good at compressing individual stanza-like objects over SMS for things like OMA EMN (Email Message Notification, or something - I've long since forgotten what these acronyms stand for), but for long-running streams I'm under the impression that studies show it'll be outperformed. So if you're a big fan of Binary XML formats, please bring along your figures. :-) 3. Fast reconnect to avoid TLS+SASL+resource-binding packets. Lots of work from mobile email (ie, Lemonade) is transferrable here. It'd be really nice if Tony Finch was coming, since he could talk us through QTLS and QUICKSTART - they're SMTP fast startup work he did a while back. Very interesting, but didn't make it into the Lemonade Profile itself. 4. ETags for roster-get (see XEP-0150, let's resurrect that). (Om. Looks quite ugly, IMHO. I'll do a counter-proposal) 5. Advisability of presence-only connections (no roster-get, just send presence and whatever you receive is nice). If you can optimize the roster fetch sufficiently, this really isn't required. Anything else? Beer, obviously. Dave. -- Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED] - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/ - http://dave.cridland.net/ Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade
Re: [Standards] mobile optimizations (was: Re: G oogle Andro ï d SDK not XMPP compliant ?)
(Hey, where did that space come from in the subject line?) On Thu Feb 14 22:06:19 2008, Boyd Fletcher wrote: 1362 byte message strongly typed WinZip 3.13 times smaller than original EfficientXML 75.67 times smaller than original 980 byte message loosely type WinZip 1.6 times smaller than original Efficient XML 8.45 times smaller than original 21437 byte message Winzip 6 times smaller Efficient XML 33 times smaller Interesting, certainly. My impression has been that binary XML formats handle cases best where the schema is fixed, and the data is relatively tightly marked up, and the overall document length is low. Our data is heavy on the text, and our overall schema varies wildly, and our documents are quite big. The Efficient XML Interchange Measurements Note seems to back up this impression I have: The best improvements compared to gzipped XML in the Both case come for small documents, which also have sufficient schema information, i.e., the FixML and CBMS groups. Here FXDI and Efficient XML (and ASN.1 PER in some cases) manage to achieve a clear improvement, sometimes even under half the size of gzipped XML. For the larger documents there appears to be no gain over the Document case. For example, there is no size difference between gzipped XML and any of the candidates for the Seismic document, in contrast to the Schema case. To my mind, the figures and graphs there suggest that improvements over DEFLATE will be marginal at best for our kind of data. But I'll do my reading, certainly, as well as getting some figures for some XMPP session compression using existing mechanisms - assuming I can. (I vaguely recall that the jabber.org server does XEP-0138, and I know ours does TLS compression - I could stick XEP-0138 in it quite quickly I think as a test). Dave. -- Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED] - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/ - http://dave.cridland.net/ Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade
Re: [Standards] mobile optimizations (was: Re: G oogle Andro ï d SDK not XMPP compliant ?)
On Fri, Feb 15, 2008 at 12:03 AM, Dave Cridland [EMAIL PROTECTED] wrote: To my mind, the figures and graphs there suggest that improvements over DEFLATE will be marginal at best for our kind of data. That's my point as you can read in my other mail, benchmarks are too sensitive to the nature of data. Before FOSDEM I can produce some figures with real xmpp data using zlib and, I hope, with also some binary xml. Anyway I can anticipate that with zlib the size of a whole message stanza is often shorter or minimally longer than the uncompressed body alone: do we really need better performance? -- Fabio Forno, Ph.D. Bluendo srl http://www.bluendo.com jabber id: [EMAIL PROTECTED]
Re: [Standards] mobile optimizations (was: Re: G oogle Andro ï d SDK not XMPP compliant ?)
On 2/14/08 5:57 PM, Fabio Forno [EMAIL PROTECTED] wrote: On Thu, Feb 14, 2008 at 11:06 PM, Boyd Fletcher [EMAIL PROTECTED] wrote: 1362 byte message strongly typed WinZip 3.13 times smaller than original EfficientXML 75.67 times smaller than original 980 byte message loosely type WinZip 1.6 times smaller than original Efficient XML 8.45 times smaller than original 21437 byte message Winzip 6 times smaller Efficient XML 33 times smaller Uhm, I've seen them, they are little significative for xmpp traffic. Try the same benchmarks on real xmpp streams and you see that the difference is not so high. The reason? Much of the redundancy comes from attribute values such as to, from, type and so on. Since it's almost impossible to make assumptions about the values of attributes, but few like type where sometimes there are restrictions on the schema, usually binary xmls don't use dictionaries and therefore they don't lead to any gains in these cases. Moreover in streams there is an incredibly high correlations between stanzas, making zlib to perform pretty better than in the single message scenario. Yep, at the end there is a gain, but it's much smaller than optimizing roster and presence stanza exchange and making the connection manager cache some information and answer for the client. I agree that protocol improvements are in order. But XMPP data was looked at but some of the folks on the W3 committee as example data and the compression was significant. There has also been some internal testing in DOD using EfficientXML with captured XMPP data streams and we have seen a decrease in size of 4-5 times compared to zip lib approach. can be a significant reduce in b/w used. Which can have a big impact on the performance of a server faster processing in the chat server. reading XML is expensive. most of the binary XML formats were designed to be not only much smaller in size but also much less CPU intensive to process. This should in theory dramatically improve the scalability of a given XMPP server. Instead I agree on this topic, though I think you can get the best advantages while connecting very limited nodes such as in sensor networks. To make it clear: - I don't think that in the wired internet the relatively small advantages you can get abandoning text based xml can pay off; for text xml you have a high number of reliable libraries in any language, while the binary xml is still far from being mature I strongly disagree. we have using binary XML for years and the libraries are quite stable and reliable. Unfortunately there just aren¹t very many open source libraries. Hopefully that will change over the next 2 years as W3C¹s EXI specification is ratified. In very high production environments, hundred of thousands of users/connections the difference in binary XML vs. regular XML can be significant not just in reduced bandwidth utilization but also in reduced CPU overview in processing the XML data. A couple of years ago, one of the large stock exchanges tried to switch to XML as the data transport. It tanked because the servers could not process the XML fast enough to keep up with the transaction rate. They switched back to their legacy binary protocol within 2 days. - In edge cases such as mobiles and sensor networks xml bindings may have a sense, especially for computational constraints, but in these cases (more true for sensors) it's also very likely to use a downsized version of xmpp, connecting to a proxy acting as a gateway -- Fabio Forno, Ph.D. Bluendo srl http://www.bluendo.com jabber id: [EMAIL PROTECTED]
Re: [Standards] mobile optimizations (was: Re: G oogle Andro ï d SDK not XMPP compliant ?)
On Fri, Feb 15, 2008 at 12:10 AM, Boyd Fletcher [EMAIL PROTECTED] wrote: I agree that protocol improvements are in order. But XMPP data was looked at but some of the folks on the W3 committee as example data and the compression was significant. There has also been some internal testing in DOD using EfficientXML with captured XMPP data streams and we have seen a decrease in size of 4-5 times compared to zip lib approach. Just to setup the correct benchmark: you mean EfficientXML + compression or EfficientXML alone? (I promise on the weekend I try to get some figures out, but without compression it's difficult to believe you can get those improvements) I strongly disagree. we have using binary XML for years and the libraries are quite stable and reliable. Unfortunately there just aren't very many open source libraries. Indeed that was I meant, sorry for not being clear. Hopefully that will change over the next 2 years as W3C's EXI specification is ratified. That was the other point about the maturity, I should have used consensus: though having some libs, it is very difficult to base some extension of xmpp on a not ratified standard and choose between the many xml binding options. If the situation changes (or has changed) I'd be happy to jump again on the binary supporters side, where I was before trying to implement it for j2me ;) In very high production environments, hundred of thousands of users/connections the difference in binary XML vs. regular XML can be significant not just in reduced bandwidth utilization but also in reduced CPU overview in processing the XML data. A couple of years ago, one of the large stock exchanges tried to switch to XML as the data transport. It tanked because the servers could not process the XML fast enough to keep up with the transaction rate. They switched back to their legacy binary protocol within 2 days. I don't have troubles in believing this, but the scenario - I guess - is slightly different, since I don't think that their format had many extensibility features (when the grammar is not fixed you loose most of the possible optimizations) -- Fabio Forno, Ph.D. Bluendo srl http://www.bluendo.com jabber id: [EMAIL PROTECTED]