On Sat May 20 05:56:19 2006, Justin Karneges wrote:
On Friday 19 May 2006 20:39, Peter Saint-Andre wrote:
> But it turns out that streaming XML has some inherent benefits, one of > which is that you don't have to create a new parser instance every time
> you want to send, receive, or route a message.

More importantly, XMPP-specific parsing code doesn't need to be written. Any other wire protocol would require writing a parser, but with XMPP you can just throw SAX at it.


Ah, you see I approached XMPP looking for the framing for the messages, because every other protocol I deal with has explicit framing for the messages.

So, I do string matches to pull out the stanzas, and turn them into complete XML documents by wrapping them in the real <stream> and faked </stream>, and use DOM on the resultant docs. In other words, I treat them as framed messages to pull out and parse, where the framing depends on the opening bytes (up to the first space or >). Maybe I'm weird, but it seems to work well. :-)

There's a potential problem where you end up finding a closing tag that's actually not closing the stanza, because of namespace redefinitions or whatever, but that's relatively easy to deal with, you just find the next candidate end-of-stanza tag. You get similar problems if you want to isolate messages in IMAP, too, where the framing changes depending on the type of message.

My favourite benefit to XML streams over XML messages, though, is that namespace declarations can be moved out of the messages and into the root element. That's very cool for octet-obsessives like me.

(For compression people: Although moving the namespace declarations further toward the root of the document tree to remove repetitions is simply a representational change, the longevity of the impact relative to the stream is large, so you tend to run out of the reference length limit for Ziv/Lempel type compressions, and the namespace strings themselves are sufficiently long that statistical modelling compression algorithms won't have a good enough effect. Also, because the namespace declaration strings tend to be self-similar, putting them all together makes them compress better, too.)


Granted, I'm also one of those guys that "wouldn't have designed it that way", but I still think XML streams are cool in that geeky sort of way. Look mom, no parser.


I think I probably would have gone for explicit framing, but I put that down to reflex rather than any particularly sound principles. I treat the data as if it does have explicit framing anyway, so it doesn't actually really matter, and different parsing techniques mean that there's advantage in letting the XML do the framing for you in the protocol.


I agree with Peter though, talking about the rationale in 2006 is kind of pointless.

Well, it's pointless from the point of view of XMPP, certainly, but it's interesting from a more philosophical protocol design kind of way. Which could be pointless, but may not be.

Dave.
--
Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED]
 - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
 - http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade

Reply via email to