(Switching list, CCing Alexander) On Wed, Dec 4, 2013 at 1:56 PM, Alexander Holler <[email protected]>wrote:
> Am 04.12.2013 14:05, schrieb Ralph Meijer: > > > Alternatively, it makes total sense to use a different protocol on PANs >> and/or LANs and then bridge it to XMPP for WAN transport. For example, >> Peter Waher is working on bridging MQTT and XMPP, and MQTT also has a >> special profile for sensor networks based on non-TCP/IP settings, like >> Zigbee. >> > > I would prefer to make a clean cut and to develop something like XMPP 2.0 > or similiar which got rid of XML in favor of some header based protocol > (e.g. protocol buffers or even something as simple like > <type><length><optional_hash>content (in binary form, a bit more would be > needed to enable nested types, but it's just to express how it should have > been done). > > I think it's relatively easy to exchange the XML-based parts of current > XMPP-implementation to something like protocol buffers. All the concepts > and other stuff would still work, but the really ugly thing of parsing > stream based XML would be gone. > XML parsers are really fast, and those designed for XMPP, or at least, those designed with XMPP in mind, are particularly fast for XML Stream processing. There *is* an argument that XML makes transporting pure binary hard, but quite honestly if we wanted to have arbitrary binary sections, we'd be pretty much forced into using a very different conceptual structure. One option, though, is EXI, which "knows" - with some encouragement - to ship values as binary even though in traditional XML serialization, they'd be base64 encoded. My only worry is that the level of benefit that this gives is rapidly eroded by how good XML parsers have got, especially when you consider the overhead that known-schema causes to the complexity of the protocol. The problem with trying to switch wholesale to an entirely non-XML protocol is that any attempt to maintain transparent compatibility with the XML-XMPP means having a common model expressible in either XML or some other format. There are attempts to do this (EXI is arguably one, XER (XML Encoding Rules, X.693 if I recall) is another path), but in general they're hopelessly inefficient and ugly unless you *also* have schema awareness at both ends. XMPP is *not* a hard-schema protocol for the most part - we can and do cheerfully sling extra elements and attributes in all over the shop. Only the core is hard - that is, the stanzas and stream - the rest should be considered simply "complete as far as they go". A final problem with protocol buffers and similar concepts is that they're binary, and therefore knock out a whole range of applications, such as javascript environments. This may well be a short-term view, though, as Javascript probably has gained some binary handling already. > Especially the problem that you need to parse the whole stream until you > even know how long a packet (stanza) is, is a very ugly concept. Together > with the surrounding <stream:stream> this is imho something never should > have been done. XML was designed for documents (of fixed sizes, e.g. you > get the size from the file system), but not for streams. > > I'm assuming you mean the self-framing thing here. "Parse" is a very loaded word. You can pretty much lex most of it, especially if at that layer in the server you don't care too much about well-formedness. There's a good argument that you should be strict about well-formedness only on output, anyway. A good (for XMPP) XML parser will cheerfully do this for you, and do it all so fast any additional overhead is just not worth worrying about. If I'd been there in 1999, I would have argued strenuously against self-framing XML. I agree whole heartedly it was a design error, and I would have gone for a split between "header" and "body", and had octet counting all the way. But it's now a solved problem, and I don't even blink anymore. Actually, there are arguments in favour of the way we do things, such as being able to serialize to XML directly on output, instead of having to serialize entire stanzas and count the bytes before transmission - I'm wouldn't claim these to be overwhelming arguments in the case of XMPP, but I've seen them made for plenty of other cases. HTTP's chunked transfer encoding is a result of this kind of argument. > Using another port, that would even be downwards compatible. Putting aside my dispute with that "downwards compatible" claim above, I think the notion of running an "XMPP 2.0" clean redesign just isn't a practical concept. It'd be very interesting from the point of view of a thought experiment in protocol design, but utterly useless in terms of realistic deployment. > What would be left, is to modify the presence stuff to get rid of the need > for ever lasting (tcp) connections. > Actually, presence hasn't required everlasting TCP connections for years. Between BOSH and XEP-0198, that's again a solved problem. For anywhere I've said "solved problem", you're free to substitute the words "case where the state of the art has mitigated the problem to the point the incremental gain from a fuller, but more drastic, solution is no longer worthwhile". Dave.
