Dear Yusuke. Thank you for your constructive comments. I'll try to address them one at a time:
> I think this idea of schema exchange is interesting. On the other hand, it > may make confusion on management (explosion of the number of derived schemas). §3.3 in the XEP handles this. The server is free to add cache rules to avoid explosion of number of derived schemas. The server is also free to reject uploading or downloading requests, for any reason (§2.4) > If there is a way to name a XML schema defined in XEP, servers and clients > can share them by the names. The problem with this approach, is that name seldom change, especially during development. And a slight change, a new attribute, a new element, etc., will completely change the compression. Furthermore, errors produced in this way will be extremely difficult to debug. An efficient and fool-proof way to communicate using different schema versions (having the same namespace and schema IDs) is necessary. > Of course (1) this does not eliminate needs to upload XML schema because the > end device may have non-XEP vendor specific extensions (2) we need secure > channel to download XEP-defined schemas to avoid attacks. §2.4 also proposes the possibility to install such schema files manually on the server. The XEP allows for different scenarios. > Another pitfall: if we want to use bit-packed, we need to make stanzas > encoded as self contained elements. Otherwise you cannot do 'fflush()' at > the end of element of a stanza. However, self contained elements do not allow > an encoder to re-use compression context (string tables) between outside of > the element and the element itself. This means the encoder need to re-encode > JID strings as is (otherwise you can just encode a string with few bytes of > reference). It is supposed that the EXI compression engine works in XML fragment mode, where each stanza is compressed separately. I don't see that self contained elements would not be required in this case. Tables should not be reused between stanzas, since tables can be very small, but, as you point out in your example, number of possible strings may be large (for instance many different JIDs). However, possible strings within a message are much smaller, making also references to tables shorter within the bit-packed message. Finally it's up to the implementor how to setup the EXI compression engine. Some may feel bit-packed is better, some that byte-packed is better. I'll add an implementation note in the XEP regarding this. > Note: the results do not use schema-informed grammars to encode XEP-based > elements, so compression ratio of Peter's proposal should be much better -- > in my best scenario with schema-informed EXI, it will be 809 bytes (22% of original XML). This is a vital aspect of this proposal. For sensor networks and IoT, especially wireless sensor networks, buffer size is first priority. Therefore, EXI compression should be done with as much information about schemas as possible. > BTW, my initial idea is somewhat different. What I want to make is > constrained XMPP clients (and if technically possible, servers) with static > set of pre-compiled EXI grammars and without ability to talk with regular > XML-based XMPP. This enables nodes with batteries to speak sensor data with > narrow wireless link such as 15.4 or with 3G link charged by quantity. Maybe > this idea is oriented towards SRV-based negotiation. We also see this as an important aspect of this proposal: Most sensors will have pre-compiled code, often (semi-)automatically generated from schema files, for compression and decompression of EXI content. Therefore, the proposal Includes the possibility for the client to reject the connection if parameters are not as expected. Sincerely, Peter Waher -----Original Message----- From: Yusuke DOI [mailto:[email protected]] Sent: den 13 mars 2013 12:00 To: Peter Waher Cc: Peter Saint-Andre; XMPP Standards; Joachim Lindborg ([email protected]) Subject: Re: EXI extension proposal Dear Peter, (2013/03/13 23:10), Peter Waher wrote: > Anybody interested in EXI & XMPP, please review. Any feedback is most welcome. I think this idea of schema exchange is interesting. On the other hand, it may make confusion on management (explosion of the number of derived schemas). If there is a way to name a XML schema defined in XEP, servers and clients can share them by the names. Of course (1) this does not eliminate needs to upload XML schema because the end device may have non-XEP vendor specific extensions (2) we need secure channel to download XEP-defined schemas to avoid attacks. Another pitfall: if we want to use bit-packed, we need to make stanzas encoded as self contained elements. Otherwise you cannot do 'fflush()' at the end of element of a stanza. However, self contained elements do not allow an encoder to re-use compression context (string tables) between outside of the element and the element itself. This means the encoder need to re-encode JID strings as is (otherwise you can just encode a string with few bytes of reference). This may make the XMPP/EXI stream more inefficient compared to byte-aligned streams. My preliminary experiment shows following results. [The number of bytes] bytes -----------------------------+------- XML | 3681 selfContained, bit-packed | 1589 byte-aligned | 1358 -----------------------------+------- Note: the results do not use schema-informed grammars to encode XEP-based elements, so compression ratio of Peter's proposal should be much better -- in my best scenario with schema-informed EXI, it will be 809 bytes (22% of original XML). BTW, my initial idea is somewhat different. What I want to make is constrained XMPP clients (and if technically possible, servers) with static set of pre-compiled EXI grammars and without ability to talk with regular XML-based XMPP. This enables nodes with batteries to speak sensor data with narrow wireless link such as 15.4 or with 3G link charged by quantity. Maybe this idea is oriented towards SRV-based negotiation. For long-targetted apporach, I think I can propose some update to EXI spec itself (I recently joined to W3C EXI working group). Now EXI working group are open to collect requirements for EXI2.0 (I already raised fflush() issue). I believe this kind of collaboration should be very important to let more constrained IoT devices join the network. Regards, Yusuke
