Dear Yusuke.

Thank you for your constructive comments. I'll try to address them one at a 
time:

> I think this idea of schema exchange is interesting. On the other hand, it 
> may make confusion on management (explosion of the number of derived schemas).

§3.3 in the XEP handles this. The server is free to add cache rules to avoid 
explosion of number of derived schemas. 
The server is also free to reject uploading or downloading requests, for any 
reason (§2.4)

> If there is a way to name a XML schema defined in XEP, servers and clients 
> can share them by the names.

The problem with this approach, is that name seldom change, especially during 
development. And a slight change, a new attribute, a new element, etc., will 
completely change the compression. Furthermore, errors produced in this way 
will be extremely difficult to debug. An efficient and fool-proof way to 
communicate using different schema versions (having the same namespace and 
schema IDs) is necessary.

> Of course (1) this does not eliminate needs to upload XML schema because the 
> end device may have non-XEP vendor specific extensions (2) we need secure 
> channel to download XEP-defined schemas to avoid attacks.

§2.4 also proposes the possibility to install such schema files manually on the 
server. The XEP allows for different scenarios.

> Another pitfall: if we want to use bit-packed, we need to make stanzas 
> encoded as self contained elements. Otherwise you cannot do 'fflush()'  at 
> the end of element of a stanza. However, self contained elements do not allow 
> an encoder to re-use compression context (string tables) between outside of 
> the element and the element itself. This means the encoder need to re-encode 
> JID strings as is (otherwise you can just encode a string with few bytes of 
> reference).

It is supposed that the EXI compression engine works in XML fragment mode, 
where each stanza is compressed separately. I don't see that self contained 
elements would not be required in this case. Tables should not be reused 
between stanzas, since tables can be very small, but, as you point out in your 
example, number of possible strings may be large (for instance many different 
JIDs). However, possible strings within a message are much smaller, making also 
references to tables shorter within the bit-packed message.

Finally it's up to the implementor how to setup the EXI compression engine. 
Some may feel bit-packed is better, some that byte-packed is better.

I'll add an implementation note in the XEP regarding this.

> Note: the results do not use schema-informed grammars to encode XEP-based 
> elements, so compression ratio of Peter's proposal should be much better -- 
> in my best scenario with schema-informed EXI, it will be
809 bytes (22% of original XML).

This is a vital aspect of this proposal. For sensor networks and IoT, 
especially wireless sensor networks, buffer size is first priority. Therefore, 
EXI compression should be done with as much information about schemas as 
possible.

> BTW, my initial idea is somewhat different. What I want to make is 
> constrained XMPP clients (and if technically possible, servers) with static 
> set of pre-compiled EXI grammars and without ability to talk with regular 
> XML-based XMPP. This enables nodes with batteries to speak sensor data with 
> narrow wireless link such as 15.4 or with 3G link charged by quantity. Maybe 
> this idea is oriented towards SRV-based negotiation.

We also see this as an important aspect of this proposal: Most sensors will 
have pre-compiled code, often (semi-)automatically generated from schema files, 
for compression and decompression of EXI content. Therefore, the proposal
Includes the possibility for the client to reject the connection if parameters 
are not as expected.

Sincerely,
Peter Waher

-----Original Message-----
From: Yusuke DOI [mailto:[email protected]] 
Sent: den 13 mars 2013 12:00
To: Peter Waher
Cc: Peter Saint-Andre; XMPP Standards; Joachim Lindborg 
([email protected])
Subject: Re: EXI extension proposal

Dear Peter,

(2013/03/13 23:10), Peter Waher wrote:
> Anybody interested in EXI & XMPP, please review. Any feedback is most welcome.

I think this idea of schema exchange is interesting. On the other hand, it may 
make confusion on management (explosion of the number of derived schemas). If 
there is a way to name a XML schema defined in XEP, servers and clients can 
share them by the names. Of course (1) this does not eliminate needs to upload 
XML schema because the end device may have non-XEP vendor specific extensions 
(2) we need secure channel to download XEP-defined schemas to avoid attacks.

Another pitfall: if we want to use bit-packed, we need to make stanzas encoded 
as self contained elements. Otherwise you cannot do 'fflush()' 
at the end of element of a stanza. However, self contained elements do not 
allow an encoder to re-use compression context (string tables) between outside 
of the element and the element itself. This means the encoder need to re-encode 
JID strings as is (otherwise you can just encode a string with few bytes of 
reference). This may make the XMPP/EXI stream more inefficient compared to 
byte-aligned streams. My preliminary experiment shows following results.

[The number of bytes]
                                 bytes
-----------------------------+-------
XML                          |  3681
selfContained, bit-packed    |  1589
byte-aligned                 |  1358
-----------------------------+-------

Note: the results do not use schema-informed grammars to encode XEP-based 
elements, so compression ratio of Peter's proposal should be much better -- in 
my best scenario with schema-informed EXI, it will be
809 bytes (22% of original XML).



BTW, my initial idea is somewhat different. What I want to make is constrained 
XMPP clients (and if technically possible, servers) with static set of 
pre-compiled EXI grammars and without ability to talk with regular XML-based 
XMPP. This enables nodes with batteries to speak sensor data with narrow 
wireless link such as 15.4 or with 3G link charged by quantity. Maybe this idea 
is oriented towards SRV-based negotiation.

For long-targetted apporach, I think I can propose some update to EXI spec 
itself (I recently joined to W3C EXI working group). Now EXI working group are 
open to collect requirements for EXI2.0 (I already raised fflush() issue). I 
believe this kind of collaboration should be very important to let more 
constrained IoT devices join the network.

Regards,

Yusuke


Reply via email to