On Montag, 12. Februar 2018 09:10:47 CET Jonas Wielicki wrote: > On Montag, 12. Februar 2018 00:41:54 CET Christian Schudt wrote: > > - Generally I am unsure if using the "xml:lang" and „name" from the > > identities is a good idea at all, because these two attributes should not > > change the capabilities of an entity. Name and language is just for > > humans. > > I.e. if a server sends german identities for one user and english > > identities for the next user (depending on their client settings/stream > > header), the server still has the same identities, which should result in > > the same verification string, shouldn’t it? > > First of all, I think previously, an entity answering a disco#info request > always sent all translated identities, so that would not have been an issue. > > You’re touching on a more general thing though which I’d like to discuss. We > could separate the hash into three hashes, one for identities, one for > features and one for forms (or maybe two: identities and forms+features). > > This has the upside that human readable identifiers don’t interfere with > protocol data (features/forms) in many cases (I think the identities are > more rarely used in protocols, but I might be wrong). The obvious downside > is that we need to transfer more data in the presence (twice or thrice the > amount for ecaps2). > > I’d like to know what you people think of it. Since this is still > Experimental, I’d be fine with bumping the namespace and getting this done. > But I’m afraid that the bandwidth costs will outweigh the advantages. We > have ~100 bytes for a 256 bit hashsum (including wrapper XML). We would end > up with more than half a kilobyte (~0.6 kB) for ecaps2 if we split the > hashes and assume that each entity uses two hash functions with 256 bits > each (which I think is a reasonable assumption). If we have caps > optimization, the impact would probably be neglectible, but I’m not sure if > we can assume that. > > I’d like to get input from you folks on that.
I had some off-list input on this. First, Evgeny pointed out that the work
which is in progress on MUC bare-presence [1] has uncovered that caps don’t
really work well for the MUC case. A MUCs disco#info contains for example the
number of occupants currently in the room, which may fluctuate a lot (thus
causing lots of <presence/> traffic if caps are used completely) [2].
Second, Florian Schmaus questioned my approach of splitting the hashes and
asked for use-cases where this makes sense. I think I can come up with two use
cases off the top of my head, both with varying relevance depending on which
metric you want to optimize.
- The MUC use case from above. Granted, this isn’t in any spec yet, but it
would be great to have. Daniel noted that having the disco#info form of
MUCs is useful to detect (a part of) the configuration which is relevant
to (IMO reasonable) UX choices in Conversations.
However, obviously if the occupant count is in there, the use of a caps
hash is rather defeated in this case.
- Clients sometimes include XEP-0232 (Software Information) and other forms
in their disco#info. This might be high-cardinality information which
may thrash (overloads/fills) entity caches.
I used the (a bit dated) capsdb [3] and ran the numbers:
Total items in capsdb: 1602
Distinct hashes: 1558 (i.e. XEP-0115/XEP-0390 as-is)
Distinct identity+features: 1140
Distinct forms: 450
This is less of a saving than I expected; however, the capsdb is rather
dated. I wonder whether the saving is larger nowadays if there are more
clients which implement XEP-0232 or other similar things.
Splitting the hashes could also allow entities to explicitly opt-out of one of
the two hashes; an entity with a disco#info form which changes in real-time
could opt-out of sending the form hash altogether (instead of sending a hash
equivalent to "no form"); thus signalling to peers that if disco#info form
data is desired, it needs to be queried freshly.
All over all, I’m not sure if those two use-cases warrant the increase of
bandwidth use by a factor of approximately two for caps2.
I’m still hoping for more feedback on this, thanks!
kind regards,
Jonas
[1]: The idea is to let MUCs emit a presence from the bare JID after the
client joined to send them caps and avatar info etc.
[2]: They work around that currently by not including the form in the caps
and omitting the form data from disco#info queries against caps
disco#info nodes.
[3]: https://github.com/xnyhps/capsdb/
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: [email protected] _______________________________________________
