On Montag, 12. Februar 2018 09:10:47 CET Jonas Wielicki wrote: > On Montag, 12. Februar 2018 00:41:54 CET Christian Schudt wrote: > > - Generally I am unsure if using the "xml:lang" and „name" from the > > identities is a good idea at all, because these two attributes should not > > change the capabilities of an entity. Name and language is just for > > humans. > > I.e. if a server sends german identities for one user and english > > identities for the next user (depending on their client settings/stream > > header), the server still has the same identities, which should result in > > the same verification string, shouldn’t it? > > First of all, I think previously, an entity answering a disco#info request > always sent all translated identities, so that would not have been an issue. > > You’re touching on a more general thing though which I’d like to discuss. We > could separate the hash into three hashes, one for identities, one for > features and one for forms (or maybe two: identities and forms+features). > > This has the upside that human readable identifiers don’t interfere with > protocol data (features/forms) in many cases (I think the identities are > more rarely used in protocols, but I might be wrong). The obvious downside > is that we need to transfer more data in the presence (twice or thrice the > amount for ecaps2). > > I’d like to know what you people think of it. Since this is still > Experimental, I’d be fine with bumping the namespace and getting this done. > But I’m afraid that the bandwidth costs will outweigh the advantages. We > have ~100 bytes for a 256 bit hashsum (including wrapper XML). We would end > up with more than half a kilobyte (~0.6 kB) for ecaps2 if we split the > hashes and assume that each entity uses two hash functions with 256 bits > each (which I think is a reasonable assumption). If we have caps > optimization, the impact would probably be neglectible, but I’m not sure if > we can assume that. > > I’d like to get input from you folks on that.
I had some off-list input on this. First, Evgeny pointed out that the work which is in progress on MUC bare-presence [1] has uncovered that caps don’t really work well for the MUC case. A MUCs disco#info contains for example the number of occupants currently in the room, which may fluctuate a lot (thus causing lots of <presence/> traffic if caps are used completely) [2]. Second, Florian Schmaus questioned my approach of splitting the hashes and asked for use-cases where this makes sense. I think I can come up with two use cases off the top of my head, both with varying relevance depending on which metric you want to optimize. - The MUC use case from above. Granted, this isn’t in any spec yet, but it would be great to have. Daniel noted that having the disco#info form of MUCs is useful to detect (a part of) the configuration which is relevant to (IMO reasonable) UX choices in Conversations. However, obviously if the occupant count is in there, the use of a caps hash is rather defeated in this case. - Clients sometimes include XEP-0232 (Software Information) and other forms in their disco#info. This might be high-cardinality information which may thrash (overloads/fills) entity caches. I used the (a bit dated) capsdb [3] and ran the numbers: Total items in capsdb: 1602 Distinct hashes: 1558 (i.e. XEP-0115/XEP-0390 as-is) Distinct identity+features: 1140 Distinct forms: 450 This is less of a saving than I expected; however, the capsdb is rather dated. I wonder whether the saving is larger nowadays if there are more clients which implement XEP-0232 or other similar things. Splitting the hashes could also allow entities to explicitly opt-out of one of the two hashes; an entity with a disco#info form which changes in real-time could opt-out of sending the form hash altogether (instead of sending a hash equivalent to "no form"); thus signalling to peers that if disco#info form data is desired, it needs to be queried freshly. All over all, I’m not sure if those two use-cases warrant the increase of bandwidth use by a factor of approximately two for caps2. I’m still hoping for more feedback on this, thanks! kind regards, Jonas [1]: The idea is to let MUCs emit a presence from the bare JID after the client joined to send them caps and avatar info etc. [2]: They work around that currently by not including the form in the caps and omitting the form data from disco#info queries against caps disco#info nodes. [3]: https://github.com/xnyhps/capsdb/
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ Standards mailing list Info: https://mail.jabber.org/mailman/listinfo/standards Unsubscribe: standards-unsubscr...@xmpp.org _______________________________________________