On Wed, 30 Jul 2008 07:04:16 -0600 Peter Saint-Andre <[EMAIL PROTECTED]> wrote:
> Pavel Simerda wrote: > > On Tue, 29 Jul 2008 19:49:01 -0600 > > Peter Saint-Andre <[EMAIL PROTECTED]> wrote: > > > >> Ahoj Pavle! > >> > >> Pavel Simerda wrote: > >>> Hello, > >>> > >>> I have some suggestions for XEP-0231 (Data Element). > >> Thanks for looking at this spec so thoroughly. > >> > > I actually have some questions. First, lolek from the jabbim.cz > > project is going to propose a XEP for text emoticons. > > Similar to XEP-0038? We can bring that back if someone wants to > maintain it. Similar but more powerful and not file-based but most probably based on Data Elements. There may be a lot of other extensive changes. If these changes can be made, I believe Martin would maintain it if he gets the chance. > > I like his ideas but I > > suggested him to use Data Element instead of a custom solution. > > +1 > > > He still has doubts but I promised him to try to sort it out and to > > help him with language corrections of his document too. > > Great, thanks. > > > I didn't find in the specs what should be used for domain ID in the > > CID. The examples apparently use the domain part of JID that is not > > unique for the clients. I looked at the RFC and still don't know a > > proper mapping to XMPP. > > > > His original idea was to use a cryptographic hash function and not a > > CID. > > I think your idea of a UUID followed by the domain part of the JID > would work well. > > > He also pointed out he misses a feature that would allow a client to > > advertise which mimetypes it supports. > > Yes we can add a disco feature for that. > > > This is another questions... if it's just emoticons, should we just > > support png and mng types or add some accept-advertisement facility? > > I don't think it hurts to define a way to advertise what MIME types > you support. We'll use the data element for things other than > emoticons, but IMHO the simplest approach would be to advertise in > general which MIME types you support, not "I support these mime types > for emoticons" and "I support these other mime types for file > transfer thumbnails" etc. Does anyone think that level of complexity > is needed? I'm not sure. Let's wait for other comments. > > Is there a written policy for image formats in XMPP extensions? > > Not yet. PNG for static raster images, MNG for animated raster images, SVG for vector images? That's something I would expect from every client. > >>> Right now, as the example shows: > >>> > >>> <message from='[EMAIL PROTECTED]/castle' > >>> to='[EMAIL PROTECTED]' > >>> type='groupchat'> > >>> <body>Yet here's a spot.</body> > >>> <html xmlns='http://jabber.org/protocol/xhtml-im'> > >>> <body xmlns='http://www.w3.org/1999/xhtml'> > >>> <p> > >>> Yet here's a spot. > >>> <img alt='A spot' > >>> > >>> src='cid:[email protected]'/> > >>> </p> > >>> </body> > >>> </html> > >>> <data xmlns='urn:xmpp:tmp:data-element' > >>> alt='A spot' > >>> cid='[EMAIL PROTECTED]' > >>> type='image/png'> > >>> iVBORw0KGgoAAAANSUhEUgAAAAoAAAAKCAYAAACNMs+9AAAABGdBTUEAALGP > >>> C/xhBQAAAAlwSFlzAAALEwAACxMBAJqcGAAAAAd0SU1FB9YGARc5KB0XV+IA > >>> AAAddEVYdENvbW1lbnQAQ3JlYXRlZCB3aXRoIFRoZSBHSU1Q72QlbgAAAF1J > >>> REFUGNO9zL0NglAAxPEfdLTs4BZM4DIO4C7OwQg2JoQ9LE1exdlYvBBeZ7jq > >>> ch9//q1uH4TLzw4d6+ErXMMcXuHWxId3KOETnnXXV6MJpcq2MLaI97CER3N0 > >>> vr4MkhoXe0rZigAAAABJRU5ErkJggg== > >>> </data> > >>> </message> > >>> > >>> Note: in this particular example the data is very short, this may > >>> not be the case in real world where people tend to ignore the size > >>> of data they send. > >> Yes, that's just about the smallest image I could find. The spec > >> says that the image should not be more than 8k (which is twice the > >> suggested size of an IBB chunk) but we don't know if people will > >> typically send images that are smaller or larger than 8k -- I think > >> smaller but I don't know that yet. > >> > > > > Might it be advertised by the client/server? And rejected if the > > other party tries to send a bigger one (just to force them to fix > > it)? > > I think that's handled at a different layer (e.g., rate limiting). > But we do need to define better handling for stanzas that are too > large (there is a proto-XEP about it but the Council didn't accept it > and I never incorporated their feedback). > Hmm. I know that people at jabbim.cz use a roster-renaming utility (for icq transport). They wait a long time between stanzas and the renaming can often takes more than just several minutes. > >>> We send data once for every session (and omit for subsequent > >>> messages). > >> In this case it's important to define "session" (see rfc321bis). Is > >> it a chat session, a presence session, or something else? > >> > > > > Exactly. > > > >>> This has two important implications: > >>> > >>> 1) The other entity may or may not cache it for the session and > >>> reuse it. That is good. > >>> > >>> 2) If an entity keeps the data for a longer time (e.g. for weeks > >>> or even permanently), this cache will never be used. As the > >>> sending entity always resends the data for a new session. > >>> > >>> What I propose is: > >>> > >>> * By default the sending entity would not send the data. It would > >>> merely reference it by its cid url. > >>> * Let the recieving client follow "3.4 Retrieving Uncached Media > >>> Data" if the data is not cached (no real change, this is already > >>> being done). > >> I think I like that approach. It introduces a round trip for the > >> IQ, which might introduce some latency. But it puts the burden for > >> "storing" and "serving" the image on the sender, which might > >> discourage abuse of in-band images. > >> > >>> * Reserve the possibility of sending the data immediately with > >>> the message for the *specific* case that the sending client > >>> actually knows the recieving party cannot have the data cached > >>> (e.g. the data was never sent before). This behavior should be > >>> considered optional. > >> In that case the sender needs to keep a list of every JID to which > >> it has ever sent the image. That seems suboptimal. > > > > I didn't write it exactly as I meant it. There may be cases we are > > knowingly sending something really new. But we might just as well > > drop this feature if you think it's better. > > If it's optional, it does no great harm. In fact it's not even a > feature, just an implementation note. Ok. > > I'm afraid some people will object. > > Don't be afraid -- some people will always object. :) > :D > >> And I suppose the recipient might have received the image from > >> another sender at some point, or might have received the image > >> through other means (e.g., an emoticon "bundle"). > > > > The problem is... that we really want the users to get what we send > > them. If they got it from someone else, we need to secure it by a > > hash function, not a mere ID. It would have to actually check the > > hash when caching. > > Isn't that a bit paranoid for something as lightweight as emoticon > bundles? > The problem is that the Data Element could very soon be used for other purposes. For me this is a grave security hole that might cause a real headache in the future. But I'm not only a bit paranoid :). Working privacy and security is what originally brought me from ICQ to Jabber... only then I realized how cool it actually is in other areas. > > Another issue would be the particular hash functions. Some client > > authors or users may want to prevent using data from third parties > > protected by weak hash functions. > > > > That's why I only considered caching per sender JID. > > I suppose caching per sender JID makes sense, yes. > I suggest this if we don't take the cryptograhic way. Or we could take both ways (let the implementors choose). > > If we want to use hashes... and third party data, we should use some > > specific "hostnames", possibly sha256.cid.xmpp.org for sha256 or > > something like that. > > Sure. If desired. > It would be - for globally-shared data, so the IDs actually match. The global-sharing feature should be optional anyway, so it can be added at any time. No reason to defer implementations. > >>> I further propose we add some informational section about > >>> generation of CIDs. Although it's specified elsewhere, I believe > >>> this XEP will be very useful and will be referenced from many > >>> future XEPs (and maybe improved as well - possibly some server > >>> caching etc). I think the informational section could suggest > >>> UUIDs generated by hashing the actual content. > >> Yes I think that would be helpful. > >> > >>> Another thing that could be considered... is to add some sort of > >>> caching hint attribute that would suggest how long its reasonable > >>> to cache a particular resource. > >> Do you think that would really be helpful? I'm still thinking about > >> it... > >> > > > > This feature would be optional, so it's easy to add it when we think > > it's useful. Right now I have no idea :). > > > >>> Maybe we could borrow from HTTP Cookies > >>> but allow (suggest) the clients to have some mechanisms for > >>> limiting the time, size and number of cached objects. > >>> > >>> There are many possibilities, I will just describe one of them. > >> Do you have examples of these? > >> > > > > The attribute values could be stated more abstractly... like... > > "session", "short", "medium", "long" with recommended defaults, for > > example. But usually the sender knows better. > > Mimicking HTTP values is OK with me. > No problem for me either, we can just define the syntax. > >>> cache="no" > >>> - no reason for caching the file will not be used again > >> Perhaps a thumbnail related to file transfer or some other > >> ephemeral image? > >> > >>> cache="session" > >>> - we suggest the recieving party only caches for this > >>> particular session > >> Perhaps also a thumbnail, or an image related to a whiteboarding > >> session? > >> > >>> cache="12" > >>> - we suggest caching for twelve days from the last use of this > >>> cid (!) > >>> - for every use (recieved reference) the recieving client should > >>> reset the date we count from > >> Perhaps images included in an XHTML notification from a blogging > >> service or somesuch? > >> > >>> cache="unlimited" > >>> - we suggest the client picks the longest time it allows (it > >>> could possibly cache some small pieces of data permanenty) > >> Perhaps a commonly-used emoticon? > >> > > > > Good use cases, thanks. > > > >>> Of course, the client MAY ignore the caching hit. In this case it > >>> SHOULD NOT cache at all. > >> Why not? My client could ignore caching hints because it has its > >> own local policy (e.g. cache images only from people in my > >> "Friends" group, but cache those forever because I want to keep > >> them in message history). Or my client could ignore caching hints > >> because it simply can't cache images (no room on the device, web > >> client, etc.). > >> > > > > I don't know, really :). > > Well it seems a bit strong to say you SHOULD NOT cache in those > instances. Just leave it up to the implementation. If we mimic HTTP even in this respect, missing cache would mean session-only (possibly other user's online session). > >>> If the cache attribute is not specified, we should decide on a > >>> reasonable default value ('session' or '1' day both seem good to > >>> me). > >> I think that's up to the client. > >> > > > > A reasonable default makes no harm, does it? :) > > I suppose '1' day is OK, or 'session' if define what we mean by that. > If we take the way of HTTP, this is a nonissue. > Peter -- Web: http://www.pavlix.net/ Jabber & Mail: pavlix(at)pavlix.net OpenID: pavlix.net
