On Fri, 15 Aug 2008 12:57:12 -0600 Peter Saint-Andre <[EMAIL PROTECTED]> wrote:
> Pavel Simerda wrote: > > On Thu, 14 Aug 2008 10:38:42 -0600 > > Peter Saint-Andre <[EMAIL PROTECTED]> wrote: > > > >> Pavel Simerda wrote: > > <snip/> > > >>>> Since developers of Jabbim would like to use BoB for emoticons > >>>> exchange (possibly in a way similar to the one used by Pidgin > >>>> developers, might be neat to spec it out sometime), I'd like to > >>>> ask whether it would be possible to reconsider using hashes > >>>> instead the UUID for identification. > >> We use hashes in XEP-0084 (User Avatar): > >> > >> http://www.xmpp.org/extensions/xep-0084.html > >> > >> So it might make sense to use them here as well. > > > > Yep, they would be good to incorporate in ConentID. Btw, the hash > > would be enough itself (without CID) but we want to use CID URIs. > > I agree that hashes would be enough (as in XEP-0084), but here we > want to use CIDs for cross-referencing. Btw, what I didn't know before... I have looked into the CID/MID rfc and there's nothing about requiring the at-sign. It's only written in the common practice sections but there they use. And they do use local hstnames, not shared strings. But then "xmpp.sha1.da39aee5e6b4b0d3255bfef95601890afd807099" (or similar syntax) is just as conforming as any other syntax. The interesting point of the RFC is that the CIDs must be globally unique but it apparently leaves it for the implementors to be clever enough not to have the same idea. It depends if you want to break common practice. > I see a bit of information about that > >> in RFC4122 but not a lot of details. It's all there, the notion of "names" or content is apparently left for the particular protocols/formats to define. > > They are only hash-based and they are (hopefully) unique to a > > particular sequence of bytes. They don't serve the same purpose as > > e.g. sha1 mostly because the full sha1 hash doesn't fit in UUID. > > I'll have to see where those are specified. > > <snip/> http://tools.ietf.org/html/rfc4122 "4.3. Algorithm for Creating a Name-Based UUID" "The concept of name and name space should be broadly construed, and not limited to textual names." That means it can even be a whole file or a combination of fields (e.g. the content type and the content). But the hashes are a better way if we want to actually check them. > >>> 2) Add one additional form of CID: > >>> "hash-value"@"hash-function".xep0231.xmpp.org. > >>> (the concrete syntax serves as an example, not final syntax) > >>> The hash functions would be "sha1", "sha256" and possibly other > >>> ones too and the computed hash value would be based on both > >>> *content type* and *content data* (needs more precise spec.). > >>> This would also be an exception from forced "per JID" caching. > >> Or if people define emoticon bundles then the images could be > >> identified by the domain of the entity that hosts the bundle, > >> perhaps an open-source project or whatever. > >> > >> /psa > > > > This would break the hash function use case. We want same data > > (including type) to have same ContentID uri for global sharing > > (most probably with a constant domain part). > > I don't see any big difference between: > > cid:[email protected] > > and: > > cid:[email protected] > > or: > > cid:[email protected] > > or whatever. > > Why do we need centralization of the address space? > > /psa > The hostname is just useless for the XMPP purposes. But if we keep it for common practice, I'd suggest a constant one then (as it's useless anyway). If we need metadata to specify the origin, we can add an additional optional metadata element inside the <data/>. Pavel -- Web: http://www.pavlix.net/ Jabber & Mail: pavlix(at)pavlix.net OpenID: pavlix.net
