Good argument Bob... ok... stewing over this a bit more. I generally dislike having to do additional parsing of attribute/element values but there are very good reasons for keeping this as a single "hash" attribute and you make a compelling case.
On Fri, May 14, 2010 at 1:26 PM, Bob Wyman <[email protected]> wrote: > James Snell <[email protected]> wrote: >> <link href="foo" md5="abc...xyz"> >> <media:hash algo="GOST">123...456</media:hash> >> </link> > > The alternative approach, which would support both a variety > and multiplicity of hashes would look like this: > <link href="foo" hash="gost:123123..., md5:0928402948..., > sha256:098078097..."/> > This strikes me as "simpler" than the hybrid approach. Just a few of my > concerns with the proposed "hybrid" approach follow: > > I like binding the algorithm and value together into a single value since I > know of no compelling case for processing one element in isolation of the > other. The hash value only makes sense if you know the algorithm and the > algorithm is only useful when bound to a specific hash value. Thus, it > strikes me as simply introducing syntactic sugar to specify the algorithm > using a different XML component than the value. > These values are likely to be stored in databases and otherwise manipulated. > In all cases, for the data to be meaningful, people will need to keep the > binding between algorithm and hash value. It is likely that storing a single > string value is going to be easier for folk than dealing with a multi-part > value. Also, consider the effect of parsers... It is likely that in order to > transfer a value from an entry into a database field, what you'll need to do > is extract both algorithm and hash value from the parse tree and then > construct some string that combines them. This would be particularly useful > if you want to use the hash value as a database key (a very reasonable thing > to do...) You could build and store the string "algo='GOST'>123...455<" or > your database might support concatenated fields, or you could build > "gost.123...456". I think I would go with the latter. > Defining distinct attributes for each hash algorithm pushes unnecessary > syntactical complexity to the global level and thus increases the complexity > not only of the specification but also of all applications no matter which > algorithms they understand or if they understand any at all. It also makes > extending the list of supported algorithms "expensive" since such extensions > require modification to the standard rather than just an registry entry.What > benefit do we get from having these algorithm types defined at the global > syntax level? > The hybrid approach looks very complicated to me. It means that I'll have > two very different places in which hash values might found and two very > different syntaxes for expressing them. The result is going to be more > complex code than would otherwise be the case. What value comes from using > the hybrid approach? > One argument for hybrid is that these elements exist already in other specs. > I wonder if it isn't possible that those other specs might have approached > the problem in a non-optimal fashion. Does it really make sense to import > syntax if there isn't a really good case that demonstrates that doing so is > the best approach? > I am unaware of any hash algorithms that need anything other than the > specification of the algorithm and the value in order to be useful. If there > were broadly used algorithms that had more complex meta-data requirements, > it would be easier to understand the appeal of the hybrid approach. > I can't think of any reason why it is *useful* to separate the algorithm > from the hash value. Can someone enlighten me here? What computation, > storage or communication task becomes easier if you have these two > separated? > > bob wyman > On Fri, May 14, 2010 at 3:06 PM, James Snell <[email protected]> wrote: >> >> Ok, I've been giving this some more thought and I think a hybrid >> approach works very well. As has been pointed out a number of times in >> this thread, there are existing elements in other namespaces that >> provide a algorithm/hash pairing. I think that the Link Extensions >> Draft can provide a attributes for the most basic hash algorithms and >> applications that require hash algorithms that are not covered can >> fall back to the extension elements. >> >> e.g. >> >> <link href="foo" md5="abc...xyz"> >> <media:hash algo="GOST">123...456</media:hash> >> </link> >> >> This would allow for the most common cases to be easily covered while >> allowing for the full range of possible cases to be handled as well. >> >> - James >> >> On Wed, May 12, 2010 at 8:50 PM, Richard Salz <[email protected]> wrote: >> >> So the key question is: what are the main algorithms we need to >> >> provide attributes for? >> > >> > This is a hard question to answer -- especially for hash/digest >> > algorithms >> > which tend to fall more rapidly than vetted crypto algorithms. >> > >> > It's more verbose, but I strongly recommend using a pair of attributes >> > to >> > represent algorithm/value. Use the URI's defined in the latest XML DSIG >> > document, perhaps with the "trick" that relative URI's ar a shorthand >> > for >> > the xmldsig namespace. >> > >> > /r$ >> > >> > -- >> > STSM, WebSphere Appliance Architect >> > https://www.ibm.com/developerworks/mydeveloperworks/blogs/soma/ >> > >> > >> >> >> >> -- >> - James Snell >> http://www.snellspace.com >> [email protected] >> > > -- - James Snell http://www.snellspace.com [email protected]
