Re: Link Extensions. Need "md5" or some kind of hash.

James Snell Sun, 16 May 2010 20:41:58 -0700

Ok, although I seriously dislike having to do additional parsing on
attribute values, the arguments made so far are valid and parsing hex
encoded hash digests is -- fortunately -- quite simple to do. So let's
go with the following syntax...


  hash = attribute hash { hash-list }
  hash-list = # ( token ":" 1*HEX )

The token and HEX productions are defined by RFC2616...

The spec would defer to the existing IANA registry for hash functions
to define the "tokens"

This would result in a syntax of...

  hash="md5:abc...xyz, sha-1:123...567, sha-512:xyz...abc"

This seem acceptable to everyone?

- James

On Sat, May 15, 2010 at 11:46 PM, Sam Johnston <[email protected]> wrote:
> James,
> In consideration of former (CRC) and future (AHS) hashing functions I think
> it's critical to support extensibility and multiple hashes. I like that XML
> digsigs use anyURIs to identify hashes (e.g. <DigestMethod
> Algorithm="http://www.w3.org/2000/09/xmldsig#sha1";>), but one could argue
> this unnecessarily complicates what should be a simple syntax.
> I was about to propose an IANA registry for hash functions but one already
> exists (Hash Function Textual Names as specified by RFC4572) so it would
> make sense to use it rather than inventing our own mechanism - even if we
> have to update the registry rules to allow for algorithms specified by URI
> rather than RFC.
> While Atom is an XML format and should arguably follow XML conventions,
> there is precedent for prefixing hashes with the name of the hashing
> function using e.g. colons or curly braces. I think it's more important to
> keep the XML syntax simple and in any case the hash and hash function should
> be tightly bound as they are useless independently.
> All that considered, I think the best approach is to allow for a
> multi-valued "hash" attribute ala:
> <link rel="alternate" href="http://example.com/";
> hash="md5:6705f99eccedeac20e969bef954c5fb0
> sha-1:bc608e6d3d339d1a7afc406a7ea6a8f07358038b" />
> and/or
> <link rel="alternate" href="http://example.com/thing.pdf";
> hash="md5:6705f99eccedeac20e969bef954c5fb0"
> hash="sha-1:bc608e6d3d339d1a7afc406a7ea6a8f07358038b" />
> Sam
> Google
> On Sat, May 15, 2010 at 1:15 AM, James Snell <[email protected]> wrote:
>>
>> Good argument Bob... ok... stewing over this a bit more. I generally
>> dislike having to do additional parsing of attribute/element values
>> but there are very good reasons for keeping this as a single "hash"
>> attribute and you make a compelling case.
>>
>> On Fri, May 14, 2010 at 1:26 PM, Bob Wyman <[email protected]> wrote:
>> > James Snell <[email protected]> wrote:
>> >> <link href="foo" md5="abc...xyz">
>> >>  <media:hash algo="GOST">123...456</media:hash>
>> >> </link>
>> >
>> > The alternative approach, which would support both a variety
>> > and multiplicity of hashes would look like this:
>> > <link href="foo" hash="gost:123123..., md5:0928402948...,
>> > sha256:098078097..."/>
>> > This strikes me as "simpler" than the hybrid approach. Just a few of my
>> > concerns with the proposed "hybrid" approach follow:
>> >
>> > I like binding the algorithm and value together into a single value
>> > since I
>> > know of no compelling case for processing one element in isolation of
>> > the
>> > other. The hash value only makes sense if you know the algorithm and the
>> > algorithm is only useful when bound to a specific hash value. Thus, it
>> > strikes me as simply introducing syntactic sugar to specify the
>> > algorithm
>> > using a different XML component than the value.
>> > These values are likely to be stored in databases and otherwise
>> > manipulated.
>> > In all cases, for the data to be meaningful, people will need to keep
>> > the
>> > binding between algorithm and hash value. It is likely that storing a
>> > single
>> > string value is going to be easier for folk than dealing with a
>> > multi-part
>> > value. Also, consider the effect of parsers... It is likely that in
>> > order to
>> > transfer a value from an entry into a database field, what you'll need
>> > to do
>> > is extract both algorithm and hash value from the parse tree and then
>> > construct some string that combines them. This would be particularly
>> > useful
>> > if you want to use the hash value as a database key (a very reasonable
>> > thing
>> > to do...) You could build and store the string "algo='GOST'>123...455<"
>> > or
>> > your database might support concatenated fields, or you could build
>> > "gost.123...456". I think I would go with the latter.
>> > Defining distinct attributes for each hash algorithm pushes unnecessary
>> > syntactical complexity to the global level and thus increases the
>> > complexity
>> > not only of the specification but also of all applications no matter
>> > which
>> > algorithms they understand or if they understand any at all. It also
>> > makes
>> > extending the list of supported algorithms "expensive" since such
>> > extensions
>> > require modification to the standard rather than just an registry
>> > entry.What
>> > benefit do we get from having these algorithm types defined at the
>> > global
>> > syntax level?
>> > The hybrid approach looks very complicated to me. It means that I'll
>> > have
>> > two very different places in which hash values might found and two very
>> > different syntaxes for expressing them. The result is going to be more
>> > complex code than would otherwise be the case. What value comes from
>> > using
>> > the hybrid approach?
>> > One argument for hybrid is that these elements exist already in other
>> > specs.
>> > I wonder if it isn't possible that those other specs might have
>> > approached
>> > the problem in a non-optimal fashion. Does it really make sense to
>> > import
>> > syntax if there isn't a really good case that demonstrates that doing so
>> > is
>> > the best approach?
>> > I am unaware of any hash algorithms that need anything other than the
>> > specification of the algorithm and the value in order to be useful. If
>> > there
>> > were broadly used algorithms that had more complex meta-data
>> > requirements,
>> > it would be easier to understand the appeal of the hybrid approach.
>> > I can't think of any reason why it is *useful* to separate the algorithm
>> > from the hash value. Can someone enlighten me here? What computation,
>> > storage or communication task becomes easier if you have these two
>> > separated?
>> >
>> > bob wyman
>> > On Fri, May 14, 2010 at 3:06 PM, James Snell <[email protected]> wrote:
>> >>
>> >> Ok, I've been giving this some more thought and I think a hybrid
>> >> approach works very well. As has been pointed out a number of times in
>> >> this thread, there are existing elements in other namespaces that
>> >> provide a algorithm/hash pairing. I think that the Link Extensions
>> >> Draft can provide a attributes for the most basic hash algorithms and
>> >> applications that require hash algorithms that are not covered can
>> >> fall back to the extension elements.
>> >>
>> >> e.g.
>> >>
>> >> <link href="foo" md5="abc...xyz">
>> >>  <media:hash algo="GOST">123...456</media:hash>
>> >> </link>
>> >>
>> >> This would allow for the most common cases to be easily covered while
>> >> allowing for the full range of possible cases to be handled as well.
>> >>
>> >> - James
>> >>
>> >> On Wed, May 12, 2010 at 8:50 PM, Richard Salz <[email protected]> wrote:
>> >> >> So the key question is: what are the main algorithms we need to
>> >> >> provide attributes for?
>> >> >
>> >> > This is a hard question to answer -- especially for hash/digest
>> >> > algorithms
>> >> > which tend to fall more rapidly than vetted crypto algorithms.
>> >> >
>> >> > It's more verbose, but I strongly recommend using a pair of
>> >> > attributes
>> >> > to
>> >> > represent algorithm/value. Use the URI's defined in the latest XML
>> >> > DSIG
>> >> > document, perhaps with the "trick" that relative URI's ar a shorthand
>> >> > for
>> >> > the xmldsig namespace.
>> >> >
>> >> >        /r$
>> >> >
>> >> > --
>> >> > STSM, WebSphere Appliance Architect
>> >> > https://www.ibm.com/developerworks/mydeveloperworks/blogs/soma/
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> - James Snell
>> >>  http://www.snellspace.com
>> >>  [email protected]
>> >>
>> >
>> >
>>
>>
>>
>> --
>> - James Snell
>>  http://www.snellspace.com
>>  [email protected]
>>
>
>

Re: Link Extensions. Need "md5" or some kind of hash.

Reply via email to