Re: [Standards] XEP-0115 redux

Dave Cridland Thu, 10 Jan 2008 02:31:41 -0800

FWIW, some input, and my opinion.

On Wed Jan  9 23:14:42 2008, Peter Saint-Andre wrote:

ISSUE #1: Do we need a new namespace?
Description: We have changed things around so radically sinceversion 1.3 [2] of the spec that maybe we need a new namespace (aswe did for the Entity Time protocol).
Discussion: Yes we could do this, but then we'd have two separateentity capabilities notations in every presence notification thatevery user sent over the network, thus violating one of therequirements of XEP-0115 ("minimize network impact"). Therefore wehave bent over backwards to not define a new namespace. The resultis not the prettiest protocol in the world, but it doesn't breakanything.
My conclusion: I am opposed to defining a new namespace.

Equally, with the current design - and I agree it's ugly, and mayoffend purists - we have the neat trick that it degrades gracefullyto disco in three key cases:

1) If the sender doesn't understand hashes, and therefore doesn't usethem.

2) If the receiver doesn't understand hashes, and therefore ignoresthem.

3) If the sender uses a hash that the receiver doesn't understand,even though the receiver *does* understand hashes in general.

That latter is key to our "hash agility" story, incidentally, as itallows graceful fallback in the case where we're forced into usinghash agility.

ISSUE #2: Should the 'v' attribute be REQUIRED?
Description: The 'ver' attribute was REQUIRED in version 1.3 [2] ofthe spec. In a late change made to version 1.4 [3] of the specduring the Council meeting at which version 1.4 was approved, wesuggested that the value of the 'node' should be"ProductURL#ProductVersion" (e.g., "http://psi-im.org/#0.11";) butwe agreed that this would *not* be REQUIRED or even officiallyRECOMMENDED. In the proposed version 1.5 [4] of the spec, we addeda new attribute 'v' to encapsulate the software version, but it isonly RECOMMENDED, *not* REQUIRED.
Discussion: Some people on the list objected strenuously to thelate change made to version 1.4 [3] which suggested that the 'node'attribute should encapsulate the ProductVersion. Therefore the listconsensus was that the 'node' attribute should be the ProductURLnot including the ProductVersion, and that we would define a newattribute 'v' to communicate the ProductVersion; however, the listconsensus was that this attribute would *not* be REQUIRED butinstead only RECOMMENDED (some people argued for making it OPTIONALor removing it altogether, but we settled on RECOMMENDED).
My conclusion: Leave version 1.5 [4] as it is now, with 'v'RECOMMENDED but *not* REQUIRED. (In fact I would not object tomaking it OPTIONAL, but RECOMMENDED seems closest to the prior listconsensus.)

Conflicting arguments here. As a not-really-client developer (I dohave a client, but even I don't use it), I hold no strong opinion.

1) The old spec did have a version, held in ver, so the new versionis to this extent a regression.

2) Exposing your client software version is a potential securityissue.

If I had to state an opinion, I'd say that if you wanted to hide yoursoftware version in "Classic" XEP-0115, it was pretty easy toobfuscate the ver attribute, whereas making v optional (whetherOPTIONAL or RECOMMENDED) does at least make this choice explicit.

ISSUE #3: Which hashing algorithms?
Description: The Council discussion seemed to assume that version1.5 [4] says SHA-1 is mandatory-to-implement ("MTI"). In fact,version 1.5 does not mandate implementation of any specificalgorithm. Be that as it may, some Council members suggested thatwe recommend MD5 instead of SHA-1 (the only concrete reason I heardin the meeting is that MD5 output is smaller).

(Kind of. One issue is that MD5 might actually be more secure.)

Discussion: As far as I can see, we had consensus not to mandateany particular hashing algorithm, but instead to allow anyalgorithm that is registered with the IANA [5]. Currently theregistered algorithms are md2, md5, sha-1, sha-224, sha-256,sha-384, and sha-512. However, we seemed to have list consensusthat most people would use SHA-1 at the beginning (SHA-1 is thedefault value of the 'hash' algorithm in the currently-approvedversion 1.4 [3] of the spec), and perhaps switch to SHA-256 in thefuture if it is shown that pre-image attacks (see RFC 4270) arelikely against SHA-1. That said, people *could* implement MD5 ifthey want to because it is registered with the IANA.

Note that RFC4270 was a fairly extensive survey by an experiencedIETF security chap - Paul Hoffman runs the VPN Consortium - and BruceSchneier's name ought to be familiar to people interested in cryptoand security.

Note also that whilst it describes some progress made in preimageweaknesses in SHA-1, none are mentioned for either SHA-2 (That'sSHA-256, SHA-512, etc), or MD5. MD5 has had a lot of cryptanalysis -you'll note that more researchers are producing papers on it than anyother hash algorithm, and this isn't entirely down to relativestrength compared to SHA-* - it's more down to the fact that MD5 hasconsiderably larger deployment, and so is a more attractive hash toanalyse.

The fact that after this length of time, nobody appears to have founda preimage attack on it is pretty gratifying. MD5 *is* demonstrablyweak in two areas:

1) Challenge-Response password hashing, for example in CRAM-MD5. Notbecause of a mathematical weakness, but because you can brute forcethings too fast, across the entire, fairly limited, space of apassword. This doesn't affect us for the twin reasons that:


a) Our space is much bigger.
b) The space we have is quite rigid in format.

2) Collisions, and from there signature algorithms. This is where youcome up with two inputs that produce an identical output. This isuseful if:


a) You get to choose both inputs. (Our poisoner cannot).
b) There is scope for adding random junk somewhere. (Likewise).

In theory, you can do a collision without random junk, but it wouldtake considerably longer. Also important to note is that this has noimpact on whether we're more likely to find inadvertant collisionswith MD5. In theory, the shorter hash length will have an impact,simply by the birthday "paradox", but it's still pretty rare.

But it's not weak in preimage attacks - those where the attackerknows the hash, and/or the input, and wishes to construct analternate input of their choosing which matches.

In order to perform caps poisoning with MD5, therefore, the attackermust:


i) Subvert the development process of the client.

ii) Optionally, to cover his tracks, subvert the XSF, thus allowingthe attacker to have some control over what counts as legitimateinput, thus reducing, to a degree, the random junk problem.

You'll note that Kevin Smith is in a position to do both, but noother person or entity is, throughout the entire world.

And anyone in either position is capable of inflicting asignificantly higher damage by choosing to do some easier attack - ifthe developer of your client turns out to be an Evil Genius, you'rehenceforth Doomed. Similarly, if Council members wish to subtleyundermine your security, we are in a position to do that.

3a. Do we specify an MTI algorithm or let the market decide?

I think we need an MTI, I have to admit I'd read the current text asessentially stating that SHA-1 was the MTI.

3b. If we specify an MTI algorithm, do we specify MD5 or SHA-1 orsomething else?

What concerns me is not that SHA-1 is a particularly poor choice, butthat we may have reached that choice by applying faulty logic. SHA-1does appear to have *some* weakness in preimage. I don't know if,given the similarity between SHA-1 and SHA-2, this also appliesthere, but I cannot find any mention of preimage weakness in MD5.

I'll drop my objection if people want - in fact, I'll drop it if theother two issues are resolved, but I would like people to take theopportunity to satisfy themselves that they've made the right choicein the face of the evidence.


So, some reading:

1) RFC4270 is an excellent backgrounder on the different attacks onhashes, and how these affect real-world protocols.


2) Wikipædia is helpful, too:

http://en.wikipedia.org/wiki/Birthday_attack demonstrates that weneed around 2.2 x 10^19 possible inputs for MD5 before an inadvertantcollision is more likely than 50%, assuming that these are randomlyspread. (They aren't, so this is in effect a worst case).

http://en.wikipedia.org/wiki/Preimage_attack,http://en.wikipedia.org/wiki/Cryptographic_hash_function, both givedetailed background.


Finally, of course, feel free to bug me by XMPP or email. :-)

Dave.
--
Dave Cridland - mailto:[EMAIL PROTECTED] - xmpp:[EMAIL PROTECTED]
 - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
 - http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade

Re: [Standards] XEP-0115 redux

Reply via email to