On 8/30/11 10:58 AM, Waqas Hussain wrote: > On Tue, Aug 30, 2011 at 5:49 AM, Peter Saint-Andre <[email protected]> wrote: >> On 8/26/11 4:32 PM, Waqas Hussain wrote: >>> >>> The proposed changes don't solve the problem. See what I wrote in the >>> "Attempting to fix the algorithm" section of the email you linked. >>> Particularly "As far as I can see, there is no way of fixing attacks 2 >>> and 3 other than fundamentally changing the algorithm." >> >> As we've seen, Attacks #3 and #4 violate both XEP-0030 and the XML spec. >> > > The problem here seems to be people focusing on the exact bytes of the > example I happened to type out for these attacks, and not the general > logic of the attacks themselves.
Perhaps you should have described the general logic of the attacks themselves, rather than just showing particular examples. > Those exact examples can be easily > fixed. But there's a general issue here which can't be fixed that > easily. The general attack acts on all namespaces of the form a/b/d/*. Please describe the general attack, then. > Note, the XML spec isn't violated. It doesn't *require* validation of > the xml:lang attribute: > > A simple declaration for xml:lang might take the form > > xml:lang CDATA #IMPLIED > - http://www.w3.org/TR/2008/REC-xml-20081126/#sec-lang-tag That's because XML schema post-dates the XML spec. Checking that a language matches the xs:language datatype seems pretty straightforward, given that the definitions in the XML spec and the schema datatype spec both reference BCP47 (RFC 5646). > XEP-0030 was only violated because that example (and admittedly most > other examples of namespaces currently in use) had '//' in them. The > general pattern of a/b/d/* stays under attack no matter what you do or > forbid for all namespaces which have at least three slashes and > something between those slashes. Do elaborate. So far, two of the attacks (#3 and #4) that you have described (via examples only) depend on violations of the XML spec and XEP-0030. Another of the attacks (#1) depends on converting the literal string "<" to "<", which we've said for years now is incorrect. Attack #2 can be mitigated by forbidding forms without fields in XEP-0068 and XEP-0128. > Now, are we just going to say 'no-one should or will ever use > namespaces with three or more slashes with something between them'? Not until we understand the general attack that you haven't described. >>> What you are proposing had already been proposed in those threads by >>> others, but wasn't sufficient. >>> >>> It's rather awkward to forbid '<', or any other valid character in >>> identity names or disco extensions (which can be user input). >> >> What's to be forbidden is the four characters '&', 'l', 't', ';' -- or, >> more specifically, coverting those literal characters into '<' (since >> the latter character has a special meaning in XEP-0115). That string of >> characters would not need to be forbidden in XEP-0030, only treated >> carefully when taking disco#info strings as input to the XEP-0115 hash >> calculation. >> >> If implementations perform that one check, then Attack #1 is null. > > So, forbid < in what could be a user-specified string (name)? Ugly... We're not here for aesthetics. >> As to Attack #2, that can be solved by saying that a data form of type >> "result" MUST include at least one <field/> element. That is, changing >> XEP-0004 from: >> >> A data form of type "form", "submit", or "result" SHOULD contain at >> least one <field/> element >> >> to: >> >> A data form of type "form", "submit", or "result" MUST contain at >> least one <field/> element >> >> Which seems sensible anyway. >> > > Err.. you are fixing the exact example again, not the general attack.. > > <identity category='client' type='pc' name='SomeClient'/> > <feature var='feature1'/> > <feature var='feature2'/> > <feature var='feature3'/> > > remains equivalent to: > > <identity category='client' type='pc' name='SomeClient'/> > <x xmlns='jabber:x:data' type='result'> > <field var='FORM_TYPE' type='hidden'> > <value>feature1</value> > </field> > <field var='feature2' type='hidden'> > <value>feature3</value> > </field> > </x> Is it fair to say that your general attack depends on inclusion of XEP-0128 forms? Or do you have an even more general attack to describe? > Now, the important part of my email: > > While fixing the exact examples I typed out is nice, the real problem > which needs solving is this: structural information is lost in the > current algorithm. The disco stanza itself is logically separated into > sections: identities, features, disco extensions. Disco extensions are > further separated into forms and forms are made up of fields and > fields are made up of values. The algorithm discards all this > structure. It's lossy. This is the actual issue. You can patch over > the simpler individual examples easily, but the algorithm is broken > fundamentally and you will not be able to fix that. For all disco data, or only disco data containing XEP-0128 forms? As far as I can see, it's the latter. >>> I'm >>> still in favor of a clean new algorithm and XEP, >> >> I'd prefer to salvage XEP-0115 if we possibly can, rather than defining >> a completely new algorithm and spec. >> > > The second important part of the email: > > I completely understand why everyone wants to salvage XEP-0115. But we > have a *security* issue here. Implementations which are insecure at > the moment must stop working in the future. Trying to let them > continue to work would be fine if this weren't a security issue, but > is irresponsible when it is a security issue. I agree that we need to fix things. I'm trying to find the least intrusive way to do so. > We'll just tighten restrictions in the document and assume all is > fine. I never said all was fine. I did say that we have some trivial fixes and clarifications that address a large part of the problem. We still need to figure out what the remaining gap is. It appears to be your Attack #2. > All would not be fine as most implementors will remain > blissfully unaware of it. New implementators will put in the minimum > amount of effort necessary to make PEP work, and that will not include > detailed validation. I'm willing to bet most client authors are > unaware of there being a security issue at all. Other than checking > the source code, no-one would really be able to trust any > implementations to be secure in this respect, because the algorithm > would be broken by default, with patchwork laid on top to make it not > leak. You might be right that we could design a better algorithm. However, fixing and clarifying some aspects of the existing design is not therefore a bad thing, unless you argue that it's going to give people a false sense of security. > SSL 3.0, TLS 1.0, 1.1, 1.2, and so on. If you looked at the specs, > you'll see that these are essentially the same protocols, with some > differences. They could have been patched up quite easily by adding > some stuff to the spec and staying compatible. They weren't. Newer > versions were made incompatible with older versions. > >>> which can also fix >>> many of the non-algorithmic issues with the XEP. >> >> What are the non-algorithmic issues? >> > > Things like hash agility were a goal of the spec. They don't work. > There was discussion of this in some of the old threads. Yes, hash agility is good. We already have the ability to specify the hashing algorithm used, such as hash='sha-1' or hash='sha-256'. What more do you think is needed? Peter -- Peter Saint-Andre https://stpeter.im/
