On 8/30/11 2:08 PM, Waqas Hussain wrote:
> On Tue, Aug 30, 2011 at 11:37 PM, Peter Saint-Andre <[email protected]> 
> wrote:
>> On 8/30/11 10:58 AM, Waqas Hussain wrote:
>>>
>>> The problem here seems to be people focusing on the exact bytes of the
>>> example I happened to type out for these attacks, and not the general
>>> logic of the attacks themselves.
>>
>> Perhaps you should have described the general logic of the attacks
>> themselves, rather than just showing particular examples.
>>
>>> Those exact examples can be easily
>>> fixed. But there's a general issue here which can't be fixed that
>>> easily. The general attack acts on all namespaces of the form a/b/d/*.
>>
>> Please describe the general attack, then.
>>
>>> Note, the XML spec isn't violated. It doesn't *require* validation of
>>> the xml:lang attribute:
>>>   > A simple declaration for xml:lang might take the form
>>>   > xml:lang CDATA #IMPLIED
>>>  - http://www.w3.org/TR/2008/REC-xml-20081126/#sec-lang-tag
>>
>> That's because XML schema post-dates the XML spec. Checking that a
>> language matches the xs:language datatype seems pretty straightforward,
>> given that the definitions in the XML spec and the schema datatype spec
>> both reference BCP47 (RFC 5646).
>>
>>> XEP-0030 was only violated because that example (and admittedly most
>>> other examples of namespaces currently in use) had '//' in them. The
>>> general pattern of a/b/d/* stays under attack no matter what you do or
>>> forbid for all namespaces which have at least three slashes and
>>> something between those slashes.
>>
>> Do elaborate.
>>
>> So far, two of the attacks (#3 and #4) that you have described (via
>> examples only) depend on violations of the XML spec and XEP-0030.
>>
>> Another of the attacks (#1) depends on converting the literal string
>> "&lt;" to "<", which we've said for years now is incorrect.
>>
>> Attack #2 can be mitigated by forbidding forms without fields in
>> XEP-0068 and XEP-0128.
>>
>>> Now, are we just going to say 'no-one should or will ever use
>>> namespaces with three or more slashes with something between them'?
>>
>> Not until we understand the general attack that you haven't described.
>>
>>>>> What you are proposing had already been proposed in those threads by
>>>>> others, but wasn't sufficient.
>>>>>
>>>>> It's rather awkward to forbid '<', or any other valid character in
>>>>> identity names or disco extensions (which can be user input).
>>>>
>>>> What's to be forbidden is the four characters '&', 'l', 't', ';' -- or,
>>>> more specifically, coverting those literal characters into '<' (since
>>>> the latter character has a special meaning in XEP-0115). That string of
>>>> characters would not need to be forbidden in XEP-0030, only treated
>>>> carefully when taking disco#info strings as input to the XEP-0115 hash
>>>> calculation.
>>>>
>>>> If implementations perform that one check, then Attack #1 is null.
>>>
>>> So, forbid < in what could be a user-specified string (name)? Ugly...
>>
>> We're not here for aesthetics.
>>
>>>> As to Attack #2, that can be solved by saying that a data form of type
>>>> "result" MUST include at least one <field/> element. That is, changing
>>>> XEP-0004 from:
>>>>
>>>>   A data form of type "form", "submit", or "result" SHOULD contain at
>>>>   least one <field/> element
>>>>
>>>> to:
>>>>
>>>>   A data form of type "form", "submit", or "result" MUST contain at
>>>>   least one <field/> element
>>>>
>>>> Which seems sensible anyway.
>>>>
>>>
>>> Err.. you are fixing the exact example again, not the general attack..
>>>
>>>   <identity category='client' type='pc' name='SomeClient'/>
>>>   <feature var='feature1'/>
>>>   <feature var='feature2'/>
>>>   <feature var='feature3'/>
>>>
>>> remains equivalent to:
>>>
>>>   <identity category='client' type='pc' name='SomeClient'/>
>>>   <x xmlns='jabber:x:data' type='result'>
>>>     <field var='FORM_TYPE' type='hidden'>
>>>       <value>feature1</value>
>>>     </field>
>>>     <field var='feature2' type='hidden'>
>>>       <value>feature3</value>
>>>     </field>
>>>   </x>
>>
>> Is it fair to say that your general attack depends on inclusion of
>> XEP-0128 forms? Or do you have an even more general attack to describe?
>>
>>> Now, the important part of my email:
>>>
>>> While fixing the exact examples I typed out is nice, the real problem
>>> which needs solving is this: structural information is lost in the
>>> current algorithm. The disco stanza itself is logically separated into
>>> sections: identities, features, disco extensions. Disco extensions are
>>> further separated into forms and forms are made up of fields and
>>> fields are made up of values. The algorithm discards all this
>>> structure. It's lossy. This is the actual issue. You can patch over
>>> the simpler individual examples easily, but the algorithm is broken
>>> fundamentally and you will not be able to fix that.
>>
>> For all disco data, or only disco data containing XEP-0128 forms? As far
>> as I can see, it's the latter.
>>
>>>>> I'm
>>>>> still in favor of a clean new algorithm and XEP,
>>>>
>>>> I'd prefer to salvage XEP-0115 if we possibly can, rather than defining
>>>> a completely new algorithm and spec.
>>>>
>>>
>>> The second important part of the email:
>>>
>>> I completely understand why everyone wants to salvage XEP-0115. But we
>>> have a *security* issue here. Implementations which are insecure at
>>> the moment must stop working in the future. Trying to let them
>>> continue to work would be fine if this weren't a security issue, but
>>> is irresponsible when it is a security issue.
>>
>> I agree that we need to fix things. I'm trying to find the least
>> intrusive way to do so.
>>
>>> We'll just tighten restrictions in the document and assume all is
>>> fine.
>>
>> I never said all was fine. I did say that we have some trivial fixes and
>> clarifications that address a large part of the problem. We still need
>> to figure out what the remaining gap is. It appears to be your Attack #2.
>>
>>> All would not be fine as most implementors will remain
>>> blissfully unaware of it. New implementators will put in the minimum
>>> amount of effort necessary to make PEP work, and that will not include
>>> detailed validation. I'm willing to bet most client authors are
>>> unaware of there being a security issue at all. Other than checking
>>> the source code, no-one would really be able to trust any
>>> implementations to be secure in this respect, because the algorithm
>>> would be broken by default, with patchwork laid on top to make it not
>>> leak.
>>
>> You might be right that we could design a better algorithm. However,
>> fixing and clarifying some aspects of the existing design is not
>> therefore a bad thing, unless you argue that it's going to give people a
>> false sense of security.
>>
>>> SSL 3.0, TLS 1.0, 1.1, 1.2, and so on. If you looked at the specs,
>>> you'll see that these are essentially the same protocols, with some
>>> differences. They could have been patched up quite easily by adding
>>> some stuff to the spec and staying compatible. They weren't. Newer
>>> versions were made incompatible with older versions.
>>>
>>>>> which can also fix
>>>>> many of the non-algorithmic issues with the XEP.
>>>>
>>>> What are the non-algorithmic issues?
>>>>
>>>
>>> Things like hash agility were a goal of the spec. They don't work.
>>> There was discussion of this in some of the old threads.
>>
>> Yes, hash agility is good. We already have the ability to specify the
>> hashing algorithm used, such as hash='sha-1' or hash='sha-256'. What
>> more do you think is needed?
>>
>> Peter
>>
> 
> Let's see..
> 
> My attacks are all about lack of separate boundaries in 'ver' for
> identities, features, etc. It just uses '<' for everything (and '/' in
> identities).
> 
> All of my attacks attack this one problem. Now, the disco stanza has
> various separate boundaries when it's XML. Specifically, it has these:
> 
> Identities are separate from features, which are separate from disco 
> extensions.
> Identities then have sub-boundaries for its attributes.
> Disco extensions have sub-boundaries between fields, which have
> sub-sub-boundaries between the 'var' and multiple values.
> 
> These are apparent in the XML structure of the disco stanza, and are
> also apparent in the algorithm's logic.
> 
> My attacks rely solely on these boundaries disappearing in the caps
> algorithm. Given any data 'a', 'b', and 'c', and boundary represented
> as '|', removing the boundaries makes a|b|c, a|bc, ab|c, abc all have
> the same ver string. That's the attack. The attack works in all places
> where there used to be a boundary, but isn't now. Look at the attacks
> I described. Each one of those is simply exploiting one of these
> disappeared boundaries. I described four attacks. There aren't just
> four. Each of those disappearing boundaries is attackable separately.
> 
> I hope that's an easy to understand description of the general problem.

Yes. If we decide to change the algorithm, then we'd need a new
specification that would obsolete XEP-0115. I am still hoping that we
can save XEP-0115 because we have an awful lot of deployed software out
there, but if that isn't palatable then we could define a new spec with
a new algorithm.

Peter

-- 
Peter Saint-Andre
https://stpeter.im/


Reply via email to