On Tue, Aug 30, 2011 at 11:37 PM, Peter Saint-Andre <[email protected]> wrote:
> On 8/30/11 10:58 AM, Waqas Hussain wrote:
>>
>> The problem here seems to be people focusing on the exact bytes of the
>> example I happened to type out for these attacks, and not the general
>> logic of the attacks themselves.
>
> Perhaps you should have described the general logic of the attacks
> themselves, rather than just showing particular examples.
>
>> Those exact examples can be easily
>> fixed. But there's a general issue here which can't be fixed that
>> easily. The general attack acts on all namespaces of the form a/b/d/*.
>
> Please describe the general attack, then.
>
>> Note, the XML spec isn't violated. It doesn't *require* validation of
>> the xml:lang attribute:
>>   > A simple declaration for xml:lang might take the form
>>   > xml:lang CDATA #IMPLIED
>>  - http://www.w3.org/TR/2008/REC-xml-20081126/#sec-lang-tag
>
> That's because XML schema post-dates the XML spec. Checking that a
> language matches the xs:language datatype seems pretty straightforward,
> given that the definitions in the XML spec and the schema datatype spec
> both reference BCP47 (RFC 5646).
>
>> XEP-0030 was only violated because that example (and admittedly most
>> other examples of namespaces currently in use) had '//' in them. The
>> general pattern of a/b/d/* stays under attack no matter what you do or
>> forbid for all namespaces which have at least three slashes and
>> something between those slashes.
>
> Do elaborate.
>
> So far, two of the attacks (#3 and #4) that you have described (via
> examples only) depend on violations of the XML spec and XEP-0030.
>
> Another of the attacks (#1) depends on converting the literal string
> "&lt;" to "<", which we've said for years now is incorrect.
>
> Attack #2 can be mitigated by forbidding forms without fields in
> XEP-0068 and XEP-0128.
>
>> Now, are we just going to say 'no-one should or will ever use
>> namespaces with three or more slashes with something between them'?
>
> Not until we understand the general attack that you haven't described.
>
>>>> What you are proposing had already been proposed in those threads by
>>>> others, but wasn't sufficient.
>>>>
>>>> It's rather awkward to forbid '<', or any other valid character in
>>>> identity names or disco extensions (which can be user input).
>>>
>>> What's to be forbidden is the four characters '&', 'l', 't', ';' -- or,
>>> more specifically, coverting those literal characters into '<' (since
>>> the latter character has a special meaning in XEP-0115). That string of
>>> characters would not need to be forbidden in XEP-0030, only treated
>>> carefully when taking disco#info strings as input to the XEP-0115 hash
>>> calculation.
>>>
>>> If implementations perform that one check, then Attack #1 is null.
>>
>> So, forbid < in what could be a user-specified string (name)? Ugly...
>
> We're not here for aesthetics.
>
>>> As to Attack #2, that can be solved by saying that a data form of type
>>> "result" MUST include at least one <field/> element. That is, changing
>>> XEP-0004 from:
>>>
>>>   A data form of type "form", "submit", or "result" SHOULD contain at
>>>   least one <field/> element
>>>
>>> to:
>>>
>>>   A data form of type "form", "submit", or "result" MUST contain at
>>>   least one <field/> element
>>>
>>> Which seems sensible anyway.
>>>
>>
>> Err.. you are fixing the exact example again, not the general attack..
>>
>>   <identity category='client' type='pc' name='SomeClient'/>
>>   <feature var='feature1'/>
>>   <feature var='feature2'/>
>>   <feature var='feature3'/>
>>
>> remains equivalent to:
>>
>>   <identity category='client' type='pc' name='SomeClient'/>
>>   <x xmlns='jabber:x:data' type='result'>
>>     <field var='FORM_TYPE' type='hidden'>
>>       <value>feature1</value>
>>     </field>
>>     <field var='feature2' type='hidden'>
>>       <value>feature3</value>
>>     </field>
>>   </x>
>
> Is it fair to say that your general attack depends on inclusion of
> XEP-0128 forms? Or do you have an even more general attack to describe?
>
>> Now, the important part of my email:
>>
>> While fixing the exact examples I typed out is nice, the real problem
>> which needs solving is this: structural information is lost in the
>> current algorithm. The disco stanza itself is logically separated into
>> sections: identities, features, disco extensions. Disco extensions are
>> further separated into forms and forms are made up of fields and
>> fields are made up of values. The algorithm discards all this
>> structure. It's lossy. This is the actual issue. You can patch over
>> the simpler individual examples easily, but the algorithm is broken
>> fundamentally and you will not be able to fix that.
>
> For all disco data, or only disco data containing XEP-0128 forms? As far
> as I can see, it's the latter.
>
>>>> I'm
>>>> still in favor of a clean new algorithm and XEP,
>>>
>>> I'd prefer to salvage XEP-0115 if we possibly can, rather than defining
>>> a completely new algorithm and spec.
>>>
>>
>> The second important part of the email:
>>
>> I completely understand why everyone wants to salvage XEP-0115. But we
>> have a *security* issue here. Implementations which are insecure at
>> the moment must stop working in the future. Trying to let them
>> continue to work would be fine if this weren't a security issue, but
>> is irresponsible when it is a security issue.
>
> I agree that we need to fix things. I'm trying to find the least
> intrusive way to do so.
>
>> We'll just tighten restrictions in the document and assume all is
>> fine.
>
> I never said all was fine. I did say that we have some trivial fixes and
> clarifications that address a large part of the problem. We still need
> to figure out what the remaining gap is. It appears to be your Attack #2.
>
>> All would not be fine as most implementors will remain
>> blissfully unaware of it. New implementators will put in the minimum
>> amount of effort necessary to make PEP work, and that will not include
>> detailed validation. I'm willing to bet most client authors are
>> unaware of there being a security issue at all. Other than checking
>> the source code, no-one would really be able to trust any
>> implementations to be secure in this respect, because the algorithm
>> would be broken by default, with patchwork laid on top to make it not
>> leak.
>
> You might be right that we could design a better algorithm. However,
> fixing and clarifying some aspects of the existing design is not
> therefore a bad thing, unless you argue that it's going to give people a
> false sense of security.
>
>> SSL 3.0, TLS 1.0, 1.1, 1.2, and so on. If you looked at the specs,
>> you'll see that these are essentially the same protocols, with some
>> differences. They could have been patched up quite easily by adding
>> some stuff to the spec and staying compatible. They weren't. Newer
>> versions were made incompatible with older versions.
>>
>>>> which can also fix
>>>> many of the non-algorithmic issues with the XEP.
>>>
>>> What are the non-algorithmic issues?
>>>
>>
>> Things like hash agility were a goal of the spec. They don't work.
>> There was discussion of this in some of the old threads.
>
> Yes, hash agility is good. We already have the ability to specify the
> hashing algorithm used, such as hash='sha-1' or hash='sha-256'. What
> more do you think is needed?
>
> Peter
>

Let's see..

My attacks are all about lack of separate boundaries in 'ver' for
identities, features, etc. It just uses '<' for everything (and '/' in
identities).

All of my attacks attack this one problem. Now, the disco stanza has
various separate boundaries when it's XML. Specifically, it has these:

Identities are separate from features, which are separate from disco extensions.
Identities then have sub-boundaries for its attributes.
Disco extensions have sub-boundaries between fields, which have
sub-sub-boundaries between the 'var' and multiple values.

These are apparent in the XML structure of the disco stanza, and are
also apparent in the algorithm's logic.

My attacks rely solely on these boundaries disappearing in the caps
algorithm. Given any data 'a', 'b', and 'c', and boundary represented
as '|', removing the boundaries makes a|b|c, a|bc, ab|c, abc all have
the same ver string. That's the attack. The attack works in all places
where there used to be a boundary, but isn't now. Look at the attacks
I described. Each one of those is simply exploiting one of these
disappeared boundaries. I described four attacks. There aren't just
four. Each of those disappearing boundaries is attackable separately.

I hope that's an easy to understand description of the general problem.

e.g.,

<identity category='category' type='type' name='name'/>
is equivalent to
<feature var='category/type//name'/>
because a separate identities-features boundary doesn't exist. I don't
think any of the proposed patches can solve this one example.

--
And yes, I'm arguing that fixing the algorithm is going to give people
a false sense of security, and people like me a sense of insecurity,
because deployments out there are going to be exploitable, and there
would be no way to know for sure that the features my client UI is
showing for an entity are really the features the entity has. Many
developers will not update their software, because it will continue to
work and their users will not report any issues.

--
Hash agility. See this mail by Dave Cridland; the last half:
http://mail.jabber.org/pipermail/security/2009-September/000828.html

Using the ability to specify a hash algorithm is worse than just using
a new XEP. A new XEP can exist in parallel with the old, and we can
have a smooth transition. Just changing the hash algorithm in the old
XEP splits the caps world into two, one side using the old, one using
the new, and both unable to interoperate.

--
Waqas Hussain

Reply via email to