On 8/30/11 10:58 AM, Waqas Hussain wrote:
> On Tue, Aug 30, 2011 at 5:49 AM, Peter Saint-Andre <[email protected]> wrote:
>> On 8/26/11 4:32 PM, Waqas Hussain wrote:
>>>
>>> The proposed changes don't solve the problem. See what I wrote in the
>>> "Attempting to fix the algorithm" section of the email you linked.
>>> Particularly "As far as I can see, there is no way of fixing attacks 2
>>> and 3 other than fundamentally changing the algorithm."
>>
>> As we've seen, Attacks #3 and #4 violate both XEP-0030 and the XML spec.
>>
> 
> The problem here seems to be people focusing on the exact bytes of the
> example I happened to type out for these attacks, and not the general
> logic of the attacks themselves. 

Perhaps you should have described the general logic of the attacks
themselves, rather than just showing particular examples.

> Those exact examples can be easily
> fixed. But there's a general issue here which can't be fixed that
> easily. The general attack acts on all namespaces of the form a/b/d/*.

Please describe the general attack, then.

> Note, the XML spec isn't violated. It doesn't *require* validation of
> the xml:lang attribute:
>   > A simple declaration for xml:lang might take the form
>   > xml:lang CDATA #IMPLIED
>  - http://www.w3.org/TR/2008/REC-xml-20081126/#sec-lang-tag

That's because XML schema post-dates the XML spec. Checking that a
language matches the xs:language datatype seems pretty straightforward,
given that the definitions in the XML spec and the schema datatype spec
both reference BCP47 (RFC 5646).

> XEP-0030 was only violated because that example (and admittedly most
> other examples of namespaces currently in use) had '//' in them. The
> general pattern of a/b/d/* stays under attack no matter what you do or
> forbid for all namespaces which have at least three slashes and
> something between those slashes.

Do elaborate.

So far, two of the attacks (#3 and #4) that you have described (via
examples only) depend on violations of the XML spec and XEP-0030.

Another of the attacks (#1) depends on converting the literal string
"&lt;" to "<", which we've said for years now is incorrect.

Attack #2 can be mitigated by forbidding forms without fields in
XEP-0068 and XEP-0128.

> Now, are we just going to say 'no-one should or will ever use
> namespaces with three or more slashes with something between them'?

Not until we understand the general attack that you haven't described.

>>> What you are proposing had already been proposed in those threads by
>>> others, but wasn't sufficient.
>>>
>>> It's rather awkward to forbid '<', or any other valid character in
>>> identity names or disco extensions (which can be user input).
>>
>> What's to be forbidden is the four characters '&', 'l', 't', ';' -- or,
>> more specifically, coverting those literal characters into '<' (since
>> the latter character has a special meaning in XEP-0115). That string of
>> characters would not need to be forbidden in XEP-0030, only treated
>> carefully when taking disco#info strings as input to the XEP-0115 hash
>> calculation.
>>
>> If implementations perform that one check, then Attack #1 is null.
> 
> So, forbid < in what could be a user-specified string (name)? Ugly...

We're not here for aesthetics.

>> As to Attack #2, that can be solved by saying that a data form of type
>> "result" MUST include at least one <field/> element. That is, changing
>> XEP-0004 from:
>>
>>   A data form of type "form", "submit", or "result" SHOULD contain at
>>   least one <field/> element
>>
>> to:
>>
>>   A data form of type "form", "submit", or "result" MUST contain at
>>   least one <field/> element
>>
>> Which seems sensible anyway.
>>
> 
> Err.. you are fixing the exact example again, not the general attack..
> 
>   <identity category='client' type='pc' name='SomeClient'/>
>   <feature var='feature1'/>
>   <feature var='feature2'/>
>   <feature var='feature3'/>
> 
> remains equivalent to:
> 
>   <identity category='client' type='pc' name='SomeClient'/>
>   <x xmlns='jabber:x:data' type='result'>
>     <field var='FORM_TYPE' type='hidden'>
>       <value>feature1</value>
>     </field>
>     <field var='feature2' type='hidden'>
>       <value>feature3</value>
>     </field>
>   </x>

Is it fair to say that your general attack depends on inclusion of
XEP-0128 forms? Or do you have an even more general attack to describe?

> Now, the important part of my email:
> 
> While fixing the exact examples I typed out is nice, the real problem
> which needs solving is this: structural information is lost in the
> current algorithm. The disco stanza itself is logically separated into
> sections: identities, features, disco extensions. Disco extensions are
> further separated into forms and forms are made up of fields and
> fields are made up of values. The algorithm discards all this
> structure. It's lossy. This is the actual issue. You can patch over
> the simpler individual examples easily, but the algorithm is broken
> fundamentally and you will not be able to fix that.

For all disco data, or only disco data containing XEP-0128 forms? As far
as I can see, it's the latter.

>>> I'm
>>> still in favor of a clean new algorithm and XEP,
>>
>> I'd prefer to salvage XEP-0115 if we possibly can, rather than defining
>> a completely new algorithm and spec.
>>
> 
> The second important part of the email:
> 
> I completely understand why everyone wants to salvage XEP-0115. But we
> have a *security* issue here. Implementations which are insecure at
> the moment must stop working in the future. Trying to let them
> continue to work would be fine if this weren't a security issue, but
> is irresponsible when it is a security issue.

I agree that we need to fix things. I'm trying to find the least
intrusive way to do so.

> We'll just tighten restrictions in the document and assume all is
> fine. 

I never said all was fine. I did say that we have some trivial fixes and
clarifications that address a large part of the problem. We still need
to figure out what the remaining gap is. It appears to be your Attack #2.

> All would not be fine as most implementors will remain
> blissfully unaware of it. New implementators will put in the minimum
> amount of effort necessary to make PEP work, and that will not include
> detailed validation. I'm willing to bet most client authors are
> unaware of there being a security issue at all. Other than checking
> the source code, no-one would really be able to trust any
> implementations to be secure in this respect, because the algorithm
> would be broken by default, with patchwork laid on top to make it not
> leak.

You might be right that we could design a better algorithm. However,
fixing and clarifying some aspects of the existing design is not
therefore a bad thing, unless you argue that it's going to give people a
false sense of security.

> SSL 3.0, TLS 1.0, 1.1, 1.2, and so on. If you looked at the specs,
> you'll see that these are essentially the same protocols, with some
> differences. They could have been patched up quite easily by adding
> some stuff to the spec and staying compatible. They weren't. Newer
> versions were made incompatible with older versions.
> 
>>> which can also fix
>>> many of the non-algorithmic issues with the XEP.
>>
>> What are the non-algorithmic issues?
>>
> 
> Things like hash agility were a goal of the spec. They don't work.
> There was discussion of this in some of the old threads.

Yes, hash agility is good. We already have the ability to specify the
hashing algorithm used, such as hash='sha-1' or hash='sha-256'. What
more do you think is needed?

Peter

-- 
Peter Saint-Andre
https://stpeter.im/


Reply via email to