Re: XML Security Java 1.5.0-RC1 available

Cantor, Scott Tue, 20 Dec 2011 11:30:46 -0800

On 12/20/11 2:11 PM, "Sean Mullan" <[email protected]> wrote:
>
>I did a grep of the source tree, and our code never calls
>Element.setIdAttribute.


Yes, but apps do. My code does. That's currently the way for schema
validation *and* manual ID setting to result in the same outcome, assuming
you call getElementById on your end.

>Usually an app won't have to do that - it is aware of the schema, and
>where the element IDs should be in the document - Colm can comment some
>more on how the WSS library does this.

You have to do a partial tree walk any time you have open extension points
that could be carrying objects that could have IDs. It's not complete
traversal, but for many types of documents, the difference is minimal.

> Our library won't find anything
>that isn't registered, so if you stick something way down in the guts of
>the document, it simply won't find it. (It used to, but not as of 1.5).
>But I can see the duplicate ID issues you mention, if the app uses
>Element.setIdAttribute to register the ID attributes.

Yes. You will find anything that isn't registered *by you*, as long as
it's registered with the DOM.

>As I understand the wrapping attacks, it happens after the signature is
>validated, when the application actually acts on the element content
>that is mapped to that ID. Then, it needs to find that element, and if
>there are duplicate IDs and it gets the wrong one, then oops. As Colm
>mentioned, we do have a mechanism to return the Elements that were
>actually validated.

Right. I agree that it's obviously better to do that, although I wonder
about the performance when dealing with transformed node sets.

I don't have it yet in C++, and I've been hesitant because it's a lot of
work, and I don't have a lot of time to fix things that aren't broken in
my code. I'm particularly unclear how to do it for the general case, not
just a simple ID reference to an element subtree. All I can see to do is
clone the nodes to save them off and return them. Or save the octets I
guess.

Point being, the API can't be just "here's the Element", but rather the
node set or stream.

>But I guess I see an issue in that it is hard for the app to do all
>these extra checks to prevent wrapping attacks. It sounds like what we
>need is an additional optional "sanity" check on the entire document
>looking for duplicate IDs.

My feeling has been that it's a difficult/impossible problem in the XPath
case (you really have to just return the exact nodes) but in the ID case,
if you can guarantee some sort of predictable behavior plus have the app
do transform checking, you have a shot of offloading some of the
significant steps.

>>Thus my point. The Xerces team is wrong. Somebody needs to explain that
>>to
>> them, somebody they'll listen to.
>
>If they won't listen to you, I'm sure they won't listen to me ;)

Well, we tried (in fairness Xerces-C is more or less dead, so the open bug
is all I really expected there). Now I think it's down to us defining a
suitable algorithm. It may be that abandoning the DOM API is the right
thing, but I don't think we should do that without some deprecation time.

>Hmm, I suppose we could stop calling Document.getElementById if the
>document was not validated against a schema. Let me think about that
>some more.

I'm not sure if you can tell, actually. Maybe in Java.

I think the most logical thing to do if you're going to deprecate that
call is to make it an application option. Basically I'd have a set of
IdResolvers with different, defined behavior, choose a default for the
time being, possibly deprecate some of them, etc. I think that's cleaner
than trying to create a bunch of options.

-- Scott

Re: XML Security Java 1.5.0-RC1 available

Reply via email to