[ 
https://issues.apache.org/jira/browse/XERCESC-2043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Cantor resolved XERCESC-2043.
-----------------------------------
    Resolution: Duplicate
      Assignee: Scott Cantor

Closing as duplicate, added fix version to the original bug report.

> String pooling in DOMDocumentImpl is unsafe, particularly on 64-bit platforms
> -----------------------------------------------------------------------------
>
>                 Key: XERCESC-2043
>                 URL: https://issues.apache.org/jira/browse/XERCESC-2043
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: DOM
>    Affects Versions: 3.1.1
>         Environment: Linux, RHEL6, likely any 64-bit OS
>            Reporter: Scott Cantor
>            Assignee: Scott Cantor
>            Priority: Critical
>             Fix For: 3.1.2
>
>
> There seem to be conditions under which an invalid DOM is generated such that 
> elements contain a prefix set to the entire qualified name of the element and 
> not just the prefix.
> e.g., I had a case where an element called del:Delegate (with del defined up 
> above) ended up such that getPrefix() returned "del:Delegate" and not just 
> "del".
> After much investigation, I traced this to the use of 
> DOMDocumentImpl::getPooledNString when storing the prefix inside 
> DOMElementNSImpl::setName()
> This pooling logic is very odd, because it performs a hash of N characters of 
> the input string, and compares the pool entries it searches based on N 
> characters, but then returns the whole pooled string on a match. The problem 
> is that there's one string pool shared between both the getPooledString() and 
> getPooledNString() methods, even though they operate based on different 
> assumptions. If you get collisions between hash values in the two cases, you 
> can end up returning a string inserted by getPooledString() when you call 
> getPooledNString().
> In my case, del:Delegate was hashed and stored by getPooledString(), and then 
> getPooledNString("del:Delegate", 3) ends up returning "del:Delegate" instead 
> of just inserting and returning "del". This is obviously catastrophic.
> Couple things:
> This algorithm is just plain unsafe no matter what because a hash collision 
> under the right conditions creates an invalid DOM.
> The source of the collisions may be in part because the hash methods in 
> XMLString were written based on XMLSize_t being 32-bits, and now they can be 
> 64-bits. The collision probability may be higher now.
> My suggested fix is to split things so that DOMDocumentImpl carries two 
> pools, one for the getPooledString method and one for getPooledNString, to 
> eliminate any chance of collison resulting in invalid returned strings. The 
> fix for that is quite trivial.
> This is a major bug that essentially breaks the parser under the right 
> conditions in a very unexpected way. I really need a 3.1.2 to get this fixed, 
> and am wiling to contribute to the release process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to