[
https://issues.apache.org/jira/browse/XERCESC-2043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Scott Cantor closed XERCESC-2043.
---------------------------------
> String pooling in DOMDocumentImpl is unsafe, particularly on 64-bit platforms
> -----------------------------------------------------------------------------
>
> Key: XERCESC-2043
> URL: https://issues.apache.org/jira/browse/XERCESC-2043
> Project: Xerces-C++
> Issue Type: Bug
> Components: DOM
> Affects Versions: 3.1.1
> Environment: Linux, RHEL6, likely any 64-bit OS
> Reporter: Scott Cantor
> Priority: Critical
> Fix For: 3.1.2
>
>
> There seem to be conditions under which an invalid DOM is generated such that
> elements contain a prefix set to the entire qualified name of the element and
> not just the prefix.
> e.g., I had a case where an element called del:Delegate (with del defined up
> above) ended up such that getPrefix() returned "del:Delegate" and not just
> "del".
> After much investigation, I traced this to the use of
> DOMDocumentImpl::getPooledNString when storing the prefix inside
> DOMElementNSImpl::setName()
> This pooling logic is very odd, because it performs a hash of N characters of
> the input string, and compares the pool entries it searches based on N
> characters, but then returns the whole pooled string on a match. The problem
> is that there's one string pool shared between both the getPooledString() and
> getPooledNString() methods, even though they operate based on different
> assumptions. If you get collisions between hash values in the two cases, you
> can end up returning a string inserted by getPooledString() when you call
> getPooledNString().
> In my case, del:Delegate was hashed and stored by getPooledString(), and then
> getPooledNString("del:Delegate", 3) ends up returning "del:Delegate" instead
> of just inserting and returning "del". This is obviously catastrophic.
> Couple things:
> This algorithm is just plain unsafe no matter what because a hash collision
> under the right conditions creates an invalid DOM.
> The source of the collisions may be in part because the hash methods in
> XMLString were written based on XMLSize_t being 32-bits, and now they can be
> 64-bits. The collision probability may be higher now.
> My suggested fix is to split things so that DOMDocumentImpl carries two
> pools, one for the getPooledString method and one for getPooledNString, to
> eliminate any chance of collison resulting in invalid returned strings. The
> fix for that is quite trivial.
> This is a major bug that essentially breaks the parser under the right
> conditions in a very unexpected way. I really need a 3.1.2 to get this fixed,
> and am wiling to contribute to the release process.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]