[ https://issues.apache.org/jira/browse/XERCESC-2043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Scott Cantor reopened XERCESC-2043: ----------------------------------- Reopening, because I need to backport a fix for this to the 3.1 branch. The trunk fix changes the ABI, so we need something different. I'm inclined to just check the string lengths directly without storing them in the StringPool entries, but I need to see if there are any non-null-termed strings. > String pooling in DOMDocumentImpl is unsafe, particularly on 64-bit platforms > ----------------------------------------------------------------------------- > > Key: XERCESC-2043 > URL: https://issues.apache.org/jira/browse/XERCESC-2043 > Project: Xerces-C++ > Issue Type: Bug > Components: DOM > Affects Versions: 3.1.1 > Environment: Linux, RHEL6, likely any 64-bit OS > Reporter: Scott Cantor > Priority: Critical > Fix For: 3.1.2 > > > There seem to be conditions under which an invalid DOM is generated such that > elements contain a prefix set to the entire qualified name of the element and > not just the prefix. > e.g., I had a case where an element called del:Delegate (with del defined up > above) ended up such that getPrefix() returned "del:Delegate" and not just > "del". > After much investigation, I traced this to the use of > DOMDocumentImpl::getPooledNString when storing the prefix inside > DOMElementNSImpl::setName() > This pooling logic is very odd, because it performs a hash of N characters of > the input string, and compares the pool entries it searches based on N > characters, but then returns the whole pooled string on a match. The problem > is that there's one string pool shared between both the getPooledString() and > getPooledNString() methods, even though they operate based on different > assumptions. If you get collisions between hash values in the two cases, you > can end up returning a string inserted by getPooledString() when you call > getPooledNString(). > In my case, del:Delegate was hashed and stored by getPooledString(), and then > getPooledNString("del:Delegate", 3) ends up returning "del:Delegate" instead > of just inserting and returning "del". This is obviously catastrophic. > Couple things: > This algorithm is just plain unsafe no matter what because a hash collision > under the right conditions creates an invalid DOM. > The source of the collisions may be in part because the hash methods in > XMLString were written based on XMLSize_t being 32-bits, and now they can be > 64-bits. The collision probability may be higher now. > My suggested fix is to split things so that DOMDocumentImpl carries two > pools, one for the getPooledString method and one for getPooledNString, to > eliminate any chance of collison resulting in invalid returned strings. The > fix for that is quite trivial. > This is a major bug that essentially breaks the parser under the right > conditions in a very unexpected way. I really need a 3.1.2 to get this fixed, > and am wiling to contribute to the release process. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org For additional commands, e-mail: c-dev-h...@xerces.apache.org