[ https://issues.apache.org/jira/browse/XERCESC-2250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17874378#comment-17874378 ]
Scott Cantor commented on XERCESC-2250: --------------------------------------- Ok, I think that all fits my understanding. Because it was reported as an overflow, I assumed having not analyzed it that you meant a real C-style issue was extant, but I think it's "safe", just broken, and it's just that the simple change to avoid the exception would indeed cause an overflow. So that's all good, and much less critical. This code is all but dead, so it's not at all surprising there hasn't been a report about it. The default socket NetAccessor doesn't even work so far as I've been able to tell, it's in even worse shape. Given we seem to have consensus about that fix, I will discuss it with whoever else is left to talk it over and when I have the cycles to work on it I'll get my patch applied and probably just slap in a compile time limit as a hard cap to avoid unbounded growth. The ABI issue is complex because while it's not likely anyone else would be linked to this class, we don't prevent it, so though it's ugly to need a 3.3 for something so internal but I don't have a better idea. > Curl NetAccessor overflow buffer resulting in NetAcc_InternalError > ------------------------------------------------------------------ > > Key: XERCESC-2250 > URL: https://issues.apache.org/jira/browse/XERCESC-2250 > Project: Xerces-C++ > Issue Type: Bug > Components: Utilities > Affects Versions: 3.2.5 > Reporter: William S Fulton > Priority: Major > > The Curl NetAccessor has a buffer overflow bug > It can be easily replicated if the curl NetAccessor is turned on during > configure: > {{./configure --with-icu --with-curl}} > and then invoking the NetAccessorTest executable using one of the large > files, I get: > {{~/xerces-c $ ./tests/NetAccessorTest file://$(pwd)/doc/program-dom.xml}} > {{Exception during test:}} > {{ internal error in NetAccessor}} > The problem is in CurlURLInputStream::writeCallback which returns a value > less than the expected value the function should consume as cnt != > totalConsume. According to > [https://curl.se/libcurl/c/CURLOPT_WRITEFUNCTION.html] CURL_WRITEFUNC_ERROR > will then be returned. CURL_WRITEFUNC_ERROR is not handled in the switch > statement in CurlURLInputStream::readMore, hence > XMLExcepts::NetAcc_InternalError is thrown. > > I can see that the logic error in CurlURLInputStream::writeCallback is down > to the assumption that libcurl will call this callback just once before > Xerces is able to clear the buffer in CurlURLInputStream::readBytes. Perhaps > this is because the Curl docs linked to above is not clear that up to > CURL_MAX_WRITE_SIZE bytes could be provided on each or multiple invocations > of the callback. Xerces makes the assumption that libcurl would only provide > CURL_MAX_WRITE_SIZE bytes for each call to curl_multi_info_read. This is not > correct on close inspection of the Xerces code at > [https://github.com/curl/curl/blob/160f0233590d0a02422594104ae805e1aa08d3db/lib/cw-out.c#L218] > where there is a loop that will call the callback multiple times. Each > invocation of the callback could expect up to CURL_MAX_WRITE_SIZE bytes to be > consumed. However, Xerces can only handle CURL_MAX_WRITE_SIZE in total for > multiple invocations of the callback due to the buffer definition: > XMLByte CurlURLInputStream::fBuffer[CURL_MAX_WRITE_SIZE]; > > Regarding solutions, one solution would be... > If the number of bytes to consume in CurlURLInputStream::writeCallback would > exceed the size of CurlURLInputStream::fBuffer then we could return > CURL_WRITEFUNC_PAUSE from the callback to defer consuming the bytes, see [ > https://curl.se/libcurl/c/CURLOPT_WRITEFUNCTION.html|https://curl.se/libcurl/c/CURLOPT_WRITEFUNCTION.html]. > However, unnecessarily pausing the network transfer seems inefficient and an > unnecessary performance hit, so probably best not to consider this. > > I think the better solution would be to replace the fixed size buffer > CurlURLInputStream::fBuffer with a dynamically sized buffer. Looking at the > Xerces code base, I don't see any kind of queue container and the STL does > not seem to be used. I was thinking of using the ValueVectorOf container with > template type B say, where B is a struct buffer with a member > XMLByte[CURL_MAX_SIZE_WRITE_SIZE]. A fixed container size of one would equate > to the current implementation. The fix would be to get it to grow if needed. > Probably just a max size of 2 or 3 would be used in reality. Thoughts? -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org For additional commands, e-mail: c-dev-h...@xerces.apache.org