[ https://issues.apache.org/jira/browse/XERCESC-2250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17874357#comment-17874357 ]
William S Fulton commented on XERCESC-2250: ------------------------------------------- Agreed it is not a security concern as it is not undefined behaviour. The code as it is is just non-functional in many scenarios, but I am surprised it has not been reported before as the Curl NetAccessor takes priority (if configured) as the default NetAccessor over the other NetAccessors. The commit referenced does look like it will fix the problem, but I havn't tried it out yet. Your suggestion of using it along with a minor version bump makes sense to me. Writing a FIFO queue to handle this might be a more strategic fix, but would indeed come with risks and require more testing, especially when compared to the referenced commit possibly being well tested in your separate code base since 2010. I think that performance tuning an implementation would require a reasonably large amount of work and the simple realloc might be faster in some scenarios. Just getting the code fixed would be higher priority. > Curl NetAccessor overflow buffer resulting in NetAcc_InternalError > ------------------------------------------------------------------ > > Key: XERCESC-2250 > URL: https://issues.apache.org/jira/browse/XERCESC-2250 > Project: Xerces-C++ > Issue Type: Bug > Components: Utilities > Affects Versions: 3.2.5 > Reporter: William S Fulton > Priority: Major > > The Curl NetAccessor has a buffer overflow bug > It can be easily replicated if the curl NetAccessor is turned on during > configure: > {{./configure --with-icu --with-curl}} > and then invoking the NetAccessorTest executable using one of the large > files, I get: > {{~/xerces-c $ ./tests/NetAccessorTest file://$(pwd)/doc/program-dom.xml}} > {{Exception during test:}} > {{ internal error in NetAccessor}} > The problem is in CurlURLInputStream::writeCallback which returns a value > less than the expected value the function should consume as cnt != > totalConsume. According to > [https://curl.se/libcurl/c/CURLOPT_WRITEFUNCTION.html] CURL_WRITEFUNC_ERROR > will then be returned. CURL_WRITEFUNC_ERROR is not handled in the switch > statement in CurlURLInputStream::readMore, hence > XMLExcepts::NetAcc_InternalError is thrown. > > I can see that the logic error in CurlURLInputStream::writeCallback is down > to the assumption that libcurl will call this callback just once before > Xerces is able to clear the buffer in CurlURLInputStream::readBytes. Perhaps > this is because the Curl docs linked to above is not clear that up to > CURL_MAX_WRITE_SIZE bytes could be provided on each or multiple invocations > of the callback. Xerces makes the assumption that libcurl would only provide > CURL_MAX_WRITE_SIZE bytes for each call to curl_multi_info_read. This is not > correct on close inspection of the Xerces code at > [https://github.com/curl/curl/blob/160f0233590d0a02422594104ae805e1aa08d3db/lib/cw-out.c#L218] > where there is a loop that will call the callback multiple times. Each > invocation of the callback could expect up to CURL_MAX_WRITE_SIZE bytes to be > consumed. However, Xerces can only handle CURL_MAX_WRITE_SIZE in total for > multiple invocations of the callback due to the buffer definition: > XMLByte CurlURLInputStream::fBuffer[CURL_MAX_WRITE_SIZE]; > > Regarding solutions, one solution would be... > If the number of bytes to consume in CurlURLInputStream::writeCallback would > exceed the size of CurlURLInputStream::fBuffer then we could return > CURL_WRITEFUNC_PAUSE from the callback to defer consuming the bytes, see [ > https://curl.se/libcurl/c/CURLOPT_WRITEFUNCTION.html|https://curl.se/libcurl/c/CURLOPT_WRITEFUNCTION.html]. > However, unnecessarily pausing the network transfer seems inefficient and an > unnecessary performance hit, so probably best not to consider this. > > I think the better solution would be to replace the fixed size buffer > CurlURLInputStream::fBuffer with a dynamically sized buffer. Looking at the > Xerces code base, I don't see any kind of queue container and the STL does > not seem to be used. I was thinking of using the ValueVectorOf container with > template type B say, where B is a struct buffer with a member > XMLByte[CURL_MAX_SIZE_WRITE_SIZE]. A fixed container size of one would equate > to the current implementation. The fix would be to get it to grow if needed. > Probably just a max size of 2 or 3 would be used in reality. Thoughts? -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org For additional commands, e-mail: c-dev-h...@xerces.apache.org