I fixed the bug where the XML parser failed to
return an error if the buffer limit was reached.

The CPDN problem exercised this bug,
but fixing this bug doesn't fix the CPDN problem.
The issue there is that the field for an MD5 checksum
as defined as char[33] (32 chars for MD5, 1 for NULL).

When the XML parser parses a string, it stores it in the buffer
(and enforces the buffer-size limit)
before it strips whitespace from the start and end of the string.
Because CPDN put a newline after the MD5, this overflowed the 33-char buffer.

I fixed this by changing the MD5 field to char[64].
Now the client will parse CPDN's replies.
(At some point I may change the XML parser so that it
strips whitespace before enforcing the buffer limit).

I've advised CPDN to remove the newline, so that things will work
with existing recent clients.

-- David

On 29-Mar-2013 1:30 PM, Nicolás Alvarez wrote:
2013/3/29, Richard Haselgrove <[email protected]>:
The sched_reply_climateprediction.net.xml files appear to be XML compliant,
but not to conform to BOINC's restricted sub-set of XML. Two tags (that I've
identified so far) are broken over two lines - <nbytes> and <md5_cksum>:
both these tags have additional single LineFeed (0x0A) characters after the
data and before the closing tag.

The version of <md5_cksum> written into client_state.xml by - in my case -
BOINC v7.0.59, 32-bit version, under Windows 7, consists of 40 characters:
the (correct) original 32-character MD5 copied from sched_reply, plus 8
additional characters of - it appears - classic buffer overflow characters.
Needless to say, the 40 character (MD5+buffer) fails to match the 32
character MD5 calculated by the BOINC client for the file: the downloaded
file is rejected, and the MD5 mis-match is recorded in the event log.

Bug: The BOINC XML parser returns successfully and doesn't
null-terminate buffers if the element contains more data than the
buffer would fit.

Here is what happens in a normal situation, where <md5_cksum> contains
exactly 32 characters:

- The FILE_INFO parser calls parse_str("md5_cksum"), passing it a
33-byte buffer.
- parse_str verifies that md5_cksum is indeed the tag name of the last
parsed tag, and calls get_aux(buf,33) to get the text after the
element.
- get_aux reads the first character, since it's not a '<', it calls
copy_until_tag to do the rest. (It also puts the first character in
the buffer and adjusts the copy_until_tag args accordingly, but for
simplicity we can pretend that get_aux ungetc()s the character
instead; the effect would be the same)
- copy_until_tag reads characters one by one, placing them in the
buffer as long as they aren't '<', and checking that there's still
space in the buffer.
- After reading the last (32nd) character of the MD5 hash, len is 1.
- It reads one more character and it's a '<', which indicates the text
content finished. It unreads the '<', puts a null-terminator in the
buffer, and returns.
- parse_str then verifies that the text is followed by a </md5_cksum>
closing tag, and terminates happily. The buffer now has the 32
characters that were inside the XML element, followed by a null
terminator.

Now, here's what happens if we have 33 characters in <md5_cksum>,
starting from copy_until_tag:

- copy_until_tag reads characters one by one, placing them in the
buffer as long as they aren't '<', and checking that there's still
space in the buffer.
- After reading the last (32nd) character of the MD5 hash, len is 1.
- It reads one more character and it's not a '<' (in this case it's a
newline). It subtracts 1 from 'len', and it becomes 0, which indicates
there's more data than the buffer can hold.
- It returns XML_PARSE_OVERFLOW, without modifying the last position
of the buffer at all. It probably contains some garbage.
- get_aux propagates the error code.
- parse_str only checks if get_aux reached EOF, or found the end tag
(which would mean there's no text, as in <foo></foo>). In the case of
PARSE_OVERFLOW it will just continue.
- parse_str checks that the </md5_cksum> tag follows, etc. and returns
successfully.

The FILE_INFO parser now thinks nothing went wrong, since parse_str
returned success. But the md5_cksum array is not null-terminated,
unless its last position already happened to contain a null character
before parsing. Any code that reads file_info.md5_cksum will go past
the array bounds.

I didn't follow the code flow in the case the element contains even
more data (ie. if it would overflow by more than 1 byte), but my guess
is that it would also return successfully, truncate the data silently,
and leave a non-terminated buffer.

_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to