2013/3/29, Richard Haselgrove <[email protected]>:
> The sched_reply_climateprediction.net.xml files appear to be XML compliant,
> but not to conform to BOINC's restricted sub-set of XML. Two tags (that I've
> identified so far) are broken over two lines - <nbytes> and <md5_cksum>:
> both these tags have additional single LineFeed (0x0A) characters after the
> data and before the closing tag.
>
> The version of <md5_cksum> written into client_state.xml by - in my case -
> BOINC v7.0.59, 32-bit version, under Windows 7, consists of 40 characters:
> the (correct) original 32-character MD5 copied from sched_reply, plus 8
> additional characters of - it appears - classic buffer overflow characters.
> Needless to say, the 40 character (MD5+buffer) fails to match the 32
> character MD5 calculated by the BOINC client for the file: the downloaded
> file is rejected, and the MD5 mis-match is recorded in the event log.
Bug: The BOINC XML parser returns successfully and doesn't
null-terminate buffers if the element contains more data than the
buffer would fit.
Here is what happens in a normal situation, where <md5_cksum> contains
exactly 32 characters:
- The FILE_INFO parser calls parse_str("md5_cksum"), passing it a
33-byte buffer.
- parse_str verifies that md5_cksum is indeed the tag name of the last
parsed tag, and calls get_aux(buf,33) to get the text after the
element.
- get_aux reads the first character, since it's not a '<', it calls
copy_until_tag to do the rest. (It also puts the first character in
the buffer and adjusts the copy_until_tag args accordingly, but for
simplicity we can pretend that get_aux ungetc()s the character
instead; the effect would be the same)
- copy_until_tag reads characters one by one, placing them in the
buffer as long as they aren't '<', and checking that there's still
space in the buffer.
- After reading the last (32nd) character of the MD5 hash, len is 1.
- It reads one more character and it's a '<', which indicates the text
content finished. It unreads the '<', puts a null-terminator in the
buffer, and returns.
- parse_str then verifies that the text is followed by a </md5_cksum>
closing tag, and terminates happily. The buffer now has the 32
characters that were inside the XML element, followed by a null
terminator.
Now, here's what happens if we have 33 characters in <md5_cksum>,
starting from copy_until_tag:
- copy_until_tag reads characters one by one, placing them in the
buffer as long as they aren't '<', and checking that there's still
space in the buffer.
- After reading the last (32nd) character of the MD5 hash, len is 1.
- It reads one more character and it's not a '<' (in this case it's a
newline). It subtracts 1 from 'len', and it becomes 0, which indicates
there's more data than the buffer can hold.
- It returns XML_PARSE_OVERFLOW, without modifying the last position
of the buffer at all. It probably contains some garbage.
- get_aux propagates the error code.
- parse_str only checks if get_aux reached EOF, or found the end tag
(which would mean there's no text, as in <foo></foo>). In the case of
PARSE_OVERFLOW it will just continue.
- parse_str checks that the </md5_cksum> tag follows, etc. and returns
successfully.
The FILE_INFO parser now thinks nothing went wrong, since parse_str
returned success. But the md5_cksum array is not null-terminated,
unless its last position already happened to contain a null character
before parsing. Any code that reads file_info.md5_cksum will go past
the array bounds.
I didn't follow the code flow in the case the element contains even
more data (ie. if it would overflow by more than 1 byte), but my guess
is that it would also return successfully, truncate the data silently,
and leave a non-terminated buffer.
--
Nicolás
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.