On Fri, Apr 24, 2009 at 4:10 PM, C. Michael Pilato <cmpil...@collab.net>wrote:
> [Please Cc: me in responses -- I think I still have APR commit privs, but > I'm not active here and not subscribed to the mailing lists.] > > In the past couple of weeks, I've seen two different reports of what > appears > to be corruption in the stream of data transmitted by Subversion's > mod_dav_svn through Apache and back to the Subversion client. What is seen > client-side is an opening XML tag, a truncated bit of CDATA "inside" the > tag, and then a missing XML closing tag. The problem seems to occur with > magically sized chunks of data, so it can be hard to reproduce[1]. > > Here are the relevant pieces of the call stack: > > mod_dav_svn/reports/replay.c's change_file_or_dir_prop() function contains > the following (which is base64-encoding Subversion file and directory > properties, and tossing them into an XML REPORT request response): > > const svn_string_t *enc_value = svn_base64_encode_string2(value, TRUE, > pool); > SVN_ERR(dav_svn__send_xml > (eb->bb, eb->output, > "<S:change-%s-prop name=\"%s\">%s</S:change-%s-prop>" DEBUG_CR, > file_or_dir, qname, enc_value->data, file_or_dir)); > > dav_svn__send_xml() is a wrapper around apr_brigade_vprintf(). > > As you know, apr_brigade_vprintf() (in buckets/apr_brigade.c) looks like > so: > > APU_DECLARE(apr_status_t) apr_brigade_vprintf(apr_bucket_brigade *b, > apr_brigade_flush flush, > void *ctx, > const char *fmt, va_list va) > { > /* the cast, in order of appearance */ > struct brigade_vprintf_data_t vd; > char buf[APR_BUCKET_BUFF_SIZE]; > int written; > > vd.vbuff.curpos = buf; > vd.vbuff.endpos = buf + APR_BUCKET_BUFF_SIZE; > vd.b = b; > vd.flusher = &flush; > vd.ctx = ctx; > vd.cbuff = buf; > > written = apr_vformatter(brigade_flush, &vd.vbuff, fmt, va); > > if (written == -1) { > return -1; > } > > /* tack on null terminator to remaining string */ > *(vd.vbuff.curpos) = '\0'; > > /* write out what remains in the buffer */ > return apr_brigade_write(b, flush, ctx, buf, vd.vbuff.curpos - buf); > } > > The function apr_vformatter() uses the buffer "buf" to format the string. > This function in turn uses the macro INS_CHAR to add characters to the > buffer. INS_CHAR is defined like this: > > #define INS_CHAR(c, sp, bep, cc) \ > { \ > if (sp) { \ > if (sp >= bep) { \ > vbuff->curpos = sp; \ > if (flush_func(vbuff)) \ > return -1; \ > sp = vbuff->curpos; \ > bep = vbuff->endpos; \ > } \ > *sp++ = (c); \ > } \ > cc++; \ > } > > So, when the macro is executed to add a new character to the buffer and the > buffer is full, the flush function is called to make room for the new > character, and then the character is added. Of course, if the buffer has > room for exactly one more character, it is not flushed, the character is > added, and the current position of the buffer is at its end (which is > actually one byte beyond the space allocated for the buffer). > > After the call to apr_vformatter(), there will be stuff in the buffer. In > the special case above, the buffer may be perfectly full (perhaps after > having been flushed one or more times, but still full now). Then, without > checking for that condition, this line is executed: > > /* tack on null terminator to remaining string */ > *(vd.vbuff.curpos) = '\0'; > > Uh-oh. Buffer overflow! > > Our CollabNet engineer is proposing a simple fix: defining 'buf' inside > apr_brigade_vprintf() like so: > > char buf[APR_BUCKET_BUFF_SIZE + 1] > > (Note the "+ 1" to make room for that pesky NULL byte.) > > But I'm wondering if an equally correct fix would be to simply not tack the > '\0' onto the buffer at all. Doesn't apr_brigade_write() accept both the > buffer and the number of bytes to write? Does it really need a > null-terminated string, especially considering that its input could be > arbitrary binary data? Other calls to it pass things like "str" and > "strlen(str)", which would ignore the NULL byte in "str". I agree; it won't have the terminating '\0' once it is written to the brigade, and apr_brigade_write() doesn't need it.