> On 03.06.2011 10:52, Julian Foad wrote: > > Hi Stefan2 and others. > > > > This patch simplifies code in RA-svn/marshal.c by using a single code > > path to read both short and long strings efficiently. Using a single > > code path is beneficial for test coverage. > > > > It is an alternative to r1028352 which merged r985606 and r1028092 from > > the performance branch. > > > > I have run the test suite over RA-svn (with BDB) but haven't done any > > more testing than that. It's just something I noticed while looking at > > that code, it's not solving a problem I've noticed or anything like > > that. By inspection I believe it is more efficient than > > read_long_string() was, and unchanged for short strings, but if you're > > able to do any performance analysis on it, that would be great.
Daniel, thanks for your detailed review comments. I'll address them if we decide the patch is useful. But Stefan has another idea. Stefan Fuhrmann wrote: > Thanks for the patch. Currently, I 'm in the process > of moving and won't find time for a detailed review > and test until sometime next week. > > But maybe the right solution would be replacing > read_long_string() with a simple error return. That > is the only way to limit memory consumption on the > server side. If the protocol has been we implemented > correctly, no string should be longer than about 100k > (a svndiff window). So, rejecting everything above, > say, 16M should be fine. Sounds good. If we can set some maximum length above which the code should throw an error, then I offer the patch attached below instead. I need to rely on others to confirm that approach as I don't know that to be true. (The patch as attached here is made ignoring white space, so applying it will give wrong indentation. And I'd want to change the name of the max length constant as well.) - Julian
Simplify RA-svn's string marshalling. This is a follow-up to r1028352 which merged r985606 and r1028092 from the performance branch. That change used separate functions to handle short and long strings. This change instead detects and fails on strings that are too long to be valid. * subversion/libsvn_ra_svn/marshal.c (SUSPICIOUSLY_HUGE_STRING_SIZE_THRESHOLD): Re-write doc string. (read_long_string): Remove. (read_string): Fail if the received string claims to be impossibly long. --This line, and those below, will be ignored-- Index: subversion/libsvn_ra_svn/marshal.c =================================================================== --- subversion/libsvn_ra_svn/marshal.c (revision 1132599) +++ subversion/libsvn_ra_svn/marshal.c (working copy) @@ -46,10 +46,10 @@ #define svn_iswhitespace(c) ((c) == ' ' || (c) == '\n') -/* If we receive data that *claims* to be followed by a very long string, - * we should not trust that claim right away. But everything up to 1 MB - * should be too small to be instrumental for a DOS attack. */ - +/* The maximum legitimate size of a received string. This just needs to be + * large enough to contain an svndiff window and small enough not to be + * instrumental in a DOS attack. + * ### Can we base this more concretely on SVN_DELTA_WINDOW_SIZE? */ #define SUSPICIOUSLY_HUGE_STRING_SIZE_THRESHOLD (0x100000) /* --- CONNECTION INITIALIZATION --- */ @@ -563,36 +563,6 @@ /* --- READING DATA ITEMS --- */ -/* Read LEN bytes from CONN into a supposedly empty STRINGBUF. - * POOL will be used for temporary allocations. */ -static svn_error_t * -read_long_string(svn_ra_svn_conn_t *conn, apr_pool_t *pool, - svn_stringbuf_t *stringbuf, apr_uint64_t len) -{ - char readbuf[4096]; - apr_size_t readbuf_len; - - /* We can't store strings longer than the maximum size of apr_size_t, - * so check for wrapping */ - if (((apr_size_t) len) < len) - return svn_error_create(SVN_ERR_RA_SVN_MALFORMED_DATA, NULL, - _("String length larger than maximum")); - - while (len) - { - readbuf_len = len > sizeof(readbuf) ? sizeof(readbuf) : (apr_size_t)len; - - SVN_ERR(readbuf_read(conn, pool, readbuf, readbuf_len)); - /* Read into a stringbuf_t to so we don't allow the sender to allocate - * an arbitrary amount of memory without actually sending us that much - * data */ - svn_stringbuf_appendbytes(stringbuf, readbuf, readbuf_len); - len -= readbuf_len; - } - - return SVN_NO_ERROR; -} - /* Read LEN bytes from CONN into already-allocated structure ITEM. * Afterwards, *ITEM is of type 'SVN_RA_SVN_STRING', and its string * data is allocated in POOL. */ @@ -601,27 +571,15 @@ { svn_stringbuf_t *stringbuf; - /* We should not use large strings in our protocol. However, we may - * receive a claim that a very long string is going to follow. In that - * case, we start small and wait for all that data to actually show up. - * This does not fully prevent DOS attacs but makes them harder (you - * have to actually send gigabytes of data). - */ + /* We should not use large strings in our protocol. Fail if we + * receive a claim that an impossibly long string is going to follow, + * in order to prevent a simple kind of DOS attack. */ if (len > SUSPICIOUSLY_HUGE_STRING_SIZE_THRESHOLD) - { - /* This string might take a large amount of memory. Don't allocate - * the whole buffer at once, so to prevent OOM issues by corrupted - * network data. - */ - stringbuf = svn_stringbuf_create("", pool); - SVN_ERR(read_long_string(conn, pool, stringbuf, len)); - } - else - { + return svn_error_create(SVN_ERR_RA_SVN_MALFORMED_DATA, NULL, + _("Received string is impossibly long")); + /* This is a reasonably sized string. So, provide a buffer large - * enough to prevent re-allocation as long as the data transmission - * is not flawed. - */ + * enough not to need re-allocation. */ stringbuf = svn_stringbuf_create_ensure((apr_size_t)len, pool); /* Read the string data directly into the string structure. @@ -637,7 +595,6 @@ stringbuf->data[stringbuf->len] = '\0'; len -= readbuf_len; } - } /* Return the string properly wrapped into an RA_SVN item. */ item->kind = SVN_RA_SVN_STRING;