On Thu, Feb 21, 2008 at 1:09 PM, Niklas Edmundsson <[EMAIL PROTECTED]> wrote: > On Wed, 20 Feb 2008, Niklas Edmundsson wrote: > > > In any case, I should probably try to figure out how to reproduce this > thing. > > All coredumps I've looked at have been when serving DVD images, which of > > course works flawlessly when I try it... > > OK, I've been able to reproduce this, and it looks really bad because: > > - I'm able to reproduce without having mod_cache loaded, ie. vanilla > httpd. > - It's as easy as continuing an aborted download, so it's a trivial > DOS. > > So, to reproduce I did: > 1) Download 2222288895 bytes of the total 4444577792 bytes of a DVD > image (debian-31r7-i386-binary-2.iso if you're curious). > 2) Continue the download by doing wget -cS http://whatever/file.iso > > This coredumps the server, immediately closing the connection to the > client. > > Backtrace of coredump is: > #0 0xffffe410 in __kernel_vsyscall () > #1 0xb7cefca6 in kill () from /lib/tls/i686/cmov/libc.so.6 > #2 0x08089a03 in sig_coredump (sig=11) at mpm_common.c:1235 > #3 <signal handler called> > #4 0x00000000 in ?? () > #5 0x08093010 in ap_byterange_filter (f=0x81606a0, bb=0x8161360) > at byterange_filter.c:271 > #6 0x0808aec5 in ap_pass_brigade (next=0x81606a0, bb=0x8161360) > at util_filter.c:526 > #7 0x08077576 in default_handler (r=0x815f968) at core.c:3740 > #8 0x0807df8d in ap_run_handler (r=0x815f968) at config.c:157 > #9 0x0807e6d7 in ap_invoke_handler (r=0x815f968) at config.c:372 > #10 0x0808ea7c in ap_process_request (r=0x815f968) at http_request.c:258 > #11 0x0808b543 in ap_process_http_connection (c=0x815bb08) at http_core.c:190 > #12 0x08086df3 in ap_run_process_connection (c=0x815bb08) at connection.c:43 > #13 0x08087274 in ap_process_connection (c=0x815bb08, csd=0x815b958) > at connection.c:178 > #14 0x08094b00 in process_socket (p=0x815b920, sock=0x815b958, > my_child_num=0, > my_thread_num=0, bucket_alloc=0x815d928) at worker.c:544 > #15 0x080953c8 in worker_thread (thd=0x812d378, dummy=0x815b460) > at worker.c:894 > #16 0xb7e87eac in dummy_worker (opaque=0x812d378) > at threadproc/unix/thread.c:142 > #17 0xb7e1846b in start_thread () from /lib/tls/i686/cmov/libpthread.so.0 > #18 0xb7d9873e in clone () from /lib/tls/i686/cmov/libc.so.6 > > (gdb) dump_bucket ec > bucket=¨0¸(0x08161364) length=135664344 data=0x080641b0 > contents=[**unknown**] rc=n/a > > (gdb) print *ec > $1 = {link = {next = 0x815db00, prev = 0x8169a50}, type = 0x815d928, > length = 135664344, start = -5193905754803399840, data = 0x80641b0, > free = 0x8161390, list = 0x1} > > (gdb) print *ec->type > $2 = {name = 0x815b920 "¨À\v\b0ù\025\b\030Ñ\022\b¸9\026\b\030À\025\b", > num_func = 135641240, is_metadata = APR_BUCKET_DATA, destroy = 0x816bd00, > read = 0x58, setaside = 0x815d928, split = 0x815d910, copy = 0} > > (gdb) dump_brigade bb > dump of brigade 0x8161360 > | type (address) | length | data addr > --------------------------------------------------- > 0 | FILE (0x0815db00) | 16777216 | 0x0815daa8 > 1 | FILE (0x0815db58) | 16777216 | 0x0815daa8 > <snip> > 265 | FILE (0x081699f8) | 16777216 | 0x0815daa8 > 266 | FILE (0x0815d948) | 15392768 | 0x0815daa8 > 267 | EOS (0x08169a50) | 0 | 0x00000000 > end of brigade > > So it looks to me that the bb brigade is intact, but the ec bucket has > been smashed into bits and pieces... > > This is on ubuntu710-i386, configured with: > ./configure --prefix=/tmp/2.2.8.worker.debug --with-mpm=worker > --sysconfdir=/var/conf/apache2 --with-included-apr > --enable-nonportable-atomics=yes --enable-layout=GNU --with-gdbm > --without-berkeley-db --enable-mods-shared=all --enable-cache=shared > --enable-disk-cache=shared --enable-ssl=shared --enable-cgi=shared > --enable-suexec --with-suexec-caller=yada --with-suexec-uidmin=1000 > --with-suexec-gidmin=1000 CFLAGS="-march=i686 -g" > > So, is anyone else able to reproduce this? > > Any clue on what's the reason? I see some notes in CHANGES about > reusing brigades and so on, which might be related. However I'm way > too unclued to figure out even the general area of where things go > wrong in bucket-land... > > I did some other tests, for example fetching 45809664 bytes of the > file and then continuing, I get this reply: > Content-Length: 103800832 > Content-Range: bytes 45809664-4444577791/4444577792 > > Which is of course dead wrong, and using wget which trusts > Content-Length I end up with a truncated file. Talking to a > httpd-2.2.6 server I get the correct reply.
Hmm, that looks like a 32-bit cutoff. 4444577791 - 45809663 = 4398768128 4398768128 - 103800832 = 4294967296 4294967296 == 2^32 -B > > Something is really messed up in 2.2.8 (and I'm partly to blame, since > I didn't have time to test it prior to release ;) > > An unrelated note: Why on earth chop the poor file into 267 buckets? > MAX_BUCKET_SIZE in srclib/apr-util/buckets/apr_brigade.c is 1GB (which > works, that's what I use with my DISKCACHE buckets), where does 16MB > come from? > > > /Nikke - keeping a brown paper bag handy, any takers? > -- > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | [EMAIL PROTECTED] > --------------------------------------------------------------------------- > Many people are unenthusiastic about your work. > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= >