On Thu, Feb 21, 2008 at 1:09 PM, Niklas Edmundsson <[EMAIL PROTECTED]> wrote:
> On Wed, 20 Feb 2008, Niklas Edmundsson wrote:
>
> > In any case, I should probably try to figure out how to reproduce this
> thing.
> > All coredumps I've looked at have been when serving DVD images, which of
> > course works flawlessly when I try it...
>
> OK, I've been able to reproduce this, and it looks really bad because:
>
> - I'm able to reproduce without having mod_cache loaded, ie. vanilla
> httpd.
> - It's as easy as continuing an aborted download, so it's a trivial
> DOS.
>
> So, to reproduce I did:
> 1) Download 2222288895 bytes of the total 4444577792 bytes of a DVD
> image (debian-31r7-i386-binary-2.iso if you're curious).
> 2) Continue the download by doing wget -cS http://whatever/file.iso
>
> This coredumps the server, immediately closing the connection to the
> client.
>
> Backtrace of coredump is:
> #0 0xffffe410 in __kernel_vsyscall ()
> #1 0xb7cefca6 in kill () from /lib/tls/i686/cmov/libc.so.6
> #2 0x08089a03 in sig_coredump (sig=11) at mpm_common.c:1235
> #3 <signal handler called>
> #4 0x00000000 in ?? ()
> #5 0x08093010 in ap_byterange_filter (f=0x81606a0, bb=0x8161360)
> at byterange_filter.c:271
> #6 0x0808aec5 in ap_pass_brigade (next=0x81606a0, bb=0x8161360)
> at util_filter.c:526
> #7 0x08077576 in default_handler (r=0x815f968) at core.c:3740
> #8 0x0807df8d in ap_run_handler (r=0x815f968) at config.c:157
> #9 0x0807e6d7 in ap_invoke_handler (r=0x815f968) at config.c:372
> #10 0x0808ea7c in ap_process_request (r=0x815f968) at http_request.c:258
> #11 0x0808b543 in ap_process_http_connection (c=0x815bb08) at http_core.c:190
> #12 0x08086df3 in ap_run_process_connection (c=0x815bb08) at connection.c:43
> #13 0x08087274 in ap_process_connection (c=0x815bb08, csd=0x815b958)
> at connection.c:178
> #14 0x08094b00 in process_socket (p=0x815b920, sock=0x815b958,
> my_child_num=0,
> my_thread_num=0, bucket_alloc=0x815d928) at worker.c:544
> #15 0x080953c8 in worker_thread (thd=0x812d378, dummy=0x815b460)
> at worker.c:894
> #16 0xb7e87eac in dummy_worker (opaque=0x812d378)
> at threadproc/unix/thread.c:142
> #17 0xb7e1846b in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
> #18 0xb7d9873e in clone () from /lib/tls/i686/cmov/libc.so.6
>
> (gdb) dump_bucket ec
> bucket=¨0¸(0x08161364) length=135664344 data=0x080641b0
> contents=[**unknown**] rc=n/a
>
> (gdb) print *ec
> $1 = {link = {next = 0x815db00, prev = 0x8169a50}, type = 0x815d928,
> length = 135664344, start = -5193905754803399840, data = 0x80641b0,
> free = 0x8161390, list = 0x1}
>
> (gdb) print *ec->type
> $2 = {name = 0x815b920 "¨À\v\b0ù\025\b\030Ñ\022\b¸9\026\b\030À\025\b",
> num_func = 135641240, is_metadata = APR_BUCKET_DATA, destroy = 0x816bd00,
> read = 0x58, setaside = 0x815d928, split = 0x815d910, copy = 0}
>
> (gdb) dump_brigade bb
> dump of brigade 0x8161360
> | type (address) | length | data addr
> ---------------------------------------------------
> 0 | FILE (0x0815db00) | 16777216 | 0x0815daa8
> 1 | FILE (0x0815db58) | 16777216 | 0x0815daa8
> <snip>
> 265 | FILE (0x081699f8) | 16777216 | 0x0815daa8
> 266 | FILE (0x0815d948) | 15392768 | 0x0815daa8
> 267 | EOS (0x08169a50) | 0 | 0x00000000
> end of brigade
>
> So it looks to me that the bb brigade is intact, but the ec bucket has
> been smashed into bits and pieces...
>
> This is on ubuntu710-i386, configured with:
> ./configure --prefix=/tmp/2.2.8.worker.debug --with-mpm=worker
> --sysconfdir=/var/conf/apache2 --with-included-apr
> --enable-nonportable-atomics=yes --enable-layout=GNU --with-gdbm
> --without-berkeley-db --enable-mods-shared=all --enable-cache=shared
> --enable-disk-cache=shared --enable-ssl=shared --enable-cgi=shared
> --enable-suexec --with-suexec-caller=yada --with-suexec-uidmin=1000
> --with-suexec-gidmin=1000 CFLAGS="-march=i686 -g"
>
> So, is anyone else able to reproduce this?
>
> Any clue on what's the reason? I see some notes in CHANGES about
> reusing brigades and so on, which might be related. However I'm way
> too unclued to figure out even the general area of where things go
> wrong in bucket-land...
>
> I did some other tests, for example fetching 45809664 bytes of the
> file and then continuing, I get this reply:
> Content-Length: 103800832
> Content-Range: bytes 45809664-4444577791/4444577792
>
> Which is of course dead wrong, and using wget which trusts
> Content-Length I end up with a truncated file. Talking to a
> httpd-2.2.6 server I get the correct reply.
Hmm, that looks like a 32-bit cutoff.
4444577791 - 45809663 = 4398768128
4398768128 - 103800832 = 4294967296
4294967296 == 2^32
-B
>
> Something is really messed up in 2.2.8 (and I'm partly to blame, since
> I didn't have time to test it prior to release ;)
>
> An unrelated note: Why on earth chop the poor file into 267 buckets?
> MAX_BUCKET_SIZE in srclib/apr-util/buckets/apr_brigade.c is 1GB (which
> works, that's what I use with my DISKCACHE buckets), where does 16MB
> come from?
>
>
> /Nikke - keeping a brown paper bag handy, any takers?
> --
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | [EMAIL PROTECTED]
> ---------------------------------------------------------------------------
> Many people are unenthusiastic about your work.
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
>