On 22/03/2019 18:54, Alberto Garcia wrote: > On Thu 21 Mar 2019 03:51:12 PM CET, Alberto Garcia <be...@igalia.com> wrote: > >> I was checking the tests that run commit and stream in parallel in >> 030, but they do commit on the upper images and stream on the lower >> ones, so that's safe. I'll try to run them the other way around >> because we might have a problem there. > > I considered these scenarios with the following backing chain: > > E <- D <- C <- B <- A > > 1) stream from C to A, then commit from C to E > > This fails because qmp_block_commit() checks for op blockers in C's > overlay (B), which is blocked by the stream block job. > ("Node 'B' is busy: block device is in use by block job: stream") > > 2) commit from C to E, then stream from C to A > > This fails because the commit job inserts a filter between C and B > and the bdrv_freeze_backing_chain(bs, base) call in stream_start() > fails. > > However! I found this crash in a couple of occasions, I believe that > it happens if the commit job finishes before block_stream, but I need > to debug it further to see why the previous error didn't happen. > > Program terminated with signal SIGSEGV, Segmentation fault. > #0 0x0000559aca6e745d in stream_prepare (job=0x559acdafad70) at > block/stream.c:80 > 80 base_fmt = base->drv->format_name; > (gdb) print base > $1 = (BlockDriverState *) 0x559acd070240 > (gdb) print base->drv > $2 = (BlockDriver *) 0xb5b5b5b5b5b5b5b5 > (gdb) bt > #0 0x0000559aca6e745d in stream_prepare (job=0x559acdafad70) at > block/stream.c:80 > #1 0x0000559aca973a40 in job_prepare (job=0x559acdafad70) at job.c:771 > #2 0x0000559aca9722fd in job_txn_apply (txn=0x559acd01e6d0, > fn=0x559aca973a03 <job_prepare>) at job.c:146 > #3 0x0000559aca973ad2 in job_do_finalize (job=0x559acdafad70) at job.c:788 > #4 0x0000559aca973ca0 in job_completed_txn_success (job=0x559acdafad70) at > job.c:842 > #5 0x0000559aca973d3d in job_completed (job=0x559acdafad70) at job.c:855 > #6 0x0000559aca973d8c in job_exit (opaque=0x559acdafad70) at job.c:874 > #7 0x0000559acaa99c55 in aio_bh_call (bh=0x559acd3247f0) at util/async.c:90 > #8 0x0000559acaa99ced in aio_bh_poll (ctx=0x559accfb9a30) at util/async.c:118 > #9 0x0000559acaa9ebc0 in aio_dispatch (ctx=0x559accfb9a30) at > util/aio-posix.c:460 > #10 0x0000559acaa9a088 in aio_ctx_dispatch (source=0x559accfb9a30, > callback=0x0, user_data=0x0) at util/async.c:261 > #11 0x00007f7d8e7787f7 in g_main_context_dispatch () from > /lib/x86_64-linux-gnu/libglib-2.0.so.0 > #12 0x0000559acaa9d4bf in glib_pollfds_poll () at util/main-loop.c:222 > #13 0x0000559acaa9d539 in os_host_main_loop_wait (timeout=0) at > util/main-loop.c:245 > #14 0x0000559acaa9d63e in main_loop_wait (nonblocking=0) at > util/main-loop.c:521 > #15 0x0000559aca6c0ace in main_loop () at vl.c:1969 > #16 0x0000559aca6c7db3 in main (argc=18, argv=0x7ffe11ee6d58, > envp=0x7ffe11ee6df0) at vl.c:4589 > > So we need to look into this :( but I'd say that it seems that stream > should not need 'base' at all, just the node on top of it. > > Berto >
Meanwhile, I will tackle a new series that uses a 'bottom node' instead of the 'base'... Andrey