Re: [Qemu-devel] [Qemu-block] [PATCH v8 00/11] Support streaming to an intermediate layer
On Wed 24 Jun 2015 12:02:35 PM CEST, Alberto Garcia be...@igalia.com wrote: What happens is that in stream_complete() all images that have been removed from the chain ([A], [B] and [C]) are closed, and then [D] is reopened in read-only mode. The reopen queue contains [C], that is still listed as a child of [D] although it's no longer a valid pointer. Here's a simple way to access freed memory with the current master: $ qemu-img create -f qcow2 base.qcow2 100M $ qemu-img create -f qcow2 -o backing_file=base.qcow2 hd.qcow2 $ valgrind qemu-system-x86_64 -enable-kvm -monitor stdio -drive if=virtio,file=hd.qcow2 (qemu) block_stream virtio0 (qemu) drive_del virtio0 ==15925== Thread 1: ==15925== Invalid read of size 8 ==15925==at 0x4C0C55: bdrv_close (block.c:1853) ==15925==by 0x2E12C7: hmp_drive_del (blockdev.c:2173) ==15925==by 0x1E72D5: handle_hmp_command (monitor.c:4058) ==15925==by 0x1E9A44: monitor_command_cb (monitor.c:5081) ==15925==by 0x578443: readline_handle_byte (readline.c:391) ==15925==by 0x1E999E: monitor_read (monitor.c:5064) ==15925==by 0x2E7064: qemu_chr_be_write (qemu-char.c:306) ==15925==by 0x2E88AA: fd_chr_read (qemu-char.c:1012) ==15925==by 0x9B6DB4C: g_main_context_dispatch (in /lib/x86_64-linux-gnu/libglib-2.0.so.0.4400.1) ==15925==by 0x4C6D7C: glib_pollfds_poll (main-loop.c:199) ==15925==by 0x4C6E59: os_host_main_loop_wait (main-loop.c:244) ==15925==by 0x4C6F18: main_loop_wait (main-loop.c:493) ==15925== Address 0x1a86f2a0 is 16 bytes after a block of size 256 free'd ==15925==at 0x4C29E90: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==15925==by 0x6215767: pixman_image_unref (in /usr/lib/x86_64-linux-gnu/libpixman-1.so.0.32.6) ==15925==by 0x490131: qemu_pixman_glyph_render (qemu-pixman.c:250) ==15925==by 0x48AC70: vga_putcharxy (console.c:435) ==15925==by 0x48B38C: console_refresh (console.c:567) ==15925==by 0x48CC57: text_console_invalidate (console.c:1155) ==15925==by 0x48A6A2: graphic_hw_invalidate (console.c:269) ==15925==by 0x48E71D: text_console_update_cursor (console.c:1831) ==15925==by 0x4C8245: timerlist_run_timers (qemu-timer.c:502) ==15925==by 0x4C828D: qemu_clock_run_timers (qemu-timer.c:513) ==15925==by 0x4C8596: qemu_clock_run_all_timers (qemu-timer.c:621) ==15925==by 0x4C6F4E: main_loop_wait (main-loop.c:499) ==15925== As explained in my previous e-mail, the problem is that after the block_stream operation, the bs-children list in hd.qcow2 still keeps a pointer to base.qcow2 even when its memory has been freed. Berto
Re: [Qemu-devel] [Qemu-block] [PATCH v8 00/11] Support streaming to an intermediate layer
On Tue 23 Jun 2015 06:48:39 PM CEST, Stefan Hajnoczi wrote: It seems that self.vm.qmp('block-stream', ...) is returning None in your case. Is that the only test that is failing? Yes, only this test fails. I have pushed my tree here: https://github.com/stefanha/qemu/commits/berto-intermediate-streaming I wonder if you see this failure on your machine with my tree. I was debugging the problem. So we have this scenario: [A] - [B] - [C] - [D] - [E] - [F] - [G] The test that is failing is test_stream_parallel(), that creates several concurrent streaming operations, in this order: a) [B] = [C] b) [D] = [E] c) [F] = [G] The way it works is that stream_start() reopens the destination image in read-write mode (if needed), creates the stream_run() coroutine to copy the data, and finally stream_complete() puts it back in read-only mode and closes the images that are removed from the chain. This was not causing problems earlier because each operation works on a different set of images, but it seems that things have changed since the reopen overhaul patches: https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01587.html https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01592.html Here's what happens: - We launch block job a). - While a) is ongoing, b) enters and tries to reopen [E] in read-write mode. - With the reopen overhaul patches, the BlockReopenQueue no longer contains just [E] but also all its backing chain up to [A]. - bdrv_reopen_multiple() calls bdrv_reopen_prepare() and flushes all images. - At this point job a) takes over and finishes, reopening [C] again in read-only mode and closing [B], removing it from memory. - When b) continues (still in bdrv_reopen_multiple) the queue still contains [B], which is no longer a valid pointer. My first reaction was to keep a reference to each BDS in the BlockReopenQueue during the operation, but I don't know why that queue needs to contain all the backing chain in the first place. Berto
Re: [Qemu-devel] [Qemu-block] [PATCH v8 00/11] Support streaming to an intermediate layer
On Wed 24 Jun 2015 10:02:03 AM CEST, Alberto Garcia wrote: [A] - [B] - [C] - [D] - [E] - [F] - [G] [...] The way it works is that stream_start() reopens the destination image in read-write mode (if needed), creates the stream_run() coroutine to copy the data, and finally stream_complete() puts it back in read-only mode and closes the images that are removed from the chain. There's more. If I perform a single streaming operation [A] = [D] this also crashes, although in a different way. block.c:1764: bdrv_reopen_prepare: Assertion `reopen_state-bs-drv != ((void *)0)' failed. What happens is that in stream_complete() all images that have been removed from the chain ([A], [B] and [C]) are closed, and then [D] is reopened in read-only mode. The reopen queue contains [C], that is still listed as a child of [D] although it's no longer a valid pointer. [C] is opened using bdrv_open_backing_file() and attached to [D] with bdrv_attach_child(), but never detached afterwards. Berto
Re: [Qemu-devel] [Qemu-block] [PATCH v8 00/11] Support streaming to an intermediate layer
On Tue, Jun 23, 2015 at 5:09 PM, Alberto Garcia be...@igalia.com wrote: On Tue 23 Jun 2015 05:36:32 PM CEST, Stefan Hajnoczi wrote: Did you try qemu-iotests? I'm still getting: 030 11s ... [failed, exit status 1] - output mismatch (see 030.out.bad) Yes, it works fine here. +Traceback (most recent call last): + File 030, line 202, in test_stream_parallel +self.assert_qmp(result, 'return', {}) This creates several streaming operations in parallel. That particular line just checks the return value of each block-stream command. + File /home/stefanha/qemu/tests/qemu-iotests/iotests.py, line 286, in assert_qmp +result = self.dictpath(d, path) + File /home/stefanha/qemu/tests/qemu-iotests/iotests.py, line 265, in dictpath +self.fail('failed path traversal for %s in %s' % (path, str(d))) +AssertionError: failed path traversal for return in None + -- Ran 17 tests It seems that self.vm.qmp('block-stream', ...) is returning None in your case. Is that the only test that is failing? Yes, only this test fails. I have pushed my tree here: https://github.com/stefanha/qemu/commits/berto-intermediate-streaming I wonder if you see this failure on your machine with my tree. Stefan