Re: [Qemu-block] [PATCH 07/19] block: Really pause block jobs on drain

2018-04-12 Thread Paolo Bonzini
On 12/04/2018 16:25, Kevin Wolf wrote: > This is already the order we have there. What is probably different from > what you envision is that after the parents have concluded, we still > check that they are still quiescent in every iteration. Yes, and that's the quadratic part. > What we could do

Re: [Qemu-block] v2v: -o rhv-upload: Long time spent zeroing the disk

2018-04-12 Thread Richard W.M. Jones
On Thu, Apr 12, 2018 at 03:44:26PM +, Nir Soffer wrote: > On Thu, Apr 12, 2018 at 5:42 PM Eric Blake wrote: > > > On 04/12/2018 05:24 AM, Richard W.M. Jones wrote: > > > > > I don't think we have nbd-server in RHEL, and in any case wouldn't it > > > be better to use qemu-nbd? > > > > > > You

[Qemu-block] [RFC] Intermediate block mirroring

2018-04-12 Thread Alberto Garcia
Hello, I mentioned this some time ago, but I'd like to retake it now: I'm checking how to copy arbitrary nodes on a backing chain, so if I have e.g. [A] <- [B] <- [C] <- [D] I'd like to end up with [A] <- [E] <- [C] <- [D] where [E] is a copy of [B]. The most obvious use case is to move

Re: [Qemu-block] [Qemu-discuss] qemu-img convert stuck

2018-04-12 Thread David Lee
On Thu, Apr 12, 2018 at 10:23 PM, Fam Zheng wrote: > On Thu, 04/12 21:45, David Lee wrote: >> On Thu, Apr 12, 2018 at 10:16 AM, David Lee wrote: >> >> > My team caught this issue too after switching to CentOS 7.4 with >> >> > qemu-img >> >> > 2.9.0 >> >> > gdb shows exactly the same backtrace wh

Re: [Qemu-block] [Qemu-devel] Withdrawn: Tweak the NBD_OPT_SET_META_CONTEXT layout

2018-04-12 Thread Eric Blake
On 03/31/2018 07:24 AM, Eric Blake wrote: > As mentioned in earlier email threads, I experimented with an > alternative layout to the structs sent across the wire during > NBD_OPT_{LIST,SET}_META_CONTEXT, on the grounds that having the > namespace and leaf-name combined into one string that require

Re: [Qemu-block] v2v: -o rhv-upload: Long time spent zeroing the disk

2018-04-12 Thread Nir Soffer
On Thu, Apr 12, 2018 at 5:42 PM Eric Blake wrote: > On 04/12/2018 05:24 AM, Richard W.M. Jones wrote: > > > I don't think we have nbd-server in RHEL, and in any case wouldn't it > > be better to use qemu-nbd? > > > > You just start a new qemu-nbd process instead of faffing around with > > configu

Re: [Qemu-block] v2v: -o rhv-upload: Long time spent zeroing the disk

2018-04-12 Thread Eric Blake
On 04/12/2018 05:24 AM, Richard W.M. Jones wrote: > I don't think we have nbd-server in RHEL, and in any case wouldn't it > be better to use qemu-nbd? > > You just start a new qemu-nbd process instead of faffing around with > configuration files, kill the qemu-nbd process when you're done, and >

Re: [Qemu-block] [PATCH 07/19] block: Really pause block jobs on drain

2018-04-12 Thread Kevin Wolf
Am 12.04.2018 um 15:42 hat Paolo Bonzini geschrieben: > On 12/04/2018 15:27, Kevin Wolf wrote: > > Not sure I follow. Let's look at an example. Say, we have a block job > > BlockBackend as the root (because that uses proper layering, unlike > > devices which use aio_disable_external()), connected t

Re: [Qemu-block] [Qemu-discuss] qemu-img convert stuck

2018-04-12 Thread Fam Zheng
On Thu, 04/12 21:45, David Lee wrote: > On Thu, Apr 12, 2018 at 10:16 AM, David Lee wrote: > >> > My team caught this issue too after switching to CentOS 7.4 with qemu-img > >> > 2.9.0 > >> > gdb shows exactly the same backtrace when the convert stuck, and we are > >> > on > >> > NFS. > >> > > >>

Re: [Qemu-block] [Qemu-discuss] qemu-img convert stuck

2018-04-12 Thread David Lee
On Thu, Apr 12, 2018 at 10:16 AM, David Lee wrote: >> > My team caught this issue too after switching to CentOS 7.4 with qemu-img >> > 2.9.0 >> > gdb shows exactly the same backtrace when the convert stuck, and we are on >> > NFS. >> > >> > Later we found the following: >> > 1. The stuck can happe

Re: [Qemu-block] [PATCH 07/19] block: Really pause block jobs on drain

2018-04-12 Thread Paolo Bonzini
On 12/04/2018 15:27, Kevin Wolf wrote: > Not sure I follow. Let's look at an example. Say, we have a block job > BlockBackend as the root (because that uses proper layering, unlike > devices which use aio_disable_external()), connected to a qcow2 node > over file. > > 1. The block job issues a req

Re: [Qemu-block] [PATCH 07/19] block: Really pause block jobs on drain

2018-04-12 Thread Kevin Wolf
Am 12.04.2018 um 14:02 hat Paolo Bonzini geschrieben: > On 12/04/2018 13:53, Kevin Wolf wrote: > >> The problem I have is that there is a direction through which I/O flows > >> (parent-to-child), so why can't draining follow that natural direction. > >> Having to check for the parents' I/O, while d

Re: [Qemu-block] [PATCH 07/19] block: Really pause block jobs on drain

2018-04-12 Thread Paolo Bonzini
On 12/04/2018 13:53, Kevin Wolf wrote: >> The problem I have is that there is a direction through which I/O flows >> (parent-to-child), so why can't draining follow that natural direction. >> Having to check for the parents' I/O, while draining the child, seems >> wrong. Perhaps we can't help it,

Re: [Qemu-block] [PATCH 07/19] block: Really pause block jobs on drain

2018-04-12 Thread Kevin Wolf
Am 12.04.2018 um 13:30 hat Paolo Bonzini geschrieben: > On 12/04/2018 13:11, Kevin Wolf wrote: > >> Well, there is one gotcha: bdrv_ref protects against disappearance, but > >> bdrv_ref/bdrv_unref are not thread-safe. Am I missing something else? > > > > Apart from the above, if we do an extra bdr

Re: [Qemu-block] [PATCH 07/19] block: Really pause block jobs on drain

2018-04-12 Thread Paolo Bonzini
On 12/04/2018 13:11, Kevin Wolf wrote: >> Well, there is one gotcha: bdrv_ref protects against disappearance, but >> bdrv_ref/bdrv_unref are not thread-safe. Am I missing something else? > > Apart from the above, if we do an extra bdrv_ref/unref we'd also have > to keep track of all the nodes that

Re: [Qemu-block] [PATCH 07/19] block: Really pause block jobs on drain

2018-04-12 Thread Kevin Wolf
Am 12.04.2018 um 12:12 hat Paolo Bonzini geschrieben: > On 12/04/2018 11:51, Kevin Wolf wrote: > > Am 12.04.2018 um 10:37 hat Paolo Bonzini geschrieben: > >> On 11/04/2018 18:39, Kevin Wolf wrote: > >>> +bool bdrv_drain_poll(BlockDriverState *bs, bool top_level) > >>> { > >>> /* Execute pendi

Re: [Qemu-block] v2v: -o rhv-upload: Long time spent zeroing the disk

2018-04-12 Thread Richard W.M. Jones
On Thu, Apr 12, 2018 at 09:22:16AM +, Nir Soffer wrote: > I think we can expose NBD using ndb-server and dynamic exports. > It can work like this: > > 0. Install nbd and enable nbd-server on a host, running >as vdsm:kvm, not exporting anything. > > 1. User starts transfer session via oVir

Re: [Qemu-block] [PATCH 07/19] block: Really pause block jobs on drain

2018-04-12 Thread Paolo Bonzini
On 12/04/2018 11:51, Kevin Wolf wrote: > Am 12.04.2018 um 10:37 hat Paolo Bonzini geschrieben: >> On 11/04/2018 18:39, Kevin Wolf wrote: >>> +bool bdrv_drain_poll(BlockDriverState *bs, bool top_level) >>> { >>> /* Execute pending BHs first and check everything else only after the >>> BHs >>>

Re: [Qemu-block] [PATCH 07/19] block: Really pause block jobs on drain

2018-04-12 Thread Kevin Wolf
Am 12.04.2018 um 10:37 hat Paolo Bonzini geschrieben: > On 11/04/2018 18:39, Kevin Wolf wrote: > > +bool bdrv_drain_poll(BlockDriverState *bs, bool top_level) > > { > > /* Execute pending BHs first and check everything else only after the > > BHs > > * have executed. */ > > -while

Re: [Qemu-block] v2v: -o rhv-upload: Long time spent zeroing the disk

2018-04-12 Thread Nir Soffer
On Thu, Apr 12, 2018 at 2:07 AM Nir Soffer wrote: > On Tue, Apr 10, 2018 at 6:53 PM Richard W.M. Jones > wrote: > ... > Dan Berrange pointed out earlier on that it might be easier if imageio >> > just exposed NBD, or if we found a way to tunnel NBD requests over web >> sockets (in the format ca

Re: [Qemu-block] [Qemu-devel] [PATCH] iotests: fix 169

2018-04-12 Thread Vladimir Sementsov-Ogievskiy
12.04.2018 11:34, Vladimir Sementsov-Ogievskiy wrote: 11.04.2018 19:11, Max Reitz wrote: On 2018-04-11 15:05, Vladimir Sementsov-Ogievskiy wrote: [...] Hmm, first type? I'm now not sure about, did I really see sha256 mismatch, or something like this (should be error, but found bitmap): --- /

Re: [Qemu-block] [PATCH 18/19] block: Allow graph changes in bdrv_drain_all_begin/end sections

2018-04-12 Thread Paolo Bonzini
On 11/04/2018 18:39, Kevin Wolf wrote: > The much easier and more obviously correct way is to fundamentally > change the way the functions work: Iterate over all BlockDriverStates, > no matter who owns them, and drain them individually. Compensation is > only necessary when a new BDS is created ins

Re: [Qemu-block] [PATCH 16/19] block: Allow AIO_WAIT_WHILE with NULL ctx

2018-04-12 Thread Paolo Bonzini
On 11/04/2018 18:39, Kevin Wolf wrote: > bdrv_drain_all() wants to have a single polling loop for draining the > in-flight requests of all nodes. This means that the AIO_WAIT_WHILE() > condition relies on activity in multiple AioContexts, which is polled > from the mainloop context. We must therefo

Re: [Qemu-block] [PATCH 10/19] block: Drain recursively with a single BDRV_POLL_WHILE()

2018-04-12 Thread Paolo Bonzini
On 11/04/2018 18:39, Kevin Wolf wrote: > +if (atomic_read(&bs->in_flight)) { > +return true; > +} > + > +if (recursive) { > +QLIST_FOREACH_SAFE(child, &bs->children, next, next) { QLIST_FOREACH_SAFE is only safe if child disappears, but not if e.g. next disappears. So

Re: [Qemu-block] [PATCH 08/19] block: Remove bdrv_drain_recurse()

2018-04-12 Thread Paolo Bonzini
On 11/04/2018 18:39, Kevin Wolf wrote: > For bdrv_drain(), recursively waiting for child node requests is > pointless because we didn't quiesce their parents, so new requests could > come in anyway. Letting the function work only on a single node makes it > more consistent. > > For subtree drains

Re: [Qemu-block] [PATCH 07/19] block: Really pause block jobs on drain

2018-04-12 Thread Paolo Bonzini
On 11/04/2018 18:39, Kevin Wolf wrote: > +bool bdrv_drain_poll(BlockDriverState *bs, bool top_level) > { > /* Execute pending BHs first and check everything else only after the BHs > * have executed. */ > -while (aio_poll(bs->aio_context, false)); > +if (top_level) { > +

Re: [Qemu-block] [Qemu-devel] [PATCH] iotests: fix 169

2018-04-12 Thread Vladimir Sementsov-Ogievskiy
11.04.2018 19:11, Max Reitz wrote: On 2018-04-11 15:05, Vladimir Sementsov-Ogievskiy wrote: [...] Hmm, first type? I'm now not sure about, did I really see sha256 mismatch, or something like this (should be error, but found bitmap): --- /work/src/qemu/up-169/tests/qemu-iotests/169.out    2018