Re: [PATCH] aio_wait_kick: add missing memory barrier

2022-06-04 Thread Roman Kagan
On Tue, May 24, 2022 at 01:30:54PM -0400, Emanuele Giuseppe Esposito wrote: > It seems that aio_wait_kick always required a memory barrier > or atomic operation in the caller, but nobody actually > took care of doing it. > > Let's put the barrier in the function instead, and pair it > with

Re: [PATCH 00/10] vhost: stick to -errno error return convention

2021-11-29 Thread Roman Kagan
On Sun, Nov 28, 2021 at 04:47:20PM -0500, Michael S. Tsirkin wrote: > On Thu, Nov 11, 2021 at 06:33:44PM +0300, Roman Kagan wrote: > > Error propagation between the generic vhost code and the specific backends > > is > > not quite consistent: some places follow &qu

Re: [PATCH 01/10] vhost-user-blk: reconnect on any error during realize

2021-11-12 Thread Roman Kagan
On Fri, Nov 12, 2021 at 12:37:59PM +0100, Kevin Wolf wrote: > Am 12.11.2021 um 08:39 hat Roman Kagan geschrieben: > > On Thu, Nov 11, 2021 at 06:52:30PM +0100, Kevin Wolf wrote: > > > Am 11.11.2021 um 16:33 hat Roman Kagan geschrieben: > > > > vhost-user-blk real

Re: [PATCH 04/10] chardev/char-fe: don't allow EAGAIN from blocking read

2021-11-12 Thread Roman Kagan
On Fri, Nov 12, 2021 at 12:24:06PM +0400, Marc-André Lureau wrote: > Hi > > On Thu, Nov 11, 2021 at 7:44 PM Roman Kagan wrote: > > > As its name suggests, ChardevClass.chr_sync_read is supposed to do a > > blocking read. The only implementation of it, tcp_chr

Re: [PATCH 05/10] vhost-backend: avoid overflow on memslots_limit

2021-11-12 Thread Roman Kagan
On Fri, Nov 12, 2021 at 09:56:17AM +, Daniel P. Berrangé wrote: > On Fri, Nov 12, 2021 at 10:46:46AM +0300, Roman Kagan wrote: > > On Thu, Nov 11, 2021 at 06:59:43PM +0100, Philippe Mathieu-Daudé wrote: > > > On 11/11/21 16:33, Roman Kagan wrote: > > > > Fi

Re: [PATCH 00/10] vhost: stick to -errno error return convention

2021-11-12 Thread Roman Kagan
On Thu, Nov 11, 2021 at 03:14:56PM -0500, Michael S. Tsirkin wrote: > On Thu, Nov 11, 2021 at 06:33:44PM +0300, Roman Kagan wrote: > > Error propagation between the generic vhost code and the specific backends > > is > > not quite consistent: some places follow &qu

Re: [PATCH 05/10] vhost-backend: avoid overflow on memslots_limit

2021-11-11 Thread Roman Kagan
On Thu, Nov 11, 2021 at 06:59:43PM +0100, Philippe Mathieu-Daudé wrote: > On 11/11/21 16:33, Roman Kagan wrote: > > Fix the (hypothetical) potential problem when the value parsed out of > > the vhost module parameter in sysfs overflows the return value from > > vhost_

Re: [PATCH 01/10] vhost-user-blk: reconnect on any error during realize

2021-11-11 Thread Roman Kagan
On Thu, Nov 11, 2021 at 06:52:30PM +0100, Kevin Wolf wrote: > Am 11.11.2021 um 16:33 hat Roman Kagan geschrieben: > > vhost-user-blk realize only attempts to reconnect if the previous > > connection attempt failed on "a problem with the connection and not an > > error rel

[PATCH 09/10] vhost: stick to -errno error return convention

2021-11-11 Thread Roman Kagan
is ok, instead of taking recovery actions (break and reestablish the vhost-user connection, cancel migration, etc) before it's too late. To fix this, consolidate on the convention to return negated errno on failures throughout generic vhost, and use it for error propagation. Signed-off-by: Roman

[PATCH 08/10] vhost-user: stick to -errno error return convention

2021-11-11 Thread Roman Kagan
return convention with the other two vhost backends, kernel and vdpa, and will therefore allow for consistent error propagation in the generic vhost code (in a followup patch). Signed-off-by: Roman Kagan --- hw/virtio/vhost-user.c | 401 +++-- 1 file changed, 223

[PATCH 07/10] vhost-vdpa: stick to -errno error return convention

2021-11-11 Thread Roman Kagan
the error codes wherever it's appropriate. Signed-off-by: Roman Kagan --- hw/virtio/vhost-vdpa.c | 37 +++-- 1 file changed, 23 insertions(+), 14 deletions(-) diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c index 0d8051426c..a3b885902a 100644 --- a/hw

[PATCH 04/10] chardev/char-fe: don't allow EAGAIN from blocking read

2021-11-11 Thread Roman Kagan
it in qemu_chr_fe_read_all; instead place an assertion that it doesn't fail with EAGAIN. Signed-off-by: Roman Kagan --- chardev/char-fe.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/chardev/char-fe.c b/chardev/char-fe.c index 7789f7be9c..f94efe928e 100644 --- a/chardev/char

[PATCH 06/10] vhost-backend: stick to -errno error return convention

2021-11-11 Thread Roman Kagan
Almost all VhostOps methods in kernel_ops follow the convention of returning negated errno on error. Adjust the only one that doesn't. Signed-off-by: Roman Kagan --- hw/virtio/vhost-backend.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/virtio/vhost-backend.c b/hw

[PATCH 10/10] vhost-user-blk: propagate error return from generic vhost

2021-11-11 Thread Roman Kagan
Fix the only callsite that doesn't propagate the error code from the generic vhost code. Signed-off-by: Roman Kagan --- hw/block/vhost-user-blk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c index f9b17f6813

[PATCH 00/10] vhost: stick to -errno error return convention

2021-11-11 Thread Roman Kagan
trigger an immediate connection drop and reconnection, leaving it in a broken state. Rework error propagation to always return negated errno on errors and correctly pass it up the stack. Roman Kagan (10): vhost-user-blk: reconnect on any error during realize chardev/char-socket: tcp_chr_recv:

[PATCH 02/10] chardev/char-socket: tcp_chr_recv: don't clobber errno

2021-11-11 Thread Roman Kagan
tcp_chr_recv communicates the specific error condition to the caller via errno. However, after setting it, it may call into some system calls or library functions which can clobber the errno. Avoid this by moving the errno assignment to the end of the function. Signed-off-by: Roman Kagan

[PATCH 03/10] chardev/char-socket: tcp_chr_sync_read: don't clobber errno

2021-11-11 Thread Roman Kagan
After the return from tcp_chr_recv, tcp_chr_sync_read calls into a function which eventually makes a system call and may clobber errno. Make a copy of errno right after tcp_chr_recv and restore the errno on return from tcp_chr_sync_read. Signed-off-by: Roman Kagan --- chardev/char-socket.c | 3

[PATCH 01/10] vhost-user-blk: reconnect on any error during realize

2021-11-11 Thread Roman Kagan
This patch amends a527e312b5 "vhost-user-blk: Implement reconnection during realize". Signed-off-by: Roman Kagan --- hw/block/vhost-user-blk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c index ba13cb87e

[PATCH 05/10] vhost-backend: avoid overflow on memslots_limit

2021-11-11 Thread Roman Kagan
Fix the (hypothetical) potential problem when the value parsed out of the vhost module parameter in sysfs overflows the return value from vhost_kernel_memslots_limit. Signed-off-by: Roman Kagan --- hw/virtio/vhost-backend.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw

Re: [PATCH v0 0/2] virtio-blk and vhost-user-blk cross-device migration

2021-10-04 Thread Roman Kagan
On Mon, Oct 04, 2021 at 11:11:00AM -0400, Michael S. Tsirkin wrote: > On Mon, Oct 04, 2021 at 06:07:29PM +0300, Denis Plotnikov wrote: > > It might be useful for the cases when a slow block layer should be replaced > > with a more performant one on running VM without stopping, i.e. with very > >

Re: [PATCH v3 06/33] util/async: aio_co_schedule(): support reschedule in same ctx

2021-05-14 Thread Roman Kagan
On Thu, May 13, 2021 at 11:04:37PM +0200, Paolo Bonzini wrote: > On 12/05/21 09:15, Vladimir Sementsov-Ogievskiy wrote: > > > > > > > > > > I don't understand.  Why doesn't aio_co_enter go through the ctx != > > > qemu_get_current_aio_context() branch and just do aio_co_schedule? > > > That was

Re: [PATCH v3 21/33] qemu-socket: pass monitor link to socket_get_fd directly

2021-05-12 Thread Roman Kagan
On Mon, Apr 19, 2021 at 10:34:49AM +0100, Daniel P. Berrangé wrote: > On Fri, Apr 16, 2021 at 11:08:59AM +0300, Vladimir Sementsov-Ogievskiy wrote: > > Detecting monitor by current coroutine works bad when we are not in > > coroutine context. And that's exactly so in nbd reconnect code, where > >

Re: [PATCH v3 19/33] block/nbd: split nbd_handle_updated_info out of nbd_client_handshake()

2021-05-12 Thread Roman Kagan
s(+), 42 deletions(-) Reviewed-by: Roman Kagan

Re: [PATCH v3 18/33] nbd/client-connection: shutdown connection on release

2021-05-11 Thread Roman Kagan
On Fri, Apr 16, 2021 at 11:08:56AM +0300, Vladimir Sementsov-Ogievskiy wrote: > Now, when thread can do negotiation and retry, it may run relatively > long. We need a mechanism to stop it, when user is not interested in > result anymore. So, on nbd_client_connection_release() let's shutdown > the

Re: [PATCH v3 17/33] nbd/client-connection: implement connection retry

2021-05-11 Thread Roman Kagan
On Fri, Apr 16, 2021 at 11:08:55AM +0300, Vladimir Sementsov-Ogievskiy wrote: > Add an option for thread to retry connection until success. We'll use > nbd/client-connection both for reconnect and for initial connection in > nbd_open(), so we need a possibility to use same NBDClientConnection >

Re: [PATCH v3 16/33] nbd/client-connection: add possibility of negotiation

2021-05-11 Thread Roman Kagan
On Fri, Apr 16, 2021 at 11:08:54AM +0300, Vladimir Sementsov-Ogievskiy wrote: > Add arguments and logic to support nbd negotiation in the same thread > after successful connection. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > include/block/nbd.h | 9 +++- > block/nbd.c

Re: [PATCH v3 15/33] nbd/client-connection: use QEMU_LOCK_GUARD

2021-04-28 Thread Roman Kagan
On Fri, Apr 16, 2021 at 11:08:53AM +0300, Vladimir Sementsov-Ogievskiy wrote: > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > nbd/client-connection.c | 94 ++--- > 1 file changed, 42 insertions(+), 52 deletions(-) > > diff --git

Re: [PATCH v3 14/33] nbd: move connection code from block/nbd to nbd/client-connection

2021-04-27 Thread Roman Kagan
On Fri, Apr 16, 2021 at 11:08:52AM +0300, Vladimir Sementsov-Ogievskiy wrote: > We now have bs-independent connection API, which consists of four > functions: > > nbd_client_connection_new() > nbd_client_connection_unref() > nbd_co_establish_connection() >

Re: [PATCH v3 13/33] block/nbd: introduce nbd_client_connection_release()

2021-04-27 Thread Roman Kagan
On Fri, Apr 16, 2021 at 11:08:51AM +0300, Vladimir Sementsov-Ogievskiy wrote: > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > block/nbd.c | 43 ++- > 1 file changed, 26 insertions(+), 17 deletions(-) > > diff --git a/block/nbd.c b/block/nbd.c >

Re: [PATCH v3 11/33] block/nbd: rename NBDConnectThread to NBDClientConnection

2021-04-27 Thread Roman Kagan
nged, 68 insertions(+), 69 deletions(-) Reviewed-by: Roman Kagan

Re: [PATCH v3 08/33] block/nbd: drop thr->state

2021-04-27 Thread Roman Kagan
On Fri, Apr 16, 2021 at 11:08:46AM +0300, Vladimir Sementsov-Ogievskiy wrote: > We don't need all these states. The code refactored to use two boolean > variables looks simpler. Indeed. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > block/nbd.c | 125

Re: [PATCH v3 07/33] block/nbd: simplify waking of nbd_co_establish_connection()

2021-04-27 Thread Roman Kagan
On Fri, Apr 16, 2021 at 11:08:45AM +0300, Vladimir Sementsov-Ogievskiy wrote: > Instead of connect_bh, bh_ctx and wait_connect fields we can live with > only one link to waiting coroutine, protected by mutex. > > So new logic is: > > nbd_co_establish_connection() sets wait_co under mutex,

Re: [PATCH v3 06/33] util/async: aio_co_schedule(): support reschedule in same ctx

2021-04-23 Thread Roman Kagan
On Fri, Apr 16, 2021 at 11:08:44AM +0300, Vladimir Sementsov-Ogievskiy wrote: > With the following patch we want to call wake coroutine from thread. > And it doesn't work with aio_co_wake: > Assume we have no iothreads. > Assume we have a coroutine A, which waits in the yield point for > external

Re: [PATCH v3 04/33] block/nbd: nbd_client_handshake(): fix leak of s->ioc

2021-04-22 Thread Roman Kagan
On Fri, Apr 16, 2021 at 11:08:42AM +0300, Vladimir Sementsov-Ogievskiy wrote: > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > block/nbd.c | 2 ++ > 1 file changed, 2 insertions(+) Reviewed-by: Roman Kagan

Re: [PATCH v3 02/33] block/nbd: fix how state is cleared on nbd_open() failure paths

2021-04-22 Thread Roman Kagan
On Thu, Apr 22, 2021 at 01:27:22AM +0300, Vladimir Sementsov-Ogievskiy wrote: > 21.04.2021 17:00, Roman Kagan wrote: > > On Fri, Apr 16, 2021 at 11:08:40AM +0300, Vladimir Sementsov-Ogievskiy > > wrote: > > > @@ -2305,20 +2301,23 @@ static int nbd_open(BlockDriverState *b

Re: [PATCH v3 02/33] block/nbd: fix how state is cleared on nbd_open() failure paths

2021-04-21 Thread Roman Kagan
On Fri, Apr 16, 2021 at 11:08:40AM +0300, Vladimir Sementsov-Ogievskiy wrote: > We have two "return error" paths in nbd_open() after > nbd_process_options(). Actually we should call nbd_clear_bdrvstate() > on these paths. Interesting that nbd_process_options() calls > nbd_clear_bdrvstate() by

Re: [PATCH for-6.0] block/nbd: fix possible use after free of s->connect_thread

2021-04-12 Thread Roman Kagan
re-patch, on first hunk we'll just crash if thr is NULL, > on second hunk it's safe to return -1, and using thr when > s->connect_thread is already zeroed is obviously wrong. > > block/nbd.c | 11 +++ > 1 file changed, 11 insertions(+) Can we please get it merged in 6.0 as it's a genuine crasher fix? Reviewed-by: Roman Kagan

Re: [PATCH for-6.0 2/2] block/nbd: ensure ->connection_thread is always valid

2021-04-10 Thread Roman Kagan
On Sat, Apr 10, 2021 at 12:56:34PM +0300, Vladimir Sementsov-Ogievskiy wrote: > 10.04.2021 11:38, Vladimir Sementsov-Ogievskiy wrote: > > 10.04.2021 11:06, Vladimir Sementsov-Ogievskiy wrote: > > > 09.04.2021 19:04, Roman Kagan wrote: > > > > Simplify lifet

[PATCH for-6.0 1/2] block/nbd: fix channel object leak

2021-04-09 Thread Roman Kagan
nbd_free_connect_thread leaks the channel object if it hasn't been stolen. Unref it and fix the leak. Signed-off-by: Roman Kagan --- block/nbd.c | 1 + 1 file changed, 1 insertion(+) diff --git a/block/nbd.c b/block/nbd.c index c26dc5a54f..d86df3afcb 100644 --- a/block/nbd.c +++ b/block/nbd.c

[PATCH for-6.0 0/2] block/nbd: assorted bugfixes

2021-04-09 Thread Roman Kagan
A couple of bugfixes to block/nbd that look appropriate for 6.0. Roman Kagan (2): block/nbd: fix channel object leak block/nbd: ensure ->connection_thread is always valid block/nbd.c | 59 +++-- 1 file changed, 30 insertions(+), 29 deleti

[PATCH for-6.0 2/2] block/nbd: ensure ->connection_thread is always valid

2021-04-09 Thread Roman Kagan
Simplify lifetime management of BDRVNBDState->connection_thread by delaying the possible cleanup of it until the BDRVNBDState itself goes away. This also fixes possible use-after-free in nbd_co_establish_connection when it races with nbd_co_establish_connection_cancel. Signed-off-by: Roman Ka

Re: [PATCH v2 03/10] util/async: aio_co_enter(): do aio_co_schedule in general case

2021-04-09 Thread Roman Kagan
On Thu, Apr 08, 2021 at 06:54:30PM +0300, Roman Kagan wrote: > On Thu, Apr 08, 2021 at 05:08:20PM +0300, Vladimir Sementsov-Ogievskiy wrote: > > With the following patch we want to call aio_co_wake() from thread. > > And it works bad. > > Assume we have no iothreads. > >

Re: [PATCH v2 00/10] block/nbd: move connection code to separate file

2021-04-08 Thread Roman Kagan
On Thu, Apr 08, 2021 at 05:08:17PM +0300, Vladimir Sementsov-Ogievskiy wrote: > Hi all! > > This substitutes "[PATCH 00/14] nbd: move reconnect-thread to separate file" > Supersedes: <20210407104637.36033-1-vsement...@virtuozzo.com> > > I want to simplify block/nbd.c which is overcomplicated

Re: [PATCH v2 10/10] nbd: move connection code from block/nbd to nbd/client-connection

2021-04-08 Thread Roman Kagan
> nbd/meson.build | 1 + > 4 files changed, 204 insertions(+), 167 deletions(-) > create mode 100644 nbd/client-connection.c Reviewed-by: Roman Kagan

Re: [PATCH v2 09/10] block/nbd: introduce nbd_client_connection_new()

2021-04-08 Thread Roman Kagan
ov-Ogievskiy > --- > block/nbd.c | 15 +-- > 1 file changed, 9 insertions(+), 6 deletions(-) Reviewed-by: Roman Kagan

Re: [PATCH v2 08/10] block/nbd: rename NBDConnectThread to NBDClientConnection

2021-04-08 Thread Roman Kagan
ementsov-Ogievskiy > --- > block/nbd.c | 127 ++-- > 1 file changed, 63 insertions(+), 64 deletions(-) [To other reviewers: in addition to renaming there's one blank line removed, hence the difference between (+) and (-)] Reviewed-by: Roman Kagan

Re: [PATCH v2 07/10] block/nbd: make nbd_co_establish_connection_cancel() bs-independent

2021-04-08 Thread Roman Kagan
> 1 file changed, 7 insertions(+), 9 deletions(-) Reviewed-by: Roman Kagan

Re: [PATCH v2 06/10] block/nbd: bs-independent interface for nbd_co_establish_connection()

2021-04-08 Thread Roman Kagan
> --- > block/nbd.c | 49 +++-- > 1 file changed, 31 insertions(+), 18 deletions(-) Reviewed-by: Roman Kagan

Re: [PATCH v2 05/10] block/nbd: drop thr->state

2021-04-08 Thread Roman Kagan
--- > 1 file changed, 27 insertions(+), 76 deletions(-) Reviewed-by: Roman Kagan

Re: [PATCH v2 04/10] block/nbd: simplify waking of nbd_co_establish_connection()

2021-04-08 Thread Roman Kagan
connection API out of > nbd.c (which is overcomplicated now). > > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > block/nbd.c | 49 +-------- > 1 file changed, 9 insertions(+), 40 deletions(-) Reviewed-by: Roman Kagan

Re: [PATCH v2 03/10] util/async: aio_co_enter(): do aio_co_schedule in general case

2021-04-08 Thread Roman Kagan
context was qemu_aio_context, an iothread would just schedule the coroutine there, while a "dumb" thread would try lock the context potentially resulting in a deadlock. This patch makes "dumb" threads and iothreads behave identically when entering a coroutine on a foreign context. You may want to rephrase the log message to that end. Anyway Reviewed-by: Roman Kagan

Re: [PATCH v2 02/10] block/nbd: BDRVNBDState: drop unused connect_err and connect_status

2021-04-08 Thread Roman Kagan
On Thu, Apr 08, 2021 at 05:08:19PM +0300, Vladimir Sementsov-Ogievskiy wrote: > These fields are write-only. Drop them. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > block/nbd.c | 12 ++-- > 1 file changed, 2 insertions(+), 10 deletions(-) Reviewed-by: Roman Kagan

Re: [PATCH v2 01/10] block/nbd: introduce NBDConnectThread reference counter

2021-04-08 Thread Roman Kagan
> Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > block/nbd.c | 62 +-------- > 1 file changed, 20 insertions(+), 42 deletions(-) Reviewed-by: Roman Kagan

Re: [PATCH 06/14] block/nbd: further segregation of connect-thread

2021-04-08 Thread Roman Kagan
On Wed, Apr 07, 2021 at 01:46:29PM +0300, Vladimir Sementsov-Ogievskiy wrote: > Add personal state NBDConnectThread for connect-thread and > nbd_connect_thread_start() function. Next step would be moving > connect-thread to a separate file. > > Note that we stop publishing thr->sioc during >

Re: [PATCH 03/14] block/nbd: drop unused NBDConnectThread::err field

2021-04-07 Thread Roman Kagan
On Wed, Apr 07, 2021 at 01:46:26PM +0300, Vladimir Sementsov-Ogievskiy wrote: > The field is used only to free it. Let's drop it for now for > simplicity. Well, it's *now* (after your patch 2) only used to free it. This makes the reconnect process even further concealed from the user: the client

Re: [PATCH 02/14] block/nbd: nbd_co_establish_connection(): drop unused errp

2021-04-07 Thread Roman Kagan
On Wed, Apr 07, 2021 at 01:46:25PM +0300, Vladimir Sementsov-Ogievskiy wrote: > We are going to refactor connection logic to make it more > understandable. Every bit that we can simplify in advance will help. > Drop errp for now, it's unused anyway. We'll probably reimplement it in > future.

Re: [PATCH 01/14] block/nbd: BDRVNBDState: drop unused connect_err

2021-04-07 Thread Roman Kagan
On Wed, Apr 07, 2021 at 01:46:24PM +0300, Vladimir Sementsov-Ogievskiy wrote: > The field is actually unused. Let's make things a bit simpler dropping > it and corresponding logic. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > block/nbd.c | 9 ++--- > 1 file changed, 2

Re: [PATCH 0/7] block/nbd: decouple reconnect from drain

2021-04-07 Thread Roman Kagan
On Wed, Mar 17, 2021 at 11:35:31AM +0300, Vladimir Sementsov-Ogievskiy wrote: > 15.03.2021 09:06, Roman Kagan wrote: > > The reconnection logic doesn't need to stop while in a drained section. > > Moreover it has to be active during the drained section, as the requests >

Re: [PATCH 0/7] block/nbd: decouple reconnect from drain

2021-03-26 Thread Roman Kagan
On Wed, Mar 17, 2021 at 11:35:31AM +0300, Vladimir Sementsov-Ogievskiy wrote: > 15.03.2021 09:06, Roman Kagan wrote: > > The reconnection logic doesn't need to stop while in a drained section. > > Moreover it has to be active during the drained section, as the requests >

Re: [PATCH 7/7] block/nbd: stop manipulating in_flight counter

2021-03-26 Thread Roman Kagan
On Tue, Mar 16, 2021 at 09:37:13PM +0300, Vladimir Sementsov-Ogievskiy wrote: > 16.03.2021 19:08, Roman Kagan wrote: > > On Mon, Mar 15, 2021 at 11:15:44PM +0300, Vladimir Sementsov-Ogievskiy > > wrote: > > > 15.03.2021 09:06, Roman Kagan wrote: > > > > As the

Re: [PATCH 6/7] block/nbd: decouple reconnect from drain

2021-03-26 Thread Roman Kagan
On Tue, Mar 16, 2021 at 09:09:12PM +0300, Vladimir Sementsov-Ogievskiy wrote: > 16.03.2021 19:03, Roman Kagan wrote: > > On Mon, Mar 15, 2021 at 11:10:14PM +0300, Vladimir Sementsov-Ogievskiy > > wrote: > > > 15.03.2021 09:06, Roman Kagan wrote: > > > > The rec

Re: [PATCH 0/7] block/nbd: decouple reconnect from drain

2021-03-16 Thread Roman Kagan
On Tue, Mar 16, 2021 at 09:41:36AM -0500, Eric Blake wrote: > On 3/15/21 1:06 AM, Roman Kagan wrote: > > The reconnection logic doesn't need to stop while in a drained section. > > Moreover it has to be active during the drained section, as the requests > > that

Re: [PATCH 7/7] block/nbd: stop manipulating in_flight counter

2021-03-16 Thread Roman Kagan
On Mon, Mar 15, 2021 at 11:15:44PM +0300, Vladimir Sementsov-Ogievskiy wrote: > 15.03.2021 09:06, Roman Kagan wrote: > > As the reconnect logic no longer interferes with drained sections, it > > appears unnecessary to explicitly manipulate the in_flight counter. > > > &

Re: [PATCH 6/7] block/nbd: decouple reconnect from drain

2021-03-16 Thread Roman Kagan
On Mon, Mar 15, 2021 at 11:10:14PM +0300, Vladimir Sementsov-Ogievskiy wrote: > 15.03.2021 09:06, Roman Kagan wrote: > > The reconnection logic doesn't need to stop while in a drained section. > > Moreover it has to be active during the drained section, as the requests >

Re: [PATCH 0/7] block/nbd: decouple reconnect from drain

2021-03-16 Thread Roman Kagan
On Mon, Mar 15, 2021 at 10:45:39PM +0300, Vladimir Sementsov-Ogievskiy wrote: > 15.03.2021 09:06, Roman Kagan wrote: > > The reconnection logic doesn't need to stop while in a drained section. > > Moreover it has to be active during the drained section, as the requests >

Re: [PATCH 1/7] block/nbd: avoid touching freed connect_thread

2021-03-16 Thread Roman Kagan
On Mon, Mar 15, 2021 at 06:40:12PM +0300, Vladimir Sementsov-Ogievskiy wrote: > 15.03.2021 09:06, Roman Kagan wrote: > > When the NBD connection is being torn down, the connection thread gets > > canceled and "detached", meaning it is about to get freed. >

Re: [PATCH 3/7] block/nbd: assert attach/detach runs in the proper context

2021-03-15 Thread Roman Kagan
On Mon, Mar 15, 2021 at 07:41:32PM +0300, Vladimir Sementsov-Ogievskiy wrote: > 15.03.2021 09:06, Roman Kagan wrote: > > Document (via a comment and an assert) that > > nbd_client_detach_aio_context and nbd_client_attach_aio_context_bh run > > in the desired aio_contex

[PATCH 2/7] block/nbd: use uniformly nbd_client_connecting_wait

2021-03-15 Thread Roman Kagan
Use nbd_client_connecting_wait uniformly all over the block/nbd.c. While at this, drop the redundant check for nbd_client_connecting_wait in reconnect_delay_timer_init, as all its callsites do this check too. Signed-off-by: Roman Kagan --- block/nbd.c | 34 +++--- 1

[PATCH 3/7] block/nbd: assert attach/detach runs in the proper context

2021-03-15 Thread Roman Kagan
Document (via a comment and an assert) that nbd_client_detach_aio_context and nbd_client_attach_aio_context_bh run in the desired aio_context. Signed-off-by: Roman Kagan --- block/nbd.c | 12 1 file changed, 12 insertions(+) diff --git a/block/nbd.c b/block/nbd.c index 1d8edb5b21

[PATCH 6/7] block/nbd: decouple reconnect from drain

2021-03-15 Thread Roman Kagan
with the drained section in the reconnection code. Fixes: 5ad81b4946 ("nbd: Restrict connection_co reentrance") Fixes: 8c517de24a ("block/nbd: fix drain dead-lock because of nbd reconnect-delay") Signed-off-by: Roman Kagan --- block/nbd.c | 79 +++-

[PATCH 7/7] block/nbd: stop manipulating in_flight counter

2021-03-15 Thread Roman Kagan
As the reconnect logic no longer interferes with drained sections, it appears unnecessary to explicitly manipulate the in_flight counter. Fixes: 5ad81b4946 ("nbd: Restrict connection_co reentrance") Signed-off-by: Roman Kagan --- block/nbd.c | 6 -- nbd/client.c | 2 -- 2 files

[PATCH 0/7] block/nbd: decouple reconnect from drain

2021-03-15 Thread Roman Kagan
t;nbd: Restrict connection_co reentrance"); as I've missed the point of that commit I'd appreciate more scrutiny in this area. Roman Kagan (7): block/nbd: avoid touching freed connect_thread block/nbd: use uniformly nbd_client_connecting_wait block/nbd: assert attach/detach runs in

[PATCH 5/7] block/nbd: better document a case in nbd_co_establish_connection

2021-03-15 Thread Roman Kagan
Cosmetic: adjust the comment and the return value in nbd_co_establish_connection where it's entered while the connection thread is still running. Signed-off-by: Roman Kagan --- block/nbd.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/block/nbd.c b/block/nbd.c index

[PATCH 1/7] block/nbd: avoid touching freed connect_thread

2021-03-15 Thread Roman Kagan
ion thread data. To prevent this, revalidate the ->connect_thread pointer in nbd_co_establish_connection_cancel before using after the the yield. Signed-off-by: Roman Kagan --- block/nbd.c | 9 + 1 file changed, 9 insertions(+) diff --git a/block/nbd.c b/block/nbd.c index c26dc5a54

[PATCH 4/7] block/nbd: transfer reconnection stuff across aio_context switch

2021-03-15 Thread Roman Kagan
the reconnection logic on entry and starts it over on exit. However, this patch paves the way to keeping the reconnection process active across the drained section (in a followup patch). Signed-off-by: Roman Kagan --- block/nbd.c | 44 ++-- 1 file changed, 42 insertions

Re: [RFC] nbd: decouple reconnect from drain

2021-03-14 Thread Roman Kagan
On Fri, Mar 12, 2021 at 03:35:25PM +0300, Vladimir Sementsov-Ogievskiy wrote: > 10.03.2021 12:32, Roman Kagan wrote: > > NBD connect coroutine takes an extra in_flight reference as if it's a > > request handler. This prevents drain from completion until the > > connection

[RFC] nbd: decouple reconnect from drain

2021-03-10 Thread Roman Kagan
in .bdrv_{attach,detach}_aio_context callbacks. Fixes: 5ad81b4946 ("nbd: Restrict connection_co reentrance") Signed-off-by: Roman Kagan --- This patch passes the regular make check but fails some extra iotests, in particular 277. It obviously lacks more robust interaction with the connect

Re: [PATCH 1/4] block/nbd: fix drain dead-lock because of nbd reconnect-delay

2021-02-03 Thread Roman Kagan
On Wed, Feb 03, 2021 at 05:44:34PM +0300, Vladimir Sementsov-Ogievskiy wrote: > 03.02.2021 17:21, Roman Kagan wrote: > > On Wed, Feb 03, 2021 at 04:10:41PM +0300, Vladimir Sementsov-Ogievskiy > > wrote: > > > 03.02.2021 13:53, Roman Kagan wrote: > > > > On T

Re: [PATCH 1/4] block/nbd: fix drain dead-lock because of nbd reconnect-delay

2021-02-03 Thread Roman Kagan
On Wed, Feb 03, 2021 at 04:10:41PM +0300, Vladimir Sementsov-Ogievskiy wrote: > 03.02.2021 13:53, Roman Kagan wrote: > > On Thu, Sep 03, 2020 at 10:02:58PM +0300, Vladimir Sementsov-Ogievskiy > > wrote: > > > We pause reconnect process during drained section. So, if we

Re: [PATCH 1/4] block/nbd: fix drain dead-lock because of nbd reconnect-delay

2021-02-03 Thread Roman Kagan
On Thu, Sep 03, 2020 at 10:02:58PM +0300, Vladimir Sementsov-Ogievskiy wrote: > We pause reconnect process during drained section. So, if we have some > requests, waiting for reconnect we should cancel them, otherwise they > deadlock the drained section. > > How to reproduce: > > 1. Create an

[PATCH v2 2/3] block/nbd: only enter connection coroutine if it's present

2021-01-28 Thread Roman Kagan
0 in ?? () Fix it by checking that the connection coroutine is non-null before trying to enter it. If it is null, no entering is needed, as the connection is probably going down anyway. Signed-off-by: Roman Kagan Reviewed-by: Vladimir Sementsov-Ogievskiy --- block/nbd.c | 16 +--- 1 fil

[PATCH v2 3/3] nbd: make nbd_read* return -EIO on error

2021-01-28 Thread Roman Kagan
the connection. Fix it by turning every negative return from qio_channel_read_all into -EIO returned from nbd_read. Apparently that was the original behavior, but got broken later. Also adjust nbd_readXX to follow. Fixes: e6798f06a6 ("nbd: generalize usage of nbd_read") Signed-off-by: R

[PATCH v2 0/3] block/nbd: fix crashers in reconnect while migrating

2021-01-28 Thread Roman Kagan
ed backtraces in log messages - add r-b by Vladimir Roman Kagan (3): block/nbd: only detach existing iochannel from aio_context block/nbd: only enter connection coroutine if it's present nbd: make nbd_read* return -EIO on error include/block/nbd.h | 7 --- block/nbd.c

[PATCH v2 1/3] block/nbd: only detach existing iochannel from aio_context

2021-01-28 Thread Roman Kagan
it by checking that the iochannel is non-null before trying to detach it from the aio_context. If it is null, no detaching is needed, and it will get reattached in the proper aio_context once the connection is reestablished. Signed-off-by: Roman Kagan Reviewed-by: Vladimir Sementsov-Ogievskiy --- block

Re: [PATCH 0/3] block/nbd: fix crashers in reconnect while migrating

2021-01-28 Thread Roman Kagan
On Fri, Jan 29, 2021 at 08:51:39AM +0300, Vladimir Sementsov-Ogievskiy wrote: > 28.01.2021 23:14, Roman Kagan wrote: > > During the final phase of migration the NBD reconnection logic may > > encounter situations it doesn't expect during regular operation. > > > >

Re: [PATCH 1/3] block/nbd: only detach existing iochannel from aio_context

2021-01-28 Thread Roman Kagan
On Fri, Jan 29, 2021 at 08:37:13AM +0300, Vladimir Sementsov-Ogievskiy wrote: > 28.01.2021 23:14, Roman Kagan wrote: > > When the reconnect in NBD client is in progress, the iochannel used for > > NBD connection doesn't exist. Therefore an attempt to detach it from >

[PATCH 1/3] block/nbd: only detach existing iochannel from aio_context

2021-01-28 Thread Roman Kagan
to detach it from the aio_context. If it is null, no detaching is needed, and it will get reattached in the proper aio_context once the connection is reestablished. Signed-off-by: Roman Kagan --- block/nbd.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/block/nbd.c b/bl

[PATCH 2/3] block/nbd: only enter connection coroutine if it's present

2021-01-28 Thread Roman Kagan
lib/x86_64-linux-gnu/libpthread.so.0 Fix it by checking that the connection coroutine is non-null before trying to enter it. If it is null, no entering is needed, as the connection is probably going down anyway. Signed-off-by: Roman Kagan --- block/nbd.c | 16 +--- 1 file changed

[PATCH 3/3] nbd: make nbd_read* return -EIO on error

2021-01-28 Thread Roman Kagan
the connection. Fix it by turning every negative return from qio_channel_read_all into -EIO returned from nbd_read. Apparently that was the original behavior, but got broken later. Also adjust nbd_readXX to follow. Fixes: e6798f06a6 ("nbd: generalize usage of nbd_read") Signed-off-by: R

[PATCH 0/3] block/nbd: fix crashers in reconnect while migrating

2021-01-28 Thread Roman Kagan
delay" runs a stress load (fio with big queue depth) in the guest on that drive and is migrated (e.g. to a file), while the nbd server is SIGKILL-ed and restarted every second. See the individual patches for specific crash conditions and more detailed explanations. Roman Kagan (3): block/nbd: o

Re: [PATCH v8 0/8] block: enhance handling of size-related BlockConf properties

2020-06-15 Thread Roman Kagan
On Fri, May 29, 2020 at 01:55:08AM +0300, Roman Kagan wrote: > BlockConf includes several properties counted in bytes. > > Enhance their handling in some aspects, specifically > > - accept common size suffixes (k, m) > - perform consistency checks on the values > -

Re: [PATCH v8 2/8] block: consolidate blocksize properties consistency checks

2020-05-29 Thread Roman Kagan
On Fri, May 29, 2020 at 11:53:26AM +0200, Markus Armbruster wrote: > Roman Kagan writes: > > > Several block device properties related to blocksize configuration must > > be in certain relationship WRT each other: physical block must be no > > smaller than logical block;

[PATCH v8 8/8] block: lift blocksize property limit to 2 MiB

2020-05-28 Thread Roman Kagan
Logical and physical block sizes in QEMU are limited to 32 KiB. This appears unnecessarily tight, and we've seen bigger block sizes handy at times. Lift the limitation up to 2 MiB which appears to be good enough for everybody, and matches the qcow2 cluster size limit. Signed-off-by: Roman Kagan

[PATCH v8 6/8] block: make BlockConf size props 32bit and accept size suffixes

2020-05-28 Thread Roman Kagan
and virtio-blk devices as an uint16_t in units of logical blocks, introduce an additional check in blkconf_blocksizes to prevent its silent truncation. Signed-off-by: Roman Kagan --- v7 -> v8: - replace stringify with %u in the error message [Eric] - fix wording in the log [Eric] include

[PATCH v8 2/8] block: consolidate blocksize properties consistency checks

2020-05-28 Thread Roman Kagan
, add corresponding consistency checks to blkconf_blocksizes, adjusting its signature to communicate possible error to the caller. Also remove the now redundant consistency checks from the specific devices. Signed-off-by: Roman Kagan Reviewed-by: Eric Blake Reviewed-by: Paul Durrant --- include

[PATCH v8 7/8] qdev-properties: add getter for size32 and blocksize

2020-05-28 Thread Roman Kagan
Add getter for size32, and use it for blocksize, too. In its human-readable branch, it reports approximate size in human-readable units next to the exact byte value, like the getter for 64bit size does. Adjust the expected test output accordingly. Signed-off-by: Roman Kagan Reviewed-by: Eric

[PATCH v8 5/8] qdev-properties: make blocksize accept size suffixes

2020-05-28 Thread Roman Kagan
It appears convenient to be able to specify physical_block_size and logical_block_size using common size suffixes. Teach the blocksize property setter to interpret them. Also express the upper and lower limits in the respective units. Signed-off-by: Roman Kagan Reviewed-by: Eric Blake --- hw

[PATCH v8 4/8] qdev-properties: add size32 property type

2020-05-28 Thread Roman Kagan
property type in a followup commit). The getter for size32 is left out for a separate patch as its benefit is less obvious, and it affects test output; for now the regular uint32 getter is used. Signed-off-by: Roman Kagan --- v7 -> v8: - replace stringify with %u in the error message [Eric] -

[PATCH v8 0/8] block: enhance handling of size-related BlockConf properties

2020-05-28 Thread Roman Kagan
e limit in the log and comment [Eric] v1 -> v2: - cap the property at 2 MiB [Eric] - accept size suffixes Roman Kagan (8): virtio-blk: store opt_io_size with correct size block: consolidate blocksize properties consistency checks qdev-properties: blocksize: use same limits in code a

[PATCH v8 1/8] virtio-blk: store opt_io_size with correct size

2020-05-28 Thread Roman Kagan
The width of opt_io_size in virtio_blk_config is 32bit. However, it's written with virtio_stw_p; this may result in value truncation, and on big-endian systems with legacy virtio in completely bogus readings in the guest. Use the appropriate accessor to store it. Signed-off-by: Roman Kagan

  1   2   >