On Wed, Nov 12, 2025 at 04:31:05PM -0600, Eric Blake wrote:
> Without a mutex, NetListener can run into this data race between a
> thread changing the async callback callback function to use when a
> client connects, and the thread servicing polling of the listening
> sockets:
> 
>  Thread 1:
>        qio_net_listener_set_client_func(lstnr, f1, ...);
>            => foreach sock: socket
>                => object_ref(lstnr)
>                => sock_src = qio_channel_socket_add_watch_source(sock, ...., 
> lstnr, object_unref);
> 
>   Thread 2:
>        poll()
>           => event POLLIN on socket
>                => ref(GSourceCallback)
>                => if (lstnr->io_func) // while lstnr->io_func is f1
>                     ...interrupt..
> 
>   Thread 1:
>        qio_net_listener_set_client_func(lstnr, f2, ...);
>           => foreach sock: socket
>                => g_source_unref(sock_src)
>           => foreach sock: socket
>                => object_ref(lstnr)
>                => sock_src = qio_channel_socket_add_watch_source(sock, ...., 
> lstnr, object_unref);
> 
>   Thread 2:
>                => call lstnr->io_func(lstnr->io_data) // now sees f2
>                => return dispatch(sock)
>                => unref(GSourceCallback)
>                   => destroy-notify
>                      => object_unref
> 
> Found by inspection.  This is a SEGFAULT waiting to happen if f2 can
> become NULL because thread 1 deregisters the user's callback while
> thread 2 is trying to service the callback.  Other messes are also
> theoretically possible, such as running callback f1 with an opaque
> pointer that should only be passed to f2 (if the client code were to
> use more than just a binary choice between a single async function or
> NULL).
> 
> Mitigating factor: if the code that modifies the QIONetListener can
> only be reached by the same thread that is executing the polling and
> async callbacks, then we are not in a two-thread race documented above
> (even though poll can see two clients trying to connect in the same
> window of time, any changes made to the listener by the first async
> callback will be completed before the thread moves on to the second
> client).  However, QEMU is complex enough that I was unable to state
> with certainty whether a QMP command (such as nbd-server-stop, which
> does modify the net listener) can ever be serviced in a thread
> distinct from the one trying to do the async callbacks.  Similarly, I
> did not spend the time trying to add sleeps or execute under gdb to
> try and actually trigger the race in practice.
> 
> At any rate, it is worth having the API be robust.  To ensure that
> modifying a NetListener can be safely done from any thread, add a
> mutex that guarantees atomicity to all members of a listener object
> related to callbacks.  This problem has been present since
> QIONetListener was introduced.
> 
> Note that this does NOT prevent the case of a second round of the
> user's old async callback being invoked with the old opaque data, even
> when the user has already tried to change the async callback during
> the first async callback; it is only about ensuring that there is no
> sharding (the eventual io_func(io_data) call that does get made will
> correspond to a particular combination that the user had requested at
> some point in time, and not be sharded to a combination that never
> existed in practice).  In other words, this patch maintains the status
> quo that a user's async callback function already needs to be robust
> to parallel clients landing in the same window of poll servicing, even
> when only one client is desired.
> 
> CC: [email protected]
> Fixes: 53047392 ("io: introduce a network socket listener API", v2.12.0)
> Signed-off-by: Eric Blake <[email protected]>
> 
> ---
> v3: new patch
> ---
>  include/io/net-listener.h |  1 +
>  io/net-listener.c         | 58 +++++++++++++++++++++++++++++----------
>  2 files changed, 44 insertions(+), 15 deletions(-)

Reviewed-by: Daniel P. BerrangĂ© <[email protected]>


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Reply via email to