On Mon, Apr 14, 2025 at 09:51:32AM +0800, Yong Huang wrote: > On Fri, Apr 11, 2025 at 5:47 PM Daniel P. Berrangé <berra...@redhat.com> > wrote: > > > On Tue, Apr 08, 2025 at 10:27:51AM +0800, yong.hu...@smartx.com wrote: > > > From: Hyman Huang <yong.hu...@smartx.com> > > > > > > As advised by the GNU TLS, the caller should attempt again > > > if the gnutls_record_{recv,send} return EAGAIN or EINTR; > > > check the following link to view the details: > > > > > https://www.gnutls.org/manual/html_node/Data-transfer-and-termination.html > > > > > > Add the retry parameter for virNetTLSSession{Read,Write} > > > functions in accordance with this guideline. > > > > > > This prevents the upper application from encountering the > > > following error message when it calls the virConnectOpenAuth API: > > > Unable to read TLS confirmation: Resource temporarily unavailable > > > > > > Signed-off-by: Hyman Huang <yong.hu...@smartx.com> > > > --- > > > src/rpc/virnetclient.c | 2 +- > > > src/rpc/virnetsocket.c | 4 ++-- > > > src/rpc/virnettlscontext.c | 28 ++++++++++++++++++++++++++-- > > > src/rpc/virnettlscontext.h | 6 ++++-- > > > 4 files changed, 33 insertions(+), 7 deletions(-) > > > > > > diff --git a/src/rpc/virnetclient.c b/src/rpc/virnetclient.c > > > index 92933220e2..5340db4211 100644 > > > --- a/src/rpc/virnetclient.c > > > +++ b/src/rpc/virnetclient.c > > > @@ -1003,7 +1003,7 @@ int virNetClientSetTLSSession(virNetClient *client, > > > ignore_value(pthread_sigmask(SIG_SETMASK, &oldmask, NULL)); > > > #endif /* !WIN32 */ > > > > > > - len = virNetTLSSessionRead(client->tls, buf, 1); > > > + len = virNetTLSSessionRead(client->tls, buf, 1, true); > > > if (len < 0 && errno != ENOMSG) { > > > virReportSystemError(errno, "%s", > > > _("Unable to read TLS confirmation")); > > > diff --git a/src/rpc/virnetsocket.c b/src/rpc/virnetsocket.c > > > index e8fc2d5f7d..6774dd4a4b 100644 > > > --- a/src/rpc/virnetsocket.c > > > +++ b/src/rpc/virnetsocket.c > > > @@ -1739,7 +1739,7 @@ static ssize_t virNetSocketReadWire(virNetSocket > > *sock, char *buf, size_t len) > > > if (sock->tlsSession && > > > virNetTLSSessionGetHandshakeStatus(sock->tlsSession) == > > > VIR_NET_TLS_HANDSHAKE_COMPLETE) { > > > - ret = virNetTLSSessionRead(sock->tlsSession, buf, len); > > > + ret = virNetTLSSessionRead(sock->tlsSession, buf, len, false); > > > } else { > > > ret = read(sock->fd, buf, len); > > > } > > > @@ -1807,7 +1807,7 @@ static ssize_t virNetSocketWriteWire(virNetSocket > > *sock, const char *buf, size_t > > > if (sock->tlsSession && > > > virNetTLSSessionGetHandshakeStatus(sock->tlsSession) == > > > VIR_NET_TLS_HANDSHAKE_COMPLETE) { > > > - ret = virNetTLSSessionWrite(sock->tlsSession, buf, len); > > > + ret = virNetTLSSessionWrite(sock->tlsSession, buf, len, false); > > > } else { > > > ret = write(sock->fd, buf, len); /* sc_avoid_write */ > > > } > > > diff --git a/src/rpc/virnettlscontext.c b/src/rpc/virnettlscontext.c > > > index e8023133b4..92d909c0b7 100644 > > > --- a/src/rpc/virnettlscontext.c > > > +++ b/src/rpc/virnettlscontext.c > > > @@ -679,16 +679,28 @@ void > > virNetTLSSessionSetIOCallbacks(virNetTLSSession *sess, > > > > > > > > > ssize_t virNetTLSSessionWrite(virNetTLSSession *sess, > > > - const char *buf, size_t len) > > > + const char *buf, size_t len, > > > + bool retry) > > > { > > > ssize_t ret; > > > > > > + rewrite: > > > virObjectLock(sess); > > > ret = gnutls_record_send(sess->session, buf, len); > > > > > > if (ret >= 0) > > > goto cleanup; > > > > > > + if (retry && (ret == GNUTLS_E_AGAIN || ret == > > GNUTLS_E_INTERRUPTED)) { > > > + /* > > > + * GNU TLS advises calling the function again to obtain the > > data if EAGAIN is returned. > > > + * See reference: > > https://www.gnutls.org/manual/html_node/Data-transfer-and-termination.html > > > + * */ > > > > To repeat my feedback on v1. > > > > This would surely be wrong for E_AGAIN - if the socket is blocking we must > > return to the main loop and wait for it to indicate that the socket has > > new data. If we don't wait in the main loop, this code will busy-loop on > > a non-blocking socket > > > Agree, for the EAGAIN error, trying to repoll could be a solution like the > following pieces of code? > > repoll: > source = virEventGLibAddSocketWatch(virNetSocketGetFD(client->sock), > G_IO_IN, > client->eventCtx, > virNetClientIOEventTLSConfirm, > client, NULL); > > g_main_loop_run(client->eventLoop); > > retry: > len = virNetTLSSessionRead(client->tls, buf, 1, true); > if (len < 0 && errno == EINTR) { > goto retry; > } > if (len < 0 && errno == EAGAIN) { > goto repoll; > }
Yes, and virNetSocketReadWire is already handlnig both EINTR and EAGAIN after it calls virNetTLSSessionRead. The other caller is virNetClientSetTLSSession and I see that is failed to handle EINTR/EGAIN, though EGAIN seems like it ought to be unlikely given that caller waited for G_IO_IN... > > A retry on E_INTERRUPTED is ok, but E_AGAIN must wait for the > > socket to be readable. This is something higher level libvirt > > code should already be doing correctly. Do you have any real > > world example of a problem being hit ? > > snip > Yes, when our higher management app try to connect Libvirtd using > "/usr/local/venv/job-center/lib/python3.10/site-packages/libvirt.py", line > 148, in openAuth raise libvirtError('virConnectOpenAuth() failed') > libvirt.libvirtError: Unable to read TLS confirmation: Resource temporarily > unavailable ....this error message comes from virNetClientSetTLSSession and suggests the wait for G_IO_IN wasn't sufficient, so we should retry I guess With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|