Your message dated Sat, 4 Sep 2021 22:15:48 +0200
with message-id <ytpt9pgd5sjpx...@aurel32.net>
and subject line Re: Bug#757474: libc6: amd copying a SVCXPRT structure leads
to libc's RPC code sending packets of incorrect length
has caused the Debian Bug report #757474,
regarding libc6: amd copying a SVCXPRT structure leads to libc's RPC code
sending packets of incorrect length
to be marked as done.
This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.
(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact ow...@bugs.debian.org
immediately.)
--
757474: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=757474
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems
--- Begin Message ---
Package: libc6
Version: 2.13-38+deb7u3
Severity: normal
Tags: upstream patch
This is really a problem with amd (am-utils), not the eglibc, but it's hard to
solve on amd's side (see topic "NFS v2 RPC reply on LOOKUP" on the am-utils
list) but can easily be hacked around on eglibc's side.
The phenomenon is an amd NFS mount (typically on user login) to stall for 5 or
10 seconds.
The root problem is that amd occasionally copies (the contents of) a SVCXPRT
structure to store it away and be able to respond in the background. This is
probably illegal, but "used to work" with the traditional SUN RPC
implementation.
Now eglibc stores both an iovec and a msghdr structure in a private part of the
SVCXPRT, with the embedded msgghdr's msg_iov field set to point at the
corresponding embedded iovec. When the structure is copied, the embedded
msghdr's msg_iov still points to the original SVCXPT's embedded iovec, not the
one embedded in the copy. If the copy is then used to transmit a reply, the
embedded iovec's length is set to the desired value, but sendmsg() actually
uses the original SVCXPRT's value due to the msg_iov field of the msghdr
embedded in the copy pointing at the iovec embedded in the original (which
fields are not set to the desired values).
Then, sendmsg() transmits a reply of incorrect length and doesn't return with
the expected value, which causes a second (error) reply being sent, confusing
the client. The client then discard the reply and resends the request after a
(five second) timeout. At that point, amd has probably finished the mount
operation, doesn't background the request, replies correctly and everything
works as expected.
The problem can obviously be hacked around by forcing the embedded msghdr's
msg_iov field to point to the embedded iovec before passing the msghdr to
sendmsg(), which the attached (one-line) patch does.
-- System Information:
Debian Release: 7.6
APT prefers stable
APT policy: (500, 'stable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386
Kernel: Linux 3.10.42.wap (SMP w/2 CPU cores)
Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Versions of packages libc6:amd64 depends on:
ii libc-bin 2.13-38+deb7u3
ii libgcc1 1:4.7.2-5
libc6:amd64 recommends no packages.
Versions of packages libc6:amd64 suggests:
ii debconf [debconf-2.0] 1.5.49
pn glibc-doc <none>
ii locales 2.13-38+deb7u3
-- debconf information excluded
Index: sunrpc/svc_udp.c
===================================================================
--- sunrpc/svc_udp.c (revision 3768)
+++ sunrpc/svc_udp.c (revision 3769)
@@ -329,6 +329,7 @@
iovp = (struct iovec *) &xprt->xp_pad [0];
iovp->iov_base = rpc_buffer (xprt);
iovp->iov_len = slen;
+ mesgp->msg_iov = iovp; /* hack around clients like amd that memcpy()
a SVCXPRT structure */
sent = __sendmsg (xprt->xp_sock, mesgp, 0);
}
else
--- End Message ---
--- Begin Message ---
Version: 2.32-0experimental0
On 2014-08-08 17:24, Edgar Fuß wrote:
> Package: libc6
> Version: 2.13-38+deb7u3
> Severity: normal
> Tags: upstream patch
>
> This is really a problem with amd (am-utils), not the eglibc, but it's hard
> to solve on amd's side (see topic "NFS v2 RPC reply on LOOKUP" on the
> am-utils list) but can easily be hacked around on eglibc's side.
>
> The phenomenon is an amd NFS mount (typically on user login) to stall for 5
> or 10 seconds.
>
> The root problem is that amd occasionally copies (the contents of) a SVCXPRT
> structure to store it away and be able to respond in the background. This is
> probably illegal, but "used to work" with the traditional SUN RPC
> implementation.
>
> Now eglibc stores both an iovec and a msghdr structure in a private part of
> the SVCXPRT, with the embedded msgghdr's msg_iov field set to point at the
> corresponding embedded iovec. When the structure is copied, the embedded
> msghdr's msg_iov still points to the original SVCXPT's embedded iovec, not
> the one embedded in the copy. If the copy is then used to transmit a reply,
> the embedded iovec's length is set to the desired value, but sendmsg()
> actually uses the original SVCXPRT's value due to the msg_iov field of the
> msghdr embedded in the copy pointing at the iovec embedded in the original
> (which fields are not set to the desired values).
> Then, sendmsg() transmits a reply of incorrect length and doesn't return with
> the expected value, which causes a second (error) reply being sent, confusing
> the client. The client then discard the reply and resends the request after a
> (five second) timeout. At that point, amd has probably finished the mount
> operation, doesn't background the request, replies correctly and everything
> works as expected.
>
> The problem can obviously be hacked around by forcing the embedded msghdr's
> msg_iov field to point to the embedded iovec before passing the msghdr to
> sendmsg(), which the attached (one-line) patch does.
>
SunRPC support has been removed from glibc 2.32. Closing the bug
accordingly.
Regards,
Aurelien
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net
--- End Message ---