Bug#1040343: linux-image-5.10.0-9-amd64: Kernel silenty de-supported nfsv3 UDP mounts

2023-07-18 Thread Hauke Fath

On 7/17/23 20:29, Ben Hutchings wrote:

But the router's package filter will have lost state after a reboot,
and reject packets from tcp connections that the clients assume to
exist. This is not a problem with udp, because connection-less.

>

Ah, I see.  You didn't mention that there was dynamic NAT involved
before.


Because it isn't. What is involved is a stateful packet filter (FreeBSD 
pf). I said


| We run nfs through a router (several client subnets accessing servers
| in an internal server subnet), and found nfs over udp a lot more
| robust in the face or router reboots.


If an NFS server is rebooted abruptly (so it doesn't properly close TCP
connections), once it's back up it will respond to any requests from
clients with a TCP RST, and they should reconnect.


Understood, and not relevant here.


If a NAT router between client and server is rebooted, I think that
something similar should happen, but the router would need to send the
TCP RST instead.


After a router reboot, the stateful packet filter will have lost 
information on active tcp connections, and (rightfully) reject packets 
for what the nfs clients (rightfully) see as an existing connection.



Is your router configured to send a TCP RST when receiving a packet for
an unknown connection, or does it just drop those packets?  (In
iptables this is the difference between REJECT and DROP policies.)


The router defaults to returning RST.

Anyway: I am not asking for an udp default here, but simply for Debian 
to keep providing the _option_, and leave the decision to me, the admin.


Cheerio,
Hauke


--
 The ASCII Ribbon CampaignHauke Fath
() No HTML/RTF in email Institut für Nachrichtentechnik
/\ No Word docs in email TU Darmstadt
 Respect for open standards  Ruf +49-6151-16-21344



Bug#1040343: linux-image-5.10.0-9-amd64: Kernel silenty de-supported nfsv3 UDP mounts

2023-07-17 Thread Ben Hutchings
On Mon, 2023-07-17 at 09:05 +0200, Hauke Fath wrote:
> On Sun, 16 Jul 2023 20:14:20 +0200, Ben Hutchings wrote:
> > > 
> > > nfsv3 over tcp works, but is subobtimal , as described - when the router 
> > > goes down, the tcp mounts will hang, and the machine will have to be 
> > > rebooted.
> > [...]
> > 
> > Does this mean you are using the "soft" mount option?  Without that, I
> > would expect access to the mount to hang until the network connection
> > is restored, regardless of whether the TCP or UDP transport is used.
> 
> No, we use hard mounts.
> 
> But the router's package filter will have lost state after a reboot, 
> and reject packets from tcp connections that the clients assume to 
> exist. This is not a problem with udp, because connection-less.

Ah, I see.  You didn't mention that there was dynamic NAT involved
before.

If an NFS server is rebooted abruptly (so it doesn't properly close TCP
connections), once it's back up it will respond to any requests from
clients with a TCP RST, and they should reconnect.

If a NAT router between client and server is rebooted, I think that
something similar should happen, but the router would need to send the
TCP RST instead.

Is your router configured to send a TCP RST when receiving a packet for
an unknown connection, or does it just drop those packets?  (In
iptables this is the difference between REJECT and DROP policies.)

Ben.

-- 
Ben Hutchings
Never attribute to conspiracy what can adequately be explained
by stupidity.



signature.asc
Description: This is a digitally signed message part


Bug#1040343: linux-image-5.10.0-9-amd64: Kernel silenty de-supported nfsv3 UDP mounts

2023-07-17 Thread Hauke Fath
On Sun, 16 Jul 2023 20:14:20 +0200, Ben Hutchings wrote:
>> 
>> nfsv3 over tcp works, but is subobtimal , as described - when the router 
>> goes down, the tcp mounts will hang, and the machine will have to be 
>> rebooted.
> [...]
> 
> Does this mean you are using the "soft" mount option?  Without that, I
> would expect access to the mount to hang until the network connection
> is restored, regardless of whether the TCP or UDP transport is used.

No, we use hard mounts.

But the router's package filter will have lost state after a reboot, 
and reject packets from tcp connections that the clients assume to 
exist. This is not a problem with udp, because connection-less.

Cheerio,
Hauke

-- 
 The ASCII Ribbon CampaignHauke Fath
() No HTML/RTF in emailInstitut für Nachrichtentechnik
/\ No Word docs in email TU Darmstadt
 Respect for open standards  Ruf +49-6151-16-21344



Bug#1040343: linux-image-5.10.0-9-amd64: Kernel silenty de-supported nfsv3 UDP mounts

2023-07-16 Thread Ben Hutchings
On Wed, 2023-07-05 at 15:18 +0200, Hauke Fath wrote:
> On 7/5/23 10:20, Bastian Blank wrote:
> > On Tue, Jul 04, 2023 at 06:35:44PM +0200, Hauke Fath wrote:
> > > /misc /etc/auto.misc  
> > > -nfsvers=3,proto=udp,resvport,retrans=5,rsize=16384,wsize=16384,rw,hard
> > And if you set it to TCP (the default) or better directly switch to
> > NFSv4?
> 
> nfsv3 over tcp works, but is subobtimal , as described - when the router 
> goes down, the tcp mounts will hang, and the machine will have to be 
> rebooted.
[...]

Does this mean you are using the "soft" mount option?  Without that, I
would expect access to the mount to hang until the network connection
is restored, regardless of whether the TCP or UDP transport is used.

The default retry and timeout behaviour *is* different between
transports, though.  See the "timeo" and "retrans" options in nfs(5). 
You may wish to override the defaults in your environment.

Ben.

-- 
Ben Hutchings
Theory and practice are closer in theory than in practice - John Levine



signature.asc
Description: This is a digitally signed message part


Bug#1040343: linux-image-5.10.0-9-amd64: Kernel silenty de-supported nfsv3 UDP mounts

2023-07-05 Thread Hauke Fath

On 7/5/23 10:20, Bastian Blank wrote:

On Tue, Jul 04, 2023 at 06:35:44PM +0200, Hauke Fath wrote:

/misc   /etc/auto.misc  
-nfsvers=3,proto=udp,resvport,retrans=5,rsize=16384,wsize=16384,rw,hard

And if you set it to TCP (the default) or better directly switch to
NFSv4?


nfsv3 over tcp works, but is subobtimal , as described - when the router 
goes down, the tcp mounts will hang, and the machine will have to be 
rebooted.


We do not use nfsv4 here.


This upstream decision
  is more than
debatable -- we have been running nfsv3 over UDP for ~20 years here
without ever seeing the data corruption that was claimed as
motivation.

That you have to take up with upstream, not us.


Note we are not talking about code changes here. This is about a kernel 
configuration option, which is very much at the discretion of a 
distribution (as the Arch example has shown).



Also this talks about
problem with fragment reassembly, not data corruption itself.  But
usually I would trust upstream to know more about it then yourself.


At this point, we have 20 years of experience with running nfsv3 over 
udp for ~40 clients that mount user homes over nfs.


On 7/5/23 14:34, Ben Hutchings wrote:

This was an upstream change in Linux 5.6 that we won't override.  NFS-
over-TCP has been well supported on Linux, and better performing, for a
long time.


This request is not about defaulting, or even preferring, udp over tcp.

It is simply about having the option, for interoperability as well as 
for situations (and they do exist, despite the blanket statement), where 
nfsv3 over udp provides more robust service -- without having to deploy 
a self-compiled kernel.


Please re-consider, and re-open the ticket.

Cheerio,
Hauke

--
 The ASCII Ribbon CampaignHauke Fath
() No HTML/RTF in email Institut für Nachrichtentechnik
/\ No Word docs in email TU Darmstadt
 Respect for open standards  Ruf +49-6151-16-21344



Bug#1040343: linux-image-5.10.0-9-amd64: Kernel silenty de-supported nfsv3 UDP mounts

2023-07-05 Thread Bastian Blank
Control: severity -1 normal

Hi

On Tue, Jul 04, 2023 at 06:35:44PM +0200, Hauke Fath wrote:
> /misc /etc/auto.misc  
> -nfsvers=3,proto=udp,resvport,retrans=5,rsize=16384,wsize=16384,rw,hard

And if you set it to TCP (the default) or better directly switch to
NFSv4?

> This upstream decision
>  is more than
> debatable -- we have been running nfsv3 over UDP for ~20 years here
> without ever seeing the data corruption that was claimed as
> motivation.

That you have to take up with upstream, not us.  Also this talks about
problem with fragment reassembly, not data corruption itself.  But
usually I would trust upstream to know more about it then yourself.

Bastian

-- 
Another dream that failed.  There's nothing sadder.
-- Kirk, "This side of Paradise", stardate 3417.3



Bug#1040343: linux-image-5.10.0-9-amd64: Kernel silenty de-supported nfsv3 UDP mounts

2023-07-04 Thread Hauke Fath
Package: src:linux
Version: 5.10.70-1
Severity: important

Dear Maintainer,

*** Reporter, please consider answering these questions, where appropriate ***

   * What led up to the situation?

Configuring this machine to mount nfs shares

   * What exactly did you do (or not do) that was effective (or
 ineffective)?

I set up the automounter with the standard mount options of our Arch
clients:

(auto.master)
/misc   /etc/auto.misc  
-nfsvers=3,proto=udp,resvport,retrans=5,rsize=16384,wsize=16384,rw,hard

When the mount failed, I repeated it manually.

   * What was the outcome of this action?

The mount failed:

# mount -t nfs -vvv -o 
nfsvers=3,proto=udp,retrans=5,rsize=16384,wsize=16384,rw,hard 
:/u/pkgsrc /mnt
mount.nfs: timeout set for Tue Jul  4 18:23:55 2023
mount.nfs: trying text-based options 
'nfsvers=3,proto=udp,retrans=5,rsize=16384,wsize=16384,hard,addr='
mount.nfs: prog 13, trying vers=3, prot=17
mount.nfs: trying 130.83.197.22 prog 13 vers 3 prot UDP port 2049
mount.nfs: prog 15, trying vers=3, prot=17
mount.nfs: trying 130.83.197.22 prog 15 vers 3 prot UDP port 701
mount.nfs: mount(2): Invalid argument
mount.nfs: an incorrect mount option was specified
#

   * What outcome did you expect instead?

A successful nfs share mount.

Instead,  pointed 
me to 

# grep "NFS.*UDP" /boot/config-5.10.0-9-amd64
CONFIG_NFS_DISABLE_UDP_SUPPORT=y
#

which disables UDP support for nfsv3 in the kernel.

This upstream decision
 is more than
debatable -- we have been running nfsv3 over UDP for ~20 years here
without ever seeing the data corruption that was claimed as
motivation.

We run nfs through a router (several client subnets accessing servers
in an internal server subnet), and found nfs over udp a lot more
robust in the face or router reboots.


*** End of the template - remove these template lines ***


-- Package-specific info:
** Version:
Linux version 5.10.0-9-amd64 (debian-ker...@lists.debian.org) (gcc-10 (Debian 
10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 SMP 
Debian 5.10.70-1 (2021-09-30)

** Command line:
root=UUID=67153f14-9542-47ac-9643-c3be9ed3e33c ro initrd=/install/initrd.gz 
quiet

** Not tainted

** Kernel log:
[9.405659] systemd[1]: Queued start job for default target Graphical 
Interface.
[9.406611] random: systemd: uninitialized urandom read (16 bytes read)
[9.409768] systemd[1]: Created slice system-getty.slice.
[9.409969] random: systemd: uninitialized urandom read (16 bytes read)
[9.410579] systemd[1]: Created slice system-modprobe.slice.
[9.411370] systemd[1]: Created slice system-serial\x2dgetty.slice.
[9.412046] systemd[1]: Created slice system-systemd\x2dfsck.slice.
[9.412681] systemd[1]: Created slice User and Session Slice.
[9.412965] systemd[1]: Started Forward Password Requests to Wall Directory 
Watch.
[9.413483] systemd[1]: Set up automount Arbitrary Executable File Formats 
File System Automount Point.
[9.413723] systemd[1]: Reached target User and Group Name Lookups.
[9.413828] systemd[1]: Reached target Slices.
[9.413924] systemd[1]: Reached target Swap.
[9.414209] systemd[1]: Listening on Device-mapper event daemon FIFOs.
[9.414533] systemd[1]: Listening on LVM2 poll daemon socket.
[9.432622] systemd[1]: Listening on RPCbind Server Activation Socket.
[9.434023] systemd[1]: Listening on Syslog Socket.
[9.434436] systemd[1]: Listening on fsck to fsckd communication Socket.
[9.434676] systemd[1]: Listening on initctl Compatibility Named Pipe.
[9.435329] systemd[1]: Listening on Journal Audit Socket.
[9.435744] systemd[1]: Listening on Journal Socket (/dev/log).
[9.436278] systemd[1]: Listening on Journal Socket.
[9.437130] systemd[1]: Listening on udev Control Socket.
[9.437505] systemd[1]: Listening on udev Kernel Socket.
[9.438066] systemd[1]: Condition check resulted in Huge Pages File System 
being skipped.
[9.441114] systemd[1]: Mounting POSIX Message Queue File System...
[9.444148] systemd[1]: Mounting RPC Pipe File System...
[9.447699] systemd[1]: Mounting Kernel Debug File System...
[9.451315] systemd[1]: Mounting Kernel Trace File System...
[9.451885] systemd[1]: Condition check resulted in Kernel Module supporting 
RPCSEC_GSS being skipped.
[9.452326] systemd[1]: Finished Availability of block devices.
[9.455957] systemd[1]: Starting Set the console keyboard layout...
[9.459351] systemd[1]: Starting Create list of static device nodes for the 
current kernel...
[9.462868] systemd[1]: Starting Monitoring of LVM2 mirrors, snapshots etc. 
using dmeventd or progress polling...
[9.466545] systemd[1]: Starting Load Kernel Module configfs...
[9.470181] systemd[1]: Starting Load Kernel Module drm...
[9.473529] systemd[1]: Starting Load Kernel Module