Anybody else seeing periodic "stalls" in networking with 5.4 kernels
and the r8169 driver ? My current feeling is that a lot of network
and other changes got into 5.4 because it is the next LTS kernel,
and on r8169 (which apparently has a bad reputation for some people)
some variants might be be very iffy (although fine in 5.2 and
earlier 5.3.stable).
I can't say that I initially noticed any problems on some machines,
but I make heavy use of nfs (v3) for my sources, my git tree with my
build scripts, my notes, and for writing the first stage of my
backups (rsync to an nfs mount).
On a machine running 5.4.1 I eventually noticed that the backups
were timing out because the mountpoint was already mounted (my
script uses a loop for up to an hour in that case, no point in
thrashing disks, and then gives up). That can happen if a very
large new system needs to be backed up. So I looked and found
rsync was not active and umounted. Seemed ok for a while, but
recurred.
Meanwhile I started a first desktop build on my skylake I've
repurposed it, it used to be my home server but will now sometimes
be used for desktop tests and other times for server tests). I
figured that I should use a current kernel (the host was LFS-9.0, so
I updated to 5.4.1).
But during the build I got a lot of network "stalls", and when that
happened my sessions were unusable. After getting part way through
BLFS I decided to stop and build a 5.3.11 kernel. That appeared to
be ok, built a whole chunk more but then crapped out in falkon (that
turns out to be a qt-5.14 issue, a header is no longer pulled in).
At that point I tried s2ram and left it until I had more time.
Today I woke it from suspend, and got more network stalls, now with
the 5.3.11 kernel - so, it is possible that waking from suspend is
unreliable (I never had reason to suspend or hibernate when it was a
server!).
In the meantime I had updated the machine running 5.4.1 to 5.4.5
yesterday evening, ran it for a while (all seemed well), suspended,
woken it again, and all seemed good. So I upgraded the skylake to
5.4.5 and resumed my BLFS build - that has now all completed. But
in the meantime I've seen that backups on the other machine have
timed out again. And while writing this email (over nfs) I've had
three stalls.
I'm fairly sure that some of the network changes in 5.4 (and
backported to later 5.3) are responsible, but finding a reliable
test for good or bad seems to be hard, so I'm reluctant to bisect.
I suppose I'll try 5.4.6 now.
ĸen
--
We've all got both light and dark inside of us.
What matters is the part we choose to act on.
-- Sirius Black
--
http://lists.linuxfromscratch.org/listinfo/blfs-support
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page