ACK 3.13.0-35-generic #62~lp1349768v5v201408250842 appears to resolve the issue completely on my actual test setup as well. No TFTP stalls, dnsmasq EPERM errors or dmesg errors, and ftrace doesn't show any calls to ipv6_find_hdr.
-- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1349768 Title: kernel 3.13.0-32 ipvs "IPv6 header not found" related to UDP socket sendto() EPERM errors Status in “linux” package in Ubuntu: In Progress Status in “linux” source package in Trusty: In Progress Status in “linux” source package in Utopic: In Progress Bug description: I have an Ubuntu 14.04 host that I am using as both a keepalived/ipvs loadbalancer and dnsmasq server for pxebooting servers. After updating linux-image 3.13.0-29.53 -> 3.13.0-32.57 I noticed that dnsmasq-tftp stopped working. pxeboot clients would hang on the "Loading ..../linux" TFTP transfer, with the transfer stalling roughly ~1000 blocks into the transfer: 10:30:51.011728 IP 10.1.1.2.43540 > 10.1.12.1.49165: UDP, length 1412 10:30:51.011924 IP 10.1.12.1.49165 > 10.1.1.2.43540: UDP, length 4 10:30:51.012012 IP 10.1.1.2.43540 > 10.1.12.1.49165: UDP, length 1412 10:30:51.012183 IP 10.1.12.1.49165 > 10.1.1.2.43540: UDP, length 4 stracing dnsmasq I noticed something very odd: sendto() on the socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) would suddenly start persistently returning EPERM in mid-transfer, even when dnsmasq continued to periodically retry: select(18, [4 5 6 7 8 9 10 11 12 15 17], [], [], {0, 250000}) = 1 (in [17], left {0, 249834}) recvfrom(17, "\0\4\3\352", 4096, 0, NULL, NULL) = 4 lseek(16, 1410816, SEEK_SET) = 1410816 read(16, "\25\306\345f\2{\r\4)W\276\32\336q\252_\230q\213\341U\354\25\374k7\243\32\221X+\v"..., 1408) = 1408 sendto(17, "\0\3\3\353\25\306\345f\2{\r\4)W\276\32\336q\252_\230q\213\341U\354\25\374k7\243\32"..., 1412, 0, {sa_family=AF_INET, sin_port=htons(49165), sin_addr=inet_addr("10.1.11.3")}, 16) = 1412 select(18, [4 5 6 7 8 9 10 11 12 15 17], [], [], {0, 250000}) = 1 (in [17], left {0, 249839}) recvfrom(17, "\0\4\3\353", 4096, 0, NULL, NULL) = 4 lseek(16, 1412224, SEEK_SET) = 1412224 read(16, "*\360 <C\363l\320:\256~\307\236\26P\323\274%\260\362\341&\232\r\243\370\224\277\221\\\307\372"..., 1408) = 1408 sendto(17, "\0\3\3\354*\360 <C\363l\320:\256~\307\236\26P\323\274%\260\362\341&\232\r\243\370\224\277"..., 1412, 0, {sa_family=AF_INET, sin_port=htons(49165), sin_addr=inet_addr("10.1.11.3")}, 16) = -1 EPERM (Operation not permitted) select(18, [4 5 6 7 8 9 10 11 12 15 17], [], [], {0, 250000}) = 0 (Timeout) select(18, [4 5 6 7 8 9 10 11 12 15 17], [], [], {0, 250000}) = 0 (Timeout) select(18, [4 5 6 7 8 9 10 11 12 15 17], [], [], {0, 250000}) = 0 (Timeout) select(18, [4 5 6 7 8 9 10 11 12 15 17], [], [], {0, 250000}) = 0 (Timeout) select(18, [4 5 6 7 8 9 10 11 12 15 17], [], [], {0, 250000}) = 0 (Timeout) select(18, [4 5 6 7 8 9 10 11 12 15 17], [], [], {0, 250000}) = 0 (Timeout) select(18, [4 5 6 7 8 9 10 11 12 15 17], [], [], {0, 250000}) = 0 (Timeout) select(18, [4 5 6 7 8 9 10 11 12 15 17], [], [], {0, 250000}) = 0 (Timeout) lseek(16, 1412224, SEEK_SET) = 1412224 read(16, "*\360 <C\363l\320:\256~\307\236\26P\323\274%\260\362\341&\232\r\243\370\224\277\221\\\307\372"..., 1408) = 1408 sendto(17, "\0\3\3\354*\360 <C\363l\320:\256~\307\236\26P\323\274%\260\362\341&\232\r\243\370\224\277"..., 1412, 0, {sa_family=AF_INET, sin_port=htons(49165), sin_addr=inet_addr("10.1.11.3")}, 16) = -1 EPERM (Operation not permitted) This was with all iptables rules unloaded (so no OUTPUT -j DENY) and apparmor profiles torn down. I also noticed the following dmesgs appearing at roughly similar times to the tftp transfers getting stuck (although not coinciding exactly with the stall): [70325.516724] IPv6 header not found The error pointed to ipvs (which I am using on the same host as an IPv4 NAT loadbalancer): http://archive.linuxvirtualserver.org/html/lvs-devel/2012-08/msg00018.html http://comments.gmane.org/gmane.comp.linux.lvs.devel/3614 I then tore down the ipvs rules (service keepalived stop) and unloaded the modules (rmmod ip_vs_rr ip_vs), and the issue resolved itself - the stalled dnsmasq-tftp transfer resumed! This seems to be reproducible, i.e. modprobing ip_vs and starting keepalived will cause dnsmasq-tftp to stall again, and stopping/unloading will resume. This seems to happen reproducibly on boot with -32 and -30. This does NOT seem to happen with 3.13.0-29 which I was using up until now. --- AlsaDevices: total 0 crw-rw---- 1 root audio 116, 1 Jul 29 13:43 seq crw-rw---- 1 root audio 116, 33 Jul 29 13:43 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3.2 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: Error: [Errno 2] No such file or directory DistroRelease: Ubuntu 14.04 HibernationDevice: RESUME=/dev/mapper/catcp2-swap InstallationDate: Installed on 2014-06-03 (56 days ago) InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Release amd64 (20140416.2) MachineType: Dell Inc. PowerEdge R410 Package: linux-image-3.13.0-32-generic 3.13.0-32.57 PackageArchitecture: amd64 PciMultimedia: ProcEnviron: TERM=xterm PATH=(custom, no user) LANG=C.UTF-8 SHELL=/bin/bash ProcFB: ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.13.0-32-generic root=/dev/mapper/hostname-root ro console=ttyS1,115200n8 console=tty0 nomdmonddf nomdmonisw ProcVersionSignature: Ubuntu 3.13.0-32.57-generic 3.13.11.4 RelatedPackageVersions: linux-restricted-modules-3.13.0-32-generic N/A linux-backports-modules-3.13.0-32-generic N/A linux-firmware 1.127.5 RfKill: Error: [Errno 2] No such file or directory Tags: trusty Uname: Linux 3.13.0-32-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: _MarkForUpload: True dmi.bios.date: 07/30/2013 dmi.bios.vendor: Dell Inc. dmi.bios.version: 1.12.0 dmi.board.name: 01V648 dmi.board.vendor: Dell Inc. dmi.board.version: A03 dmi.chassis.type: 23 dmi.chassis.vendor: Dell Inc. dmi.modalias: dmi:bvnDellInc.:bvr1.12.0:bd07/30/2013:svnDellInc.:pnPowerEdgeR410:pvr:rvnDellInc.:rn01V648:rvrA03:cvnDellInc.:ct23:cvr: dmi.product.name: PowerEdge R410 dmi.sys.vendor: Dell Inc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349768/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp