Hi there, I'm not sure of a way of testing it with a real network device, but I'm happy to attempt to build a older UML kernel and test it from there. As I said in my original email, the last fully known working build was way back in kernel 3.2 and a lot has changed since then, so it could very well be a kernel issue and due to the edge use case, no one has ever really come across it. Is there a kernel version you'd like me to try out? Debian has a standard usermodelinux package which contains prebuilt UML images with kernel versions of 4.9, 4.19 or 5.5 if they'd be handy? https://tracker.debian.org/pkg/user-mode-linux.
Thanks for the support, Josh On Thu, 23 Apr 2020 at 20:30, Simon Kelley <si...@thekelleys.org.uk> wrote: > Ok, so Josh ran the strace and sent me the results as requested. > > The interesting bit us here. > > recvmsg(4, {msg_name={sa_family=AF_INET, sin_port=htons(68), > sin_addr=inet_addr("0.0.0.0")}, msg_namelen=16, > > msg_iov=[{iov_base="\1\1\6\0\310\261\311+\0\6\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\366\226}H"..., > iov_len=548}], msg_iovlen=1, msg_control=[{cmsg_len=24, > cmsg_level=SOL_IP, cmsg_type=IP_PKTINFO, > cmsg_data={ipi_ifindex=if_nametoindex("eth0"), > ipi_spec_dst=inet_addr("192.168.1.1"), > ipi_addr=inet_addr("255.255.255.255")}}], msg_controllen=24, > msg_flags=0}, MSG_PEEK|MSG_TRUNC) = 300 > recvmsg(4, {msg_namelen=16}, 0) = -1 EAGAIN (Resource > temporarily unavailable) > > > > The first call to recvmsg has the MSG_PEEK and MSG_TRUNC flags set. > MSG_TRUNC causes the result to be the actual length of the received > packet, even if it's longer than supplied buffer (548) and MSG_PEEK is > defined as: > > > MSG_PEEK > This flag causes the receive operation to return data from the > beginning of the receive queue without removing that data from > the queue. Thus, a subsequent receive call will return the same > data. > > So this allows the buffer to be expanded if necessary and then recvmsg > gets called again when the buffer is big enough, to actually get the > data and remove it from the queue. In this case the packet is 300 bytes > long and the buffer is already 548 bytes, so no expansion is needed, we > just do the call again, without the MSG_PEEK|MSG_TRUNC flags. That's the > second call to recvmsg, which returns EAGAIN - the socket is > no-blocking, and this return says there's no packet queued. It looks > like the kernel is ignoring the MSG_PEEK flag, and dequeueing the data > on the first call. > > I think this is a kernel bug. > > Josh, does this work with an older kernel or with a real network device, > rather than the UML virtual device? It would be good to work out where > the regression happened. > > > Simon. > > On 16/04/2020 15:40, Josh H wrote: > > > > First, answer a simple question the answer to which I may have > missed. > > Is dnsmasq logging the receipt of DHCPDISCOVER messages? Can we see > the > > whole log showing that? > > > > > > Based on the config I provided at the initial message, I have the log > > file writing to /var/log/dnsmasq.log. This is the whole content of that > > file: > > > > root@dns:~# cat /var/log/dnsmasq.log > > Apr 16 15:36:50 dnsmasq[1695]: started, version 2.80 DNS disabled > > Apr 16 15:36:50 dnsmasq[1695]: compile time options: IPv6 GNU-getopt > > DBus i18n IDN DHCP DHCPv6 no-Lua TFTP conntrack ipset auth DNSSEC > > loop-detect inotify dumpfile > > Apr 16 15:36:50 dnsmasq-dhcp[1695]: DHCP, IP range 192.168.1.3 -- > > 192.168.1.8, lease time 12h > > > > No mention of the DHCPDiscover being acknowledged. > > > > The next stage is to run dnsmasq under strace (check back here if you > > need instructions on that) and see what system calls it's making. > > > > > > What command would I need to run for this? And what service is best to > > upload the strace result, pastebin? > > > > Thanks, > > Josh > > > > On Thu, 16 Apr 2020 at 12:49, Simon Kelley <si...@thekelleys.org.uk > > <mailto:si...@thekelleys.org.uk>> wrote: > > > > > > > > On 15/04/2020 19:27, Josh H wrote: > > > > > It's difficult for me to share the config outright as I'm using a > > > modified version of netkit that I've updated to a much newer kernel > > > - http://netkit-ng.github.io/. The netkit version that is > available on > > > that link is the one that worked with dnsmasq just fine, and that > > > version was 2.62 and kernel 3.2. However I've updated it and am > > running > > > 2.80 and kernel 5.6. > > > > > > Anything else I can provide you with that might help? It's a very > > unique > > > setup so I appreciate it's probably not the easiest thing to try > and > > > debug. > > > > > > > First, answer a simple question the answer to which I may have > missed. > > Is dnsmasq logging the receipt of DHCPDISCOVER messages? Can we see > the > > whole log showing that? > > > > The next stage is to run dnsmasq under strace (check back here if you > > need instructions on that) and see what system calls it's making. > > > > > > Simon. > > > > > > _______________________________________________ > > Dnsmasq-discuss mailing list > > Dnsmasq-discuss@lists.thekelleys.org.uk > > <mailto:Dnsmasq-discuss@lists.thekelleys.org.uk> > > http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss > > >
_______________________________________________ Dnsmasq-discuss mailing list Dnsmasq-discuss@lists.thekelleys.org.uk http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss