On 4/12/21 10:38 AM, Eric Dumazet wrote:
[ ... ]

> Yes, I think this is the real issue here. This smells like some memory
> corruption.
> 
> In my traces, packet is correctly received in AF_PACKET queue.
> 
> I have checked the skb is well formed.
> 
> But the user space seems to never call poll() and recvmsg() on this
> af_packet socket.
> 

After sprinkling the kernel with debug messages:

424   00:01:33.674181 sendto(6, 
"E\0\1H\0\0\0\0@\21y\246\0\0\0\0\377\377\377\377\0D\0C\00148\346\1\1\6\0\246\336\333\v\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0RT\0\
424   00:01:33.693873 close(6)          = 0
424   00:01:33.694652 fcntl64(5, F_SETFD, FD_CLOEXEC) = 0
424   00:01:33.695213 clock_gettime64(CLOCK_MONOTONIC, 0x7be18a18) = -1 EFAULT 
(Bad address)
424   00:01:33.695889 write(2, "udhcpc: clock_gettime(MONOTONIC) failed\n", 40) 
= -1 EFAULT (Bad address)
424   00:01:33.697311 exit_group(1)     = ?
424   00:01:33.698346 +++ exited with 1 +++

I only see that after adding debug messages in the kernel, so I guess there 
must be
a heisenbug somehere.

Anyway, indeed, I see (another kernel debug message):

__do_sys_clock_gettime: Returning -EFAULT on address 0x7bacc9a8

So udhcpc doesn't even try to read the reply because it crashes after sendto()
when trying to read the current time. Unless I am missing something, that means
that the problem happens somewhere on the send side.

To make things even more interesting, it looks like the failing system call
isn't always clock_gettime().

Guenter

Reply via email to