On 11/13/25 1:05 PM, Tyler W. Ross wrote:
> On Thursday, November 13th, 2025 at 10:47 AM, Chuck Lever 
> <[email protected]> wrote:
> 
>>> ls-969 [003] ..... 270.327063: rpc_xdr_recvfrom: task:00000008@00000005 
>>> head=[0xffff8895c29fef64,140] page=4008(88) tail=[0xffff8895c29feff0,36] 
>>> len=988
>>> ls-969 [003] ..... 270.327067: rpc_xdr_overflow: task:00000008@00000005 
>>> nfsv4 READDIR requested=8 p=0xffff8895c29fefec end=0xffff8895c29feff0 
>>> xdr=[0xffff8895c29fef64,140]/4008/[0xffff8895c29feff0,36]/988
>>
>>
>> Here's the problem. This is a sign of an XDR decoding issue. If you
>> capture the traffic with Wireshark, does Wireshark indicate where the
>> XDR is malformed?
> 
> Wireshark appears to decode the READDIR reply without issue. Nothing is 
> obviously marked as malformed, and values all appear sane when spot-checking 
> fields in the decoded packet.
Then I would start looking for differences between the Debian 13 and
Fedora 43 kernel code base under net/sunrpc/ .

Alternatively, "git bisect first, ask questions later" ... :-)

So I didn't find an indication of whether this was sec=krb5, sec=krb5i,
or sec=krb5p. That might narrow down where the code changed.

Also, the xdr_buf might have a page boundary positioned in the middle of
an XDR data item. Knowing which data item is being decoded where the
"overflow" occurs might be helpful (I think adding pr_info() call sites
or trace_printk() will be adequate to gain some better observability).


-- 
Chuck Lever

Reply via email to