Alex Still wrote:

We're using tcpdump to try to diagnose an NFS performance problem.
We're seeing a lot of these :
[..]
16:01:19.890794 IP nfs_server.nfs > client_46.1205729087: reply ERR 1448
16:01:19.890831 IP nfs_server.nfs > client_46.3664003007: reply ERR 1448
[..]

That seems to be an RPC error,

It seems to be one, but it isn't necessarily one.

The code to print that printed "reply OK {length}" if the field in the packet that, *IF* it contains the beginning of an ONC RPC reply, would be the reply code is 0, and "reply ERR {length}" if it's not zero.

However, not all packets from an ONC RPC server (such as an NFS server) necessarily contain the beginning of an ONC RPC reply, as a reply (or request) can require more than one link-layer packet - for example, an NFS READ reply that returns more bytes that fit in one Ethernet packet will, when sent over Ethernet, take more than one Ethernet packet.

If the request or reply is coming over UDP, that's handled by IP fragmentation; the first packet of the reply will be the first fragment, i.e. it will have a fragment offset of 0, and can be identified by looking at the IP header. tcpdump only attempts to dissect the first fragment as anything other than IP (for one thing, subsequent fragments don't have UDP or TCP port numbers, so, unless it keeps track of the fragments, it has no idea what port number any fragment other than the first one was sent from or to), so it won't try to dissect the subsequent fragments as if they contained the beginning of an ONC RPC request or reply.

If, however, it's coming over TCP, that's handled by the RPC implementation, as TCP merely provides a byte stream, and has no idea what the message boundaries are for the protocols that run atop it.

Therefore, the current tcpdump code to dissect RPC requests and replies doesn't identify TCP segments that are in the middle of an ONC RPC request or reply, and tries to dissect them as if they were at the beginning of the request or reply.

This can cause packets to be mis-dissected; the apparent RPC errors you're seeing are probably examples of that.

That part of the code
has changed in tcpdump-3.9.7, which we installed, and now gives :

16:01:19.890794 IP nfs_server.nfs > client_46.1205729087: reply Unknown rpc
response code=3128327487 1448
16:01:19.890831 IP nfs_server.nfs > client_46.3664003007: reply Unknown rpc
response code=910276799 1448

It now prints the value of the field that, if the packet really *is* an RPC error, would be the error code (the code is derived from from NetBSD's version of tcpdump). If the packet *isn't* an RPC error, the value at the location that's interpreted as a reply code has a very good chance of being bogus, as was the case here.

I don't understand what's happening here. Is it something wrong with our NFS
setup,

No.

or I'm misunderstanding the tcpdump output ?

Yes.

Wireshark/TShark should do a better job of this, for two reasons:

1) it does reassembly of ONC RPC requests and replies over TCP, as well as handling TCP segments with multiple requests or replies;

2) it does some heuristics to check whether a packet *is* an ONC RPC request or reply.

1) would probably be a major undertaking to do in tcpdump, but 2) could be done:

for requests, check whether the RPC version field (not the NFS version) has the value 2 (neither Sun nor anybody else have released any version of ONC RPC other than 2), and check whether the program number in the request is the program number for NFS (note that Sun also run another RPC protocol on port 2049 - a protocol to allow access to POSIX ACLs over NFS - which will be misdissected if interpreted as NFS);

for replies, check whether the transaction ID of the reply matches the transaction ID of a request we've already seen (tcpdump already does that matching for replies without errors, as it needs to know what type of request the reply is for).

I'll see whether that could be done.
-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.

Reply via email to