Ben Rockwood wrote: > I wanted to add one more piece of information to this problem that may > or not may be helpful. > > On an NFS client if we just do "ls" commands over and over and over we > can snoop the wire and see TCP retransmits whenever the CPU is burned > up. nfsstat doesn't record these retransmits, they are happening lower > down. Here's an example packet exchange: > > Frame Time Packet Time Delta > 85 7.91 GETATTR Call 0.00 > 89 8.31 Retransmit of 85 0.407s > 93 11.81 GETATTR Reply 3.497s > > > When the CPU on the Thumper is "normal" transactions look like this: > > 22:38:16.52564 private.atlantis -> 10.71.165.6 NFS C GETATTR3 FH=0E05 > 22:38:16.52574 10.71.165.6 -> private.atlantis NFS R GETATTR3 OK > 22:38:16.57938 private.atlantis -> 10.71.165.6 TCP D=2049 S=992 > Ack=1937283399 Seq=4211323728 Len=0 Win=49640 > > > When the CPU is tapped out, it looks like this: > > 22:37:50.55974 private.atlantis -> 10.71.165.6 NFS C GETATTR3 FH=0E05 > 22:37:50.96940 private.atlantis -> 10.71.165.6 NFS C GETATTR3 FH=0E05 > (retransmit) > 22:37:50.96949 10.71.165.6 -> private.atlantis TCP D=992 S=2049 > Ack=4211321824 Seq=1937281311 Len=0 Win=49640 > 22:37:53.84858 10.71.165.6 -> private.atlantis NFS R GETATTR3 OK > 22:37:53.89939 private.atlantis -> 10.71.165.6 TCP D=2049 S=992 > Ack=1937281427 Seq=4211321824 Len=0 Win=49640 > > > > This finding caused us to re-evaluate the entire network scheme. > However, I can't ignore the fact that this only happens when the CPU's > are tapped out. Perhaps there is a correlation perhaps not. Just wanted > to throw it on there. > > > benr.
Thanks for providing more info... Yeah, i would think its TCP/NFS server dropping packets/not getting to processing the packets in a timely matter because it doesn't have enough resource (CPU). There may be network inefficiences in your setup, but i think this is simply due to lack of CPU on the server. eric