Ok. I got a matching snoop (from the server) and tcpdump (from the linux client) and the replies seen leaving the server are definitely not seen by the client:
From the server: releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C ACCESS3 FH=6C39 (read,lookup,modify,extend,delete) Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 S=2049 Ack=1184956508 Seq=1590845549 Len=0 Win=53688 Options=<nop,nop,tstamp 631479 2639688273> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R ACCESS3 OK (read,lookup) releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM TCP D=2049 S=800 Ack=1590845673 Seq=1184956508 Len=0 Win=512 Options=<nop,nop,tstamp 2639688274 631479> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096 Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R READDIRPLUS3 OK 12 entries (No more) releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096 (retransmit) Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 S=2049 Ack=1184956668 Seq=1590847685 Len=0 Win=53688 Options=<nop,nop,tstamp 631500 2639688274> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R READDIRPLUS3 OK 12 entries (No more) Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R READDIRPLUS3 OK 12 entries (No more) Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R READDIRPLUS3 OK 12 entries (No more) Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R READDIRPLUS3 OK 12 entries (No more) Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R READDIRPLUS3 OK 12 entries (No more) Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R READDIRPLUS3 OK 12 entries (No more) Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R READDIRPLUS3 OK 12 entries (No more) releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096 (retransmit) Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 S=2049 Ack=1184956828 Seq=1590847685 Len=0 Win=53688 Options=<nop,nop,tstamp 637478 2639748274> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R READDIRPLUS3 OK 12 entries (No more) Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R READDIRPLUS3 OK 12 entries (No more) Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R READDIRPLUS3 OK 12 entries (No more) Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R READDIRPLUS3 OK 12 entries (No more) Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R READDIRPLUS3 OK 12 entries (No more) Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R READDIRPLUS3 OK 12 entries (No more) releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM TCP D=2049 S=800 Fin Ack=1590845673 Seq=1184956828 Len=0 Win=512 Options=<nop,nop,tstamp 2640063353 631479> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 S=2049 Ack=1184956829 Seq=1590849697 Len=0 Win=53688 Options=<nop,nop,tstamp 668980 2640063353> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 S=2049 Fin Ack=1184956829 Seq=1590849697 Len=0 Win=53688 Options=<nop,nop,tstamp 668980 2640063353> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM TCP D=2049 S=800 Rst Seq=1184956829 Len=0 Win=0 From the client: 15:09:24.165259 IP releng1.862107800 > Galileo.RelEng.Egenera.COM.nfs: 140 access [|nfs] 15:09:24.165611 IP Galileo.RelEng.Egenera.COM.nfs > releng1.800: . ack 277 win 53688 <nop,nop,timestamp 631479 2639688273> 15:09:24.165855 IP Galileo.RelEng.Egenera.COM.nfs > releng1.862107800: reply ok 124 access [|nfs] 15:09:24.165867 IP releng1.800 > Galileo.RelEng.Egenera.COM.nfs: . ack 293 win 512 <nop,nop,timestamp 2639688274 631479> 15:09:24.166069 IP releng1.878885016 > Galileo.RelEng.Egenera.COM.nfs: 160 readdirplus [|nfs] 15:09:24.379488 IP releng1.878885016 > Galileo.RelEng.Egenera.COM.nfs: 160 readdirplus [|nfs] 15:09:24.379772 IP Galileo.RelEng.Egenera.COM.nfs > releng1.800: . ack 437 win 53688 <nop,nop,timestamp 631500 2639688274> 15:10:24.156981 IP releng1.878885016 > Galileo.RelEng.Egenera.COM.nfs: 160 readdirplus [|nfs] 15:10:24.157285 IP Galileo.RelEng.Egenera.COM.nfs > releng1.800: . ack 597 win 53688 <nop,nop,timestamp 637478 2639748274> 15:15:39.184608 IP releng1.800 > Galileo.RelEng.Egenera.COM.nfs: F 597:597(0) ack 293 win 512 <nop,nop,timestamp 2640063353 631479> 15:15:39.185420 IP Galileo.RelEng.Egenera.COM.nfs > releng1.800: . ack 598 win 53688 <nop,nop,timestamp 668980 2640063353> 15:15:39.185548 IP Galileo.RelEng.Egenera.COM.nfs > releng1.800: F 4317:4317(0) ack 598 win 53688 <nop,nop,timestamp 668980 2640063353> 15:15:39.185572 IP releng1.800 > Galileo.RelEng.Egenera.COM.nfs: R 1184956829:1184956829(0) win 0 I'll have to try to setup a port mirror on the Switch I'm using. I don't know if that switch can mirror the LACP group though - I'll have to disable it. Any other suggestions? -Kyle Kyle McDonald wrote: > I'm having a similiar problem from a linux client to a sNV_b103 server. > > For me though the mount works fine, it's the NFS accesses that hang. > > Here's a snoop that shows what the server is seeing: > > releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C > GETATTR3 FH=6C39 > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 > S=2049 Ack=659557530 Seq=471989561 Len=0 Win=53688 > Options=<nop,nop,tstamp 181189 2635184551> > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R > GETATTR3 OK > releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM TCP D=2049 > S=800 Ack=471989677 Seq=659557530 Len=0 Win=512 > Options=<nop,nop,tstamp 2635184551 181189> > releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C ACCESS3 > FH=6C39 (read,lookup,modify,extend,delete) > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R ACCESS3 > OK (read,lookup) > releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C > READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096 > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R > READDIRPLUS3 OK 12 entries (No more) > releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C > READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096 (retransmit) > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 > S=2049 Ack=659557830 Seq=471991813 Len=0 Win=53688 > Options=<nop,nop,tstamp 181209 2635184552> > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R > READDIRPLUS3 OK 12 entries (No more) > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R > READDIRPLUS3 OK 12 entries (No more) > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R > READDIRPLUS3 OK 12 entries (No more) > releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C > READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096 (retransmit) > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 > S=2049 Ack=659557990 Seq=471991813 Len=0 Win=53688 > Options=<nop,nop,tstamp 187188 2635244552> > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R > READDIRPLUS3 OK 12 entries (No more) > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R > READDIRPLUS3 OK 12 entries (No more) > releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C FSSTAT3 > FH=6C39 > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R FSSTAT3 OK > releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM TCP D=2049 > S=800 Ack=471989801 Seq=659558126 Len=0 Win=512 > Options=<nop,nop,tstamp 2635279692 181189,no > p,nop,sack 471993825-471993997> > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R > READDIRPLUS3 OK 12 entries (No more) > releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C > READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096 (retransmit) > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R > READDIRPLUS3 OK 12 entries (No more) > releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C > READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096 (retransmit) > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 > S=2049 Ack=659558286 Seq=471996009 Len=0 Win=53688 > Options=<nop,nop,tstamp 199208 2635364552> > > This (to my uneducated eye) shows the server repling multiple times, > and the client retransmitting the READDIR3 multiple times. > > I'm not familiar enough with Linux (yet) to run the equivalent of > snoop (what is it? ethereal?) or I'd include traces from the client also. > > The client (and server) are both IBM x346 eServers. Like the link > below, both have BMC(like an LOM) modules to manage the machine. Also > like the link below these modules share the ethernet port with one of > the broadcom (not intel) ethernet interfaces built into the > motherboard. However in this case: > > 1) Neither the server nor the client are using the shared broadcom > interface on the motherboard. > 2) The client is using the other broadcom interface on the MB. > 3) The Server is using a LACP aggr group (setup with dladm [with > mtu=9000]) built up from 4 intel e1000g interfaces on a PCI card. > > So if packets are being lost on the return trip from the server to the > client, I don't think it's for the same reason, though it may be > similiar. > Note this on a ZFS filesystem, but from the traces above I'm not > inclied to think that has anything to do with the problem. > > The server does have other ethernet interfaces on other subnets. > However the testing above was careful to do the NFS mount with only > the IP of the one interface, that one interface is also the one used > for the default route, and snoop was running on the others and showed > zero incoming or outgoing traffic (to or from the clients IP) durring > this same period. > > Anyone got any ideas? > > -Kyle > > > > > > Jorgen Lundman wrote: >> >> I stumbled across this entry: >> >> http://blogs.sun.com/shepler/entry/port_623_or_the_mount >> >> and even though we do not see this issue with port 623, but rather >> 664. But sure enough, it was sending SYN/ACK, then timeout until RST. >> >> I waited for the port to the released, told inetd to listen on port >> 664 and voila, mount works fine again. >> >> We use Supermicros with Intel? 82573V and 82573L. >> >> I would send Shepler my thanks but comments are disabled. >> >> >> >> Useless logs: >> >> >> # mount -o proto=tcp,vers=3 172.20.12.226:/export/src /mnt >> >> 172.20.12.6 -> 172.20.12.226 PORTMAP C GETPORT prog=100005 (MOUNT) >> vers=3 proto=UDP >> 172.20.12.226 -> 172.20.12.6 PORTMAP R GETPORT port=39049 >> 172.20.12.6 -> 172.20.12.226 MOUNT3 C Null >> 172.20.12.226 -> 172.20.12.6 MOUNT3 R Null >> 172.20.12.6 -> 172.20.12.226 MOUNT3 C Mount /export/src >> 172.20.12.226 -> 172.20.12.6 MOUNT3 R Mount OK FH=076E Auth=unix >> 172.20.12.6 -> 172.20.12.226 PORTMAP C GETPORT prog=100003 (NFS) >> vers=3 proto=TCP >> 172.20.12.226 -> 172.20.12.6 PORTMAP R GETPORT port=2049 >> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38337 Syn Seq=592414549 >> Len=0 Win=49640 Options=<mss 1460,nop,wscale 0,nop,nop,sackOK> >> 172.20.12.226 -> 172.20.12.6 TCP D=38337 S=2049 Syn Ack=592414550 >> Seq=2210245643 Len=0 Win=49640 Options=<mss 1460,nop,wscale >> 0,nop,nop,sackOK> >> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38337 Ack=2210245644 >> Seq=592414550 Len=0 Win=49640 >> 172.20.12.6 -> 172.20.12.226 NFS C NULL3 >> 172.20.12.226 -> 172.20.12.6 TCP D=38337 S=2049 Ack=592414670 >> Seq=2210245644 Len=0 Win=49520 >> 172.20.12.226 -> 172.20.12.6 NFS R NULL3 >> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38337 Ack=2210245672 >> Seq=592414670 Len=0 Win=49640 >> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38337 Fin Ack=2210245672 >> Seq=592414670 Len=0 Win=49640 >> 172.20.12.226 -> 172.20.12.6 TCP D=38337 S=2049 Ack=592414671 >> Seq=2210245672 Len=0 Win=49640 >> 172.20.12.226 -> 172.20.12.6 TCP D=38337 S=2049 Fin Ack=592414671 >> Seq=2210245672 Len=0 Win=49640 >> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38337 Ack=2210245673 >> Seq=592414671 Len=0 Win=49640 >> 172.20.12.6 -> 172.20.12.226 PORTMAP C GETPORT prog=100003 (NFS) >> vers=3 proto=TCP >> 172.20.12.226 -> 172.20.12.6 PORTMAP R GETPORT port=2049 >> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38338 Syn Seq=3614232918 >> Len=0 Win=49640 Options=<mss 1460,nop,wscale 0,nop,nop,sackOK> >> 172.20.12.226 -> 172.20.12.6 TCP D=38338 S=2049 Syn Ack=3614232919 >> Seq=2210460804 Len=0 Win=49640 Options=<mss 1460,nop,wscale >> 0,nop,nop,sackOK> >> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38338 Ack=2210460805 >> Seq=3614232919 Len=0 Win=49640 >> 172.20.12.6 -> 172.20.12.226 NFS C NULL3 >> 172.20.12.226 -> 172.20.12.6 TCP D=38338 S=2049 Ack=3614233039 >> Seq=2210460805 Len=0 Win=49520 >> 172.20.12.226 -> 172.20.12.6 NFS R NULL3 >> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38338 Ack=2210460833 >> Seq=3614233039 Len=0 Win=49640 >> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38338 Fin Ack=2210460833 >> Seq=3614233039 Len=0 Win=49640 >> 172.20.12.226 -> 172.20.12.6 TCP D=38338 S=2049 Ack=3614233040 >> Seq=2210460833 Len=0 Win=49640 >> 172.20.12.226 -> 172.20.12.6 TCP D=38338 S=2049 Fin Ack=3614233040 >> Seq=2210460833 Len=0 Win=49640 >> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38338 Ack=2210460834 >> Seq=3614233040 Len=0 Win=49640 >> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=664 Rst Ack=0 >> Seq=3456416233 Len=0 Win=49640 >> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=664 Syn Seq=3507413975 >> Len=0 Win=49640 Options=<mss 1460,nop,wscale 0,nop,nop,sackOK> >> >> >> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=664 Syn Seq=3507413975 >> Len=0 Win=49640 Options=<mss 1460,nop,wscale 0,nop,nop,sackOK> >> >> >> # netstat >> 172.20.12.6.664 172.20.12.226.2049 0 0 49640 0 >> SYN_SENT >> >> >> After inetd hack: >> >> # netstat >> *.664 *.* 0 0 49152 0 >> BOUND >> >> # mount -o proto=tcp,vers=3 172.20.12.226:/export/src /mnt >> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=661 Syn Seq=1448210229 >> Len=0 Win=49640 Options=<mss 1460,nop,wscale 0,nop,nop,sackOK> >> >> # df >> 172.20.12.226:/export/src >> 24T 11G 24T 1% /mnt >> >> >> >> Jorgen Lundman wrote: >>> >>> Ok, it still happens even when not using aliases, it just took >>> longer to turn up. >>> >>> Attempting to mount (snoop running on NFS client) >> >> > > > _______________________________________________ > nfs-discuss mailing list > nfs-discuss at opensolaris.org