Sounds like it is possibly different, but in the end once I knew what to look for, we could test for it quite easily:
# nmap -p 512-1023 172.20.12.5 623/tcp filtered unknown 664/tcp filtered unknown (The IP in this case in NFS client, but I ran nmap against both to look for other stolen ports). We then did the Inetd hack to occupy ports 623/664, and no hung NFS since then. (knock on wood). Kyle McDonald wrote: > > Ok. I got a matching snoop (from the server) and tcpdump (from the linux > client) and the replies seen leaving the server are definitely not seen > by the client: > > From the server: > releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C ACCESS3 > FH=6C39 (read,lookup,modify,extend,delete) > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 > S=2049 Ack=1184956508 Seq=1590845549 Len=0 Win=53688 > Options=<nop,nop,tstamp 631479 2639688273> > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R ACCESS3 > OK (read,lookup) > releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM TCP D=2049 > S=800 Ack=1590845673 Seq=1184956508 Len=0 Win=512 > Options=<nop,nop,tstamp 2639688274 631479> > releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C > READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096 > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R > READDIRPLUS3 OK 12 entries (No more) > releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C > READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096 (retransmit) > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 > S=2049 Ack=1184956668 Seq=1590847685 Len=0 Win=53688 > Options=<nop,nop,tstamp 631500 2639688274> > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R > READDIRPLUS3 OK 12 entries (No more) > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R > READDIRPLUS3 OK 12 entries (No more) > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R > READDIRPLUS3 OK 12 entries (No more) > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R > READDIRPLUS3 OK 12 entries (No more) > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R > READDIRPLUS3 OK 12 entries (No more) > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R > READDIRPLUS3 OK 12 entries (No more) > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R > READDIRPLUS3 OK 12 entries (No more) > releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C > READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096 (retransmit) > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 > S=2049 Ack=1184956828 Seq=1590847685 Len=0 Win=53688 > Options=<nop,nop,tstamp 637478 2639748274> > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R > READDIRPLUS3 OK 12 entries (No more) > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R > READDIRPLUS3 OK 12 entries (No more) > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R > READDIRPLUS3 OK 12 entries (No more) > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R > READDIRPLUS3 OK 12 entries (No more) > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R > READDIRPLUS3 OK 12 entries (No more) > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R > READDIRPLUS3 OK 12 entries (No more) > releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM TCP D=2049 > S=800 Fin Ack=1590845673 Seq=1184956828 Len=0 Win=512 > Options=<nop,nop,tstamp 2640063353 631479> > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 > S=2049 Ack=1184956829 Seq=1590849697 Len=0 Win=53688 > Options=<nop,nop,tstamp 668980 2640063353> > Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 > S=2049 Fin Ack=1184956829 Seq=1590849697 Len=0 Win=53688 > Options=<nop,nop,tstamp 668980 2640063353> > releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM TCP D=2049 > S=800 Rst Seq=1184956829 Len=0 Win=0 > > From the client: > 15:09:24.165259 IP releng1.862107800 > Galileo.RelEng.Egenera.COM.nfs: > 140 access [|nfs] > 15:09:24.165611 IP Galileo.RelEng.Egenera.COM.nfs > releng1.800: . ack > 277 win 53688 <nop,nop,timestamp 631479 2639688273> > 15:09:24.165855 IP Galileo.RelEng.Egenera.COM.nfs > releng1.862107800: > reply ok 124 access [|nfs] > 15:09:24.165867 IP releng1.800 > Galileo.RelEng.Egenera.COM.nfs: . ack > 293 win 512 <nop,nop,timestamp 2639688274 631479> > 15:09:24.166069 IP releng1.878885016 > Galileo.RelEng.Egenera.COM.nfs: > 160 readdirplus [|nfs] > 15:09:24.379488 IP releng1.878885016 > Galileo.RelEng.Egenera.COM.nfs: > 160 readdirplus [|nfs] > 15:09:24.379772 IP Galileo.RelEng.Egenera.COM.nfs > releng1.800: . ack > 437 win 53688 <nop,nop,timestamp 631500 2639688274> > 15:10:24.156981 IP releng1.878885016 > Galileo.RelEng.Egenera.COM.nfs: > 160 readdirplus [|nfs] > 15:10:24.157285 IP Galileo.RelEng.Egenera.COM.nfs > releng1.800: . ack > 597 win 53688 <nop,nop,timestamp 637478 2639748274> > 15:15:39.184608 IP releng1.800 > Galileo.RelEng.Egenera.COM.nfs: F > 597:597(0) ack 293 win 512 <nop,nop,timestamp 2640063353 631479> > 15:15:39.185420 IP Galileo.RelEng.Egenera.COM.nfs > releng1.800: . ack > 598 win 53688 <nop,nop,timestamp 668980 2640063353> > 15:15:39.185548 IP Galileo.RelEng.Egenera.COM.nfs > releng1.800: F > 4317:4317(0) ack 598 win 53688 <nop,nop,timestamp 668980 2640063353> > 15:15:39.185572 IP releng1.800 > Galileo.RelEng.Egenera.COM.nfs: R > 1184956829:1184956829(0) win 0 > > I'll have to try to setup a port mirror on the Switch I'm using. I don't > know if that switch can mirror the LACP group though - I'll have to > disable it. > > Any other suggestions? > > -Kyle > > > > > > > > > > > > > > > Kyle McDonald wrote: >> I'm having a similiar problem from a linux client to a sNV_b103 server. >> >> For me though the mount works fine, it's the NFS accesses that hang. >> >> Here's a snoop that shows what the server is seeing: >> >> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C >> GETATTR3 FH=6C39 >> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 >> S=2049 Ack=659557530 Seq=471989561 Len=0 Win=53688 >> Options=<nop,nop,tstamp 181189 2635184551> >> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R >> GETATTR3 OK >> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM TCP D=2049 >> S=800 Ack=471989677 Seq=659557530 Len=0 Win=512 >> Options=<nop,nop,tstamp 2635184551 181189> >> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C ACCESS3 >> FH=6C39 (read,lookup,modify,extend,delete) >> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R ACCESS3 >> OK (read,lookup) >> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C >> READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096 >> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R >> READDIRPLUS3 OK 12 entries (No more) >> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C >> READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096 (retransmit) >> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 >> S=2049 Ack=659557830 Seq=471991813 Len=0 Win=53688 >> Options=<nop,nop,tstamp 181209 2635184552> >> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R >> READDIRPLUS3 OK 12 entries (No more) >> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R >> READDIRPLUS3 OK 12 entries (No more) >> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R >> READDIRPLUS3 OK 12 entries (No more) >> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C >> READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096 (retransmit) >> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 >> S=2049 Ack=659557990 Seq=471991813 Len=0 Win=53688 >> Options=<nop,nop,tstamp 187188 2635244552> >> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R >> READDIRPLUS3 OK 12 entries (No more) >> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R >> READDIRPLUS3 OK 12 entries (No more) >> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C FSSTAT3 >> FH=6C39 >> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R FSSTAT3 OK >> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM TCP D=2049 >> S=800 Ack=471989801 Seq=659558126 Len=0 Win=512 >> Options=<nop,nop,tstamp 2635279692 181189,no >> p,nop,sack 471993825-471993997> >> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R >> READDIRPLUS3 OK 12 entries (No more) >> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C >> READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096 (retransmit) >> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R >> READDIRPLUS3 OK 12 entries (No more) >> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C >> READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096 (retransmit) >> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 >> S=2049 Ack=659558286 Seq=471996009 Len=0 Win=53688 >> Options=<nop,nop,tstamp 199208 2635364552> >> >> This (to my uneducated eye) shows the server repling multiple times, >> and the client retransmitting the READDIR3 multiple times. >> >> I'm not familiar enough with Linux (yet) to run the equivalent of >> snoop (what is it? ethereal?) or I'd include traces from the client also. >> >> The client (and server) are both IBM x346 eServers. Like the link >> below, both have BMC(like an LOM) modules to manage the machine. Also >> like the link below these modules share the ethernet port with one of >> the broadcom (not intel) ethernet interfaces built into the >> motherboard. However in this case: >> >> 1) Neither the server nor the client are using the shared broadcom >> interface on the motherboard. >> 2) The client is using the other broadcom interface on the MB. >> 3) The Server is using a LACP aggr group (setup with dladm [with >> mtu=9000]) built up from 4 intel e1000g interfaces on a PCI card. >> >> So if packets are being lost on the return trip from the server to the >> client, I don't think it's for the same reason, though it may be >> similiar. >> Note this on a ZFS filesystem, but from the traces above I'm not >> inclied to think that has anything to do with the problem. >> >> The server does have other ethernet interfaces on other subnets. >> However the testing above was careful to do the NFS mount with only >> the IP of the one interface, that one interface is also the one used >> for the default route, and snoop was running on the others and showed >> zero incoming or outgoing traffic (to or from the clients IP) durring >> this same period. >> >> Anyone got any ideas? >> >> -Kyle >> >> >> >> >> >> Jorgen Lundman wrote: >>> >>> I stumbled across this entry: >>> >>> http://blogs.sun.com/shepler/entry/port_623_or_the_mount >>> >>> and even though we do not see this issue with port 623, but rather >>> 664. But sure enough, it was sending SYN/ACK, then timeout until RST. >>> >>> I waited for the port to the released, told inetd to listen on port >>> 664 and voila, mount works fine again. >>> >>> We use Supermicros with Intel? 82573V and 82573L. >>> >>> I would send Shepler my thanks but comments are disabled. >>> >>> >>> >>> Useless logs: >>> >>> >>> # mount -o proto=tcp,vers=3 172.20.12.226:/export/src /mnt >>> >>> 172.20.12.6 -> 172.20.12.226 PORTMAP C GETPORT prog=100005 (MOUNT) >>> vers=3 proto=UDP >>> 172.20.12.226 -> 172.20.12.6 PORTMAP R GETPORT port=39049 >>> 172.20.12.6 -> 172.20.12.226 MOUNT3 C Null >>> 172.20.12.226 -> 172.20.12.6 MOUNT3 R Null >>> 172.20.12.6 -> 172.20.12.226 MOUNT3 C Mount /export/src >>> 172.20.12.226 -> 172.20.12.6 MOUNT3 R Mount OK FH=076E Auth=unix >>> 172.20.12.6 -> 172.20.12.226 PORTMAP C GETPORT prog=100003 (NFS) >>> vers=3 proto=TCP >>> 172.20.12.226 -> 172.20.12.6 PORTMAP R GETPORT port=2049 >>> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38337 Syn Seq=592414549 >>> Len=0 Win=49640 Options=<mss 1460,nop,wscale 0,nop,nop,sackOK> >>> 172.20.12.226 -> 172.20.12.6 TCP D=38337 S=2049 Syn Ack=592414550 >>> Seq=2210245643 Len=0 Win=49640 Options=<mss 1460,nop,wscale >>> 0,nop,nop,sackOK> >>> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38337 Ack=2210245644 >>> Seq=592414550 Len=0 Win=49640 >>> 172.20.12.6 -> 172.20.12.226 NFS C NULL3 >>> 172.20.12.226 -> 172.20.12.6 TCP D=38337 S=2049 Ack=592414670 >>> Seq=2210245644 Len=0 Win=49520 >>> 172.20.12.226 -> 172.20.12.6 NFS R NULL3 >>> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38337 Ack=2210245672 >>> Seq=592414670 Len=0 Win=49640 >>> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38337 Fin Ack=2210245672 >>> Seq=592414670 Len=0 Win=49640 >>> 172.20.12.226 -> 172.20.12.6 TCP D=38337 S=2049 Ack=592414671 >>> Seq=2210245672 Len=0 Win=49640 >>> 172.20.12.226 -> 172.20.12.6 TCP D=38337 S=2049 Fin Ack=592414671 >>> Seq=2210245672 Len=0 Win=49640 >>> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38337 Ack=2210245673 >>> Seq=592414671 Len=0 Win=49640 >>> 172.20.12.6 -> 172.20.12.226 PORTMAP C GETPORT prog=100003 (NFS) >>> vers=3 proto=TCP >>> 172.20.12.226 -> 172.20.12.6 PORTMAP R GETPORT port=2049 >>> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38338 Syn Seq=3614232918 >>> Len=0 Win=49640 Options=<mss 1460,nop,wscale 0,nop,nop,sackOK> >>> 172.20.12.226 -> 172.20.12.6 TCP D=38338 S=2049 Syn Ack=3614232919 >>> Seq=2210460804 Len=0 Win=49640 Options=<mss 1460,nop,wscale >>> 0,nop,nop,sackOK> >>> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38338 Ack=2210460805 >>> Seq=3614232919 Len=0 Win=49640 >>> 172.20.12.6 -> 172.20.12.226 NFS C NULL3 >>> 172.20.12.226 -> 172.20.12.6 TCP D=38338 S=2049 Ack=3614233039 >>> Seq=2210460805 Len=0 Win=49520 >>> 172.20.12.226 -> 172.20.12.6 NFS R NULL3 >>> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38338 Ack=2210460833 >>> Seq=3614233039 Len=0 Win=49640 >>> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38338 Fin Ack=2210460833 >>> Seq=3614233039 Len=0 Win=49640 >>> 172.20.12.226 -> 172.20.12.6 TCP D=38338 S=2049 Ack=3614233040 >>> Seq=2210460833 Len=0 Win=49640 >>> 172.20.12.226 -> 172.20.12.6 TCP D=38338 S=2049 Fin Ack=3614233040 >>> Seq=2210460833 Len=0 Win=49640 >>> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38338 Ack=2210460834 >>> Seq=3614233040 Len=0 Win=49640 >>> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=664 Rst Ack=0 >>> Seq=3456416233 Len=0 Win=49640 >>> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=664 Syn Seq=3507413975 >>> Len=0 Win=49640 Options=<mss 1460,nop,wscale 0,nop,nop,sackOK> >>> >>> >>> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=664 Syn Seq=3507413975 >>> Len=0 Win=49640 Options=<mss 1460,nop,wscale 0,nop,nop,sackOK> >>> >>> >>> # netstat >>> 172.20.12.6.664 172.20.12.226.2049 0 0 49640 0 >>> SYN_SENT >>> >>> >>> After inetd hack: >>> >>> # netstat >>> *.664 *.* 0 0 49152 0 >>> BOUND >>> >>> # mount -o proto=tcp,vers=3 172.20.12.226:/export/src /mnt >>> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=661 Syn Seq=1448210229 >>> Len=0 Win=49640 Options=<mss 1460,nop,wscale 0,nop,nop,sackOK> >>> >>> # df >>> 172.20.12.226:/export/src >>> 24T 11G 24T 1% /mnt >>> >>> >>> >>> Jorgen Lundman wrote: >>>> >>>> Ok, it still happens even when not using aliases, it just took >>>> longer to turn up. >>>> >>>> Attempting to mount (snoop running on NFS client) >>> >>> >> >> >> _______________________________________________ >> nfs-discuss mailing list >> nfs-discuss at opensolaris.org > > > -- Jorgen Lundman | <lundman at lundman.net> Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo | +81 (0)90-5578-8500 (cell) Japan | +81 (0)3 -3375-1767 (home)