I'm having a similiar problem from a linux client to a sNV_b103 server. For me though the mount works fine, it's the NFS accesses that hang.
Here's a snoop that shows what the server is seeing: releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C GETATTR3 FH=6C39 Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 S=2049 Ack=659557530 Seq=471989561 Len=0 Win=53688 Options=<nop,nop,tstamp 181189 2635184551> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R GETATTR3 OK releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM TCP D=2049 S=800 Ack=471989677 Seq=659557530 Len=0 Win=512 Options=<nop,nop,tstamp 2635184551 181189> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C ACCESS3 FH=6C39 (read,lookup,modify,extend,delete) Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R ACCESS3 OK (read,lookup) releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096 Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R READDIRPLUS3 OK 12 entries (No more) releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096 (retransmit) Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 S=2049 Ack=659557830 Seq=471991813 Len=0 Win=53688 Options=<nop,nop,tstamp 181209 2635184552> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R READDIRPLUS3 OK 12 entries (No more) Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R READDIRPLUS3 OK 12 entries (No more) Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R READDIRPLUS3 OK 12 entries (No more) releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096 (retransmit) Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 S=2049 Ack=659557990 Seq=471991813 Len=0 Win=53688 Options=<nop,nop,tstamp 187188 2635244552> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R READDIRPLUS3 OK 12 entries (No more) Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R READDIRPLUS3 OK 12 entries (No more) releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C FSSTAT3 FH=6C39 Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R FSSTAT3 OK releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM TCP D=2049 S=800 Ack=471989801 Seq=659558126 Len=0 Win=512 Options=<nop,nop,tstamp 2635279692 181189,no p,nop,sack 471993825-471993997> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R READDIRPLUS3 OK 12 entries (No more) releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096 (retransmit) Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R READDIRPLUS3 OK 12 entries (No more) releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096 (retransmit) Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 S=2049 Ack=659558286 Seq=471996009 Len=0 Win=53688 Options=<nop,nop,tstamp 199208 2635364552> This (to my uneducated eye) shows the server repling multiple times, and the client retransmitting the READDIR3 multiple times. I'm not familiar enough with Linux (yet) to run the equivalent of snoop (what is it? ethereal?) or I'd include traces from the client also. The client (and server) are both IBM x346 eServers. Like the link below, both have BMC(like an LOM) modules to manage the machine. Also like the link below these modules share the ethernet port with one of the broadcom (not intel) ethernet interfaces built into the motherboard. However in this case: 1) Neither the server nor the client are using the shared broadcom interface on the motherboard. 2) The client is using the other broadcom interface on the MB. 3) The Server is using a LACP aggr group (setup with dladm [with mtu=9000]) built up from 4 intel e1000g interfaces on a PCI card. So if packets are being lost on the return trip from the server to the client, I don't think it's for the same reason, though it may be similiar. Note this on a ZFS filesystem, but from the traces above I'm not inclied to think that has anything to do with the problem. The server does have other ethernet interfaces on other subnets. However the testing above was careful to do the NFS mount with only the IP of the one interface, that one interface is also the one used for the default route, and snoop was running on the others and showed zero incoming or outgoing traffic (to or from the clients IP) durring this same period. Anyone got any ideas? -Kyle Jorgen Lundman wrote: > > I stumbled across this entry: > > http://blogs.sun.com/shepler/entry/port_623_or_the_mount > > and even though we do not see this issue with port 623, but rather > 664. But sure enough, it was sending SYN/ACK, then timeout until RST. > > I waited for the port to the released, told inetd to listen on port > 664 and voila, mount works fine again. > > We use Supermicros with Intel? 82573V and 82573L. > > I would send Shepler my thanks but comments are disabled. > > > > Useless logs: > > > # mount -o proto=tcp,vers=3 172.20.12.226:/export/src /mnt > > 172.20.12.6 -> 172.20.12.226 PORTMAP C GETPORT prog=100005 (MOUNT) > vers=3 proto=UDP > 172.20.12.226 -> 172.20.12.6 PORTMAP R GETPORT port=39049 > 172.20.12.6 -> 172.20.12.226 MOUNT3 C Null > 172.20.12.226 -> 172.20.12.6 MOUNT3 R Null > 172.20.12.6 -> 172.20.12.226 MOUNT3 C Mount /export/src > 172.20.12.226 -> 172.20.12.6 MOUNT3 R Mount OK FH=076E Auth=unix > 172.20.12.6 -> 172.20.12.226 PORTMAP C GETPORT prog=100003 (NFS) > vers=3 proto=TCP > 172.20.12.226 -> 172.20.12.6 PORTMAP R GETPORT port=2049 > 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38337 Syn Seq=592414549 > Len=0 Win=49640 Options=<mss 1460,nop,wscale 0,nop,nop,sackOK> > 172.20.12.226 -> 172.20.12.6 TCP D=38337 S=2049 Syn Ack=592414550 > Seq=2210245643 Len=0 Win=49640 Options=<mss 1460,nop,wscale > 0,nop,nop,sackOK> > 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38337 Ack=2210245644 > Seq=592414550 Len=0 Win=49640 > 172.20.12.6 -> 172.20.12.226 NFS C NULL3 > 172.20.12.226 -> 172.20.12.6 TCP D=38337 S=2049 Ack=592414670 > Seq=2210245644 Len=0 Win=49520 > 172.20.12.226 -> 172.20.12.6 NFS R NULL3 > 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38337 Ack=2210245672 > Seq=592414670 Len=0 Win=49640 > 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38337 Fin Ack=2210245672 > Seq=592414670 Len=0 Win=49640 > 172.20.12.226 -> 172.20.12.6 TCP D=38337 S=2049 Ack=592414671 > Seq=2210245672 Len=0 Win=49640 > 172.20.12.226 -> 172.20.12.6 TCP D=38337 S=2049 Fin Ack=592414671 > Seq=2210245672 Len=0 Win=49640 > 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38337 Ack=2210245673 > Seq=592414671 Len=0 Win=49640 > 172.20.12.6 -> 172.20.12.226 PORTMAP C GETPORT prog=100003 (NFS) > vers=3 proto=TCP > 172.20.12.226 -> 172.20.12.6 PORTMAP R GETPORT port=2049 > 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38338 Syn Seq=3614232918 > Len=0 Win=49640 Options=<mss 1460,nop,wscale 0,nop,nop,sackOK> > 172.20.12.226 -> 172.20.12.6 TCP D=38338 S=2049 Syn Ack=3614232919 > Seq=2210460804 Len=0 Win=49640 Options=<mss 1460,nop,wscale > 0,nop,nop,sackOK> > 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38338 Ack=2210460805 > Seq=3614232919 Len=0 Win=49640 > 172.20.12.6 -> 172.20.12.226 NFS C NULL3 > 172.20.12.226 -> 172.20.12.6 TCP D=38338 S=2049 Ack=3614233039 > Seq=2210460805 Len=0 Win=49520 > 172.20.12.226 -> 172.20.12.6 NFS R NULL3 > 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38338 Ack=2210460833 > Seq=3614233039 Len=0 Win=49640 > 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38338 Fin Ack=2210460833 > Seq=3614233039 Len=0 Win=49640 > 172.20.12.226 -> 172.20.12.6 TCP D=38338 S=2049 Ack=3614233040 > Seq=2210460833 Len=0 Win=49640 > 172.20.12.226 -> 172.20.12.6 TCP D=38338 S=2049 Fin Ack=3614233040 > Seq=2210460833 Len=0 Win=49640 > 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38338 Ack=2210460834 > Seq=3614233040 Len=0 Win=49640 > 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=664 Rst Ack=0 > Seq=3456416233 Len=0 Win=49640 > 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=664 Syn Seq=3507413975 > Len=0 Win=49640 Options=<mss 1460,nop,wscale 0,nop,nop,sackOK> > > > 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=664 Syn Seq=3507413975 > Len=0 Win=49640 Options=<mss 1460,nop,wscale 0,nop,nop,sackOK> > > > # netstat > 172.20.12.6.664 172.20.12.226.2049 0 0 49640 0 > SYN_SENT > > > After inetd hack: > > # netstat > *.664 *.* 0 0 49152 0 BOUND > > # mount -o proto=tcp,vers=3 172.20.12.226:/export/src /mnt > 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=661 Syn Seq=1448210229 Len=0 > Win=49640 Options=<mss 1460,nop,wscale 0,nop,nop,sackOK> > > # df > 172.20.12.226:/export/src > 24T 11G 24T 1% /mnt > > > > Jorgen Lundman wrote: >> >> Ok, it still happens even when not using aliases, it just took longer >> to turn up. >> >> Attempting to mount (snoop running on NFS client) > >
