Sounds like it is possibly different, but in the end once I knew what to 
look for, we could test for it quite easily:

# nmap -p 512-1023 172.20.12.5
623/tcp filtered unknown
664/tcp filtered unknown

(The IP in this case in NFS client, but I ran nmap against both to look 
for other stolen ports).

We then did the Inetd hack to occupy ports 623/664, and no hung NFS 
since then. (knock on wood).



Kyle McDonald wrote:
> 
> Ok. I got a matching snoop (from the server) and tcpdump (from the linux 
> client) and the replies seen leaving the server are definitely not seen 
> by the client:
> 
>  From the server:
> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C ACCESS3 
> FH=6C39 (read,lookup,modify,extend,delete)
> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 
> S=2049 Ack=1184956508 Seq=1590845549 Len=0 Win=53688 
> Options=<nop,nop,tstamp 631479 2639688273>
> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R ACCESS3 
> OK (read,lookup)
> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM TCP D=2049 
> S=800 Ack=1590845673 Seq=1184956508 Len=0 Win=512 
> Options=<nop,nop,tstamp 2639688274 631479>
> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C 
> READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096
> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R 
> READDIRPLUS3 OK 12 entries (No more)
> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C 
> READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096 (retransmit)
> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 
> S=2049 Ack=1184956668 Seq=1590847685 Len=0 Win=53688 
> Options=<nop,nop,tstamp 631500 2639688274>
> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R 
> READDIRPLUS3 OK 12 entries (No more)
> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R 
> READDIRPLUS3 OK 12 entries (No more)
> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R 
> READDIRPLUS3 OK 12 entries (No more)
> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R 
> READDIRPLUS3 OK 12 entries (No more)
> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R 
> READDIRPLUS3 OK 12 entries (No more)
> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R 
> READDIRPLUS3 OK 12 entries (No more)
> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R 
> READDIRPLUS3 OK 12 entries (No more)
> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C 
> READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096 (retransmit)
> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 
> S=2049 Ack=1184956828 Seq=1590847685 Len=0 Win=53688 
> Options=<nop,nop,tstamp 637478 2639748274>
> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R 
> READDIRPLUS3 OK 12 entries (No more)
> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R 
> READDIRPLUS3 OK 12 entries (No more)
> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R 
> READDIRPLUS3 OK 12 entries (No more)
> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R 
> READDIRPLUS3 OK 12 entries (No more)
> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R 
> READDIRPLUS3 OK 12 entries (No more)
> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R 
> READDIRPLUS3 OK 12 entries (No more)
> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM TCP D=2049 
> S=800 Fin Ack=1590845673 Seq=1184956828 Len=0 Win=512 
> Options=<nop,nop,tstamp 2640063353 631479>
> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 
> S=2049 Ack=1184956829 Seq=1590849697 Len=0 Win=53688 
> Options=<nop,nop,tstamp 668980 2640063353>
> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 
> S=2049 Fin Ack=1184956829 Seq=1590849697 Len=0 Win=53688 
> Options=<nop,nop,tstamp 668980 2640063353>
> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM TCP D=2049 
> S=800 Rst Seq=1184956829 Len=0 Win=0
> 
>  From the client:
> 15:09:24.165259 IP releng1.862107800 > Galileo.RelEng.Egenera.COM.nfs: 
> 140 access [|nfs]
> 15:09:24.165611 IP Galileo.RelEng.Egenera.COM.nfs > releng1.800: . ack 
> 277 win 53688 <nop,nop,timestamp 631479 2639688273>
> 15:09:24.165855 IP Galileo.RelEng.Egenera.COM.nfs > releng1.862107800: 
> reply ok 124 access [|nfs]
> 15:09:24.165867 IP releng1.800 > Galileo.RelEng.Egenera.COM.nfs: . ack 
> 293 win 512 <nop,nop,timestamp 2639688274 631479>
> 15:09:24.166069 IP releng1.878885016 > Galileo.RelEng.Egenera.COM.nfs: 
> 160 readdirplus [|nfs]
> 15:09:24.379488 IP releng1.878885016 > Galileo.RelEng.Egenera.COM.nfs: 
> 160 readdirplus [|nfs]
> 15:09:24.379772 IP Galileo.RelEng.Egenera.COM.nfs > releng1.800: . ack 
> 437 win 53688 <nop,nop,timestamp 631500 2639688274>
> 15:10:24.156981 IP releng1.878885016 > Galileo.RelEng.Egenera.COM.nfs: 
> 160 readdirplus [|nfs]
> 15:10:24.157285 IP Galileo.RelEng.Egenera.COM.nfs > releng1.800: . ack 
> 597 win 53688 <nop,nop,timestamp 637478 2639748274>
> 15:15:39.184608 IP releng1.800 > Galileo.RelEng.Egenera.COM.nfs: F 
> 597:597(0) ack 293 win 512 <nop,nop,timestamp 2640063353 631479>
> 15:15:39.185420 IP Galileo.RelEng.Egenera.COM.nfs > releng1.800: . ack 
> 598 win 53688 <nop,nop,timestamp 668980 2640063353>
> 15:15:39.185548 IP Galileo.RelEng.Egenera.COM.nfs > releng1.800: F 
> 4317:4317(0) ack 598 win 53688 <nop,nop,timestamp 668980 2640063353>
> 15:15:39.185572 IP releng1.800 > Galileo.RelEng.Egenera.COM.nfs: R 
> 1184956829:1184956829(0) win 0
> 
> I'll have to try to setup a port mirror on the Switch I'm using. I don't 
> know if that switch can mirror the LACP group though - I'll have to 
> disable it.
> 
> Any other suggestions?
> 
>  -Kyle
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Kyle McDonald wrote:
>> I'm having a similiar problem from a linux client to a sNV_b103 server.
>>
>> For me though the mount works fine, it's the NFS accesses that hang.
>>
>> Here's a snoop that shows what the server is seeing:
>>
>> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C 
>> GETATTR3 FH=6C39
>> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 
>> S=2049 Ack=659557530 Seq=471989561 Len=0 Win=53688 
>> Options=<nop,nop,tstamp 181189 2635184551>
>> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R 
>> GETATTR3 OK
>> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM TCP D=2049 
>> S=800 Ack=471989677 Seq=659557530 Len=0 Win=512 
>> Options=<nop,nop,tstamp 2635184551 181189>
>> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C ACCESS3 
>> FH=6C39 (read,lookup,modify,extend,delete)
>> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R ACCESS3 
>> OK (read,lookup)
>> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C 
>> READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096
>> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R 
>> READDIRPLUS3 OK 12 entries (No more)
>> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C 
>> READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096 (retransmit)
>> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 
>> S=2049 Ack=659557830 Seq=471991813 Len=0 Win=53688 
>> Options=<nop,nop,tstamp 181209 2635184552>
>> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R 
>> READDIRPLUS3 OK 12 entries (No more)
>> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R 
>> READDIRPLUS3 OK 12 entries (No more)
>> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R 
>> READDIRPLUS3 OK 12 entries (No more)
>> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C 
>> READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096 (retransmit)
>> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 
>> S=2049 Ack=659557990 Seq=471991813 Len=0 Win=53688 
>> Options=<nop,nop,tstamp 187188 2635244552>
>> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R 
>> READDIRPLUS3 OK 12 entries (No more)
>> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R 
>> READDIRPLUS3 OK 12 entries (No more)
>> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C FSSTAT3 
>> FH=6C39
>> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R FSSTAT3 OK
>> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM TCP D=2049 
>> S=800 Ack=471989801 Seq=659558126 Len=0 Win=512 
>> Options=<nop,nop,tstamp 2635279692 181189,no
>> p,nop,sack 471993825-471993997>
>> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R 
>> READDIRPLUS3 OK 12 entries (No more)
>> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C 
>> READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096 (retransmit)
>> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM NFS R 
>> READDIRPLUS3 OK 12 entries (No more)
>> releng1.RelEng.Egenera.COM -> Galileo.RelEng.Egenera.COM NFS C 
>> READDIRPLUS3 FH=6C39 Cookie=0 for 512/4096 (retransmit)
>> Galileo.RelEng.Egenera.COM -> releng1.RelEng.Egenera.COM TCP D=800 
>> S=2049 Ack=659558286 Seq=471996009 Len=0 Win=53688 
>> Options=<nop,nop,tstamp 199208 2635364552>
>>
>> This (to my uneducated eye) shows the server repling multiple times, 
>> and the client retransmitting the READDIR3 multiple times.
>>
>> I'm not familiar enough with Linux (yet) to run the equivalent of 
>> snoop (what is it? ethereal?) or I'd include traces from the client also.
>>
>> The client (and server) are both IBM x346 eServers. Like the link 
>> below, both have BMC(like an LOM) modules to manage the machine. Also 
>> like the link below these modules share the ethernet port with one of 
>> the broadcom (not intel) ethernet interfaces built into the 
>> motherboard. However in this case:
>>
>> 1) Neither the server nor the client are using the shared broadcom 
>> interface on the motherboard.
>> 2) The client is using the other broadcom interface on the MB.
>> 3) The Server is using a LACP aggr group (setup with dladm [with 
>> mtu=9000]) built up from 4 intel e1000g interfaces on a PCI card.
>>
>> So if packets are being lost on the return trip from the server to the 
>> client, I don't think it's for the same reason, though it may be 
>> similiar.
>> Note this on a ZFS filesystem, but from the traces above I'm not 
>> inclied to think that has anything to do with the problem.
>>
>> The server does have other ethernet interfaces on other subnets. 
>> However the testing above was careful to do the NFS mount with only 
>> the IP of the one interface, that one interface is also the one used 
>> for the default route, and snoop was running on the others and showed 
>> zero incoming or outgoing traffic (to or from the clients IP) durring 
>> this same period.
>>
>> Anyone got any ideas?
>>
>>  -Kyle
>>
>>
>>
>>
>>
>> Jorgen Lundman wrote:
>>>
>>> I stumbled across this entry:
>>>
>>> http://blogs.sun.com/shepler/entry/port_623_or_the_mount
>>>
>>> and even though we do not see this issue with port 623, but rather 
>>> 664. But sure enough, it was sending SYN/ACK, then timeout until RST.
>>>
>>> I waited for the port to the released, told inetd to listen on port 
>>> 664 and voila, mount works fine again.
>>>
>>> We use Supermicros with Intel? 82573V and 82573L.
>>>
>>> I would send Shepler my thanks but comments are disabled.
>>>
>>>
>>>
>>> Useless logs:
>>>
>>>
>>> # mount -o proto=tcp,vers=3 172.20.12.226:/export/src /mnt
>>>
>>>  172.20.12.6 -> 172.20.12.226 PORTMAP C GETPORT prog=100005 (MOUNT) 
>>> vers=3 proto=UDP
>>> 172.20.12.226 -> 172.20.12.6  PORTMAP R GETPORT port=39049
>>>  172.20.12.6 -> 172.20.12.226 MOUNT3 C Null
>>> 172.20.12.226 -> 172.20.12.6  MOUNT3 R Null
>>>  172.20.12.6 -> 172.20.12.226 MOUNT3 C Mount /export/src
>>> 172.20.12.226 -> 172.20.12.6  MOUNT3 R Mount OK FH=076E Auth=unix
>>>  172.20.12.6 -> 172.20.12.226 PORTMAP C GETPORT prog=100003 (NFS) 
>>> vers=3 proto=TCP
>>> 172.20.12.226 -> 172.20.12.6  PORTMAP R GETPORT port=2049
>>>  172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38337 Syn Seq=592414549 
>>> Len=0 Win=49640 Options=<mss 1460,nop,wscale 0,nop,nop,sackOK>
>>> 172.20.12.226 -> 172.20.12.6  TCP D=38337 S=2049 Syn Ack=592414550 
>>> Seq=2210245643 Len=0 Win=49640 Options=<mss 1460,nop,wscale 
>>> 0,nop,nop,sackOK>
>>>  172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38337 Ack=2210245644 
>>> Seq=592414550 Len=0 Win=49640
>>>  172.20.12.6 -> 172.20.12.226 NFS C NULL3
>>> 172.20.12.226 -> 172.20.12.6  TCP D=38337 S=2049 Ack=592414670 
>>> Seq=2210245644 Len=0 Win=49520
>>> 172.20.12.226 -> 172.20.12.6  NFS R NULL3
>>>  172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38337 Ack=2210245672 
>>> Seq=592414670 Len=0 Win=49640
>>>  172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38337 Fin Ack=2210245672 
>>> Seq=592414670 Len=0 Win=49640
>>> 172.20.12.226 -> 172.20.12.6  TCP D=38337 S=2049 Ack=592414671 
>>> Seq=2210245672 Len=0 Win=49640
>>> 172.20.12.226 -> 172.20.12.6  TCP D=38337 S=2049 Fin Ack=592414671 
>>> Seq=2210245672 Len=0 Win=49640
>>>  172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38337 Ack=2210245673 
>>> Seq=592414671 Len=0 Win=49640
>>>  172.20.12.6 -> 172.20.12.226 PORTMAP C GETPORT prog=100003 (NFS) 
>>> vers=3 proto=TCP
>>> 172.20.12.226 -> 172.20.12.6  PORTMAP R GETPORT port=2049
>>>  172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38338 Syn Seq=3614232918 
>>> Len=0 Win=49640 Options=<mss 1460,nop,wscale 0,nop,nop,sackOK>
>>> 172.20.12.226 -> 172.20.12.6  TCP D=38338 S=2049 Syn Ack=3614232919 
>>> Seq=2210460804 Len=0 Win=49640 Options=<mss 1460,nop,wscale 
>>> 0,nop,nop,sackOK>
>>>  172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38338 Ack=2210460805 
>>> Seq=3614232919 Len=0 Win=49640
>>>  172.20.12.6 -> 172.20.12.226 NFS C NULL3
>>> 172.20.12.226 -> 172.20.12.6  TCP D=38338 S=2049 Ack=3614233039 
>>> Seq=2210460805 Len=0 Win=49520
>>> 172.20.12.226 -> 172.20.12.6  NFS R NULL3
>>>  172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38338 Ack=2210460833 
>>> Seq=3614233039 Len=0 Win=49640
>>>  172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38338 Fin Ack=2210460833 
>>> Seq=3614233039 Len=0 Win=49640
>>> 172.20.12.226 -> 172.20.12.6  TCP D=38338 S=2049 Ack=3614233040 
>>> Seq=2210460833 Len=0 Win=49640
>>> 172.20.12.226 -> 172.20.12.6  TCP D=38338 S=2049 Fin Ack=3614233040 
>>> Seq=2210460833 Len=0 Win=49640
>>>  172.20.12.6 -> 172.20.12.226 TCP D=2049 S=38338 Ack=2210460834 
>>> Seq=3614233040 Len=0 Win=49640
>>>  172.20.12.6 -> 172.20.12.226 TCP D=2049 S=664 Rst Ack=0 
>>> Seq=3456416233 Len=0 Win=49640
>>>  172.20.12.6 -> 172.20.12.226 TCP D=2049 S=664 Syn Seq=3507413975 
>>> Len=0 Win=49640 Options=<mss 1460,nop,wscale 0,nop,nop,sackOK>
>>>
>>>
>>>  172.20.12.6 -> 172.20.12.226 TCP D=2049 S=664 Syn Seq=3507413975 
>>> Len=0 Win=49640 Options=<mss 1460,nop,wscale 0,nop,nop,sackOK>
>>>
>>>
>>> # netstat
>>> 172.20.12.6.664      172.20.12.226.2049       0      0 49640      0 
>>> SYN_SENT
>>>
>>>
>>> After inetd hack:
>>>
>>> # netstat
>>>       *.664                *.*                0      0 49152      0 
>>> BOUND
>>>
>>> # mount -o proto=tcp,vers=3 172.20.12.226:/export/src /mnt
>>> 172.20.12.6 -> 172.20.12.226 TCP D=2049 S=661 Syn Seq=1448210229 
>>> Len=0 Win=49640 Options=<mss 1460,nop,wscale 0,nop,nop,sackOK>
>>>
>>> # df
>>> 172.20.12.226:/export/src
>>>                         24T    11G    24T     1%    /mnt
>>>
>>>
>>>
>>> Jorgen Lundman wrote:
>>>>
>>>> Ok, it still happens even when not using aliases, it just took 
>>>> longer to turn up.
>>>>
>>>> Attempting to mount (snoop running on NFS client)
>>>
>>>
>>
>>
>> _______________________________________________
>> nfs-discuss mailing list
>> nfs-discuss at opensolaris.org
> 
> 
> 

-- 
Jorgen Lundman       | <lundman at lundman.net>
Unix Administrator   | +81 (0)3 -5456-2687 ext 1017 (work)
Shibuya-ku, Tokyo    | +81 (0)90-5578-8500          (cell)
Japan                | +81 (0)3 -3375-1767          (home)

Reply via email to