This morning I tried to get tcpdump information from an OpenBSD nfs client
to my tru64 5.1A TruCluster system when I activate the ipfilter module.
Some observations: the primary (first one up) cluster member is the one
that is handling all the NFS traffic - this is based on tcpdump output
that showed no NFS traffic from member2. The client must mount the
cluster alias for NFS -- attempt to mount an individual member gives
permission denied error - I think this is by design. When I activate the
ipfilter module with sysconfig -c ipfilter, then attempt to access a
mounted filesystem on the OpenBSD client hangs and the OpenBSD kernel
prints nfs server not responding messages. Tcpdump output shows the
following -- before activating ipfilter module:
Dump output from Tru64 member:
djklaptop.42eabfa7 > keckcenter65.nfs-v3: 128 call access OSF/1 fh
3149,424748/10.2 want: lookup
keckcenter65.42eabfa7 > djklaptop.nfs-v3: 120 reply access {dir size 8192
mtime 1076709964.121488000 ctime 1076709964.121488000} permitted: lookup
djklaptop.42eac089 > keckcenter65.nfs-v3: 128 call access OSF/1
fh 3149,424748/10.2 want: read
keckcenter65.42eac089 > djklaptop.nfs-v3: 120 reply access {dir size 8192
mtime 1076709964.121488000 ctime 1076709964.121488000} permitted: read
djklaptop.42eac3bd > keckcenter65.nfs-v3: 124 call getattr OSF/1
fh 3149,424748/10.2
keckcenter65.42eac3bd > djklaptop.nfs-v3: 112 reply getattr {dir size 8192
mtime 1076709964.121488000 ctime 1076709964.121488000}
djklaptop.42eac3f3 > keckcenter65.nfs-v3: 96 call fsstat OSF/1
fh 3149,424748/10.2
keckcenter65.42eac3f3 > djklaptop.nfs-v3: 168 reply fsstat {dir size 8192
mtime 1076709964.121488000 ctime 1076709964.121488000} tbytes 50052467712
fbytes 18651762688 abytes 18651762688 tfiles 789762910 ffiles 788683578
afiles 788683578 invarsec 0
djklaptop.42eac408 > keckcenter65.nfs-v3: 144 call readdir OSF/1
fh 3149,424748/10.2 cookie 0 cookieverf 0 count 8192
after executing sysconfig -c ipfilter:
djklaptop.42eadf25 > keckcenter65.nfs-v3: 128 call access OSF/1 fh
3149,424748/10.2 want: lookup
lehrer65.2049 > djklaptop.829: udp 120 (DF)
djklaptop > lehrer65: icmp: djklaptop udp port 829 unreachable
djklaptop.42eadf25 > keckcenter65.nfs-v3: 128 call access OSF/1 fh
3149,424748/10.2 want: lookup
lehrer65.2049 > djklaptop.829: udp 120 (DF)
djklaptop > lehrer65: icmp: djklaptop udp port 829 unreachable
djklaptop.42eadf25 > keckcenter65.nfs-v3: 128 call access OSF/1
fh 3149,424748/10.2 want: lookup
lehrer65.2049 > djklaptop.829: udp 120 (DF)
djklaptop > lehrer65: icmp: djklaptop udp port 829 unreachable
On the OpenBSD client tcpdump traffic before the module activation on the
Tru64 server looks OK:
ec 29 09:29:21.903357 0:a:95:b3:c9:80 0:10:64:30:38:d2 ip
170: djklaptop.ucsf.edu.829 > keckcenter.ucsf.edu.nfsd: xid 0x42eabfa7 128
lookup fh 0,32/201326592 [|nfs]
Dec 29 09:29:21.950037 0:a:95:b3:c9:80 0:10:64:30:38:d2 ip
170: djklaptop.ucsf.edu.829 > keckcenter.ucsf.edu.nfsd: xid 0x42eabfa7 128
lookup fh 0,32/201326592 [|nfs]
ec 29 09:29:52.681607 0:10:64:30:38:d2 0:a:95:b3:c9:80 ip
166: keckcenter.ucsf.edu.nfsd > djklaptop.ucsf.edu.829: xid 0x42eabfa7
reply ok 120 lookup fh 0,1/18
Dec 29 09:29:52.681736 0:a:95:b3:c9:80 0:10:64:30:38:d2 ip
170: djklaptop.ucsf.edu.829 > keckcenter.ucsf.edu.nfsd: xid 0x42eac044 128
lookup fh 0,32/201326592 [|nfs]
When the module is activiated, the OpenBSD client shows:
Dec 29 09:33:02.440050 0:a:95:b3:c9:80 0:10:64:30:38:d2 ip
170: djklaptop.ucsf.edu.829 > keckcenter.ucsf.edu.nfsd: xid 0x42eadf25 128
lookup fh 0,32/201326592 [|nfs]
Dec 29 09:33:02.440506 0:10:64:30:38:d2 0:a:95:b3:c9:80 ip
166: lehrer.ucsf.edu.nfsd > djklaptop.ucsf.edu.829: xid 0x42eadf25
reply ok 120 (DF)
Dec 29 09:33:02.440546 0:a:95:b3:c9:80 0:10:64:30:38:d2 ip
70: djklaptop.ucsf.edu > lehrer.ucsf.edu: icmp: djklaptop.ucsf.edu
udp port 829 unreachable
What strikes me as interesting is that the tcpdump output from both
machines show all successful connections as being to/from the cluster
alias (keckcenter), whereas the unsuccessful connections (when the module
is activated) show the icmp port unreachable traffic from the member
(lehrer).
Anyone have any idea why loading the module would cause the problem ?
Dirk
On Tue, 28 Dec 2004, Darren Reed wrote:
> In the cluster model, is it possible that IP traffic is coming in
> one host, being sent via the memory channel interconnect to the
> other and replies then exiting it ?
>
> e.g.
>
> sender--<SYN>->[hostA]--(SYN via interconnect)-->[hostB]--<SYN+ACK>-->sender
>
> well, that's not a good diagram...but...
>
> IPFilter as yet isn't cluster aware, so at this point, you'd need to
> rewrite your ruleset without "keep state" rules.
>
> Darren
>