> We are preparing to transition users from a Cray XT3 sytem to
> an XT4 system. In order to transfer the data between the two
> Lustre filesystems, we are looking at the possibility of having
> one system mount the other systems file system...
Tell LNET there are 2 portals networks so one cluster has LNET
NIDs that are <portals_nid>@ptl0 and the other has
<portals_nid>@ptl1.
Let me assume for the purposes of this discussion...
1. The XT3 service nodes have IP addresses in the range
192.168.1.* on some interface.
2. The XT3 system has 8 nodes that connect to the site IP network
via eth2 which have IP addresses in the range 100.1.1.[1-8]
and NIDs in the range 16-23
3. The XT4 service nodes have IP addresses in the range
192.168.2.* on some interface.
4. The XT4 system has 8 nodes that connect to the site IP network
via eth3 which have IP addresses in the range 100.1.2.[1-8]
and NIDs in the range 32-39
...so you can set the following lnet module parameters...
options lnet ip2nets="ptl0 192.168.1.* # XT3 service nodes;\
ptl1 192.168.2.* # XT4 service nodes;\
tcp0(eth2) 100.1.1.[1-8] # XT3 lnet routers;\
tcp0(eth3) 100.1.2.[1-8] # XT4 lnet routers;"\
routes="ptl0 1 [EMAIL PROTECTED] # XT3 <- IP network;\
ptl0 2 [EMAIL PROTECTED] # XT3 <- XT4;\
ptl1 1 [EMAIL PROTECTED] # XT4 <- IP network;\
ptl1 2 [EMAIL PROTECTED] # XT4 <- XT3;\
tcp0 1 [EMAIL PROTECTED] # IP network <- XT3;\
tcp0 1 [EMAIL PROTECTED] # IP network <- XT4;"
You may also want to set the following additonal LNET parameters...
options lnet check_routers_before_use=1\
dead_router_check_interval=50\
live_router_check_interval=50
...which enable automatic router health checks (these are disabled by
default) so that dead routers are avoided. This
If you want catamount applications to run while this network is
in place, you'll need the following environment variables set...
1. XT4 catamount apps need LNET_NETWORKS="ptl1" so they know they
are in the ptl1 network. XT3 nodes don't need this because
ptl0 is the default.
2. XT3 catamount apps wishing to access XT4 servers need
LNET_ROUTES="ptl1 [EMAIL PROTECTED]".
3. XT4 catamount apps wishing to access XT3 servers need
LNET_ROUTES="ptl0 [EMAIL PROTECTED]"
Please note that catamount apps are not tolerant of router failure, so any
downed routers must be omitted from the LNET_ROUTES environment variable.
Cheers,
Eric
_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss