Eric and Paul, Thanks for you help. We tested cross mounting the file system on a dev XT system on another XT. We still haven't tested having both file systems mounted concurrently, but I'm optimistic given the success of this first test.
Thanks again, --Shane -----Original Message----- From: Eric Barton [mailto:[EMAIL PROTECTED] Sent: Sunday, February 04, 2007 9:28 AM To: Canon, Richard Shane; [email protected] Subject: RE: [Lustre-discuss] Routing between two XT systems > We are preparing to transition users from a Cray XT3 sytem to > an XT4 system. In order to transfer the data between the two > Lustre filesystems, we are looking at the possibility of having > one system mount the other systems file system... Tell LNET there are 2 portals networks so one cluster has LNET NIDs that are <portals_nid>@ptl0 and the other has <portals_nid>@ptl1. Let me assume for the purposes of this discussion... 1. The XT3 service nodes have IP addresses in the range 192.168.1.* on some interface. 2. The XT3 system has 8 nodes that connect to the site IP network via eth2 which have IP addresses in the range 100.1.1.[1-8] and NIDs in the range 16-23 3. The XT4 service nodes have IP addresses in the range 192.168.2.* on some interface. 4. The XT4 system has 8 nodes that connect to the site IP network via eth3 which have IP addresses in the range 100.1.2.[1-8] and NIDs in the range 32-39 ...so you can set the following lnet module parameters... options lnet ip2nets="ptl0 192.168.1.* # XT3 service nodes;\ ptl1 192.168.2.* # XT4 service nodes;\ tcp0(eth2) 100.1.1.[1-8] # XT3 lnet routers;\ tcp0(eth3) 100.1.2.[1-8] # XT4 lnet routers;"\ routes="ptl0 1 [EMAIL PROTECTED] # XT3 <- IP network;\ ptl0 2 [EMAIL PROTECTED] # XT3 <- XT4;\ ptl1 1 [EMAIL PROTECTED] # XT4 <- IP network;\ ptl1 2 [EMAIL PROTECTED] # XT4 <- XT3;\ tcp0 1 [EMAIL PROTECTED] # IP network <- XT3;\ tcp0 1 [EMAIL PROTECTED] # IP network <- XT4;" You may also want to set the following additonal LNET parameters... options lnet check_routers_before_use=1\ dead_router_check_interval=50\ live_router_check_interval=50 ...which enable automatic router health checks (these are disabled by default) so that dead routers are avoided. This If you want catamount applications to run while this network is in place, you'll need the following environment variables set... 1. XT4 catamount apps need LNET_NETWORKS="ptl1" so they know they are in the ptl1 network. XT3 nodes don't need this because ptl0 is the default. 2. XT3 catamount apps wishing to access XT4 servers need LNET_ROUTES="ptl1 [EMAIL PROTECTED]". 3. XT4 catamount apps wishing to access XT3 servers need LNET_ROUTES="ptl0 [EMAIL PROTECTED]" Please note that catamount apps are not tolerant of router failure, so any downed routers must be omitted from the LNET_ROUTES environment variable. Cheers, Eric _______________________________________________ Lustre-discuss mailing list [email protected] https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
