[Lustre-discuss] Mount 2 clusters, different networks - LNET tcp1-tcp2-o2ib

2011-06-14 Thread Thomas Roth
Hi all, I'd like to mount two Lustre filesystems on one client. Issues with more than one MGS set aside, the point here is that one of them is an Infiniband-cluster, the other is ethernet-based. And my client is on the ethernet. I have managed to mount the o2ib-fs by setting up an LNET router,

Re: [Lustre-discuss] Mount 2 clusters, different networks - LNET tcp1-tcp2-o2ib

2011-06-14 Thread Michael Shuey
Is your ethernet FS in tcp1, or tcp0? Your config bits indicate the client is in tcp1 - do the servers agree? -- Mike Shuey On Tue, Jun 14, 2011 at 12:23 PM, Thomas Roth t.r...@gsi.de wrote: Hi all, I'd like to mount two Lustre filesystems on one client. Issues with more than one MGS set

Re: [Lustre-discuss] Mount 2 clusters, different networks - LNET tcp1-tcp2-o2ib

2011-06-14 Thread Thomas Roth
Hm, the ethernet FS is in tcp0 - MGS says its nids are MGS-IP@tcp. So not surprising it refuses that connection. On the other hand, options lnet networks=tcp1(eth0),tcp(eth0:0) routes=o2ib LNET-Router-IP@tcp1; tcp Default-Gateway-IP@tcp results in Can't create route to tcp via

Re: [Lustre-discuss] Mount 2 clusters, different networks - LNET tcp1-tcp2-o2ib

2011-06-14 Thread Michael Shuey
That may be because your gateway doesn't have an interface on tcp (aka tcp0). I suspect you want to keep your ethernet clients in tcp0, your IB clients in o2ib0, and your router in both. Personally, I find it easiest to just give different module options on each system (rather than try ip2nets

Re: [Lustre-discuss] Mount 2 clusters, different networks - LNET tcp1-tcp2-o2ib

2011-06-14 Thread Thomas Roth
Thanks, Michael. I'll certainly put in the check_interval, that will be needed. However, what I tried was to have an ethernet client that mounts one FS via the LNET router (Infiniband-FS behind it) and simultaneously mounts the other FS, which is on tcp0 - via its default route. So actually

Re: [Lustre-discuss] Mount 2 clusters, different networks - LNET tcp1-tcp2-o2ib - solved?

2011-06-14 Thread Thomas Roth
Hi all, this seems to work with the correct IPs and correct network names ;-[] I now have the following modprobe on my ethernet client: options lnet networks=tcp1(eth0),tcp0(eth0:0) routes=o2ib LNET-Router@tcp1; tcp Default-Route@tcp1 With these options, loading the modules gives me Jun

[Lustre-discuss] Enabling mds failover after filesystem creation

2011-06-14 Thread Jeff Johnson
Greetings, I am attempting to add mds failover operation to an existing v1.8.4 filesystem. I have heartbeat/stonith configured on the mds nodes. What is unclear is what to change in the lustre parameters. I have read over the 1.8.x and 2.0 manuals and they are unclear as exactly how to enable

Re: [Lustre-discuss] Enabling mds failover after filesystem creation

2011-06-14 Thread Cliff White
It depends - are you using a combined MGS/MDS? If so, you will have to update the mgsnid on all servers to reflect the failover node, plus change the client mount string to show the failover node. otherwise, it's the same procedure as with an OST. cliffw On Tue, Jun 14, 2011 at 12:06 PM, Jeff

Re: [Lustre-discuss] Enabling mds failover after filesystem creation

2011-06-14 Thread Jeff Johnson
Apologies, I should have been more descriptive. I am running a dedicated MGS node and MGT device. The MDT is a standalone RAID-10 shared via SAS between two nodes, one being the current MDS and the second being the planned secondary MDS. Heartbeat and stonith w/ ipmi control is currently

Re: [Lustre-discuss] Enabling mds failover after filesystem creation

2011-06-14 Thread Cliff White
Then it should be the same as the OST case. The only difference between the two is that we never allow two active MDSs on the same filesystem, so MDT is always active/passive. cliffw On Tue, Jun 14, 2011 at 12:18 PM, Jeff Johnson jeff.john...@aeoncomputing.com wrote: Apologies, I should have

Re: [Lustre-discuss] Where should SHARED_DIRECTORY of acc-sm cfg variable set to?

2011-06-14 Thread Andreas Dilger
On 2011-06-14, at 9:57 AM, Surya, Prakash B. wrote: Perhaps it would be beneficial to add Andreas's comment to the source to avoid any future confusion? {code} # This is used by a small number of tests to share state between the client # running the tests, or in some cases between

[Lustre-discuss] lustre from source EXTRAVERSION

2011-06-14 Thread Michael Di Domenico
I'm trying to rebuild the RHEL kernel with the lustre patches, most everything has gone okay one time through so far, but i ran into an issue when trying to get OFED to compile against the new code According to the Whamcloud wiki I am to perform this step Add a unique build id so we can be