[Lustre-discuss] lctl ping of Pacemaker IP

2012-11-04 Thread Ms. Megan Larko
Greetings,

My present solution for my corosync/pacemaker control of my Lustre
filesystem availability was to make a Linux Standards Base (LSB) Sys V
init script for my IB0 service and then I could use the corosync
primitive to control the IB network (and therefore the MGS).  Being
that I did not know how to make the corosync alias IP accessible to
LNET for a successful lctl ping required for Lustre OSS nodes to
properly communicate with the MGS/MDS, I chose to point to the real
InfiniBand ib0 IP and coorsync align  that network address with the
system servinig the fibre channel multipath mgs/mdt disk.   In this
way the ost disks have one and only one mgsnode (no failover because
the IB0 address fails over).

This has been successful in my TCP test (an LSB-compliant service for
eth1).   I plan on implementing this week when the IB hardware comes
in.

Thanks for your help.  I appreciate it.

Cheers,
megan
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] lctl ping of Pacemaker IP

2012-11-02 Thread Isaac Huang
On Fri, Nov 02, 2012 at 12:04:02AM -0400, Ms. Megan Larko wrote:
 ..
 What steps should I take to generate a successful lctl ping a.b.c.d?

There must be a LNet instance running over SOCKLND on a.b.c.d.

- Isaac
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] lctl ping of Pacemaker IP

2012-11-01 Thread Ms. Megan Larko
Greetings!

I am working with Lustre-2.1.2 on RHEL 6.2.  First I configured it
using the standard defaults over TCP/IP.   Everything worked very
nicely usnig a real, static --mgsnode=a.b.c.x value which was the
actual IP of the MGS/MDS system1 node.

I am now trying to integrate it with Pacemaker-1.1.7.I believe I
have most of the set-up completed with a particular exception.  The
lctl ping command cannot ping the pacemaker IP alias (say a.b.c.d).
The generic ping command in RHEL 6.2 can successfully access the
interface.  The Pacemaker alias IP (for failover of the combnied
MGSMDS node with Fibre Channel multipath storage shared between both
MGS/MDS-configured machines)  works in and of itself.  I tested with
an apache service.   The Pacemaker will correctly fail over the
MGS/MDS from system1 to system2 properly.  If I go to system2 then my
Lustre file system stops because it cannot get to the alias IP number.

I did configure the lustre OSTs to use --mgsnode=a.b.c.d (a.b.c.d
representing my Pacemaker IP alias).  A tunefs.lustre confirms the
alias IP number.  The alias IP number does not appear in LNET (lctl
list_nids), and lctl ping a.b.c.d fails.

Should this IP alias go into the LNET data base?  If yes, how?   What
steps should I take to generate a successful lctl ping a.b.c.d?

Thanks for reading!
Cheers,
megan
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] lctl ping of Pacemaker IP

2012-11-01 Thread Jeff Johnson
Megan,

lnet pings aren't the same as tcpip/udp pings. An lnet ping 'lctl ping' would 
need to touch an active lnet instance on the target address. I don't think you 
can bind lnet to a pacemaker virtual IP but I'll let someone smarter than me on 
this list confirm or correct me.

In any event an lnet ping and udp ping are completely separate animals.

--Jeff

Sent from my iPhone

On Nov 1, 2012, at 21:04, Ms. Megan Larko dobsonu...@gmail.com wrote:

 Greetings!
 
 I am working with Lustre-2.1.2 on RHEL 6.2.  First I configured it
 using the standard defaults over TCP/IP.   Everything worked very
 nicely usnig a real, static --mgsnode=a.b.c.x value which was the
 actual IP of the MGS/MDS system1 node.
 
 I am now trying to integrate it with Pacemaker-1.1.7.I believe I
 have most of the set-up completed with a particular exception.  The
 lctl ping command cannot ping the pacemaker IP alias (say a.b.c.d).
 The generic ping command in RHEL 6.2 can successfully access the
 interface.  The Pacemaker alias IP (for failover of the combnied
 MGSMDS node with Fibre Channel multipath storage shared between both
 MGS/MDS-configured machines)  works in and of itself.  I tested with
 an apache service.   The Pacemaker will correctly fail over the
 MGS/MDS from system1 to system2 properly.  If I go to system2 then my
 Lustre file system stops because it cannot get to the alias IP number.
 
 I did configure the lustre OSTs to use --mgsnode=a.b.c.d (a.b.c.d
 representing my Pacemaker IP alias).  A tunefs.lustre confirms the
 alias IP number.  The alias IP number does not appear in LNET (lctl
 list_nids), and lctl ping a.b.c.d fails.
 
 Should this IP alias go into the LNET data base?  If yes, how?   What
 steps should I take to generate a successful lctl ping a.b.c.d?
 
 Thanks for reading!
 Cheers,
 megan
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss