Dejan,

thanks so much for the explanation :-)

c.

On Apr 8, 2009, at 5:46 PM, Dejan Muhamedagic wrote:

Ciao,

On Wed, Apr 08, 2009 at 04:17:45PM +0200, Cristina Bulfon wrote:
Ciao Dejan,

thanks for the answer.
Do you mean that I have to use heartbeat V2 plus CRM and there is a way to
check the HBA without using
hbaping ?

Unlike Heartbeat v1, CRM/v2 can monitor resources. I suppose that
in your case, a failing HBA would cause drbd or Filesystem
monitor action to fail, which would result in either a failover
or restart, depending on the configuration.

Thanks,

Dejan

Just to be sure if I have understood correctly. I am newby on heartbeat V2

thanks

cristina





On Mar 31, 2009, at 2:00 PM, Dejan Muhamedagic wrote:

Ciao,

On Tue, Mar 31, 2009 at 01:48:47PM +0200, Cristina Bulfon wrote:
Ciao,

in our heartbeat cluster we have simulated the breaking of the HBA by unplugging the fiber from HBA on the primary node. The resource didn't
switch to   the secondary node and on the log file on primary node
reported
the following messages:

Feb 19 14:33:33 afsitfs3 kernel: qla2xxx 0000:0a:01.0: LOOP DOWN detected
(2 e678 16ed).
Feb 19 14:33:38 afsitfs3 kernel: qla2xxx 0000:0a:01.1: LOOP DOWN detected
(2 8633 16fc).
Feb 19 14:33:46 afsitfs3 kernel: qla2x00: FAILOVER device 2 from
200500a0b832d169 -> 200400a0b832d16a - LUN 10, reason=0x2
Feb 19 14:33:46 afsitfs3 kernel: qla2x00: FROM HBA 0 to HBA 1
Feb 19 14:33:52 afsitfs3 kernel: qla2x00: FAILOVER device 2 from
200400a0b832d16a -> 200500a0b832d16a - LUN 10, reason=0x2
Feb 19 14:33:52 afsitfs3 kernel: qla2x00: FROM HBA 1 to HBA 1
Feb 19 14:33:55 afsitfs3 kernel: qla2x00: FAILOVER device 2 from
200500a0b832d16a -> 200400a0b832d169 - LUN 10, reason=0x2
Feb 19 14:33:55 afsitfs3 kernel: qla2x00: FROM HBA 1 to HBA 0
Feb 19 14:33:58 afsitfs3 kernel: qla2x00: FAILOVER device 2 from
200400a0b832d169 -> 200500a0b832d169 - LUN 10, reason=0x2
Feb 19 14:33:58 afsitfs3 kernel: qla2x00: FROM HBA 0 to HBA 0
Feb 19 14:34:01 afsitfs3 kernel: qla2x00: FAILOVER device 2 from
200500a0b832d169 -> 200400a0b832d16a - LUN 10, reason=0x2

In some way I expected this kind of messages but I do not understand why
the secondary node doesn't take the control of the resources.

In the ha.cf there is not nothing related to HBA and the haresources file
is

afsitfs3.roma1.infn.it  IPaddr2::Y.Y.Y.Y/24/eth0:0
afsitfs3.roma1.infn.it  drbddisk::r0
Filesystem::/dev/drbd1::/vicepa::xfs
afsitfs3.roma1.infn.it  drbddisk::r1
Filesystem::/dev/drbd2::/usr/afs::ext3
afsitfs3.roma1.infn.it         Y.Y.Y.Y   afs

There's no resource monitoring with v1. For that you have to go
with v2/Pacemaker (aka CRM).

Also tried to use hbaping compiling the hbaapi_src_2.2 but without
success
.. got problem during the compilations and I didn't understand if I have
to
use libHBAAPI.so  from hbaapi or from HBA vendor.

That could work with ipfail, perhaps.

Thanks,

Dejan

Our FC controller is
                Logic PCI to Fibre Channel Host Adapter for QLA2342:
        Firmware version 3.03.25 IPX, Driver version 8.02.14.01-fo

Thanks in advance

cristina



_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to