Hi,

On Tue, Apr 14, 2009 at 10:56:23AM +0200, Cristina Bulfon wrote:
> Ciao,
>
> thanks for the answer ... Dejan has already pointed me out regarding the 
> IP.
> That IP is the alias IP for the AFS server, and I was using also with 
> IPaddr2 because at the beginning,
> while I was configuring AFS, I had probem with network communication and I 
> thought to redirect the traffic
> on that IP. I've solved that problem and I forgot to delete the entry in 
> haresource file
> beacuse that configuration work fine with V1...
>
> Anyway I correct the haresource file as follows
>
> afsitfs3.roma1.infn.it \
>         drbddisk::afs_fs Filesystem::/dev/drbd1::/vicepa/::xfs \
>         drbddisk::afs_sw Filesystem::/dev/drbd2::/usr/afs::ext3 \
>         141.108.26.31 afs
>
> and create the cib.xml  I don't have anymore the error  but the AFS 
> start/stop
> continuously

Probably an afs issue. What do you see in the logs?

Dejan

> cristina
>
> On Apr 14, 2009, at 10:38 AM, Andrew Beekhof wrote:
>
>> On Fri, Apr 10, 2009 at 12:25, Cristina Bulfon
>> <[email protected]> wrote:
>>> Dejan,
>>>
>>> I've followed your advice and I've moved to V2, first the software has 
>>> been
>>> updated to version 2.1.4.
>>>  I just modified the following files
>>>
>>> - ha.cf, added the line
>>>         crm yes
>>>
>>> - cib.xml has been produced using the python script and my haresources
>>>
>>>        afsitfs3.roma1.infn.it IPaddr2::141.108.26.31/24/eth0:0
>>>        afsitfs3.roma1.infn.it drbddisk::afs_fs
>>> Filesystem::/dev/drbd1::/vicepa::xfs
>>>        afsitfs3.roma1.infn.it drbddisk::afs_sw
>>> Filesystem::/dev/drbd2::/usr/afs::ext3
>>>        afsitfs3.roma1.infn.it 141.108.26.31 afs
>>>
>>>
>>> With this kind of configuration I've got a lot of error and the AFS 
>>> resource
>>> doesn't work
>>
>> Looks to me like the ip address is the one that doesn't work.  Did you
>> actually read the output you pasted below?
>>
>> You might want to double check the nic and netmask attributes, they're
>> probably swapped around.
>>
>>>
>>> - crm_verify -L  -x /var/lib/heartbeat/crm/cib.xml
>>>
>>> crm_verify[30489]: 2009/04/10_12:20:01 ERROR: unpack_rsc_op: Hard error:
>>> IPaddr2_1_monitor_0 failed with rc=2.
>>> crm_verify[30489]: 2009/04/10_12:20:01 ERROR: unpack_rsc_op:   Preventing
>>> IPaddr2_1 from re-starting on afsitfs4.roma1.infn.it
>>> crm_verify[30489]: 2009/04/10_12:20:01 ERROR: unpack_rsc_op: Hard error:
>>> IPaddr2_1_monitor_0 failed with rc=2.
>>> crm_verify[30489]: 2009/04/10_12:20:01 ERROR: unpack_rsc_op:   Preventing
>>> IPaddr2_1 from re-starting on afsitfs3.roma1.infn.it
>>>
>>> I've attached both cib.xml, ha-log and ha.cf
>>>
>>> Thanks for helping me
>>>
>>> cristina
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Apr 8, 2009, at 5:50 PM, Cristina Bulfon wrote:
>>>
>>>> Dejan,
>>>>
>>>> thanks so much for the explanation :-)
>>>>
>>>> c.
>>>>
>>>> On Apr 8, 2009, at 5:46 PM, Dejan Muhamedagic wrote:
>>>>
>>>>> Ciao,
>>>>>
>>>>> On Wed, Apr 08, 2009 at 04:17:45PM +0200, Cristina Bulfon wrote:
>>>>>>
>>>>>> Ciao Dejan,
>>>>>>
>>>>>> thanks for the answer.
>>>>>> Do you mean that I have to use heartbeat V2 plus CRM  and there is a 
>>>>>> way
>>>>>> to
>>>>>> check the HBA without using
>>>>>> hbaping ?
>>>>>
>>>>> Unlike Heartbeat v1, CRM/v2 can monitor resources. I suppose that
>>>>> in your case, a failing HBA would cause drbd or Filesystem
>>>>> monitor action to fail, which would result in either a failover
>>>>> or restart, depending on the configuration.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Dejan
>>>>>
>>>>>> Just to be sure if I have understood correctly. I am newby on 
>>>>>> heartbeat
>>>>>> V2
>>>>>>
>>>>>> thanks
>>>>>>
>>>>>> cristina
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mar 31, 2009, at 2:00 PM, Dejan Muhamedagic wrote:
>>>>>>
>>>>>>> Ciao,
>>>>>>>
>>>>>>> On Tue, Mar 31, 2009 at 01:48:47PM +0200, Cristina Bulfon wrote:
>>>>>>>>
>>>>>>>> Ciao,
>>>>>>>>
>>>>>>>> in our heartbeat cluster we have simulated the breaking of the HBA 
>>>>>>>> by
>>>>>>>> unplugging the fiber from HBA on the primary node. The resource 
>>>>>>>> didn't
>>>>>>>> switch to   the secondary node and on the log file on primary node
>>>>>>>> reported
>>>>>>>> the following messages:
>>>>>>>>
>>>>>>>> Feb 19 14:33:33 afsitfs3 kernel: qla2xxx 0000:0a:01.0: LOOP DOWN
>>>>>>>> detected
>>>>>>>> (2 e678 16ed).
>>>>>>>> Feb 19 14:33:38 afsitfs3 kernel: qla2xxx 0000:0a:01.1: LOOP DOWN
>>>>>>>> detected
>>>>>>>> (2 8633 16fc).
>>>>>>>> Feb 19 14:33:46 afsitfs3 kernel: qla2x00: FAILOVER device 2 from
>>>>>>>> 200500a0b832d169 -> 200400a0b832d16a - LUN 10, reason=0x2
>>>>>>>> Feb 19 14:33:46 afsitfs3 kernel: qla2x00: FROM HBA 0 to HBA 1
>>>>>>>> Feb 19 14:33:52 afsitfs3 kernel: qla2x00: FAILOVER device 2 from
>>>>>>>> 200400a0b832d16a -> 200500a0b832d16a - LUN 10, reason=0x2
>>>>>>>> Feb 19 14:33:52 afsitfs3 kernel: qla2x00: FROM HBA 1 to HBA 1
>>>>>>>> Feb 19 14:33:55 afsitfs3 kernel: qla2x00: FAILOVER device 2 from
>>>>>>>> 200500a0b832d16a -> 200400a0b832d169 - LUN 10, reason=0x2
>>>>>>>> Feb 19 14:33:55 afsitfs3 kernel: qla2x00: FROM HBA 1 to HBA 0
>>>>>>>> Feb 19 14:33:58 afsitfs3 kernel: qla2x00: FAILOVER device 2 from
>>>>>>>> 200400a0b832d169 -> 200500a0b832d169 - LUN 10, reason=0x2
>>>>>>>> Feb 19 14:33:58 afsitfs3 kernel: qla2x00: FROM HBA 0 to HBA 0
>>>>>>>> Feb 19 14:34:01 afsitfs3 kernel: qla2x00: FAILOVER device 2 from
>>>>>>>> 200500a0b832d169 -> 200400a0b832d16a - LUN 10, reason=0x2
>>>>>>>>
>>>>>>>> In some way I expected this kind of messages but  I do not 
>>>>>>>> understand
>>>>>>>> why
>>>>>>>> the secondary node doesn't take the control of the resources.
>>>>>>>>
>>>>>>>> In the ha.cf there is not nothing related to HBA and the haresources
>>>>>>>> file
>>>>>>>> is
>>>>>>>>
>>>>>>>> afsitfs3.roma1.infn.it  IPaddr2::Y.Y.Y.Y/24/eth0:0
>>>>>>>> afsitfs3.roma1.infn.it  drbddisk::r0
>>>>>>>> Filesystem::/dev/drbd1::/vicepa::xfs
>>>>>>>> afsitfs3.roma1.infn.it  drbddisk::r1
>>>>>>>> Filesystem::/dev/drbd2::/usr/afs::ext3
>>>>>>>> afsitfs3.roma1.infn.it         Y.Y.Y.Y   afs
>>>>>>>
>>>>>>> There's no resource monitoring with v1. For that you have to go
>>>>>>> with v2/Pacemaker (aka CRM).
>>>>>>>
>>>>>>>> Also tried to use hbaping compiling the hbaapi_src_2.2 but without
>>>>>>>> success
>>>>>>>> .. got problem during the compilations and I didn't understand if I
>>>>>>>> have
>>>>>>>> to
>>>>>>>> use libHBAAPI.so  from hbaapi or from HBA vendor.
>>>>>>>
>>>>>>> That could work with ipfail, perhaps.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Dejan
>>>>>>>
>>>>>>>> Our FC controller is
>>>>>>>>                Logic PCI to Fibre Channel Host Adapter for QLA2342:
>>>>>>>>        Firmware version 3.03.25 IPX, Driver version 8.02.14.01-fo
>>>>>>>>
>>>>>>>> Thanks in advance
>>>>>>>>
>>>>>>>> cristina
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Linux-HA mailing list
>>>>>>>> [email protected]
>>>>>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>>>>>>> See also: http://linux-ha.org/ReportingProblems
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Linux-HA mailing list
>>>>>>> [email protected]
>>>>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>>>>>> See also: http://linux-ha.org/ReportingProblems
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Linux-HA mailing list
>>>>>> [email protected]
>>>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>>>>> See also: http://linux-ha.org/ReportingProblems
>>>>>
>>>>> _______________________________________________
>>>>> Linux-HA mailing list
>>>>> [email protected]
>>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>>>> See also: http://linux-ha.org/ReportingProblems
>>>>>
>>>>
>>>> _______________________________________________
>>>> Linux-HA mailing list
>>>> [email protected]
>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>>> See also: http://linux-ha.org/ReportingProblems
>>>>
>>>
>>>
>>> _______________________________________________
>>> Linux-HA mailing list
>>> [email protected]
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>> See also: http://linux-ha.org/ReportingProblems
>>>
>> _______________________________________________
>> Linux-HA mailing list
>> [email protected]
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>
>
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to