Hi, On Tue, Apr 14, 2009 at 10:56:23AM +0200, Cristina Bulfon wrote: > Ciao, > > thanks for the answer ... Dejan has already pointed me out regarding the > IP. > That IP is the alias IP for the AFS server, and I was using also with > IPaddr2 because at the beginning, > while I was configuring AFS, I had probem with network communication and I > thought to redirect the traffic > on that IP. I've solved that problem and I forgot to delete the entry in > haresource file > beacuse that configuration work fine with V1... > > Anyway I correct the haresource file as follows > > afsitfs3.roma1.infn.it \ > drbddisk::afs_fs Filesystem::/dev/drbd1::/vicepa/::xfs \ > drbddisk::afs_sw Filesystem::/dev/drbd2::/usr/afs::ext3 \ > 141.108.26.31 afs > > and create the cib.xml I don't have anymore the error but the AFS > start/stop > continuously
Probably an afs issue. What do you see in the logs? Dejan > cristina > > On Apr 14, 2009, at 10:38 AM, Andrew Beekhof wrote: > >> On Fri, Apr 10, 2009 at 12:25, Cristina Bulfon >> <[email protected]> wrote: >>> Dejan, >>> >>> I've followed your advice and I've moved to V2, first the software has >>> been >>> updated to version 2.1.4. >>> I just modified the following files >>> >>> - ha.cf, added the line >>> crm yes >>> >>> - cib.xml has been produced using the python script and my haresources >>> >>> afsitfs3.roma1.infn.it IPaddr2::141.108.26.31/24/eth0:0 >>> afsitfs3.roma1.infn.it drbddisk::afs_fs >>> Filesystem::/dev/drbd1::/vicepa::xfs >>> afsitfs3.roma1.infn.it drbddisk::afs_sw >>> Filesystem::/dev/drbd2::/usr/afs::ext3 >>> afsitfs3.roma1.infn.it 141.108.26.31 afs >>> >>> >>> With this kind of configuration I've got a lot of error and the AFS >>> resource >>> doesn't work >> >> Looks to me like the ip address is the one that doesn't work. Did you >> actually read the output you pasted below? >> >> You might want to double check the nic and netmask attributes, they're >> probably swapped around. >> >>> >>> - crm_verify -L -x /var/lib/heartbeat/crm/cib.xml >>> >>> crm_verify[30489]: 2009/04/10_12:20:01 ERROR: unpack_rsc_op: Hard error: >>> IPaddr2_1_monitor_0 failed with rc=2. >>> crm_verify[30489]: 2009/04/10_12:20:01 ERROR: unpack_rsc_op: Preventing >>> IPaddr2_1 from re-starting on afsitfs4.roma1.infn.it >>> crm_verify[30489]: 2009/04/10_12:20:01 ERROR: unpack_rsc_op: Hard error: >>> IPaddr2_1_monitor_0 failed with rc=2. >>> crm_verify[30489]: 2009/04/10_12:20:01 ERROR: unpack_rsc_op: Preventing >>> IPaddr2_1 from re-starting on afsitfs3.roma1.infn.it >>> >>> I've attached both cib.xml, ha-log and ha.cf >>> >>> Thanks for helping me >>> >>> cristina >>> >>> >>> >>> >>> >>> >>> >>> >>> On Apr 8, 2009, at 5:50 PM, Cristina Bulfon wrote: >>> >>>> Dejan, >>>> >>>> thanks so much for the explanation :-) >>>> >>>> c. >>>> >>>> On Apr 8, 2009, at 5:46 PM, Dejan Muhamedagic wrote: >>>> >>>>> Ciao, >>>>> >>>>> On Wed, Apr 08, 2009 at 04:17:45PM +0200, Cristina Bulfon wrote: >>>>>> >>>>>> Ciao Dejan, >>>>>> >>>>>> thanks for the answer. >>>>>> Do you mean that I have to use heartbeat V2 plus CRM and there is a >>>>>> way >>>>>> to >>>>>> check the HBA without using >>>>>> hbaping ? >>>>> >>>>> Unlike Heartbeat v1, CRM/v2 can monitor resources. I suppose that >>>>> in your case, a failing HBA would cause drbd or Filesystem >>>>> monitor action to fail, which would result in either a failover >>>>> or restart, depending on the configuration. >>>>> >>>>> Thanks, >>>>> >>>>> Dejan >>>>> >>>>>> Just to be sure if I have understood correctly. I am newby on >>>>>> heartbeat >>>>>> V2 >>>>>> >>>>>> thanks >>>>>> >>>>>> cristina >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Mar 31, 2009, at 2:00 PM, Dejan Muhamedagic wrote: >>>>>> >>>>>>> Ciao, >>>>>>> >>>>>>> On Tue, Mar 31, 2009 at 01:48:47PM +0200, Cristina Bulfon wrote: >>>>>>>> >>>>>>>> Ciao, >>>>>>>> >>>>>>>> in our heartbeat cluster we have simulated the breaking of the HBA >>>>>>>> by >>>>>>>> unplugging the fiber from HBA on the primary node. The resource >>>>>>>> didn't >>>>>>>> switch to the secondary node and on the log file on primary node >>>>>>>> reported >>>>>>>> the following messages: >>>>>>>> >>>>>>>> Feb 19 14:33:33 afsitfs3 kernel: qla2xxx 0000:0a:01.0: LOOP DOWN >>>>>>>> detected >>>>>>>> (2 e678 16ed). >>>>>>>> Feb 19 14:33:38 afsitfs3 kernel: qla2xxx 0000:0a:01.1: LOOP DOWN >>>>>>>> detected >>>>>>>> (2 8633 16fc). >>>>>>>> Feb 19 14:33:46 afsitfs3 kernel: qla2x00: FAILOVER device 2 from >>>>>>>> 200500a0b832d169 -> 200400a0b832d16a - LUN 10, reason=0x2 >>>>>>>> Feb 19 14:33:46 afsitfs3 kernel: qla2x00: FROM HBA 0 to HBA 1 >>>>>>>> Feb 19 14:33:52 afsitfs3 kernel: qla2x00: FAILOVER device 2 from >>>>>>>> 200400a0b832d16a -> 200500a0b832d16a - LUN 10, reason=0x2 >>>>>>>> Feb 19 14:33:52 afsitfs3 kernel: qla2x00: FROM HBA 1 to HBA 1 >>>>>>>> Feb 19 14:33:55 afsitfs3 kernel: qla2x00: FAILOVER device 2 from >>>>>>>> 200500a0b832d16a -> 200400a0b832d169 - LUN 10, reason=0x2 >>>>>>>> Feb 19 14:33:55 afsitfs3 kernel: qla2x00: FROM HBA 1 to HBA 0 >>>>>>>> Feb 19 14:33:58 afsitfs3 kernel: qla2x00: FAILOVER device 2 from >>>>>>>> 200400a0b832d169 -> 200500a0b832d169 - LUN 10, reason=0x2 >>>>>>>> Feb 19 14:33:58 afsitfs3 kernel: qla2x00: FROM HBA 0 to HBA 0 >>>>>>>> Feb 19 14:34:01 afsitfs3 kernel: qla2x00: FAILOVER device 2 from >>>>>>>> 200500a0b832d169 -> 200400a0b832d16a - LUN 10, reason=0x2 >>>>>>>> >>>>>>>> In some way I expected this kind of messages but I do not >>>>>>>> understand >>>>>>>> why >>>>>>>> the secondary node doesn't take the control of the resources. >>>>>>>> >>>>>>>> In the ha.cf there is not nothing related to HBA and the haresources >>>>>>>> file >>>>>>>> is >>>>>>>> >>>>>>>> afsitfs3.roma1.infn.it IPaddr2::Y.Y.Y.Y/24/eth0:0 >>>>>>>> afsitfs3.roma1.infn.it drbddisk::r0 >>>>>>>> Filesystem::/dev/drbd1::/vicepa::xfs >>>>>>>> afsitfs3.roma1.infn.it drbddisk::r1 >>>>>>>> Filesystem::/dev/drbd2::/usr/afs::ext3 >>>>>>>> afsitfs3.roma1.infn.it Y.Y.Y.Y afs >>>>>>> >>>>>>> There's no resource monitoring with v1. For that you have to go >>>>>>> with v2/Pacemaker (aka CRM). >>>>>>> >>>>>>>> Also tried to use hbaping compiling the hbaapi_src_2.2 but without >>>>>>>> success >>>>>>>> .. got problem during the compilations and I didn't understand if I >>>>>>>> have >>>>>>>> to >>>>>>>> use libHBAAPI.so from hbaapi or from HBA vendor. >>>>>>> >>>>>>> That could work with ipfail, perhaps. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Dejan >>>>>>> >>>>>>>> Our FC controller is >>>>>>>> Logic PCI to Fibre Channel Host Adapter for QLA2342: >>>>>>>> Firmware version 3.03.25 IPX, Driver version 8.02.14.01-fo >>>>>>>> >>>>>>>> Thanks in advance >>>>>>>> >>>>>>>> cristina >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Linux-HA mailing list >>>>>>>> [email protected] >>>>>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >>>>>>>> See also: http://linux-ha.org/ReportingProblems >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Linux-HA mailing list >>>>>>> [email protected] >>>>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >>>>>>> See also: http://linux-ha.org/ReportingProblems >>>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Linux-HA mailing list >>>>>> [email protected] >>>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >>>>>> See also: http://linux-ha.org/ReportingProblems >>>>> >>>>> _______________________________________________ >>>>> Linux-HA mailing list >>>>> [email protected] >>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >>>>> See also: http://linux-ha.org/ReportingProblems >>>>> >>>> >>>> _______________________________________________ >>>> Linux-HA mailing list >>>> [email protected] >>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >>>> See also: http://linux-ha.org/ReportingProblems >>>> >>> >>> >>> _______________________________________________ >>> Linux-HA mailing list >>> [email protected] >>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >>> See also: http://linux-ha.org/ReportingProblems >>> >> _______________________________________________ >> Linux-HA mailing list >> [email protected] >> http://lists.linux-ha.org/mailman/listinfo/linux-ha >> See also: http://linux-ha.org/ReportingProblems >> > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
