Hi Dejan. Thanks for your answer. I still have trouble, so I will post
my results and experiences below. If anyone can help me, that will be
appreciated.
Dejan Muhamedagic wrote:
>> <primitive id="db-sql1-shooter" class="stonith" type="external/ipmi"
>> provider="heartbeat">
>
> There are no providers for class stonith. Just drop that
> attribute.
OK, done, it's nice to know, but that changes nothing. :) Thanks.
>> <operations>
>> <op id="op-sql1-shooter-stop" name="stop" timeout="60s"/>
>> <op id="op-sql1-shooter-start" name="start" timeout="30s"/>
>> <op id="op-sql1-shooter-monitor" name="monitor" timeout="5s"
>> interval="10s"/>
>
> This monitor timeout (interval too) are way to short. How likely
> is it that your stonith device fails within 10 seconds when it's
> actually required to reset a node? The start timeout should equal
> the monitor timeout.
Agreed, changed the intervals so they are like this now. It also didn't
helped removing my errors.
<operations>
<op id="op-sql3-shooter-stop" name="stop" timeout="120"/>
<op id="op-sql3-shooter-start" name="start" timeout="120s"/>
<op id="op-sql3-shooter-monitor" name="monitor" timeout="120s"
interval="180s"/>
</operations>
>> (...) crm_verify -VL points me to several warnings and a error that I am
>> unable to interpret correctly:
>>
>> crm_verify[29480]: 2008/11/20_11:55:03 ERROR: unpack_rsc_op: Remapping
>> db-sql1-shooter_start_0 (rc=1) on db-sql1.ripe.net to an ERROR
>> crm_verify[29480]: 2008/11/20_11:55:03 WARN: unpack_rsc_op: Processing
>> failed op db-sql1-shooter_start_0 on db-sql1.ripe.net: Error
>> crm_verify[29480]: 2008/11/20_11:55:03 WARN: unpack_rsc_op: Compatability
>> handling for failed op db-sql1-shooter_start_0 on db-sql1.ripe.net
>
> The stonith resource failed to start. Your configuration looks
> OK, apart from the way too tight timing constraints. Did you try
> the stonith program with your device and this configuration:
I double checked the configuration, going through the same process that
helped me building the initial configuration, and incorporating Dejan's
suggestions.
> stonith -d -t external/ipmi ...
The output of "stonith -d" is as follows:
stonith -d -t external/ipmi hostname="db-sql3" ipaddr="db-sql3-ipmi"
userid="root" passwd="XXXXXX" -T reset db-sql3
** (process:2663): DEBUG: NewPILPluginUniv(0x19430010)
** (process:2663): DEBUG: PILS: Plugin path =
/usr/lib64/stonith/plugins:/usr/lib64/pils/plugins
** (process:2663): DEBUG: NewPILInterfaceUniv(0x19431330)
** (process:2663): DEBUG: NewPILPlugintype(0x19430040)
** (process:2663): DEBUG: NewPILPlugin(0x19431e30)
** (process:2663): DEBUG: NewPILInterface(0x19431f40)
** (process:2663): DEBUG:
NewPILInterface(0x19431f40:InterfaceMgr/InterfaceMgr)*** user_data: 0x0
*******
** (process:2663): DEBUG:
InterfaceManager_plugin_init(0x19431f40/InterfaceMgr)
** (process:2663): DEBUG: Registering Implementation manager for
Interface type 'InterfaceMgr'
** (process:2663): DEBUG: PILS: Looking for InterfaceMgr/generic =>
[/usr/lib64/stonith/plugins/InterfaceMgr/generic.so]
** (process:2663): DEBUG: Plugin file
/usr/lib64/stonith/plugins/InterfaceMgr/generic.so does not exist
** (process:2663): DEBUG: PILS: Looking for InterfaceMgr/generic =>
[/usr/lib64/pils/plugins/InterfaceMgr/generic.so]
** (process:2663): DEBUG: Plugin path for InterfaceMgr/generic =>
[/usr/lib64/pils/plugins/InterfaceMgr/generic.so]
** (process:2663): DEBUG: PluginType InterfaceMgr already present
** (process:2663): DEBUG: Plugin InterfaceMgr/generic init function:
InterfaceMgr_LTX_generic_pil_plugin_init
** (process:2663): DEBUG: NewPILPlugin(0x19432800)
** (process:2663): DEBUG: Plugin InterfaceMgr/generic loaded and
constructed.
** (process:2663): DEBUG: Calling init function in plugin
InterfaceMgr/generic.
** (process:2663): DEBUG: NewPILInterface(0x194331f0)
** (process:2663): DEBUG:
NewPILInterface(0x194331f0:InterfaceMgr/stonith2)*** user_data:
0x19432100 *******
** (process:2663): DEBUG: Registering Implementation manager for
Interface type 'stonith2'
** (process:2663): DEBUG: IfIncrRefCount(1 + 1 )
** (process:2663): DEBUG: PluginIncrRefCount(0 + 1 )
** (process:2663): DEBUG: IfIncrRefCount(1 + 100 )
** (process:2663): DEBUG: PILS: Looking for stonith2/external =>
[/usr/lib64/stonith/plugins/stonith2/external.so]
** (process:2663): DEBUG: Plugin path for stonith2/external =>
[/usr/lib64/stonith/plugins/stonith2/external.so]
** (process:2663): DEBUG: Creating PluginType for stonith2
** (process:2663): DEBUG: NewPILPlugintype(0x19433260)
** (process:2663): DEBUG: Plugin stonith2/external init function:
stonith2_LTX_external_pil_plugin_init
** (process:2663): DEBUG: NewPILPlugin(0x194333e0)
** (process:2663): DEBUG: Plugin stonith2/external loaded and constructed.
** (process:2663): DEBUG: Calling init function in plugin stonith2/external.
** (process:2663): DEBUG: NewPILInterface(0x19433b60)
** (process:2663): DEBUG:
NewPILInterface(0x19433b60:stonith2/external)*** user_data:
0x2aaaaaec4738 *******
** (process:2663): DEBUG: IfIncrRefCount(101 + 1 )
** (process:2663): DEBUG: PluginIncrRefCount(0 + 1 )
** (process:2663): DEBUG: external_set_config: called.
** (process:2663): DEBUG: external_get_confignames: called.
** (process:2663): DEBUG: external_run_cmd: Calling
'/usr/lib64/stonith/plugins/external/ipmi getconfignames'
** INFO: external_run_cmd: '/usr/lib64/stonith/plugins/external/ipmi
getconfignames' output: hostname
ipaddr
userid
passwd
** (process:2663): DEBUG: external_get_confignames: 'ipmi
getconfignames' returned 0
** (process:2663): DEBUG: external_get_confignames: ipmi configname hostname
** (process:2663): DEBUG: external_get_confignames: ipmi configname ipaddr
** (process:2663): DEBUG: external_get_confignames: ipmi configname userid
** (process:2663): DEBUG: external_get_confignames: ipmi configname passwd
** (process:2663): DEBUG: external_status: called.
** (process:2663): DEBUG: external_run_cmd: Calling
'/usr/lib64/stonith/plugins/external/ipmi status'
** INFO: external_run_cmd: '/usr/lib64/stonith/plugins/external/ipmi
status' output: Chassis Power is on
** (process:2663): DEBUG: external_status: running 'ipmi status' returned 0
** (process:2663): DEBUG: external_getinfo: called.
** (process:2663): DEBUG: external_run_cmd: Calling
'/usr/lib64/stonith/plugins/external/ipmi getinfo-devid'
** INFO: external_run_cmd: '/usr/lib64/stonith/plugins/external/ipmi
getinfo-devid' output: IPMI STONITH device
** (process:2663): DEBUG: external_getinfo: 'ipmi getinfo-devid' returned 0
** (process:2663): DEBUG: external_reset_req: called.
** (process:2663): DEBUG: Host external-reset initiating on db-sql3
** (process:2663): DEBUG: external_run_cmd: Calling
'/usr/lib64/stonith/plugins/external/ipmi reset db-sql3'
** INFO: external_run_cmd: '/usr/lib64/stonith/plugins/external/ipmi
reset db-sql3' output: Chassis Power Control: Reset
** (process:2663): DEBUG: external_reset_req: running 'ipmi reset'
returned 0
** (process:2663): DEBUG: external_destroy: called.
** (process:2663): DEBUG: IfIncrRefCount(1 + -1 )
** (process:2663): DEBUG: RemoveAPILInterface(0x19433b60/external)
** (process:2663): DEBUG: RmAPILInterface(0x19433b60/external)
** (process:2663): DEBUG: PILunregister_interface(stonith2/external)
** (process:2663): DEBUG: Calling InterfaceClose on stonith2/external
** (process:2663): DEBUG: IfIncrRefCount(102 + -1 )
** (process:2663): DEBUG: PluginIncrRefCount(1 + -1 )
** (process:2663): DEBUG: RemoveAPILPlugin(stonith2/external)
** (process:2663): DEBUG: RmAPILPlugin(stonith2/external)
** (process:2663): DEBUG: Closing dlhandle for (stonith2/external)
** (process:2663): DEBUG: RmAPILPluginType(stonith2)
** (process:2663): DEBUG: DelPILPluginType(stonith2)
** (process:2663): DEBUG: DelPILInterface(0x19433b60/external)
It actually rebooted the host designated as "db-sql3", no problems.
Unfortunately, the errors still there:
crm_verify[3238]: 2008/11/24_12:10:13 ERROR: unpack_rsc_op: Remapping
db-sql3-shooter_start_0 (rc=1) on db-sql3.ripe.net to an ERROR
crm_verify[3238]: 2008/11/24_12:10:13 WARN: unpack_rsc_op: Processing
failed op db-sql3-shooter_start_0 on db-sql3.ripe.net: Error
crm_verify[3238]: 2008/11/24_12:10:13 WARN: unpack_rsc_op: Compatability
handling for failed op db-sql3-shooter_start_0 on db-sql3.ripe.net
crm_verify[3238]: 2008/11/24_12:10:13 ERROR: unpack_rsc_op: Remapping
db-sql1-shooter_start_0 (rc=1) on db-sql1.ripe.net to an ERROR
crm_verify[3238]: 2008/11/24_12:10:13 WARN: unpack_rsc_op: Processing
failed op db-sql1-shooter_start_0 on db-sql1.ripe.net: Error
crm_verify[3238]: 2008/11/24_12:10:13 WARN: unpack_rsc_op: Compatability
handling for failed op db-sql1-shooter_start_0 on db-sql1.ripe.net
Despite Dejan's good help, and my tests working all-right, I still have
the warnings and errors, and can't understand the C code to tell what it
does by myself. I will keep trying to follow the code. But I am stuck
once more. Please help?
Many thanks in advance.
Regards.
--
Luis Motta Campos is a software engineer,
Perl Programmer, foodie and photographer.
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems