Re: [Linux-HA] stonith

Dejan Muhamedagic Thu, 06 Dec 2007 06:36:06 -0800

Hi,

On Thu, Dec 06, 2007 at 01:11:37PM +0100, Papp Tamas wrote:
> hi All,
> 
> 
> There is a problem again. The cluster has two nodes, and heartbeat's
> version is 2.1.2, CentOS 5.
> 
> This is the resource (this is the same for teszt2, both of them do not
> work):
> 
> <primitive class="stonith" type="apcsmart" provider="heartbeat" 
> id="stonith_teszt1">
>   <instance_attributes id="stonith_teszt1_instance_attrs">
>     <attributes>
>       <nvpair name="target_role" id="stonith_teszt1_target_role" 
> value="stopped"/>
>       <nvpair id="device_stonith1" name="ttydev" value="/dev/ttyS0"/>
>       <nvpair id="hostname_stonith1" name="hostlist" value="teszt1"/>
>     </attributes>
>   </instance_attributes>
> </primitive>
> 
> Constraints:
> 
> <rsc_location id="stonith_teszt1_on_teszt2" rsc="stonith_teszt1">
>     <rule id="stonith_teszt1_preferred_teszt2" score="INFINITY">
>         <expression attribute="#uname" operation="eq" value="teszt2"/>
>     </rule>
>     <rule id="stonith_teszt1_nowhere_else_teszt2" score="-INFINITY">
>         <expression attribute="#uname" operation="ne" value="teszt2"/>
>     </rule>
> </rsc_location>
> 
> 
> I start the resource and see this in messages:
> 
> $ crm_resource -p target_role -v started -r stonith_teszt1
> crm_resource[3002]: 2007/12/06_13:03:39 info: Invoked: crm_resource -p 
> target_role -v started -r stonith_teszt1
> $
> 
> /var/log/messages:
> 
> 
> Dec  6 13:03:39 teszt2 pengine: [2958]: info: determine_online_status: Node 
> teszt2 is online
> Dec  6 13:03:39 teszt2 pengine: [2958]: info: determine_online_status: Node 
> teszt1 is online
> Dec  6 13:03:39 teszt2 pengine: [2958]: info: native_print: stonith_teszt1    
>   (stonith:apcsmart):     Stopped
> Dec  6 13:03:39 teszt2 pengine: [2958]: info: native_print: stonith_teszt2    
>   (stonith:apcsmart):     Stopped
> Dec  6 13:03:39 teszt2 pengine: [2958]: notice: StartRsc:  teszt2       Start 
> stonith_teszt1
> Dec  6 13:03:39 teszt2 pengine: [2958]: WARN: native_color: Resource 
> stonith_teszt2 cannot run anywhere
> Dec  6 13:03:39 teszt2 crmd: [2866]: info: do_state_transition: State 
> transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS 
> cause=C_IPC_MESSAGE origin=route_message ]
> Dec  6 13:03:39 teszt2 tengine: [2957]: info: unpack_graph: Unpacked 
> transition 3: 1 actions in 1 synapses
> Dec  6 13:03:39 teszt2 pengine: [2958]: WARN: process_pe_message: Transition 
> 3: WARNINGs found during PE processing. PEngine Input stored in: 
> /var/lib/heartbeat/pengine/pe-warn-464.bz2
> Dec  6 13:03:39 teszt2 tengine: [2957]: info: send_rsc_command: Initiating 
> action 4: stonith_teszt1_start_0 on teszt2
> Dec  6 13:03:39 teszt2 pengine: [2958]: info: process_pe_message: 
> Configuration WARNINGs found during PE processing.  Please run "crm_verify 
> -L" to identify issues.
> Dec  6 13:03:40 teszt2 crmd: [2866]: info: do_lrm_rsc_op: Performing 
> op=stonith_teszt1_start_0 key=4:3:e21ebe4d-a057-4264-8905-05ab83dad327)
> Dec  6 13:03:40 teszt2 lrmd: [3004]: info: Try to start STONITH resource 
> <rsc_id=stonith_teszt1> : Device=apcsmart
> Dec  6 13:03:43 teszt2 crmd: [2866]: ERROR: process_lrm_event: LRM operation 
> stonith_teszt1_start_0 (call=4, rc=1) Error unknown error
> Dec  6 13:03:43 teszt2 crmd: [2866]: info: build_operation_update: Digest for 
> 4:1;4:3:e21ebe4d-a057-4264-8905-05ab83dad327 (stonith_teszt1_start_0) was 
> 007459368f704b2551a6f4d6156433b0
> Dec  6 13:03:43 teszt2 crmd: [2866]: info: log_data_element: 
> build_operation_update: digest:source <parameters target_role="started" 
> ttydev="/dev/ttyS0" hostlist="teszt1"/>
> Dec  6 13:03:43 teszt2 lrmd: [2863]: WARN: stonithRA plugin: cannot get 
> shortdesc segment of apcsmart's metadata.
> Dec  6 13:03:43 teszt2 tengine: [2957]: WARN: status_from_rc: Action start on 
> teszt2 failed (target: <null> vs. rc: 1): Error
> Dec  6 13:03:43 teszt2 tengine: [2957]: info: update_abort_priority: Abort 
> priority upgraded to 1
> Dec  6 13:03:43 teszt2 tengine: [2957]: info: update_abort_priority: Abort 
> action 0 superceeded by 2
> Dec  6 13:03:43 teszt2 tengine: [2957]: info: match_graph_event: Action 
> stonith_teszt1_start_0 (4) confirmed on teszt2
> Dec  6 13:03:43 teszt2 tengine: [2957]: info: run_graph: Transition 3: 
> (Complete=1, Pending=0, Fired=0, Skipped=0, Incomplete=0)
> Dec  6 13:03:43 teszt2 crmd: [2866]: info: do_state_transition: State 
> transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [ input=I_PE_CALC 
> cause=C_IPC_MESSAGE origin=route_message ]
> Dec  6 13:03:43 teszt2 crmd: [2866]: info: do_state_transition: All 2 cluster 
> nodes are eligible to run resources.
> Dec  6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default 
> value 'stop' for cluster option 'no-quorum-policy'
> Dec  6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default 
> value 'true' for cluster option 'symmetric-cluster'
> Dec  6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default 
> value 'false' for cluster option 'stonith-enabled'
> Dec  6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default 
> value 'reboot' for cluster option 'stonith-action'
> Dec  6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default 
> value '0' for cluster option 'default-resource-stickiness'
> Dec  6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default 
> value '0' for cluster option 'default-resource-failure-stickiness'
> Dec  6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default 
> value 'true' for cluster option 'is-managed-default'
> Dec  6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default 
> value '60s' for cluster option 'cluster-delay'
> Dec  6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default 
> value '20s' for cluster option 'default-action-timeout'
> Dec  6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default 
> value 'true' for cluster option 'stop-orphan-resources'
> Dec  6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default 
> value 'true' for cluster option 'stop-orphan-actions'
> Dec  6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default 
> value 'false' for cluster option 'remove-after-stop'
> Dec  6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default 
> value '-1' for cluster option 'pe-error-series-max'
> Dec  6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default 
> value '-1' for cluster option 'pe-warn-series-max'
> Dec  6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default 
> value '-1' for cluster option 'pe-input-series-max'
> Dec  6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default 
> value 'true' for cluster option 'startup-fencing'
> Dec  6 13:03:43 teszt2 pengine: [2958]: info: determine_online_status: Node 
> teszt2 is online
> Dec  6 13:03:43 teszt2 pengine: [2958]: WARN: unpack_rsc_op: Processing 
> failed op (stonith_teszt1_start_0) on teszt2
> Dec  6 13:03:43 teszt2 pengine: [2958]: WARN: unpack_rsc_op: Handling failed 
> start for stonith_teszt1 on teszt2
> Dec  6 13:03:43 teszt2 pengine: [2958]: info: determine_online_status: Node 
> teszt1 is online
> Dec  6 13:03:43 teszt2 pengine: [2958]: info: native_print: stonith_teszt1    
>   (stonith:apcsmart):     Started teszt2 FAILED
> Dec  6 13:03:43 teszt2 pengine: [2958]: info: native_print: stonith_teszt2    
>   (stonith:apcsmart):     Stopped
> Dec  6 13:03:43 teszt2 pengine: [2958]: WARN: native_color: Resource 
> stonith_teszt1 cannot run anywhere
> Dec  6 13:03:43 teszt2 pengine: [2958]: notice: StopRsc:   teszt2       Stop 
> stonith_teszt1
> Dec  6 13:03:43 teszt2 pengine: [2958]: WARN: native_color: Resource 
> stonith_teszt2 cannot run anywhere
> Dec  6 13:03:43 teszt2 crmd: [2866]: info: do_state_transition: State 
> transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS 
> cause=C_IPC_MESSAGE origin=route_message ]
> Dec  6 13:03:43 teszt2 pengine: [2958]: WARN: process_pe_message: Transition 
> 4: WARNINGs found during PE processing. PEngine Input stored in: 
> /var/lib/heartbeat/pengine/pe-warn-465.bz2
> Dec  6 13:03:43 teszt2 tengine: [2957]: info: unpack_graph: Unpacked 
> transition 4: 1 actions in 1 synapses
> Dec  6 13:03:43 teszt2 pengine: [2958]: info: process_pe_message: 
> Configuration WARNINGs found during PE processing.  Please run "crm_verify 
> -L" to identify issues.
> Dec  6 13:03:43 teszt2 tengine: [2957]: info: send_rsc_command: Initiating 
> action 1: stonith_teszt1_stop_0 on teszt2
> Dec  6 13:03:43 teszt2 crmd: [2866]: info: do_lrm_rsc_op: Performing 
> op=stonith_teszt1_stop_0 key=1:4:e21ebe4d-a057-4264-8905-05ab83dad327)
> Dec  6 13:03:43 teszt2 lrmd: [3006]: info: Try to stop STONITH resource 
> <rsc_id=stonith_teszt1> : Device=apcsmart
> Dec  6 13:03:43 teszt2 stonithd: [2864]: notice: try to stop a resource 
> stonith_teszt1 who is not in started resource queue.
> Dec  6 13:03:43 teszt2 crmd: [2866]: info: process_lrm_event: LRM operation 
> stonith_teszt1_stop_0 (call=5, rc=0) complete
> Dec  6 13:03:43 teszt2 tengine: [2957]: info: match_graph_event: Action 
> stonith_teszt1_stop_0 (1) confirmed on teszt2
> Dec  6 13:03:43 teszt2 tengine: [2957]: info: run_graph: Transition 4: 
> (Complete=1, Pending=0, Fired=0, Skipped=0, Incomplete=0)
> Dec  6 13:03:43 teszt2 tengine: [2957]: info: notify_crmd: Transition 4 
> status: te_complete - <null>
> Dec  6 13:03:43 teszt2 crmd: [2866]: info: do_state_transition: State 
> transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS 
> cause=C_IPC_MESSAGE origin=route_message ]
> 
> 
> 
> 
> So I think this should shut down the UPS (which gives the power to teszt1).


No. Starting a stonith resource is just making it available. The
policy engine decides to actually stonith (reset) a node when it
finds it necessary. 

> What's wrong, what do I miss?

The error you encountered is probably from the monitor operation
(there's a monitor implied in the start). Did you try with
stonith (the program):

# stonith -t apcsmart ttydev=/dev/ttyS0 hostlist=teszt1 -S
# stonith -t apcsmart ttydev=/dev/ttyS0 hostlist=teszt1 -l

(perhaps you'll have to twiddle a bit the parameters/options).

Thanks,

Dejan

> Thank you,
> 
> tamas
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] stonith

Reply via email to