Hi, On Thu, Dec 06, 2007 at 01:11:37PM +0100, Papp Tamas wrote: > hi All, > > > There is a problem again. The cluster has two nodes, and heartbeat's > version is 2.1.2, CentOS 5. > > This is the resource (this is the same for teszt2, both of them do not > work): > > <primitive class="stonith" type="apcsmart" provider="heartbeat" > id="stonith_teszt1"> > <instance_attributes id="stonith_teszt1_instance_attrs"> > <attributes> > <nvpair name="target_role" id="stonith_teszt1_target_role" > value="stopped"/> > <nvpair id="device_stonith1" name="ttydev" value="/dev/ttyS0"/> > <nvpair id="hostname_stonith1" name="hostlist" value="teszt1"/> > </attributes> > </instance_attributes> > </primitive> > > Constraints: > > <rsc_location id="stonith_teszt1_on_teszt2" rsc="stonith_teszt1"> > <rule id="stonith_teszt1_preferred_teszt2" score="INFINITY"> > <expression attribute="#uname" operation="eq" value="teszt2"/> > </rule> > <rule id="stonith_teszt1_nowhere_else_teszt2" score="-INFINITY"> > <expression attribute="#uname" operation="ne" value="teszt2"/> > </rule> > </rsc_location> > > > I start the resource and see this in messages: > > $ crm_resource -p target_role -v started -r stonith_teszt1 > crm_resource[3002]: 2007/12/06_13:03:39 info: Invoked: crm_resource -p > target_role -v started -r stonith_teszt1 > $ > > /var/log/messages: > > > Dec 6 13:03:39 teszt2 pengine: [2958]: info: determine_online_status: Node > teszt2 is online > Dec 6 13:03:39 teszt2 pengine: [2958]: info: determine_online_status: Node > teszt1 is online > Dec 6 13:03:39 teszt2 pengine: [2958]: info: native_print: stonith_teszt1 > (stonith:apcsmart): Stopped > Dec 6 13:03:39 teszt2 pengine: [2958]: info: native_print: stonith_teszt2 > (stonith:apcsmart): Stopped > Dec 6 13:03:39 teszt2 pengine: [2958]: notice: StartRsc: teszt2 Start > stonith_teszt1 > Dec 6 13:03:39 teszt2 pengine: [2958]: WARN: native_color: Resource > stonith_teszt2 cannot run anywhere > Dec 6 13:03:39 teszt2 crmd: [2866]: info: do_state_transition: State > transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS > cause=C_IPC_MESSAGE origin=route_message ] > Dec 6 13:03:39 teszt2 tengine: [2957]: info: unpack_graph: Unpacked > transition 3: 1 actions in 1 synapses > Dec 6 13:03:39 teszt2 pengine: [2958]: WARN: process_pe_message: Transition > 3: WARNINGs found during PE processing. PEngine Input stored in: > /var/lib/heartbeat/pengine/pe-warn-464.bz2 > Dec 6 13:03:39 teszt2 tengine: [2957]: info: send_rsc_command: Initiating > action 4: stonith_teszt1_start_0 on teszt2 > Dec 6 13:03:39 teszt2 pengine: [2958]: info: process_pe_message: > Configuration WARNINGs found during PE processing. Please run "crm_verify > -L" to identify issues. > Dec 6 13:03:40 teszt2 crmd: [2866]: info: do_lrm_rsc_op: Performing > op=stonith_teszt1_start_0 key=4:3:e21ebe4d-a057-4264-8905-05ab83dad327) > Dec 6 13:03:40 teszt2 lrmd: [3004]: info: Try to start STONITH resource > <rsc_id=stonith_teszt1> : Device=apcsmart > Dec 6 13:03:43 teszt2 crmd: [2866]: ERROR: process_lrm_event: LRM operation > stonith_teszt1_start_0 (call=4, rc=1) Error unknown error > Dec 6 13:03:43 teszt2 crmd: [2866]: info: build_operation_update: Digest for > 4:1;4:3:e21ebe4d-a057-4264-8905-05ab83dad327 (stonith_teszt1_start_0) was > 007459368f704b2551a6f4d6156433b0 > Dec 6 13:03:43 teszt2 crmd: [2866]: info: log_data_element: > build_operation_update: digest:source <parameters target_role="started" > ttydev="/dev/ttyS0" hostlist="teszt1"/> > Dec 6 13:03:43 teszt2 lrmd: [2863]: WARN: stonithRA plugin: cannot get > shortdesc segment of apcsmart's metadata. > Dec 6 13:03:43 teszt2 tengine: [2957]: WARN: status_from_rc: Action start on > teszt2 failed (target: <null> vs. rc: 1): Error > Dec 6 13:03:43 teszt2 tengine: [2957]: info: update_abort_priority: Abort > priority upgraded to 1 > Dec 6 13:03:43 teszt2 tengine: [2957]: info: update_abort_priority: Abort > action 0 superceeded by 2 > Dec 6 13:03:43 teszt2 tengine: [2957]: info: match_graph_event: Action > stonith_teszt1_start_0 (4) confirmed on teszt2 > Dec 6 13:03:43 teszt2 tengine: [2957]: info: run_graph: Transition 3: > (Complete=1, Pending=0, Fired=0, Skipped=0, Incomplete=0) > Dec 6 13:03:43 teszt2 crmd: [2866]: info: do_state_transition: State > transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [ input=I_PE_CALC > cause=C_IPC_MESSAGE origin=route_message ] > Dec 6 13:03:43 teszt2 crmd: [2866]: info: do_state_transition: All 2 cluster > nodes are eligible to run resources. > Dec 6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default > value 'stop' for cluster option 'no-quorum-policy' > Dec 6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default > value 'true' for cluster option 'symmetric-cluster' > Dec 6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default > value 'false' for cluster option 'stonith-enabled' > Dec 6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default > value 'reboot' for cluster option 'stonith-action' > Dec 6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default > value '0' for cluster option 'default-resource-stickiness' > Dec 6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default > value '0' for cluster option 'default-resource-failure-stickiness' > Dec 6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default > value 'true' for cluster option 'is-managed-default' > Dec 6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default > value '60s' for cluster option 'cluster-delay' > Dec 6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default > value '20s' for cluster option 'default-action-timeout' > Dec 6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default > value 'true' for cluster option 'stop-orphan-resources' > Dec 6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default > value 'true' for cluster option 'stop-orphan-actions' > Dec 6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default > value 'false' for cluster option 'remove-after-stop' > Dec 6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default > value '-1' for cluster option 'pe-error-series-max' > Dec 6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default > value '-1' for cluster option 'pe-warn-series-max' > Dec 6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default > value '-1' for cluster option 'pe-input-series-max' > Dec 6 13:03:43 teszt2 pengine: [2958]: notice: cluster_option: Using default > value 'true' for cluster option 'startup-fencing' > Dec 6 13:03:43 teszt2 pengine: [2958]: info: determine_online_status: Node > teszt2 is online > Dec 6 13:03:43 teszt2 pengine: [2958]: WARN: unpack_rsc_op: Processing > failed op (stonith_teszt1_start_0) on teszt2 > Dec 6 13:03:43 teszt2 pengine: [2958]: WARN: unpack_rsc_op: Handling failed > start for stonith_teszt1 on teszt2 > Dec 6 13:03:43 teszt2 pengine: [2958]: info: determine_online_status: Node > teszt1 is online > Dec 6 13:03:43 teszt2 pengine: [2958]: info: native_print: stonith_teszt1 > (stonith:apcsmart): Started teszt2 FAILED > Dec 6 13:03:43 teszt2 pengine: [2958]: info: native_print: stonith_teszt2 > (stonith:apcsmart): Stopped > Dec 6 13:03:43 teszt2 pengine: [2958]: WARN: native_color: Resource > stonith_teszt1 cannot run anywhere > Dec 6 13:03:43 teszt2 pengine: [2958]: notice: StopRsc: teszt2 Stop > stonith_teszt1 > Dec 6 13:03:43 teszt2 pengine: [2958]: WARN: native_color: Resource > stonith_teszt2 cannot run anywhere > Dec 6 13:03:43 teszt2 crmd: [2866]: info: do_state_transition: State > transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS > cause=C_IPC_MESSAGE origin=route_message ] > Dec 6 13:03:43 teszt2 pengine: [2958]: WARN: process_pe_message: Transition > 4: WARNINGs found during PE processing. PEngine Input stored in: > /var/lib/heartbeat/pengine/pe-warn-465.bz2 > Dec 6 13:03:43 teszt2 tengine: [2957]: info: unpack_graph: Unpacked > transition 4: 1 actions in 1 synapses > Dec 6 13:03:43 teszt2 pengine: [2958]: info: process_pe_message: > Configuration WARNINGs found during PE processing. Please run "crm_verify > -L" to identify issues. > Dec 6 13:03:43 teszt2 tengine: [2957]: info: send_rsc_command: Initiating > action 1: stonith_teszt1_stop_0 on teszt2 > Dec 6 13:03:43 teszt2 crmd: [2866]: info: do_lrm_rsc_op: Performing > op=stonith_teszt1_stop_0 key=1:4:e21ebe4d-a057-4264-8905-05ab83dad327) > Dec 6 13:03:43 teszt2 lrmd: [3006]: info: Try to stop STONITH resource > <rsc_id=stonith_teszt1> : Device=apcsmart > Dec 6 13:03:43 teszt2 stonithd: [2864]: notice: try to stop a resource > stonith_teszt1 who is not in started resource queue. > Dec 6 13:03:43 teszt2 crmd: [2866]: info: process_lrm_event: LRM operation > stonith_teszt1_stop_0 (call=5, rc=0) complete > Dec 6 13:03:43 teszt2 tengine: [2957]: info: match_graph_event: Action > stonith_teszt1_stop_0 (1) confirmed on teszt2 > Dec 6 13:03:43 teszt2 tengine: [2957]: info: run_graph: Transition 4: > (Complete=1, Pending=0, Fired=0, Skipped=0, Incomplete=0) > Dec 6 13:03:43 teszt2 tengine: [2957]: info: notify_crmd: Transition 4 > status: te_complete - <null> > Dec 6 13:03:43 teszt2 crmd: [2866]: info: do_state_transition: State > transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS > cause=C_IPC_MESSAGE origin=route_message ] > > > > > So I think this should shut down the UPS (which gives the power to teszt1).
No. Starting a stonith resource is just making it available. The policy engine decides to actually stonith (reset) a node when it finds it necessary. > What's wrong, what do I miss? The error you encountered is probably from the monitor operation (there's a monitor implied in the start). Did you try with stonith (the program): # stonith -t apcsmart ttydev=/dev/ttyS0 hostlist=teszt1 -S # stonith -t apcsmart ttydev=/dev/ttyS0 hostlist=teszt1 -l (perhaps you'll have to twiddle a bit the parameters/options). Thanks, Dejan > Thank you, > > tamas > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
