Dejan Muhamedagic ha scritto:
> Hi,
> 
> On Mon, Nov 23, 2009 at 12:09:02PM +0100, Matteo Chesi wrote:
>> Dejan Muhamedagic ha scritto:
>>> Hi,
>>>
>>> On Mon, Nov 23, 2009 at 11:28:24AM +0100, Matteo Chesi wrote:
>>>> Andrew Beekhof ha scritto:
>>>>> On Mon, Nov 23, 2009 at 10:31 AM, Matteo Chesi <[email protected]> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I've got one problem on Heartbeat in one of my production clusters.
>>>>>>
>>>>>> The problem is that one resource (scs-mysql) in a group of resources do
>>>>>> not respond to start/stop commands through hb_gui (or crm_resource
>>>>>> commands).
>>>>>>
>>>>>> I checked "crm_verify -L" and I found this problem:
>>>>>>
>>>>>> scs02:~# crm_verify -L
>>>>>> crm_verify[28132]: 2009/11/23_10:26:37 ERROR: unpack_rsc_op: Remapping
>>>>>> resource_scs_postgresql_monitor_0 (rc=1) on scs01 to an ERROR
>>>>>>
>>>>>>
>>>>>> Please could you help me to find out what's the problem ?
>>>>> Your postgres resource returned the wrong thing. Check your logs.
>>>> I've got 3 logs and none of them tell me more than the message I posted.
>>> Perhaps the logs were rotated after the error happened? Try to
>>> grep your logs for lrmd.*postgres.
>>>
>> The error happens everytime I do a "crm_verify -L" and gets the
>> timestamp of that moment.
> 
> crm_verify only reports here whatever is recorded in the status
> section of the CIB. That error happened in the past though and
> has nothing to do with crm_verify.

Ok,
so How so I clean crm_verify output ? Editing by myself these rows ?

         <lrm_resource id="resource_scs_postgresql"
type="scs-postgresql" class="lsb" provider="heartbeat">
           <lrm_rsc_op id="resource_scs_postgresql_monitor_0"
operation="monitor" crm-debug-origin="build_active_RAs"
transition_key="5:149:204295f9-bfd2-414a-bb3f-c9b03528f94c"
transition_magic="0:1;5:149:204295f9-bfd2-414a-bb3f-c9b03528f94c"
call_id="269" crm_feature_set="2.0" rc_code="1" op_status="0"
interval="0" op_digest="f2317cad3d54cec5d7d7aa7d0bf35cf8"/>
           <lrm_rsc_op id="resource_scs_postgresql_stop_0"
operation="stop" crm-debug-origin="build_active_RAs"
transition_key="1:150:204295f9-bfd2-414a-bb3f-c9b03528f94c"
transition_magic="0:0;1:150:204295f9-bfd2-414a-bb3f-c9b03528f94c"
call_id="270" crm_feature_set="2.0" rc_code="0" op_status="0"
interval="0" op_digest="f2317cad3d54cec5d7d7aa7d0bf35cf8"/>
         </lrm_resource>


>> However looking for that string in /var/log I found that one past log
>> shows something related my error and a particular event seems to be
>> happened before this error started to repeat ...
>>
>> pengine[9937]: 2009/10/28_14:50:20 WARN: unpack_rsc_op: Processing
>> failed op resource_scs_postgresql_monitor_0 on scs01: Error
>> pengine[9937]: 2009/10/28_14:50:20 ERROR: native_add_running: Resource
>> lsb::scs-postgresql:resource_scs_postgresql appears to be active on 2 nodes.
> 
> The lsb script is probably not LSB compliant (or indeed there is
> postgres running on both nodes). You should be better off with
> the OCF resource agent.

I think that probably in the past (at error moment) it wasn't LSB
compliant but nowadays it is.
However I can't (and I wouldn't) reproduce that error.

TIA,
Matteo
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to