Re: [Linux-HA] Error on "unpack_rsc_op"

Matteo Chesi Mon, 23 Nov 2009 03:09:14 -0800

Dejan Muhamedagic ha scritto:
> Hi,
> 
> On Mon, Nov 23, 2009 at 11:28:24AM +0100, Matteo Chesi wrote:
>> Andrew Beekhof ha scritto:
>>> On Mon, Nov 23, 2009 at 10:31 AM, Matteo Chesi <[email protected]> wrote:
>>>> Hi,
>>>>
>>>> I've got one problem on Heartbeat in one of my production clusters.
>>>>
>>>> The problem is that one resource (scs-mysql) in a group of resources do
>>>> not respond to start/stop commands through hb_gui (or crm_resource
>>>> commands).
>>>>
>>>> I checked "crm_verify -L" and I found this problem:
>>>>
>>>> scs02:~# crm_verify -L
>>>> crm_verify[28132]: 2009/11/23_10:26:37 ERROR: unpack_rsc_op: Remapping
>>>> resource_scs_postgresql_monitor_0 (rc=1) on scs01 to an ERROR
>>>>
>>>>
>>>> Please could you help me to find out what's the problem ?
>>> Your postgres resource returned the wrong thing. Check your logs.
>> I've got 3 logs and none of them tell me more than the message I posted.
> 
> Perhaps the logs were rotated after the error happened? Try to
> grep your logs for lrmd.*postgres.
>


The error happens everytime I do a "crm_verify -L" and gets the
timestamp of that moment.

However looking for that string in /var/log I found that one past log
shows something related my error and a particular event seems to be
happened before this error started to repeat ...

pengine[9937]: 2009/10/28_14:50:20 WARN: unpack_rsc_op: Processing
failed op resource_scs_postgresql_monitor_0 on scs01: Error
pengine[9937]: 2009/10/28_14:50:20 ERROR: native_add_running: Resource
lsb::scs-postgresql:resource_scs_postgresql appears to be active on 2 nodes.
pengine[9937]: 2009/10/28_14:50:20 ERROR: See
http://linux-ha.org/v2/faq/resource_too_active for more information.
pengine[9937]: 2009/10/28_14:50:20 ERROR: native_create_actions:
Attempting recovery of resource resource_scs_postgresql
pengine[9937]: 2009/10/28_14:50:21 ERROR: process_pe_message: Transition
94: ERRORs found during PE processing. PEngine Input stored in:
/var/lib/heartbeat/pengine/pe-error-28.bz2
mgmtd[5279]: 2009/10/28_14:50:21 ERROR: unpack_rsc_op: Remapping
resource_scs_postgresql_monitor_0 (rc=1) on scs01 to an ERROR
mgmtd[5279]: 2009/10/28_14:50:21 ERROR: native_add_running: Resource
lsb::scs-postgresql:resource_scs_postgresql appears to be active on 2 nodes.
mgmtd[5279]: 2009/10/28_14:50:21 ERROR: See
http://linux-ha.org/v2/faq/resource_too_active for more information.
mgmtd[5279]: 2009/10/28_14:50:22 ERROR: unpack_rsc_op: Remapping
resource_scs_postgresql_monitor_0 (rc=1) on scs01 to an ERROR
mgmtd[5279]: 2009/10/28_14:50:22 ERROR: unpack_rsc_op: Remapping
resource_scs_postgresql_monitor_0 (rc=1) on scs01 to an ERROR
mgmtd[5279]: 2009/10/28_14:50:23 ERROR: unpack_rsc_op: Remapping
resource_scs_postgresql_monitor_0 (rc=1) on scs01 to an ERROR
mgmtd[5279]: 2009/10/28_14:50:24 ERROR: unpack_rsc_op: Remapping
resource_scs_postgresql_monitor_0 (rc=1) on scs01 to an ERROR
mgmtd[5279]: 2009/10/28_14:50:26 ERROR: unpack_rsc_op: Remapping
resource_scs_postgresql_monitor_0 (rc=1) on scs01 to an ERROR

Looking for that error in FAQ (Pacemaker one) I found:

  Resource is Too Active

Pacemaker will try and determine what resources are active on a machine
when it starts. To do this, it sends what we call a probe which uses the
monitor operation of your ResourceAgent.

There are two common reasons for seeing this message:

    * Your resource really is active on more than one node
          o Check you are _not_ starting it on boot
          o Did Pacemaker suffer an internal failure? If so, please
check the Help:Contents page and report it
    * Your resource doesn't implement the monitor operation correctly
          o Make sure your Resource Agent conforms to the OCF-spec by
using the ocf-tester script




My Init script is a LSB one, not OCF.

Any Idea to solve it ? If it is an error happened in the past could I do
some cleanup to solve it ?

TIA,
Matteo
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Error on "unpack_rsc_op"

Reply via email to