Re: [Linux-HA] Unrelated resource getting restarted when other group resource status changes

Andrew Beekhof Mon, 18 Jun 2007 07:40:55 -0700

On 6/18/07, Doug Knight <[EMAIL PROTECTED]> wrote:

All,
I have an HA cluster consisting of two nodes running HA 2.0.8. I have
configured two groups and a single individual resource, as follows:


grp_pgsql_mirror - drbd, file system, postgresql, alias IP address
skybase_ingestor_HA - a background process that feeds our database with
raw data
grp_decoders - background processes that are triggered by storage of raw
data through postgresql triggers

Both groups have colocated and ordered set to true. I control where each
runs by using location constraints. The problem I'm having is that when
I move the decoder group from one node to another, the ingestor resource
gets a restart when it should not be touched. I've attached my cibadmin
-Q output, a portion of the log where I've executed the switch showing
the restart on the ingestor, and the piece of xml I use to update the
location constraint. The command I use to apply the revised location
constraint xml is:

cibadmin -o constraints -R -x rule_locate_decoder_dk.xml

I simply do not see why making changes to the decoders would have any
impact on the ingestor from a heartbeat stand point. Any ideas, am I
missing something in the XML?


It got restarted because of these lines:
pengine[14991]: 2007/06/18_08:51:00 WARN: check_action_definition:
Parameters to skybase_ingestor_HA_start_0 on arc-dknightlx changed:
recorded f7d867defb23b1498919d3b8aa223431 vs. calculated
05cc0923186775b674d1d4876ac94e56

So the restart was "caused" by the update you made only in that it
triggered a re-run of the PE.

The real mystery is why we think the parameters changed.

oh.... I bet you're suffering from this problem:
  http://hg.beekhof.net/lha/crm-dev/rev/f7775a4af780

that would cause false positives

can i suggest the packages at:
  http://software.opensuse.org/download/server:/ha-clustering
they will have the indicated patch included.


Doug
p.s. I don't think the size of this email will exceed the list's limits,
but if it does I respectfully ask that it be passed along.


compressing the logs usually helps, but it looks like it was small enough anyway

I have a
milestone to meet this week and would like to get this last issue
resolved as soon as I can. Thanks again,.

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Unrelated resource getting restarted when other group resource status changes

Reply via email to