Re: [ClusterLabs] stonithd/fenced filling up logs

Israel Brewster Wed, 05 Oct 2016 10:00:59 -0700

-----------------------------------------------

Israel Brewster

Systems Analyst II

Ravn Alaska

5245 Airport Industrial Rd

Fairbanks, AK 99709

(907) 450-7293

-----------------------------------------------

BEGIN:VCARD
VERSION:3.0
N:Brewster;Israel;;;
FN:Israel Brewster
ORG:Frontier Flying Service;MIS
TITLE:PC Support Tech II
EMAIL;type=INTERNET;type=WORK;type=pref:[email protected]
TEL;type=WORK;type=pref:907-450-7293
item1.ADR;type=WORK;type=pref:;;5245 Airport Industrial Wy;Fairbanks;AK;99701;
item1.X-ABADR:us
CATEGORIES:General
X-ABUID:36305438-95EA-4410-91AB-45D16CABCDDC\:ABPerson
END:VCARD

On Oct 4, 2016, at 4:06 PM, Digimer <[email protected]> wrote:

On 04/10/16 07:50 PM, Israel Brewster wrote:
On Oct 4, 2016, at 3:38 PM, Digimer <[email protected]> wrote:

On 04/10/16 07:09 PM, Israel Brewster wrote:
On Oct 4, 2016, at 3:03 PM, Digimer <[email protected]> wrote:

On 04/10/16 06:50 PM, Israel Brewster wrote:
On Oct 4, 2016, at 2:26 PM, Ken Gaillot <[email protected]
<mailto:[email protected]>> wrote:

On 10/04/2016 11:31 AM, Israel Brewster wrote:
I sent this a week ago, but never got a response, so I'm sending it
again in the hopes that it just slipped through the cracks. It seems to
me that this should just be a simple mis-configuration on my part
causing the issue, but I suppose it could be a bug as well.

I have two two-node clusters set up using corosync/pacemaker on CentOS
6.8. One cluster is simply sharing an IP, while the other one has
numerous services and IP's set up between the two machines in the
cluster. Both appear to be working fine. However, I was poking around
today, and I noticed that on the single IP cluster, corosync, stonithd,
and fenced were using "significant" amounts of processing power - 25%
for corosync on the current primary node, with fenced and stonithd often
showing 1-2% (not horrible, but more than any other process). In looking
at my logs, I see that they are dumping messages like the following to
the messages log every second or two:

Sep 27 08:51:50 fai-dbs1 stonith-ng[4851]: warning: get_xpath_object:
No match for //@st_delegate in /st-reply
Sep 27 08:51:50 fai-dbs1 stonith-ng[4851]: notice: remote_op_done:
Operation reboot of fai-dbs1 by fai-dbs2 for
[email protected]: No such device
Sep 27 08:51:50 fai-dbs1 crmd[4855]: notice: tengine_stonith_notify:
Peer fai-dbs1 was not terminated (reboot) by fai-dbs2 for fai-dbs2: No
such device (ref=c5161517-c0cc-42e5-ac11-1d55f7749b05) by client
stonith_admin.cman.15835
Sep 27 08:51:50 fai-dbs1 fence_pcmk[15393]: Requesting Pacemaker fence
fai-dbs2 (reset)

The above shows that CMAN is asking pacemaker to fence a node. Even
though fencing is disabled in pacemaker itself, CMAN is configured to
use pacemaker for fencing (fence_pcmk).

I never did any specific configuring of CMAN, Perhaps that's the
problem? I missed some configuration steps on setup? I just followed the
directions
here: http://jensd.be/156/linux/building-a-high-available-failover-cluster-with-pacemaker-corosync-pcs,
which disabled stonith in pacemaker via the
"pcs property set stonith-enabled=false" command. Is there separate CMAN
configs I need to do to get everything copacetic? If so, can you point
me to some sort of guide/tutorial for that?

Disabling stonith is not possible in cman, and very ill advised in
pacemaker. This is a mistake a lot of "tutorials" make when the author
doesn't understand the role of fencing.

In your case, pcs setup cman to use the fence_pcmk "passthrough" fence
agent, as it should. So when something went wrong, corosync detected it,
informed cman which then requested pacemaker to fence the peer. With
pacemaker not having stonith configured and enabled, it could do
nothing. So pacemaker returned that the fence failed and cman went into
an infinite loop trying again and again to fence (as it should have).

You must configure stonith (exactly how depends on your hardware), then
enable stonith in pacemaker.

Gotcha. There is nothing special about the hardware, it's just two physical boxes connected to the network. So I guess I've got a choice of either a) live with the logging/load situation (since the system does work perfectly as-is other than the excessive logging), or b) spend some time researching stonith to figure out what it does and how to configure it. Thanks for the pointers.

The system is not working perfectly. Consider it like this; You're
flying, and your landing gears are busted. You think everything is fine
because you're not trying to land yet.

Ok, good analogy :-)

Fencing is needed to force a node that has entered into a known state
into a known state (usually 'off'). It does this by reaching out over
some independent mechanism, like IPMI or a switched PDU, and forcing the
target to shut down.

Yeah, I don't want that. If one of the nodes enters an unknown state, I want the system to notify me so I can decide the proper course of action - I don't want it to simply shut down the other machine or something.

You do, actually. If a node isn't readily disposable, you need to
rethink your HA strategy. The service you're protecting is what matters,
not the machine hosting it at any particular time.

True. My hesitation, however, stems not from loosing the machine without warning (the ability to do so without consequence being one of the major selling points of HA), but rather with loosing the diagnostic opportunities presented *while* the machine is mis-behaving. I'm borderline obsessive with knowing what went wrong and why, if the machine is shut down before I have a chance to see what state it is in, my chances of being able to figure out what happened greatly diminish.

As you say, though, this is something I'll simply need to get over if I want real HA (see below).

Further, the whole role of pacemaker is to know what to do when things
go wrong (which you validate with plenty of creative failure testing
pre-production). A good HA system is one you won't touch for a long
time, possibly over a year. You don't want to be relying on rusty memory
for what do while, while people are breathing down your neck because the
service is down.

True, although that argument would hold more weight if I worked for a company where everyone wasn't quite so nice :-) We've had outages before (one of the reasons I started looking at HA), and everyone was like "Well, we can't do our jobs without it, so please let us know when it's back up. Have a good day!"

Trust the HA stack to do the right job, and validate that via testing.

Yeah, my testing is somewhat lacking. Probably contributes to my lack of trust.

This is also why I said that your hardware matters.
Do your nodes have IPMI? (or iRMC, iLO, DRAC, RSA, etc)?

I *might* have IPMI. I know my newer servers do. I'll have to check on that.

You can tell from the CLI. I've got a section on how to locate and
configure IPMI from the command line here:

https://alteeve.ca/w/AN!Cluster_Tutorial_2#What_is_IPMI

It should port to most any distro/version.

Looks like I'm out-of-luck on the IPMI front. Neither my application servers nor my database servers have IPMI ports. I'll have to talk to my boss about getting controllable power strips or the like (unless there are better options than just cutting the power)

If you don't need to coordinate actions between the nodes, you don't
need HA software, just run things everywhere all the time. If, however,
you do need to coordinate actions, then you need fencing.

The coordination is, of course, the whole point - an IP/service/whatever runs on one machine, and should that machine become unavailable (for whatever reason), it automatically moves to the other machine. My services could, of course, run on both just fine, but that doesn't help with accessing said services - that still has to go to one or the other.

Exactly. So you need HA, and you need to ensure coordinated actions
between the nodes. If you lose access to a node, for any reason, you
can't make assumptions about its state. If you do, you will eventually
get it wrong and voila, split-brain and uncoordinated actions.

So where fencing comes in would be for the situations where one machine *thinks* the other is unavailable, perhaps due to a network issue, but in fact the other machine is still up and running, I guess? That would make sense, but the thought of software simply taking over and shutting down one of my machines, without even consulting me first, doesn't sit well with me at all. Even a restart would be annoying - I typically like to see if I can figure out what is going on before restarting, since restarting often eliminates the symptoms that help diagnose problems.

That is a classic example, but not the only one. Perhaps the target is
hung, but might recover later? You just don't know, and not knowing is
all you know, *until* you fence it.`

....or until I log onto the machine and take a look at what is going on :-)

I can understand that it "doesn't sit well with you", but you need to
let that go. HA software is not like most other applications.

Understood. It might help if I knew there would be good documentation of the current state of the machine before the shutdown, but I don't know if that is even possible. So I guess I'll just have to get over it, be happy that I didn't loose any services, and move on :-)

If a node gets shot, in pacemaker/corosync, there is always going to be
a reason for it. Your job is to sort out why, after the fact. The
important part is that your services continued to be available.

Gotcha. Makes sense.

Note that you can bias which node wins in a case where both are alive
but something blocked comms. You do this by setting a delay on the fence
method for the node you prefer. So it works like this;

Say node 1 is your primary node where your services normally live, and
node 2 is the backup. Something breaks comms and both declare the other
dead and both initiate a fence. Node 2 loops up how to fence node 1,
sees a delay and pauses for $delay seconds. Node 1 looks up how to fence
node 2, sees no delay and pulls the trigger right away. Node 2 will die
because it exits its delay.

I was wondering about that.

So in any case, I guess the next step here is to figure out how to do fencing properly, using controllable power strips or the like. Back to the drawing board!

Now if there is a version of fencing that simply e-mails/texts/whatever me and says "Ummm... something is wrong with that machine over there, you need to do something about it, because I can't guarantee operation otherwise", I could go for that.

No, that is not fencing.

--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

_______________________________________________
Users mailing list: [email protected]
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

_______________________________________________
Users mailing list: [email protected]
http://clusterlabs.org/mailman/listinfo/users


Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] stonithd/fenced filling up logs

Reply via email to