Re: [ClusterLabs] pcmk_remote evaluation (continued)

2017-12-11 Thread Ken Gaillot
On Mon, 2017-12-11 at 23:43 +0300, Vladislav Bogdanov wrote:
> 11.12.2017 23:06, Ken Gaillot wrote:
> [...]
> > > =
> > > 
> > > * The first issue I found (and I expect that to be a reason for
> > > some
> > > other issues) is that
> > > pacemaker_remote does not drop an old crmds' connection after new
> > > crmd connects.
> > > As IPC proxy connections are in the hash table, there is a 50%
> > > chance
> > > that remoted tries to
> > > reach an old crmd to f.e. proxy checks of node attributes when
> > > resources are reprobed.
> > > That leads to timeouts of that resources' probes with consequent
> > > reaction from a cluster.
> > > A solution here could be to drop old IPC proxy connection as soon
> > > as
> > > new one is established.
> > 
> > We can't drop connections from the pacemaker_remoted side because
> > it
> > doesn't know anything about the cluster state (e.g. whether the
> > cluster
> > connection resource is live-migrating).
> 
> Well, ok. But what happens when the fenced cluster node goes back
> and 
> receives a TCP packet from the old connection? Yes, it sends RST
> which 
> would terminate a connection on the peer side and then pcmk_remoted 
> should shutdown it on a socket event.
> 
> > 
> > However we can simply always use the most recently connected
> > provider,
> > which I think solves the issue. See commit e9a7e3bb, one of a few
> > recent bugfixes in the master branch for pacemaker_remoted. It will
> > most likely not make it into 2.0 (which I'm trying to focus on
> > deprecated syntax removals), but the next release after that.
> 
> Will definitely try it, all stakeholders are already notified that
> we 
> need another round on all available hardware :) We will test as soon
> as 
> it becomes free.
> 
> I will return to this as soon as I have some results.
> 
> Thank you,
> Vladislav

Great, thanks. See my latest post about 1.1.18 -- you can compile the
latest 1.1 branch to have this and other known bug fixes.
-- 
Ken Gaillot 

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] pcmk_remote evaluation (continued)

2017-12-11 Thread Vladislav Bogdanov

11.12.2017 23:06, Ken Gaillot wrote:
[...]

=

* The first issue I found (and I expect that to be a reason for some
other issues) is that
pacemaker_remote does not drop an old crmds' connection after new
crmd connects.
As IPC proxy connections are in the hash table, there is a 50% chance
that remoted tries to
reach an old crmd to f.e. proxy checks of node attributes when
resources are reprobed.
That leads to timeouts of that resources' probes with consequent
reaction from a cluster.
A solution here could be to drop old IPC proxy connection as soon as
new one is established.


We can't drop connections from the pacemaker_remoted side because it
doesn't know anything about the cluster state (e.g. whether the cluster
connection resource is live-migrating).


Well, ok. But what happens when the fenced cluster node goes back and 
receives a TCP packet from the old connection? Yes, it sends RST which 
would terminate a connection on the peer side and then pcmk_remoted 
should shutdown it on a socket event.




However we can simply always use the most recently connected provider,
which I think solves the issue. See commit e9a7e3bb, one of a few
recent bugfixes in the master branch for pacemaker_remoted. It will
most likely not make it into 2.0 (which I'm trying to focus on
deprecated syntax removals), but the next release after that.


Will definitely try it, all stakeholders are already notified that we 
need another round on all available hardware :) We will test as soon as 
it becomes free.


I will return to this as soon as I have some results.

Thank you,
Vladislav

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] pcmk_remote evaluation (continued)

2017-12-11 Thread Ken Gaillot
On Wed, 2017-09-20 at 12:46 +0300, Vladislav Bogdanov wrote:
> Hi,
> 
> as 1.1.17 received a lot of care in pcmk_remote, I decided to try it
> again
> in rather big setup (less then previous, so I'm not hit by IPC
> disconnects here).
> 
> From the first runs there are still some severe issues when cluster
> nodes are fenced.
> 
> The following results are obtained by killing the DC node (md12k-3-
> srv) which was
> hosting remote resources for nodes es7700-3-srv, es12kxe-3-srv and
> es12kxe-7-srv.
> After the fence new DC (md12k-1-srv) has moved that resources the
> following way:
> =
> Sep 20 08:53:28 md12k-1-srv pengine[2525]:   notice: Movees12kxe-
> 3-srv  (Started md12k-3-srv -> md12k-4-srv)
> Sep 20 08:53:28 md12k-1-srv pengine[2525]:   notice: Movees12kxe-
> 7-srv  (Started md12k-3-srv -> md12k-1-srv)
> Sep 20 08:53:28 md12k-1-srv pengine[2525]:   notice: Movees7700-
> 3-srv   (Started md12k-3-srv -> md12k-2-srv)
> =
> 
> * The first issue I found (and I expect that to be a reason for some
> other issues) is that
> pacemaker_remote does not drop an old crmds' connection after new
> crmd connects.
> As IPC proxy connections are in the hash table, there is a 50% chance
> that remoted tries to
> reach an old crmd to f.e. proxy checks of node attributes when
> resources are reprobed.
> That leads to timeouts of that resources' probes with consequent
> reaction from a cluster.
> A solution here could be to drop old IPC proxy connection as soon as
> new one is established.

We can't drop connections from the pacemaker_remoted side because it
doesn't know anything about the cluster state (e.g. whether the cluster
connection resource is live-migrating).

However we can simply always use the most recently connected provider,
which I think solves the issue. See commit e9a7e3bb, one of a few
recent bugfixes in the master branch for pacemaker_remoted. It will
most likely not make it into 2.0 (which I'm trying to focus on
deprecated syntax removals), but the next release after that.

> * I _suspect_ that the issue above could lead to following lines in a
> cluster node logs.
> I didn't check, but I suspect that when remoted decides to disconnect
> an old connection
> after fenced node goes up and TCP connections are reset - it
> disconnects a new one too.

pacemaker_remoted doesn't really initiate disconnections except when it
exits. All connections and disconnections should be initiated on the
cluster side.

> At least this issue happens at the same time fenced node rejoins a
> cluster.
> These logs are for the case no resources operate node attributes (I
> removed that resources
> from the CIB and set a stickiness for all others).
> =
> Sep 20 08:55:41 md12k-1-srv crmd[11375]:error: Timed out (1
> ms) while waiting for remote data
> Sep 20 08:55:41 md12k-1-srv crmd[11375]:error: Unable to receive
> expected reply, disconnecting.
> Sep 20 08:55:41 md12k-1-srv crmd[11375]:error: Remote lrmd server
> disconnected while waiting for reply with id 9823.
> Sep 20 08:55:41 md12k-1-srv crmd[11375]:error: Unexpected
> disconnect on remote-node es12kxe-7-srv

The messages above may be confusing, but the disconnect is originating
on the cluster side, after a timeout waiting for a reply from
pacemaker_remoted.

I don't know of any reason that a cluster node rejoining would cause
pacemaker_remoted to stop working, aside from when the cluster node is
the same host running pacemaker_remoted (which I've done accidentally
during testing).

> Sep 20 08:55:41 md12k-1-srv crmd[11375]:error: Result of monitor
> operation for es12kxe-7-srv on md12k-1-srv: Error
> Sep 20 08:55:41 md12k-1-srv crmd[11375]:error: Couldn't perform
> lrmd_rsc_info operation (timeout=0): -107: Success (0)
> Sep 20 08:55:41 md12k-1-srv crmd[11375]:error: LRMD disconnected
> Sep 20 08:55:41 md12k-1-srv crmd[11375]:error: Could not add
> resource ost0033-es03a to LRM es12kxe-7-srv
> Sep 20 08:55:41 md12k-1-srv crmd[11375]:error: Invalid resource
> definition for ost0033-es03a
> Sep 20 08:55:41 md12k-1-srv crmd[11375]:  warning: bad
> input    version="3.0.13" subt="request" reference="lrm_invoke-tengine-150
> 5897731-3872" crm_task="lrm_invoke" crm_sys_to="lrmd"
> crm_sys_from="tengine" crm_host_to="md12k-1-srv"
> acl_target="hacluster" crm_user="hacluster">
> Sep 20 08:55:41 md12k-1-srv crmd[11375]:  warning: bad
> input 
> Sep 20 08:55:41 md12k-1-srv crmd[11375]:  warning: bad
> input    operation_key="ost0033-es03a_monitor_0" on_node="es12kxe-7-srv"
> on_node_uuid=
> "es12kxe-7-srv" router_node="md12k-1-srv" transition-
> key="221:23:7:7df0b5ef-5bfa-4d21-82e3-4dc964d6cffd">
> Sep 20 08:55:41 md12k-1-srv crmd[11375]:  warning: bad
> input  provider="ddn" type="lustre-server"/>
> Sep 20 08:55:41 md12k-1-srv crmd[11375]:  warning: bad
> input  CRM_meta_on_node_uuid="es12kxe-7-srv" CRM_meta_op_target_rc="7" CRM
> _meta_record_pending="false"