On 21.4.2017 14:14, Vladislav Bogdanov wrote:
20.04.2017 23:16, Jan Wrona wrote:
On 20.4.2017 19:33, Ken Gaillot wrote:
On 04/20/2017 10:52 AM, Jan Wrona wrote:
Hello,
my problem is closely related to the thread [1], but I didn't find a
solution there. I have a resource that is set up as a clone C
restricted
to two copies (using the clone-max=2 meta attribute||), because the
resource takes long time to get ready (it starts immediately though),
A resource agent must not return from "start" until a "monitor"
operation would return success.
Beyond that, the cluster doesn't care what "ready" means, so it's OK if
it's not fully operational by some measure. However, that raises the
question of what you're accomplishing with your monitor.
I know all that and my RA respects that. I didn't want to go into
details about the service I'm running, but maybe it will help you
understand. Its a data collector which receives and processes data from
a UDP stream. To understand these data, it needs templates which
periodically occur in the stream (every five minutes or so). After
"start" the service is up and running, "monitor" operations are
successful, but until the templates arrive the service is not "ready". I
basically need to somehow simulate this "ready" state.
If you are able to detect that your application is ready (it already
received its templates) in your RA's monitor, you may want to use
transient node attributes to indicate that to the cluster. And tie
your vip with such an attribute (with location constraint with rules).
http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/_using_rules_to_determine_resource_location.html#_location_rules_based_on_other_node_properties
Look at pacemaker/ping RA for attr management example.
[...]
It looks like transient attributes are what I've been looking for, thank
you! I'm not able to detect "ready" state, but at least I'm able to
assign application's elapsed time since the process was started (with
the upper limit of 1000 seconds) into the transient attribute. I've tied
this with IP's location constraint score and now the cluster places the
IP resource on the node with the longest running application process.
When both clone instances hit the upper limit of 1000 seconds, then both
should be "ready" and the cluster may safely apply other location
preferences. I've also made the IP's resource slightly sticky, so it
doesn't oscillate between nodes when several clone instances are started
at the same time.
_______________________________________________
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
_______________________________________________
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org