On 21.4.2017 14:14, Vladislav Bogdanov wrote:
20.04.2017 23:16, Jan Wrona wrote:
On 20.4.2017 19:33, Ken Gaillot wrote:
On 04/20/2017 10:52 AM, Jan Wrona wrote:
Hello,

my problem is closely related to the thread [1], but I didn't find a
solution there. I have a resource that is set up as a clone C restricted
to two copies (using the clone-max=2 meta attribute||), because the
resource takes long time to get ready (it starts immediately though),
A resource agent must not return from "start" until a "monitor"
operation would return success.

Beyond that, the cluster doesn't care what "ready" means, so it's OK if
it's not fully operational by some measure. However, that raises the
question of what you're accomplishing with your monitor.
I know all that and my RA respects that. I didn't want to go into
details about the service I'm running, but maybe it will help you
understand. Its a data collector which receives and processes data from
a UDP stream. To understand these data, it needs templates which
periodically occur in the stream (every five minutes or so). After
"start" the service is up and running, "monitor" operations are
successful, but until the templates arrive the service is not "ready". I
basically need to somehow simulate this "ready" state.

If you are able to detect that your application is ready (it already received its templates) in your RA's monitor, you may want to use transient node attributes to indicate that to the cluster. And tie your vip with such an attribute (with location constraint with rules).

http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/_using_rules_to_determine_resource_location.html#_location_rules_based_on_other_node_properties

Look at pacemaker/ping RA for attr management example.

[...]

It looks like transient attributes are what I've been looking for, thank you! I'm not able to detect "ready" state, but at least I'm able to assign application's elapsed time since the process was started (with the upper limit of 1000 seconds) into the transient attribute. I've tied this with IP's location constraint score and now the cluster places the IP resource on the node with the longest running application process. When both clone instances hit the upper limit of 1000 seconds, then both should be "ready" and the cluster may safely apply other location preferences. I've also made the IP's resource slightly sticky, so it doesn't oscillate between nodes when several clone instances are started at the same time.



_______________________________________________
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


_______________________________________________
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to