Re: [ClusterLabs] HA static route
So even if the default gateway is set in /etc/sysconfig/network-script/ifcfg-eth* that could cause it? Original Message Subject: Re: [ClusterLabs] HA static route Local Time: March 14, 2016 9:52 PM UTC Time: March 15, 2016 2:52 AM From: denni...@conversis.de To: s...@protonmail.com,users@clusterlabs.org On 15.03.2016 02:25, S0ke wrote: > Trying to do HA for a static route. The resource is fine on HA1. But when I > try to failover to HA2 it does not seem to add the route. > > Operation start for p_src_eth0DEF (ocf:heartbeat:Route) returned 1 >> stderr: RTNETLINK answers: File exists >> stderr: ERROR: p_src_eth0DEF Failed to add network route: to default dev >> eth0 src 10.10.5.1 >> stderr: DEBUG: p_src_eth0DEF start returned 1 > > Is there way to overwrite the route? As it seems to be failing because the > route exists already? The question is why does the route exist already? I you want it to be managed by Pacemaker then you need to remove it from any other startup scripts that create that route. Regards, Dennis___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] HA static route
Trying to do HA for a static route. The resource is fine on HA1. But when I try to failover to HA2 it does not seem to add the route. Operation start for p_src_eth0DEF (ocf:heartbeat:Route) returned 1 > stderr: RTNETLINK answers: File exists > stderr: ERROR: p_src_eth0DEF Failed to add network route: to default dev eth0 > src 10.10.5.1 > stderr: DEBUG: p_src_eth0DEF start returned 1 Is there way to overwrite the route? As it seems to be failing because the route exists already?___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] ClusterIP location constraint reappears after reboot
On 02/22/2016 05:23 PM, Jeremy Matthews wrote: > Thanks for the quick response again, and pardon for the delay in responding. > A colleague of mine and I have been trying some different things today. > > But from the reboot on Friday, further below are the logs from corosync.log > from the time of the reboot command to the constraint being added. > > I am not able to perform a "pcs cluster cib-upgrade". The version of pcs that > I have does not have that option (just cib [filename] and cib-push > ). My versions at the time of these logs were: I'm curious whether you were able to solve your issue. Regarding cib-upgrade, you can use the "cibadmin --upgrade" command instead, which is what pcs does behind the scenes. For a better-safe-than-sorry how-to, see: http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#_upgrading_the_configuration > [root@g5se-f3efce Packages]# pcs --version > 0.9.90 > [root@g5se-f3efce Packages]# pacemakerd --version > Pacemaker 1.1.11 > Written by Andrew Beekhof > > I think you're right in that we had a script banning the ClusterIP. It is > called from a message daemon that we created that acts as middleware between > the cluster software and our application. In this daemon, it has an exit > handler that calls a script which runs: > > pcs resource ban ClusterIP $host # where $host is the > result of "host =`hostname` > > ...cause we normally try to push the cluster IP to the other side (though in > this case, we just have one node), but then right after that the script calls: > > pcs resource clear ClusterIP > > > ...but for some reason, it doesn't seem to result in the constraint being > removed (see even FURTHER below where I show a /var/log/message log snippet > with both the constraint addition and removal; this was using an earlier > version of pacemaker, Pacemaker 1.1.10-1.el6_4.4). I guess with the earlier > pcs or pacemaker version, these logs went to messages rather than > corosync.log today. > > I am in a bit of a conundrum in that if I upgrade pcs to the 0.9.149 > (retrieved and "make install" 'ed from github.com because 0.9.139 had a pcs > issue with one node clusters) which has the cib-upgrade option), then if I > manually remove the ClusterIP constraint this causes a problem for our > message daemon in that it thinks neither side in the cluster is active; > something to look at on our end. So it seems the removal of the constraint > affects our daemon in the new pcs. For the time being, I've rolled back pcs > to the above 0.9.90 version. > > One other thing to mention is that the timing of pacemaker's start may have > been delayed by what I found out was a change to its initialization header > (by either our daemon or application installation script) from 90 1 to 70 20. > So in /etc/rc3.d, there is S70pacemaker rather than S90pacemaker. I am not a > Linux expert by any means. I guess that may affect start up, but I'm not sure > about shutdown. > > Corosync logs from the time reboot was issued to the constraint being added: > > Feb 19 15:22:22 [1997] g5se-f3efce attrd: notice: > attrd_trigger_update: Sending flush op to all hosts for: standby (true) > Feb 19 15:22:22 [1997] g5se-f3efce attrd: notice: > attrd_perform_update: Sent update 24: standby=true > Feb 19 15:22:22 [1994] g5se-f3efcecib: info: cib_process_request: > Forwarding cib_modify operation for section status to master > (origin=local/attrd/24) > Feb 19 15:22:22 [1994] g5se-f3efcecib: info: cib_perform_op: > Diff: --- 0.291.2 2 > Feb 19 15:22:22 [1994] g5se-f3efcecib: info: cib_perform_op: > Diff: +++ 0.291.3 (null) > Feb 19 15:22:22 [1994] g5se-f3efcecib: info: cib_perform_op: > + /cib: @num_updates=3 > Feb 19 15:22:22 [1994] g5se-f3efcecib: info: cib_perform_op: > ++ > /cib/status/node_state[@id='g5se-f3efce']/transient_attributes[@id='g5se-f3efce']/instance_attributes[@id='status-g5se-f3efce']: > > Feb 19 15:22:22 [1999] g5se-f3efce crmd: info: > abort_transition_graph:Transition aborted by > status-g5se-f3efce-standby, standby=true: Transient attribute change (create > cib=0.291.3, source=te_update_diff:391, > path=/cib/status/node_state[@id='g5se-f3efce']/transient_attributes[@id='g5se-f3efce']/instance_attributes[@id='status-g5se-f3efce'], > 1) > Feb 19 15:22:22 [1999] g5se-f3efce crmd: notice: do_state_transition: > State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC > cause=C_FSA_INTERNAL origin=abort_transition_graph ] > Feb 19 15:22:22 [1994] g5se-f3efcecib: info: cib_process_request: > Completed cib_modify operation for section status: OK (rc=0, > origin=g5se-f3efce/attrd/24, version=0.291.3) > Feb 19 15:22:22 [1998] g5se-f3efcepengine: notice: update_validation: > pacemaker-1.2-style configuration is also
Re: [ClusterLabs] documentation on STONITH with remote nodes?
Ken Gaillotwrote: > On 03/12/2016 05:07 AM, Adam Spiers wrote: > > Is there any documentation on how STONITH works on remote nodes? I > > couldn't find any on clusterlabs.org, and it's conspicuously missing > > from: > > > > http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Remote/ > > > > I'm guessing the answer is more or less "it works exactly the same as > > for corosync nodes", however I expect there are nuances which might be > > worth documenting. In particular I'm looking for confirmation that > > STONITH resources for remote nodes will only run on the corosync > > nodes, and can't run on (other) remote nodes. My empirical tests seem > > to confirm this, but reassurance from an expert would be appreciated :-) > > The above link does have some information -- search it for "fencing". Ah! Thanks - I had only searched for "STONITH". Here it is: http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Remote/#_fencing_remote_nodes > You are correct, only full cluster nodes can run fence devices or > initiate fencing actions. > > Fencing of remote nodes (configured via ocf:pacemaker:remote resource) > is indeed identical to fencing of full cluster nodes. You can configure > fence devices for them the same way, and the cluster fences them the > same way. Thanks for the confirmation. > Fencing of guest nodes (configured via remote-node property of a > resource such as VirtualDomain) is different. For those, fence devices > are ignored, and the cluster fences them by stopping and starting the > resource. Makes sense and is intuitive. And I also found that it is documented here: http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Remote/#_testing_recovery_and_fencing Thanks again! ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] documentation on STONITH with remote nodes?
On 03/12/2016 05:07 AM, Adam Spiers wrote: > Is there any documentation on how STONITH works on remote nodes? I > couldn't find any on clusterlabs.org, and it's conspicuously missing > from: > > http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Remote/ > > I'm guessing the answer is more or less "it works exactly the same as > for corosync nodes", however I expect there are nuances which might be > worth documenting. In particular I'm looking for confirmation that > STONITH resources for remote nodes will only run on the corosync > nodes, and can't run on (other) remote nodes. My empirical tests seem > to confirm this, but reassurance from an expert would be appreciated :-) The above link does have some information -- search it for "fencing". You are correct, only full cluster nodes can run fence devices or initiate fencing actions. Fencing of remote nodes (configured via ocf:pacemaker:remote resource) is indeed identical to fencing of full cluster nodes. You can configure fence devices for them the same way, and the cluster fences them the same way. Fencing of guest nodes (configured via remote-node property of a resource such as VirtualDomain) is different. For those, fence devices are ignored, and the cluster fences them by stopping and starting the resource. ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Security with Corosync
Nikhil Utane napsal(a): Follow-up question. I noticed that secauth was turned off in my corosync.conf file. I enabled it on all 3 nodes and restarted the cluster. Everything was working fine. However I just noticed that I had forgotten to copy the authkey to one of the node. It is present on 2 nodes but not the third. And I did a failover and the third node took over without any issue. How is the 3rd node participating in the cluster if it doesn't have the authkey? It's just not possible. If you would enabled secauth correctly and you didn't have /etc/corosync/authkey, message like "Could not open /etc/corosync/authkey: No such file or directory" would show up. There are few exceptions: - you have changed totem.keyfile with file existing on all nodes - you are using totem.key then everything works as expected (it has priority over default authkey file but not over totem.keyfile) - you are using COROSYNC_TOTEM_AUTHKEY_FILE env with file existing on all nodes Regards, Honza On Fri, Mar 11, 2016 at 4:15 PM, Nikhil Utanewrote: Perfect. Thanks for the quick response Honza. Cheers Nikhil On Fri, Mar 11, 2016 at 4:10 PM, Jan Friesse wrote: Nikhil, Nikhil Utane napsal(a): Hi, I changed some configuration and captured packets. I can see that the data is already garbled and not in the clear. So does corosync already have this built-in? Can somebody provide more details as to what all security features are incorporated? See man page corosync.conf(5) options crypto_hash, crypto_cipher (for corosync 2.x) and potentially secauth (for coorsync 1.x and 2.x). Basically corosync by default uses aes256 for encryption and sha1 for hmac authentication. Pacemaker uses corosync cpg API so as long as encryption is enabled in the corosync.conf, messages interchanged between nodes are encrypted. Regards, Honza -Thanks Nikhil On Fri, Mar 11, 2016 at 11:38 AM, Nikhil Utane < nikhil.subscri...@gmail.com> wrote: Hi, Does corosync provide mechanism to secure the communication path between nodes of a cluster? I would like all the data that gets exchanged between all nodes to be encrypted. A quick google threw up this link: https://github.com/corosync/corosync/blob/master/SECURITY Can I make use of it with pacemaker? -Thanks Nikhil ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org