Re: [ClusterLabs] HA static route

2016-03-14 Thread S0ke
So even if the default gateway is set in 
/etc/sysconfig/network-script/ifcfg-eth* that could cause it?


 Original Message 
Subject: Re: [ClusterLabs] HA static route
Local Time: March 14, 2016 9:52 PM
UTC Time: March 15, 2016 2:52 AM
From: denni...@conversis.de
To: s...@protonmail.com,users@clusterlabs.org

On 15.03.2016 02:25, S0ke wrote:
> Trying to do HA for a static route. The resource is fine on HA1. But when I 
> try to failover to HA2 it does not seem to add the route.
>
> Operation start for p_src_eth0DEF (ocf:heartbeat:Route) returned 1
>> stderr: RTNETLINK answers: File exists
>> stderr: ERROR: p_src_eth0DEF Failed to add network route: to default dev 
>> eth0 src 10.10.5.1
>> stderr: DEBUG: p_src_eth0DEF start returned 1
>
> Is there way to overwrite the route? As it seems to be failing because the 
> route exists already?

The question is why does the route exist already? I you want it to be
managed by Pacemaker then you need to remove it from any other startup
scripts that create that route.

Regards,
Dennis___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] HA static route

2016-03-14 Thread S0ke
Trying to do HA for a static route. The resource is fine on HA1. But when I try 
to failover to HA2 it does not seem to add the route.

Operation start for p_src_eth0DEF (ocf:heartbeat:Route) returned 1
> stderr: RTNETLINK answers: File exists
> stderr: ERROR: p_src_eth0DEF Failed to add network route: to default dev eth0 
> src 10.10.5.1
> stderr: DEBUG: p_src_eth0DEF start returned 1

Is there way to overwrite the route? As it seems to be failing because the 
route exists already?___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] ClusterIP location constraint reappears after reboot

2016-03-14 Thread Ken Gaillot
On 02/22/2016 05:23 PM, Jeremy Matthews wrote:
> Thanks for the quick response again, and pardon for the delay in responding. 
> A colleague of mine and I have been trying some different things today.
> 
> But from the reboot on Friday, further below are the logs from corosync.log 
> from the time of the reboot command to the constraint being added.
> 
> I am not able to perform a "pcs cluster cib-upgrade". The version of pcs that 
> I have does not have that option (just cib [filename] and cib-push 
> ). My versions at the time of these logs were:

I'm curious whether you were able to solve your issue.

Regarding cib-upgrade, you can use the "cibadmin --upgrade" command
instead, which is what pcs does behind the scenes. For a
better-safe-than-sorry how-to, see:

http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#_upgrading_the_configuration

> [root@g5se-f3efce Packages]# pcs --version
> 0.9.90
> [root@g5se-f3efce Packages]# pacemakerd --version
> Pacemaker 1.1.11
> Written by Andrew Beekhof
> 
> I think you're right in that we had a script banning the ClusterIP. It is 
> called from a message daemon that we created that acts as middleware between 
> the cluster software and our application. In this daemon, it has an exit 
> handler that calls a script which runs:
> 
> pcs resource ban ClusterIP $host  # where $host is the 
> result of "host =`hostname`
> 
> ...cause we normally try to push the cluster IP to the other side (though in 
> this case, we just have one node), but then right after that the script calls:
> 
> pcs resource clear ClusterIP
> 
> 
> ...but for some reason, it doesn't seem to result in the constraint being 
> removed (see even FURTHER below where I show a /var/log/message log snippet 
> with both the constraint addition and removal; this was using an earlier 
> version of pacemaker, Pacemaker 1.1.10-1.el6_4.4). I guess with the earlier 
> pcs or pacemaker version, these logs went to messages rather than 
> corosync.log today.
> 
> I am in a bit of a conundrum in that if I upgrade pcs to the 0.9.149 
> (retrieved and "make install" 'ed from github.com because 0.9.139 had a pcs 
> issue with one node clusters) which has the cib-upgrade option), then if I 
> manually remove the ClusterIP constraint this causes a problem for our 
> message daemon in that it thinks neither side in the cluster is active; 
> something to look at on our end. So it seems the removal of the constraint 
> affects our daemon in the new pcs. For the time being, I've rolled back pcs 
> to the above 0.9.90 version.
> 
> One other thing to mention is that the timing of pacemaker's start may have 
> been delayed by what I found out was a change to its initialization header 
> (by either our daemon or application installation script) from 90 1 to 70 20. 
> So in /etc/rc3.d, there is S70pacemaker rather than S90pacemaker. I am not a 
> Linux expert by any means. I guess that may affect start up, but I'm not sure 
> about shutdown.
> 
> Corosync logs from the time reboot was issued to the constraint being added:
> 
> Feb 19 15:22:22 [1997] g5se-f3efce  attrd:   notice: 
> attrd_trigger_update:  Sending flush op to all hosts for: standby (true)
> Feb 19 15:22:22 [1997] g5se-f3efce  attrd:   notice: 
> attrd_perform_update:  Sent update 24: standby=true
> Feb 19 15:22:22 [1994] g5se-f3efcecib: info: cib_process_request: 
>   Forwarding cib_modify operation for section status to master 
> (origin=local/attrd/24)
> Feb 19 15:22:22 [1994] g5se-f3efcecib: info: cib_perform_op:  
>   Diff: --- 0.291.2 2
> Feb 19 15:22:22 [1994] g5se-f3efcecib: info: cib_perform_op:  
>   Diff: +++ 0.291.3 (null)
> Feb 19 15:22:22 [1994] g5se-f3efcecib: info: cib_perform_op:  
>   +  /cib:  @num_updates=3
> Feb 19 15:22:22 [1994] g5se-f3efcecib: info: cib_perform_op:  
>   ++ 
> /cib/status/node_state[@id='g5se-f3efce']/transient_attributes[@id='g5se-f3efce']/instance_attributes[@id='status-g5se-f3efce']:
>   
> Feb 19 15:22:22 [1999] g5se-f3efce   crmd: info: 
> abort_transition_graph:Transition aborted by 
> status-g5se-f3efce-standby, standby=true: Transient attribute change (create 
> cib=0.291.3, source=te_update_diff:391, 
> path=/cib/status/node_state[@id='g5se-f3efce']/transient_attributes[@id='g5se-f3efce']/instance_attributes[@id='status-g5se-f3efce'],
>  1)
> Feb 19 15:22:22 [1999] g5se-f3efce   crmd:   notice: do_state_transition: 
>   State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC 
> cause=C_FSA_INTERNAL origin=abort_transition_graph ]
> Feb 19 15:22:22 [1994] g5se-f3efcecib: info: cib_process_request: 
>   Completed cib_modify operation for section status: OK (rc=0, 
> origin=g5se-f3efce/attrd/24, version=0.291.3)
> Feb 19 15:22:22 [1998] g5se-f3efcepengine:   notice: update_validation:   
>   pacemaker-1.2-style configuration is also 

Re: [ClusterLabs] documentation on STONITH with remote nodes?

2016-03-14 Thread Adam Spiers
Ken Gaillot  wrote:
> On 03/12/2016 05:07 AM, Adam Spiers wrote:
> > Is there any documentation on how STONITH works on remote nodes?  I
> > couldn't find any on clusterlabs.org, and it's conspicuously missing
> > from:
> > 
> > http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Remote/
> > 
> > I'm guessing the answer is more or less "it works exactly the same as
> > for corosync nodes", however I expect there are nuances which might be
> > worth documenting.  In particular I'm looking for confirmation that
> > STONITH resources for remote nodes will only run on the corosync
> > nodes, and can't run on (other) remote nodes.  My empirical tests seem
> > to confirm this, but reassurance from an expert would be appreciated :-)
> 
> The above link does have some information -- search it for "fencing".

Ah!  Thanks - I had only searched for "STONITH".  Here it is:

http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Remote/#_fencing_remote_nodes

> You are correct, only full cluster nodes can run fence devices or
> initiate fencing actions.
> 
> Fencing of remote nodes (configured via ocf:pacemaker:remote resource)
> is indeed identical to fencing of full cluster nodes. You can configure
> fence devices for them the same way, and the cluster fences them the
> same way.

Thanks for the confirmation.

> Fencing of guest nodes (configured via remote-node property of a
> resource such as VirtualDomain) is different. For those, fence devices
> are ignored, and the cluster fences them by stopping and starting the
> resource.

Makes sense and is intuitive.  And I also found that it is documented
here:

http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Remote/#_testing_recovery_and_fencing

Thanks again!

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] documentation on STONITH with remote nodes?

2016-03-14 Thread Ken Gaillot
On 03/12/2016 05:07 AM, Adam Spiers wrote:
> Is there any documentation on how STONITH works on remote nodes?  I
> couldn't find any on clusterlabs.org, and it's conspicuously missing
> from:
> 
> http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Remote/
> 
> I'm guessing the answer is more or less "it works exactly the same as
> for corosync nodes", however I expect there are nuances which might be
> worth documenting.  In particular I'm looking for confirmation that
> STONITH resources for remote nodes will only run on the corosync
> nodes, and can't run on (other) remote nodes.  My empirical tests seem
> to confirm this, but reassurance from an expert would be appreciated :-)

The above link does have some information -- search it for "fencing".

You are correct, only full cluster nodes can run fence devices or
initiate fencing actions.

Fencing of remote nodes (configured via ocf:pacemaker:remote resource)
is indeed identical to fencing of full cluster nodes. You can configure
fence devices for them the same way, and the cluster fences them the
same way.

Fencing of guest nodes (configured via remote-node property of a
resource such as VirtualDomain) is different. For those, fence devices
are ignored, and the cluster fences them by stopping and starting the
resource.

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Security with Corosync

2016-03-14 Thread Jan Friesse

Nikhil Utane napsal(a):

Follow-up question.
I noticed that secauth was turned off in my corosync.conf file. I enabled
it on all 3 nodes and restarted the cluster. Everything was working fine.
However I just noticed that I had forgotten to copy the authkey to one of
the node. It is present on 2 nodes but not the third. And I did a failover
and the third node took over without any issue.
How is the 3rd node participating in the cluster if it doesn't have the
authkey?


It's just not possible. If you would enabled secauth correctly and you 
didn't have /etc/corosync/authkey, message like "Could not open 
/etc/corosync/authkey: No such file or directory" would show up. There 
are few exceptions:

- you have changed totem.keyfile with file existing on all nodes
- you are using totem.key then everything works as expected (it has 
priority over default authkey file but not over totem.keyfile)
- you are using COROSYNC_TOTEM_AUTHKEY_FILE env with file existing on 
all nodes


Regards,
  Honza



On Fri, Mar 11, 2016 at 4:15 PM, Nikhil Utane 
wrote:


Perfect. Thanks for the quick response Honza.

Cheers
Nikhil

On Fri, Mar 11, 2016 at 4:10 PM, Jan Friesse  wrote:


Nikhil,

Nikhil Utane napsal(a):


Hi,

I changed some configuration and captured packets. I can see that the
data
is already garbled and not in the clear.
So does corosync already have this built-in?
Can somebody provide more details as to what all security features are
incorporated?



See man page corosync.conf(5) options crypto_hash, crypto_cipher (for
corosync 2.x) and potentially secauth (for coorsync 1.x and 2.x).

Basically corosync by default uses aes256 for encryption and sha1 for
hmac authentication.

Pacemaker uses corosync cpg API so as long as encryption is enabled in
the corosync.conf, messages interchanged between nodes are encrypted.

Regards,
   Honza



-Thanks
Nikhil

On Fri, Mar 11, 2016 at 11:38 AM, Nikhil Utane <
nikhil.subscri...@gmail.com>
wrote:

Hi,


Does corosync provide mechanism to secure the communication path between
nodes of a cluster?
I would like all the data that gets exchanged between all nodes to be
encrypted.

A quick google threw up this link:
https://github.com/corosync/corosync/blob/master/SECURITY

Can I make use of it with pacemaker?

-Thanks
Nikhil






___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org








___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org