Re: [ClusterLabs] pacemaker doesn't failover when httpd killed

2016-09-05 Thread Nurit Vilosny
Perfect! I did missed it. Thanks for the help!!

-Original Message-
From: Kristoffer Grönlund [mailto:kgronl...@suse.com] 
Sent: Monday, September 05, 2016 3:27 PM
To: Nurit Vilosny <nur...@mellanox.com>; users@clusterlabs.org
Subject: RE: [ClusterLabs] pacemaker doesn't failover when httpd killed

Nurit Vilosny <nur...@mellanox.com> writes:

> Here is the configuration for the httpd:
>
> # pcs resource show cluster_virtualIP
> Resource: cluster_virtualIP (class=ocf provider=heartbeat type=IPaddr2)
>   Attributes: ip=10.215.53.99
>   Operations: monitor interval=20s (cluster_virtualIP-monitor-interval-20s)
>   start interval=0s timeout=20s 
> (cluster_virtualIP-start-interval-0s)
>   stop interval=0s timeout=20s on-fail=restart 
> (cluster_virtualIP-stop-interval-0s)
>
> (yes - I have monitoring configured and yes I used the ocf)
>

Hi Nurit,

That's just the cluster resource for managing a virtual IP, not the resource 
for managing the httpd daemon itself.

If you've only got this resource, then there is nothing that monitors the web 
server. You need a cluster resource for the web server as well 
(ocf:heartbeat:apache, usually).

You are missing both that resource and the constraints that ensure that the 
virtual IP is active on the same node as the web server. The Clusters from 
Scratch document on the clusterlabs.org website shows you how to configure this.

Cheers,
Kristoffer

--
// Kristoffer Grönlund
// kgronl...@suse.com
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] pacemaker doesn't failover when httpd killed

2016-09-05 Thread Kristoffer Grönlund
Nurit Vilosny  writes:

> Here is the configuration for the httpd:
>
> # pcs resource show cluster_virtualIP
> Resource: cluster_virtualIP (class=ocf provider=heartbeat type=IPaddr2)
>   Attributes: ip=10.215.53.99
>   Operations: monitor interval=20s (cluster_virtualIP-monitor-interval-20s)
>   start interval=0s timeout=20s 
> (cluster_virtualIP-start-interval-0s)
>   stop interval=0s timeout=20s on-fail=restart 
> (cluster_virtualIP-stop-interval-0s)
>
> (yes - I have monitoring configured and yes I used the ocf)
>

Hi Nurit,

That's just the cluster resource for managing a virtual IP, not the
resource for managing the httpd daemon itself.

If you've only got this resource, then there is nothing that monitors
the web server. You need a cluster resource for the web server as well
(ocf:heartbeat:apache, usually).

You are missing both that resource and the constraints that ensure that
the virtual IP is active on the same node as the web server. The
Clusters from Scratch document on the clusterlabs.org website shows you
how to configure this.

Cheers,
Kristoffer

-- 
// Kristoffer Grönlund
// kgronl...@suse.com

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] pacemaker doesn't failover when httpd killed

2016-09-05 Thread Digimer
Depends on your OS, but generally /var/log/messages. Also, please share
your full pacemaker config. Please only obfuscate passwords.

digimer

On 05/09/16 07:53 PM, Nurit Vilosny wrote:
> Hi Kristoffer,
> Thanks for the prompt answer.
> Result of kill -9 is a dead process. Restart is not being performed.
> Can you tell me what logs to attach, so I can add them?
> 
> -Original Message-
> From: Kristoffer Grönlund [mailto:kgronl...@suse.com] 
> Sent: Monday, September 05, 2016 9:35 AM
> To: Nurit Vilosny <nur...@mellanox.com>; users@clusterlabs.org
> Subject: Re: [ClusterLabs] pacemaker doesn't failover when httpd killed
> 
> Nurit Vilosny <nur...@mellanox.com> writes:
> 
>> Hi everyone,
>> I tried the IRC for that, but I get disconnected and cannot see the reply...
>> So I try again:
>> I have a cluster with 3 nodes and 2 services: apache and application service 
>> - grouped together.
>> Debugging the cluster I used kill -9 to kill the httpd process, assuming the 
>> services will migrate to another node, but they didn't.
>> Log didn't show anything, and I remember reading that pacemaker check httpd 
>> status somewhere else that at service httpd status - but couldn't find where.
>> Any idea what can I do ?
> 
> Without any attached logs or before/after status information, it's difficult 
> to know what exactly happened in your case. But by default, Pacemaker tries 
> to restart the service on the same node before migrating to another node. So 
> running kill -9 on httpd should result in a restart on the same node, not a 
> migration.
> 
> Cheers,
> Kristoffer
> 
>>
>> Thanks.
>> Nurit
>>
>>
>>
>> ___
>> Users mailing list: Users@clusterlabs.org 
>> http://clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org Getting started: 
>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
> 
> --
> // Kristoffer Grönlund
> // kgronl...@suse.com
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 


-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] pacemaker doesn't failover when httpd killed

2016-09-05 Thread Nurit Vilosny
Here is the configuration for the httpd:

# pcs resource show cluster_virtualIP
Resource: cluster_virtualIP (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: ip=10.215.53.99
  Operations: monitor interval=20s (cluster_virtualIP-monitor-interval-20s)
  start interval=0s timeout=20s 
(cluster_virtualIP-start-interval-0s)
  stop interval=0s timeout=20s on-fail=restart 
(cluster_virtualIP-stop-interval-0s)

(yes - I have monitoring configured and yes I used the ocf)

Regrads,
Nurit

-Original Message-
From: Kristoffer Grönlund [mailto:kgronl...@suse.com] 
Sent: Monday, September 05, 2016 2:01 PM
To: Nurit Vilosny <nur...@mellanox.com>; users@clusterlabs.org
Subject: RE: [ClusterLabs] pacemaker doesn't failover when httpd killed

Nurit Vilosny <nur...@mellanox.com> writes:

> Hi Kristoffer,
> Thanks for the prompt answer.
> Result of kill -9 is a dead process. Restart is not being performed.
> Can you tell me what logs to attach, so I can add them?

Hi Nurit,

Start by attaching your configuration. Do you have a monitoring operation 
configured for your apache resource? Did you use the OCF resource agent?

Cheers,
Kristoffer

--
// Kristoffer Grönlund
// kgronl...@suse.com
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] pacemaker doesn't failover when httpd killed

2016-09-05 Thread Kristoffer Grönlund
Nurit Vilosny  writes:

> Hi everyone,
> I tried the IRC for that, but I get disconnected and cannot see the reply...
> So I try again:
> I have a cluster with 3 nodes and 2 services: apache and application service 
> - grouped together.
> Debugging the cluster I used kill -9 to kill the httpd process, assuming the 
> services will migrate to another node, but they didn't.
> Log didn't show anything, and I remember reading that pacemaker check httpd 
> status somewhere else that at service httpd status - but couldn't find where.
> Any idea what can I do ?

Without any attached logs or before/after status information, it's
difficult to know what exactly happened in your case. But by default,
Pacemaker tries to restart the service on the same node before migrating
to another node. So running kill -9 on httpd should result in a restart
on the same node, not a migration.

Cheers,
Kristoffer

>
> Thanks.
> Nurit
>
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-- 
// Kristoffer Grönlund
// kgronl...@suse.com

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org