Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5

2014-11-12 Thread Sihan Goi
Hi,

getenforce returns Enforcing
ls -dZ /var/www/html returns drwxr-xr-x. root root
system_u:object_r:httpd_sys_content_t:s0 /var/www/html on both nodes.

Running restorecon doesn't change the ls-dZ output.

On Wed, Nov 12, 2014 at 2:24 PM, Vladislav Bogdanov bub...@hoster-ok.com
wrote:

 11.11.2014 07:27, Sihan Goi wrote:
  Hi,
 
  DocumentRoot is still set to /var/www/html
  ls -al /var/www/html shows different things on the 2 nodes
  node01:
 
  total 28
  drwxr-xr-x. 3 root root  4096 Nov 11 12:25 .
  drwxr-xr-x. 6 root root  4096 Jul 23 22:18 ..
  -rw-r--r--. 1 root root50 Oct 28 18:00 index.html
  drwx--. 2 root root 16384 Oct 28 17:59 lost+found
 
  node02 only has index.html, no lost+found, and it's a different version
  of the file.
 

 It look like apache is unable to stat its document root.
 Could you please show output of two commands:

 getenforce
 ls -dZ /var/www/html

 on both nodes when fs is mounted on one of them?
 If you see 'Enforcing', and the last part of the selinux context of a
 mounted fs root is not httpd_sys_content_t, then run
 'restorecon -R /var/www/html' on that node.

  Status URL is enabled in both nodes.
 
 
  On Oct 30, 2014 11:14 AM, Andrew Beekhof and...@beekhof.net
  mailto:and...@beekhof.net wrote:
 
 
   On 29 Oct 2014, at 1:01 pm, Sihan Goi gois...@gmail.com
  mailto:gois...@gmail.com wrote:
  
   Hi,
  
   I've never used crm_report before. I just read the man file and
  generated a tarball from 1-2 hours before I reconfigured all the
  DRBD related resources. I've put the tarball here -
 
 https://www.dropbox.com/s/suj9pttjp403msv/unexplained-apache-failure.tar.bz2?dl=0
  
   Hope you can help figure out what I'm doing wrong. Thanks for the
  help!
 
  Oct 28 18:13:38 node02 Filesystem(WebFS)[29940]: INFO: Running start
  for /dev/drbd/by-res/wwwdata on /var/www/html
  Oct 28 18:13:39 node02 kernel: EXT4-fs (drbd1): mounted filesystem
  with ordered data mode. Opts:
  Oct 28 18:13:39 node02 crmd[9870]:   notice: process_lrm_event: LRM
  operation WebFS_start_0 (call=164, rc=0, cib-update=298,
  confirmed=true) ok
  Oct 28 18:13:39 node02 crmd[9870]:   notice: te_rsc_command:
  Initiating action 7: start WebSite_start_0 on node02 (local)
  Oct 28 18:13:39 node02 apache(WebSite)[30007]: ERROR: Syntax error
  on line 292 of /etc/httpd/conf/httpd.conf: DocumentRoot must be a
  directory
 
  Is DocumentRoot still set to /var/www/html?
  If so, what happens if you run 'ls -al /var/www/html' in a shell?
 
  Oct 28 18:13:39 node02 apache(WebSite)[30007]: INFO: apache not
 running
  Oct 28 18:13:39 node02 apache(WebSite)[30007]: INFO: waiting for
  apache /etc/httpd/conf/httpd.conf to come up
 
  Did you enable the status url?
 
 http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Clusters_from_Scratch/_enable_the_apache_status_url.html
 
 
 
  ___
  Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
  mailto:Pacemaker@oss.clusterlabs.org
  http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
  Project Home: http://www.clusterlabs.org
  Getting started:
 http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
  Bugs: http://bugs.clusterlabs.org
 
 
 
  ___
  Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
  http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
  Project Home: http://www.clusterlabs.org
  Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
  Bugs: http://bugs.clusterlabs.org
 


 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org




-- 
- Goi Sihan
gois...@gmail.com
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5

2014-11-11 Thread Sihan Goi
Hi,

I'm fluent in English so I doubt it's a language barrier. I have reasonable
user experience in Linux, though not extensive experience in the various
system commands, and I have zero experience in HA. I'm in fact trying to
make things as simple as possible by simply following the Clusters from
Scratch guide step by step, and only modifying/omitting steps when they
don't work.

I know a block device (like /dev/sda) is simply a device (such as a hard
disk) that appears like a file in Linux, allowing users buffered access to
the device.
I know a file system is like FAT/NTFS/ext2/etc.
I know a mount point is a directory that you can mount an image file with a
file system onto it. Once mounted, it would be as if the entire file system
has the mount point as its root directory.

I set up DRBD almost exactly like the instructions from Chapter 7 of
Clusters from Scratch. The only differences are in our setups. The guide
assumes Fedora 13, DRBD 8.3 while I'm using CentOS 6.5 and DRBD 8.4.

Since I was following the guide from start to finish, /var/www/html already
has index.html already in there. node01 has it's own index.html, and node02
has its own index.html, both with different content. The guide did not
instruct me to delete these files, and seems to configure the mount point
to be /var/www/html (Chapter 7.4) with an ext4 file system, hence mounting
the image onto a directory that already has files in it. Is this a problem?


On Tue, Nov 11, 2014 at 6:07 PM, Lars Ellenberg lars.ellenb...@linbit.com
wrote:

 On Tue, Nov 11, 2014 at 12:27:23PM +0800, Sihan Goi wrote:
  Hi,
 
  DocumentRoot is still set to /var/www/html
  ls -al /var/www/html shows different things on the 2 nodes
  node01:
 
  total 28
  drwxr-xr-x. 3 root root  4096 Nov 11 12:25 .
  drwxr-xr-x. 6 root root  4096 Jul 23 22:18 ..
  -rw-r--r--. 1 root root50 Oct 28 18:00 index.html
  drwx--. 2 root root 16384 Oct 28 17:59 lost+found
 
  node02 only has index.html, no lost+found, and it's a different version
 of
  the file.

 I'm unsure if there is just a language barrier,
 or if you just have not enough experience with linux in general,
 or if you try to make things more complicated as they are.

 Do you know
  * what a block device is?
  * what a file system is?
  * what a mount point is?
  * that a mount point may not be empty, even though it typically is?
  * what it means to mount a file system to a mount point?

 Assuming you set up DRBD in a sane way,
 and it is mounted on *one* node (the node where it is Primary),
 then on the *other* node, where it is NOT mounted,
 you will only see the mount point,
 and whatever happens to be in there.

 You probably should clear out the contents of that mount point,
 so that you'd have an empty mount point.

 Or, if you like, replace it with some dummy content
 that clearly shows that this is the mount point,
 and not the file system that is intended to be mounted there.

  Status URL is enabled in both nodes.

 As for the DocumentRoot must be a directory,
 please double check for typos...


  On Oct 30, 2014 11:14 AM, Andrew Beekhof and...@beekhof.net wrote:
 
  
On 29 Oct 2014, at 1:01 pm, Sihan Goi gois...@gmail.com wrote:
   
Hi,
   
I've never used crm_report before. I just read the man file and
   generated a tarball from 1-2 hours before I reconfigured all the DRBD
   related resources. I've put the tarball here -
  
 https://www.dropbox.com/s/suj9pttjp403msv/unexplained-apache-failure.tar.bz2?dl=0
   
Hope you can help figure out what I'm doing wrong. Thanks for the
 help!
  
   Oct 28 18:13:38 node02 Filesystem(WebFS)[29940]: INFO: Running start
 for
   /dev/drbd/by-res/wwwdata on /var/www/html
   Oct 28 18:13:39 node02 kernel: EXT4-fs (drbd1): mounted filesystem with
   ordered data mode. Opts:
   Oct 28 18:13:39 node02 crmd[9870]:   notice: process_lrm_event: LRM
   operation WebFS_start_0 (call=164, rc=0, cib-update=298,
 confirmed=true) ok
   Oct 28 18:13:39 node02 crmd[9870]:   notice: te_rsc_command: Initiating
   action 7: start WebSite_start_0 on node02 (local)
   Oct 28 18:13:39 node02 apache(WebSite)[30007]: ERROR: Syntax error on
 line
   292 of /etc/httpd/conf/httpd.conf: DocumentRoot must be a directory
  
   Is DocumentRoot still set to /var/www/html?
   If so, what happens if you run 'ls -al /var/www/html' in a shell?
  
   Oct 28 18:13:39 node02 apache(WebSite)[30007]: INFO: apache not running
   Oct 28 18:13:39 node02 apache(WebSite)[30007]: INFO: waiting for apache
   /etc/httpd/conf/httpd.conf to come up
  
   Did you enable the status url?
  
  
 http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Clusters_from_Scratch/_enable_the_apache_status_url.html


 --
 : Lars Ellenberg
 : http://www.LINBIT.com | Your Way to High Availability
 : DRBD, Linux-HA  and  Pacemaker support and consulting

 DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.

 ___
 Pacemaker mailing list: Pacemaker

Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5

2014-11-10 Thread Sihan Goi
Hi,

DocumentRoot is still set to /var/www/html
ls -al /var/www/html shows different things on the 2 nodes
node01:

total 28
drwxr-xr-x. 3 root root  4096 Nov 11 12:25 .
drwxr-xr-x. 6 root root  4096 Jul 23 22:18 ..
-rw-r--r--. 1 root root50 Oct 28 18:00 index.html
drwx--. 2 root root 16384 Oct 28 17:59 lost+found

node02 only has index.html, no lost+found, and it's a different version of
the file.

Status URL is enabled in both nodes.


On Oct 30, 2014 11:14 AM, Andrew Beekhof and...@beekhof.net wrote:


  On 29 Oct 2014, at 1:01 pm, Sihan Goi gois...@gmail.com wrote:
 
  Hi,
 
  I've never used crm_report before. I just read the man file and
 generated a tarball from 1-2 hours before I reconfigured all the DRBD
 related resources. I've put the tarball here -
 https://www.dropbox.com/s/suj9pttjp403msv/unexplained-apache-failure.tar.bz2?dl=0
 
  Hope you can help figure out what I'm doing wrong. Thanks for the help!

 Oct 28 18:13:38 node02 Filesystem(WebFS)[29940]: INFO: Running start for
 /dev/drbd/by-res/wwwdata on /var/www/html
 Oct 28 18:13:39 node02 kernel: EXT4-fs (drbd1): mounted filesystem with
 ordered data mode. Opts:
 Oct 28 18:13:39 node02 crmd[9870]:   notice: process_lrm_event: LRM
 operation WebFS_start_0 (call=164, rc=0, cib-update=298, confirmed=true) ok
 Oct 28 18:13:39 node02 crmd[9870]:   notice: te_rsc_command: Initiating
 action 7: start WebSite_start_0 on node02 (local)
 Oct 28 18:13:39 node02 apache(WebSite)[30007]: ERROR: Syntax error on line
 292 of /etc/httpd/conf/httpd.conf: DocumentRoot must be a directory

 Is DocumentRoot still set to /var/www/html?
 If so, what happens if you run 'ls -al /var/www/html' in a shell?

 Oct 28 18:13:39 node02 apache(WebSite)[30007]: INFO: apache not running
 Oct 28 18:13:39 node02 apache(WebSite)[30007]: INFO: waiting for apache
 /etc/httpd/conf/httpd.conf to come up

 Did you enable the status url?

 http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Clusters_from_Scratch/_enable_the_apache_status_url.html



 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5

2014-10-28 Thread Sihan Goi
Hi,

No, I did not do this. I followed the Pacemaker 1.1 - Clusters from scratch
edition 5 for Fedora 13, and in section 7.3.4 it instructed me to run the
following commands, which I did:
mkfs.ext4 /dev/drbd1
mount /dev/drbd1 /mnt
create index.html file in /mnt
umount /dev/drbd1

Subsequently, after unmounting, there were no further instructions to mount
any other directories.

So, how should I mount /dev/mapper/vg_node02-drbd--demo to /var/www/html?
Should I be mounting /dev/mapper/vg_node02-drbd--demo, or /dev/drbd1. Since
I've already created index.html in /dev/drbd1, should I be mounting that?
I'm a little confused here.

On Tue, Oct 28, 2014 at 11:41 AM, Andrew Beekhof and...@beekhof.net wrote:


  On 27 Oct 2014, at 6:05 pm, Sihan Goi gois...@gmail.com wrote:
 
  Hi,
 
  That offending line is as follows:
  DocumentRoot /var/www/html
 
  I'm guessing it needs to be updated to the DRBD block device, but I'm
 not sure how to do that, or even what the block device is.
 
  fdisk -l shows the following, which I'm guessing is the block device?
  /dev/mapper/vg_node02-drbd--demo
 
  lvs shows the following:
  drbd-demo vg_node02 -wi-ao  1.00g
 
  btw I'm running the commands on node02 (secondary) rather than node01
 (primary). It's just a matter of convenience due to the physical location
 of the machine. Does it matter?

 Um, you need to mount /dev/mapper/vg_node02-drbd--demo to /var/www/html
 with a FileSystem resource.
 Have you not done this?

 
  Thanks.
 
  On Mon, Oct 27, 2014 at 11:35 AM, Andrew Beekhof and...@beekhof.net
 wrote:
  Oct 27 10:28:44 node02 apache(WebSite)[10515]: ERROR: Syntax error on
 line 292 of /etc/httpd/conf/httpd.conf: DocumentRoot must be a directory
 
 
 
   On 27 Oct 2014, at 1:36 pm, Sihan Goi gois...@gmail.com wrote:
  
   Hi Andrew,
  
   Logs in /var/log/httpd/ are empty, but here's a snippet of
 /var/log/messages right after I start pacemaker and do a crm status
  
   http://pastebin.com/ivQdyV4u
  
   Seems like the Apache service doesn't come up. This only happens after
 I run the commands in the guide to configure DRBD.
  
   On Fri, Oct 24, 2014 at 8:29 AM, Andrew Beekhof and...@beekhof.net
 wrote:
   logs?
  
On 23 Oct 2014, at 1:08 pm, Sihan Goi gois...@gmail.com wrote:
   
Hi, can anyone help? Really stuck here...
   
On Mon, Oct 20, 2014 at 9:46 AM, Sihan Goi gois...@gmail.com
 wrote:
Hi,
   
I'm following the Clusters from Scratch guide for Fedora 13, and
 I've managed to get a 2 node cluster working with Apache. However, once I
 tried to add DRBD 8.4 to the mix, it stopped working.
   
I've followed the DRBD steps in the guide all the way till cib
 commit fs in Section 7.4, right before Testing Migration. However, when
 I do a crm_mon, I get the following failed actions.
   
Last updated: Thu Oct 16 17:28:34 2014
Last change: Thu Oct 16 17:26:04 2014 via crm_shadow on node01
Stack: cman
Current DC: node02 - partition with quorum
Version: 1.1.10-14.el6_5.3-368c726
2 Nodes configured
5 Resources configured
   
   
Online: [ node01 node02 ]
   
ClusterIP(ocf::heartbeat:IPaddr2):Started node02
 Master/Slave Set: WebDataClone [WebData]
 Masters: [ node02 ]
 Slaves: [ node01 ]
WebFS   (ocf::heartbeat:Filesystem):Started node02
   
Failed actions:
WebSite_start_0 on node02 'unknown error' (1): call=278,
 status=Timed Out, last-rc-change='Thu Oct 16 17:26:28 2014',
 queued=2ms, exec=0ms
WebSite_start_0 on node01 'unknown error' (1): call=203,
 status=Timed
Out, last-rc-change='Thu Oct 16 17:26:09 2014', queued=2ms,
 exec=0ms
   
Seems like the apache Website resource isn't starting up. Apache was
working just fine before I configured DRBD. What did I do wrong?
   
--
- Goi Sihan
gois...@gmail.com
   
   
   
--
- Goi Sihan
gois...@gmail.com
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
   
Project Home: http://www.clusterlabs.org
Getting started:
 http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
  
  
   ___
   Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
   http://oss.clusterlabs.org/mailman/listinfo/pacemaker
  
   Project Home: http://www.clusterlabs.org
   Getting started:
 http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
   Bugs: http://bugs.clusterlabs.org
  
  
  
   --
   - Goi Sihan
   gois...@gmail.com
   ___
   Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
   http://oss.clusterlabs.org/mailman/listinfo/pacemaker
  
   Project Home: http://www.clusterlabs.org
   Getting started:
 http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
   Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5

2014-10-28 Thread Sihan Goi
Hi,

I followed those steps previously. I just tried it again, but I'm still
getting the same error. My crm configure show shows the following:

node node01 \
attributes standby=off
node node02
primitive ClusterIP IPaddr2 \
params ip=192.168.1.110 cidr_netmask=24 \
op monitor interval=30s
primitive WebData ocf:linbit:drbd \
params drbd_resource=wwwdata \
op monitor interval=60s
primitive WebFS Filesystem \
params device=/dev/drbd/by-res/wwwdata directory=/var/www/html
fstype=ext4
primitive WebSite apache \
params configfile=/etc/httpd/conf/httpd.conf \
op monitor interval=1min
ms WebDataClone WebData \
meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1
notify=true
location prefer-node01 WebSite 50: node01
colocation WebSite-with-WebFS inf: WebSite WebFS
colocation fs_on_drbd inf: WebFS WebDataClone:Master
colocation website-with-ip inf: WebSite ClusterIP
order WebFS-after-WebData inf: WebDataClone:promote WebFS:start
order WebSite-after-WebFS inf: WebFS WebSite
order apache-after-ip Mandatory: ClusterIP WebSite
property cib-bootstrap-options: \
dc-version=1.1.10-14.el6_5.3-368c726 \
cluster-infrastructure=cman \
stonith-enabled=false \
no-quorum-policy=ignore
rsc_defaults rsc_defaults-options: \
migration-threshold=1

What am I doing wrong?

On Tue, Oct 28, 2014 at 5:11 PM, Andrew Beekhof and...@beekhof.net wrote:


  On 28 Oct 2014, at 6:26 pm, Sihan Goi gois...@gmail.com wrote:
 
  Hi,
 
  No, I did not do this. I followed the Pacemaker 1.1 - Clusters from
 scratch edition 5 for Fedora 13, and in section 7.3.4 it instructed me to
 run the following commands, which I did:
  mkfs.ext4 /dev/drbd1
  mount /dev/drbd1 /mnt
  create index.html file in /mnt
  umount /dev/drbd1
 
  Subsequently, after unmounting, there were no further instructions to
 mount any other directories.
 
  So, how should I mount /dev/mapper/vg_node02-drbd--demo to
 /var/www/html? Should I be mounting /dev/mapper/vg_node02-drbd--demo, or
 /dev/drbd1. Since I've already created index.html in /dev/drbd1, should I
 be mounting that? I'm a little confused here.


 http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Clusters_from_Scratch/_configure_the_cluster_for_drbd.html

 Look for Now that DRBD is functioning we can configure a Filesystem
 resource to use it

 
  On Tue, Oct 28, 2014 at 11:41 AM, Andrew Beekhof and...@beekhof.net
 wrote:
 
   On 27 Oct 2014, at 6:05 pm, Sihan Goi gois...@gmail.com wrote:
  
   Hi,
  
   That offending line is as follows:
   DocumentRoot /var/www/html
  
   I'm guessing it needs to be updated to the DRBD block device, but I'm
 not sure how to do that, or even what the block device is.
  
   fdisk -l shows the following, which I'm guessing is the block device?
   /dev/mapper/vg_node02-drbd--demo
  
   lvs shows the following:
   drbd-demo vg_node02 -wi-ao  1.00g
  
   btw I'm running the commands on node02 (secondary) rather than node01
 (primary). It's just a matter of convenience due to the physical location
 of the machine. Does it matter?
 
  Um, you need to mount /dev/mapper/vg_node02-drbd--demo to /var/www/html
 with a FileSystem resource.
  Have you not done this?
 
  
   Thanks.
  
   On Mon, Oct 27, 2014 at 11:35 AM, Andrew Beekhof and...@beekhof.net
 wrote:
   Oct 27 10:28:44 node02 apache(WebSite)[10515]: ERROR: Syntax error on
 line 292 of /etc/httpd/conf/httpd.conf: DocumentRoot must be a directory
  
  
  
On 27 Oct 2014, at 1:36 pm, Sihan Goi gois...@gmail.com wrote:
   
Hi Andrew,
   
Logs in /var/log/httpd/ are empty, but here's a snippet of
 /var/log/messages right after I start pacemaker and do a crm status
   
http://pastebin.com/ivQdyV4u
   
Seems like the Apache service doesn't come up. This only happens
 after I run the commands in the guide to configure DRBD.
   
On Fri, Oct 24, 2014 at 8:29 AM, Andrew Beekhof and...@beekhof.net
 wrote:
logs?
   
 On 23 Oct 2014, at 1:08 pm, Sihan Goi gois...@gmail.com wrote:

 Hi, can anyone help? Really stuck here...

 On Mon, Oct 20, 2014 at 9:46 AM, Sihan Goi gois...@gmail.com
 wrote:
 Hi,

 I'm following the Clusters from Scratch guide for Fedora 13, and
 I've managed to get a 2 node cluster working with Apache. However, once I
 tried to add DRBD 8.4 to the mix, it stopped working.

 I've followed the DRBD steps in the guide all the way till cib
 commit fs in Section 7.4, right before Testing Migration. However, when
 I do a crm_mon, I get the following failed actions.

 Last updated: Thu Oct 16 17:28:34 2014
 Last change: Thu Oct 16 17:26:04 2014 via crm_shadow on node01
 Stack: cman
 Current DC: node02 - partition with quorum
 Version: 1.1.10-14.el6_5.3-368c726
 2 Nodes configured
 5 Resources configured


 Online: [ node01 node02 ]

 ClusterIP(ocf::heartbeat:IPaddr2):Started

Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5

2014-10-28 Thread Sihan Goi
Hi,

I've never used crm_report before. I just read the man file and generated a
tarball from 1-2 hours before I reconfigured all the DRBD related
resources. I've put the tarball here -
https://www.dropbox.com/s/suj9pttjp403msv/unexplained-apache-failure.tar.bz2?dl=0

Hope you can help figure out what I'm doing wrong. Thanks for the help!

On Wed, Oct 29, 2014 at 9:24 AM, Andrew Beekhof and...@beekhof.net wrote:

 Can you run crm_report so we can see the logs and PE files?

  On 28 Oct 2014, at 9:16 pm, Sihan Goi gois...@gmail.com wrote:
 
  Hi,
 
  I followed those steps previously. I just tried it again, but I'm still
 getting the same error. My crm configure show shows the following:
 
  node node01 \
  attributes standby=off
  node node02
  primitive ClusterIP IPaddr2 \
  params ip=192.168.1.110 cidr_netmask=24 \
  op monitor interval=30s
  primitive WebData ocf:linbit:drbd \
  params drbd_resource=wwwdata \
  op monitor interval=60s
  primitive WebFS Filesystem \
  params device=/dev/drbd/by-res/wwwdata
 directory=/var/www/html fstype=ext4
  primitive WebSite apache \
  params configfile=/etc/httpd/conf/httpd.conf \
  op monitor interval=1min
  ms WebDataClone WebData \
  meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1
 notify=true
  location prefer-node01 WebSite 50: node01
  colocation WebSite-with-WebFS inf: WebSite WebFS
  colocation fs_on_drbd inf: WebFS WebDataClone:Master
  colocation website-with-ip inf: WebSite ClusterIP
  order WebFS-after-WebData inf: WebDataClone:promote WebFS:start
  order WebSite-after-WebFS inf: WebFS WebSite
  order apache-after-ip Mandatory: ClusterIP WebSite
  property cib-bootstrap-options: \
  dc-version=1.1.10-14.el6_5.3-368c726 \
  cluster-infrastructure=cman \
  stonith-enabled=false \
  no-quorum-policy=ignore
  rsc_defaults rsc_defaults-options: \
  migration-threshold=1
 
  What am I doing wrong?
 
  On Tue, Oct 28, 2014 at 5:11 PM, Andrew Beekhof and...@beekhof.net
 wrote:
 
   On 28 Oct 2014, at 6:26 pm, Sihan Goi gois...@gmail.com wrote:
  
   Hi,
  
   No, I did not do this. I followed the Pacemaker 1.1 - Clusters from
 scratch edition 5 for Fedora 13, and in section 7.3.4 it instructed me to
 run the following commands, which I did:
   mkfs.ext4 /dev/drbd1
   mount /dev/drbd1 /mnt
   create index.html file in /mnt
   umount /dev/drbd1
  
   Subsequently, after unmounting, there were no further instructions to
 mount any other directories.
  
   So, how should I mount /dev/mapper/vg_node02-drbd--demo to
 /var/www/html? Should I be mounting /dev/mapper/vg_node02-drbd--demo, or
 /dev/drbd1. Since I've already created index.html in /dev/drbd1, should I
 be mounting that? I'm a little confused here.
 
 
 http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Clusters_from_Scratch/_configure_the_cluster_for_drbd.html
 
  Look for Now that DRBD is functioning we can configure a Filesystem
 resource to use it
 
  
   On Tue, Oct 28, 2014 at 11:41 AM, Andrew Beekhof and...@beekhof.net
 wrote:
  
On 27 Oct 2014, at 6:05 pm, Sihan Goi gois...@gmail.com wrote:
   
Hi,
   
That offending line is as follows:
DocumentRoot /var/www/html
   
I'm guessing it needs to be updated to the DRBD block device, but
 I'm not sure how to do that, or even what the block device is.
   
fdisk -l shows the following, which I'm guessing is the block device?
/dev/mapper/vg_node02-drbd--demo
   
lvs shows the following:
drbd-demo vg_node02 -wi-ao  1.00g
   
btw I'm running the commands on node02 (secondary) rather than
 node01 (primary). It's just a matter of convenience due to the physical
 location of the machine. Does it matter?
  
   Um, you need to mount /dev/mapper/vg_node02-drbd--demo to
 /var/www/html with a FileSystem resource.
   Have you not done this?
  
   
Thanks.
   
On Mon, Oct 27, 2014 at 11:35 AM, Andrew Beekhof and...@beekhof.net
 wrote:
Oct 27 10:28:44 node02 apache(WebSite)[10515]: ERROR: Syntax error
 on line 292 of /etc/httpd/conf/httpd.conf: DocumentRoot must be a directory
   
   
   
 On 27 Oct 2014, at 1:36 pm, Sihan Goi gois...@gmail.com wrote:

 Hi Andrew,

 Logs in /var/log/httpd/ are empty, but here's a snippet of
 /var/log/messages right after I start pacemaker and do a crm status

 http://pastebin.com/ivQdyV4u

 Seems like the Apache service doesn't come up. This only happens
 after I run the commands in the guide to configure DRBD.

 On Fri, Oct 24, 2014 at 8:29 AM, Andrew Beekhof 
 and...@beekhof.net wrote:
 logs?

  On 23 Oct 2014, at 1:08 pm, Sihan Goi gois...@gmail.com wrote:
 
  Hi, can anyone help? Really stuck here...
 
  On Mon, Oct 20, 2014 at 9:46 AM, Sihan Goi gois...@gmail.com
 wrote:
  Hi,
 
  I'm following the Clusters from Scratch guide for Fedora 13,
 and I've managed

Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5

2014-10-27 Thread Sihan Goi
Hi,

That offending line is as follows:
DocumentRoot /var/www/html

I'm guessing it needs to be updated to the DRBD block device, but I'm not
sure how to do that, or even what the block device is.

fdisk -l shows the following, which I'm guessing is the block device?
/dev/mapper/vg_node02-drbd--demo

lvs shows the following:
drbd-demo vg_node02 -wi-ao  1.00g

btw I'm running the commands on node02 (secondary) rather than node01
(primary). It's just a matter of convenience due to the physical location
of the machine. Does it matter?

Thanks.

On Mon, Oct 27, 2014 at 11:35 AM, Andrew Beekhof and...@beekhof.net wrote:

 Oct 27 10:28:44 node02 apache(WebSite)[10515]: ERROR: Syntax error on line
 292 of /etc/httpd/conf/httpd.conf: DocumentRoot must be a directory



  On 27 Oct 2014, at 1:36 pm, Sihan Goi gois...@gmail.com wrote:
 
  Hi Andrew,
 
  Logs in /var/log/httpd/ are empty, but here's a snippet of
 /var/log/messages right after I start pacemaker and do a crm status
 
  http://pastebin.com/ivQdyV4u
 
  Seems like the Apache service doesn't come up. This only happens after I
 run the commands in the guide to configure DRBD.
 
  On Fri, Oct 24, 2014 at 8:29 AM, Andrew Beekhof and...@beekhof.net
 wrote:
  logs?
 
   On 23 Oct 2014, at 1:08 pm, Sihan Goi gois...@gmail.com wrote:
  
   Hi, can anyone help? Really stuck here...
  
   On Mon, Oct 20, 2014 at 9:46 AM, Sihan Goi gois...@gmail.com wrote:
   Hi,
  
   I'm following the Clusters from Scratch guide for Fedora 13, and
 I've managed to get a 2 node cluster working with Apache. However, once I
 tried to add DRBD 8.4 to the mix, it stopped working.
  
   I've followed the DRBD steps in the guide all the way till cib commit
 fs in Section 7.4, right before Testing Migration. However, when I do a
 crm_mon, I get the following failed actions.
  
   Last updated: Thu Oct 16 17:28:34 2014
   Last change: Thu Oct 16 17:26:04 2014 via crm_shadow on node01
   Stack: cman
   Current DC: node02 - partition with quorum
   Version: 1.1.10-14.el6_5.3-368c726
   2 Nodes configured
   5 Resources configured
  
  
   Online: [ node01 node02 ]
  
   ClusterIP(ocf::heartbeat:IPaddr2):Started node02
Master/Slave Set: WebDataClone [WebData]
Masters: [ node02 ]
Slaves: [ node01 ]
   WebFS   (ocf::heartbeat:Filesystem):Started node02
  
   Failed actions:
   WebSite_start_0 on node02 'unknown error' (1): call=278,
 status=Timed Out, last-rc-change='Thu Oct 16 17:26:28 2014',
 queued=2ms, exec=0ms
   WebSite_start_0 on node01 'unknown error' (1): call=203,
 status=Timed
   Out, last-rc-change='Thu Oct 16 17:26:09 2014', queued=2ms,
 exec=0ms
  
   Seems like the apache Website resource isn't starting up. Apache was
   working just fine before I configured DRBD. What did I do wrong?
  
   --
   - Goi Sihan
   gois...@gmail.com
  
  
  
   --
   - Goi Sihan
   gois...@gmail.com
   ___
   Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
   http://oss.clusterlabs.org/mailman/listinfo/pacemaker
  
   Project Home: http://www.clusterlabs.org
   Getting started:
 http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
   Bugs: http://bugs.clusterlabs.org
 
 
  ___
  Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
  http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
  Project Home: http://www.clusterlabs.org
  Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
  Bugs: http://bugs.clusterlabs.org
 
 
 
  --
  - Goi Sihan
  gois...@gmail.com
  ___
  Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
  http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
  Project Home: http://www.clusterlabs.org
  Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
  Bugs: http://bugs.clusterlabs.org


 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org




-- 
- Goi Sihan
gois...@gmail.com
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5

2014-10-26 Thread Sihan Goi
Hi Andrew,

Logs in /var/log/httpd/ are empty, but here's a snippet of
/var/log/messages right after I start pacemaker and do a crm status

http://pastebin.com/ivQdyV4u

Seems like the Apache service doesn't come up. This only happens after I
run the commands in the guide to configure DRBD.

On Fri, Oct 24, 2014 at 8:29 AM, Andrew Beekhof and...@beekhof.net wrote:

 logs?

  On 23 Oct 2014, at 1:08 pm, Sihan Goi gois...@gmail.com wrote:
 
  Hi, can anyone help? Really stuck here...
 
  On Mon, Oct 20, 2014 at 9:46 AM, Sihan Goi gois...@gmail.com wrote:
  Hi,
 
  I'm following the Clusters from Scratch guide for Fedora 13, and I've
 managed to get a 2 node cluster working with Apache. However, once I tried
 to add DRBD 8.4 to the mix, it stopped working.
 
  I've followed the DRBD steps in the guide all the way till cib commit
 fs in Section 7.4, right before Testing Migration. However, when I do a
 crm_mon, I get the following failed actions.
 
  Last updated: Thu Oct 16 17:28:34 2014
  Last change: Thu Oct 16 17:26:04 2014 via crm_shadow on node01
  Stack: cman
  Current DC: node02 - partition with quorum
  Version: 1.1.10-14.el6_5.3-368c726
  2 Nodes configured
  5 Resources configured
 
 
  Online: [ node01 node02 ]
 
  ClusterIP(ocf::heartbeat:IPaddr2):Started node02
   Master/Slave Set: WebDataClone [WebData]
   Masters: [ node02 ]
   Slaves: [ node01 ]
  WebFS   (ocf::heartbeat:Filesystem):Started node02
 
  Failed actions:
  WebSite_start_0 on node02 'unknown error' (1): call=278,
 status=Timed Out, last-rc-change='Thu Oct 16 17:26:28 2014',
 queued=2ms, exec=0ms
  WebSite_start_0 on node01 'unknown error' (1): call=203, status=Timed
  Out, last-rc-change='Thu Oct 16 17:26:09 2014', queued=2ms, exec=0ms
 
  Seems like the apache Website resource isn't starting up. Apache was
  working just fine before I configured DRBD. What did I do wrong?
 
  --
  - Goi Sihan
  gois...@gmail.com
 
 
 
  --
  - Goi Sihan
  gois...@gmail.com
  ___
  Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
  http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
  Project Home: http://www.clusterlabs.org
  Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
  Bugs: http://bugs.clusterlabs.org


 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org




-- 
- Goi Sihan
gois...@gmail.com
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] DRBD with Pacemaker on CentOs 6.5

2014-10-22 Thread Sihan Goi
Hi, can anyone help? Really stuck here...

On Mon, Oct 20, 2014 at 9:46 AM, Sihan Goi gois...@gmail.com wrote:

 Hi,

 I'm following the Clusters from Scratch guide for Fedora 13, and I've
 managed to get a 2 node cluster working with Apache. However, once I tried
 to add DRBD 8.4 to the mix, it stopped working.

 I've followed the DRBD steps in the guide all the way till cib commit fs
 in Section 7.4, right before Testing Migration. However, when I do a
 crm_mon, I get the following failed actions.

 Last updated: Thu Oct 16 17:28:34 2014
 Last change: Thu Oct 16 17:26:04 2014 via crm_shadow on node01
 Stack: cman
 Current DC: node02 - partition with quorum
 Version: 1.1.10-14.el6_5.3-368c726
 2 Nodes configured
 5 Resources configured


 Online: [ node01 node02 ]

 ClusterIP(ocf::heartbeat:IPaddr2):Started node02
  Master/Slave Set: WebDataClone [WebData]
  Masters: [ node02 ]
  Slaves: [ node01 ]
 WebFS   (ocf::heartbeat:Filesystem):Started node02

 Failed actions:
 WebSite_start_0 on node02 'unknown error' (1): call=278, status=Timed
 Out, last-rc-change='Thu Oct 16 17:26:28 2014', queued=2ms, exec=0ms
 WebSite_start_0 on node01 'unknown error' (1): call=203, status=Timed
 Out, last-rc-change='Thu Oct 16 17:26:09 2014', queued=2ms, exec=0ms

 Seems like the apache Website resource isn't starting up. Apache was
 working just fine before I configured DRBD. What did I do wrong?

 --
 - Goi Sihan
 gois...@gmail.com




-- 
- Goi Sihan
gois...@gmail.com
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] DRBD with Pacemaker on CentOs 6.5

2014-10-19 Thread Sihan Goi
Hi,

I'm following the Clusters from Scratch guide for Fedora 13, and I've
managed to get a 2 node cluster working with Apache. However, once I tried
to add DRBD 8.4 to the mix, it stopped working.

I've followed the DRBD steps in the guide all the way till cib commit fs
in Section 7.4, right before Testing Migration. However, when I do a
crm_mon, I get the following failed actions.

Last updated: Thu Oct 16 17:28:34 2014
Last change: Thu Oct 16 17:26:04 2014 via crm_shadow on node01
Stack: cman
Current DC: node02 - partition with quorum
Version: 1.1.10-14.el6_5.3-368c726
2 Nodes configured
5 Resources configured


Online: [ node01 node02 ]

ClusterIP(ocf::heartbeat:IPaddr2):Started node02
 Master/Slave Set: WebDataClone [WebData]
 Masters: [ node02 ]
 Slaves: [ node01 ]
WebFS   (ocf::heartbeat:Filesystem):Started node02

Failed actions:
WebSite_start_0 on node02 'unknown error' (1): call=278, status=Timed
Out, last-rc-change='Thu Oct 16 17:26:28 2014', queued=2ms, exec=0ms
WebSite_start_0 on node01 'unknown error' (1): call=203, status=Timed
Out, last-rc-change='Thu Oct 16 17:26:09 2014', queued=2ms, exec=0ms

Seems like the apache Website resource isn't starting up. Apache was
working just fine before I configured DRBD. What did I do wrong?

-- 
- Goi Sihan
gois...@gmail.com
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Linux HA setup for CentOS 6.5

2014-10-16 Thread Sihan Goi
Thanks!

OK, so I've followed the DRBD steps in the guide all the way till cib
commit fs in Section 7.4, right before Testing Migration. However, when
I do a crm_mon, I get the following failed actions.

Last updated: Thu Oct 16 17:28:34 2014
Last change: Thu Oct 16 17:26:04 2014 via crm_shadow on node01
Stack: cman
Current DC: node02 - partition with quorum
Version: 1.1.10-14.el6_5.3-368c726
2 Nodes configured
5 Resources configured


Online: [ node01 node02 ]

ClusterIP(ocf::heartbeat:IPaddr2):Started node02
 Master/Slave Set: WebDataClone [WebData]
 Masters: [ node02 ]
 Slaves: [ node01 ]
WebFS   (ocf::heartbeat:Filesystem):Started node02

Failed actions:
WebSite_start_0 on node02 'unknown error' (1): call=278, status=Timed
Out, l
ast-rc-change='Thu Oct 16 17:26:28 2014', queued=2ms, exec=0ms
WebSite_start_0 on node01 'unknown error' (1): call=203, status=Timed
Out, l
ast-rc-change='Thu Oct 16 17:26:09 2014', queued=2ms, exec=0ms

Seems like the apache Website resource isn't starting up. Apache was
working just fine before I configured DRBD. What did I do wrong?

On Thu, Oct 16, 2014 at 1:49 PM, Digimer li...@alteeve.ca wrote:

 On 16/10/14 12:14 AM, Sihan Goi wrote:

 After following the guide, I've successfully managed to get Apache
 server up and running in the cluster as an active/passive setup, but
 with some differences. My cluster stack is stated as being cman while
 the guide's is openais. Not sure if that's a problem. Also, some
 commands in the guide don't seem to work.


 If you can provide examples of what issues you're having, I will be happy
 to try an help.

  I'm moving on to DRBD installation now, but when I do a yum install
 drbd-pacemaker drbd-udev, these packages are not available. After some
 googling, it seems that drbd83-utils/kmod-drbd83 or
 drbd84-utils/kmod-drbd84 is available via another repo. Does this work
 with the guide?


 You need to get them from a 3rd party repo (or install from source). I
 personally still use 8.3.16 (consistency during Anvil! generations), but
 I know that 8.4 is fine on EL6 (and EL7, to address an earlier comment). I
 have my own repos with these packages, but you would likely be better
 served using the ELRepo ones.

 https://alteeve.ca/w/AN!Cluster_Tutorial_2#Installing_DRBD

 The only real difference is to s/83/84/:

 + yum install drbd84-utils kmod-drbd84
 - yum install drbd83-utils kmod-drbd83

 If you run into any troubles, please share details and I am sure we'll get
 you sorted out in no time.

 Cheers


 --
 Digimer
 Papers and Projects: https://alteeve.ca/w/
 What if the cure for cancer is trapped in the mind of a person without
 access to education?

 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org




-- 
- Goi Sihan
gois...@gmail.com
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Linux HA setup for CentOS 6.5

2014-10-15 Thread Sihan Goi
Hi,

So I've decided to make things simpler and go with a wired network instead
of wireless. I connected both boxes to a router, manually edited the
ifcfg-eth0 files to set static IP addresses for both boxes (not before
downloading and building a driver for the nic of 1 of the boxes), did a
chkconfig NetworkManager off, service NetworkManager stop, and service
network restart.

I'm able to ping each other via IP address and hostname. I also already
have corosync, pacemaker, crmsh and cman installed.

I then did the following as per the guide at
http://geekpeek.net/linux-cluster-corosync-pacemaker

service corosync start - success.
service pacemaker start - I get a Starting cman...corosync cluster engine
is already running [FAILED]

What's up? :(
On Oct 15, 2014 12:23 PM, Sihan Goi gois...@gmail.com wrote:

 No typo.

 [root@node02 network-scripts]# ls -lah
 /etc/sysconfig/network-scripts/ifcfg-*
 -rw-r--r--. 1 root root 254 Oct 10  2013
 /etc/sysconfig/network-scripts/ifcfg-lo

 I installed CentOS 6.5 with the LiveDVD. I found it weird as well that
 these files were missing.

 On Wed, Oct 15, 2014 at 11:54 AM, Digimer li...@alteeve.ca wrote:

 Sure there isn't a typo there?

 an-c05n01:~# ls -lah /etc/sysconfig/network-scripts/ifcfg-*
 -rw-r--r--. 1 root root 225 Jan 16  2013 /etc/sysconfig/network-
 scripts/ifcfg-bond0
 -rw-r--r--. 1 root root 220 Jan 16  2013 /etc/sysconfig/network-
 scripts/ifcfg-bond1
 -rw-r--r--. 1 root root 198 Jan 16  2013 /etc/sysconfig/network-
 scripts/ifcfg-bond2
 -rw-r--r--. 1 root root 149 Jan 16  2013 /etc/sysconfig/network-
 scripts/ifcfg-eth0
 -rw-r--r--. 1 root root 144 Jan 16  2013 /etc/sysconfig/network-
 scripts/ifcfg-eth1
 -rw-r--r--. 1 root root 152 Mar 14  2013 /etc/sysconfig/network-
 scripts/ifcfg-eth2
 -rw-r--r--. 1 root root 149 Jan 16  2013 /etc/sysconfig/network-
 scripts/ifcfg-eth3
 -rw-r--r--. 1 root root 144 Jan 16  2013 /etc/sysconfig/network-
 scripts/ifcfg-eth4
 -rw-r--r--. 1 root root 152 Mar 14  2013 /etc/sysconfig/network-
 scripts/ifcfg-eth5
 -rw-r--r--. 1 root root 254 Jul 22 09:56 /etc/sysconfig/network-
 scripts/ifcfg-lo
 -rw-r--r--. 1 root root 213 Mar 13  2013 /etc/sysconfig/network-
 scripts/ifcfg-vbr2

 I've never seen an EL6 install without the files there, 'network' or
 NetworkManager aside.

 digimer

 On 14/10/14 11:32 PM, Sihan Goi wrote:

 There aren't any config files in /etc/sysconfig/network-scripts. When I
 was using CentOS 7, the config files were there (ifcfg-something) but in
 this CentOS 6.5 installation, they are missing.

 If is possible to not use cman, and just use corosync and pacemaker? If
 so, how?

 On Wed, Oct 15, 2014 at 11:22 AM, Digimer li...@alteeve.ca
 mailto:li...@alteeve.ca wrote:

 You can manually configure the wireless LAN without NetworkManager.
 If you take a look, there should be existing config files in
 /etc/sysconfig/network-__scripts/ for the wireless connection. I've
 not done it myself since many Fedora's ago, but I believe you can
 change NMCONTROLLER=no and then start it up with
 /etc/sysconfig/network start. I could be a bit wrong, but I am sure
 you can make wireless work without NM.

 Question; Servers with WLAN? I assume these won't be used for
 corosync?

 digimer


 On 14/10/14 11:17 PM, Sihan Goi wrote:

 Hi,

 Is there a tutorial showing how to get a basic Linux HA setup
 with
 replicated storage (via DRBD) working on CentOS 6.5? I want to
 have
 mySQL as the HA resource with the database replicated across the
 nodes.
 I've scoured the web for one but it seems that I get stuck in
 each one
 somewhere.

 To elaborate, I have 2 CentOS 6.5 nodes configured with distinct
 hostnames and static IPs. They are connected to a wireless AP,
 and can
 ping each other.

 I tried following this guide -
 http://clusterlabs.org/__quickstart-redhat.html
 http://clusterlabs.org/quickstart-redhat.html
 However, cman will not start when NetworkManager is running, and
 my
 nodes cannot connect to the wireless AP without NetworkManager
 running.
 Am I missing something or is that the stupidest dependency ever?
 How is
 a cluster supposed to work when the nodes aren't connected to
 one another?

 I also tried following the clusters from scratch guide but
 that seems
 to rely on systemctl calls which aren't available on CentOS 6.5.

 Any help?

 --
 - Goi Sihan
 gois...@gmail.com mailto:gois...@gmail.com
 mailto:gois...@gmail.com mailto:gois...@gmail.com


 _
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 mailto:Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/__mailman/listinfo/pacemaker
 http://oss.clusterlabs.org/mailman/listinfo

Re: [Pacemaker] Linux HA setup for CentOS 6.5

2014-10-15 Thread Sihan Goi
Hi,

Thanks for the guide! I thought I had the same exact version...mine is also
named Pacemaker 1.1 Clusters from Scratch Creating Active/Passive and
Active/Active Clusters on Fedora Edition 5, but my version of the document
is meant for Fedora 17, and uses pcs and systemctl calls which don't exist
on CentOS 6.5. I was trying to get it to work on CentOS 7 but realized
support for DRBD on CentOS 7 is really lacking.

I'll refer to the version you posted from hereon.

On Wed, Oct 15, 2014 at 11:43 PM, Digimer li...@alteeve.ca wrote:

 Let pacemaker start cman/corosync on EL6.

 This is the guide that covers it, written by Pacemaker's author:

 http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html-
 single/Clusters_from_Scratch/index.html

 It notes that it's based on Fedora 13, but that maps to EL6 almost
 perfectly.

 A very slightly altered approach is here, in my *very* unfinished tutorial:

 https://alteeve.ca/w/Anvil!_Tutorial_3_on_EL6#Configuring_the_Anvil.21

 The main difference is that Andrew's approach (see section 8.2.2) is to
 disable quorum via editing /etc/sysconfig/cman, where my approach handles
 it in the main /etc/cluster/cluster.conf (cman's main config file).

 In any case, from then on, start pacemaker and let it handle everything
 else.

 Cheers

 digimer

 On 15/10/14 04:27 AM, Sihan Goi wrote:

 Hi,

 So I've decided to make things simpler and go with a wired network
 instead of wireless. I connected both boxes to a router, manually edited
 the ifcfg-eth0 files to set static IP addresses for both boxes (not
 before downloading and building a driver for the nic of 1 of the boxes),
 did a chkconfig NetworkManager off, service NetworkManager stop, and
 service network restart.

 I'm able to ping each other via IP address and hostname. I also already
 have corosync, pacemaker, crmsh and cman installed.

 I then did the following as per the guide at
 http://geekpeek.net/linux-cluster-corosync-pacemaker

 service corosync start - success.
 service pacemaker start - I get a Starting cman...corosync cluster
 engine is already running [FAILED]

 What's up? :(

 On Oct 15, 2014 12:23 PM, Sihan Goi gois...@gmail.com
 mailto:gois...@gmail.com wrote:

 No typo.

 [root@node02 network-scripts]# ls -lah
 /etc/sysconfig/network-scripts/ifcfg-*
 -rw-r--r--. 1 root root 254 Oct 10  2013
 /etc/sysconfig/network-scripts/ifcfg-lo

 I installed CentOS 6.5 with the LiveDVD. I found it weird as well
 that these files were missing.

 On Wed, Oct 15, 2014 at 11:54 AM, Digimer li...@alteeve.ca
 mailto:li...@alteeve.ca wrote:

 Sure there isn't a typo there?

 an-c05n01:~# ls -lah /etc/sysconfig/network-__scripts/ifcfg-*
 -rw-r--r--. 1 root root 225 Jan 16  2013
 /etc/sysconfig/network-__scripts/ifcfg-bond0
 -rw-r--r--. 1 root root 220 Jan 16  2013
 /etc/sysconfig/network-__scripts/ifcfg-bond1
 -rw-r--r--. 1 root root 198 Jan 16  2013
 /etc/sysconfig/network-__scripts/ifcfg-bond2
 -rw-r--r--. 1 root root 149 Jan 16  2013
 /etc/sysconfig/network-__scripts/ifcfg-eth0
 -rw-r--r--. 1 root root 144 Jan 16  2013
 /etc/sysconfig/network-__scripts/ifcfg-eth1
 -rw-r--r--. 1 root root 152 Mar 14  2013
 /etc/sysconfig/network-__scripts/ifcfg-eth2
 -rw-r--r--. 1 root root 149 Jan 16  2013
 /etc/sysconfig/network-__scripts/ifcfg-eth3
 -rw-r--r--. 1 root root 144 Jan 16  2013
 /etc/sysconfig/network-__scripts/ifcfg-eth4
 -rw-r--r--. 1 root root 152 Mar 14  2013
 /etc/sysconfig/network-__scripts/ifcfg-eth5
 -rw-r--r--. 1 root root 254 Jul 22 09:56
 /etc/sysconfig/network-__scripts/ifcfg-lo
 -rw-r--r--. 1 root root 213 Mar 13  2013
 /etc/sysconfig/network-__scripts/ifcfg-vbr2

 I've never seen an EL6 install without the files there,
 'network' or NetworkManager aside.

 digimer

 On 14/10/14 11:32 PM, Sihan Goi wrote:

 There aren't any config files in
 /etc/sysconfig/network-__scripts. When I
 was using CentOS 7, the config files were there
 (ifcfg-something) but in
 this CentOS 6.5 installation, they are missing.

 If is possible to not use cman, and just use corosync and
 pacemaker? If
 so, how?

 On Wed, Oct 15, 2014 at 11:22 AM, Digimer li...@alteeve.ca
 mailto:li...@alteeve.ca
 mailto:li...@alteeve.ca mailto:li...@alteeve.ca wrote:

  You can manually configure the wireless LAN without
 NetworkManager.
  If you take a look, there should be existing config
 files in
  /etc/sysconfig/network-scripts/ for the wireless

 connection. I've
  not done it myself since many Fedora's ago, but I
 believe you can

Re: [Pacemaker] Linux HA setup for CentOS 6.5

2014-10-15 Thread Sihan Goi
After following the guide, I've successfully managed to get Apache server
up and running in the cluster as an active/passive setup, but with some
differences. My cluster stack is stated as being cman while the guide's is
openais. Not sure if that's a problem. Also, some commands in the guide
don't seem to work.

I'm moving on to DRBD installation now, but when I do a yum install
drbd-pacemaker drbd-udev, these packages are not available. After some
googling, it seems that drbd83-utils/kmod-drbd83 or
drbd84-utils/kmod-drbd84 is available via another repo. Does this work with
the guide?

On Thu, Oct 16, 2014 at 9:35 AM, Sihan Goi gois...@gmail.com wrote:

 Hi,

 Thanks for the guide! I thought I had the same exact version...mine is
 also named Pacemaker 1.1 Clusters from Scratch Creating Active/Passive and
 Active/Active Clusters on Fedora Edition 5, but my version of the document
 is meant for Fedora 17, and uses pcs and systemctl calls which don't exist
 on CentOS 6.5. I was trying to get it to work on CentOS 7 but realized
 support for DRBD on CentOS 7 is really lacking.

 I'll refer to the version you posted from hereon.

 On Wed, Oct 15, 2014 at 11:43 PM, Digimer li...@alteeve.ca wrote:

 Let pacemaker start cman/corosync on EL6.

 This is the guide that covers it, written by Pacemaker's author:

 http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html-
 single/Clusters_from_Scratch/index.html

 It notes that it's based on Fedora 13, but that maps to EL6 almost
 perfectly.

 A very slightly altered approach is here, in my *very* unfinished
 tutorial:

 https://alteeve.ca/w/Anvil!_Tutorial_3_on_EL6#Configuring_the_Anvil.21

 The main difference is that Andrew's approach (see section 8.2.2) is to
 disable quorum via editing /etc/sysconfig/cman, where my approach handles
 it in the main /etc/cluster/cluster.conf (cman's main config file).

 In any case, from then on, start pacemaker and let it handle everything
 else.

 Cheers

 digimer

 On 15/10/14 04:27 AM, Sihan Goi wrote:

 Hi,

 So I've decided to make things simpler and go with a wired network
 instead of wireless. I connected both boxes to a router, manually edited
 the ifcfg-eth0 files to set static IP addresses for both boxes (not
 before downloading and building a driver for the nic of 1 of the boxes),
 did a chkconfig NetworkManager off, service NetworkManager stop, and
 service network restart.

 I'm able to ping each other via IP address and hostname. I also already
 have corosync, pacemaker, crmsh and cman installed.

 I then did the following as per the guide at
 http://geekpeek.net/linux-cluster-corosync-pacemaker

 service corosync start - success.
 service pacemaker start - I get a Starting cman...corosync cluster
 engine is already running [FAILED]

 What's up? :(

 On Oct 15, 2014 12:23 PM, Sihan Goi gois...@gmail.com
 mailto:gois...@gmail.com wrote:

 No typo.

 [root@node02 network-scripts]# ls -lah
 /etc/sysconfig/network-scripts/ifcfg-*
 -rw-r--r--. 1 root root 254 Oct 10  2013
 /etc/sysconfig/network-scripts/ifcfg-lo

 I installed CentOS 6.5 with the LiveDVD. I found it weird as well
 that these files were missing.

 On Wed, Oct 15, 2014 at 11:54 AM, Digimer li...@alteeve.ca
 mailto:li...@alteeve.ca wrote:

 Sure there isn't a typo there?

 an-c05n01:~# ls -lah /etc/sysconfig/network-__scripts/ifcfg-*
 -rw-r--r--. 1 root root 225 Jan 16  2013
 /etc/sysconfig/network-__scripts/ifcfg-bond0
 -rw-r--r--. 1 root root 220 Jan 16  2013
 /etc/sysconfig/network-__scripts/ifcfg-bond1
 -rw-r--r--. 1 root root 198 Jan 16  2013
 /etc/sysconfig/network-__scripts/ifcfg-bond2
 -rw-r--r--. 1 root root 149 Jan 16  2013
 /etc/sysconfig/network-__scripts/ifcfg-eth0
 -rw-r--r--. 1 root root 144 Jan 16  2013
 /etc/sysconfig/network-__scripts/ifcfg-eth1
 -rw-r--r--. 1 root root 152 Mar 14  2013
 /etc/sysconfig/network-__scripts/ifcfg-eth2
 -rw-r--r--. 1 root root 149 Jan 16  2013
 /etc/sysconfig/network-__scripts/ifcfg-eth3
 -rw-r--r--. 1 root root 144 Jan 16  2013
 /etc/sysconfig/network-__scripts/ifcfg-eth4
 -rw-r--r--. 1 root root 152 Mar 14  2013
 /etc/sysconfig/network-__scripts/ifcfg-eth5
 -rw-r--r--. 1 root root 254 Jul 22 09:56
 /etc/sysconfig/network-__scripts/ifcfg-lo
 -rw-r--r--. 1 root root 213 Mar 13  2013
 /etc/sysconfig/network-__scripts/ifcfg-vbr2

 I've never seen an EL6 install without the files there,
 'network' or NetworkManager aside.

 digimer

 On 14/10/14 11:32 PM, Sihan Goi wrote:

 There aren't any config files in
 /etc/sysconfig/network-__scripts. When I
 was using CentOS 7, the config files were there
 (ifcfg-something) but in
 this CentOS 6.5 installation, they are missing.

 If is possible

[Pacemaker] Linux HA setup for CentOS 6.5

2014-10-14 Thread Sihan Goi
Hi,

Is there a tutorial showing how to get a basic Linux HA setup with
replicated storage (via DRBD) working on CentOS 6.5? I want to have mySQL
as the HA resource with the database replicated across the nodes. I've
scoured the web for one but it seems that I get stuck in each one somewhere.

To elaborate, I have 2 CentOS 6.5 nodes configured with distinct hostnames
and static IPs. They are connected to a wireless AP, and can ping each
other.

I tried following this guide - http://clusterlabs.org/quickstart-redhat.html
However, cman will not start when NetworkManager is running, and my nodes
cannot connect to the wireless AP without NetworkManager running. Am I
missing something or is that the stupidest dependency ever? How is a
cluster supposed to work when the nodes aren't connected to one another?

I also tried following the clusters from scratch guide but that seems to
rely on systemctl calls which aren't available on CentOS 6.5.

Any help?

-- 
- Goi Sihan
gois...@gmail.com
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Linux HA setup for CentOS 6.5

2014-10-14 Thread Sihan Goi
No typo.

[root@node02 network-scripts]# ls -lah
/etc/sysconfig/network-scripts/ifcfg-*
-rw-r--r--. 1 root root 254 Oct 10  2013
/etc/sysconfig/network-scripts/ifcfg-lo

I installed CentOS 6.5 with the LiveDVD. I found it weird as well that
these files were missing.

On Wed, Oct 15, 2014 at 11:54 AM, Digimer li...@alteeve.ca wrote:

 Sure there isn't a typo there?

 an-c05n01:~# ls -lah /etc/sysconfig/network-scripts/ifcfg-*
 -rw-r--r--. 1 root root 225 Jan 16  2013 /etc/sysconfig/network-
 scripts/ifcfg-bond0
 -rw-r--r--. 1 root root 220 Jan 16  2013 /etc/sysconfig/network-
 scripts/ifcfg-bond1
 -rw-r--r--. 1 root root 198 Jan 16  2013 /etc/sysconfig/network-
 scripts/ifcfg-bond2
 -rw-r--r--. 1 root root 149 Jan 16  2013 /etc/sysconfig/network-
 scripts/ifcfg-eth0
 -rw-r--r--. 1 root root 144 Jan 16  2013 /etc/sysconfig/network-
 scripts/ifcfg-eth1
 -rw-r--r--. 1 root root 152 Mar 14  2013 /etc/sysconfig/network-
 scripts/ifcfg-eth2
 -rw-r--r--. 1 root root 149 Jan 16  2013 /etc/sysconfig/network-
 scripts/ifcfg-eth3
 -rw-r--r--. 1 root root 144 Jan 16  2013 /etc/sysconfig/network-
 scripts/ifcfg-eth4
 -rw-r--r--. 1 root root 152 Mar 14  2013 /etc/sysconfig/network-
 scripts/ifcfg-eth5
 -rw-r--r--. 1 root root 254 Jul 22 09:56 /etc/sysconfig/network-
 scripts/ifcfg-lo
 -rw-r--r--. 1 root root 213 Mar 13  2013 /etc/sysconfig/network-
 scripts/ifcfg-vbr2

 I've never seen an EL6 install without the files there, 'network' or
 NetworkManager aside.

 digimer

 On 14/10/14 11:32 PM, Sihan Goi wrote:

 There aren't any config files in /etc/sysconfig/network-scripts. When I
 was using CentOS 7, the config files were there (ifcfg-something) but in
 this CentOS 6.5 installation, they are missing.

 If is possible to not use cman, and just use corosync and pacemaker? If
 so, how?

 On Wed, Oct 15, 2014 at 11:22 AM, Digimer li...@alteeve.ca
 mailto:li...@alteeve.ca wrote:

 You can manually configure the wireless LAN without NetworkManager.
 If you take a look, there should be existing config files in
 /etc/sysconfig/network-__scripts/ for the wireless connection. I've
 not done it myself since many Fedora's ago, but I believe you can
 change NMCONTROLLER=no and then start it up with
 /etc/sysconfig/network start. I could be a bit wrong, but I am sure
 you can make wireless work without NM.

 Question; Servers with WLAN? I assume these won't be used for
 corosync?

 digimer


 On 14/10/14 11:17 PM, Sihan Goi wrote:

 Hi,

 Is there a tutorial showing how to get a basic Linux HA setup with
 replicated storage (via DRBD) working on CentOS 6.5? I want to
 have
 mySQL as the HA resource with the database replicated across the
 nodes.
 I've scoured the web for one but it seems that I get stuck in
 each one
 somewhere.

 To elaborate, I have 2 CentOS 6.5 nodes configured with distinct
 hostnames and static IPs. They are connected to a wireless AP,
 and can
 ping each other.

 I tried following this guide -
 http://clusterlabs.org/__quickstart-redhat.html
 http://clusterlabs.org/quickstart-redhat.html
 However, cman will not start when NetworkManager is running, and
 my
 nodes cannot connect to the wireless AP without NetworkManager
 running.
 Am I missing something or is that the stupidest dependency ever?
 How is
 a cluster supposed to work when the nodes aren't connected to
 one another?

 I also tried following the clusters from scratch guide but
 that seems
 to rely on systemctl calls which aren't available on CentOS 6.5.

 Any help?

 --
 - Goi Sihan
 gois...@gmail.com mailto:gois...@gmail.com
 mailto:gois...@gmail.com mailto:gois...@gmail.com


 _
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 mailto:Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/__mailman/listinfo/pacemaker
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

 Project Home: http://www.clusterlabs.org
 Getting started:
 http://www.clusterlabs.org/__doc/Cluster_from_Scratch.pdf
 http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org



 --
 Digimer
 Papers and Projects: https://alteeve.ca/w/
 What if the cure for cancer is trapped in the mind of a person
 without access to education?

 _
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 mailto:Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/__mailman/listinfo/pacemaker
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

 Project Home: http://www.clusterlabs.org
 Getting started:
 http://www.clusterlabs.org/__doc

Re: [Pacemaker] ERROR: Unable to find nic or netmask.

2014-09-16 Thread Sihan Goi
Figured out the problem - the firewall rules are somehow not persistent.
After running the following commands:

iptables -I INPUT -m state --state NEW -p udp -m multiport --dports
5404,5405 -j ACCEPT
iptables -I INPUT -p tcp -m state --state NEW -m tcp --dport 2224 -j ACCEPT
iptables -I INPUT -p igmp -j ACCEPT
iptables -I INPUT -m addrtype --dst-type MULTICAST -j ACCEPT
service iptables save

Both nodes are able to communicate with each other.

Seems like several things aren't persistent upon reboots, and need to be
restarted/reconfigured. Is this the intended behavior?

On Tue, Sep 2, 2014 at 2:05 PM, Nikita Michalko michalko.sys...@a-i-p.com
wrote:

  Hi,

 maybe is following helpfull:
 https://www.google.at/url?sa=trct=jq=esrc=ssource=webcd=2cad=rjauact=8ved=0CDEQFjABurl=http%3A%2F%2Fhttpd.apache.org%2Fdocs%2Ftrunk%2Fbind.htmlei=QV0FVK2YBYHO0QXPxYHQDwusg=AFQjCNGCErofEEVtclS_x6ZXA3bXvJiawwsig2=hR8kUWRcpmN4PE1V42t9kgbvm=bv.74115972,d.bGE
 https://www.google.at/url?sa=trct=jq=esrc=ssource=webcd=1cad=rjauact=8ved=0CC0QrAIwAAurl=http%3A%2F%2Fubuntuforums.org%2Fshowthread.php%3Ft%3D1636667ei=QV0FVK2YBYHO0QXPxYHQDwusg=AFQjCNHcs7alJ_RwBc4tWq2X7ew4ynEmzgsig2=ra1qjZ8nly8opwawrACidwbvm=bv.74115972,d.bGE


 HTH

 Nikita



 On 02.09.2014 07:47, Sihan Goi wrote:

 Hi,

 After some investigation, it seems that my Apache is having trouble
 starting in both nodes. I get the following error message when I try to
 restart the service:

 Job for httpd.service failed. See 'systemctl status httpd.service' and
 'journalctl -xn' for details.

 systemctl status httpd.service shows the following output:

 httpd.service - The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled)
Active: failed (Result: exit-code) since Tue 2014-09-02 13:45:52 SGT; 8s
 ago
   Process: 26095 ExecStop=/bin/kill -WINCH ${MAINPID} (code=exited,
 status=0/SUCCESS)
   Process: 26093 ExecStart=/usr/sbin/httpd $OPTIONS -DFOREGROUND
 (code=exited, status=1/FAILURE)
  Main PID: 26093 (code=exited, status=1/FAILURE)

 Sep 02 13:45:52 node02 httpd[26093]: AH00558: httpd: Could not reliably
 det...ge
 Sep 02 13:45:52 node02 httpd[26093]: (98)Address already in use: AH00072:
 m...80
 Sep 02 13:45:52 node02 httpd[26093]: no listening sockets available,
 shutti...wn
 Sep 02 13:45:52 node02 httpd[26093]: AH00015: Unable to open logs
 Sep 02 13:45:52 node02 systemd[1]: httpd.service: main process exited,
 code...RE
 Sep 02 13:45:52 node02 systemd[1]: Failed to start The Apache HTTP Server.
 Sep 02 13:45:52 node02 systemd[1]: Unit httpd.service entered failed state.
 Hint: Some lines were ellipsized, use -l to show in full.

 /var/log/messages also shows similar messages

 Sep  2 13:41:12 node02 systemd: Starting The Apache HTTP Server...
 Sep  2 13:41:12 node02 httpd: AH00558: httpd: Could not reliably determine
 the server's fully qualified domain name, using 192.168.0.112. Set the
 'ServerName' directive globally to suppress this message
 Sep  2 13:41:12 node02 httpd: (98)Address already in use: AH00072:
 make_sock: could not bind to address 127.0.0.1:80
 Sep  2 13:41:12 node02 httpd: no listening sockets available, shutting down
 Sep  2 13:41:12 node02 httpd: AH00015: Unable to open logs
 Sep  2 13:41:12 node02 systemd: httpd.service: main process exited,
 code=exited, status=1/FAILURE
 Sep  2 13:41:12 node02 systemd: Failed to start The Apache HTTP Server.
 Sep  2 13:41:12 node02 systemd: Unit httpd.service entered failed state.

 Is this related to the problem?



 On Tue, Sep 2, 2014 at 12:42 PM, Teerapatr Kittiratanachai 
 maillist...@gmail.com wrote:


  Try to set cidr_netmask=32 for resource only, and let the physical
 interface's netmask be 24.

 On Tue, Sep 2, 2014 at 11:27 AM, Sihan Goi gois...@gmail.com 
 gois...@gmail.com wrote:

  Got it. Changed the netmask for both PCs to 255.255.255.0 and changed
 cidr_netmask to 24 and it works...sort of.

 It was working for a while, and then I rebooted both PCs, and now each
 thinks its online and the other is offline.

 pcs status on my node01 gives the following output:
 Cluster name: cluster_web
 Last updated: Tue Sep  2 12:21:25 2014
 Last change: Tue Sep  2 12:13:27 2014 via cibadmin on node02
 Stack: corosync
 Current DC: node01 (1) - partition WITHOUT quorum
 Version: 1.1.10-32.el7_0-368c726
 2 Nodes configured
 2 Resources configured


 Online: [ node01 ]
 OFFLINE: [ node02 ]

 Full list of resources:

  virtual_ip(ocf::heartbeat:IPaddr2):Started node01
  webserver(ocf::heartbeat:apache):Started node01

 PCSD Status:
   node01: Offline
   node02: Online

 Daemon Status:
   corosync: active/disabled
   pacemaker: active/disabled
   pcsd: active/disabled

 However, pcs status on node02 shows the following output:
 Cluster name: cluster_web
 Last updated: Tue Sep  2 12:20:41 2014
 Last change: Tue Sep  2 11:59:03 2014 via cibadmin on node02
 Stack: corosync
 Current DC: node02 (2) - partition WITHOUT quorum
 Version: 1.1.10-32.el7_0-368c726
 2 Nodes

Re: [Pacemaker] ERROR: Unable to find nic or netmask.

2014-09-16 Thread Sihan Goi
I mean things like firewall settings, as well as services like pcsd,
pacemaker and corosync not starting up automatically sometimes.

On Tue, Sep 16, 2014 at 5:10 PM, Nikita Michalko michalko.sys...@a-i-p.com
wrote:

  On 16.09.2014 10:31, Sihan Goi wrote:

 Figured out the problem - the firewall rules are somehow not persistent.
 After running the following commands:

 iptables -I INPUT -m state --state NEW -p udp -m multiport --dports
 5404,5405 -j ACCEPT
 iptables -I INPUT -p tcp -m state --state NEW -m tcp --dport 2224 -j ACCEPT
 iptables -I INPUT -p igmp -j ACCEPT
 iptables -I INPUT -m addrtype --dst-type MULTICAST -j ACCEPT
 service iptables save

 Both nodes are able to communicate with each other.

 Seems like several things aren't persistent upon reboots, and need to be
 restarted/reconfigured. Is this the intended behavior?

  What do you mean with several things ? Firewall/iptables on CentOS 7? Or 
 Pacemaker/Corosync/pcs ?


 Nikita


 On Tue, Sep 2, 2014 at 2:05 PM, Nikita Michalko michalko.sys...@a-i-p.com 
 michalko.sys...@a-i-p.com
 wrote:


   Hi,

 maybe is following 
 helpfull:https://www.google.at/url?sa=trct=jq=esrc=ssource=webcd=2cad=rjauact=8ved=0CDEQFjABurl=http%3A%2F%2Fhttpd.apache.org%2Fdocs%2Ftrunk%2Fbind.htmlei=QV0FVK2YBYHO0QXPxYHQDwusg=AFQjCNGCErofEEVtclS_x6ZXA3bXvJiawwsig2=hR8kUWRcpmN4PE1V42t9kgbvm=bv.74115972,d.bGEhttps://www.google.at/url?sa=trct=jq=esrc=ssource=webcd=1cad=rjauact=8ved=0CC0QrAIwAAurl=http%3A%2F%2Fubuntuforums.org%2Fshowthread.php%3Ft%3D1636667ei=QV0FVK2YBYHO0QXPxYHQDwusg=AFQjCNHcs7alJ_RwBc4tWq2X7ew4ynEmzgsig2=ra1qjZ8nly8opwawrACidwbvm=bv.74115972,d.bGE


 HTH

 Nikita



 On 02.09.2014 07:47, Sihan Goi wrote:

 Hi,

 After some investigation, it seems that my Apache is having trouble
 starting in both nodes. I get the following error message when I try to
 restart the service:

 Job for httpd.service failed. See 'systemctl status httpd.service' and
 'journalctl -xn' for details.

 systemctl status httpd.service shows the following output:

 httpd.service - The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled)
Active: failed (Result: exit-code) since Tue 2014-09-02 13:45:52 SGT; 8s
 ago
   Process: 26095 ExecStop=/bin/kill -WINCH ${MAINPID} (code=exited,
 status=0/SUCCESS)
   Process: 26093 ExecStart=/usr/sbin/httpd $OPTIONS -DFOREGROUND
 (code=exited, status=1/FAILURE)
  Main PID: 26093 (code=exited, status=1/FAILURE)

 Sep 02 13:45:52 node02 httpd[26093]: AH00558: httpd: Could not reliably
 det...ge
 Sep 02 13:45:52 node02 httpd[26093]: (98)Address already in use: AH00072:
 m...80
 Sep 02 13:45:52 node02 httpd[26093]: no listening sockets available,
 shutti...wn
 Sep 02 13:45:52 node02 httpd[26093]: AH00015: Unable to open logs
 Sep 02 13:45:52 node02 systemd[1]: httpd.service: main process exited,
 code...RE
 Sep 02 13:45:52 node02 systemd[1]: Failed to start The Apache HTTP Server.
 Sep 02 13:45:52 node02 systemd[1]: Unit httpd.service entered failed state.
 Hint: Some lines were ellipsized, use -l to show in full.

 /var/log/messages also shows similar messages

 Sep  2 13:41:12 node02 systemd: Starting The Apache HTTP Server...
 Sep  2 13:41:12 node02 httpd: AH00558: httpd: Could not reliably determine
 the server's fully qualified domain name, using 192.168.0.112. Set the
 'ServerName' directive globally to suppress this message
 Sep  2 13:41:12 node02 httpd: (98)Address already in use: AH00072:
 make_sock: could not bind to address 127.0.0.1:80
 Sep  2 13:41:12 node02 httpd: no listening sockets available, shutting down
 Sep  2 13:41:12 node02 httpd: AH00015: Unable to open logs
 Sep  2 13:41:12 node02 systemd: httpd.service: main process exited,
 code=exited, status=1/FAILURE
 Sep  2 13:41:12 node02 systemd: Failed to start The Apache HTTP Server.
 Sep  2 13:41:12 node02 systemd: Unit httpd.service entered failed state.

 Is this related to the problem?



 On Tue, Sep 2, 2014 at 12:42 PM, Teerapatr Kittiratanachai 
 maillist...@gmail.com maillist...@gmail.com
  wrote:


  Try to set cidr_netmask=32 for resource only, and let the physical
 interface's netmask be 24.

 On Tue, Sep 2, 2014 at 11:27 AM, Sihan Goi gois...@gmail.com 
 gois...@gmail.com gois...@gmail.com gois...@gmail.com wrote:

  Got it. Changed the netmask for both PCs to 255.255.255.0 and changed
 cidr_netmask to 24 and it works...sort of.

 It was working for a while, and then I rebooted both PCs, and now each
 thinks its online and the other is offline.

 pcs status on my node01 gives the following output:
 Cluster name: cluster_web
 Last updated: Tue Sep  2 12:21:25 2014
 Last change: Tue Sep  2 12:13:27 2014 via cibadmin on node02
 Stack: corosync
 Current DC: node01 (1) - partition WITHOUT quorum
 Version: 1.1.10-32.el7_0-368c726
 2 Nodes configured
 2 Resources configured


 Online: [ node01 ]
 OFFLINE: [ node02 ]

 Full list of resources:

  virtual_ip(ocf::heartbeat:IPaddr2):Started node01
  webserver(ocf::heartbeat:apache

[Pacemaker] Notification when a node is down

2014-09-12 Thread Sihan Goi
Hi,

Is there any way for a Pacemaker/Corosync/PCS setup to send a notification
when it detects that a node in a cluster is down? I read that Pacemaker and
Corosync logs events to syslog, but where is the syslog file in CentOS? Do
they log events such as a failover occurrence?

Thanks.

-- 
- Goi Sihan
gois...@gmail.com
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] pcs cluster auth shows Error: Unable to communicate with node message

2014-09-09 Thread Sihan Goi
Hi,

I had a basic HA setup working with 2 nodes previously running a simple
Apache web server on a private local network. However, I'm having trouble
getting it to work right now, and I haven't changed anything other than
rebooting a few times.

Firstly, I've noticed that I need to start the pcsd service manually after
every reboot with systemctl start pcsd. Corosync seems to start
automatically

After starting pcsd and restarting the cluster, the HA cluster used to
work. However, now it doesn't seem to. pcs status on the node01 would
show node1 as online and node02 as offline, and vice versa. When I try pcs
cluster auth node02 from node01, I'd get Error: Unable to communicate
with node02, even though I'm able to ping both the IP address and hostname
of node02 from node01

node01 and node02 would both serve their own web page when I enter the
virtual IP address in the browser URL bar. However, a 3rd device connected
to the same network is unable to load the webpage from the virtual IP
address.

What's wrong? Thanks!
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] pcs cluster auth shows Error: Unable to communicate with node message

2014-09-09 Thread Sihan Goi
Tried that, same problem.
On Sep 9, 2014 3:44 PM, emmanuel segura emi2f...@gmail.com wrote:

 systemctl enable pcsd.service ?

 2014-09-09 9:37 GMT+02:00 Sihan Goi gois...@gmail.com:
  Hi,
 
  I had a basic HA setup working with 2 nodes previously running a simple
  Apache web server on a private local network. However, I'm having trouble
  getting it to work right now, and I haven't changed anything other than
  rebooting a few times.
 
  Firstly, I've noticed that I need to start the pcsd service manually
 after
  every reboot with systemctl start pcsd. Corosync seems to start
  automatically
 
  After starting pcsd and restarting the cluster, the HA cluster used to
 work.
  However, now it doesn't seem to. pcs status on the node01 would show
 node1
  as online and node02 as offline, and vice versa. When I try pcs cluster
  auth node02 from node01, I'd get Error: Unable to communicate with
  node02, even though I'm able to ping both the IP address and hostname of
  node02 from node01
 
  node01 and node02 would both serve their own web page when I enter the
  virtual IP address in the browser URL bar. However, a 3rd device
 connected
  to the same network is unable to load the webpage from the virtual IP
  address.
 
  What's wrong? Thanks!
 
 
  ___
  Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
  http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
  Project Home: http://www.clusterlabs.org
  Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
  Bugs: http://bugs.clusterlabs.org
 



 --
 esta es mi vida e me la vivo hasta que dios quiera

 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] ERROR: Unable to find nic or netmask.

2014-09-01 Thread Sihan Goi
Hi,

I'm trying to create a HA cluster with 2 CentOS 7 PCs connected to a
wireless AP. The PCs have the static IP addresses 192.168.0.111 and
192.168.0.112 respectively and hostnames node01 and node02 respectively.

I've tried to create a virtual IP address of 192.168.0.110 using the
following command:

pcs resource create virtual_ip ocf:heartbeat:IPaddr2 ip=192.168.0.110
cidr_netmask=32 op monitor interval=30s

However, when I do a pcs status resources I get the following output:

 virtual_ip(ocf::heartbeat:IPaddr2):Stopped

The virtual IP is stopped rather than started. I looked into
/var/log/messages and /var/log/pacemaker.log
 and I find the following error messages:

node02 IPaddr2(virtual_ip)[25451]: ERROR: Unable to find nic or netmask.
node02 IPaddr2(virtual_ip)[25451]: ERROR: [findif] failed

It seems that it's unable to find my nic. How can I fix this?

Thanks.
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] ERROR: Unable to find nic or netmask.

2014-09-01 Thread Sihan Goi
Got it. Changed the netmask for both PCs to 255.255.255.0 and changed
cidr_netmask to 24 and it works...sort of.

It was working for a while, and then I rebooted both PCs, and now each
thinks its online and the other is offline.

pcs status on my node01 gives the following output:
Cluster name: cluster_web
Last updated: Tue Sep  2 12:21:25 2014
Last change: Tue Sep  2 12:13:27 2014 via cibadmin on node02
Stack: corosync
Current DC: node01 (1) - partition WITHOUT quorum
Version: 1.1.10-32.el7_0-368c726
2 Nodes configured
2 Resources configured


Online: [ node01 ]
OFFLINE: [ node02 ]

Full list of resources:

 virtual_ip(ocf::heartbeat:IPaddr2):Started node01
 webserver(ocf::heartbeat:apache):Started node01

PCSD Status:
  node01: Offline
  node02: Online

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/disabled

However, pcs status on node02 shows the following output:
Cluster name: cluster_web
Last updated: Tue Sep  2 12:20:41 2014
Last change: Tue Sep  2 11:59:03 2014 via cibadmin on node02
Stack: corosync
Current DC: node02 (2) - partition WITHOUT quorum
Version: 1.1.10-32.el7_0-368c726
2 Nodes configured
2 Resources configured


Online: [ node02 ]
OFFLINE: [ node01 ]

Full list of resources:

 virtual_ip(ocf::heartbeat:IPaddr2):Started node02
 webserver(ocf::heartbeat:apache):Started node02

PCSD Status:
  node01: Offline
  node02: Online

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/disabled

Seems like each node thinks it's online and the other is not. I'm running
HA on apache webserver, and if I access the webpage on node01, I get
node01's index.html. If I access it on node02, I get node02's index.html.
If I access it via another PC connected to the same AP, the webpage is
unavailable.

What could be wrong?


On Mon, Sep 1, 2014 at 9:09 PM, John Lauro john.la...@covenanteyes.com
wrote:

 ip=192.168.0.110 cidr_netmask=32
 /32 leaves no room for any other IP addresses on that interface and so you
 have to specify the nic.  Are you certain 192.168.0.111 and 192.168.0.112
 do not have a different netmask from 255.255.255.255, like 255.255.255.0
 for /24 or 255.255.0.0 for /16?  If they do have 255.255.255.255 too, then
 they are probably not setup correctly...

 PS: cidr_netmask is optional.  Assuming a proper netmask (not
 255.255.255.2555) is on 192.168.0.111 and 192.168.0.112 it should work
 without specifying cidr_netmask.


 --

 *From: *Sihan Goi gois...@gmail.com
 *To: *pacemaker@oss.clusterlabs.org
 *Sent: *Monday, September 1, 2014 4:17:20 AM
 *Subject: *[Pacemaker] ERROR: Unable to find nic or netmask.


 Hi,

 I'm trying to create a HA cluster with 2 CentOS 7 PCs connected to a
 wireless AP. The PCs have the static IP addresses 192.168.0.111 and
 192.168.0.112 respectively and hostnames node01 and node02 respectively.

 I've tried to create a virtual IP address of 192.168.0.110 using the
 following command:

 pcs resource create virtual_ip ocf:heartbeat:IPaddr2 ip=192.168.0.110
 cidr_netmask=32 op monitor interval=30s

 However, when I do a pcs status resources I get the following output:

  virtual_ip(ocf::heartbeat:IPaddr2):Stopped

 The virtual IP is stopped rather than started. I looked into
 /var/log/messages and /var/log/pacemaker.log
  and I find the following error messages:

 node02 IPaddr2(virtual_ip)[25451]: ERROR: Unable to find nic or netmask.
 node02 IPaddr2(virtual_ip)[25451]: ERROR: [findif] failed

 It seems that it's unable to find my nic. How can I fix this?

 Thanks.

 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org





-- 
- Goi Sihan
gois...@gmail.com
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] ERROR: Unable to find nic or netmask.

2014-09-01 Thread Sihan Goi
Hi,

After some investigation, it seems that my Apache is having trouble
starting in both nodes. I get the following error message when I try to
restart the service:

Job for httpd.service failed. See 'systemctl status httpd.service' and
'journalctl -xn' for details.

systemctl status httpd.service shows the following output:

httpd.service - The Apache HTTP Server
   Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled)
   Active: failed (Result: exit-code) since Tue 2014-09-02 13:45:52 SGT; 8s
ago
  Process: 26095 ExecStop=/bin/kill -WINCH ${MAINPID} (code=exited,
status=0/SUCCESS)
  Process: 26093 ExecStart=/usr/sbin/httpd $OPTIONS -DFOREGROUND
(code=exited, status=1/FAILURE)
 Main PID: 26093 (code=exited, status=1/FAILURE)

Sep 02 13:45:52 node02 httpd[26093]: AH00558: httpd: Could not reliably
det...ge
Sep 02 13:45:52 node02 httpd[26093]: (98)Address already in use: AH00072:
m...80
Sep 02 13:45:52 node02 httpd[26093]: no listening sockets available,
shutti...wn
Sep 02 13:45:52 node02 httpd[26093]: AH00015: Unable to open logs
Sep 02 13:45:52 node02 systemd[1]: httpd.service: main process exited,
code...RE
Sep 02 13:45:52 node02 systemd[1]: Failed to start The Apache HTTP Server.
Sep 02 13:45:52 node02 systemd[1]: Unit httpd.service entered failed state.
Hint: Some lines were ellipsized, use -l to show in full.

/var/log/messages also shows similar messages

Sep  2 13:41:12 node02 systemd: Starting The Apache HTTP Server...
Sep  2 13:41:12 node02 httpd: AH00558: httpd: Could not reliably determine
the server's fully qualified domain name, using 192.168.0.112. Set the
'ServerName' directive globally to suppress this message
Sep  2 13:41:12 node02 httpd: (98)Address already in use: AH00072:
make_sock: could not bind to address 127.0.0.1:80
Sep  2 13:41:12 node02 httpd: no listening sockets available, shutting down
Sep  2 13:41:12 node02 httpd: AH00015: Unable to open logs
Sep  2 13:41:12 node02 systemd: httpd.service: main process exited,
code=exited, status=1/FAILURE
Sep  2 13:41:12 node02 systemd: Failed to start The Apache HTTP Server.
Sep  2 13:41:12 node02 systemd: Unit httpd.service entered failed state.

Is this related to the problem?



On Tue, Sep 2, 2014 at 12:42 PM, Teerapatr Kittiratanachai 
maillist...@gmail.com wrote:

 Try to set cidr_netmask=32 for resource only, and let the physical
 interface's netmask be 24.

 On Tue, Sep 2, 2014 at 11:27 AM, Sihan Goi gois...@gmail.com wrote:
  Got it. Changed the netmask for both PCs to 255.255.255.0 and changed
  cidr_netmask to 24 and it works...sort of.
 
  It was working for a while, and then I rebooted both PCs, and now each
  thinks its online and the other is offline.
 
  pcs status on my node01 gives the following output:
  Cluster name: cluster_web
  Last updated: Tue Sep  2 12:21:25 2014
  Last change: Tue Sep  2 12:13:27 2014 via cibadmin on node02
  Stack: corosync
  Current DC: node01 (1) - partition WITHOUT quorum
  Version: 1.1.10-32.el7_0-368c726
  2 Nodes configured
  2 Resources configured
 
 
  Online: [ node01 ]
  OFFLINE: [ node02 ]
 
  Full list of resources:
 
   virtual_ip(ocf::heartbeat:IPaddr2):Started node01
   webserver(ocf::heartbeat:apache):Started node01
 
  PCSD Status:
node01: Offline
node02: Online
 
  Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/disabled
 
  However, pcs status on node02 shows the following output:
  Cluster name: cluster_web
  Last updated: Tue Sep  2 12:20:41 2014
  Last change: Tue Sep  2 11:59:03 2014 via cibadmin on node02
  Stack: corosync
  Current DC: node02 (2) - partition WITHOUT quorum
  Version: 1.1.10-32.el7_0-368c726
  2 Nodes configured
  2 Resources configured
 
 
  Online: [ node02 ]
  OFFLINE: [ node01 ]
 
  Full list of resources:
 
   virtual_ip(ocf::heartbeat:IPaddr2):Started node02
   webserver(ocf::heartbeat:apache):Started node02
 
  PCSD Status:
node01: Offline
node02: Online
 
  Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/disabled
 
  Seems like each node thinks it's online and the other is not. I'm
 running HA
  on apache webserver, and if I access the webpage on node01, I get
 node01's
  index.html. If I access it on node02, I get node02's index.html. If I
 access
  it via another PC connected to the same AP, the webpage is unavailable.
 
  What could be wrong?
 
 
  On Mon, Sep 1, 2014 at 9:09 PM, John Lauro john.la...@covenanteyes.com
  wrote:
 
  ip=192.168.0.110 cidr_netmask=32
  /32 leaves no room for any other IP addresses on that interface and so
 you
  have to specify the nic.  Are you certain 192.168.0.111 and
 192.168.0.112 do
  not have a different netmask from 255.255.255.255, like 255.255.255.0
 for
  /24 or 255.255.0.0 for /16?  If they do have 255.255.255.255 too, then
 they
  are probably not setup correctly...
 
  PS: cidr_netmask is optional.  Assuming a proper netmask (not
  255.255.255.2555