Re: [Linux-HA] heartbeat doesnt create the socket /var/run/heartbeat/register

2012-01-22 Thread Efrat Lefeber
The heartbeat I install is from debian packages.
dpkg -l | grep  heartbeat
ii  heartbeat  1:3.0.3-2~bpo50+1  Subsystem 
for High-Availability Linux
ii  libheartbeat2   1:3.0.3-2~bpo50+1  Subsystem 
for High-Availability Linux (libraries)

version 3.0.2

I install the same packages and builds on all devices. I have an automatic 
installation. Some devices are installed ok and some suffers from the problem 
that the socket isn't created.
Is there a way I can create the socket from outside heartbeat (from perl or 
bash)? I have a watchdog and I wish to create the socket automatically in case 
the socket doesn't exist.

-Original Message-
From: linux-ha-boun...@lists.linux-ha.org 
[mailto:linux-ha-boun...@lists.linux-ha.org] On Behalf Of Lars Ellenberg
Sent: Friday, January 20, 2012 8:48 PM
To: linux-ha@lists.linux-ha.org
Subject: Re: [Linux-HA] heartbeat doesnt create the socket 
/var/run/heartbeat/register

On Thu, Jan 19, 2012 at 02:18:53PM +, Efrat Lefeber wrote:
 Hi,
 
 I am using linux-ha heartbeat on a two simple nodes cluster.
 For some reason which I can't figure out, the socket 
 /var/run/heartbeat/register is not created though the directory 
 /var/run/heartbeat/ exist:
 
 ll /var/run/heartbeat/
 total 24
 drwxr-x---  6 hacluster haclient 4096 2012-01-19 14:30 .
 drwxr-xr-x 16 root  root 4096 2012-01-19 14:30 ..
 drwxr-x---  2 hacluster haclient 4096 2012-01-19 14:30 ccm
 drwxr-x---  2 hacluster haclient 4096 2012-01-19 14:30 crm
 drwxr-x---  2 hacluster haclient 4096 2012-01-19 14:30 dopd
 drwxr-xr-t  2 root  root 4096 2012-01-19 14:30 rsctmp
 
 
 /etc/init.d/heartbeat status
 heartbeat OK [pid 14685 et al] is running on vs-158 [vs-158]...
 
 cl_status hbstatus
 Heartbeat is stopped on this machine.
 
 I ran cl_status with strace and I saw this error:
 connect(3, {sa_family=AF_FILE, path=/var/run/heartbeat/register...}, 
 110) = -1 ENOENT (No such file or directory)
 
 
 Who created this socket?

That's one of the first things the heartbeat binary does when it starts, If it 
can not create that socket, heartbeat will not even start up.

Of course, in theory someone may remove that socket after it was created. If 
so, make sure that does not happen again ;)

 How can I find out why isn't the socket created?

Where did you get your packages/binaries?
Double check your build?
lsof -n -p your heartbeat master control process?

 Is there a workaround I can do to create the socket?

Fix your installation.

 This problem doesn't happen all the time. I have another node with the 
 same configuration and the socket was created there.

Same packages and build?

--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com 
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
#
Scanned by MailMarshal - M86 Security's comprehensive email content security 
solution. 
Download a free evaluation of MailMarshal at www.m86security.com
#
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] heartbeat doesnt create the socket /var/run/heartbeat/register

2012-01-19 Thread Efrat Lefeber
Hi,

I am using linux-ha heartbeat on a two simple nodes cluster.
For some reason which I can't figure out, the socket 
/var/run/heartbeat/register is not created though the directory 
/var/run/heartbeat/ exist:

ll /var/run/heartbeat/
total 24
drwxr-x---  6 hacluster haclient 4096 2012-01-19 14:30 .
drwxr-xr-x 16 root  root 4096 2012-01-19 14:30 ..
drwxr-x---  2 hacluster haclient 4096 2012-01-19 14:30 ccm
drwxr-x---  2 hacluster haclient 4096 2012-01-19 14:30 crm
drwxr-x---  2 hacluster haclient 4096 2012-01-19 14:30 dopd
drwxr-xr-t  2 root  root 4096 2012-01-19 14:30 rsctmp


/etc/init.d/heartbeat status
heartbeat OK [pid 14685 et al] is running on vs-158 [vs-158]...

cl_status hbstatus
Heartbeat is stopped on this machine.

I ran cl_status with strace and I saw this error:
connect(3, {sa_family=AF_FILE, path=/var/run/heartbeat/register...}, 110) = 
-1 ENOENT (No such file or directory)


Who created this socket? How can I find out why isn't the socket created?
Is there a workaround I can do to create the socket?

I am attaching log files and ha configuration.
This problem doesn't happen all the time. I have another node with the same 
configuration and the socket was created there.

Thanks,
Efrat

Efrat Lefeber
RD SW Engineer

M86 Security
1 Hamachshev St., New Industrial Area
Netanya 42504, Israel
T: +972 (0) 98648 200 ext. 377
efrat.lefe...@m86security.commailto:efrat.lefe...@m86security.com
Skype: lefrat
www.m86security.comhttp://www.m86security.com/


#
Scanned by MailMarshal - M86 Security's comprehensive email content security 
solution. 
Download a free evaluation of MailMarshal at www.m86security.com
#


ha.cf
Description: ha.cf


haresources
Description: haresources


hb.strace
Description: hb.strace
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems