Lars, Thank you for the response. Unfortunately I am in a situation where upgrading Heartbeat is not an option as this cluster is a currently unsupported black box lustre environment from HP. All nodes are locked into a specific HP branded heartbeat RPM package at that revision. The directory does indeed exist and has the correct ownership and permissions. The curious thing is that the strace on heartbeat never mentions these sockets. Essentially this not happening seems to be the cause of CRM and associated processes failing to start because of the socket file /var/run/heartbeat/register not existing.
Thanks, Chris Wilson On May 17, 2013, at 11:02 AM, "Lars Ellenberg" <[email protected]> wrote: > On Thu, May 16, 2013 at 08:05:39PM +0000, Wilson, Christopher (IT) wrote: >> I have a heartbeat 2.1.3-1 cluster and it was running fine until a recent >> network outage. Since then one node has been getting errors such as > > You do realize that there is heartbeat 3 and pacemaker? > >> heartbeat: [3824]: ERROR: Message hist queue is filling up (500 messages in >> queue) > > I don't think this ^^^ message has anything to do with > those "missing sockets" below. > >> I have looked through other mailing lists on the internet and have found >> that it most likely stems from missing sockets in /var/run/heartbeat >> (notably /var/run/heartbeat/register) >> I have uninstalled the rpm and re-installed it, rebooted the machine and run >> an strace on the heartbeat process to no avail. >> It appears that heartbeat does not try to create the socket files if they >> are missing. >> >> Could someone help me understand which component of heartbeat is responsible >> for creating socket files? > > Heartbeat (the core process itself) is creating those sockets. > It does not (in that version, anyways) create the *directory* > /var/run/heartbeat. > So you need to put a mkdir in your init script, if you have /var/run on tmpfs > or similar. > > heartbeat 3 has that covered, btw. > > -- > : Lars Ellenberg > : LINBIT | Your Way to High Availability > : DRBD/HA support and consulting http://www.linbit.com > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
