Version Info:
OS: CentOS 6.4
Kernel (current): 2.6.32-358.11.1.el6.x86_64
Pacemaker: 1.1.8-7.el6
Corosync: 1.4.1-15.el6_4.1
CRMSH: 1.2.5-55.6
Resource Agents: 3.9.2-21.el6
I'm having a problem with the ProFTPD OCF script
(/usr/lib/ocf/resource.d/heartbeat/proftpd) and the monitor function.
Here is the relevant section of my config (anonymizing addresses):
primitive TEST-EXT-IP ocf:heartbeat:IPaddr2 \
params ip="xxx.yyy.zzz.11" \
op monitor interval="30s"
primitive TEST-PROFTPD ocf:heartbeat:proftpd \
params conffile="/usr/local/etc/test_proftpd.conf"
binary="/usr/local/sbin/proftpd" test_user="gpmreal"
test_pass="[email protected]"
curl_url="ftp://hacbackup.blah.blah.bla"
pidfile="/var/run/test_proftpd.pid" \
op monitor interval="2m"
primitive TEST-PWFILE ocf:PPS:hacPWFile \
params role="nrt"
group TEST TEST-PWFILE TEST-EXT-IP TEST-PROFTPD
location test-dislike-a TEST \
rule $id="test-dislike-a-rule" -inf: class eq A
colocation test_and_hacmaster -inf: TEST HACMASTER
colocation test_and_nrtdistro -inf: TEST NRTDISTRO
colocation test_and_sdpsmaster -inf: TEST SDPSMASTER
property $id="cib-bootstrap-options" \
dc-version="1.1.8-7.el6-394e906" \
cluster-infrastructure="classic openais (with plugin)" \
expected-quorum-votes="12" \
stonith-enabled="false"
rsc_defaults $id="rsc-options" \
resource-stickiness="500"
Now, from what I can tell ProFTPD actually STARTS and RUNS just fine. I
can see it running:
# ps -ef | grep ftp
nobody 482 1 0 11:31 ? 00:00:00 proftpd: (accepting
connections)
I can login to it:
# ftp hacbackup
Connected to hacbackup.blah.blah.bla.
220 ProFTPD 1.3.4c Server (HAC TEST) [xxx.yyy.zzz.11]
500 AUTH not understood
Name (hacbackup:astocker): gpmreal
331 Anonymous login ok, send your complete email address as your password
Password:
230 Anonymous access granted, restrictions apply
Remote system type is UNIX.
Using binary mode to transfer files.
ftp> dir
227 Entering Passive Mode (198,118,195,11,250,89).
150 Opening ASCII mode data connection for file list
lrwxrwxrwx 1 root root 4 Jul 2 11:41 NRTPUB -> data
drwxr-xr-x 2 root root 4096 Apr 15 14:33 data
226 Transfer complete
However, the 'op monitor' function appears to think that it is down:
(from crm_mon -1r)
Failed actions:
TEST-PROFTPD_monitor_120000 (node=gpmhac10, call=4414, rc=7,
status=complete): not running
(from grep monitor /var/log/messages)
...snip...
Jul 2 11:42:06 gpmhac10 crmd[2853]: notice: process_lrm_event: LRM
operation TEST-PROFTPD_monitor_120000 (call=5096, rc=7, cib-update=1493,
confirmed=false) not running
Jul 2 11:42:07 gpmhac10 crmd[2853]: notice: process_lrm_event: LRM
operation TEST-PROFTPD_monitor_120000 (call=5107, rc=7, cib-update=1496,
confirmed=false) not running
...snip...
Now, when I look at the OCF script for the monitor function I see that
it's looking for a PID file (actually it first calls proftpd_status which
also looks for the PID file.) According to the script the PID file should
by default be ${OCF_RESKEY_pidfile="/var/run/proftpd.pid"}, and in my
definition is basically the same: pidfile="/var/run/test_proftpd.pid" but
there is no *ftpd.pid file in that directory:
# ls -l /var/run | grep ftp
#
As a matter of fact I cannot figure WHERE the PID file is at all, if I
call the init.d proftpd script it reports success:
# /etc/init.d/proftpd status
proftpd (pid 482) is running...
So it seems to know where to look for the PID, or else it's doing
something different entirely to determine the status.
If my reading of the OCF script is correct, the monitor function is
reporting failure because it cannot find the PID file to check things.
But more importantly of course is the fact that the script does not seem
to be creating the PID file in the first place. While it's nice that the
server is actually running, it doesn't really help if it cannot be
monitored, restarted/moved as necessary due to failure.
Looking at the proftpd_start() function, it doesn't appear to make use of
the OCF_RESKEY_pidfile parameter at all, so I would assume that my
renaming of the file is irrelevant because it doesn't get set anywhere.
But more importantly again is the fact that the script launches the
ProFTPD service but it doesn't even create the 'normal' default PID file.
Why is that and how can that be fixed?
I know that the CURL portion of the monitor works when run manually from
the command line on the system running the TEST-PROFTPD:
# curl -sS -u "gpmreal:pacemaker@hacbackup"
ftp://hacbackup.blah.blah.bla/
lrwxrwxrwx 1 root root 4 Jul 2 11:41 NRTPUB -> data
drwxr-xr-x 2 root root 4096 Apr 15 14:33 data
So I'm assuming that if this PID issue is fixed the monitor functionality
would work.
What I'm hoping to have setup is something that monitors the FTP
functionality and notices when it's failed and attempts to restart it and
if that fails, move it (along with everything else in the resource
group) to a different node. But that's not going to work if it always
thinks it's in a failed state.
Help is appreciated.
Tony
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems