Version Info:
OS:                     CentOS 6.4
Kernel (current):       2.6.32-358.11.1.el6.x86_64
Pacemaker:              1.1.8-7.el6
Corosync:               1.4.1-15.el6_4.1
CRMSH:                  1.2.5-55.6
Resource Agents:        3.9.2-21.el6

I'm having a problem with the ProFTPD OCF script (/usr/lib/ocf/resource.d/heartbeat/proftpd) and the monitor function.

Here is the relevant section of my config (anonymizing addresses):

primitive TEST-EXT-IP ocf:heartbeat:IPaddr2 \
        params ip="xxx.yyy.zzz.11" \
        op monitor interval="30s"
primitive TEST-PROFTPD ocf:heartbeat:proftpd \
params conffile="/usr/local/etc/test_proftpd.conf" binary="/usr/local/sbin/proftpd" test_user="gpmreal" test_pass="[email protected]" curl_url="ftp://hacbackup.blah.blah.bla"; pidfile="/var/run/test_proftpd.pid" \
        op monitor interval="2m"
primitive TEST-PWFILE ocf:PPS:hacPWFile \
        params role="nrt"
group TEST TEST-PWFILE TEST-EXT-IP TEST-PROFTPD
location test-dislike-a TEST \
        rule $id="test-dislike-a-rule" -inf: class eq A
colocation test_and_hacmaster -inf: TEST HACMASTER
colocation test_and_nrtdistro -inf: TEST NRTDISTRO
colocation test_and_sdpsmaster -inf: TEST SDPSMASTER
property $id="cib-bootstrap-options" \
        dc-version="1.1.8-7.el6-394e906" \
        cluster-infrastructure="classic openais (with plugin)" \
        expected-quorum-votes="12" \
        stonith-enabled="false"
rsc_defaults $id="rsc-options" \
        resource-stickiness="500"


Now, from what I can tell ProFTPD actually STARTS and RUNS just fine. I can see it running:

# ps -ef | grep ftp
nobody 482 1 0 11:31 ? 00:00:00 proftpd: (accepting connections)


I can login to it:

# ftp hacbackup
Connected to hacbackup.blah.blah.bla.
220 ProFTPD 1.3.4c Server (HAC TEST) [xxx.yyy.zzz.11]
500 AUTH not understood
Name (hacbackup:astocker): gpmreal
331 Anonymous login ok, send your complete email address as your password
Password:
230 Anonymous access granted, restrictions apply
Remote system type is UNIX.
Using binary mode to transfer files.
ftp> dir
227 Entering Passive Mode (198,118,195,11,250,89).
150 Opening ASCII mode data connection for file list
lrwxrwxrwx   1 root     root            4 Jul  2 11:41 NRTPUB -> data
drwxr-xr-x   2 root     root         4096 Apr 15 14:33 data
226 Transfer complete


However, the 'op monitor' function appears to think that it is down:

(from crm_mon -1r)
Failed actions:
TEST-PROFTPD_monitor_120000 (node=gpmhac10, call=4414, rc=7, status=complete): not running

(from grep monitor /var/log/messages)
...snip...
Jul 2 11:42:06 gpmhac10 crmd[2853]: notice: process_lrm_event: LRM operation TEST-PROFTPD_monitor_120000 (call=5096, rc=7, cib-update=1493, confirmed=false) not running Jul 2 11:42:07 gpmhac10 crmd[2853]: notice: process_lrm_event: LRM operation TEST-PROFTPD_monitor_120000 (call=5107, rc=7, cib-update=1496, confirmed=false) not running
...snip...


Now, when I look at the OCF script for the monitor function I see that it's looking for a PID file (actually it first calls proftpd_status which also looks for the PID file.) According to the script the PID file should by default be ${OCF_RESKEY_pidfile="/var/run/proftpd.pid"}, and in my definition is basically the same: pidfile="/var/run/test_proftpd.pid" but there is no *ftpd.pid file in that directory:

# ls -l /var/run | grep ftp
#

As a matter of fact I cannot figure WHERE the PID file is at all, if I call the init.d proftpd script it reports success:

# /etc/init.d/proftpd status
proftpd (pid 482) is running...

So it seems to know where to look for the PID, or else it's doing something different entirely to determine the status.

If my reading of the OCF script is correct, the monitor function is reporting failure because it cannot find the PID file to check things. But more importantly of course is the fact that the script does not seem to be creating the PID file in the first place. While it's nice that the server is actually running, it doesn't really help if it cannot be monitored, restarted/moved as necessary due to failure.

Looking at the proftpd_start() function, it doesn't appear to make use of the OCF_RESKEY_pidfile parameter at all, so I would assume that my renaming of the file is irrelevant because it doesn't get set anywhere. But more importantly again is the fact that the script launches the ProFTPD service but it doesn't even create the 'normal' default PID file. Why is that and how can that be fixed?


I know that the CURL portion of the monitor works when run manually from the command line on the system running the TEST-PROFTPD:

# curl -sS -u "gpmreal:pacemaker@hacbackup" ftp://hacbackup.blah.blah.bla/
lrwxrwxrwx   1 root     root            4 Jul  2 11:41 NRTPUB -> data
drwxr-xr-x   2 root     root         4096 Apr 15 14:33 data


So I'm assuming that if this PID issue is fixed the monitor functionality would work.


What I'm hoping to have setup is something that monitors the FTP functionality and notices when it's failed and attempts to restart it and if that fails, move it (along with everything else in the resource group) to a different node. But that's not going to work if it always thinks it's in a failed state.


Help is appreciated.


Tony

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to