Hi Florian,
I will have a look into this today, and fix these things. Thx for all
your comments.
Some comments inline...
On 25.11.2010 09:31, Florian Haas wrote:
On 2010-11-24 19:46, Frank Lazzarini wrote:
Hi,
sure no problem we can name it firebird.
OK, firebird it is.
Packaging it with the firebird
packages I don't really find a good idea, because not every user that
needs firebird 2.5 will use it in a cluster.
That's why there would typically be a subpackage RPM named
firebird-resource-agent or whatever, and no-one would be obliged to
install that.
Yes this would be an idea but I would really appreciate beeing in the
cluster-agents package.
Therefore I would
appreciate it if it would be packages with the pacemaker package, if
that package includes all the ocf RA's.
It's not pacemaker, it's the resource-agents (on Debian: cluster-agents)
package, but sure, we can ship it.
Thanks again for the support.
A few more issues I noticed:
1. you use sudo, any particular reason why you're not doing "su -c" like
all the other RAs do? Besides, you should add a check_binary test (for
either sudo or su) to your validate function.
I had some problems using "su -c" but I think it was because the pidfile
couldn't be written,
the fbguard process isn't that clever on giving out some error messages
it just exits. Gonna
go and change that to su, or like you say check wether su or sudo exists.
2. Even if starting the daemon fails, you still pretend it succeeded:
# start fbguard with the specified user
sudo -u $OCF_RESKEY_user $OCF_RESKEY_binary -pidfile $OCF_RESKEY_pid
-daemon -forever
return $OCF_SUCCESS
You can either exit on error after sudo (with "|| exit
$OCF_ERR_GENERIC"), or spin on monitor after start, or both.
3. This too looks fishy:
if [ -f $OCF_RESKEY_pid ]
then
kill `cat $OCF_RESKEY_pid`
if [ $? = 0 ]; then
return $OCF_SUCCESS
fi
All that $? tells you is whether you were able to _send_ the signal. It
says nothing about whether the process did the right thing with the
signal, and it _certainly_ does not imply that the process shut down
completely. Returning successfully here is premature. Ditch that if
block. Because just below:
fi
while firebird25_monitor; do
ocf_log debug "Resource has not stopped yet, waiting"
sleep 1
done
... you're doing the right thing and ask your monitor function to see if
the daemon is in fact dead (and if it's not, you wait till it is).
4. In monitor, you do this:
pid=`cat $OCF_RESKEY_pid`
kill -s 0 $pid>/dev/null 2>&1
rc=$?
... and then test based on $rc. Now, kill -s 0 is generally fine to test
whether a process with that pid is currently running, but the problem
with that approach is that it _will_ succeed on a process that is a
zombie, which really is a failed resource. Thus, it's often a good idea
to combine this with an additional test, preferably one where you can
ask the daemon, or a client, whether the application is really
operational. Does Firebird offer such a facility?
I am not really sure about this, the only thing I can think of is
something similar than the mysql
resource agent does, which is doing a sql on some Database system
tables, but Firebird
doesn't really offer that possibility as those tables only exist in the
databases it self, so if there
is no database currently attached to the firebird server I wouldn't have
these system tables. But
I will have a look into this, and maybe figure out a way to check this.
5. There's an easier way to do this:
if [ ! -x $OCF_RESKEY_binary ]; then
ocf_log err "fbguard binary $OCF_RESKEY_binary does not exist or is
not executable"
exit $OCF_ERR_INSTALLED
fi
You just replace those 4 lines with "check_binary $OCF_RESKEY_binary".
Alrighty didn't know of that function. Hmm this just gives me an idea,
remember I use a function to check wether the pid
directory is writable or not, wouldn't that be an idea to add to
.ocf-binaries ? Like I say just an idea.
6. Also,
getent passwd $OCF_RESKEY_user>/dev/null 2>&1
if [ ! $? -eq 0 ]; then
can just be replaced with
if ! getent passwd $OCF_RESKEY_user>/dev/null 2>&1; then
...for additional brevity.
As far as I'm concerned, once you submit a version with these issues
fixed, this RA is in good shape to be merged. Lars, Dejan, and the other
usual suspects, if you have any objections to pushing this, please let
me know.
I will get to it, this afternoon to correct these things :), once again
thx for all your comments.
Cheers,
Florian
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
Frank
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/