Hi Dejan,
First of all, thanks for your comments!
> > db2_asym.patch:
> > The first one is about the return code of the RA in case the DB2
instance
> > is not availble. I've change the rc to $OCF_ERR_INSTALLED.
>
> I guess that you mean "not installed" which is the case for
> example when the software is on a shared storage. Funny, I
> thought that I did fix this before.
Hmm. Do you mean an installation on a shared file system which is not
mounted on the current host?
> The patch you posted is probably not enough. We have to handle
> differently "not installed" and "not configured". Also, on
> monitor operation, the RA should exit with OCF_NOT_RUNNING.
> Please see if the attached patch works.
Well, OCF_NOT_RUNNING means "_cleanly_" stopped, doesn't it? If vital files
are not available, it's more like "not installed"?!
Btw. I've applied the attached patch and could successfully, stop, start,
monitor, etc. the resource.
Additionally, db2info is more a startup-test to cover cases when DB2 is not
installed or not configured (and to setup some required variables...). If
that's the case, the monitor and status operation will also report not
installed or not configured. Isn't that a normal behaviour? I couldn't
identify the right behaviour in the OCF RA spec.
Acutally, with the current RA, there's no way to figure out if DB2 is
installed or configured correctly. In order to start an instance, you just
issue db2start (~/sqllib/adm contains this file and is a link pointing to
the installation itself) as the instance user, but the home directory may
not be available if it's a shared, not mounted file system. In this
scenario, you're unable to figure out, if DB2 was not configured correctly,
if it's just not installed or if the filesystem is just not available.
As I think further, the only way to handle the different setups, is to tell
the RA more about the environment (actually, this applies to other RAs as
well...). A possible way to do so:
OCF_RESKEY_clustertype = [ full | instance | db ]
OCF_RESKEY_db2dir = <db2 installation dir:-/opt/IBM/db2>
clustertype full:
That'd be the default and means that binary files as well as instances
files are shared and may not be available (during monitoring and status
operations). It covers RA backward compatibility and does not support
asymmetrical clusters. It basically behaves like the current RA
clustertype instance:
uses reskey db2dir to check for binary files (installed), db2home to check
for instance user (configured). Supports asymmetrical clusters, but is not
able to check the configuration in detail...
clustertype db:
Both, the DB2 binaries as well as the instance home directory are local.
It's just the database files which are shared and therefore, no adoption to
db2nodes.cfg is required and binary files (db2dir), as well as the
configuration can be checked...
I haven't implemented anything yet, except your suggestions (see comments
below). Before I'll do further changes, I want to have a good picture,
about how the RA shall behave, so further suggestions are welcome :)
> Don't know what does db2nodes.cfg look like. Perhaps it's safer
> to use 'grep -w ...'. '&>' is not necessary: use just '>'.
Right. I redirected everything to /dev/null to cover "file not found"
errors
But I've change the line completly to handle better the db2nodes.cfg
syntax:
+ runasdb2 [ x`hostname` = x`awk '{print $2}' $db2nodes` ]
> mktemp's not available on all platforms. There's a function
> provided which takes care of that: maketempfile.
ok, changed.
> \s is a GNU extension. And the regular expression looks a bit
> odd. Perhaps better to use awk:
>
> awk "{\$2=\"$localhost\"; print}" $db2sql/db2nodes.cfg > $tmpfile
Yes, looks much nicer and is faster too. Thanks for this hint! Though I had
to define $localhost as local to get this working?!
> --reference is a GNU extension. Plain cp should preserved the
> permissions of the target. Just drop these two...
>
> + mv $tmpfile $db2sql/db2nodes.cfg
>
> and
>
> cp $tmpfile $db2sql/db2nodes.cfg
> rm $tmpfile
Changed. Additionally, I've added -f to cp and rm. Is that ok?
Mit freundlichen Grüßen / Best regards
Andreas MATHER
ESLT - Enterprise Services for Linux Technologies
IBM Austria, Obere Donaustrasse 95, 1020 Vienna
Phone : +43-1-21145/4799
Fax: +43-1-21145/8888
e-mail: [EMAIL PROTECTED]
IBM Österreich Internationale Büromaschinen Gesellschaft m.b.H.
Sitz: Wien
Firmenbuchgericht: Handelsgericht Wien, FN 80000y
[EMAIL PROTECTED] wrote on 01/23/2008 11:41:31 AM:
> Hi,
>
> On Tue, Jan 22, 2008 at 12:47:20PM +0100, Andreas Mather1 wrote:
> >
> >
> > Hi all,
> >
> > I ran into some troubles with the DB2 RA on an asymmetrical 4 node
cluster.
> >
> > I've changed the RA to cover my needs and want to share the patches
(which
> > I seperated as they address different issues):
>
> That's very well :)
>
> > db2_asym.patch:
> > The first one is about the return code of the RA in case the DB2
instance
> > is not availble. I've change the rc to $OCF_ERR_INSTALLED.
>
> I guess that you mean "not installed" which is the case for
> example when the software is on a shared storage. Funny, I
> thought that I did fix this before.
>
> The patch you posted is probably not enough. We have to handle
> differently "not installed" and "not configured". Also, on
> monitor operation, the RA should exit with OCF_NOT_RUNNING.
> Please see if the attached patch works.
>
> > db2_env.patch:
> > In case the instance user's directory is shared, the
~/sqllib/db2nodes.cfg
> > does not represent the current hostname after a takeover. This patch
> > changes the RA to check the db2nodes.cfg for correct content and, if
> > necessary, changes the hostname.
>
> --- db2.orig 2008-01-22 10:52:15.460234564 +0100
> +++ db2 2008-01-22 12:15:10.709047111 +0100
> @@ -173,9 +173,41 @@
>
>
> #
> +# db2_init_start: Prepare DB2 environment
> +#
> +db2_init_start() {
> +
> + localhost=`hostname`
> + runasdb2 grep $localhost $db2sql/db2nodes.cfg &>/dev/null
>
> Don't know what does db2nodes.cfg look like. Perhaps it's safer
> to use 'grep -w ...'. '&>' is not necessary: use just '>'.
>
> +
> + # if db2nodes.cfg is ok, return
> + [ $? -eq 0 ] && return
> +
> + # ok, we need to change db2nodes.cfg to list our hostname
> + # if mktemp fails, we cannot recover so exit with error
> + tmpfile=`mktemp` || exit $OCF_ERR_GENERIC
>
> mktemp's not available on all platforms. There's a function
> provided which takes care of that: maketempfile.
>
> +
> + cat $db2sql/db2nodes.cfg | sed "s#\(.*\)\s.*\s\(.*\)#\1
> $localhost \2#" > $tmpfile
>
> \s is a GNU extension. And the regular expression looks a bit
> odd. Perhaps better to use awk:
>
> awk "{\$2=\"$localhost\"; print}" $db2sql/db2nodes.cfg > $tmpfile
>
> +
> + # a sed failure is something we cannot recover from, so exit with
error
> + [ -s $tmpfile ] || exit $OCF_ERR_GENERIC
> +
> + chown --reference=$db2sql/db2nodes.cfg $tmpfile
> + chmod --reference=$db2sql/db2nodes.cfg $tmpfile
>
> --reference is a GNU extension. Plain cp should preserved the
> permissions of the target. Just drop these two...
>
> + mv $tmpfile $db2sql/db2nodes.cfg
>
> and
>
> cp $tmpfile $db2sql/db2nodes.cfg
> rm $tmpfile
>
> Can you please review the comments and suggestions and repost?
> Thanks.
>
> Dejan
>
> +
> +}
> +
> +
> +
> +#
> # db2_start: Start the given db2 instance
> #
> db2_start() {
> +
> + # prepare db2 environment
> + db2_init_start
> +
> if
> output=`runasdb2 $db2adm/db2start`
> then
> [attachment "db2_2.patch.gz" deleted by Andreas Mather1/Austria/IBM]
> _______________________________________________________
> Linux-HA-Dev: [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/