Hi,

On Thu, Jan 24, 2008 at 01:52:32PM +0100, Andreas Mather1 wrote:
> Hi Dejan,
> 
> First of all, thanks for your comments!
> 
> > > db2_asym.patch:
> > > The first one is about the return code of the RA in case the DB2
> instance
> > > is not availble. I've change the rc to $OCF_ERR_INSTALLED.
> >
> > I guess that you mean "not installed" which is the case for
> > example when the software is on a shared storage. Funny, I
> > thought that I did fix this before.
> Hmm. Do you mean an installation on a shared file system which is not
> mounted on the current host?

Yes.

> > The patch you posted is probably not enough. We have to handle
> > differently "not installed" and "not configured". Also, on
> > monitor operation, the RA should exit with OCF_NOT_RUNNING.
> > Please see if the attached patch works.
> Well, OCF_NOT_RUNNING means "_cleanly_" stopped, doesn't it? If vital files
> are not available, it's more like "not installed"?!

In case the software resides on a shared storage and it's not
mounted then it obviously can't run either. Hence, stopped is
accurate. The software is installed, it's just that it is not
currently available.

> Btw. I've applied the attached patch and could successfully, stop, start,
> monitor, etc. the resource.
> Additionally, db2info is more a startup-test to cover cases when DB2 is not
> installed or not configured (and to setup some required variables...). If
> that's the case, the monitor and status operation will also report not
> installed or not configured. Isn't that a normal behaviour? I couldn't
> identify the right behaviour in the OCF RA spec.

Not sure if this case is covered in the OCF specification.

> Acutally, with the current RA, there's no way to figure out if DB2 is
> installed or configured correctly. In order to start an instance, you just
> issue db2start (~/sqllib/adm contains this file and is a link pointing to
> the installation itself) as the instance user, but the home directory may
> not be available if it's a shared, not mounted file system. In this
> scenario, you're unable to figure out, if DB2 was not configured correctly,
> if it's just not installed or if the filesystem is just not available.
> 
> As I think further, the only way to handle the different setups, is to tell
> the RA more about the environment (actually, this applies to other RAs as
> well...). A possible way to do so:
> 
> OCF_RESKEY_clustertype = [ full | instance | db ]
> OCF_RESKEY_db2dir = <db2 installation dir:-/opt/IBM/db2>

This is an overkill. I think that the CRM might be annoyed if
the monitor operation returns something different from success or
not running. In case there's a real problem (configuration or
otherwise), i.e. software is definitely not available, then the
start action is going to fail.

> clustertype full:
> That'd be the default and means that binary files as well as instances
> files are shared and may not be available (during monitoring and status
> operations). It covers RA backward compatibility and  does not support
> asymmetrical clusters. It basically behaves like the current RA
> 
> clustertype instance:
> uses reskey db2dir to check for binary files (installed), db2home to check
> for instance user (configured). Supports asymmetrical clusters, but is not
> able to check the configuration in detail...
> 
> clustertype db:
> Both, the DB2 binaries as well as the instance home directory are local.
> It's just the database files which are shared and therefore, no adoption to
> db2nodes.cfg is required and binary files (db2dir), as well as the
> configuration can be checked...
> 
> I haven't implemented anything yet, except your suggestions (see comments
> below). Before I'll do further changes, I want to have a good picture,
> about how the RA shall behave, so further suggestions are welcome :)

> > Don't know what does db2nodes.cfg look like. Perhaps it's safer
> > to use 'grep -w ...'. '&>' is not necessary: use just '>'.
> Right. I redirected everything to /dev/null to cover "file not found"
> errors
> But I've change the line completly to handle better the db2nodes.cfg
> syntax:
> +  runasdb2 [ x`hostname` = x`awk '{print $2}' $db2nodes` ]
> 
> > mktemp's not available on all platforms. There's a function
> > provided which takes care of that: maketempfile.
> ok, changed.
> 
> > \s is a GNU extension. And the regular expression looks a bit
> > odd. Perhaps better to use awk:
> >
> >   awk "{\$2=\"$localhost\"; print}" $db2sql/db2nodes.cfg > $tmpfile
> Yes, looks much nicer and is faster too. Thanks for this hint! Though I had
> to define $localhost as local to get this working?!
> 
> > --reference is a GNU extension. Plain cp should preserved the
> > permissions of the target. Just drop these two...
> >
> > +  mv $tmpfile $db2sql/db2nodes.cfg
> >
> > and
> >
> >   cp $tmpfile $db2sql/db2nodes.cfg
> >   rm $tmpfile
> Changed. Additionally, I've added -f to cp and rm. Is that ok?

Sure.

Cheers,

Dejan

> 
> Mit freundlichen Gr??en / Best regards
> 
> Andreas MATHER
> ESLT - Enterprise Services for Linux Technologies
> 
> IBM Austria, Obere Donaustrasse 95, 1020 Vienna
> Phone : +43-1-21145/4799
> Fax: +43-1-21145/8888
> e-mail: [EMAIL PROTECTED]
> 
> IBM ?sterreich Internationale B?romaschinen Gesellschaft m.b.H.
> Sitz: Wien
> Firmenbuchgericht: Handelsgericht Wien, FN 80000y
> 
> [EMAIL PROTECTED] wrote on 01/23/2008 11:41:31 AM:
> 
> > Hi,
> >
> > On Tue, Jan 22, 2008 at 12:47:20PM +0100, Andreas Mather1 wrote:
> > >
> > >
> > > Hi all,
> > >
> > > I ran into some troubles with the DB2 RA on an asymmetrical 4 node
> cluster.
> > >
> > > I've changed the RA to cover my needs and want to share the patches
> (which
> > > I seperated as they address different issues):
> >
> > That's very well :)
> >
> > > db2_asym.patch:
> > > The first one is about the return code of the RA in case the DB2
> instance
> > > is not availble. I've change the rc to $OCF_ERR_INSTALLED.
> >
> > I guess that you mean "not installed" which is the case for
> > example when the software is on a shared storage. Funny, I
> > thought that I did fix this before.
> >
> > The patch you posted is probably not enough. We have to handle
> > differently "not installed" and "not configured". Also, on
> > monitor operation, the RA should exit with OCF_NOT_RUNNING.
> > Please see if the attached patch works.
> >
> > > db2_env.patch:
> > > In case the instance user's directory is shared, the
> ~/sqllib/db2nodes.cfg
> > > does not represent the current hostname after a takeover. This patch
> > > changes the RA to check the db2nodes.cfg for correct content and, if
> > > necessary, changes the hostname.
> >
> > --- db2.orig   2008-01-22 10:52:15.460234564 +0100
> > +++ db2   2008-01-22 12:15:10.709047111 +0100
> > @@ -173,9 +173,41 @@
> >
> >
> >  #
> > +# db2_init_start: Prepare DB2 environment
> > +#
> > +db2_init_start() {
> > +
> > +  localhost=`hostname`
> > +  runasdb2 grep $localhost $db2sql/db2nodes.cfg &>/dev/null
> >
> > Don't know what does db2nodes.cfg look like. Perhaps it's safer
> > to use 'grep -w ...'. '&>' is not necessary: use just '>'.
> >
> > +
> > +  # if db2nodes.cfg is ok, return
> > +  [ $? -eq 0 ] && return
> > +
> > +  # ok, we need to change db2nodes.cfg to list our hostname
> > +  # if mktemp fails, we cannot recover so exit with error
> > +  tmpfile=`mktemp` || exit $OCF_ERR_GENERIC
> >
> > mktemp's not available on all platforms. There's a function
> > provided which takes care of that: maketempfile.
> >
> > +
> > +  cat $db2sql/db2nodes.cfg | sed "s#\(.*\)\s.*\s\(.*\)#\1
> > $localhost \2#" > $tmpfile
> >
> > \s is a GNU extension. And the regular expression looks a bit
> > odd. Perhaps better to use awk:
> >
> >   awk "{\$2=\"$localhost\"; print}" $db2sql/db2nodes.cfg > $tmpfile
> >
> > +
> > +  # a sed failure is something we cannot recover from, so exit with
> error
> > +  [ -s $tmpfile ] || exit $OCF_ERR_GENERIC
> > +
> > +  chown --reference=$db2sql/db2nodes.cfg $tmpfile
> > +  chmod --reference=$db2sql/db2nodes.cfg $tmpfile
> >
> > --reference is a GNU extension. Plain cp should preserved the
> > permissions of the target. Just drop these two...
> >
> > +  mv $tmpfile $db2sql/db2nodes.cfg
> >
> > and
> >
> >   cp $tmpfile $db2sql/db2nodes.cfg
> >   rm $tmpfile
> >
> > Can you please review the comments and suggestions and repost?
> > Thanks.
> >
> > Dejan
> >
> > +
> > +}
> > +
> > +
> > +
> > +#
> >  # db2_start: Start the given db2 instance
> >  #
> >  db2_start() {
> > +
> > +  # prepare db2 environment
> > +  db2_init_start
> > +
> >    if
> >      output=`runasdb2 $db2adm/db2start`
> >    then
> > [attachment "db2_2.patch.gz" deleted by Andreas Mather1/Austria/IBM]
> > _______________________________________________________
> > Linux-HA-Dev: [email protected]
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> > Home Page: http://linux-ha.org/
> 
> _______________________________________________________
> Linux-HA-Dev: [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to