On 2/26/07, Serge Dubrouski <[EMAIL PROTECTED]> wrote:
You broke it:
./pgsql start
Usage: grep [OPTION]... PATTERN [FILE]...
Try `grep --help' for more information.
chown: missing operand after `:'
Try `chown --help' for more information.
2007/02/26_12:50:26 ERROR: Can't start PostgreSQL.
The reason for these errors is changed way of initialization
sorry - i've pushed up a fix
variables. Also I still don't like that indefinite loop on start
because it makes harder to manually troubleshoot problem in case if
PostgreSQL doesn't start.
then add a call to ocf_log which indicates the RA is retrying or some-such
the RA is definitely not the best place to set limits on how long a
resource can take to start.
at the very least it leads to confusion when the timeout is less than
an RAs internal limit. on the other-hand, if the internal limit is
lower than the timeout, then you're returning before you needed to.
it is also not reliable if any part of the RA can block.
I don't know what is the right way to fix those problem now: fix your
version of script or fix previous one.
On 2/26/07, Andrew Beekhof <[EMAIL PROTECTED]> wrote:
> i made some further improvements in:
> http://hg.beekhof.net/lha/crm-dev/rev/2e9b22cfb7e1
>
> On 2/26/07, Keisuke MORI <[EMAIL PROTECTED]> wrote:
> > "Serge Dubrouski" <[EMAIL PROTECTED]> writes:
> > >> "Serge Dubrouski" <[EMAIL PROTECTED]> writes:
> > >>
> > >> > And I don't like the idea of removing PID in "start" function. The
> > >> > standard approach if to remove it after stopping application. Other
> > >> > way it could lead to attempt of starting a second copy of application.
> > >>
> > >> This is necessary for the recovery from the power failure of the
> > >> primary node, for example. There is no chance to cleanup by stop
> > >> in such cases.
> > >>
> > >> Duplicate starting is avoided by checking if the postmaster
> > >> process exists beforehand, as the original script does.
> > >
> > > Yes, but in this case you remov the legitimate pid file from the
> > > running instance. You remove it before testing that the checking for
> > > postmaster.
> >
> > Well, I think that the script does the cheking for postmaster first
> > and removing it second (remove it only when no postmaster process exists).
> >
> > Here's the code snip with my patch.
> > pgsql_status checks for it and I think it should be good enough.
> >
----8<--------8<--------8<--------8<--------8<--------8<--------8<--------8<----
> > pgsql_start() {
> > if pgsql_status
> > then
> > ocf_log info "PostgreSQL is already running. PID=`cat $PIDFILE`"
> > return $OCF_SUCCESS
> > fi
> >
> > if [ -x $PGCTL ]
> > then
> > # Remove postmastre.pid if it exists
> > rm -f $PIDFILE
> >
----8<--------8<--------8<--------8<--------8<--------8<--------8<--------8<----
> >
> >
> > > Let me think about it, I don't know what is worse in a
> > > such case. Probably you are right and we has the right to think that
> > > Postgress shouldn't be started outside of cluster control.
> >
> > If postmaster was already started outside of heartbeat control,
> > then it should return OCF_SUCCESS and the postmaster should
> > continue to run.
> >
> > Power failure is one of the most typical situation that we want
> > to save with HA software, so this 'cleanup in start' is
> > important, I think.
> >
> > Maybe it would be nice if we put a WARN log before removing it.
> >
> > Thanks,
> >
> > >
> > >>
> > >>
> > >> >
> > >> > On 2/23/07, Serge Dubrouski <[EMAIL PROTECTED]> wrote:
> > >> >> I like the idea of the patch, but honestly I don't like how it's
> > >> >> implemented. It shall call (as Andrew suggested) "monitor" function to
> > >> >> check that pgsql is up or down instead of spreading the same code all
> > >> >> around the script. I'd like to review the idea and prepare another
> > >> >> patch if everybody is agree.
> > >>
> > >> Yes, using the same monitor function would be better.
> > >> I didn't do that just because it will dump many logs every
> > >> seconds when it takes time to start.
> > >> It is OK if you don't mind it.
> > >
> > > Don't think that this is a problem. Those files are big even without
> > > those records.
> > >
> > > Thanks for all these proposals.
> > >
> > >>
> > >> Thanks,
> > >> --
> > >> Keisuke MORI
> > >> NTT DATA Intellilink Corporation
> > >> _______________________________________________________
> > >> Linux-HA-Dev: [email protected]
> > >> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> > >> Home Page: http://linux-ha.org/
> > >>
> > > _______________________________________________________
> > > Linux-HA-Dev: [email protected]
> > > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> > > Home Page: http://linux-ha.org/
> >
> > --
> > Keisuke MORI
> > Open Source Business Division
> > NTT DATA Intellilink Corporation
> > Tel: +81-3-3534-4811 / Fax: +81-3-3534-4814
> > _______________________________________________________
> > Linux-HA-Dev: [email protected]
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> > Home Page: http://linux-ha.org/
> >
> _______________________________________________________
> Linux-HA-Dev: [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
>
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/