This might help. With the resource in a failed mode, but target_role =
started, I manually ran crm_master, exporting the proper resource ID,
with the following results:

[EMAIL PROTECTED] wsi]# vi stateful_pgsql 
[EMAIL PROTECTED] wsi]# OCF_RESOURCE_INSTANCE=pgsql_wal_5556:0
[EMAIL PROTECTED] wsi]# export OCF_RESOURCE_INSTANCE
[EMAIL PROTECTED] wsi]# crm_master -V
crm_master[7588]: 2007/03/23_12:50:45 ERROR: crm_abort: main: Triggered
non-fatal assert at crm_attribute.c:353 : attr_value != NULL


Doug

On Fri, 2007-03-23 at 12:21 -0400, Doug Knight wrote:
> Got it. The attached file contains the strace from the second attempt
> by heartbeat to start the resource up as master, right up until it was
> killed. The resource already showed failed on the gui. I zipped it up
> using gzip.
> 
> Doug
> 
> On Fri, 2007-03-23 at 10:11 -0600, Alan Robertson wrote: 
> > Doug Knight wrote:
> > > On Fri, 2007-03-23 at 09:25 -0600, Alan Robertson wrote:
> > >> Doug Knight wrote:
> > >> > Current 2.0.8 tarball from 1/18/07. Process in top looks like:
> > >> > 
> > >> >   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEM  TIME+   COMMAND
> > >> > 24591 root  18   0 1663m 1.5g 1028 R   83 77.8  1:19.42
> > >> > /usr/sbin/crm_master -v 100
> > >> > 
> > >> > It dies and restarts about every 120 seconds, which happens to be the
> > >> > timeout I have specified for the stop and start methods.
> > >> > 
> > >> > Doug
> > >> > 
> > >> > On Fri, 2007-03-23 at 08:20 -0600, Alan Robertson wrote:
> > >> >> Doug Knight wrote:
> > >> >> > Hi Alan,
> > >> >> > I've started testing my OCF script, and I'm seeing something unusual
> > >> >> > during initial startup. I've placed a crm_master call in my
> > >> >> > stateful_start function, after the function has determined that it 
> > >> >> > is
> > >> >> > running on what should be the master, and postgresql has 
> > >> >> > successfully
> > >> >> > started:
> > >> >> > 
> > >> >> > crm_master -v 100
> > >> >> > 
> > >> >> > When this command gets executed, it starts using nearly 100% CPU, 
> > >> >> > memory
> > >> >> > usage continuously increases up to about 68%, then it dies (killed 
> > >> >> > via
> > >> >> > timeout?), followed by a second attempt to go master (with the same
> > >> >> > charactistics, after the function timeout is exceeded), then a 
> > >> >> > demote is
> > >> >> > sent (again, after timeout) and it switches to try to become the 
> > >> >> > slave
> > >> >> > (crm_master -v 10 is what I use, though I'm not sure this is correct
> > >> >> > usage to say "I want to change to a slave). Eventually, I wind up 
> > >> >> > with
> > >> >> > the resource in failed mode.
> > >> >> > 
> > >> >> > First question, any idea why the straight line running of a 
> > >> >> > crm_master
> > >> >> > -v 100 (not within any loops in my script) would spin up to 100%?
> > >> >>
> > >> >> Bugs maybe?  What version of heartbeat are you running?  Which 
> > >> >> processes
> > >> >> are running up to 100%?  For how long?
> > >> >>
> > >> >> > Second question, is using the crm_master -v with different values 
> > >> >> > the
> > >> >> > way to say on which node I prefer the master to run (higher number =
> > >> >> > preferred node)?
> > >> >>
> > >> >> Yes.  I believe that these are added into the values that come from
> > >> >> other constraints in your configuration file to come up with a best
> > >> >> configuration.
> > >>
> > >> Good info.
> > >>
> > >> Could you provide a few hundred lines of strace output to show us what
> > >> it's doing?
> > >>
> > > 
> > > Do you mean the last few hundred lines from ha.log? Just the primary
> > > where I'm trying to start?
> > 
> > No, I mean output from the strace command.  From your reply, I'd guess
> > you've never used it:
> > 
> >   strace -tt -p process-id-of-hung-process > /some/file
> > 
> > Do that for a few seconds, and attach the file to an email to the list.
> > 
> > Does that help?
> > 
> > 
> > 
> _______________________________________________________
> Linux-HA-Dev: [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to