This might help. With the resource in a failed mode, but target_role = started, I manually ran crm_master, exporting the proper resource ID, with the following results:
[EMAIL PROTECTED] wsi]# vi stateful_pgsql [EMAIL PROTECTED] wsi]# OCF_RESOURCE_INSTANCE=pgsql_wal_5556:0 [EMAIL PROTECTED] wsi]# export OCF_RESOURCE_INSTANCE [EMAIL PROTECTED] wsi]# crm_master -V crm_master[7588]: 2007/03/23_12:50:45 ERROR: crm_abort: main: Triggered non-fatal assert at crm_attribute.c:353 : attr_value != NULL Doug On Fri, 2007-03-23 at 12:21 -0400, Doug Knight wrote: > Got it. The attached file contains the strace from the second attempt > by heartbeat to start the resource up as master, right up until it was > killed. The resource already showed failed on the gui. I zipped it up > using gzip. > > Doug > > On Fri, 2007-03-23 at 10:11 -0600, Alan Robertson wrote: > > Doug Knight wrote: > > > On Fri, 2007-03-23 at 09:25 -0600, Alan Robertson wrote: > > >> Doug Knight wrote: > > >> > Current 2.0.8 tarball from 1/18/07. Process in top looks like: > > >> > > > >> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > > >> > 24591 root 18 0 1663m 1.5g 1028 R 83 77.8 1:19.42 > > >> > /usr/sbin/crm_master -v 100 > > >> > > > >> > It dies and restarts about every 120 seconds, which happens to be the > > >> > timeout I have specified for the stop and start methods. > > >> > > > >> > Doug > > >> > > > >> > On Fri, 2007-03-23 at 08:20 -0600, Alan Robertson wrote: > > >> >> Doug Knight wrote: > > >> >> > Hi Alan, > > >> >> > I've started testing my OCF script, and I'm seeing something unusual > > >> >> > during initial startup. I've placed a crm_master call in my > > >> >> > stateful_start function, after the function has determined that it > > >> >> > is > > >> >> > running on what should be the master, and postgresql has > > >> >> > successfully > > >> >> > started: > > >> >> > > > >> >> > crm_master -v 100 > > >> >> > > > >> >> > When this command gets executed, it starts using nearly 100% CPU, > > >> >> > memory > > >> >> > usage continuously increases up to about 68%, then it dies (killed > > >> >> > via > > >> >> > timeout?), followed by a second attempt to go master (with the same > > >> >> > charactistics, after the function timeout is exceeded), then a > > >> >> > demote is > > >> >> > sent (again, after timeout) and it switches to try to become the > > >> >> > slave > > >> >> > (crm_master -v 10 is what I use, though I'm not sure this is correct > > >> >> > usage to say "I want to change to a slave). Eventually, I wind up > > >> >> > with > > >> >> > the resource in failed mode. > > >> >> > > > >> >> > First question, any idea why the straight line running of a > > >> >> > crm_master > > >> >> > -v 100 (not within any loops in my script) would spin up to 100%? > > >> >> > > >> >> Bugs maybe? What version of heartbeat are you running? Which > > >> >> processes > > >> >> are running up to 100%? For how long? > > >> >> > > >> >> > Second question, is using the crm_master -v with different values > > >> >> > the > > >> >> > way to say on which node I prefer the master to run (higher number = > > >> >> > preferred node)? > > >> >> > > >> >> Yes. I believe that these are added into the values that come from > > >> >> other constraints in your configuration file to come up with a best > > >> >> configuration. > > >> > > >> Good info. > > >> > > >> Could you provide a few hundred lines of strace output to show us what > > >> it's doing? > > >> > > > > > > Do you mean the last few hundred lines from ha.log? Just the primary > > > where I'm trying to start? > > > > No, I mean output from the strace command. From your reply, I'd guess > > you've never used it: > > > > strace -tt -p process-id-of-hung-process > /some/file > > > > Do that for a few seconds, and attach the file to an email to the list. > > > > Does that help? > > > > > > > _______________________________________________________ > Linux-HA-Dev: [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev > Home Page: http://linux-ha.org/
_______________________________________________________ Linux-HA-Dev: [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
