Thirty seconds _should_ be enough time, but I'm curious as to why my five minute timeout isn't in effect here. --BO
On 4/19/07, Dejan Muhamedagic <[EMAIL PROTECTED]> wrote:
On Tue, Apr 17, 2007 at 03:53:41PM -0400, Bjorn Oglefjorn wrote: > Here they are again. It looks like that this Apr 4 11:28:20 test-2 stonithd: [13658]: info: Failed to STONITH the node test-1.domain: optype=1, op_result=2 means that the stonith operation timed out. I'll fix the code to raise this to an error condition and include the descriptions. Before, we see: Apr 4 11:27:50 test-2 tengine: [13668]: info: te_fence_node:actions.cExecuting reboot fencing operation (16) on test-1.domain (timeout=30000) Note the timeout: 30secs. After some digging I found that it's transition_timeout. Is 30 seconds enough time for your stonith agent to perform the reset? Anyway, in CIB I found only this (crm_verify doesn't complain) I find these two timeouts: <nvpair id="cib-bootstrap-options-transition_idle_timeout" name="transition_idle_timeout" value="5min"/> ... <op id="test-1_DRAC_reset" name="reset" timeout="3min" prereq="nothing"/> 1. transition_timeout is not in the annotated CIB. 2. Should user specify this timeout in the crm_config section and calculate the maximum value of all rsc operations' timeouts? 3. What's the difference between the transition_timeout and the transition_idle_timeout? Andrew, can you please take a look. Thanks. > > On 4/17/07, Andrew Beekhof <[EMAIL PROTECTED]> wrote: > > > >On 4/17/07, Bjorn Oglefjorn <[EMAIL PROTECTED]> wrote: > >> I know that my plugin is getting called because of the logging that the > >> plugin does. > > > >do we get to see that logging at all? preferably in the context of > >the other log messages > > > >> That said, I also know my plugin is not receiving any 'reset' > >> operation request from heartbeat. If you see below, request actions are > >> logged. The only actions logged when node failure is simulated are: > >> getconfignames, status, and gethosts, in that order. We should also see > >> getinfo-devid and reset operations logged, but they are never present. > >> --BO > >> > >> On 4/17/07, Andrew Beekhof <[EMAIL PROTECTED]> wrote: > >> > > >> > On 4/17/07, Bjorn Oglefjorn <[EMAIL PROTECTED]> wrote: > >> > > Yes, I most certainly have. The stonith command-line tool has no > >> > problem at > >> > > all with the plugin. The following was run from test-1.domain . The > >> > > indented log entries are from the debug log of the stonith plugin: > >> > > >> > I'm no stonith expert, but the outputs certainly look plausible > >enough. > >> > You kept the same CIB? > >> > Are you sure your plugin is getting called? > >> > > >> > > root:~ # stonith -t external/drac4 > >> > > DRAC_ADDR=test-2.drac.domainDRAC_LOGIN=root DRAC_PASSWD=******** -lS > >> > > stonith: external/drac4 device OK. > >> > > test-2.drac.domain > >> > > > >> > > [Tue Apr 17 09:57:20 2007] Requested Action for : getconfignames > >> > > [Tue Apr 17 09:57:22 2007] Requested Action for test-2.drac.domain > >: > >> > status > >> > > [Tue Apr 17 09:57:22 2007] Success: test-2.drac.domain is > >reachable > >> > > [Tue Apr 17 09:57:23 2007] Requested Action for : getinfo-devid > >> > > [Tue Apr 17 09:57:24 2007] Requested Action for test-2.drac.domain > >: > >> > > gethosts > >> > > > >> > > root:~ # stonith -t external/drac4 > >> > > DRAC_ADDR=test-2.drac.domainDRAC_LOGIN=root DRAC_PASSWD=******** -T > >on > >> > > test-2.domain > >> > > > >> > > [Tue Apr 17 09:57:28 2007] Requested Action for : getconfignames > >> > > [Tue Apr 17 09:57:30 2007] Requested Action for test-2.drac.domain > >: > >> > status > >> > > [Tue Apr 17 09:57:30 2007] Success: test-2.drac.domain is > >reachable > >> > > [Tue Apr 17 09:57:31 2007] Requested Action for : getinfo-devid > >> > > [Tue Apr 17 09:57:33 2007] Requested Action for test-2.drac.domain: > >on > >> > > [Tue Apr 17 09:57:33 2007] test-2.drac.domain Initial Power Status > >= > >> > ON > >> > > [Tue Apr 17 09:57:33 2007] Success: test-2.drac.domain Power > >Status = > >> > ON > >> > > > >> > > root:~ # stonith -t external/drac4 > >> > > DRAC_ADDR=test-2.drac.domainDRAC_LOGIN=root DRAC_PASSWD=******** -T > >> > > reset > >> > > test-2.domain > >> > > > >> > > [Tue Apr 17 09:57:46 2007] Requested Action for : getconfignames > >> > > [Tue Apr 17 09:57:48 2007] Requested Action for test-2.drac.domain > >: > >> > status > >> > > [Tue Apr 17 09:57:48 2007] Success: test-2.drac.domain is > >reachable > >> > > [Tue Apr 17 09:57:49 2007] Requested Action for : getinfo-devid > >> > > [Tue Apr 17 09:57:50 2007] Requested Action for test-2.drac.domain > >: > >> > reset > >> > > [Tue Apr 17 09:57:50 2007] test-2.drac.domain Initial Power Status > >= > >> > ON > >> > > [Tue Apr 17 09:57:58 2007] Success: test-2.drac.domain Power > >Status = > >> > > RESET > >> > > > >> > > --BO > >> > > > >> > > On 4/17/07, Andrew Beekhof <[EMAIL PROTECTED]> wrote: > >> > > > > >> > > > On 4/16/07, Bjorn Oglefjorn <[EMAIL PROTECTED]> wrote: > >> > > > > No ideas? > >> > > > > >> > > > none at all - have you tried calling it manually using the stonith > >> > > > command-line tool to make sure it works? > >> > > > > >> > > > > On 4/9/07, Bjorn Oglefjorn <[EMAIL PROTECTED]> wrote: > >> > > > > > > >> > > > > > I quickly put together a STONITH plugin for testing this. It > >> > conforms > >> > > > to > >> > > > > > the heartbeat spec and always lies to heartbeat returning > >success > >> > no > >> > > > matter > >> > > > > > what. With this plugin in place I'm still getting this error: > >> > > > > > > >> > > > > > Apr 9 15:40:47 test-2 stonithd: [8791]: info: Failed to > >STONITH > >> > the > >> > > > node > >> > > > > > test-1.domain: optype=1, op_result=2 > >> > > > > > Apr 9 15:40:47 test-2 tengine: [8803]: info: > >> > > > tengine_stonith_callback: > >> > > > > > callbacks.c call=-4, optype=1, node_name= test-1.domain, > >result=2, > >> > > > > > node_list=, action=13;5:6eaeba12-87c3-465e-98f1-78585e71e495 > >> > > > > > Apr 9 15:40:47 test-2 tengine: [8803]: ERROR: > >> > > > tengine_stonith_callback: > >> > > > > > callbacks.c Stonith of test-1.domain failed (2)... aborting > >> > > > transition. > >> > > > > > > >> > > > > > --BO > >> > > > _______________________________________________ > >> > > > Linux-HA mailing list > >> > > > [email protected] > >> > > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > >> > > > See also: http://linux-ha.org/ReportingProblems > >> > > > > >> > > _______________________________________________ > >> > > Linux-HA mailing list > >> > > [email protected] > >> > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > >> > > See also: http://linux-ha.org/ReportingProblems > >> > > > >> > _______________________________________________ > >> > Linux-HA mailing list > >> > [email protected] > >> > http://lists.linux-ha.org/mailman/listinfo/linux-ha > >> > See also: http://linux-ha.org/ReportingProblems > >> > > >> _______________________________________________ > >> Linux-HA mailing list > >> [email protected] > >> http://lists.linux-ha.org/mailman/listinfo/linux-ha > >> See also: http://linux-ha.org/ReportingProblems > >> > >_______________________________________________ > >Linux-HA mailing list > >[email protected] > >http://lists.linux-ha.org/mailman/listinfo/linux-ha > >See also: http://linux-ha.org/ReportingProblems > > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems -- Dejan _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
_______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
