Hi, On Tue, Nov 27, 2012 at 08:28:04AM +0100, Dejan Muhamedagic wrote: > On Wed, Nov 21, 2012 at 07:06:35PM +0100, Lars Marowsky-Bree wrote: > > On 2012-11-21T18:02:49, Dejan Muhamedagic <de...@suse.de> wrote: > > > > > > What would you think of OCF_RESKEY_RA_TRACE ? > > > A meta attribute perhaps? That wouldn't cause a resource > > > restart. > > > > Point, but - meta attributes so far were mostly for the PE/pacemaker, > > this would be for the RA. > > Not exactly for the RA itself. The RA execution would just be > observed. The attribute is consumed by others. Whether it is PE > or lrmd or something else makes less of a difference. It is up to > these subsystems to sort the meta attributes out.
It turns out that pacemaker won't export meta attributes which were not recognized. At any rate, we can go with OCF_RESKEY_trace_ra. The good thing is that it can be specified per operation (op start trace_ra=1). The interface is simple and it's described in ocf-shellfuncs. It would get support in the UI. > > Would a changed definition for a resource we're trying to trace be an > > actual problem? I mean, tracing clearly means you want to trace an > > resource action, so one would put the attribute on the resource before > > triggering that. > > > > (It can also be put on in maintenance mode, avoiding the restart.) > > > > > > Our include script could > > > > enable that; it's unlikely that the problem occurs prior to that. > > > > > > > > - never (default): Does nothing > > > > - always: Always trace, write to $(which path?)/raname.rscid.$timestamp > > > > > > bash has a way to send trace to a separate FD, but that feature > > > is available with version >=4.x. Otherwise, it could be messy to > > > separate the trace from the other stderr output. Of course, one > > > could just redirect stderr in this case. I suppose that that > > > would work too. > > > > I assume that'd be easiest. > > > > (And people not using bash can write their own implementation for this. > > ;-) > > > > > > - on-error: always trace, but delete on successful exit > > > Good idea. This is not implemented right now. The patch is attached. It's planned for the release 3.9.5. Thanks, Dejan > > > > hb_report/history explorer could gather this too. > > > Right. > > > > > > > (And yes I know this introduces a fake parameter that doesn't really > > > > exist. But it'd be so helpful.) > > > > > > > > Sorry. Maybe I'm getting carried away ;-) > > > > > > Good points. I didn't really think much (yet) about how to > > > further facilitate the feature, just had a vague idea that > > > somehow lrmd should set the environment variable. > > > > Sure. LRM is an other obvious entry point for increased > > tracing/logging. That could also work. > > > > > Perhaps we could do something like this: > > > > > > # crm resource trace <rsc_id> [<action>] [<when-to-trace>] > > > > > > This would set the appropriate meta attribute for the resource which > > > would trickle down to the RA. ocf-shellfuncs would then do whatever's > > > necessary to setup the trace. The file management could get tricky > > > though, as we don't have a single point of exit (and trap is already > > > used elsewhere). > > > > The file/log management would be easier to do in the LRM - and also > > handle the timeout situation; that could also make use of the "redirect > > trace elsewhere" if the shell is new enough. > > Indeed. Until then, ocf-shellfuncs can fallback to some well > known location. > > Thanks, > > Dejan > > > > > Regards, > > Lars > > > > -- > > Architect Storage/HA > > SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, > > HRB 21284 (AG Nürnberg) > > "Experience is the name everyone gives to their mistakes." -- Oscar Wilde > > > > _______________________________________________________ > > Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org > > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev > > Home Page: http://linux-ha.org/ > _______________________________________________________ > Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev > Home Page: http://linux-ha.org/
>From edad4f4f7ef39da0243c1b3444bb8630443a8c38 Mon Sep 17 00:00:00 2001 From: Dejan Muhamedagic <de...@suse.de> Date: Wed, 23 Jan 2013 17:36:08 +0100 Subject: [PATCH] Medium: ocf-shellfuncs: RA tracing --- doc/dev-guides/ra-dev-guide.txt | 6 +++ heartbeat/ocf-shellfuncs.in | 82 +++++++++++++++++++++++++++++++++++++++++ tools/ocf-tester.8 | 5 ++- tools/ocf-tester.in | 4 +- 4 files changed, 95 insertions(+), 2 deletions(-) diff --git a/doc/dev-guides/ra-dev-guide.txt b/doc/dev-guides/ra-dev-guide.txt index af5e3b1..11e9a5d 100644 --- a/doc/dev-guides/ra-dev-guide.txt +++ b/doc/dev-guides/ra-dev-guide.txt @@ -1623,6 +1623,12 @@ Beginning tests for /home/johndoe/ra-dev/foobar... /home/johndoe/ra-dev/foobar passed all tests -------------------------------------------------------------------------- +If the resource agent exhibits some difficult to grasp behaviour, +which is typically the case with just developed software, there +are +-v+ and +-d+ options to dump more output. If that does not +help, instruct +ocf-tester+ to trace the resource agent with ++-X+ (make sure to redirect output to a file, unless you are a +really fast reader). === Testing with +ocft+ diff --git a/heartbeat/ocf-shellfuncs.in b/heartbeat/ocf-shellfuncs.in index f3822b7..8c3d269 100644 --- a/heartbeat/ocf-shellfuncs.in +++ b/heartbeat/ocf-shellfuncs.in @@ -675,4 +675,86 @@ ocf_stop_processes() { return 1 } +# +# RA tracing may be turned on by setting OCF_TRACE_RA +# the trace output will be saved to OCF_TRACE_FILE, if set, or +# by default to +# $HA_VARRUN/ra_trace/<type>/<id>.<action>.<timestamp> +# e.g. $HA_VARRUN/ra_trace/oracle/db.start.2012-11-27.08:37:08 +# +# OCF_TRACE_FILE: +# - FD (small integer [3-9]) in that case it is up to the callers +# to capture output; the FD _must_ be open for writing +# - absolute path +# +# NB: FD 9 may be used for tracing with bash >= v4 +# +ocf_is_bash4() { + echo "$SHELL" | grep bash > /dev/null && + [ ${BASH_VERSINFO[0]} = "4" ] +} +ocf_trace_redirect_to_file() { + local dest=$1 + if ocf_is_bash4; then + exec 9>$dest + BASH_XTRACEFD=9 + else + exec 2>$dest + fi +} +ocf_trace_redirect_to_fd() { + local fd=$1 + if ocf_is_bash4; then + BASH_XTRACEFD=$fd + else + exec 2>&$fd + fi +} +__ocf_test_trc_dest() { + local dest=$1 + if ! touch $dest; then + ocf_log warn "$dest not writable, trace not going to happen" + __OCF_TRC_DEST="" + __OCF_TRC_MANAGE="" + return 1 + fi + return 0 +} +ocf_default_trace_dest() { + tty >/dev/null && return + if [ -n "$OCF_RESOURCE_TYPE" -a \ + -n "$OCF_RESOURCE_INSTANCE" -a -n "$__OCF_ACTION" ]; then + local ts=`date +%F.%T` + __OCF_TRC_DEST=$HA_VARRUN/trace_ra/${OCF_RESOURCE_TYPE}/${OCF_RESOURCE_INSTANCE}.${__OCF_ACTION}.$ts + __OCF_TRC_MANAGE="1" + fi +} + +ocf_start_trace() { + export __OCF_TRC_DEST="" __OCF_TRC_MANAGE="" + case "$OCF_TRACE_FILE" in + [3-9]) ocf_trace_redirect_to_fd "$OCF_TRACE_FILE" ;; + /*/*) __OCF_TRC_DEST=$OCF_TRACE_FILE ;; + "") ocf_default_trace_dest ;; + *) + ocf_log warn "OCF_TRACE_FILE must be set to either FD (open for writing) or absolute file path" + ocf_default_trace_dest + ;; + esac + if [ "$__OCF_TRC_DEST" ]; then + mkdir -p `dirname $__OCF_TRC_DEST` + __ocf_test_trc_dest $__OCF_TRC_DEST || + return + ocf_trace_redirect_to_file "$__OCF_TRC_DEST" + fi + PS4='+ `date +"%T"`: ${FUNCNAME[0]:+${FUNCNAME[0]}:}${LINENO}: ' + set -x +} +ocf_stop_trace() { + set +x +} + __ocf_set_defaults "$@" + +: ${OCF_TRACE_RA:=$OCF_RESKEY_trace_ra} +ocf_is_true "$OCF_TRACE_RA" && ocf_start_trace diff --git a/tools/ocf-tester.8 b/tools/ocf-tester.8 index ba07058..850ec0b 100644 --- a/tools/ocf-tester.8 +++ b/tools/ocf-tester.8 @@ -3,7 +3,7 @@ ocf-tester \- Part of the Linux-HA project .SH SYNOPSIS .B ocf-tester -[\fI-Lh\fR] \fI-n resource_name \fR[\fI-o name=value\fR]\fI* /full/path/to/resource/agent\fR +[\fI-LhvqdX\fR] \fI-n resource_name \fR[\fI-o name=value\fR]\fI* /full/path/to/resource/agent\fR .SH DESCRIPTION Tool for testing if a cluster resource is OCF compliant .SH OPTIONS @@ -43,6 +43,9 @@ Be quiet while testing \fB\-d\fR Turn on RA debugging .TP +\fB\-X\fR +Turn on RA tracing (expect large output) +.TP \fB\-n\fR name Name of the resource .TP diff --git a/tools/ocf-tester.in b/tools/ocf-tester.in index 214e25c..d14e511 100755 --- a/tools/ocf-tester.in +++ b/tools/ocf-tester.in @@ -51,13 +51,14 @@ usage() { echo "Tool for testing if a cluster resource is OCF compliant" echo "" - echo "Usage: ocf-tester [-Lh] -n resource_name [-o name=value]* /full/path/to/resource/agent" + echo "Usage: ocf-tester [-LhvqdX] -n resource_name [-o name=value]* /full/path/to/resource/agent" echo "" echo "Options:" echo " -h This text" echo " -v Be verbose while testing" echo " -q Be quiet while testing" echo " -d Turn on RA debugging" + echo " -X Turn on RA tracing (expect large output)" echo " -n name Name of the resource" echo " -o name=value Name and value of any parameters required by the agent" echo " -L Use lrmadmin/lrmd for tests" @@ -106,6 +107,7 @@ while test "$done" = "0"; do -L) use_lrmd=1; shift;; -v) verbose=1; shift;; -d) export HA_debug=1; shift;; + -X) export OCF_TRACE_RA=1; verbose=1; shift;; -q) quiet=1; shift;; -?|--help) usage 0;; --version) echo "@PACKAGE_VERSION@"; exit 0;; -- 1.8.0
_______________________________________________________ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/