Hi,

On Tue, Nov 27, 2012 at 08:28:04AM +0100, Dejan Muhamedagic wrote:
> On Wed, Nov 21, 2012 at 07:06:35PM +0100, Lars Marowsky-Bree wrote:
> > On 2012-11-21T18:02:49, Dejan Muhamedagic <de...@suse.de> wrote:
> > 
> > > > What would you think of OCF_RESKEY_RA_TRACE ?
> > > A meta attribute perhaps? That wouldn't cause a resource
> > > restart.
> > 
> > Point, but - meta attributes so far were mostly for the PE/pacemaker,
> > this would be for the RA.
> 
> Not exactly for the RA itself. The RA execution would just be
> observed. The attribute is consumed by others. Whether it is PE
> or lrmd or something else makes less of a difference. It is up to
> these subsystems to sort the meta attributes out.

It turns out that pacemaker won't export meta attributes which
were not recognized. At any rate, we can go with
OCF_RESKEY_trace_ra. The good thing is that it can be specified
per operation (op start trace_ra=1).

The interface is simple and it's described in ocf-shellfuncs. It
would get support in the UI.

> > Would a changed definition for a resource we're trying to trace be an
> > actual problem? I mean, tracing clearly means you want to trace an
> > resource action, so one would put the attribute on the resource before
> > triggering that.
> > 
> > (It can also be put on in maintenance mode, avoiding the restart.)
> > 
> > > >  Our include script could
> > > > enable that; it's unlikely that the problem occurs prior to that.
> > > > 
> > > > - never (default): Does nothing
> > > > - always: Always trace, write to $(which path?)/raname.rscid.$timestamp
> > > 
> > > bash has a way to send trace to a separate FD, but that feature
> > > is available with version >=4.x. Otherwise, it could be messy to
> > > separate the trace from the other stderr output. Of course, one
> > > could just redirect stderr in this case. I suppose that that
> > > would work too.
> > 
> > I assume that'd be easiest.
> > 
> > (And people not using bash can write their own implementation for this.
> > ;-)
> > 
> > > > - on-error: always trace, but delete on successful exit
> > > Good idea.

This is not implemented right now.

The patch is attached. It's planned for the release 3.9.5.

Thanks,

Dejan

> > > > hb_report/history explorer could gather this too.
> > > Right.
> > > 
> > > > (And yes I know this introduces a fake parameter that doesn't really
> > > > exist. But it'd be so helpful.)
> > > > 
> > > > Sorry. Maybe I'm getting carried away ;-)
> > > 
> > > Good points. I didn't really think much (yet) about how to
> > > further facilitate the feature, just had a vague idea that
> > > somehow lrmd should set the environment variable.
> > 
> > Sure. LRM is an other obvious entry point for increased
> > tracing/logging. That could also work.
> > 
> > > Perhaps we could do something like this:
> > > 
> > > # crm resource trace <rsc_id> [<action>] [<when-to-trace>]
> > > 
> > > This would set the appropriate meta attribute for the resource which
> > > would trickle down to the RA. ocf-shellfuncs would then do whatever's
> > > necessary to setup the trace. The file management could get tricky
> > > though, as we don't have a single point of exit (and trap is already
> > > used elsewhere).
> > 
> > The file/log management would be easier to do in the LRM - and also
> > handle the timeout situation; that could also make use of the "redirect
> > trace elsewhere" if the shell is new enough.
> 
> Indeed. Until then, ocf-shellfuncs can fallback to some well
> known location.
> 
> Thanks,
> 
> Dejan
> 
> > 
> > Regards,
> >     Lars
> > 
> > -- 
> > Architect Storage/HA
> > SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, 
> > HRB 21284 (AG Nürnberg)
> > "Experience is the name everyone gives to their mistakes." -- Oscar Wilde
> > 
> > _______________________________________________________
> > Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> > Home Page: http://linux-ha.org/
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
>From edad4f4f7ef39da0243c1b3444bb8630443a8c38 Mon Sep 17 00:00:00 2001
From: Dejan Muhamedagic <de...@suse.de>
Date: Wed, 23 Jan 2013 17:36:08 +0100
Subject: [PATCH] Medium: ocf-shellfuncs: RA tracing

---
 doc/dev-guides/ra-dev-guide.txt |  6 +++
 heartbeat/ocf-shellfuncs.in     | 82 +++++++++++++++++++++++++++++++++++++++++
 tools/ocf-tester.8              |  5 ++-
 tools/ocf-tester.in             |  4 +-
 4 files changed, 95 insertions(+), 2 deletions(-)

diff --git a/doc/dev-guides/ra-dev-guide.txt b/doc/dev-guides/ra-dev-guide.txt
index af5e3b1..11e9a5d 100644
--- a/doc/dev-guides/ra-dev-guide.txt
+++ b/doc/dev-guides/ra-dev-guide.txt
@@ -1623,6 +1623,12 @@ Beginning tests for /home/johndoe/ra-dev/foobar...
 /home/johndoe/ra-dev/foobar passed all tests
 --------------------------------------------------------------------------
 
+If the resource agent exhibits some difficult to grasp behaviour,
+which is typically the case with just developed software, there
+are +-v+ and +-d+ options to dump more output. If that does not
+help, instruct +ocf-tester+ to trace the resource agent with
++-X+ (make sure to redirect output to a file, unless you are a
+really fast reader).
 
 === Testing with +ocft+
 
diff --git a/heartbeat/ocf-shellfuncs.in b/heartbeat/ocf-shellfuncs.in
index f3822b7..8c3d269 100644
--- a/heartbeat/ocf-shellfuncs.in
+++ b/heartbeat/ocf-shellfuncs.in
@@ -675,4 +675,86 @@ ocf_stop_processes() {
 	return 1
 }
 
+#
+# RA tracing may be turned on by setting OCF_TRACE_RA
+# the trace output will be saved to OCF_TRACE_FILE, if set, or
+# by default to
+#   $HA_VARRUN/ra_trace/<type>/<id>.<action>.<timestamp>
+#   e.g. $HA_VARRUN/ra_trace/oracle/db.start.2012-11-27.08:37:08
+#
+# OCF_TRACE_FILE:
+# - FD (small integer [3-9]) in that case it is up to the callers
+#   to capture output; the FD _must_ be open for writing
+# - absolute path
+#
+# NB: FD 9 may be used for tracing with bash >= v4
+#
+ocf_is_bash4() {
+	echo "$SHELL" | grep bash > /dev/null &&
+			[ ${BASH_VERSINFO[0]} = "4" ]
+}
+ocf_trace_redirect_to_file() {
+	local dest=$1
+	if ocf_is_bash4; then
+		exec 9>$dest
+		BASH_XTRACEFD=9
+	else
+		exec 2>$dest
+	fi
+}
+ocf_trace_redirect_to_fd() {
+	local fd=$1
+	if ocf_is_bash4; then
+		BASH_XTRACEFD=$fd
+	else
+		exec 2>&$fd
+	fi
+}
+__ocf_test_trc_dest() {
+	local dest=$1
+	if ! touch $dest; then
+		ocf_log warn "$dest not writable, trace not going to happen"
+		__OCF_TRC_DEST=""
+		__OCF_TRC_MANAGE=""
+		return 1
+	fi
+	return 0
+}
+ocf_default_trace_dest() {
+	tty >/dev/null && return
+	if [ -n "$OCF_RESOURCE_TYPE" -a \
+			-n "$OCF_RESOURCE_INSTANCE" -a -n "$__OCF_ACTION" ]; then
+		local ts=`date +%F.%T`
+		__OCF_TRC_DEST=$HA_VARRUN/trace_ra/${OCF_RESOURCE_TYPE}/${OCF_RESOURCE_INSTANCE}.${__OCF_ACTION}.$ts
+		__OCF_TRC_MANAGE="1"
+	fi
+}
+
+ocf_start_trace() {
+	export __OCF_TRC_DEST="" __OCF_TRC_MANAGE=""
+	case "$OCF_TRACE_FILE" in
+	[3-9]) ocf_trace_redirect_to_fd "$OCF_TRACE_FILE" ;;
+	/*/*) __OCF_TRC_DEST=$OCF_TRACE_FILE ;;
+	"") ocf_default_trace_dest ;;
+	*)
+		ocf_log warn "OCF_TRACE_FILE must be set to either FD (open for writing) or absolute file path"
+		ocf_default_trace_dest
+		;;
+	esac
+	if [ "$__OCF_TRC_DEST" ]; then
+		mkdir -p `dirname $__OCF_TRC_DEST`
+		__ocf_test_trc_dest $__OCF_TRC_DEST ||
+			return
+		ocf_trace_redirect_to_file "$__OCF_TRC_DEST"
+	fi
+	PS4='+ `date +"%T"`: ${FUNCNAME[0]:+${FUNCNAME[0]}:}${LINENO}: '
+	set -x
+}
+ocf_stop_trace() {
+	set +x
+}
+
 __ocf_set_defaults "$@"
+
+: ${OCF_TRACE_RA:=$OCF_RESKEY_trace_ra}
+ocf_is_true "$OCF_TRACE_RA" && ocf_start_trace
diff --git a/tools/ocf-tester.8 b/tools/ocf-tester.8
index ba07058..850ec0b 100644
--- a/tools/ocf-tester.8
+++ b/tools/ocf-tester.8
@@ -3,7 +3,7 @@
 ocf-tester \- Part of the Linux-HA project
 .SH SYNOPSIS
 .B ocf-tester
-[\fI-Lh\fR] \fI-n resource_name \fR[\fI-o name=value\fR]\fI* /full/path/to/resource/agent\fR
+[\fI-LhvqdX\fR] \fI-n resource_name \fR[\fI-o name=value\fR]\fI* /full/path/to/resource/agent\fR
 .SH DESCRIPTION
 Tool for testing if a cluster resource is OCF compliant
 .SH OPTIONS
@@ -43,6 +43,9 @@ Be quiet while testing
 \fB\-d\fR
 Turn on RA debugging
 .TP
+\fB\-X\fR
+Turn on RA tracing (expect large output)
+.TP
 \fB\-n\fR name
 Name of the resource
 .TP
diff --git a/tools/ocf-tester.in b/tools/ocf-tester.in
index 214e25c..d14e511 100755
--- a/tools/ocf-tester.in
+++ b/tools/ocf-tester.in
@@ -51,13 +51,14 @@ usage() {
 
     echo "Tool for testing if a cluster resource is OCF compliant"
     echo ""
-    echo "Usage: ocf-tester [-Lh] -n resource_name [-o name=value]* /full/path/to/resource/agent"
+    echo "Usage: ocf-tester [-LhvqdX] -n resource_name [-o name=value]* /full/path/to/resource/agent"
     echo ""
     echo "Options:"
     echo "  -h       		This text"
     echo "  -v       		Be verbose while testing"
     echo "  -q       		Be quiet while testing"
     echo "  -d       		Turn on RA debugging"
+    echo "  -X       		Turn on RA tracing (expect large output)"
     echo "  -n name		Name of the resource"	
     echo "  -o name=value		Name and value of any parameters required by the agent"
     echo "  -L			Use lrmadmin/lrmd for tests"
@@ -106,6 +107,7 @@ while test "$done" = "0"; do
 	-L) use_lrmd=1; shift;;
 	-v) verbose=1; shift;;
 	-d) export HA_debug=1; shift;;
+	-X) export OCF_TRACE_RA=1; verbose=1; shift;;
 	-q) quiet=1; shift;;
 	-?|--help) usage 0;;
 	--version) echo "@PACKAGE_VERSION@"; exit 0;;
-- 
1.8.0

_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to