Re: [dtrace-discuss] CPU dispatcher and buffer questions

Scott Shurr Mon, 11 Jul 2011 08:22:09 -0700

Title: signature

My customer still thinks this is a bug:

**********
We think that DTrace is not working as designed. Therefore we wanted to report a bug for DTrace. Thank you for the answers from mailing list, but these do not solve our problem.

The DTrace script that we describe in the document results in "dynamic variable drops", if the system is heavily loaded for a longer period of time (around 24 hours). We assume that this is not the intended behavior of DTrace.

We use a DTrace script with thread local variables (self->) and still get "dynamic variable drops". DTrace has to be used for a longer period of time in our systems, where the utilization of the system can be high. In case of "dynamic variable drops" incorrect tracing is performed. As described in the previous document we think our script is working correctly and therefore we think that DTrace is working incorrectly. We would like a solution for this.

The previous document contains some assumptions that we had to make about DTrace, but which we could verify using the DTrace documentation. In our script we use thread local variables (self->). We assume that these variables result in at most one variable per hardware thread(which is a part of a CPU). Because the thread local variable at a CPU can be reused in consecutive calls of the probes and we use no other variables, we assume that our script is not causing a full dynamic variable space. But, DTrace appears to be doing something else that causes the dynamic variable space to get full (for a loaded system after 24 hours), because we still get "dynamic variable drops". This makes the DTrace solution unreliable. We would like to see a solution for this problem.

I attached the version of the DTrace script that we use. For its configuration, this script depends upon a process for its configuration. After running this script for approximately 24 hours on a heavily loaded machine, we get dynamic variable drops.
**********

It is my belief that this is not a bug, but I need something more to give the customer to convince him of this. I've attached his script CSET.d
Thanks

Scott Shurr| Solaris and Network Domain, Global Systems Support
Email: scott.sh...@oracle.com
Phone: 781-442-1352
Oracle Global Customer Services

Log, update, and monitor your Service Request online using My Oracle Support

On 07/01/11 10:58, Jim Mauro wrote:

I'm not sure I understand what is being asked here, but I'll take a shot...

Note it is virtually impossible to write a piece of software that is guaranteed

to have sufficient space to buffer a given amount of data when the rate

and size of the data flow is unknown. This is one of the robustness features

of dtrace - it's smart enough to know that, and smart enough to let the user

know when data can not be buffered.

Yes, buffers are allocated per-CPU. There are several buffer types, depending

on the dtrace invocation. Minimally, principle buffers are allocated per CPU

when a dtrace consumer (dtrace(1M)) is executed. Read;

http://wikis.sun.com/display/DTrace/Buffers+and+Buffering

The "self->read" describes a thread local variable, one of several variable

types available in DTrace. It defines the variable scope - each kernel thread

that's on the CPU when the probe(s) fires will have it's own copy of a

"self->" variable.

There is only one kernel dispatcher, not one per CPU. There are per-CPU run

queues managed by the dispatcher.

As for running a DTrace script for hours/days/weeks, I have never been down that

road. It is theoretically possible of course, and seems to be a good use of

speculative buffers or a ring buffer policy.

We can not guarantee it will execute without errors ("dynamic variable drops", etc).

We can guarantee you'll know when errors occur.

How can such guarantees be made with a dynamic tool like dtrace?

Does your customer know up-front how much data will be traced/processed/

consumed, and at what rate?

Read this;

http://blogs.oracle.com/bmc/resource/dtrace_tips.pdf

Thanks

/jim

On Jul 1, 2011, at 9:30 AM, Scott Shurr wrote:

Hello,
I have a customer who has some dtrace questions. I am guessing that someone knows the answer to these, so I am asking here. Here are the questions:

**********
In this document, we will describe how we assume that DTrace uses its memory. Most assumptions result from [1]. We want these assumptions to be validated by a DTrace expert from Oracle. This validation is necessary to provide us confidence that DTrace can execute for a long period of time (in the order of weeks) along with our software, without introducing errors due to e.g. “dynamic variable drops”. In addition, we described a problem we experience with our DTrace script, for which we want to have support from you.

[1] Sun Microsystems inc, “Solaris Dynamic Tracing Guide”, September 2008.
Quotes from Solaris Dynamic Tracing Guide [1], with interpretation:
•    “Each time the variable self->read is referenced in your D program, the data object referenced is the one associated with the operating system thread that was executing when the corresponding DTrace probe fired.”
o    Interpretation: Per CPU there is a dispatcher that has its own thread, when it executes the sched:::on-cpu and sched:::off probes.
•    “At that time, the ring buffer is consumed and processed. dtrace processes each ring buffer in CPU order. Within a CPU's buffer, trace records will be displayed in order from oldest to youngest.”
Interpretation: There is a principal buffer per CPU

1) Impact on Business
We have a number of assumptions that we would like to verify about DTrace.

2) What is the OS version and the kernel patch level of the system?
SunOS nlvdhe321 5.10 Generic_141444-09 sun4v sparc SUNW,T5240

3) What is the Firmware level of the system?
SP firmware 3.0.10.2.b
SP firmware build number: 56134
SP firmware date: Tue May 25 13:02:56 PDT 2010
SP filesystem version: 0.1.22
**********

Thanks

<oracle.jpg>

Scott Shurr| Solaris and Network Domain, Global Systems Support
Email: scott.sh...@oracle.com
Phone: 781-442-1352
Oracle Global Customer Services

Log, update, and monitor your Service Request online using My Oracle Support

_______________________________________________
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org

#!/usr/sbin/dtrace -s 

/*----------------------------------------------------------------------------|
|                                                                             |
|                           DTrace script                                     |
|                                                                             |
|-----------------------------------------------------------------------------|
|
| Ident        : CSET.d
| Description  : This script traces thread executions
|
| History
| yyyy-mm-dd   : <swchg> <author> [<platform> <release>]
| 2011-03-16    : Initial creation tbijlsma
| 2011-03-16    : <SWCHG00371518> <tbijlsma>
|
|-----------------------------------------------------------------------------|
|                                                                             |
|        Copyright (c) 2011, ASML Holding N.V. (including affiliates).        |
|                           All rights reserved                               |
|                                                                             |
|----------------------------------------------------------------------------*/


/* Do not provide additional output*/
#pragma D option quiet

/* Store the printf values in a circular buffer*/ 
#pragma D option bufpolicy=ring

/* Necessary to signal CSET.c */
#pragma D option destructive

uint64_t Stimercorrection; /* Offset set by the pid$1::configDtrace:entry 
probe*/
int *setOne;              /* Pointer to an int with value one */

/* Initially sets Stimercorrection to zero
 */
BEGIN
{
   Stimercorrection = 0;
}

/* This probe sets the moment that CSET calls the function "configDtrace" as 
 * time 0 and signals the function, by setting the value of arg0 to 1
 */
pid$1::configDtrace:entry
{
   setOne = alloca(sizeof(int));
   *setOne = 1;
   Stimercorrection = timestamp;
   copyout(setOne,arg0,sizeof(int));
}

/* When a SW thread starts executing, we store the start time */
sched:::on-cpu
/execname!="sched" && Stimercorrection/
{
   self->startTime = timestamp - Stimercorrection;
}

/* When a SW thread stops executing at a processor, print information for the 
 * event in the ring buffer
 */
sched:::off-cpu
/self->startTime && execname!="sched" && Stimercorrection/
{
   printf("%d|%d|%i|%i|%i|%s\n",
      self->startTime, /* start time*/
      timestamp - Stimercorrection, 
      cpu,       /*number of CPU*/
      tid,      /*current thread*/
      pid,      /*process id*/
      execname  /*process name*/
   );
   self->startTime = 0;
}

/* If the main task calls the function retrieveTrace, we stop the DTrace script
 * and the values in the ring buffer will be printed 
 */
pid$1::retrieveTrace:entry
{
   exit(0);
}

/* Stop the DTrace script if kill -9 has been called for CSET
 * arg0 contains PID of killed process and arg1 the signal
 */
syscall::kill:entry
/arg0==$1 && ( arg1==9 || arg1==15 )/
{
   exit(0);
}


/* Finally we print "EOT" to mark the end of a trace 
 */
END
{
   printf("EOT\n");
}

_______________________________________________
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org

Re: [dtrace-discuss] CPU dispatcher and buffer questions

Reply via email to