while trying to use the USDT made available to us in Postgresql database I have 
 problems enabling them.

On my database server I ran 1024+ PG processes (1024 clients). The server is a 
8 core T2000 @1000MHz. 32GB. 7 storage arrays. Solaris 10 Update 4. Home 
compiled PG 8.3 beta 1 (optimized binary for T1 chip) with DTrace probes 
enabled.

When running without enabling the probes I have approx 25% of idle time. Expect 
there is some wait queue forming (and time being spend) on some locks. 
Fortunately there are some USDT for this to enable closer inspection.

Overenthusiastically I used the following script:

postgresql*:::lwlock-acquire
{ thee line action }

and the same for the corresponding release-, the waitstart- and the waitend 
probes.
Result: had to reboot my T2000 by using the system console and powering it off 
and on again after waiting for more than 12 hours to see if it came back.
(used -x dynvarsize=8m)

Sure too many probes enabled. However this result is lets say "surprising". 
Indeed very dangerous since database can be used in production environments. In 
my view this is not good.

Ok I now try to enable these 4 probes. Each probe in three  clauses (12 clauses 
per process) for 25 processes.

After using -x cleanrate=201 -x dynvarsize=64m (determined through trial and 
error) and having set  dtrace_dof_maxsize=1m with mdb I see  the following:

After 15 minutes my test finishes and dtrace is still trying to  find the 
probes of interest. Lots of idle time throughout this period. Although far from 
constant.

DTrace now stops with the following message:

dtrace -x dynvarsize=64m -x cleanrate=203 -qs lw.d    >LW/lw.out
dtrace: failed to compile script lw.d: line 374: failed to grab process 14510

BTW I specified my clauses like:

postgresql15250:postgres:LWLockAcquire:lwlock-endwait,
postgresql14637:postgres:LWLockAcquire:lwlock-endwait,
postgresql15470:postgres:LWLockAcquire:lwlock-endwait,
postgresql15395:postgres:LWLockAcquire:lwlock-endwait,
postgresql15329:postgres:LWLockAcquire:lwlock-endwait,
postgresql14795:postgres:LWLockAcquire:lwlock-endwait,
postgresql14527:postgres:LWLockAcquire:lwlock-endwait,
postgresql14869:postgres:LWLockAcquire:lwlock-endwait,
postgresql14807:postgres:LWLockAcquire:lwlock-endwait,
postgresql14910:postgres:LWLockAcquire:lwlock-endwait,
postgresql15195:postgres:LWLockAcquire:lwlock-endwait,
postgresql14574:postgres:LWLockAcquire:lwlock-endwait,
postgresql14917:postgres:LWLockAcquire:lwlock-endwait,
postgresql14662:postgres:LWLockAcquire:lwlock-endwait,
postgresql14831:postgres:LWLockAcquire:lwlock-endwait,
postgresql15027:postgres:LWLockAcquire:lwlock-endwait,
postgresql14999:postgres:LWLockAcquire:lwlock-endwait,
postgresql15009:postgres:LWLockAcquire:lwlock-endwait,
postgresql14699:postgres:LWLockAcquire:lwlock-endwait,
postgresql15232:postgres:LWLockAcquire:lwlock-endwait,
postgresql15153:postgres:LWLockAcquire:lwlock-endwait,
postgresql14776:postgres:LWLockAcquire:lwlock-endwait,
postgresql15332:postgres:LWLockAcquire:lwlock-endwait,
postgresql15050:postgres:LWLockAcquire:lwlock-endwait,
postgresql14510:postgres:LWLockAcquire:lwlock-endwait
/arg1 == 1 && self->tsw1[arg0]/
{
        @cntw1[arg0] = count();
        @timw1[arg0] = sum(timestamp - self->tsw1[arg0]);
        self->tsw1[arg0] = 0;
}

And this is repeated for all twelve clauses where the actions and test differ. 
However the actions are very similar to this one. Basically using differently 
named aggregations. Same holds for the predicates used.


Questions: how do I make this workable. Meaning  speedier in startup.  Without 
reducing my attention to just one or two processes.
Question: why is my system so unresponsive (I waited more that 12 hours to see 
if it came back) forcing me to power off/on to get it alive and kicking again?


And please note that we are planning to add many, many more probes in PG 
source. However if  the above is the result I fear great fears ...

Thanks
--Paul


--
This message posted from opensolaris.org
_______________________________________________
dtrace-discuss mailing list
[email protected]

Reply via email to