Re: [dtrace-discuss] More DTrace Questions - Release builds & Performance

James McIlree Fri, 07 May 2010 13:09:56 -0700

On May 7, 2010, at 11:41 AM, Eric Gorr wrote:

> Again, I am using Mac OS X 10.6.3.
> 
> 
> 
> First, I am curious, how is DTrace getting the function name information for 
> a release build when all of it's symbols have been stripped from it and there 
> is not dSym file either in the application package or adjacent to it? For 
> example, I create the default Cocoa Application, and wrote:
> 
> - (void) myRandomMethod
> {
>       int x;
>               
>       for ( x = 0; x < 9000000; x++ )
>       {
>       }
> }
> 
> - (void) applicationDidFinishLaunching:(NSNotification *)aNotification 
> {
>       [self myRandomMethod];
> }
> 
> 
> sudo dtrace -l -n 'pid1221:MyApp::entry' produced the following results for 
> the release build:
> 
>    ID   PROVIDER            MODULE                          FUNCTION NAME
> 26601    pid1221             MyApp                             start entry
> 26602    pid1221             MyApp                              main entry
> 26603    pid1221             MyApp -[MyAppAppDelegate myRandomMethod] entry
> 26604    pid1221             MyApp        -[MyAppAppDelegate window] entry
> 26605    pid1221             MyApp    -[MyAppAppDelegate setWindow:] entry
> 26606    pid1221             MyApp -[MyAppAppDelegate 
> applicationDidFinishLaunching:] entry
> 26607    pid1221             MyApp                      stub helpers entry
> 
> 
> Now, there is a dSym file for this application, but for DTrace to find it, it 
> would have needed to search the drive for it's location...


        The OS X dtrace uses spotlight to find dSYM(s), if you want to disable 
this, move the dSYM to a directory not indexed by spotlight, or turn off 
spotlight.


> 
> ---
> ---
> ---
> 
> I am looking into the performance hit of doing probe enabled checks.
> 
> To test this, I created the default Cocoa application and then wrote:
> 
> - (unsigned long) myMethod:(unsigned long)inX
> {
>       return inX + 1;
> }
> 
> - (void)applicationDidFinishLaunching:(NSNotification *)aNotification 
> {
>       if ( DTRACETEST_TEST_START_ENABLED() )
>               DTRACETEST_TEST_START();
>               
>       unsigned long x;
>       unsigned long y = 0;
>       
>       for ( x = 0; x < 1000000000; x++ )
>       {
>               y = [self myMethod:x];
> 
>               if ( DTRACETEST_TEST_MINE_ENABLED() )
>               {
>                       DTRACETEST_TEST_MINE();
>               }
>               
>       }
>       
>       if ( DTRACETEST_TEST_END_ENABLED() )
>               DTRACETEST_TEST_END();
>       
>       printf( "%lu", y );
>       
>       // Insert code here to initialize your application 
> }
> 
> In the Debug build,
> 
> If I remove DTRACETEST_TEST_MINE_ENABLED block, the time between the start 
> and end probes is ~8.8 seconds.
> Adding DTRACETEST_TEST_MINE_ENABLED back in, but keeping the probe disabled 
> (i.e. DTRACETEST_TEST_MINE_ENABLED returns false), took ~8.9 seconds.
> 
> In the Release build,
> 
> If I remove DTRACETEST_TEST_MINE_ENABLED block, the time between the start 
> and end probes is ~5.7 seconds.
> Adding DTRACETEST_TEST_MINE_ENABLED back in, but keeping the probe disabled, 
> took ~6.4 seconds.
> 
> The overhead has increased significantly in the release build.
> 
> This doesn't seem like it is as close to zero cost as I initially thought it 
> should be, leading me to think that i may not be measuring what I think I am 
> measuring. 
> 
> Based on these results, it would seem that one would want to keep probes out 
> of the tightest loops in an application.
> 
> But, considering how pervasive such things are even in Mac OS X, my 
> conclusion does not seem to be correct.

        The debug build has a less optimized loop. The cpu is able to "hide" 
more of the dtrace probe cost by overlapping its execution time with other 
instructions in the loop.

        The release build is tighter code, you're seeing more of the actual 
probe cost.

        It looks to me like you are doing 1 billion loops, at a cost of .7 
seconds of execution time.

        That puts the cost of an ENABLED() guarded probe at .0000000007 
seconds, or .7 nanoseconds.

        Call that ~1.5 clock cycles on most laptop cpu's, maybe ~2.x clock 
cycles if you are on a desktop.

        That's pretty low cost. If you're writing seriously hot code, where a 
clock or two really matters, I'd leave the dtrace probe out.

        James M



_______________________________________________
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org

Re: [dtrace-discuss] More DTrace Questions - Release builds & Performance

Reply via email to