On Fri, Jan 22, 2010 at 11:07 AM, Collin Winter <collinwin...@google.com> wrote:
> Hey Jake,
>
> On Thu, Jan 21, 2010 at 10:48 AM, Jake McGuire <mcgu...@google.com> wrote:
>> On Thu, Jan 21, 2010 at 10:19 AM, Reid Kleckner <r...@mit.edu> wrote:
>>> On Thu, Jan 21, 2010 at 12:27 PM, Jake McGuire <mcgu...@google.com> wrote:
>>>> On Wed, Jan 20, 2010 at 2:27 PM, Collin Winter <collinwin...@google.com> 
>>>> wrote:
>>>>> Profiling
>>>>> ---------
>>>>>
>>>>> Unladen Swallow integrates with oProfile 0.9.4 and newer [#oprofile]_ to 
>>>>> support
>>>>> assembly-level profiling on Linux systems. This means that oProfile will
>>>>> correctly symbolize JIT-compiled functions in its reports.
>>>>
>>>> Do the current python profiling tools (profile/cProfile/pstats) still
>>>> work with Unladen Swallow?
>>>
>>> Sort of.  They disable the use of JITed code, so they don't quite work
>>> the way you would want them to.  Checking tstate->c_tracefunc every
>>> line generated too much code.  They still give you a rough idea of
>>> where your application hotspots are, though, which I think is
>>> acceptable.
>>
>> Hmm.  So cProfile doesn't break, but it causes code to run under a
>> completely different execution model so the numbers it produces are
>> not connected to reality?
>>
>> We've found the call graph and associated execution time information
>> from cProfile to be extremely useful for understanding performance
>> issues and tracking down regressions.  Giving that up would be a huge
>> blow.
>
> FWIW, cProfile's call graph information is still perfectly accurate,
> but you're right: turning on cProfile does trigger execution under a
> different codepath. That's regrettable, but instrumentation-based
> profiling is always going to introduce skew into your numbers. That's
> why we opted to improve oProfile, since we believe sampling-based
> profiling to be a better model.

Sampling-based may be theoretically better, but we've gotten a lot of
mileage out of profile, hotshot and especially cProfile.  I know that
other people at Google have also used cProfile (backported to 2.4)
with great success.  The couple of times I tried to use oProfile it
was less illuminating than I'd hoped, but that could just be
inexperience.

> Profiling was problematic to support in machine code because in
> Python, you can turn profiling on from user code at arbitrary points.
> To correctly support that, we would need to add lots of hooks to the
> generated code to check whether profiling is enabled, and if so, call
> out to the profiler. Those "is profiling enabled now?" checks are
> (almost) always going to be false, which means we spend cycles for no
> real benefit.

Well, we put the ability to profile on demand to good use - in
particular by restricting profiling to one particular servlet (or a
subset of servlets) and by skipping the first few executions of that
servlet in a process to avoid startup noise.  All of this gets kicked
off by talking to the management process of our app server via http.

> Can YouTube use oProfile for profiling, or is instrumented profiling
> critical?

[snip]

I don't know that instrumented profiling is critical, but the level of
insight we have now is very important for keeping the our site happy.
It seems like it'd be a fair bit of work to get oProfile to give us
the same level of insight, and it's not clear who would be motivated
to do that work.

> - Add the necessary profiling hooks to JITted code to better support
> cProfile, but add a command-line flag (something explicit like -O3)
> that removes the hooks and activates the current behaviour (or
> something even more restrictive, possibly).

This would be workable albeit suboptimal; as I said we start and stop
profiling on the fly, and while we currently fork a new process to do
this, that's only because we don't have a good arbitrary RPC mechanism
from parent to child.  Having to start up a new python process from
scratch would be a big step back.

> - Initially compile Python code without the hooks, but have a
> trip-wire set to detect the installation of profiling hooks. When
> profiling hooks are installed, purge all machine code from the system
> and recompile all hot functions to include the profiling hooks.

This would be the closest to the way we are doing things now.

If Unladen Swallow is sufficiently faster, we would probably make
oProfile work.  But if it's a marginal improvement, we'd be more
inclined to try for more incremental improvements (e.g. your excellent
cPickle work).

-jake
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to