On Mon, May 19, 2008 at 9:03 PM, Keith Packard <[EMAIL PROTECTED]> wrote:
> On Mon, 2008-05-19 at 20:11 +0100, Keith Whitwell wrote:
>
>> I'm still confused by your test setup...  Stepping back from cache
>> metaphysics, why doesn't classic pin the hardware, if it's still got
>> 60% cpu to burn?
>
> glxgears under classic is definitely not pinning the hardware -- the
> 'intel_idle' tool shows that it's only using about 70% of the GPU. GEM
> is pinning the hardware. Usually this means there's some synchronization
> between the CPU and GPU causing each to wait part of the time while the
> other executes.

Yes, understood -- the question though is why...  Classic has always
been more than able to pin the hardware in gears, and there's enough
buffering in the system to avoid letting the GPU go idle.  It's not
exactly rocket science to dump enough frames of gears into the queue
to keep the GPU busy, as long as the CPU isn't itself pinned.

There aren't a huge number of synchronization points in the driver
that gears could hit -- the two are allocation of space for batch
buffers and the throttle which prevents the app from getting more than
a couple of frames ahead of the hardware...  The latter is where you
would expect gears to spend most of its wall time - snoozing somewhere
inside swapbuffers, comfortably a frame or so ahead of hardware.

If it's for some reason starved of batchbuffer space, you might see it
spending time elsewhere -- stalled inside command submission or vertex
emit...  or there may be some unexpected other case...

So possibilities are:
  - batchbuffer starvation -- has
  - over-throttling in swapbuffers -- I think we used to let it get
two frames ahead - has this changed?
  - something else...

An easy way to investigate is just run gears under gdb and hit ctrl-c
periodically & see where it ends up.  IE, look for a pattern in the
stack traces...

Most profiling tools out there try and figure out where the CPU cycles
go, but at this point we're trying to figure out where the wall time
goes.



> I haven't really looked at the non-gem case though; the
> numbers seem similar enough to what I've seen in the past.

I think it's important if there are going to be performance
comparisons between various versions of the memory manager that all
the versions are actually working at their best.  The baseline in this
case is classic & for some reason it seems to be operating less well
on your box than elsewhere.

A consequence of that is that it risks making all the new memory
managers look better than they should, because the baseline is
artificially poor...

As much as I like to promote the new tech, I don't think crippling the
old to make it look good is a great strategy, so lets figure out why
gears has regressed on classic & then re-assess how that changes the
landscape for ttm & gem.

It's worth noting that even 'classic' has changed fairly significantly
over the last couple of years with the backport of the bufmgr_fake
functionality from i965, so there were plenty of opportunities for
regressions.  It might be worth trying a Mesa-7.0-ish version of the
driver as well.

Keith

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
--
_______________________________________________
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Reply via email to