On Mon, 2009-05-25 at 18:49 -0400, Owen Taylor wrote:
> On Mon, 2009-05-25 at 22:08 +0200, Nicolai Hähnle wrote:
> 
> > > [ Even with the change, there still is a noticeable performance win from
> > > dropping the size of the DMA buffer in mesa - binding the 1M buffers
> > > into AGP space takes a measurable amount of time. ]
> > 
> > Forgive me if you've mentioned it before, but what is your testcase?
> > 
> > If I recall the discussions correctly (and admittedly I've been somewhat 
> > away 
> > for a while), the idea was to have big buffers so that games can just keep 
> > scheduling drawing commands virtually indefinitely (maybe even up to a full 
> > frame worth of draw calls). Obviously, if your testcase is in fact not one 
> > of 
> > those eyecandy-crazy shooters, then that's a bad idea for you.
> 
> I've been testing with mutter/gnome-shell (GL based compositor, with UI
> elements in the GL scene graph as well.) It certainly doesn't look that
> much like a game; the vertex/state-change ratio is probably lower than
> for most games since there are no complex 3D models - the most vertex
> intensive thing it's doing is text (quad per glyph).
> 
> But on the other hand, it's using maybe 1/1000th of the vertex buffer
> before it fills its command buffer and releases the DMA buffer, so
> that's a pretty big gap to make up if games are that different.

Sounds like you've got your problems mostly figured out, but some notes
on Mesa VBO management:

What the 965 driver does is make our driver-internal VBOs moderately
sized (64k or so, unless you need bigger for the particular arrays that
are enabled), and make new ones when we fill one up while accumulating a
batchbuffer.  This is pretty cheap, and increasing size of the VBO
allocation didn't help, but then we're caching buffers so that
allocation's cheap anyway.

Immediate-mode GL rendering is still overly slow and cache-hungry,
because Mesa assumes that after every draw_prims, it can't remap the VBO
again or it'll wait on the GPU.  Because of this, we don't actually use
VBOs in the VBO module (there's a function to enable real VBOs that we
don't use), and instead use arrays that get copied to the
driver-internal VBOs.  But it doesn't need to worry about the waiting if
the buffer hasn't been dispatched to the GPU, so we could just have a
callback from the driver to the VBO module for "I've flushed my batch,
and now is the time to allocate a new VBO to avoid waiting", and then
enable real VBO usage for immediate mode.

The alternative for immediate mode would be implementing the ranged
buffer mapping extension, which lets the VBO module do the right thing.
I'm concerned that we have no testcases for that extension other than
the VBO module, which is why I haven't just done that yet.

-- 
Eric Anholt
[email protected]                         [email protected]


Attachment: signature.asc
Description: This is a digitally signed message part

------------------------------------------------------------------------------
Register Now for Creativity and Technology (CaT), June 3rd, NYC. CaT 
is a gathering of tech-side developers & brand creativity professionals. Meet
the minds behind Google Creative Lab, Visual Complexity, Processing, & 
iPhoneDevCamp as they present alongside digital heavyweights like Barbarian 
Group, R/GA, & Big Spaceship. http://p.sf.net/sfu/creativitycat-com 
_______________________________________________
Mesa3d-dev mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Reply via email to