On Mon, 2009-05-25 at 18:49 -0400, Owen Taylor wrote: > On Mon, 2009-05-25 at 22:08 +0200, Nicolai Hähnle wrote: > > > > [ Even with the change, there still is a noticeable performance win from > > > dropping the size of the DMA buffer in mesa - binding the 1M buffers > > > into AGP space takes a measurable amount of time. ] > > > > Forgive me if you've mentioned it before, but what is your testcase? > > > > If I recall the discussions correctly (and admittedly I've been somewhat > > away > > for a while), the idea was to have big buffers so that games can just keep > > scheduling drawing commands virtually indefinitely (maybe even up to a full > > frame worth of draw calls). Obviously, if your testcase is in fact not one > > of > > those eyecandy-crazy shooters, then that's a bad idea for you. > > I've been testing with mutter/gnome-shell (GL based compositor, with UI > elements in the GL scene graph as well.) It certainly doesn't look that > much like a game; the vertex/state-change ratio is probably lower than > for most games since there are no complex 3D models - the most vertex > intensive thing it's doing is text (quad per glyph). > > But on the other hand, it's using maybe 1/1000th of the vertex buffer > before it fills its command buffer and releases the DMA buffer, so > that's a pretty big gap to make up if games are that different.
Sounds like you've got your problems mostly figured out, but some notes on Mesa VBO management: What the 965 driver does is make our driver-internal VBOs moderately sized (64k or so, unless you need bigger for the particular arrays that are enabled), and make new ones when we fill one up while accumulating a batchbuffer. This is pretty cheap, and increasing size of the VBO allocation didn't help, but then we're caching buffers so that allocation's cheap anyway. Immediate-mode GL rendering is still overly slow and cache-hungry, because Mesa assumes that after every draw_prims, it can't remap the VBO again or it'll wait on the GPU. Because of this, we don't actually use VBOs in the VBO module (there's a function to enable real VBOs that we don't use), and instead use arrays that get copied to the driver-internal VBOs. But it doesn't need to worry about the waiting if the buffer hasn't been dispatched to the GPU, so we could just have a callback from the driver to the VBO module for "I've flushed my batch, and now is the time to allocate a new VBO to avoid waiting", and then enable real VBO usage for immediate mode. The alternative for immediate mode would be implementing the ranged buffer mapping extension, which lets the VBO module do the right thing. I'm concerned that we have no testcases for that extension other than the VBO module, which is why I haven't just done that yet. -- Eric Anholt [email protected] [email protected]
signature.asc
Description: This is a digitally signed message part
------------------------------------------------------------------------------ Register Now for Creativity and Technology (CaT), June 3rd, NYC. CaT is a gathering of tech-side developers & brand creativity professionals. Meet the minds behind Google Creative Lab, Visual Complexity, Processing, & iPhoneDevCamp as they present alongside digital heavyweights like Barbarian Group, R/GA, & Big Spaceship. http://p.sf.net/sfu/creativitycat-com
_______________________________________________ Mesa3d-dev mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
