2009/8/12 Jerome Glisse <gli...@freedesktop.org>: > On Tue, 2009-08-11 at 15:41 +0200, Maciej Cencora wrote: >> 2009/8/11 Keith Whitwell <kei...@vmware.com>: >> > On Mon, 2009-08-10 at 12:40 -0700, Maciej Cencora wrote: >> >> Dnia poniedziałek, 29 czerwca 2009 o 12:24:54 Keith Whitwell napisał(a): >> >> > On Sat, 2009-06-27 at 14:57 -0700, Maciej Cencora wrote: >> >> > > Hi, >> >> > > >> >> > > while playing with r300 driver I've stumbled upon a problem with >> >> > > splitting vertexes. >> >> > > >> >> > > Let's say we get rendering operation where number of indexes in index >> >> > > buffer is 80000 and max_index is 20000. We are calling vbo_split_prims >> >> > > because number of indexes exceeds hw limit. >> >> > > In flush_vertex (vbo_split_inplace.c) function the split->ib is not >> >> > > null, >> >> > > so the max_index (20000) won't be changed. In the end the draw_prims >> >> > > functions will be called with inappropriate max_index number. >> >> > > >> >> > > I'm seeing this behaviour with UT2004 demo on current r300 driver. >> >> > > >> >> > > I think the solution would be to always calculate min/max_index >> >> > > numbers >> >> > > just like in the !split->ib path but I want to be sure before I commit >> >> > > the patch. >> >> > >> >> > This seems reasonable to me - I haven't looked at this in a while >> >> > though, and suspect this might be just one of several niggles in this >> >> > code. >> >> > >> >> > Keith >> >> >> >> There is one more (even bigger) problem with the vbo_split_inplace code. >> >> The >> >> behavior is completely bogus when there's an index buffer, because in >> >> flush >> >> function we don't use index buffer at all. For cases when max_index >= ib- >> >> >count we end with incorrect rendering in the worst case, for max_index < >> >> >ib- >> >> >count we end with GPU trying to read outside of VBOs resulting in GPU >> >> >lockup. >> >> I've tried to come up with a proper solution but failed. I'd really >> >> appreciate >> >> if someone could take a look at this. >> > >> > We handle this correctly in gallium, though it's a major performance hit >> > if you end up having to do it & we try pretty hard to avoid it (though >> > there is always more that could be done). >> > >> > A first question is to ask why you're splitting the vertex data up in >> > the first place? If you're just passing it off to hardware, why not >> > improve the hardware upload path and avoid the need to split? >> > >> > If you're relying on swtnl, then I guess you're out of luck short of >> > rewriting the TNL module. In that case, you may want to pull some code >> > from Gallium (or consider moving to Gallium and getting all the other >> > benefits). >> > >> > The code in question (draw_pt_vcache.c & friends) basically walks the >> > element list, building a set of new vertex buffers and element lists. >> > Basically: >> > >> > for each triangle in primitive >> > get three source index numbers >> > for each source index { >> > check index-mapping cache for this index >> > if (!present) { >> > copy that vertex from source vertex buffer to tail of dest >> > update index cache to map source index to dest_vert_count++ >> > } >> > append index-map[source-index] to end of dest_index_list >> > flush & restart triangle if out of space >> > } >> > >> > So it's a lot of work & is definitely a noticeable performance hit. >> > >> > Are you really sure you need to be splitting vertex buffers? >> > >> > Keith >> > >> > >> >> I don't need vertex buffer neither index buffer splitting (there's no >> limit for vertex buffer size), it's the rendering operation that needs >> to be split (at least for UT2004 game). >> R300-R400 has hardware limit of 65535 vertices that can be rendered at >> once (be it from index buffer or directly from vertex buffer), and >> UT2004 is sending us index buffer with over 90k elements (but >> max_index is < 20k - so elements are repeated). >> >> Maciej Cencora > > If vertex/index are not emitted in the command stream but are in > their own buffer than this is quite easy to split you just need to emit > several draw packet and offset the index/vertex buffer their. > > packet3(drawvbuf) > vboptr+0 > nelements > packet3(drawvbuf) > vboptr+lastnelements*elsize > nelements > > Cheers, > Jerome >
Hi Jerome, you're right, I've completely forgotten that we can put many draw packets in one cs. There's one thing we need to remember when splitting the draw commands - we need to split them on primitive boundary. Keith if no other drivers depend on vbo_split_prims for splitting index buffer than we can forget about this bug, at least for now. Maciej Cencora ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev