On Fri, 2009-09-25 at 02:39 -0700, Christoph Bumiller wrote: > Keith Whitwell schrieb: > > On Fri, 2009-09-25 at 02:02 -0700, Christoph Bumiller wrote: > >> Module: Mesa > >> Branch: master > >> Commit: 513cadf5afad18516f7299ade246f59d520753d0 > >> URL: > >> http://cgit.freedesktop.org/mesa/mesa/commit/?id=513cadf5afad18516f7299ade246f59d520753d0 > >> > >> Author: Christoph Bumiller <e0425...@student.tuwien.ac.at> > >> Date: Thu Sep 24 17:37:08 2009 +0200 > >> > >> nv50: actually enable view volume clipping > >> > >> Until now, only primitives wholly outside the view volume > >> were not drawn. > >> This was only visibile when using a viewport smaller than > >> the window size, naturally. > >> > >> --- > >> > >> src/gallium/drivers/nv50/nv50_state_validate.c | 11 ++++++++++- > >> 1 files changed, 10 insertions(+), 1 deletions(-) > >> > >> diff --git a/src/gallium/drivers/nv50/nv50_state_validate.c > >> b/src/gallium/drivers/nv50/nv50_state_validate.c > >> index 5a3559e..4ed7697 100644 > >> --- a/src/gallium/drivers/nv50/nv50_state_validate.c > >> +++ b/src/gallium/drivers/nv50/nv50_state_validate.c > >> @@ -312,7 +312,7 @@ scissor_uptodate: > >> goto viewport_uptodate; > >> nv50->state.viewport_bypass = bypass; > >> > >> - so = so_new(12, 0); > >> + so = so_new(14, 0); > >> if (!bypass) { > >> so_method(so, tesla, NV50TCL_VIEWPORT_TRANSLATE(0), 3); > >> so_data (so, fui(nv50->viewport.translate[0])); > >> @@ -325,12 +325,21 @@ scissor_uptodate: > >> > >> so_method(so, tesla, NV50TCL_VIEWPORT_TRANSFORM_EN, 1); > >> so_data (so, 1); > >> + /* 0x0000 = remove whole primitive only (xyz) > >> + * 0x1018 = remove whole primitive only (xy), clamp z > >> + * 0x1080 = clip primitive (xyz) > >> + * 0x1098 = clip primitive (xy), clamp z > >> + */ > >> + so_method(so, tesla, NV50TCL_VIEW_VOLUME_CLIP_CTRL, 1); > >> + so_data (so, 0x1080); > >> /* no idea what 0f90 does */ > >> so_method(so, tesla, 0x0f90, 1); > >> so_data (so, 0); > >> } else { > >> so_method(so, tesla, NV50TCL_VIEWPORT_TRANSFORM_EN, 1); > >> so_data (so, 0); > >> + so_method(so, tesla, NV50TCL_VIEW_VOLUME_CLIP_CTRL, 1); > >> + so_data (so, 0x0000); > >> so_method(so, tesla, 0x0f90, 1); > >> so_data (so, 1); > >> } > >> > > > > Chris, > > > > Do you notice any performance difference after doing this? I suspect > > that the fastpath for a lot of hardware is to actually avoid > > geometry-based clipping and do it with the rasterizer. If this forces > > geometry clipping on when not required, it may bottleneck the hardware > > in vertex processing. > > > > I'm only guessing, but something to keep in mind or investigate one day. > > > > Keith > > > I did measure a slight performance decrease, yes, but I just wanted > correct behaviour for now, and there's no pipe state for the viewport > boundaries (yet) (although I could infer them from scale & xlate).
Yeah, there are a couple of cases where the pipe state is too heavily processed (or we made wrong guesses about what drivers would need). The viewport state is one instance. I'm also not thrilled with how we handle front/backface information for culling, stencil, etc. > The binary driver also doesn't do this, but I kind of wanted to show > that it could be done. > > I guess I'll deactivate vertex clipping again when I swap the scissor > and viewport regs (as we probably have them reversed, current viewport > affects clear and scissor doesn't), which will happen when I have a > reason to, which is when I'm able to tell the state tracker it doesn't > have to fallback in any of the buffer clear cases :-) OK. I'd be interested to know if there is any real difference between partial clears executed through a hardware path vs. having the state-tracker draw a quad. I guess my opinion on this is that there is real difference between a full-screen write-only clear versus part-screen or masked clears which are read-modify-write operations. The former are very special operations if you analyse the rendering stream as effectively a set of transactions on render-targets - they obliterate the previous state of the render-target, but the latter (partial or masked clears) are semantically no different to any other primitive - they just happen to be flat-shaded and axis-aligned. It would be interesting to know if (and indeed why) partial clears are any faster than regular rendering on hardware we care about. Ketih ------------------------------------------------------------------------------ Come build with us! The BlackBerry® Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9-12, 2009. Register now! http://p.sf.net/sfu/devconf _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev