On Fri, Sep 25, 2009 at 12:07, Keith Whitwell <kei...@vmware.com> wrote:
> On Fri, 2009-09-25 at 02:39 -0700, Christoph Bumiller wrote:
>> Keith Whitwell schrieb:
>> > On Fri, 2009-09-25 at 02:02 -0700, Christoph Bumiller wrote:
>> >> Module: Mesa
>> >> Branch: master
>> >> Commit: 513cadf5afad18516f7299ade246f59d520753d0
>> >> URL:    
>> >> http://cgit.freedesktop.org/mesa/mesa/commit/?id=513cadf5afad18516f7299ade246f59d520753d0
>> >>
>> >> Author: Christoph Bumiller <e0425...@student.tuwien.ac.at>
>> >> Date:   Thu Sep 24 17:37:08 2009 +0200
>> >>
>> >> nv50: actually enable view volume clipping
>> >>
>> >> Until now, only primitives wholly outside the view volume
>> >> were not drawn.
>> >> This was only visibile when using a viewport smaller than
>> >> the window size, naturally.
>> >>
>> >> ---
>> >>
>> >>  src/gallium/drivers/nv50/nv50_state_validate.c |   11 ++++++++++-
>> >>  1 files changed, 10 insertions(+), 1 deletions(-)
>> >>
>> >> diff --git a/src/gallium/drivers/nv50/nv50_state_validate.c 
>> >> b/src/gallium/drivers/nv50/nv50_state_validate.c
>> >> index 5a3559e..4ed7697 100644
>> >> --- a/src/gallium/drivers/nv50/nv50_state_validate.c
>> >> +++ b/src/gallium/drivers/nv50/nv50_state_validate.c
>> >> @@ -312,7 +312,7 @@ scissor_uptodate:
>> >>                    goto viewport_uptodate;
>> >>            nv50->state.viewport_bypass = bypass;
>> >>
>> >> -          so = so_new(12, 0);
>> >> +          so = so_new(14, 0);
>> >>            if (!bypass) {
>> >>                    so_method(so, tesla, NV50TCL_VIEWPORT_TRANSLATE(0), 3);
>> >>                    so_data  (so, fui(nv50->viewport.translate[0]));
>> >> @@ -325,12 +325,21 @@ scissor_uptodate:
>> >>
>> >>                    so_method(so, tesla, NV50TCL_VIEWPORT_TRANSFORM_EN, 1);
>> >>                    so_data  (so, 1);
>> >> +                  /* 0x0000 = remove whole primitive only (xyz)
>> >> +                   * 0x1018 = remove whole primitive only (xy), clamp z
>> >> +                   * 0x1080 = clip primitive (xyz)
>> >> +                   * 0x1098 = clip primitive (xy), clamp z
>> >> +                   */
>> >> +                  so_method(so, tesla, NV50TCL_VIEW_VOLUME_CLIP_CTRL, 1);
>> >> +                  so_data  (so, 0x1080);
>> >>                    /* no idea what 0f90 does */
>> >>                    so_method(so, tesla, 0x0f90, 1);
>> >>                    so_data  (so, 0);
>> >>            } else {
>> >>                    so_method(so, tesla, NV50TCL_VIEWPORT_TRANSFORM_EN, 1);
>> >>                    so_data  (so, 0);
>> >> +                  so_method(so, tesla, NV50TCL_VIEW_VOLUME_CLIP_CTRL, 1);
>> >> +                  so_data  (so, 0x0000);
>> >>                    so_method(so, tesla, 0x0f90, 1);
>> >>                    so_data  (so, 1);
>> >>            }
>> >>
>> >
>> > Chris,
>> >
>> > Do you notice any performance difference after doing this?  I suspect
>> > that the fastpath for a lot of hardware is to actually avoid
>> > geometry-based clipping and do it with the rasterizer.  If this forces
>> > geometry clipping on when not required, it may bottleneck the hardware
>> > in vertex processing.
>> >
>> > I'm only guessing, but something to keep in mind or investigate one day.
>> >
>> > Keith
>> >
>> I did measure a slight performance decrease, yes, but I just wanted
>> correct behaviour for now, and there's no pipe state for the viewport
>> boundaries (yet) (although I could infer them from scale & xlate).
>
> Yeah, there are a couple of cases where the pipe state is too heavily
> processed (or we made wrong guesses about what drivers would need).  The
> viewport state is one instance.  I'm also not thrilled with how we
> handle front/backface information for culling, stencil, etc.
>
>> The binary driver also doesn't do this, but I kind of wanted to show
>> that it could be done.
>>
>> I guess I'll deactivate vertex clipping again when I swap the scissor
>> and viewport regs (as we probably have them reversed, current viewport
>> affects clear and scissor doesn't), which will happen when I have a
>> reason to, which is when I'm able to tell the state tracker it doesn't
>> have to fallback in any of the buffer clear cases :-)
>
> OK.  I'd be interested to know if there is any real difference between
> partial clears executed through a hardware path vs. having the
> state-tracker draw a quad.
>

Well, as soon as your hardware features a form of hierarchical
zbuffering that'll be the case. So on radeon/geforce there will always
be a difference, both at clear time (where you only need to clear the
highest z level) and at subsequent rendering (where you have to go to
the lowest z levels to realise that you're fully visible anyway). The
reason is that GPUs are not smart enough to figure out that a given
primitive covers the whole ztile and only clear the hierarchical
value; the hierarchical valu is only propagated once the tile is
completed and flushed.

So yeah, while the GL semantic is the same, the performance is
definitely better with hw-specific clears.

Stephane

------------------------------------------------------------------------------
Come build with us! The BlackBerry&reg; Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9&#45;12, 2009. Register now&#33;
http://p.sf.net/sfu/devconf
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Reply via email to