Sounds good to me.
On Fri, Sep 13, 2013 at 3:11 PM, Chia-I Wu <olva...@gmail.com> wrote: > On Thu, Sep 12, 2013 at 10:48 PM, Ian Romanick <i...@freedesktop.org> wrote: >> On 09/12/2013 01:06 AM, Chris Forbes wrote: >>> Can we make this approximation conditional on an image-quality control >>> in driconf [or somewhere else]? >> >> There's already a control that applications can use: >> GL_FRAGMENT_SHADER_DERIVATIVE_HINT. I don't know whether or not /any/ >> app has ever used it. The default setting is GL_DONT_CARE, so, >> technically speaking, we could do this optimization whenever the hint >> isn't GL_NICEST. Though, we may want a driconf override anyway. Hmm... > How about, in generate_ddx(): > > if (brw->ctx.Hint.FragmentShaderDerivative == GL_NICEST || > brw->accurate_ddx) { > // current code > } > else { > // new code > } > > That is, when the app don't care, we treat it as GL_FASTEST. If the > user cares, he can set the new drirc option, accurate_ddx, to true to > override. accurate_ddx is false by default. > >>> On Thu, Sep 12, 2013 at 5:00 PM, Chia-I Wu <olva...@gmail.com> wrote: >>>> From: Chia-I Wu <o...@lunarg.com> >>>> >>>> Replicate the gradient of the top-left pixel to the other three pixels in >>>> the >>>> subspan, as how DDY is implemented. Before, different graidents were used >>>> for >>>> pixels in the top row and pixels in the bottom row. >>>> >>>> This change results in a less accurate approximation. However, it improves >>>> the performance of Xonotic with Ultra settings by 24.3879% +/- 0.832202% >>>> (at >>>> 95.0% confidence) on Haswell. No noticeable image quality difference >>>> observed. >>>> >>>> No piglit gpu.tests regressions. >>>> >>>> I failed to come up with an explanation for the performance difference. >>>> The >>>> change does not make a difference on Ivy Bridge either. If anyone has the >>>> insight, please kindly enlighten me. Performance differences may also be >>>> observed on other games that call textureGrad and dFdx. >>>> >>>> Signed-off-by: Chia-I Wu <o...@lunarg.com> >>>> --- >>>> src/mesa/drivers/dri/i965/brw_fs_emit.cpp | 17 +++++++++++++---- >>>> 1 file changed, 13 insertions(+), 4 deletions(-) >>>> >>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp >>>> b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp >>>> index bfb3d33..c0d24a0 100644 >>>> --- a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp >>>> +++ b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp >>>> @@ -564,16 +564,25 @@ fs_generator::generate_tex(fs_inst *inst, struct >>>> brw_reg dst, struct brw_reg src >>>> void >>>> fs_generator::generate_ddx(fs_inst *inst, struct brw_reg dst, struct >>>> brw_reg src) >>>> { >>>> + /* approximate with ((ss0.tr - ss0.tl)x4 (ss1.tr - ss1.tl)x4) on >>>> Haswell, >>>> + * which gives much better performance when the result is used with >>>> + * sample_d >>>> + */ >>>> + unsigned vstride = (brw->is_haswell) ? BRW_VERTICAL_STRIDE_4 : >>>> + BRW_VERTICAL_STRIDE_2; >>>> + unsigned width = (brw->is_haswell) ? BRW_WIDTH_4 : >>>> + BRW_WIDTH_2; >>>> + >>>> struct brw_reg src0 = brw_reg(src.file, src.nr, 1, >>>> BRW_REGISTER_TYPE_F, >>>> - BRW_VERTICAL_STRIDE_2, >>>> - BRW_WIDTH_2, >>>> + vstride, >>>> + width, >>>> BRW_HORIZONTAL_STRIDE_0, >>>> BRW_SWIZZLE_XYZW, WRITEMASK_XYZW); >>>> struct brw_reg src1 = brw_reg(src.file, src.nr, 0, >>>> BRW_REGISTER_TYPE_F, >>>> - BRW_VERTICAL_STRIDE_2, >>>> - BRW_WIDTH_2, >>>> + vstride, >>>> + width, >>>> BRW_HORIZONTAL_STRIDE_0, >>>> BRW_SWIZZLE_XYZW, WRITEMASK_XYZW); >>>> brw_ADD(p, dst, src0, negate(src1)); >>>> -- >>>> 1.8.3.1 >>>> >>>> _______________________________________________ >>>> mesa-dev mailing list >>>> mesa-dev@lists.freedesktop.org >>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev >>> _______________________________________________ >>> mesa-dev mailing list >>> mesa-dev@lists.freedesktop.org >>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev >> > > > > -- > o...@lunarg.com _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev