Re: [Mesa-dev] [Mesa-stable] [PATCH] ac/nir: Add workaround for GFX9 buffer views.

2018-04-11 Thread Bas Nieuwenhuizen
On Wed, Apr 11, 2018 at 4:25 PM, Juan A. Suarez Romero
 wrote:
> Hi, Bas.
>
> Unfortunately, I can't apply this patch neither in next 17.3 release stable, 
> nor
> in 18.0, as this patch does not apply cleanly, and it requires other different
> commits that didn't land in the stable branches.

This backport (https://patchwork.freedesktop.org/patch/214949/)
applies cleanly on 17.3 for me.

>
>
> For 17.3 I think it is probably not worth to try to provide a specific version
> for the stable, as I'm already cooking the pre-release, and this is the latest
> release for 17.3 series.

Can we please get this in 17.3? This has been submitted upstream
laready for 2 weeks and this backport has been available for a week
and fixes one of the major games from Feral on Vega:

commit 4503ff760c794c3bb15b978a47c530037d56498e (dow3)
Author: Bas Nieuwenhuizen 
Date:   Wed Mar 28 23:54:40 2018 +0200

Why did I not get a message that that commit could not be applied?

>
>
> Maybe it is worth to provide a version for 18.0 branch, though. Probably it
> wouldn't enter in this release, but surely in the next one which should happen
> in 1 or 2 weeks.
>
>
> For more information, for 18.0 this patch seems to require 1251f08ef ("ac: add
> ac_build_buffer_load_common() helper"), bac9fa9f17f ("ac: add glc parameter to
> ac_build_buffer_load_format"), and probably others.
>
>
> J.A.
>
>
> On Wed, 2018-04-11 at 11:26 +0200, Bas Nieuwenhuizen wrote:
>> On Wed, Apr 4, 2018 at 10:19 PM, Bas Nieuwenhuizen
>>  wrote:
>> > On GFX9 whether the buffer size is interpreted as elements or bytes
>> > depends on whether IDXEN is enabled in the instruction. If the index
>> > is a constant zero, LLVM optimizes IDXEN to 0.
>> >
>> > Now the size in elements is interpreted in bytes which of course
>> > results in out of bounds accesses.
>> >
>> > The correct fix is most likely to disable the LLVM optimization,
>> > but we need something to work with LLVM <= 6.0.
>> >
>> > radeonsi does the max between stride and element count on the CPU
>> > but that results in the size intrinsics returning the wrong size
>> > for the buffer. This would cause CTS errors for radv.
>> >
>> > v2: Also include the store changes.
>> >
>> > Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."'
>> > (backported from 4503ff760c794c3bb15b978a47c530037d56498e for 17.3)
>> > ---
>> >  src/amd/common/ac_llvm_build.c  | 20 
>> >  src/amd/common/ac_llvm_build.h  |  8 
>> >  src/amd/common/ac_nir_to_llvm.c | 36 ++--
>> >  src/amd/common/ac_shader_abi.h  |  4 
>> >  4 files changed, 62 insertions(+), 6 deletions(-)
>> >
>> > diff --git a/src/amd/common/ac_llvm_build.c 
>> > b/src/amd/common/ac_llvm_build.c
>> > index e5cd23e025..f193f71c3e 100644
>> > --- a/src/amd/common/ac_llvm_build.c
>> > +++ b/src/amd/common/ac_llvm_build.c
>> > @@ -960,6 +960,26 @@ LLVMValueRef ac_build_buffer_load_format(struct 
>> > ac_llvm_context *ctx,
>> >   AC_FUNC_ATTR_READONLY);
>> >  }
>> >
>> > +LLVMValueRef ac_build_buffer_load_format_gfx9_safe(struct ac_llvm_context 
>> > *ctx,
>> > +  LLVMValueRef rsrc,
>> > +  LLVMValueRef vindex,
>> > +  LLVMValueRef voffset,
>> > +  bool can_speculate)
>> > +{
>> > +   LLVMValueRef elem_count = LLVMBuildExtractElement(ctx->builder, 
>> > rsrc, LLVMConstInt(ctx->i32, 2, 0), "");
>> > +   LLVMValueRef stride = LLVMBuildExtractElement(ctx->builder, rsrc, 
>> > LLVMConstInt(ctx->i32, 1, 0), "");
>> > +   stride = LLVMBuildLShr(ctx->builder, stride, 
>> > LLVMConstInt(ctx->i32, 16, 0), "");
>> > +
>> > +   LLVMValueRef new_elem_count = LLVMBuildSelect(ctx->builder,
>> > + 
>> > LLVMBuildICmp(ctx->builder, LLVMIntUGT, elem_count, stride, ""),
>> > + elem_count, stride, 
>> > "");
>> > +
>> > +   LLVMValueRef new_rsrc = LLVMBuildInsertElement(ctx->builder, rsrc, 
>> > new_elem_count,
>> > +  
>> > LLVMConstInt(ctx->i32, 2, 0), "");
>> > +
>> > +   return ac_build_buffer_load_format(ctx, new_rsrc, vindex, voffset, 
>> > can_speculate);
>> > +}
>> > +
>> >  /**
>> >   * Set range metadata on an instruction.  This can only be used on load 
>> > and
>> >   * call instructions.  If you know an instruction can only produce the 
>> > values
>> > diff --git a/src/amd/common/ac_llvm_build.h 
>> > b/src/amd/common/ac_llvm_build.h
>> > index aa2a2899ab..d4264f2879 100644
>> > --- a/src/amd/common/ac_llvm_build.h
>> > +++ b/src/amd/common/ac_llvm_build.h
>> > @@ -188,6 +188,14 @@ LLVMValueRef 

Re: [Mesa-dev] [Mesa-stable] [PATCH] ac/nir: Add workaround for GFX9 buffer views.

2018-04-11 Thread Juan A. Suarez Romero
Hi, Bas.

Unfortunately, I can't apply this patch neither in next 17.3 release stable, nor
in 18.0, as this patch does not apply cleanly, and it requires other different
commits that didn't land in the stable branches.


For 17.3 I think it is probably not worth to try to provide a specific version
for the stable, as I'm already cooking the pre-release, and this is the latest
release for 17.3 series.


Maybe it is worth to provide a version for 18.0 branch, though. Probably it
wouldn't enter in this release, but surely in the next one which should happen
in 1 or 2 weeks.


For more information, for 18.0 this patch seems to require 1251f08ef ("ac: add
ac_build_buffer_load_common() helper"), bac9fa9f17f ("ac: add glc parameter to
ac_build_buffer_load_format"), and probably others.


J.A.


On Wed, 2018-04-11 at 11:26 +0200, Bas Nieuwenhuizen wrote:
> On Wed, Apr 4, 2018 at 10:19 PM, Bas Nieuwenhuizen
>  wrote:
> > On GFX9 whether the buffer size is interpreted as elements or bytes
> > depends on whether IDXEN is enabled in the instruction. If the index
> > is a constant zero, LLVM optimizes IDXEN to 0.
> > 
> > Now the size in elements is interpreted in bytes which of course
> > results in out of bounds accesses.
> > 
> > The correct fix is most likely to disable the LLVM optimization,
> > but we need something to work with LLVM <= 6.0.
> > 
> > radeonsi does the max between stride and element count on the CPU
> > but that results in the size intrinsics returning the wrong size
> > for the buffer. This would cause CTS errors for radv.
> > 
> > v2: Also include the store changes.
> > 
> > Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."'
> > (backported from 4503ff760c794c3bb15b978a47c530037d56498e for 17.3)
> > ---
> >  src/amd/common/ac_llvm_build.c  | 20 
> >  src/amd/common/ac_llvm_build.h  |  8 
> >  src/amd/common/ac_nir_to_llvm.c | 36 ++--
> >  src/amd/common/ac_shader_abi.h  |  4 
> >  4 files changed, 62 insertions(+), 6 deletions(-)
> > 
> > diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
> > index e5cd23e025..f193f71c3e 100644
> > --- a/src/amd/common/ac_llvm_build.c
> > +++ b/src/amd/common/ac_llvm_build.c
> > @@ -960,6 +960,26 @@ LLVMValueRef ac_build_buffer_load_format(struct 
> > ac_llvm_context *ctx,
> >   AC_FUNC_ATTR_READONLY);
> >  }
> > 
> > +LLVMValueRef ac_build_buffer_load_format_gfx9_safe(struct ac_llvm_context 
> > *ctx,
> > +  LLVMValueRef rsrc,
> > +  LLVMValueRef vindex,
> > +  LLVMValueRef voffset,
> > +  bool can_speculate)
> > +{
> > +   LLVMValueRef elem_count = LLVMBuildExtractElement(ctx->builder, 
> > rsrc, LLVMConstInt(ctx->i32, 2, 0), "");
> > +   LLVMValueRef stride = LLVMBuildExtractElement(ctx->builder, rsrc, 
> > LLVMConstInt(ctx->i32, 1, 0), "");
> > +   stride = LLVMBuildLShr(ctx->builder, stride, LLVMConstInt(ctx->i32, 
> > 16, 0), "");
> > +
> > +   LLVMValueRef new_elem_count = LLVMBuildSelect(ctx->builder,
> > + 
> > LLVMBuildICmp(ctx->builder, LLVMIntUGT, elem_count, stride, ""),
> > + elem_count, stride, 
> > "");
> > +
> > +   LLVMValueRef new_rsrc = LLVMBuildInsertElement(ctx->builder, rsrc, 
> > new_elem_count,
> > +  
> > LLVMConstInt(ctx->i32, 2, 0), "");
> > +
> > +   return ac_build_buffer_load_format(ctx, new_rsrc, vindex, voffset, 
> > can_speculate);
> > +}
> > +
> >  /**
> >   * Set range metadata on an instruction.  This can only be used on load and
> >   * call instructions.  If you know an instruction can only produce the 
> > values
> > diff --git a/src/amd/common/ac_llvm_build.h b/src/amd/common/ac_llvm_build.h
> > index aa2a2899ab..d4264f2879 100644
> > --- a/src/amd/common/ac_llvm_build.h
> > +++ b/src/amd/common/ac_llvm_build.h
> > @@ -188,6 +188,14 @@ LLVMValueRef ac_build_buffer_load_format(struct 
> > ac_llvm_context *ctx,
> >  LLVMValueRef voffset,
> >  bool can_speculate);
> > 
> > +/* load_format that handles the stride & element count better if idxen is
> > + * disabled by LLVM. */
> > +LLVMValueRef ac_build_buffer_load_format_gfx9_safe(struct ac_llvm_context 
> > *ctx,
> > +  LLVMValueRef rsrc,
> > +  LLVMValueRef vindex,
> > +  LLVMValueRef voffset,
> > +  bool can_speculate);
> > +
> >  LLVMValueRef
> >