Re: [Mesa-dev] [PATCH] gallivm: fix improper clamping of vertex index when fetching gs inputs

2018-11-08 Thread Roland Scheidegger
Am 08.11.18 um 14:59 schrieb Jose Fonseca:
> Good find.
> 
> On 08/11/2018 01:54, srol...@vmware.com wrote:
>> From: Roland Scheidegger 
>>
>> Because we only have one file_max for the (2d) gs input file, the value
>> actually represents the max of attrib and vertex index (although I'm
>> not entirely sure if we really want the max, since the max valid value
>> of the vertex dimension can be easily deduced from the input primitive).
> 
> Yes, perhaps we should have a 2nd file_max array for the 2nd axis.   But
> it would be a more invasive change.
If just for gs inputs, I can't see why we'd need it, since it can easily
be deduced from the input prim (we also have file_count, although that
seems just plain broken to me for gs inputs, as well as file_mask, which
also mixes vertex / attrib indexes, and looks useless in that form to me
as well).

> 
>> Thus in cases where the number of inputs is higher than the number of
>> vertices per prim, we did not properly clamp the vertex index, which
>> would result in out-of-bound fetches, potentially causing segfaults
>> (the segfaults seemed actually difficult to trigger, but valgrind
>> certainly wasn't happy). This might have happened even if the shader
>> did not actually try to fetch bogus vertices, if the fetching happened
>> in non-active conditional clauses.
>>
>> To fix simply use the correct max vertex index value (derived from
>> the input prim type) instead when clamping for this case.
>> ---
>>   .../auxiliary/gallivm/lp_bld_tgsi_soa.c   | 38 ++-
>>   1 file changed, 28 insertions(+), 10 deletions(-)
>>
>> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
>> b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
>> index 83d7dbea9a..0db81b31ad 100644
>> --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
>> +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
>> @@ -41,6 +41,7 @@
>>   #include "util/u_debug.h"
>>   #include "util/u_math.h"
>>   #include "util/u_memory.h"
>> +#include "util/u_prim.h"
>>   #include "tgsi/tgsi_dump.h"
>>   #include "tgsi/tgsi_exec.h"
>>   #include "tgsi/tgsi_info.h"
>> @@ -1059,7 +1060,8 @@ emit_mask_scatter(struct
>> lp_build_tgsi_soa_context *bld,
>>   static LLVMValueRef
>>   get_indirect_index(struct lp_build_tgsi_soa_context *bld,
>>  unsigned reg_file, unsigned reg_index,
>> -   const struct tgsi_ind_register *indirect_reg)
>> +   const struct tgsi_ind_register *indirect_reg,
>> +   unsigned index_limit)
> 
> file_max array is signed, so let's make index_limit also signed, and
> assert here in this function that it is positive.  This would trap if
> some caller gets the limit for the wrong file.
> 
> Which reminds me -- SM4 allows to declare constant buffers with
> undeclared size.  What exactly would we get passed here in those
> circunstances?  In practice constant buffers can't be larger than 64KB,
> which means we could presume filemax is 4K registers.
For constant reg file, the clamping is skipped in this function here.
emit_fetch_constant does its own clamping, based on the actual size of
the bound buffers. (Although note the clamping is only done for indirect
constant buffer fetches, not for direct ones, which may not be safe.
Maybe should fix that as well? I think it was omitted for performance
reasons, and because unlike for indirect fetches the shader would be
somewhat bogus, but I guess it would still result in out-of-bound fetches.)


> 
>>   {
>>  LLVMBuilderRef builder = bld->bld_base.base.gallivm->builder;
>>  struct lp_build_context *uint_bld = >bld_base.uint_bld;
>> @@ -1107,8 +1109,7 @@ get_indirect_index(struct
>> lp_build_tgsi_soa_context *bld,
>>   */
>>  if (reg_file != TGSI_FILE_CONSTANT) {
>>     max_index = lp_build_const_int_vec(bld->bld_base.base.gallivm,
>> - uint_bld->type,
>> -
>> bld->bld_base.info->file_max[reg_file]);
>> + uint_bld->type, index_limit);
>>       assert(!uint_bld->type.sign);
>>     index = lp_build_min(uint_bld, index, max_index);
>> @@ -1224,7 +1225,8 @@ emit_fetch_constant(
>>     indirect_index = get_indirect_index(bld,
>>     reg->Register.File,
>>     reg->Register.Index,
>> -  >Indirect);
>> +  >Indirect,
>> + 
>> bld->bld_base.info->file_max[reg->Register.File]);
>>       /* All fetches are from the same constant buffer, so
>>  * we need to propagate the size to a vector to do a
>> @@ -1341,7 +1343,8 @@ emit_fetch_immediate(
>>    indirect_index = get_indirect_index(bld,
>>    reg->Register.File,
>>    

Re: [Mesa-dev] [PATCH] gallivm: fix improper clamping of vertex index when fetching gs inputs

2018-11-08 Thread Jose Fonseca

Good find.

On 08/11/2018 01:54, srol...@vmware.com wrote:

From: Roland Scheidegger 

Because we only have one file_max for the (2d) gs input file, the value
actually represents the max of attrib and vertex index (although I'm
not entirely sure if we really want the max, since the max valid value
of the vertex dimension can be easily deduced from the input primitive).


Yes, perhaps we should have a 2nd file_max array for the 2nd axis.   But 
it would be a more invasive change.



Thus in cases where the number of inputs is higher than the number of
vertices per prim, we did not properly clamp the vertex index, which
would result in out-of-bound fetches, potentially causing segfaults
(the segfaults seemed actually difficult to trigger, but valgrind
certainly wasn't happy). This might have happened even if the shader
did not actually try to fetch bogus vertices, if the fetching happened
in non-active conditional clauses.

To fix simply use the correct max vertex index value (derived from
the input prim type) instead when clamping for this case.
---
  .../auxiliary/gallivm/lp_bld_tgsi_soa.c   | 38 ++-
  1 file changed, 28 insertions(+), 10 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
index 83d7dbea9a..0db81b31ad 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
@@ -41,6 +41,7 @@
  #include "util/u_debug.h"
  #include "util/u_math.h"
  #include "util/u_memory.h"
+#include "util/u_prim.h"
  #include "tgsi/tgsi_dump.h"
  #include "tgsi/tgsi_exec.h"
  #include "tgsi/tgsi_info.h"
@@ -1059,7 +1060,8 @@ emit_mask_scatter(struct lp_build_tgsi_soa_context *bld,
  static LLVMValueRef
  get_indirect_index(struct lp_build_tgsi_soa_context *bld,
 unsigned reg_file, unsigned reg_index,
-   const struct tgsi_ind_register *indirect_reg)
+   const struct tgsi_ind_register *indirect_reg,
+   unsigned index_limit)


file_max array is signed, so let's make index_limit also signed, and 
assert here in this function that it is positive.  This would trap if 
some caller gets the limit for the wrong file.


Which reminds me -- SM4 allows to declare constant buffers with 
undeclared size.  What exactly would we get passed here in those 
circunstances?  In practice constant buffers can't be larger than 64KB, 
which means we could presume filemax is 4K registers.



  {
 LLVMBuilderRef builder = bld->bld_base.base.gallivm->builder;
 struct lp_build_context *uint_bld = >bld_base.uint_bld;
@@ -1107,8 +1109,7 @@ get_indirect_index(struct lp_build_tgsi_soa_context *bld,
  */
 if (reg_file != TGSI_FILE_CONSTANT) {
max_index = lp_build_const_int_vec(bld->bld_base.base.gallivm,
- uint_bld->type,
- 
bld->bld_base.info->file_max[reg_file]);
+ uint_bld->type, index_limit);
  
assert(!uint_bld->type.sign);

index = lp_build_min(uint_bld, index, max_index);
@@ -1224,7 +1225,8 @@ emit_fetch_constant(
indirect_index = get_indirect_index(bld,
reg->Register.File,
reg->Register.Index,
-  >Indirect);
+  >Indirect,
+  
bld->bld_base.info->file_max[reg->Register.File]);
  
/* All fetches are from the same constant buffer, so

 * we need to propagate the size to a vector to do a
@@ -1341,7 +1343,8 @@ emit_fetch_immediate(
   indirect_index = get_indirect_index(bld,
   reg->Register.File,
   reg->Register.Index,
- >Indirect);
+ >Indirect,
+ 
bld->bld_base.info->file_max[reg->Register.File]);
   /*
* Unlike for other reg classes, adding pixel offsets is unnecessary 
-
* immediates are stored as full vectors (FIXME??? - might be better
@@ -1414,7 +1417,8 @@ emit_fetch_input(
indirect_index = get_indirect_index(bld,
reg->Register.File,
reg->Register.Index,
-  >Indirect);
+  >Indirect,
+  
bld->bld_base.info->file_max[reg->Register.File]);
  
index_vec = get_soa_array_offsets(_base->uint_bld,

  indirect_index,
@@ -1502,7 +1506,15 @@ emit_fetch_gs_input(
attrib_index = get_indirect_index(bld,
  

[Mesa-dev] [PATCH] gallivm: fix improper clamping of vertex index when fetching gs inputs

2018-11-07 Thread sroland
From: Roland Scheidegger 

Because we only have one file_max for the (2d) gs input file, the value
actually represents the max of attrib and vertex index (although I'm
not entirely sure if we really want the max, since the max valid value
of the vertex dimension can be easily deduced from the input primitive).

Thus in cases where the number of inputs is higher than the number of
vertices per prim, we did not properly clamp the vertex index, which
would result in out-of-bound fetches, potentially causing segfaults
(the segfaults seemed actually difficult to trigger, but valgrind
certainly wasn't happy). This might have happened even if the shader
did not actually try to fetch bogus vertices, if the fetching happened
in non-active conditional clauses.

To fix simply use the correct max vertex index value (derived from
the input prim type) instead when clamping for this case.
---
 .../auxiliary/gallivm/lp_bld_tgsi_soa.c   | 38 ++-
 1 file changed, 28 insertions(+), 10 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
index 83d7dbea9a..0db81b31ad 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
@@ -41,6 +41,7 @@
 #include "util/u_debug.h"
 #include "util/u_math.h"
 #include "util/u_memory.h"
+#include "util/u_prim.h"
 #include "tgsi/tgsi_dump.h"
 #include "tgsi/tgsi_exec.h"
 #include "tgsi/tgsi_info.h"
@@ -1059,7 +1060,8 @@ emit_mask_scatter(struct lp_build_tgsi_soa_context *bld,
 static LLVMValueRef
 get_indirect_index(struct lp_build_tgsi_soa_context *bld,
unsigned reg_file, unsigned reg_index,
-   const struct tgsi_ind_register *indirect_reg)
+   const struct tgsi_ind_register *indirect_reg,
+   unsigned index_limit)
 {
LLVMBuilderRef builder = bld->bld_base.base.gallivm->builder;
struct lp_build_context *uint_bld = >bld_base.uint_bld;
@@ -1107,8 +1109,7 @@ get_indirect_index(struct lp_build_tgsi_soa_context *bld,
 */
if (reg_file != TGSI_FILE_CONSTANT) {
   max_index = lp_build_const_int_vec(bld->bld_base.base.gallivm,
- uint_bld->type,
- 
bld->bld_base.info->file_max[reg_file]);
+ uint_bld->type, index_limit);
 
   assert(!uint_bld->type.sign);
   index = lp_build_min(uint_bld, index, max_index);
@@ -1224,7 +1225,8 @@ emit_fetch_constant(
   indirect_index = get_indirect_index(bld,
   reg->Register.File,
   reg->Register.Index,
-  >Indirect);
+  >Indirect,
+  
bld->bld_base.info->file_max[reg->Register.File]);
 
   /* All fetches are from the same constant buffer, so
* we need to propagate the size to a vector to do a
@@ -1341,7 +1343,8 @@ emit_fetch_immediate(
  indirect_index = get_indirect_index(bld,
  reg->Register.File,
  reg->Register.Index,
- >Indirect);
+ >Indirect,
+ 
bld->bld_base.info->file_max[reg->Register.File]);
  /*
   * Unlike for other reg classes, adding pixel offsets is unnecessary -
   * immediates are stored as full vectors (FIXME??? - might be better
@@ -1414,7 +1417,8 @@ emit_fetch_input(
   indirect_index = get_indirect_index(bld,
   reg->Register.File,
   reg->Register.Index,
-  >Indirect);
+  >Indirect,
+  
bld->bld_base.info->file_max[reg->Register.File]);
 
   index_vec = get_soa_array_offsets(_base->uint_bld,
 indirect_index,
@@ -1502,7 +1506,15 @@ emit_fetch_gs_input(
   attrib_index = get_indirect_index(bld,
 reg->Register.File,
 reg->Register.Index,
->Indirect);
+>Indirect,
+   /*
+* XXX: this is possibly not quite the right value, since file_max may be
+* larger than the max attrib index, due to it being the max of declared
+* inputs AND the max vertices per prim (which is 6 for tri adj).
+* It should however be safe to use (since we always allocate
+* PIPE_MAX_SHADER_INPUTS (80) for it, which is overallocated quite a bit).
+*/
+info->file_max[reg->Register.File]);
}