date:20180228

Mesa (master): nir/serialize: handle var->name being NULL

2018-02-28 Thread Alejandro Pinheiro

Module: Mesa
Branch: master
Commit: e72fb4e61128684efc28647931a793910e190656
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=e72fb4e61128684efc28647931a793910e190656

Author: Alejandro Piñeiro 
Date:   Wed Feb 28 13:01:56 2018 +0100

nir/serialize: handle var->name being NULL

var->name could be NULL under ARB_gl_spirv for example. And in any
case, the code is already handing var name being NULL when reading a
variable, so it is consistent to do it writing a variable too.

Reviewed-by: Timothy Arceri 

---

 src/compiler/nir/nir_serialize.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/compiler/nir/nir_serialize.c b/src/compiler/nir/nir_serialize.c
index 9fe46a675f..00df49c2ef 100644
--- a/src/compiler/nir/nir_serialize.c
+++ b/src/compiler/nir/nir_serialize.c
@@ -137,7 +137,8 @@ write_variable(write_ctx *ctx, const nir_variable *var)
write_add_object(ctx, var);
encode_type_to_blob(ctx->blob, var->type);
blob_write_uint32(ctx->blob, !!(var->name));
-   blob_write_string(ctx->blob, var->name);
+   if (var->name)
+  blob_write_string(ctx->blob, var->name);
blob_write_bytes(ctx->blob, (uint8_t *) >data, sizeof(var->data));
blob_write_uint32(ctx->blob, var->num_state_slots);
blob_write_bytes(ctx->blob, (uint8_t *) var->state_slots,

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): i965/fs: Support 16-bit store_ssbo with VK_KHR_relaxed_block_layout

2018-02-28 Thread Jason Ekstrand

Module: Mesa
Branch: master
Commit: 69be3a82ca6f3247c75d76ae97429462c8909a3c
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=69be3a82ca6f3247c75d76ae97429462c8909a3c

Author: Jose Maria Casanova Crespo 
Date:   Thu Feb  1 00:26:04 2018 +0100

i965/fs: Support 16-bit store_ssbo with VK_KHR_relaxed_block_layout

Restrict the use of untyped_surface_write with 16-bit pairs in
ssbo to the cases where we can guarantee that offset is multiple
of 4.

Taking into account that VK_KHR_relaxed_block_layout is available
in ANV we can only guarantee that when we have a constant offset
that is multiple of 4. For non constant offsets we will always use
byte_scattered_write.

v2: (Jason Ekstrand)
- Assert offset_reg to be multiple of 4 if it is immediate.

Reviewed-by: Jason Ekstrand 

---

 src/intel/compiler/brw_fs_nir.cpp | 22 +++---
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp
index 3f077b3c91..73f424cf10 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -4130,6 +4130,8 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , 
nir_intrinsic_instr *instr
  unsigned num_components = ffs(~(writemask >> first_component)) - 1;
  fs_reg write_src = offset(val_reg, bld, first_component);
 
+ nir_const_value *const_offset = nir_src_as_const_value(instr->src[2]);
+
  if (type_size > 4) {
 /* We can't write more than 2 64-bit components at once. Limit
  * the num_components of the write to what we can do and let the 
next
@@ -4145,14 +4147,19 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , 
nir_intrinsic_instr *instr
  * 32-bit-aligned we need to use byte-scattered writes because
  * untyped writes works with 32-bit components with 32-bit
  * alignment. byte_scattered_write messages only support one
- * 16-bit component at a time.
+ * 16-bit component at a time. As VK_KHR_relaxed_block_layout
+ * could be enabled we can not guarantee that not constant offsets
+ * to be 32-bit aligned for 16-bit types. For example an array, of
+ * 16-bit vec3 with array element stride of 6.
  *
- * For example, if there is a 3-components vector we submit one
- * untyped-write message of 32-bit (first two components), and one
- * byte-scattered write message (the last component).
+ * In the case of 32-bit aligned constant offsets if there is
+ * a 3-components vector we submit one untyped-write message
+ * of 32-bit (first two components), and one byte-scattered
+ * write message (the last component).
  */
 
-if (first_component % 2) {
+if ( !const_offset || ((const_offset->u32[0] +
+   type_size * first_component) % 4)) {
/* If we use a .yz writemask we also need to emit 2
 * byte-scattered write messages because of y-component not
 * being aligned to 32-bit.
@@ -4178,7 +4185,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , 
nir_intrinsic_instr *instr
  }
 
  fs_reg offset_reg;
- nir_const_value *const_offset = nir_src_as_const_value(instr->src[2]);
+
  if (const_offset) {
 offset_reg = brw_imm_ud(const_offset->u32[0] +
 type_size * first_component);
@@ -4217,7 +4224,8 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , 
nir_intrinsic_instr *instr
  } else {
 assert(num_components * type_size <= 16);
 assert((num_components * type_size) % 4 == 0);
-assert((first_component * type_size) % 4 == 0);
+assert(offset_reg.file != BRW_IMMEDIATE_VALUE ||
+   offset_reg.ud % 4 == 0);
 unsigned num_slots = (num_components * type_size) / 4;
 
 emit_untyped_write(bld, surf_index, offset_reg,

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): anv: Enable VK_KHR_16bit_storage for SSBO and UBO

2018-02-28 Thread Jason Ekstrand

Module: Mesa
Branch: master
Commit: 994d21042996232998a51bfabaab6f4970f0ea5a
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=994d21042996232998a51bfabaab6f4970f0ea5a

Author: Jose Maria Casanova Crespo 
Date:   Mon Nov 20 23:28:45 2017 +0100

anv: Enable VK_KHR_16bit_storage for SSBO and UBO

Enables storageBuffer16BitAccess and uniformAndStorageBuffer16BitAccesss
features of VK_KHR_16bit_storage for Gen8+.

Reviewed-by: Jason Ekstrand 

---

 src/intel/vulkan/anv_device.c  | 5 +++--
 src/intel/vulkan/anv_extensions.py | 2 +-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 41f9e00c4e..e42b05d4fa 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -793,9 +793,10 @@ void anv_GetPhysicalDeviceFeatures2KHR(
   case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_16BIT_STORAGE_FEATURES_KHR: {
  VkPhysicalDevice16BitStorageFeaturesKHR *features =
 (VkPhysicalDevice16BitStorageFeaturesKHR *)ext;
+ ANV_FROM_HANDLE(anv_physical_device, pdevice, physicalDevice);
 
- features->storageBuffer16BitAccess = false;
- features->uniformAndStorageBuffer16BitAccess = false;
+ features->storageBuffer16BitAccess = pdevice->info.gen >= 8;
+ features->uniformAndStorageBuffer16BitAccess = pdevice->info.gen >= 8;
  features->storagePushConstant16 = false;
  features->storageInputOutput16 = false;
  break;
diff --git a/src/intel/vulkan/anv_extensions.py 
b/src/intel/vulkan/anv_extensions.py
index 6194eb0ad6..8d39038c43 100644
--- a/src/intel/vulkan/anv_extensions.py
+++ b/src/intel/vulkan/anv_extensions.py
@@ -49,7 +49,7 @@ class Extension:
 # and dEQP-VK.api.info.device fail due to the duplicated strings.
 EXTENSIONS = [
 Extension('VK_ANDROID_native_buffer', 5, 'ANDROID'),
-Extension('VK_KHR_16bit_storage', 1, False),
+Extension('VK_KHR_16bit_storage', 1, 'device->info.gen 
>= 8'),
 Extension('VK_KHR_bind_memory2',  1, True),
 Extension('VK_KHR_dedicated_allocation',  1, True),
 Extension('VK_KHR_descriptor_update_template',1, True),

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): spirv: Calculate properly 16-bit vector sizes

2018-02-28 Thread Jason Ekstrand

Module: Mesa
Branch: master
Commit: 23ffb7c2d17f0268b209782a46e6cb838bd63585
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=23ffb7c2d17f0268b209782a46e6cb838bd63585

Author: Jose Maria Casanova Crespo 
Date:   Thu Feb 22 17:36:37 2018 +0100

spirv: Calculate properly 16-bit vector sizes

Range in 16-bit push constants load was being calculated
wrongly using 4-bytes per element instead of 2-bytes as it
should be.

v2: Use glsl_get_bit_size instead of if statement
(Jason Ekstrand)

Reviewed-by: Jason Ekstrand 

---

 src/compiler/spirv/vtn_variables.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c
index 9eb85c24e9..105b33a567 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -683,12 +683,9 @@ vtn_type_block_size(struct vtn_builder *b, struct vtn_type 
*type)
   if (cols > 1) {
  vtn_assert(type->stride > 0);
  return type->stride * cols;
-  } else if (base_type == GLSL_TYPE_DOUBLE ||
-base_type == GLSL_TYPE_UINT64 ||
-base_type == GLSL_TYPE_INT64) {
- return glsl_get_vector_elements(type->type) * 8;
   } else {
- return glsl_get_vector_elements(type->type) * 4;
+ unsigned type_size = glsl_get_bit_size(type->type) / 8;
+ return glsl_get_vector_elements(type->type) * type_size;
   }
}
 

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): i965/fs: shuffle_32bit_load_result_to_16bit_data now skips components

2018-02-28 Thread Jason Ekstrand

Module: Mesa
Branch: master
Commit: 2dd94f462b0069fc3a20c9a93a9cfe97dd079837
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=2dd94f462b0069fc3a20c9a93a9cfe97dd079837

Author: Jose Maria Casanova Crespo 
Date:   Mon Feb 26 20:28:34 2018 +0100

i965/fs: shuffle_32bit_load_result_to_16bit_data now skips components

This helper used to load 16bit components from 32-bits read now allows
skipping components with the new parameter first_component. The semantics
now skip components until we reach the first_component, and then reads the
number of components passed to the function.

All previous uses of the helper are updated to use 0 as first_component.
This will allow read 16-bit components when the first one is not aligned
32-bit. Enabling more usages of untyped_reads with 16-bit types.

v2: (Jason Ektrand)
Change parameters order to first_component, num_components

Reviewed-by: Jason Ekstrand 

---

 src/intel/compiler/brw_fs.cpp | 2 +-
 src/intel/compiler/brw_fs.h   | 1 +
 src/intel/compiler/brw_fs_nir.cpp | 6 --
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
index 113f62c46c..244c6cda03 100644
--- a/src/intel/compiler/brw_fs.cpp
+++ b/src/intel/compiler/brw_fs.cpp
@@ -194,7 +194,7 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_builder 
,
fs_reg dw = offset(vec4_result, bld, (const_offset & 0xf) / 4);
switch (type_sz(dst.type)) {
case 2:
-  shuffle_32bit_load_result_to_16bit_data(bld, dst, dw, 1);
+  shuffle_32bit_load_result_to_16bit_data(bld, dst, dw, 0, 1);
   bld.MOV(dst, subscript(dw, dst.type, (const_offset / 2) & 1));
   break;
case 4:
diff --git a/src/intel/compiler/brw_fs.h b/src/intel/compiler/brw_fs.h
index 76ad76e08b..38e9991df7 100644
--- a/src/intel/compiler/brw_fs.h
+++ b/src/intel/compiler/brw_fs.h
@@ -505,6 +505,7 @@ fs_reg shuffle_64bit_data_for_32bit_write(const 
brw::fs_builder ,
 void shuffle_32bit_load_result_to_16bit_data(const brw::fs_builder ,
  const fs_reg ,
  const fs_reg ,
+ uint32_t first_component,
  uint32_t components);
 
 void shuffle_16bit_data_for_32bit_write(const brw::fs_builder ,
diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp
index d8300589a5..0d1ab5b01c 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -2316,7 +2316,7 @@ do_untyped_vector_read(const fs_builder ,
  shuffle_32bit_load_result_to_16bit_data(bld,
retype(dest, BRW_REGISTER_TYPE_W),
retype(read_result, BRW_REGISTER_TYPE_D),
-   num_components);
+   0, num_components);
   } else {
  assert(num_components == 1);
  /* scalar 16-bit are read using one byte_scattered_read message */
@@ -4912,6 +4912,7 @@ void
 shuffle_32bit_load_result_to_16bit_data(const fs_builder ,
 const fs_reg ,
 const fs_reg ,
+uint32_t first_component,
 uint32_t components)
 {
assert(type_sz(src.type) == 4);
@@ -4926,7 +4927,8 @@ shuffle_32bit_load_result_to_16bit_data(const fs_builder 
,
 
for (unsigned i = 0; i < components; i++) {
   const fs_reg component_i =
- subscript(offset(src, bld, i / 2), dst.type, i % 2);
+ subscript(offset(src, bld, (first_component + i) / 2), dst.type,
+   (first_component + i) % 2);
 
   bld.MOV(offset(tmp, bld, i % 2), component_i);
 

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): isl/i965/fs: SSBO/UBO buffers need size padding if not multiple of 32-bit

2018-02-28 Thread Jason Ekstrand

Module: Mesa
Branch: master
Commit: 67d7dd594ecd6f15ba3d126a4bb92d4222c4168d
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=67d7dd594ecd6f15ba3d126a4bb92d4222c4168d

Author: Jose Maria Casanova Crespo 
Date:   Tue Jan 30 09:59:34 2018 +0100

isl/i965/fs: SSBO/UBO buffers need size padding if not multiple of 32-bit

The surfaces that backup the GPU buffers have a boundary check that
considers that access to partial dwords are considered out-of-bounds.
For example, buffers with 1,3 16-bit elements has size 2 or 6 and the
last two bytes would always be read as 0 or its writting ignored.

The introduction of 16-bit types implies that we need to align the size
to 4-bytew multiples so that partial dwords could be read/written.
Adding an inconditional +2 size to buffers not being multiple of 2
solves this issue for the general cases of UBO or SSBO.

But, when unsized arrays of 16-bit elements are used it is not possible
to know if the size was padded or not. To solve this issue the
implementation calculates the needed size of the buffer surfaces,
as suggested by Jason:

surface_size = isl_align(buffer_size, 4) +
   (isl_align(buffer_size, 4) - buffer_size)

So when we calculate backwards the buffer_size in the backend we
update the resinfo return value with:

buffer_size = (surface_size & ~3) - (surface_size & 3)

It is also exposed this buffer requirements when robust buffer access
is enabled so these buffer sizes recommend being multiple of 4.

v2: (Jason Ekstrand)
Move padding logic fron anv to isl_surface_state.
Move calculus of original size from spirv to driver backend.
v3: (Jason Ekstrand)
Rename some variables and use a similar expresion when calculating.
padding than when obtaining the original buffer size.
Avoid use of unnecesary component call at brw_fs_nir.
v4: (Jason Ekstrand)
Complete comment with buffer size calculus explanation in brw_fs_nir.

Reviewed-by: Jason Ekstrand 

---

 src/intel/compiler/brw_fs_nir.cpp | 31 ++-
 src/intel/isl/isl_surface_state.c | 22 +-
 src/intel/vulkan/anv_device.c | 11 +++
 3 files changed, 62 insertions(+), 2 deletions(-)

diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp
index 8efec34cc9..d8300589a5 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -4290,7 +4290,36 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , 
nir_intrinsic_instr *instr
   inst->mlen = 1;
   inst->size_written = 4 * REG_SIZE;
 
-  bld.MOV(retype(dest, ret_payload.type), component(ret_payload, 0));
+  /* SKL PRM, vol07, 3D Media GPGPU Engine, Bounds Checking and Faulting:
+   *
+   * "Out-of-bounds checking is always performed at a DWord granularity. If
+   * any part of the DWord is out-of-bounds then the whole DWord is
+   * considered out-of-bounds."
+   *
+   * This implies that types with size smaller than 4-bytes need to be
+   * padded if they don't complete the last dword of the buffer. But as we
+   * need to maintain the original size we need to reverse the padding
+   * calculation to return the correct size to know the number of elements
+   * of an unsized array. As we stored in the last two bits of the surface
+   * size the needed padding for the buffer, we calculate here the
+   * original buffer_size reversing the surface_size calculation:
+   *
+   * surface_size = isl_align(buffer_size, 4) +
+   *(isl_align(buffer_size) - buffer_size)
+   *
+   * buffer_size = surface_size & ~3 - surface_size & 3
+   */
+
+  fs_reg size_aligned4 = ubld.vgrf(BRW_REGISTER_TYPE_UD);
+  fs_reg size_padding = ubld.vgrf(BRW_REGISTER_TYPE_UD);
+  fs_reg buffer_size = ubld.vgrf(BRW_REGISTER_TYPE_UD);
+
+  ubld.AND(size_padding, ret_payload, brw_imm_ud(3));
+  ubld.AND(size_aligned4, ret_payload, brw_imm_ud(~3));
+  ubld.ADD(buffer_size, size_aligned4, negate(size_padding));
+
+  bld.MOV(retype(dest, ret_payload.type), component(buffer_size, 0));
+
   brw_mark_surface_used(prog_data, index);
   break;
}
diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index bfb27fa4a4..c205b3d2c0 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -673,7 +673,27 @@ void
 isl_genX(buffer_fill_state_s)(void *state,
   const struct isl_buffer_fill_state_info 
*restrict info)
 {
-   uint32_t num_elements = info->size / info->stride;
+   uint64_t buffer_size = info->size;
+
+   /* Uniform and Storage buffers need to have surface size not less that the
+* aligned 32-bit size of the buffer. To calculate the array lenght on
+* unsized arrays in StorageBuffer the last 2 bits store the padding size
+* added to the surface, so we can calculate latter

Mesa (master): i965/fs: Support 16-bit do_read_vector with VK_KHR_relaxed_block_layout

2018-02-28 Thread Jason Ekstrand

Module: Mesa
Branch: master
Commit: 8dd8be0323bbc207631a39f43cff7b898af4a55a
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=8dd8be0323bbc207631a39f43cff7b898af4a55a

Author: Jose Maria Casanova Crespo 
Date:   Thu Feb  1 00:05:11 2018 +0100

i965/fs: Support 16-bit do_read_vector with VK_KHR_relaxed_block_layout

16-bit load_ubo/ssbo operations that call do_untyped_read_vector don't
guarantee that offsets are multiple of 4-bytes as required by untyped_read
message. This happens for example in the case of f16mat3x3 when then
VK_KHR_relaxed_block_layout is enabled.

Vectors reads when we have non-constant offsets are implemented with
multiple byte_scattered_read messages that not require 32-bit aligned offsets.

Now for all constant offsets we can use the untyped_read_surface message.
In the case of constant offsets not aligned to 32-bits, we calculate a
start offset 32-bit aligned and use the shuffle_32bit_load_result_to_16bit_data
function and the first_component parameter to skip the copy of the unneeded
component.

v2: (Jason Ekstrand)
Use untyped_read_surface messages always we have constant offsets.

v3: (Jason Ekstrand)
Simplify loop for reads with non constant offsets.
Use end - start to calculate the number of 32-bit components to read with
constant offsets.

Reviewed-by: Jason Ekstrand 

---

 src/intel/compiler/brw_fs_nir.cpp | 51 ---
 1 file changed, 37 insertions(+), 14 deletions(-)

diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp
index 0d1ab5b01c..3f077b3c91 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -2304,28 +2304,51 @@ do_untyped_vector_read(const fs_builder ,
 {
if (type_sz(dest.type) <= 2) {
   assert(dest.stride == 1);
+  boolean is_const_offset = offset_reg.file == BRW_IMMEDIATE_VALUE;
 
-  if (num_components > 1) {
- /* Pairs of 16-bit components can be read with untyped read, for 
16-bit
-  * vec3 4th component is ignored.
+  if (is_const_offset) {
+ uint32_t start = offset_reg.ud & ~3;
+ uint32_t end = offset_reg.ud + num_components * type_sz(dest.type);
+ end = ALIGN(end, 4);
+ assert (end - start <= 16);
+
+ /* At this point we have 16-bit component/s that have constant
+  * offset aligned to 4-bytes that can be read with untyped_reads.
+  * untyped_read message requires 32-bit aligned offsets.
   */
+ unsigned first_component = (offset_reg.ud & 3) / type_sz(dest.type);
+ unsigned num_components_32bit = (end - start) / 4;
+
  fs_reg read_result =
-emit_untyped_read(bld, surf_index, offset_reg,
-  1 /* dims */, DIV_ROUND_UP(num_components, 2),
+emit_untyped_read(bld, surf_index, brw_imm_ud(start),
+  1 /* dims */,
+  num_components_32bit,
   BRW_PREDICATE_NONE);
  shuffle_32bit_load_result_to_16bit_data(bld,
retype(dest, BRW_REGISTER_TYPE_W),
retype(read_result, BRW_REGISTER_TYPE_D),
-   0, num_components);
+   first_component, num_components);
   } else {
- assert(num_components == 1);
- /* scalar 16-bit are read using one byte_scattered_read message */
- fs_reg read_result =
-emit_byte_scattered_read(bld, surf_index, offset_reg,
- 1 /* dims */, 1,
- type_sz(dest.type) * 8 /* bit_size */,
- BRW_PREDICATE_NONE);
- bld.MOV(dest, subscript(read_result, dest.type, 0));
+ fs_reg read_offset = bld.vgrf(BRW_REGISTER_TYPE_UD);
+ for (unsigned i = 0; i < num_components; i++) {
+if (i == 0) {
+   bld.MOV(read_offset, offset_reg);
+} else {
+   bld.ADD(read_offset, offset_reg,
+   brw_imm_ud(i * type_sz(dest.type)));
+}
+/* Non constant offsets are not guaranteed to be aligned 32-bits
+ * so they are read using one byte_scattered_read message
+ * for each component.
+ */
+fs_reg read_result =
+   emit_byte_scattered_read(bld, surf_index, read_offset,
+1 /* dims */, 1,
+type_sz(dest.type) * 8 /* bit_size */,
+BRW_PREDICATE_NONE);
+bld.MOV(offset(dest, bld, i),
+subscript (read_result, dest.type, 0));
+ }
   }
} else if (type_sz(dest.type) == 4) {
   fs_reg read_result = emit_untyped_read(bld, surf_index, offset_reg,

___
mesa-commit mailing list

Mesa (master): spirv/i965/anv: Relax push constant offset assertions being 32-bit aligned

2018-02-28 Thread Jason Ekstrand

Module: Mesa
Branch: master
Commit: 02266f9ba1990eac655ae98b5febf298cc2d33d8
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=02266f9ba1990eac655ae98b5febf298cc2d33d8

Author: Jose Maria Casanova Crespo 
Date:   Tue Feb 20 10:28:41 2018 +0100

spirv/i965/anv: Relax push constant offset assertions being 32-bit aligned

The introduction of 16-bit types with VK_KHR_16bit_storages implies that
push constant offsets could be multiple of 2-bytes. Some assertions are
updated so offsets should be just multiple of size of the base type but
in some cases we can not assume it as doubles aren't aligned to 8 bytes
in some cases.

For 16-bit types, the push constant offset takes into account the
internal offset in the 32-bit uniform bucket adding 2-bytes when we access
not 32-bit aligned elements. In all 32-bit aligned cases it just becomes 0.

v2: Assert offsets to be aligned to the dest type size. (Jason Ekstrand)

Reviewed-by: Jason Ekstrand 

---

 src/compiler/spirv/vtn_variables.c  |  2 --
 src/intel/compiler/brw_fs_nir.cpp   | 15 ++-
 src/intel/vulkan/anv_nir_lower_push_constants.c |  2 --
 3 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c
index 105b33a567..7e8a090add 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -753,8 +753,6 @@ _vtn_load_store_tail(struct vtn_builder *b, 
nir_intrinsic_op op, bool load,
}
 
if (op == nir_intrinsic_load_push_constant) {
-  vtn_assert(access_offset % 4 == 0);
-
   nir_intrinsic_set_base(instr, access_offset);
   nir_intrinsic_set_range(instr, access_size);
}
diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp
index 73f424cf10..47247875e8 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -3882,16 +3882,21 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , 
nir_intrinsic_instr *instr
   break;
 
case nir_intrinsic_load_uniform: {
-  /* Offsets are in bytes but they should always be multiples of 4 */
-  assert(instr->const_index[0] % 4 == 0);
+  /* Offsets are in bytes but they should always aligned to
+   * the type size
+   */
+  assert(instr->const_index[0] % 4 == 0 ||
+ instr->const_index[0] % type_sz(dest.type) == 0);
 
   fs_reg src(UNIFORM, instr->const_index[0] / 4, dest.type);
 
   nir_const_value *const_offset = nir_src_as_const_value(instr->src[0]);
   if (const_offset) {
- /* Offsets are in bytes but they should always be multiples of 4 */
- assert(const_offset->u32[0] % 4 == 0);
- src.offset = const_offset->u32[0];
+ assert(const_offset->u32[0] % type_sz(dest.type) == 0);
+ /* For 16-bit types we add the module of the const_index[0]
+  * offset to access to not 32-bit aligned element
+  */
+ src.offset = const_offset->u32[0] + instr->const_index[0] % 4;
 
  for (unsigned j = 0; j < instr->num_components; j++) {
 bld.MOV(offset(dest, bld, j), offset(src, bld, j));
diff --git a/src/intel/vulkan/anv_nir_lower_push_constants.c 
b/src/intel/vulkan/anv_nir_lower_push_constants.c
index b66552825b..ad60d0c824 100644
--- a/src/intel/vulkan/anv_nir_lower_push_constants.c
+++ b/src/intel/vulkan/anv_nir_lower_push_constants.c
@@ -41,8 +41,6 @@ anv_nir_lower_push_constants(nir_shader *shader)
 if (intrin->intrinsic != nir_intrinsic_load_push_constant)
continue;
 
-assert(intrin->const_index[0] % 4 == 0);
-
 /* We just turn them into uniform loads */
 intrin->intrinsic = nir_intrinsic_load_uniform;
  }

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): anv: Enable VK_KHR_16bit_storage for PushConstant

2018-02-28 Thread Jason Ekstrand

Module: Mesa
Branch: master
Commit: ba642ee3ee36d7aefd21e8b8d4da0c5c24ec0ec8
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=ba642ee3ee36d7aefd21e8b8d4da0c5c24ec0ec8

Author: Jose Maria Casanova Crespo 
Date:   Fri Feb 23 01:15:13 2018 +0100

anv: Enable VK_KHR_16bit_storage for PushConstant

Enables storagePushConstant16 features of VK_KHR_16bit_storage for Gen8+.

Reviewed-by: Jason Ekstrand 

---

 src/intel/vulkan/anv_device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index e42b05d4fa..78cd0da179 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -797,7 +797,7 @@ void anv_GetPhysicalDeviceFeatures2KHR(
 
  features->storageBuffer16BitAccess = pdevice->info.gen >= 8;
  features->uniformAndStorageBuffer16BitAccess = pdevice->info.gen >= 8;
- features->storagePushConstant16 = false;
+ features->storagePushConstant16 = pdevice->info.gen >= 8;
  features->storageInputOutput16 = false;
  break;
   }

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): vbo: Remove vbo_save_vertex_list::attrtype.

2018-02-28 Thread Mathias Fröhlich

Module: Mesa
Branch: master
Commit: 95b4be4f29fab106cee715dd96657be044e54654
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=95b4be4f29fab106cee715dd96657be044e54654

Author: Mathias Fröhlich 
Date:   Sun Feb 25 18:01:07 2018 +0100

vbo: Remove vbo_save_vertex_list::attrtype.

Is not used anymore on replay, move the last use in display list
compilation to the original array in the display list compiler.

Reviewed-by: Brian Paul 
Signed-off-by: Mathias Fröhlich 

---

 src/mesa/vbo/vbo_save.h | 1 -
 src/mesa/vbo/vbo_save_api.c | 4 +---
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/src/mesa/vbo/vbo_save.h b/src/mesa/vbo/vbo_save.h
index 4dd886eb12..3ccbfac7e2 100644
--- a/src/mesa/vbo/vbo_save.h
+++ b/src/mesa/vbo/vbo_save.h
@@ -62,7 +62,6 @@ struct vbo_save_copied_vtx {
  */
 struct vbo_save_vertex_list {
GLubyte attrsz[VBO_ATTRIB_MAX];
-   GLenum16 attrtype[VBO_ATTRIB_MAX];
GLuint vertex_size;  /**< size in GLfloats */
struct gl_vertex_array_object *VAO[VP_MODE_MAX];
 
diff --git a/src/mesa/vbo/vbo_save_api.c b/src/mesa/vbo/vbo_save_api.c
index 95593fc0a7..2263276a18 100644
--- a/src/mesa/vbo/vbo_save_api.c
+++ b/src/mesa/vbo/vbo_save_api.c
@@ -545,8 +545,6 @@ compile_vertex_list(struct gl_context *ctx)
 */
STATIC_ASSERT(sizeof(node->attrsz) == sizeof(save->attrsz));
memcpy(node->attrsz, save->attrsz, sizeof(node->attrsz));
-   STATIC_ASSERT(sizeof(node->attrtype) == sizeof(save->attrtype));
-   memcpy(node->attrtype, save->attrtype, sizeof(node->attrtype));
node->vertex_size = save->vertex_size;
node->buffer_offset =
   (save->buffer_map - save->vertex_store->buffer_map) * sizeof(GLfloat);
@@ -582,7 +580,7 @@ compile_vertex_list(struct gl_context *ctx)
   update_vao(ctx, vpm, >VAO[vpm],
  save->vertex_store->bufferobj, buffer_offset,
  node->vertex_size*sizeof(GLfloat), save->enabled,
- node->attrsz, node->attrtype, offsets);
+ node->attrsz, save->attrtype, offsets);
   /* Reference the vao in the dlist */
   node->VAO[vpm] = NULL;
   _mesa_reference_vao(ctx, >VAO[vpm], save->VAO[vpm]);

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): vbo: Remove vbo_save_vertex_list::vertex_size.

2018-02-28 Thread Mathias Fröhlich

Module: Mesa
Branch: master
Commit: 4c232dc721645c147c864f102268a52c9536096d
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=4c232dc721645c147c864f102268a52c9536096d

Author: Mathias Fröhlich 
Date:   Sun Feb 25 18:01:07 2018 +0100

vbo: Remove vbo_save_vertex_list::vertex_size.

Like before use local variables from compile_vertex_list instead.
Remove vertex_size from struct vbo_save_vertex_list.

Reviewed-by: Brian Paul 
Signed-off-by: Mathias Fröhlich 

---

 src/mesa/vbo/vbo_save.h |  1 -
 src/mesa/vbo/vbo_save_api.c | 14 ++
 2 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/src/mesa/vbo/vbo_save.h b/src/mesa/vbo/vbo_save.h
index a9834d6e6d..b158c07795 100644
--- a/src/mesa/vbo/vbo_save.h
+++ b/src/mesa/vbo/vbo_save.h
@@ -61,7 +61,6 @@ struct vbo_save_copied_vtx {
  * compiled using the fallback opcode mechanism provided by dlist.c.
  */
 struct vbo_save_vertex_list {
-   GLuint vertex_size;  /**< size in GLfloats */
struct gl_vertex_array_object *VAO[VP_MODE_MAX];
 
/* Copy of the final vertex from node->vertex_store->bufferobj.
diff --git a/src/mesa/vbo/vbo_save_api.c b/src/mesa/vbo/vbo_save_api.c
index e6cd04281e..47ee355e72 100644
--- a/src/mesa/vbo/vbo_save_api.c
+++ b/src/mesa/vbo/vbo_save_api.c
@@ -543,7 +543,6 @@ compile_vertex_list(struct gl_context *ctx)
/* Duplicate our template, increment refcounts to the storage structs:
 */
const GLsizei stride = save->vertex_size*sizeof(GLfloat);
-   node->vertex_size = save->vertex_size;
GLintptr buffer_offset =
(save->buffer_map - save->vertex_store->buffer_map) * sizeof(GLfloat);
GLuint start_offset = 0;
@@ -579,9 +578,8 @@ compile_vertex_list(struct gl_context *ctx)
for (gl_vertex_processing_mode vpm = VP_MODE_FF; vpm < VP_MODE_MAX; ++vpm) {
   /* create or reuse the vao */
   update_vao(ctx, vpm, >VAO[vpm],
- save->vertex_store->bufferobj, buffer_offset,
- node->vertex_size*sizeof(GLfloat), save->enabled,
- save->attrsz, save->attrtype, offsets);
+ save->vertex_store->bufferobj, buffer_offset, stride,
+ save->enabled, save->attrsz, save->attrtype, offsets);
   /* Reference the vao in the dlist */
   node->VAO[vpm] = NULL;
   _mesa_reference_vao(ctx, >VAO[vpm], save->VAO[vpm]);
@@ -593,7 +591,7 @@ compile_vertex_list(struct gl_context *ctx)
   node->current_data = NULL;
}
else {
-  GLuint current_size = node->vertex_size - save->attrsz[0];
+  GLuint current_size = save->vertex_size - save->attrsz[0];
   node->current_data = NULL;
 
   if (current_size) {
@@ -604,8 +602,7 @@ compile_vertex_list(struct gl_context *ctx)
 unsigned vertex_offset = 0;
 
 if (node->vertex_count)
-   vertex_offset =
-  (node->vertex_count - 1) * node->vertex_size * 
sizeof(GLfloat);
+   vertex_offset = (node->vertex_count - 1) * stride;
 
 memcpy(node->current_data, buffer + vertex_offset + attr_offset,
current_size * sizeof(GLfloat));
@@ -1817,11 +1814,12 @@ vbo_print_vertex_list(struct gl_context *ctx, void 
*data, FILE *f)
struct vbo_save_vertex_list *node = (struct vbo_save_vertex_list *) data;
GLuint i;
struct gl_buffer_object *buffer = node->VAO[0]->BufferBinding[0].BufferObj;
+   const GLuint vertex_size = _vbo_save_get_stride(node)/sizeof(GLfloat);
(void) ctx;
 
fprintf(f, "VBO-VERTEX-LIST, %u vertices, %d primitives, %d vertsize, "
"buffer %p\n",
-   node->vertex_count, node->prim_count, node->vertex_size,
+   node->vertex_count, node->prim_count, vertex_size,
buffer);
 
for (i = 0; i < node->prim_count; i++) {

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): vbo: Implement current values update in terms of the VAO.

2018-02-28 Thread Mathias Fröhlich

Module: Mesa
Branch: master
Commit: 6e410270ee73f21c4363c8d9cc8f4eef4bf949b1
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=6e410270ee73f21c4363c8d9cc8f4eef4bf949b1

Author: Mathias Fröhlich 
Date:   Sun Feb 25 18:01:07 2018 +0100

vbo: Implement current values update in terms of the VAO.

Use the information already present in the VAO to update the current values
after display list replay. Set GL_OUT_OF_MEMORY on allocation failure
for the current value update storage.

Reviewed-by: Brian Paul 
Signed-off-by: Mathias Fröhlich 

---

 src/mesa/vbo/vbo_save.h  |  1 -
 src/mesa/vbo/vbo_save_api.c  | 14 +++
 src/mesa/vbo/vbo_save_draw.c | 94 +++-
 3 files changed, 47 insertions(+), 62 deletions(-)

diff --git a/src/mesa/vbo/vbo_save.h b/src/mesa/vbo/vbo_save.h
index 44dc8c201f..00f18363b7 100644
--- a/src/mesa/vbo/vbo_save.h
+++ b/src/mesa/vbo/vbo_save.h
@@ -72,7 +72,6 @@ struct vbo_save_vertex_list {
 * map/unmap of the VBO when updating GL current data.
 */
fi_type *current_data;
-   GLuint current_size;
 
GLuint buffer_offset;/**< in bytes */
GLuint start_vertex; /**< first vertex used by any primitive */
diff --git a/src/mesa/vbo/vbo_save_api.c b/src/mesa/vbo/vbo_save_api.c
index dc248934f7..db9a3fbdfa 100644
--- a/src/mesa/vbo/vbo_save_api.c
+++ b/src/mesa/vbo/vbo_save_api.c
@@ -595,18 +595,14 @@ compile_vertex_list(struct gl_context *ctx)
node->prim_store->refcount++;
 
if (node->prims[0].no_current_update) {
-  node->current_size = 0;
   node->current_data = NULL;
}
else {
-  node->current_size = node->vertex_size - node->attrsz[0];
+  GLuint current_size = node->vertex_size - node->attrsz[0];
   node->current_data = NULL;
 
-  if (node->current_size) {
- /* If the malloc fails, we just pull the data out of the VBO
-  * later instead.
-  */
- node->current_data = malloc(node->current_size * sizeof(GLfloat));
+  if (current_size) {
+ node->current_data = malloc(current_size * sizeof(GLfloat));
  if (node->current_data) {
 const char *buffer = (const char *) save->vertex_store->buffer_map;
 unsigned attr_offset = node->attrsz[0] * sizeof(GLfloat);
@@ -618,7 +614,9 @@ compile_vertex_list(struct gl_context *ctx)
 
 memcpy(node->current_data,
buffer + node->buffer_offset + vertex_offset + attr_offset,
-   node->current_size * sizeof(GLfloat));
+   current_size * sizeof(GLfloat));
+ } else {
+_mesa_error(ctx, GL_OUT_OF_MEMORY, "Current value allocation");
  }
   }
}
diff --git a/src/mesa/vbo/vbo_save_draw.c b/src/mesa/vbo/vbo_save_draw.c
index 0358ecd2f9..b8b6b872c0 100644
--- a/src/mesa/vbo/vbo_save_draw.c
+++ b/src/mesa/vbo/vbo_save_draw.c
@@ -42,72 +42,60 @@
 #include "vbo_private.h"
 
 
-/**
- * After playback, copy everything but the position from the
- * last vertex to the saved state
- */
 static void
-playback_copy_to_current(struct gl_context *ctx,
- const struct vbo_save_vertex_list *node)
+copy_vao(struct gl_context *ctx, const struct gl_vertex_array_object *vao,
+ GLbitfield mask, GLbitfield state, int shift, fi_type **data)
 {
struct vbo_context *vbo = vbo_context(ctx);
-   fi_type vertex[VBO_ATTRIB_MAX * 4];
-   fi_type *data;
-   GLbitfield64 mask;
-
-   if (node->current_size == 0)
-  return;
-
-   if (node->current_data) {
-  data = node->current_data;
-   }
-   else {
-  /* Position of last vertex */
-  const GLuint pos = node->vertex_count > 0 ? node->vertex_count - 1 : 0;
-  /* Offset to last vertex in the vertex buffer */
-  const GLuint offset = node->buffer_offset
- + pos * node->vertex_size * sizeof(GLfloat);
-
-  data = vertex;
-
-  ctx->Driver.GetBufferSubData(ctx, offset,
-   node->vertex_size * sizeof(GLfloat),
-   data, node->vertex_store->bufferobj);
-
-  data += node->attrsz[0]; /* skip vertex position */
-   }
 
-   mask = node->enabled & (~BITFIELD64_BIT(VBO_ATTRIB_POS));
+   mask &= vao->_Enabled;
while (mask) {
-  const int i = u_bit_scan64();
-  fi_type *current = (fi_type *)vbo->currval[i].Ptr;
+  const int i = u_bit_scan();
+  const struct gl_array_attributes *attrib = >VertexAttrib[i];
+  struct gl_vertex_array *currval = >currval[shift + i];
+  const GLubyte size = attrib->Size;
+  const GLenum16 type = attrib->Type;
   fi_type tmp[4];
-  assert(node->attrsz[i]);
 
-  COPY_CLEAN_4V_TYPE_AS_UNION(tmp,
-  node->attrsz[i],
-  data,
-  node->attrtype[i]);
+  COPY_CLEAN_4V_TYPE_AS_UNION(tmp, size, *data, type);
 
-

Mesa (master): vbo: Remove vbo_save_vertex_list::start_vertex.

2018-02-28 Thread Mathias Fröhlich

Module: Mesa
Branch: master
Commit: bfa8d8e5bf5a2f2f5b727acde0b4dad34d0c4210
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=bfa8d8e5bf5a2f2f5b727acde0b4dad34d0c4210

Author: Mathias Fröhlich 
Date:   Sun Feb 25 18:01:07 2018 +0100

vbo: Remove vbo_save_vertex_list::start_vertex.

Replace last use on replay with _vbo_save_get_{min,max}_index. Appart from
that it is not used anymore.

Reviewed-by: Brian Paul 
Signed-off-by: Mathias Fröhlich 

---

 src/mesa/vbo/vbo_save.h  | 1 -
 src/mesa/vbo/vbo_save_api.c  | 3 ---
 src/mesa/vbo/vbo_save_draw.c | 4 ++--
 3 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/src/mesa/vbo/vbo_save.h b/src/mesa/vbo/vbo_save.h
index cbf73892ee..51ca3c2614 100644
--- a/src/mesa/vbo/vbo_save.h
+++ b/src/mesa/vbo/vbo_save.h
@@ -71,7 +71,6 @@ struct vbo_save_vertex_list {
fi_type *current_data;
 
GLuint buffer_offset;/**< in bytes */
-   GLuint start_vertex; /**< first vertex used by any primitive */
GLuint vertex_count; /**< number of vertices in this list */
GLuint wrap_count;  /* number of copied vertices at start */
 
diff --git a/src/mesa/vbo/vbo_save_api.c b/src/mesa/vbo/vbo_save_api.c
index e21315120d..e8d027f15c 100644
--- a/src/mesa/vbo/vbo_save_api.c
+++ b/src/mesa/vbo/vbo_save_api.c
@@ -642,9 +642,6 @@ compile_vertex_list(struct gl_context *ctx)
   for (unsigned i = 0; i < node->prim_count; i++) {
  node->prims[i].start += start_offset;
   }
-  node->start_vertex = start_offset;
-   } else {
-  node->start_vertex = 0;
}
 
/* Deal with GL_COMPILE_AND_EXECUTE:
diff --git a/src/mesa/vbo/vbo_save_draw.c b/src/mesa/vbo/vbo_save_draw.c
index b8b6b872c0..137fb6e3fd 100644
--- a/src/mesa/vbo/vbo_save_draw.c
+++ b/src/mesa/vbo/vbo_save_draw.c
@@ -213,8 +213,8 @@ vbo_save_playback_vertex_list(struct gl_context *ctx, void 
*data)
   assert(ctx->NewState == 0);
 
   if (node->vertex_count > 0) {
- GLuint min_index = node->start_vertex;
- GLuint max_index = min_index + node->vertex_count - 1;
+ GLuint min_index = _vbo_save_get_min_index(node);
+ GLuint max_index = _vbo_save_get_max_index(node);
  vbo->draw_prims(ctx,
  node->prims,
  node->prim_count,

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): vbo: Remove vbo_save_vertex_list::buffer_offset.

2018-02-28 Thread Mathias Fröhlich

Module: Mesa
Branch: master
Commit: 478a9bc7bb6870993e2a8df97b2dab1d4e45a723
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=478a9bc7bb6870993e2a8df97b2dab1d4e45a723

Author: Mathias Fröhlich 
Date:   Sun Feb 25 18:01:07 2018 +0100

vbo: Remove vbo_save_vertex_list::buffer_offset.

The buffer_offset is used in aligned_vertex_buffer_offset.
But now that most of these decisions are done in compile_vertex_list
we can work on local variables instead of struct members in the
display list code. Clean that up and remove buffer_offset.

Reviewed-by: Brian Paul 
Signed-off-by: Mathias Fröhlich 

---

 src/mesa/vbo/vbo_save.h | 14 --
 src/mesa/vbo/vbo_save_api.c | 28 +---
 2 files changed, 13 insertions(+), 29 deletions(-)

diff --git a/src/mesa/vbo/vbo_save.h b/src/mesa/vbo/vbo_save.h
index 51ca3c2614..a9834d6e6d 100644
--- a/src/mesa/vbo/vbo_save.h
+++ b/src/mesa/vbo/vbo_save.h
@@ -70,7 +70,6 @@ struct vbo_save_vertex_list {
 */
fi_type *current_data;
 
-   GLuint buffer_offset;/**< in bytes */
GLuint vertex_count; /**< number of vertices in this list */
GLuint wrap_count;  /* number of copied vertices at start */
 
@@ -82,19 +81,6 @@ struct vbo_save_vertex_list {
 
 
 /**
- * Is the vertex list's buffer offset an exact multiple of the
- * vertex size (in bytes)?  This is used to check for a vertex array /
- * drawing optimization.
- */
-static inline bool
-aligned_vertex_buffer_offset(const struct vbo_save_vertex_list *node)
-{
-   unsigned vertex_size = node->vertex_size * sizeof(GLfloat); /* in bytes */
-   return vertex_size != 0 && node->buffer_offset % vertex_size == 0;
-}
-
-
-/**
  * Return the stride in bytes of the display list node.
  */
 static inline GLsizei
diff --git a/src/mesa/vbo/vbo_save_api.c b/src/mesa/vbo/vbo_save_api.c
index e8d027f15c..e6cd04281e 100644
--- a/src/mesa/vbo/vbo_save_api.c
+++ b/src/mesa/vbo/vbo_save_api.c
@@ -527,7 +527,6 @@ compile_vertex_list(struct gl_context *ctx)
 {
struct vbo_save_context *save = _context(ctx)->save;
struct vbo_save_vertex_list *node;
-   GLintptr buffer_offset = 0;
 
/* Allocate space for this structure in the display list currently
 * being compiled.
@@ -543,10 +542,12 @@ compile_vertex_list(struct gl_context *ctx)
 
/* Duplicate our template, increment refcounts to the storage structs:
 */
+   const GLsizei stride = save->vertex_size*sizeof(GLfloat);
node->vertex_size = save->vertex_size;
-   node->buffer_offset =
-  (save->buffer_map - save->vertex_store->buffer_map) * sizeof(GLfloat);
-   if (aligned_vertex_buffer_offset(node)) {
+   GLintptr buffer_offset =
+   (save->buffer_map - save->vertex_store->buffer_map) * sizeof(GLfloat);
+   GLuint start_offset = 0;
+   if (0 < buffer_offset && 0 < stride && buffer_offset % stride == 0) {
   /* The vertex size is an exact multiple of the buffer offset.
* This means that we can use zero-based vertex attribute pointers
* and specify the start of the primitive with the _mesa_prim::start
@@ -555,9 +556,11 @@ compile_vertex_list(struct gl_context *ctx)
* changes in drivers.  In particular, the Gallium CSO module will
* filter out redundant vertex buffer changes.
*/
+  /* We cannot immediately update the primitives as some methods below
+   * still need the uncorrected start vertices
+   */
+  start_offset = buffer_offset/stride;
   buffer_offset = 0;
-   } else {
-  buffer_offset = node->buffer_offset;
}
GLuint offsets[VBO_ATTRIB_MAX];
for (unsigned i = 0, offset = 0; i < VBO_ATTRIB_MAX; ++i) {
@@ -596,7 +599,7 @@ compile_vertex_list(struct gl_context *ctx)
   if (current_size) {
  node->current_data = malloc(current_size * sizeof(GLfloat));
  if (node->current_data) {
-const char *buffer = (const char *) save->vertex_store->buffer_map;
+const char *buffer = (const char *)save->buffer_map;
 unsigned attr_offset = save->attrsz[0] * sizeof(GLfloat);
 unsigned vertex_offset = 0;
 
@@ -604,8 +607,7 @@ compile_vertex_list(struct gl_context *ctx)
vertex_offset =
   (node->vertex_count - 1) * node->vertex_size * 
sizeof(GLfloat);
 
-memcpy(node->current_data,
-   buffer + node->buffer_offset + vertex_offset + attr_offset,
+memcpy(node->current_data, buffer + vertex_offset + attr_offset,
current_size * sizeof(GLfloat));
  } else {
 _mesa_error(ctx, GL_OUT_OF_MEMORY, "Current value allocation");
@@ -636,12 +638,8 @@ compile_vertex_list(struct gl_context *ctx)
 * On the other hand the _vbo_loopback_vertex_list call below needs the
 * primitves to be corrected already.
 */
-   if (aligned_vertex_buffer_offset(node)) {
-  const unsigned start_offset =
-

Mesa (master): vbo: Remove vbo_save_vertex_list::attrsz.

2018-02-28 Thread Mathias Fröhlich

Module: Mesa
Branch: master
Commit: 6dd3e98c213f8c82a934c49eb369e88f5a648f19
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=6dd3e98c213f8c82a934c49eb369e88f5a648f19

Author: Mathias Fröhlich 
Date:   Sun Feb 25 18:01:07 2018 +0100

vbo: Remove vbo_save_vertex_list::attrsz.

Is not used anymore on replay, move the last use in display list
compilation to the original array in the display list compiler.

Reviewed-by: Brian Paul 
Signed-off-by: Mathias Fröhlich 

---

 src/mesa/vbo/vbo_save.h |  1 -
 src/mesa/vbo/vbo_save_api.c | 10 --
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/src/mesa/vbo/vbo_save.h b/src/mesa/vbo/vbo_save.h
index 3ccbfac7e2..cbf73892ee 100644
--- a/src/mesa/vbo/vbo_save.h
+++ b/src/mesa/vbo/vbo_save.h
@@ -61,7 +61,6 @@ struct vbo_save_copied_vtx {
  * compiled using the fallback opcode mechanism provided by dlist.c.
  */
 struct vbo_save_vertex_list {
-   GLubyte attrsz[VBO_ATTRIB_MAX];
GLuint vertex_size;  /**< size in GLfloats */
struct gl_vertex_array_object *VAO[VP_MODE_MAX];
 
diff --git a/src/mesa/vbo/vbo_save_api.c b/src/mesa/vbo/vbo_save_api.c
index 2263276a18..e21315120d 100644
--- a/src/mesa/vbo/vbo_save_api.c
+++ b/src/mesa/vbo/vbo_save_api.c
@@ -543,8 +543,6 @@ compile_vertex_list(struct gl_context *ctx)
 
/* Duplicate our template, increment refcounts to the storage structs:
 */
-   STATIC_ASSERT(sizeof(node->attrsz) == sizeof(save->attrsz));
-   memcpy(node->attrsz, save->attrsz, sizeof(node->attrsz));
node->vertex_size = save->vertex_size;
node->buffer_offset =
   (save->buffer_map - save->vertex_store->buffer_map) * sizeof(GLfloat);
@@ -580,7 +578,7 @@ compile_vertex_list(struct gl_context *ctx)
   update_vao(ctx, vpm, >VAO[vpm],
  save->vertex_store->bufferobj, buffer_offset,
  node->vertex_size*sizeof(GLfloat), save->enabled,
- node->attrsz, save->attrtype, offsets);
+ save->attrsz, save->attrtype, offsets);
   /* Reference the vao in the dlist */
   node->VAO[vpm] = NULL;
   _mesa_reference_vao(ctx, >VAO[vpm], save->VAO[vpm]);
@@ -592,14 +590,14 @@ compile_vertex_list(struct gl_context *ctx)
   node->current_data = NULL;
}
else {
-  GLuint current_size = node->vertex_size - node->attrsz[0];
+  GLuint current_size = node->vertex_size - save->attrsz[0];
   node->current_data = NULL;
 
   if (current_size) {
  node->current_data = malloc(current_size * sizeof(GLfloat));
  if (node->current_data) {
 const char *buffer = (const char *) save->vertex_store->buffer_map;
-unsigned attr_offset = node->attrsz[0] * sizeof(GLfloat);
+unsigned attr_offset = save->attrsz[0] * sizeof(GLfloat);
 unsigned vertex_offset = 0;
 
 if (node->vertex_count)
@@ -615,7 +613,7 @@ compile_vertex_list(struct gl_context *ctx)
   }
}
 
-   assert(node->attrsz[VBO_ATTRIB_POS] != 0 || node->vertex_count == 0);
+   assert(save->attrsz[VBO_ATTRIB_POS] != 0 || node->vertex_count == 0);
 
if (save->dangling_attr_ref)
   ctx->ListState.CurrentList->Flags |= DLIST_DANGLING_REFS;

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): vbo: Remove vbo_save_vertex_list::enabled.

2018-02-28 Thread Mathias Fröhlich

Module: Mesa
Branch: master
Commit: 77df52cc4febb45008db4cfca3c144482a3a8578
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=77df52cc4febb45008db4cfca3c144482a3a8578

Author: Mathias Fröhlich 
Date:   Sun Feb 25 18:01:07 2018 +0100

vbo: Remove vbo_save_vertex_list::enabled.

Is not used anymore on replay.

Reviewed-by: Brian Paul 
Signed-off-by: Mathias Fröhlich 

---

 src/mesa/vbo/vbo_save.h | 1 -
 src/mesa/vbo/vbo_save_api.c | 3 +--
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/src/mesa/vbo/vbo_save.h b/src/mesa/vbo/vbo_save.h
index f4565023fd..4dd886eb12 100644
--- a/src/mesa/vbo/vbo_save.h
+++ b/src/mesa/vbo/vbo_save.h
@@ -61,7 +61,6 @@ struct vbo_save_copied_vtx {
  * compiled using the fallback opcode mechanism provided by dlist.c.
  */
 struct vbo_save_vertex_list {
-   GLbitfield64 enabled; /**< mask of enabled vbo arrays. */
GLubyte attrsz[VBO_ATTRIB_MAX];
GLenum16 attrtype[VBO_ATTRIB_MAX];
GLuint vertex_size;  /**< size in GLfloats */
diff --git a/src/mesa/vbo/vbo_save_api.c b/src/mesa/vbo/vbo_save_api.c
index 8dac6251c4..95593fc0a7 100644
--- a/src/mesa/vbo/vbo_save_api.c
+++ b/src/mesa/vbo/vbo_save_api.c
@@ -543,7 +543,6 @@ compile_vertex_list(struct gl_context *ctx)
 
/* Duplicate our template, increment refcounts to the storage structs:
 */
-   node->enabled = save->enabled;
STATIC_ASSERT(sizeof(node->attrsz) == sizeof(save->attrsz));
memcpy(node->attrsz, save->attrsz, sizeof(node->attrsz));
STATIC_ASSERT(sizeof(node->attrtype) == sizeof(save->attrtype));
@@ -582,7 +581,7 @@ compile_vertex_list(struct gl_context *ctx)
   /* create or reuse the vao */
   update_vao(ctx, vpm, >VAO[vpm],
  save->vertex_store->bufferobj, buffer_offset,
- node->vertex_size*sizeof(GLfloat), node->enabled,
+ node->vertex_size*sizeof(GLfloat), save->enabled,
  node->attrsz, node->attrtype, offsets);
   /* Reference the vao in the dlist */
   node->VAO[vpm] = NULL;

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): vbo: Remove unused vbo_save_vertex_list::dangling_attr_ref.

2018-02-28 Thread Mathias Fröhlich

Module: Mesa
Branch: master
Commit: 07915020f0de9f5f1e8865bb61e2cf0d673ff278
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=07915020f0de9f5f1e8865bb61e2cf0d673ff278

Author: Mathias Fröhlich 
Date:   Sun Feb 25 18:01:07 2018 +0100

vbo: Remove unused vbo_save_vertex_list::dangling_attr_ref.

Reviewed-by: Brian Paul 
Signed-off-by: Mathias Fröhlich 

---

 src/mesa/vbo/vbo_save.h | 2 --
 src/mesa/vbo/vbo_save_api.c | 1 -
 2 files changed, 3 deletions(-)

diff --git a/src/mesa/vbo/vbo_save.h b/src/mesa/vbo/vbo_save.h
index edbce3673d..ee3de0fedb 100644
--- a/src/mesa/vbo/vbo_save.h
+++ b/src/mesa/vbo/vbo_save.h
@@ -79,8 +79,6 @@ struct vbo_save_vertex_list {
GLuint start_vertex; /**< first vertex used by any primitive */
GLuint vertex_count; /**< number of vertices in this list */
GLuint wrap_count;  /* number of copied vertices at start */
-   GLboolean dangling_attr_ref;/* current attr implicitly referenced
-   outside the list */
 
struct _mesa_prim *prims;
GLuint prim_count;
diff --git a/src/mesa/vbo/vbo_save_api.c b/src/mesa/vbo/vbo_save_api.c
index 1edf7b9dfa..a87bbe0856 100644
--- a/src/mesa/vbo/vbo_save_api.c
+++ b/src/mesa/vbo/vbo_save_api.c
@@ -573,7 +573,6 @@ compile_vertex_list(struct gl_context *ctx)
}
node->vertex_count = save->vert_count;
node->wrap_count = save->copied.nr;
-   node->dangling_attr_ref = save->dangling_attr_ref;
node->prims = save->prims;
node->prim_count = save->prim_count;
node->vertex_store = save->vertex_store;

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): vbo: Remove unused vbo_save_context::wrap_count.

2018-02-28 Thread Mathias Fröhlich

Module: Mesa
Branch: master
Commit: 1cc3516a1105c90214b2e3465421681f64699a7f
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=1cc3516a1105c90214b2e3465421681f64699a7f

Author: Mathias Fröhlich 
Date:   Sun Feb 25 18:01:07 2018 +0100

vbo: Remove unused vbo_save_context::wrap_count.

Reviewed-by: Brian Paul 
Signed-off-by: Mathias Fröhlich 

---

 src/mesa/vbo/vbo_save.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/mesa/vbo/vbo_save.h b/src/mesa/vbo/vbo_save.h
index ee3de0fedb..414a477f31 100644
--- a/src/mesa/vbo/vbo_save.h
+++ b/src/mesa/vbo/vbo_save.h
@@ -149,7 +149,6 @@ struct vbo_save_context {
 
GLboolean out_of_memory;  /**< True if last VBO allocation failed */
 
-   GLuint wrap_count;
GLbitfield replay_flags;
 
struct _mesa_prim *prims;

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): vbo: Implement vbo_loopback_vertex_list in terms of the VAO.

2018-02-28 Thread Mathias Fröhlich

Module: Mesa
Branch: master
Commit: 08aa0d9bf49ea74f84b19cd11a0f0ace7ce7211a
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=08aa0d9bf49ea74f84b19cd11a0f0ace7ce7211a

Author: Mathias Fröhlich 
Date:   Sun Feb 25 18:01:07 2018 +0100

vbo: Implement vbo_loopback_vertex_list in terms of the VAO.

Use the information already present in the VAO to replay a display list
node using immediate mode draw commands. Use a hand full of helper methods
that will be useful for the next patches also.

v2: Insert asserts, constify local variables.

Reviewed-by: Brian Paul 
Signed-off-by: Mathias Fröhlich 

---

 src/mesa/vbo/vbo_save.h  |  55 +++---
 src/mesa/vbo/vbo_save_api.c  |  42 ++
 src/mesa/vbo/vbo_save_draw.c |  28 +++--
 src/mesa/vbo/vbo_save_loopback.c | 119 +--
 4 files changed, 151 insertions(+), 93 deletions(-)

diff --git a/src/mesa/vbo/vbo_save.h b/src/mesa/vbo/vbo_save.h
index 14ac831ffd..44dc8c201f 100644
--- a/src/mesa/vbo/vbo_save.h
+++ b/src/mesa/vbo/vbo_save.h
@@ -100,6 +100,52 @@ aligned_vertex_buffer_offset(const struct 
vbo_save_vertex_list *node)
 }
 
 
+/**
+ * Return the stride in bytes of the display list node.
+ */
+static inline GLsizei
+_vbo_save_get_stride(const struct vbo_save_vertex_list *node)
+{
+   return node->VAO[0]->BufferBinding[0].Stride;
+}
+
+
+/**
+ * Return the first referenced vertex index in the display list node.
+ */
+static inline GLuint
+_vbo_save_get_min_index(const struct vbo_save_vertex_list *node)
+{
+   assert(node->prim_count > 0);
+   return node->prims[0].start;
+}
+
+
+/**
+ * Return the last referenced vertex index in the display list node.
+ */
+static inline GLuint
+_vbo_save_get_max_index(const struct vbo_save_vertex_list *node)
+{
+   assert(node->prim_count > 0);
+   const struct _mesa_prim *last_prim = >prims[node->prim_count - 1];
+   return last_prim->start + last_prim->count - 1;
+}
+
+
+/**
+ * Return the vertex count in the display list node.
+ */
+static inline GLuint
+_vbo_save_get_vertex_count(const struct vbo_save_vertex_list *node)
+{
+   assert(node->prim_count > 0);
+   const struct _mesa_prim *first_prim = >prims[0];
+   const struct _mesa_prim *last_prim = >prims[node->prim_count - 1];
+   return last_prim->start - first_prim->start + last_prim->count;
+}
+
+
 /* These buffers should be a reasonable size to support upload to
  * hardware.  Current vbo implementation will re-upload on any
  * changes, so don't make too big or apps which dynamically create
@@ -178,13 +224,8 @@ void vbo_save_fallback(struct gl_context *ctx, GLboolean 
fallback);
 
 /* save_loopback.c:
  */
-void vbo_loopback_vertex_list(struct gl_context *ctx,
-  const GLfloat *buffer,
-  const GLubyte *attrsz,
-  const struct _mesa_prim *prim,
-  GLuint prim_count,
-  GLuint wrap_count,
-  GLuint vertex_size);
+void _vbo_loopback_vertex_list(struct gl_context *ctx,
+   const struct vbo_save_vertex_list* node);
 
 /* Callbacks:
  */
diff --git a/src/mesa/vbo/vbo_save_api.c b/src/mesa/vbo/vbo_save_api.c
index b6fc7daa35..dc248934f7 100644
--- a/src/mesa/vbo/vbo_save_api.c
+++ b/src/mesa/vbo/vbo_save_api.c
@@ -641,6 +641,22 @@ compile_vertex_list(struct gl_context *ctx)
 
merge_prims(node->prims, >prim_count);
 
+   /* Correct the primitive starts, we can only do this here as copy_vertices
+* and convert_line_loop_to_strip above consume the uncorrected starts.
+* On the other hand the _vbo_loopback_vertex_list call below needs the
+* primitves to be corrected already.
+*/
+   if (aligned_vertex_buffer_offset(node)) {
+  const unsigned start_offset =
+ node->buffer_offset / (node->vertex_size * sizeof(GLfloat));
+  for (unsigned i = 0; i < node->prim_count; i++) {
+ node->prims[i].start += start_offset;
+  }
+  node->start_vertex = start_offset;
+   } else {
+  node->start_vertex = 0;
+   }
+
/* Deal with GL_COMPILE_AND_EXECUTE:
 */
if (ctx->ExecuteFlag) {
@@ -648,13 +664,8 @@ compile_vertex_list(struct gl_context *ctx)
 
   _glapi_set_dispatch(ctx->Exec);
 
-  const GLfloat *buffer = (const GLfloat *)
- ((const char *) save->vertex_store->buffer_map +
-  node->buffer_offset);
-
-  vbo_loopback_vertex_list(ctx, buffer,
-   node->attrsz, node->prims, node->prim_count,
-   node->wrap_count, node->vertex_size);
+  /* Note that the range of referenced vertices must be mapped already */
+  _vbo_loopback_vertex_list(ctx, node);
 
   _glapi_set_dispatch(dispatch);
}
@@ -693,23 +704,6 @@ compile_vertex_list(struct gl_context *ctx)
   save->prim_store =

Mesa (master): vbo: Use a local variable for the dlist offsets.

2018-02-28 Thread Mathias Fröhlich

Module: Mesa
Branch: master
Commit: f7178d677ca6a072455ff45b328b1078175a93b6
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=f7178d677ca6a072455ff45b328b1078175a93b6

Author: Mathias Fröhlich 
Date:   Sun Feb 25 18:01:07 2018 +0100

vbo: Use a local variable for the dlist offsets.

The master value is now stored inside the VAO already present in
struct vbo_save_vertex_list. Remove the unneeded copy from dlist storage.

Reviewed-by: Brian Paul 
Signed-off-by: Mathias Fröhlich 

---

 src/mesa/vbo/vbo_save.h |  1 -
 src/mesa/vbo/vbo_save_api.c | 15 +++
 2 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/src/mesa/vbo/vbo_save.h b/src/mesa/vbo/vbo_save.h
index 414a477f31..14ac831ffd 100644
--- a/src/mesa/vbo/vbo_save.h
+++ b/src/mesa/vbo/vbo_save.h
@@ -64,7 +64,6 @@ struct vbo_save_vertex_list {
GLbitfield64 enabled; /**< mask of enabled vbo arrays. */
GLubyte attrsz[VBO_ATTRIB_MAX];
GLenum16 attrtype[VBO_ATTRIB_MAX];
-   GLuint offsets[VBO_ATTRIB_MAX];
GLuint vertex_size;  /**< size in GLfloats */
struct gl_vertex_array_object *VAO[VP_MODE_MAX];
 
diff --git a/src/mesa/vbo/vbo_save_api.c b/src/mesa/vbo/vbo_save_api.c
index a87bbe0856..b6fc7daa35 100644
--- a/src/mesa/vbo/vbo_save_api.c
+++ b/src/mesa/vbo/vbo_save_api.c
@@ -529,8 +529,6 @@ compile_vertex_list(struct gl_context *ctx)
struct vbo_save_context *save = _context(ctx)->save;
struct vbo_save_vertex_list *node;
GLintptr buffer_offset = 0;
-   GLuint offset;
-   unsigned i;
 
/* Allocate space for this structure in the display list currently
 * being compiled.
@@ -563,13 +561,14 @@ compile_vertex_list(struct gl_context *ctx)
* changes in drivers.  In particular, the Gallium CSO module will
* filter out redundant vertex buffer changes.
*/
-  offset = 0;
+  buffer_offset = 0;
} else {
-  offset = node->buffer_offset;
+  buffer_offset = node->buffer_offset;
}
-   for (i = 0; i < VBO_ATTRIB_MAX; ++i) {
-  node->offsets[i] = offset;
-  offset += node->attrsz[i] * sizeof(GLfloat);
+   GLuint offsets[VBO_ATTRIB_MAX];
+   for (unsigned i = 0, offset = 0; i < VBO_ATTRIB_MAX; ++i) {
+  offsets[i] = offset;
+  offset += save->attrsz[i] * sizeof(GLfloat);
}
node->vertex_count = save->vert_count;
node->wrap_count = save->copied.nr;
@@ -586,7 +585,7 @@ compile_vertex_list(struct gl_context *ctx)
   update_vao(ctx, vpm, >VAO[vpm],
  node->vertex_store->bufferobj, buffer_offset,
  node->vertex_size*sizeof(GLfloat), node->enabled,
- node->attrsz, node->attrtype, node->offsets);
+ node->attrsz, node->attrtype, offsets);
   /* Reference the vao in the dlist */
   node->VAO[vpm] = NULL;
   _mesa_reference_vao(ctx, >VAO[vpm], save->VAO[vpm]);

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): vbo: Remove reference to the vertex_store from the dlist node.

2018-02-28 Thread Mathias Fröhlich

Module: Mesa
Branch: master
Commit: 19a0f27a491ae7cb3abceda8e60b9944cd273558
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=19a0f27a491ae7cb3abceda8e60b9944cd273558

Author: Mathias Fröhlich 
Date:   Sun Feb 25 18:01:07 2018 +0100

vbo: Remove reference to the vertex_store from the dlist node.

Since we now store a set of VAOs in the display list, use these object
to get the reference to the VBO in several places.

Reviewed-by: Brian Paul 
Signed-off-by: Mathias Fröhlich 

---

 src/mesa/vbo/vbo_save.c | 11 +--
 src/mesa/vbo/vbo_save.h |  6 ++
 src/mesa/vbo/vbo_save_api.c | 14 +++---
 3 files changed, 10 insertions(+), 21 deletions(-)

diff --git a/src/mesa/vbo/vbo_save.c b/src/mesa/vbo/vbo_save.c
index f106cf279a..361964195c 100644
--- a/src/mesa/vbo/vbo_save.c
+++ b/src/mesa/vbo/vbo_save.c
@@ -65,12 +65,11 @@ void vbo_save_destroy( struct gl_context *ctx )
  free(save->prim_store);
  save->prim_store = NULL;
   }
-  if ( --save->vertex_store->refcount == 0 ) {
- _mesa_reference_buffer_object(ctx,
-   >vertex_store->bufferobj, NULL);
- free(save->vertex_store);
- save->vertex_store = NULL;
-  }
+   }
+   if (save->vertex_store) {
+  _mesa_reference_buffer_object(ctx, >vertex_store->bufferobj, NULL);
+  free(save->vertex_store);
+  save->vertex_store = NULL;
}
 }
 
diff --git a/src/mesa/vbo/vbo_save.h b/src/mesa/vbo/vbo_save.h
index 00f18363b7..f4565023fd 100644
--- a/src/mesa/vbo/vbo_save.h
+++ b/src/mesa/vbo/vbo_save.h
@@ -81,7 +81,6 @@ struct vbo_save_vertex_list {
struct _mesa_prim *prims;
GLuint prim_count;
 
-   struct vbo_save_vertex_store *vertex_store;
struct vbo_save_primitive_store *prim_store;
 };
 
@@ -163,15 +162,14 @@ _vbo_save_get_vertex_count(const struct 
vbo_save_vertex_list *node)
 
 #define VBO_SAVE_FALLBACK0x1000
 
-/* Storage to be shared among several vertex_lists.
- */
 struct vbo_save_vertex_store {
struct gl_buffer_object *bufferobj;
fi_type *buffer_map;
GLuint used;   /**< Number of 4-byte words used in buffer */
-   GLuint refcount;
 };
 
+/* Storage to be shared among several vertex_lists.
+ */
 struct vbo_save_primitive_store {
struct _mesa_prim prims[VBO_SAVE_PRIM_SIZE];
GLuint used;
diff --git a/src/mesa/vbo/vbo_save_api.c b/src/mesa/vbo/vbo_save_api.c
index db9a3fbdfa..8dac6251c4 100644
--- a/src/mesa/vbo/vbo_save_api.c
+++ b/src/mesa/vbo/vbo_save_api.c
@@ -222,7 +222,6 @@ alloc_vertex_store(struct gl_context *ctx)
 
vertex_store->buffer_map = NULL;
vertex_store->used = 0;
-   vertex_store->refcount = 1;
 
return vertex_store;
 }
@@ -574,7 +573,6 @@ compile_vertex_list(struct gl_context *ctx)
node->wrap_count = save->copied.nr;
node->prims = save->prims;
node->prim_count = save->prim_count;
-   node->vertex_store = save->vertex_store;
node->prim_store = save->prim_store;
 
/* Create a pair of VAOs for the possible VERTEX_PROCESSING_MODEs
@@ -583,7 +581,7 @@ compile_vertex_list(struct gl_context *ctx)
for (gl_vertex_processing_mode vpm = VP_MODE_FF; vpm < VP_MODE_MAX; ++vpm) {
   /* create or reuse the vao */
   update_vao(ctx, vpm, >VAO[vpm],
- node->vertex_store->bufferobj, buffer_offset,
+ save->vertex_store->bufferobj, buffer_offset,
  node->vertex_size*sizeof(GLfloat), node->enabled,
  node->attrsz, node->attrtype, offsets);
   /* Reference the vao in the dlist */
@@ -591,7 +589,6 @@ compile_vertex_list(struct gl_context *ctx)
   _mesa_reference_vao(ctx, >VAO[vpm], save->VAO[vpm]);
}
 
-   node->vertex_store->refcount++;
node->prim_store->refcount++;
 
if (node->prims[0].no_current_update) {
@@ -680,8 +677,7 @@ compile_vertex_list(struct gl_context *ctx)
 
   /* Release old reference:
*/
-  save->vertex_store->refcount--;
-  assert(save->vertex_store->refcount != 0);
+  free_vertex_store(ctx, save->vertex_store);
   save->vertex_store = NULL;
 
   /* Allocate and map new store:
@@ -1817,9 +1813,6 @@ vbo_destroy_vertex_list(struct gl_context *ctx, void 
*data)
for (gl_vertex_processing_mode vpm = VP_MODE_FF; vpm < VP_MODE_MAX; ++vpm)
   _mesa_reference_vao(ctx, >VAO[vpm], NULL);
 
-   if (--node->vertex_store->refcount == 0)
-  free_vertex_store(ctx, node->vertex_store);
-
if (--node->prim_store->refcount == 0)
   free(node->prim_store);
 
@@ -1833,8 +1826,7 @@ vbo_print_vertex_list(struct gl_context *ctx, void *data, 
FILE *f)
 {
struct vbo_save_vertex_list *node = (struct vbo_save_vertex_list *) data;
GLuint i;
-   struct gl_buffer_object *buffer = node->vertex_store ?
-  node->vertex_store->bufferobj : NULL;
+   struct gl_buffer_object *buffer = node->VAO[0]->BufferBinding[0].BufferObj;
(void) ctx;
 
fprintf(f,

Mesa (master): anv: Always set has_context_priority

2018-02-28 Thread Jason Ekstrand

Module: Mesa
Branch: master
Commit: 6d3edbea16335b1f85f9e4e38cfe6dbd1133472d
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=6d3edbea16335b1f85f9e4e38cfe6dbd1133472d

Author: Jason Ekstrand 
Date:   Wed Feb 28 15:25:48 2018 -0800

anv: Always set has_context_priority

We don't zalloc the physical device so we need to unconditionally set
everything.  Crucible helpfully initializes all allocations to 139 so it
was getting true regardless of whether or not the kernel actually
supports context priorities.

Fixes: 6d8ab53303331 "anv: implement VK_EXT_global_priority extension"
Reviewed-by: Kenneth Graunke 

---

 src/intel/vulkan/anv_device.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 56c0c5fa9f..3d44bfd43f 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -374,9 +374,7 @@ anv_physical_device_init(struct anv_physical_device *device,
device->has_syncobj = anv_gem_get_param(fd, 
I915_PARAM_HAS_EXEC_FENCE_ARRAY);
device->has_syncobj_wait = device->has_syncobj &&
   anv_gem_supports_syncobj_wait(fd);
-
-   if (anv_gem_has_context_priority(fd))
-  device->has_context_priority = true;
+   device->has_context_priority = anv_gem_has_context_priority(fd);
 
bool swizzled = anv_gem_get_bit6_swizzle(fd, I915_TILING_X);
 

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): Revert "i965: Only emit 3DSTATE_DRAWING_RECTANGLE once on gen8+"

2018-02-28 Thread Mark Janes

Module: Mesa
Branch: master
Commit: 0fc009b8c7bd6fb4a2cc77e9c4d0440acdc58ee1
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=0fc009b8c7bd6fb4a2cc77e9c4d0440acdc58ee1

Author: Mark Janes 
Date:   Wed Feb 28 17:26:08 2018 -0800

Revert "i965: Only emit 3DSTATE_DRAWING_RECTANGLE once on gen8+"

This reverts commit a2c1e48f15995a826dc759e064c2603882a37e0c.

On BDWGT3e and KBLGT3e systems, this commit regressed the following
tests:

  piglit.spec.ext_framebuffer_multisample.accuracy 2 stencil_resolve small 
depthstencil
  piglit.spec.ext_framebuffer_multisample.accuracy 4 stencil_resolve small 
depthstencil
  piglit.spec.ext_framebuffer_multisample.accuracy 6 stencil_resolve small 
depthstencil
  piglit.spec.ext_framebuffer_multisample.accuracy 8 stencil_resolve small 
depthstencil
  piglit.spec.ext_framebuffer_multisample.accuracy all_samples stencil_resolve 
small depthstencil

---

 src/mesa/drivers/dri/i965/brw_misc_state.c| 9 -
 src/mesa/drivers/dri/i965/genX_blorp_exec.c   | 2 --
 src/mesa/drivers/dri/i965/genX_state_upload.c | 4 ++--
 3 files changed, 2 insertions(+), 13 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c 
b/src/mesa/drivers/dri/i965/brw_misc_state.c
index 2d2517d2bd..c4ef6812bf 100644
--- a/src/mesa/drivers/dri/i965/brw_misc_state.c
+++ b/src/mesa/drivers/dri/i965/brw_misc_state.c
@@ -573,15 +573,6 @@ brw_upload_invariant_state(struct brw_context *brw)
BEGIN_BATCH(1);
OUT_BATCH(_3DSTATE_VF_STATISTICS << 16 | 1);
ADVANCE_BATCH();
-
-   if (devinfo->gen >= 8) {
-  BEGIN_BATCH(4);
-  OUT_BATCH(_3DSTATE_DRAWING_RECTANGLE << 16 | 1);
-  OUT_BATCH(0);
-  OUT_BATCH(~0);
-  OUT_BATCH(0);
-  ADVANCE_BATCH();
-   }
 }
 
 /**
diff --git a/src/mesa/drivers/dri/i965/genX_blorp_exec.c 
b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
index aa97981dd1..062171af60 100644
--- a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
+++ b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
@@ -276,12 +276,10 @@ retry:
gen8_write_pma_stall_bits(brw, 0);
 #endif
 
-#if GEN_GEN < 8
blorp_emit(batch, GENX(3DSTATE_DRAWING_RECTANGLE), rect) {
   rect.ClippedDrawingRectangleXMax = MAX2(params->x1, params->x0) - 1;
   rect.ClippedDrawingRectangleYMax = MAX2(params->y1, params->y0) - 1;
}
-#endif
 
blorp_exec(batch, params);
 
diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
b/src/mesa/drivers/dri/i965/genX_state_upload.c
index eda812868b..b38b61a874 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -280,7 +280,6 @@ static const struct brw_tracked_state genX(line_stipple) = {
.emit = genX(upload_line_stipple),
 };
 
-#if GEN_GEN < 8
 /* Constant single cliprect for framebuffer object or DRI2 drawing */
 static void
 genX(upload_drawing_rect)(struct brw_context *brw)
@@ -304,7 +303,6 @@ static const struct brw_tracked_state genX(drawing_rect) = {
},
.emit = genX(upload_drawing_rect),
 };
-#endif
 
 static uint32_t *
 genX(emit_vertex_buffer_state)(struct brw_context *brw,
@@ -5658,6 +5656,8 @@ genX(init_atoms)(struct brw_context *brw)
 
   (line_stipple),
 
+  (drawing_rect),
+
   (vf_topology),
 
   _indices,

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): radeonsi/nir: increase values to 8 for gs fetch.

2018-02-28 Thread Dave Airlie

Module: Mesa
Branch: master
Commit: 6c1b5a40fde6f4ca77f8b866e99673b34df42116
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=6c1b5a40fde6f4ca77f8b866e99673b34df42116

Author: Dave Airlie 
Date:   Thu Mar  1 10:01:33 2018 +1000

radeonsi/nir: increase values to 8 for gs fetch.

This stops a crash when running (still fails):
tests/spec/arb_gpu_shader_fp64/execution/explicit-location-gs-fs-vs.shader_test

Reviewed-by: Timothy Arceri 
Signed-off-by: Dave Airlie 

---

 src/gallium/drivers/radeonsi/si_shader_nir.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader_nir.c 
b/src/gallium/drivers/radeonsi/si_shader_nir.c
index d410a6c2d6..147bd9511d 100644
--- a/src/gallium/drivers/radeonsi/si_shader_nir.c
+++ b/src/gallium/drivers/radeonsi/si_shader_nir.c
@@ -740,7 +740,7 @@ LLVMValueRef si_nir_load_input_gs(struct ac_shader_abi *abi,
 {
struct si_shader_context *ctx = si_shader_context_from_abi(abi);
 
-   LLVMValueRef value[4];
+   LLVMValueRef value[8];
for (unsigned i = component; i < num_components + component; i++) {
value[i] = si_llvm_load_input_gs(>abi, driver_location  / 
4,
 vertex_index, type, i);

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): radv: Implement waiting on non-submitted fences.

2018-02-28 Thread Bas Nieuwenhuizen

Module: Mesa
Branch: master
Commit: 6968d782d3063c639e80dbcf6df944902d72692f
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=6968d782d3063c639e80dbcf6df944902d72692f

Author: Bas Nieuwenhuizen 
Date:   Mon Feb 26 22:54:06 2018 +0100

radv: Implement waiting on non-submitted fences.

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie 

---

 src/amd/vulkan/radv_device.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 24ea3b689e..8eadd8f203 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -2946,8 +2946,17 @@ VkResult radv_WaitForFences(
if (fence->signalled)
continue;
 
-   if (!fence->submitted)
-   return VK_TIMEOUT;
+   if (!fence->submitted) {
+   while(radv_get_current_time() <= timeout && 
!fence->submitted)
+   /* Do nothing */;
+
+   if (!fence->submitted)
+   return VK_TIMEOUT;
+
+   /* Recheck as it may have been set by submitting 
operations. */
+   if (fence->signalled)
+   continue;
+   }
 
expired = device->ws->fence_wait(device->ws, fence->fence, 
true, timeout);
if (!expired)

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): radv: Implement WaitForFences with !waitAll.

2018-02-28 Thread Bas Nieuwenhuizen

Module: Mesa
Branch: master
Commit: 2a404c6f923880cfd0bc04f9db1890cadce8bd92
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=2a404c6f923880cfd0bc04f9db1890cadce8bd92

Author: Bas Nieuwenhuizen 
Date:   Mon Feb 26 22:50:41 2018 +0100

radv: Implement WaitForFences with !waitAll.

Nothing to do except using a busy wait loop. At least for old kernels.

A better implementation for newer kernels to come later.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105255
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie 

---

 src/amd/vulkan/radv_device.c | 20 +++-
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 92865122ad..24ea3b689e 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -2890,13 +2890,17 @@ void radv_DestroyFence(
vk_free2(>alloc, pAllocator, fence);
 }
 
-static uint64_t radv_get_absolute_timeout(uint64_t timeout)
+
+static uint64_t radv_get_current_time()
 {
-   uint64_t current_time;
struct timespec tv;
-
clock_gettime(CLOCK_MONOTONIC, );
-   current_time = tv.tv_nsec + tv.tv_sec*10ull;
+   return tv.tv_nsec + tv.tv_sec*10ull;
+}
+
+static uint64_t radv_get_absolute_timeout(uint64_t timeout)
+{
+   uint64_t current_time = radv_get_current_time();
 
timeout = MIN2(UINT64_MAX - current_time, timeout);
 
@@ -2914,7 +2918,13 @@ VkResult radv_WaitForFences(
timeout = radv_get_absolute_timeout(timeout);
 
if (!waitAll && fenceCount > 1) {
-   fprintf(stderr, "radv: WaitForFences without waitAll not 
implemented yet\n");
+   while(radv_get_current_time() <= timeout) {
+   for (uint32_t i = 0; i < fenceCount; ++i) {
+   if (radv_GetFenceStatus(_device, pFences[i]) == 
VK_SUCCESS)
+   return VK_SUCCESS;
+   }
+   }
+   return VK_TIMEOUT;
}
 
for (uint32_t i = 0; i < fenceCount; ++i) {

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): radv: Use the syncobj wait ioctl to wait on fences if possible.

2018-02-28 Thread Bas Nieuwenhuizen

Module: Mesa
Branch: master
Commit: f9898b211eb23c18d27508a2cbbdd629fc3dc734
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=f9898b211eb23c18d27508a2cbbdd629fc3dc734

Author: Bas Nieuwenhuizen 
Date:   Mon Feb 26 21:52:49 2018 +0100

radv: Use the syncobj wait ioctl to wait on fences if possible.

Handles the !waitAll and signal after the start of the wait cases correctly.

Reviewed-by: Dave Airlie 

---

 src/amd/vulkan/radv_device.c  | 24 
 src/amd/vulkan/radv_radeon_winsys.h   |  3 ++-
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c |  8 
 3 files changed, 26 insertions(+), 9 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 21ccfa679f..36d7a406bf 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -2928,6 +2928,22 @@ VkResult radv_WaitForFences(
RADV_FROM_HANDLE(radv_device, device, _device);
timeout = radv_get_absolute_timeout(timeout);
 
+   if (device->always_use_syncobj) {
+   uint32_t *handles = malloc(sizeof(uint32_t) * fenceCount);
+   if (!handles)
+   return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
+
+   for (uint32_t i = 0; i < fenceCount; ++i) {
+   RADV_FROM_HANDLE(radv_fence, fence, pFences[i]);
+   handles[i] = fence->temp_syncobj ? fence->temp_syncobj 
: fence->syncobj;
+   }
+
+   bool success = device->ws->wait_syncobj(device->ws, handles, 
fenceCount, waitAll, timeout);
+
+   free(handles);
+   return success ? VK_SUCCESS : VK_TIMEOUT;
+   }
+
if (!waitAll && fenceCount > 1) {
/* Not doing this by default for waitAll, due to needing to 
allocate twice. */
if (device->physical_device->rad_info.drm_minor >= 10 && 
radv_all_fences_plain_and_submitted(fenceCount, pFences)) {
@@ -2968,13 +2984,13 @@ VkResult radv_WaitForFences(
bool expired = false;
 
if (fence->temp_syncobj) {
-   if (!device->ws->wait_syncobj(device->ws, 
fence->temp_syncobj, timeout))
+   if (!device->ws->wait_syncobj(device->ws, 
>temp_syncobj, 1, true, timeout))
return VK_TIMEOUT;
continue;
}
 
if (fence->syncobj) {
-   if (!device->ws->wait_syncobj(device->ws, 
fence->syncobj, timeout))
+   if (!device->ws->wait_syncobj(device->ws, 
>syncobj, 1, true, timeout))
return VK_TIMEOUT;
continue;
}
@@ -3035,12 +3051,12 @@ VkResult radv_GetFenceStatus(VkDevice _device, VkFence 
_fence)
RADV_FROM_HANDLE(radv_fence, fence, _fence);
 
if (fence->temp_syncobj) {
-   bool success = device->ws->wait_syncobj(device->ws, 
fence->temp_syncobj, 0);
+   bool success = device->ws->wait_syncobj(device->ws, 
>temp_syncobj, 1, true, 0);
return success ? VK_SUCCESS : VK_NOT_READY;
}
 
if (fence->syncobj) {
-   bool success = device->ws->wait_syncobj(device->ws, 
fence->syncobj, 0);
+   bool success = device->ws->wait_syncobj(device->ws, 
>syncobj, 1, true, 0);
return success ? VK_SUCCESS : VK_NOT_READY;
}
 
diff --git a/src/amd/vulkan/radv_radeon_winsys.h 
b/src/amd/vulkan/radv_radeon_winsys.h
index 643d76a826..270b3bceab 100644
--- a/src/amd/vulkan/radv_radeon_winsys.h
+++ b/src/amd/vulkan/radv_radeon_winsys.h
@@ -286,7 +286,8 @@ struct radeon_winsys {
 
void (*reset_syncobj)(struct radeon_winsys *ws, uint32_t handle);
void (*signal_syncobj)(struct radeon_winsys *ws, uint32_t handle);
-   bool (*wait_syncobj)(struct radeon_winsys *ws, uint32_t handle, 
uint64_t timeout);
+   bool (*wait_syncobj)(struct radeon_winsys *ws, const uint32_t *handles, 
uint32_t handle_count,
+bool wait_all, uint64_t timeout);
 
int (*export_syncobj)(struct radeon_winsys *ws, uint32_t syncobj, int 
*fd);
int (*import_syncobj)(struct radeon_winsys *ws, int fd, uint32_t 
*syncobj);
diff --git a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c 
b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
index d2b33546cc..cd7ab384e7 100644
--- a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
+++ b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
@@ -1332,8 +1332,8 @@ static void radv_amdgpu_signal_syncobj(struct 
radeon_winsys *_ws,
amdgpu_cs_syncobj_signal(ws->dev, , 1);
 }
 
-static bool radv_amdgpu_wait_syncobj(struct radeon_winsys *_ws,
-   uint32_t handle, uint64_t timeout)
+static bool radv_amdgpu_wait_syncobj(struct radeon_winsys *_ws, const uint32_t

Mesa (master): radv: Implement more efficient !waitAll fence waiting.

2018-02-28 Thread Bas Nieuwenhuizen

Module: Mesa
Branch: master
Commit: 34bd5e2e2e8d9c213b051152f7a8b731151d9be5
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=34bd5e2e2e8d9c213b051152f7a8b731151d9be5

Author: Bas Nieuwenhuizen 
Date:   Mon Feb 26 23:48:27 2018 +0100

radv: Implement more efficient !waitAll fence waiting.

Reviewed-by: Dave Airlie 

---

 src/amd/vulkan/radv_device.c  | 36 +++
 src/amd/vulkan/radv_radeon_winsys.h   |  5 
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c | 34 +
 3 files changed, 75 insertions(+)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 8eadd8f203..21ccfa679f 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -2907,6 +2907,17 @@ static uint64_t radv_get_absolute_timeout(uint64_t 
timeout)
return current_time + timeout;
 }
 
+
+static bool radv_all_fences_plain_and_submitted(uint32_t fenceCount, const 
VkFence *pFences)
+{
+   for (uint32_t i = 0; i < fenceCount; ++i) {
+   RADV_FROM_HANDLE(radv_fence, fence, pFences[i]);
+   if (fence->syncobj || fence->temp_syncobj || (!fence->signalled 
&& !fence->submitted))
+   return false;
+   }
+   return true;
+}
+
 VkResult radv_WaitForFences(
VkDevice_device,
uint32_tfenceCount,
@@ -2918,6 +2929,31 @@ VkResult radv_WaitForFences(
timeout = radv_get_absolute_timeout(timeout);
 
if (!waitAll && fenceCount > 1) {
+   /* Not doing this by default for waitAll, due to needing to 
allocate twice. */
+   if (device->physical_device->rad_info.drm_minor >= 10 && 
radv_all_fences_plain_and_submitted(fenceCount, pFences)) {
+   uint32_t wait_count = 0;
+   struct radeon_winsys_fence **fences = 
malloc(sizeof(struct radeon_winsys_fence *) * fenceCount);
+   if (!fences)
+   return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
+
+   for (uint32_t i = 0; i < fenceCount; ++i) {
+   RADV_FROM_HANDLE(radv_fence, fence, pFences[i]);
+
+   if (fence->signalled) {
+   free(fences);
+   return VK_SUCCESS;
+   }
+
+   fences[wait_count++] = fence->fence;
+   }
+
+   bool success = device->ws->fences_wait(device->ws, 
fences, wait_count,
+  waitAll, timeout 
- radv_get_current_time());
+
+   free(fences);
+   return success ? VK_SUCCESS : VK_TIMEOUT;
+   }
+
while(radv_get_current_time() <= timeout) {
for (uint32_t i = 0; i < fenceCount; ++i) {
if (radv_GetFenceStatus(_device, pFences[i]) == 
VK_SUCCESS)
diff --git a/src/amd/vulkan/radv_radeon_winsys.h 
b/src/amd/vulkan/radv_radeon_winsys.h
index 4c306692e5..643d76a826 100644
--- a/src/amd/vulkan/radv_radeon_winsys.h
+++ b/src/amd/vulkan/radv_radeon_winsys.h
@@ -270,6 +270,11 @@ struct radeon_winsys {
   struct radeon_winsys_fence *fence,
   bool absolute,
   uint64_t timeout);
+   bool (*fences_wait)(struct radeon_winsys *ws,
+   struct radeon_winsys_fence *const *fences,
+   uint32_t fence_count,
+   bool wait_all,
+   uint64_t timeout);
 
/* old semaphores - non shareable */
struct radeon_winsys_sem *(*create_sem)(struct radeon_winsys *ws);
diff --git a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c 
b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
index 5632b1d4ee..d2b33546cc 100644
--- a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
+++ b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
@@ -154,6 +154,39 @@ static bool radv_amdgpu_fence_wait(struct radeon_winsys 
*_ws,
return false;
 }
 
+
+static bool radv_amdgpu_fences_wait(struct radeon_winsys *_ws,
+ struct radeon_winsys_fence *const *_fences,
+ uint32_t fence_count,
+ bool wait_all,
+ uint64_t timeout)
+{
+   struct amdgpu_cs_fence *fences = malloc(sizeof(struct amdgpu_cs_fence) 
* fence_count);
+   int r;
+   uint32_t expired = 0, first = 0;
+
+   if (!fences)
+   return false;
+
+   for (uint32_t i = 0; i < fence_count; ++i)
+   fences[i] = ((struct radv_amdgpu_fence *)_fences[i])->fence;
+
+   /* Now use the libdrm query. */
+   r =

Mesa (master): ac/nir: don't apply slice rounding on txf_ms

2018-02-28 Thread Dave Airlie

Module: Mesa
Branch: master
Commit: 69495b30a38fbb01a937cdea6f7674f89a2e60e7
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=69495b30a38fbb01a937cdea6f7674f89a2e60e7

Author: Dave Airlie 
Date:   Thu Mar  1 09:24:01 2018 +1000

ac/nir: don't apply slice rounding on txf_ms

This matches the tgsi code.

Fixes arb_texture_multisample texelFetch piglit tests.

Reviewed-by: Timothy Arceri 
Reviewed-by: Bas Nieuwenhuizen 
Fixes: f4e499ec7914 (radv: add initial non-conformant radv vulkan driver)
Signed-off-by: Dave Airlie 

---

 src/amd/common/ac_nir_to_llvm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 88e0cf9b4b..3c5be7e203 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -5105,7 +5105,7 @@ static void visit_tex(struct ac_nir_context *ctx, 
nir_tex_instr *instr)
 instr->sampler_dim == GLSL_SAMPLER_DIM_SUBPASS ||
 instr->sampler_dim == GLSL_SAMPLER_DIM_SUBPASS_MS) 
&&
instr->is_array &&
-   instr->op != nir_texop_txf) {
+   instr->op != nir_texop_txf && instr->op != 
nir_texop_txf_ms) {
coords[2] = apply_round_slice(>ac, 
coords[2]);
}
address[count++] = coords[2];

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): ac/nir: fix shared atomic operations.

2018-02-28 Thread Dave Airlie

Module: Mesa
Branch: master
Commit: 49879f3778707e50b2b2d5968996d60557bd99d4
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=49879f3778707e50b2b2d5968996d60557bd99d4

Author: Dave Airlie 
Date:   Thu Mar  1 09:38:19 2018 +1000

ac/nir: fix shared atomic operations.

The nir->llvm conversion was using the wrong srcs.

Fixes:
tests/spec/arb_compute_shader/execution/shared-atomics.shader_test

Reviewed-by: Bas Nieuwenhuizen 
Reviewed-by: Timothy Arceri 
Signed-off-by: Dave Airlie 

---

 src/amd/common/ac_nir_to_llvm.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 3c5be7e203..afe17a8f11 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -3991,10 +3991,10 @@ visit_store_shared(struct ac_nir_context *ctx,
 
 static LLVMValueRef visit_var_atomic(struct ac_nir_context *ctx,
 const nir_intrinsic_instr *instr,
-LLVMValueRef ptr)
+LLVMValueRef ptr, int src_idx)
 {
LLVMValueRef result;
-   LLVMValueRef src = get_src(ctx, instr->src[0]);
+   LLVMValueRef src = get_src(ctx, instr->src[src_idx]);
 
if (instr->intrinsic == nir_intrinsic_var_atomic_comp_swap ||
instr->intrinsic == nir_intrinsic_shared_atomic_comp_swap) {
@@ -4574,8 +4574,8 @@ static void visit_intrinsic(struct ac_nir_context *ctx,
case nir_intrinsic_shared_atomic_xor:
case nir_intrinsic_shared_atomic_exchange:
case nir_intrinsic_shared_atomic_comp_swap: {
-   LLVMValueRef ptr = get_memory_ptr(ctx, instr->src[1]);
-   result = visit_var_atomic(ctx, instr, ptr);
+   LLVMValueRef ptr = get_memory_ptr(ctx, instr->src[0]);
+   result = visit_var_atomic(ctx, instr, ptr, 1);
break;
}
case nir_intrinsic_var_atomic_add:
@@ -4589,7 +4589,7 @@ static void visit_intrinsic(struct ac_nir_context *ctx,
case nir_intrinsic_var_atomic_exchange:
case nir_intrinsic_var_atomic_comp_swap: {
LLVMValueRef ptr = build_gep_for_deref(ctx, 
instr->variables[0]);
-   result = visit_var_atomic(ctx, instr, ptr);
+   result = visit_var_atomic(ctx, instr, ptr, 0);
break;
}
case nir_intrinsic_interp_var_at_centroid:

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): gallium: remove llvm from ir struct

2018-02-28 Thread Timothy Arceri

Module: Mesa
Branch: master
Commit: 7e46214f871983dc64730f2f9c5029ee6109c3b4
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=7e46214f871983dc64730f2f9c5029ee6109c3b4

Author: Timothy Arceri 
Date:   Fri Feb  2 08:50:09 2018 +1100

gallium: remove llvm from ir struct

This was added in 425dc4c4b366 but never used. Also since
100796c15c3a native has superseded llvm.

Acked-by: Dave Airlie 

---

 src/gallium/include/pipe/p_state.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/gallium/include/pipe/p_state.h 
b/src/gallium/include/pipe/p_state.h
index 2b56d60b5e..640e6ed26d 100644
--- a/src/gallium/include/pipe/p_state.h
+++ b/src/gallium/include/pipe/p_state.h
@@ -267,7 +267,6 @@ struct pipe_shader_state
/* TODO move tokens into union. */
const struct tgsi_token *tokens;
union {
-  void *llvm;
   void *native;
   void *nir;
} ir;

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): radeonsi: set some context vars for nir path

2018-02-28 Thread Timothy Arceri

Module: Mesa
Branch: master
Commit: f383fec903220ecd18cb0d237b7d9a4de2ae8f2a
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=f383fec903220ecd18cb0d237b7d9a4de2ae8f2a

Author: Timothy Arceri 
Date:   Tue Feb 13 13:06:51 2018 +1100

radeonsi: set some context vars for nir path

Reviewed-by: Marek Olšák 

---

 src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c 
b/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c
index 8707be504e..4a027d8659 100644
--- a/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c
+++ b/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c
@@ -1253,7 +1253,16 @@ void si_llvm_context_set_tgsi(struct si_shader_context 
*ctx,
ctx->temps = NULL;
ctx->temps_count = 0;
 
-   if (!info || !tokens)
+   if (!info)
+   return;
+
+   ctx->num_const_buffers = util_last_bit(info->const_buffers_declared);
+   ctx->num_shader_buffers = util_last_bit(info->shader_buffers_declared);
+
+   ctx->num_samplers = util_last_bit(info->samplers_declared);
+   ctx->num_images = util_last_bit(info->images_declared);
+
+   if (!tokens)
return;
 
if (info->array_max[TGSI_FILE_TEMPORARY] > 0) {
@@ -1281,11 +1290,6 @@ void si_llvm_context_set_tgsi(struct si_shader_context 
*ctx,
ctx->bld_base.emit_fetch_funcs[TGSI_FILE_TEMPORARY] = 
si_llvm_emit_fetch;
ctx->bld_base.emit_fetch_funcs[TGSI_FILE_OUTPUT] = si_llvm_emit_fetch;
ctx->bld_base.emit_fetch_funcs[TGSI_FILE_SYSTEM_VALUE] = 
fetch_system_value;
-
-   ctx->num_const_buffers = util_last_bit(info->const_buffers_declared);
-   ctx->num_shader_buffers = util_last_bit(info->shader_buffers_declared);
-   ctx->num_samplers = util_last_bit(info->samplers_declared);
-   ctx->num_images = util_last_bit(info->images_declared);
 }
 
 void si_llvm_create_func(struct si_shader_context *ctx,

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): i965: Don't emit MOVs with undefined registers for Gen4 point clipping.

2018-02-28 Thread Kenneth Graunke

Module: Mesa
Branch: master
Commit: e51b0664e03a028961e1a4250c49fbc3005b2fa4
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=e51b0664e03a028961e1a4250c49fbc3005b2fa4

Author: Kenneth Graunke 
Date:   Wed Feb 28 13:22:22 2018 -0800

i965: Don't emit MOVs with undefined registers for Gen4 point clipping.

Gen4 point clipping calls brw_clip_tri_alloc_regs with nr_verts == 0,
which means that c->reg.vertex[] isn't initialized.  It then emits MOVs
to stomp components of those uninitialized registers to 0.

This started causing assertions after Matt's recent series, when those
uninitialized registers started getting BRW_REGISTER_TYPE_NF, which
definitely doesn't exist on Gen4-5.

Reviewed-by: Matt Turner 

---

 src/intel/compiler/brw_clip_tri.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/compiler/brw_clip_tri.c 
b/src/intel/compiler/brw_clip_tri.c
index 8ccf9e49b2..194e6ab1d2 100644
--- a/src/intel/compiler/brw_clip_tri.c
+++ b/src/intel/compiler/brw_clip_tri.c
@@ -68,7 +68,7 @@ void brw_clip_tri_alloc_regs( struct brw_clip_compile *c,
   i += c->nr_regs;
}
 
-   if (c->vue_map.num_slots % 2) {
+   if (c->vue_map.num_slots % 2 && nr_verts > 0) {
   /* The VUE has an odd number of slots so the last register is only half
* used.  Fill the second half with zero.
*/

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): broadcom/vc5: Fix regression in the page-cache slice size alignment.

2018-02-28 Thread Eric Anholt

Module: Mesa
Branch: master
Commit: e4e79a02da2e813284aa8a82dfd4423f0ae9923a
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=e4e79a02da2e813284aa8a82dfd4423f0ae9923a

Author: Eric Anholt 
Date:   Fri Feb 23 15:35:25 2018 -0800

broadcom/vc5: Fix regression in the page-cache slice size alignment.

We need to align the size of the slice, not the offset of the next slice.
Fixes KHR-GLES3.texture_repeat_mode.rgba32ui_11x131_2_clamp_to_edge.

Fixes: b4b4ada7616d ("broadcom/vc5: Fix layout of 3D textures.")

---

 src/gallium/drivers/vc5/vc5_resource.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/vc5/vc5_resource.c 
b/src/gallium/drivers/vc5/vc5_resource.c
index e1645a4fde..86a0a0c139 100644
--- a/src/gallium/drivers/vc5/vc5_resource.c
+++ b/src/gallium/drivers/vc5/vc5_resource.c
@@ -488,8 +488,7 @@ vc5_setup_slices(struct vc5_resource *rsc)
 slice->padded_height = level_height;
 slice->size = level_height * slice->stride;
 
-offset += slice->size * level_depth;
-
+uint32_t slice_total_size = slice->size * level_depth;
 
 /* The HW aligns level 1's base to a page if any of level 1 or
  * below could be UIF XOR.  The lower levels then inherit the
@@ -499,8 +498,12 @@ vc5_setup_slices(struct vc5_resource *rsc)
 if (i == 1 &&
 level_width > 4 * uif_block_w &&
 level_height > PAGE_CACHE_MINUS_1_5_UB_ROWS * uif_block_h) 
{
-offset = align(offset, VC5_UIFCFG_PAGE_SIZE);
+slice_total_size = align(slice_total_size,
+ VC5_UIFCFG_PAGE_SIZE);
 }
+
+offset += slice_total_size;
+
 }
 rsc->size = offset;
 

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): i965: Be more clever about setting up our viewport clip

2018-02-28 Thread Jason Ekstrand

Module: Mesa
Branch: master
Commit: 67da59e320bd5f797f6bdc3ab111f33c64e16811
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=67da59e320bd5f797f6bdc3ab111f33c64e16811

Author: Jason Ekstrand 
Date:   Fri Nov  3 14:13:08 2017 -0700

i965: Be more clever about setting up our viewport clip

Before, we were trusting in the hardware to take the intersection
of the viewport clip with the drawing rectangle.  Unfortunately,
3DSTATE_DRAWING_RECTANGLE is fairly expensive because it implicitly
does a full pipeline stall.  If we're a bit more careful with our
viewport clipping, we can just re-emit it once at context creation
time.

Reviewed-by: Samuel Iglesias Gonsálvez 
Reviewed-by: Kenneth Graunke 

---

 src/mesa/drivers/dri/i965/genX_state_upload.c | 20 
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 8668abd591..b38b61a874 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -2469,24 +2469,28 @@ genX(upload_sf_clip_viewport)(struct brw_context *brw)
 #elif GEN_GEN >= 8
   /* _NEW_VIEWPORT | _NEW_BUFFERS: Screen Space Viewport
* The hardware will take the intersection of the drawing rectangle,
-   * scissor rectangle, and the viewport extents. We don't need to be
-   * smart, and can therefore just program the viewport extents.
+   * scissor rectangle, and the viewport extents.  However, emitting
+   * 3DSTATE_DRAWING_RECTANGLE is expensive since it requires a full
+   * pipeline stall so we're better off just being a little more clever
+   * with our viewport so we can emit it once at context creation time.
*/
+  const float viewport_Xmin = MAX2(ctx->ViewportArray[i].X, 0);
+  const float viewport_Ymin = MAX2(ctx->ViewportArray[i].Y, 0);
   const float viewport_Xmax =
- ctx->ViewportArray[i].X + ctx->ViewportArray[i].Width;
+ MIN2(ctx->ViewportArray[i].X + ctx->ViewportArray[i].Width, fb_width);
   const float viewport_Ymax =
- ctx->ViewportArray[i].Y + ctx->ViewportArray[i].Height;
+ MIN2(ctx->ViewportArray[i].Y + ctx->ViewportArray[i].Height, 
fb_height);
 
   if (render_to_fbo) {
- sfv.XMinViewPort = ctx->ViewportArray[i].X;
+ sfv.XMinViewPort = viewport_Xmin;
  sfv.XMaxViewPort = viewport_Xmax - 1;
- sfv.YMinViewPort = ctx->ViewportArray[i].Y;
+ sfv.YMinViewPort = viewport_Ymin;
  sfv.YMaxViewPort = viewport_Ymax - 1;
   } else {
- sfv.XMinViewPort = ctx->ViewportArray[i].X;
+ sfv.XMinViewPort = viewport_Xmin;
  sfv.XMaxViewPort = viewport_Xmax - 1;
  sfv.YMinViewPort = fb_height - viewport_Ymax;
- sfv.YMaxViewPort = fb_height - ctx->ViewportArray[i].Y - 1;
+ sfv.YMaxViewPort = fb_height - viewport_Ymin - 1;
   }
 #endif
 

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): i965: Only emit 3DSTATE_DRAWING_RECTANGLE once on gen8+

2018-02-28 Thread Jason Ekstrand

Module: Mesa
Branch: master
Commit: a2c1e48f15995a826dc759e064c2603882a37e0c
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=a2c1e48f15995a826dc759e064c2603882a37e0c

Author: Jason Ekstrand 
Date:   Fri Nov  3 10:36:32 2017 -0700

i965: Only emit 3DSTATE_DRAWING_RECTANGLE once on gen8+

Reviewed-by: Samuel Iglesias Gonsálvez 
Reviewed-by: Kenneth Graunke 

---

 src/mesa/drivers/dri/i965/brw_misc_state.c| 9 +
 src/mesa/drivers/dri/i965/genX_blorp_exec.c   | 2 ++
 src/mesa/drivers/dri/i965/genX_state_upload.c | 4 ++--
 3 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c 
b/src/mesa/drivers/dri/i965/brw_misc_state.c
index c4ef6812bf..2d2517d2bd 100644
--- a/src/mesa/drivers/dri/i965/brw_misc_state.c
+++ b/src/mesa/drivers/dri/i965/brw_misc_state.c
@@ -573,6 +573,15 @@ brw_upload_invariant_state(struct brw_context *brw)
BEGIN_BATCH(1);
OUT_BATCH(_3DSTATE_VF_STATISTICS << 16 | 1);
ADVANCE_BATCH();
+
+   if (devinfo->gen >= 8) {
+  BEGIN_BATCH(4);
+  OUT_BATCH(_3DSTATE_DRAWING_RECTANGLE << 16 | 1);
+  OUT_BATCH(0);
+  OUT_BATCH(~0);
+  OUT_BATCH(0);
+  ADVANCE_BATCH();
+   }
 }
 
 /**
diff --git a/src/mesa/drivers/dri/i965/genX_blorp_exec.c 
b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
index 062171af60..aa97981dd1 100644
--- a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
+++ b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
@@ -276,10 +276,12 @@ retry:
gen8_write_pma_stall_bits(brw, 0);
 #endif
 
+#if GEN_GEN < 8
blorp_emit(batch, GENX(3DSTATE_DRAWING_RECTANGLE), rect) {
   rect.ClippedDrawingRectangleXMax = MAX2(params->x1, params->x0) - 1;
   rect.ClippedDrawingRectangleYMax = MAX2(params->y1, params->y0) - 1;
}
+#endif
 
blorp_exec(batch, params);
 
diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
b/src/mesa/drivers/dri/i965/genX_state_upload.c
index b38b61a874..eda812868b 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -280,6 +280,7 @@ static const struct brw_tracked_state genX(line_stipple) = {
.emit = genX(upload_line_stipple),
 };
 
+#if GEN_GEN < 8
 /* Constant single cliprect for framebuffer object or DRI2 drawing */
 static void
 genX(upload_drawing_rect)(struct brw_context *brw)
@@ -303,6 +304,7 @@ static const struct brw_tracked_state genX(drawing_rect) = {
},
.emit = genX(upload_drawing_rect),
 };
+#endif
 
 static uint32_t *
 genX(emit_vertex_buffer_state)(struct brw_context *brw,
@@ -5656,8 +5658,6 @@ genX(init_atoms)(struct brw_context *brw)
 
   (line_stipple),
 
-  (drawing_rect),
-
   (vf_topology),
 
   _indices,

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): intel/compiler: Re-add .vs_inputs_dual_locations = true

2018-02-28 Thread Matt Turner

Module: Mesa
Branch: master
Commit: debaa822ef12bc9006dcf95ab76ac8e3432bd9a7
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=debaa822ef12bc9006dcf95ab76ac8e3432bd9a7

Author: Matt Turner 
Date:   Wed Feb 28 13:25:21 2018 -0800

intel/compiler: Re-add .vs_inputs_dual_locations = true

Looks like a rebase mistake.

Fixes: 89fe5190a256 ("intel/compiler: Lower flrp32 on Gen11+")

---

 src/intel/compiler/brw_compiler.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/intel/compiler/brw_compiler.c 
b/src/intel/compiler/brw_compiler.c
index 34be3b705f..9340317492 100644
--- a/src/intel/compiler/brw_compiler.c
+++ b/src/intel/compiler/brw_compiler.c
@@ -57,6 +57,7 @@
.lower_unpack_snorm_4x8 = true,\
.lower_unpack_unorm_2x16 = true,   \
.lower_unpack_unorm_4x8 = true,\
+   .vs_inputs_dual_locations = true,  \
.max_unroll_iterations = 32
 
 static const struct nir_shader_compiler_options scalar_nir_options = {

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): r600: fix whitespace in recent 1d texture commit.

2018-02-28 Thread Dave Airlie

Module: Mesa
Branch: master
Commit: 8369fdee8ba311aab6a6cf5e75f5f12f56469779
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=8369fdee8ba311aab6a6cf5e75f5f12f56469779

Author: Dave Airlie 
Date:   Wed Feb 28 20:15:30 2018 +

r600: fix whitespace in recent 1d texture commit.

trivial fix.

---

 src/gallium/drivers/r600/r600_texture.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/r600/r600_texture.c 
b/src/gallium/drivers/r600/r600_texture.c
index 1fbb682d67..fbcc878a24 100644
--- a/src/gallium/drivers/r600/r600_texture.c
+++ b/src/gallium/drivers/r600/r600_texture.c
@@ -1056,7 +1056,7 @@ r600_choose_tiling(struct r600_common_screen *rscreen,
/* 1D textures should be linear - fixes image operations on 1d 
*/
if (templ->target == PIPE_TEXTURE_1D ||
templ->target == PIPE_TEXTURE_1D_ARRAY)
-   return RADEON_SURF_MODE_LINEAR_ALIGNED;
+   return RADEON_SURF_MODE_LINEAR_ALIGNED;
 
/* Textures likely to be mapped often. */
if (templ->usage == PIPE_USAGE_STAGING ||

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): r600/shader: when using images always load thread id gpr at start (v2)

2018-02-28 Thread Dave Airlie

Module: Mesa
Branch: master
Commit: 7cb9353de38461c6492712b7b43ee69c57921705
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=7cb9353de38461c6492712b7b43ee69c57921705

Author: Dave Airlie 
Date:   Wed Feb 28 06:42:53 2018 +

r600/shader: when using images always load thread id gpr at start (v2)

The delayed loading code was fail if we had control flow.

This fixes:
tests/spec/arb_shader_image_load_store/execution/image_checkerboard.shader_test

v2: don't use temp_reg before setting temp_reg up.

Tested-by: Gert Wollny 
Signed-off-by: Dave Airlie 

---

 src/gallium/drivers/r600/r600_shader.c | 22 +++---
 1 file changed, 7 insertions(+), 15 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_shader.c 
b/src/gallium/drivers/r600/r600_shader.c
index f2fc3f4c6f..46eeb9021f 100644
--- a/src/gallium/drivers/r600/r600_shader.c
+++ b/src/gallium/drivers/r600/r600_shader.c
@@ -367,7 +367,6 @@ struct r600_shader_ctx {
unsignedtess_input_info; /* temp with 
tess input offsets */
unsignedtess_output_info; /* temp with 
tess input offsets */
unsignedthread_id_gpr; /* temp with 
thread id calculated for images */
-   bool thread_id_gpr_loaded;
 };
 
 struct r600_shader_tgsi_instruction {
@@ -3279,9 +3278,6 @@ static int load_thread_id_gpr(struct r600_shader_ctx *ctx)
struct r600_bytecode_alu alu;
int r;
 
-   if (ctx->thread_id_gpr_loaded)
-   return 0;
-
memset(, 0, sizeof(struct r600_bytecode_alu));
alu.op = ALU_OP1_MBCNT_32LO_ACCUM_PREV_INT;
alu.dst.sel = ctx->temp_reg;
@@ -3326,7 +3322,6 @@ static int load_thread_id_gpr(struct r600_shader_ctx *ctx)
   ctx->temp_reg, 0);
if (r)
return r;
-   ctx->thread_id_gpr_loaded = true;
return 0;
 }
 
@@ -3434,12 +3429,12 @@ static int r600_shader_from_tgsi(struct r600_context 
*rctx,
ctx.gs_next_vertex = 0;
ctx.gs_stream_output_info = 
 
+   ctx.thread_id_gpr = -1;
ctx.face_gpr = -1;
ctx.fixed_pt_position_gpr = -1;
ctx.fragcoord_input = -1;
ctx.colors_used = 0;
ctx.clip_vertex_write = 0;
-   ctx.thread_id_gpr_loaded = false;
 
ctx.helper_invoc_reg = -1;
ctx.cs_block_size_reg = -1;
@@ -3573,7 +3568,6 @@ static int r600_shader_from_tgsi(struct r600_context 
*rctx,
 
if (shader->uses_images) {
ctx.thread_id_gpr = ++regno;
-   ctx.thread_id_gpr_loaded = false;
}
ctx.temp_reg = ++regno;
 
@@ -3616,6 +3610,12 @@ static int r600_shader_from_tgsi(struct r600_context 
*rctx,
if (shader->vs_as_gs_a)
vs_add_primid_output(, key.vs.prim_id_out);
 
+   if (ctx.thread_id_gpr != -1) {
+   r = load_thread_id_gpr();
+   if (r)
+   return r;
+   }
+
if (ctx.type == PIPE_SHADER_TESS_EVAL)
r600_fetch_tess_io_info();
 
@@ -8650,10 +8650,6 @@ static int tgsi_load_rat(struct r600_shader_ctx *ctx)
unsigned rat_index_mode;
unsigned immed_base;
 
-   r = load_thread_id_gpr(ctx);
-   if (r)
-   return r;
-
rat_index_mode = inst->Src[0].Indirect.Index == 2 ? 2 : 0; // 
CF_INDEX_1 : CF_INDEX_NONE
 
immed_base = R600_IMAGE_IMMED_RESOURCE_OFFSET;
@@ -8981,10 +8977,6 @@ static int tgsi_atomic_op_rat(struct r600_shader_ctx 
*ctx)
immed_base = R600_IMAGE_IMMED_RESOURCE_OFFSET;
rat_base = ctx->shader->rat_base;
 
-   r = load_thread_id_gpr(ctx);
-   if (r)
-   return r;
-
 if (inst->Src[0].Register.File == TGSI_FILE_BUFFER) {
immed_base += ctx->info.file_count[TGSI_FILE_IMAGE];
rat_base += ctx->info.file_count[TGSI_FILE_IMAGE];

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): intel/compiler: Add ICL to test_eu_validate.cpp

2018-02-28 Thread Matt Turner

Module: Mesa
Branch: master
Commit: 6f00bf519d6f13eb58e7495a41b8f8b055782832
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=6f00bf519d6f13eb58e7495a41b8f8b055782832

Author: Matt Turner 
Date:   Mon Jan 29 15:52:39 2018 -0800

intel/compiler: Add ICL to test_eu_validate.cpp

With the Align16 tests now disabled, we can run the rest of the tests in
ICL mode (and see them pass!)

Reviewed-by: Kenneth Graunke 

---

 src/intel/compiler/test_eu_validate.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/intel/compiler/test_eu_validate.cpp 
b/src/intel/compiler/test_eu_validate.cpp
index f6c2b35625..d987311ef8 100644
--- a/src/intel/compiler/test_eu_validate.cpp
+++ b/src/intel/compiler/test_eu_validate.cpp
@@ -56,6 +56,7 @@ static const struct gen_info {
{ "glk", 9, IS_GLK },
{ "cfl", 9, IS_CFL },
{ "cnl", 10 },
+   { "icl", 11 },
 };
 
 class validation_test: public ::testing::TestWithParam {

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): i965: Warn about preliminary support for Gen11

2018-02-28 Thread Matt Turner

Module: Mesa
Branch: master
Commit: 35bfe2099564b6655563d920a21d13392b78c43e
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=35bfe2099564b6655563d920a21d13392b78c43e

Author: Matt Turner 
Date:   Mon Feb 26 14:25:17 2018 -0800

i965: Warn about preliminary support for Gen11

Reviewed-by: Kenneth Graunke 

---

 src/mesa/drivers/dri/i965/brw_context.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index b9c3fa27bf..8ab9063d21 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -1011,6 +1011,13 @@ brwCreateContext(gl_api api,
   return false;
}
 
+   if (devinfo->gen == 11) {
+  fprintf(stderr,
+  "WARNING: i965 does not fully support Gen11 yet.\n"
+  "Instability or lower performance might occur.\n");
+
+   }
+
brw_init_state(brw);
 
intelInitExtensions(ctx);

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): intel/compiler/fs: Implement ddy without using align16 for Gen11+

2018-02-28 Thread Matt Turner

Module: Mesa
Branch: master
Commit: 2134ea380033d5d1f3c5760b8bdb1da7aadd9842
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=2134ea380033d5d1f3c5760b8bdb1da7aadd9842

Author: Matt Turner 
Date:   Thu Jun 15 17:29:16 2017 -0700

intel/compiler/fs: Implement ddy without using align16 for Gen11+

Align16 is no more. We previously generated an align16 ADD instruction
to calculate DDY:

   add(16) g25<1>F  -g23<4>.xyxyF   g23<4>.zwzwF   { align16 1H };

Without align16, we now implement it as:

   add(4) g25<1>F   -g23<0,2,1>Fg23.2<0,2,1>F  { align1 1N };
   add(4) g25.4<1>F -g23.4<0,2,1>F  g23.6<0,2,1>F  { align1 1N };
   add(4) g26<1>F   -g24<0,2,1>Fg24.2<0,2,1>F  { align1 1N };
   add(4) g26.4<1>F -g24.4<0,2,1>F  g24.6<0,2,1>F  { align1 1N };

where only the first two instructions are needed in SIMD8 mode.

Note: an earlier version of the patch implemented this in two
instructions in SIMD16:

   add(8) g25<2>F   -g23<4,2,0>Fg23.2<4,2,0>F  { align1 1N };
   add(8) g25.1<2>F -g23.1<4,2,0>F  g23.3<4,2,0>F  { align1 1N };

but I realized that the channel enable bits will not be correct. If we
knew we were under uniform control flow, we could emit only those two
instructions however.

Reviewed-by: Kenneth Graunke 

---

 src/intel/compiler/brw_fs_generator.cpp | 46 +++--
 1 file changed, 38 insertions(+), 8 deletions(-)

diff --git a/src/intel/compiler/brw_fs_generator.cpp 
b/src/intel/compiler/brw_fs_generator.cpp
index e6fb7c92d4..0dc0a695e4 100644
--- a/src/intel/compiler/brw_fs_generator.cpp
+++ b/src/intel/compiler/brw_fs_generator.cpp
@@ -1187,15 +1187,45 @@ fs_generator::generate_ddy(const fs_inst *inst,
 {
if (inst->opcode == FS_OPCODE_DDY_FINE) {
   /* produce accurate derivatives */
-  struct brw_reg src0 = stride(src, 4, 4, 1);
-  struct brw_reg src1 = stride(src, 4, 4, 1);
-  src0.swizzle = BRW_SWIZZLE_XYXY;
-  src1.swizzle = BRW_SWIZZLE_ZWZW;
+  if (devinfo->gen >= 11) {
+ src = stride(src, 0, 2, 1);
+ struct brw_reg src_0  = byte_offset(src,  0 * sizeof(float));
+ struct brw_reg src_2  = byte_offset(src,  2 * sizeof(float));
+ struct brw_reg src_4  = byte_offset(src,  4 * sizeof(float));
+ struct brw_reg src_6  = byte_offset(src,  6 * sizeof(float));
+ struct brw_reg src_8  = byte_offset(src,  8 * sizeof(float));
+ struct brw_reg src_10 = byte_offset(src, 10 * sizeof(float));
+ struct brw_reg src_12 = byte_offset(src, 12 * sizeof(float));
+ struct brw_reg src_14 = byte_offset(src, 14 * sizeof(float));
+
+ struct brw_reg dst_0  = byte_offset(dst,  0 * sizeof(float));
+ struct brw_reg dst_4  = byte_offset(dst,  4 * sizeof(float));
+ struct brw_reg dst_8  = byte_offset(dst,  8 * sizeof(float));
+ struct brw_reg dst_12 = byte_offset(dst, 12 * sizeof(float));
 
-  brw_push_insn_state(p);
-  brw_set_default_access_mode(p, BRW_ALIGN_16);
-  brw_ADD(p, dst, negate(src0), src1);
-  brw_pop_insn_state(p);
+ brw_push_insn_state(p);
+ brw_set_default_exec_size(p, BRW_EXECUTE_4);
+
+ brw_ADD(p, dst_0, negate(src_0), src_2);
+ brw_ADD(p, dst_4, negate(src_4), src_6);
+
+ if (inst->exec_size == 16) {
+brw_ADD(p, dst_8,  negate(src_8),  src_10);
+brw_ADD(p, dst_12, negate(src_12), src_14);
+ }
+
+ brw_pop_insn_state(p);
+  } else {
+ struct brw_reg src0 = stride(src, 4, 4, 1);
+ struct brw_reg src1 = stride(src, 4, 4, 1);
+ src0.swizzle = BRW_SWIZZLE_XYXY;
+ src1.swizzle = BRW_SWIZZLE_ZWZW;
+
+ brw_push_insn_state(p);
+ brw_set_default_access_mode(p, BRW_ALIGN_16);
+ brw_ADD(p, dst, negate(src0), src1);
+ brw_pop_insn_state(p);
+  }
} else {
   /* replicate the derivative at the top-left pixel to other pixels */
   struct brw_reg src0 = stride(src, 4, 4, 0);

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): intel: Disable 64-bit extensions on platforms without 64-bit types

2018-02-28 Thread Matt Turner

Module: Mesa
Branch: master
Commit: bb428454a9d70e5f5984269e6c4a7f5d6e2871d9
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=bb428454a9d70e5f5984269e6c4a7f5d6e2871d9

Author: Matt Turner 
Date:   Mon Dec 11 13:59:13 2017 -0800

intel: Disable 64-bit extensions on platforms without 64-bit types

Gen11 does not support DF, Q, UQ types in hardware. As a result, we have
to disable some GL extensions until they can be reimplemented.

Reviewed-by: Kenneth Graunke 
Reviewed-by: Iago Toral Quiroga 

---

 src/intel/common/gen_device_info.c   | 3 +++
 src/intel/common/gen_device_info.h   | 1 +
 src/mesa/drivers/dri/i965/intel_extensions.c | 9 +
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/src/intel/common/gen_device_info.c 
b/src/intel/common/gen_device_info.c
index 11a4480ebf..7bed806b36 100644
--- a/src/intel/common/gen_device_info.c
+++ b/src/intel/common/gen_device_info.c
@@ -197,6 +197,7 @@ static const struct gen_device_info gen_device_info_snb_gt2 
= {
.must_use_separate_stencil = true,   \
.has_llc = true, \
.has_pln = true, \
+   .has_64bit_types = true, \
.has_surface_tile_offset = true, \
.timestamp_frequency = 1250
 
@@ -381,6 +382,7 @@ static const struct gen_device_info gen_device_info_hsw_gt3 
= {
.has_llc = true, \
.has_sample_with_hiz = false,\
.has_pln = true, \
+   .has_64bit_types = true, \
.supports_simd16_3src = true,\
.has_surface_tile_offset = true, \
.max_vs_threads = 504,   \
@@ -815,6 +817,7 @@ static const struct gen_device_info gen_device_info_cnl_5x8 
= {
 #define GEN11_FEATURES(_gt, _slices, _subslices, _l3) \
GEN8_FEATURES, \
GEN11_HW_INFO, \
+   .has_64bit_types = false,  \
.gt = _gt, .num_slices = _slices, .l3_banks = _l3, \
.num_subslices = _subslices
 
diff --git a/src/intel/common/gen_device_info.h 
b/src/intel/common/gen_device_info.h
index 3e9c087f58..9b635ff178 100644
--- a/src/intel/common/gen_device_info.h
+++ b/src/intel/common/gen_device_info.h
@@ -59,6 +59,7 @@ struct gen_device_info
bool has_llc;
 
bool has_pln;
+   bool has_64bit_types;
bool has_compr4;
bool has_surface_tile_offset;
bool supports_simd16_3src;
diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index 127371c5b8..73a6c73f53 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -218,7 +218,7 @@ intelInitExtensions(struct gl_context *ctx)
   ctx->Extensions.ARB_derivative_control = true;
   ctx->Extensions.ARB_framebuffer_no_attachments = true;
   ctx->Extensions.ARB_gpu_shader5 = true;
-  ctx->Extensions.ARB_gpu_shader_fp64 = true;
+  ctx->Extensions.ARB_gpu_shader_fp64 = devinfo->has_64bit_types;
   ctx->Extensions.ARB_shader_atomic_counters = true;
   ctx->Extensions.ARB_shader_atomic_counter_ops = true;
   ctx->Extensions.ARB_shader_clock = true;
@@ -230,7 +230,7 @@ intelInitExtensions(struct gl_context *ctx)
   ctx->Extensions.ARB_texture_compression_bptc = true;
   ctx->Extensions.ARB_texture_view = true;
   ctx->Extensions.ARB_shader_storage_buffer_object = true;
-  ctx->Extensions.ARB_vertex_attrib_64bit = true;
+  ctx->Extensions.ARB_vertex_attrib_64bit = devinfo->has_64bit_types;
   ctx->Extensions.EXT_shader_samples_identical = true;
   ctx->Extensions.OES_primitive_bounding_box = true;
   ctx->Extensions.OES_texture_buffer = true;
@@ -280,8 +280,9 @@ intelInitExtensions(struct gl_context *ctx)
}
 
if (devinfo->gen >= 8) {
-  ctx->Extensions.ARB_gpu_shader_int64 = true;
-  ctx->Extensions.ARB_shader_ballot = true; /* requires 
ARB_gpu_shader_int64 */
+  ctx->Extensions.ARB_gpu_shader_int64 = devinfo->has_64bit_types;
+  /* requires ARB_gpu_shader_int64 */
+  ctx->Extensions.ARB_shader_ballot = devinfo->has_64bit_types;
   ctx->Extensions.ARB_ES3_2_compatibility = true;
}
 

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): intel/compiler: Disable Align16 tests on Gen11+

2018-02-28 Thread Matt Turner

Module: Mesa
Branch: master
Commit: ff4b41dd1dffe81f70572c9183062cd36b0074dc
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=ff4b41dd1dffe81f70572c9183062cd36b0074dc

Author: Matt Turner 
Date:   Thu Feb  8 10:23:11 2018 -0800

intel/compiler: Disable Align16 tests on Gen11+

Align16 is no more.

Reviewed-by: Kenneth Graunke 

---

 src/intel/compiler/test_eu_validate.cpp | 16 
 1 file changed, 16 insertions(+)

diff --git a/src/intel/compiler/test_eu_validate.cpp 
b/src/intel/compiler/test_eu_validate.cpp
index cb2fcd3d40..f6c2b35625 100644
--- a/src/intel/compiler/test_eu_validate.cpp
+++ b/src/intel/compiler/test_eu_validate.cpp
@@ -374,6 +374,10 @@ TEST_P(validation_test, dst_horizontal_stride_0)
 
clear_instructions(p);
 
+   /* Align16 does not exist on Gen11+ */
+   if (devinfo.gen >= 11)
+  return;
+
brw_set_default_access_mode(p, BRW_ALIGN_16);
 
brw_ADD(p, g0, g0, g0);
@@ -421,6 +425,10 @@ TEST_P(validation_test, 
must_not_cross_grf_boundary_in_a_width)
 /* Destination Horizontal must be 1 in Align16 */
 TEST_P(validation_test, dst_hstride_on_align16_must_be_1)
 {
+   /* Align16 does not exist on Gen11+ */
+   if (devinfo.gen >= 11)
+  return;
+
brw_set_default_access_mode(p, BRW_ALIGN_16);
 
brw_ADD(p, g0, g0, g0);
@@ -439,6 +447,10 @@ TEST_P(validation_test, dst_hstride_on_align16_must_be_1)
 /* VertStride must be 0 or 4 in Align16 */
 TEST_P(validation_test, vstride_on_align16_must_be_0_or_4)
 {
+   /* Align16 does not exist on Gen11+ */
+   if (devinfo.gen >= 11)
+  return;
+
const struct {
   enum brw_vertical_stride vstride;
   bool expected_result;
@@ -1419,6 +1431,10 @@ TEST_P(validation_test, align16_64_bit_integer)
if (devinfo.gen < 8)
   return;
 
+   /* Align16 does not exist on Gen11+ */
+   if (devinfo.gen >= 11)
+  return;
+
brw_set_default_access_mode(p, BRW_ALIGN_16);
 
for (unsigned i = 0; i < sizeof(inst) / sizeof(inst[0]); i++) {

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): intel/compiler/fs: Simplify ddx/ddy code generation

2018-02-28 Thread Matt Turner

Module: Mesa
Branch: master
Commit: 62cfd4c6563dfcd950b703c4159faff21f36a19e
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=62cfd4c6563dfcd950b703c4159faff21f36a19e

Author: Matt Turner 
Date:   Thu Jun 15 17:20:29 2017 -0700

intel/compiler/fs: Simplify ddx/ddy code generation

The brw_reg() constructor just obfuscates things here, in my opinion.

Reviewed-by: Kenneth Graunke 

---

 src/intel/compiler/brw_fs_generator.cpp | 63 +++--
 1 file changed, 21 insertions(+), 42 deletions(-)

diff --git a/src/intel/compiler/brw_fs_generator.cpp 
b/src/intel/compiler/brw_fs_generator.cpp
index c49af89f2f..e6fb7c92d4 100644
--- a/src/intel/compiler/brw_fs_generator.cpp
+++ b/src/intel/compiler/brw_fs_generator.cpp
@@ -1163,20 +1163,17 @@ fs_generator::generate_ddx(const fs_inst *inst,
   width = BRW_WIDTH_4;
}
 
-   struct brw_reg src0 = brw_reg(src.file, src.nr, 1,
- src.negate, src.abs,
-BRW_REGISTER_TYPE_F,
-vstride,
-width,
-BRW_HORIZONTAL_STRIDE_0,
-BRW_SWIZZLE_XYZW, WRITEMASK_XYZW);
-   struct brw_reg src1 = brw_reg(src.file, src.nr, 0,
- src.negate, src.abs,
-BRW_REGISTER_TYPE_F,
-vstride,
-width,
-BRW_HORIZONTAL_STRIDE_0,
-BRW_SWIZZLE_XYZW, WRITEMASK_XYZW);
+   struct brw_reg src0 = src;
+   struct brw_reg src1 = src;
+
+   src0.subnr   = sizeof(float);
+   src0.vstride = vstride;
+   src0.width   = width;
+   src0.hstride = BRW_HORIZONTAL_STRIDE_0;
+   src1.vstride = vstride;
+   src1.width   = width;
+   src1.hstride = BRW_HORIZONTAL_STRIDE_0;
+
brw_ADD(p, dst, src0, negate(src1));
 }
 
@@ -1190,40 +1187,22 @@ fs_generator::generate_ddy(const fs_inst *inst,
 {
if (inst->opcode == FS_OPCODE_DDY_FINE) {
   /* produce accurate derivatives */
-  struct brw_reg src0 = brw_reg(src.file, src.nr, 0,
-src.negate, src.abs,
-BRW_REGISTER_TYPE_F,
-BRW_VERTICAL_STRIDE_4,
-BRW_WIDTH_4,
-BRW_HORIZONTAL_STRIDE_1,
-BRW_SWIZZLE_XYXY, WRITEMASK_XYZW);
-  struct brw_reg src1 = brw_reg(src.file, src.nr, 0,
-src.negate, src.abs,
-BRW_REGISTER_TYPE_F,
-BRW_VERTICAL_STRIDE_4,
-BRW_WIDTH_4,
-BRW_HORIZONTAL_STRIDE_1,
-BRW_SWIZZLE_ZWZW, WRITEMASK_XYZW);
+  struct brw_reg src0 = stride(src, 4, 4, 1);
+  struct brw_reg src1 = stride(src, 4, 4, 1);
+  src0.swizzle = BRW_SWIZZLE_XYXY;
+  src1.swizzle = BRW_SWIZZLE_ZWZW;
+
   brw_push_insn_state(p);
   brw_set_default_access_mode(p, BRW_ALIGN_16);
   brw_ADD(p, dst, negate(src0), src1);
   brw_pop_insn_state(p);
} else {
   /* replicate the derivative at the top-left pixel to other pixels */
-  struct brw_reg src0 = brw_reg(src.file, src.nr, 0,
-src.negate, src.abs,
-BRW_REGISTER_TYPE_F,
-BRW_VERTICAL_STRIDE_4,
-BRW_WIDTH_4,
-BRW_HORIZONTAL_STRIDE_0,
-BRW_SWIZZLE_XYZW, WRITEMASK_XYZW);
-  struct brw_reg src1 = brw_reg(src.file, src.nr, 2,
-src.negate, src.abs,
-BRW_REGISTER_TYPE_F,
-BRW_VERTICAL_STRIDE_4,
-BRW_WIDTH_4,
-BRW_HORIZONTAL_STRIDE_0,
-BRW_SWIZZLE_XYZW, WRITEMASK_XYZW);
+  struct brw_reg src0 = stride(src, 4, 4, 0);
+  struct brw_reg src1 = stride(src, 4, 4, 0);
+  src0.subnr = 0 * sizeof(float);
+  src1.subnr = 2 * sizeof(float);
+
   brw_ADD(p, dst, negate(src0), src1);
}
 }

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): intel/compiler: Add Gen11+ native float type

2018-02-28 Thread Matt Turner

Module: Mesa
Branch: master
Commit: 2cff3242109078999c57d5e6772418c09e835826
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=2cff3242109078999c57d5e6772418c09e835826

Author: Matt Turner 
Date:   Wed Jun 14 11:03:19 2017 -0700

intel/compiler: Add Gen11+ native float type

This new type exposes the additional precision offered by the
accumulator register and will be used in the next patch to implement the
functionality of the PLN instruction using a pair of MAD instructions.

One weird thing to note: align1 ternary instructions may only have an
accumulator in the dst or src1 normally, but when src0's type is :NF
the accumulator is read.

Reviewed-by: Kenneth Graunke 

---

 src/intel/compiler/brw_disasm.c  |  7 +++
 src/intel/compiler/brw_eu_emit.c | 10 --
 src/intel/compiler/brw_eu_validate.c |  1 +
 src/intel/compiler/brw_reg_type.c|  8 
 src/intel/compiler/brw_reg_type.h|  2 ++
 src/intel/compiler/brw_shader.cpp|  6 ++
 6 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/src/intel/compiler/brw_disasm.c b/src/intel/compiler/brw_disasm.c
index 429ed78140..a9a108f8ac 100644
--- a/src/intel/compiler/brw_disasm.c
+++ b/src/intel/compiler/brw_disasm.c
@@ -1035,6 +1035,12 @@ src0_3src(FILE *file, const struct gen_device_info 
*devinfo, const brw_inst *ins
  reg_nr = brw_inst_3src_src0_reg_nr(devinfo, inst);
  subreg_nr = brw_inst_3src_a1_src0_subreg_nr(devinfo, inst);
  type = brw_inst_3src_a1_src0_type(devinfo, inst);
+  } else if (brw_inst_3src_a1_src0_type(devinfo, inst) ==
+ BRW_REGISTER_TYPE_NF) {
+ _file = BRW_ARCHITECTURE_REGISTER_FILE;
+ reg_nr = brw_inst_3src_src0_reg_nr(devinfo, inst);
+ subreg_nr = brw_inst_3src_a1_src0_subreg_nr(devinfo, inst);
+ type = brw_inst_3src_a1_src0_type(devinfo, inst);
   } else {
  _file = BRW_IMMEDIATE_VALUE;
  uint16_t imm_val = brw_inst_3src_a1_src0_imm(devinfo, inst);
@@ -1288,6 +1294,7 @@ imm(FILE *file, const struct gen_device_info *devinfo, 
enum brw_reg_type type,
case BRW_REGISTER_TYPE_HF:
   string(file, "Half Float IMM");
   break;
+   case BRW_REGISTER_TYPE_NF:
case BRW_REGISTER_TYPE_UB:
case BRW_REGISTER_TYPE_B:
   format(file, "*** invalid immediate type %d ", type);
diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c
index c25d8d6eda..ec871e5aa7 100644
--- a/src/intel/compiler/brw_eu_emit.c
+++ b/src/intel/compiler/brw_eu_emit.c
@@ -771,7 +771,11 @@ brw_alu3(struct brw_codegen *p, unsigned opcode, struct 
brw_reg dest,
 to_3src_align1_hstride(src2.hstride));
 
   brw_inst_set_3src_a1_src0_subreg_nr(devinfo, inst, src0.subnr);
-  brw_inst_set_3src_src0_reg_nr(devinfo, inst, src0.nr);
+  if (src0.type == BRW_REGISTER_TYPE_NF) {
+ brw_inst_set_3src_src0_reg_nr(devinfo, inst, BRW_ARF_ACCUMULATOR);
+  } else {
+ brw_inst_set_3src_src0_reg_nr(devinfo, inst, src0.nr);
+  }
   brw_inst_set_3src_src0_abs(devinfo, inst, src0.abs);
   brw_inst_set_3src_src0_negate(devinfo, inst, src0.negate);
 
@@ -790,7 +794,9 @@ brw_alu3(struct brw_codegen *p, unsigned opcode, struct 
brw_reg dest,
   brw_inst_set_3src_src2_negate(devinfo, inst, src2.negate);
 
   assert(src0.file == BRW_GENERAL_REGISTER_FILE ||
- src0.file == BRW_IMMEDIATE_VALUE);
+ src0.file == BRW_IMMEDIATE_VALUE ||
+ (src0.file == BRW_ARCHITECTURE_REGISTER_FILE &&
+  src0.type == BRW_REGISTER_TYPE_NF));
   assert(src1.file == BRW_GENERAL_REGISTER_FILE ||
  src1.file == BRW_ARCHITECTURE_REGISTER_FILE);
   assert(src2.file == BRW_GENERAL_REGISTER_FILE ||
diff --git a/src/intel/compiler/brw_eu_validate.c 
b/src/intel/compiler/brw_eu_validate.c
index 6ee6b4ffbe..d3189d1ef5 100644
--- a/src/intel/compiler/brw_eu_validate.c
+++ b/src/intel/compiler/brw_eu_validate.c
@@ -277,6 +277,7 @@ static enum brw_reg_type
 execution_type_for_type(enum brw_reg_type type)
 {
switch (type) {
+   case BRW_REGISTER_TYPE_NF:
case BRW_REGISTER_TYPE_DF:
case BRW_REGISTER_TYPE_F:
case BRW_REGISTER_TYPE_HF:
diff --git a/src/intel/compiler/brw_reg_type.c 
b/src/intel/compiler/brw_reg_type.c
index c4f8eedeb4..3c82eb0a76 100644
--- a/src/intel/compiler/brw_reg_type.c
+++ b/src/intel/compiler/brw_reg_type.c
@@ -52,6 +52,7 @@ enum hw_reg_type {
GEN11_HW_REG_TYPE_HF = 8,
GEN11_HW_REG_TYPE_F  = 9,
GEN11_HW_REG_TYPE_DF = 10,
+   GEN11_HW_REG_TYPE_NF = 11,
 };
 
 enum hw_imm_type {
@@ -87,6 +88,8 @@ static const struct hw_type {
enum hw_reg_type reg_type;
enum hw_imm_type imm_type;
 } gen4_hw_type[] = {
+   [0 ... BRW_REGISTER_TYPE_LAST] = { INVALID, INVALID },
+
[BRW_REGISTER_TYPE_DF] = { GEN7_HW_REG_TYPE_DF, GEN8_HW_IMM_TYPE_DF },
[BRW_REGISTER_TYPE_F]  = { BRW_HW_REG_TYPE_F,

Mesa (master): intel/compiler: Mark line, pln, and lrp as removed on Gen11+

2018-02-28 Thread Matt Turner

Module: Mesa
Branch: master
Commit: d5bf093cf9da323ce3ebb69c07834870441e0e38
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=d5bf093cf9da323ce3ebb69c07834870441e0e38

Author: Matt Turner 
Date:   Wed Jun 14 16:14:11 2017 -0700

intel/compiler: Mark line, pln, and lrp as removed on Gen11+

Reviewed-by: Kenneth Graunke 

---

 src/intel/compiler/brw_eu.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/src/intel/compiler/brw_eu.c b/src/intel/compiler/brw_eu.c
index bc297a21b3..3646076a8e 100644
--- a/src/intel/compiler/brw_eu.c
+++ b/src/intel/compiler/brw_eu.c
@@ -384,7 +384,8 @@ enum gen {
GEN75 = (1 << 5),
GEN8  = (1 << 6),
GEN9  = (1 << 7),
-   GEN10  = (1 << 8),
+   GEN10 = (1 << 8),
+   GEN11 = (1 << 9),
GEN_ALL = ~0
 };
 
@@ -628,16 +629,16 @@ static const struct opcode_desc opcode_descs[128] = {
},
/* Reserved 88 */
[BRW_OPCODE_LINE] = {
-  .name = "line",.nsrc = 2, .ndst = 1, .gens = GEN_ALL,
+  .name = "line",.nsrc = 2, .ndst = 1, .gens = GEN_LE(GEN10),
},
[BRW_OPCODE_PLN] = {
-  .name = "pln", .nsrc = 2, .ndst = 1, .gens = GEN_GE(GEN45),
+  .name = "pln", .nsrc = 2, .ndst = 1, .gens = GEN_GE(GEN45) & 
GEN_LE(GEN10),
},
[BRW_OPCODE_MAD] = {
   .name = "mad", .nsrc = 3, .ndst = 1, .gens = GEN_GE(GEN6),
},
[BRW_OPCODE_LRP] = {
-  .name = "lrp", .nsrc = 3, .ndst = 1, .gens = GEN_GE(GEN6),
+  .name = "lrp", .nsrc = 3, .ndst = 1, .gens = GEN_GE(GEN6) & 
GEN_LE(GEN10),
},
[93] = {
   .name = "madm",.nsrc = 3, .ndst = 1, .gens = GEN_GE(GEN8),
@@ -662,6 +663,7 @@ gen_from_devinfo(const struct gen_device_info *devinfo)
case 8: return GEN8;
case 9: return GEN9;
case 10: return GEN10;
+   case 11: return GEN11;
default:
   unreachable("not reached");
}

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): intel/compiler/fs: Implement FS_OPCODE_LINTERP with MADs on Gen11+

2018-02-28 Thread Matt Turner

Module: Mesa
Branch: master
Commit: 432674ce93ceee2abd7e0cc4171bc36a499d4c1f
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=432674ce93ceee2abd7e0cc4171bc36a499d4c1f

Author: Matt Turner 
Date:   Wed Jun 14 14:47:19 2017 -0700

intel/compiler/fs: Implement FS_OPCODE_LINTERP with MADs on Gen11+

The PLN instruction is no more. Its functionality is now implemented
using two MAD instructions with the new native-float type. Instead of

   pln(16) r20.0<1>:F r10.4<0;1,0>:F r4.0<8;8,1>:F

we now have

   mad(8) acc0<1>:NF r10.7<0;1,0>:F r4.0<8;8,1>:F r10.4<0;1,0>:F
   mad(8) r20.0<1>:F acc0<8;8,1>:NF r5.0<8;8,1>:F r10.5<0;1,0>:F
   mad(8) acc0<1>:NF r10.7<0;1,0>:F r6.0<8;8,1>:F r10.4<0;1,0>:F
   mad(8) r21.0<1>:F acc0<8;8,1>:NF r7.0<8;8,1>:F r10.5<0;1,0>:F

... and in the case of SIMD8 only the first pair of MAD instructions is
used.

Reviewed-by: Kenneth Graunke 

---

 src/intel/compiler/brw_eu_emit.c|  2 +-
 src/intel/compiler/brw_fs_generator.cpp | 48 ++---
 2 files changed, 46 insertions(+), 4 deletions(-)

diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c
index ec871e5aa7..a96fe43556 100644
--- a/src/intel/compiler/brw_eu_emit.c
+++ b/src/intel/compiler/brw_eu_emit.c
@@ -968,7 +968,7 @@ ALU2(DP4)
 ALU2(DPH)
 ALU2(DP3)
 ALU2(DP2)
-ALU3F(MAD)
+ALU3(MAD)
 ALU3F(LRP)
 ALU1(BFREV)
 ALU3(BFE)
diff --git a/src/intel/compiler/brw_fs_generator.cpp 
b/src/intel/compiler/brw_fs_generator.cpp
index 3abd7cf538..736b3b5fba 100644
--- a/src/intel/compiler/brw_fs_generator.cpp
+++ b/src/intel/compiler/brw_fs_generator.cpp
@@ -673,10 +673,52 @@ fs_generator::generate_linterp(fs_inst *inst,
struct brw_reg delta_x = src[0];
struct brw_reg delta_y = offset(src[0], inst->exec_size / 8);
struct brw_reg interp = src[1];
-   brw_inst *i[2];
+   brw_inst *i[4];
 
-   if (devinfo->has_pln &&
-   (devinfo->gen >= 7 || (delta_x.nr & 1) == 0)) {
+   if (devinfo->gen >= 11) {
+  struct brw_reg acc = retype(brw_acc_reg(8), BRW_REGISTER_TYPE_NF);
+  struct brw_reg dwP = suboffset(interp, 0);
+  struct brw_reg dwQ = suboffset(interp, 1);
+  struct brw_reg dwR = suboffset(interp, 3);
+
+  brw_set_default_exec_size(p, BRW_EXECUTE_8);
+
+  if (inst->exec_size == 8) {
+ i[0] = brw_MAD(p,acc, dwR, offset(delta_x, 0), dwP);
+ i[1] = brw_MAD(p, offset(dst, 0), acc, offset(delta_y, 0), dwQ);
+
+ brw_inst_set_cond_modifier(p->devinfo, i[1], inst->conditional_mod);
+
+ /* brw_set_default_saturate() is called before emitting instructions,
+  * so the saturate bit is set in each instruction, so we need to unset
+  * it on the first instruction of each pair.
+  */
+ brw_inst_set_saturate(p->devinfo, i[0], false);
+  } else {
+ brw_set_default_compression_control(p, BRW_COMPRESSION_NONE);
+ i[0] = brw_MAD(p,acc, dwR, offset(delta_x, 0), dwP);
+ i[1] = brw_MAD(p, offset(dst, 0), acc, offset(delta_x, 1), dwQ);
+
+ brw_set_default_compression_control(p, BRW_COMPRESSION_2NDHALF);
+ i[2] = brw_MAD(p,acc, dwR, offset(delta_y, 0), dwP);
+ i[3] = brw_MAD(p, offset(dst, 1), acc, offset(delta_y, 1), dwQ);
+
+ brw_set_default_compression_control(p, BRW_COMPRESSION_COMPRESSED);
+
+ brw_inst_set_cond_modifier(p->devinfo, i[1], inst->conditional_mod);
+ brw_inst_set_cond_modifier(p->devinfo, i[3], inst->conditional_mod);
+
+ /* brw_set_default_saturate() is called before emitting instructions,
+  * so the saturate bit is set in each instruction, so we need to unset
+  * it on the first instruction of each pair.
+  */
+ brw_inst_set_saturate(p->devinfo, i[0], false);
+ brw_inst_set_saturate(p->devinfo, i[2], false);
+  }
+
+  return true;
+   } else if (devinfo->has_pln &&
+  (devinfo->gen >= 7 || (delta_x.nr & 1) == 0)) {
   brw_PLN(p, dst, interp, delta_x);
 
   return false;

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): intel/compiler: Add Gen11 register types

2018-02-28 Thread Matt Turner

Module: Mesa
Branch: master
Commit: 58611ff913df74e7f790b0c572b983a992e25a17
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=58611ff913df74e7f790b0c572b983a992e25a17

Author: Matt Turner 
Date:   Fri Aug 25 09:50:29 2017 -0700

intel/compiler: Add Gen11 register types

The hardware register types' encodings have changed on Gen11. Good thing
we have that superfluous looking brw_reg_type abstraction lying around!

Reviewed-by: Kenneth Graunke 

---

 src/intel/compiler/brw_reg_type.c | 73 ++-
 1 file changed, 65 insertions(+), 8 deletions(-)

diff --git a/src/intel/compiler/brw_reg_type.c 
b/src/intel/compiler/brw_reg_type.c
index b7fff0867f..c4f8eedeb4 100644
--- a/src/intel/compiler/brw_reg_type.c
+++ b/src/intel/compiler/brw_reg_type.c
@@ -40,6 +40,18 @@ enum hw_reg_type {
BRW_HW_REG_TYPE_B   = 5,
GEN7_HW_REG_TYPE_DF = 6,
GEN8_HW_REG_TYPE_HF = 10,
+
+   GEN11_HW_REG_TYPE_UD = 0,
+   GEN11_HW_REG_TYPE_D  = 1,
+   GEN11_HW_REG_TYPE_UW = 2,
+   GEN11_HW_REG_TYPE_W  = 3,
+   GEN11_HW_REG_TYPE_UB = 4,
+   GEN11_HW_REG_TYPE_B  = 5,
+   GEN11_HW_REG_TYPE_UQ = 6,
+   GEN11_HW_REG_TYPE_Q  = 7,
+   GEN11_HW_REG_TYPE_HF = 8,
+   GEN11_HW_REG_TYPE_F  = 9,
+   GEN11_HW_REG_TYPE_DF = 10,
 };
 
 enum hw_imm_type {
@@ -56,9 +68,22 @@ enum hw_imm_type {
BRW_HW_IMM_TYPE_V   = 6,
GEN8_HW_IMM_TYPE_DF = 10,
GEN8_HW_IMM_TYPE_HF = 11,
+
+   GEN11_HW_IMM_TYPE_UD = 0,
+   GEN11_HW_IMM_TYPE_D  = 1,
+   GEN11_HW_IMM_TYPE_UW = 2,
+   GEN11_HW_IMM_TYPE_W  = 3,
+   GEN11_HW_IMM_TYPE_UV = 4,
+   GEN11_HW_IMM_TYPE_V  = 5,
+   GEN11_HW_IMM_TYPE_UQ = 6,
+   GEN11_HW_IMM_TYPE_Q  = 7,
+   GEN11_HW_IMM_TYPE_HF = 8,
+   GEN11_HW_IMM_TYPE_F  = 9,
+   GEN11_HW_IMM_TYPE_DF = 10,
+   GEN11_HW_IMM_TYPE_VF = 11,
 };
 
-static const struct {
+static const struct hw_type {
enum hw_reg_type reg_type;
enum hw_imm_type imm_type;
 } gen4_hw_type[] = {
@@ -77,6 +102,22 @@ static const struct {
[BRW_REGISTER_TYPE_UB] = { BRW_HW_REG_TYPE_UB,  INVALID },
[BRW_REGISTER_TYPE_V]  = { INVALID, BRW_HW_IMM_TYPE_V   },
[BRW_REGISTER_TYPE_UV] = { INVALID, BRW_HW_IMM_TYPE_UV  },
+}, gen11_hw_type[] = {
+   [BRW_REGISTER_TYPE_DF] = { GEN11_HW_REG_TYPE_DF, GEN11_HW_IMM_TYPE_DF },
+   [BRW_REGISTER_TYPE_F]  = { GEN11_HW_REG_TYPE_F,  GEN11_HW_IMM_TYPE_F  },
+   [BRW_REGISTER_TYPE_HF] = { GEN11_HW_REG_TYPE_HF, GEN11_HW_IMM_TYPE_HF },
+   [BRW_REGISTER_TYPE_VF] = { INVALID,  GEN11_HW_IMM_TYPE_VF },
+
+   [BRW_REGISTER_TYPE_Q]  = { GEN11_HW_REG_TYPE_Q,  GEN11_HW_IMM_TYPE_Q  },
+   [BRW_REGISTER_TYPE_UQ] = { GEN11_HW_REG_TYPE_UQ, GEN11_HW_IMM_TYPE_UQ },
+   [BRW_REGISTER_TYPE_D]  = { GEN11_HW_REG_TYPE_D,  GEN11_HW_IMM_TYPE_D  },
+   [BRW_REGISTER_TYPE_UD] = { GEN11_HW_REG_TYPE_UD, GEN11_HW_IMM_TYPE_UD },
+   [BRW_REGISTER_TYPE_W]  = { GEN11_HW_REG_TYPE_W,  GEN11_HW_IMM_TYPE_W  },
+   [BRW_REGISTER_TYPE_UW] = { GEN11_HW_REG_TYPE_UW, GEN11_HW_IMM_TYPE_UW },
+   [BRW_REGISTER_TYPE_B]  = { GEN11_HW_REG_TYPE_B,  INVALID  },
+   [BRW_REGISTER_TYPE_UB] = { GEN11_HW_REG_TYPE_UB, INVALID  },
+   [BRW_REGISTER_TYPE_V]  = { INVALID,  GEN11_HW_IMM_TYPE_V  },
+   [BRW_REGISTER_TYPE_UV] = { INVALID,  GEN11_HW_IMM_TYPE_UV },
 };
 
 /* SNB adds 3-src instructions (MAD and LRP) that only operate on floats, so
@@ -147,14 +188,22 @@ brw_reg_type_to_hw_type(const struct gen_device_info 
*devinfo,
 enum brw_reg_file file,
 enum brw_reg_type type)
 {
-   assert(type < ARRAY_SIZE(gen4_hw_type));
+   const struct hw_type *table;
+
+   if (devinfo->gen >= 11) {
+  assert(type < ARRAY_SIZE(gen11_hw_type));
+  table = gen11_hw_type;
+   } else {
+  assert(type < ARRAY_SIZE(gen4_hw_type));
+  table = gen4_hw_type;
+   }
 
if (file == BRW_IMMEDIATE_VALUE) {
-  assert(gen4_hw_type[type].imm_type != (enum hw_imm_type)INVALID);
-  return gen4_hw_type[type].imm_type;
+  assert(table[type].imm_type != (enum hw_imm_type)INVALID);
+  return table[type].imm_type;
} else {
-  assert(gen4_hw_type[type].reg_type != (enum hw_reg_type)INVALID);
-  return gen4_hw_type[type].reg_type;
+  assert(table[type].reg_type != (enum hw_reg_type)INVALID);
+  return table[type].reg_type;
}
 }
 
@@ -167,15 +216,23 @@ enum brw_reg_type
 brw_hw_type_to_reg_type(const struct gen_device_info *devinfo,
 enum brw_reg_file file, unsigned hw_type)
 {
+   const struct hw_type *table;
+
+   if (devinfo->gen >= 11) {
+  table = gen11_hw_type;
+   } else {
+  table = gen4_hw_type;
+   }
+
if (file == BRW_IMMEDIATE_VALUE) {
   for (enum brw_reg_type i = 0; i <= BRW_REGISTER_TYPE_LAST; i++) {
- if (gen4_hw_type[i].imm_type == (enum hw_imm_type)hw_type) {
+ if (table[i].imm_type == (enum hw_imm_type)hw_type) {
 return i;
  }
   }

Mesa (master): intel/compiler/fs: Return multiple_instructions_emitted from generate_linterp

2018-02-28 Thread Matt Turner

Module: Mesa
Branch: master
Commit: b5d8781e19559a8f9850f1a900ef93ffa3617faa
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=b5d8781e19559a8f9850f1a900ef93ffa3617faa

Author: Matt Turner 
Date:   Wed Jun 14 11:06:45 2017 -0700

intel/compiler/fs: Return multiple_instructions_emitted from generate_linterp

If multiple instructions are emitted, special handling of things like
conditional mod and NoDDClr/NoDDChk need to be performed.

Reviewed-by: Kenneth Graunke 

---

 src/intel/compiler/brw_fs.h |  2 +-
 src/intel/compiler/brw_fs_generator.cpp | 10 +++---
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/intel/compiler/brw_fs.h b/src/intel/compiler/brw_fs.h
index 63373580ee..37106ccb28 100644
--- a/src/intel/compiler/brw_fs.h
+++ b/src/intel/compiler/brw_fs.h
@@ -409,7 +409,7 @@ private:
void generate_urb_write(fs_inst *inst, struct brw_reg payload);
void generate_cs_terminate(fs_inst *inst, struct brw_reg payload);
void generate_barrier(fs_inst *inst, struct brw_reg src);
-   void generate_linterp(fs_inst *inst, struct brw_reg dst,
+   bool generate_linterp(fs_inst *inst, struct brw_reg dst,
 struct brw_reg *src);
void generate_tex(fs_inst *inst, struct brw_reg dst, struct brw_reg src,
  struct brw_reg surface_index,
diff --git a/src/intel/compiler/brw_fs_generator.cpp 
b/src/intel/compiler/brw_fs_generator.cpp
index bba917d755..3abd7cf538 100644
--- a/src/intel/compiler/brw_fs_generator.cpp
+++ b/src/intel/compiler/brw_fs_generator.cpp
@@ -646,9 +646,9 @@ fs_generator::generate_barrier(fs_inst *inst, struct 
brw_reg src)
brw_WAIT(p);
 }
 
-void
+bool
 fs_generator::generate_linterp(fs_inst *inst,
-struct brw_reg dst, struct brw_reg *src)
+   struct brw_reg dst, struct brw_reg *src)
 {
/* PLN reads:
 *  /   in SIMD16   \
@@ -678,6 +678,8 @@ fs_generator::generate_linterp(fs_inst *inst,
if (devinfo->has_pln &&
(devinfo->gen >= 7 || (delta_x.nr & 1) == 0)) {
   brw_PLN(p, dst, interp, delta_x);
+
+  return false;
} else {
   i[0] = brw_LINE(p, brw_null_reg(), interp, delta_x);
   i[1] = brw_MAC(p, dst, suboffset(interp, 1), delta_y);
@@ -689,6 +691,8 @@ fs_generator::generate_linterp(fs_inst *inst,
* the first instruction.
*/
   brw_inst_set_saturate(p->devinfo, i[0], false);
+
+  return true;
}
 }
 
@@ -1963,7 +1967,7 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
 brw_MOV(p, dst, src[0]);
 break;
   case FS_OPCODE_LINTERP:
-generate_linterp(inst, dst, src);
+multiple_instructions_emitted = generate_linterp(inst, dst, src);
 break;
   case FS_OPCODE_PIXEL_X:
  assert(src[0].type == BRW_REGISTER_TYPE_UW);

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): intel/compiler/fs: Don't generate integer DWord multiply on Gen11

2018-02-28 Thread Matt Turner

Module: Mesa
Branch: master
Commit: 3a584a15c0b1dd6c31a6520a0f749306f48d5782
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=3a584a15c0b1dd6c31a6520a0f749306f48d5782

Author: Matt Turner 
Date:   Mon Oct 23 10:44:39 2017 -0700

intel/compiler/fs: Don't generate integer DWord multiply on Gen11

Like CHV et al., Gen11 does not support 32x32 -> 32/64-bit integer
multiplies.

Reviewed-by: Kenneth Graunke 

---

 src/intel/common/gen_device_info.c | 4 
 src/intel/common/gen_device_info.h | 1 +
 src/intel/compiler/brw_fs.cpp  | 6 +-
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/src/intel/common/gen_device_info.c 
b/src/intel/common/gen_device_info.c
index 7bed806b36..1773009d33 100644
--- a/src/intel/common/gen_device_info.c
+++ b/src/intel/common/gen_device_info.c
@@ -382,6 +382,7 @@ static const struct gen_device_info gen_device_info_hsw_gt3 
= {
.has_llc = true, \
.has_sample_with_hiz = false,\
.has_pln = true, \
+   .has_integer_dword_mul = true,   \
.has_64bit_types = true, \
.supports_simd16_3src = true,\
.has_surface_tile_offset = true, \
@@ -464,6 +465,7 @@ static const struct gen_device_info gen_device_info_bdw_gt3 
= {
 static const struct gen_device_info gen_device_info_chv = {
GEN8_FEATURES, .is_cherryview = 1, .gt = 1,
.has_llc = false,
+   .has_integer_dword_mul = false,
.num_slices = 1,
.num_subslices = { 2, },
.num_thread_per_eu = 7,
@@ -514,6 +516,7 @@ static const struct gen_device_info gen_device_info_chv = {
 #define GEN9_LP_FEATURES   \
GEN8_FEATURES,  \
GEN9_HW_INFO,   \
+   .has_integer_dword_mul = false, \
.gt = 1,\
.has_llc = false,   \
.has_sample_with_hiz = true,\
@@ -818,6 +821,7 @@ static const struct gen_device_info gen_device_info_cnl_5x8 
= {
GEN8_FEATURES, \
GEN11_HW_INFO, \
.has_64bit_types = false,  \
+   .has_integer_dword_mul = false,\
.gt = _gt, .num_slices = _slices, .l3_banks = _l3, \
.num_subslices = _subslices
 
diff --git a/src/intel/common/gen_device_info.h 
b/src/intel/common/gen_device_info.h
index 9b635ff178..b8044d0003 100644
--- a/src/intel/common/gen_device_info.h
+++ b/src/intel/common/gen_device_info.h
@@ -60,6 +60,7 @@ struct gen_device_info
 
bool has_pln;
bool has_64bit_types;
+   bool has_integer_dword_mul;
bool has_compr4;
bool has_surface_tile_offset;
bool supports_simd16_3src;
diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
index bed632d21b..113f62c46c 100644
--- a/src/intel/compiler/brw_fs.cpp
+++ b/src/intel/compiler/brw_fs.cpp
@@ -3549,11 +3549,7 @@ fs_visitor::lower_integer_multiplication()
   inst->dst.type != BRW_REGISTER_TYPE_UD))
 continue;
 
- /* Gen8's MUL instruction can do a 32-bit x 32-bit -> 32-bit
-  * operation directly, but CHV/BXT cannot.
-  */
- if (devinfo->gen >= 8 &&
- !devinfo->is_cherryview && !gen_device_info_is_9lp(devinfo))
+ if (devinfo->has_integer_dword_mul)
 continue;
 
  if (inst->src[1].file == IMM &&

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): intel: Add a preliminary device for Ice Lake

2018-02-28 Thread Matt Turner

Module: Mesa
Branch: master
Commit: 5ac804bd9accac58a176ae102dd0de52aaec6eb2
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=5ac804bd9accac58a176ae102dd0de52aaec6eb2

Author: Anuj Phogat 
Date:   Tue Mar 14 14:43:34 2017 -0700

intel: Add a preliminary device for Ice Lake

Reviewed-by: Kenneth Graunke 
Signed-off-by: Anuj Phogat 

---

 src/intel/common/gen_device_info.c | 57 +-
 1 file changed, 56 insertions(+), 1 deletion(-)

diff --git a/src/intel/common/gen_device_info.c 
b/src/intel/common/gen_device_info.c
index ef0ae4ce8c..b17d22e5f8 100644
--- a/src/intel/common/gen_device_info.c
+++ b/src/intel/common/gen_device_info.c
@@ -789,6 +789,50 @@ static const struct gen_device_info 
gen_device_info_cnl_5x8 = {
.is_cannonlake = true,
 };
 
+#define GEN11_HW_INFO   \
+   .gen = 11,   \
+   .has_pln = false,\
+   .max_vs_threads = 364,   \
+   .max_gs_threads = 224,   \
+   .max_tcs_threads = 224,  \
+   .max_tes_threads = 364,  \
+   .max_cs_threads = 56,\
+   .urb = { \
+  .size = 1024, \
+  .min_entries = {  \
+ [MESA_SHADER_VERTEX]= 64,  \
+ [MESA_SHADER_TESS_EVAL] = 34,  \
+  },\
+  .max_entries = {  \
+ [MESA_SHADER_VERTEX]= 2384,\
+ [MESA_SHADER_TESS_CTRL] = 1032,\
+ [MESA_SHADER_TESS_EVAL] = 2384,\
+ [MESA_SHADER_GEOMETRY]  = 1032,\
+  },\
+   }
+
+#define GEN11_FEATURES(_gt, _slices, _subslices, _l3) \
+   GEN8_FEATURES, \
+   GEN11_HW_INFO, \
+   .gt = _gt, .num_slices = _slices, .l3_banks = _l3, \
+   .num_subslices = _subslices
+
+static const struct gen_device_info gen_device_info_icl_8x8 = {
+   GEN11_FEATURES(2, 1, subslices(8), 8),
+};
+
+static const struct gen_device_info gen_device_info_icl_6x8 = {
+   GEN11_FEATURES(1, 1, subslices(6), 6),
+};
+
+static const struct gen_device_info gen_device_info_icl_4x8 = {
+   GEN11_FEATURES(1, 1, subslices(4), 6),
+};
+
+static const struct gen_device_info gen_device_info_icl_1x8 = {
+   GEN11_FEATURES(1, 1, subslices(1), 6),
+};
+
 bool
 gen_get_device_info(int devid, struct gen_device_info *devinfo)
 {
@@ -815,10 +859,21 @@ gen_get_device_info(int devid, struct gen_device_info 
*devinfo)
 * Extra padding can be necessary depending how the thread IDs are
 * calculated for a particular shader stage.
 */
-   if (devinfo->gen >= 9) {
+
+   switch(devinfo->gen) {
+   case 9:
+   case 10:
   devinfo->max_wm_threads = 64 /* threads-per-PSD */
   * devinfo->num_slices
   * 4; /* effective subslices per slice */
+  break;
+   case 11:
+  devinfo->max_wm_threads = 128 /* threads-per-PSD */
+  * devinfo->num_slices
+  * 8; /* subslices per slice */
+  break;
+   default:
+  break;
}
 
assert(devinfo->num_slices <= ARRAY_SIZE(devinfo->num_subslices));

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): intel/compiler: Lower flrp32 on Gen11+

2018-02-28 Thread Matt Turner

Module: Mesa
Branch: master
Commit: 89fe5190a256ee0939061c4c264e9156256d16e8
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=89fe5190a256ee0939061c4c264e9156256d16e8

Author: Matt Turner 
Date:   Wed Jun 14 16:20:41 2017 -0700

intel/compiler: Lower flrp32 on Gen11+

The LRP instruction is no more.

Reviewed-by: Kenneth Graunke 

---

 src/intel/compiler/brw_compiler.c   | 35 +
 src/intel/compiler/brw_fs_builder.h |  2 +-
 src/intel/compiler/brw_fs_generator.cpp |  2 +-
 src/intel/compiler/brw_vec4_builder.h   |  2 +-
 src/intel/compiler/brw_vec4_visitor.cpp |  2 +-
 5 files changed, 26 insertions(+), 17 deletions(-)

diff --git a/src/intel/compiler/brw_compiler.c 
b/src/intel/compiler/brw_compiler.c
index bb9df5e701..34be3b705f 100644
--- a/src/intel/compiler/brw_compiler.c
+++ b/src/intel/compiler/brw_compiler.c
@@ -46,20 +46,28 @@
.use_interpolated_input_intrinsics = true, \
.vertex_id_zero_based = true
 
+#define COMMON_SCALAR_OPTIONS \
+   .lower_pack_half_2x16 = true,  \
+   .lower_pack_snorm_2x16 = true, \
+   .lower_pack_snorm_4x8 = true,  \
+   .lower_pack_unorm_2x16 = true, \
+   .lower_pack_unorm_4x8 = true,  \
+   .lower_unpack_half_2x16 = true,\
+   .lower_unpack_snorm_2x16 = true,   \
+   .lower_unpack_snorm_4x8 = true,\
+   .lower_unpack_unorm_2x16 = true,   \
+   .lower_unpack_unorm_4x8 = true,\
+   .max_unroll_iterations = 32
+
 static const struct nir_shader_compiler_options scalar_nir_options = {
COMMON_OPTIONS,
-   .lower_pack_half_2x16 = true,
-   .lower_pack_snorm_2x16 = true,
-   .lower_pack_snorm_4x8 = true,
-   .lower_pack_unorm_2x16 = true,
-   .lower_pack_unorm_4x8 = true,
-   .lower_unpack_half_2x16 = true,
-   .lower_unpack_snorm_2x16 = true,
-   .lower_unpack_snorm_4x8 = true,
-   .lower_unpack_unorm_2x16 = true,
-   .lower_unpack_unorm_4x8 = true,
-   .vs_inputs_dual_locations = true,
-   .max_unroll_iterations = 32,
+   COMMON_SCALAR_OPTIONS,
+};
+
+static const struct nir_shader_compiler_options scalar_nir_options_gen11 = {
+   COMMON_OPTIONS,
+   COMMON_SCALAR_OPTIONS,
+   .lower_flrp32 = true,
 };
 
 static const struct nir_shader_compiler_options vector_nir_options = {
@@ -149,7 +157,8 @@ brw_compiler_create(void *mem_ctx, const struct 
gen_device_info *devinfo)
   compiler->glsl_compiler_options[i].OptimizeForAOS = !is_scalar;
 
   if (is_scalar) {
- compiler->glsl_compiler_options[i].NirOptions = _nir_options;
+ compiler->glsl_compiler_options[i].NirOptions =
+devinfo->gen < 11 ? _nir_options : 
_nir_options_gen11;
   } else {
  compiler->glsl_compiler_options[i].NirOptions =
 devinfo->gen < 6 ? _nir_options : _nir_options_gen6;
diff --git a/src/intel/compiler/brw_fs_builder.h 
b/src/intel/compiler/brw_fs_builder.h
index 87394bc17b..874272b7af 100644
--- a/src/intel/compiler/brw_fs_builder.h
+++ b/src/intel/compiler/brw_fs_builder.h
@@ -540,7 +540,7 @@ namespace brw {
   LRP(const dst_reg , const src_reg , const src_reg ,
   const src_reg ) const
   {
- if (shader->devinfo->gen >= 6) {
+ if (shader->devinfo->gen >= 6 && shader->devinfo->gen <= 10) {
 /* The LRP instruction actually does op1 * op0 + op2 * (1 - op0), 
so
  * we need to reorder the operands.
  */
diff --git a/src/intel/compiler/brw_fs_generator.cpp 
b/src/intel/compiler/brw_fs_generator.cpp
index 0dc0a695e4..b59c09f46e 100644
--- a/src/intel/compiler/brw_fs_generator.cpp
+++ b/src/intel/compiler/brw_fs_generator.cpp
@@ -1826,7 +1826,7 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
 break;
 
   case BRW_OPCODE_LRP:
- assert(devinfo->gen >= 6);
+ assert(devinfo->gen >= 6 && devinfo->gen <= 10);
  if (devinfo->gen < 10)
 brw_set_default_access_mode(p, BRW_ALIGN_16);
  brw_LRP(p, dst, src[0], src[1], src[2]);
diff --git a/src/intel/compiler/brw_vec4_builder.h 
b/src/intel/compiler/brw_vec4_builder.h
index 4c3efe8457..5c880c19f5 100644
--- a/src/intel/compiler/brw_vec4_builder.h
+++ b/src/intel/compiler/brw_vec4_builder.h
@@ -501,7 +501,7 @@ namespace brw {
   LRP(const dst_reg , const src_reg , const src_reg ,
   const src_reg ) const
   {
- if (shader->devinfo->gen >= 6) {
+ if (shader->devinfo->gen >= 6 && shader->devinfo->gen <= 10) {
 /* The LRP

Mesa (master): intel: Add icl pci id for INTEL_DEVID_OVERRIDE

2018-02-28 Thread Matt Turner

Module: Mesa
Branch: master
Commit: 5e42103f3be5cfaaa374442e009c101403c143bd
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=5e42103f3be5cfaaa374442e009c101403c143bd

Author: Anuj Phogat 
Date:   Wed May 10 15:26:51 2017 -0700

intel: Add icl pci id for INTEL_DEVID_OVERRIDE

Reviewed-by: Matt Turner 
Signed-off-by: Anuj Phogat 

---

 src/intel/common/gen_device_info.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/intel/common/gen_device_info.c 
b/src/intel/common/gen_device_info.c
index b17d22e5f8..11a4480ebf 100644
--- a/src/intel/common/gen_device_info.c
+++ b/src/intel/common/gen_device_info.c
@@ -56,6 +56,7 @@ gen_device_name_to_pci_device_id(const char *name)
   { "kbl", 0x5912 },
   { "glk", 0x3185 },
   { "cnl", 0x5a52 },
+  { "icl", 0x8a52 },
};
 
for (unsigned i = 0; i < ARRAY_SIZE(name_map); i++) {

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): intel/compiler/fs: Pass fs_inst to generate_ddx/ddy instead of opcode

2018-02-28 Thread Matt Turner

Module: Mesa
Branch: master
Commit: bed0267ff64d923feff01ab9144a8f8283700d63
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=bed0267ff64d923feff01ab9144a8f8283700d63

Author: Matt Turner 
Date:   Thu Jun 15 15:41:40 2017 -0700

intel/compiler/fs: Pass fs_inst to generate_ddx/ddy instead of opcode

In a future patch, generate_ddy will want to inspect inst->exec_size.
Change generate_ddx as well for consistency.

Reviewed-by: Kenneth Graunke 

---

 src/intel/compiler/brw_fs.h |  6 --
 src/intel/compiler/brw_fs_generator.cpp | 12 ++--
 2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/src/intel/compiler/brw_fs.h b/src/intel/compiler/brw_fs.h
index 37106ccb28..76ad76e08b 100644
--- a/src/intel/compiler/brw_fs.h
+++ b/src/intel/compiler/brw_fs.h
@@ -417,8 +417,10 @@ private:
void generate_get_buffer_size(fs_inst *inst, struct brw_reg dst,
  struct brw_reg src,
  struct brw_reg surf_index);
-   void generate_ddx(enum opcode op, struct brw_reg dst, struct brw_reg src);
-   void generate_ddy(enum opcode op, struct brw_reg dst, struct brw_reg src);
+   void generate_ddx(const fs_inst *inst,
+ struct brw_reg dst, struct brw_reg src);
+   void generate_ddy(const fs_inst *inst,
+ struct brw_reg dst, struct brw_reg src);
void generate_scratch_write(fs_inst *inst, struct brw_reg src);
void generate_scratch_read(fs_inst *inst, struct brw_reg dst);
void generate_scratch_read_gen7(fs_inst *inst, struct brw_reg dst);
diff --git a/src/intel/compiler/brw_fs_generator.cpp 
b/src/intel/compiler/brw_fs_generator.cpp
index 736b3b5fba..c49af89f2f 100644
--- a/src/intel/compiler/brw_fs_generator.cpp
+++ b/src/intel/compiler/brw_fs_generator.cpp
@@ -1148,12 +1148,12 @@ fs_generator::generate_tex(fs_inst *inst, struct 
brw_reg dst, struct brw_reg src
  * appropriate swizzling.
  */
 void
-fs_generator::generate_ddx(enum opcode opcode,
+fs_generator::generate_ddx(const fs_inst *inst,
struct brw_reg dst, struct brw_reg src)
 {
unsigned vstride, width;
 
-   if (opcode == FS_OPCODE_DDX_FINE) {
+   if (inst->opcode == FS_OPCODE_DDX_FINE) {
   /* produce accurate derivatives */
   vstride = BRW_VERTICAL_STRIDE_2;
   width = BRW_WIDTH_2;
@@ -1185,10 +1185,10 @@ fs_generator::generate_ddx(enum opcode opcode,
  * left.
  */
 void
-fs_generator::generate_ddy(enum opcode opcode,
+fs_generator::generate_ddy(const fs_inst *inst,
struct brw_reg dst, struct brw_reg src)
 {
-   if (opcode == FS_OPCODE_DDY_FINE) {
+   if (inst->opcode == FS_OPCODE_DDY_FINE) {
   /* produce accurate derivatives */
   struct brw_reg src0 = brw_reg(src.file, src.nr, 0,
 src.negate, src.abs,
@@ -2044,11 +2044,11 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
 break;
   case FS_OPCODE_DDX_COARSE:
   case FS_OPCODE_DDX_FINE:
- generate_ddx(inst->opcode, dst, src[0]);
+ generate_ddx(inst, dst, src[0]);
  break;
   case FS_OPCODE_DDY_COARSE:
   case FS_OPCODE_DDY_FINE:
- generate_ddy(inst->opcode, dst, src[0]);
+ generate_ddy(inst, dst, src[0]);
 break;
 
   case SHADER_OPCODE_GEN4_SCRATCH_WRITE:

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): intel/compiler/fs: Fix application of cmod and saturate to LINE/MAC pair

2018-02-28 Thread Matt Turner

Module: Mesa
Branch: master
Commit: b1afdf9fc121df7e2e757fb9cf0d2c1f37a408ba
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=b1afdf9fc121df7e2e757fb9cf0d2c1f37a408ba

Author: Matt Turner 
Date:   Wed Jun 14 14:47:19 2017 -0700

intel/compiler/fs: Fix application of cmod and saturate to LINE/MAC pair

This isn't technically broken, but the next patch will make this
function report whether it generated multiple instructions, and that
information will be used to disable the application of conditional mod
by the generic code.

Reviewed-by: Kenneth Graunke 

---

 src/intel/compiler/brw_fs_generator.cpp | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/src/intel/compiler/brw_fs_generator.cpp 
b/src/intel/compiler/brw_fs_generator.cpp
index cd5be054f6..bba917d755 100644
--- a/src/intel/compiler/brw_fs_generator.cpp
+++ b/src/intel/compiler/brw_fs_generator.cpp
@@ -673,13 +673,22 @@ fs_generator::generate_linterp(fs_inst *inst,
struct brw_reg delta_x = src[0];
struct brw_reg delta_y = offset(src[0], inst->exec_size / 8);
struct brw_reg interp = src[1];
+   brw_inst *i[2];
 
if (devinfo->has_pln &&
(devinfo->gen >= 7 || (delta_x.nr & 1) == 0)) {
   brw_PLN(p, dst, interp, delta_x);
} else {
-  brw_LINE(p, brw_null_reg(), interp, delta_x);
-  brw_MAC(p, dst, suboffset(interp, 1), delta_y);
+  i[0] = brw_LINE(p, brw_null_reg(), interp, delta_x);
+  i[1] = brw_MAC(p, dst, suboffset(interp, 1), delta_y);
+
+  brw_inst_set_cond_modifier(p->devinfo, i[1], inst->conditional_mod);
+
+  /* brw_set_default_saturate() is called before emitting instructions, so
+   * the saturate bit is set in each instruction, so we need to unset it on
+   * the first instruction.
+   */
+  brw_inst_set_saturate(p->devinfo, i[0], false);
}
 }
 

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): intel/compiler: Add instruction compaction support on Gen11

2018-02-28 Thread Matt Turner

Module: Mesa
Branch: master
Commit: c31d77ac22c10f23704a98fe955ce22e0839cfe2
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=c31d77ac22c10f23704a98fe955ce22e0839cfe2

Author: Matt Turner 
Date:   Wed Jun 14 16:43:05 2017 -0700

intel/compiler: Add instruction compaction support on Gen11

Gen11 only differs from SKL+ in that it uses a new datatype index table.

Reviewed-by: Kenneth Graunke 

---

 src/intel/compiler/brw_eu_compact.c | 42 +
 1 file changed, 42 insertions(+)

diff --git a/src/intel/compiler/brw_eu_compact.c 
b/src/intel/compiler/brw_eu_compact.c
index 8d33e2adff..ae14ef10ec 100644
--- a/src/intel/compiler/brw_eu_compact.c
+++ b/src/intel/compiler/brw_eu_compact.c
@@ -637,6 +637,41 @@ static const uint16_t gen8_src_index_table[32] = {
0b010110001000
 };
 
+static const uint32_t gen11_datatype_table[32] = {
+   0b00101,
+   0b001000100,
+   0b001000101,
+   0b001001101,
+   0b0010101100101,
+   0b0010010100101,
+   0b0010010010101,
+   0b00100100101000101,
+   0b00100100101100101,
+   0b001010101,
+   0b001110100,
+   0b001110101,
+   0b001000101000101000101,
+   0b001000111000101000100,
+   0b001000111000101000101,
+   0b001100100100101100101,
+   0b001100101100100100101,
+   0b001100101100101100100,
+   0b001100101100101100101,
+   0b00110000101100100,
+   0b001001100,
+   0b0010001100101,
+   0b0010101000101,
+   0b001010100,
+   0b001000101000101000100,
+   0b00100011100010100,
+   0b00100100100101001,
+   0b00110100101100101,
+   0b00110000101100101,
+   0b00100001101001100,
+   0b001001001001001001000,
+   0b001001011001001001000,
+};
+
 /* This is actually the control index table for Cherryview (26 bits), but the
  * only difference from Broadwell (24 bits) is that it has two extra 0-bits at
  * the start.
@@ -1450,8 +1485,15 @@ brw_init_compaction_tables(const struct gen_device_info 
*devinfo)
assert(gen8_datatype_table[ARRAY_SIZE(gen8_datatype_table) - 1] != 0);
assert(gen8_subreg_table[ARRAY_SIZE(gen8_subreg_table) - 1] != 0);
assert(gen8_src_index_table[ARRAY_SIZE(gen8_src_index_table) - 1] != 0);
+   assert(gen11_datatype_table[ARRAY_SIZE(gen11_datatype_table) - 1] != 0);
 
switch (devinfo->gen) {
+   case 11:
+  control_index_table = gen8_control_index_table;
+  datatype_table = gen11_datatype_table;
+  subreg_table = gen8_subreg_table;
+  src_index_table = gen8_src_index_table;
+  break;
case 10:
case 9:
case 8:

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): anv: remove anv_gem_set_context_priority helper

2018-02-28 Thread Tapani Pälli

Module: Mesa
Branch: master
Commit: 0c983b9094bb544456da5efd0960fae00ce73358
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=0c983b9094bb544456da5efd0960fae00ce73358

Author: Tapani Pälli 
Date:   Wed Feb 28 18:54:24 2018 +0200

anv: remove anv_gem_set_context_priority helper

anv_gem_set_context_param is to be used directly instead!

Fixes: 6d8ab53303 "anv: implement VK_EXT_global_priority extension"
Signed-off-by: Tapani Pälli 
Reviewed-by: Jason Ekstrand 

---

 src/intel/vulkan/anv_device.c  | 5 +++--
 src/intel/vulkan/anv_gem.c | 9 -
 src/intel/vulkan/anv_private.h | 1 -
 3 files changed, 3 insertions(+), 12 deletions(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index f314d7667d..56c0c5fa9f 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1433,8 +1433,9 @@ VkResult anv_CreateDevice(
 * is returned.
 */
if (physical_device->has_context_priority) {
-  int err =
- anv_gem_set_context_priority(device, vk_priority_to_gen(priority));
+  int err = anv_gem_set_context_param(device->fd, device->context_id,
+  I915_CONTEXT_PARAM_PRIORITY,
+  vk_priority_to_gen(priority));
   if (err != 0 && priority > VK_QUEUE_GLOBAL_PRIORITY_MEDIUM_EXT) {
  result = vk_error(VK_ERROR_NOT_PERMITTED_EXT);
  goto fail_fd;
diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c
index 93072c7d3b..2a8f8b14b7 100644
--- a/src/intel/vulkan/anv_gem.c
+++ b/src/intel/vulkan/anv_gem.c
@@ -303,15 +303,6 @@ close_and_return:
return swizzled;
 }
 
-int
-anv_gem_set_context_priority(struct anv_device *device,
- int priority)
-{
-   return anv_gem_set_context_param(device->fd, device->context_id,
-I915_CONTEXT_PARAM_PRIORITY,
-priority);
-}
-
 bool
 anv_gem_has_context_priority(int fd)
 {
diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 3a4a80d869..a6863f5532 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -925,7 +925,6 @@ int anv_gem_set_tiling(struct anv_device *device, uint32_t 
gem_handle,
uint32_t stride, uint32_t tiling);
 int anv_gem_create_context(struct anv_device *device);
 bool anv_gem_has_context_priority(int fd);
-int anv_gem_set_context_priority(struct anv_device *device, int priority);
 int anv_gem_destroy_context(struct anv_device *device, int context);
 int anv_gem_set_context_param(int fd, int context, uint32_t param,
   uint64_t value);

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): swr/rast: Faster frustum prim culling

2018-02-28 Thread George Kyriazis

Module: Mesa
Branch: master
Commit: 7e813f62149b9141d2f8970d35ab7ffbd0e0637c
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=7e813f62149b9141d2f8970d35ab7ffbd0e0637c

Author: George Kyriazis 
Date:   Wed Feb  7 12:24:23 2018 -0600

swr/rast: Faster frustum prim culling

Fix clipper validMask setting. We don't need to run frustum rejected
primitives through the clipper.  Perform frustum culling with only
frustum clip codes. Guardband clip codes cannot be used because they
overlap frustum codes.

Reviewed-By: Bruce Cherniak 

---

 src/gallium/drivers/swr/rasterizer/core/clip.h | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/core/clip.h 
b/src/gallium/drivers/swr/rasterizer/core/clip.h
index 8d2590a498..0f8399c742 100644
--- a/src/gallium/drivers/swr/rasterizer/core/clip.h
+++ b/src/gallium/drivers/swr/rasterizer/core/clip.h
@@ -60,6 +60,7 @@ enum SWR_CLIPCODES
 };
 
 #define GUARDBAND_CLIP_MASK 
(FRUSTUM_NEAR|FRUSTUM_FAR|GUARDBAND_LEFT|GUARDBAND_TOP|GUARDBAND_RIGHT|GUARDBAND_BOTTOM|NEGW)
+#define FRUSTUM_CLIP_MASK 
(FRUSTUM_NEAR|FRUSTUM_FAR|FRUSTUM_LEFT|FRUSTUM_RIGHT|FRUSTUM_TOP|FRUSTUM_BOTTOM)
 
 template
 void ComputeClipCodes(const API_STATE , const Vec4 , 
Float , Integer const )
@@ -708,15 +709,18 @@ public:
 primMask &= ~ComputeUserClipCullMask(pa, prim);
 }
 
-// cull prims outside view frustum
 Float clipIntersection = ComputeClipCodeIntersection();
+// Mask out non-frustum codes
+clipIntersection = SIMD_T::and_ps(clipIntersection, 
SIMD_T::castsi_ps(SIMD_T::set1_epi32(FRUSTUM_CLIP_MASK)));
+
+// cull prims outside view frustum
 int validMask = primMask & 
SimdHelper::cmpeq_ps_mask(clipIntersection, SIMD_T::setzero_ps());
 
 // skip clipping for points
 uint32_t clipMask = 0;
 if (NumVertsPerPrim != 1)
 {
-clipMask = primMask & ComputeClipMask();
+clipMask = validMask & ComputeClipMask();
 }
 
 AR_EVENT(ClipInfoEvent(numInvoc, validMask, clipMask));
@@ -726,7 +730,7 @@ public:
 RDTSC_BEGIN(FEGuardbandClip, pa.pDC->drawId);
 // we have to clip tris, execute the clipper, which will also
 // call the binner
-ClipSimd(prim, SIMD_T::vmask_ps(primMask), 
SIMD_T::vmask_ps(clipMask), pa, primId, viewportIdx, rtIdx);
+ClipSimd(prim, SIMD_T::vmask_ps(validMask), 
SIMD_T::vmask_ps(clipMask), pa, primId, viewportIdx, rtIdx);
 RDTSC_END(FEGuardbandClip, 1);
 }
 else if (validMask)

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): swr/rast: whitespace change

2018-02-28 Thread George Kyriazis

Module: Mesa
Branch: master
Commit: 90e3e23f63d29c658550146c43b29216d1edc1c5
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=90e3e23f63d29c658550146c43b29216d1edc1c5

Author: George Kyriazis 
Date:   Tue Feb 27 11:34:45 2018 -0600

swr/rast: whitespace change

Reviewed-By: Bruce Cherniak 

---

 src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp
index 880aaf8d54..68bd4c1687 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp
@@ -1881,7 +1881,7 @@ Value* FetchJit::GetSimdValid32bitIndices(Value* 
pIndices, Value* pLastIndex)
 // vIndexMask-1-1-1-1 0 0 0 0 : offsets < max pass
 // vLoadedIndices 0 1 2 3 0 0 0 0 : offsets >= max masked to 0
 Value* vMaxIndex = VBROADCAST(numIndicesLeft);
-Value* vIndexMask = VPCMPGTD(vMaxIndex,vIndexOffsets);
+Value* vIndexMask = VPCMPGTD(vMaxIndex, vIndexOffsets);
 
 // VMASKLOAD takes an *i8 src pointer
 pIndices = BITCAST(pIndices,PointerType::get(mInt8Ty,0));

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): swr/rast: Code generation cleanup

2018-02-28 Thread George Kyriazis

Module: Mesa
Branch: master
Commit: e2a4fd076167fed786edc9e7acb45b68429c3399
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=e2a4fd076167fed786edc9e7acb45b68429c3399

Author: George Kyriazis 
Date:   Tue Feb 13 19:22:03 2018 -0600

swr/rast: Code generation cleanup

Generate more compact code from gen_llvm.hpp.

Reviewed-By: Bruce Cherniak 

---

 .../swr/rasterizer/codegen/templates/gen_llvm.hpp  | 36 +-
 1 file changed, 21 insertions(+), 15 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/codegen/templates/gen_llvm.hpp 
b/src/gallium/drivers/swr/rasterizer/codegen/templates/gen_llvm.hpp
index d61194dae1..190e660ad1 100644
--- a/src/gallium/drivers/swr/rasterizer/codegen/templates/gen_llvm.hpp
+++ b/src/gallium/drivers/swr/rasterizer/codegen/templates/gen_llvm.hpp
@@ -1,5 +1,5 @@
 /
-* Copyright (C) 2014-2017 Intel Corporation.   All Rights Reserved.
+* Copyright (C) 2014-2018 Intel Corporation.   All Rights Reserved.
 *
 * Permission is hereby granted, free of charge, to any person obtaining a
 * copy of this software and associated documentation files (the "Software"),
@@ -39,19 +39,19 @@ namespace SwrJit
 %for type in types:
 INLINE static StructType *Gen_${type['name']}(JitManager* pJitMgr)
 {
+%if needs_ctx(type):
 LLVMContext& ctx = pJitMgr->mContext;
-   (void) ctx;
 
+%endif
 StructType* pRetType = 
pJitMgr->mpCurrentModule->getTypeByName("${type['name']}");
 if (pRetType == nullptr)
 {
-std::vector members;
-<%
-(max_type_len, max_name_len) = calc_max_len(type['members'])
-%>
-%for member in type['members']:
-/* ${member['name']} ${pad(len(member['name']), max_name_len)}*/ 
members.push_back(${ member['type'] });
-%endfor
+std::vector members =<% (max_type_len, max_name_len) = 
calc_max_len(type['members']) %>
+{
+%for member in type['members']:
+/* ${member['name']} ${pad(len(member['name']), 
max_name_len)}*/ ${member['type']},
+%endfor
+};
 
 pRetType = StructType::create(members, "${type['name']}", false);
 
@@ -59,13 +59,13 @@ namespace SwrJit
 llvm::DIBuilder builder(*pJitMgr->mpCurrentModule);
 llvm::DIFile* pFile = builder.createFile("${input_file}", 
"${os.path.normpath(input_dir).replace('\\', '/')}");
 
-std::vector> dbgMembers;
-%for member in type['members']:
-dbgMembers.push_back(std::make_pair("${member['name']}", ${ 
member['lineNum'] }));
-%endfor
-
+std::vector> dbgMembers =
+{
+%for member in type['members']:
+std::make_pair("${member['name']}", ${pad(len(member['name']), 
max_name_len)}${member['lineNum']}),
+%endfor
+};
 pJitMgr->CreateDebugStructType(pRetType, "${type['name']}", pFile, 
${type['lineNum']}, dbgMembers);
-
 }
 
 return pRetType;
@@ -80,6 +80,12 @@ namespace SwrJit
 
 <%! # Global function definitions
 import os
+def needs_ctx(struct_type):
+for m in struct_type.get('members', []):
+if '(ctx)' in m.get('type', ''):
+return True
+return False
+
 def calc_max_len(fields):
 max_type_len = 0
 max_name_len = 0

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): swr/rast: Remove draw type from event definitions

2018-02-28 Thread George Kyriazis

Module: Mesa
Branch: master
Commit: 190ead3d79f1f4037c08f7d6a87d9a1a955ff30d
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=190ead3d79f1f4037c08f7d6a87d9a1a955ff30d

Author: George Kyriazis 
Date:   Tue Feb 13 17:38:55 2018 -0600

swr/rast: Remove draw type from event definitions

- Have the draw type sent to DrawInfoEvent in handlers created in
  archrast.cpp.  The draw type no longer needs to be sent during during
  AR_API_EVENT() call in api.cpp.

- Remove draw type from event defintions in events_private.proto, no
  longer needed

Reviewed-By: Bruce Cherniak 

---

 src/gallium/drivers/swr/rasterizer/archrast/archrast.cpp | 8 
 src/gallium/drivers/swr/rasterizer/archrast/events_private.proto | 4 
 src/gallium/drivers/swr/rasterizer/core/api.cpp  | 8 
 3 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/archrast/archrast.cpp 
b/src/gallium/drivers/swr/rasterizer/archrast/archrast.cpp
index d7a3b292d6..8c09411029 100644
--- a/src/gallium/drivers/swr/rasterizer/archrast/archrast.cpp
+++ b/src/gallium/drivers/swr/rasterizer/archrast/archrast.cpp
@@ -175,28 +175,28 @@ namespace ArchRast
 
 virtual void Handle(const DrawInstancedEvent& event)
 {
-DrawInfoEvent e(event.data.drawId, event.data.type, 
event.data.topology, event.data.numVertices, 0, 0, event.data.startVertex, 
event.data.numInstances, event.data.startInstance);
+DrawInfoEvent e(event.data.drawId, ArchRast::Instanced, 
event.data.topology, event.data.numVertices, 0, 0, event.data.startVertex, 
event.data.numInstances, event.data.startInstance);
 
 EventHandlerFile::Handle(e);
 }
 
 virtual void Handle(const DrawIndexedInstancedEvent& event)
 {
-DrawInfoEvent e(event.data.drawId, event.data.type, 
event.data.topology, 0, event.data.numIndices, event.data.indexOffset, 
event.data.baseVertex, event.data.numInstances, event.data.startInstance);
+DrawInfoEvent e(event.data.drawId, ArchRast::IndexedInstanced, 
event.data.topology, 0, event.data.numIndices, event.data.indexOffset, 
event.data.baseVertex, event.data.numInstances, event.data.startInstance);
 
 EventHandlerFile::Handle(e);
 }
 
 virtual void Handle(const DrawInstancedSplitEvent& event)
 {
-DrawInfoEvent e(event.data.drawId, event.data.type, 0, 0, 0, 0, 0, 
0, 0);
+DrawInfoEvent e(event.data.drawId, ArchRast::InstancedSplit, 0, 0, 
0, 0, 0, 0, 0);
 
 EventHandlerFile::Handle(e);
 }
 
 virtual void Handle(const DrawIndexedInstancedSplitEvent& event)
 {
-DrawInfoEvent e(event.data.drawId, event.data.type, 0, 0, 0, 0, 0, 
0, 0);
+DrawInfoEvent e(event.data.drawId, 
ArchRast::IndexedInstancedSplit, 0, 0, 0, 0, 0, 0, 0);
 
 EventHandlerFile::Handle(e);
 }
diff --git a/src/gallium/drivers/swr/rasterizer/archrast/events_private.proto 
b/src/gallium/drivers/swr/rasterizer/archrast/events_private.proto
index 71b723d5f6..8970141d60 100644
--- a/src/gallium/drivers/swr/rasterizer/archrast/events_private.proto
+++ b/src/gallium/drivers/swr/rasterizer/archrast/events_private.proto
@@ -117,7 +117,6 @@ event ClipInfoEvent
 event DrawInstancedEvent
 {
 uint32_t drawId;
-AR_DRAW_TYPE type;
 uint32_t topology;
 uint32_t numVertices;
 int32_t  startVertex;
@@ -128,7 +127,6 @@ event DrawInstancedEvent
 event DrawIndexedInstancedEvent
 {
 uint32_t drawId;
-AR_DRAW_TYPE type;
 uint32_t topology;
 uint32_t numIndices;
 int32_t  indexOffset;
@@ -141,12 +139,10 @@ event DrawIndexedInstancedEvent
 event DrawInstancedSplitEvent
 {
 uint32_t drawId;
-AR_DRAW_TYPE type;
 };
 
 ///@brief API Stat: Split draw event for DrawIndexedInstanced.
 event DrawIndexedInstancedSplitEvent
 {
 uint32_t drawId;
-AR_DRAW_TYPE type;
 };
diff --git a/src/gallium/drivers/swr/rasterizer/core/api.cpp 
b/src/gallium/drivers/swr/rasterizer/core/api.cpp
index cb98cbe7ee..99d3cd5bb0 100644
--- a/src/gallium/drivers/swr/rasterizer/core/api.cpp
+++ b/src/gallium/drivers/swr/rasterizer/core/api.cpp
@@ -1169,7 +1169,7 @@ void DrawInstanced(
 DRAW_CONTEXT* pDC = GetDrawContext(pContext);
 
 RDTSC_BEGIN(APIDraw, pDC->drawId);
-AR_API_EVENT(DrawInstancedEvent(pDC->drawId, ArchRast::Instanced, 
topology, numVertices, startVertex, numInstances, startInstance));
+AR_API_EVENT(DrawInstancedEvent(pDC->drawId, topology, numVertices, 
startVertex, numInstances, startInstance));
 
 uint32_t maxVertsPerDraw = MaxVertsPerDraw(pDC, numVertices, topology);
 uint32_t primsPerDraw = GetNumPrims(topology, maxVertsPerDraw);
@@ -1221,7 +1221,7 @@ void DrawInstanced(
 //enqueue DC
 QueueDraw(pContext);
 
-AR_API_EVENT(DrawInstancedSplitEvent(pDC->drawId,

Mesa (master): swr/rast: Consolidate TRANSLATE_ADDRESS

2018-02-28 Thread George Kyriazis

Module: Mesa
Branch: master
Commit: 1c73f42e6e55de0be21221979882f6e42b3c2747
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=1c73f42e6e55de0be21221979882f6e42b3c2747

Author: George Kyriazis 
Date:   Wed Feb 14 01:13:13 2018 -0600

swr/rast: Consolidate TRANSLATE_ADDRESS

Translate is now part of an overloaded LOAD call which required a change to
the code gen to skip the load functions in order to handle them manually
to make them virtual.

Reviewed-By: Bruce Cherniak 

---

 .../swr/rasterizer/codegen/gen_llvm_ir_macros.py |  3 ++-
 .../drivers/swr/rasterizer/jitter/builder_mem.cpp| 20 
 .../drivers/swr/rasterizer/jitter/builder_mem.h  |  7 ++-
 .../drivers/swr/rasterizer/jitter/fetch_jit.cpp  |  4 
 4 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py 
b/src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py
index 3b19cb4e80..aab499b54a 100644
--- a/src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py
+++ b/src/gallium/drivers/swr/rasterizer/codegen/gen_llvm_ir_macros.py
@@ -152,7 +152,8 @@ def parse_ir_builder(input_file):
 # The following functions need to be ignored.
 if (func_name == 'CreateInsertNUWNSWBinOp' or
 func_name == 'CreateMaskedIntrinsic' or
-func_name == 'CreateAlignmentAssumptionHelper'):
+func_name == 'CreateAlignmentAssumptionHelper' or
+func_name == 'CreateLoad'):
 ignore = True
 
 # Convert CamelCase to CAMEL_CASE
diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder_mem.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/builder_mem.cpp
index 3bba6ff04f..67e415cdcc 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/builder_mem.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/builder_mem.cpp
@@ -69,6 +69,26 @@ namespace SwrJit
 return IN_BOUNDS_GEP(ptr, indices);
 }
 
+LoadInst* Builder::LOAD(Value *Ptr, const char *Name)
+{
+return IRB()->CreateLoad(Ptr, Name);
+}
+
+LoadInst* Builder::LOAD(Value *Ptr, const Twine )
+{
+return IRB()->CreateLoad(Ptr, Name);
+}
+
+LoadInst* Builder::LOAD(Type *Ty, Value *Ptr, const Twine )
+{
+return IRB()->CreateLoad(Ty, Ptr, Name);
+}
+
+LoadInst* Builder::LOAD(Value *Ptr, bool isVolatile, const Twine )
+{
+return IRB()->CreateLoad(Ptr, isVolatile, Name);
+}
+
 LoadInst *Builder::LOAD(Value *basePtr, const 
std::initializer_list , const llvm::Twine& name)
 {
 std::vector valIndices;
diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder_mem.h 
b/src/gallium/drivers/swr/rasterizer/jitter/builder_mem.h
index 4f496343e9..b3a0e2b09f 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/builder_mem.h
+++ b/src/gallium/drivers/swr/rasterizer/jitter/builder_mem.h
@@ -34,7 +34,12 @@ Value *GEP(Value* ptr, const std::initializer_list 
);
 Value *IN_BOUNDS_GEP(Value* ptr, const std::initializer_list 
);
 Value *IN_BOUNDS_GEP(Value* ptr, const std::initializer_list 
);
 
-LoadInst *LOAD(Value *BasePtr, const std::initializer_list , 
const llvm::Twine& name = "");
+virtual LoadInst* LOAD(Value *Ptr, const char *Name);
+virtual LoadInst* LOAD(Value *Ptr, const Twine  = "");
+virtual LoadInst* LOAD(Type *Ty, Value *Ptr, const Twine  = "");
+virtual LoadInst* LOAD(Value *Ptr, bool isVolatile, const Twine  = "");
+virtual LoadInst* LOAD(Value *BasePtr, const std::initializer_list 
, const llvm::Twine& Name = "");
+
 LoadInst *LOADV(Value *BasePtr, const std::initializer_list , 
const llvm::Twine& name = "");
 StoreInst *STORE(Value *Val, Value *BasePtr, const 
std::initializer_list );
 StoreInst *STOREV(Value *Val, Value *BasePtr, const 
std::initializer_list );
diff --git a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp
index 68bd4c1687..f1dc00293a 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp
@@ -1830,16 +1830,12 @@ Value* FetchJit::GetSimdValid16bitIndices(Value* 
pIndices, Value* pLastIndex)
 Value* pZeroIndex = ALLOCA(mInt16Ty);
 STORE(C((uint16_t)0), pZeroIndex);
 
-pLastIndex = TRANSLATE_ADDRESS(pLastIndex);
-
 // Load a SIMD of index pointers
 for(int64_t lane = 0; lane < mVWidth; lane++)
 {
 // Calculate the address of the requested index
 Value *pIndex = GEP(pIndices, C(lane));
 
-pIndex = TRANSLATE_ADDRESS(pIndex);
-
 // check if the address is less than the max index, 
 Value* mask = ICMP_ULT(pIndex, pLastIndex);
 

___
mesa-commit mailing list

Mesa (master): swr/rast: Fix index buffer overfetch issue for non-indexed draws

2018-02-28 Thread George Kyriazis

Module: Mesa
Branch: master
Commit: 539de78633c45598e0b1e3b7763ea318f9200c32
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=539de78633c45598e0b1e3b7763ea318f9200c32

Author: George Kyriazis 
Date:   Fri Feb 16 11:14:50 2018 -0600

swr/rast: Fix index buffer overfetch issue for non-indexed draws

Populate pLastIndex, even for the non-indexed case.  An zero pLastIndex
can cause the index offsets inside the fetcher to have non-sensical values
that can be either very large positive or very large negative numbers.

Reviewed-By: Bruce Cherniak 

---

 src/gallium/drivers/swr/rasterizer/core/frontend.cpp | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/src/gallium/drivers/swr/rasterizer/core/frontend.cpp 
b/src/gallium/drivers/swr/rasterizer/core/frontend.cpp
index 1c4b522e45..c2be5d7bd1 100644
--- a/src/gallium/drivers/swr/rasterizer/core/frontend.cpp
+++ b/src/gallium/drivers/swr/rasterizer/core/frontend.cpp
@@ -1719,6 +1719,21 @@ void ProcessDraw(
 
 if (i < endVertex)
 {
+if (!IsIndexedT::value)
+{
+fetchInfo_lo.pLastIndex = fetchInfo_lo.pIndices;
+uint32_t offset;
+offset = std::min(endVertex-i, (uint32_t) 
KNOB_SIMD16_WIDTH);
+#if USE_SIMD16_SHADERS
+fetchInfo_lo.pLastIndex += offset;
+#else
+fetchInfo_lo.pLastIndex += std::min(offset, (uint32_t) 
KNOB_SIMD_WIDTH);
+uint32_t offset2 = std::min(offset, (uint32_t) 
KNOB_SIMD16_WIDTH)-KNOB_SIMD_WIDTH;
+assert(offset >= 0);
+fetchInfo_hi.pLastIndex = fetchInfo_hi.pIndices;
+fetchInfo_hi.pLastIndex += offset2;
+#endif
+}
 // 1. Execute FS/VS for a single SIMD.
 RDTSC_BEGIN(FEFetchShader, pDC->drawId);
 #if USE_SIMD16_SHADERS

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): swr/rast: revert clip distance precision

2018-02-28 Thread George Kyriazis

Module: Mesa
Branch: master
Commit: a01d5e371269eed50fb5f478b98ace5b64490001
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=a01d5e371269eed50fb5f478b98ace5b64490001

Author: George Kyriazis 
Date:   Tue Feb 20 00:07:57 2018 -0600

swr/rast: revert clip distance precision

Fixes piglit tests that broke with 8a64593bde

Reviewed-By: Bruce Cherniak 

---

 src/gallium/drivers/swr/rasterizer/core/backend_impl.h |  4 +---
 src/gallium/drivers/swr/rasterizer/core/binner.cpp | 17 -
 2 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/core/backend_impl.h 
b/src/gallium/drivers/swr/rasterizer/core/backend_impl.h
index 454f473b47..2cfd52e829 100644
--- a/src/gallium/drivers/swr/rasterizer/core/backend_impl.h
+++ b/src/gallium/drivers/swr/rasterizer/core/backend_impl.h
@@ -62,10 +62,8 @@ static INLINE simdmask ComputeUserClipMask(uint8_t clipMask, 
float* pUserClipBuf
 simdscalar vB = _simd_broadcast_ss(pUserClipBuffer++);
 simdscalar vC = _simd_broadcast_ss(pUserClipBuffer++);
 
-simdscalar vK = _simd_sub_ps(_simd_sub_ps(_simd_set1_ps(1.0f), vI), 
vJ);
-
 // interpolate
-simdscalar vInterp = vplaneps(vA, vB, _simd_mul_ps(vK, vC), vI, vJ);
+simdscalar vInterp = vplaneps(vA, vB, vC, vI, vJ);
 
 // clip if interpolated clip distance is < 0 || NAN
 simdscalar vCull = _simd_cmp_ps(_simd_setzero_ps(), vInterp, 
_CMP_NLE_UQ);
diff --git a/src/gallium/drivers/swr/rasterizer/core/binner.cpp 
b/src/gallium/drivers/swr/rasterizer/core/binner.cpp
index 3b093cefc0..c9a37cb17a 100644
--- a/src/gallium/drivers/swr/rasterizer/core/binner.cpp
+++ b/src/gallium/drivers/swr/rasterizer/core/binner.cpp
@@ -256,12 +256,27 @@ void ProcessUserClipDist(const SWR_BACKEND_STATE& state, 
PA_STATE& pa, uint32_t
 simd4scalar primClipDist[3];
 pa.AssembleSingle(clipAttribSlot, primIndex, primClipDist);
 
+float vertClipDist[NumVerts];
 for (uint32_t e = 0; e < NumVerts; ++e)
 {
 OSALIGNSIMD(float) aVertClipDist[4];
 SIMD128::store_ps(aVertClipDist, primClipDist[e]);
-*(pUserClipBuffer++) = aVertClipDist[clipComp];
+vertClipDist[e] = aVertClipDist[clipComp];
 };
+
+// setup plane equations for barycentric interpolation in the backend
+float baryCoeff[NumVerts];
+float last = vertClipDist[NumVerts - 1] * pRecipW[NumVerts - 1];
+for (uint32_t e = 0; e < NumVerts - 1; ++e)
+{
+baryCoeff[e] = vertClipDist[e] * pRecipW[e] - last;
+}
+baryCoeff[NumVerts - 1] = last;
+
+for (uint32_t e = 0; e < NumVerts; ++e)
+{
+*(pUserClipBuffer++) = baryCoeff[e];
+}
 }
 }
 

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): draw: don't needlessly iterate through all sampler view slots

2018-02-28 Thread Roland Scheidegger

Module: Mesa
Branch: master
Commit: 89ae5def8cea9311727ac80d7274f80650279373
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=89ae5def8cea9311727ac80d7274f80650279373

Author: Roland Scheidegger 
Date:   Sun Feb 25 04:26:37 2018 +0100

draw: don't needlessly iterate through all sampler view slots

We already stored the highest (potentially) used number.

Reviewed-by: Jose Fonseca 
Reviewed-by: Brian Paul 

---

 src/gallium/auxiliary/draw/draw_context.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/draw/draw_context.c 
b/src/gallium/auxiliary/draw/draw_context.c
index 9791ec5506..e887272e15 100644
--- a/src/gallium/auxiliary/draw/draw_context.c
+++ b/src/gallium/auxiliary/draw/draw_context.c
@@ -973,7 +973,7 @@ draw_set_sampler_views(struct draw_context *draw,
 
for (i = 0; i < num; ++i)
   draw->sampler_views[shader_stage][i] = views[i];
-   for (i = num; i < PIPE_MAX_SHADER_SAMPLER_VIEWS; ++i)
+   for (i = num; i < draw->num_sampler_views[shader_stage]; ++i)
   draw->sampler_views[shader_stage][i] = NULL;
 
draw->num_sampler_views[shader_stage] = num;

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): cso: don't cycle through PIPE_MAX_SHADER_SAMPLER_VIEWS on context destroy

2018-02-28 Thread Roland Scheidegger

Module: Mesa
Branch: master
Commit: b923f21eaadb77ee70e1bf4c5e2f9aee2a5fa205
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=b923f21eaadb77ee70e1bf4c5e2f9aee2a5fa205

Author: Roland Scheidegger 
Date:   Wed Feb 28 03:01:23 2018 +0100

cso: don't cycle through PIPE_MAX_SHADER_SAMPLER_VIEWS on context destroy

There's no point, we know the highest non-null one.

Reviewed-by: Brian Paul 
Reviewed-by: Jose Fonseca 

---

 src/gallium/auxiliary/cso_cache/cso_context.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/cso_cache/cso_context.c 
b/src/gallium/auxiliary/cso_cache/cso_context.c
index 1b5d4b5598..3fa57f16ff 100644
--- a/src/gallium/auxiliary/cso_cache/cso_context.c
+++ b/src/gallium/auxiliary/cso_cache/cso_context.c
@@ -407,8 +407,10 @@ void cso_destroy_context( struct cso_context *ctx )
  ctx->pipe->set_stream_output_targets(ctx->pipe, 0, NULL, NULL);
}
 
-   for (i = 0; i < PIPE_MAX_SHADER_SAMPLER_VIEWS; i++) {
+   for (i = 0; i < ctx->nr_fragment_views; i++) {
   pipe_sampler_view_reference(>fragment_views[i], NULL);
+   }
+   for (i = 0; i < ctx->nr_fragment_views_saved; i++) {
   pipe_sampler_view_reference(>fragment_views_saved[i], NULL);
}
 

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): softpipe: don't iterate through PIPE_MAX_SHADER_SAMPLER_VIEWS

2018-02-28 Thread Roland Scheidegger

Module: Mesa
Branch: master
Commit: 26103487b54a1c1121132cc040927619cce45262
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=26103487b54a1c1121132cc040927619cce45262

Author: Roland Scheidegger 
Date:   Wed Feb 28 04:28:29 2018 +0100

softpipe: don't iterate through PIPE_MAX_SHADER_SAMPLER_VIEWS

We were setting view to NULL if the iteration was larger than i.
But in fact if the view is NULL the code did nothing anyway...

Reviewed-by: Brian Paul 
Reviewed-by: Jose Fonseca 

---

 src/gallium/drivers/softpipe/sp_state_sampler.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/softpipe/sp_state_sampler.c 
b/src/gallium/drivers/softpipe/sp_state_sampler.c
index c10fd918fd..751eb76e84 100644
--- a/src/gallium/drivers/softpipe/sp_state_sampler.c
+++ b/src/gallium/drivers/softpipe/sp_state_sampler.c
@@ -181,8 +181,8 @@ prepare_shader_sampling(
if (!num)
   return;
 
-   for (i = 0; i < PIPE_MAX_SHADER_SAMPLER_VIEWS; i++) {
-  struct pipe_sampler_view *view = i < num ? views[i] : NULL;
+   for (i = 0; i < num; i++) {
+  struct pipe_sampler_view *view = views[i];
 
   if (view) {
  struct pipe_resource *tex = view->texture;

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): anv: implement VK_EXT_global_priority extension

2018-02-28 Thread Tapani Pälli

Module: Mesa
Branch: master
Commit: 6d8ab53303331a2438ab7c89c94be31c44e70bb1
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=6d8ab53303331a2438ab7c89c94be31c44e70bb1

Author: Tapani Pälli 
Date:   Tue Jan 23 14:01:00 2018 +0200

anv: implement VK_EXT_global_priority extension

v2: add ANV_CONTEXT_REALTIME_PRIORITY (Chris)
use unreachable with unknown priority (Samuel)

v3: add stubs in gem_stubs.c (Emil)
use priority defines from gen_defines.h

v4: cleanup, add anv_gem_set_context_param (Jason)

Signed-off-by: Tapani Pälli 
Reviewed-by: Samuel Iglesias Gonsálvez  (v2)
Reviewed-by: Chris Wilson  (v2)
Reviewed-by: Emil Velikov  (v3)
Reviewed-by: Jason Ekstrand 

---

 src/intel/vulkan/anv_device.c  | 44 ++
 src/intel/vulkan/anv_extensions.py |  2 ++
 src/intel/vulkan/anv_gem.c | 32 +++
 src/intel/vulkan/anv_gem_stubs.c   | 12 +++
 src/intel/vulkan/anv_private.h |  5 +
 5 files changed, 95 insertions(+)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 8be88acc52..f314d7667d 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -37,6 +37,7 @@
 #include "util/build_id.h"
 #include "util/mesa-sha1.h"
 #include "vk_util.h"
+#include "common/gen_defines.h"
 
 #include "genxml/gen7_pack.h"
 
@@ -374,6 +375,9 @@ anv_physical_device_init(struct anv_physical_device *device,
device->has_syncobj_wait = device->has_syncobj &&
   anv_gem_supports_syncobj_wait(fd);
 
+   if (anv_gem_has_context_priority(fd))
+  device->has_context_priority = true;
+
bool swizzled = anv_gem_get_bit6_swizzle(fd, I915_TILING_X);
 
/* Starting with Gen10, the timestamp frequency of the command streamer may
@@ -1324,6 +1328,23 @@ anv_device_init_dispatch(struct anv_device *device)
}
 }
 
+static int
+vk_priority_to_gen(int priority)
+{
+   switch (priority) {
+   case VK_QUEUE_GLOBAL_PRIORITY_LOW_EXT:
+  return GEN_CONTEXT_LOW_PRIORITY;
+   case VK_QUEUE_GLOBAL_PRIORITY_MEDIUM_EXT:
+  return GEN_CONTEXT_MEDIUM_PRIORITY;
+   case VK_QUEUE_GLOBAL_PRIORITY_HIGH_EXT:
+  return GEN_CONTEXT_HIGH_PRIORITY;
+   case VK_QUEUE_GLOBAL_PRIORITY_REALTIME_EXT:
+  return GEN_CONTEXT_REALTIME_PRIORITY;
+   default:
+  unreachable("Invalid priority");
+   }
+}
+
 VkResult anv_CreateDevice(
 VkPhysicalDevicephysicalDevice,
 const VkDeviceCreateInfo*   pCreateInfo,
@@ -1367,6 +1388,15 @@ VkResult anv_CreateDevice(
   }
}
 
+   /* Check if client specified queue priority. */
+   const VkDeviceQueueGlobalPriorityCreateInfoEXT *queue_priority =
+  vk_find_struct_const(pCreateInfo->pQueueCreateInfos[0].pNext,
+   DEVICE_QUEUE_GLOBAL_PRIORITY_CREATE_INFO_EXT);
+
+   VkQueueGlobalPriorityEXT priority =
+  queue_priority ? queue_priority->globalPriority :
+ VK_QUEUE_GLOBAL_PRIORITY_MEDIUM_EXT;
+
device = vk_alloc2(_device->instance->alloc, pAllocator,
sizeof(*device), 8,
VK_SYSTEM_ALLOCATION_SCOPE_DEVICE);
@@ -1397,6 +1427,20 @@ VkResult anv_CreateDevice(
   goto fail_fd;
}
 
+   /* As per spec, the driver implementation may deny requests to acquire
+* a priority above the default priority (MEDIUM) if the caller does not
+* have sufficient privileges. In this scenario VK_ERROR_NOT_PERMITTED_EXT
+* is returned.
+*/
+   if (physical_device->has_context_priority) {
+  int err =
+ anv_gem_set_context_priority(device, vk_priority_to_gen(priority));
+  if (err != 0 && priority > VK_QUEUE_GLOBAL_PRIORITY_MEDIUM_EXT) {
+ result = vk_error(VK_ERROR_NOT_PERMITTED_EXT);
+ goto fail_fd;
+  }
+   }
+
device->info = physical_device->info;
device->isl_dev = physical_device->isl_dev;
 
diff --git a/src/intel/vulkan/anv_extensions.py 
b/src/intel/vulkan/anv_extensions.py
index 581921e62a..6194eb0ad6 100644
--- a/src/intel/vulkan/anv_extensions.py
+++ b/src/intel/vulkan/anv_extensions.py
@@ -86,6 +86,8 @@ EXTENSIONS = [
 Extension('VK_KHX_multiview', 1, True),
 Extension('VK_EXT_debug_report',  8, True),
 Extension('VK_EXT_external_memory_dma_buf',   1, True),
+Extension('VK_EXT_global_priority',   1,
+  'device->has_context_priority'),
 ]
 
 class VkVersion:
diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c
index 34c0989108..93072c7d3b 100644
--- a/src/intel/vulkan/anv_gem.c
+++ b/src/intel/vulkan/anv_gem.c
@@ -30,6 +30,7 @@
 #include 
 
 #include "anv_private.h"
+#include "common/gen_defines.h"
 
 static int
 anv_ioctl(int fd, unsigned long request, void *arg)
@@ -303,6 +304,22 @@

Mesa (master): intel: add new common header gen_defines.h

2018-02-28 Thread Tapani Pälli

Module: Mesa
Branch: master
Commit: 4449a1f80dd3a37ee0fc6084ac93bc9f19f32580
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=4449a1f80dd3a37ee0fc6084ac93bc9f19f32580

Author: Tapani Pälli 
Date:   Mon Jan 22 08:17:50 2018 +0200

intel: add new common header gen_defines.h

Signed-off-by: Tapani Pälli 
Reviewed-by: Chris Wilson 
Reviewed-by: Emil Velikov 

---

 src/intel/Makefile.sources |  1 +
 src/intel/common/gen_defines.h | 54 ++
 2 files changed, 55 insertions(+)

diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
index 692c860477..0a16e2398c 100644
--- a/src/intel/Makefile.sources
+++ b/src/intel/Makefile.sources
@@ -13,6 +13,7 @@ COMMON_FILES = \
common/gen_debug.h \
common/gen_decoder.c \
common/gen_decoder.h \
+   common/gen_defines.h \
common/gen_device_info.c \
common/gen_device_info.h \
common/gen_l3_config.c \
diff --git a/src/intel/common/gen_defines.h b/src/intel/common/gen_defines.h
new file mode 100644
index 00..d1d63a17f1
--- /dev/null
+++ b/src/intel/common/gen_defines.h
@@ -0,0 +1,54 @@
+/*
+ * Copyright © 2018 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining
+ * a copy of this software and associated documentation files (the
+ * "Software"), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sublicense, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial
+ * portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+ * IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
+ * LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+ * OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+ * WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef GEN_DEFINES_H
+#define GEN_DEFINES_H
+
+#include "i915_drm.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+/**
+ * \file gen_defines.h
+ *
+ * Common defines we want to share between GL And Vulkan.
+ */
+
+#define GEN_CONTEXT_LOW_PRIORITY ((I915_CONTEXT_MIN_USER_PRIORITY-1)/2)
+#define GEN_CONTEXT_MEDIUM_PRIORITY (I915_CONTEXT_DEFAULT_PRIORITY)
+#define GEN_CONTEXT_HIGH_PRIORITY ((I915_CONTEXT_MAX_USER_PRIORITY+1)/2)
+/* We don't have a strict notion of RT (yet, and when we do it is likely
+ * to be more complicated than a mere priority value!), but we can give
+ * it the absolute most priority available to us. By convention, this
+ * is higher than any other client, except for blocked interactive
+ * clients.
+ */
+#define GEN_CONTEXT_REALTIME_PRIORITY I915_CONTEXT_MAX_USER_PRIORITY
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* GEN_DEFINES_H */

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): i965: use context priority definitions from gen_defines.h

2018-02-28 Thread Tapani Pälli

Module: Mesa
Branch: master
Commit: 5960023cf4abcd610ae646b094d04cb29f8b4986
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=5960023cf4abcd610ae646b094d04cb29f8b4986

Author: Tapani Pälli 
Date:   Mon Jan 22 08:22:53 2018 +0200

i965: use context priority definitions from gen_defines.h

Signed-off-by: Tapani Pälli 
Reviewed-by: Chris Wilson 
Reviewed-by: Emil Velikov 

---

 src/mesa/drivers/dri/i965/brw_bufmgr.h   | 4 
 src/mesa/drivers/dri/i965/brw_context.c  | 8 +---
 src/mesa/drivers/dri/i965/intel_screen.c | 8 +---
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.h 
b/src/mesa/drivers/dri/i965/brw_bufmgr.h
index 005ff19798..0f2badd006 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.h
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.h
@@ -320,10 +320,6 @@ int brw_bo_wait(struct brw_bo *bo, int64_t timeout_ns);
 
 uint32_t brw_create_hw_context(struct brw_bufmgr *bufmgr);
 
-#define BRW_CONTEXT_LOW_PRIORITY ((I915_CONTEXT_MIN_USER_PRIORITY-1)/2)
-#define BRW_CONTEXT_MEDIUM_PRIORITY (I915_CONTEXT_DEFAULT_PRIORITY)
-#define BRW_CONTEXT_HIGH_PRIORITY ((I915_CONTEXT_MAX_USER_PRIORITY+1)/2)
-
 int brw_hw_context_set_priority(struct brw_bufmgr *bufmgr,
 uint32_t ctx_id,
 int priority);
diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index ea1c78d1fe..b9c3fa27bf 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -76,6 +76,8 @@
 #include "util/disk_cache.h"
 #include "isl/isl.h"
 
+#include "common/gen_defines.h"
+
 /***
  * Mesa's Driver Functions
  ***/
@@ -982,14 +984,14 @@ brwCreateContext(gl_api api,
  return false;
   }
 
-  int hw_priority = BRW_CONTEXT_MEDIUM_PRIORITY;
+  int hw_priority = GEN_CONTEXT_MEDIUM_PRIORITY;
   if (ctx_config->attribute_mask & __DRIVER_CONTEXT_ATTRIB_PRIORITY) {
  switch (ctx_config->priority) {
  case __DRI_CTX_PRIORITY_LOW:
-hw_priority = BRW_CONTEXT_LOW_PRIORITY;
+hw_priority = GEN_CONTEXT_LOW_PRIORITY;
 break;
  case __DRI_CTX_PRIORITY_HIGH:
-hw_priority = BRW_CONTEXT_HIGH_PRIORITY;
+hw_priority = GEN_CONTEXT_HIGH_PRIORITY;
 break;
  }
   }
diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index 9e0c15bad2..dfb889221d 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -45,6 +45,8 @@
 #include "util/disk_cache.h"
 #include "util/xmlpool.h"
 
+#include "common/gen_defines.h"
+
 static const __DRIconfigOptionsExtension brw_config_options = {
.base = { __DRI_CONFIG_OPTIONS, 1 },
.xml =
@@ -1462,14 +1464,14 @@ brw_query_renderer_integer(__DRIscreen *dri_screen,
case __DRI2_RENDERER_HAS_CONTEXT_PRIORITY:
   value[0] = 0;
   if (brw_hw_context_set_priority(screen->bufmgr,
- 0, BRW_CONTEXT_HIGH_PRIORITY) == 0)
+ 0, GEN_CONTEXT_HIGH_PRIORITY) == 0)
  value[0] |= __DRI2_RENDERER_HAS_CONTEXT_PRIORITY_HIGH;
   if (brw_hw_context_set_priority(screen->bufmgr,
- 0, BRW_CONTEXT_LOW_PRIORITY) == 0)
+ 0, GEN_CONTEXT_LOW_PRIORITY) == 0)
  value[0] |= __DRI2_RENDERER_HAS_CONTEXT_PRIORITY_LOW;
   /* reset to default last, just in case */
   if (brw_hw_context_set_priority(screen->bufmgr,
- 0, BRW_CONTEXT_MEDIUM_PRIORITY) == 0)
+ 0, GEN_CONTEXT_MEDIUM_PRIORITY) == 0)
  value[0] |= __DRI2_RENDERER_HAS_CONTEXT_PRIORITY_MEDIUM;
   return 0;
case __DRI2_RENDERER_HAS_FRAMEBUFFER_SRGB:

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): winsys/amdgpu: request high addresses

2018-02-28 Thread Christian König

Module: Mesa
Branch: master
Commit: 33633690aa51ff5c79909146d6453b50e37dbad0
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=33633690aa51ff5c79909146d6453b50e37dbad0

Author: Christian König 
Date:   Mon Feb 26 14:13:28 2018 +0100

winsys/amdgpu: request high addresses

We now have hopefully fixed all bugs regarding high addresses on Vega10 and
Raven. Start to use the high range to make room for SVM in the low
range.

Signed-off-by: Christian König 
Reviewed-by: Marek Olšák 

---

 src/gallium/winsys/amdgpu/drm/amdgpu_bo.c | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
index 19f6fedf7c..12d497d292 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
@@ -38,6 +38,10 @@
 #define AMDGPU_GEM_CREATE_VM_ALWAYS_VALID (1 << 6)
 #endif
 
+#ifndef AMDGPU_VA_RANGE_HIGH
+#define AMDGPU_VA_RANGE_HIGH   0x2
+#endif
+
 /* Set to 1 for verbose output showing committed sparse buffer ranges. */
 #define DEBUG_SPARSE_COMMITS 0
 
@@ -438,7 +442,8 @@ static struct amdgpu_winsys_bo *amdgpu_create_bo(struct 
amdgpu_winsys *ws,
   alignment = MAX2(alignment, ws->info.pte_fragment_size);
r = amdgpu_va_range_alloc(ws->dev, amdgpu_gpu_va_range_general,
  size + va_gap_size, alignment, 0, , _handle,
- flags & RADEON_FLAG_32BIT ? 
AMDGPU_VA_RANGE_32_BIT : 0);
+ (flags & RADEON_FLAG_32BIT ? 
AMDGPU_VA_RANGE_32_BIT : 0) |
+AMDGPU_VA_RANGE_HIGH);
if (r)
   goto error_va_alloc;
 
@@ -896,7 +901,8 @@ amdgpu_bo_sparse_create(struct amdgpu_winsys *ws, uint64_t 
size,
va_gap_size = ws->check_vm ? 4 * RADEON_SPARSE_PAGE_SIZE : 0;
r = amdgpu_va_range_alloc(ws->dev, amdgpu_gpu_va_range_general,
  map_size + va_gap_size, RADEON_SPARSE_PAGE_SIZE,
- 0, >va, >u.sparse.va_handle, 0);
+ 0, >va, >u.sparse.va_handle,
+AMDGPU_VA_RANGE_HIGH);
if (r)
   goto error_va_alloc;
 
@@ -1290,7 +1296,8 @@ static struct pb_buffer *amdgpu_bo_from_handle(struct 
radeon_winsys *rws,
   goto error_query;
 
r = amdgpu_va_range_alloc(ws->dev, amdgpu_gpu_va_range_general,
- result.alloc_size, 1 << 20, 0, , _handle, 
0);
+ result.alloc_size, 1 << 20, 0, , _handle,
+AMDGPU_VA_RANGE_HIGH);
if (r)
   goto error_query;
 
@@ -1401,7 +1408,8 @@ static struct pb_buffer *amdgpu_bo_from_ptr(struct 
radeon_winsys *rws,
 goto error;
 
 if (amdgpu_va_range_alloc(ws->dev, amdgpu_gpu_va_range_general,
-  aligned_size, 1 << 12, 0, , _handle, 0))
+  aligned_size, 1 << 12, 0, , _handle,
+ AMDGPU_VA_RANGE_HIGH))
 goto error_va_alloc;
 
 if (amdgpu_bo_va_op(buf_handle, 0, aligned_size, va, 0, AMDGPU_VA_OP_MAP))

___
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (master): ac/shader: move scanning some info about input PS declarations

2018-02-28 Thread Samuel Pitoiset

Module: Mesa
Branch: master
Commit: 639c4f2b54a6edbedfc3f75fd05d1588752b0693
URL:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=639c4f2b54a6edbedfc3f75fd05d1588752b0693

Author: Samuel Pitoiset 
Date:   Mon Feb 26 12:14:35 2018 +0100

ac/shader: move scanning some info about input PS declarations

Signed-off-by: Samuel Pitoiset 
Reviewed-by: Dave Airlie 

---

 src/amd/common/ac_nir_to_llvm.c |  6 --
 src/amd/common/ac_nir_to_llvm.h |  3 ---
 src/amd/common/ac_shader_info.c | 15 +++
 src/amd/common/ac_shader_info.h |  3 +++
 src/amd/vulkan/radv_pipeline.c  | 14 --
 5 files changed, 26 insertions(+), 15 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 8b662f884f..88e0cf9b4b 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -5638,12 +5638,6 @@ handle_fs_inputs(struct radv_shader_context *ctx,
}
}
ctx->shader_info->fs.num_interp = index;
-   if (ctx->input_mask & (1 << VARYING_SLOT_PNTC))
-   ctx->shader_info->fs.has_pcoord = true;
-   if (ctx->input_mask & (1 << VARYING_SLOT_PRIMITIVE_ID))
-   ctx->shader_info->fs.prim_id_input = true;
-   if (ctx->input_mask & (1 << VARYING_SLOT_LAYER))
-   ctx->shader_info->fs.layer_input = true;
ctx->shader_info->fs.input_mask = ctx->input_mask >> VARYING_SLOT_VAR0;
 
if (ctx->shader_info->info.needs_multiview_view_index)
diff --git a/src/amd/common/ac_nir_to_llvm.h b/src/amd/common/ac_nir_to_llvm.h
index 07cf9656f5..766acec6ed 100644
--- a/src/amd/common/ac_nir_to_llvm.h
+++ b/src/amd/common/ac_nir_to_llvm.h
@@ -177,11 +177,8 @@ struct ac_shader_variant_info {
unsigned num_interp;
uint32_t input_mask;
uint32_t flat_shaded_mask;
-   bool has_pcoord;
bool can_discard;
bool early_fragment_test;
-   bool prim_id_input;
-   bool layer_input;
} fs;
struct {
unsigned block_size[3];
diff --git a/src/amd/common/ac_shader_info.c b/src/amd/common/ac_shader_info.c
index d76fecd244..57d7edec76 100644
--- a/src/amd/common/ac_shader_info.c
+++ b/src/amd/common/ac_shader_info.c
@@ -194,6 +194,21 @@ gather_info_input_decl_ps(const nir_shader *nir, const 
nir_variable *var,
  struct ac_shader_info *info)
 {
const struct glsl_type *type = glsl_without_array(var->type);
+   int idx = var->data.location;
+
+   switch (idx) {
+   case VARYING_SLOT_PNTC:
+   info->ps.has_pcoord = true;
+   break;
+   case VARYING_SLOT_PRIMITIVE_ID:
+   info->ps.prim_id_input = true;
+   break;
+   case VARYING_SLOT_LAYER:
+   info->ps.layer_input = true;
+   break;
+   default:
+   break;
+   }
 
if (glsl_get_base_type(type) == GLSL_TYPE_FLOAT) {
if (var->data.sample)
diff --git a/src/amd/common/ac_shader_info.h b/src/amd/common/ac_shader_info.h
index 7f87582930..60ddfd2d71 100644
--- a/src/amd/common/ac_shader_info.h
+++ b/src/amd/common/ac_shader_info.h
@@ -49,6 +49,9 @@ struct ac_shader_info {
bool writes_z;
bool writes_stencil;
bool writes_sample_mask;
+   bool has_pcoord;
+   bool prim_id_input;
+   bool layer_input;
} ps;
struct {
bool uses_grid_size;
diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index 9990a3e863..6ad0b486f1 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -1779,9 +1779,9 @@ void radv_create_shaders(struct radv_pipeline *pipeline,
 
/* TODO: These are no longer used as keys we should refactor 
this */
keys[MESA_SHADER_VERTEX].vs.export_prim_id =
-   
pipeline->shaders[MESA_SHADER_FRAGMENT]->info.fs.prim_id_input;
+   
pipeline->shaders[MESA_SHADER_FRAGMENT]->info.info.ps.prim_id_input;
keys[MESA_SHADER_TESS_EVAL].tes.export_prim_id =
-   
pipeline->shaders[MESA_SHADER_FRAGMENT]->info.fs.prim_id_input;
+   
pipeline->shaders[MESA_SHADER_FRAGMENT]->info.info.ps.prim_id_input;
}
 
if (device->physical_device->rad_info.chip_class >= GFX9 && 
modules[MESA_SHADER_TESS_CTRL]) {
@@ -2750,7 +2750,7 @@ radv_pipeline_generate_ps_inputs(struct radeon_winsys_cs 
*cs,
 
unsigned ps_offset = 0;
 
-   if (ps->info.fs.prim_id_input) {
+   if (ps->info.info.ps.prim_id_input) {
unsigned vs_offset = 
outinfo->vs_output_param_offset[VARYING_SLOT_PRIMITIVE_ID];

73 matches

Mail list logo