[Mesa-dev] [PATCH] util/build-id: define ElfW and NT_GNU_BUILD_ID if needed

2017-02-17 Thread Jonathan Gray
Define ElfW() and NT_GNU_BUILD_ID if needed as these defines are not
present on at least OpenBSD and FreeBSD.  Fixes the build on OpenBSD.

Signed-off-by: Jonathan Gray 
---
 src/util/build_id.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/util/build_id.c b/src/util/build_id.c
index 2993a80cfe..cc0f852730 100644
--- a/src/util/build_id.c
+++ b/src/util/build_id.c
@@ -28,6 +28,14 @@
 
 #include "build_id.h"
 
+#ifndef NT_GNU_BUILD_ID
+#define NT_GNU_BUILD_ID 3
+#endif
+
+#ifndef ElfW
+#define ElfW(type) Elf_##type
+#endif
+
 #define ALIGN(val, align)  (((val) + (align) - 1) & ~((align) - 1))
 
 struct build_id_note {
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] util: Add utility build-id code.

2017-02-17 Thread Jonathan Gray
On Fri, Feb 17, 2017 at 08:30:17AM -0800, Matt Turner wrote:
> On Fri, Feb 17, 2017 at 5:39 AM, Emil Velikov  
> wrote:
> > On 17 February 2017 at 01:10, Jonathan Gray  wrote:
> >> On Thu, Feb 16, 2017 at 04:25:02PM +, Emil Velikov wrote:
> >>> On 16 February 2017 at 14:23, Jonathan Gray  wrote:
> >>> > On Wed, Feb 15, 2017 at 11:11:50AM -0800, Matt Turner wrote:
> >>> >> Provides the ability to read the .note.gnu.build-id section of ELF
> >>> >> binaries, which is inserted by the --build-id=... flag to ld.
> >>> >>
> >>> >> Reviewed-by: Emil Velikov 
> >>> >
> >>> > I don't have time to dig into details right now but this broke the Mesa
> >>> > build on OpenBSD and likely other non-linux platforms:
> >>> >
> >>> > libtool: compile:  gcc -DPACKAGE_NAME=\"Mesa\" 
> >>> > -DPACKAGE_TARNAME=\"mesa\" -DPACKAGE_VERSION=\"17.1.0-devel\" 
> >>> > "-DPACKAGE_STRING=\"Mesa 17.1.0-devel\"" 
> >>> > "-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\";
> >>> >  -DPACKAGE_URL=\"\" -DPACKAGE=\"mesa\" -DVERSION=\"17.1.0-devel\" 
> >>> > -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 
> >>> > -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 
> >>> > -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 
> >>> > -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\" 
> >>> > -DYYTEXT_POINTER=1 -DHAVE___BUILTIN_CLZ=1 -DHAVE___BUILTIN_CLZLL=1 
> >>> > -DHAVE___BUILTIN_CTZ=1 -DHAVE___BUILTIN_EXPECT=1 -DHAVE___BUILTIN_FFS=1 
> >>> > -DHAVE___BUILTIN_FFSLL=1 -DHAVE___BUILTIN_POPCOUNT=1 
> >>> > -DHAVE___BUILTIN_POPCOUNTLL=1 -DHAVE_FUNC_ATTRIBUTE_CONST=1 
> >>> > -DHAVE_FUNC_ATTRIBUTE_FLATTEN=1 -DHAVE_FUNC_ATTRIBUTE_FORMAT=1 
> >>> > -DHAVE_FUNC_ATTRIBUTE_MALLOC=1 -DHAVE_FUNC_ATTRIBUTE_PACKED=1 
> >>> > -DHAVE_FUNC_ATTRIBUTE_PURE=1 -DHAVE_FUNC_ATTRIBUTE_UNUSED=1 
> >>> > -DHAVE_FUNC_ATTRIBUTE_VISIBILITY=1 
> >>> > -DHAVE_FUNC_ATTRIBUTE_WARN_UNUSED_RESULT=1 -DHAVE_FUNC_ATTRIBUTE_WEAK=1 
> >>> > -DHAVE_FUNC_ATTRIBUTE_ALIAS=1 -DHAVE_DLADDR=1 -DHAVE_CLOCK_GETTIME=1 
> >>> > -DHAVE_PTHREAD_PRIO_INHERIT=1 -DHAVE_PTHREAD=1 -I. 
> >>> > -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS 
> >>> > -DDEBUG -DTEXTURE_FLOAT_ENABLED -DUSE_X86_64_ASM -DHAVE_SYS_SYSCTL_H 
> >>> > -DHAVE_STRTOF -DHAVE_MKOSTEMP -DHAVE_DLOPEN -DHAVE_DL_ITERATE_PHDR 
> >>> > -DHAVE_POSIX_MEMALIGN -DHAVE_LIBDRM -DGLX_USE_DRM 
> >>> > -DGLX_INDIRECT_RENDERING -DGLX_DIRECT_RENDERING -DENABLE_SHADER_CACHE 
> >>> > -DHAVE_MINCORE -I../../include -I../../src -I../../src/mapi 
> >>> > -I../../src/mesa -I../../src/gallium/include 
> >>> > -I../../src/gallium/auxiliary -fvisibility=hidden -Werror=pointer-arith 
> >>> > -g -O2 -Wall -std=gnu99 -Werror=implicit-function-declaration 
> >>> > -Werror=missing-prototypes -fno-math-errno -fno-trapping-math -MT 
> >>> > libmesautil_la-build_id.lo -MD -MP -MF 
> >>> > .deps/libmesautil_la-build_id.Tpo -c build_id.c  -fPIC -DPIC -o 
> >>> > .libs/libmesautil_la-build_id.o
> >>> > In file included from /usr/include/elf_abi.h:31,
> >>> >  from /usr/include/link_elf.h:10,
> >>> >  from /usr/include/link.h:39,
> >>> >  from build_id.c:25:
> >>> > /usr/include/sys/exec_elf.h:585: error: expected 
> >>> > specifier-qualifier-list before 'uint32_t'
> >>> > In file included from /usr/include/link.h:39,
> >>> >  from build_id.c:25:
> >>> > /usr/include/link_elf.h:22: error: expected specifier-qualifier-list 
> >>> > before 'caddr_t'
> >>> > /usr/include/link_elf.h:37: error: expected '=', ',', ';', 'asm' or 
> >>> > '__attribute__' before 'int'
> >>> > In file included from build_id.c:25:
> >>> > /usr/include/link.h:49: error: expected '=', ',', ';', 'asm' or 
> >>> > '__attribute__' before 'struct'
> >>> > /usr/include/link.h:65: error: expected specifier-qualifier-list before 
> >>> > 'caddr_t'
> >>> These look like issue in your platform code/headers. Perhaps some bad
> >>> interaction with the bits that Mesa defines ?
> >>>
> >>> Quick workaround is to check the function only when needed, roughly
> >>> like this pseudo code:
> >>>
> >>> if test $building_any_vulkan_driver = yes ;then
> >>> require_dl...=yes
> >>>
> >>> fi
> >>> 
> >>>
> >>> if test $require_dl... = yes ; then
> >>>AC_CHECK_FUNC([dl_iterate_phdr], [DEFINES="$DEFINES
> >>> -DHAVE_DL_ITERATE_PHDR"], [AC_MSG_ERROR([required  not found])])
> >>> fi
> >>>
> >>>
> >>> Please give it a bash and send us a patch that works on your end.
> >>
> >> Leaning towards something along the lines of the following.
> >> With Nhdr struct definitions added to system exec_elf.h.
> >>
> > IMHO it makes little sense to build the file if no code uses it. That aside:
> 
> Agreed, but I think this will be used for shader cache as well.
> 
> >
> >> The need for sys/types.h here may go away shortly as well.
> >>
> >> diff --git a/src/util/build_id.c 

Re: [Mesa-dev] mesa3d.org is now synced to docs/

2017-02-17 Thread Matt Turner
On Fri, Feb 17, 2017 at 4:59 PM, Eric Engestrom  wrote:
> Hey all,
>
> I (finally) set up the git hook, which means that the website [1] is now
> automagically updated whenever a change in docs/* on master is pushed.
>
> There might be bugs, and I'll be doing some other git hook related
> changes later on; if you see anything weird when pushing or if the
> website's content seems wrong, please send me a mail.

Awesome. Thank you.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] mesa3d.org is now synced to docs/

2017-02-17 Thread Eric Engestrom
Hey all,

I (finally) set up the git hook, which means that the website [1] is now
automagically updated whenever a change in docs/* on master is pushed.

There might be bugs, and I'll be doing some other git hook related
changes later on; if you see anything weird when pushing or if the
website's content seems wrong, please send me a mail.

Cheers,
  Eric

[1] https://mesa3d.org
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 99319] godot engine poor performance

2017-02-17 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99319

Bronson  changed:

   What|Removed |Added

Version|13.0|17.0

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 99319] godot engine poor performance

2017-02-17 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99319

--- Comment #5 from Bronson  ---
Ive also thrown up a demo showing the situation with some stuff from my current
project.
There is a build here:
https://drive.google.com/open?id=0B_nQZvJoqbFmRUtYT3pnUm14Szg
It will load straight into a map, press "p" to bring up there performance
dialog.

On my various amd card in the house i get between 2-12 fps.
On my integrated intel I get about 50-60fps.
So there is definitely some issue here with the amd cards.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] isl: Return surface creation success from aux helpers

2017-02-17 Thread Jason Ekstrand
The isl_surf_init call that each of these helpers make can, in theory,
fail.  We should propagate that up to the caller rather than just
silently ignoring it.
---
 src/intel/isl/isl.c  | 72 +---
 src/intel/isl/isl.h  |  4 +--
 src/intel/vulkan/anv_image.c |  5 +--
 3 files changed, 40 insertions(+), 41 deletions(-)

diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index 82ab68d..1a47da5 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -1323,7 +1323,7 @@ isl_surf_get_tile_info(const struct isl_device *dev,
isl_tiling_get_info(dev, surf->tiling, fmtl->bpb, tile_info);
 }
 
-void
+bool
 isl_surf_get_hiz_surf(const struct isl_device *dev,
   const struct isl_surf *surf,
   struct isl_surf *hiz_surf)
@@ -1391,20 +1391,20 @@ isl_surf_get_hiz_surf(const struct isl_device *dev,
 */
const unsigned samples = ISL_DEV_GEN(dev) >= 9 ? 1 : surf->samples;
 
-   isl_surf_init(dev, hiz_surf,
- .dim = surf->dim,
- .format = ISL_FORMAT_HIZ,
- .width = surf->logical_level0_px.width,
- .height = surf->logical_level0_px.height,
- .depth = surf->logical_level0_px.depth,
- .levels = surf->levels,
- .array_len = surf->logical_level0_px.array_len,
- .samples = samples,
- .usage = ISL_SURF_USAGE_HIZ_BIT,
- .tiling_flags = ISL_TILING_HIZ_BIT);
+   return isl_surf_init(dev, hiz_surf,
+.dim = surf->dim,
+.format = ISL_FORMAT_HIZ,
+.width = surf->logical_level0_px.width,
+.height = surf->logical_level0_px.height,
+.depth = surf->logical_level0_px.depth,
+.levels = surf->levels,
+.array_len = surf->logical_level0_px.array_len,
+.samples = samples,
+.usage = ISL_SURF_USAGE_HIZ_BIT,
+.tiling_flags = ISL_TILING_HIZ_BIT);
 }
 
-void
+bool
 isl_surf_get_mcs_surf(const struct isl_device *dev,
   const struct isl_surf *surf,
   struct isl_surf *mcs_surf)
@@ -1427,17 +1427,17 @@ isl_surf_get_mcs_surf(const struct isl_device *dev,
   unreachable("Invalid sample count");
}
 
-   isl_surf_init(dev, mcs_surf,
- .dim = ISL_SURF_DIM_2D,
- .format = mcs_format,
- .width = surf->logical_level0_px.width,
- .height = surf->logical_level0_px.height,
- .depth = 1,
- .levels = 1,
- .array_len = surf->logical_level0_px.array_len,
- .samples = 1, /* MCS surfaces are really single-sampled */
- .usage = ISL_SURF_USAGE_MCS_BIT,
- .tiling_flags = ISL_TILING_Y0_BIT);
+   return isl_surf_init(dev, mcs_surf,
+.dim = ISL_SURF_DIM_2D,
+.format = mcs_format,
+.width = surf->logical_level0_px.width,
+.height = surf->logical_level0_px.height,
+.depth = 1,
+.levels = 1,
+.array_len = surf->logical_level0_px.array_len,
+.samples = 1, /* MCS surfaces are really 
single-sampled */
+.usage = ISL_SURF_USAGE_MCS_BIT,
+.tiling_flags = ISL_TILING_Y0_BIT);
 }
 
 bool
@@ -1491,19 +1491,17 @@ isl_surf_get_ccs_surf(const struct isl_device *dev,
   return false;
}
 
-   isl_surf_init(dev, ccs_surf,
- .dim = surf->dim,
- .format = ccs_format,
- .width = surf->logical_level0_px.width,
- .height = surf->logical_level0_px.height,
- .depth = surf->logical_level0_px.depth,
- .levels = surf->levels,
- .array_len = surf->logical_level0_px.array_len,
- .samples = 1,
- .usage = ISL_SURF_USAGE_CCS_BIT,
- .tiling_flags = ISL_TILING_CCS_BIT);
-
-   return true;
+   return isl_surf_init(dev, ccs_surf,
+.dim = surf->dim,
+.format = ccs_format,
+.width = surf->logical_level0_px.width,
+.height = surf->logical_level0_px.height,
+.depth = surf->logical_level0_px.depth,
+.levels = surf->levels,
+.array_len = surf->logical_level0_px.array_len,
+.samples = 1,
+.usage = ISL_SURF_USAGE_CCS_BIT,
+.tiling_flags = ISL_TILING_CCS_BIT);
 }
 
 void
diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
index c340e6a..3a2991b 100644
--- a/src/intel/isl/isl.h

[Mesa-dev] [PATCH 2/2] anv: Enable MSAA compression

2017-02-17 Thread Jason Ekstrand
This just enables basic MSAA compression (no fast clears) for all
multisampled surfaces.  This improves the framerate of the Sascha
"multisampling" demo by 76% on my Sky Lake laptop.  Running Talos on
medium settings with 8x MSAA, this improves the framerate in the
benchmark by 80%.
---
 src/intel/vulkan/TODO  |  2 +-
 src/intel/vulkan/anv_blorp.c   |  3 ++-
 src/intel/vulkan/anv_image.c   |  8 
 src/intel/vulkan/anv_pipeline.c| 14 ++
 src/intel/vulkan/genX_cmd_buffer.c |  5 +
 5 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/src/intel/vulkan/TODO b/src/intel/vulkan/TODO
index f8b73a1..daab39f 100644
--- a/src/intel/vulkan/TODO
+++ b/src/intel/vulkan/TODO
@@ -9,7 +9,7 @@ Missing Features:
 
 Performance:
  - Multi-{sampled/gen8,LOD} HiZ
- - Compressed multisample support
+ - MSAA fast clears
  - Pushing pieces of UBOs?
  - Enable guardband clipping
  - Use soft-pin to avoid relocations
diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
index 4e7078b..902d9af 100644
--- a/src/intel/vulkan/anv_blorp.c
+++ b/src/intel/vulkan/anv_blorp.c
@@ -1397,7 +1397,8 @@ ccs_resolve_attachment(struct anv_cmd_buffer *cmd_buffer,
struct anv_attachment_state *att_state =
   _buffer->state.attachments[att];
 
-   if (att_state->aux_usage == ISL_AUX_USAGE_NONE)
+   if (att_state->aux_usage == ISL_AUX_USAGE_NONE ||
+   att_state->aux_usage == ISL_AUX_USAGE_MCS)
   return; /* Nothing to resolve */
 
assert(att_state->aux_usage == ISL_AUX_USAGE_CCS_E ||
diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
index 7eb0f8f..cc47a50 100644
--- a/src/intel/vulkan/anv_image.c
+++ b/src/intel/vulkan/anv_image.c
@@ -238,6 +238,14 @@ make_surface(const struct anv_device *dev,
 }
  }
   }
+   } else if (aspect == VK_IMAGE_ASPECT_COLOR_BIT && vk_info->samples > 1) {
+  assert(image->aux_surface.isl.size == 0);
+  assert(!(vk_info->usage & VK_IMAGE_USAGE_STORAGE_BIT));
+  ok = isl_surf_get_mcs_surf(>isl_dev, _surf->isl,
+ >aux_surface.isl);
+  assert(ok);
+  add_surface(image, >aux_surface);
+  image->aux_usage = ISL_AUX_USAGE_MCS;
}
 
return VK_SUCCESS;
diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c
index 4410103..3301679 100644
--- a/src/intel/vulkan/anv_pipeline.c
+++ b/src/intel/vulkan/anv_pipeline.c
@@ -228,6 +228,20 @@ static void
 populate_sampler_prog_key(const struct gen_device_info *devinfo,
   struct brw_sampler_prog_key_data *key)
 {
+   /* All multisampled textures are compressed. */
+   key->compressed_multisample_layout_mask = ~0;
+
+   /* SkyLake added support for 16x MSAA.  With this came a new message for
+* reading from a 16x MSAA surface with compression.  The new message was
+* needed because now the MCS data is 64 bits instead of 32 or lower as is
+* the case for 8x, 4x, and 2x.  The key->msaa_16 bit-field controls which
+* message we use.  Fortunately, the 16x message works for 8x, 4x, and 2x
+* so we can just use it unconditionally.  This may not be quite as
+* efficient but it saves us from recompiling.
+*/
+   if (devinfo->gen >= 9)
+  key->msaa_16 = ~0;
+
/* XXX: Handle texture swizzle on HSW- */
for (int i = 0; i < MAX_SAMPLERS; i++) {
   /* Assume color sampler, no swizzling. (Works for BDW+) */
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 40a72f4..5d8c3ea 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -222,6 +222,11 @@ color_attachment_compute_aux_usage(struct anv_device 
*device,
   att_state->input_aux_usage = ISL_AUX_USAGE_NONE;
   att_state->fast_clear = false;
   return;
+   } else if (iview->image->aux_usage == ISL_AUX_USAGE_MCS) {
+  att_state->aux_usage = ISL_AUX_USAGE_MCS;
+  att_state->input_aux_usage = ISL_AUX_USAGE_MCS;
+  att_state->fast_clear = false;
+  return;
}
 
assert(iview->image->aux_surface.isl.usage & ISL_SURF_USAGE_CCS_BIT);
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] isl: Return surface creation success from aux helpers

2017-02-17 Thread Jason Ekstrand
The isl_surf_init call that each of these helpers make can, in theory,
fail.  We should propagate that up to the caller rather than just
silently ignoring it.
---
 src/intel/isl/isl.c  | 72 +---
 src/intel/isl/isl.h  |  4 +--
 src/intel/vulkan/anv_image.c |  5 +--
 3 files changed, 40 insertions(+), 41 deletions(-)

diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index 82ab68d..1a47da5 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -1323,7 +1323,7 @@ isl_surf_get_tile_info(const struct isl_device *dev,
isl_tiling_get_info(dev, surf->tiling, fmtl->bpb, tile_info);
 }
 
-void
+bool
 isl_surf_get_hiz_surf(const struct isl_device *dev,
   const struct isl_surf *surf,
   struct isl_surf *hiz_surf)
@@ -1391,20 +1391,20 @@ isl_surf_get_hiz_surf(const struct isl_device *dev,
 */
const unsigned samples = ISL_DEV_GEN(dev) >= 9 ? 1 : surf->samples;
 
-   isl_surf_init(dev, hiz_surf,
- .dim = surf->dim,
- .format = ISL_FORMAT_HIZ,
- .width = surf->logical_level0_px.width,
- .height = surf->logical_level0_px.height,
- .depth = surf->logical_level0_px.depth,
- .levels = surf->levels,
- .array_len = surf->logical_level0_px.array_len,
- .samples = samples,
- .usage = ISL_SURF_USAGE_HIZ_BIT,
- .tiling_flags = ISL_TILING_HIZ_BIT);
+   return isl_surf_init(dev, hiz_surf,
+.dim = surf->dim,
+.format = ISL_FORMAT_HIZ,
+.width = surf->logical_level0_px.width,
+.height = surf->logical_level0_px.height,
+.depth = surf->logical_level0_px.depth,
+.levels = surf->levels,
+.array_len = surf->logical_level0_px.array_len,
+.samples = samples,
+.usage = ISL_SURF_USAGE_HIZ_BIT,
+.tiling_flags = ISL_TILING_HIZ_BIT);
 }
 
-void
+bool
 isl_surf_get_mcs_surf(const struct isl_device *dev,
   const struct isl_surf *surf,
   struct isl_surf *mcs_surf)
@@ -1427,17 +1427,17 @@ isl_surf_get_mcs_surf(const struct isl_device *dev,
   unreachable("Invalid sample count");
}
 
-   isl_surf_init(dev, mcs_surf,
- .dim = ISL_SURF_DIM_2D,
- .format = mcs_format,
- .width = surf->logical_level0_px.width,
- .height = surf->logical_level0_px.height,
- .depth = 1,
- .levels = 1,
- .array_len = surf->logical_level0_px.array_len,
- .samples = 1, /* MCS surfaces are really single-sampled */
- .usage = ISL_SURF_USAGE_MCS_BIT,
- .tiling_flags = ISL_TILING_Y0_BIT);
+   return isl_surf_init(dev, mcs_surf,
+.dim = ISL_SURF_DIM_2D,
+.format = mcs_format,
+.width = surf->logical_level0_px.width,
+.height = surf->logical_level0_px.height,
+.depth = 1,
+.levels = 1,
+.array_len = surf->logical_level0_px.array_len,
+.samples = 1, /* MCS surfaces are really 
single-sampled */
+.usage = ISL_SURF_USAGE_MCS_BIT,
+.tiling_flags = ISL_TILING_Y0_BIT);
 }
 
 bool
@@ -1491,19 +1491,17 @@ isl_surf_get_ccs_surf(const struct isl_device *dev,
   return false;
}
 
-   isl_surf_init(dev, ccs_surf,
- .dim = surf->dim,
- .format = ccs_format,
- .width = surf->logical_level0_px.width,
- .height = surf->logical_level0_px.height,
- .depth = surf->logical_level0_px.depth,
- .levels = surf->levels,
- .array_len = surf->logical_level0_px.array_len,
- .samples = 1,
- .usage = ISL_SURF_USAGE_CCS_BIT,
- .tiling_flags = ISL_TILING_CCS_BIT);
-
-   return true;
+   return isl_surf_init(dev, ccs_surf,
+.dim = surf->dim,
+.format = ccs_format,
+.width = surf->logical_level0_px.width,
+.height = surf->logical_level0_px.height,
+.depth = surf->logical_level0_px.depth,
+.levels = surf->levels,
+.array_len = surf->logical_level0_px.array_len,
+.samples = 1,
+.usage = ISL_SURF_USAGE_CCS_BIT,
+.tiling_flags = ISL_TILING_CCS_BIT);
 }
 
 void
diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
index c340e6a..3a2991b 100644
--- a/src/intel/isl/isl.h

Re: [Mesa-dev] [Mesa-stable] Mesa 13.0.5 release candidate

2017-02-17 Thread Marek Olšák
On Fri, Feb 17, 2017 at 7:24 PM, Emil Velikov  wrote:
> On 17 February 2017 at 17:14, Andreas Boll  wrote:
>> 2017-02-17 16:15 GMT+01:00 Emil Velikov :
>>> Hello list,
>>>
>>> The candidate for the Mesa 13.0.5 is now available. Currently we have:
>>>  - 70 queued
>>>  - 5 nominated (outstanding)
>>>  - and 0 rejected patch(es)
>>>
>>>
>>> Testing reports/general approval
>>> 
>>> Any testing reports (or general approval of the state of the branch) will be
>>> greatly appreciated.
>>>
>>> The plan is to have 13.0.5 this Friday (19th of February), around or shortly
>>> after 15:00 GMT.
>>
>> This Friday (17th of February) or this Sunday (19th of February)? :-)
>>
> The latter.
>
>>>
>>> If you have any questions or suggestions - be that about the current patch
>>> queue or otherwise, please go ahead.
>>>
>>
>> Please cherry-pick the following commit for 13.0.5 and 17.0.1:
>>
>> commit 94262e5f5db1f5c7865ced251c440bc5f3f4a89d
>> Author: Bartosz Tomczyk 
>> Date:   Sun Jan 29 19:10:25 2017 +0100
>>
>> r600/sb: Fix memory leak
>>
>> Signed-off-by: Marek Olšák 
>>
>> It fixes e933246013ee ("r600/sb: Fix loop optimization related hangs on eg")
>>
> Barring any objections from the author or Marek I'll pick it up.

Sounds good to me.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/2] vulkan: Combine wsi and util makefiles

2017-02-17 Thread Matt Turner
Nice, thanks.

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 1/2] vulkan/util: Add generator for enum_to_str functions

2017-02-17 Thread Dylan Baker
This adds a python generator to produce enum_to_str functions for
Vulkan from the vk.xml API description. It supports extensions as well
as core API features, and the generator works with both python2 and
python3.

Signed-off-by: Dylan Baker 

v2: - Fix automake comments from Matt
---
 configure.ac   |   1 +
 src/Makefile.am|   2 +-
 src/intel/vulkan/Makefile.am   |   2 +
 src/intel/vulkan/anv_util.c|  36 +---
 src/vulkan/util/.gitignore |   1 +
 src/vulkan/util/Makefile.am|  22 +
 src/vulkan/util/gen_enum_to_str.py | 172 +
 7 files changed, 201 insertions(+), 35 deletions(-)
 create mode 100644 src/vulkan/util/.gitignore
 create mode 100644 src/vulkan/util/Makefile.am
 create mode 100644 src/vulkan/util/gen_enum_to_str.py

diff --git a/configure.ac b/configure.ac
index 7e4544f5bf..c83a5234da 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2691,6 +2691,7 @@ AC_CONFIG_FILES([Makefile
src/mesa/main/tests/Makefile
src/util/Makefile
src/util/tests/hash_table/Makefile
+   src/vulkan/util/Makefile
src/vulkan/wsi/Makefile])
 
 AC_OUTPUT
diff --git a/src/Makefile.am b/src/Makefile.am
index 12e5dcdb12..cbdf378c54 100644
--- a/src/Makefile.am
+++ b/src/Makefile.am
@@ -117,7 +117,7 @@ SUBDIRS += intel/tools
 endif
 
 if HAVE_VULKAN_COMMON
-SUBDIRS += vulkan/wsi
+SUBDIRS += vulkan/util vulkan/wsi
 endif
 EXTRA_DIST += vulkan/registry/vk.xml
 
diff --git a/src/intel/vulkan/Makefile.am b/src/intel/vulkan/Makefile.am
index 4197b0e77c..54bf0f5de1 100644
--- a/src/intel/vulkan/Makefile.am
+++ b/src/intel/vulkan/Makefile.am
@@ -49,6 +49,7 @@ AM_CPPFLAGS = \
-I$(top_builddir)/src \
-I$(top_srcdir)/src \
-I$(top_srcdir)/src/vulkan/wsi \
+   -I$(top_builddir)/src/vulkan/util \
-I$(top_builddir)/src/compiler \
-I$(top_srcdir)/src/compiler \
-I$(top_builddir)/src/compiler/nir \
@@ -125,6 +126,7 @@ libvulkan_common_la_SOURCES = $(VULKAN_SOURCES)
 
 VULKAN_LIB_DEPS += \
libvulkan_common.la \
+   $(top_builddir)/src/vulkan/util/libvulkan_util.la \
$(top_builddir)/src/vulkan/wsi/libvulkan_wsi.la \
$(top_builddir)/src/mesa/drivers/dri/i965/libi965_compiler.la \
$(top_builddir)/src/compiler/nir/libnir.la \
diff --git a/src/intel/vulkan/anv_util.c b/src/intel/vulkan/anv_util.c
index 6d75187065..ec5c9486d8 100644
--- a/src/intel/vulkan/anv_util.c
+++ b/src/intel/vulkan/anv_util.c
@@ -29,6 +29,7 @@
 #include 
 
 #include "anv_private.h"
+#include "vk_enum_to_str.h"
 
 /** Log an error message.  */
 void anv_printflike(1, 2)
@@ -69,40 +70,7 @@ __vk_errorf(VkResult error, const char *file, int line, 
const char *format, ...)
va_list ap;
char buffer[256];
 
-#define ERROR_CASE(error) case error: error_str = #error; break;
-
-   const char *error_str;
-   switch ((int32_t)error) {
-
-   /* Core errors */
-   ERROR_CASE(VK_ERROR_OUT_OF_HOST_MEMORY)
-   ERROR_CASE(VK_ERROR_OUT_OF_DEVICE_MEMORY)
-   ERROR_CASE(VK_ERROR_INITIALIZATION_FAILED)
-   ERROR_CASE(VK_ERROR_DEVICE_LOST)
-   ERROR_CASE(VK_ERROR_MEMORY_MAP_FAILED)
-   ERROR_CASE(VK_ERROR_LAYER_NOT_PRESENT)
-   ERROR_CASE(VK_ERROR_EXTENSION_NOT_PRESENT)
-   ERROR_CASE(VK_ERROR_FEATURE_NOT_PRESENT)
-   ERROR_CASE(VK_ERROR_INCOMPATIBLE_DRIVER)
-   ERROR_CASE(VK_ERROR_TOO_MANY_OBJECTS)
-   ERROR_CASE(VK_ERROR_FORMAT_NOT_SUPPORTED)
-   ERROR_CASE(VK_ERROR_FRAGMENTED_POOL)
-
-   /* Extension errors */
-   ERROR_CASE(VK_ERROR_SURFACE_LOST_KHR)
-   ERROR_CASE(VK_ERROR_NATIVE_WINDOW_IN_USE_KHR)
-   ERROR_CASE(VK_ERROR_OUT_OF_DATE_KHR)
-   ERROR_CASE(VK_ERROR_INCOMPATIBLE_DISPLAY_KHR)
-   ERROR_CASE(VK_ERROR_VALIDATION_FAILED_EXT)
-   ERROR_CASE(VK_ERROR_INVALID_SHADER_NV)
-   ERROR_CASE(VK_ERROR_OUT_OF_POOL_MEMORY_KHR)
-
-   default:
-  assert(!"Unknown error");
-  error_str = "unknown error";
-   }
-
-#undef ERROR_CASE
+   const char *error_str = vk_Result_to_str(error);
 
if (format) {
   va_start(ap, format);
diff --git a/src/vulkan/util/.gitignore b/src/vulkan/util/.gitignore
new file mode 100644
index 00..5c79217982
--- /dev/null
+++ b/src/vulkan/util/.gitignore
@@ -0,0 +1 @@
+vk_enum_to_str.*
diff --git a/src/vulkan/util/Makefile.am b/src/vulkan/util/Makefile.am
new file mode 100644
index 00..87c96d5e5b
--- /dev/null
+++ b/src/vulkan/util/Makefile.am
@@ -0,0 +1,22 @@
+vulkan_api_xml = $(top_srcdir)/src/vulkan/registry/vk.xml
+
+AM_CPPFLAGS = \
+   -I$(top_srcdir)/include \
+   -I$(top_srcdir)/src
+
+EXTRA_DIST = \
+   gen_enum_to_str.py
+
+BUILT_SOURCES = \
+   vk_enum_to_str.c \
+   vk_enum_to_str.h
+
+vk_enum_to_str.c vk_enum_to_str.h: gen_enum_to_str.py $(vulkan_api_xml)
+   $(AM_V_GEN)$(PYTHON2) $(srcdir)/gen_enum_to_str.py
+
+noinst_LTLIBRARIES = libvulkan_util.la
+
+libvulkan_util_la_SOURCES = \
+   vk_enum_to_str.c \
+  

[Mesa-dev] [PATCH v2 2/2] vulkan: Combine wsi and util makefiles

2017-02-17 Thread Dylan Baker
cc: Matt Turner 
Signed-off-by: Dylan Baker 

v2: - add this patch
---
 configure.ac   |  3 +--
 src/Makefile.am|  2 +-
 src/intel/vulkan/Makefile.am   |  4 ++--
 src/vulkan/{wsi => }/Makefile.am   | 16 ++--
 src/vulkan/Makefile.sources| 16 
 src/vulkan/util/Makefile.am| 22 --
 src/vulkan/util/gen_enum_to_str.py |  4 ++--
 src/vulkan/wsi/Makefile.sources| 12 
 8 files changed, 36 insertions(+), 43 deletions(-)
 rename src/vulkan/{wsi => }/Makefile.am (70%)
 create mode 100644 src/vulkan/Makefile.sources
 delete mode 100644 src/vulkan/util/Makefile.am
 delete mode 100644 src/vulkan/wsi/Makefile.sources

diff --git a/configure.ac b/configure.ac
index c83a5234da..44c788377b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2691,8 +2691,7 @@ AC_CONFIG_FILES([Makefile
src/mesa/main/tests/Makefile
src/util/Makefile
src/util/tests/hash_table/Makefile
-   src/vulkan/util/Makefile
-   src/vulkan/wsi/Makefile])
+   src/vulkan/Makefile])
 
 AC_OUTPUT
 
diff --git a/src/Makefile.am b/src/Makefile.am
index cbdf378c54..860be53c01 100644
--- a/src/Makefile.am
+++ b/src/Makefile.am
@@ -117,7 +117,7 @@ SUBDIRS += intel/tools
 endif
 
 if HAVE_VULKAN_COMMON
-SUBDIRS += vulkan/util vulkan/wsi
+SUBDIRS += vulkan
 endif
 EXTRA_DIST += vulkan/registry/vk.xml
 
diff --git a/src/intel/vulkan/Makefile.am b/src/intel/vulkan/Makefile.am
index 54bf0f5de1..8089762610 100644
--- a/src/intel/vulkan/Makefile.am
+++ b/src/intel/vulkan/Makefile.am
@@ -126,8 +126,8 @@ libvulkan_common_la_SOURCES = $(VULKAN_SOURCES)
 
 VULKAN_LIB_DEPS += \
libvulkan_common.la \
-   $(top_builddir)/src/vulkan/util/libvulkan_util.la \
-   $(top_builddir)/src/vulkan/wsi/libvulkan_wsi.la \
+   $(top_builddir)/src/vulkan/libvulkan_util.la \
+   $(top_builddir)/src/vulkan/libvulkan_wsi.la \
$(top_builddir)/src/mesa/drivers/dri/i965/libi965_compiler.la \
$(top_builddir)/src/compiler/nir/libnir.la \
$(top_builddir)/src/util/libmesautil.la \
diff --git a/src/vulkan/wsi/Makefile.am b/src/vulkan/Makefile.am
similarity index 70%
rename from src/vulkan/wsi/Makefile.am
rename to src/vulkan/Makefile.am
index a71279947a..df84cfd71d 100644
--- a/src/vulkan/wsi/Makefile.am
+++ b/src/vulkan/Makefile.am
@@ -1,9 +1,21 @@
-
 include Makefile.sources
 
+noinst_LTLIBRARIES = libvulkan_wsi.la libvulkan_util.la 
+
 vulkan_includedir = $(includedir)/vulkan
+vulkan_api_xml = $(top_srcdir)/src/vulkan/registry/vk.xml
+
+EXTRA_DIST = \
+   util/gen_enum_to_str.py
+
+BUILT_SOURCES = \
+   util/vk_enum_to_str.c \
+   util/vk_enum_to_str.h
+
+util/vk_enum_to_str.c util/vk_enum_to_str.h: util/gen_enum_to_str.py 
$(vulkan_api_xml)
+   $(AM_V_GEN)$(PYTHON2) $(srcdir)/util/gen_enum_to_str.py
 
-noinst_LTLIBRARIES = libvulkan_wsi.la
+libvulkan_util_la_SOURCES = $(VULKAN_UTIL_FILES)
 
 AM_CPPFLAGS = \
$(DEFINES) \
diff --git a/src/vulkan/Makefile.sources b/src/vulkan/Makefile.sources
new file mode 100644
index 00..fbb8bfc51d
--- /dev/null
+++ b/src/vulkan/Makefile.sources
@@ -0,0 +1,16 @@
+
+VULKAN_WSI_FILES := \
+   wsi/wsi_common.h \
+   wsi/wsi_common_queue.h
+
+VULKAN_WSI_WAYLAND_FILES := \
+   wsi/wsi_common_wayland.c \
+   wsi/wsi_common_wayland.h
+
+VULKAN_WSI_X11_FILES := \
+   wsi/wsi_common_x11.c \
+   wsi/wsi_common_x11.h
+
+VULKAN_UTIL_FILES := \
+   util/vk_enum_to_str.c \
+   util/vk_enum_to_str.h
diff --git a/src/vulkan/util/Makefile.am b/src/vulkan/util/Makefile.am
deleted file mode 100644
index 87c96d5e5b..00
--- a/src/vulkan/util/Makefile.am
+++ /dev/null
@@ -1,22 +0,0 @@
-vulkan_api_xml = $(top_srcdir)/src/vulkan/registry/vk.xml
-
-AM_CPPFLAGS = \
-   -I$(top_srcdir)/include \
-   -I$(top_srcdir)/src
-
-EXTRA_DIST = \
-   gen_enum_to_str.py
-
-BUILT_SOURCES = \
-   vk_enum_to_str.c \
-   vk_enum_to_str.h
-
-vk_enum_to_str.c vk_enum_to_str.h: gen_enum_to_str.py $(vulkan_api_xml)
-   $(AM_V_GEN)$(PYTHON2) $(srcdir)/gen_enum_to_str.py
-
-noinst_LTLIBRARIES = libvulkan_util.la
-
-libvulkan_util_la_SOURCES = \
-   vk_enum_to_str.c \
-   vk_enum_to_str.h
-
diff --git a/src/vulkan/util/gen_enum_to_str.py 
b/src/vulkan/util/gen_enum_to_str.py
index 0564b8e028..4b6fdf3b3d 100644
--- a/src/vulkan/util/gen_enum_to_str.py
+++ b/src/vulkan/util/gen_enum_to_str.py
@@ -159,8 +159,8 @@ def xml_parser(filename):
 
 def main():
 enums = xml_parser(VK_XML)
-for template, file_ in [(C_TEMPLATE, 'vk_enum_to_str.c'),
-(H_TEMPLATE, 'vk_enum_to_str.h')]:
+for template, file_ in [(C_TEMPLATE, 'util/vk_enum_to_str.c'),
+(H_TEMPLATE, 'util/vk_enum_to_str.h')]:
 with open(file_, 'wb') as f:
 f.write(template.render(
  

[Mesa-dev] [PATCH 2/2] [swr] fix index buffers with non-zero indices

2017-02-17 Thread George Kyriazis
Fix issue with index buffers that do not contain a 0 index.  0 index
can be a non-valid index if the (copied) vertex buffers are a subset of the
user's (which happens because we only copy the range between min & max).
Core will use an index passed in from the driver to replace invalid indices.

Only do this for calls that contain non-zero indices, to minimize performance
cost.
---
 src/gallium/drivers/swr/rasterizer/core/state.h|  1 +
 .../drivers/swr/rasterizer/jitter/fetch_jit.cpp| 60 +++---
 .../drivers/swr/rasterizer/jitter/fetch_jit.h  |  2 +
 src/gallium/drivers/swr/swr_draw.cpp   |  1 +
 src/gallium/drivers/swr/swr_state.cpp  |  4 ++
 5 files changed, 62 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/core/state.h 
b/src/gallium/drivers/swr/rasterizer/core/state.h
index 2f3b913..05347dc 100644
--- a/src/gallium/drivers/swr/rasterizer/core/state.h
+++ b/src/gallium/drivers/swr/rasterizer/core/state.h
@@ -524,6 +524,7 @@ struct SWR_VERTEX_BUFFER_STATE
 const uint8_t *pData;
 uint32_t size;
 uint32_t numaNode;
+uint32_t minVertex; // min vertex (for bounds checking)
 uint32_t maxVertex; // size / pitch.  precalculated value used 
by fetch shader for OOB checks
 uint32_t partialInboundsSize;   // size % pitch.  precalculated value used 
by fetch shader for partially OOB vertices
 };
diff --git a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp
index 901bce6..ffa7605 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp
@@ -309,11 +309,29 @@ void FetchJit::JitLoadVertices(const FETCH_COMPILE_STATE 
, Value* str
 
 Value* startVertexOffset = MUL(Z_EXT(startOffset, mInt64Ty), stride);
 
+Value *minVertex = NULL;
+Value *minVertexOffset = NULL;
+if (fetchState.bPartialVertexBuffer) {
+// fetch min index for low bounds checking
+minVertex = GEP(streams, {C(ied.StreamIndex), 
C(SWR_VERTEX_BUFFER_STATE_minVertex)});
+minVertex = LOAD(minVertex);
+if (!fetchState.bDisableIndexOOBCheck) {
+minVertexOffset = MUL(Z_EXT(minVertex, mInt64Ty), stride);
+}
+}
+
 // Load from the stream.
 for(uint32_t lane = 0; lane < mVWidth; ++lane)
 {
 // Get index
 Value* index = VEXTRACT(vCurIndices, C(lane));
+
+if (fetchState.bPartialVertexBuffer) {
+// clamp below minvertex
+Value *isBelowMin = ICMP_SLT(index, minVertex);
+index = SELECT(isBelowMin, minVertex, index);
+}
+
 index = Z_EXT(index, mInt64Ty);
 
 Value*offset = MUL(index, stride);
@@ -321,10 +339,14 @@ void FetchJit::JitLoadVertices(const FETCH_COMPILE_STATE 
, Value* str
 offset = ADD(offset, startVertexOffset);
 
 if (!fetchState.bDisableIndexOOBCheck) {
-// check for out of bound access, including partial OOB, and 
mask them to 0
+// check for out of bound access, including partial OOB, and 
replace them with minVertex
 Value *endOffset = ADD(offset, C((int64_t)info.Bpp));
 Value *oob = ICMP_ULE(endOffset, size);
-offset = SELECT(oob, offset, ConstantInt::get(mInt64Ty, 0));
+if (fetchState.bPartialVertexBuffer) {
+offset = SELECT(oob, offset, minVertexOffset);
+} else {
+offset = SELECT(oob, offset, ConstantInt::get(mInt64Ty, 
0));
+}
 }
 
 Value*pointer = GEP(stream, offset);
@@ -732,6 +754,13 @@ void FetchJit::JitGatherVertices(const FETCH_COMPILE_STATE 
,
 Value *maxVertex = GEP(streams, {C(ied.StreamIndex), 
C(SWR_VERTEX_BUFFER_STATE_maxVertex)});
 maxVertex = LOAD(maxVertex);
 
+Value *minVertex = NULL;
+if (fetchState.bPartialVertexBuffer) {
+// min vertex index for low bounds OOB checking
+minVertex = GEP(streams, {C(ied.StreamIndex), 
C(SWR_VERTEX_BUFFER_STATE_minVertex)});
+minVertex = LOAD(minVertex);
+}
+
 Value *vCurIndices;
 Value *startOffset;
 if(ied.InstanceEnable)
@@ -769,9 +798,16 @@ void FetchJit::JitGatherVertices(const FETCH_COMPILE_STATE 
,
 
 // if we have a start offset, subtract from max vertex. Used for OOB 
check
 maxVertex = SUB(Z_EXT(maxVertex, mInt64Ty), Z_EXT(startOffset, 
mInt64Ty));
-Value* neg = ICMP_SLT(maxVertex, C((int64_t)0));
+Value* maxNeg = ICMP_SLT(maxVertex, C((int64_t)0));
 // if we have a negative value, we're already OOB. clamp at 0.
-maxVertex = SELECT(neg, C(0), TRUNC(maxVertex, mInt32Ty));
+maxVertex = SELECT(maxNeg, 

[Mesa-dev] [PATCH 1/2] [swr] Add fetch shader cache

2017-02-17 Thread George Kyriazis
For now, the cache key is all of FETCH_COMPILE_STATE.

Use new/delete for swr_vertex_element_state, since we have to call the
constructors/destructors of the struct elements.
---
 src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.h |  2 +-
 src/gallium/drivers/swr/swr_draw.cpp  | 19 +++
 src/gallium/drivers/swr/swr_shader.cpp| 14 ++
 src/gallium/drivers/swr/swr_shader.h  | 15 +++
 src/gallium/drivers/swr/swr_state.cpp |  6 --
 src/gallium/drivers/swr/swr_state.h   |  9 +
 6 files changed, 50 insertions(+), 15 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.h 
b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.h
index 1547453..622608a 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.h
+++ b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.h
@@ -94,7 +94,7 @@ enum ComponentControl
 //
 struct FETCH_COMPILE_STATE
 {
-uint32_t numAttribs;
+uint32_t numAttribs {0};
 INPUT_ELEMENT_DESC layout[KNOB_NUM_ATTRIBUTES];
 SWR_FORMAT indexType;
 uint32_t cutIndex{ 0x };
diff --git a/src/gallium/drivers/swr/swr_draw.cpp 
b/src/gallium/drivers/swr/swr_draw.cpp
index c4d5e5c..4bdd3bb 100644
--- a/src/gallium/drivers/swr/swr_draw.cpp
+++ b/src/gallium/drivers/swr/swr_draw.cpp
@@ -141,19 +141,22 @@ swr_draw_vbo(struct pipe_context *pipe, const struct 
pipe_draw_info *info)
}
 
struct swr_vertex_element_state *velems = ctx->velems;
-   if (!velems->fsFunc
-   || (velems->fsState.cutIndex != info->restart_index)
-   || (velems->fsState.bEnableCutIndex != info->primitive_restart)) {
-
-  velems->fsState.cutIndex = info->restart_index;
-  velems->fsState.bEnableCutIndex = info->primitive_restart;
-
-  /* Create Fetch Shader */
+   velems->fsState.cutIndex = info->restart_index;
+   velems->fsState.bEnableCutIndex = info->primitive_restart;
+
+   swr_jit_fetch_key key;
+   swr_generate_fetch_key(key, velems);
+   auto search = velems->map.find(key);
+   if (search != velems->map.end()) {
+  velems->fsFunc = search->second;
+   } else {
   HANDLE hJitMgr = swr_screen(ctx->pipe.screen)->hJitMgr;
   velems->fsFunc = JitCompileFetch(hJitMgr, velems->fsState);
 
   debug_printf("fetch shader %p\n", velems->fsFunc);
   assert(velems->fsFunc && "Error: FetchShader = NULL");
+
+  velems->map.insert(std::make_pair(key, velems->fsFunc));
}
 
SwrSetFetchFunc(ctx->swrContext, velems->fsFunc);
diff --git a/src/gallium/drivers/swr/swr_shader.cpp 
b/src/gallium/drivers/swr/swr_shader.cpp
index 979a28b..676938c 100644
--- a/src/gallium/drivers/swr/swr_shader.cpp
+++ b/src/gallium/drivers/swr/swr_shader.cpp
@@ -61,6 +61,11 @@ bool operator==(const swr_jit_vs_key , const 
swr_jit_vs_key )
return !memcmp(, , sizeof(lhs));
 }
 
+bool operator==(const swr_jit_fetch_key , const swr_jit_fetch_key )
+{
+   return !memcmp(, , sizeof(lhs));
+}
+
 static void
 swr_generate_sampler_key(const struct lp_tgsi_info ,
  struct swr_context *ctx,
@@ -157,6 +162,15 @@ swr_generate_vs_key(struct swr_jit_vs_key ,
swr_generate_sampler_key(swr_vs->info, ctx, PIPE_SHADER_VERTEX, key);
 }
 
+void
+swr_generate_fetch_key(struct swr_jit_fetch_key ,
+   struct swr_vertex_element_state *velems)
+{
+   memset(, 0, sizeof(key));
+
+   key.fsState = velems->fsState;
+}
+
 struct BuilderSWR : public Builder {
BuilderSWR(JitManager *pJitMgr, const char *pName)
   : Builder(pJitMgr)
diff --git a/src/gallium/drivers/swr/swr_shader.h 
b/src/gallium/drivers/swr/swr_shader.h
index 7e3399c..266573f 100644
--- a/src/gallium/drivers/swr/swr_shader.h
+++ b/src/gallium/drivers/swr/swr_shader.h
@@ -42,6 +42,9 @@ void swr_generate_vs_key(struct swr_jit_vs_key ,
  struct swr_context *ctx,
  swr_vertex_shader *swr_vs);
 
+void swr_generate_fetch_key(struct swr_jit_fetch_key ,
+struct swr_vertex_element_state *velems);
+
 struct swr_jit_sampler_key {
unsigned nr_samplers;
unsigned nr_sampler_views;
@@ -60,6 +63,10 @@ struct swr_jit_vs_key : swr_jit_sampler_key {
unsigned clip_plane_mask; // from rasterizer state & vs_info
 };
 
+struct swr_jit_fetch_key {
+   FETCH_COMPILE_STATE fsState;
+};
+
 namespace std
 {
 template <> struct hash {
@@ -75,7 +82,15 @@ template <> struct hash {
   return util_hash_crc32(, sizeof(k));
}
 };
+
+template <> struct hash {
+   std::size_t operator()(const swr_jit_fetch_key ) const
+   {
+  return util_hash_crc32(, sizeof(k));
+   }
+};
 };
 
 bool operator==(const swr_jit_fs_key , const swr_jit_fs_key );
 bool operator==(const swr_jit_vs_key , const swr_jit_vs_key );
+bool operator==(const swr_jit_fetch_key , const swr_jit_fetch_key );
diff --git 

Re: [Mesa-dev] [PATCH] vulkan/util: Add generator for enum_to_str functions

2017-02-17 Thread Dylan Baker
I'll send out a v2 soon.

Quoting Matt Turner (2017-02-17 11:38:17)
> On Fri, Feb 17, 2017 at 10:49 AM, Dylan Baker  wrote:
> > This adds a python generator to produce enum_to_str functions for
> > Vulkan from the vk.xml API description. It supports extensions as well
> > as core API features, and the generator works with both python2 and
> > python3.
> >
> > CC: Jason Ekstrand 
> > Signed-off-by: Dylan Baker 
> > ---
> >  configure.ac   |   1 +
> >  src/Makefile.am|   1 +
> >  src/intel/vulkan/Makefile.am   |   2 +
> >  src/intel/vulkan/anv_util.c|  36 +---
> >  src/vulkan/util/.gitignore |   1 +
> >  src/vulkan/util/Makefile.am|  22 +
> >  src/vulkan/util/gen_enum_to_str.py | 172 
> > +
> >  7 files changed, 201 insertions(+), 34 deletions(-)
> >  create mode 100644 src/vulkan/util/.gitignore
> >  create mode 100644 src/vulkan/util/Makefile.am
> >  create mode 100644 src/vulkan/util/gen_enum_to_str.py
> >
> > diff --git a/configure.ac b/configure.ac
> > index 7e4544f5bf..c83a5234da 100644
> > --- a/configure.ac
> > +++ b/configure.ac
> > @@ -2691,6 +2691,7 @@ AC_CONFIG_FILES([Makefile
> > src/mesa/main/tests/Makefile
> > src/util/Makefile
> > src/util/tests/hash_table/Makefile
> > +   src/vulkan/util/Makefile
> > src/vulkan/wsi/Makefile])
> 
> These two really don't need to be separate Makefiles.
> 
> Can you try, as a follow-on, to combine them as a unified
> src/vulkan/Makefile.am?

Okay, I'll do that

> 
> >
> >  AC_OUTPUT
> > diff --git a/src/Makefile.am b/src/Makefile.am
> > index 12e5dcdb12..90f95b2265 100644
> > --- a/src/Makefile.am
> > +++ b/src/Makefile.am
> > @@ -117,6 +117,7 @@ SUBDIRS += intel/tools
> >  endif
> >
> >  if HAVE_VULKAN_COMMON
> > +SUBDIRS += vulkan/util
> >  SUBDIRS += vulkan/wsi
> 
> List both on the same line.

done

> 
> >  endif
> >  EXTRA_DIST += vulkan/registry/vk.xml
> > diff --git a/src/intel/vulkan/Makefile.am b/src/intel/vulkan/Makefile.am
> > index 4197b0e77c..54bf0f5de1 100644
> > --- a/src/intel/vulkan/Makefile.am
> > +++ b/src/intel/vulkan/Makefile.am
> > @@ -49,6 +49,7 @@ AM_CPPFLAGS = \
> > -I$(top_builddir)/src \
> > -I$(top_srcdir)/src \
> > -I$(top_srcdir)/src/vulkan/wsi \
> > +   -I$(top_builddir)/src/vulkan/util \
> > -I$(top_builddir)/src/compiler \
> > -I$(top_srcdir)/src/compiler \
> > -I$(top_builddir)/src/compiler/nir \
> > @@ -125,6 +126,7 @@ libvulkan_common_la_SOURCES = $(VULKAN_SOURCES)
> >
> >  VULKAN_LIB_DEPS += \
> > libvulkan_common.la \
> > +   $(top_builddir)/src/vulkan/util/libvulkan_util.la \
> > $(top_builddir)/src/vulkan/wsi/libvulkan_wsi.la \
> > $(top_builddir)/src/mesa/drivers/dri/i965/libi965_compiler.la \
> > $(top_builddir)/src/compiler/nir/libnir.la \
> > diff --git a/src/intel/vulkan/anv_util.c b/src/intel/vulkan/anv_util.c
> > index 6d75187065..ec5c9486d8 100644
> > --- a/src/intel/vulkan/anv_util.c
> > +++ b/src/intel/vulkan/anv_util.c
> > @@ -29,6 +29,7 @@
> >  #include 
> >
> >  #include "anv_private.h"
> > +#include "vk_enum_to_str.h"
> >
> >  /** Log an error message.  */
> >  void anv_printflike(1, 2)
> > @@ -69,40 +70,7 @@ __vk_errorf(VkResult error, const char *file, int line, 
> > const char *format, ...)
> > va_list ap;
> > char buffer[256];
> >
> > -#define ERROR_CASE(error) case error: error_str = #error; break;
> > -
> > -   const char *error_str;
> > -   switch ((int32_t)error) {
> > -
> > -   /* Core errors */
> > -   ERROR_CASE(VK_ERROR_OUT_OF_HOST_MEMORY)
> > -   ERROR_CASE(VK_ERROR_OUT_OF_DEVICE_MEMORY)
> > -   ERROR_CASE(VK_ERROR_INITIALIZATION_FAILED)
> > -   ERROR_CASE(VK_ERROR_DEVICE_LOST)
> > -   ERROR_CASE(VK_ERROR_MEMORY_MAP_FAILED)
> > -   ERROR_CASE(VK_ERROR_LAYER_NOT_PRESENT)
> > -   ERROR_CASE(VK_ERROR_EXTENSION_NOT_PRESENT)
> > -   ERROR_CASE(VK_ERROR_FEATURE_NOT_PRESENT)
> > -   ERROR_CASE(VK_ERROR_INCOMPATIBLE_DRIVER)
> > -   ERROR_CASE(VK_ERROR_TOO_MANY_OBJECTS)
> > -   ERROR_CASE(VK_ERROR_FORMAT_NOT_SUPPORTED)
> > -   ERROR_CASE(VK_ERROR_FRAGMENTED_POOL)
> > -
> > -   /* Extension errors */
> > -   ERROR_CASE(VK_ERROR_SURFACE_LOST_KHR)
> > -   ERROR_CASE(VK_ERROR_NATIVE_WINDOW_IN_USE_KHR)
> > -   ERROR_CASE(VK_ERROR_OUT_OF_DATE_KHR)
> > -   ERROR_CASE(VK_ERROR_INCOMPATIBLE_DISPLAY_KHR)
> > -   ERROR_CASE(VK_ERROR_VALIDATION_FAILED_EXT)
> > -   ERROR_CASE(VK_ERROR_INVALID_SHADER_NV)
> > -   ERROR_CASE(VK_ERROR_OUT_OF_POOL_MEMORY_KHR)
> > -
> > -   default:
> > -  assert(!"Unknown error");
> > -  error_str = "unknown error";
> > -   }
> > -
> > -#undef ERROR_CASE
> > +   const char *error_str = vk_Result_to_str(error);
> >
> > if (format) {
> >va_start(ap, format);
> > diff --git a/src/vulkan/util/.gitignore 

Re: [Mesa-dev] [PATCH] vulkan/util: Add generator for enum_to_str functions

2017-02-17 Thread Matt Turner
On Fri, Feb 17, 2017 at 10:49 AM, Dylan Baker  wrote:
> This adds a python generator to produce enum_to_str functions for
> Vulkan from the vk.xml API description. It supports extensions as well
> as core API features, and the generator works with both python2 and
> python3.
>
> CC: Jason Ekstrand 
> Signed-off-by: Dylan Baker 
> ---
>  configure.ac   |   1 +
>  src/Makefile.am|   1 +
>  src/intel/vulkan/Makefile.am   |   2 +
>  src/intel/vulkan/anv_util.c|  36 +---
>  src/vulkan/util/.gitignore |   1 +
>  src/vulkan/util/Makefile.am|  22 +
>  src/vulkan/util/gen_enum_to_str.py | 172 
> +
>  7 files changed, 201 insertions(+), 34 deletions(-)
>  create mode 100644 src/vulkan/util/.gitignore
>  create mode 100644 src/vulkan/util/Makefile.am
>  create mode 100644 src/vulkan/util/gen_enum_to_str.py
>
> diff --git a/configure.ac b/configure.ac
> index 7e4544f5bf..c83a5234da 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -2691,6 +2691,7 @@ AC_CONFIG_FILES([Makefile
> src/mesa/main/tests/Makefile
> src/util/Makefile
> src/util/tests/hash_table/Makefile
> +   src/vulkan/util/Makefile
> src/vulkan/wsi/Makefile])

These two really don't need to be separate Makefiles.

Can you try, as a follow-on, to combine them as a unified
src/vulkan/Makefile.am?

>
>  AC_OUTPUT
> diff --git a/src/Makefile.am b/src/Makefile.am
> index 12e5dcdb12..90f95b2265 100644
> --- a/src/Makefile.am
> +++ b/src/Makefile.am
> @@ -117,6 +117,7 @@ SUBDIRS += intel/tools
>  endif
>
>  if HAVE_VULKAN_COMMON
> +SUBDIRS += vulkan/util
>  SUBDIRS += vulkan/wsi

List both on the same line.

>  endif
>  EXTRA_DIST += vulkan/registry/vk.xml
> diff --git a/src/intel/vulkan/Makefile.am b/src/intel/vulkan/Makefile.am
> index 4197b0e77c..54bf0f5de1 100644
> --- a/src/intel/vulkan/Makefile.am
> +++ b/src/intel/vulkan/Makefile.am
> @@ -49,6 +49,7 @@ AM_CPPFLAGS = \
> -I$(top_builddir)/src \
> -I$(top_srcdir)/src \
> -I$(top_srcdir)/src/vulkan/wsi \
> +   -I$(top_builddir)/src/vulkan/util \
> -I$(top_builddir)/src/compiler \
> -I$(top_srcdir)/src/compiler \
> -I$(top_builddir)/src/compiler/nir \
> @@ -125,6 +126,7 @@ libvulkan_common_la_SOURCES = $(VULKAN_SOURCES)
>
>  VULKAN_LIB_DEPS += \
> libvulkan_common.la \
> +   $(top_builddir)/src/vulkan/util/libvulkan_util.la \
> $(top_builddir)/src/vulkan/wsi/libvulkan_wsi.la \
> $(top_builddir)/src/mesa/drivers/dri/i965/libi965_compiler.la \
> $(top_builddir)/src/compiler/nir/libnir.la \
> diff --git a/src/intel/vulkan/anv_util.c b/src/intel/vulkan/anv_util.c
> index 6d75187065..ec5c9486d8 100644
> --- a/src/intel/vulkan/anv_util.c
> +++ b/src/intel/vulkan/anv_util.c
> @@ -29,6 +29,7 @@
>  #include 
>
>  #include "anv_private.h"
> +#include "vk_enum_to_str.h"
>
>  /** Log an error message.  */
>  void anv_printflike(1, 2)
> @@ -69,40 +70,7 @@ __vk_errorf(VkResult error, const char *file, int line, 
> const char *format, ...)
> va_list ap;
> char buffer[256];
>
> -#define ERROR_CASE(error) case error: error_str = #error; break;
> -
> -   const char *error_str;
> -   switch ((int32_t)error) {
> -
> -   /* Core errors */
> -   ERROR_CASE(VK_ERROR_OUT_OF_HOST_MEMORY)
> -   ERROR_CASE(VK_ERROR_OUT_OF_DEVICE_MEMORY)
> -   ERROR_CASE(VK_ERROR_INITIALIZATION_FAILED)
> -   ERROR_CASE(VK_ERROR_DEVICE_LOST)
> -   ERROR_CASE(VK_ERROR_MEMORY_MAP_FAILED)
> -   ERROR_CASE(VK_ERROR_LAYER_NOT_PRESENT)
> -   ERROR_CASE(VK_ERROR_EXTENSION_NOT_PRESENT)
> -   ERROR_CASE(VK_ERROR_FEATURE_NOT_PRESENT)
> -   ERROR_CASE(VK_ERROR_INCOMPATIBLE_DRIVER)
> -   ERROR_CASE(VK_ERROR_TOO_MANY_OBJECTS)
> -   ERROR_CASE(VK_ERROR_FORMAT_NOT_SUPPORTED)
> -   ERROR_CASE(VK_ERROR_FRAGMENTED_POOL)
> -
> -   /* Extension errors */
> -   ERROR_CASE(VK_ERROR_SURFACE_LOST_KHR)
> -   ERROR_CASE(VK_ERROR_NATIVE_WINDOW_IN_USE_KHR)
> -   ERROR_CASE(VK_ERROR_OUT_OF_DATE_KHR)
> -   ERROR_CASE(VK_ERROR_INCOMPATIBLE_DISPLAY_KHR)
> -   ERROR_CASE(VK_ERROR_VALIDATION_FAILED_EXT)
> -   ERROR_CASE(VK_ERROR_INVALID_SHADER_NV)
> -   ERROR_CASE(VK_ERROR_OUT_OF_POOL_MEMORY_KHR)
> -
> -   default:
> -  assert(!"Unknown error");
> -  error_str = "unknown error";
> -   }
> -
> -#undef ERROR_CASE
> +   const char *error_str = vk_Result_to_str(error);
>
> if (format) {
>va_start(ap, format);
> diff --git a/src/vulkan/util/.gitignore b/src/vulkan/util/.gitignore
> new file mode 100644
> index 00..5c79217982
> --- /dev/null
> +++ b/src/vulkan/util/.gitignore
> @@ -0,0 +1 @@
> +vk_enum_to_str.*
> diff --git a/src/vulkan/util/Makefile.am b/src/vulkan/util/Makefile.am
> new file mode 100644
> index 00..ced83e8873
> --- /dev/null
> +++ b/src/vulkan/util/Makefile.am
> @@ -0,0 +1,22 @@
> 

[Mesa-dev] [PATCH 16/16] i965/blorp: Drop unnecessary flushes after clear/resolve

2017-02-17 Thread Topi Pohjolainen
Now that there is proper end-of-pipe synchronization the additional
delay needed before has become redundant. On SKL helps:

OglDrvRes: 1.65304% +/- 0.0816077%

by making blorp blits/copies more competitive.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 39 +++
 1 file changed, 17 insertions(+), 22 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index bc9f964..3c8a7bd 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -992,18 +992,24 @@ do_single_blorp_clear(struct brw_context *brw, struct 
gl_framebuffer *fb,
   x0, y0, x1, y1,
   clear_color, color_write_disable);
   blorp_batch_finish();
-   }
 
-   /*
-* Ivybrigde PRM Vol 2, Part 1, "11.7 MCS Buffer for Render Target(s)":
-*
-*  Any transition from any value in {Clear, Render, Resolve} to a
-*  different value in {Clear, Render, Resolve} requires end of pipe
-*  synchronization.
-*/
-   brw_emit_pipe_control_flush(brw,
-   PIPE_CONTROL_RENDER_TARGET_FLUSH |
-   PIPE_CONTROL_CS_STALL);
+  const bool is_x_tiled = irb->mt->tiling == I915_TILING_X;
+  const bool is_compressed_msaa = irb->mt->mcs_buf &&
+  irb->mt->num_samples > 1;
+
+  /* HACK: Workaround unknown bug. This used to be unconditional for
+   * all clears even though originally meant only for fast clears. It
+   * seems it has been hiding bug(s). It looks that it is really
+   * brw_try_draw_prims() that is missing this. But unfortunately there is
+   * no way of detecting non-fast clears with current bookkeeping and
+   * doing this flush unconditionally there hurts performance in many
+   * cases.
+   */
+  if (brw->gen >= 9 && (is_x_tiled || is_compressed_msaa))
+ brw_emit_pipe_control_flush(brw,
+ PIPE_CONTROL_RENDER_TARGET_FLUSH |
+ PIPE_CONTROL_CS_STALL);
+   }
 
return true;
 }
@@ -1082,17 +1088,6 @@ brw_blorp_resolve_color(struct brw_context *brw, struct 
intel_mipmap_tree *mt,
  brw_blorp_to_isl_format(brw, format, true),
  resolve_op);
blorp_batch_finish();
-
-   /*
-* Ivybrigde PRM Vol 2, Part 1, "11.7 MCS Buffer for Render Target(s)":
-*
-*  Any transition from any value in {Clear, Render, Resolve} to a
-*  different value in {Clear, Render, Resolve} requires end of pipe
-*  synchronization.
-*/
-   brw_emit_pipe_control_flush(brw,
-   PIPE_CONTROL_RENDER_TARGET_FLUSH |
-   PIPE_CONTROL_CS_STALL);
 }
 
 static void
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 15/16] i965: Check if fast color clear state transition needs sync

2017-02-17 Thread Topi Pohjolainen
Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_context.c | 49 +
 1 file changed, 49 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 52f8c17..28ed8d1 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -228,6 +228,52 @@ enum intel_write_cache_flush_type {
INTEL_WRITE_CACHE_SYNC = 1 << 1,
 };
 
+/*
+ * Check if transition to new fast clear color state requires synchronization:
+ *
+ * Ivybrigde PRM Vol 2, Part 1, "11.7 MCS Buffer for Render Target(s)":
+ *
+ *  Any transition from any value in {Clear, Render, Resolve} to a
+ *  different value in {Clear, Render, Resolve} requires end of pipe
+ *  synchronization.
+ *
+ * On the other hand:
+ *
+ * Ivybrigde PRM Vol 2, Part 1, "11.8 Render Target Fast Clear":
+ *
+ *  After Render target fast clear, pipe-control with color cache write-flush
+ *  must be issued before sending any DRAW commands on that render target
+ *
+ * Empirically it looks that after a clear flushing alone is sufficient while
+ * full sync decreases performance in some benchmarks significantly.
+ */
+static enum intel_write_cache_flush_type
+intel_miptree_fast_clear_needs_sync(const struct brw_context *brw,
+const struct intel_mipmap_tree *mt,
+unsigned level, unsigned layer)
+{
+   if (!mt->mcs_buf || mt->num_samples > 1)
+  return INTEL_WRITE_CACHE_NO_FLUSH;
+
+   /* Presence in the render cache combined with fast clear state of CLEAR
+* or RESOLVED means that there is fast clear op pending flush and sync.
+*/
+   if (_mesa_set_search(brw->render_cache, mt->bo) == NULL)
+  return INTEL_WRITE_CACHE_NO_FLUSH;
+
+   const enum intel_fast_clear_state curr_state =
+  intel_miptree_get_fast_clear_state(mt, level, layer);
+
+   switch (curr_state) {
+   case INTEL_FAST_CLEAR_STATE_CLEAR:
+  return INTEL_WRITE_CACHE_FLUSH;
+   case INTEL_FAST_CLEAR_STATE_RESOLVED:
+  return INTEL_WRITE_CACHE_SYNC;
+   default:
+  return INTEL_WRITE_CACHE_NO_FLUSH;
+   }
+}
+
 static enum intel_write_cache_flush_type 
 brw_prepare_textures(struct gl_context *ctx)
 {
@@ -381,6 +427,9 @@ brw_prepare_framebuffer(struct gl_context *ctx)
  irb->mt_layer, irb->layer_count, 0))
 flush |= INTEL_WRITE_CACHE_SYNC;
   }
+
+  flush |= intel_miptree_fast_clear_needs_sync(
+  brw, irb->mt, irb->mt_level, irb->mt_layer);
}
 
return flush;
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/16] i965/blorp/blit: Refactor hiz/ccs prep for blits

2017-02-17 Thread Topi Pohjolainen
They are explicitly considered for blits from now on. Currently
they are part of common surface preparation which is used by all
blorp ops. However, color/hiz/depth/stencil clears and resolves
use hiz/ccs without the tweaks.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 95 ---
 1 file changed, 67 insertions(+), 28 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 3247bd4..e3e4402 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -196,29 +196,9 @@ blorp_surf_for_miptree(struct brw_context *brw,
   if (surf->aux_usage == ISL_AUX_USAGE_HIZ) {
  /* If we're not going to use it as a depth buffer, resolve HiZ */
  if (!(safe_aux_usage & (1 << ISL_AUX_USAGE_HIZ))) {
-for (unsigned i = 0; i < num_layers; i++) {
-   intel_miptree_slice_resolve_depth(brw, mt, *level,
- start_layer + i);
-
-   /* If we're rendering to it then we'll need a HiZ resolve once
-* we're done before we can use it with HiZ again.
-*/
-   if (is_render_target)
-  intel_miptree_slice_set_needs_hiz_resolve(mt, *level,
-start_layer + i);
-}
 surf->aux_usage = ISL_AUX_USAGE_NONE;
  }
   } else if (!(safe_aux_usage & (1 << surf->aux_usage))) {
- uint32_t flags = 0;
- if (safe_aux_usage & (1 << ISL_AUX_USAGE_CCS_E))
-flags |= INTEL_MIPTREE_IGNORE_CCS_E;
-
- intel_miptree_resolve_color(brw, mt,
- *level, start_layer, num_layers, flags);
-
- assert(!intel_miptree_has_color_unresolved(mt, *level, 1,
-start_layer, num_layers));
  surf->aux_usage = ISL_AUX_USAGE_NONE;
   }
}
@@ -335,11 +315,64 @@ physical_to_logical_layer(struct intel_mipmap_tree *mt,
}
 }
 
+/* If we're not going to use it as a depth buffer, resolve HiZ */
+static bool
+prepare_hiz_for_blit(struct brw_context *brw,
+ struct intel_mipmap_tree *mt,
+ unsigned level, unsigned layer,
+ bool is_render_target)
+{
+   if (!mt->hiz_buf)
+  return false;
+
+   const bool needs_sync =
+  intel_miptree_slice_resolve_depth(brw, mt, level, layer);
+
+   /* If we're rendering to it then we'll need a HiZ resolve once
+* we're done before we can use it with HiZ again.
+*/
+   if (is_render_target)
+  intel_miptree_slice_set_needs_hiz_resolve(mt, level, layer);
+
+   return needs_sync;
+}
+
+static bool
+prepare_ccs_for_blit(struct brw_context *brw,
+ struct intel_mipmap_tree *mt,
+ unsigned level, unsigned layer,
+ uint32_t safe_aux_usage)
+{
+   if (!mt->mcs_buf || mt->num_samples > 1)
+  return false;
+
+   if (intel_miptree_is_lossless_compressed(brw, mt) &&
+   safe_aux_usage & (1 << ISL_AUX_USAGE_CCS_E)) {
+  assert(brw->gen >= 9);
+  return false;
+   }
+
+   const bool needs_sync =
+  intel_miptree_resolve_color(brw, mt, level, layer, 1, 0);
+
+   assert(!intel_miptree_has_color_unresolved(mt, level, 1, layer, 1));
+
+   return needs_sync;
+}
+
 static void
 prepare_blit(struct brw_context *brw,
- const struct intel_mipmap_tree *src_mt,
- const struct intel_mipmap_tree *dst_mt)
+ struct intel_mipmap_tree *src_mt,
+ unsigned src_level, unsigned src_layer, uint32_t src_usage_flags,
+ struct intel_mipmap_tree *dst_mt,
+ unsigned dst_level, unsigned dst_layer, uint32_t dst_usage_flags)
 {
+   prepare_hiz_for_blit(brw, src_mt, src_level, src_layer, false);
+   prepare_hiz_for_blit(brw, dst_mt, dst_level, dst_layer, true);
+
+   prepare_ccs_for_blit(brw, src_mt, src_level, src_layer, src_usage_flags);
+   prepare_ccs_for_blit(brw, dst_mt, dst_level, dst_layer, dst_usage_flags);
+
/* Flush the sampler and render caches.  We definitely need to flush the
 * sampler cache so that we get updated contents from the render cache for
 * the glBlitFramebuffer() source.  Also, we are sometimes warned in the
@@ -422,6 +455,10 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
  (1 << ISL_AUX_USAGE_CCS_D);
}
 
+   prepare_blit(brw,
+src_mt, src_level, src_layer, src_usage_flags,
+dst_mt, dst_level, dst_layer, dst_usage_flags);
+
struct isl_surf tmp_surfs[4];
struct blorp_surf src_surf, dst_surf;
blorp_surf_for_miptree(brw, _surf, src_mt, false, src_usage_flags,
@@ -467,15 +504,17 @@ brw_blorp_copy_miptrees(struct brw_context *brw,
dst_mt->num_samples, _mesa_get_format_name(dst_mt->format), 

[Mesa-dev] [PATCH 11/16] i965/blorp: Do more fine grained flushing/syncing

2017-02-17 Thread Topi Pohjolainen
Color clears and resolves now consider end-of-pipe-sync similarly
as normal render path.

Blits remain functionally the same as before. Same as hiz/depth/stencil
clears - they do not have src or dst enabled and therefore current logic
was already no-op.

Later patches will enable blorp blits for texture uploads which
require excess flushing to be omitted in order to perform properly.
Now that clears and blits make the decision independently that also
becomes easier.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_blorp.c   | 48 +
 src/mesa/drivers/dri/i965/genX_blorp_exec.c | 11 ---
 2 files changed, 48 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 52f85ff..3247bd4 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -27,6 +27,7 @@
 #include "main/fbobject.h"
 #include "main/renderbuffer.h"
 #include "main/glformats.h"
+#include "util/set.h"
 
 #include "brw_blorp.h"
 #include "brw_context.h"
@@ -334,6 +335,25 @@ physical_to_logical_layer(struct intel_mipmap_tree *mt,
}
 }
 
+static void
+prepare_blit(struct brw_context *brw,
+ const struct intel_mipmap_tree *src_mt,
+ const struct intel_mipmap_tree *dst_mt)
+{
+   /* Flush the sampler and render caches.  We definitely need to flush the
+* sampler cache so that we get updated contents from the render cache for
+* the glBlitFramebuffer() source.  Also, we are sometimes warned in the
+* docs to flush the cache between reinterpretations of the same surface
+* data with different formats, which blorp does for stencil and depth
+* data.
+*/
+   brw_render_cache_set_check_flush(brw, src_mt->bo);
+   brw_render_cache_set_check_flush(brw, dst_mt->bo);
+
+   brw_emit_pipe_control_flush(brw,
+   PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE);
+}
+
 /**
  * Note: if the src (or dst) is a 2D multisample array texture on Gen7+ using
  * INTEL_MSAA_LAYOUT_UMS or INTEL_MSAA_LAYOUT_CMS, src_layer (dst_layer) is
@@ -859,6 +879,16 @@ do_single_blorp_clear(struct brw_context *brw, struct 
gl_framebuffer *fb,
   }
}
 
+   const bool is_blorp_pending =
+  irb->mt->mcs_buf && irb->mt->num_samples <= 1 &&
+  _mesa_set_search(brw->render_cache, irb->mt->bo) != NULL;
+
+   if (is_blorp_pending &&
+   fast_clear_state == INTEL_FAST_CLEAR_STATE_RESOLVED) {
+  brw_end_of_pipe_sync(brw);
+  brw_render_cache_set_clear(brw);
+   }
+
const unsigned num_layers = fb->MaxNumLayers ? irb->layer_count : 1;
 
/* We can't setup the blorp_surf until we've allocated the MCS above */
@@ -897,6 +927,14 @@ do_single_blorp_clear(struct brw_context *brw, struct 
gl_framebuffer *fb,
   union isl_color_value clear_color;
   memcpy(clear_color.f32, ctx->Color.ClearColor.f, sizeof(float) * 4);
 
+  if (is_blorp_pending &&
+  fast_clear_state == INTEL_FAST_CLEAR_STATE_CLEAR) {
+ brw_emit_pipe_control_flush(brw,
+ PIPE_CONTROL_RENDER_TARGET_FLUSH |
+ PIPE_CONTROL_CS_STALL);
+ brw_render_cache_set_clear(brw);
+  }
+
   struct blorp_batch batch;
   blorp_batch_init(>blorp, , brw, 0);
   blorp_clear(, ,
@@ -960,6 +998,16 @@ brw_blorp_resolve_color(struct brw_context *brw, struct 
intel_mipmap_tree *mt,
 
const mesa_format format = _mesa_get_srgb_format_linear(mt->format);
 
+   /* Resolve requires all changes in both auxiliary and color buffer written
+* out.
+*/
+   if (_mesa_set_search(brw->render_cache, mt->bo)) {
+  brw_emit_pipe_control_flush(brw,
+  PIPE_CONTROL_RENDER_TARGET_FLUSH |
+  PIPE_CONTROL_CS_STALL);
+  brw_render_cache_set_clear(brw);
+   }
+
struct isl_surf isl_tmp[2];
struct blorp_surf surf;
blorp_surf_for_miptree(brw, , mt, true,
diff --git a/src/mesa/drivers/dri/i965/genX_blorp_exec.c 
b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
index 37b29cd..de8bc4c 100644
--- a/src/mesa/drivers/dri/i965/genX_blorp_exec.c
+++ b/src/mesa/drivers/dri/i965/genX_blorp_exec.c
@@ -173,17 +173,6 @@ genX(blorp_exec)(struct blorp_batch *batch,
const uint32_t estimated_max_batch_usage = GEN_GEN >= 8 ? 1800 : 1500;
bool check_aperture_failed_once = false;
 
-   /* Flush the sampler and render caches.  We definitely need to flush the
-* sampler cache so that we get updated contents from the render cache for
-* the glBlitFramebuffer() source.  Also, we are sometimes warned in the
-* docs to flush the cache between reinterpretations of the same surface
-* data with different formats, which blorp does for stencil and depth
-* data.
-*/
-   if (params->src.enabled)
-  brw_render_cache_set_check_flush(brw, params->src.addr.buffer);
-   

[Mesa-dev] [PATCH 09/16] i965/miptree: Add color resolve end-of-pipe-sync before sharing

2017-02-17 Thread Topi Pohjolainen
Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 717a320..b0148d2 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -2332,6 +2332,8 @@ void
 intel_miptree_make_shareable(struct brw_context *brw,
  struct intel_mipmap_tree *mt)
 {
+   bool need_sync = false;
+
/* MCS buffers are also used for multisample buffers, but we can't resolve
 * away a multisample MCS buffer because it's an integral part of how the
 * pixel data is stored.  Fortunately this code path should never be
@@ -2340,7 +2342,7 @@ intel_miptree_make_shareable(struct brw_context *brw,
assert(mt->msaa_layout == INTEL_MSAA_LAYOUT_NONE || mt->num_samples <= 1);
 
if (mt->mcs_buf) {
-  intel_miptree_all_slices_resolve_color(brw, mt, 0);
+  need_sync |= intel_miptree_all_slices_resolve_color(brw, mt, 0);
   mt->aux_disable |= (INTEL_AUX_DISABLE_CCS | INTEL_AUX_DISABLE_MCS);
   drm_intel_bo_unreference(mt->mcs_buf->bo);
   free(mt->mcs_buf);
@@ -2369,6 +2371,11 @@ intel_miptree_make_shareable(struct brw_context *brw,
*/
   exec_list_make_empty(>hiz_map);
}
+
+   if (need_sync) {
+  brw_end_of_pipe_sync(brw);
+  brw_render_cache_set_clear(brw);
+   }
 }
 
 
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/16] i965/dri2: Add end-of-pipe-sync after color resolves

2017-02-17 Thread Topi Pohjolainen
Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_context.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index bb84102..75d4920 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -1476,7 +1476,10 @@ intel_resolve_for_dri2_flush(struct brw_context *brw,
   if (rb->mt->num_samples <= 1) {
  assert(rb->mt_layer == 0 && rb->mt_level == 0 &&
 rb->layer_count == 1);
- intel_miptree_resolve_color(brw, rb->mt, 0, 0, 1, 0);
+ if (intel_miptree_resolve_color(brw, rb->mt, 0, 0, 1, 0)) {
+brw_end_of_pipe_sync(brw);
+brw_render_cache_set_clear(brw);
+ }
   } else {
  intel_renderbuffer_downsample(brw, rb);
   }
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 14/16] i965: Consider surface resolves and sync after blorp ops

2017-02-17 Thread Topi Pohjolainen
Note that on 3D render path gen >= 6 check is needed even though
one considers BLORP driver state bit (although blorp itsels is
effective only on gen6+). This is because driver state flags are
all set unconditionally in initialization. Compute in turn does
not need the check as it is not available on gen < 6.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_compute.c  | 2 ++
 src/mesa/drivers/dri/i965/brw_context.c  | 2 +-
 src/mesa/drivers/dri/i965/brw_context.h  | 2 ++
 src/mesa/drivers/dri/i965/brw_draw.c | 2 ++
 src/mesa/drivers/dri/i965/intel_pixel.c  | 4 
 src/mesa/drivers/dri/i965/intel_pixel_read.c | 2 ++
 6 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_compute.c 
b/src/mesa/drivers/dri/i965/brw_compute.c
index 16b5df7..081d6e6 100644
--- a/src/mesa/drivers/dri/i965/brw_compute.c
+++ b/src/mesa/drivers/dri/i965/brw_compute.c
@@ -185,6 +185,8 @@ brw_dispatch_compute_common(struct gl_context *ctx)
 
if (ctx->NewState)
   _mesa_update_state(ctx);
+   else if (ctx->NewDriverState & BRW_NEW_BLORP)
+  intel_resolve_and_sync_surfaces(ctx);
 
brw_validate_textures(brw);
 
diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 75d4920..52f8c17 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -386,7 +386,7 @@ brw_prepare_framebuffer(struct gl_context *ctx)
return flush;
 }
 
-static bool
+bool
 intel_resolve_and_sync_surfaces(struct gl_context *ctx)
 {
struct brw_context *brw = brw_context(ctx);
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 34aea56..f4305c9 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1226,6 +1226,8 @@ void intel_prepare_render(struct brw_context *brw);
 void intel_resolve_for_dri2_flush(struct brw_context *brw,
   __DRIdrawable *drawable);
 
+bool intel_resolve_and_sync_surfaces(struct gl_context *ctc);
+
 GLboolean brwCreateContext(gl_api api,
  const struct gl_config *mesaVis,
  __DRIcontext *driContextPriv,
diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 564739c..6ca6a7a 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -417,6 +417,8 @@ brw_try_draw_prims(struct gl_context *ctx,
 
if (ctx->NewState)
   _mesa_update_state(ctx);
+   else if (brw->gen >= 6 && ctx->NewDriverState & BRW_NEW_BLORP)
+  intel_resolve_and_sync_surfaces(ctx);
 
/* We have to validate the textures *before* checking for fallbacks;
 * otherwise, the software fallback won't be able to rely on the
diff --git a/src/mesa/drivers/dri/i965/intel_pixel.c 
b/src/mesa/drivers/dri/i965/intel_pixel.c
index d4f86fd..a5b366a 100644
--- a/src/mesa/drivers/dri/i965/intel_pixel.c
+++ b/src/mesa/drivers/dri/i965/intel_pixel.c
@@ -55,8 +55,12 @@ effective_func(GLenum func, bool src_alpha_is_one)
 bool
 intel_check_blit_fragment_ops(struct gl_context * ctx, bool src_alpha_is_one)
 {
+   struct brw_context *brw = brw_context(ctx);
+
if (ctx->NewState)
   _mesa_update_state(ctx);
+   else if (brw->gen >= 6 && ctx->NewDriverState & BRW_NEW_BLORP)
+  intel_resolve_and_sync_surfaces(ctx);
 
if (ctx->FragmentProgram._Enabled) {
   DBG("fallback due to fragment program\n");
diff --git a/src/mesa/drivers/dri/i965/intel_pixel_read.c 
b/src/mesa/drivers/dri/i965/intel_pixel_read.c
index 8e5e8d9..9e99a1d 100644
--- a/src/mesa/drivers/dri/i965/intel_pixel_read.c
+++ b/src/mesa/drivers/dri/i965/intel_pixel_read.c
@@ -263,6 +263,8 @@ intelReadPixels(struct gl_context * ctx,
 
if (ctx->NewState)
   _mesa_update_state(ctx);
+   else if (brw->gen >= 6 && ctx->NewDriverState & BRW_NEW_BLORP)
+  intel_resolve_and_sync_surfaces(ctx);
 
_mesa_readpixels(ctx, x, y, width, height, format, type, pack, pixels);
 
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/16] i965/blorp: Use conditional end-of-pipe-sync

2017-02-17 Thread Topi Pohjolainen
instead of unconditional render cache flush.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 23 ---
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index e3e4402..bc9f964 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -367,11 +367,13 @@ prepare_blit(struct brw_context *brw,
  struct intel_mipmap_tree *dst_mt,
  unsigned dst_level, unsigned dst_layer, uint32_t dst_usage_flags)
 {
-   prepare_hiz_for_blit(brw, src_mt, src_level, src_layer, false);
-   prepare_hiz_for_blit(brw, dst_mt, dst_level, dst_layer, true);
-
-   prepare_ccs_for_blit(brw, src_mt, src_level, src_layer, src_usage_flags);
-   prepare_ccs_for_blit(brw, dst_mt, dst_level, dst_layer, dst_usage_flags);
+   const bool needs_sync =
+  prepare_hiz_for_blit(brw, src_mt, src_level, src_layer, false) |
+  prepare_hiz_for_blit(brw, dst_mt, dst_level, dst_layer, true) |
+  prepare_ccs_for_blit(brw, src_mt, src_level, src_layer,
+   src_usage_flags) |
+  prepare_ccs_for_blit(brw, dst_mt, dst_level, dst_layer,
+   dst_usage_flags);
 
/* Flush the sampler and render caches.  We definitely need to flush the
 * sampler cache so that we get updated contents from the render cache for
@@ -380,8 +382,15 @@ prepare_blit(struct brw_context *brw,
 * data with different formats, which blorp does for stencil and depth
 * data.
 */
-   brw_render_cache_set_check_flush(brw, src_mt->bo);
-   brw_render_cache_set_check_flush(brw, dst_mt->bo);
+   if (needs_sync) {
+  brw_end_of_pipe_sync(brw);
+  brw_render_cache_set_clear(brw);
+   } else if (_mesa_set_search(brw->render_cache, src_mt->bo)) {
+  brw_emit_pipe_control_flush(brw,
+  PIPE_CONTROL_RENDER_TARGET_FLUSH |
+  PIPE_CONTROL_CS_STALL);
+  brw_render_cache_set_clear(brw);
+   }
 
brw_emit_pipe_control_flush(brw,
PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE);
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/16] i965: Add end-of-pipe sync before non-gpu read of color resolves

2017-02-17 Thread Topi Pohjolainen
Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c  |  8 
 src/mesa/drivers/dri/i965/intel_tex_image.c| 10 --
 src/mesa/drivers/dri/i965/intel_tex_subimage.c | 11 +--
 3 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index b0148d2..49f148c 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -2527,10 +2527,10 @@ intel_update_r8stencil(struct brw_context *brw,
 static void *
 intel_miptree_map_raw(struct brw_context *brw, struct intel_mipmap_tree *mt)
 {
-   /* CPU accesses to color buffers don't understand fast color clears, so
-* resolve any pending fast color clears before we map.
-*/
-   intel_miptree_all_slices_resolve_color(brw, mt, 0);
+   if (intel_miptree_all_slices_resolve_color(brw, mt, 0)) {
+  brw_end_of_pipe_sync(brw);
+  brw_render_cache_set_clear(brw);
+   }
 
drm_intel_bo *bo = mt->bo;
 
diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c 
b/src/mesa/drivers/dri/i965/intel_tex_image.c
index 141996f..cff831b 100644
--- a/src/mesa/drivers/dri/i965/intel_tex_image.c
+++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
@@ -521,9 +521,15 @@ intel_gettexsubimage_tiled_memcpy(struct gl_context *ctx,
}
 
/* Since we are going to write raw data to the miptree, we need to resolve
-* any pending fast color clears before we start.
+* any pending fast color clears before we start. There is no need to
+* call brw_end_of_pipe_sync() here - drm_intel_bo_references() check below
+* will trigger if any resolve was needed and intel_batchbuffer_flush()
+* syncs in the end.
 */
-   intel_miptree_all_slices_resolve_color(brw, image->mt, 0);
+   if (intel_miptree_all_slices_resolve_color(brw, image->mt, 0)) {
+  brw_end_of_pipe_sync(brw);
+  brw_render_cache_set_clear(brw);
+   }
 
bo = image->mt->bo;
 
diff --git a/src/mesa/drivers/dri/i965/intel_tex_subimage.c 
b/src/mesa/drivers/dri/i965/intel_tex_subimage.c
index b7e52bc..a2db953 100644
--- a/src/mesa/drivers/dri/i965/intel_tex_subimage.c
+++ b/src/mesa/drivers/dri/i965/intel_tex_subimage.c
@@ -36,6 +36,7 @@
 
 #include "brw_context.h"
 #include "intel_batchbuffer.h"
+#include "intel_fbo.h"
 #include "intel_tex.h"
 #include "intel_mipmap_tree.h"
 #include "intel_blit.h"
@@ -137,9 +138,15 @@ intel_texsubimage_tiled_memcpy(struct gl_context * ctx,
}
 
/* Since we are going to write raw data to the miptree, we need to resolve
-* any pending fast color clears before we start.
+* any pending fast color clears before we start. There is no need to
+* call brw_end_of_pipe_sync() here - drm_intel_bo_references() check below
+* will trigger if any resolve was needed and intel_batchbuffer_flush()
+* syncs in the end.
 */
-   intel_miptree_all_slices_resolve_color(brw, image->mt, 0);
+   if (intel_miptree_all_slices_resolve_color(brw, image->mt, 0)) {
+  brw_end_of_pipe_sync(brw);
+  brw_render_cache_set_clear(brw);
+   }
 
bo = image->mt->bo;
 
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/16] i965: Consider layered rt resolves along with other

2017-02-17 Thread Topi Pohjolainen
Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_context.c | 15 +++
 src/mesa/drivers/dri/i965/brw_draw.c| 34 -
 2 files changed, 15 insertions(+), 34 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 746d754..bb84102 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -366,6 +366,21 @@ brw_prepare_framebuffer(struct gl_context *ctx)
  if (intel_miptree_all_slices_resolve_color(brw, irb->mt, 0))
 flush |= INTEL_WRITE_CACHE_SYNC;
   }
+
+  /* For layered rendering non-compressed fast cleared buffers need to be
+   * resolved. Surface state can carry only one fast color clear value
+   * while each layer may have its own fast clear color value. For
+   * compressed buffers color value is available in the color buffer.
+   */
+  if (irb->layer_count > 1 &&
+  !(irb->mt->aux_disable & INTEL_AUX_DISABLE_CCS) &&
+  !intel_miptree_is_lossless_compressed(brw, irb->mt)) {
+ assert(brw->gen >= 8);
+
+ if (intel_miptree_resolve_color(brw, irb->mt, irb->mt_level,
+ irb->mt_layer, irb->layer_count, 0))
+flush |= INTEL_WRITE_CACHE_SYNC;
+  }
}
 
return flush;
diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 940ce70..564739c 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -395,39 +395,6 @@ brw_postdraw_set_buffers_need_resolve(struct brw_context 
*brw)
}
 }
 
-static void
-brw_predraw_set_aux_buffers(struct brw_context *brw)
-{
-   if (brw->gen < 9)
-  return;
-
-   struct gl_context *ctx = >ctx;
-   struct gl_framebuffer *fb = ctx->DrawBuffer;
-
-   for (unsigned i = 0; i < fb->_NumColorDrawBuffers; i++) {
-  struct intel_renderbuffer *irb =
- intel_renderbuffer(fb->_ColorDrawBuffers[i]);
-
-  if (!irb) {
- continue;
-  }
-
-  /* For layered rendering non-compressed fast cleared buffers need to be
-   * resolved. Surface state can carry only one fast color clear value
-   * while each layer may have its own fast clear color value. For
-   * compressed buffers color value is available in the color buffer.
-   */
-  if (irb->layer_count > 1 &&
-  !(irb->mt->aux_disable & INTEL_AUX_DISABLE_CCS) &&
-  !intel_miptree_is_lossless_compressed(brw, irb->mt)) {
- assert(brw->gen >= 8);
-
- intel_miptree_resolve_color(brw, irb->mt, irb->mt_level,
- irb->mt_layer, irb->layer_count, 0);
-  }
-   }
-}
-
 /* May fail if out of video memory for texture or vbo upload, or on
  * fallback conditions.
  */
@@ -476,7 +443,6 @@ brw_try_draw_prims(struct gl_context *ctx,
   util_last_bit(ctx->VertexProgram._Current->SamplersUsed);
 
intel_prepare_render(brw);
-   brw_predraw_set_aux_buffers(brw);
 
/* This workaround has to happen outside of brw_upload_render_state()
 * because it may flush the batchbuffer for a blit, affecting the state
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/16] i965: Add color resolve end-of-pipe-sync before switch to blit ring

2017-02-17 Thread Topi Pohjolainen
This ensures that all rendering is finished and gpu caches are
flushed out. These are paths trying to switch to blit engine.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/intel_blit.c | 16 
 src/mesa/drivers/dri/i965/intel_copy_image.c   | 10 --
 src/mesa/drivers/dri/i965/intel_pixel_bitmap.c |  5 -
 src/mesa/drivers/dri/i965/intel_pixel_read.c   |  5 -
 4 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_blit.c 
b/src/mesa/drivers/dri/i965/intel_blit.c
index 4d4ab91..ee6800b 100644
--- a/src/mesa/drivers/dri/i965/intel_blit.c
+++ b/src/mesa/drivers/dri/i965/intel_blit.c
@@ -346,8 +346,12 @@ intel_miptree_blit(struct brw_context *brw,
 */
intel_miptree_slice_resolve_depth(brw, src_mt, src_level, src_slice);
intel_miptree_slice_resolve_depth(brw, dst_mt, dst_level, dst_slice);
-   intel_miptree_resolve_color(brw, src_mt, src_level, src_slice, 1, 0);
-   intel_miptree_resolve_color(brw, dst_mt, dst_level, dst_slice, 1, 0);
+
+   if (intel_miptree_resolve_color(brw, src_mt, src_level, src_slice, 1, 0) |
+   intel_miptree_resolve_color(brw, dst_mt, dst_level, dst_slice, 1, 0)) {
+  brw_end_of_pipe_sync(brw);
+  brw_render_cache_set_clear(brw);
+   }
 
if (src_flip)
   src_y = minify(src_mt->physical_height0, src_level - 
src_mt->first_level) - src_y - height;
@@ -404,8 +408,12 @@ intel_miptree_copy(struct brw_context *brw,
 */
intel_miptree_slice_resolve_depth(brw, src_mt, src_level, src_slice);
intel_miptree_slice_resolve_depth(brw, dst_mt, dst_level, dst_slice);
-   intel_miptree_resolve_color(brw, src_mt, src_level, src_slice, 1, 0);
-   intel_miptree_resolve_color(brw, dst_mt, dst_level, dst_slice, 1, 0);
+
+   if (intel_miptree_resolve_color(brw, src_mt, src_level, src_slice, 1, 0) |
+   intel_miptree_resolve_color(brw, dst_mt, dst_level, dst_slice, 1, 0)) {
+  brw_end_of_pipe_sync(brw);
+  brw_render_cache_set_clear(brw);
+   }
 
uint32_t src_image_x, src_image_y;
intel_miptree_get_image_offset(src_mt, src_level, src_slice,
diff --git a/src/mesa/drivers/dri/i965/intel_copy_image.c 
b/src/mesa/drivers/dri/i965/intel_copy_image.c
index 85585c7..72ed18e 100644
--- a/src/mesa/drivers/dri/i965/intel_copy_image.c
+++ b/src/mesa/drivers/dri/i965/intel_copy_image.c
@@ -129,17 +129,23 @@ copy_miptrees(struct brw_context *brw,
   return;
}
 
+   bool color_resolved = false;
/* We are now going to try and copy the texture using the blitter.  If
 * that fails, we will fall back mapping the texture and using memcpy.
 * In either case, we need to do a full resolve.
 */
intel_miptree_all_slices_resolve_hiz(brw, src_mt);
intel_miptree_all_slices_resolve_depth(brw, src_mt);
-   intel_miptree_all_slices_resolve_color(brw, src_mt, 0);
+   color_resolved |= intel_miptree_all_slices_resolve_color(brw, src_mt, 0);
 
intel_miptree_all_slices_resolve_hiz(brw, dst_mt);
intel_miptree_all_slices_resolve_depth(brw, dst_mt);
-   intel_miptree_all_slices_resolve_color(brw, dst_mt, 0);
+   color_resolved |= intel_miptree_all_slices_resolve_color(brw, dst_mt, 0);
+
+   if (color_resolved) {
+  brw_end_of_pipe_sync(brw);
+  brw_render_cache_set_clear(brw);
+   }
 
_mesa_get_format_block_size(src_mt->format, , );
 
diff --git a/src/mesa/drivers/dri/i965/intel_pixel_bitmap.c 
b/src/mesa/drivers/dri/i965/intel_pixel_bitmap.c
index 4522d28..1cdb4e7 100644
--- a/src/mesa/drivers/dri/i965/intel_pixel_bitmap.c
+++ b/src/mesa/drivers/dri/i965/intel_pixel_bitmap.c
@@ -256,7 +256,10 @@ do_blit_bitmap( struct gl_context *ctx,
/* The blitter has no idea about fast color clears, so we need to resolve
 * the miptree before we do anything.
 */
-   intel_miptree_all_slices_resolve_color(brw, irb->mt, 0);
+   if (intel_miptree_all_slices_resolve_color(brw, irb->mt, 0)) {
+  brw_end_of_pipe_sync(brw);
+  brw_render_cache_set_clear(brw);
+   }
 
/* Chop it all into chunks that can be digested by hardware: */
for (py = 0; py < height; py += DY) {
diff --git a/src/mesa/drivers/dri/i965/intel_pixel_read.c 
b/src/mesa/drivers/dri/i965/intel_pixel_read.c
index 2563897..8e5e8d9 100644
--- a/src/mesa/drivers/dri/i965/intel_pixel_read.c
+++ b/src/mesa/drivers/dri/i965/intel_pixel_read.c
@@ -138,7 +138,10 @@ intel_readpixels_tiled_memcpy(struct gl_context * ctx,
/* Since we are going to read raw data to the miptree, we need to resolve
 * any pending fast color clears before we start.
 */
-   intel_miptree_all_slices_resolve_color(brw, irb->mt, 0);
+   if (intel_miptree_all_slices_resolve_color(brw, irb->mt, 0)) {
+  brw_end_of_pipe_sync(brw);
+  brw_render_cache_set_clear(brw);
+   }
 
bo = irb->mt->bo;
 
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 05/16] i965: Hook end-of-pipe-sync after framebuffer resolves

2017-02-17 Thread Topi Pohjolainen
Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_context.c | 101 
 1 file changed, 51 insertions(+), 50 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index f4ebaf2..746d754 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -322,6 +322,55 @@ brw_prepare_image_surfaces(struct gl_context *ctx)
return flush;
 }
 
+static enum intel_write_cache_flush_type 
+brw_prepare_framebuffer(struct gl_context *ctx)
+{
+   struct brw_context *brw = brw_context(ctx);
+   enum intel_write_cache_flush_type flush = INTEL_WRITE_CACHE_NO_FLUSH;
+
+   /* Resolve color buffers for non-coherent framebuffer fetch. */
+   const bool non_coherent_fb_fetch = 
+  !ctx->Extensions.MESA_shader_framebuffer_fetch &&
+  ctx->FragmentProgram._Current &&
+  ctx->FragmentProgram._Current->info.outputs_read;
+
+   /* If FRAMEBUFFER_SRGB is used on Gen9+ then we need to resolve any of the
+* single-sampled color renderbuffers because the CCS buffer isn't
+* supported for SRGB formats. This only matters if FRAMEBUFFER_SRGB is
+* enabled because otherwise the surface state will be programmed with the
+* linear equivalent format anyway.
+*/
+   const bool srgb_fb = brw->gen >= 9 && ctx->Color.sRGBEnabled;
+
+   const struct gl_framebuffer *fb = ctx->DrawBuffer;
+   for (unsigned i = 0; i < fb->_NumColorDrawBuffers; i++) {
+  const struct intel_renderbuffer *irb =
+ intel_renderbuffer(fb->_ColorDrawBuffers[i]);
+
+  if (!irb)
+ continue;
+
+  if (non_coherent_fb_fetch &&
+  intel_miptree_resolve_color(
+ brw, irb->mt, irb->mt_level, irb->mt_layer, irb->layer_count,
+ INTEL_MIPTREE_IGNORE_CCS_E)) {
+ flush |= INTEL_WRITE_CACHE_SYNC;
+  }
+
+  if (srgb_fb && irb->mt && irb->mt->num_samples <= 1 &&
+  _mesa_get_srgb_format_linear(irb->mt->format) != irb->mt->format) {
+ /* Lossless compression is not supported for SRGB formats, it
+  * should be impossible to get here with such surfaces.
+  */
+ assert(!intel_miptree_is_lossless_compressed(brw, irb->mt));
+ if (intel_miptree_all_slices_resolve_color(brw, irb->mt, 0))
+flush |= INTEL_WRITE_CACHE_SYNC;
+  }
+   }
+
+   return flush;
+}
+
 static bool
 intel_resolve_and_sync_surfaces(struct gl_context *ctx)
 {
@@ -333,7 +382,8 @@ intel_resolve_and_sync_surfaces(struct gl_context *ctx)
 
const enum intel_write_cache_flush_type flush =
   brw_prepare_textures(ctx) |
-  brw_prepare_image_surfaces(ctx);
+  brw_prepare_image_surfaces(ctx) |
+  brw_prepare_framebuffer(ctx);
 
if (flush == INTEL_WRITE_CACHE_NO_FLUSH)
   return false;
@@ -402,55 +452,6 @@ intel_update_state(struct gl_context * ctx, GLuint 
new_state)
   }
}
 
-   /* Resolve color buffers for non-coherent framebuffer fetch. */
-   if (!ctx->Extensions.MESA_shader_framebuffer_fetch &&
-   ctx->FragmentProgram._Current &&
-   ctx->FragmentProgram._Current->info.outputs_read) {
-  const struct gl_framebuffer *fb = ctx->DrawBuffer;
-
-  for (unsigned i = 0; i < fb->_NumColorDrawBuffers; i++) {
- const struct intel_renderbuffer *irb =
-intel_renderbuffer(fb->_ColorDrawBuffers[i]);
-
- if (irb &&
- intel_miptree_resolve_color(
-brw, irb->mt, irb->mt_level, irb->mt_layer, irb->layer_count,
-INTEL_MIPTREE_IGNORE_CCS_E))
-brw_render_cache_set_check_flush(brw, irb->mt->bo);
-  }
-   }
-
-   /* If FRAMEBUFFER_SRGB is used on Gen9+ then we need to resolve any of the
-* single-sampled color renderbuffers because the CCS buffer isn't
-* supported for SRGB formats. This only matters if FRAMEBUFFER_SRGB is
-* enabled because otherwise the surface state will be programmed with the
-* linear equivalent format anyway.
-*/
-   if (brw->gen >= 9 && ctx->Color.sRGBEnabled) {
-  struct gl_framebuffer *fb = ctx->DrawBuffer;
-  for (int i = 0; i < fb->_NumColorDrawBuffers; i++) {
- struct gl_renderbuffer *rb = fb->_ColorDrawBuffers[i];
-
- if (rb == NULL)
-continue;
-
- struct intel_renderbuffer *irb = intel_renderbuffer(rb);
- struct intel_mipmap_tree *mt = irb->mt;
-
- if (mt == NULL ||
- mt->num_samples > 1 ||
- _mesa_get_srgb_format_linear(mt->format) == mt->format)
-   continue;
-
- /* Lossless compression is not supported for SRGB formats, it
-  * should be impossible to get here with such surfaces.
-  */
- assert(!intel_miptree_is_lossless_compressed(brw, mt));
- intel_miptree_all_slices_resolve_color(brw, mt, 0);
- brw_render_cache_set_check_flush(brw, mt->bo);
-  }
-   }
-

[Mesa-dev] i965: On-demand render target flushing

2017-02-17 Thread Topi Pohjolainen
Currently:

1) Blorp color clears and resolves emit unconditional render target
   flush + command stream after every clear/resolve (including
   regular non-fast clears).

2) Blorp color clears, resolves and blits emit texture and constant
   cache resolves even in case only destination is dirty. This is
   because brw_render_cache_set_check_flush() does both render target
   flush as well as the top-of-pipe read cache flushes.

3) Similarly to item 2, 3D and compute paths also flush texture and
   constant caches even if none of the texture surfaces are dirty.

4) In case of multiple surfaces needing resolves, all render paths
   (blorp, 3D and compute) emit render target, texture and constant
   cache flushes after each resolve instead of just once after all
   resolves.

This series addresses all four cases. Good news are that even though
the current setup isn't optimal, it doesn't actually get any better in
most cases performance wise. There is modest gain in OglDrvRes which
does heavy blorp blitting. I'm expecting this series also to make
blorp tex uploads and blorp mipmap generation more competitive.

Bad news are in the final patch - it looks that current unconditional
flushing/stalling has been hiding bugs elsewhere. There are cases
which rely on the flushes after non-fast clears. Hunting the real
cause is, however, difficult. I only saw them in CI system within
full runs and was not able to reproduce them myself.

As first steps the series introduces end-of-pipe synchronization.
This is a flush combined with stall and post-sync operation of
writing a double word (32 bits). Until now this wasn't really
needed as there was in many cases double flushing which in turn
looks to take long enough to hide the need for the sync. I also
noticed that one needs to be rather careful with it - performance
gets decreased noticeably when used unneeded.

I don't really know if we want to go this way myself even. Current
logic - while not ideal - is rather simple.

Topi Pohjolainen (16):
  i965/miptree: Tell if anything got resolved
  i965/gen6+: Implement end-of-pipe sync
  i965: Hook end-of-pipe-sync after texture resolves
  i965: Hook end-of-pipe-sync after image resolves
  i965: Hook end-of-pipe-sync after framebuffer resolves
  i965: Consider layered rt resolves along with other
  i965: Add color resolve end-of-pipe-sync before switch to blit ring
  i965/dri2: Add end-of-pipe-sync after color resolves
  i965/miptree: Add color resolve end-of-pipe-sync before sharing
  i965: Add end-of-pipe sync before non-gpu read of color resolves
  i965/blorp: Do more fine grained flushing/syncing
  i965/blorp/blit: Refactor hiz/ccs prep for blits
  i965/blorp: Use conditional end-of-pipe-sync
  i965: Consider surface resolves and sync after blorp ops
  i965: Check if fast color clear state transition needs sync
  i965/blorp: Drop unnecessary flushes after clear/resolve

 src/mesa/drivers/dri/i965/brw_blorp.c  | 187 ++
 src/mesa/drivers/dri/i965/brw_compute.c|   2 +
 src/mesa/drivers/dri/i965/brw_context.c| 333 +++--
 src/mesa/drivers/dri/i965/brw_context.h|   3 +
 src/mesa/drivers/dri/i965/brw_draw.c   |  36 +--
 src/mesa/drivers/dri/i965/brw_pipe_control.c   |  91 +++
 src/mesa/drivers/dri/i965/genX_blorp_exec.c|  11 -
 src/mesa/drivers/dri/i965/intel_blit.c |  16 +-
 src/mesa/drivers/dri/i965/intel_copy_image.c   |  10 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c  |  25 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h  |   2 +-
 src/mesa/drivers/dri/i965/intel_pixel.c|   4 +
 src/mesa/drivers/dri/i965/intel_pixel_bitmap.c |   5 +-
 src/mesa/drivers/dri/i965/intel_pixel_read.c   |   7 +-
 src/mesa/drivers/dri/i965/intel_tex_image.c|  10 +-
 src/mesa/drivers/dri/i965/intel_tex_subimage.c |  11 +-
 16 files changed, 557 insertions(+), 196 deletions(-)

-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/16] i965/miptree: Tell if anything got resolved

2017-02-17 Thread Topi Pohjolainen
Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 8 ++--
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 2 +-
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index b339f99..717a320 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -2297,21 +2297,25 @@ intel_miptree_resolve_color(struct brw_context *brw,
return resolved;
 }
 
-void
+bool
 intel_miptree_all_slices_resolve_color(struct brw_context *brw,
struct intel_mipmap_tree *mt,
int flags)
 {
if (!intel_miptree_needs_color_resolve(brw, mt, flags))
-  return;
+  return false;
   
+   bool resolved = false;
foreach_list_typed_safe(struct intel_resolve_map, map, link,
>color_resolve_map) {
   assert(map->fast_clear_state != INTEL_FAST_CLEAR_STATE_RESOLVED);
 
   brw_blorp_resolve_color(brw, mt, map->level, map->layer);
   intel_resolve_map_remove(map);
+  resolved = true;
}
+
+   return resolved;
 }
 
 /**
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index 27bcdfb..0337bf0 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -947,7 +947,7 @@ intel_miptree_resolve_color(struct brw_context *brw,
 unsigned start_layer, unsigned num_layers,
 int flags);
 
-void
+bool
 intel_miptree_all_slices_resolve_color(struct brw_context *brw,
struct intel_mipmap_tree *mt,
int flags);
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/16] i965: Hook end-of-pipe-sync after texture resolves

2017-02-17 Thread Topi Pohjolainen
There are three functional changes in this patch:

1) Currently the iteration over textures would flush after each
   resolve: brw_render_cache_set_check_flush() would fire
   every time as the resolved surface would be found in the render
   cache. Now the iteration records is flush is needed and does
   it only once after all resolves are pipelined.

2) Make distinction between resolves and other renders. In the
   former case issue end-of-pipe-sync and in the latter keep on
   emitting just the flush.

3) Current logic calls brw_render_cache_set_check_flush() which
   also does the top-of-pipe flushing. In case of texture resolves
   this is now also done once in intel_update_state(). Ideally
   this would be called conditionally by the 3D draw and compute
   draw paths. In the former case this would need plumbing from
   core update to the driver as draw elements update the state
   without the driver necessarily knowing.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_context.c | 113 
 1 file changed, 99 insertions(+), 14 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 7240b1f..9ca1ac1 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -222,6 +222,86 @@ intel_texture_view_requires_resolve(struct brw_context 
*brw,
return true;
 }
 
+enum intel_write_cache_flush_type {
+   INTEL_WRITE_CACHE_NO_FLUSH = 0,
+   INTEL_WRITE_CACHE_FLUSH= 1 << 0,
+   INTEL_WRITE_CACHE_SYNC = 1 << 1,
+};
+
+static enum intel_write_cache_flush_type 
+brw_prepare_textures(struct gl_context *ctx)
+{
+   struct brw_context *brw = brw_context(ctx);
+   enum intel_write_cache_flush_type flush = INTEL_WRITE_CACHE_NO_FLUSH;
+   bool resolved = false;
+
+   memset(brw->draw_aux_buffer_disabled, 0,
+  sizeof(brw->draw_aux_buffer_disabled));
+
+   for (int i = 0; i <= ctx->Texture._MaxEnabledTexImageUnit; i++) {
+  if (!ctx->Texture.Unit[i]._Current)
+continue;
+
+  struct intel_texture_object * const tex_obj =
+ intel_texture_object(ctx->Texture.Unit[i]._Current);
+  if (!tex_obj || !tex_obj->mt)
+continue;
+
+  if (intel_miptree_sample_with_hiz(brw, tex_obj->mt))
+ resolved |= intel_miptree_all_slices_resolve_hiz(brw, tex_obj->mt);
+  else
+ resolved |= intel_miptree_all_slices_resolve_depth(brw, tex_obj->mt);
+
+  /* Sampling engine understands lossless compression and resolving
+   * those surfaces should be skipped for performance reasons.
+   */
+  const int flags = intel_texture_view_requires_resolve(brw, tex_obj) ?
+   0 : INTEL_MIPTREE_IGNORE_CCS_E;
+  resolved |= intel_miptree_all_slices_resolve_color(brw, tex_obj->mt,
+ flags);
+
+  /*
+   * Ivybrigde PRM Vol 2, Part 1, "11.7 MCS Buffer for Render Target(s)":
+   *
+   *  Any transition from any value in {Clear, Render, Resolve} to a
+   *  different value in {Clear, Render, Resolve} requires end of pipe
+   *  synchronization.
+   */
+  if (resolved)
+ flush |= INTEL_WRITE_CACHE_SYNC;
+
+  if (_mesa_set_search(brw->render_cache, tex_obj->mt->bo) != NULL)
+ flush |= INTEL_WRITE_CACHE_FLUSH;
+   }
+
+   return flush;
+}
+
+static bool
+intel_resolve_and_sync_surfaces(struct gl_context *ctx)
+{
+   struct brw_context *brw = brw_context(ctx);
+   const int flags = (brw->gen >= 6) ? PIPE_CONTROL_DEPTH_CACHE_FLUSH |
+   PIPE_CONTROL_RENDER_TARGET_FLUSH |
+   PIPE_CONTROL_CS_STALL :
+   PIPE_CONTROL_RENDER_TARGET_FLUSH;
+
+   const enum intel_write_cache_flush_type flush =
+  brw_prepare_textures(ctx);
+
+   if (flush == INTEL_WRITE_CACHE_NO_FLUSH)
+  return false;
+
+   if (flush & INTEL_WRITE_CACHE_SYNC)
+  brw_end_of_pipe_sync(brw);
+   else if (flush & INTEL_WRITE_CACHE_FLUSH)
+  brw_emit_pipe_control_flush(brw, flags);
+
+   brw_render_cache_set_clear(brw);
+
+   return true;
+}
+
 static void
 intel_update_state(struct gl_context * ctx, GLuint new_state)
 {
@@ -242,10 +322,26 @@ intel_update_state(struct gl_context * ctx, GLuint 
new_state)
if (depth_irb)
   intel_renderbuffer_resolve_hiz(brw, depth_irb);
 
-   memset(brw->draw_aux_buffer_disabled, 0,
-  sizeof(brw->draw_aux_buffer_disabled));
+   if (intel_resolve_and_sync_surfaces(ctx)) {
+  /* Perform top-of-pipe flush.
+   *
+   * TODO: Consider flushing only in brw_dispatch_compute_common() and
+   *   brw_try_draw_prims(). Other callers of _mesa_update_state() are
+   *   not going to be using gpu and hence flushing the gpu read-only
+   *   caches (texture and data port constant cache) are unnecessary.
+   *   

[Mesa-dev] [PATCH 02/16] i965/gen6+: Implement end-of-pipe sync

2017-02-17 Thread Topi Pohjolainen
Implementation for gen < 6 is taken as copy-paste from
brw_emit_mi_flush() in order to preserve the behavior in later
patches.

Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_context.h  |  1 +
 src/mesa/drivers/dri/i965/brw_pipe_control.c | 91 
 2 files changed, 92 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 01e651b..34aea56 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1704,6 +1704,7 @@ void brw_emit_pipe_control_write(struct brw_context *brw, 
uint32_t flags,
  uint32_t imm_lower, uint32_t imm_upper);
 void brw_emit_mi_flush(struct brw_context *brw);
 void brw_emit_post_sync_nonzero_flush(struct brw_context *brw);
+void brw_end_of_pipe_sync(struct brw_context *brw);
 void brw_emit_depth_stall_flushes(struct brw_context *brw);
 void gen7_emit_vs_workaround_flush(struct brw_context *brw);
 void gen7_emit_cs_stall_flush(struct brw_context *brw);
diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c 
b/src/mesa/drivers/dri/i965/brw_pipe_control.c
index b8f7406..4fbd401 100644
--- a/src/mesa/drivers/dri/i965/brw_pipe_control.c
+++ b/src/mesa/drivers/dri/i965/brw_pipe_control.c
@@ -331,6 +331,97 @@ brw_emit_post_sync_nonzero_flush(struct brw_context *brw)
brw->workaround_bo, 0, 0, 0);
 }
 
+/*
+ * From Sandybridge PRM, volume 2, "1.7.2 End-of-Pipe Synchronization":
+ *
+ *  Write synchronization is a special case of end-of-pipe
+ *  synchronization that requires that the render cache and/or depth
+ *  related caches are flushed to memory, where the data will become
+ *  globally visible. This type of synchronization is required prior to
+ *  SW (CPU) actually reading the result data from memory, or initiating
+ *  an operation that will use as a read surface (such as a texture
+ *  surface) a previous render target and/or depth/stencil buffer
+ *
+ *
+ * From Haswell PRM, volume 2, part 1, "End-of-Pipe Synchronization":
+ *
+ *  Exercising the write cache flush bits (Render Target Cache Flush
+ *  Enable, Depth Cache Flush Enable, DC Flush) in PIPE_CONTROL only
+ *  ensures the write caches are flushed and doesn't guarantee the data
+ *  is globally visible.
+ *
+ *  SW can track the completion of the end-of-pipe-synchronization by
+ *  using "Notify Enable" and "PostSync Operation - Write Immediate
+ *  Data" in the PIPE_CONTROL command. 
+ */
+static void
+gen6_end_of_pipe_sync(struct brw_context *brw)
+{
+   /*
+* From Sandybridge PRM, volume 2, "1.7.3.1 Writing a Value to Memory":
+*
+*  The most common action to perform upon reaching a synchronization
+*  point is to write a value out to memory. An immediate value (included
+*  with the synchronization command) may be written. 
+*
+*
+* From Broadwell PRM, volume 7, "End-of-Pipe Synchronization":
+*
+*  In case the data flushed out by the render engine is to be read back in
+*  to the render engine in coherent manner, then the render engine has to
+*  wait for the fence completion before accessing the flushed data. This
+*  can be achieved by following means on various products:
+*  PIPE_CONTROL command with CS Stall and the required write caches
+*  flushed with Post-Sync-Operation as Write Immediate Data.
+*
+*  Example:
+* - Workload-1 (3D/GPGPU/MEDIA)
+* - PIPE_CONTROL (CS Stall, Post-Sync-Operation Write Immediate Data,
+*   Required Write Cache Flush bits set)
+* - Workload-2 (Can use the data produce or output by Workload-1)
+*/
+   brw_emit_pipe_control_write(brw,
+   PIPE_CONTROL_CS_STALL |
+   PIPE_CONTROL_DATA_CACHE_FLUSH |
+   PIPE_CONTROL_DEPTH_CACHE_FLUSH |
+   PIPE_CONTROL_RENDER_TARGET_FLUSH |
+   PIPE_CONTROL_WRITE_IMMEDIATE,
+   brw->workaround_bo, 0, 0, 0);
+
+   if (!brw->is_haswell)
+  return;
+
+   /*
+* Haswell needs additionally:
+*
+* From Haswell PRM, volume 2, part 1, "End-of-Pipe Synchronization":
+*
+*  Option 1:
+*  PIPE_CONTROL command with the CS Stall and the required write caches
+*  flushed with Post-SyncOperation as Write Immediate Data followed by
+*  eight dummy MI_STORE_DATA_IMM (write to scratch spce) commands.
+*
+*  Example:
+* - Workload-1
+* - PIPE_CONTROL (CS Stall, Post-Sync-Operation Write Immediate Data,
+*   Required Write Cache Flush bits set)
+* - MI_STORE_DATA_IMM (8 times) (Dummy data, Scratch Address)
+* - Workload-2 (Can use the data produce or output by Workload-1)
+*/
+   for (unsigned i = 0; i < 8; ++i) {
+  brw_store_data_imm32(brw, brw->workaround_bo, 0, 0);
+   }

[Mesa-dev] [PATCH 04/16] i965: Hook end-of-pipe-sync after image resolves

2017-02-17 Thread Topi Pohjolainen
Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_context.c | 80 +++--
 1 file changed, 47 insertions(+), 33 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 9ca1ac1..f4ebaf2 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -277,6 +277,51 @@ brw_prepare_textures(struct gl_context *ctx)
return flush;
 }
 
+static enum intel_write_cache_flush_type 
+brw_prepare_image_surfaces(struct gl_context *ctx)
+{
+   struct brw_context *brw = brw_context(ctx);
+   enum intel_write_cache_flush_type flush = INTEL_WRITE_CACHE_NO_FLUSH;
+
+   for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
+  const struct gl_program *prog = ctx->_Shader->CurrentProgram[i];
+
+  if (!prog || prog->info.num_images == 0)
+ continue;
+
+  for (unsigned j = 0; j < prog->info.num_images; j++) {
+ struct gl_image_unit *u =
+>ImageUnits[prog->sh.ImageUnits[j]];
+ struct intel_texture_object * const tex_obj =
+intel_texture_object(u->TexObj);
+
+ if (!tex_obj || !tex_obj->mt)
+continue;
+
+ /* Access to images is implemented using indirect messages
+  * against data port. Normal render target write understands
+  * lossless compression but unfortunately the typed/untyped
+  * read/write interface doesn't. Therefore even lossless
+  * compressed surfaces need to be resolved prior to accessing
+  * them. Hence skip setting INTEL_MIPTREE_IGNORE_CCS_E.
+  */
+ if (intel_miptree_all_slices_resolve_color(brw, tex_obj->mt, 0))
+flush |= INTEL_WRITE_CACHE_SYNC;
+
+ if (intel_miptree_is_lossless_compressed(brw, tex_obj->mt) &&
+ intel_disable_rb_aux_buffer(brw, tex_obj->mt->bo)) {
+perf_debug("Using renderbuffer as shader image - turning "
+   "off lossless compression");
+ }
+
+ if (_mesa_set_search(brw->render_cache, tex_obj->mt->bo) != NULL)
+flush |= INTEL_WRITE_CACHE_FLUSH;
+  }
+   }
+
+   return flush;
+}
+
 static bool
 intel_resolve_and_sync_surfaces(struct gl_context *ctx)
 {
@@ -287,7 +332,8 @@ intel_resolve_and_sync_surfaces(struct gl_context *ctx)
PIPE_CONTROL_RENDER_TARGET_FLUSH;
 
const enum intel_write_cache_flush_type flush =
-  brw_prepare_textures(ctx);
+  brw_prepare_textures(ctx) |
+  brw_prepare_image_surfaces(ctx);
 
if (flush == INTEL_WRITE_CACHE_NO_FLUSH)
   return false;
@@ -356,38 +402,6 @@ intel_update_state(struct gl_context * ctx, GLuint 
new_state)
   }
}
 
-   /* Resolve color for each active shader image. */
-   for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
-  const struct gl_program *prog = ctx->_Shader->CurrentProgram[i];
-
-  if (unlikely(prog && prog->info.num_images)) {
- for (unsigned j = 0; j < prog->info.num_images; j++) {
-struct gl_image_unit *u =
-   >ImageUnits[prog->sh.ImageUnits[j]];
-tex_obj = intel_texture_object(u->TexObj);
-
-if (tex_obj && tex_obj->mt) {
-   /* Access to images is implemented using indirect messages
-* against data port. Normal render target write understands
-* lossless compression but unfortunately the typed/untyped
-* read/write interface doesn't. Therefore even lossless
-* compressed surfaces need to be resolved prior to accessing
-* them. Hence skip setting INTEL_MIPTREE_IGNORE_CCS_E.
-*/
-   intel_miptree_all_slices_resolve_color(brw, tex_obj->mt, 0);
-
-   if (intel_miptree_is_lossless_compressed(brw, tex_obj->mt) &&
-   intel_disable_rb_aux_buffer(brw, tex_obj->mt->bo)) {
-  perf_debug("Using renderbuffer as shader image - turning "
- "off lossless compression");
-   }
-
-   brw_render_cache_set_check_flush(brw, tex_obj->mt->bo);
-}
- }
-  }
-   }
-
/* Resolve color buffers for non-coherent framebuffer fetch. */
if (!ctx->Extensions.MESA_shader_framebuffer_fetch &&
ctx->FragmentProgram._Current &&
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] vulkan/util: Add generator for enum_to_str functions

2017-02-17 Thread Dylan Baker
This adds a python generator to produce enum_to_str functions for
Vulkan from the vk.xml API description. It supports extensions as well
as core API features, and the generator works with both python2 and
python3.

CC: Jason Ekstrand 
Signed-off-by: Dylan Baker 
---
 configure.ac   |   1 +
 src/Makefile.am|   1 +
 src/intel/vulkan/Makefile.am   |   2 +
 src/intel/vulkan/anv_util.c|  36 +---
 src/vulkan/util/.gitignore |   1 +
 src/vulkan/util/Makefile.am|  22 +
 src/vulkan/util/gen_enum_to_str.py | 172 +
 7 files changed, 201 insertions(+), 34 deletions(-)
 create mode 100644 src/vulkan/util/.gitignore
 create mode 100644 src/vulkan/util/Makefile.am
 create mode 100644 src/vulkan/util/gen_enum_to_str.py

diff --git a/configure.ac b/configure.ac
index 7e4544f5bf..c83a5234da 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2691,6 +2691,7 @@ AC_CONFIG_FILES([Makefile
src/mesa/main/tests/Makefile
src/util/Makefile
src/util/tests/hash_table/Makefile
+   src/vulkan/util/Makefile
src/vulkan/wsi/Makefile])
 
 AC_OUTPUT
diff --git a/src/Makefile.am b/src/Makefile.am
index 12e5dcdb12..90f95b2265 100644
--- a/src/Makefile.am
+++ b/src/Makefile.am
@@ -117,6 +117,7 @@ SUBDIRS += intel/tools
 endif
 
 if HAVE_VULKAN_COMMON
+SUBDIRS += vulkan/util
 SUBDIRS += vulkan/wsi
 endif
 EXTRA_DIST += vulkan/registry/vk.xml
diff --git a/src/intel/vulkan/Makefile.am b/src/intel/vulkan/Makefile.am
index 4197b0e77c..54bf0f5de1 100644
--- a/src/intel/vulkan/Makefile.am
+++ b/src/intel/vulkan/Makefile.am
@@ -49,6 +49,7 @@ AM_CPPFLAGS = \
-I$(top_builddir)/src \
-I$(top_srcdir)/src \
-I$(top_srcdir)/src/vulkan/wsi \
+   -I$(top_builddir)/src/vulkan/util \
-I$(top_builddir)/src/compiler \
-I$(top_srcdir)/src/compiler \
-I$(top_builddir)/src/compiler/nir \
@@ -125,6 +126,7 @@ libvulkan_common_la_SOURCES = $(VULKAN_SOURCES)
 
 VULKAN_LIB_DEPS += \
libvulkan_common.la \
+   $(top_builddir)/src/vulkan/util/libvulkan_util.la \
$(top_builddir)/src/vulkan/wsi/libvulkan_wsi.la \
$(top_builddir)/src/mesa/drivers/dri/i965/libi965_compiler.la \
$(top_builddir)/src/compiler/nir/libnir.la \
diff --git a/src/intel/vulkan/anv_util.c b/src/intel/vulkan/anv_util.c
index 6d75187065..ec5c9486d8 100644
--- a/src/intel/vulkan/anv_util.c
+++ b/src/intel/vulkan/anv_util.c
@@ -29,6 +29,7 @@
 #include 
 
 #include "anv_private.h"
+#include "vk_enum_to_str.h"
 
 /** Log an error message.  */
 void anv_printflike(1, 2)
@@ -69,40 +70,7 @@ __vk_errorf(VkResult error, const char *file, int line, 
const char *format, ...)
va_list ap;
char buffer[256];
 
-#define ERROR_CASE(error) case error: error_str = #error; break;
-
-   const char *error_str;
-   switch ((int32_t)error) {
-
-   /* Core errors */
-   ERROR_CASE(VK_ERROR_OUT_OF_HOST_MEMORY)
-   ERROR_CASE(VK_ERROR_OUT_OF_DEVICE_MEMORY)
-   ERROR_CASE(VK_ERROR_INITIALIZATION_FAILED)
-   ERROR_CASE(VK_ERROR_DEVICE_LOST)
-   ERROR_CASE(VK_ERROR_MEMORY_MAP_FAILED)
-   ERROR_CASE(VK_ERROR_LAYER_NOT_PRESENT)
-   ERROR_CASE(VK_ERROR_EXTENSION_NOT_PRESENT)
-   ERROR_CASE(VK_ERROR_FEATURE_NOT_PRESENT)
-   ERROR_CASE(VK_ERROR_INCOMPATIBLE_DRIVER)
-   ERROR_CASE(VK_ERROR_TOO_MANY_OBJECTS)
-   ERROR_CASE(VK_ERROR_FORMAT_NOT_SUPPORTED)
-   ERROR_CASE(VK_ERROR_FRAGMENTED_POOL)
-
-   /* Extension errors */
-   ERROR_CASE(VK_ERROR_SURFACE_LOST_KHR)
-   ERROR_CASE(VK_ERROR_NATIVE_WINDOW_IN_USE_KHR)
-   ERROR_CASE(VK_ERROR_OUT_OF_DATE_KHR)
-   ERROR_CASE(VK_ERROR_INCOMPATIBLE_DISPLAY_KHR)
-   ERROR_CASE(VK_ERROR_VALIDATION_FAILED_EXT)
-   ERROR_CASE(VK_ERROR_INVALID_SHADER_NV)
-   ERROR_CASE(VK_ERROR_OUT_OF_POOL_MEMORY_KHR)
-
-   default:
-  assert(!"Unknown error");
-  error_str = "unknown error";
-   }
-
-#undef ERROR_CASE
+   const char *error_str = vk_Result_to_str(error);
 
if (format) {
   va_start(ap, format);
diff --git a/src/vulkan/util/.gitignore b/src/vulkan/util/.gitignore
new file mode 100644
index 00..5c79217982
--- /dev/null
+++ b/src/vulkan/util/.gitignore
@@ -0,0 +1 @@
+vk_enum_to_str.*
diff --git a/src/vulkan/util/Makefile.am b/src/vulkan/util/Makefile.am
new file mode 100644
index 00..ced83e8873
--- /dev/null
+++ b/src/vulkan/util/Makefile.am
@@ -0,0 +1,22 @@
+vulkan_api_xml = $(top_srcdir)/src/vulkan/registry/vk.xml
+
+AM_CPPFLAGS = \
+   -I$(top_srcdir)/include \
+   -I$(top_srcdir)/src
+
+EXTRA_DIST= \
+   gen_enum_to_str.py
+
+BUILT_SOURCES= \
+   vk_enum_to_str.c \
+   vk_enum_to_str.h
+
+vk_enum_to_str.c vk_enum_to_str.h: gen_enum_to_str.py $(vulkan_api_xml)
+   $(AM_V_GEN)$(PYTHON2) $(srcdir)/gen_enum_to_str.py
+
+noinst_LTLIBRARIES = libvulkan_util.la
+
+libvulkan_util_la_SOURCES = \
+   vk_enum_to_str.c \
+   

Re: [Mesa-dev] [Mesa-stable] Mesa 13.0.5 release candidate

2017-02-17 Thread Emil Velikov
On 17 February 2017 at 17:14, Andreas Boll  wrote:
> 2017-02-17 16:15 GMT+01:00 Emil Velikov :
>> Hello list,
>>
>> The candidate for the Mesa 13.0.5 is now available. Currently we have:
>>  - 70 queued
>>  - 5 nominated (outstanding)
>>  - and 0 rejected patch(es)
>>
>>
>> Testing reports/general approval
>> 
>> Any testing reports (or general approval of the state of the branch) will be
>> greatly appreciated.
>>
>> The plan is to have 13.0.5 this Friday (19th of February), around or shortly
>> after 15:00 GMT.
>
> This Friday (17th of February) or this Sunday (19th of February)? :-)
>
The latter.

>>
>> If you have any questions or suggestions - be that about the current patch
>> queue or otherwise, please go ahead.
>>
>
> Please cherry-pick the following commit for 13.0.5 and 17.0.1:
>
> commit 94262e5f5db1f5c7865ced251c440bc5f3f4a89d
> Author: Bartosz Tomczyk 
> Date:   Sun Jan 29 19:10:25 2017 +0100
>
> r600/sb: Fix memory leak
>
> Signed-off-by: Marek Olšák 
>
> It fixes e933246013ee ("r600/sb: Fix loop optimization related hangs on eg")
>
Barring any objections from the author or Marek I'll pick it up.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa 13.0.5 release candidate

2017-02-17 Thread Andreas Boll
2017-02-17 16:15 GMT+01:00 Emil Velikov :
> Hello list,
>
> The candidate for the Mesa 13.0.5 is now available. Currently we have:
>  - 70 queued
>  - 5 nominated (outstanding)
>  - and 0 rejected patch(es)
>
>
> Testing reports/general approval
> 
> Any testing reports (or general approval of the state of the branch) will be
> greatly appreciated.
>
> The plan is to have 13.0.5 this Friday (19th of February), around or shortly
> after 15:00 GMT.

This Friday (17th of February) or this Sunday (19th of February)? :-)

>
> If you have any questions or suggestions - be that about the current patch
> queue or otherwise, please go ahead.
>

Please cherry-pick the following commit for 13.0.5 and 17.0.1:

commit 94262e5f5db1f5c7865ced251c440bc5f3f4a89d
Author: Bartosz Tomczyk 
Date:   Sun Jan 29 19:10:25 2017 +0100

r600/sb: Fix memory leak

Signed-off-by: Marek Olšák 

It fixes e933246013ee ("r600/sb: Fix loop optimization related hangs on eg")

Thanks,
Andreas
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/7] i965: Enable ARB_transform_feedback2 on Sandybridge.

2017-02-17 Thread Matt Turner
On Fri, Feb 17, 2017 at 1:56 AM, Kenneth Graunke  wrote:
> The only feature over and above ES 3.0 is DrawTransformFeedback().
>
> We already have to do the whole SOL_NUM_PRIMS_WRITTEN counter dance in
> order to compute the SVBI value for ResumeTransformFeedback(), at which
> point our existing GetTransformFeedbackVertexCount() implementation will
> do the trick (though with a stall to CPU map the buffer).
>
> Someday, we could probably implement DrawTransformFeedback() more
> efficiently, using the "Load Internal Vertex Count" feature of
> 3DSTATE_SVB_INDEX and the 3DPRIMITIVE indirect vertex count bit.
>
> Rumor has it this allows people to use WebGL 2.0 on Sandybridge.
>
> Note that we don't need pipelined register writes like Gen7+ because
> we use the 3DSTATE_SVB_INDEX command rather than MI_LOAD_REGISTER_MEM.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99842
> Signed-off-by: Kenneth Graunke 

Should update the release notes and features.txt as well.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH shader-db 4/4] add special script to run on intel hardware

2017-02-17 Thread Matt Turner
On Thu, Feb 16, 2017 at 4:29 AM, Lionel Landwerlin
 wrote:
> Intel produces fairly beefy Xeon servers on which it would be nice to be
> able to run shader-db to get some results pretty fast. Unfortunately those
> don't ship with any intel graphics IP (only ancient Matrox cards).
>
> This new script stubs a bunch of ioctls so that we can run shader-db on
> hardware that doesn't have a /dev/dri/renderD128.
>
> Example :
>
>./intel_run -j70 -pskl -oi965 shaders

This is awesome. Thanks a ton for building this.

>
> Signed-off-by: Lionel Landwerlin 
> ---
>  Makefile |  10 ++-
>  intel_run|   5 ++
>  intel_stub.c | 237 
> +++
>  3 files changed, 251 insertions(+), 1 deletion(-)
>  create mode 100755 intel_run
>  create mode 100644 intel_stub.c
>
> diff --git a/Makefile b/Makefile
> index 9422b32..52a764f 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -23,7 +23,15 @@ CFLAGS ?= -g -O2 -march=native -pipe
>  CFLAGS += -std=gnu99 -fopenmp
>  LDLIBS = -lepoxy -lgbm
>
> +INTEL_STUB_CFLAGS = -g -fPIC -shared -Wall `pkg-config libdrm_intel --cflags`
> +INTEL_STUB_LIBS = -ldl
> +
> +all: intel_stub.so run
> +
> +intel_stub.so: intel_stub.c
> +   gcc $(INTEL_STUB_CFLAGS) $< -o $@ $(INTEL_STUB_LIBS)

$(CC) instead of gcc.

Feel free to commit the lot!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] util: Add utility build-id code.

2017-02-17 Thread Matt Turner
On Fri, Feb 17, 2017 at 5:39 AM, Emil Velikov  wrote:
> On 17 February 2017 at 01:10, Jonathan Gray  wrote:
>> On Thu, Feb 16, 2017 at 04:25:02PM +, Emil Velikov wrote:
>>> On 16 February 2017 at 14:23, Jonathan Gray  wrote:
>>> > On Wed, Feb 15, 2017 at 11:11:50AM -0800, Matt Turner wrote:
>>> >> Provides the ability to read the .note.gnu.build-id section of ELF
>>> >> binaries, which is inserted by the --build-id=... flag to ld.
>>> >>
>>> >> Reviewed-by: Emil Velikov 
>>> >
>>> > I don't have time to dig into details right now but this broke the Mesa
>>> > build on OpenBSD and likely other non-linux platforms:
>>> >
>>> > libtool: compile:  gcc -DPACKAGE_NAME=\"Mesa\" -DPACKAGE_TARNAME=\"mesa\" 
>>> > -DPACKAGE_VERSION=\"17.1.0-devel\" "-DPACKAGE_STRING=\"Mesa 
>>> > 17.1.0-devel\"" 
>>> > "-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\";
>>> >  -DPACKAGE_URL=\"\" -DPACKAGE=\"mesa\" -DVERSION=\"17.1.0-devel\" 
>>> > -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 
>>> > -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 
>>> > -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 
>>> > -DLT_OBJDIR=\".libs/\" -DYYTEXT_POINTER=1 -DHAVE___BUILTIN_CLZ=1 
>>> > -DHAVE___BUILTIN_CLZLL=1 -DHAVE___BUILTIN_CTZ=1 -DHAVE___BUILTIN_EXPECT=1 
>>> > -DHAVE___BUILTIN_FFS=1 -DHAVE___BUILTIN_FFSLL=1 
>>> > -DHAVE___BUILTIN_POPCOUNT=1 -DHAVE___BUILTIN_POPCOUNTLL=1 
>>> > -DHAVE_FUNC_ATTRIBUTE_CONST=1 -DHAVE_FUNC_ATTRIBUTE_FLATTEN=1 
>>> > -DHAVE_FUNC_ATTRIBUTE_FORMAT=1 -DHAVE_FUNC_ATTRIBUTE_MALLOC=1 
>>> > -DHAVE_FUNC_ATTRIBUTE_PACKED=1 -DHAVE_FUNC_ATTRIBUTE_PURE=1 
>>> > -DHAVE_FUNC_ATTRIBUTE_UNUSED=1 -DHAVE_FUNC_ATTRIBUTE_VISIBILITY=1 
>>> > -DHAVE_FUNC_ATTRIBUTE_WARN_UNUSED_RESULT=1 -DHAVE_FUNC_ATTRIBUTE_WEAK=1 
>>> > -DHAVE_FUNC_ATTRIBUTE_ALIAS=1 -DHAVE_DLADDR=1 -DHAVE_CLOCK_GETTIME=1 
>>> > -DHAVE_PTHREAD_PRIO_INHERIT=1 -DHAVE_PTHREAD=1 -I. 
>>> > -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS 
>>> > -DDEBUG -DTEXTURE_FLOAT_ENABLED -DUSE_X86_64_ASM -DHAVE_SYS_SYSCTL_H 
>>> > -DHAVE_STRTOF -DHAVE_MKOSTEMP -DHAVE_DLOPEN -DHAVE_DL_ITERATE_PHDR 
>>> > -DHAVE_POSIX_MEMALIGN -DHAVE_LIBDRM -DGLX_USE_DRM 
>>> > -DGLX_INDIRECT_RENDERING -DGLX_DIRECT_RENDERING -DENABLE_SHADER_CACHE 
>>> > -DHAVE_MINCORE -I../../include -I../../src -I../../src/mapi 
>>> > -I../../src/mesa -I../../src/gallium/include 
>>> > -I../../src/gallium/auxiliary -fvisibility=hidden -Werror=pointer-arith 
>>> > -g -O2 -Wall -std=gnu99 -Werror=implicit-function-declaration 
>>> > -Werror=missing-prototypes -fno-math-errno -fno-trapping-math -MT 
>>> > libmesautil_la-build_id.lo -MD -MP -MF .deps/libmesautil_la-build_id.Tpo 
>>> > -c build_id.c  -fPIC -DPIC -o .libs/libmesautil_la-build_id.o
>>> > In file included from /usr/include/elf_abi.h:31,
>>> >  from /usr/include/link_elf.h:10,
>>> >  from /usr/include/link.h:39,
>>> >  from build_id.c:25:
>>> > /usr/include/sys/exec_elf.h:585: error: expected specifier-qualifier-list 
>>> > before 'uint32_t'
>>> > In file included from /usr/include/link.h:39,
>>> >  from build_id.c:25:
>>> > /usr/include/link_elf.h:22: error: expected specifier-qualifier-list 
>>> > before 'caddr_t'
>>> > /usr/include/link_elf.h:37: error: expected '=', ',', ';', 'asm' or 
>>> > '__attribute__' before 'int'
>>> > In file included from build_id.c:25:
>>> > /usr/include/link.h:49: error: expected '=', ',', ';', 'asm' or 
>>> > '__attribute__' before 'struct'
>>> > /usr/include/link.h:65: error: expected specifier-qualifier-list before 
>>> > 'caddr_t'
>>> These look like issue in your platform code/headers. Perhaps some bad
>>> interaction with the bits that Mesa defines ?
>>>
>>> Quick workaround is to check the function only when needed, roughly
>>> like this pseudo code:
>>>
>>> if test $building_any_vulkan_driver = yes ;then
>>> require_dl...=yes
>>>
>>> fi
>>> 
>>>
>>> if test $require_dl... = yes ; then
>>>AC_CHECK_FUNC([dl_iterate_phdr], [DEFINES="$DEFINES
>>> -DHAVE_DL_ITERATE_PHDR"], [AC_MSG_ERROR([required  not found])])
>>> fi
>>>
>>>
>>> Please give it a bash and send us a patch that works on your end.
>>
>> Leaning towards something along the lines of the following.
>> With Nhdr struct definitions added to system exec_elf.h.
>>
> IMHO it makes little sense to build the file if no code uses it. That aside:

Agreed, but I think this will be used for shader cache as well.

>
>> The need for sys/types.h here may go away shortly as well.
>>
>> diff --git a/src/util/build_id.c b/src/util/build_id.c
>> index 2993a80cfe..92250a1f5f 100644
>> --- a/src/util/build_id.c
>> +++ b/src/util/build_id.c
>> @@ -22,12 +22,22 @@
>>   */
>>
>>  #ifdef HAVE_DL_ITERATE_PHDR
>> +
>> +#include 
>>  #include 
>>  #include 
>>  #include 
>>
>>  #include "build_id.h"

Re: [Mesa-dev] [PATCH 1/2] i965: Add an OUT_BATCH64() macro.

2017-02-17 Thread Matt Turner
On Tue, Feb 14, 2017 at 1:45 PM, Kenneth Graunke  wrote:
> diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.h 
> b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
> index bf7cadfc4d6..da8f7e561f4 100644
> --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.h
> +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
> @@ -161,6 +161,7 @@ intel_batchbuffer_advance(struct brw_context *brw)
>
>  #define OUT_BATCH(d) *__map++ = (d)
>  #define OUT_BATCH_F(f) OUT_BATCH(float_as_int((f)))
> +#define OUT_BATCH64(d) *((uint64_t *) __map) = (d); __map += 2

Does this not generate strict aliasing warnings?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/3] i965: INTEL_performance_query for pipeline stats

2017-02-17 Thread Lionel Landwerlin

Looks good to me. This series is :

Reviewed-by: Lionel Landwerlin 

On 15/02/17 21:37, Robert Bragg wrote:

To hopefully make progress towards landing support for OA unit metrics exposed
via INTEL_performance_query the idea here is to first just tackle upstreaming
the backend rework with an initial implementation supporting pipeline
statistics.

In case anyone wants to look ahead, my branch with these patches as well as OA
unit support can be found at https://github.com/rib/mesa - wip/rib/oa-next

Regards,
- Robert

Robert Bragg (3):
   Separate INTEL_performance_query frontend
   Model INTEL perf query backend after query object BE
   i965: Implement INTEL_performance_query backend

  src/mapi/glapi/gen/gl_genexec.py  |   1 +
  src/mesa/Makefile.sources |   2 +
  src/mesa/drivers/dri/i965/Makefile.sources|   2 +
  src/mesa/drivers/dri/i965/brw_context.c   |   3 +
  src/mesa/drivers/dri/i965/brw_context.h   |  23 +
  src/mesa/drivers/dri/i965/brw_performance_query.c | 595 
  src/mesa/drivers/dri/i965/brw_performance_query.h |  49 ++
  src/mesa/drivers/dri/i965/intel_extensions.c  |   3 +
  src/mesa/main/context.c   |   3 +
  src/mesa/main/dd.h|  41 ++
  src/mesa/main/mtypes.h|  27 +
  src/mesa/main/performance_monitor.c   | 590 
  src/mesa/main/performance_monitor.h   |  39 --
  src/mesa/main/performance_query.c | 629 ++
  src/mesa/main/performance_query.h |  80 +++
  15 files changed, 1458 insertions(+), 629 deletions(-)
  create mode 100644 src/mesa/drivers/dri/i965/brw_performance_query.c
  create mode 100644 src/mesa/drivers/dri/i965/brw_performance_query.h
  create mode 100644 src/mesa/main/performance_query.c
  create mode 100644 src/mesa/main/performance_query.h



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 99849] Dashed lines (drawn via GLAMOR) are not rendered correctly

2017-02-17 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99849

Max Staudt  changed:

   What|Removed |Added

   See Also||https://bugzilla.opensuse.o
   ||rg/show_bug.cgi?id=1021803

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 99849] Dashed lines (drawn via GLAMOR) are not rendered correctly

2017-02-17 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99849

Max Staudt  changed:

   What|Removed |Added

   See Also||https://bugs.freedesktop.or
   ||g/show_bug.cgi?id=99708

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 99849] Dashed lines (drawn via GLAMOR) are not rendered correctly

2017-02-17 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99849

Max Staudt  changed:

   What|Removed |Added

 CC||mse00...@gmail.com,
   ||msta...@suse.de,
   ||sndir...@suse.de

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 99849] Dashed lines (drawn via GLAMOR) are not rendered correctly

2017-02-17 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99849

Bug ID: 99849
   Summary: Dashed lines (drawn via GLAMOR) are not rendered
correctly
   Product: Mesa
   Version: unspecified
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: msta...@suse.de
QA Contact: mesa-dev@lists.freedesktop.org

This is a continuation of fdo#99708.

Basically, since X server commit d18f5801, dashed lines with zero width are
accelerated through GLAMOR, and thus through OpenGL.

When using Mesa's software renderer, this is drawn as one continuous line.

When using the Intel backend, only the first segment of the dashed line is
drawn.


Please see the original bug report against GLAMOR for screenshots and example
code to trigger this condition:

  https://bugs.freedesktop.org/show_bug.cgi?id=99708


This seems to be a bug in the OpenGL rendering, according to this thread:

  https://lists.x.org/archives/xorg-devel/2017-February/052689.html


The fact that different backends behave differently makes me think so, too.

Any help would be greatly appreciated!


For reference, the bug that originated this analysis was:

  https://bugzilla.opensuse.org/show_bug.cgi?id=1021803

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Mesa 13.0.5 release candidate

2017-02-17 Thread Emil Velikov
Hello list,

The candidate for the Mesa 13.0.5 is now available. Currently we have:
 - 70 queued
 - 5 nominated (outstanding)
 - and 0 rejected patch(es)


With this series we have:

On the GLX/EGL front we have a GLVND fix for "The Binding of Isaac: Rebirth"
and other games, EGL Wayland buffer age rendering is back to normal.

Over a dozen of GLSL patches, addressing multiple CTS/dEQP tests.

A couple Vulkan WSI entrypoints (vkGetPhysicalDeviceSurfaceFormatsKHR and
vkGetPhysicalDeviceSurfacePresentModesKHR) now correctly handle VK_INCOMPLETE.

The following drivers have received bugfixes anv, i965, radv, r600, radeonsi
and vc4.

To top it up, multiple corner case build issues we resolved.


Take a look at section "Mesa stable queue" for more information.


Testing reports/general approval

Any testing reports (or general approval of the state of the branch) will be
greatly appreciated.

The plan is to have 13.0.5 this Friday (19th of February), around or shortly
after 15:00 GMT.

If you have any questions or suggestions - be that about the current patch
queue or otherwise, please go ahead.


Trivial merge conflicts
---
commit efe15de5667c62894909dd4f98b9697e23d9af72
Author: Ian Romanick 

glsl: Fix wonkey indentation left from previous commit

(cherry picked from commit 300de78ab17433ea05f39013c9eede6a851bcb24)


commit 3abc96823621f245c34e24ff6e22f9f51f523214
Author: Ian Romanick 

glsl: Track the linearized array index for each UBO instance array element

(cherry picked from commit d56bd07bb3b6821eca961dde15c40f179be99e2d)


commit f4e2c60858eacffc0ec67b2f5716d4a66e66b2bc
Author: Lionel Landwerlin 

spirv: handle undefined components for OpVectorShuffle

(cherry picked from commit bbe8705c579c3e464615a0ca9b2eb4bd3c16aad3)


commit 0683feb18ca4f6010e0714d398d2fc90d3c25413
Author: Eric Anholt 

vc4: Avoid emitting small immediates for UBO indirect load address guards.

(cherry picked from commit b2309393039b2ec0cc00a8e6fd828c60c4ef1e11)


commit 2bdb22fdaaf3ce8146c08dcb65d64318ddb830f5
Author: Jason Ekstrand 

i965/sampler_state: Set the "Base Mip Level" field on Sandy Bridge

(cherry picked from commit c59d1ea51bd0809761094e54c66bf3a200d964ff)


commit c3365b06aca867a2e6c2ae7cd3ba61e3756d081a
Author: Dave Airlie 

radv: change base aligmment for allocated memory.

(cherry picked from commit 06ffd299252311f57feac4474551bd5b44d3d4d4)


Cheers,
Emil


Mesa stable queue
-

Nominated (5)
=

Bas Nieuwenhuizen (1):
  f448701 radv: Never try to create more than max_sets descriptor sets.

Jason Ekstrand (1):
  a4393bd i965/fs: Fix the inline nir_op_pack_double optimization

Lionel Landwerlin (3):
  4b44ca7 anv: add helper to get vue map for fragment shader
  860d91e anv: set input_slots_valid on brw_wm_prog_key
  a0ac118 i965/fs: fix uninitialized memory access


Queued (70)
===

Bartosz Tomczyk (1):
  r600: Fix stack overflow

Bruce Cherniak (1):
  swr: [rasterizer core] Remove dead code Clipper::ClipScalar()

Chad Versace (1):
  i965/mt: Disable HiZ when sharing depth buffer externally (v2)

Dave Airlie (3):
  radv: change base aligmment for allocated memory.
  radv: fix cik macroModeIndex.
  radv: adopt some init config workarounds from radeonsi.

Derek Foreman (1):
  egl/dri2: add image_loader_extension back into loader extensions
for wayland

Emil Velikov (24):
  configure.ac: list radeon in --with-vulkan-drivers help string
  i965: automake: correctly set MKDIR_GEN
  freedreno: automake: correctly set MKDIR_GEN
  i965: automake: include builddir prior to srcdir
  i915: automake: include builddir prior to srcdir
  egl: automake: include builddir prior to srcdir
  clover: automake: include builddir prior to srcdir
  st/dri: automake: include builddir prior to srcdir
  d3dadapter9: automake: include builddir prior to srcdir
  glx: automake: include builddir prior to srcdir
  glx/apple: automake: include builddir prior to srcdir
  glx/windows: automake: include builddir prior to srcdir
  loader: automake: include builddir prior to srcdir
  mapi: automake: include builddir prior to srcdir
  radeon, r200: automake: include builddir prior to srcdir
  dri/swrast: automake: include builddir prior to srcdir
  dri/osmesa: automake: include builddir prior to srcdir
  mesa/tests: automake: include builddir prior to srcdir
  bin/get-extra-pick-list: use git merge-base to get the branchpoint
  bin/get-extra-pick-list: rework to use already_picked list
  bin/get-typod-pick-list.sh: limit `git grep ...' to only as needed
  bin/get-pick-list.sh: limit `git grep ...' only as needed
  bin/get-pick-list.sh: 

[Mesa-dev] [PATCH mesa] egl/dri3: implement query surface hook

2017-02-17 Thread Eric Engestrom
From: Brendan King 

This is a DRI3 version of a change made for DRI2
(4d6d4f939e0af4252e0b, "egl/dri2: implement query surface hook"),
that fixed failures in dEQP-EGL.functional.resize.surface_size.grow
and dEQP-EGL.functional.resize.surface_size.shrink.

Cc: Tapani Pälli 
Cc: Mark Janes 
Cc: Chad Versace 
Signed-off-by: Brendan King 
---
 src/egl/drivers/dri2/platform_x11_dri3.c | 20 
 src/loader/loader_dri3_helper.c  | 23 +++
 src/loader/loader_dri3_helper.h  |  2 ++
 3 files changed, 45 insertions(+)

diff --git a/src/egl/drivers/dri2/platform_x11_dri3.c 
b/src/egl/drivers/dri2/platform_x11_dri3.c
index c4747144d1..c4a54431cc 100644
--- a/src/egl/drivers/dri2/platform_x11_dri3.c
+++ b/src/egl/drivers/dri2/platform_x11_dri3.c
@@ -419,6 +419,25 @@ dri3_query_buffer_age(_EGLDriver *drv, _EGLDisplay *dpy, 
_EGLSurface *surf)
return loader_dri3_query_buffer_age(_surf->loader_drawable);
 }
 
+static EGLBoolean
+dri3_query_surface(_EGLDriver *drv, _EGLDisplay *dpy,
+   _EGLSurface *surf, EGLint attribute,
+   EGLint *value)
+{
+   struct dri3_egl_surface *dri3_surf = dri3_egl_surface(surf);
+
+   switch (attribute) {
+   case EGL_WIDTH:
+   case EGL_HEIGHT:
+  loader_dri3_update_drawable_geometry(_surf->loader_drawable);
+  break;
+   default:
+  break;
+   }
+
+   return _eglQuerySurface(drv, dpy, surf, attribute, value);
+}
+
 static __DRIdrawable *
 dri3_get_dri_drawable(_EGLSurface *surf)
 {
@@ -441,6 +460,7 @@ struct dri2_egl_display_vtbl dri3_x11_display_vtbl = {
.post_sub_buffer = dri2_fallback_post_sub_buffer,
.copy_buffers = dri3_copy_buffers,
.query_buffer_age = dri3_query_buffer_age,
+   .query_surface = dri3_query_surface,
.create_wayland_buffer_from_image = 
dri2_fallback_create_wayland_buffer_from_image,
.get_sync_values = dri3_get_sync_values,
.get_dri_drawable = dri3_get_dri_drawable,
diff --git a/src/loader/loader_dri3_helper.c b/src/loader/loader_dri3_helper.c
index 6e5d1b8843..493a7f5218 100644
--- a/src/loader/loader_dri3_helper.c
+++ b/src/loader/loader_dri3_helper.c
@@ -1408,3 +1408,26 @@ loader_dri3_get_buffers(__DRIdrawable *driDrawable,
 
return true;
 }
+
+/** loader_dri3_update_drawable_geometry
+ *
+ * Get the current drawable geometry.
+ */
+void
+loader_dri3_update_drawable_geometry(struct loader_dri3_drawable *draw)
+{
+   xcb_get_geometry_cookie_t geom_cookie;
+   xcb_get_geometry_reply_t *geom_reply;
+
+   geom_cookie = xcb_get_geometry(draw->conn, draw->drawable);
+
+   geom_reply = xcb_get_geometry_reply(draw->conn, geom_cookie, NULL);
+
+   if (geom_reply) {
+  draw->width = geom_reply->width;
+  draw->height = geom_reply->height;
+  draw->vtable->set_drawable_size(draw, draw->width, draw->height);
+
+  free(geom_reply);
+   }
+}
diff --git a/src/loader/loader_dri3_helper.h b/src/loader/loader_dri3_helper.h
index 1d1f15ebb9..a865e46355 100644
--- a/src/loader/loader_dri3_helper.h
+++ b/src/loader/loader_dri3_helper.h
@@ -239,4 +239,6 @@ loader_dri3_get_buffers(__DRIdrawable *driDrawable,
 uint32_t buffer_mask,
 struct __DRIimageList *buffers);
 
+void
+loader_dri3_update_drawable_geometry(struct loader_dri3_drawable *draw);
 #endif
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/5] freedreno: add support for user index buffers

2017-02-17 Thread Marek Olšák
On Fri, Feb 17, 2017 at 2:14 PM, Rob Clark  wrote:
> Haven't had a chance to try this yet, but looks reasonable.  But how
> common would it be that a single indexbuf gets re-used for multiple
> draws?  I wonder if it would be better to just do the upload in
> set_index_buffer() instead?

No, we need draw_info->count for the upload.

>
> Also, I might have missed some discussion (have been buried in kernel
> stuff lately so haven't been following mesa-dev so much), but what was
> the motivation to remove IB upload support from mesa/st?

My motivation was to reduce the amount of work done in st_draw_vbo.

>
> (Oh, and in patch 1/5, I guess you meant to split the u_helpers and
> etnaviv into two separate patches?  not sure if you accidentally
> squashed them together?)

That was on purpose.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] util: Add utility build-id code.

2017-02-17 Thread Emil Velikov
On 17 February 2017 at 01:10, Jonathan Gray  wrote:
> On Thu, Feb 16, 2017 at 04:25:02PM +, Emil Velikov wrote:
>> On 16 February 2017 at 14:23, Jonathan Gray  wrote:
>> > On Wed, Feb 15, 2017 at 11:11:50AM -0800, Matt Turner wrote:
>> >> Provides the ability to read the .note.gnu.build-id section of ELF
>> >> binaries, which is inserted by the --build-id=... flag to ld.
>> >>
>> >> Reviewed-by: Emil Velikov 
>> >
>> > I don't have time to dig into details right now but this broke the Mesa
>> > build on OpenBSD and likely other non-linux platforms:
>> >
>> > libtool: compile:  gcc -DPACKAGE_NAME=\"Mesa\" -DPACKAGE_TARNAME=\"mesa\" 
>> > -DPACKAGE_VERSION=\"17.1.0-devel\" "-DPACKAGE_STRING=\"Mesa 
>> > 17.1.0-devel\"" 
>> > "-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\";
>> >  -DPACKAGE_URL=\"\" -DPACKAGE=\"mesa\" -DVERSION=\"17.1.0-devel\" 
>> > -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 
>> > -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 
>> > -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 
>> > -DLT_OBJDIR=\".libs/\" -DYYTEXT_POINTER=1 -DHAVE___BUILTIN_CLZ=1 
>> > -DHAVE___BUILTIN_CLZLL=1 -DHAVE___BUILTIN_CTZ=1 -DHAVE___BUILTIN_EXPECT=1 
>> > -DHAVE___BUILTIN_FFS=1 -DHAVE___BUILTIN_FFSLL=1 
>> > -DHAVE___BUILTIN_POPCOUNT=1 -DHAVE___BUILTIN_POPCOUNTLL=1 
>> > -DHAVE_FUNC_ATTRIBUTE_CONST=1 -DHAVE_FUNC_ATTRIBUTE_FLATTEN=1 
>> > -DHAVE_FUNC_ATTRIBUTE_FORMAT=1 -DHAVE_FUNC_ATTRIBUTE_MALLOC=1 
>> > -DHAVE_FUNC_ATTRIBUTE_PACKED=1 -DHAVE_FUNC_ATTRIBUTE_PURE=1 
>> > -DHAVE_FUNC_ATTRIBUTE_UNUSED=1 -DHAVE_FUNC_ATTRIBUTE_VISIBILITY=1 
>> > -DHAVE_FUNC_ATTRIBUTE_WARN_UNUSED_RESULT=1 -DHAVE_FUNC_ATTRIBUTE_WEAK=1 
>> > -DHAVE_FUNC_ATTRIBUTE_ALIAS=1 -DHAVE_DLADDR=1 -DHAVE_CLOCK_GETTIME=1 
>> > -DHAVE_PTHREAD_PRIO_INHERIT=1 -DHAVE_PTHREAD=1 -I. 
>> > -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS 
>> > -DDEBUG -DTEXTURE_FLOAT_ENABLED -DUSE_X86_64_ASM -DHAVE_SYS_SYSCTL_H 
>> > -DHAVE_STRTOF -DHAVE_MKOSTEMP -DHAVE_DLOPEN -DHAVE_DL_ITERATE_PHDR 
>> > -DHAVE_POSIX_MEMALIGN -DHAVE_LIBDRM -DGLX_USE_DRM -DGLX_INDIRECT_RENDERING 
>> > -DGLX_DIRECT_RENDERING -DENABLE_SHADER_CACHE -DHAVE_MINCORE 
>> > -I../../include -I../../src -I../../src/mapi -I../../src/mesa 
>> > -I../../src/gallium/include -I../../src/gallium/auxiliary 
>> > -fvisibility=hidden -Werror=pointer-arith -g -O2 -Wall -std=gnu99 
>> > -Werror=implicit-function-declaration -Werror=missing-prototypes 
>> > -fno-math-errno -fno-trapping-math -MT libmesautil_la-build_id.lo -MD -MP 
>> > -MF .deps/libmesautil_la-build_id.Tpo -c build_id.c  -fPIC -DPIC -o 
>> > .libs/libmesautil_la-build_id.o
>> > In file included from /usr/include/elf_abi.h:31,
>> >  from /usr/include/link_elf.h:10,
>> >  from /usr/include/link.h:39,
>> >  from build_id.c:25:
>> > /usr/include/sys/exec_elf.h:585: error: expected specifier-qualifier-list 
>> > before 'uint32_t'
>> > In file included from /usr/include/link.h:39,
>> >  from build_id.c:25:
>> > /usr/include/link_elf.h:22: error: expected specifier-qualifier-list 
>> > before 'caddr_t'
>> > /usr/include/link_elf.h:37: error: expected '=', ',', ';', 'asm' or 
>> > '__attribute__' before 'int'
>> > In file included from build_id.c:25:
>> > /usr/include/link.h:49: error: expected '=', ',', ';', 'asm' or 
>> > '__attribute__' before 'struct'
>> > /usr/include/link.h:65: error: expected specifier-qualifier-list before 
>> > 'caddr_t'
>> These look like issue in your platform code/headers. Perhaps some bad
>> interaction with the bits that Mesa defines ?
>>
>> Quick workaround is to check the function only when needed, roughly
>> like this pseudo code:
>>
>> if test $building_any_vulkan_driver = yes ;then
>> require_dl...=yes
>>
>> fi
>> 
>>
>> if test $require_dl... = yes ; then
>>AC_CHECK_FUNC([dl_iterate_phdr], [DEFINES="$DEFINES
>> -DHAVE_DL_ITERATE_PHDR"], [AC_MSG_ERROR([required  not found])])
>> fi
>>
>>
>> Please give it a bash and send us a patch that works on your end.
>
> Leaning towards something along the lines of the following.
> With Nhdr struct definitions added to system exec_elf.h.
>
IMHO it makes little sense to build the file if no code uses it. That aside:

> The need for sys/types.h here may go away shortly as well.
>
> diff --git a/src/util/build_id.c b/src/util/build_id.c
> index 2993a80cfe..92250a1f5f 100644
> --- a/src/util/build_id.c
> +++ b/src/util/build_id.c
> @@ -22,12 +22,22 @@
>   */
>
>  #ifdef HAVE_DL_ITERATE_PHDR
> +
> +#include 
>  #include 
>  #include 
>  #include 
>
>  #include "build_id.h"
>
> +#ifndef NT_GNU_BUILD_ID
> +#define NT_GNU_BUILD_ID 3
> +#endif
> +
> +#ifndef ElfW
> +#define ElfW(type) Elf_##type
> +#endif
> +
AFAICT the ElfW macro is a Linux/Solaris thing and is missing from
OpenBSD/FreeBSD. So we do want this, but I have 

[Mesa-dev] [PATCH] ac/llvm: fix various findMSB bugs

2017-02-17 Thread Marek Olšák
From: Marek Olšák 

sffbh needs to be suffixed with ".i32"
---
 src/amd/common/ac_llvm_build.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 5398f07..2f25b14 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -762,21 +762,22 @@ ac_emit_sendmsg(struct ac_llvm_context *ctx,
args[1] = wave_id;
ac_emit_llvm_intrinsic(ctx, intr_name, ctx->voidt,
   args, 2, 0);
 }
 
 LLVMValueRef
 ac_emit_imsb(struct ac_llvm_context *ctx,
 LLVMValueRef arg,
 LLVMTypeRef dst_type)
 {
-   const char *intr_name = (HAVE_LLVM < 0x0400) ? "llvm.AMDGPU.flbit.i32" 
: "llvm.amdgcn.sffbh";
+   const char *intr_name = (HAVE_LLVM < 0x0400) ? "llvm.AMDGPU.flbit.i32" :
+  "llvm.amdgcn.sffbh.i32";
LLVMValueRef msb = ac_emit_llvm_intrinsic(ctx, intr_name,
  dst_type, , 1,
  AC_FUNC_ATTR_READNONE);
 
/* The HW returns the last bit index from MSB, but NIR/TGSI wants
 * the index from LSB. Invert it by doing "31 - msb". */
msb = LLVMBuildSub(ctx->builder, LLVMConstInt(ctx->i32, 31, false),
   msb, "");
 
LLVMValueRef all_ones = LLVMConstInt(ctx->i32, -1, true);
@@ -789,21 +790,21 @@ ac_emit_imsb(struct ac_llvm_context *ctx,
return LLVMBuildSelect(ctx->builder, cond, all_ones, msb, "");
 }
 
 LLVMValueRef
 ac_emit_umsb(struct ac_llvm_context *ctx,
 LLVMValueRef arg,
 LLVMTypeRef dst_type)
 {
LLVMValueRef args[2] = {
arg,
-   LLVMConstInt(ctx->i32, 1, 0),
+   LLVMConstInt(ctx->i1, 1, 0),
};
LLVMValueRef msb = ac_emit_llvm_intrinsic(ctx, "llvm.ctlz.i32",
  dst_type, args, 
ARRAY_SIZE(args),
  AC_FUNC_ATTR_READNONE);
 
/* The HW returns the last bit index from MSB, but TGSI/NIR wants
 * the index from LSB. Invert it by doing "31 - msb". */
msb = LLVMBuildSub(ctx->builder, LLVMConstInt(ctx->i32, 31, false),
   msb, "");
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/5] freedreno: add support for user index buffers

2017-02-17 Thread Rob Clark
Haven't had a chance to try this yet, but looks reasonable.  But how
common would it be that a single indexbuf gets re-used for multiple
draws?  I wonder if it would be better to just do the upload in
set_index_buffer() instead?

Also, I might have missed some discussion (have been buried in kernel
stuff lately so haven't been following mesa-dev so much), but what was
the motivation to remove IB upload support from mesa/st?

(Oh, and in patch 1/5, I guess you meant to split the u_helpers and
etnaviv into two separate patches?  not sure if you accidentally
squashed them together?)


BR,
-R

On Fri, Feb 17, 2017 at 5:27 AM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> ---
>  src/gallium/drivers/freedreno/freedreno_draw.c   | 13 +
>  src/gallium/drivers/freedreno/freedreno_screen.c |  2 +-
>  2 files changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/freedreno/freedreno_draw.c 
> b/src/gallium/drivers/freedreno/freedreno_draw.c
> index cfe13cd..cb4c063 100644
> --- a/src/gallium/drivers/freedreno/freedreno_draw.c
> +++ b/src/gallium/drivers/freedreno/freedreno_draw.c
> @@ -24,20 +24,21 @@
>   *
>   * Authors:
>   *Rob Clark 
>   */
>
>  #include "pipe/p_state.h"
>  #include "util/u_string.h"
>  #include "util/u_memory.h"
>  #include "util/u_prim.h"
>  #include "util/u_format.h"
> +#include "util/u_helpers.h"
>
>  #include "freedreno_draw.h"
>  #include "freedreno_context.h"
>  #include "freedreno_state.h"
>  #include "freedreno_resource.h"
>  #include "freedreno_query_hw.h"
>  #include "freedreno_util.h"
>
>  static void
>  resource_read(struct fd_batch *batch, struct pipe_resource *prsc)
> @@ -77,20 +78,29 @@ fd_draw_vbo(struct pipe_context *pctx, const struct 
> pipe_draw_info *info)
> /* emulate unsupported primitives: */
> if (!fd_supported_prim(ctx, info->mode)) {
> if (ctx->streamout.num_targets > 0)
> debug_error("stream-out with emulated prims");
> util_primconvert_save_index_buffer(ctx->primconvert, 
> >indexbuf);
> util_primconvert_save_rasterizer_state(ctx->primconvert, 
> ctx->rasterizer);
> util_primconvert_draw_vbo(ctx->primconvert, info);
> return;
> }
>
> +   /* Upload a user index buffer. */
> +   struct pipe_index_buffer ibuffer_saved = {};
> +   if (info->indexed && ctx->indexbuf.user_buffer &&
> +   !util_save_and_upload_index_buffer(pctx, info,
> +  >indexbuf.user_buffer,
> +  _saved)) {
> +   return;
> +   }
> +
> if (ctx->in_blit) {
> fd_batch_reset(batch);
> ctx->dirty = ~0;
> }
>
> batch->blit = ctx->in_blit;
> batch->back_blit = ctx->in_shadow;
>
> /* NOTE: needs to be before resource_written(batch->query_buf), 
> otherwise
>  * query_buf may not be created yet.
> @@ -194,20 +204,23 @@ fd_draw_vbo(struct pipe_context *pctx, const struct 
> pipe_draw_info *info)
> if (ctx->draw_vbo(ctx, info))
> batch->needs_flush = true;
>
> for (i = 0; i < ctx->streamout.num_targets; i++)
> ctx->streamout.offsets[i] += info->count;
>
> if (fd_mesa_debug & FD_DBG_DDRAW)
> ctx->dirty = 0x;
>
> fd_batch_check_size(batch);
> +
> +   if (info->indexed && ibuffer_saved.user_buffer)
> +   pctx->set_index_buffer(pctx, _saved);
>  }
>
>  /* Generic clear implementation (partially) using u_blitter: */
>  static void
>  fd_blitter_clear(struct pipe_context *pctx, unsigned buffers,
> const union pipe_color_union *color, double depth, unsigned 
> stencil)
>  {
> struct fd_context *ctx = fd_context(pctx);
> struct pipe_framebuffer_state *pfb = >batch->framebuffer;
> struct blitter_context *blitter = ctx->blitter;
> diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
> b/src/gallium/drivers/freedreno/freedreno_screen.c
> index 1122e29..e1b95a6 100644
> --- a/src/gallium/drivers/freedreno/freedreno_screen.c
> +++ b/src/gallium/drivers/freedreno/freedreno_screen.c
> @@ -172,20 +172,21 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
> pipe_cap param)
> case PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_INTEGER:
> case PIPE_CAP_SEAMLESS_CUBE_MAP:
> case PIPE_CAP_VERTEX_COLOR_UNCLAMPED:
> case PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION:
> case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY:
> case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY:
> case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY:
> case PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT:
> case PIPE_CAP_STRING_MARKER:
> case PIPE_CAP_MIXED_COLOR_DEPTH_BITS:
> +   case 

Re: [Mesa-dev] [PATCH 1/7] i965: Drop dead Gen8+ code from Gen7/sometimes-HSW driver hooks.

2017-02-17 Thread Iago Toral
And patches 6-7 are also:

Reviewed-by: Iago Toral Quiroga 

On Fri, 2017-02-17 at 13:21 +0100, Iago Toral wrote:
> Patches 1-5 are:
> 
> Reviewed-by: Iago Toral Quiroga 
> 
> On Fri, 2017-02-17 at 01:56 -0800, Kenneth Graunke wrote:
> > 
> > These driver hooks are not used when MI_MATH and
> > MI_LOAD_REGISTER_REG
> > are supported, which Gen8+ can always do.  So this code is dead.
> > 
> > Signed-off-by: Kenneth Graunke 
> > ---
> >  src/mesa/drivers/dri/i965/gen7_sol_state.c | 50 ++
> > 
> >  1 file changed, 24 insertions(+), 26 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c
> > b/src/mesa/drivers/dri/i965/gen7_sol_state.c
> > index e6b79ed2342..50631610e51 100644
> > --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
> > +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
> > @@ -490,13 +490,11 @@ gen7_begin_transform_feedback(struct
> > gl_context
> > *ctx, GLenum mode,
> > struct brw_transform_feedback_object *brw_obj =
> >    (struct brw_transform_feedback_object *) obj;
> >  
> > +   assert(brw->gen == 7);
> > +
> > /* Reset the SO buffer offsets to 0. */
> > -   if (brw->gen >= 8) {
> > -  brw_obj->zero_offsets = true;
> > -   } else {
> > -  intel_batchbuffer_flush(brw);
> > -  brw->batch.needs_sol_reset = true;
> > -   }
> > +   intel_batchbuffer_flush(brw);
> > +   brw->batch.needs_sol_reset = true;
> >  
> > /* We're about to lose the information needed to compute the
> > number of
> >  * vertices written during the last Begin/EndTransformFeedback
> > section,
> > @@ -552,17 +550,17 @@ gen7_pause_transform_feedback(struct
> > gl_context
> > *ctx,
> > /* Flush any drawing so that the counters have the right
> > values.
> > */
> > brw_emit_mi_flush(brw);
> >  
> > +   assert(brw->gen == 7);
> > +
> > /* Save the SOL buffer offset register values. */
> > -   if (brw->gen < 8) {
> > -  for (int i = 0; i < 4; i++) {
> > - BEGIN_BATCH(3);
> > - OUT_BATCH(MI_STORE_REGISTER_MEM | (3 - 2));
> > - OUT_BATCH(GEN7_SO_WRITE_OFFSET(i));
> > - OUT_RELOC(brw_obj->offset_bo,
> > -   I915_GEM_DOMAIN_INSTRUCTION,
> > I915_GEM_DOMAIN_INSTRUCTION,
> > -   i * sizeof(uint32_t));
> > - ADVANCE_BATCH();
> > -  }
> > +   for (int i = 0; i < 4; i++) {
> > +  BEGIN_BATCH(3);
> > +  OUT_BATCH(MI_STORE_REGISTER_MEM | (3 - 2));
> > +  OUT_BATCH(GEN7_SO_WRITE_OFFSET(i));
> > +  OUT_RELOC(brw_obj->offset_bo,
> > +I915_GEM_DOMAIN_INSTRUCTION,
> > I915_GEM_DOMAIN_INSTRUCTION,
> > +i * sizeof(uint32_t));
> > +  ADVANCE_BATCH();
> > }
> >  
> > /* Store the temporary ending value of the SO_NUM_PRIMS_WRITTEN
> > counters.
> > @@ -581,17 +579,17 @@ gen7_resume_transform_feedback(struct
> > gl_context *ctx,
> > struct brw_transform_feedback_object *brw_obj =
> >    (struct brw_transform_feedback_object *) obj;
> >  
> > +   assert(brw->gen == 7);
> > +
> > /* Reload the SOL buffer offset registers. */
> > -   if (brw->gen < 8) {
> > -  for (int i = 0; i < 4; i++) {
> > - BEGIN_BATCH(3);
> > - OUT_BATCH(GEN7_MI_LOAD_REGISTER_MEM | (3 - 2));
> > - OUT_BATCH(GEN7_SO_WRITE_OFFSET(i));
> > - OUT_RELOC(brw_obj->offset_bo,
> > -   I915_GEM_DOMAIN_INSTRUCTION,
> > I915_GEM_DOMAIN_INSTRUCTION,
> > -   i * sizeof(uint32_t));
> > - ADVANCE_BATCH();
> > -  }
> > +   for (int i = 0; i < 4; i++) {
> > +  BEGIN_BATCH(3);
> > +  OUT_BATCH(GEN7_MI_LOAD_REGISTER_MEM | (3 - 2));
> > +  OUT_BATCH(GEN7_SO_WRITE_OFFSET(i));
> > +  OUT_RELOC(brw_obj->offset_bo,
> > +I915_GEM_DOMAIN_INSTRUCTION,
> > I915_GEM_DOMAIN_INSTRUCTION,
> > +i * sizeof(uint32_t));
> > +  ADVANCE_BATCH();
> > }
> >  
> > /* Store the new starting value of the SO_NUM_PRIMS_WRITTEN
> > counters. */
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] etnaviv: add support for user index buffers

2017-02-17 Thread Marek Olšák
On Fri, Feb 17, 2017 at 11:27 AM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> ---
>  src/gallium/auxiliary/util/u_helpers.c| 29 
> +++
>  src/gallium/auxiliary/util/u_helpers.h|  5 +
>  src/gallium/drivers/etnaviv/etnaviv_context.c | 12 +++
>  src/gallium/drivers/etnaviv/etnaviv_screen.c  |  2 +-
>  4 files changed, 47 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/auxiliary/util/u_helpers.c 
> b/src/gallium/auxiliary/util/u_helpers.c
> index 09020b0..85e7fb0 100644
> --- a/src/gallium/auxiliary/util/u_helpers.c
> +++ b/src/gallium/auxiliary/util/u_helpers.c
> @@ -20,20 +20,21 @@
>   * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
>   * IN NO EVENT SHALL THE AUTHORS AND/OR THEIR SUPPLIERS BE LIABLE FOR
>   * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
>   * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
>   * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
>   *
>   **/
>
>  #include "util/u_helpers.h"
>  #include "util/u_inlines.h"
> +#include "util/u_upload_mgr.h"
>
>  /**
>   * This function is used to copy an array of pipe_vertex_buffer structures,
>   * while properly referencing the pipe_vertex_buffer::buffer member.
>   *
>   * enabled_buffers is updated such that the bits corresponding to the indices
>   * of disabled buffers are set to 0 and the enabled ones are set to 1.
>   *
>   * \sa util_copy_framebuffer_state
>   */
> @@ -102,10 +103,38 @@ util_set_index_buffer(struct pipe_index_buffer *dst,
>  {
> if (src) {
>pipe_resource_reference(>buffer, src->buffer);
>memcpy(dst, src, sizeof(*dst));
> }
> else {
>pipe_resource_reference(>buffer, NULL);
>memset(dst, 0, sizeof(*dst));
> }
>  }
> +
> +/**
> + * Given a user index buffer, save the structure to "saved", and upload it.
> + */
> +bool
> +util_save_and_upload_index_buffer(struct pipe_context *pipe,
> +  const struct pipe_draw_info *info,
> +  const struct pipe_index_buffer *ib,
> +  struct pipe_index_buffer *out_saved)
> +{
> +   struct pipe_index_buffer new_ib = {0};
> +   unsigned start_offset = info->start * ib->index_size;
> +
> +   u_upload_data(pipe->stream_uploader, start_offset,
> + info->count * ib->index_size, 4,
> + (char*)ib->user_buffer + start_offset,
> + _ib.offset, _ib.buffer);
> +   if (!ib->buffer)
> +  return false;
> +   u_upload_unmap(pipe->stream_uploader);
> +
> +   new_ib.offset -= start_offset;
> +   new_ib.index_size = ib->index_size;
> +
> +   util_set_index_buffer(out_saved, ib);
> +   pipe->set_index_buffer(pipe, _ib);
> +   return true;
> +}
> diff --git a/src/gallium/auxiliary/util/u_helpers.h 
> b/src/gallium/auxiliary/util/u_helpers.h
> index a9a53e4..7de960b 100644
> --- a/src/gallium/auxiliary/util/u_helpers.h
> +++ b/src/gallium/auxiliary/util/u_helpers.h
> @@ -40,15 +40,20 @@ void util_set_vertex_buffers_mask(struct 
> pipe_vertex_buffer *dst,
>unsigned start_slot, unsigned count);
>
>  void util_set_vertex_buffers_count(struct pipe_vertex_buffer *dst,
> unsigned *dst_count,
> const struct pipe_vertex_buffer *src,
> unsigned start_slot, unsigned count);
>
>  void util_set_index_buffer(struct pipe_index_buffer *dst,
> const struct pipe_index_buffer *src);
>
> +bool util_save_and_upload_index_buffer(struct pipe_context *pipe,
> +   const struct pipe_draw_info *info,
> +   const struct pipe_index_buffer *ib,
> +   struct pipe_index_buffer *out_saved);
> +
>  #ifdef __cplusplus
>  }
>  #endif
>
>  #endif
> diff --git a/src/gallium/drivers/etnaviv/etnaviv_context.c 
> b/src/gallium/drivers/etnaviv/etnaviv_context.c
> index 62297a0..d22939a 100644
> --- a/src/gallium/drivers/etnaviv/etnaviv_context.c
> +++ b/src/gallium/drivers/etnaviv/etnaviv_context.c
> @@ -40,20 +40,21 @@
>  #include "etnaviv_state.h"
>  #include "etnaviv_surface.h"
>  #include "etnaviv_texture.h"
>  #include "etnaviv_transfer.h"
>  #include "etnaviv_translate.h"
>  #include "etnaviv_zsa.h"
>
>  #include "pipe/p_context.h"
>  #include "pipe/p_state.h"
>  #include "util/u_blitter.h"
> +#include "util/u_helpers.h"
>  #include "util/u_memory.h"
>  #include "util/u_prim.h"
>  #include "util/u_upload_mgr.h"
>
>  #include "hw/common.xml.h"
>
>  static void
>  etna_context_destroy(struct pipe_context *pctx)
>  {
> struct etna_context *ctx = etna_context(pctx);
> @@ -130,20 +131,29 @@ etna_draw_vbo(struct pipe_context *pctx, 

Re: [Mesa-dev] [PATCH 2/5] freedreno: add support for user index buffers

2017-02-17 Thread Marek Olšák
On Fri, Feb 17, 2017 at 11:27 AM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> ---
>  src/gallium/drivers/freedreno/freedreno_draw.c   | 13 +
>  src/gallium/drivers/freedreno/freedreno_screen.c |  2 +-
>  2 files changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/freedreno/freedreno_draw.c 
> b/src/gallium/drivers/freedreno/freedreno_draw.c
> index cfe13cd..cb4c063 100644
> --- a/src/gallium/drivers/freedreno/freedreno_draw.c
> +++ b/src/gallium/drivers/freedreno/freedreno_draw.c
> @@ -24,20 +24,21 @@
>   *
>   * Authors:
>   *Rob Clark 
>   */
>
>  #include "pipe/p_state.h"
>  #include "util/u_string.h"
>  #include "util/u_memory.h"
>  #include "util/u_prim.h"
>  #include "util/u_format.h"
> +#include "util/u_helpers.h"
>
>  #include "freedreno_draw.h"
>  #include "freedreno_context.h"
>  #include "freedreno_state.h"
>  #include "freedreno_resource.h"
>  #include "freedreno_query_hw.h"
>  #include "freedreno_util.h"
>
>  static void
>  resource_read(struct fd_batch *batch, struct pipe_resource *prsc)
> @@ -77,20 +78,29 @@ fd_draw_vbo(struct pipe_context *pctx, const struct 
> pipe_draw_info *info)
> /* emulate unsupported primitives: */
> if (!fd_supported_prim(ctx, info->mode)) {
> if (ctx->streamout.num_targets > 0)
> debug_error("stream-out with emulated prims");
> util_primconvert_save_index_buffer(ctx->primconvert, 
> >indexbuf);
> util_primconvert_save_rasterizer_state(ctx->primconvert, 
> ctx->rasterizer);
> util_primconvert_draw_vbo(ctx->primconvert, info);
> return;
> }
>
> +   /* Upload a user index buffer. */
> +   struct pipe_index_buffer ibuffer_saved = {};
> +   if (info->indexed && ctx->indexbuf.user_buffer &&
> +   !util_save_and_upload_index_buffer(pctx, info,
> +  >indexbuf.user_buffer,

This should be >indexbuf. Fixed locally.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/7] i965: Drop dead Gen8+ code from Gen7/sometimes-HSW driver hooks.

2017-02-17 Thread Iago Toral
Patches 1-5 are:

Reviewed-by: Iago Toral Quiroga 

On Fri, 2017-02-17 at 01:56 -0800, Kenneth Graunke wrote:
> These driver hooks are not used when MI_MATH and MI_LOAD_REGISTER_REG
> are supported, which Gen8+ can always do.  So this code is dead.
> 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/gen7_sol_state.c | 50 ++
> 
>  1 file changed, 24 insertions(+), 26 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c
> b/src/mesa/drivers/dri/i965/gen7_sol_state.c
> index e6b79ed2342..50631610e51 100644
> --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
> +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
> @@ -490,13 +490,11 @@ gen7_begin_transform_feedback(struct gl_context
> *ctx, GLenum mode,
> struct brw_transform_feedback_object *brw_obj =
>    (struct brw_transform_feedback_object *) obj;
>  
> +   assert(brw->gen == 7);
> +
> /* Reset the SO buffer offsets to 0. */
> -   if (brw->gen >= 8) {
> -  brw_obj->zero_offsets = true;
> -   } else {
> -  intel_batchbuffer_flush(brw);
> -  brw->batch.needs_sol_reset = true;
> -   }
> +   intel_batchbuffer_flush(brw);
> +   brw->batch.needs_sol_reset = true;
>  
> /* We're about to lose the information needed to compute the
> number of
>  * vertices written during the last Begin/EndTransformFeedback
> section,
> @@ -552,17 +550,17 @@ gen7_pause_transform_feedback(struct gl_context
> *ctx,
> /* Flush any drawing so that the counters have the right values.
> */
> brw_emit_mi_flush(brw);
>  
> +   assert(brw->gen == 7);
> +
> /* Save the SOL buffer offset register values. */
> -   if (brw->gen < 8) {
> -  for (int i = 0; i < 4; i++) {
> - BEGIN_BATCH(3);
> - OUT_BATCH(MI_STORE_REGISTER_MEM | (3 - 2));
> - OUT_BATCH(GEN7_SO_WRITE_OFFSET(i));
> - OUT_RELOC(brw_obj->offset_bo,
> -   I915_GEM_DOMAIN_INSTRUCTION,
> I915_GEM_DOMAIN_INSTRUCTION,
> -   i * sizeof(uint32_t));
> - ADVANCE_BATCH();
> -  }
> +   for (int i = 0; i < 4; i++) {
> +  BEGIN_BATCH(3);
> +  OUT_BATCH(MI_STORE_REGISTER_MEM | (3 - 2));
> +  OUT_BATCH(GEN7_SO_WRITE_OFFSET(i));
> +  OUT_RELOC(brw_obj->offset_bo,
> +I915_GEM_DOMAIN_INSTRUCTION,
> I915_GEM_DOMAIN_INSTRUCTION,
> +i * sizeof(uint32_t));
> +  ADVANCE_BATCH();
> }
>  
> /* Store the temporary ending value of the SO_NUM_PRIMS_WRITTEN
> counters.
> @@ -581,17 +579,17 @@ gen7_resume_transform_feedback(struct
> gl_context *ctx,
> struct brw_transform_feedback_object *brw_obj =
>    (struct brw_transform_feedback_object *) obj;
>  
> +   assert(brw->gen == 7);
> +
> /* Reload the SOL buffer offset registers. */
> -   if (brw->gen < 8) {
> -  for (int i = 0; i < 4; i++) {
> - BEGIN_BATCH(3);
> - OUT_BATCH(GEN7_MI_LOAD_REGISTER_MEM | (3 - 2));
> - OUT_BATCH(GEN7_SO_WRITE_OFFSET(i));
> - OUT_RELOC(brw_obj->offset_bo,
> -   I915_GEM_DOMAIN_INSTRUCTION,
> I915_GEM_DOMAIN_INSTRUCTION,
> -   i * sizeof(uint32_t));
> - ADVANCE_BATCH();
> -  }
> +   for (int i = 0; i < 4; i++) {
> +  BEGIN_BATCH(3);
> +  OUT_BATCH(GEN7_MI_LOAD_REGISTER_MEM | (3 - 2));
> +  OUT_BATCH(GEN7_SO_WRITE_OFFSET(i));
> +  OUT_RELOC(brw_obj->offset_bo,
> +I915_GEM_DOMAIN_INSTRUCTION,
> I915_GEM_DOMAIN_INSTRUCTION,
> +i * sizeof(uint32_t));
> +  ADVANCE_BATCH();
> }
>  
> /* Store the new starting value of the SO_NUM_PRIMS_WRITTEN
> counters. */
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] gallium/u_index_modify: don't add PIPE_TRANSFER_UNSYNCHRONIZED unconditionally

2017-02-17 Thread Marek Olšák
From: Marek Olšák 

It's OK for r300g (because r300g can't write to buffers via the GPU), but
not later hardware. This issue was spotted randomly.

Cc: mesa-sta...@lists.freedesktop.org
---
 src/gallium/auxiliary/util/u_index_modify.c  | 9 ++---
 src/gallium/auxiliary/util/u_index_modify.h  | 3 +++
 src/gallium/drivers/r300/r300_render_translate.c | 4 +++-
 src/gallium/drivers/r600/r600_state_common.c | 2 +-
 src/gallium/drivers/radeonsi/si_state_draw.c | 2 +-
 5 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_index_modify.c 
b/src/gallium/auxiliary/util/u_index_modify.c
index 5c4fc3c..7b072b2 100644
--- a/src/gallium/auxiliary/util/u_index_modify.c
+++ b/src/gallium/auxiliary/util/u_index_modify.c
@@ -21,102 +21,105 @@
  * USE OR OTHER DEALINGS IN THE SOFTWARE. */
 
 #include "pipe/p_context.h"
 #include "util/u_index_modify.h"
 #include "util/u_inlines.h"
 
 /* Ubyte indices. */
 
 void util_shorten_ubyte_elts_to_userptr(struct pipe_context *context,
struct pipe_index_buffer *ib,
+unsigned add_transfer_flags,
int index_bias,
unsigned start,
unsigned count,
void *out)
 {
 struct pipe_transfer *src_transfer = NULL;
 const unsigned char *in_map;
 unsigned short *out_map = out;
 unsigned i;
 
 if (ib->user_buffer) {
in_map = ib->user_buffer;
 } else {
in_map = pipe_buffer_map(context, ib->buffer,
 PIPE_TRANSFER_READ |
-PIPE_TRANSFER_UNSYNCHRONIZED,
+add_transfer_flags,
 _transfer);
 }
 in_map += start;
 
 for (i = 0; i < count; i++) {
 *out_map = (unsigned short)(*in_map + index_bias);
 in_map++;
 out_map++;
 }
 
 if (src_transfer)
pipe_buffer_unmap(context, src_transfer);
 }
 
 /* Ushort indices. */
 
 void util_rebuild_ushort_elts_to_userptr(struct pipe_context *context,
 struct pipe_index_buffer *ib,
+ unsigned add_transfer_flags,
 int index_bias,
 unsigned start, unsigned count,
 void *out)
 {
 struct pipe_transfer *in_transfer = NULL;
 const unsigned short *in_map;
 unsigned short *out_map = out;
 unsigned i;
 
 if (ib->user_buffer) {
in_map = ib->user_buffer;
 } else {
in_map = pipe_buffer_map(context, ib->buffer,
 PIPE_TRANSFER_READ |
-PIPE_TRANSFER_UNSYNCHRONIZED,
+add_transfer_flags,
 _transfer);
 }
 in_map += start;
 
 for (i = 0; i < count; i++) {
 *out_map = (unsigned short)(*in_map + index_bias);
 in_map++;
 out_map++;
 }
 
 if (in_transfer)
pipe_buffer_unmap(context, in_transfer);
 }
 
 /* Uint indices. */
 
 void util_rebuild_uint_elts_to_userptr(struct pipe_context *context,
   struct pipe_index_buffer *ib,
+   unsigned add_transfer_flags,
   int index_bias,
   unsigned start, unsigned count,
   void *out)
 {
 struct pipe_transfer *in_transfer = NULL;
 const unsigned int *in_map;
 unsigned int *out_map = out;
 unsigned i;
 
 if (ib->user_buffer) {
in_map = ib->user_buffer;
 } else {
in_map = pipe_buffer_map(context, ib->buffer,
 PIPE_TRANSFER_READ |
-PIPE_TRANSFER_UNSYNCHRONIZED,
+add_transfer_flags,
 _transfer);
 }
 in_map += start;
 
 for (i = 0; i < count; i++) {
 *out_map = (unsigned int)(*in_map + index_bias);
 in_map++;
 out_map++;
 }
 
diff --git a/src/gallium/auxiliary/util/u_index_modify.h 
b/src/gallium/auxiliary/util/u_index_modify.h
index 1d34b12..0cfc189 100644
--- a/src/gallium/auxiliary/util/u_index_modify.h
+++ b/src/gallium/auxiliary/util/u_index_modify.h
@@ -22,28 +22,31 @@
 
 #ifndef UTIL_INDEX_MODIFY_H
 #define UTIL_INDEX_MODIFY_H
 
 struct pipe_context;
 struct pipe_resource;
 struct pipe_index_buffer;
 
 void util_shorten_ubyte_elts_to_userptr(struct pipe_context *context,
struct pipe_index_buffer *ib,
+unsigned add_transfer_flags,
   

[Mesa-dev] [PATCH 1/2] radeonsi: fix UNSIGNED_BYTE index buffer fallback with non-zero start (v2)

2017-02-17 Thread Marek Olšák
From: Marek Olšák 

start can only be non-zero with MultiDrawElements, which is unlikely
to occur with UNSIGNED_BYTE indices.

v2: Also fix the util_shorten_ubyte_elts_to_userptr call.
Tested with the new piglit.

Cc: mesa-sta...@lists.freedesktop.org
---
 src/gallium/drivers/radeonsi/si_state_draw.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index d453309..b45ef87 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -1045,32 +1045,32 @@ void si_draw_vbo(struct pipe_context *ctx, const struct 
pipe_draw_info *info)
ib.offset = sctx->index_buffer.offset;
 
/* Translate or upload, if needed. */
/* 8-bit indices are supported on VI. */
if (sctx->b.chip_class <= CIK && ib.index_size == 1) {
struct pipe_resource *out_buffer = NULL;
unsigned out_offset, start, count, start_offset;
void *ptr;
 
si_get_draw_start_count(sctx, info, , );
-   start_offset = start * ib.index_size;
+   start_offset = start * 2;
 
u_upload_alloc(ctx->stream_uploader, start_offset,
count * 2, 256,
   _offset, _buffer, );
if (!out_buffer) {
pipe_resource_reference(, NULL);
return;
}
 
util_shorten_ubyte_elts_to_userptr(>b.b, , 0,
-  ib.offset + 
start_offset,
+  ib.offset + start,
   count, ptr);
 
pipe_resource_reference(, NULL);
ib.user_buffer = NULL;
ib.buffer = out_buffer;
/* info->start will be added by the drawing code */
ib.offset = out_offset - start_offset;
ib.index_size = 2;
} else if (ib.user_buffer && !ib.buffer) {
unsigned start, count, start_offset;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: link error if unsized array not-last in ssbo

2017-02-17 Thread Iago Toral
Reviewed-by: Iago Toral Quiroga 

You can also add the new piglit tests that this fixes to the commit
log.

Iago 

On Wed, 2017-02-15 at 15:12 +0100, Jose Maria Casanova Crespo wrote:
> If an unsized declared array is not the last in an SSBO
> and an implicit size can not be defined on linking time,
> the linker should raise an error instead of reaching
> an assertion on GL.
> 
> This reverts part of commit 3da08e166415a745139c1127040a24e8a45dc553
> getting back to the behavior of commit
> 5b2675093e863a52b610f112884ae12d42513770
> 
> The original patch was correct for GLES that should produce
> a compile-time error but the linker error is still necessary in
> desktop GL.
> 
> Signed-off-by: Jose Maria Casanova Crespo 
> ---
>  src/compiler/glsl/link_uniform_blocks.cpp | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/src/compiler/glsl/link_uniform_blocks.cpp
> b/src/compiler/glsl/link_uniform_blocks.cpp
> index ba01269..839fd07 100644
> --- a/src/compiler/glsl/link_uniform_blocks.cpp
> +++ b/src/compiler/glsl/link_uniform_blocks.cpp
> @@ -146,7 +146,13 @@ private:
> */
>    const glsl_type *type_for_size = type;
>    if (type->is_unsized_array()) {
> - assert(last_field);
> + if (!last_field) {
> +linker_error(prog, "unsized array `%s' definition: "
> + "only last member of a shader storage block
> "
> + "can be defined as unsized array",
> + name);
> + }
> +
>   type_for_size = type->without_array();
>    }
>  
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH shader-db 3/4] run: set INTEL_NO_HW together with INTEL_DEVID_OVERRIDE

2017-02-17 Thread Lionel Landwerlin

On 16/02/17 20:02, Kenneth Graunke wrote:

On Thursday, February 16, 2017 4:29:50 AM PST Lionel Landwerlin wrote:

Since we're already asking the driver to generate code for a different
hardware than what we're running on, better not even bother with emitting
any batch.

Signed-off-by: Lionel Landwerlin 
---
  run.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/run.c b/run.c
index 62c19c8..7543b2a 100644
--- a/run.c
+++ b/run.c
@@ -370,6 +370,7 @@ main(int argc, char **argv)
  
  printf("### Compiling for %s ###\n", platform->name);

  setenv("INTEL_DEVID_OVERRIDE", platform->pci_id, 1);
+setenv("INTEL_NO_HW", "1", 1);
  break;
  }
  case 'j':


I don't think you need this patch - libdrm will already not execute
batches if INTEL_DEVID_OVERRIDE is used to force a PCI ID that doesn't
match the one on the system.

Unless the fake PCI ID happens to match the one you're compiling for...


Yeah actually it works without this one. I was probably trying to figure 
stuff out earlier...


Dropping!

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] run: add -j option to select number of threads

2017-02-17 Thread Lionel Landwerlin
v2: Also drop the '-1' in favor of '-j1' (Matt)

Signed-off-by: Lionel Landwerlin 
---
 run.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/run.c b/run.c
index 2654bff..0e77926 100644
--- a/run.c
+++ b/run.c
@@ -307,9 +307,7 @@ const struct platform platforms[] = {
 void print_usage(const char *prog_name)
 {
 fprintf(stderr,
-"Usage: %s [-d ] [-p ] \n"
-"Other options: \n"
-" -1Disable multi-threading\n",
+"Usage: %s [-d ] [-j ] [-p ] 
\n",
 prog_name);
 }

@@ -335,7 +333,7 @@ main(int argc, char **argv)

 max_threads = omp_get_max_threads();

-while((opt = getopt(argc, argv, "1d:p:")) != -1) {
+while ((opt = getopt(argc, argv, "d:j:p:")) != -1) {
 switch(opt) {
 case 'd': {
 char *endptr;
@@ -368,8 +366,8 @@ main(int argc, char **argv)
 setenv("INTEL_DEVID_OVERRIDE", platform->pci_id, 1);
 break;
 }
-case '1':
-max_threads = 1;
+case 'j':
+max_threads = atoi(optarg);
 break;
 default:
 fprintf(stderr, "Unknown option: %x\n", opt);
--
2.7.4
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/5] st/mesa: assume all drivers support user index buffers

2017-02-17 Thread Marek Olšák
From: Marek Olšák 

---
 src/mesa/state_tracker/st_context.c |  2 --
 src/mesa/state_tracker/st_context.h |  1 -
 src/mesa/state_tracker/st_draw.c| 50 ++---
 3 files changed, 13 insertions(+), 40 deletions(-)

diff --git a/src/mesa/state_tracker/st_context.c 
b/src/mesa/state_tracker/st_context.c
index f4ad6d8..4cc4dab 100644
--- a/src/mesa/state_tracker/st_context.c
+++ b/src/mesa/state_tracker/st_context.c
@@ -331,22 +331,20 @@ st_create_context_priv( struct gl_context *ctx, struct 
pipe_context *pipe,
st->pipe = pipe;
 
/* XXX: this is one-off, per-screen init: */
st_debug_init();

/* state tracker needs the VBO module */
_vbo_CreateContext(ctx);
 
st->dirty = ST_ALL_STATES_MASK;
 
-   st->has_user_indexbuf =
-  screen->get_param(screen, PIPE_CAP_USER_INDEX_BUFFERS);
st->has_user_constbuf =
   screen->get_param(screen, PIPE_CAP_USER_CONSTANT_BUFFERS);
 
/* Drivers still have to upload zero-stride vertex attribs manually
 * with the GL core profile, but they don't have to deal with any complex
 * user vertex buffer uploads.
 */
unsigned vbuf_flags =
   ctx->API == API_OPENGL_CORE ? U_VBUF_FLAG_NO_USER_VBOS : 0;
st->cso_context = cso_create_context(pipe, vbuf_flags);
diff --git a/src/mesa/state_tracker/st_context.h 
b/src/mesa/state_tracker/st_context.h
index 942fdd7..bb00384 100644
--- a/src/mesa/state_tracker/st_context.h
+++ b/src/mesa/state_tracker/st_context.h
@@ -78,21 +78,20 @@ struct st_context
boolean has_time_elapsed;
boolean has_shader_model3;
boolean has_etc1;
boolean has_etc2;
boolean prefer_blit_based_texture_transfer;
boolean force_persample_in_shader;
boolean has_shareable_shaders;
boolean has_half_float_packing;
boolean has_multi_draw_indirect;
boolean has_user_constbuf;
-   boolean has_user_indexbuf;
 
/**
 * If a shader can be created when we get its source.
 * This means it has only 1 variant, not counting glBitmap and
 * glDrawPixels.
 */
boolean shader_has_one_variant[MESA_SHADER_STAGES];
 
boolean needs_texcoord_semantic;
boolean apply_texture_swizzle_to_border_color;
diff --git a/src/mesa/state_tracker/st_draw.c b/src/mesa/state_tracker/st_draw.c
index 8d54732..f04b6c2 100644
--- a/src/mesa/state_tracker/st_draw.c
+++ b/src/mesa/state_tracker/st_draw.c
@@ -81,55 +81,45 @@ all_varyings_in_vbos(const struct gl_vertex_array *arrays[])
   !_mesa_is_bufferobj(arrays[i]->BufferObj))
 return GL_FALSE;
 
return GL_TRUE;
 }
 
 
 /**
  * Basically, translate Mesa's index buffer information into
  * a pipe_index_buffer object.
- * \return TRUE or FALSE for success/failure
  */
-static boolean
+static void
 setup_index_buffer(struct st_context *st,
-   const struct _mesa_index_buffer *ib,
-   struct pipe_index_buffer *ibuffer)
+   const struct _mesa_index_buffer *ib)
 {
+   struct pipe_index_buffer ibuffer;
struct gl_buffer_object *bufobj = ib->obj;
 
-   ibuffer->index_size = vbo_sizeof_ib_type(ib->type);
+   ibuffer.index_size = vbo_sizeof_ib_type(ib->type);
 
/* get/create the index buffer object */
if (_mesa_is_bufferobj(bufobj)) {
   /* indices are in a real VBO */
-  ibuffer->buffer = st_buffer_object(bufobj)->buffer;
-  ibuffer->offset = pointer_to_offset(ib->ptr);
-   }
-   else if (!st->has_user_indexbuf) {
-  /* upload indexes from user memory into a real buffer */
-  u_upload_data(st->pipe->stream_uploader, 0,
-ib->count * ibuffer->index_size, 4, ib->ptr,
->offset, >buffer);
-  if (!ibuffer->buffer) {
- /* out of memory */
- return FALSE;
-  }
-  u_upload_unmap(st->pipe->stream_uploader);
+  ibuffer.buffer = st_buffer_object(bufobj)->buffer;
+  ibuffer.offset = pointer_to_offset(ib->ptr);
+  ibuffer.user_buffer = NULL;
}
else {
   /* indices are in user space memory */
-  ibuffer->user_buffer = ib->ptr;
+  ibuffer.buffer = NULL;
+  ibuffer.offset = 0;
+  ibuffer.user_buffer = ib->ptr;
}
 
-   cso_set_index_buffer(st->cso_context, ibuffer);
-   return TRUE;
+   cso_set_index_buffer(st->cso_context, );
 }
 
 
 /**
  * Set the restart index.
  */
 static void
 setup_primitive_restart(struct gl_context *ctx,
 const struct _mesa_index_buffer *ib,
 struct pipe_draw_info *info)
@@ -178,21 +168,20 @@ st_draw_vbo(struct gl_context *ctx,
 GLuint nr_prims,
 const struct _mesa_index_buffer *ib,
GLboolean index_bounds_valid,
 GLuint min_index,
 GLuint max_index,
 struct gl_transform_feedback_object *tfb_vertcount,
 unsigned stream,
 struct gl_buffer_object *indirect)
 {
struct st_context *st = st_context(ctx);
-   struct pipe_index_buffer ibuffer = 

[Mesa-dev] [PATCH 3/5] svga: implement user index buffers

2017-02-17 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/svga/svga_pipe_draw.c | 13 -
 src/gallium/drivers/svga/svga_screen.c|  2 +-
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/svga/svga_pipe_draw.c 
b/src/gallium/drivers/svga/svga_pipe_draw.c
index c51c0b2..bbd4430 100644
--- a/src/gallium/drivers/svga/svga_pipe_draw.c
+++ b/src/gallium/drivers/svga/svga_pipe_draw.c
@@ -18,20 +18,21 @@
  * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
  * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
  * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
  * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
  * SOFTWARE.
  *
  **/
 
 
 #include "util/u_format.h"
+#include "util/u_helpers.h"
 #include "util/u_inlines.h"
 #include "util/u_prim.h"
 #include "util/u_prim_restart.h"
 #include "util/u_time.h"
 #include "util/u_upload_mgr.h"
 #include "indices/u_indices.h"
 
 #include "svga_hw_reg.h"
 #include "svga_cmd.h"
 #include "svga_context.h"
@@ -187,20 +188,28 @@ svga_draw_vbo(struct pipe_context *pipe, const struct 
pipe_draw_info *info)
boolean needed_swtnl;
 
SVGA_STATS_TIME_PUSH(svga_sws(svga), SVGA_STATS_TIME_DRAWVBO);
 
svga->hud.num_draw_calls++;  /* for SVGA_QUERY_NUM_DRAW_CALLS */
 
if (u_reduced_prim(info->mode) == PIPE_PRIM_TRIANGLES &&
svga->curr.rast->templ.cull_face == PIPE_FACE_FRONT_AND_BACK)
   goto done;
 
+   /* Upload a user index buffer. */
+   struct pipe_index_buffer ibuffer_saved = {};
+   if (info->indexed && svga->curr.ib.user_buffer &&
+   !util_save_and_upload_index_buffer(pipe, info, >curr.ib,
+  _saved)) {
+  return;
+   }
+
/*
 * Mark currently bound target surfaces as dirty
 * doesn't really matter if it is done before drawing.
 *
 * TODO If we ever normaly return something other then
 * true we should not mark it as dirty then.
 */
svga_mark_surfaces_dirty(svga_context(pipe));
 
if (svga->curr.reduced_prim != reduced_prim) {
@@ -270,19 +279,21 @@ svga_draw_vbo(struct pipe_context *pipe, const struct 
pipe_draw_info *info)
 
/* XXX: Silence warnings, do something sensible here? */
(void)ret;
 
if (SVGA_DEBUG & DEBUG_FLUSH) {
   svga_hwtnl_flush_retry( svga );
   svga_context_flush(svga, NULL);
}
 
 done:
+   if (info->indexed && ibuffer_saved.user_buffer)
+  pipe->set_index_buffer(pipe, _saved);
+
SVGA_STATS_TIME_POP(svga_sws(svga));
-;
 }
 
 
 void svga_init_draw_functions( struct svga_context *svga )
 {
svga->pipe.draw_vbo = svga_draw_vbo;
 }
diff --git a/src/gallium/drivers/svga/svga_screen.c 
b/src/gallium/drivers/svga/svga_screen.c
index f9dfcd2..8af66b7 100644
--- a/src/gallium/drivers/svga/svga_screen.c
+++ b/src/gallium/drivers/svga/svga_screen.c
@@ -175,20 +175,21 @@ static int
 svga_get_param(struct pipe_screen *screen, enum pipe_cap param)
 {
struct svga_screen *svgascreen = svga_screen(screen);
struct svga_winsys_screen *sws = svgascreen->sws;
SVGA3dDevCapResult result;
 
switch (param) {
case PIPE_CAP_NPOT_TEXTURES:
case PIPE_CAP_MIXED_FRAMEBUFFER_SIZES:
case PIPE_CAP_MIXED_COLOR_DEPTH_BITS:
+   case PIPE_CAP_USER_INDEX_BUFFERS:
   return 1;
case PIPE_CAP_TWO_SIDED_STENCIL:
   return 1;
case PIPE_CAP_MAX_DUAL_SOURCE_RENDER_TARGETS:
   /*
* "In virtually every OpenGL implementation and hardware,
* GL_MAX_DUAL_SOURCE_DRAW_BUFFERS is 1"
* http://www.opengl.org/wiki/Blending
*/
   return sws->have_vgpu10 ? 1 : 0;
@@ -206,21 +207,20 @@ svga_get_param(struct pipe_screen *screen, enum pipe_cap 
param)
   return 0;
case PIPE_CAP_TEXTURE_BUFFER_OBJECTS:
   return sws->have_vgpu10;
case PIPE_CAP_TEXTURE_SHADOW_MAP:
   return 1;
case PIPE_CAP_TEXTURE_SWIZZLE:
   return 1;
case PIPE_CAP_TEXTURE_BORDER_COLOR_QUIRK:
   return 0;
case PIPE_CAP_USER_VERTEX_BUFFERS:
-   case PIPE_CAP_USER_INDEX_BUFFERS:
   return 0;
case PIPE_CAP_USER_CONSTANT_BUFFERS:
   return 1;
case PIPE_CAP_CONSTANT_BUFFER_OFFSET_ALIGNMENT:
   return 256;
 
case PIPE_CAP_MAX_TEXTURE_2D_LEVELS:
   {
  unsigned levels = SVGA_MAX_TEXTURE_LEVELS;
  if (sws->get_cap(sws, SVGA3D_DEVCAP_MAX_TEXTURE_WIDTH, ))
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] gallium: remove PIPE_CAP_USER_INDEX_BUFFERS

2017-02-17 Thread Marek Olšák
From: Marek Olšák 

all drivers support it
---
 src/gallium/docs/source/screen.rst   | 4 
 src/gallium/drivers/etnaviv/etnaviv_screen.c | 1 -
 src/gallium/drivers/freedreno/freedreno_screen.c | 1 -
 src/gallium/drivers/i915/i915_screen.c   | 1 -
 src/gallium/drivers/llvmpipe/lp_screen.c | 1 -
 src/gallium/drivers/nouveau/nv30/nv30_screen.c   | 1 -
 src/gallium/drivers/nouveau/nv50/nv50_screen.c   | 1 -
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   | 1 -
 src/gallium/drivers/r300/r300_screen.c   | 1 -
 src/gallium/drivers/r600/r600_pipe.c | 1 -
 src/gallium/drivers/radeonsi/si_pipe.c   | 1 -
 src/gallium/drivers/softpipe/sp_screen.c | 1 -
 src/gallium/drivers/svga/svga_screen.c   | 1 -
 src/gallium/drivers/swr/swr_screen.cpp   | 1 -
 src/gallium/drivers/vc4/vc4_screen.c | 1 -
 src/gallium/drivers/virgl/virgl_screen.c | 1 -
 src/gallium/include/pipe/p_defines.h | 1 -
 17 files changed, 20 deletions(-)

diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index 74c8cec..b08c3ce 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -108,24 +108,20 @@ The integer capabilities:
   limitation.  If true, pipe_vertex_buffer::buffer_offset must always be 
aligned
   to 4.  If false, there are no restrictions on the offset.
 * ``PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY``: This CAP describes a hw
   limitation.  If true, pipe_vertex_buffer::stride must always be aligned to 4.
   If false, there are no restrictions on the stride.
 * ``PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY``: This CAP describes
   a hw limitation.  If true, pipe_vertex_element::src_offset must always be
   aligned to 4.  If false, there are no restrictions on src_offset.
 * ``PIPE_CAP_COMPUTE``: Whether the implementation supports the
   compute entry points defined in pipe_context and pipe_screen.
-* ``PIPE_CAP_USER_INDEX_BUFFERS``: Whether user index buffers are supported.
-  If not, the state tracker must upload all indices which are not in hw
-  resources.  If user-space buffers are supported, the driver must also still
-  accept HW resource buffers.
 * ``PIPE_CAP_USER_CONSTANT_BUFFERS``: Whether user-space constant buffers
   are supported.  If not, the state tracker must put constants into HW
   resources/buffers.  If user-space constant buffers are supported, the
   driver must still accept HW constant buffers also.
 * ``PIPE_CAP_CONSTANT_BUFFER_OFFSET_ALIGNMENT``: Describes the required
   alignment of pipe_constant_buffer::buffer_offset.
 * ``PIPE_CAP_START_INSTANCE``: Whether the driver supports
   pipe_draw_info::start_instance.
 * ``PIPE_CAP_QUERY_TIMESTAMP``: Whether PIPE_QUERY_TIMESTAMP and
   the pipe_screen::get_timestamp hook are implemented.
diff --git a/src/gallium/drivers/etnaviv/etnaviv_screen.c 
b/src/gallium/drivers/etnaviv/etnaviv_screen.c
index 93eeb58..2d128b3 100644
--- a/src/gallium/drivers/etnaviv/etnaviv_screen.c
+++ b/src/gallium/drivers/etnaviv/etnaviv_screen.c
@@ -129,21 +129,20 @@ etna_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_HALF_INTEGER:
case PIPE_CAP_SM3:
case PIPE_CAP_TEXTURE_BARRIER:
case PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION:
case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY:
case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY:
case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY:
case PIPE_CAP_USER_CONSTANT_BUFFERS:
case PIPE_CAP_TGSI_TEXCOORD:
case PIPE_CAP_VERTEX_COLOR_UNCLAMPED:
-   case PIPE_CAP_USER_INDEX_BUFFERS:
   return 1;
 
/* Memory */
case PIPE_CAP_CONSTANT_BUFFER_OFFSET_ALIGNMENT:
   return 256;
case PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT:
   return 4; /* XXX could easily be supported */
case PIPE_CAP_GLSL_FEATURE_LEVEL:
   return 120;
 
diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
b/src/gallium/drivers/freedreno/freedreno_screen.c
index e1b95a6..e667187 100644
--- a/src/gallium/drivers/freedreno/freedreno_screen.c
+++ b/src/gallium/drivers/freedreno/freedreno_screen.c
@@ -172,21 +172,20 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_INTEGER:
case PIPE_CAP_SEAMLESS_CUBE_MAP:
case PIPE_CAP_VERTEX_COLOR_UNCLAMPED:
case PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION:
case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY:
case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY:
case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY:
case PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT:
case PIPE_CAP_STRING_MARKER:
case PIPE_CAP_MIXED_COLOR_DEPTH_BITS:
-   case PIPE_CAP_USER_INDEX_BUFFERS:
return 1;
 
case 

[Mesa-dev] [PATCH 2/5] freedreno: add support for user index buffers

2017-02-17 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/freedreno/freedreno_draw.c   | 13 +
 src/gallium/drivers/freedreno/freedreno_screen.c |  2 +-
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/freedreno/freedreno_draw.c 
b/src/gallium/drivers/freedreno/freedreno_draw.c
index cfe13cd..cb4c063 100644
--- a/src/gallium/drivers/freedreno/freedreno_draw.c
+++ b/src/gallium/drivers/freedreno/freedreno_draw.c
@@ -24,20 +24,21 @@
  *
  * Authors:
  *Rob Clark 
  */
 
 #include "pipe/p_state.h"
 #include "util/u_string.h"
 #include "util/u_memory.h"
 #include "util/u_prim.h"
 #include "util/u_format.h"
+#include "util/u_helpers.h"
 
 #include "freedreno_draw.h"
 #include "freedreno_context.h"
 #include "freedreno_state.h"
 #include "freedreno_resource.h"
 #include "freedreno_query_hw.h"
 #include "freedreno_util.h"
 
 static void
 resource_read(struct fd_batch *batch, struct pipe_resource *prsc)
@@ -77,20 +78,29 @@ fd_draw_vbo(struct pipe_context *pctx, const struct 
pipe_draw_info *info)
/* emulate unsupported primitives: */
if (!fd_supported_prim(ctx, info->mode)) {
if (ctx->streamout.num_targets > 0)
debug_error("stream-out with emulated prims");
util_primconvert_save_index_buffer(ctx->primconvert, 
>indexbuf);
util_primconvert_save_rasterizer_state(ctx->primconvert, 
ctx->rasterizer);
util_primconvert_draw_vbo(ctx->primconvert, info);
return;
}
 
+   /* Upload a user index buffer. */
+   struct pipe_index_buffer ibuffer_saved = {};
+   if (info->indexed && ctx->indexbuf.user_buffer &&
+   !util_save_and_upload_index_buffer(pctx, info,
+  >indexbuf.user_buffer,
+  _saved)) {
+   return;
+   }
+
if (ctx->in_blit) {
fd_batch_reset(batch);
ctx->dirty = ~0;
}
 
batch->blit = ctx->in_blit;
batch->back_blit = ctx->in_shadow;
 
/* NOTE: needs to be before resource_written(batch->query_buf), 
otherwise
 * query_buf may not be created yet.
@@ -194,20 +204,23 @@ fd_draw_vbo(struct pipe_context *pctx, const struct 
pipe_draw_info *info)
if (ctx->draw_vbo(ctx, info))
batch->needs_flush = true;
 
for (i = 0; i < ctx->streamout.num_targets; i++)
ctx->streamout.offsets[i] += info->count;
 
if (fd_mesa_debug & FD_DBG_DDRAW)
ctx->dirty = 0x;
 
fd_batch_check_size(batch);
+
+   if (info->indexed && ibuffer_saved.user_buffer)
+   pctx->set_index_buffer(pctx, _saved);
 }
 
 /* Generic clear implementation (partially) using u_blitter: */
 static void
 fd_blitter_clear(struct pipe_context *pctx, unsigned buffers,
const union pipe_color_union *color, double depth, unsigned 
stencil)
 {
struct fd_context *ctx = fd_context(pctx);
struct pipe_framebuffer_state *pfb = >batch->framebuffer;
struct blitter_context *blitter = ctx->blitter;
diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
b/src/gallium/drivers/freedreno/freedreno_screen.c
index 1122e29..e1b95a6 100644
--- a/src/gallium/drivers/freedreno/freedreno_screen.c
+++ b/src/gallium/drivers/freedreno/freedreno_screen.c
@@ -172,20 +172,21 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_INTEGER:
case PIPE_CAP_SEAMLESS_CUBE_MAP:
case PIPE_CAP_VERTEX_COLOR_UNCLAMPED:
case PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION:
case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY:
case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY:
case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY:
case PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT:
case PIPE_CAP_STRING_MARKER:
case PIPE_CAP_MIXED_COLOR_DEPTH_BITS:
+   case PIPE_CAP_USER_INDEX_BUFFERS:
return 1;
 
case PIPE_CAP_VERTEXID_NOBASE:
return is_a3xx(screen) || is_a4xx(screen);
 
case PIPE_CAP_USER_CONSTANT_BUFFERS:
return is_a4xx(screen) ? 0 : 1;
 
case PIPE_CAP_SHADER_STENCIL_EXPORT:
case PIPE_CAP_TGSI_TEXCOORD:
@@ -246,21 +247,20 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_GLSL_FEATURE_LEVEL:
if (glsl120)
return 120;
return is_ir3(screen) ? 140 : 120;
 
/* Unsupported features. */
case PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT:
case PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_HALF_INTEGER:
case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS:
case PIPE_CAP_USER_VERTEX_BUFFERS:
-   case PIPE_CAP_USER_INDEX_BUFFERS:
   

[Mesa-dev] [PATCH 1/5] etnaviv: add support for user index buffers

2017-02-17 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/auxiliary/util/u_helpers.c| 29 +++
 src/gallium/auxiliary/util/u_helpers.h|  5 +
 src/gallium/drivers/etnaviv/etnaviv_context.c | 12 +++
 src/gallium/drivers/etnaviv/etnaviv_screen.c  |  2 +-
 4 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/util/u_helpers.c 
b/src/gallium/auxiliary/util/u_helpers.c
index 09020b0..85e7fb0 100644
--- a/src/gallium/auxiliary/util/u_helpers.c
+++ b/src/gallium/auxiliary/util/u_helpers.c
@@ -20,20 +20,21 @@
  * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
  * IN NO EVENT SHALL THE AUTHORS AND/OR THEIR SUPPLIERS BE LIABLE FOR
  * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
  * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
  * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
  *
  **/
 
 #include "util/u_helpers.h"
 #include "util/u_inlines.h"
+#include "util/u_upload_mgr.h"
 
 /**
  * This function is used to copy an array of pipe_vertex_buffer structures,
  * while properly referencing the pipe_vertex_buffer::buffer member.
  *
  * enabled_buffers is updated such that the bits corresponding to the indices
  * of disabled buffers are set to 0 and the enabled ones are set to 1.
  *
  * \sa util_copy_framebuffer_state
  */
@@ -102,10 +103,38 @@ util_set_index_buffer(struct pipe_index_buffer *dst,
 {
if (src) {
   pipe_resource_reference(>buffer, src->buffer);
   memcpy(dst, src, sizeof(*dst));
}
else {
   pipe_resource_reference(>buffer, NULL);
   memset(dst, 0, sizeof(*dst));
}
 }
+
+/**
+ * Given a user index buffer, save the structure to "saved", and upload it.
+ */
+bool
+util_save_and_upload_index_buffer(struct pipe_context *pipe,
+  const struct pipe_draw_info *info,
+  const struct pipe_index_buffer *ib,
+  struct pipe_index_buffer *out_saved)
+{
+   struct pipe_index_buffer new_ib = {0};
+   unsigned start_offset = info->start * ib->index_size;
+
+   u_upload_data(pipe->stream_uploader, start_offset,
+ info->count * ib->index_size, 4,
+ (char*)ib->user_buffer + start_offset,
+ _ib.offset, _ib.buffer);
+   if (!ib->buffer)
+  return false;
+   u_upload_unmap(pipe->stream_uploader);
+
+   new_ib.offset -= start_offset;
+   new_ib.index_size = ib->index_size;
+
+   util_set_index_buffer(out_saved, ib);
+   pipe->set_index_buffer(pipe, _ib);
+   return true;
+}
diff --git a/src/gallium/auxiliary/util/u_helpers.h 
b/src/gallium/auxiliary/util/u_helpers.h
index a9a53e4..7de960b 100644
--- a/src/gallium/auxiliary/util/u_helpers.h
+++ b/src/gallium/auxiliary/util/u_helpers.h
@@ -40,15 +40,20 @@ void util_set_vertex_buffers_mask(struct pipe_vertex_buffer 
*dst,
   unsigned start_slot, unsigned count);
 
 void util_set_vertex_buffers_count(struct pipe_vertex_buffer *dst,
unsigned *dst_count,
const struct pipe_vertex_buffer *src,
unsigned start_slot, unsigned count);
 
 void util_set_index_buffer(struct pipe_index_buffer *dst,
const struct pipe_index_buffer *src);
 
+bool util_save_and_upload_index_buffer(struct pipe_context *pipe,
+   const struct pipe_draw_info *info,
+   const struct pipe_index_buffer *ib,
+   struct pipe_index_buffer *out_saved);
+
 #ifdef __cplusplus
 }
 #endif
 
 #endif
diff --git a/src/gallium/drivers/etnaviv/etnaviv_context.c 
b/src/gallium/drivers/etnaviv/etnaviv_context.c
index 62297a0..d22939a 100644
--- a/src/gallium/drivers/etnaviv/etnaviv_context.c
+++ b/src/gallium/drivers/etnaviv/etnaviv_context.c
@@ -40,20 +40,21 @@
 #include "etnaviv_state.h"
 #include "etnaviv_surface.h"
 #include "etnaviv_texture.h"
 #include "etnaviv_transfer.h"
 #include "etnaviv_translate.h"
 #include "etnaviv_zsa.h"
 
 #include "pipe/p_context.h"
 #include "pipe/p_state.h"
 #include "util/u_blitter.h"
+#include "util/u_helpers.h"
 #include "util/u_memory.h"
 #include "util/u_prim.h"
 #include "util/u_upload_mgr.h"
 
 #include "hw/common.xml.h"
 
 static void
 etna_context_destroy(struct pipe_context *pctx)
 {
struct etna_context *ctx = etna_context(pctx);
@@ -130,20 +131,29 @@ etna_draw_vbo(struct pipe_context *pctx, const struct 
pipe_draw_info *info)
   DBG("Invalid draw primitive mode=%i or no primitives to be drawn", 
info->mode);
   return;
}
 
draw_mode = translate_draw_mode(info->mode);
if (draw_mode == ETNA_NO_MATCH) {
   BUG("Unsupported draw mode");
   return;
}
 
+   /* Upload a user 

[Mesa-dev] Remaining work for the i965 on disk shader cache

2017-02-17 Thread Timothy Arceri

Hi guys,

I've rebased and updated the i965 cache on master which now contains 
most of the GLSL IR cache pieces. There are 4 extra GLSL IR patches 
required by i965 as it needs to fallback to compiling GLSL IR if there 
is a cache miss at draw time (radeonsi will always have tgsi so doesn't 
require this extra step).


There are two more pieces required to hook up and enabled the cache 
(which will still be disabled by default).


1. Call disk_cache_create(); in the patch "i965: make use of on disk 
shader cache" when brw_context is created. We need to pass the unique 
mesa id and the gpu gen to it as strings, as far as I can tell there is 
currently nothing we can reuse to create the gpu id string.


2. Add cache support for the compute stage, see "i965: add shader cache 
support for geometry shaders" as an example of how to do this.


The latest patches are in the shader-cache41 branch of 
https://github.com/tarceri/Mesa.git


So if anyone wants to pick this up, go for it :)

I was originally planning on finishing this up myself but I've got a lot 
of spinning up on radeonsi and llvm to do so I thought I'd pass it off 
to someone who will give it their full attention.


Thanks,
Tim
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: binding qualifier must match with opaque-uniforms only

2017-02-17 Thread Andres Gomez
Dropping this patch due to the discussion I opened at:
https://cvs.khronos.org/bugzilla/show_bug.cgi?id=16238

The layout qualification among Uniform and Shader Storage Blocks across
a linked program must match.

Br.

On Sun, 2017-02-05 at 20:53 +0200, Andres Gomez wrote:
> The binding point is a valid layout qualifier for Uniform Blocks,
> Shader Storage Blocks and Opaque-Uniforms.
> 
> From page 60 (page 66 of the PDF) of the GLSL 4.20 spec, v11:
> 
>   " A link error will result if two compilation units in a program
> specify different integer-constant bindings for the same
> opaque-uniform name. However, it is not an error to specify a
> binding on some but not all declarations for the same name, as
> shown in the examples below."
> 
> As we see, this restriction applies to Opaque-Uniforms only, not to
> Uniform Blocks nor Shader Storage Blocks.
> 
> Fixes GL45-CTS.enhanced_layouts.ssb_layout_qualifier_conflict
> 
> Signed-off-by: Andres Gomez 
> ---
>  src/compiler/glsl/linker.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
> index b768a6e5285..ae13a45d22b 100644
> --- a/src/compiler/glsl/linker.cpp
> +++ b/src/compiler/glsl/linker.cpp
> @@ -962,7 +962,7 @@ cross_validate_globals(struct gl_shader_program *prog,
>*  opaque-uniform name.  However, it is not an error to specify a
>*  binding on some but not all declarations for the same name"
>*/
> - if (var->data.explicit_binding) {
> + if (var->type->contains_opaque() && var->data.explicit_binding) {
>  if (existing->data.explicit_binding &&
>  var->data.binding != existing->data.binding) {
> linker_error(prog, "explicit bindings for %s "
-- 
Br,

Andres
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/7] i965: Drop dead Gen8+ code from Gen7/sometimes-HSW driver hooks.

2017-02-17 Thread Kenneth Graunke
These driver hooks are not used when MI_MATH and MI_LOAD_REGISTER_REG
are supported, which Gen8+ can always do.  So this code is dead.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/gen7_sol_state.c | 50 ++
 1 file changed, 24 insertions(+), 26 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c 
b/src/mesa/drivers/dri/i965/gen7_sol_state.c
index e6b79ed2342..50631610e51 100644
--- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
@@ -490,13 +490,11 @@ gen7_begin_transform_feedback(struct gl_context *ctx, 
GLenum mode,
struct brw_transform_feedback_object *brw_obj =
   (struct brw_transform_feedback_object *) obj;
 
+   assert(brw->gen == 7);
+
/* Reset the SO buffer offsets to 0. */
-   if (brw->gen >= 8) {
-  brw_obj->zero_offsets = true;
-   } else {
-  intel_batchbuffer_flush(brw);
-  brw->batch.needs_sol_reset = true;
-   }
+   intel_batchbuffer_flush(brw);
+   brw->batch.needs_sol_reset = true;
 
/* We're about to lose the information needed to compute the number of
 * vertices written during the last Begin/EndTransformFeedback section,
@@ -552,17 +550,17 @@ gen7_pause_transform_feedback(struct gl_context *ctx,
/* Flush any drawing so that the counters have the right values. */
brw_emit_mi_flush(brw);
 
+   assert(brw->gen == 7);
+
/* Save the SOL buffer offset register values. */
-   if (brw->gen < 8) {
-  for (int i = 0; i < 4; i++) {
- BEGIN_BATCH(3);
- OUT_BATCH(MI_STORE_REGISTER_MEM | (3 - 2));
- OUT_BATCH(GEN7_SO_WRITE_OFFSET(i));
- OUT_RELOC(brw_obj->offset_bo,
-   I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
-   i * sizeof(uint32_t));
- ADVANCE_BATCH();
-  }
+   for (int i = 0; i < 4; i++) {
+  BEGIN_BATCH(3);
+  OUT_BATCH(MI_STORE_REGISTER_MEM | (3 - 2));
+  OUT_BATCH(GEN7_SO_WRITE_OFFSET(i));
+  OUT_RELOC(brw_obj->offset_bo,
+I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
+i * sizeof(uint32_t));
+  ADVANCE_BATCH();
}
 
/* Store the temporary ending value of the SO_NUM_PRIMS_WRITTEN counters.
@@ -581,17 +579,17 @@ gen7_resume_transform_feedback(struct gl_context *ctx,
struct brw_transform_feedback_object *brw_obj =
   (struct brw_transform_feedback_object *) obj;
 
+   assert(brw->gen == 7);
+
/* Reload the SOL buffer offset registers. */
-   if (brw->gen < 8) {
-  for (int i = 0; i < 4; i++) {
- BEGIN_BATCH(3);
- OUT_BATCH(GEN7_MI_LOAD_REGISTER_MEM | (3 - 2));
- OUT_BATCH(GEN7_SO_WRITE_OFFSET(i));
- OUT_RELOC(brw_obj->offset_bo,
-   I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
-   i * sizeof(uint32_t));
- ADVANCE_BATCH();
-  }
+   for (int i = 0; i < 4; i++) {
+  BEGIN_BATCH(3);
+  OUT_BATCH(GEN7_MI_LOAD_REGISTER_MEM | (3 - 2));
+  OUT_BATCH(GEN7_SO_WRITE_OFFSET(i));
+  OUT_RELOC(brw_obj->offset_bo,
+I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
+i * sizeof(uint32_t));
+  ADVANCE_BATCH();
}
 
/* Store the new starting value of the SO_NUM_PRIMS_WRITTEN counters. */
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/7] i965: Enable ARB_transform_feedback2 on Sandybridge.

2017-02-17 Thread Kenneth Graunke
The only feature over and above ES 3.0 is DrawTransformFeedback().

We already have to do the whole SOL_NUM_PRIMS_WRITTEN counter dance in
order to compute the SVBI value for ResumeTransformFeedback(), at which
point our existing GetTransformFeedbackVertexCount() implementation will
do the trick (though with a stall to CPU map the buffer).

Someday, we could probably implement DrawTransformFeedback() more
efficiently, using the "Load Internal Vertex Count" feature of
3DSTATE_SVB_INDEX and the 3DPRIMITIVE indirect vertex count bit.

Rumor has it this allows people to use WebGL 2.0 on Sandybridge.

Note that we don't need pipelined register writes like Gen7+ because
we use the 3DSTATE_SVB_INDEX command rather than MI_LOAD_REGISTER_MEM.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99842
Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_context.c  | 2 ++
 src/mesa/drivers/dri/i965/intel_extensions.c | 3 +++
 2 files changed, 5 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 977c59c1c3e..17800e3fd60 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -483,6 +483,8 @@ brw_init_driver_functions(struct brw_context *brw,
   functions->EndTransformFeedback = brw_end_transform_feedback;
   functions->PauseTransformFeedback = brw_pause_transform_feedback;
   functions->ResumeTransformFeedback = brw_resume_transform_feedback;
+  functions->GetTransformFeedbackVertexCount =
+ brw_get_transform_feedback_vertex_count;
}
 
if (brw->gen >= 6)
diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index f1290bf7b49..ec7ff02be84 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -158,6 +158,9 @@ intelInitExtensions(struct gl_context *ctx)
   ctx->Extensions.EXT_timer_query = true;
}
 
+   if (brw->gen == 6)
+  ctx->Extensions.ARB_transform_feedback2 = true;
+
if (brw->gen >= 6) {
   ctx->Extensions.ARB_blend_func_extended =
  !driQueryOptionb(>optionCache, "disable_blend_func_extended");
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/7] i965: Use ctx->Const.MaxVertexStreams rather than BRW_XFB_MAX_STREAMS.

2017-02-17 Thread Kenneth Graunke
This way on Sandybridge we'll only do 1 stream worth of math, since
we only have one SO_NUM_PRIMS_WRITTEN counter.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/gen6_sol.c | 25 -
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c 
b/src/mesa/drivers/dri/i965/gen6_sol.c
index ca06194ba15..41158bd580c 100644
--- a/src/mesa/drivers/dri/i965/gen6_sol.c
+++ b/src/mesa/drivers/dri/i965/gen6_sol.c
@@ -230,11 +230,16 @@ brw_delete_transform_feedback(struct gl_context *ctx,
  * For each stream, we subtract the pair of values (end - start) to get the
  * number of primitives generated during one section.  We accumulate these
  * values, adding them up to get the total number of primitives generated.
+ *
+ * Note that we expose one stream pre-Gen7, so the above is just (start, end).
  */
 static void
 tally_prims_generated(struct brw_context *brw,
   struct brw_transform_feedback_object *obj)
 {
+   const struct gl_context *ctx = >ctx;
+   const int streams = ctx->Const.MaxVertexStreams;
+
/* If the current batch is still contributing to the number of primitives
 * generated, flush it now so the results will be present when mapped.
 */
@@ -247,15 +252,14 @@ tally_prims_generated(struct brw_context *brw,
drm_intel_bo_map(obj->prim_count_bo, false);
uint64_t *prim_counts = obj->prim_count_bo->virtual;
 
-   assert(obj->prim_count_buffer_index % (2 * BRW_MAX_XFB_STREAMS) == 0);
-   int pairs = obj->prim_count_buffer_index / (2 * BRW_MAX_XFB_STREAMS);
+   assert(obj->prim_count_buffer_index % (2 * streams) == 0);
+   int pairs = obj->prim_count_buffer_index / (2 * streams);
 
for (int i = 0; i < pairs; i++) {
-  for (int s = 0; s < BRW_MAX_XFB_STREAMS; s++) {
- obj->prims_generated[s] +=
-prim_counts[BRW_MAX_XFB_STREAMS + s] - prim_counts[s];
+  for (int s = 0; s < streams; s++) {
+ obj->prims_generated[s] += prim_counts[streams + s] - prim_counts[s];
   }
-  prim_counts += 2 * BRW_MAX_XFB_STREAMS; /* move to the next pair */
+  prim_counts += 2 * streams; /* move to the next pair */
}
 
drm_intel_bo_unmap(obj->prim_count_bo);
@@ -279,7 +283,8 @@ void
 brw_save_primitives_written_counters(struct brw_context *brw,
  struct brw_transform_feedback_object *obj)
 {
-   const int streams = BRW_MAX_XFB_STREAMS;
+   const struct gl_context *ctx = >ctx;
+   const int streams = ctx->Const.MaxVertexStreams;
 
/* Check if there's enough space for a new pair of four values. */
if (obj->prim_count_bo != NULL &&
@@ -310,6 +315,8 @@ void
 brw_compute_xfb_vertices_written(struct brw_context *brw,
  struct brw_transform_feedback_object *obj)
 {
+   const struct gl_context *ctx = >ctx;
+
if (obj->vertices_written_valid || !obj->base.EndedAnytime)
   return;
 
@@ -332,7 +339,7 @@ brw_compute_xfb_vertices_written(struct brw_context *brw,
/* Get the number of primitives generated. */
tally_prims_generated(brw, obj);
 
-   for (int i = 0; i < BRW_MAX_XFB_STREAMS; i++) {
+   for (int i = 0; i < ctx->Const.MaxVertexStreams; i++) {
   obj->vertices_written[i] = vertices_per_prim * obj->prims_generated[i];
}
obj->vertices_written_valid = true;
@@ -354,7 +361,7 @@ brw_get_transform_feedback_vertex_count(struct gl_context 
*ctx,
   (struct brw_transform_feedback_object *) obj;
 
assert(obj->EndedAnytime);
-   assert(stream < BRW_MAX_XFB_STREAMS);
+   assert(stream < ctx->Const.MaxVertexStreams);
 
brw_compute_xfb_vertices_written(brw, brw_obj);
return brw_obj->vertices_written[stream];
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/7] i965: Properly reset SVBI counters on ResumeTransformFeedback().

2017-02-17 Thread Kenneth Graunke
This fixes Piglit's ARB_transform_feedback2/change-objects-while-paused
GLES 3.0 test.  When resuming the transform feedback object, we need to
reset the SVBI counters so we continue writing at the correct point in
the buffer.

Instead of SO_WRITE_OFFSET counters (with a DWord offset), we have the
Streamed Vertex Buffer Index (SVBI) counters, which contain a count of
vertices emitted.

Unfortunately, there's no straightforward way to store the current SVBI
counter values to a buffer.  They're not available in a register.  You
can use a bit in the 3DSTATE_SVB_INDEX packet to copy them to another
internal counter which 3DPRIMITIVE can use...but there's no good way to
extract that either.

So, once again, we use SO_NUM_PRIMS_WRITTEN to calculate the vertex
numbers.  Thankfully, we can reuse most of the existing Gen7+ code.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_context.c |   2 +
 src/mesa/drivers/dri/i965/brw_context.h |   6 ++
 src/mesa/drivers/dri/i965/gen6_sol.c| 116 +++-
 3 files changed, 107 insertions(+), 17 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 7240b1f4455..977c59c1c3e 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -481,6 +481,8 @@ brw_init_driver_functions(struct brw_context *brw,
} else {
   functions->BeginTransformFeedback = brw_begin_transform_feedback;
   functions->EndTransformFeedback = brw_end_transform_feedback;
+  functions->PauseTransformFeedback = brw_pause_transform_feedback;
+  functions->ResumeTransformFeedback = brw_resume_transform_feedback;
}
 
if (brw->gen >= 6)
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 25c90645cea..df406c0d772 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1471,6 +1471,12 @@ void
 brw_end_transform_feedback(struct gl_context *ctx,
struct gl_transform_feedback_object *obj);
 void
+brw_pause_transform_feedback(struct gl_context *ctx,
+ struct gl_transform_feedback_object *obj);
+void
+brw_resume_transform_feedback(struct gl_context *ctx,
+  struct gl_transform_feedback_object *obj);
+void
 brw_save_primitives_written_counters(struct brw_context *brw,
  struct brw_transform_feedback_object 
*obj);
 void
diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c 
b/src/mesa/drivers/dri/i965/gen6_sol.c
index f1cc2d59fd4..702cb2830f0 100644
--- a/src/mesa/drivers/dri/i965/gen6_sol.c
+++ b/src/mesa/drivers/dri/i965/gen6_sol.c
@@ -314,18 +314,12 @@ brw_save_primitives_written_counters(struct brw_context 
*brw,
obj->prim_count_buffer_index += streams;
 }
 
-/**
- * Compute the number of vertices written by this transform feedback operation.
- */
-void
-brw_compute_xfb_vertices_written(struct brw_context *brw,
- struct brw_transform_feedback_object *obj)
+static void
+compute_vertices_written_so_far(struct brw_context *brw,
+struct brw_transform_feedback_object *obj,
+uint64_t *vertices_written)
 {
const struct gl_context *ctx = >ctx;
-
-   if (obj->vertices_written_valid || !obj->base.EndedAnytime)
-  return;
-
unsigned vertices_per_prim = 0;
 
switch (obj->primitive_mode) {
@@ -346,8 +340,22 @@ brw_compute_xfb_vertices_written(struct brw_context *brw,
tally_prims_generated(brw, obj);
 
for (int i = 0; i < ctx->Const.MaxVertexStreams; i++) {
-  obj->vertices_written[i] = vertices_per_prim * obj->prims_generated[i];
+  vertices_written[i] = vertices_per_prim * obj->prims_generated[i];
}
+}
+
+/**
+ * Compute the number of vertices written by this transform feedback operation.
+ */
+void
+brw_compute_xfb_vertices_written(struct brw_context *brw,
+ struct brw_transform_feedback_object *obj)
+{
+   if (obj->vertices_written_valid || !obj->base.EndedAnytime)
+  return;
+
+   compute_vertices_written_so_far(brw, obj, obj->vertices_written);
+
obj->vertices_written_valid = true;
 }
 
@@ -423,18 +431,92 @@ brw_begin_transform_feedback(struct gl_context *ctx, 
GLenum mode,
   OUT_BATCH(0x);
   ADVANCE_BATCH();
}
+
+   /* We're about to lose the information needed to compute the number of
+* vertices written during the last Begin/EndTransformFeedback section,
+* so we can't delay it any further.
+*/
+   brw_compute_xfb_vertices_written(brw, brw_obj);
+
+   /* No primitives have been generated yet. */
+   for (int i = 0; i < BRW_MAX_XFB_STREAMS; i++) {
+  brw_obj->prims_generated[i] = 0;
+   }
+
+   /* Store the starting value of the SO_NUM_PRIMS_WRITTEN counters. */
+   brw_save_primitives_written_counters(brw, 

[Mesa-dev] [PATCH 2/7] i965: Move some code from gen7_sol_state.c to gen6_sol.c.

2017-02-17 Thread Kenneth Graunke
I plan to use these functions on Sandybridge soon.  I changed the prefix
on a couple of functions to "brw" instead of "gen7" as in theory they
should be usable all the way back to G45.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_context.h|   6 ++
 src/mesa/drivers/dri/i965/gen6_sol.c   | 140 +++
 src/mesa/drivers/dri/i965/gen7_sol_state.c | 148 +
 3 files changed, 150 insertions(+), 144 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 01e651b09f0..8d9a75f884b 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1464,6 +1464,12 @@ brw_begin_transform_feedback(struct gl_context *ctx, 
GLenum mode,
 void
 brw_end_transform_feedback(struct gl_context *ctx,
struct gl_transform_feedback_object *obj);
+void
+brw_save_primitives_written_counters(struct brw_context *brw,
+ struct brw_transform_feedback_object 
*obj);
+void
+brw_compute_xfb_vertices_written(struct brw_context *brw,
+ struct brw_transform_feedback_object *obj);
 GLsizei
 brw_get_transform_feedback_vertex_count(struct gl_context *ctx,
 struct gl_transform_feedback_object 
*obj,
diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c 
b/src/mesa/drivers/dri/i965/gen6_sol.c
index 6f1d2c2fd04..ca06194ba15 100644
--- a/src/mesa/drivers/dri/i965/gen6_sol.c
+++ b/src/mesa/drivers/dri/i965/gen6_sol.c
@@ -220,6 +220,146 @@ brw_delete_transform_feedback(struct gl_context *ctx,
free(brw_obj);
 }
 
+/**
+ * Tally the number of primitives generated so far.
+ *
+ * The buffer contains a series of pairs:
+ * (, ) ;
+ * (, ) ;
+ *
+ * For each stream, we subtract the pair of values (end - start) to get the
+ * number of primitives generated during one section.  We accumulate these
+ * values, adding them up to get the total number of primitives generated.
+ */
+static void
+tally_prims_generated(struct brw_context *brw,
+  struct brw_transform_feedback_object *obj)
+{
+   /* If the current batch is still contributing to the number of primitives
+* generated, flush it now so the results will be present when mapped.
+*/
+   if (drm_intel_bo_references(brw->batch.bo, obj->prim_count_bo))
+  intel_batchbuffer_flush(brw);
+
+   if (unlikely(brw->perf_debug && drm_intel_bo_busy(obj->prim_count_bo)))
+  perf_debug("Stalling for # of transform feedback primitives written.\n");
+
+   drm_intel_bo_map(obj->prim_count_bo, false);
+   uint64_t *prim_counts = obj->prim_count_bo->virtual;
+
+   assert(obj->prim_count_buffer_index % (2 * BRW_MAX_XFB_STREAMS) == 0);
+   int pairs = obj->prim_count_buffer_index / (2 * BRW_MAX_XFB_STREAMS);
+
+   for (int i = 0; i < pairs; i++) {
+  for (int s = 0; s < BRW_MAX_XFB_STREAMS; s++) {
+ obj->prims_generated[s] +=
+prim_counts[BRW_MAX_XFB_STREAMS + s] - prim_counts[s];
+  }
+  prim_counts += 2 * BRW_MAX_XFB_STREAMS; /* move to the next pair */
+   }
+
+   drm_intel_bo_unmap(obj->prim_count_bo);
+
+   /* We've already gathered up the old data; we can safely overwrite it now. 
*/
+   obj->prim_count_buffer_index = 0;
+}
+
+/**
+ * Store the SO_NUM_PRIMS_WRITTEN counters for each stream (4 uint64_t values)
+ * to prim_count_bo.
+ *
+ * If prim_count_bo is out of space, gather up the results so far into
+ * prims_generated[] and allocate a new buffer with enough space.
+ *
+ * The number of primitives written is used to compute the number of vertices
+ * written to a transform feedback stream, which is required to implement
+ * DrawTransformFeedback().
+ */
+void
+brw_save_primitives_written_counters(struct brw_context *brw,
+ struct brw_transform_feedback_object *obj)
+{
+   const int streams = BRW_MAX_XFB_STREAMS;
+
+   /* Check if there's enough space for a new pair of four values. */
+   if (obj->prim_count_bo != NULL &&
+   obj->prim_count_buffer_index + 2 * streams >= 4096 / sizeof(uint64_t)) {
+  /* Gather up the results so far and release the BO. */
+  tally_prims_generated(brw, obj);
+   }
+
+   /* Flush any drawing so that the counters have the right values. */
+   brw_emit_mi_flush(brw);
+
+   /* Emit MI_STORE_REGISTER_MEM commands to write the values. */
+   for (int i = 0; i < streams; i++) {
+  int offset = (obj->prim_count_buffer_index + i) * sizeof(uint64_t);
+  brw_store_register_mem64(brw, obj->prim_count_bo,
+   GEN7_SO_NUM_PRIMS_WRITTEN(i),
+   offset);
+   }
+
+   /* Update where to write data to. */
+   obj->prim_count_buffer_index += streams;
+}
+
+/**
+ * Compute the number of 

[Mesa-dev] [PATCH 5/7] i965: Save max_index in brw_transform_feedback_object.

2017-02-17 Thread Kenneth Graunke
I'm going to need this in a new Resume hook shortly.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_context.h | 6 ++
 src/mesa/drivers/dri/i965/gen6_sol.c| 6 --
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 8d9a75f884b..25c90645cea 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -567,6 +567,12 @@ struct brw_transform_feedback_object {
GLenum primitive_mode;
 
/**
+* The maximum number of vertices that we can write without overflowing
+* any of the buffers currently being used for transform feedback.
+*/
+   unsigned max_index;
+
+   /**
 * Count of primitives generated during this transform feedback operation.
 *  @{
 */
diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c 
b/src/mesa/drivers/dri/i965/gen6_sol.c
index 8adac92d07d..f1cc2d59fd4 100644
--- a/src/mesa/drivers/dri/i965/gen6_sol.c
+++ b/src/mesa/drivers/dri/i965/gen6_sol.c
@@ -382,6 +382,8 @@ brw_begin_transform_feedback(struct gl_context *ctx, GLenum 
mode,
const struct gl_transform_feedback_info *linked_xfb_info;
struct gl_transform_feedback_object *xfb_obj =
   ctx->TransformFeedback.CurrentObject;
+   struct brw_transform_feedback_object *brw_obj =
+  (struct brw_transform_feedback_object *) xfb_obj;
 
assert(brw->gen == 6);
 
@@ -397,7 +399,7 @@ brw_begin_transform_feedback(struct gl_context *ctx, GLenum 
mode,
/* Compute the maximum number of vertices that we can write without
 * overflowing any of the buffers currently being used for feedback.
 */
-   unsigned max_index
+   brw_obj->max_index
   = _mesa_compute_max_transform_feedback_vertices(ctx, xfb_obj,
   linked_xfb_info);
 
@@ -406,7 +408,7 @@ brw_begin_transform_feedback(struct gl_context *ctx, GLenum 
mode,
OUT_BATCH(_3DSTATE_GS_SVB_INDEX << 16 | (4 - 2));
OUT_BATCH(0); /* SVBI 0 */
OUT_BATCH(0); /* starting index */
-   OUT_BATCH(max_index);
+   OUT_BATCH(brw_obj->max_index);
ADVANCE_BATCH();
 
/* Initialize the rest of the unused streams to sane values.  Otherwise,
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/7] i965: Update brw_save_primitives_written_counters for pre-Gen7.

2017-02-17 Thread Kenneth Graunke
Sandybridge and earlier only have a single counter.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/gen6_sol.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c 
b/src/mesa/drivers/dri/i965/gen6_sol.c
index 41158bd580c..8adac92d07d 100644
--- a/src/mesa/drivers/dri/i965/gen6_sol.c
+++ b/src/mesa/drivers/dri/i965/gen6_sol.c
@@ -297,11 +297,17 @@ brw_save_primitives_written_counters(struct brw_context 
*brw,
brw_emit_mi_flush(brw);
 
/* Emit MI_STORE_REGISTER_MEM commands to write the values. */
-   for (int i = 0; i < streams; i++) {
-  int offset = (obj->prim_count_buffer_index + i) * sizeof(uint64_t);
+   if (brw->gen >= 7) {
+  for (int i = 0; i < streams; i++) {
+ int offset = (obj->prim_count_buffer_index + i) * sizeof(uint64_t);
+ brw_store_register_mem64(brw, obj->prim_count_bo,
+  GEN7_SO_NUM_PRIMS_WRITTEN(i),
+  offset);
+  }
+   } else {
   brw_store_register_mem64(brw, obj->prim_count_bo,
-   GEN7_SO_NUM_PRIMS_WRITTEN(i),
-   offset);
+   GEN6_SO_NUM_PRIMS_WRITTEN,
+   obj->prim_count_buffer_index * 
sizeof(uint64_t));
}
 
/* Update where to write data to. */
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] radv: Never try to create more than max_sets descriptor sets.

2017-02-17 Thread Nicolai Hähnle

On 16.02.2017 21:26, Bas Nieuwenhuizen wrote:

We only use the freed ones after all free space has been used. If
the app only allocates small descriptor sets, we might go over
max_sets before the memory is full.

Signed-off-by: Bas Nieuwenhuizen 
CC: 
Fixes: f4e499ec79147f4172f3669ae9dafd941aaeeb65


I think it would be good to follow the kernel style of quoting the title 
of the fixed commit, in this case:


Fixes: f4e499ec7914 ("radv: add initial non-conformant radv vulkan driver")


---
 src/amd/vulkan/radv_descriptor_set.c | 7 +--
 src/amd/vulkan/radv_private.h| 1 +
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_descriptor_set.c 
b/src/amd/vulkan/radv_descriptor_set.c
index 6d89d601de0..81291d10037 100644
--- a/src/amd/vulkan/radv_descriptor_set.c
+++ b/src/amd/vulkan/radv_descriptor_set.c
@@ -275,12 +275,13 @@ radv_descriptor_set_create(struct radv_device *device,
uint32_t layout_size = align_u32(layout->size, 32);
set->size = layout->size;
if (!cmd_buffer) {
-   if (pool->current_offset + layout_size <= pool->size) {
+   if (pool->current_offset + layout_size <= pool->size &&
+   pool->allocated_sets < pool->max_sets) {
set->bo = pool->bo;
set->mapped_ptr = (uint32_t*)(pool->mapped_ptr + 
pool->current_offset);
set->va = device->ws->buffer_get_va(set->bo) + 
pool->current_offset;
pool->current_offset += layout_size;
-
+   ++pool->allocated_sets;
} else {
int entry = pool->free_list, prev_entry = -1;
uint32_t offset;
@@ -417,6 +418,7 @@ VkResult radv_CreateDescriptorPool(
pool->full_list = 0;
pool->free_nodes[max_sets - 1].next = -1;
pool->max_sets = max_sets;
+   pool->allocated_sets = 0;

for (int i = 0; i  + 1 < max_sets; ++i)
pool->free_nodes[i].next = i + 1;
@@ -494,6 +496,7 @@ VkResult radv_ResetDescriptorPool(
radv_descriptor_set_destroy(device, pool, set, false);
}

+   pool->allocated_sets = 0;
pool->current_offset = 0;
pool->free_list = -1;
pool->full_list = 0;
diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h
index 7b1d8fb1f45..9c326dcef83 100644
--- a/src/amd/vulkan/radv_private.h
+++ b/src/amd/vulkan/radv_private.h
@@ -564,6 +564,7 @@ struct radv_descriptor_pool {
int free_list;
int full_list;
uint32_t max_sets;
+   uint32_t allocated_sets;
struct radv_descriptor_pool_free_node free_nodes[];
 };




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] st/mesa: stop using TGSI_OPCODE_CLAMP

2017-02-17 Thread Nicolai Hähnle

With Roland's comment on #5 addressed, the series is:

Reviewed-by: Nicolai Hähnle 

On 16.02.2017 23:00, Marek Olšák wrote:

From: Marek Olšák 

---
 src/mesa/state_tracker/st_atifs_to_tgsi.c | 14 --
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/src/mesa/state_tracker/st_atifs_to_tgsi.c 
b/src/mesa/state_tracker/st_atifs_to_tgsi.c
index 9c4218e..64879f1 100644
--- a/src/mesa/state_tracker/st_atifs_to_tgsi.c
+++ b/src/mesa/state_tracker/st_atifs_to_tgsi.c
@@ -612,21 +612,20 @@ st_init_atifs_prog(struct gl_context *ctx, struct 
gl_program *prog)
prog->arb.NumParameters = MAX_NUM_FRAGMENT_CONSTANTS_ATI + 2; /* 2 state 
variables for fog */
 }


 struct tgsi_atifs_transform {
struct tgsi_transform_context base;
struct tgsi_shader_info info;
const struct st_fp_variant_key *key;
bool first_instruction_emitted;
unsigned fog_factor_temp;
-   unsigned fog_clamp_imm;
 };

 static inline struct tgsi_atifs_transform *
 tgsi_atifs_transform(struct tgsi_transform_context *tctx)
 {
return (struct tgsi_atifs_transform *)tctx;
 }

 /* copied from st_cb_drawpixels_shader.c */
 static void
@@ -669,24 +668,20 @@ transform_instr(struct tgsi_transform_context *tctx,

if (ctx->first_instruction_emitted)
   goto transform_inst;

ctx->first_instruction_emitted = true;

if (ctx->key->fog) {
   /* add a new temp for the fog factor */
   ctx->fog_factor_temp = ctx->info.file_max[TGSI_FILE_TEMPORARY] + 1;
   tgsi_transform_temp_decl(tctx, ctx->fog_factor_temp);
-
-  /* add immediates for clamp */
-  ctx->fog_clamp_imm = ctx->info.immediate_count;
-  tgsi_transform_immediate_decl(tctx, 1.0f, 0.0f, 0.0f, 0.0f);
}

 transform_inst:
if (current_inst->Instruction.Opcode == TGSI_OPCODE_TEX) {
   /* fix texture target */
   unsigned newtarget = 
ctx->key->texture_targets[current_inst->Src[1].Register.Index];
   if (newtarget)
  current_inst->Texture.Texture = newtarget;

} else if (ctx->key->fog && current_inst->Instruction.Opcode == TGSI_OPCODE_MOV 
&&
@@ -783,31 +778,30 @@ transform_inst:
  inst.Instruction.Opcode = TGSI_OPCODE_EX2;
  inst.Instruction.NumDstRegs = 1;
  inst.Dst[0].Register.File  = TGSI_FILE_TEMPORARY;
  inst.Dst[0].Register.Index = ctx->fog_factor_temp;
  inst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_XYZW;
  inst.Instruction.NumSrcRegs = 1;
  SET_SRC(, 0, TGSI_FILE_TEMPORARY, ctx->fog_factor_temp, X, Y, Z, 
W);
  inst.Src[0].Register.Negate ^= 1;
  tctx->emit_instruction(tctx, );
   }
-  /* f = CLAMP(f, 0.0, 1.0) */
+  /* f = saturate(f) */
   inst = tgsi_default_full_instruction();
-  inst.Instruction.Opcode = TGSI_OPCODE_CLAMP;
+  inst.Instruction.Opcode = TGSI_OPCODE_MOV;
   inst.Instruction.NumDstRegs = 1;
+  inst.Instruction.Saturate = 1;
   inst.Dst[0].Register.File  = TGSI_FILE_TEMPORARY;
   inst.Dst[0].Register.Index = ctx->fog_factor_temp;
   inst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_XYZW;
-  inst.Instruction.NumSrcRegs = 3;
+  inst.Instruction.NumSrcRegs = 1;
   SET_SRC(, 0, TGSI_FILE_TEMPORARY, ctx->fog_factor_temp, X, Y, Z, W);
-  SET_SRC(, 1, TGSI_FILE_IMMEDIATE, ctx->fog_clamp_imm, Y, Y, Y, Y); 
// 0.0
-  SET_SRC(, 2, TGSI_FILE_IMMEDIATE, ctx->fog_clamp_imm, X, X, X, X); 
// 1.0
   tctx->emit_instruction(tctx, );

   /* REG0 = LRP(f, REG0, fogcolor) */
   inst = tgsi_default_full_instruction();
   inst.Instruction.Opcode = TGSI_OPCODE_LRP;
   inst.Instruction.NumDstRegs = 1;
   inst.Dst[0].Register.File  = TGSI_FILE_TEMPORARY;
   inst.Dst[0].Register.Index = reg0_index;
   inst.Dst[0].Register.WriteMask = TGSI_WRITEMASK_XYZW;
   inst.Instruction.NumSrcRegs = 3;



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 15/18] radeonsi: upload constants into VRAM instead of GTT

2017-02-17 Thread Nicolai Hähnle

On 16.02.2017 23:36, Marek Olšák wrote:

On Thu, Feb 16, 2017 at 4:21 PM, Nicolai Hähnle  wrote:

On 16.02.2017 13:53, Marek Olšák wrote:


From: Marek Olšák 

This lowers lgkm wait cycles by 30% on VI and normal conditions.
The might be a measurable improvement when CE is disabled (radeon)
or under L2 thrashing.



Good idea. I'm just wondering if all the users of const upload end up as
streaming writes? I hope we don't accidentally hit some place where writes
from the CPU end up extremely slow, e.g. where st/mesa uploads some
structures.


I think constant buffers always benefit from being in VRAM. If every
CU loads a value from a constant buffer, you'll get at least 16 TC L2
read requests on Fiji (each group of 4 CUs submits one), which can be
misses under thrashing. This is very different from "streaming" where
you expect to get exactly 1 read request for each piece of data.


Good point.



The small problem with VRAM uploads may be write combining. I don't
know the alignment at which it operates and how exactly it works. E.g.
if we get 2 16-byte uploads aligned to 32, there is an untouched hole
of 16 bytes. Does the hole have any effect on upload performance?
u_upload_mgr could fill all holes if it was a problem.


So some quick googling found this: 
https://fgiesen.wordpress.com/2013/01/29/write-combining-is-not-your-friend/ 
with the main three rules in the conclusion:


- Never read from write-combined memory.
- Try to keep writes sequential. This is good style even when it’s not 
strictly necessary. On processors with picky write-combining logic, you 
might also need to use volatile or some other way to cause the compiler 
not to reorder instructions.

- Don’t leave holes. Always write large, contiguous ranges.

All uses of u_upload_data should be fine (it's just a memcpy). Your 
example of 2 16-byte uploads aligned to 32 isn't ideal, but it's 
probably not terrible.


Scanning st/mesa, I see the following potentially questionable pieces of 
code:


1) st_DrawAtlasBitmaps: If the compiler reorders the verts->XYZ writes, 
that could be bad. But this is obsolete functionality, so we don't care.

2) st_DrawTex: similar
3) st_draw_quad: Used for scissored/windowed clears. Re-ordering by the 
compiler could potentially hurt, we may want to look into that.

4) st_pbo_draw: same as st_draw_quad

The blitter only uses u_upload_data, which is fine.

Also, none of the issues affect large uploads, so I think we're good.

The patch is: Reviewed-by: Nicolai Hähnle 



Also, Feral's games upload directly to VRAM all the time. This patch
is nothing compared to what they're doing.

Marek



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] V2 GLSL IR & TGSI on-disk shader cache

2017-02-17 Thread Timothy Arceri



On 17/02/17 19:26, Nicolai Hähnle wrote:

On 16.02.2017 23:55, Timothy Arceri wrote:

On 17/02/17 01:27, Nicolai Hähnle wrote:

Hi Timothy,

thank you for the update. I had a look at all the patches now, and
especially the glsl parts looks basically ready to go. There are only
minor comments for which I don't need a full resend of the series, and
an open question on patch 22 where it would be nice to get a proper
answer.


Thanks! It's a relief to finally get this stuff reviewed.


It's definitely good to have concrete progress. There's always the
possibility that we still missed some metadata, so I wonder what your
testing plan is for this. Games, obviously, but perhaps consecutive runs
of piglit give useful info as well?


Yes, I've been doing consecutive piglit runs throughout development. I 
still need to do some runs of the CTS, but at this point I'm pretty 
confident we should have everything. I've done a huge amount of 
re-factoring of the structs that store this meta-data to reduce the 
amount of data we need to store both in the cache and at run time, so if 
I missed anything it should be fairly minor and easy to fix.






I'm not really
sure about an authoritative source for patch 22, although I'm happy to
add the group check, I think it makes some sense.


Please push the patch; in its current form its better than nothing, and
even the loader doesn't do fancier checks. It has my R-b.


ok, thanks :)





On 14.02.2017 01:52, Timothy Arceri wrote:

Changes in V2:

- no longer mess around storing/restoring any pointers
- implemented support for compute shaders
- dropped some patches only needed by i965 for now
- add fallback support for shader source that is changed after its
compiledi (piglit test on the list)
- simplify cache enable for r600/radeonsi by unconditionally creating
the cache in screen_create.


Remind me how each part of the cache can be disabled?


We can't really enable GLSL IR cache by itself (I guess we could enable
tgis but that wouldn't make much sense). The code simply checks id
ctx->Cache != NULL in various locations which means cache is enabled.


There should still be some environment variable to disable the cache.
R600_DEBUG=nocache, for example, since it's the driver that creates the
actual disk cache structure now.


oh, that would be MESA_GLSL_CACHE_DISABLE=1 which is checked when the 
driver calls disk_cache_create()




Cheers,
Nicolai





Thanks,
Nicolai



- make glsl version (the version reported as supported by the
implemenation at
  compile time) part of the sha1 input rather than adding mesa string
to the cache object itself.
  This avoids fallbacks and should be more reliable.
- add any drirc options as sha1 inputs
- some other tidy ups suggested by Nicolai and Marek

In future we probably want to check what other env vars have been set,
but for now the gl/glsl version and drirc options should cover most
things.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev






___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/formatquery: use consistent local function names

2017-02-17 Thread Samuel Iglesias Gonsálvez
Reviewed-by: Samuel Iglesias Gonsálvez 

On Sat, 2017-02-11 at 17:21 +0100, Alejandro Piñeiro wrote:
> ---
>  src/mesa/main/formatquery.c | 18 +-
>  1 file changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/src/mesa/main/formatquery.c
> b/src/mesa/main/formatquery.c
> index 29df958..598d34d 100644
> --- a/src/mesa/main/formatquery.c
> +++ b/src/mesa/main/formatquery.c
> @@ -718,8 +718,8 @@ _mesa_query_internal_format_default(struct
> gl_context *ctx, GLenum target,
>   * arb_internalformat_query2 spec.
>   */
>  static GLenum
> -equivalentSizePname(GLenum target,
> -GLenum pname)
> +_equivalent_size_pname(GLenum target,
> +   GLenum pname)
>  {
> switch (target) {
> case GL_TEXTURE_1D:
> @@ -763,7 +763,7 @@ equivalentSizePname(GLenum target,
>   * per-se, so we can't just call _mesa_get_texture_dimension
> directly.
>   */
>  static GLint
> -get_target_dimensions(GLenum target)
> +_get_target_dimensions(GLenum target)
>  {
> switch(target) {
> case GL_TEXTURE_BUFFER:
> @@ -788,7 +788,7 @@ get_target_dimensions(GLenum target)
>   *  ."
>   */
>  static GLint
> -get_min_dimensions(GLenum pname)
> +_get_min_dimensions(GLenum pname)
>  {
> switch(pname) {
> case GL_MAX_WIDTH:
> @@ -807,7 +807,7 @@ get_min_dimensions(GLenum pname)
>   * dimensions.
>   */
>  static bool
> -is_multisample_target(GLenum target)
> +_is_multisample_target(GLenum target)
>  {
> switch(target) {
> case GL_TEXTURE_2D_MULTISAMPLE:
> @@ -1016,12 +1016,12 @@ _mesa_GetInternalformativ(GLenum target,
> GLenum internalformat, GLenum pname,
> * "If the resource does not have at least two dimensions, or
> if the
> * resource is unsupported, zero is returned."
> */
> -  dimensions = get_target_dimensions(target);
> -  min_dimensions = get_min_dimensions(pname);
> +  dimensions = _get_target_dimensions(target);
> +  min_dimensions = _get_min_dimensions(pname);
>    if (dimensions < min_dimensions)
>   goto end;
>  
> -  get_pname = equivalentSizePname(target, pname);
> +  get_pname = _equivalent_size_pname(target, pname);
>    if (get_pname == 0)
>   goto end;
>  
> @@ -1055,7 +1055,7 @@ _mesa_GetInternalformativ(GLenum target, GLenum
> internalformat, GLenum pname,
> * returned as MAX_HEIGHT or MAX_DEPTH */
>    for (i = 0; i < 4; i++) {
>   if (max_dimensions_pnames[i] == GL_SAMPLES &&
> - !is_multisample_target(target))
> + !_is_multisample_target(target))
>  continue;
>  
>   _mesa_GetInternalformativ(target, internalformat,

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/18] radeonsi: use a clever alignment for descriptor uploads

2017-02-17 Thread Nicolai Hähnle

On 16.02.2017 23:16, Marek Olšák wrote:

On Thu, Feb 16, 2017 at 4:17 PM, Nicolai Hähnle  wrote:

On 16.02.2017 13:53, Marek Olšák wrote:


From: Marek Olšák 

Non-VBO descriptors won't be smaller than the cache line, so simply use
the cache line size.



What about SSBOs? Those are just 16 bytes.

Also, shader images are just 32 bytes (though we may have to bump this to 64


We always upload the whole list for non-VBO descriptors, which is
num_slot * slot_size. That's a lot more than a cache line. We could
certainly optimize this for both CE and non-CE paths. The CE path
evicts more cache lines needlessly, while the non-CE path has to
upload more data.

Since only the necessary number of VBO descriptors is uploaded, we can
hang the hardware if the vertex shader is using more inputs than the
vertex element state, which luckily can't happen with st/mesa.


Ah, thanks for the reminder. This patch has my R-b as well.



bytes for multisample image support -- except that it's unclear how to write
to a multisample shader image while keeping the FMASK).


I wouldn't like to support MSAA image stores.


We may not have a choice if games expect to be able to use shader image 
functionality with MSAA. But I agree that it's ugly.


Nicolai
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/18] radeonsi: fix UNSIGNED_BYTE index buffer fallback with non-zero start

2017-02-17 Thread Nicolai Hähnle

On 16.02.2017 23:06, Marek Olšák wrote:

On Thu, Feb 16, 2017 at 4:10 PM, Nicolai Hähnle  wrote:

On 16.02.2017 13:53, Marek Olšák wrote:


From: Marek Olšák 

start can only be non-zero with MultiDrawElements, which is unlikely
to occur with UNSIGNED_BYTE indices.



Do we have a test case for this?


Sadly we don't.


Could you add one?

Nicolai


Marek



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/formatquery: use consistent local function names

2017-02-17 Thread Alejandro Piñeiro
Gentle ping for a really trivial patch.

On 11/02/17 17:21, Alejandro Piñeiro wrote:
> ---
>  src/mesa/main/formatquery.c | 18 +-
>  1 file changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/src/mesa/main/formatquery.c b/src/mesa/main/formatquery.c
> index 29df958..598d34d 100644
> --- a/src/mesa/main/formatquery.c
> +++ b/src/mesa/main/formatquery.c
> @@ -718,8 +718,8 @@ _mesa_query_internal_format_default(struct gl_context 
> *ctx, GLenum target,
>   * arb_internalformat_query2 spec.
>   */
>  static GLenum
> -equivalentSizePname(GLenum target,
> -GLenum pname)
> +_equivalent_size_pname(GLenum target,
> +   GLenum pname)
>  {
> switch (target) {
> case GL_TEXTURE_1D:
> @@ -763,7 +763,7 @@ equivalentSizePname(GLenum target,
>   * per-se, so we can't just call _mesa_get_texture_dimension directly.
>   */
>  static GLint
> -get_target_dimensions(GLenum target)
> +_get_target_dimensions(GLenum target)
>  {
> switch(target) {
> case GL_TEXTURE_BUFFER:
> @@ -788,7 +788,7 @@ get_target_dimensions(GLenum target)
>   *  ."
>   */
>  static GLint
> -get_min_dimensions(GLenum pname)
> +_get_min_dimensions(GLenum pname)
>  {
> switch(pname) {
> case GL_MAX_WIDTH:
> @@ -807,7 +807,7 @@ get_min_dimensions(GLenum pname)
>   * dimensions.
>   */
>  static bool
> -is_multisample_target(GLenum target)
> +_is_multisample_target(GLenum target)
>  {
> switch(target) {
> case GL_TEXTURE_2D_MULTISAMPLE:
> @@ -1016,12 +1016,12 @@ _mesa_GetInternalformativ(GLenum target, GLenum 
> internalformat, GLenum pname,
> * "If the resource does not have at least two dimensions, or if the
> * resource is unsupported, zero is returned."
> */
> -  dimensions = get_target_dimensions(target);
> -  min_dimensions = get_min_dimensions(pname);
> +  dimensions = _get_target_dimensions(target);
> +  min_dimensions = _get_min_dimensions(pname);
>if (dimensions < min_dimensions)
>   goto end;
>  
> -  get_pname = equivalentSizePname(target, pname);
> +  get_pname = _equivalent_size_pname(target, pname);
>if (get_pname == 0)
>   goto end;
>  
> @@ -1055,7 +1055,7 @@ _mesa_GetInternalformativ(GLenum target, GLenum 
> internalformat, GLenum pname,
> * returned as MAX_HEIGHT or MAX_DEPTH */
>for (i = 0; i < 4; i++) {
>   if (max_dimensions_pnames[i] == GL_SAMPLES &&
> - !is_multisample_target(target))
> + !_is_multisample_target(target))
>  continue;
>  
>   _mesa_GetInternalformativ(target, internalformat,

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Clamp GetUniformuiv values to be >= 0.

2017-02-17 Thread Nicolai Hähnle

On 16.02.2017 20:24, Antía Puentes wrote:

On lun, 2016-12-12 at 10:43 +0100, Nicolai Hähnle wrote:

On 12.12.2016 00:25, Kenneth Graunke wrote:


Section 2.2.2 (Data Conversions For State Query Commands) of the
OpenGL 4.5 October 24th 2016 specification says:

"If a command returning unsigned integer data is called, such as
 GetSamplerParameterIuiv, negative values are clamped to zero."

Fixes GL44-CTS.gpu_shader_fp64.state_query.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/main/uniform_query.cpp | 48
+
 1 file changed, 39 insertions(+), 9 deletions(-)

Hey Nicolai,

I wrote a similar patch a while back, but never got around to
sending it,
since I realized that the gl45release branch expects our current
behavior,
and the change to make the CTS expect clamping is only on the
master branch.

Apparently I made some additional changes, compared to yours.  I
figured
I'd send this along and let you see if you think any of my extra
changes
are still necessary.  If so, feel free to fold them into your
patch.

I also think we need to fix several other glGet* commands...it's
just that
this is the only one currently tested.  A bunch work because the
values
returned can't be negative.

I think your patch is a strict superset of what mine does and should
be
used instead. I do have one comment below, with that fixed it has my
R-b.


This patch was never pushed, was it? and GL45-CTS.gpu_shader_fp64.state_query
fails in the new vk-gl-cts repository because it expects these negative
values to be clamped.


Fine with me. But again, take Ken's patch, not mine :)

Cheers,
Nicolai
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/13] gallium: do not #include foo.h within extern C {}

2017-02-17 Thread Nicolai Hähnle

On 16.02.2017 16:16, Emil Velikov wrote:

From: Emil Velikov 

Analogous to previous commit.

Signed-off-by: Emil Velikov 


Patches 2-7 & 9:

Reviewed-by: Nicolai Hähnle 


---
 src/gallium/auxiliary/tgsi/tgsi_util.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_util.h 
b/src/gallium/auxiliary/tgsi/tgsi_util.h
index 83a930b69c..aa4606d0b2 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_util.h
+++ b/src/gallium/auxiliary/tgsi/tgsi_util.h
@@ -28,12 +28,12 @@
 #ifndef TGSI_UTIL_H
 #define TGSI_UTIL_H

+#include "pipe/p_shader_tokens.h"
+
 #if defined __cplusplus
 extern "C" {
 #endif

-#include "pipe/p_shader_tokens.h"
-
 struct tgsi_src_register;
 struct tgsi_full_src_register;
 struct tgsi_full_instruction;



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] configure.ac: Drop LLVM compiler flags more radically

2017-02-17 Thread Michel Dänzer
On 17/02/17 06:01 PM, Eero Tamminen wrote:
> Hi,
> 
> On 10.02.2017 02:59, Michel Dänzer wrote:
>> On 09/02/17 10:50 PM, Emil Velikov wrote:
>>> On 9 February 2017 at 08:07, Michel Dänzer  wrote:
 From: Michel Dänzer 

 Drop all -m*, -W*, -O*, -g* and -f* flags, with the exception of
 -fno-rtti, which must be used if it's part of the llvm-config
 --cxxflags
 output. We don't want LLVM to dictate the flags we use, and it can even
 cause build failures, e.g. if LLVM and Mesa are built with different
 compilers.
> 
> Out of curiosity, where this stuff is applied?
> 
> (If you're removing all of them, result is hard to debug, non-optimized
> binary...)
> 
> 
>>> Yes, please !
>>> Reviewed-by: Emil Velikov 
>>
>>
>>> Out of curiosity:
>>> Are you speaking of personal experience ? What was stored in the
>>> c/cpp/cxx flags that triggered build failure ?
>>
>> Building LLVM with clang 4.0 resulted in llvm-config --cxxflags
>> containing -Wstring-conversion and -fcolor-diagnostics, which aren't
>> supported by gcc 6.3.
> 
> There are generic flags that have been supported by few decades, which
> are not going to change, like -O[1-3], -Wall, -g.  Only the specific
> optimization & warning flags are something that differ between compiler
> versions (as new ones get added).

strip_unwanted_llvm_flags() only affects LLVM_CXXFLAGS, so this change
doesn't remove any compiler flags used by Mesa itself, it just prevents
llvm-config from unnecessarily adding potentially harmful compiler flags
when compiling some LLVM related Mesa code.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] configure.ac: Drop LLVM compiler flags more radically

2017-02-17 Thread Eero Tamminen

Hi,

On 10.02.2017 02:59, Michel Dänzer wrote:

On 09/02/17 10:50 PM, Emil Velikov wrote:

On 9 February 2017 at 08:07, Michel Dänzer  wrote:

From: Michel Dänzer 

Drop all -m*, -W*, -O*, -g* and -f* flags, with the exception of
-fno-rtti, which must be used if it's part of the llvm-config --cxxflags
output. We don't want LLVM to dictate the flags we use, and it can even
cause build failures, e.g. if LLVM and Mesa are built with different
compilers.


Out of curiosity, where this stuff is applied?

(If you're removing all of them, result is hard to debug, non-optimized 
binary...)




Yes, please !
Reviewed-by: Emil Velikov 




Out of curiosity:
Are you speaking of personal experience ? What was stored in the
c/cpp/cxx flags that triggered build failure ?


Building LLVM with clang 4.0 resulted in llvm-config --cxxflags
containing -Wstring-conversion and -fcolor-diagnostics, which aren't
supported by gcc 6.3.


There are generic flags that have been supported by few decades, which 
are not going to change, like -O[1-3], -Wall, -g.  Only the specific 
optimization & warning flags are something that differ between compiler 
versions (as new ones get added).



- Eero

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] spec: MESA_program_binary

2017-02-17 Thread Nicolai Hähnle

On 17.02.2017 09:31, Ernst Sjöstrand wrote:

Also, what if the user switches between say AMDGPU-PRO and RadeonSI?


I'd expect radeonsi to use the single Mesa enum, while AMDGPU-PRO 
obviously uses an AMD-assigned enum.


Cheers,
Nicolai



Regards
//Ernst

2017-02-17 1:33 GMT+01:00 Timothy Arceri >:



On 17/02/17 10:44, Ian Romanick wrote:

On 02/15/2017 11:58 PM, Timothy Arceri wrote:



On 16/02/17 17:55, Tapani Pälli wrote:


On 02/16/2017 04:52 AM, Timothy Arceri wrote:

In order add functionality to ARB_get_program_binary
we need
binary format enums.


I've understood that this is a driver internal
enumeration. When
application gets the binary it also receives enum
(integer value) what
format we gave. Then when loading application needs to
query what
formats are supported by the implementation and load the
correct binary.
We just need to internally make agreement on format list
and return
correct one matching the current driver in use?


Not that it's actually likely to happen but if we were to
only have a
single MESA enum an application could only distribute a
single binary.


Applications really, really, *REALLY* should not distribute binaries
retrieved from the driver.  The intention of this extension is for
applications to implement their own shader cache, for example, at
application installation.  The driver can reject the binary at
any time
for any reason.  Driver changes, hardware changes, OS changes,
phase of
the moon, etc.

Looking at the GLES extension registry, it appears that the other
vendors have just a single binary for all the hardware they
make.  Based
on that, having a single Mesa enum isn't an insane idea.  We
would just
need to agree on the format of the header so that the driver
receiving
the blob could determine which driver generated the blob.


The only other thing to consider with a single enum is that it will
require a laptop with an Intel cpu and Nvidia gpu for example to
recompile the binary if the user were to switch between using the
Intel and Nvidia gpus. This might happen depending on if the laptop
is plugged into a power source or not.

If we don't care about this than one enum is fine.



e.g either for AMD, INTEL or NVIDIA but not one for each.
That is unless
we were to compile and pack all gpu vendor binarys at the
same time
which seems overly complicated and expensive.

I could see an intenal id being used for gpu generations
from hardware
vendors.

---

Techland games such as Dead Island and Dying Light
make use of
GetProgramBinary(). My current guess is the Dead
Island crash
https://bugs.freedesktop.org/show_bug.cgi?id=85564

is caused
due to buggy handling of this feature not being
available.

Anyway I'm not sure how we go about getting Khronos
to assign
enums for the binary formats but thought I'd send
this to the
list for discussion.


There's a two step process:

1. Vendors request a block of values via the Khronos internal
bugzilla.

2. When the spec is ready, another bug is submitted requesting
the spec
be published.

Mesa might still have some available enums assigned to it.  I'll
have to
check...

 docs/specs/MESA_program_binary.txt | 78
++
 1 file changed, 78 insertions(+)
 create mode 100644 docs/specs/MESA_program_binary.txt

diff --git a/docs/specs/MESA_program_binary.txt
b/docs/specs/MESA_program_binary.txt
new file mode 100644
index 000..b34e42e
--- /dev/null
+++ b/docs/specs/MESA_program_binary.txt
@@ -0,0 +1,78 @@
+Name
+
+MESA_program_binary
+
+Name Strings
+
+GL_MESA_program_binary
+
  

Re: [Mesa-dev] [RFC] spec: MESA_program_binary

2017-02-17 Thread Ernst Sjöstrand
Also, what if the user switches between say AMDGPU-PRO and RadeonSI?

Regards
//Ernst

2017-02-17 1:33 GMT+01:00 Timothy Arceri :

>
>
> On 17/02/17 10:44, Ian Romanick wrote:
>
>> On 02/15/2017 11:58 PM, Timothy Arceri wrote:
>>
>>>
>>>
>>> On 16/02/17 17:55, Tapani Pälli wrote:
>>>

 On 02/16/2017 04:52 AM, Timothy Arceri wrote:

> In order add functionality to ARB_get_program_binary we need
> binary format enums.
>

 I've understood that this is a driver internal enumeration. When
 application gets the binary it also receives enum (integer value) what
 format we gave. Then when loading application needs to query what
 formats are supported by the implementation and load the correct binary.
 We just need to internally make agreement on format list and return
 correct one matching the current driver in use?

>>>
>>> Not that it's actually likely to happen but if we were to only have a
>>> single MESA enum an application could only distribute a single binary.
>>>
>>
>> Applications really, really, *REALLY* should not distribute binaries
>> retrieved from the driver.  The intention of this extension is for
>> applications to implement their own shader cache, for example, at
>> application installation.  The driver can reject the binary at any time
>> for any reason.  Driver changes, hardware changes, OS changes, phase of
>> the moon, etc.
>>
>> Looking at the GLES extension registry, it appears that the other
>> vendors have just a single binary for all the hardware they make.  Based
>> on that, having a single Mesa enum isn't an insane idea.  We would just
>> need to agree on the format of the header so that the driver receiving
>> the blob could determine which driver generated the blob.
>>
>
> The only other thing to consider with a single enum is that it will
> require a laptop with an Intel cpu and Nvidia gpu for example to recompile
> the binary if the user were to switch between using the Intel and Nvidia
> gpus. This might happen depending on if the laptop is plugged into a power
> source or not.
>
> If we don't care about this than one enum is fine.
>
>
>
>> e.g either for AMD, INTEL or NVIDIA but not one for each. That is unless
>>> we were to compile and pack all gpu vendor binarys at the same time
>>> which seems overly complicated and expensive.
>>>
>>> I could see an intenal id being used for gpu generations from hardware
>>> vendors.
>>>
>>> ---
>
> Techland games such as Dead Island and Dying Light make use of
> GetProgramBinary(). My current guess is the Dead Island crash
> https://bugs.freedesktop.org/show_bug.cgi?id=85564 is caused
> due to buggy handling of this feature not being available.
>
> Anyway I'm not sure how we go about getting Khronos to assign
> enums for the binary formats but thought I'd send this to the
> list for discussion.
>

>> There's a two step process:
>>
>> 1. Vendors request a block of values via the Khronos internal bugzilla.
>>
>> 2. When the spec is ready, another bug is submitted requesting the spec
>> be published.
>>
>> Mesa might still have some available enums assigned to it.  I'll have to
>> check...
>>
>>  docs/specs/MESA_program_binary.txt | 78
> ++
>  1 file changed, 78 insertions(+)
>  create mode 100644 docs/specs/MESA_program_binary.txt
>
> diff --git a/docs/specs/MESA_program_binary.txt
> b/docs/specs/MESA_program_binary.txt
> new file mode 100644
> index 000..b34e42e
> --- /dev/null
> +++ b/docs/specs/MESA_program_binary.txt
> @@ -0,0 +1,78 @@
> +Name
> +
> +MESA_program_binary
> +
> +Name Strings
> +
> +GL_MESA_program_binary
> +
> +Contact
> +
> +Timothy Arceri (tarceri 'at' itsqueeze.com)
> +
> +Status
> +
> +Complete.
> +
> +Version
> +
> +Last Modified Date: February 16, 2017
> +Revision: #1
> +
> +Number
> +
> +???
> +
> +Dependencies
> +
> +OpenGL ES 2.0 is required.
> +
> +Written based on the wording of the OpenGL ES 2.0 specification.
> +
> +This extension interacts with OES_get_program_binary.
> +
> +Overview
> +
> +MESA provides drivers for multiple hardware vendors. This
> extension
> +provides binary formats in order to avoid conflicts between
> drivers when
> +loading precompiled binaries.
> +
> +New Procedures and Functions
> +
> +None.
> +
> +New Tokens
> +
> +Accepted by the  parameter of ShaderBinary:
> +
> +MESA_PROGRAM_BINARY_AMD
> +MESA_PROGRAM_BINARY_NV 
> +MESA_PROGRAM_BINARY_INTEL  
> +MESA_PROGRAM_BINARY_BCOM   
> +

Re: [Mesa-dev] V2 GLSL IR & TGSI on-disk shader cache

2017-02-17 Thread Nicolai Hähnle

On 16.02.2017 23:55, Timothy Arceri wrote:

On 17/02/17 01:27, Nicolai Hähnle wrote:

Hi Timothy,

thank you for the update. I had a look at all the patches now, and
especially the glsl parts looks basically ready to go. There are only
minor comments for which I don't need a full resend of the series, and
an open question on patch 22 where it would be nice to get a proper
answer.


Thanks! It's a relief to finally get this stuff reviewed.


It's definitely good to have concrete progress. There's always the 
possibility that we still missed some metadata, so I wonder what your 
testing plan is for this. Games, obviously, but perhaps consecutive runs 
of piglit give useful info as well?




I'm not really
sure about an authoritative source for patch 22, although I'm happy to
add the group check, I think it makes some sense.


Please push the patch; in its current form its better than nothing, and 
even the loader doesn't do fancier checks. It has my R-b.




On 14.02.2017 01:52, Timothy Arceri wrote:

Changes in V2:

- no longer mess around storing/restoring any pointers
- implemented support for compute shaders
- dropped some patches only needed by i965 for now
- add fallback support for shader source that is changed after its
compiledi (piglit test on the list)
- simplify cache enable for r600/radeonsi by unconditionally creating
the cache in screen_create.


Remind me how each part of the cache can be disabled?


We can't really enable GLSL IR cache by itself (I guess we could enable
tgis but that wouldn't make much sense). The code simply checks id
ctx->Cache != NULL in various locations which means cache is enabled.


There should still be some environment variable to disable the cache. 
R600_DEBUG=nocache, for example, since it's the driver that creates the 
actual disk cache structure now.


Cheers,
Nicolai





Thanks,
Nicolai



- make glsl version (the version reported as supported by the
implemenation at
  compile time) part of the sha1 input rather than adding mesa string
to the cache object itself.
  This avoids fallbacks and should be more reliable.
- add any drirc options as sha1 inputs
- some other tidy ups suggested by Nicolai and Marek

In future we probably want to check what other env vars have been set,
but for now the gl/glsl version and drirc options should cover most
things.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev