[Mesa-dev] [PATCH] llvmpipe: Link tests with CLOCK_LIB.

2016-12-02 Thread Vinson Lee
Fix linking error with 'make check'.

  CXXLD  lp_test_format
../../../../src/gallium/auxiliary/.libs/libgallium.a(os_time.o): In function 
`os_time_get_nano':
/home/jenkins/workspace/mesa/src/gallium/auxiliary/os/os_time.c:59: undefined 
reference to `clock_gettime'

Signed-off-by: Vinson Lee 
---
 src/gallium/drivers/llvmpipe/Makefile.am |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/llvmpipe/Makefile.am 
b/src/gallium/drivers/llvmpipe/Makefile.am
index 85ae0ae13d89..562c2d6321c5 100644
--- a/src/gallium/drivers/llvmpipe/Makefile.am
+++ b/src/gallium/drivers/llvmpipe/Makefile.am
@@ -54,7 +54,8 @@ TEST_LIBS = \
$(top_builddir)/src/util/libmesautil.la \
$(LLVM_LIBS) \
$(DLOPEN_LIBS) \
-   $(PTHREAD_LIBS)
+   $(PTHREAD_LIBS) \
+   $(CLOCK_LIB)
 
 lp_test_format_SOURCES = lp_test_format.c lp_test_main.c
 lp_test_format_LDADD = $(TEST_LIBS)
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] configure.ac: Strip patch version from LLVM version.

2016-12-02 Thread Vinson Lee
HAVE_LLVM variable included the patch version if the LLVM version had a
patch version.

For LLVM version '4.0.0', HAVE_LLVM would be '0x0400.0'.

Fixes: 45574ab2e92f ("configure.ac: better detection of LLVM version")
Signed-off-by: Vinson Lee 
---
 configure.ac | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/configure.ac b/configure.ac
index f62bc61e5025..3b8b32485ae1 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2195,7 +2195,7 @@ if test "x$enable_gallium_llvm" = xyes || test 
"x$HAVE_RADEON_VULKAN" = xyes; th
 if test -n "${LLVM_VERSION_MAJOR}"; then
 LLVM_VERSION_INT="${LLVM_VERSION_MAJOR}0${LLVM_VERSION_MINOR}"
 else
-LLVM_VERSION_INT=`echo $LLVM_VERSION | sed -e 
's/\([[0-9]]\)\.\([[0-9]]\)/\10\2/g'`
+LLVM_VERSION_INT=`echo $LLVM_VERSION | sed -e 
's/\([[0-9]]\)\.\([[0-9]]\).*/\10\2/g'`
 fi
 
 LLVM_REQUIRED_VERSION_MAJOR="3"
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i915: Stop claiming GL 2.1 support.

2016-12-02 Thread Matt Turner
On Fri, Dec 2, 2016 at 12:22 PM, Emil Velikov  wrote:
> On 2 December 2016 at 19:49, Matt Turner  wrote:
>> A user reporting an unrelated bug (98964) said that he has to set
>> MESA_GL_VERSION_OVERRIDE=1.4 when running Chromium otherwise it's too
>> slow. I presume that it's attempting to use GL 2.0/2.1 features that
>> aren't hardware-supported on i915.
> Ubuntu has been carrying a slightly different patch for a while [1].
> JFYI - I cannot comment which one is the better option.
>
> -Emil
>
> [1] 
> https://anonscm.debian.org/git/pkg-xorg/lib/mesa.git/tree/debian/patches/i915-dont-default-to-2.1.patch?h=ubuntu

Yeah, reverting the patch directly is probably better. Thanks for the heads up.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 3/7] intel/blorp_blit: Adjust blorp surface parameters for split blits

2016-12-02 Thread Jordan Justen
On 2016-12-01 16:19:49, Jason Ekstrand wrote:
> On Wed, Nov 30, 2016 at 8:12 PM, Jordan Justen 
> wrote:
> 
> > +
> > +   x_offset_sa = (uint32_t)*x0 * px_size_sa.w + info->tile_x_sa;
> > +   y_offset_sa = (uint32_t)*y0 * px_size_sa.h + info->tile_y_sa;
> > +   isl_tiling_get_intratile_offset_sa(dev, info->surf.tiling,
> > +  info->surf.format,
> > info->surf.row_pitch,
> > +  x_offset_sa, y_offset_sa,
> > +  _offset,
> > +  >tile_x_sa, >tile_y_sa);
> >
> 
> If we're going to do things this early, we should just make our own
> temporary variables for tile_x/y instead of trying to re-use the ones from
> info
> 

Note that blorp_copy may have called surf_convert_to_uncompressed
which calls surf_convert_to_single_slice and therefore tile_x/y may
already be non-zero.

This code handles tile_x/y already being non-zero by adding them into
x/y_offset_sa.

After discovering the new offset, I adjust the coords such that we can
use 0 for the tile offsets. So, we'll eventually need to 0 the
offsets.

-Jordan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98974] Can't see borders/empires in Stellaris

2016-12-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98974

Dylan Baker  changed:

   What|Removed |Added

 QA Contact|dri-devel@lists.freedesktop |mesa-dev@lists.freedesktop.
   |.org|org
   Assignee|dri-devel@lists.freedesktop |mesa-dev@lists.freedesktop.
   |.org|org
  Component|Drivers/Gallium/r600|Mesa core

--- Comment #1 from Dylan Baker  ---
I can verify this bug on Archlinux using the i965 driver on both HSW and SKL,
thus moving to mesa core.

On i965 I'm using low settings but see no borders for my own empire, but other
empires get a light grey border.

I can also verify that the game works correctly on the Nvidia blob on SteamOS.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98974] Can't see borders/empires in Stellaris

2016-12-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98974

Dylan Baker  changed:

   What|Removed |Added

 CC||baker.dyla...@gmail.com

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/23] radeonsi: update all GSVS ring descriptors for new buffer allocations

2016-12-02 Thread Edward O'Callaghan


On 12/01/2016 12:35 AM, Nicolai Hähnle wrote:
> From: Nicolai Hähnle 
> 
> Fixes 
> GL45-CTS.gtf40.GL3Tests.transform_feedback3.transform_feedback3_geometry_instanced.
Reviewed-by: Edward O'Callaghan 

> 
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/gallium/drivers/radeonsi/si_state_shaders.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c 
> b/src/gallium/drivers/radeonsi/si_state_shaders.c
> index 0afc3b4..ea71569 100644
> --- a/src/gallium/drivers/radeonsi/si_state_shaders.c
> +++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
> @@ -2031,24 +2031,29 @@ static bool si_update_gs_ring_buffers(struct 
> si_context *sctx)
>  
>   /* Set ring bindings. */
>   if (sctx->esgs_ring) {
>   si_set_ring_buffer(>b.b, SI_ES_RING_ESGS,
>  sctx->esgs_ring, 0, sctx->esgs_ring->width0,
>  true, true, 4, 64, 0);
>   si_set_ring_buffer(>b.b, SI_GS_RING_ESGS,
>  sctx->esgs_ring, 0, sctx->esgs_ring->width0,
>  false, false, 0, 0, 0);
>   }
> - if (sctx->gsvs_ring)
> + if (sctx->gsvs_ring) {
>   si_set_ring_buffer(>b.b, SI_VS_RING_GSVS,
>  sctx->gsvs_ring, 0, sctx->gsvs_ring->width0,
>  false, false, 0, 0, 0);
> +
> + /* Also update SI_GS_RING_GSVSi descriptors. */
> + sctx->last_gsvs_itemsize = 0;
> + }
> +
>   return true;
>  }
>  
>  static void si_update_gsvs_ring_bindings(struct si_context *sctx)
>  {
>   unsigned gsvs_itemsize = sctx->gs_shader.cso->max_gsvs_emit_size;
>   uint64_t offset;
>  
>   if (!sctx->gsvs_ring || gsvs_itemsize == sctx->last_gsvs_itemsize)
>   return;
> 



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] cso: don't release sampler states that are bound

2016-12-02 Thread Edward O'Callaghan
patches 1 & 2 are,
Reviewed-by: Edward O'Callaghan 

On 12/03/2016 07:38 AM, Marek Olšák wrote:
> From: Marek Olšák 
> 
> This fixes random radeonsi GPU hangs in Batman Arkham: Origins (Wine) and
> probably many other games too.
> 
> cso_cache deletes sampler states when the cache size is too big and doesn't
> check which sampler states are bound, causing use-after-free in drivers.
> Because of that, radeonsi uploaded garbage sampler states and the hardware
> went bananas. Other drivers may have experienced similar issues.
> 
> Cc: 13.0 12.0 
> ---
>  src/gallium/auxiliary/cso_cache/cso_cache.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gallium/auxiliary/cso_cache/cso_cache.c 
> b/src/gallium/auxiliary/cso_cache/cso_cache.c
> index b240c93..1f3be4b 100644
> --- a/src/gallium/auxiliary/cso_cache/cso_cache.c
> +++ b/src/gallium/auxiliary/cso_cache/cso_cache.c
> @@ -181,21 +181,23 @@ static inline void sanitize_cb(struct cso_hash *hash, 
> enum cso_cache_type type,
>--to_remove;
> }
>  }
>  
>  struct cso_hash_iter
>  cso_insert_state(struct cso_cache *sc,
>   unsigned hash_key, enum cso_cache_type type,
>   void *state)
>  {
> struct cso_hash *hash = _cso_hash_for_type(sc, type);
> -   sanitize_hash(sc, hash, type, sc->max_size);
> +
> +   if (type != CSO_SAMPLER)
> +  sanitize_hash(sc, hash, type, sc->max_size);
>  
> return cso_hash_insert(hash, hash_key, state);
>  }
>  
>  struct cso_hash_iter
>  cso_find_state(struct cso_cache *sc,
> unsigned hash_key, enum cso_cache_type type)
>  {
> struct cso_hash *hash = _cso_hash_for_type(sc, type);
>  
> 



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv: Reject VkMemoryAllocateInfo::allocationSize == 0

2016-12-02 Thread Nanley Chery
On Fri, Dec 02, 2016 at 02:37:32PM -0800, Chad Versace wrote:
> ---
>  src/intel/vulkan/anv_device.c | 7 ++-
>  1 file changed, 2 insertions(+), 5 deletions(-)

This patch is,
Reviewed-by: Nanley Chery 

> 
> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
> index d594df7d3b..e3d278df73 100644
> --- a/src/intel/vulkan/anv_device.c
> +++ b/src/intel/vulkan/anv_device.c
> @@ -1246,11 +1246,8 @@ VkResult anv_AllocateMemory(
>  
> assert(pAllocateInfo->sType == VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO);
>  
> -   if (pAllocateInfo->allocationSize == 0) {
> -  /* Apparently, this is allowed */
> -  *pMem = VK_NULL_HANDLE;
> -  return VK_SUCCESS;
> -   }
> +   /* The Vulkan 1.0.33 spec says "allocationSize must be greater than 0". */
> +   assert(pAllocateInfo->allocationSize > 0);
>  
> /* We support exactly one memory heap. */
> assert(pAllocateInfo->memoryTypeIndex == 0 ||
> -- 
> 2.11.0.rc2
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 16/22] anv/hiz: Perform HiZ resolves for all partial renders

2016-12-02 Thread Nanley Chery
On Fri, Oct 14, 2016 at 12:46:31PM -0700, Jason Ekstrand wrote:
> On Wed, Oct 12, 2016 at 9:01 AM, Nanley Chery  wrote:
> 
> > On Tue, Oct 11, 2016 at 06:55:53PM -0700, Jason Ekstrand wrote:
> > > On Tue, Oct 11, 2016 at 6:16 PM, Nanley Chery 
> > wrote:
> > >
> > > > On Mon, Oct 10, 2016 at 06:00:49PM -0700, Jason Ekstrand wrote:
> > > > > On Mon, Oct 10, 2016 at 2:23 PM, Nanley Chery  > >
> > > > wrote:
> > > > >
> > > > > > On Fri, Oct 07, 2016 at 09:41:14PM -0700, Jason Ekstrand wrote:
> > > > > > > If we don't, we can end up with corruption in the portion of the
> > > > depth
> > > > > > > buffer that lies outside the render area when we do a HiZ
> > resolve at
> > > > the
> > > > > > > end.  The only reason we weren't seeing this before was that all
> > of
> > > > the
> > > > > > > meta-based clears such as VkCmdClearDepthStencilImage were
> > internally
> > > > > > using
> > > > > > > HiZ so the HiZ buffer never truly got out-of-sync.  If the CTS
> > ever
> > > > > > tested
> > > > > > > a depth upload (which doesn't care about HiZ) and then a partial
> > > > render
> > > > > > we
> > > > > > > would have seen problems.  Soon, we will be using blorp to do
> > depth
> > > > > > clears
> > > > > > > and it won't bother with HiZ so we would get CTS regressions
> > without
> > > > > > this.
> > > > > > >
> > > > > >
> > > > > > I understand the problem, but I think this solution unnecessarily
> > > > > > penalizes the user's renderpass.
> > > > > >
> > > > > > Since depth buffer updates via vkCopy*ToImage and
> > > > > > vkCmdClearDepthStencilImage cause the HiZ buffer to become stale,
> > > > > > calling
> > > > > >
> > > > > > genX(cmd_buffer_emit_hz_op)(cmd_buffer, BLORP_HIZ_OP_HIZ_RESOLVE);
> > > > > >
> > > > > > at the bottom of those commands should fix the issue without the
> > extra
> > > > > > penalty. I'd imagine that as a prequisite, blorp would have to
> > learn to
> > > > > > emit enough depth stencil state for this command.
> > > > > >
> > > > >
> > > > > I think that's dangerously mixing HiZ data validity models.  There
> > are 3
> > > > > basic aux data validity models that we've thrown around:
> > > > >
> > > > >  1) AUX is always correct.
> > > > >  2) AUX is correct within a render pass and invalid outside.
> > > > >  3) Track whether or not AUX is valid and resolve only as needed.
> > > > >
> > > >
> > > > What is the definition of correct here? I'd assume you mean that the
> > > > data matches what's in the depth buffer, but that sometimes may not be
> > > > the case (STORE_OP_DONTCARE) yet the program behavior is correct
> > > > nonetheless.
> > > >
> > >
> > > By "correct" I mean "consistent with the depth buffer" or, more
> > precicely,
> > > "all well-defined pixels of the depth buffer are consistent with the HiZ
> > > buffer".  We *may* be able to avoid the depth resolve at the end if you
> > > have STORE_OP_DONT_CARE.  However, we would probably not do anything
> > > interesting with LOAD_OP_DONT_CARE.
> > >
> >
> >
> > With this definition of correct (accessing either buffer will give you
> > the correct value due to their being consistent with each other), the
> > current implementation is arguably a course-grained version of (3) (no
> > tracking, let's call this 4) than it is (2). The HiZ buffer is only
> > consistent with the depth buffer when a user performs an operation that
> > likely requires it to be so. For example:
> >
> > * LOAD_OP_LOAD -> HiZ Resolve (consistent)
> > * LOAD_OP_CLEAR -> No resolve, Fast Depth Clear (inconsistent)
> > * vkCmdDraw* -> No resolve (inconsistent)
> > * STORE_OP_STORE -> Depth Resolve (consistent)
> >
> > >
> > > > Also, could you please explain where the danger comes into play?
> > > >
> > >
> > > We need to have a solid mental model of when HiZ and depth are
> > consistent.
> > > Otherwise, we'll make mistakes, things will get inconsistent, and we'll
> >
> > Agreed.
> >
> > > have weird bugs.  This bug is a good example of this.  Our mental model
> > (2)
> > > works fine except that we were leaking garbage depth from DONT_CARE when
> > we
> > > have a partial areat.  Just doing a HiZ resolve after a blorp clear
> > "fixes"
> > > the bug by making things always consistent (mental model 1).  But then it
> >
> > As mentioned above, I'm not advocating mixing 1 and 2, but covering a
> > missed case in 4. Whether or not that mental model is solid seems like
> > a subjective claim.
> >
> > > means that we have LOAD_OP_LOAD, we're doing two HiZ resolves which we
> > > don't want either.
> > >
> >
> > I wouldn't expect Vulkan apps to submit image copies as frequently as
> > render passes, so my thinking is that an extra HiZ resolve at the end of
> > an Image copy should have less of an impact on FPS than performing the
> > resolve on every clearing RP that doesn't use a full render area. I
> > did not write the patch to test my suggestion, but I was able to get a
> > 

[Mesa-dev] [PATCH] anv: Reject VkMemoryAllocateInfo::allocationSize == 0

2016-12-02 Thread Chad Versace
---
 src/intel/vulkan/anv_device.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index d594df7d3b..e3d278df73 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1246,11 +1246,8 @@ VkResult anv_AllocateMemory(
 
assert(pAllocateInfo->sType == VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO);
 
-   if (pAllocateInfo->allocationSize == 0) {
-  /* Apparently, this is allowed */
-  *pMem = VK_NULL_HANDLE;
-  return VK_SUCCESS;
-   }
+   /* The Vulkan 1.0.33 spec says "allocationSize must be greater than 0". */
+   assert(pAllocateInfo->allocationSize > 0);
 
/* We support exactly one memory heap. */
assert(pAllocateInfo->memoryTypeIndex == 0 ||
-- 
2.11.0.rc2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa 12.0.5 release candidate

2016-12-02 Thread Nanley Chery
On Fri, Dec 02, 2016 at 09:27:42PM +, Emil Velikov wrote:
> On 2 December 2016 at 21:16, Emil Velikov  wrote:
> > On 2 December 2016 at 20:53, Nanley Chery  wrote:
> >> On Fri, Dec 02, 2016 at 08:15:16PM +, Emil Velikov wrote:
> >>> Hello list,
> >>>
> >>> The candidate for the Mesa 12.0.5 is now available. Currently we have:
> >>>  - 25 queued
> >>>  - 0 nominated (outstanding)
> >>>  - and 1 rejected patches
> >>>
> >>> Take a look at section "Mesa stable queue" for more information.
> >>>
> >>> Note: This is the final planned release for the 12.0 stable branch.
> >>>
> >>>
> >>> Testing reports/general approval
> >>> 
> >>> Any testing reports (or general approval of the state of the branch) will 
> >>> be
> >>> greatly appreciated.
> >>>
> >>> The plan is to have 12.0.5 this Sunday (4th of December), around or 
> >>> shortly
> >>> after 20:00 GMT.
> >>>
> >>> If you have any questions or suggestions - be that about the current patch
> >>> queue or otherwise, please go ahead.
> >>>
> >>
> >> Hello,
> >>
> >> I nominated the patch, "mesa/fbobject: Update CubeMapFace when reusing
> >> textures," [1] for stable but did not leave the Cc: in the commit
> >> message when pushing [2]. Is that why it's not listed here?
> >>
> > Because it's nominated in a way that's explicitly mentioned as not
> > recommended [1].
> > Afaict it was never "the way" even though it mostly worked.
> >

I did refer to the Submitting Patches section of
http://mesa3d.org/devinfo.html before submitting the patch, but it seems
like it's out-of-date compared to what's in the repo. I'll make sure to
check my local web page in the future.

> I stand corrected - seems like you've explicitly dropped the stable
> tag before commiting.
> Note that doing that effectively cancels your nomination.

Oh okay, I wasn't aware of this. This seems like it would be a good
bullet to add in the list under "Criteria for accepting patches to the
stable branch." I just discovered it, so forgive me if it this info is
already there in some form or fashion.

> 
> Either way... it's on the back-burner being tested.

Thanks!

-Nanley
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/6] winsys/amdgpu: use drmGetDevice[s]2 API

2016-12-02 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Fri, Dec 2, 2016 at 5:31 PM, Emil Velikov  wrote:
> From: Emil Velikov 
>
> Analogous to previous commit
>
> Cc: Michel Dänzer 
> Signed-off-by: Emil Velikov 
> ---
>  src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c 
> b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
> index 98d72bd..d3df66f 100644
> --- a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
> +++ b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
> @@ -108,9 +108,9 @@ static bool do_winsys_init(struct amdgpu_winsys *ws, int 
> fd)
> drmDevicePtr devinfo;
>
> /* Get PCI info. */
> -   r = drmGetDevice(fd, );
> +   r = drmGetDevice2(fd, 0, );
> if (r) {
> -  fprintf(stderr, "amdgpu: drmGetDevice failed.\n");
> +  fprintf(stderr, "amdgpu: drmGetDevice2 failed.\n");
>goto fail;
> }
> ws->info.pci_domain = devinfo->businfo.pci->domain;
> --
> 2.10.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] gallium: add pipe_screen::resource_changed

2016-12-02 Thread Marek Olšák
Shouldn't this be in pipe_context if it does a copy? It's basically
the opposite of flush_resource, right?

Marek

On Fri, Dec 2, 2016 at 4:27 PM, Philipp Zabel  wrote:
> Add a hook to tell drivers that an imported resource may have changed
> and they need to update their internal derived resources.
>
> Signed-off-by: Philipp Zabel 
> ---
>  src/gallium/include/pipe/p_screen.h | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/src/gallium/include/pipe/p_screen.h 
> b/src/gallium/include/pipe/p_screen.h
> index 255647e..e21229e 100644
> --- a/src/gallium/include/pipe/p_screen.h
> +++ b/src/gallium/include/pipe/p_screen.h
> @@ -224,6 +224,12 @@ struct pipe_screen {
>   struct winsys_handle *handle,
>   unsigned usage);
>
> +   /**
> +* Trigger recreation of derived internal resources. This can be used for
> +* reimporting external images that can't be directly used as texture
> +* sampler source.
> +*/
> +   void (*resource_changed)(struct pipe_screen *, struct pipe_resource *pt);
>
> void (*resource_destroy)(struct pipe_screen *,
> struct pipe_resource *pt);
> --
> 2.10.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/6] winsys/amdgpu: use drmGetDevice[s]2 API

2016-12-02 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Fri, Dec 2, 2016 at 5:31 PM, Emil Velikov  wrote:
> From: Emil Velikov 
>
> Analogous to previous commit
>
> Cc: Michel Dänzer 
> Signed-off-by: Emil Velikov 
> ---
>  src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c 
> b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
> index 98d72bd..d3df66f 100644
> --- a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
> +++ b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
> @@ -108,9 +108,9 @@ static bool do_winsys_init(struct amdgpu_winsys *ws, int 
> fd)
> drmDevicePtr devinfo;
>
> /* Get PCI info. */
> -   r = drmGetDevice(fd, );
> +   r = drmGetDevice2(fd, 0, );
> if (r) {
> -  fprintf(stderr, "amdgpu: drmGetDevice failed.\n");
> +  fprintf(stderr, "amdgpu: drmGetDevice2 failed.\n");
>goto fail;
> }
> ws->info.pci_domain = devinfo->businfo.pci->domain;
> --
> 2.10.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] st/mesa: round lod_bias to a multiple of 1/256

2016-12-02 Thread Marek Olšák
On Fri, Dec 2, 2016 at 10:18 PM, Roland Scheidegger  wrote:
> Ideally this wouldn't be tied to specific hardware... That said, I
> believe the clamping limits are sane (d3d10 will use these too). If GL
> has some requirements for lod accuracy or if it's queryable, it should
> probably honor this (d3d10 only would require 6 fractional bits),
> although I'd guess that 8 fractional bits is probably safe...

AMD DX10 GPUs have 6 fractional bits. DX11 and later GPUs have 8
fractional bits.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/23] tgsi: add Stream{X, Y, Z, W} fields to tgsi_declaration_semantic

2016-12-02 Thread Roland Scheidegger
Am 02.12.2016 um 20:44 schrieb Nicolai Hähnle:
> On 02.12.2016 19:46, Roland Scheidegger wrote:
>> Am 02.12.2016 um 18:23 schrieb Nicolai Hähnle:
>>> On 30.11.2016 21:37, Roland Scheidegger wrote:
 Am 30.11.2016 um 20:19 schrieb Nicolai Hähnle:
> On 30.11.2016 19:06, Roland Scheidegger wrote:
>> Am 30.11.2016 um 14:35 schrieb Nicolai Hähnle:
>>> From: Nicolai Hähnle 
>>>
>>> This is for geometry shader outputs. Without it, drivers have no
>>> way of
>>> knowing which stream each output is intended for, and have to
>>> conservatively write all outputs to all streams.
>>>
>>> Separate stream numbers for each component are required due to
>>> output
>>> packing.
>> Are you sure this is true?
>> This is an area I don't know much about, but
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.opengl.org_wiki_Layout-5FQualifier-5F-28GLSL-29=DgIDaQ=uilaK90D4TOVoH58JNXRgQ=_QIjpv-UJ77xEQY8fIYoQtr5qv8wKrPJc7v7_-CYAb0=fVpTGTYN2KTEhU17RpFTxEULrsIfC3bdpEin0k8NIYE=uamnHj-9Xr12ctr0gHDfCMIMHq8DyUBtKIwHQQpjDLs=
>>
>>
>> tells me "Stream
>> assignments for a geometry shader are required to be the same for all
>> members of a block, but offsets are not."
>>
>> Therefore I don't think output packing should ever happen across
>> multiple streams. I think it would be MUCH nicer if the semantic
>> needed
>> just one stream member...
>
> There are two variants of that question, I guess.
>
> The answer to the first variant is: Yes, this is currently true.
> lower_packed_varyings will happily pack outputs from different vertex
> streams into the same vec4. This affects quite a lot of programs, e.g.
> you see it in piglit arb_gpu_shader5-xfb-streams.
>
> The second question is: Do we want it to be true? I agree that it
> would
> be convenient to be able to use a single Stream member. Also,
> isolating
> the stream0 components from the rest would lead to slightly more
> efficient shaders for us in some cases.
>
> I opted against it so far because I didn't want to think through the
> implications of changing lower_packed_varyings. The main question I
> have
> is: if you account for the size of the GS output in # of components,
> then it could happen that the number of output vec4s ends up being
> larger than (max # of output components) / 4. Will that be a problem
> somewhere?

 I don't know if that would be a problem, but if it is I'd assume this
 would be fixable (since the number of actual components ultimately
 doesn't change).
 Having outputs belonging to multiple streams in a single output just
 seems weird...
 That said, I wonder if it actually would be possible to do that with
 d3d11 too.
 With shader model 5 you'd have:
 dcl_stream 0
 dcl_output o0.xy
 dcl_stream 1
 dcl_output o0.zw // legal or not???

 Though the shader model 4/5 rules are a bit weird for packing
 inputs/outputs, I'm not even sure two dcl_output are legal for the same
 reg without a dcl_stream in between them (but you can pack system
 values
 together with ordinary inputs/outputs).

 So maybe just allowing this is the right solution...
>>>
>>> I played around with the DX shader compiler, and I have some annoying
>>> news. SM5 actually uses not just the same output register but even the
>>> same component for multiple streams -- see the output I've pasted at the
>>> end.
>>>
>>> So how to proceed? To simplify things going forward, I'm mostly
>>> convinced that the GLSL output packing should be changed to pack outputs
>>> by stream. As I mentioned previously, this has other minor advantages
>>> for us anyway.
>>>
>>> Then one possibility to accomodate SM5 would be to have a Stream
>>> bitmask, one bit per stream, as part of the output semantics. The
>>> downside of this is that I wanted to use the WriteMask as an additional
>>> optimization to avoid writing out unused components, and you'd then need
>>> separate WriteMasks for each stream.
>>>
>>> The other possibility, which I prefer, would be to have just a single
>>> Stream field indicating one stream number per output register, and
>>> aliasing is just not allowed despite what SM5 wants.
> 
> I have to go back on that unfortunately: I forgot that it's possible to
> create location aliasing across vertex streams via ARB_enhanced_layouts.
> I looked hard and found nothing in the spec that would forbid it, and
> our closed source driver also allows it.
Oh hmm it looks like you can basically assign individual locations
to anything, as long as components don't overlap the same component in
another declaration?
So yes I guess you're right this is indeed needed.
Just too bad it's complex and still not quite enough to meet d3d11
demands, but looks reasonable then.

Roland


> So my plan now is to leave the 

Re: [Mesa-dev] [PATCH 2/3] gallium: decrease the size of pipe_sampler_state fields

2016-12-02 Thread Roland Scheidegger
Reviewed-by: Roland Scheidegger 

Not that it really makes a difference...


Am 02.12.2016 um 21:38 schrieb Marek Olšák:
> From: Marek Olšák 
> 
> We've had unused bits.
> ---
>  src/gallium/include/pipe/p_state.h | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/src/gallium/include/pipe/p_state.h 
> b/src/gallium/include/pipe/p_state.h
> index 46df196..d501a93 100644
> --- a/src/gallium/include/pipe/p_state.h
> +++ b/src/gallium/include/pipe/p_state.h
> @@ -357,27 +357,27 @@ struct pipe_framebuffer_state
>  
>  
>  /**
>   * Texture sampler state.
>   */
>  struct pipe_sampler_state
>  {
> unsigned wrap_s:3;/**< PIPE_TEX_WRAP_x */
> unsigned wrap_t:3;/**< PIPE_TEX_WRAP_x */
> unsigned wrap_r:3;/**< PIPE_TEX_WRAP_x */
> -   unsigned min_img_filter:2;/**< PIPE_TEX_FILTER_x */
> +   unsigned min_img_filter:1;/**< PIPE_TEX_FILTER_x */
> unsigned min_mip_filter:2;/**< PIPE_TEX_MIPFILTER_x */
> -   unsigned mag_img_filter:2;/**< PIPE_TEX_FILTER_x */
> +   unsigned mag_img_filter:1;/**< PIPE_TEX_FILTER_x */
> unsigned compare_mode:1;  /**< PIPE_TEX_COMPARE_x */
> unsigned compare_func:3;  /**< PIPE_FUNC_x */
> unsigned normalized_coords:1; /**< Are coords normalized to [0,1]? */
> -   unsigned max_anisotropy:6;
> +   unsigned max_anisotropy:5;
> unsigned seamless_cube_map:1;
> float lod_bias;   /**< LOD/lambda bias */
> float min_lod, max_lod;   /**< LOD clamp range, after bias */
> union pipe_color_union border_color;
>  };
>  
>  union pipe_surface_desc {
> struct {
>unsigned level;
>unsigned first_layer:16;
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa 12.0.5 release candidate

2016-12-02 Thread Emil Velikov
On 2 December 2016 at 21:16, Emil Velikov  wrote:
> On 2 December 2016 at 20:53, Nanley Chery  wrote:
>> On Fri, Dec 02, 2016 at 08:15:16PM +, Emil Velikov wrote:
>>> Hello list,
>>>
>>> The candidate for the Mesa 12.0.5 is now available. Currently we have:
>>>  - 25 queued
>>>  - 0 nominated (outstanding)
>>>  - and 1 rejected patches
>>>
>>> Take a look at section "Mesa stable queue" for more information.
>>>
>>> Note: This is the final planned release for the 12.0 stable branch.
>>>
>>>
>>> Testing reports/general approval
>>> 
>>> Any testing reports (or general approval of the state of the branch) will be
>>> greatly appreciated.
>>>
>>> The plan is to have 12.0.5 this Sunday (4th of December), around or shortly
>>> after 20:00 GMT.
>>>
>>> If you have any questions or suggestions - be that about the current patch
>>> queue or otherwise, please go ahead.
>>>
>>
>> Hello,
>>
>> I nominated the patch, "mesa/fbobject: Update CubeMapFace when reusing
>> textures," [1] for stable but did not leave the Cc: in the commit
>> message when pushing [2]. Is that why it's not listed here?
>>
> Because it's nominated in a way that's explicitly mentioned as not
> recommended [1].
> Afaict it was never "the way" even though it mostly worked.
>
I stand corrected - seems like you've explicitly dropped the stable
tag before commiting.
Note that doing that effectively cancels your nomination.

Either way... it's on the back-burner being tested.
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] spirv: Builtin Layer is an input for fragment shaders

2016-12-02 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

On Fri, Dec 2, 2016 at 5:16 AM, Iago Toral Quiroga 
wrote:

> This change makes it so we emit a load_input intrinsic when Layer
> is read in a fragment shader.
> ---
>
> Even with this, layered rendering does not seem to work in the Vulkan
> driver, so there is something else that is broken. We are probably
> not mapping the Layer input correctly somewhere.
>

I'm not sure how fragment shader layers work in GL today.  I did recently
add a NIR intrinsic for layer_id and hook it up in the FS backend.  We can
probably just plumb that through.  It would be good to check GL first
though.

--Jason


>  src/compiler/spirv/vtn_variables.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/src/compiler/spirv/vtn_variables.c b/src/compiler/spirv/vtn_
> variables.c
> index 14366dc..c6d73a7 100644
> --- a/src/compiler/spirv/vtn_variables.c
> +++ b/src/compiler/spirv/vtn_variables.c
> @@ -819,7 +819,12 @@ vtn_get_builtin_location(struct vtn_builder *b,
>break;
> case SpvBuiltInLayer:
>*location = VARYING_SLOT_LAYER;
> -  *mode = nir_var_shader_out;
> +  if (b->shader->stage == MESA_SHADER_FRAGMENT)
> + *mode = nir_var_shader_in;
> +  else if (b->shader->stage == MESA_SHADER_GEOMETRY)
> + *mode = nir_var_shader_out;
> +  else
> + unreachable("invalid stage for SpvBuiltInLayer");
>break;
> case SpvBuiltInViewportIndex:
>*location = VARYING_SLOT_VIEWPORT;
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] st/mesa: round lod_bias to a multiple of 1/256

2016-12-02 Thread Roland Scheidegger
Ideally this wouldn't be tied to specific hardware... That said, I
believe the clamping limits are sane (d3d10 will use these too). If GL
has some requirements for lod accuracy or if it's queryable, it should
probably honor this (d3d10 only would require 6 fractional bits),
although I'd guess that 8 fractional bits is probably safe...

Roland

Am 02.12.2016 um 21:38 schrieb Marek Olšák:
> From: Marek Olšák 
> 
> This reduces the number of sampler states 3.6x in Batman Arkham: Origins.
> (from ~7200 to ~2000)
> ---
>  src/mesa/state_tracker/st_atom_sampler.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/src/mesa/state_tracker/st_atom_sampler.c 
> b/src/mesa/state_tracker/st_atom_sampler.c
> index 4568630..daf98c3 100644
> --- a/src/mesa/state_tracker/st_atom_sampler.c
> +++ b/src/mesa/state_tracker/st_atom_sampler.c
> @@ -152,20 +152,26 @@ convert_sampler(struct st_context *st,
> sampler->wrap_r = gl_wrap_xlate(msamp->WrapR);
>  
> sampler->min_img_filter = gl_filter_to_img_filter(msamp->MinFilter);
> sampler->min_mip_filter = gl_filter_to_mip_filter(msamp->MinFilter);
> sampler->mag_img_filter = gl_filter_to_img_filter(msamp->MagFilter);
>  
> if (texobj->Target != GL_TEXTURE_RECTANGLE_ARB)
>sampler->normalized_coords = 1;
>  
> sampler->lod_bias = ctx->Texture.Unit[texUnit].LodBias + msamp->LodBias;
> +   /* Reduce the number of states by allowing only the values that AMD GCN
> +* can represent. Apps use lod_bias for smooth transitions to bigger 
> mipmap
> +* levels.
> +*/
> +   sampler->lod_bias = CLAMP(sampler->lod_bias, -16, 16);
> +   sampler->lod_bias = floorf(sampler->lod_bias * 256) / 256;
>  
> sampler->min_lod = MAX2(msamp->MinLod, 0.0f);
> sampler->max_lod = msamp->MaxLod;
> if (sampler->max_lod < sampler->min_lod) {
>/* The GL spec doesn't seem to specify what to do in this case.
> * Swap the values.
> */
>float tmp = sampler->max_lod;
>sampler->max_lod = sampler->min_lod;
>sampler->min_lod = tmp;
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa 12.0.5 release candidate

2016-12-02 Thread Emil Velikov
On 2 December 2016 at 20:53, Nanley Chery  wrote:
> On Fri, Dec 02, 2016 at 08:15:16PM +, Emil Velikov wrote:
>> Hello list,
>>
>> The candidate for the Mesa 12.0.5 is now available. Currently we have:
>>  - 25 queued
>>  - 0 nominated (outstanding)
>>  - and 1 rejected patches
>>
>> Take a look at section "Mesa stable queue" for more information.
>>
>> Note: This is the final planned release for the 12.0 stable branch.
>>
>>
>> Testing reports/general approval
>> 
>> Any testing reports (or general approval of the state of the branch) will be
>> greatly appreciated.
>>
>> The plan is to have 12.0.5 this Sunday (4th of December), around or shortly
>> after 20:00 GMT.
>>
>> If you have any questions or suggestions - be that about the current patch
>> queue or otherwise, please go ahead.
>>
>
> Hello,
>
> I nominated the patch, "mesa/fbobject: Update CubeMapFace when reusing
> textures," [1] for stable but did not leave the Cc: in the commit
> message when pushing [2]. Is that why it's not listed here?
>
Because it's nominated in a way that's explicitly mentioned as not
recommended [1].
Afaict it was never "the way" even though it mostly worked.

That said, I've picked it up and [barring any regressions] it will be
in the release.

Thanks
Emil

[1] https://cgit.freedesktop.org/mesa/mesa/tree/docs/submittingpatches.html#n218
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa 12.0.5 release candidate

2016-12-02 Thread Nanley Chery
On Fri, Dec 02, 2016 at 08:15:16PM +, Emil Velikov wrote:
> Hello list,
> 
> The candidate for the Mesa 12.0.5 is now available. Currently we have:
>  - 25 queued
>  - 0 nominated (outstanding)
>  - and 1 rejected patches
> 
> Take a look at section "Mesa stable queue" for more information.
> 
> Note: This is the final planned release for the 12.0 stable branch.
> 
> 
> Testing reports/general approval
> 
> Any testing reports (or general approval of the state of the branch) will be
> greatly appreciated.
> 
> The plan is to have 12.0.5 this Sunday (4th of December), around or shortly
> after 20:00 GMT.
> 
> If you have any questions or suggestions - be that about the current patch
> queue or otherwise, please go ahead.
> 

Hello,

I nominated the patch, "mesa/fbobject: Update CubeMapFace when reusing
textures," [1] for stable but did not leave the Cc: in the commit
message when pushing [2]. Is that why it's not listed here?

Thanks,
Nanley

[1]: https://patchwork.freedesktop.org/patch/121882/
[2]: 
https://cgit.freedesktop.org/mesa/mesa/commit/?id=63318d34acd4a5edb271d57adf3b01e2e52552f8

> 
> Trivial merge conflicts
> ---
> None
> 
> 
> Cheers,
> Emil
> 
> 
> Mesa stable queue
> -
> 
> Nominated (0)
> =
> 
> 
> Queued (25)
> ===
> 
> Adam Jackson (2):
>   glx/glvnd: Don't modify the dummy slot in the dispatch table
>   glx/glvnd: Fix dispatch function names and indices
> 
> Anuj Phogat (1):
>   i965: Fix GPU hang related to multiple render targets and alpha testing
> 
> Emil Velikov (3):
>   docs: add release notes for 12.0.4
>   docs: add sha256 checksums for 12.0.4
>   cherry-ignore: add reverted LLVM_LIBDIR patch
> 
> Haixia Shi (1):
>   mesa: change state query return value for RGB565
> 
> Jason Ekstrand (3):
>   i965/fs/generator: Don't use the address immediate for MOV_INDIRECT
>   anv/cmd_buffer: Take a command buffer instead of a batch in two helpers
>   anv/cmd_buffer: Enable a CS stall workaround for Sky Lake gt4
> 
> Kenneth Graunke (1):
>   intel: Fix pixel shader scratch space allocation on Gen9+ platforms.
> 
> Marek Olšák (9):
>   gallium/radeon: fix behavior of GLSL findLSB(0)
>   gallium/radeon: make sure HTILE address is aligned properly
>   radeonsi: fix an assertion failure in 
> si_decompress_sampler_color_textures
>   gallium/radeon: unify viewport emission code
>   gallium/radeon: set VPORT_ZMIN/MAX registers correctly
>   radeonsi: fix gl_PatchVerticesIn for tessellation evaluation shader
>   radeonsi: fix a crash in imageSize for cubemap arrays
>   radeonsi: emit TA_CS_BC_BASE_ADDR on SI only if the kernel allows it
>   gallium/radeon: add support for sharing textures with DCC
> between processes
> 
> Matt Turner (1):
>   anv: Replace "abi_versions" with correct "api_version".
> 
> Steinar H. Gunderson (1):
>   Fix races during _mesa_HashWalk().
> 
> Tim Rowley (3):
>   swr: [rasterizer jitter] cleanup supporting different llvm versions
>   swr: [rasterizer jitter] fix llvm-3.7 compile
>   swr: [rasterizer] add support for llvm-3.9
> 
> 
> Rejected (1)
> 
> 
> Emil Velikov (1)
>   a39ad18 configure.ac: honour LLVM_LIBDIR when linking against LLVM
> 
> Reason: The patch was reverted shortly after it was merged.
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/10] radeonsi: check for sampler state CSO corruption

2016-12-02 Thread Gustaw Smolarczyk
2016-12-02 21:39 GMT+01:00 Marek Olšák :

> From: Marek Olšák 
>
> It really happens.
> ---
>  src/gallium/drivers/radeonsi/si_descriptors.c | 1 +
>  src/gallium/drivers/radeonsi/si_pipe.h| 3 +++
>  src/gallium/drivers/radeonsi/si_state.c   | 5 +
>  3 files changed, 9 insertions(+)
>
> diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c
> b/src/gallium/drivers/radeonsi/si_descriptors.c
> index 8b6e0bb..4f78b1a 100644
> --- a/src/gallium/drivers/radeonsi/si_descriptors.c
> +++ b/src/gallium/drivers/radeonsi/si_descriptors.c
> @@ -796,20 +796,21 @@ static void si_bind_sampler_states(struct
> pipe_context *ctx,
> if (!count || shader >= SI_NUM_SHADERS)
> return;
>
> for (i = 0; i < count; i++) {
> unsigned slot = start + i;
>
> if (!sstates[i] ||
> sstates[i] == samplers->views.sampler_states[slot])
> continue;
>
> +   assert(sstates[i]->magic == SI_SAMPLER_STATE_MAGIC);
> samplers->views.sampler_states[slot] = sstates[i];
>
> /* If FMASK is bound, don't overwrite it.
>  * The sampler state will be set after FMASK is unbound.
>  */
> if (samplers->views.views[slot] &&
> samplers->views.views[slot]->texture &&
> samplers->views.views[slot]->texture->target !=
> PIPE_BUFFER &&
> ((struct r600_texture*)samplers->views.
> views[slot]->texture)->fmask.size)
> continue;
> diff --git a/src/gallium/drivers/radeonsi/si_pipe.h b/src/gallium/drivers/
> radeonsi/si_pipe.h
> index 42cbecb..a7985e7 100644
> --- a/src/gallium/drivers/radeonsi/si_pipe.h
> +++ b/src/gallium/drivers/radeonsi/si_pipe.h
> @@ -130,21 +130,24 @@ struct si_sampler_view {
>  /* [0..7] = image descriptor
>   * [4..7] = buffer descriptor */
> uint32_tstate[8];
> uint32_tfmask_state[8];
> const struct radeon_surf_level  *base_level_info;
> unsignedbase_level;
> unsignedblock_width;
> bool is_stencil_sampler;
>  };
>
> +#define SI_SAMPLER_STATE_MAGIC 0x34f1c35a
> +
>  struct si_sampler_state {
> +   unsignedmagic;
>

How about wrapping it in #ifndef NDEBUG/#endif? Here and the other places.


> uint32_tval[4];
>  };
>
>  struct si_cs_shader_state {
> struct si_compute   *program;
> struct si_compute   *emitted_program;
> unsignedoffset;
> boolinitialized;
> booluses_scratch;
>  };
> diff --git a/src/gallium/drivers/radeonsi/si_state.c
> b/src/gallium/drivers/radeonsi/si_state.c
> index 1ccf5b6..7ff9f8c 100644
> --- a/src/gallium/drivers/radeonsi/si_state.c
> +++ b/src/gallium/drivers/radeonsi/si_state.c
> @@ -3240,20 +3240,21 @@ static void *si_create_sampler_state(struct
> pipe_context *ctx,
> util_memcpy_cpu_to_le32(
> >border_color_map[i],
>
> >border_color,
>
> sizeof(state->border_color));
> sctx->border_color_count++;
> }
>
> border_color_index = i;
> }
> }
>
> +   rstate->magic = SI_SAMPLER_STATE_MAGIC;
> rstate->val[0] = (S_008F30_CLAMP_X(si_tex_wrap(state->wrap_s)) |
>   S_008F30_CLAMP_Y(si_tex_wrap(state->wrap_t)) |
>   S_008F30_CLAMP_Z(si_tex_wrap(state->wrap_r)) |
>   S_008F30_MAX_ANISO_RATIO(max_aniso_ratio) |
>   S_008F30_DEPTH_COMPARE_FUNC(
> si_tex_compare(state->compare_func)) |
>   
> S_008F30_FORCE_UNNORMALIZED(!state->normalized_coords)
> |
>   S_008F30_ANISO_THRESHOLD(max_aniso_ratio >> 1) |
>   S_008F30_ANISO_BIAS(max_aniso_ratio) |
>   
> S_008F30_DISABLE_CUBE_WRAP(!state->seamless_cube_map)
> |
>   S_008F30_COMPAT_MODE(sctx->b.chip_class >= VI));
> @@ -3296,20 +3297,24 @@ static void si_emit_sample_mask(struct si_context
> *sctx, struct r600_atom *atom)
> assert(mask == 0x || sctx->framebuffer.nr_samples > 1 ||
>(mask & 1 && sctx->blitter->running));
>
> radeon_set_context_reg_seq(cs, R_028C38_PA_SC_AA_MASK_X0Y0_X1Y0,
> 2);
> radeon_emit(cs, mask | (mask << 16));
> radeon_emit(cs, mask | (mask << 16));
>  }
>
>  static void si_delete_sampler_state(struct pipe_context *ctx, void *state)
>  {
> +   struct si_sampler_state *s = state;
> +
> +   assert(s->magic == SI_SAMPLER_STATE_MAGIC);
> +   s->magic = 0;
>

[Mesa-dev] [PATCH 08/10] radeonsi: wait for outstanding memory instructions in TCS barriers

2016-12-02 Thread Marek Olšák
From: Marek Olšák 

Cc: 13.0 12.0 
---
 src/gallium/drivers/radeonsi/si_shader.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index b914efb..45896bd 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -3176,28 +3176,32 @@ static void build_type_name_for_intr(
 }
 
 static void build_tex_intrinsic(const struct lp_build_tgsi_action *action,
struct lp_build_tgsi_context *bld_base,
struct lp_build_emit_data *emit_data);
 
 /* Prevent optimizations (at least of memory accesses) across the current
  * point in the program by emitting empty inline assembly that is marked as
  * having side effects.
  */
+#if 0 /* unused currently */
 static void emit_optimization_barrier(struct si_shader_context *ctx)
 {
LLVMBuilderRef builder = ctx->gallivm.builder;
LLVMTypeRef ftype = LLVMFunctionType(ctx->voidt, NULL, 0, false);
LLVMValueRef inlineasm = LLVMConstInlineAsm(ftype, "", "", true, false);
LLVMBuildCall(builder, inlineasm, NULL, 0, "");
 }
+#endif
 
+/* Combine these with & instead of |. */
+#define LGKM_CNT 0x07f
 #define VM_CNT 0xf70
 
 static void emit_waitcnt(struct si_shader_context *ctx, unsigned simm16)
 {
struct gallivm_state *gallivm = >gallivm;
LLVMBuilderRef builder = gallivm->builder;
LLVMValueRef args[1] = {
lp_build_const_int32(gallivm, simm16)
};
lp_build_intrinsic(builder, "llvm.amdgcn.s.waitcnt",
@@ -5228,21 +5232,21 @@ static void si_llvm_emit_barrier(const struct 
lp_build_tgsi_action *action,
 struct lp_build_tgsi_context *bld_base,
 struct lp_build_emit_data *emit_data)
 {
struct si_shader_context *ctx = si_shader_context(bld_base);
struct gallivm_state *gallivm = bld_base->base.gallivm;
 
/* The real barrier instruction isn’t needed, because an entire patch
 * always fits into a single wave.
 */
if (ctx->type == PIPE_SHADER_TESS_CTRL) {
-   emit_optimization_barrier(ctx);
+   emit_waitcnt(ctx, LGKM_CNT & VM_CNT);
return;
}
 
lp_build_intrinsic(gallivm->builder,
   HAVE_LLVM >= 0x0309 ? "llvm.amdgcn.s.barrier"
   : "llvm.AMDGPU.barrier.local",
   ctx->voidt, NULL, 0, 0);
 }
 
 static const struct lp_build_tgsi_action tex_action = {
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/10] radeonsi: allow specifying simm16 of emit_waitcnt at call sites

2016-12-02 Thread Marek Olšák
From: Marek Olšák 

The next commit will use this.

Cc: 13.0 12.0 
---
 src/gallium/drivers/radeonsi/si_shader.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 145de9f..b914efb 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -3184,39 +3184,41 @@ static void build_tex_intrinsic(const struct 
lp_build_tgsi_action *action,
  * having side effects.
  */
 static void emit_optimization_barrier(struct si_shader_context *ctx)
 {
LLVMBuilderRef builder = ctx->gallivm.builder;
LLVMTypeRef ftype = LLVMFunctionType(ctx->voidt, NULL, 0, false);
LLVMValueRef inlineasm = LLVMConstInlineAsm(ftype, "", "", true, false);
LLVMBuildCall(builder, inlineasm, NULL, 0, "");
 }
 
-static void emit_waitcnt(struct si_shader_context *ctx)
+#define VM_CNT 0xf70
+
+static void emit_waitcnt(struct si_shader_context *ctx, unsigned simm16)
 {
struct gallivm_state *gallivm = >gallivm;
LLVMBuilderRef builder = gallivm->builder;
LLVMValueRef args[1] = {
-   lp_build_const_int32(gallivm, 0xf70)
+   lp_build_const_int32(gallivm, simm16)
};
lp_build_intrinsic(builder, "llvm.amdgcn.s.waitcnt",
   ctx->voidt, args, 1, 0);
 }
 
 static void membar_emit(
const struct lp_build_tgsi_action *action,
struct lp_build_tgsi_context *bld_base,
struct lp_build_emit_data *emit_data)
 {
struct si_shader_context *ctx = si_shader_context(bld_base);
 
-   emit_waitcnt(ctx);
+   emit_waitcnt(ctx, VM_CNT);
 }
 
 static LLVMValueRef
 shader_buffer_fetch_rsrc(struct si_shader_context *ctx,
 const struct tgsi_full_src_register *reg)
 {
LLVMValueRef index;
LLVMValueRef rsrc_ptr = LLVMGetParam(ctx->main_fn,
 SI_PARAM_SHADER_BUFFERS);
 
@@ -3606,21 +3608,21 @@ static void load_emit(
LLVMBuilderRef builder = gallivm->builder;
const struct tgsi_full_instruction * inst = emit_data->inst;
char intrinsic_name[64];
 
if (inst->Src[0].Register.File == TGSI_FILE_MEMORY) {
load_emit_memory(ctx, emit_data);
return;
}
 
if (inst->Memory.Qualifier & TGSI_MEMORY_VOLATILE)
-   emit_waitcnt(ctx);
+   emit_waitcnt(ctx, VM_CNT);
 
if (inst->Src[0].Register.File == TGSI_FILE_BUFFER) {
load_emit_buffer(ctx, emit_data);
return;
}
 
if (inst->Memory.Texture == TGSI_TEXTURE_BUFFER) {
emit_data->output[emit_data->chan] =
lp_build_intrinsic(
builder, 
"llvm.amdgcn.buffer.load.format.v4f32", emit_data->dst_type,
@@ -3815,21 +3817,21 @@ static void store_emit(
const struct tgsi_full_instruction * inst = emit_data->inst;
unsigned target = inst->Memory.Texture;
char intrinsic_name[64];
 
if (inst->Dst[0].Register.File == TGSI_FILE_MEMORY) {
store_emit_memory(ctx, emit_data);
return;
}
 
if (inst->Memory.Qualifier & TGSI_MEMORY_VOLATILE)
-   emit_waitcnt(ctx);
+   emit_waitcnt(ctx, VM_CNT);
 
if (inst->Dst[0].Register.File == TGSI_FILE_BUFFER) {
store_emit_buffer(ctx, emit_data);
return;
}
 
if (target == TGSI_TEXTURE_BUFFER) {
emit_data->output[emit_data->chan] = lp_build_intrinsic(
builder, "llvm.amdgcn.buffer.store.format.v4f32",
emit_data->dst_type, emit_data->args,
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/10] tgsi: fix the src type of TGSI_OPCODE_MEMBAR

2016-12-02 Thread Marek Olšák
From: Marek Olšák 

It's a literal integer. The next commit will need this.

Cc: 13.0 12.0 
---
 src/gallium/auxiliary/tgsi/tgsi_info.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c 
b/src/gallium/auxiliary/tgsi/tgsi_info.c
index 18e1bc8..37549aa 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_info.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_info.c
@@ -478,20 +478,21 @@ tgsi_opcode_infer_src_type( uint opcode )
case TGSI_OPCODE_U2F:
case TGSI_OPCODE_U2D:
case TGSI_OPCODE_UADD:
case TGSI_OPCODE_SWITCH:
case TGSI_OPCODE_CASE:
case TGSI_OPCODE_SAMPLE_I:
case TGSI_OPCODE_SAMPLE_I_MS:
case TGSI_OPCODE_UMUL_HI:
case TGSI_OPCODE_UP2H:
case TGSI_OPCODE_U2I64:
+   case TGSI_OPCODE_MEMBAR:
   return TGSI_TYPE_UNSIGNED;
case TGSI_OPCODE_IMUL_HI:
case TGSI_OPCODE_I2F:
case TGSI_OPCODE_I2D:
case TGSI_OPCODE_I2I64:
   return TGSI_TYPE_SIGNED;
case TGSI_OPCODE_ARL:
case TGSI_OPCODE_ARR:
case TGSI_OPCODE_TXQ_LZ:
case TGSI_OPCODE_F2D:
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/10] radeonsi: wait for outstanding LDS instructions in memory barriers if needed

2016-12-02 Thread Marek Olšák
From: Marek Olšák 

Cc: 13.0 12.0 
---
 src/gallium/drivers/radeonsi/si_shader.c | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 45896bd..dc5c67a 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -3187,42 +3187,58 @@ static void build_tex_intrinsic(const struct 
lp_build_tgsi_action *action,
 static void emit_optimization_barrier(struct si_shader_context *ctx)
 {
LLVMBuilderRef builder = ctx->gallivm.builder;
LLVMTypeRef ftype = LLVMFunctionType(ctx->voidt, NULL, 0, false);
LLVMValueRef inlineasm = LLVMConstInlineAsm(ftype, "", "", true, false);
LLVMBuildCall(builder, inlineasm, NULL, 0, "");
 }
 #endif
 
 /* Combine these with & instead of |. */
+#define NOOP_WAITCNT 0xf7f
 #define LGKM_CNT 0x07f
 #define VM_CNT 0xf70
 
 static void emit_waitcnt(struct si_shader_context *ctx, unsigned simm16)
 {
struct gallivm_state *gallivm = >gallivm;
LLVMBuilderRef builder = gallivm->builder;
LLVMValueRef args[1] = {
lp_build_const_int32(gallivm, simm16)
};
lp_build_intrinsic(builder, "llvm.amdgcn.s.waitcnt",
   ctx->voidt, args, 1, 0);
 }
 
 static void membar_emit(
const struct lp_build_tgsi_action *action,
struct lp_build_tgsi_context *bld_base,
struct lp_build_emit_data *emit_data)
 {
struct si_shader_context *ctx = si_shader_context(bld_base);
+   LLVMValueRef src0 = lp_build_emit_fetch(bld_base, emit_data->inst, 0, 
0);
+   unsigned flags = LLVMConstIntGetZExtValue(src0);
+   unsigned waitcnt = NOOP_WAITCNT;
 
-   emit_waitcnt(ctx, VM_CNT);
+   if (flags & TGSI_MEMBAR_THREAD_GROUP)
+   waitcnt &= VM_CNT & LGKM_CNT;
+
+   if (flags & (TGSI_MEMBAR_ATOMIC_BUFFER |
+TGSI_MEMBAR_SHADER_BUFFER |
+TGSI_MEMBAR_SHADER_IMAGE))
+   waitcnt &= VM_CNT;
+
+   if (flags & TGSI_MEMBAR_SHARED)
+   waitcnt &= LGKM_CNT;
+
+   if (waitcnt != NOOP_WAITCNT)
+   emit_waitcnt(ctx, waitcnt);
 }
 
 static LLVMValueRef
 shader_buffer_fetch_rsrc(struct si_shader_context *ctx,
 const struct tgsi_full_src_register *reg)
 {
LLVMValueRef index;
LLVMValueRef rsrc_ptr = LLVMGetParam(ctx->main_fn,
 SI_PARAM_SHADER_BUFFERS);
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/10] radeonsi: always restore sampler states when unbinding sampler views

2016-12-02 Thread Marek Olšák
From: Marek Olšák 

Cc: 13.0 12.0 
---
 src/gallium/drivers/radeonsi/si_descriptors.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 097ffcd..8777f36 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -406,27 +406,27 @@ void si_set_mutable_tex_desc_fields(struct r600_texture 
*tex,
 }
 
 static void si_set_sampler_view(struct si_context *sctx,
unsigned shader,
unsigned slot, struct pipe_sampler_view *view,
bool disallow_early_out)
 {
struct si_sampler_views *views = >samplers[shader].views;
struct si_sampler_view *rview = (struct si_sampler_view*)view;
struct si_descriptors *descs = si_sampler_descriptors(sctx, shader);
+   uint32_t *desc = descs->list + slot * 16;
 
if (views->views[slot] == view && !disallow_early_out)
return;
 
if (view) {
struct r600_texture *rtex = (struct r600_texture 
*)view->texture;
-   uint32_t *desc = descs->list + slot * 16;
 
assert(rtex); /* views with texture == NULL aren't supported */
pipe_sampler_view_reference(>views[slot], view);
memcpy(desc, rview->state, 8*4);
 
if (rtex->resource.b.b.target == PIPE_BUFFER) {
rtex->resource.bind_history |= PIPE_BIND_SAMPLER_VIEW;
 
si_set_buf_desc_address(>resource,
view->u.buf.offset,
@@ -461,23 +461,28 @@ static void si_set_sampler_view(struct si_context *sctx,
 
views->enabled_mask |= 1u << slot;
 
/* Since this can flush, it must be done after enabled_mask is
 * updated. */
si_sampler_view_add_buffer(sctx, view->texture,
   RADEON_USAGE_READ,
   rview->is_stencil_sampler, true);
} else {
pipe_sampler_view_reference(>views[slot], NULL);
-   memcpy(descs->list + slot*16, null_texture_descriptor, 8*4);
+   memcpy(desc, null_texture_descriptor, 8*4);
/* Only clear the lower dwords of FMASK. */
-   memcpy(descs->list + slot*16 + 8, null_texture_descriptor, 4*4);
+   memcpy(desc + 8, null_texture_descriptor, 4*4);
+   /* Re-set the sampler state if we are transitioning from FMASK. 
*/
+   if (views->sampler_states[slot])
+   memcpy(desc + 12,
+  views->sampler_states[slot], 4*4);
+
views->enabled_mask &= ~(1u << slot);
}
 
descs->dirty_mask |= 1u << slot;
sctx->descriptors_dirty |= 1u << si_sampler_descriptors_idx(shader);
 }
 
 static bool is_compressed_colortex(struct r600_texture *rtex)
 {
return rtex->cmask.size || rtex->fmask.size ||
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/10] radeonsi: check for sampler state CSO corruption

2016-12-02 Thread Marek Olšák
From: Marek Olšák 

It really happens.
---
 src/gallium/drivers/radeonsi/si_descriptors.c | 1 +
 src/gallium/drivers/radeonsi/si_pipe.h| 3 +++
 src/gallium/drivers/radeonsi/si_state.c   | 5 +
 3 files changed, 9 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 8b6e0bb..4f78b1a 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -796,20 +796,21 @@ static void si_bind_sampler_states(struct pipe_context 
*ctx,
if (!count || shader >= SI_NUM_SHADERS)
return;
 
for (i = 0; i < count; i++) {
unsigned slot = start + i;
 
if (!sstates[i] ||
sstates[i] == samplers->views.sampler_states[slot])
continue;
 
+   assert(sstates[i]->magic == SI_SAMPLER_STATE_MAGIC);
samplers->views.sampler_states[slot] = sstates[i];
 
/* If FMASK is bound, don't overwrite it.
 * The sampler state will be set after FMASK is unbound.
 */
if (samplers->views.views[slot] &&
samplers->views.views[slot]->texture &&
samplers->views.views[slot]->texture->target != PIPE_BUFFER 
&&
((struct 
r600_texture*)samplers->views.views[slot]->texture)->fmask.size)
continue;
diff --git a/src/gallium/drivers/radeonsi/si_pipe.h 
b/src/gallium/drivers/radeonsi/si_pipe.h
index 42cbecb..a7985e7 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.h
+++ b/src/gallium/drivers/radeonsi/si_pipe.h
@@ -130,21 +130,24 @@ struct si_sampler_view {
 /* [0..7] = image descriptor
  * [4..7] = buffer descriptor */
uint32_tstate[8];
uint32_tfmask_state[8];
const struct radeon_surf_level  *base_level_info;
unsignedbase_level;
unsignedblock_width;
bool is_stencil_sampler;
 };
 
+#define SI_SAMPLER_STATE_MAGIC 0x34f1c35a
+
 struct si_sampler_state {
+   unsignedmagic;
uint32_tval[4];
 };
 
 struct si_cs_shader_state {
struct si_compute   *program;
struct si_compute   *emitted_program;
unsignedoffset;
boolinitialized;
booluses_scratch;
 };
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 1ccf5b6..7ff9f8c 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -3240,20 +3240,21 @@ static void *si_create_sampler_state(struct 
pipe_context *ctx,

util_memcpy_cpu_to_le32(>border_color_map[i],
>border_color,

sizeof(state->border_color));
sctx->border_color_count++;
}
 
border_color_index = i;
}
}
 
+   rstate->magic = SI_SAMPLER_STATE_MAGIC;
rstate->val[0] = (S_008F30_CLAMP_X(si_tex_wrap(state->wrap_s)) |
  S_008F30_CLAMP_Y(si_tex_wrap(state->wrap_t)) |
  S_008F30_CLAMP_Z(si_tex_wrap(state->wrap_r)) |
  S_008F30_MAX_ANISO_RATIO(max_aniso_ratio) |
  
S_008F30_DEPTH_COMPARE_FUNC(si_tex_compare(state->compare_func)) |
  
S_008F30_FORCE_UNNORMALIZED(!state->normalized_coords) |
  S_008F30_ANISO_THRESHOLD(max_aniso_ratio >> 1) |
  S_008F30_ANISO_BIAS(max_aniso_ratio) |
  S_008F30_DISABLE_CUBE_WRAP(!state->seamless_cube_map) 
|
  S_008F30_COMPAT_MODE(sctx->b.chip_class >= VI));
@@ -3296,20 +3297,24 @@ static void si_emit_sample_mask(struct si_context 
*sctx, struct r600_atom *atom)
assert(mask == 0x || sctx->framebuffer.nr_samples > 1 ||
   (mask & 1 && sctx->blitter->running));
 
radeon_set_context_reg_seq(cs, R_028C38_PA_SC_AA_MASK_X0Y0_X1Y0, 2);
radeon_emit(cs, mask | (mask << 16));
radeon_emit(cs, mask | (mask << 16));
 }
 
 static void si_delete_sampler_state(struct pipe_context *ctx, void *state)
 {
+   struct si_sampler_state *s = state;
+
+   assert(s->magic == SI_SAMPLER_STATE_MAGIC);
+   s->magic = 0;
free(state);
 }
 
 /*
  * Vertex elements & buffers
  */
 
 static void *si_create_vertex_elements(struct pipe_context *ctx,
   unsigned count,
   const struct 

[Mesa-dev] [PATCH 04/10] radeonsi: properly declare context sampler states

2016-12-02 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_blit.c| 2 +-
 src/gallium/drivers/radeonsi/si_descriptors.c | 4 ++--
 src/gallium/drivers/radeonsi/si_state.h   | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_blit.c 
b/src/gallium/drivers/radeonsi/si_blit.c
index f5f49c1..83870e5 100644
--- a/src/gallium/drivers/radeonsi/si_blit.c
+++ b/src/gallium/drivers/radeonsi/si_blit.c
@@ -71,21 +71,21 @@ static void si_blitter_begin(struct pipe_context *ctx, enum 
si_blitter_op op)
util_blitter_save_viewport(sctx->blitter, 
>b.viewports.states[0]);
util_blitter_save_scissor(sctx->blitter, 
>b.scissors.states[0]);
}
 
if (op & SI_SAVE_FRAMEBUFFER)
util_blitter_save_framebuffer(sctx->blitter, 
>framebuffer.state);
 
if (op & SI_SAVE_TEXTURES) {
util_blitter_save_fragment_sampler_states(
sctx->blitter, 2,
-   
sctx->samplers[PIPE_SHADER_FRAGMENT].views.sampler_states);
+   
(void**)sctx->samplers[PIPE_SHADER_FRAGMENT].views.sampler_states);
 
util_blitter_save_fragment_sampler_views(sctx->blitter, 2,
sctx->samplers[PIPE_SHADER_FRAGMENT].views.views);
}
 
if (op & SI_DISABLE_RENDER_COND)
sctx->b.render_cond_force_off = true;
 }
 
 static void si_blitter_end(struct pipe_context *ctx)
diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index cf66102..8b6e0bb 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -449,39 +449,39 @@ static void si_set_sampler_view(struct si_context *sctx,
rtex->fmask.size) {
memcpy(desc + 8,
   rview->fmask_state, 8*4);
} else {
/* Disable FMASK and bind sampler state in [12:15]. */
memcpy(desc + 8,
   null_texture_descriptor, 4*4);
 
if (views->sampler_states[slot])
memcpy(desc + 12,
-  views->sampler_states[slot], 4*4);
+  views->sampler_states[slot]->val, 4*4);
}
 
views->enabled_mask |= 1u << slot;
 
/* Since this can flush, it must be done after enabled_mask is
 * updated. */
si_sampler_view_add_buffer(sctx, view->texture,
   RADEON_USAGE_READ,
   rview->is_stencil_sampler, true);
} else {
pipe_sampler_view_reference(>views[slot], NULL);
memcpy(desc, null_texture_descriptor, 8*4);
/* Only clear the lower dwords of FMASK. */
memcpy(desc + 8, null_texture_descriptor, 4*4);
/* Re-set the sampler state if we are transitioning from FMASK. 
*/
if (views->sampler_states[slot])
memcpy(desc + 12,
-  views->sampler_states[slot], 4*4);
+  views->sampler_states[slot]->val, 4*4);
 
views->enabled_mask &= ~(1u << slot);
}
 
descs->dirty_mask |= 1u << slot;
sctx->descriptors_dirty |= 1u << si_sampler_descriptors_idx(shader);
 }
 
 static bool is_compressed_colortex(struct r600_texture *rtex)
 {
diff --git a/src/gallium/drivers/radeonsi/si_state.h 
b/src/gallium/drivers/radeonsi/si_state.h
index 3a9f0cf..eb7a69f 100644
--- a/src/gallium/drivers/radeonsi/si_state.h
+++ b/src/gallium/drivers/radeonsi/si_state.h
@@ -238,21 +238,21 @@ struct si_descriptors {
 
/* The shader userdata offset within a shader where the 64-bit pointer 
to the descriptor
 * array will be stored. */
unsigned shader_userdata_offset;
/* Whether the pointer should be re-emitted. */
bool pointer_dirty;
 };
 
 struct si_sampler_views {
struct pipe_sampler_view*views[SI_NUM_SAMPLERS];
-   void*sampler_states[SI_NUM_SAMPLERS];
+   struct si_sampler_state *sampler_states[SI_NUM_SAMPLERS];
 
/* The i-th bit is set if that element is enabled (non-NULL resource). 
*/
unsignedenabled_mask;
 };
 
 struct si_buffer_resources {
enum radeon_bo_usageshader_usage; /* READ, WRITE, or 
READWRITE */
enum radeon_bo_priority priority;
struct pipe_resource**buffers; /* this has num_buffers 
elements */
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/10] radeonsi: fix incorrect FMASK checking in bind_sampler_states

2016-12-02 Thread Marek Olšák
From: Marek Olšák 

Cc: 13.0 12.0 
---
 src/gallium/drivers/radeonsi/si_descriptors.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 8777f36..cf66102 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -801,24 +801,24 @@ static void si_bind_sampler_states(struct pipe_context 
*ctx,
 
if (!sstates[i] ||
sstates[i] == samplers->views.sampler_states[slot])
continue;
 
samplers->views.sampler_states[slot] = sstates[i];
 
/* If FMASK is bound, don't overwrite it.
 * The sampler state will be set after FMASK is unbound.
 */
-   if (samplers->views.views[i] &&
-   samplers->views.views[i]->texture &&
-   samplers->views.views[i]->texture->target != PIPE_BUFFER &&
-   ((struct 
r600_texture*)samplers->views.views[i]->texture)->fmask.size)
+   if (samplers->views.views[slot] &&
+   samplers->views.views[slot]->texture &&
+   samplers->views.views[slot]->texture->target != PIPE_BUFFER 
&&
+   ((struct 
r600_texture*)samplers->views.views[slot]->texture)->fmask.size)
continue;
 
memcpy(desc->list + slot * 16 + 12, sstates[i]->val, 4*4);
desc->dirty_mask |= 1u << slot;
sctx->descriptors_dirty |= 1u << 
si_sampler_descriptors_idx(shader);
}
 }
 
 /* BUFFER RESOURCES */
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/10] radeonsi: write shader descriptors into hang reports

2016-12-02 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_debug.c   | 114 ++
 src/gallium/drivers/radeonsi/si_descriptors.c |   1 +
 src/gallium/drivers/radeonsi/si_state.h   |   2 +
 3 files changed, 117 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_debug.c 
b/src/gallium/drivers/radeonsi/si_debug.c
index b2c3574..1090dda 100644
--- a/src/gallium/drivers/radeonsi/si_debug.c
+++ b/src/gallium/drivers/radeonsi/si_debug.c
@@ -667,37 +667,151 @@ static void si_dump_framebuffer(struct si_context *sctx, 
FILE *f)
}
 
if (state->zsbuf) {
rtex = (struct r600_texture*)state->zsbuf->texture;
fprintf(f, COLOR_YELLOW "Depth-stencil buffer:" COLOR_RESET 
"\n");
r600_print_texture_info(rtex, f);
fprintf(f, "\n");
}
 }
 
+static void si_dump_descriptor_list(struct si_descriptors *desc,
+   const char *shader_name,
+   const char *elem_name,
+   unsigned num_elements,
+   FILE *f)
+{
+   unsigned i, j;
+   uint32_t *cpu_list = desc->list;
+   uint32_t *gpu_list = desc->gpu_list;
+   const char *list_note = "GPU list";
+
+   if (!gpu_list) {
+   gpu_list = cpu_list;
+   list_note = "CPU list";
+   }
+
+   for (i = 0; i < num_elements; i++) {
+   fprintf(f, COLOR_GREEN "%s%s slot %u (%s):" COLOR_RESET "\n",
+   shader_name, elem_name, i, list_note);
+
+   switch (desc->element_dw_size) {
+   case 4:
+   for (j = 0; j < 4; j++)
+   si_dump_reg(f, R_008F00_SQ_BUF_RSRC_WORD0 + j*4,
+   gpu_list[j], 0x);
+   break;
+   case 8:
+   for (j = 0; j < 8; j++)
+   si_dump_reg(f, R_008F10_SQ_IMG_RSRC_WORD0 + j*4,
+   gpu_list[j], 0x);
+
+   fprintf(f, COLOR_CYAN "Buffer:" COLOR_RESET "\n");
+   for (j = 0; j < 4; j++)
+   si_dump_reg(f, R_008F00_SQ_BUF_RSRC_WORD0 + j*4,
+   gpu_list[4+j], 0x);
+   break;
+   case 16:
+   for (j = 0; j < 8; j++)
+   si_dump_reg(f, R_008F10_SQ_IMG_RSRC_WORD0 + j*4,
+   gpu_list[j], 0x);
+
+   fprintf(f, COLOR_CYAN "Buffer:" COLOR_RESET "\n");
+   for (j = 0; j < 4; j++)
+   si_dump_reg(f, R_008F00_SQ_BUF_RSRC_WORD0 + j*4,
+   gpu_list[4+j], 0x);
+
+   fprintf(f, COLOR_CYAN "FMASK:" COLOR_RESET "\n");
+   for (j = 0; j < 8; j++)
+   si_dump_reg(f, R_008F10_SQ_IMG_RSRC_WORD0 + j*4,
+   gpu_list[8+j], 0x);
+
+   fprintf(f, COLOR_CYAN "Sampler state:" COLOR_RESET 
"\n");
+   for (j = 0; j < 4; j++)
+   si_dump_reg(f, R_008F30_SQ_IMG_SAMP_WORD0 + j*4,
+   gpu_list[12+j], 0x);
+   break;
+   }
+
+   if (memcmp(gpu_list, cpu_list, desc->element_dw_size * 4) != 0) 
{
+   fprintf(f, COLOR_RED "! This slot was corrupted in 
GPU memory !"
+   COLOR_RESET "\n");
+   }
+
+   fprintf(f, "\n");
+   gpu_list += desc->element_dw_size;
+   cpu_list += desc->element_dw_size;
+   }
+}
+
+static void si_dump_descriptors(struct si_context *sctx,
+   struct si_shader_ctx_state *state,
+   FILE *f)
+{
+   if (!state->cso || !state->current)
+   return;
+
+   unsigned type = state->cso->type;
+   const struct tgsi_shader_info *info = >cso->info;
+   struct si_descriptors *descs =
+   >descriptors[SI_DESCS_FIRST_SHADER +
+  type * SI_NUM_SHADER_DESCS];
+   static const char *shader_name[] = {"VS", "PS", "GS", "TCS", "TES", 
"CS"};
+
+   static const char *elem_name[] = {
+   " - Constant buffer",
+   " - Shader buffer",
+   " - Sampler",
+   " - Image",
+   };
+   unsigned num_elements[] = {
+   util_last_bit(info->const_buffers_declared),
+   util_last_bit(info->shader_buffers_declared),
+   util_last_bit(info->samplers_declared),
+   

[Mesa-dev] [PATCH 01/10] radeonsi: take LDS into account for compute shader occupancy stats

2016-12-02 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c | 29 ++---
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 44a4dd2..145de9f 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -5943,61 +5943,71 @@ static void si_shader_dump_disassembly(const struct 
radeon_shader_binary *binary
fprintf(file, "Shader %s binary:\n", name);
for (i = 0; i < binary->code_size; i += 4) {
fprintf(file, "@0x%x: %02x%02x%02x%02x\n", i,
binary->code[i + 3], binary->code[i + 2],
binary->code[i + 1], binary->code[i]);
}
}
 }
 
 static void si_shader_dump_stats(struct si_screen *sscreen,
-struct si_shader_config *conf,
-unsigned num_inputs,
-unsigned code_size,
+struct si_shader *shader,
 struct pipe_debug_callback *debug,
 unsigned processor,
 FILE *file)
 {
+   struct si_shader_config *conf = >config;
+   unsigned num_inputs = shader->selector ? 
shader->selector->info.num_inputs : 0;
+   unsigned code_size = si_get_shader_binary_size(shader);
unsigned lds_increment = sscreen->b.chip_class >= CIK ? 512 : 256;
unsigned lds_per_wave = 0;
unsigned max_simd_waves = 10;
 
/* Compute LDS usage for PS. */
-   if (processor == PIPE_SHADER_FRAGMENT) {
+   switch (processor) {
+   case PIPE_SHADER_FRAGMENT:
/* The minimum usage per wave is (num_inputs * 48). The maximum
 * usage is (num_inputs * 48 * 16).
 * We can get anything in between and it varies between waves.
 *
 * The 48 bytes per input for a single primitive is equal to
 * 4 bytes/component * 4 components/input * 3 points.
 *
 * Other stages don't know the size at compile time or don't
 * allocate LDS per wave, but instead they do it per thread 
group.
 */
lds_per_wave = conf->lds_size * lds_increment +
   align(num_inputs * 48, lds_increment);
+   break;
+   case PIPE_SHADER_COMPUTE:
+   if (shader->selector) {
+   unsigned max_workgroup_size =
+   si_get_max_workgroup_size(shader);
+   lds_per_wave = (conf->lds_size * lds_increment) /
+  DIV_ROUND_UP(max_workgroup_size, 64);
+   }
+   break;
}
 
/* Compute the per-SIMD wave counts. */
if (conf->num_sgprs) {
if (sscreen->b.chip_class >= VI)
max_simd_waves = MIN2(max_simd_waves, 800 / 
conf->num_sgprs);
else
max_simd_waves = MIN2(max_simd_waves, 512 / 
conf->num_sgprs);
}
 
if (conf->num_vgprs)
max_simd_waves = MIN2(max_simd_waves, 256 / conf->num_vgprs);
 
-   /* LDS is 64KB per CU (4 SIMDs), divided into 16KB blocks per SIMD
-* that PS can use.
-*/
+   /* LDS is 64KB per CU (4 SIMDs), which is 16KB per SIMD (usage above
+* 16KB makes some SIMDs unoccupied). */
if (lds_per_wave)
max_simd_waves = MIN2(max_simd_waves, 16384 / lds_per_wave);
 
if (file != stderr ||
r600_can_dump_shader(>b, processor)) {
if (processor == PIPE_SHADER_FRAGMENT) {
fprintf(file, "*** SHADER CONFIG ***\n"
"SPI_PS_INPUT_ADDR = 0x%04x\n"
"SPI_PS_INPUT_ENA  = 0x%04x\n",
conf->spi_ps_input_addr, 
conf->spi_ps_input_ena);
@@ -6087,24 +6097,21 @@ void si_shader_dump(struct si_screen *sscreen, struct 
si_shader *shader,
   debug, "prolog", file);
 
si_shader_dump_disassembly(>binary, debug, "main", 
file);
 
if (shader->epilog)
si_shader_dump_disassembly(>epilog->binary,
   debug, "epilog", file);
fprintf(file, "\n");
}
 
-   si_shader_dump_stats(sscreen, >config,
-shader->selector ? 
shader->selector->info.num_inputs : 0,
-si_get_shader_binary_size(shader), debug, 
processor,
-file);
+   si_shader_dump_stats(sscreen, shader, debug, processor, file);
 }
 
 int 

[Mesa-dev] [PATCH 2/3] gallium: decrease the size of pipe_sampler_state fields

2016-12-02 Thread Marek Olšák
From: Marek Olšák 

We've had unused bits.
---
 src/gallium/include/pipe/p_state.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/gallium/include/pipe/p_state.h 
b/src/gallium/include/pipe/p_state.h
index 46df196..d501a93 100644
--- a/src/gallium/include/pipe/p_state.h
+++ b/src/gallium/include/pipe/p_state.h
@@ -357,27 +357,27 @@ struct pipe_framebuffer_state
 
 
 /**
  * Texture sampler state.
  */
 struct pipe_sampler_state
 {
unsigned wrap_s:3;/**< PIPE_TEX_WRAP_x */
unsigned wrap_t:3;/**< PIPE_TEX_WRAP_x */
unsigned wrap_r:3;/**< PIPE_TEX_WRAP_x */
-   unsigned min_img_filter:2;/**< PIPE_TEX_FILTER_x */
+   unsigned min_img_filter:1;/**< PIPE_TEX_FILTER_x */
unsigned min_mip_filter:2;/**< PIPE_TEX_MIPFILTER_x */
-   unsigned mag_img_filter:2;/**< PIPE_TEX_FILTER_x */
+   unsigned mag_img_filter:1;/**< PIPE_TEX_FILTER_x */
unsigned compare_mode:1;  /**< PIPE_TEX_COMPARE_x */
unsigned compare_func:3;  /**< PIPE_FUNC_x */
unsigned normalized_coords:1; /**< Are coords normalized to [0,1]? */
-   unsigned max_anisotropy:6;
+   unsigned max_anisotropy:5;
unsigned seamless_cube_map:1;
float lod_bias;   /**< LOD/lambda bias */
float min_lod, max_lod;   /**< LOD clamp range, after bias */
union pipe_color_union border_color;
 };
 
 union pipe_surface_desc {
struct {
   unsigned level;
   unsigned first_layer:16;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] cso: don't release sampler states that are bound

2016-12-02 Thread Marek Olšák
From: Marek Olšák 

This fixes random radeonsi GPU hangs in Batman Arkham: Origins (Wine) and
probably many other games too.

cso_cache deletes sampler states when the cache size is too big and doesn't
check which sampler states are bound, causing use-after-free in drivers.
Because of that, radeonsi uploaded garbage sampler states and the hardware
went bananas. Other drivers may have experienced similar issues.

Cc: 13.0 12.0 
---
 src/gallium/auxiliary/cso_cache/cso_cache.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/cso_cache/cso_cache.c 
b/src/gallium/auxiliary/cso_cache/cso_cache.c
index b240c93..1f3be4b 100644
--- a/src/gallium/auxiliary/cso_cache/cso_cache.c
+++ b/src/gallium/auxiliary/cso_cache/cso_cache.c
@@ -181,21 +181,23 @@ static inline void sanitize_cb(struct cso_hash *hash, 
enum cso_cache_type type,
   --to_remove;
}
 }
 
 struct cso_hash_iter
 cso_insert_state(struct cso_cache *sc,
  unsigned hash_key, enum cso_cache_type type,
  void *state)
 {
struct cso_hash *hash = _cso_hash_for_type(sc, type);
-   sanitize_hash(sc, hash, type, sc->max_size);
+
+   if (type != CSO_SAMPLER)
+  sanitize_hash(sc, hash, type, sc->max_size);
 
return cso_hash_insert(hash, hash_key, state);
 }
 
 struct cso_hash_iter
 cso_find_state(struct cso_cache *sc,
unsigned hash_key, enum cso_cache_type type)
 {
struct cso_hash *hash = _cso_hash_for_type(sc, type);
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] st/mesa: round lod_bias to a multiple of 1/256

2016-12-02 Thread Marek Olšák
From: Marek Olšák 

This reduces the number of sampler states 3.6x in Batman Arkham: Origins.
(from ~7200 to ~2000)
---
 src/mesa/state_tracker/st_atom_sampler.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/mesa/state_tracker/st_atom_sampler.c 
b/src/mesa/state_tracker/st_atom_sampler.c
index 4568630..daf98c3 100644
--- a/src/mesa/state_tracker/st_atom_sampler.c
+++ b/src/mesa/state_tracker/st_atom_sampler.c
@@ -152,20 +152,26 @@ convert_sampler(struct st_context *st,
sampler->wrap_r = gl_wrap_xlate(msamp->WrapR);
 
sampler->min_img_filter = gl_filter_to_img_filter(msamp->MinFilter);
sampler->min_mip_filter = gl_filter_to_mip_filter(msamp->MinFilter);
sampler->mag_img_filter = gl_filter_to_img_filter(msamp->MagFilter);
 
if (texobj->Target != GL_TEXTURE_RECTANGLE_ARB)
   sampler->normalized_coords = 1;
 
sampler->lod_bias = ctx->Texture.Unit[texUnit].LodBias + msamp->LodBias;
+   /* Reduce the number of states by allowing only the values that AMD GCN
+* can represent. Apps use lod_bias for smooth transitions to bigger mipmap
+* levels.
+*/
+   sampler->lod_bias = CLAMP(sampler->lod_bias, -16, 16);
+   sampler->lod_bias = floorf(sampler->lod_bias * 256) / 256;
 
sampler->min_lod = MAX2(msamp->MinLod, 0.0f);
sampler->max_lod = msamp->MaxLod;
if (sampler->max_lod < sampler->min_lod) {
   /* The GL spec doesn't seem to specify what to do in this case.
* Swap the values.
*/
   float tmp = sampler->max_lod;
   sampler->max_lod = sampler->min_lod;
   sampler->min_lod = tmp;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] swr: Fix active_queries count

2016-12-02 Thread Rowley, Timothy O
Reviewed-by: Tim Rowley 
>

On Dec 1, 2016, at 7:08 PM, Bruce Cherniak 
> wrote:

The active_query count was incorrect for query types that don't require
a begin_query.  Removed the unnecessary assert.
---
src/gallium/drivers/swr/swr_query.cpp | 13 +++--
1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/swr/swr_query.cpp 
b/src/gallium/drivers/swr/swr_query.cpp
index a95e0d8..6eb0781 100644
--- a/src/gallium/drivers/swr/swr_query.cpp
+++ b/src/gallium/drivers/swr/swr_query.cpp
@@ -165,8 +165,9 @@ swr_begin_query(struct pipe_context *pipe, struct 
pipe_query *q)
   /* Initialize Results */
   memset(>result, 0, sizeof(pq->result));
   switch (pq->type) {
+   case PIPE_QUERY_GPU_FINISHED:
   case PIPE_QUERY_TIMESTAMP:
-  /* nothing to do */
+  /* nothing to do, but don't want the default */
  break;
   case PIPE_QUERY_TIME_ELAPSED:
  pq->result.timestamp_start = swr_get_timestamp(pipe->screen);
@@ -181,10 +182,10 @@ swr_begin_query(struct pipe_context *pipe, struct 
pipe_query *q)
 SwrEnableStatsFE(ctx->swrContext, TRUE);
 SwrEnableStatsBE(ctx->swrContext, TRUE);
  }
+  ctx->active_queries++;
  break;
   }

-   ctx->active_queries++;

   return true;
}
@@ -195,11 +196,10 @@ swr_end_query(struct pipe_context *pipe, struct 
pipe_query *q)
   struct swr_context *ctx = swr_context(pipe);
   struct swr_query *pq = swr_query(q);

-   assert(ctx->active_queries
-  && "swr_end_query, there are no active queries!");
-   ctx->active_queries--;
-
   switch (pq->type) {
+   case PIPE_QUERY_GPU_FINISHED:
+  /* nothing to do, but don't want the default */
+  break;
   case PIPE_QUERY_TIMESTAMP:
   case PIPE_QUERY_TIME_ELAPSED:
  pq->result.timestamp_end = swr_get_timestamp(pipe->screen);
@@ -214,6 +214,7 @@ swr_end_query(struct pipe_context *pipe, struct pipe_query 
*q)
  swr_fence_submit(ctx, pq->fence);

  /* Only change stat collection if there are no active queries */
+  ctx->active_queries--;
  if (ctx->active_queries == 0) {
 SwrEnableStatsFE(ctx->swrContext, FALSE);
 SwrEnableStatsBE(ctx->swrContext, FALSE);
--
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i915: Stop claiming GL 2.1 support.

2016-12-02 Thread Emil Velikov
On 2 December 2016 at 19:49, Matt Turner  wrote:
> A user reporting an unrelated bug (98964) said that he has to set
> MESA_GL_VERSION_OVERRIDE=1.4 when running Chromium otherwise it's too
> slow. I presume that it's attempting to use GL 2.0/2.1 features that
> aren't hardware-supported on i915.
Ubuntu has been carrying a slightly different patch for a while [1].
JFYI - I cannot comment which one is the better option.

-Emil

[1] 
https://anonscm.debian.org/git/pkg-xorg/lib/mesa.git/tree/debian/patches/i915-dont-default-to-2.1.patch?h=ubuntu
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/23] tgsi: add Stream{X, Y, Z, W} fields to tgsi_declaration_semantic

2016-12-02 Thread Nicolai Hähnle

On 02.12.2016 19:46, Roland Scheidegger wrote:

Am 02.12.2016 um 18:23 schrieb Nicolai Hähnle:

On 30.11.2016 21:37, Roland Scheidegger wrote:

Am 30.11.2016 um 20:19 schrieb Nicolai Hähnle:

On 30.11.2016 19:06, Roland Scheidegger wrote:

Am 30.11.2016 um 14:35 schrieb Nicolai Hähnle:

From: Nicolai Hähnle 

This is for geometry shader outputs. Without it, drivers have no
way of
knowing which stream each output is intended for, and have to
conservatively write all outputs to all streams.

Separate stream numbers for each component are required due to output
packing.

Are you sure this is true?
This is an area I don't know much about, but
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.opengl.org_wiki_Layout-5FQualifier-5F-28GLSL-29=DgIDaQ=uilaK90D4TOVoH58JNXRgQ=_QIjpv-UJ77xEQY8fIYoQtr5qv8wKrPJc7v7_-CYAb0=fVpTGTYN2KTEhU17RpFTxEULrsIfC3bdpEin0k8NIYE=uamnHj-9Xr12ctr0gHDfCMIMHq8DyUBtKIwHQQpjDLs=

tells me "Stream
assignments for a geometry shader are required to be the same for all
members of a block, but offsets are not."

Therefore I don't think output packing should ever happen across
multiple streams. I think it would be MUCH nicer if the semantic needed
just one stream member...


There are two variants of that question, I guess.

The answer to the first variant is: Yes, this is currently true.
lower_packed_varyings will happily pack outputs from different vertex
streams into the same vec4. This affects quite a lot of programs, e.g.
you see it in piglit arb_gpu_shader5-xfb-streams.

The second question is: Do we want it to be true? I agree that it would
be convenient to be able to use a single Stream member. Also, isolating
the stream0 components from the rest would lead to slightly more
efficient shaders for us in some cases.

I opted against it so far because I didn't want to think through the
implications of changing lower_packed_varyings. The main question I have
is: if you account for the size of the GS output in # of components,
then it could happen that the number of output vec4s ends up being
larger than (max # of output components) / 4. Will that be a problem
somewhere?


I don't know if that would be a problem, but if it is I'd assume this
would be fixable (since the number of actual components ultimately
doesn't change).
Having outputs belonging to multiple streams in a single output just
seems weird...
That said, I wonder if it actually would be possible to do that with
d3d11 too.
With shader model 5 you'd have:
dcl_stream 0
dcl_output o0.xy
dcl_stream 1
dcl_output o0.zw // legal or not???

Though the shader model 4/5 rules are a bit weird for packing
inputs/outputs, I'm not even sure two dcl_output are legal for the same
reg without a dcl_stream in between them (but you can pack system values
together with ordinary inputs/outputs).

So maybe just allowing this is the right solution...


I played around with the DX shader compiler, and I have some annoying
news. SM5 actually uses not just the same output register but even the
same component for multiple streams -- see the output I've pasted at the
end.

So how to proceed? To simplify things going forward, I'm mostly
convinced that the GLSL output packing should be changed to pack outputs
by stream. As I mentioned previously, this has other minor advantages
for us anyway.

Then one possibility to accomodate SM5 would be to have a Stream
bitmask, one bit per stream, as part of the output semantics. The
downside of this is that I wanted to use the WriteMask as an additional
optimization to avoid writing out unused components, and you'd then need
separate WriteMasks for each stream.

The other possibility, which I prefer, would be to have just a single
Stream field indicating one stream number per output register, and
aliasing is just not allowed despite what SM5 wants.


I have to go back on that unfortunately: I forgot that it's possible to 
create location aliasing across vertex streams via ARB_enhanced_layouts. 
I looked hard and found nothing in the spec that would forbid it, and 
our closed source driver also allows it.


So my plan now is to leave the StreamXYZW stuff as is. I will send 
around a v2 of this series to account for this use case (because there's 
still a problem in the GLSL-to-TGSI translation), plus some 
radeonsi-specific additions.


I'm also going to send a piglit test around that exercises this.

Cheers,
Nicolai



TGSI -> SM5 conversion is trivial.

SM5 -> TGSI conversion is also possible despite the aliasing on the DX
side, because the doc says this about emit_stream: "Af[t]er the emit,
all data in all output registers for all streams become uninitialized,
not just the stream emitted to."

Oh that's pretty interesting, since emit didn't have that part about
outputs becoming uninitialized. Maybe that's just what was needed to
keep implementations sane when allowing the crazy "same output multiple
stream" stuff... Or I suppose it's not actually that crazy then...




[Mesa-dev] Mesa 12.0.5 release candidate

2016-12-02 Thread Emil Velikov
Hello list,

The candidate for the Mesa 12.0.5 is now available. Currently we have:
 - 25 queued
 - 0 nominated (outstanding)
 - and 1 rejected patches

Take a look at section "Mesa stable queue" for more information.

Note: This is the final planned release for the 12.0 stable branch.


Testing reports/general approval

Any testing reports (or general approval of the state of the branch) will be
greatly appreciated.

The plan is to have 12.0.5 this Sunday (4th of December), around or shortly
after 20:00 GMT.

If you have any questions or suggestions - be that about the current patch
queue or otherwise, please go ahead.


Trivial merge conflicts
---
None


Cheers,
Emil


Mesa stable queue
-

Nominated (0)
=


Queued (25)
===

Adam Jackson (2):
  glx/glvnd: Don't modify the dummy slot in the dispatch table
  glx/glvnd: Fix dispatch function names and indices

Anuj Phogat (1):
  i965: Fix GPU hang related to multiple render targets and alpha testing

Emil Velikov (3):
  docs: add release notes for 12.0.4
  docs: add sha256 checksums for 12.0.4
  cherry-ignore: add reverted LLVM_LIBDIR patch

Haixia Shi (1):
  mesa: change state query return value for RGB565

Jason Ekstrand (3):
  i965/fs/generator: Don't use the address immediate for MOV_INDIRECT
  anv/cmd_buffer: Take a command buffer instead of a batch in two helpers
  anv/cmd_buffer: Enable a CS stall workaround for Sky Lake gt4

Kenneth Graunke (1):
  intel: Fix pixel shader scratch space allocation on Gen9+ platforms.

Marek Olšák (9):
  gallium/radeon: fix behavior of GLSL findLSB(0)
  gallium/radeon: make sure HTILE address is aligned properly
  radeonsi: fix an assertion failure in si_decompress_sampler_color_textures
  gallium/radeon: unify viewport emission code
  gallium/radeon: set VPORT_ZMIN/MAX registers correctly
  radeonsi: fix gl_PatchVerticesIn for tessellation evaluation shader
  radeonsi: fix a crash in imageSize for cubemap arrays
  radeonsi: emit TA_CS_BC_BASE_ADDR on SI only if the kernel allows it
  gallium/radeon: add support for sharing textures with DCC
between processes

Matt Turner (1):
  anv: Replace "abi_versions" with correct "api_version".

Steinar H. Gunderson (1):
  Fix races during _mesa_HashWalk().

Tim Rowley (3):
  swr: [rasterizer jitter] cleanup supporting different llvm versions
  swr: [rasterizer jitter] fix llvm-3.7 compile
  swr: [rasterizer] add support for llvm-3.9


Rejected (1)


Emil Velikov (1)
  a39ad18 configure.ac: honour LLVM_LIBDIR when linking against LLVM

Reason: The patch was reverted shortly after it was merged.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] swr: [rasterizer jitter] include cstdarg in builder_misc.cpp

2016-12-02 Thread Cherniak, Bruce
Reviewed-by: Bruce Cherniak 

> On Dec 2, 2016, at 1:28 PM, Rowley, Timothy O  
> wrote:
> 
> Fixes build problem with llvm-svn.
> 
> v2: use cstdarg instead of stdarg.h
> ---
> src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp 
> b/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp
> index d755cc3..8120a2f 100644
> --- a/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp
> +++ b/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp
> @@ -30,6 +30,7 @@
> #include "builder.h"
> #include "common/rdtsc_buckets.h"
> 
> +#include 
> 
> namespace SwrJit
> {
> @@ -1623,4 +1624,4 @@ namespace SwrJit
> }
> }
> 
> -}
> \ No newline at end of file
> +}
> -- 
> 2.7.4
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/6] anv: do not open random render node(s)

2016-12-02 Thread Jason Ekstrand
I haven't reviewed in any sort of detail, but I'm fine with this change
FWIW.  I glanced at the drm entrypoints you're calling, and that's not code
we want to duplicate in anv.

On Fri, Dec 2, 2016 at 10:31 AM, Emil Velikov 
wrote:

> On 2 December 2016 at 17:33, Eric Engestrom 
> wrote:
> > On Friday, 2016-12-02 16:31:49 +, Emil Velikov wrote:
> >> From: Emil Velikov 
> >>
> >> drmGetDevices2() provides us with enough flexibility to build heuristics
> >> upon. Opening a random node on the other hand will wake up the device,
> >> regardless if it's the one we're intereseted or not.
> >
> > "interested"
> > (same in the previous patch)
> >
> >>
> >> Cc: Jason Ekstrand 
> >> Signed-off-by: Emil Velikov 
> >> ---
> >>  src/intel/vulkan/Makefile.am  |  3 ++-
> >>  src/intel/vulkan/anv_device.c | 53 ++
> +
> >>  2 files changed, 40 insertions(+), 16 deletions(-)
> >>
> >> diff --git a/src/intel/vulkan/Makefile.am b/src/intel/vulkan/Makefile.am
> >> index df7645f..e309491 100644
> >> --- a/src/intel/vulkan/Makefile.am
> >> +++ b/src/intel/vulkan/Makefile.am
> >> @@ -66,7 +66,7 @@ AM_CPPFLAGS += \
> >>  endif
> >>
> >>  AM_CPPFLAGS += \
> >> - $(INTEL_CFLAGS) \
> >> + $(LIBDRM_CFLAGS) \
> >>   $(VALGRIND_CFLAGS) \
> >>   $(DEFINES)
> >>
> >> @@ -131,6 +131,7 @@ VULKAN_LIB_DEPS += \
> >>   $(top_builddir)/src/intel/isl/libisl.la \
> >>   $(top_builddir)/src/intel/blorp/libblorp.la \
> >>   $(PER_GEN_LIBS) \
> >> + $(LIBDRM_LIBS) \
> >>   $(PTHREAD_LIBS) \
> >>   $(DLOPEN_LIBS) \
> >>   -lm
> >
> > Unrelated -> separate patch?
> >
> It's related: libvulkan_intel.so does not pull libdrm.so as dependency
> (up-to this patch). As we use it below, we ought to add the link here.
> Otherwise we'll end up with interment breakage.
>
> Note: libvulkan_radeon.so already depends on libdrm, which is why the
> hunk is missing from that patch.
>
> >> diff --git a/src/intel/vulkan/anv_device.c
> b/src/intel/vulkan/anv_device.c
> >> index d594df7..9927ac2 100644
> >> --- a/src/intel/vulkan/anv_device.c
> >> +++ b/src/intel/vulkan/anv_device.c
> >> @@ -29,6 +29,7 @@
> >>  #include 
> >>  #include 
> >>  #include 
> >> +#include 
> >>
> >>  #include "anv_private.h"
> >>  #include "util/strtod.h"
> >> @@ -375,6 +376,40 @@ void anv_DestroyInstance(
> >> vk_free(>alloc, instance);
> >>  }
> >>
> >> +static VkResult
> >> +anv_enumerate_devices(struct anv_instance *instance)
> >> +{
> >> +   /* TODO: Check for more devices ? */
> >> +   drmDevicePtr devices[8];
> >> +   VkResult result = VK_SUCCESS;
> >> +   int max_devices;
> >> +
> >> +   max_devices = drmGetDevices2(0, devices, sizeof(devices));
> >> +   if (max_devices < 1)
> >> +  return VK_ERROR_INCOMPATIBLE_DRIVER;
> >> +
> >> +   for (unsigned i = 0; i < (unsigned)max_devices; i++) {
> >> +  if (devices[i]->available_nodes & 1 << DRM_NODE_RENDER &&
> >> +  devices[i]->bustype == DRM_BUS_PCI &&
> >> +  devices[i]->deviceinfo.pci->vendor_id == 0x8086) {
> >
> > Yay, magic values!
> > I feel like we should replace all those with PCI_VENDOR_INTEL or
> something.
> >
> We have another three instances of these for each Vulkan driver.
> Barring any objections I'll do that as cleanup on top ?
>
> Thanks
> Emil
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 19/22] anv/nir: add support for dvec3/4 consuming two locations

2016-12-02 Thread Jason Ekstrand
On Thu, Dec 1, 2016 at 10:22 PM, Jason Ekstrand 
wrote:

> +Ken
>
> On Thu, Dec 1, 2016 at 10:17 PM, Jason Ekstrand 
> wrote:
>
>> I'm not sure how I feel about this one.  It seems like it would almost be
>> easier to just pick one convention or the other for NIR and adjust one of
>> the drivers accordingly.  I don't know that I have a huge preference which
>> convention we choose.  I guess the Vulkan convention matches our hardware a
>> bit better.  In either case, converting from one to the other should be a
>> simple matter of building a remap table or a creative use of popcount.
>>
>
As another data point, TGSI uses 2 slots for dvec3/4 inputs.  I think the
simplest and most consistent thing to do is make NIR use 2 slots and just
change the GL driver to work that way.  That way we can keep GL's oddity as
contained as possible.


> On Fri, Nov 25, 2016 at 12:52 AM, Juan A. Suarez Romero <
>> jasua...@igalia.com> wrote:
>>
>>> One difference between OpenGL and Vulkan regarding 64-bit vertex
>>> attribute types is that dvec3 and dvec4 consumes just one location in
>>> OpenGL, while in Vulkan it consumes two locations.
>>>
>>> Thus, in OpenGL for each dvec3/dvec4 vertex attrib we mark just one bit
>>> in our internal inputs_read bitmap (and also the corresponding bit in
>>> double_inputs_read bitmap) while in Vulkan we mark two consecutive bits
>>> in both bitmaps.
>>>
>>> This is handled with a nir option called "dvec3_consumes_two_locations",
>>> which is set to True for Vulkan code. And all the computation regarding
>>> emitting vertices as well as the mapping between attributes and physical
>>> registers use this option to correctly do the work.
>>> ---
>>>  src/amd/vulkan/radv_pipeline.c   |  1 +
>>>  src/compiler/nir/nir.h   |  5 +++
>>>  src/compiler/nir/nir_gather_info.c   |  6 +--
>>>  src/gallium/drivers/freedreno/ir3/ir3_nir.c  |  1 +
>>>  src/intel/vulkan/anv_device.c|  2 +-
>>>  src/intel/vulkan/genX_pipeline.c | 62
>>> +---
>>>  src/mesa/drivers/dri/i965/brw_compiler.c | 23 ++-
>>>  src/mesa/drivers/dri/i965/brw_compiler.h |  2 +-
>>>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 14 +--
>>>  src/mesa/drivers/dri/i965/brw_nir.c  | 18 +---
>>>  src/mesa/drivers/dri/i965/brw_vec4.cpp   | 13 --
>>>  src/mesa/drivers/dri/i965/intel_screen.c |  3 +-
>>>  12 files changed, 105 insertions(+), 45 deletions(-)
>>>
>>> diff --git a/src/amd/vulkan/radv_pipeline.c
>>> b/src/amd/vulkan/radv_pipeline.c
>>> index ee5d812..90d4650 100644
>>> --- a/src/amd/vulkan/radv_pipeline.c
>>> +++ b/src/amd/vulkan/radv_pipeline.c
>>> @@ -59,6 +59,7 @@ static const struct nir_shader_compiler_options
>>> nir_options = {
>>> .lower_unpack_unorm_4x8 = true,
>>> .lower_extract_byte = true,
>>> .lower_extract_word = true,
>>> +   .dvec3_consumes_two_locations = true,
>>>  };
>>>
>>>  VkResult radv_CreateShaderModule(
>>> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
>>> index 1679d89..0fc8f39 100644
>>> --- a/src/compiler/nir/nir.h
>>> +++ b/src/compiler/nir/nir.h
>>> @@ -1794,6 +1794,11 @@ typedef struct nir_shader_compiler_options {
>>>  * information must be inferred from the list of input nir_variables.
>>>  */
>>> bool use_interpolated_input_intrinsics;
>>> +
>>> +   /**
>>> +* In Vulkan, a dvec3/dvec4 consumes two locations instead just one.
>>> +*/
>>> +   bool dvec3_consumes_two_locations;
>>>  } nir_shader_compiler_options;
>>>
>>>  typedef struct nir_shader {
>>> diff --git a/src/compiler/nir/nir_gather_info.c
>>> b/src/compiler/nir/nir_gather_info.c
>>> index 07c9949..8c80671 100644
>>> --- a/src/compiler/nir/nir_gather_info.c
>>> +++ b/src/compiler/nir/nir_gather_info.c
>>> @@ -96,7 +96,7 @@ mark_whole_variable(nir_shader *shader, nir_variable
>>> *var)
>>>
>>> const unsigned slots =
>>>var->data.compact ? DIV_ROUND_UP(glsl_get_length(type), 4)
>>> -: glsl_count_attribute_slots(type,
>>> is_vertex_input);
>>> +: glsl_count_attribute_slots(type,
>>> is_vertex_input && !shader->options->dvec3_consumes_two_locations);
>>>
>>
>> This makes no sense, why are we passing is_vertex_input &&
>> !dvec3_consumes_two_locations to an argument labled is_vertex_input?
>>
>>
>>>
>>> set_io_mask(shader, var, 0, slots);
>>>  }
>>> @@ -168,7 +168,7 @@ try_mask_partial_io(nir_shader *shader,
>>> nir_deref_var *deref)
>>> var->data.mode == nir_var_shader_in)
>>>is_vertex_input = true;
>>>
>>> -   unsigned offset = get_io_offset(deref, is_vertex_input);
>>> +   unsigned offset = get_io_offset(deref, is_vertex_input &&
>>> !shader->options->dvec3_consumes_two_locations);
>>>
>>
>> Same here
>>
>>
>>> if (offset == -1)
>>>return false;
>>>
>>> @@ -184,7 +184,7 @@ try_mask_partial_io(nir_shader *shader,
>>> 

[Mesa-dev] [PATCH] i915: Stop claiming GL 2.1 support.

2016-12-02 Thread Matt Turner
A user reporting an unrelated bug (98964) said that he has to set
MESA_GL_VERSION_OVERRIDE=1.4 when running Chromium otherwise it's too
slow. I presume that it's attempting to use GL 2.0/2.1 features that
aren't hardware-supported on i915.
---
 src/mesa/drivers/dri/i915/intel_screen.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i915/intel_screen.c 
b/src/mesa/drivers/dri/i915/intel_screen.c
index 1b80df0..e31e9c0 100644
--- a/src/mesa/drivers/dri/i915/intel_screen.c
+++ b/src/mesa/drivers/dri/i915/intel_screen.c
@@ -1127,7 +1127,7 @@ set_max_gl_versions(struct intel_screen *screen)
case 3:
   psp->max_gl_core_version = 0;
   psp->max_gl_es1_version = 11;
-  psp->max_gl_compat_version = 21;
+  psp->max_gl_compat_version = 14;
   psp->max_gl_es2_version = 20;
   break;
case 2:
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] swr: [rasterizer jitter] include cstdarg in builder_misc.cpp

2016-12-02 Thread Tim Rowley
Fixes build problem with llvm-svn.

v2: use cstdarg instead of stdarg.h
---
 src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp
index d755cc3..8120a2f 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp
@@ -30,6 +30,7 @@
 #include "builder.h"
 #include "common/rdtsc_buckets.h"
 
+#include 
 
 namespace SwrJit
 {
@@ -1623,4 +1624,4 @@ namespace SwrJit
 }
 }
 
-}
\ No newline at end of file
+}
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965: Delete the meta-base CopyImageSubData implementation

2016-12-02 Thread Jason Ekstrand
On Fri, Dec 2, 2016 at 11:19 AM, Anuj Phogat  wrote:

> On Thu, Dec 1, 2016 at 10:35 AM, Jason Ekstrand 
> wrote:
> >
> > When I originally implemented the ARB_copy_image extension, the fast-path
> > was written in meta using texture views.  This path only worked if both
> > images were uncompressed color images.  All of the other cases fell back
> to
> > the blitter or, in the worst case, mapping and memcpy on the CPU.  Now
> that
> > we have the blorp path, it handles all copies ever and the old meta,
> > blitter, and CPU paths are only used on gen5 and below.  The primary
> reason
> > why we needed the meta path (apart from having a slow blitter on later
> > hardware) was to handle multisampling which gen5 and earlier don't
> support
> > anyway.  Since the blitter is reasonably fast on gen5, we can just delete
> > the meta path and get rid of all that terrible code.
> >
> > If we decide that we're ok with just disabling ARB_copy_image on gen5 and
> > earlier (I personally am), then we could get rid of another 300 lines or
> so
> > of semi-hairy code.
> > ---
> >  src/mesa/Makefile.sources|   1 -
> >  src/mesa/drivers/common/meta.h   |  10 -
> >  src/mesa/drivers/common/meta_copy_image.c| 307
> ---
> >  src/mesa/drivers/dri/i965/intel_copy_image.c |  10 -
> >  4 files changed, 328 deletions(-)
> >  delete mode 100644 src/mesa/drivers/common/meta_copy_image.c
> >
> > diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
> > index 410a61a..ee737b0 100644
> > --- a/src/mesa/Makefile.sources
> > +++ b/src/mesa/Makefile.sources
> > @@ -621,7 +621,6 @@ COMMON_DRIVER_FILES =   \
> > drivers/common/driverfuncs.c\
> > drivers/common/driverfuncs.h\
> > drivers/common/meta_blit.c  \
> > -   drivers/common/meta_copy_image.c\
> > drivers/common/meta_generate_mipmap.c   \
> > drivers/common/meta_tex_subimage.c  \
> > drivers/common/meta.c \
> > diff --git a/src/mesa/drivers/common/meta.h b/src/mesa/drivers/common/
> meta.h
> > index a7018f5..0a913e9 100644
> > --- a/src/mesa/drivers/common/meta.h
> > +++ b/src/mesa/drivers/common/meta.h
> > @@ -492,16 +492,6 @@ _mesa_meta_and_swrast_BlitFramebuffer(struct
> gl_context *ctx,
> >GLint dstX1, GLint dstY1,
> >GLbitfield mask, GLenum filter);
> >
> > -bool
> > -_mesa_meta_CopyImageSubData_uncompressed(struct gl_context *ctx,
> > - struct gl_texture_image
> *src_tex_image,
> > - struct gl_renderbuffer
> *src_renderbuffer,
> > - int src_x, int src_y, int
> src_z,
> > - struct gl_texture_image
> *dst_tex_image,
> > - struct gl_renderbuffer
> *dst_renderbuffer,
> > - int dst_x, int dst_y, int
> dst_z,
> > - int src_width, int src_height);
> > -
> >  extern void
> >  _mesa_meta_Clear(struct gl_context *ctx, GLbitfield buffers);
> >
> > diff --git a/src/mesa/drivers/common/meta_copy_image.c
> b/src/mesa/drivers/common/meta_copy_image.c
> > deleted file mode 100644
> > index e1c90a3..000
> > --- a/src/mesa/drivers/common/meta_copy_image.c
> > +++ /dev/null
> > @@ -1,307 +0,0 @@
> > -/*
> > - * Mesa 3-D graphics library
> > - *
> > - * Copyright (C) 2014 Intel Corporation.  All Rights Reserved.
> > - *
> > - * Permission is hereby granted, free of charge, to any person
> obtaining a
> > - * copy of this software and associated documentation files (the
> "Software"),
> > - * to deal in the Software without restriction, including without
> limitation
> > - * the rights to use, copy, modify, merge, publish, distribute,
> sublicense,
> > - * and/or sell copies of the Software, and to permit persons to whom the
> > - * Software is furnished to do so, subject to the following conditions:
> > - *
> > - * The above copyright notice and this permission notice shall be
> included
> > - * in all copies or substantial portions of the Software.
> > - *
> > - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> EXPRESS
> > - * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> > - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
> SHALL
> > - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> > - * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> > - * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> > - * OTHER DEALINGS IN THE SOFTWARE.
> > - */
> > -
> > -#include "glheader.h"
> > -#include "context.h"
> > -#include "enums.h"
> > -#include "imports.h"
> > -#include "macros.h"
> > -#include 

Re: [Mesa-dev] [PATCH v2] swr: Fix type to match parameters of std::max()

2016-12-02 Thread Rowley, Timothy O
Should have parens on the zsbuf test line to match your corresponding change 
for cbuf attachments.

With that change, Reviewed-by: Tim Rowley 
>

On Dec 2, 2016, at 1:18 PM, George Kyriazis 
> wrote:

Include propagation of comparisons further down.
---
src/gallium/drivers/swr/swr_clear.cpp | 14 +++---
1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/gallium/drivers/swr/swr_clear.cpp 
b/src/gallium/drivers/swr/swr_clear.cpp
index f59179f..08eead8 100644
--- a/src/gallium/drivers/swr/swr_clear.cpp
+++ b/src/gallium/drivers/swr/swr_clear.cpp
@@ -35,7 +35,7 @@ swr_clear(struct pipe_context *pipe,
   struct pipe_framebuffer_state *fb = >framebuffer;

   UINT clearMask = 0;
-   int layers = 0;
+   unsigned layers = 0;

   if (!swr_check_render_cond(pipe))
  return;
@@ -47,20 +47,20 @@ swr_clear(struct pipe_context *pipe,
 if (fb->cbufs[i] && (buffers & (PIPE_CLEAR_COLOR0 << i))) {
clearMask |= (SWR_ATTACHMENT_COLOR0_BIT << i);
layers = std::max(layers, fb->cbufs[i]->u.tex.last_layer -
-  fb->cbufs[i]->u.tex.first_layer + 1);
+  fb->cbufs[i]->u.tex.first_layer + 1u);
 }
   }

   if (buffers & PIPE_CLEAR_DEPTH && fb->zsbuf) {
  clearMask |= SWR_ATTACHMENT_DEPTH_BIT;
  layers = std::max(layers, fb->zsbuf->u.tex.last_layer -
-fb->zsbuf->u.tex.first_layer + 1);
+fb->zsbuf->u.tex.first_layer + 1u);
   }

   if (buffers & PIPE_CLEAR_STENCIL && fb->zsbuf) {
  clearMask |= SWR_ATTACHMENT_STENCIL_BIT;
  layers = std::max(layers, fb->zsbuf->u.tex.last_layer -
-fb->zsbuf->u.tex.first_layer + 1);
+fb->zsbuf->u.tex.first_layer + 1u);
   }

#if 0 // XXX HACK, override clear color alpha. On ubuntu, clears are
@@ -68,7 +68,7 @@ swr_clear(struct pipe_context *pipe,
   ((union pipe_color_union *)color)->f[3] = 1.0; /* cast off your const'd-ness 
*/
#endif

-   for (int i = 0; i < layers; ++i) {
+   for (unsigned i = 0; i < layers; ++i) {
  swr_update_draw_context(ctx);
  SwrClearRenderTarget(ctx->swrContext, clearMask, i,
   color->f, depth, stencil,
@@ -76,11 +76,11 @@ swr_clear(struct pipe_context *pipe,

  // Mask out the attachments that are out of layers.
  if (fb->zsbuf &&
-  fb->zsbuf->u.tex.last_layer - fb->zsbuf->u.tex.first_layer <= i)
+  fb->zsbuf->u.tex.last_layer <= fb->zsbuf->u.tex.first_layer + i)
 clearMask &= ~(SWR_ATTACHMENT_DEPTH_BIT | SWR_ATTACHMENT_STENCIL_BIT);
  for (unsigned c = 0; c < fb->nr_cbufs; ++c) {
 const struct pipe_surface *sf = fb->cbufs[c];
- if (sf && sf->u.tex.last_layer - sf->u.tex.first_layer <= i)
+ if (sf && (sf->u.tex.last_layer <= sf->u.tex.first_layer + i))
clearMask &= ~(SWR_ATTACHMENT_COLOR0_BIT << c);
  }
   }
--
2.10.0.windows.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965: Delete the meta-base CopyImageSubData implementation

2016-12-02 Thread Anuj Phogat
On Thu, Dec 1, 2016 at 10:35 AM, Jason Ekstrand  wrote:
>
> When I originally implemented the ARB_copy_image extension, the fast-path
> was written in meta using texture views.  This path only worked if both
> images were uncompressed color images.  All of the other cases fell back to
> the blitter or, in the worst case, mapping and memcpy on the CPU.  Now that
> we have the blorp path, it handles all copies ever and the old meta,
> blitter, and CPU paths are only used on gen5 and below.  The primary reason
> why we needed the meta path (apart from having a slow blitter on later
> hardware) was to handle multisampling which gen5 and earlier don't support
> anyway.  Since the blitter is reasonably fast on gen5, we can just delete
> the meta path and get rid of all that terrible code.
>
> If we decide that we're ok with just disabling ARB_copy_image on gen5 and
> earlier (I personally am), then we could get rid of another 300 lines or so
> of semi-hairy code.
> ---
>  src/mesa/Makefile.sources|   1 -
>  src/mesa/drivers/common/meta.h   |  10 -
>  src/mesa/drivers/common/meta_copy_image.c| 307 
> ---
>  src/mesa/drivers/dri/i965/intel_copy_image.c |  10 -
>  4 files changed, 328 deletions(-)
>  delete mode 100644 src/mesa/drivers/common/meta_copy_image.c
>
> diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
> index 410a61a..ee737b0 100644
> --- a/src/mesa/Makefile.sources
> +++ b/src/mesa/Makefile.sources
> @@ -621,7 +621,6 @@ COMMON_DRIVER_FILES =   \
> drivers/common/driverfuncs.c\
> drivers/common/driverfuncs.h\
> drivers/common/meta_blit.c  \
> -   drivers/common/meta_copy_image.c\
> drivers/common/meta_generate_mipmap.c   \
> drivers/common/meta_tex_subimage.c  \
> drivers/common/meta.c \
> diff --git a/src/mesa/drivers/common/meta.h b/src/mesa/drivers/common/meta.h
> index a7018f5..0a913e9 100644
> --- a/src/mesa/drivers/common/meta.h
> +++ b/src/mesa/drivers/common/meta.h
> @@ -492,16 +492,6 @@ _mesa_meta_and_swrast_BlitFramebuffer(struct gl_context 
> *ctx,
>GLint dstX1, GLint dstY1,
>GLbitfield mask, GLenum filter);
>
> -bool
> -_mesa_meta_CopyImageSubData_uncompressed(struct gl_context *ctx,
> - struct gl_texture_image 
> *src_tex_image,
> - struct gl_renderbuffer 
> *src_renderbuffer,
> - int src_x, int src_y, int src_z,
> - struct gl_texture_image 
> *dst_tex_image,
> - struct gl_renderbuffer 
> *dst_renderbuffer,
> - int dst_x, int dst_y, int dst_z,
> - int src_width, int src_height);
> -
>  extern void
>  _mesa_meta_Clear(struct gl_context *ctx, GLbitfield buffers);
>
> diff --git a/src/mesa/drivers/common/meta_copy_image.c 
> b/src/mesa/drivers/common/meta_copy_image.c
> deleted file mode 100644
> index e1c90a3..000
> --- a/src/mesa/drivers/common/meta_copy_image.c
> +++ /dev/null
> @@ -1,307 +0,0 @@
> -/*
> - * Mesa 3-D graphics library
> - *
> - * Copyright (C) 2014 Intel Corporation.  All Rights Reserved.
> - *
> - * Permission is hereby granted, free of charge, to any person obtaining a
> - * copy of this software and associated documentation files (the "Software"),
> - * to deal in the Software without restriction, including without limitation
> - * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> - * and/or sell copies of the Software, and to permit persons to whom the
> - * Software is furnished to do so, subject to the following conditions:
> - *
> - * The above copyright notice and this permission notice shall be included
> - * in all copies or substantial portions of the Software.
> - *
> - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
> - * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
> MERCHANTABILITY,
> - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
> - * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> - * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> - * OTHER DEALINGS IN THE SOFTWARE.
> - */
> -
> -#include "glheader.h"
> -#include "context.h"
> -#include "enums.h"
> -#include "imports.h"
> -#include "macros.h"
> -#include "teximage.h"
> -#include "texobj.h"
> -#include "fbobject.h"
> -#include "framebuffer.h"
> -#include "buffers.h"
> -#include "state.h"
> -#include "mtypes.h"
> -#include "meta.h"
> -
> -/**
> - * Create a texture image that wraps a renderbuffer.
> - */
> -static struct gl_texture_image *
> 

[Mesa-dev] [PATCH v2] swr: Fix type to match parameters of std::max()

2016-12-02 Thread George Kyriazis
Include propagation of comparisons further down.
---
 src/gallium/drivers/swr/swr_clear.cpp | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/gallium/drivers/swr/swr_clear.cpp 
b/src/gallium/drivers/swr/swr_clear.cpp
index f59179f..08eead8 100644
--- a/src/gallium/drivers/swr/swr_clear.cpp
+++ b/src/gallium/drivers/swr/swr_clear.cpp
@@ -35,7 +35,7 @@ swr_clear(struct pipe_context *pipe,
struct pipe_framebuffer_state *fb = >framebuffer;
 
UINT clearMask = 0;
-   int layers = 0;
+   unsigned layers = 0;
 
if (!swr_check_render_cond(pipe))
   return;
@@ -47,20 +47,20 @@ swr_clear(struct pipe_context *pipe,
  if (fb->cbufs[i] && (buffers & (PIPE_CLEAR_COLOR0 << i))) {
 clearMask |= (SWR_ATTACHMENT_COLOR0_BIT << i);
 layers = std::max(layers, fb->cbufs[i]->u.tex.last_layer -
-  fb->cbufs[i]->u.tex.first_layer + 1);
+  fb->cbufs[i]->u.tex.first_layer + 1u);
  }
}
 
if (buffers & PIPE_CLEAR_DEPTH && fb->zsbuf) {
   clearMask |= SWR_ATTACHMENT_DEPTH_BIT;
   layers = std::max(layers, fb->zsbuf->u.tex.last_layer -
-fb->zsbuf->u.tex.first_layer + 1);
+fb->zsbuf->u.tex.first_layer + 1u);
}
 
if (buffers & PIPE_CLEAR_STENCIL && fb->zsbuf) {
   clearMask |= SWR_ATTACHMENT_STENCIL_BIT;
   layers = std::max(layers, fb->zsbuf->u.tex.last_layer -
-fb->zsbuf->u.tex.first_layer + 1);
+fb->zsbuf->u.tex.first_layer + 1u);
}
 
 #if 0 // XXX HACK, override clear color alpha. On ubuntu, clears are
@@ -68,7 +68,7 @@ swr_clear(struct pipe_context *pipe,
((union pipe_color_union *)color)->f[3] = 1.0; /* cast off your 
const'd-ness */
 #endif
 
-   for (int i = 0; i < layers; ++i) {
+   for (unsigned i = 0; i < layers; ++i) {
   swr_update_draw_context(ctx);
   SwrClearRenderTarget(ctx->swrContext, clearMask, i,
color->f, depth, stencil,
@@ -76,11 +76,11 @@ swr_clear(struct pipe_context *pipe,
 
   // Mask out the attachments that are out of layers.
   if (fb->zsbuf &&
-  fb->zsbuf->u.tex.last_layer - fb->zsbuf->u.tex.first_layer <= i)
+  fb->zsbuf->u.tex.last_layer <= fb->zsbuf->u.tex.first_layer + i)
  clearMask &= ~(SWR_ATTACHMENT_DEPTH_BIT | SWR_ATTACHMENT_STENCIL_BIT);
   for (unsigned c = 0; c < fb->nr_cbufs; ++c) {
  const struct pipe_surface *sf = fb->cbufs[c];
- if (sf && sf->u.tex.last_layer - sf->u.tex.first_layer <= i)
+ if (sf && (sf->u.tex.last_layer <= sf->u.tex.first_layer + i))
 clearMask &= ~(SWR_ATTACHMENT_COLOR0_BIT << c);
   }
}
-- 
2.10.0.windows.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/23] tgsi: add Stream{X, Y, Z, W} fields to tgsi_declaration_semantic

2016-12-02 Thread Roland Scheidegger
Am 02.12.2016 um 18:23 schrieb Nicolai Hähnle:
> On 30.11.2016 21:37, Roland Scheidegger wrote:
>> Am 30.11.2016 um 20:19 schrieb Nicolai Hähnle:
>>> On 30.11.2016 19:06, Roland Scheidegger wrote:
 Am 30.11.2016 um 14:35 schrieb Nicolai Hähnle:
> From: Nicolai Hähnle 
>
> This is for geometry shader outputs. Without it, drivers have no
> way of
> knowing which stream each output is intended for, and have to
> conservatively write all outputs to all streams.
>
> Separate stream numbers for each component are required due to output
> packing.
 Are you sure this is true?
 This is an area I don't know much about, but
 https://urldefense.proofpoint.com/v2/url?u=https-3A__www.opengl.org_wiki_Layout-5FQualifier-5F-28GLSL-29=DgIDaQ=uilaK90D4TOVoH58JNXRgQ=_QIjpv-UJ77xEQY8fIYoQtr5qv8wKrPJc7v7_-CYAb0=fVpTGTYN2KTEhU17RpFTxEULrsIfC3bdpEin0k8NIYE=uamnHj-9Xr12ctr0gHDfCMIMHq8DyUBtKIwHQQpjDLs=

 tells me "Stream
 assignments for a geometry shader are required to be the same for all
 members of a block, but offsets are not."

 Therefore I don't think output packing should ever happen across
 multiple streams. I think it would be MUCH nicer if the semantic needed
 just one stream member...
>>>
>>> There are two variants of that question, I guess.
>>>
>>> The answer to the first variant is: Yes, this is currently true.
>>> lower_packed_varyings will happily pack outputs from different vertex
>>> streams into the same vec4. This affects quite a lot of programs, e.g.
>>> you see it in piglit arb_gpu_shader5-xfb-streams.
>>>
>>> The second question is: Do we want it to be true? I agree that it would
>>> be convenient to be able to use a single Stream member. Also, isolating
>>> the stream0 components from the rest would lead to slightly more
>>> efficient shaders for us in some cases.
>>>
>>> I opted against it so far because I didn't want to think through the
>>> implications of changing lower_packed_varyings. The main question I have
>>> is: if you account for the size of the GS output in # of components,
>>> then it could happen that the number of output vec4s ends up being
>>> larger than (max # of output components) / 4. Will that be a problem
>>> somewhere?
>>
>> I don't know if that would be a problem, but if it is I'd assume this
>> would be fixable (since the number of actual components ultimately
>> doesn't change).
>> Having outputs belonging to multiple streams in a single output just
>> seems weird...
>> That said, I wonder if it actually would be possible to do that with
>> d3d11 too.
>> With shader model 5 you'd have:
>> dcl_stream 0
>> dcl_output o0.xy
>> dcl_stream 1
>> dcl_output o0.zw // legal or not???
>>
>> Though the shader model 4/5 rules are a bit weird for packing
>> inputs/outputs, I'm not even sure two dcl_output are legal for the same
>> reg without a dcl_stream in between them (but you can pack system values
>> together with ordinary inputs/outputs).
>>
>> So maybe just allowing this is the right solution...
> 
> I played around with the DX shader compiler, and I have some annoying
> news. SM5 actually uses not just the same output register but even the
> same component for multiple streams -- see the output I've pasted at the
> end.
> 
> So how to proceed? To simplify things going forward, I'm mostly
> convinced that the GLSL output packing should be changed to pack outputs
> by stream. As I mentioned previously, this has other minor advantages
> for us anyway.
> 
> Then one possibility to accomodate SM5 would be to have a Stream
> bitmask, one bit per stream, as part of the output semantics. The
> downside of this is that I wanted to use the WriteMask as an additional
> optimization to avoid writing out unused components, and you'd then need
> separate WriteMasks for each stream.
> 
> The other possibility, which I prefer, would be to have just a single
> Stream field indicating one stream number per output register, and
> aliasing is just not allowed despite what SM5 wants.
> 
> TGSI -> SM5 conversion is trivial.
> 
> SM5 -> TGSI conversion is also possible despite the aliasing on the DX
> side, because the doc says this about emit_stream: "Af[t]er the emit,
> all data in all output registers for all streams become uninitialized,
> not just the stream emitted to."
Oh that's pretty interesting, since emit didn't have that part about
outputs becoming uninitialized. Maybe that's just what was needed to
keep implementations sane when allowing the crazy "same output multiple
stream" stuff... Or I suppose it's not actually that crazy then...


> (https://urldefense.proofpoint.com/v2/url?u=https-3A__msdn.microsoft.com_en-2Dus_library_windows_desktop_hh447051-28v-3Dvs.85-29.aspx=DgIDaQ=uilaK90D4TOVoH58JNXRgQ=_QIjpv-UJ77xEQY8fIYoQtr5qv8wKrPJc7v7_-CYAb0=EBMBRMVpTcLbno2cH7eaI5WJW9VY3tec7RBNULl1btw=HJ2sRJpROX7JfDvjHycEwHAx6YzJa8RUa1biVttH-zM=
> ). So you have to look-ahead to the next emit_stream for 

Re: [Mesa-dev] [PATCH 2/3] st/va: remove unused variable pbuff

2016-12-02 Thread tournier.elie
Reviewed-by: Elie Tournier 


2016-12-02 17:26 GMT+01:00 Emil Velikov :

> From: Emil Velikov 
>
> Signed-off-by: Emil Velikov 
> ---
>  src/gallium/state_trackers/va/surface.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/src/gallium/state_trackers/va/surface.c
> b/src/gallium/state_trackers/va/surface.c
> index f8513d9..357e85e 100644
> --- a/src/gallium/state_trackers/va/surface.c
> +++ b/src/gallium/state_trackers/va/surface.c
> @@ -94,7 +94,6 @@ vlVaSyncSurface(VADriverContextP ctx, VASurfaceID
> render_target)
> vlVaDriver *drv;
> vlVaContext *context;
> vlVaSurface *surf;
> -   void *pbuff;
>
> if (!ctx)
>return VA_STATUS_ERROR_INVALID_CONTEXT;
> --
> 2.10.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/6] anv: do not open random render node(s)

2016-12-02 Thread Emil Velikov
On 2 December 2016 at 17:33, Eric Engestrom  wrote:
> On Friday, 2016-12-02 16:31:49 +, Emil Velikov wrote:
>> From: Emil Velikov 
>>
>> drmGetDevices2() provides us with enough flexibility to build heuristics
>> upon. Opening a random node on the other hand will wake up the device,
>> regardless if it's the one we're intereseted or not.
>
> "interested"
> (same in the previous patch)
>
>>
>> Cc: Jason Ekstrand 
>> Signed-off-by: Emil Velikov 
>> ---
>>  src/intel/vulkan/Makefile.am  |  3 ++-
>>  src/intel/vulkan/anv_device.c | 53 
>> +++
>>  2 files changed, 40 insertions(+), 16 deletions(-)
>>
>> diff --git a/src/intel/vulkan/Makefile.am b/src/intel/vulkan/Makefile.am
>> index df7645f..e309491 100644
>> --- a/src/intel/vulkan/Makefile.am
>> +++ b/src/intel/vulkan/Makefile.am
>> @@ -66,7 +66,7 @@ AM_CPPFLAGS += \
>>  endif
>>
>>  AM_CPPFLAGS += \
>> - $(INTEL_CFLAGS) \
>> + $(LIBDRM_CFLAGS) \
>>   $(VALGRIND_CFLAGS) \
>>   $(DEFINES)
>>
>> @@ -131,6 +131,7 @@ VULKAN_LIB_DEPS += \
>>   $(top_builddir)/src/intel/isl/libisl.la \
>>   $(top_builddir)/src/intel/blorp/libblorp.la \
>>   $(PER_GEN_LIBS) \
>> + $(LIBDRM_LIBS) \
>>   $(PTHREAD_LIBS) \
>>   $(DLOPEN_LIBS) \
>>   -lm
>
> Unrelated -> separate patch?
>
It's related: libvulkan_intel.so does not pull libdrm.so as dependency
(up-to this patch). As we use it below, we ought to add the link here.
Otherwise we'll end up with interment breakage.

Note: libvulkan_radeon.so already depends on libdrm, which is why the
hunk is missing from that patch.

>> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
>> index d594df7..9927ac2 100644
>> --- a/src/intel/vulkan/anv_device.c
>> +++ b/src/intel/vulkan/anv_device.c
>> @@ -29,6 +29,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>
>>  #include "anv_private.h"
>>  #include "util/strtod.h"
>> @@ -375,6 +376,40 @@ void anv_DestroyInstance(
>> vk_free(>alloc, instance);
>>  }
>>
>> +static VkResult
>> +anv_enumerate_devices(struct anv_instance *instance)
>> +{
>> +   /* TODO: Check for more devices ? */
>> +   drmDevicePtr devices[8];
>> +   VkResult result = VK_SUCCESS;
>> +   int max_devices;
>> +
>> +   max_devices = drmGetDevices2(0, devices, sizeof(devices));
>> +   if (max_devices < 1)
>> +  return VK_ERROR_INCOMPATIBLE_DRIVER;
>> +
>> +   for (unsigned i = 0; i < (unsigned)max_devices; i++) {
>> +  if (devices[i]->available_nodes & 1 << DRM_NODE_RENDER &&
>> +  devices[i]->bustype == DRM_BUS_PCI &&
>> +  devices[i]->deviceinfo.pci->vendor_id == 0x8086) {
>
> Yay, magic values!
> I feel like we should replace all those with PCI_VENDOR_INTEL or something.
>
We have another three instances of these for each Vulkan driver.
Barring any objections I'll do that as cleanup on top ?

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/27] gbm: Fix width height getters return type (trivial)

2016-12-02 Thread Daniel Stone
Hi Ben,

On 2 December 2016 at 18:17, Ben Widawsky  wrote:
> On 16-12-02 18:07:22, Daniel Stone wrote:
>> On 2 December 2016 at 17:56, Eric Engestrom  
>> wrote:
>> I have to admit I didn't catch this one. It doesn't help on 64-bit
>> since unsigned int is still 32-bit there, and in any case it's library
>> ABI, so if it doesn't change anything then it doesn't help, and if it
>> does then it's an ABI break, so NAK from me.
>
> It was like the patch says, meant to match the definition of the
> implementation.
> The exported symbol is defined as unsigned int. It had nothing to do with
> 64-bit.
>
> GBM_EXPORT unsigned int
> gbm_bo_get_height(struct gbm_bo *bo)
>
> I'd say they should match, and both can be uint32_t. I don't care much
> either
> way.

Right. Given that we have multiple implementations of libgbm in the
wild, I'd be much more comfortable saying that the existing gbm.h is
canonical, and changing the Mesa implementation to match.

Cheers,
Daniel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] st/va: automake: cleanup C{PP,}FLAGS

2016-12-02 Thread Christian König

Am 02.12.2016 um 17:26 schrieb Emil Velikov:

From: Emil Velikov 

Remove some transitional left overs from the gallium pipe-loader rework
and kill off unneeded AM_CPPFLAGS.

Signed-off-by: Emil Velikov 


Reviewed-by: Christian König .


---
  src/gallium/state_trackers/va/Makefile.am | 12 
  1 file changed, 12 deletions(-)

diff --git a/src/gallium/state_trackers/va/Makefile.am 
b/src/gallium/state_trackers/va/Makefile.am
index 348cfe1..a70eede5 100644
--- a/src/gallium/state_trackers/va/Makefile.am
+++ b/src/gallium/state_trackers/va/Makefile.am
@@ -30,18 +30,6 @@ AM_CFLAGS = \
$(VA_CFLAGS) \
-DVA_DRIVER_INIT_FUNC="__vaDriverInit_$(VA_MAJOR)_$(VA_MINOR)"
  
-AM_CFLAGS += \

-   $(GALLIUM_PIPE_LOADER_DEFINES) \
-   -DPIPE_SEARCH_DIR=\"$(libdir)/gallium-pipe\"
-
-if HAVE_GALLIUM_STATIC_TARGETS
-AM_CFLAGS += \
-   -DGALLIUM_STATIC_TARGETS=1
-endif
-
-AM_CPPFLAGS = \
-   -I$(top_srcdir)/include
-
  noinst_LTLIBRARIES = libvatracker.la
  
  libvatracker_la_SOURCES = $(C_SOURCES)



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/27] gbm: Fix width height getters return type (trivial)

2016-12-02 Thread Ben Widawsky

On 16-12-02 18:07:22, Daniel Stone wrote:

Hi,

On 2 December 2016 at 17:56, Eric Engestrom  wrote:

On Thursday, 2016-12-01 14:09:43 -0800, Ben Widawsky wrote:

--- a/src/gbm/main/gbm.h
+++ b/src/gbm/main/gbm.h
@@ -294,10 +294,10 @@ gbm_bo_map(struct gbm_bo *bo,
 void
 gbm_bo_unmap(struct gbm_bo *bo, void *map_data);

-uint32_t
+unsigned int
 gbm_bo_get_width(struct gbm_bo *bo);

-uint32_t
+unsigned int
 gbm_bo_get_height(struct gbm_bo *bo);


I'm not sure I understand this change. Why would you want to remove the
information of the type size? If the point is to increase it on 64-bit
machines, I'd go with an explicit `uint64_t` instead.


I have to admit I didn't catch this one. It doesn't help on 64-bit
since unsigned int is still 32-bit there, and in any case it's library
ABI, so if it doesn't change anything then it doesn't help, and if it
does then it's an ABI break, so NAK from me.

Cheers,
Daniel


It was like the patch says, meant to match the definition of the implementation.
The exported symbol is defined as unsigned int. It had nothing to do with
64-bit.

GBM_EXPORT unsigned int
gbm_bo_get_height(struct gbm_bo *bo)

I'd say they should match, and both can be uint32_t. I don't care much either
way.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/27] gbm: Fix width height getters return type (trivial)

2016-12-02 Thread Daniel Stone
Hi,

On 2 December 2016 at 17:56, Eric Engestrom  wrote:
> On Thursday, 2016-12-01 14:09:43 -0800, Ben Widawsky wrote:
>> --- a/src/gbm/main/gbm.h
>> +++ b/src/gbm/main/gbm.h
>> @@ -294,10 +294,10 @@ gbm_bo_map(struct gbm_bo *bo,
>>  void
>>  gbm_bo_unmap(struct gbm_bo *bo, void *map_data);
>>
>> -uint32_t
>> +unsigned int
>>  gbm_bo_get_width(struct gbm_bo *bo);
>>
>> -uint32_t
>> +unsigned int
>>  gbm_bo_get_height(struct gbm_bo *bo);
>
> I'm not sure I understand this change. Why would you want to remove the
> information of the type size? If the point is to increase it on 64-bit
> machines, I'd go with an explicit `uint64_t` instead.

I have to admit I didn't catch this one. It doesn't help on 64-bit
since unsigned int is still 32-bit there, and in any case it's library
ABI, so if it doesn't change anything then it doesn't help, and if it
does then it's an ABI break, so NAK from me.

Cheers,
Daniel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/27] dri: Add an image creation with modifiers

2016-12-02 Thread Eric Engestrom
On Thursday, 2016-12-01 14:09:49 -0800, Ben Widawsky wrote:
> From: Ben Widawsky 
> 
> Modifiers will be obtains or guessed by the client and passed in during
> image creation/import.
> 
> This requires bumping the DRIimage version.
> 
> Signed-off-by: Ben Widawsky 
> ---
>  include/GL/internal/dri_interface.h  | 28 +++-
>  src/gallium/state_trackers/dri/dri2.c|  1 +
>  src/mesa/drivers/dri/i965/intel_screen.c | 26 +-
>  3 files changed, 53 insertions(+), 2 deletions(-)
> 
> diff --git a/include/GL/internal/dri_interface.h 
> b/include/GL/internal/dri_interface.h
> index d0b1bc6..657e158 100644
> --- a/include/GL/internal/dri_interface.h
> +++ b/include/GL/internal/dri_interface.h
> @@ -1094,7 +1094,7 @@ struct __DRIdri2ExtensionRec {
>   * extensions.
>   */
>  #define __DRI_IMAGE "DRI_IMAGE"
> -#define __DRI_IMAGE_VERSION 13
> +#define __DRI_IMAGE_VERSION 14
>  
>  /**
>   * These formats correspond to the similarly named MESA_FORMAT_*
> @@ -1209,6 +1209,8 @@ struct __DRIdri2ExtensionRec {
>  #define __DRI_IMAGE_ATTRIB_NUM_PLANES   0x2009 /* available in versions 11 */
>  
>  #define __DRI_IMAGE_ATTRIB_OFFSET 0x200A /* available in versions 13 */
> +#define __DRI_IMAGE_ATTRIB_MODIFIER_LOWER 0x200B /* available in versions 14 
> */
> +#define __DRI_IMAGE_ATTRIB_MODIFIER_UPPER 0x200C /* available in versions 14 
> */
>  
>  enum __DRIYUVColorSpace {
> __DRI_YUV_COLOR_SPACE_UNDEFINED = 0,
> @@ -1420,6 +1422,30 @@ struct __DRIimageExtensionRec {
>  */
> void (*unmapImage)(__DRIcontext *context, __DRIimage *image, void *data);
>  
> +
> +   /**
> +* Creates an image with implementations favorite modifiers.
> +*
> +* This acts like createImage except there is a list of modifiers passed 
> in
> +* which the implementation may selectively use to create the DRIimage. 
> The
> +* result should be the implementation selects one modifier (perhaps it 
> would
> +* hold on to a few and later pick).
> +*
> +* The created image should be destroyed with destroyImage().
> +*
> +* Returns the new DRIimage. The chosen modifier can be obtained later on
> +* through some API visible functionality if required.
> +*
> +* \sa __DRIimageRec::createImage
> +*
> +* \since 14
> +*/
> +   __DRIimage *(*createImageWithModifiers)(__DRIscreen *screen,
> +   int width, int height, int format,
> +   unsigned int use,
> +   const uint64_t *modifiers,
> +   const unsigned int modifier_count,
> +   void *loaderPrivate);
>  };
>  
>  
> diff --git a/src/gallium/state_trackers/dri/dri2.c 
> b/src/gallium/state_trackers/dri/dri2.c
> index 9ec069b..c9fbe84 100644
> --- a/src/gallium/state_trackers/dri/dri2.c
> +++ b/src/gallium/state_trackers/dri/dri2.c
> @@ -1409,6 +1409,7 @@ static __DRIimageExtension dri2ImageExtension = {
>  .getCapabilities  = dri2_get_capabilities,
>  .mapImage = dri2_map_image,
>  .unmapImage   = dri2_unmap_image,
> +.createImageWithModifiers = NULL,
>  };
>  
>  
> diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
> b/src/mesa/drivers/dri/i965/intel_screen.c
> index 5808bde..b5bb4a0 100644
> --- a/src/mesa/drivers/dri/i965/intel_screen.c
> +++ b/src/mesa/drivers/dri/i965/intel_screen.c
> @@ -538,9 +538,11 @@ intel_destroy_image(__DRIimage *image)
>  }
>  
>  static __DRIimage *
> -intel_create_image(__DRIscreen *dri_screen,
> +__intel_create_image(__DRIscreen *dri_screen,
>  int width, int height, int format,
>  unsigned int use,
> +   const uint64_t *modifiers,
> +   unsigned count,
>  void *loaderPrivate)
>  {
> __DRIimage *image;
> @@ -578,6 +580,27 @@ intel_create_image(__DRIscreen *dri_screen,
> return image;
>  }
>  
> +static __DRIimage *
> +intel_create_image(__DRIscreen *dri_screen,
> +int width, int height, int format,
> +unsigned int use,
> +void *loaderPrivate)
> +{
> +   return __intel_create_image(dri_screen, width, height, format, use, NULL, 
> 0, loaderPrivate);
> +}
> +
> +static __DRIimage *
> +intel_create_image_with_modifiers(__DRIscreen *dri_screen,
> +  int width, int height, int format,
> +  unsigned int use,
> +  const uint64_t *modifiers,
> +  const unsigned count,
> +  void *loaderPrivate)
> +{
> +   return __intel_create_image(dri_screen, width, height, format, use, NULL, 
> 0,

I think you meant to use `modifiers` and `count` here :P

If you really want to leave them 

Re: [Mesa-dev] [PATCH 07/27] i965/dri: Store the screen associated with the image

2016-12-02 Thread Eric Engestrom
On Thursday, 2016-12-01 14:09:48 -0800, Ben Widawsky wrote:
> From: Ben Widawsky 
> 
> I intend to need to get to the devinfo structure, and storing the screen
> is an easy way to do that.
> 
> It seems to be the consensus that you cannot share an image between
> multiple screens.
> 
> Scape-goat: Rob Clark 

Do we need to teach git about this new tag? xD

> Signed-off-by: Ben Widawsky 
> ---
>  src/mesa/drivers/dri/i965/intel_image.h  |  1 +
>  src/mesa/drivers/dri/i965/intel_screen.c | 16 ++--
>  2 files changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_image.h 
> b/src/mesa/drivers/dri/i965/intel_image.h
> index 9b3816e..fd63919 100644
> --- a/src/mesa/drivers/dri/i965/intel_image.h
> +++ b/src/mesa/drivers/dri/i965/intel_image.h
> @@ -65,6 +65,7 @@ struct intel_image_format {
>  };
>  
>  struct __DRIimageRec {
> +   struct intel_screen *screen;
> drm_intel_bo *bo;
> uint32_t pitch; /**< in bytes */
> GLenum internal_format;
> diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
> b/src/mesa/drivers/dri/i965/intel_screen.c
> index e1c3c19..5808bde 100644
> --- a/src/mesa/drivers/dri/i965/intel_screen.c
> +++ b/src/mesa/drivers/dri/i965/intel_screen.c
> @@ -349,7 +349,8 @@ static boolean intel_lookup_fourcc(int dri_format, int 
> *fourcc)
>  }
>  
>  static __DRIimage *
> -intel_allocate_image(int dri_format, void *loaderPrivate)
> +intel_allocate_image(struct intel_screen *screen, int dri_format,
> + void *loaderPrivate)
>  {
>  __DRIimage *image;
>  
> @@ -357,6 +358,7 @@ intel_allocate_image(int dri_format, void *loaderPrivate)
>  if (image == NULL)
>   return NULL;
>  
> +image->screen = screen;
>  image->dri_format = dri_format;
>  image->offset = 0;
>  
> @@ -407,7 +409,7 @@ intel_create_image_from_name(__DRIscreen *dri_screen,
>  __DRIimage *image;
>  int cpp;
>  
> -image = intel_allocate_image(format, loaderPrivate);
> +image = intel_allocate_image(screen, format, loaderPrivate);
>  if (image == NULL)
> return NULL;
>  
> @@ -557,7 +559,7 @@ intel_create_image(__DRIscreen *dri_screen,
> if (use & __DRI_IMAGE_USE_LINEAR)
>tiling = I915_TILING_NONE;
>  
> -   image = intel_allocate_image(format, loaderPrivate);
> +   image = intel_allocate_image(screen, format, loaderPrivate);
> if (image == NULL)
>return NULL;
>  
> @@ -719,9 +721,11 @@ intel_create_image_from_fds(__DRIscreen *dri_screen,
>return NULL;
>  
> if (f->nplanes == 1)
> -  image = intel_allocate_image(f->planes[0].dri_format, loaderPrivate);
> +  image = intel_allocate_image(screen, f->planes[0].dri_format,
> +   loaderPrivate);
> else
> -  image = intel_allocate_image(__DRI_IMAGE_FORMAT_NONE, loaderPrivate);
> +  image = intel_allocate_image(screen, __DRI_IMAGE_FORMAT_NONE,
> +   loaderPrivate);
>  
> if (image == NULL)
>return NULL;
> @@ -824,7 +828,7 @@ intel_from_planar(__DRIimage *parent, int plane, void 
> *loaderPrivate)
>  offset = parent->offsets[index];
>  stride = parent->strides[index];
>  
> -image = intel_allocate_image(dri_format, loaderPrivate);
> +image = intel_allocate_image(parent->screen, dri_format, loaderPrivate);
>  if (image == NULL)
> return NULL;
>  
> -- 
> 2.10.2
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/27] gbm: Create a gbm_device getter for stride

2016-12-02 Thread Eric Engestrom
On Thursday, 2016-12-01 14:09:45 -0800, Ben Widawsky wrote:
> From: Ben Widawsky 
> 
> This will be used so we can query information per plane.
> 
> Signed-off-by: Ben Widawsky 
> ---
>  src/gbm/backends/dri/gbm_dri.c | 7 +++
>  src/gbm/main/gbm.c | 2 +-
>  src/gbm/main/gbmint.h  | 1 +
>  3 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c
> index c61d56b..f3ca228 100644
> --- a/src/gbm/backends/dri/gbm_dri.c
> +++ b/src/gbm/backends/dri/gbm_dri.c
> @@ -622,6 +622,12 @@ gbm_dri_bo_get_planes(struct gbm_bo *_bo)
> return get_number_planes(dri, bo->image);
>  }
>  
> +static uint32_t
> +gbm_dri_bo_get_stride(struct gbm_bo *_bo, int plane)

`unsigned plane`?
Same in the next patches.

There is a very weird mix of sized- and unsized-types in these patches
(see also comment on patch #2); what is the reasoning here?
(For instance, the return type is sized but the input plane isn't.)

> +{
> +   return _bo->stride;
> +}
> +
>  static void
>  gbm_dri_bo_destroy(struct gbm_bo *_bo)
>  {
> @@ -1080,6 +1086,7 @@ dri_device_create(int fd)
> dri->base.base.bo_write = gbm_dri_bo_write;
> dri->base.base.bo_get_fd = gbm_dri_bo_get_fd;
> dri->base.base.bo_get_planes = gbm_dri_bo_get_planes;
> +   dri->base.base.bo_get_stride = gbm_dri_bo_get_stride;
> dri->base.base.bo_destroy = gbm_dri_bo_destroy;
> dri->base.base.destroy = dri_destroy;
> dri->base.base.surface_create = gbm_dri_surface_create;
> diff --git a/src/gbm/main/gbm.c b/src/gbm/main/gbm.c
> index b5e0316..14c31ad 100644
> --- a/src/gbm/main/gbm.c
> +++ b/src/gbm/main/gbm.c
> @@ -165,7 +165,7 @@ gbm_bo_get_height(struct gbm_bo *bo)
>  GBM_EXPORT uint32_t
>  gbm_bo_get_stride(struct gbm_bo *bo)
>  {
> -   return bo->stride;
> +   return bo->gbm->bo_get_stride(bo, 0);
>  }
>  
>  /** Get the format of the buffer object
> diff --git a/src/gbm/main/gbmint.h b/src/gbm/main/gbmint.h
> index c6a6701..35d3bcb 100644
> --- a/src/gbm/main/gbmint.h
> +++ b/src/gbm/main/gbmint.h
> @@ -77,6 +77,7 @@ struct gbm_device {
> int (*bo_write)(struct gbm_bo *bo, const void *buf, size_t data);
> int (*bo_get_fd)(struct gbm_bo *bo);
> int (*bo_get_planes)(struct gbm_bo *bo);
> +   uint32_t (*bo_get_stride)(struct gbm_bo *bo, int plane);
> void (*bo_destroy)(struct gbm_bo *bo);
>  
> struct gbm_surface *(*surface_create)(struct gbm_device *gbm,
> -- 
> 2.10.2
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/27] gbm: Export a plane getter function

2016-12-02 Thread Eric Engestrom
On Thursday, 2016-12-01 14:09:44 -0800, Ben Widawsky wrote:
> From: Ben Widawsky 
> 
> This will be used by clients that need to know the number of planes
> allocated for them on behalf of the GL or other API. The best current
> example of this is when an extra "plane" is allocated to store
> compression data for the primary plane.
> 
> Cc: Daniel Stone 
> Signed-off-by: Ben Widawsky 
> ---
>  src/gbm/backends/dri/gbm_dri.c | 25 +
>  src/gbm/gbm-symbols-check  |  1 +
>  src/gbm/main/gbm.c | 10 ++
>  src/gbm/main/gbm.h |  3 +++
>  src/gbm/main/gbmint.h  |  1 +
>  5 files changed, 40 insertions(+)
> 
> diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c
> index 45cb42a..c61d56b 100644
> --- a/src/gbm/backends/dri/gbm_dri.c
> +++ b/src/gbm/backends/dri/gbm_dri.c
> @@ -598,6 +598,30 @@ gbm_dri_bo_get_fd(struct gbm_bo *_bo)
> return fd;
>  }
>  
> +static int
> +get_number_planes(struct gbm_dri_device *dri, __DRIimage *image)
> +{
> +   int num_planes = 0;
> +   dri->image->queryImage(image, __DRI_IMAGE_ATTRIB_NUM_PLANES, _planes);
> +
> +   if (num_planes <= 0)
> +  num_planes = 1;

When __DRI_IMAGE_ATTRIB_NUM_PLANES is invalid, why hide this and
return 1 anyway?

> +
> +   return num_planes;
> +}
> +
> +static int
> +gbm_dri_bo_get_planes(struct gbm_bo *_bo)
> +{
> +   struct gbm_dri_device *dri = gbm_dri_device(_bo->gbm);
> +   struct gbm_dri_bo *bo = gbm_dri_bo(_bo);
> +
> +   if (bo->image == NULL)
> +  return -1;
> +
> +   return get_number_planes(dri, bo->image);
> +}
> +
>  static void
>  gbm_dri_bo_destroy(struct gbm_bo *_bo)
>  {
> @@ -1055,6 +1079,7 @@ dri_device_create(int fd)
> dri->base.base.is_format_supported = gbm_dri_is_format_supported;
> dri->base.base.bo_write = gbm_dri_bo_write;
> dri->base.base.bo_get_fd = gbm_dri_bo_get_fd;
> +   dri->base.base.bo_get_planes = gbm_dri_bo_get_planes;
> dri->base.base.bo_destroy = gbm_dri_bo_destroy;
> dri->base.base.destroy = dri_destroy;
> dri->base.base.surface_create = gbm_dri_surface_create;
> diff --git a/src/gbm/gbm-symbols-check b/src/gbm/gbm-symbols-check
> index 5a333ff..8c4da1b 100755
> --- a/src/gbm/gbm-symbols-check
> +++ b/src/gbm/gbm-symbols-check
> @@ -18,6 +18,7 @@ gbm_bo_get_format
>  gbm_bo_get_device
>  gbm_bo_get_handle
>  gbm_bo_get_fd
> +gbm_bo_get_plane_count
>  gbm_bo_write
>  gbm_bo_set_user_data
>  gbm_bo_get_user_data
> diff --git a/src/gbm/main/gbm.c b/src/gbm/main/gbm.c
> index 00113fa..b5e0316 100644
> --- a/src/gbm/main/gbm.c
> +++ b/src/gbm/main/gbm.c
> @@ -223,6 +223,16 @@ gbm_bo_get_fd(struct gbm_bo *bo)
> return bo->gbm->bo_get_fd(bo);
>  }
>  
> +/** Get the number of planes for the given bo.
> + *
> + * \param bo The buffer object
> + * \return The number of planes
> + */
> +GBM_EXPORT int
> +gbm_bo_get_plane_count(struct gbm_bo *bo)
> +{
> +   return bo->gbm->bo_get_planes(bo);
> +}
>  
>  /** Write data into the buffer object
>   *
> diff --git a/src/gbm/main/gbm.h b/src/gbm/main/gbm.h
> index efb329e..b4873ab 100644
> --- a/src/gbm/main/gbm.h
> +++ b/src/gbm/main/gbm.h
> @@ -316,6 +316,9 @@ int
>  gbm_bo_get_fd(struct gbm_bo *bo);
>  
>  int
> +gbm_bo_get_plane_count(struct gbm_bo *bo);
> +
> +int
>  gbm_bo_write(struct gbm_bo *bo, const void *buf, size_t count);
>  
>  void
> diff --git a/src/gbm/main/gbmint.h b/src/gbm/main/gbmint.h
> index cfef5ee..c6a6701 100644
> --- a/src/gbm/main/gbmint.h
> +++ b/src/gbm/main/gbmint.h
> @@ -76,6 +76,7 @@ struct gbm_device {
> void (*bo_unmap)(struct gbm_bo *bo, void *map_data);
> int (*bo_write)(struct gbm_bo *bo, const void *buf, size_t data);
> int (*bo_get_fd)(struct gbm_bo *bo);
> +   int (*bo_get_planes)(struct gbm_bo *bo);
> void (*bo_destroy)(struct gbm_bo *bo);
>  
> struct gbm_surface *(*surface_create)(struct gbm_device *gbm,
> -- 
> 2.10.2
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/27] gbm: Fix width height getters return type (trivial)

2016-12-02 Thread Eric Engestrom
On Thursday, 2016-12-01 14:09:43 -0800, Ben Widawsky wrote:
> From: Ben Widawsky 
> 
> Signed-off-by: Ben Widawsky 
> ---
>  src/gbm/main/gbm.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/gbm/main/gbm.h b/src/gbm/main/gbm.h
> index 59daaa1..efb329e 100644
> --- a/src/gbm/main/gbm.h
> +++ b/src/gbm/main/gbm.h
> @@ -294,10 +294,10 @@ gbm_bo_map(struct gbm_bo *bo,
>  void
>  gbm_bo_unmap(struct gbm_bo *bo, void *map_data);
>  
> -uint32_t
> +unsigned int
>  gbm_bo_get_width(struct gbm_bo *bo);
>  
> -uint32_t
> +unsigned int
>  gbm_bo_get_height(struct gbm_bo *bo);

I'm not sure I understand this change. Why would you want to remove the
information of the type size? If the point is to increase it on 64-bit
machines, I'd go with an explicit `uint64_t` instead.

>  
>  uint32_t
> -- 
> 2.10.2
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] swr: [rasterizer jitter] include stdarg.h in builder_misc.cpp

2016-12-02 Thread Jan Vesely
On Fri, 2016-12-02 at 10:53 -0600, Tim Rowley wrote:
> Fixes build problem with llvm-svn.
> ---
>  src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp 
> b/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp
> index d755cc3..fce68c8 100644
> --- a/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp
> +++ b/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp
> @@ -30,6 +30,7 @@
>  #include "builder.h"
>  #include "common/rdtsc_buckets.h"
>  
> +#include 

wouldn't a  be better for c++ source?

Jan

>  
>  namespace SwrJit
>  {
> @@ -1623,4 +1624,4 @@ namespace SwrJit
>  }
>  }
>  
> -}
> \ No newline at end of file
> +}


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/6] anv: do not open random render node(s)

2016-12-02 Thread Eric Engestrom
On Friday, 2016-12-02 16:31:49 +, Emil Velikov wrote:
> From: Emil Velikov 
> 
> drmGetDevices2() provides us with enough flexibility to build heuristics
> upon. Opening a random node on the other hand will wake up the device,
> regardless if it's the one we're intereseted or not.

"interested"
(same in the previous patch)

> 
> Cc: Jason Ekstrand 
> Signed-off-by: Emil Velikov 
> ---
>  src/intel/vulkan/Makefile.am  |  3 ++-
>  src/intel/vulkan/anv_device.c | 53 
> +++
>  2 files changed, 40 insertions(+), 16 deletions(-)
> 
> diff --git a/src/intel/vulkan/Makefile.am b/src/intel/vulkan/Makefile.am
> index df7645f..e309491 100644
> --- a/src/intel/vulkan/Makefile.am
> +++ b/src/intel/vulkan/Makefile.am
> @@ -66,7 +66,7 @@ AM_CPPFLAGS += \
>  endif
>  
>  AM_CPPFLAGS += \
> - $(INTEL_CFLAGS) \
> + $(LIBDRM_CFLAGS) \
>   $(VALGRIND_CFLAGS) \
>   $(DEFINES)
>  
> @@ -131,6 +131,7 @@ VULKAN_LIB_DEPS += \
>   $(top_builddir)/src/intel/isl/libisl.la \
>   $(top_builddir)/src/intel/blorp/libblorp.la \
>   $(PER_GEN_LIBS) \
> + $(LIBDRM_LIBS) \
>   $(PTHREAD_LIBS) \
>   $(DLOPEN_LIBS) \
>   -lm

Unrelated -> separate patch?

> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
> index d594df7..9927ac2 100644
> --- a/src/intel/vulkan/anv_device.c
> +++ b/src/intel/vulkan/anv_device.c
> @@ -29,6 +29,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "anv_private.h"
>  #include "util/strtod.h"
> @@ -375,6 +376,40 @@ void anv_DestroyInstance(
> vk_free(>alloc, instance);
>  }
>  
> +static VkResult
> +anv_enumerate_devices(struct anv_instance *instance)
> +{
> +   /* TODO: Check for more devices ? */
> +   drmDevicePtr devices[8];
> +   VkResult result = VK_SUCCESS;
> +   int max_devices;
> +
> +   max_devices = drmGetDevices2(0, devices, sizeof(devices));
> +   if (max_devices < 1)
> +  return VK_ERROR_INCOMPATIBLE_DRIVER;
> +
> +   for (unsigned i = 0; i < (unsigned)max_devices; i++) {
> +  if (devices[i]->available_nodes & 1 << DRM_NODE_RENDER &&
> +  devices[i]->bustype == DRM_BUS_PCI &&
> +  devices[i]->deviceinfo.pci->vendor_id == 0x8086) {

Yay, magic values!
I feel like we should replace all those with PCI_VENDOR_INTEL or something.

Anyway, series is
Reviewed-by: Eric Engestrom 

> +
> + result = anv_physical_device_init(>physicalDevice,
> +instance,
> +devices[i]->nodes[DRM_NODE_RENDER]);
> + if (result != VK_ERROR_INCOMPATIBLE_DRIVER)
> +break;
> +  }
> +   }
> +
> +   if (result == VK_ERROR_INCOMPATIBLE_DRIVER)
> +  instance->physicalDeviceCount = 0;
> +   else if (result == VK_SUCCESS)
> +  instance->physicalDeviceCount = 1;
> +
> +   return result;
> +}
> +
> +
>  VkResult anv_EnumeratePhysicalDevices(
>  VkInstance  _instance,
>  uint32_t*   pPhysicalDeviceCount,
> @@ -384,22 +419,10 @@ VkResult anv_EnumeratePhysicalDevices(
> VkResult result;
>  
> if (instance->physicalDeviceCount < 0) {
> -  char path[20];
> -  for (unsigned i = 0; i < 8; i++) {
> - snprintf(path, sizeof(path), "/dev/dri/renderD%d", 128 + i);
> - result = anv_physical_device_init(>physicalDevice,
> -   instance, path);
> - if (result != VK_ERROR_INCOMPATIBLE_DRIVER)
> -break;
> -  }
> -
> -  if (result == VK_ERROR_INCOMPATIBLE_DRIVER) {
> - instance->physicalDeviceCount = 0;
> -  } else if (result == VK_SUCCESS) {
> - instance->physicalDeviceCount = 1;
> -  } else {
> +  result = anv_enumerate_devices(instance);
> +  if (result != VK_SUCCESS &&
> +  result != VK_ERROR_INCOMPATIBLE_DRIVER)
>   return result;
> -  }
> }
>  
> /* pPhysicalDeviceCount is an out parameter if pPhysicalDevices is NULL;
> -- 
> 2.10.2
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/23] tgsi: add Stream{X, Y, Z, W} fields to tgsi_declaration_semantic

2016-12-02 Thread Nicolai Hähnle

On 30.11.2016 21:37, Roland Scheidegger wrote:

Am 30.11.2016 um 20:19 schrieb Nicolai Hähnle:

On 30.11.2016 19:06, Roland Scheidegger wrote:

Am 30.11.2016 um 14:35 schrieb Nicolai Hähnle:

From: Nicolai Hähnle 

This is for geometry shader outputs. Without it, drivers have no way of
knowing which stream each output is intended for, and have to
conservatively write all outputs to all streams.

Separate stream numbers for each component are required due to output
packing.

Are you sure this is true?
This is an area I don't know much about, but
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.opengl.org_wiki_Layout-5FQualifier-5F-28GLSL-29=DgIDaQ=uilaK90D4TOVoH58JNXRgQ=_QIjpv-UJ77xEQY8fIYoQtr5qv8wKrPJc7v7_-CYAb0=fVpTGTYN2KTEhU17RpFTxEULrsIfC3bdpEin0k8NIYE=uamnHj-9Xr12ctr0gHDfCMIMHq8DyUBtKIwHQQpjDLs=
tells me "Stream
assignments for a geometry shader are required to be the same for all
members of a block, but offsets are not."

Therefore I don't think output packing should ever happen across
multiple streams. I think it would be MUCH nicer if the semantic needed
just one stream member...


There are two variants of that question, I guess.

The answer to the first variant is: Yes, this is currently true.
lower_packed_varyings will happily pack outputs from different vertex
streams into the same vec4. This affects quite a lot of programs, e.g.
you see it in piglit arb_gpu_shader5-xfb-streams.

The second question is: Do we want it to be true? I agree that it would
be convenient to be able to use a single Stream member. Also, isolating
the stream0 components from the rest would lead to slightly more
efficient shaders for us in some cases.

I opted against it so far because I didn't want to think through the
implications of changing lower_packed_varyings. The main question I have
is: if you account for the size of the GS output in # of components,
then it could happen that the number of output vec4s ends up being
larger than (max # of output components) / 4. Will that be a problem
somewhere?


I don't know if that would be a problem, but if it is I'd assume this
would be fixable (since the number of actual components ultimately
doesn't change).
Having outputs belonging to multiple streams in a single output just
seems weird...
That said, I wonder if it actually would be possible to do that with
d3d11 too.
With shader model 5 you'd have:
dcl_stream 0
dcl_output o0.xy
dcl_stream 1
dcl_output o0.zw // legal or not???

Though the shader model 4/5 rules are a bit weird for packing
inputs/outputs, I'm not even sure two dcl_output are legal for the same
reg without a dcl_stream in between them (but you can pack system values
together with ordinary inputs/outputs).

So maybe just allowing this is the right solution...


I played around with the DX shader compiler, and I have some annoying 
news. SM5 actually uses not just the same output register but even the 
same component for multiple streams -- see the output I've pasted at the 
end.


So how to proceed? To simplify things going forward, I'm mostly 
convinced that the GLSL output packing should be changed to pack outputs 
by stream. As I mentioned previously, this has other minor advantages 
for us anyway.


Then one possibility to accomodate SM5 would be to have a Stream 
bitmask, one bit per stream, as part of the output semantics. The 
downside of this is that I wanted to use the WriteMask as an additional 
optimization to avoid writing out unused components, and you'd then need 
separate WriteMasks for each stream.


The other possibility, which I prefer, would be to have just a single 
Stream field indicating one stream number per output register, and 
aliasing is just not allowed despite what SM5 wants.


TGSI -> SM5 conversion is trivial.

SM5 -> TGSI conversion is also possible despite the aliasing on the DX 
side, because the doc says this about emit_stream: "Af[t]er the emit, 
all data in all output registers for all streams become uninitialized, 
not just the stream emitted to." 
(https://msdn.microsoft.com/en-us/library/windows/desktop/hh447051(v=vs.85).aspx). 
So you have to look-ahead to the next emit_stream for disambiguation, 
but it's clearly doable.


Any objections to that approach?

Thanks,
Nicolai
---
//
// Generated by Microsoft (R) HLSL Shader Compiler 10.0.10011.16384
//
//
//
// Input signature:
//
// Name Index   Mask Register SysValue  Format   Used
//  - --   --- --
// SV_POSITION  0   xyzw0  POS   float   xyzw
// TEXCOORD 0   xyz 1 NONE   float
// TEXCOORD 1   xy  2 NONE   float
//
//
// Output signature:
//
// Name Index   Mask Register SysValue  Format   Used
//  - --   --- --
// m0:TEXCOORD  0   x   0 NONE   float   x
// m0:TEXCOORD  1y  0  

Re: [Mesa-dev] [PATCH] swr: [rasterizer jitter] include stdarg.h in builder_misc.cpp

2016-12-02 Thread Cherniak, Bruce
Reviewed-by: Bruce Cherniak  

> On Dec 2, 2016, at 10:53 AM, Rowley, Timothy O  
> wrote:
> 
> Fixes build problem with llvm-svn.
> ---
> src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp 
> b/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp
> index d755cc3..fce68c8 100644
> --- a/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp
> +++ b/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp
> @@ -30,6 +30,7 @@
> #include "builder.h"
> #include "common/rdtsc_buckets.h"
> 
> +#include 
> 
> namespace SwrJit
> {
> @@ -1623,4 +1624,4 @@ namespace SwrJit
> }
> }
> 
> -}
> \ No newline at end of file
> +}
> -- 
> 2.9.3
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] swr: [rasterizer jitter] include stdarg.h in builder_misc.cpp

2016-12-02 Thread Tim Rowley
Fixes build problem with llvm-svn.
---
 src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp
index d755cc3..fce68c8 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/builder_misc.cpp
@@ -30,6 +30,7 @@
 #include "builder.h"
 #include "common/rdtsc_buckets.h"
 
+#include 
 
 namespace SwrJit
 {
@@ -1623,4 +1624,4 @@ namespace SwrJit
 }
 }
 
-}
\ No newline at end of file
+}
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/6] winsys/amdgpu: use drmGetDevice[s]2 API

2016-12-02 Thread Emil Velikov
From: Emil Velikov 

Analogous to previous commit

Cc: Michel Dänzer 
Signed-off-by: Emil Velikov 
---
 src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
index 98d72bd..d3df66f 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
@@ -108,9 +108,9 @@ static bool do_winsys_init(struct amdgpu_winsys *ws, int fd)
drmDevicePtr devinfo;
 
/* Get PCI info. */
-   r = drmGetDevice(fd, );
+   r = drmGetDevice2(fd, 0, );
if (r) {
-  fprintf(stderr, "amdgpu: drmGetDevice failed.\n");
+  fprintf(stderr, "amdgpu: drmGetDevice2 failed.\n");
   goto fail;
}
ws->info.pci_domain = devinfo->businfo.pci->domain;
-- 
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/6] anv: do not open random render node(s)

2016-12-02 Thread Emil Velikov
From: Emil Velikov 

drmGetDevices2() provides us with enough flexibility to build heuristics
upon. Opening a random node on the other hand will wake up the device,
regardless if it's the one we're intereseted or not.

Cc: Jason Ekstrand 
Signed-off-by: Emil Velikov 
---
 src/intel/vulkan/Makefile.am  |  3 ++-
 src/intel/vulkan/anv_device.c | 53 +++
 2 files changed, 40 insertions(+), 16 deletions(-)

diff --git a/src/intel/vulkan/Makefile.am b/src/intel/vulkan/Makefile.am
index df7645f..e309491 100644
--- a/src/intel/vulkan/Makefile.am
+++ b/src/intel/vulkan/Makefile.am
@@ -66,7 +66,7 @@ AM_CPPFLAGS += \
 endif
 
 AM_CPPFLAGS += \
-   $(INTEL_CFLAGS) \
+   $(LIBDRM_CFLAGS) \
$(VALGRIND_CFLAGS) \
$(DEFINES)
 
@@ -131,6 +131,7 @@ VULKAN_LIB_DEPS += \
$(top_builddir)/src/intel/isl/libisl.la \
$(top_builddir)/src/intel/blorp/libblorp.la \
$(PER_GEN_LIBS) \
+   $(LIBDRM_LIBS) \
$(PTHREAD_LIBS) \
$(DLOPEN_LIBS) \
-lm
diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index d594df7..9927ac2 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "anv_private.h"
 #include "util/strtod.h"
@@ -375,6 +376,40 @@ void anv_DestroyInstance(
vk_free(>alloc, instance);
 }
 
+static VkResult
+anv_enumerate_devices(struct anv_instance *instance)
+{
+   /* TODO: Check for more devices ? */
+   drmDevicePtr devices[8];
+   VkResult result = VK_SUCCESS;
+   int max_devices;
+
+   max_devices = drmGetDevices2(0, devices, sizeof(devices));
+   if (max_devices < 1)
+  return VK_ERROR_INCOMPATIBLE_DRIVER;
+
+   for (unsigned i = 0; i < (unsigned)max_devices; i++) {
+  if (devices[i]->available_nodes & 1 << DRM_NODE_RENDER &&
+  devices[i]->bustype == DRM_BUS_PCI &&
+  devices[i]->deviceinfo.pci->vendor_id == 0x8086) {
+
+ result = anv_physical_device_init(>physicalDevice,
+instance,
+devices[i]->nodes[DRM_NODE_RENDER]);
+ if (result != VK_ERROR_INCOMPATIBLE_DRIVER)
+break;
+  }
+   }
+
+   if (result == VK_ERROR_INCOMPATIBLE_DRIVER)
+  instance->physicalDeviceCount = 0;
+   else if (result == VK_SUCCESS)
+  instance->physicalDeviceCount = 1;
+
+   return result;
+}
+
+
 VkResult anv_EnumeratePhysicalDevices(
 VkInstance  _instance,
 uint32_t*   pPhysicalDeviceCount,
@@ -384,22 +419,10 @@ VkResult anv_EnumeratePhysicalDevices(
VkResult result;
 
if (instance->physicalDeviceCount < 0) {
-  char path[20];
-  for (unsigned i = 0; i < 8; i++) {
- snprintf(path, sizeof(path), "/dev/dri/renderD%d", 128 + i);
- result = anv_physical_device_init(>physicalDevice,
-   instance, path);
- if (result != VK_ERROR_INCOMPATIBLE_DRIVER)
-break;
-  }
-
-  if (result == VK_ERROR_INCOMPATIBLE_DRIVER) {
- instance->physicalDeviceCount = 0;
-  } else if (result == VK_SUCCESS) {
- instance->physicalDeviceCount = 1;
-  } else {
+  result = anv_enumerate_devices(instance);
+  if (result != VK_SUCCESS &&
+  result != VK_ERROR_INCOMPATIBLE_DRIVER)
  return result;
-  }
}
 
/* pPhysicalDeviceCount is an out parameter if pPhysicalDevices is NULL;
-- 
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/6] loader: use drmGetDevice[s]2 API

2016-12-02 Thread Emil Velikov
From: Emil Velikov 

By default this allows us to fetch the device list/info _without_ the
revision field. At the moment retrieving that wakes up the device.

Note: kernel patch to resolve that should be in 4.10.

Cc: Michel Dänzer 
Signed-off-by: Emil Velikov 
---
 src/loader/loader.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/loader/loader.c b/src/loader/loader.c
index 449ff54..b369dec 100644
--- a/src/loader/loader.c
+++ b/src/loader/loader.c
@@ -145,7 +145,7 @@ static char *drm_get_id_path_tag_for_fd(int fd)
drmDevicePtr device;
char *tag;
 
-   if (drmGetDevice(fd, ) != 0)
+   if (drmGetDevice2(fd, 0, ) != 0)
return NULL;
 
tag = drm_construct_id_path_tag(device);
@@ -179,7 +179,7 @@ int loader_get_user_preferred_fd(int default_fd, int 
*different_device)
if (default_tag == NULL)
   goto err;
 
-   num_devices = drmGetDevices(devices, MAX_DRM_DEVICES);
+   num_devices = drmGetDevices2(0, devices, MAX_DRM_DEVICES);
if (num_devices < 0)
   goto err;
 
@@ -275,7 +275,7 @@ drm_get_pci_id_for_fd(int fd, int *vendor_id, int *chip_id)
drmDevicePtr device;
int ret;
 
-   if (drmGetDevice(fd, ) == 0) {
+   if (drmGetDevice2(fd, 0, ) == 0) {
   if (device->bustype == DRM_BUS_PCI) {
  *vendor_id = device->deviceinfo.pci->vendor_id;
  *chip_id = device->deviceinfo.pci->device_id;
-- 
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/6] radv: do not open random render node(s)

2016-12-02 Thread Emil Velikov
From: Emil Velikov 

drmGetDevices2() provides us with enough flexibility to build heuristics
upon. Opening a random node on the other hand will wake up the device,
regardless if it's the one we're intereseted or not.

Cc: Michel Dänzer 
Cc: Dave Airlie 
Signed-off-by: Emil Velikov 
---
Afacit there is no system with more than one Intel GPU, but on the other
hand one can easily have setup with many AMD cards.

Dave, any reason why we are capped at 1 device ?
---
 src/amd/vulkan/radv_device.c | 51 +++-
 1 file changed, 36 insertions(+), 15 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 0defc0f..3eea0cd 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -300,6 +300,39 @@ void radv_DestroyInstance(
vk_free(>alloc, instance);
 }
 
+static VkResult
+radv_enumerate_devices(struct radv_instance *instance)
+{
+   /* TODO: Check for more devices ? */
+   drmDevicePtr devices[8];
+   VkResult result = VK_SUCCESS;
+   int max_devices;
+
+   max_devices = drmGetDevices2(0, devices, sizeof(devices));
+   if (max_devices < 1)
+   return VK_ERROR_INCOMPATIBLE_DRIVER;
+
+   for (unsigned i = 0; i < (unsigned)max_devices; i++) {
+   if (devices[i]->available_nodes & 1 << DRM_NODE_RENDER &&
+   devices[i]->bustype == DRM_BUS_PCI &&
+   devices[i]->deviceinfo.pci->vendor_id == 0x1002) {
+
+   result = 
radv_physical_device_init(>physicalDevice,
+  instance,
+  
devices[i]->nodes[DRM_NODE_RENDER]);
+   if (result != VK_ERROR_INCOMPATIBLE_DRIVER)
+   break;
+   }
+   }
+
+   if (result == VK_ERROR_INCOMPATIBLE_DRIVER)
+   instance->physicalDeviceCount = 0;
+   else if (result == VK_SUCCESS)
+   instance->physicalDeviceCount = 1;
+
+   return result;
+}
+
 VkResult radv_EnumeratePhysicalDevices(
VkInstance  _instance,
uint32_t*   pPhysicalDeviceCount,
@@ -309,22 +342,10 @@ VkResult radv_EnumeratePhysicalDevices(
VkResult result;
 
if (instance->physicalDeviceCount < 0) {
-   char path[20];
-   for (unsigned i = 0; i < 8; i++) {
-   snprintf(path, sizeof(path), "/dev/dri/renderD%d", 128 
+ i);
-   result = 
radv_physical_device_init(>physicalDevice,
-  instance, path);
-   if (result != VK_ERROR_INCOMPATIBLE_DRIVER)
-   break;
-   }
-
-   if (result == VK_ERROR_INCOMPATIBLE_DRIVER) {
-   instance->physicalDeviceCount = 0;
-   } else if (result == VK_SUCCESS) {
-   instance->physicalDeviceCount = 1;
-   } else {
+   result = radv_enumerate_devices(instance);
+   if (result != VK_SUCCESS &&
+   result != VK_ERROR_INCOMPATIBLE_DRIVER)
return result;
-   }
}
 
/* pPhysicalDeviceCount is an out parameter if pPhysicalDevices is NULL;
-- 
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/6] radv/winsys: use drmGetDevice[s]2 API

2016-12-02 Thread Emil Velikov
From: Emil Velikov 

Analogous to previous commit

Cc: Michel Dänzer 
Cc: Dave Airlie 
Signed-off-by: Emil Velikov 
---
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.c 
b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.c
index b2e171a..014e4e9 100644
--- a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.c
+++ b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.c
@@ -122,9 +122,9 @@ do_winsys_init(struct radv_amdgpu_winsys *ws, int fd)
int r;
int i, j;
/* Get PCI info. */
-   r = drmGetDevice(fd, );
+   r = drmGetDevice2(fd, 0, );
if (r) {
-   fprintf(stderr, "amdgpu: drmGetDevice failed.\n");
+   fprintf(stderr, "amdgpu: drmGetDevice2 failed.\n");
goto fail;
}
ws->info.pci_domain = devinfo->businfo.pci->domain;
-- 
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/6] autoconf/scons: bump libdrm to 2.4.75

2016-12-02 Thread Emil Velikov
From: Emil Velikov 

We'll be using the drmGetDevice[s]2 API with next patch.

Cc: Michel Dänzer 
Signed-off-by: Emil Velikov 
---
The libdrm patches have not landed !
Version number is preliminary !
---
 configure.ac | 2 +-
 scons/gallium.py | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/configure.ac b/configure.ac
index adca49d..2ecb070 100644
--- a/configure.ac
+++ b/configure.ac
@@ -68,7 +68,7 @@ OPENCL_VERSION=1
 AC_SUBST([OPENCL_VERSION])
 
 dnl Versions for external dependencies
-LIBDRM_REQUIRED=2.4.66
+LIBDRM_REQUIRED=2.4.75
 LIBDRM_RADEON_REQUIRED=2.4.56
 LIBDRM_AMDGPU_REQUIRED=2.4.63
 LIBDRM_INTEL_REQUIRED=2.4.61
diff --git a/scons/gallium.py b/scons/gallium.py
index dc7fdce..a8ebab0 100755
--- a/scons/gallium.py
+++ b/scons/gallium.py
@@ -651,7 +651,7 @@ def generate(env):
 env.PkgCheckModules('X11', ['x11', 'xext', 'xdamage', 'xfixes', 'glproto 
>= 1.4.13'])
 env.PkgCheckModules('XCB', ['x11-xcb', 'xcb-glx >= 1.8.1', 'xcb-dri2 >= 
1.8'])
 env.PkgCheckModules('XF86VIDMODE', ['xxf86vm'])
-env.PkgCheckModules('DRM', ['libdrm >= 2.4.66'])
+env.PkgCheckModules('DRM', ['libdrm >= 2.4.75'])
 
 if env['x11']:
 env.Append(CPPPATH = env['X11_CPPPATH'])
-- 
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] st/va: remove unused variable pbuff

2016-12-02 Thread Emil Velikov
From: Emil Velikov 

Signed-off-by: Emil Velikov 
---
 src/gallium/state_trackers/va/surface.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/gallium/state_trackers/va/surface.c 
b/src/gallium/state_trackers/va/surface.c
index f8513d9..357e85e 100644
--- a/src/gallium/state_trackers/va/surface.c
+++ b/src/gallium/state_trackers/va/surface.c
@@ -94,7 +94,6 @@ vlVaSyncSurface(VADriverContextP ctx, VASurfaceID 
render_target)
vlVaDriver *drv;
vlVaContext *context;
vlVaSurface *surf;
-   void *pbuff;
 
if (!ctx)
   return VA_STATUS_ERROR_INVALID_CONTEXT;
-- 
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] st/va: move vlVaBuffer declaration further up

2016-12-02 Thread Emil Velikov
From: Emil Velikov 

This allows us to remove the struct vlVaBuffer reference, which lead the
compiler to emit a "assignment from incompatible pointer type" warnings.

Signed-off-by: Emil Velikov 
---
 src/gallium/state_trackers/va/va_private.h | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/src/gallium/state_trackers/va/va_private.h 
b/src/gallium/state_trackers/va/va_private.h
index c9a6a41..054cfb3 100644
--- a/src/gallium/state_trackers/va/va_private.h
+++ b/src/gallium/state_trackers/va/va_private.h
@@ -220,6 +220,20 @@ typedef struct {
 } vlVaSubpicture;
 
 typedef struct {
+   VABufferType type;
+   unsigned int size;
+   unsigned int num_elements;
+   void *data;
+   struct {
+  struct pipe_resource *resource;
+  struct pipe_transfer *transfer;
+   } derived_surface;
+   unsigned int export_refcount;
+   VABufferInfo export_state;
+   unsigned int coded_size;
+} vlVaBuffer;
+
+typedef struct {
struct pipe_video_codec templat, *decoder;
struct pipe_video_buffer *target;
union {
@@ -242,7 +256,7 @@ typedef struct {
} mpeg4;
 
struct vl_deint_filter *deint;
-   struct vlVaBuffer *coded_buf;
+   vlVaBuffer *coded_buf;
int target_id;
 } vlVaContext;
 
@@ -254,20 +268,6 @@ typedef struct {
 } vlVaConfig;
 
 typedef struct {
-   VABufferType type;
-   unsigned int size;
-   unsigned int num_elements;
-   void *data;
-   struct {
-  struct pipe_resource *resource;
-  struct pipe_transfer *transfer;
-   } derived_surface;
-   unsigned int export_refcount;
-   VABufferInfo export_state;
-   unsigned int coded_size;
-} vlVaBuffer;
-
-typedef struct {
struct pipe_video_buffer templat, *buffer;
struct util_dynarray subpics; /* vlVaSubpicture */
VAContextID ctx;
-- 
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] st/va: automake: cleanup C{PP,}FLAGS

2016-12-02 Thread Emil Velikov
From: Emil Velikov 

Remove some transitional left overs from the gallium pipe-loader rework
and kill off unneeded AM_CPPFLAGS.

Signed-off-by: Emil Velikov 
---
 src/gallium/state_trackers/va/Makefile.am | 12 
 1 file changed, 12 deletions(-)

diff --git a/src/gallium/state_trackers/va/Makefile.am 
b/src/gallium/state_trackers/va/Makefile.am
index 348cfe1..a70eede5 100644
--- a/src/gallium/state_trackers/va/Makefile.am
+++ b/src/gallium/state_trackers/va/Makefile.am
@@ -30,18 +30,6 @@ AM_CFLAGS = \
$(VA_CFLAGS) \
-DVA_DRIVER_INIT_FUNC="__vaDriverInit_$(VA_MAJOR)_$(VA_MINOR)"
 
-AM_CFLAGS += \
-   $(GALLIUM_PIPE_LOADER_DEFINES) \
-   -DPIPE_SEARCH_DIR=\"$(libdir)/gallium-pipe\"
-
-if HAVE_GALLIUM_STATIC_TARGETS
-AM_CFLAGS += \
-   -DGALLIUM_STATIC_TARGETS=1
-endif
-
-AM_CPPFLAGS = \
-   -I$(top_srcdir)/include
-
 noinst_LTLIBRARIES = libvatracker.la
 
 libvatracker_la_SOURCES = $(C_SOURCES)
-- 
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] gallium: add pipe_screen::resource_changed

2016-12-02 Thread Philipp Zabel
Add a hook to tell drivers that an imported resource may have changed
and they need to update their internal derived resources.

Signed-off-by: Philipp Zabel 
---
 src/gallium/include/pipe/p_screen.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/gallium/include/pipe/p_screen.h 
b/src/gallium/include/pipe/p_screen.h
index 255647e..e21229e 100644
--- a/src/gallium/include/pipe/p_screen.h
+++ b/src/gallium/include/pipe/p_screen.h
@@ -224,6 +224,12 @@ struct pipe_screen {
  struct winsys_handle *handle,
  unsigned usage);
 
+   /**
+* Trigger recreation of derived internal resources. This can be used for
+* reimporting external images that can't be directly used as texture
+* sampler source.
+*/
+   void (*resource_changed)(struct pipe_screen *, struct pipe_resource *pt);
 
void (*resource_destroy)(struct pipe_screen *,
struct pipe_resource *pt);
-- 
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] etnaviv: implement resource_changed to invalidate internal resources derived from imported buffers

2016-12-02 Thread Philipp Zabel
Implement the new resource_changed pipe callback to invalidate internal
resources derived from imported buffers. This is needed to update the
texture for re-imported renderables that may contain new contents.

Signed-off-by: Philipp Zabel 
---
 src/gallium/drivers/etnaviv/etnaviv_resource.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/src/gallium/drivers/etnaviv/etnaviv_resource.c 
b/src/gallium/drivers/etnaviv/etnaviv_resource.c
index a8858c5..20ec8f8 100644
--- a/src/gallium/drivers/etnaviv/etnaviv_resource.c
+++ b/src/gallium/drivers/etnaviv/etnaviv_resource.c
@@ -275,6 +275,18 @@ etna_resource_create(struct pipe_screen *pscreen,
 }
 
 static void
+etna_resource_changed(struct pipe_screen *pscreen, struct pipe_resource *prsc)
+{
+   struct etna_resource *res = etna_resource(prsc);
+
+   /* Make sure texture is older than the imported renderable buffer,
+* so etna_update_sampler_source will copy the pixel data again.
+*/
+   if (res->texture)
+  etna_resource(res->texture)->seqno = res->seqno - 1;
+}
+
+static void
 etna_resource_destroy(struct pipe_screen *pscreen, struct pipe_resource *prsc)
 {
struct etna_resource *rsc = etna_resource(prsc);
@@ -436,5 +448,6 @@ etna_resource_screen_init(struct pipe_screen *pscreen)
pscreen->resource_create = etna_resource_create;
pscreen->resource_from_handle = etna_resource_from_handle;
pscreen->resource_get_handle = etna_resource_get_handle;
+   pscreen->resource_changed = etna_resource_changed;
pscreen->resource_destroy = etna_resource_destroy;
 }
-- 
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] etnaviv: initialize seqno of imported resources

2016-12-02 Thread Philipp Zabel
Imported resources already have contents that we want to be copied to
texture resources derived from them. Set initial seqno of imported
resources to 1, just as if they had already been rendered to.

Signed-off-by: Philipp Zabel 
---
 src/gallium/drivers/etnaviv/etnaviv_resource.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/gallium/drivers/etnaviv/etnaviv_resource.c 
b/src/gallium/drivers/etnaviv/etnaviv_resource.c
index aefe65b..a8858c5 100644
--- a/src/gallium/drivers/etnaviv/etnaviv_resource.c
+++ b/src/gallium/drivers/etnaviv/etnaviv_resource.c
@@ -325,6 +325,8 @@ etna_resource_from_handle(struct pipe_screen *pscreen,
if (!rsc->bo)
   goto fail;
 
+   rsc->seqno = 1;
+
level->width = tmpl->width0;
level->height = tmpl->height0;
 
-- 
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/4] etnaviv: update derived texture resources of (re)imported buffers

2016-12-02 Thread Philipp Zabel
Hi,

to get weston / wayland_egl working on etnaviv, we need to update the texture
resources derived from imported buffers every time they are re-imported.

This patchset is based on the github-etnaviv/for_mainline_v1 branch and adds
a new pipe_screen::resource_changed callback that is called inside
dri2_from_planar and instructs the pipe driver to invalidate the internal
(texture) resources that are derived from the re-imported resource.

The etnaviv implementation of resource_changed just sets the texture seqno
to the resource seqno - 1. The initial seqno of imported resources is set to 1
so that texture resources created from them are actually older and trigger the
resolve on first use.

regards
Philipp

Philipp Zabel (4):
  gallium: add pipe_screen::resource_changed
  st/dri: ask the driver to update its internal copies on reimport
  etnaviv: initialize seqno of imported resources
  etnaviv: implement resource_changed to invalidate internal resources
derived from imported buffers

 src/gallium/drivers/etnaviv/etnaviv_resource.c | 15 +++
 src/gallium/include/pipe/p_screen.h|  6 ++
 src/gallium/state_trackers/dri/dri2.c  |  4 
 3 files changed, 25 insertions(+)

-- 
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] st/dri: ask the driver to update its internal copies on reimport

2016-12-02 Thread Philipp Zabel
For imported buffers that can't be used directly as a source to the
texture samplers, the pipe driver might need to create an internal
copy, for example in a different tiling layout. When buffers are
reimported they may contain new image data, so the driver internal
copies need to be recreated.

Signed-off-by: Philipp Zabel 
---
 src/gallium/state_trackers/dri/dri2.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/gallium/state_trackers/dri/dri2.c 
b/src/gallium/state_trackers/dri/dri2.c
index 9ec069b..a216e83 100644
--- a/src/gallium/state_trackers/dri/dri2.c
+++ b/src/gallium/state_trackers/dri/dri2.c
@@ -1168,6 +1168,10 @@ dri2_from_planar(__DRIimage *image, int plane, void 
*loaderPrivate)
if (img == NULL)
   return NULL;
 
+   if (img->texture->screen->resource_changed)
+  img->texture->screen->resource_changed(img->texture->screen,
+ img->texture);
+
/* set this to 0 for sub images. */
img->dri_components = 0;
return img;
-- 
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] anv: bump push constant max size to 512bytes

2016-12-02 Thread Lionel Landwerlin
This is the size selected by the i965 driver.

Signed-off-by: Lionel Landwerlin 
Cc: Kenneth Graunke 
---
 src/intel/vulkan/anv_private.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 1f03b68..ce4eb4d 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -77,7 +77,7 @@ extern "C" {
 #define MAX_RTS  8
 #define MAX_VIEWPORTS   16
 #define MAX_SCISSORS16
-#define MAX_PUSH_CONSTANTS_SIZE 128
+#define MAX_PUSH_CONSTANTS_SIZE (4 * 8 * 16) /* 16 (256-bits) registers */
 #define MAX_DYNAMIC_BUFFERS 16
 #define MAX_IMAGES 8
 #define MAX_SAMPLES_LOG2 4 /* SKL supports 16 samples */
-- 
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa 13.1.0 release plan

2016-12-02 Thread Emil Velikov
On 1 December 2016 at 10:00, Christian Gmeiner
 wrote:
> 2016-11-30 21:23 GMT+01:00 Emil Velikov :
>> Hi all,
>>
>> With holidays not far off, it might be a nice idea to consider the
>> branchpoint/release schedule for the next release.
>>
>> I will be having limited internet access during 20 Dec - 7 Jan, thus
>> the I'm leaning towards following:
>>  Jan 13 2017 - Feature freeze/Release candidate 1
>>  Jan 20 2017 - Release candidate 2
>>  Jan 27 2017 - Release candidate 3
>>  Feb 03 2017 - Release candidate 4/final release
>>
>> How does this align with people's schedules ?
>>
>> Please let me know if you have any work we want to land before the
>> next branchpoint.
>>
>
> I am interested in landing etnaviv.
>
Ack. Dully noted.

There's a few pretty minor (in terms of work) suggestions and with
those in I'm pretty sure we can land it.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] spirv: Builtin Layer is an input for fragment shaders

2016-12-02 Thread Iago Toral Quiroga
This change makes it so we emit a load_input intrinsic when Layer
is read in a fragment shader.
---

Even with this, layered rendering does not seem to work in the Vulkan
driver, so there is something else that is broken. We are probably
not mapping the Layer input correctly somewhere.

 src/compiler/spirv/vtn_variables.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c
index 14366dc..c6d73a7 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -819,7 +819,12 @@ vtn_get_builtin_location(struct vtn_builder *b,
   break;
case SpvBuiltInLayer:
   *location = VARYING_SLOT_LAYER;
-  *mode = nir_var_shader_out;
+  if (b->shader->stage == MESA_SHADER_FRAGMENT)
+ *mode = nir_var_shader_in;
+  else if (b->shader->stage == MESA_SHADER_GEOMETRY)
+ *mode = nir_var_shader_out;
+  else
+ unreachable("invalid stage for SpvBuiltInLayer");
   break;
case SpvBuiltInViewportIndex:
   *location = VARYING_SLOT_VIEWPORT;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa 13.1.0 release plan

2016-12-02 Thread Emil Velikov
On 1 December 2016 at 09:35, Nicolai Hähnle  wrote:
> On 30.11.2016 21:23, Emil Velikov wrote:
>>
>> Hi all,
>>
>> With holidays not far off, it might be a nice idea to consider the
>> branchpoint/release schedule for the next release.
>
>
> +1 on the 17.0 question.
>
I'd prefer to keep different things separate - scheme vs schedule ;-)
Will check with the versioning scheme thread in a day or so. Might
poke some distro maintainers for to collect some feedback (mostly
checking for objections).

>
>> I will be having limited internet access during 20 Dec - 7 Jan, thus
>> the I'm leaning towards following:
>>  Jan 13 2017 - Feature freeze/Release candidate 1
>>  Jan 20 2017 - Release candidate 2
>>  Jan 27 2017 - Release candidate 3
>>  Feb 03 2017 - Release candidate 4/final release
>>
>> How does this align with people's schedules ?
>>
>> Please let me know if you have any work we want to land before the
>> next branchpoint.
>
>
> I was hoping to get GLCTS failures for radeonsi down to 0. We're currently
> at 18 (including patches not on master and some pending LLVM changes). For
> some of the failures this may need spec clarification feedback, which tends
> to be not the fastest process in the world (e.g. I'm thinking of the
> program_interface_query stuff). Apart from those, which are a big unknown,
> the schedule is probably doable.
>
Ack. If there is any noticeable work needed please try to land this
before the branchpoint. This way one can toggle between codepath A and
B during the RCs ;-)

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/27] Renderbuffer Decompression (and GBM modifiers)

2016-12-02 Thread Rob Clark
On Thu, Dec 1, 2016 at 5:09 PM, Ben Widawsky
 wrote:
> When Kristian's interface is ready, kmscube can be modified to make use of it.
>
> Rob: are you interested in a PR for kmscube?

sure, from a quick look seems like it should be backwards compatible..
probably we should set up a git tree on fd.o for kmscube

It does make me realize that I do need to figure out what to do w/ the
atomic/fences branches.. maybe I should just make a legacy branch
which sticks with the legacy APIs for hw that doesn't support atomic
and old kernels.  Otherwise I guess kmscube maybe needs to get split
into more than one file to keep it from being too much of a mess ;-)

btw, interesting that you went the route of an extra plane for
"metadata"..  I have something similar w/ a5xx, and was assuming I'd
just have to go single-plane with well known formula for calculating
offset of color data from aux data, to avoid confusing dri2/dri3 too
badly.

BR,
-R
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/27] gbm: Get modifiers from DRI

2016-12-02 Thread Daniel Stone
Hi Ben,

On 1 December 2016 at 22:09, Ben Widawsky  wrote:
> @@ -678,6 +679,28 @@ gbm_dri_bo_get_offset(struct gbm_bo *_bo, int plane)
> return (uint32_t)offset;
>  }
>
> +static uint64_t
> +gbm_dri_bo_get_modifier(struct gbm_bo *_bo)
> +{
> +   struct gbm_dri_device *dri = gbm_dri_device(_bo->gbm);
> +   struct gbm_dri_bo *bo = gbm_dri_bo(_bo);
> +
> +   if (!dri->image || dri->image->base.version < 14) {
> +  errno = ENOSYS;
> +  return 0;
> +   }

Sticking this here prevents my cursor crash:
+   /* Dumb buffers have no modifiers */
+   if (!bo->image)
+  return 0;

Cheers,
Daniel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/27] gbm: Introduce modifiers into surface/bo creation

2016-12-02 Thread Daniel Stone
Hi Ben,

On 1 December 2016 at 22:09, Ben Widawsky  wrote:
> @@ -996,13 +997,22 @@ gbm_dri_bo_create(struct gbm_device *gbm,
> dri_use |= __DRI_IMAGE_USE_SHARE;
>
> bo->image =
> -  dri->image->createImage(dri->screen,
> -  width, height,
> -  dri_format, dri_use,
> -  bo);
> +  dri->image->createImageWithModifiers(dri->screen,
> +   width, height,
> +   dri_format, dri_use,
> +   modifiers, count,
> +   bo);
> if (bo->image == NULL)
>goto failed;
>
> +   bo->base.base.modifiers = calloc(count, sizeof(*modifiers));
> +   if (!bo->base.base.modifiers) {

if (count && ...)

Cheers,
Daniel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/27] Renderbuffer Decompression (and GBM modifiers)

2016-12-02 Thread Daniel Stone
Hey Ben,
Sorry I didn't get to testing this before now; have been tied up with
all manner of stuff.

On 1 December 2016 at 22:09, Ben Widawsky  wrote:
> The overall strategy is that the buffer/surface is created with a list of
> modifiers. The list of modifiers the hardware is capable of using will come 
> from
> a new kernel API that is aware of the hardware and general constraints. A 
> client
> will request the list of modifiers and pass it directly back in during buffer
> creation (potentially the client can prune the list, but as of now there is no
> reason to.) This new API is being developed by Kristian. I did not get far
> enough to play with that.
>
> For EGL, a similar mechanism would exist whereby when importing a buffer into
> EGL, one would provide a modifier and probably a pointer to the auxiliary data
> upon import. (Import therefore might require multiple dma-buf fds), but for 
> i965
> and Intel, this wouldn't be necessary.

Right, we have EGL_EXT_image_dma_buf_import_modifiers; Varad has a
series on the list already for this which just needs some reviews
(ahem).

> Here is a brief description of the series:
> 1-6 Adds support in GBM for per plane functions where necessary. This is
> required because the kernel expects the auxiliary buffer to be passed along 
> as a
> plane. It has its own offset, and stride, and the client shouldn't need to
> calculate those.

This is missing gbm_bo_get_handle_for_plane(); as you say, a lot of
other hardware tends to use separate buffers rather than
adjacent/offset. So adding that would be nice. Having
gbm_bo_get_plane_count() is really nice though, since it allows us to
have a completely agnostic client (i.e. I don't have to have a map
inside Weston with every exotic format/modifier combination).

> 7-9 Adds support in GBM to understand modifiers. When creating a buffer or
> surface, the client is expected to pass in a list of modifiers that the driver
> will optimally choose from. As a result of this, the GBM APIs need to support
> modifiers.

This bit seems good, and like a reasonable fit for the draft of
GETPLANE2 which is kicking around.

> 10-12 Support Y-tiled modifier. Y-tiling was already a modifier exposed by the
> kernel. With the previous patches in place, it's easy to support this too.

And it works! \o/

> 13-26 Plumbing to support sending CCS buffers to display. Leveraging much of 
> the
> existing code for MCS buffers, these patches creating an MCS for the scanout
> buffer. The trickery here is that a single BO contains both the main surface 
> and
> the auxiliary data. Previously, auxiliary data always lived in its own BO.
>
> 27 Support CCS-modifier. Finally, the code can parse the CCS fb modifier(s) 
> and
> realize the bandwidth savings that come with it.

I've not rebuilt my kernel to test the new CCS bits, so I haven't tested this.

> This was tested using kmscube
> (https://github.com/bwidawsk/kmscube/tree/modifiers). The kmscube 
> implementation
> is missing support for GET_PLANE2 - which is currently being worked on by
> Kristian.

There's also a Weston branch here:
https://git.collabora.com/cgit/user/daniels/weston.git/log/?h=wip/2016-11/gbm-planes-modifiers

This works with Y-tiling for me, but with the same need for
GET_PLANE2; also the branch as-is will provoke a segfault inside
gbm_dri_bo_get_modifier(), which ends up calling intel_query_image()
with image == NULL, when using cursor images. To get it to succeed,
you need to shove an early 'return -1' inside
drm_output_init_cursor_egl() so we fall back to software (well OK, GL)
cursors.

The branch is broken with multihead, but that's the branch it's based
on being broken/WIP, not a result of these patches.

> Upstream plan:
> 1. All of the patches up through 26 should be mergeable today after review.
> 2. After 1-12 land, client support of Y-tiling should be achievable. 
> Modesetting
> driver can probably be updated as can things like Weston. Clients assuming a 
> new
> enough kernel should be able to blindly set the y tiled modifier.
> 3. Once kernel and libdrm support for CCS modifiers, patch 27 can land, 
> however
> CCS isn't yet usable, it is only available as a prototype.
> 4. Kristian's GET_PLANE2 interface needs to be solidified and land.
> 5. Clients will utilize #3 and #4 to use CCS.
> 6. Protocol work, EGL, Wayland, DRIX - etc

Wayland has modifier support already; there are patches out for review
for Weston to support this via the EGL extension above, as well as
inside KMS (part of the atomic branch).

> When Kristian's interface is ready, kmscube can be modified to make use of it.

And I'll modify Weston to use it as well.

Thanks for this, and sorry for the tardy review.

Cheers,
Daniel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] st/va: force to submit two consecutive single jobs

2016-12-02 Thread Christian König

Am 29.11.2016 um 20:43 schrieb boyuan.zh...@amd.com:

From: Boyuan Zhang 

The gop_size in rate control is the budget window for internal rate
control calculation, and shouldn't always equal to idr period. Define
a coefficient to let budget window contains a number of idr period for
proper rate control calculation. Adjust the number of i/p frame remaining
accordingly.

v2: fixed regression issues introduced by previous version

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=98005

Signed-off-by: Boyuan Zhang 


Acked-by: Christian König . for the series.


---
  src/gallium/state_trackers/va/picture.c| 24 +++-
  src/gallium/state_trackers/va/surface.c|  8 ++--
  src/gallium/state_trackers/va/va_private.h |  2 ++
  3 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c
index a8102a4..592cdef 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -413,7 +413,6 @@ handleVAEncPictureParameterBufferType(vlVaDriver *drv, 
vlVaContext *context, vlV
 context->desc.h264enc.quant_i_frames = h264->pic_init_qp;
 context->desc.h264enc.quant_b_frames = h264->pic_init_qp;
 context->desc.h264enc.quant_p_frames = h264->pic_init_qp;
-   context->desc.h264enc.frame_num_cnt++;
 context->desc.h264enc.gop_cnt++;
 if (context->desc.h264enc.gop_cnt == context->desc.h264enc.gop_size)
context->desc.h264enc.gop_cnt = 0;
@@ -569,18 +568,33 @@ vlVaEndPicture(VADriverContextP ctx, VAContextID 
context_id)
 if (context->decoder->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE) {
coded_buf = context->coded_buf;
getEncParamPreset(context);
+  context->desc.h264enc.frame_num_cnt++;
context->decoder->begin_frame(context->decoder, context->target, 
>desc.base);
context->decoder->encode_bitstream(context->decoder, context->target,
   coded_buf->derived_surface.resource, 
);
-  surf->frame_num_cnt = context->desc.h264enc.frame_num_cnt;
surf->feedback = feedback;
surf->coded_buf = coded_buf;
 }
  
 context->decoder->end_frame(context->decoder, context->target, >desc.base);

-   if (context->decoder->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE &&
-   context->desc.h264enc.p_remain == 1)
-  context->decoder->flush(context->decoder);
+   if (context->decoder->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE) {
+  surf->frame_num_cnt = context->desc.h264enc.frame_num_cnt;
+  surf->force_flushed = false;
+  if (context->first_single_submitted) {
+ context->decoder->flush(context->decoder);
+ context->first_single_submitted = false;
+ surf->force_flushed = true;
+  }
+  if (context->desc.h264enc.p_remain == 1) {
+ if ((context->desc.h264enc.frame_num_cnt % 2) != 0) {
+context->decoder->flush(context->decoder);
+context->first_single_submitted = true;
+ }
+ else
+context->first_single_submitted = false;
+ surf->force_flushed = true;
+  }
+   }
 pipe_mutex_unlock(drv->mutex);
 return VA_STATUS_SUCCESS;
  }
diff --git a/src/gallium/state_trackers/va/surface.c 
b/src/gallium/state_trackers/va/surface.c
index f8513d9..38b3151 100644
--- a/src/gallium/state_trackers/va/surface.c
+++ b/src/gallium/state_trackers/va/surface.c
@@ -125,12 +125,16 @@ vlVaSyncSurface(VADriverContextP ctx, VASurfaceID 
render_target)
  
 if (context->decoder->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE) {

int frame_diff;
-  if (context->desc.h264enc.frame_num_cnt > surf->frame_num_cnt)
+  if (context->desc.h264enc.frame_num_cnt >= surf->frame_num_cnt)
   frame_diff = context->desc.h264enc.frame_num_cnt - 
surf->frame_num_cnt;
else
   frame_diff = 0x - surf->frame_num_cnt + 1 + 
context->desc.h264enc.frame_num_cnt;
-  if (frame_diff < 2)
+  if ((frame_diff == 0) &&
+  (surf->force_flushed == false) &&
+  (context->desc.h264enc.frame_num_cnt % 2 != 0)) {
   context->decoder->flush(context->decoder);
+ context->first_single_submitted = true;
+  }
context->decoder->get_feedback(context->decoder, surf->feedback, 
&(surf->coded_buf->coded_size));
surf->feedback = NULL;
 }
diff --git a/src/gallium/state_trackers/va/va_private.h 
b/src/gallium/state_trackers/va/va_private.h
index c9a6a41..9e3ba03 100644
--- a/src/gallium/state_trackers/va/va_private.h
+++ b/src/gallium/state_trackers/va/va_private.h
@@ -244,6 +244,7 @@ typedef struct {
 struct vl_deint_filter *deint;
 struct vlVaBuffer *coded_buf;
 int target_id;
+   bool first_single_submitted;
  } vlVaContext;
  
  typedef struct {

@@ -274,6 +275,7 @@ typedef struct {
 vlVaBuffer *coded_buf;
 void 

Re: [Mesa-dev] Mesa 13.1.0 release plan

2016-12-02 Thread Juan A. Suarez Romero
On Thu, 2016-12-01 at 21:29 +0100, Marek Olšák wrote:
> On Wed, Nov 30, 2016 at 9:23 PM, Emil Velikov  om> wrote:
> > Hi all,
> > 
> > With holidays not far off, it might be a nice idea to consider the
> > branchpoint/release schedule for the next release.
> > 
> > I will be having limited internet access during 20 Dec - 7 Jan,
> > thus
> > the I'm leaning towards following:
> >  Jan 13 2017 - Feature freeze/Release candidate 1
> >  Jan 20 2017 - Release candidate 2
> >  Jan 27 2017 - Release candidate 3
> >  Feb 03 2017 - Release candidate 4/final release
> 
> Sounds like a good plan for 17.0. :)

+1

J.A.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 16/22] isl: fix VA64 support for double and dvecN vertex attributes

2016-12-02 Thread Samuel Iglesias Gonsálvez
On Thu, 2016-12-01 at 18:58 -0800, Jason Ekstrand wrote:
> On Fri, Nov 25, 2016 at 12:52 AM, Juan A. Suarez Romero  lia.com> wrote:
> > From: Samuel Iglesias Gonsálvez 
> > 
> > 
> > 
> > We use *64*_PASSTHRU formats to upload vertex attributes of 64 bits
> > 
> > to avoid conversions. From the BDW PRM, Volume 2d, page 586
> > 
> > (VERTEX_ELEMENT_STATE):
> > 
> > 
> > 
> >      "When SourceElementFormat is set to one of the *64*_PASSTHRU
> > 
> >      formats, 64-bit components are stored in the URB without any
> > 
> >      conversion. In this case, vertex elements must be written as
> > 128
> > 
> >      or 256 bits, with VFCOMP_STORE_0 being used to pad the output
> > 
> >      as required. E.g., if R64_PASSTHRU is used to copy a 64-bit
> > Red
> > 
> >      component into the URB, Component 1 must be specified as
> > 
> >      VFCOMP_STORE_0 (with Components 2,3 set to VFCOMP_NOSTORE)
> > 
> >      in order to output a 128-bit vertex element, or Components 1-3 
> > must
> > 
> >      be specified as VFCOMP_STORE_0 in order to output a 256-bit
> > vertex
> > 
> >      element. Likewise, use of R64G64B64_PASSTHRU requires
> > Component 3
> > 
> >      to be specified as VFCOMP_STORE_0 in order to output a 256-bit 
> > vertex
> > 
> >      element."
> > 
> > 
> > 
> > Signed-off-by: Samuel Iglesias Gonsálvez 
> > 
> > ---
> > 
> >  src/intel/isl/isl_format.c          | 4 ++--
> > 
> >  src/intel/isl/isl_format_layout.csv | 3 ---
> > 
> >  src/intel/vulkan/anv_formats.c      | 8 
> > 
> >  3 files changed, 6 insertions(+), 9 deletions(-)
> > 
> > 
> > 
> > diff --git a/src/intel/isl/isl_format.c
> > b/src/intel/isl/isl_format.c
> > 
> > index 98806f4..92b630a 100644
> > 
> > --- a/src/intel/isl/isl_format.c
> > 
> > +++ b/src/intel/isl/isl_format.c
> > 
> > @@ -97,7 +97,7 @@ static const struct surface_format_info
> > format_info[] = {
> > 
> >     SF( x,  x,  x,  x,  x,  x,  Y,  x,  x,    x, 
> >  R32G32B32A32_SSCALED)
> > 
> >     SF( x,  x,  x,  x,  x,  x,  Y,  x,  x,    x, 
> >  R32G32B32A32_USCALED)
> > 
> >     SF( x,  x,  x,  x,  x,  x, 75,  x,  x,    x, 
> >  R32G32B32A32_SFIXED)
> > 
> > -   SF( x,  x,  x,  x,  x,  x,  x,  x,  x,    x,   R64G64_PASSTHRU)
> > 
> > +   SF( x,  x,  x,  x,  x,  x, 80,  x,  x,    x,   R64G64_PASSTHRU)
> > 
> >     SF( Y, 50,  x,  x,  x,  x,  Y,  Y,  x,    x,   R32G32B32_FLOAT)
> > 
> >     SF( Y,  x,  x,  x,  x,  x,  Y,  Y,  x,    x,   R32G32B32_SINT)
> > 
> >     SF( Y,  x,  x,  x,  x,  x,  Y,  Y,  x,    x,   R32G32B32_UINT)
> > 
> > @@ -131,7 +131,7 @@ static const struct surface_format_info
> > format_info[] = {
> > 
> >     SF( x,  x,  x,  x,  x,  x,  Y,  x,  x,    x,   R32G32_SSCALED)
> > 
> >     SF( x,  x,  x,  x,  x,  x,  Y,  x,  x,    x,   R32G32_USCALED)
> > 
> >     SF( x,  x,  x,  x,  x,  x, 75,  x,  x,    x,   R32G32_SFIXED)
> > 
> > -   SF( x,  x,  x,  x,  x,  x,  x,  x,  x,    x,   R64_PASSTHRU)
> > 
> > +   SF( x,  x,  x,  x,  x,  x, 80,  x,  x,    x,   R64_PASSTHRU)
> > 
> >     SF( Y,  Y,  x,  Y,  Y,  Y,  Y,  x, 60,   90,   B8G8R8A8_UNORM)
> > 
> >     SF( Y,  Y,  x,  x,  Y,  Y,  x,  x,  x,    x, 
> >  B8G8R8A8_UNORM_SRGB)
> > 
> >  /* smpl filt shad CK  RT  AB  VB  SO  color ccs_e */
> > 
> > diff --git a/src/intel/isl/isl_format_layout.csv
> > b/src/intel/isl/isl_format_layout.csv
> > 
> > index f0f31c7..b1e298b 100644
> > 
> > --- a/src/intel/isl/isl_format_layout.csv
> > 
> > +++ b/src/intel/isl/isl_format_layout.csv
> > 
> > @@ -96,7 +96,6 @@ X32_TYPELESS_G8X24_UINT     ,  64,  1,  1,  1, 
> > x32,  ui8,  x24,     ,     ,
> > 
> >  L32A32_FLOAT                ,  64,  1,  1,  1,     ,     ,     ,
> > sf32, sf32,     ,    , linear,
> > 
> >  R32G32_UNORM                ,  64,  1,  1,  1, un32, un32,     , 
> >    ,     ,     ,    , linear,
> > 
> >  R32G32_SNORM                ,  64,  1,  1,  1, sn32, sn32,     , 
> >    ,     ,     ,    , linear,
> > 
> > -R64_FLOAT                   ,  64,  1,  1,  1, sf64,     ,     , 
> >    ,     ,     ,    , linear,
> > 
> >  R16G16B16X16_UNORM          ,  64,  1,  1,  1, un16, un16, un16, 
> > x16,     ,     ,    , linear,
> > 
> >  R16G16B16X16_FLOAT          ,  64,  1,  1,  1, sf16, sf16, sf16, 
> > x16,     ,     ,    , linear,
> > 
> >  A32X32_FLOAT                ,  64,  1,  1,  1,     ,     ,     ,
> > sf32,  x32,     ,    ,  alpha,
> > 
> > @@ -243,8 +242,6 @@ R8G8B8_UNORM                ,  24,  1,  1,  1, 
> > un8,  un8,  un8,     ,     ,
> > 
> >  R8G8B8_SNORM                ,  24,  1,  1,  1,  sn8,  sn8,  sn8, 
> >    ,     ,     ,    , linear,
> > 
> >  R8G8B8_SSCALED              ,  24,  1,  1,  1,  ss8,  ss8,  ss8, 
> >    ,     ,     ,    , linear,
> > 
> >  R8G8B8_USCALED              ,  24,  1,  1,  1,  us8,  us8,  us8, 
> >    ,     ,     ,    , linear,
> > 
> > -R64G64B64A64_FLOAT          , 256,  1,  1,  1, sf64, sf64, sf64,
> > sf64,     ,     ,    , linear,
> > 
> > -R64G64B64_FLOAT             , 196,  1,  1,  1, sf64, 

Re: [Mesa-dev] [PATCH 04/22] spirv: add DF support to vtn_const_ssa_value()

2016-12-02 Thread Samuel Iglesias Gonsálvez
On Thu, 2016-12-01 at 18:55 -0800, Jason Ekstrand wrote:
> If you don't mind rebasing on it, my "get rid of nir_constant_data"
> patch should let you drop most of this and patch 5.
> 
> 

OK, thanks!

Sam

On Fri, Nov 25, 2016 at 12:52 AM, Juan A. Suarez Romero  
wrote:
From: Samuel Iglesias Gonsálvez 


Signed-off-by: Samuel Iglesias Gonsálvez 

---

 src/compiler/spirv/spirv_to_nir.c | 24 +---

 1 file changed, 17 insertions(+), 7 deletions(-)


diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c

index dadf7fc..8569bc8 100644

--- a/src/compiler/spirv/spirv_to_nir.c

+++ b/src/compiler/spirv/spirv_to_nir.c

@@ -98,14 +98,19 @@ vtn_const_ssa_value(struct vtn_builder *b, nir_constant 
*constant,

    case GLSL_TYPE_UINT:

    case GLSL_TYPE_BOOL:

    case GLSL_TYPE_FLOAT:

-   case GLSL_TYPE_DOUBLE:

+   case GLSL_TYPE_DOUBLE: {

+      int bit_size = glsl_get_bit_size(type);

       if (glsl_type_is_vector_or_scalar(type)) {

          unsigned num_components = glsl_get_vector_elements(val->type);

          nir_load_const_instr *load =

-            nir_load_const_instr_create(b->shader, num_components, 32);

+            nir_load_const_instr_create(b->shader, num_components, bit_size);


-         for (unsigned i = 0; i < num_components; i++)

-            load->value.u32[i] = constant->value.u[i];

+         for (unsigned i = 0; i < num_components; i++) {

+            if (bit_size == 64)

+               load->value.f64[i] = constant->value.d[i];

+            else

+               load->value.u32[i] = constant->value.u[i];

+         }


          nir_instr_insert_before_cf_list(>impl->body, >instr);

          val->def = >def;

@@ -119,10 +124,14 @@ vtn_const_ssa_value(struct vtn_builder *b, nir_constant 
*constant,

             struct vtn_ssa_value *col_val = rzalloc(b, struct vtn_ssa_value);

             col_val->type = glsl_get_column_type(val->type);

             nir_load_const_instr *load =

-               nir_load_const_instr_create(b->shader, rows, 32);

+               nir_load_const_instr_create(b->shader, rows, bit_size);


-            for (unsigned j = 0; j < rows; j++)

-               load->value.u32[j] = constant->value.u[rows * i + j];

+            for (unsigned j = 0; j < rows; j++) {

+               if (bit_size == 64)

+                  load->value.f64[j] = constant->value.d[rows * i + j];

+               else

+                  load->value.u32[j] = constant->value.u[rows * i + j];

+            }


             nir_instr_insert_before_cf_list(>impl->body, >instr);

             col_val->def = >def;

@@ -131,6 +140,7 @@ vtn_const_ssa_value(struct vtn_builder *b, nir_constant 
*constant,

          }

       }

       break;

+   }


    case GLSL_TYPE_ARRAY: {

       unsigned elems = glsl_get_length(val->type);
--

2.7.4


___

mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/22] spirv/nir: implement DF conversions

2016-12-02 Thread Samuel Iglesias Gonsálvez
On Thu, 2016-12-01 at 18:50 -0800, Jason Ekstrand wrote:
> On Fri, Nov 25, 2016 at 12:52 AM, Juan A. Suarez Romero  lia.com> wrote:
> > From: Samuel Iglesias Gonsálvez 
> > 
> > SPIR-V does not have special opcodes for DF conversions. We need to
> > identify
> > them by checking the bit size of the operand and the result.
> > 
> > Signed-off-by: Samuel Iglesias Gonsálvez 
> > ---
> >  src/compiler/spirv/spirv_to_nir.c | 29 ++-
> > --
> >  src/compiler/spirv/vtn_alu.c      | 37
> > +++--
> >  src/compiler/spirv/vtn_private.h  |  3 ++-
> >  3 files changed, 51 insertions(+), 18 deletions(-)
> > 
> > diff --git a/src/compiler/spirv/spirv_to_nir.c
> > b/src/compiler/spirv/spirv_to_nir.c
> > index a13f72a..81c73da 100644
> > --- a/src/compiler/spirv/spirv_to_nir.c
> > +++ b/src/compiler/spirv/spirv_to_nir.c
> > @@ -1211,12 +1211,21 @@ vtn_handle_constant(struct vtn_builder *b,
> > SpvOp opcode,
> > 
> >        default: {
> >           bool swap;
> > -         nir_op op = vtn_nir_alu_op_for_spirv_opcode(opcode,
> > );
> > -
> > -         unsigned num_components = glsl_get_vector_elements(val-
> > >const_type);
> >           unsigned bit_size =
> >              glsl_get_bit_size(val->const_type);
> > 
> > +         bool is_double_dst = bit_size == 64;
> > +         bool is_double_src = is_double_dst;
> > +         /* We assume there is no double conversion here */
> > +         assert(bit_size != 64 ||
> > +                (opcode != SpvOpConvertFToU && opcode !=
> > SpvOpConvertFToS &&
> > +                 opcode != SpvOpConvertSToF && opcode !=
> > SpvOpConvertUToF &&
> > +                 opcode != SpvOpFConvert));
> > +         nir_op op =
> > +            vtn_nir_alu_op_for_spirv_opcode(opcode, ,
> > +                                            is_double_dst,
> > is_double_src);
> > +
> > +         unsigned num_components = glsl_get_vector_elements(val-
> > >const_type);
> >           nir_const_value src[4];
> >           assert(count <= 7);
> >           for (unsigned i = 0; i < count - 4; i++) {
> > @@ -1224,16 +1233,22 @@ vtn_handle_constant(struct vtn_builder *b,
> > SpvOp opcode,
> >                 vtn_value(b, w[4 + i], vtn_value_type_constant)-
> > >constant;
> > 
> >              unsigned j = swap ? 1 - i : i;
> > -            assert(bit_size == 32);
> >              for (unsigned k = 0; k < num_components; k++)
> > -               src[j].u32[k] = c->value.u[k];
> > +               if (!is_double_src)
> > +                  src[j].u32[k] = c->value.u[k];
> > +               else
> > +                  src[j].f64[k] = c->value.d[k];
> >           }
> > 
> >           nir_const_value res = nir_eval_const_opcode(op,
> > num_components,
> >                                                       bit_size,
> > src);
> > 
> > -         for (unsigned k = 0; k < num_components; k++)
> > -            val->constant->value.u[k] = res.u32[k];
> > +         for (unsigned k = 0; k < num_components; k++) {
> > +            if (!is_double_dst)
> > +               val->constant->value.u[k] = res.u32[k];
> > +            else
> > +               val->constant->value.d[k] = res.f64[k];
> > +         }
> > 
> >           break;
> >        } /* default */
> > diff --git a/src/compiler/spirv/vtn_alu.c
> > b/src/compiler/spirv/vtn_alu.c
> > index 95ff2b1..e444d3f 100644
> > --- a/src/compiler/spirv/vtn_alu.c
> > +++ b/src/compiler/spirv/vtn_alu.c
> > @@ -211,7 +211,8 @@ vtn_handle_matrix_alu(struct vtn_builder *b,
> > SpvOp opcode,
> >  }
> > 
> >  nir_op
> > -vtn_nir_alu_op_for_spirv_opcode(SpvOp opcode, bool *swap)
> > +vtn_nir_alu_op_for_spirv_opcode(SpvOp opcode, bool *swap,
> > +                                bool is_double_dst, bool
> > is_double_src)
> 
> I think it would be better if we did this as dst_bit_size and
> src_bit_size.  That would make this simpler for basically every
> caller.  Also, it makes it more 8/16-bit ready.
>  

OK.

> >  {
> >     /* Indicates that the first two arguments should be swapped. 
> > This is
> >      * used for implementing greater-than and less-than-or-equal.
> > @@ -284,16 +285,21 @@ vtn_nir_alu_op_for_spirv_opcode(SpvOp opcode,
> > bool *swap)
> >     case SpvOpFUnordGreaterThanEqual:               return
> > nir_op_fge;
> > 
> >     /* Conversions: */
> > -   case SpvOpConvertFToU:           return nir_op_f2u;
> > -   case SpvOpConvertFToS:           return nir_op_f2i;
> > -   case SpvOpConvertSToF:           return nir_op_i2f;
> > -   case SpvOpConvertUToF:           return nir_op_u2f;
> > +   case SpvOpConvertFToU:           return is_double_src ?
> > nir_op_d2u : nir_op_f2u;
> > +   case SpvOpConvertFToS:           return is_double_src ?
> > nir_op_d2i : nir_op_f2i;
> > +   case SpvOpConvertSToF:           return is_double_dst ?
> > nir_op_i2d : nir_op_i2f;
> > +   case SpvOpConvertUToF:           return is_double_dst ?
> > nir_op_u2d : nir_op_u2f;
> 
> The