Re: [Mesa-dev] [Mesa-stable] [PATCH v2 1/3] i965/fs: fix indirect load DF uniforms on BSW/BXT

2017-02-21 Thread Samuel Iglesias Gonsálvez


On 21/02/17 21:07, Francisco Jerez wrote:
> Samuel Iglesias Gonsálvez  writes:
> 
>> On 20/02/17 21:31, Francisco Jerez wrote:
>>> Samuel Iglesias Gonsálvez  writes:
>>>
 On Mon, 2017-02-20 at 08:58 +0100, Samuel Iglesias Gonsálvez wrote:
> On Sat, 2017-02-18 at 18:58 -0800, Francisco Jerez wrote:
>> Samuel Iglesias Gonsálvez  writes:
>>
>>> The lowered BSW/BXT indirect move instructions had incorrect
>>> source types, which luckily wasn't causing incorrect assembly to
>>> be
>>> generated due to the bug fixed in the next patch, but would have
>>> confused the remaining back-end IR infrastructure due to the
>>> mismatch
>>> between the IR source types and the emitted machine code.
>>>
>>> v2:
>>> - Improve commit log (Curro)
>>> - Fix read_size (Curro)
>>> - Fix DF uniform array detection in assign_constant_locations()
>>> when
>>>   it is acceded with 32-bit MOV_INDIRECTs in BSW/BXT.
>>>
>>> Signed-off-by: Samuel Iglesias Gonsálvez 
>>> Cc: "17.0" 
>>> ---
>>>  src/mesa/drivers/dri/i965/brw_fs.cpp | 11 -
>>>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 41 --
>>> --
>>>  2 files changed, 30 insertions(+), 22 deletions(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
>>> b/src/mesa/drivers/dri/i965/brw_fs.cpp
>>> index c348bc7138d..93ab84b5845 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>>> @@ -1944,10 +1944,19 @@ fs_visitor::assign_constant_locations()
>>>  unsigned last = constant_nr + (inst->src[2].ud / 4)
>>> -
>>> 1;
>>>  assert(last < uniforms);
>>>  
>>> +bool supports_64bit_indirects =
>>> +   !devinfo->is_cherryview && !devinfo->is_broxton;
>>> +/* Detect if this is as a result 64-bit MOV
>>> INDIRECT.
>>> In case
>>> + * of BSW/BXT, we substitute each DF MOV INDIRECT by
>>> two 32-bit MOV
>>> + * INDIRECT.
>>> + */
>>> +bool mov_indirect_64bit = (type_sz(inst-
 src[i].type)
>>> == 8) ||
>>> +   (!supports_64bit_indirects && inst->dst.type ==
>>> BRW_REGISTER_TYPE_UD &&
>>> +inst->src[0].type == BRW_REGISTER_TYPE_UD &&
>>> inst-
 dst.stride == 2);
>>
>> This seems kind of fragile, I don't think the optimizer gives you
>> any
>> guarantees that the stride of a lowered 64-bit indirect move will
>> remain
>> equal to two, or that the destination stride of an actual 32-bit
>> indirect uniform load will never end up being two as well.  That
>> said,
>> because you access these with 32-bit indirect moves, I don't see
>> why
>> you'd need to treat them as 64-bit uniforms here, the usual
>> alignment
>> requirements for 64-bit uniforms no longer apply, so you can treat
>> them
>> as regular 32-bit uniforms AFAICT.  Why did you add this hunk?
>>
>
> I added it because of this case: if we access to one DF uniform array
> element with a normal MOV and the rest with MOV INDIRECT, we will
> mark
> the former as a live 64bit variable. Then we have the array scattered
> as part of it is uploaded as a 64-bits uniform and the other as 32-
> bits. Even if we avoid this by uploading everything together as 32-
> bits, then the access to that DF could not be aligned to 64-bits.
>
> So my idea was to find a way to identify somehow those MOV INDIRECT
> in
> BSW to mark all the array as a 64-bit one.
>

 Mmm, maybe I can fix this case without the hack I did. I can add the
 following code after marking all live variables accessed by the
 instructions.

 It is very similar to the one to detect live variables but it is fixing
 the case where any MOV INDIRECT in BSW is accessing to an uniform array
 of DF elements where one of these elements is directly accessed by
 another instruction.

 What do you think?

>>>
>>> Looks somewhat better, but I don't think this is correct if you have
>>> multiple overlapping indirect loads of the same uniform array and only
>>> one of them overlaps with a direct 64-bit load.
>>
>> In that case, I think this is correct. The 2 32-bit MOV INDIRECTs where
>> emitted as a "lowering" of an unsupported 64-bit MOV INDIRECT. They both
>> keep the 'read_size' of the original one, so they both overlap to any
>> other direct 64-bit load to that array like with the original
>> instruction. If none of them overlap to the direct 64-bit access, then I
>> think they can be handled as non-contiguous to the latter without any issue.
>>
> 
> What if you have two 

Re: [Mesa-dev] [RFC] spec: MESA_program_binary

2017-02-21 Thread Timothy Arceri



On 22/02/17 06:11, Ian Romanick wrote:

On 02/16/2017 04:33 PM, Timothy Arceri wrote:

On 17/02/17 10:44, Ian Romanick wrote:

On 02/15/2017 11:58 PM, Timothy Arceri wrote:

On 16/02/17 17:55, Tapani Pälli wrote:


On 02/16/2017 04:52 AM, Timothy Arceri wrote:

In order add functionality to ARB_get_program_binary we need
binary format enums.


I've understood that this is a driver internal enumeration. When
application gets the binary it also receives enum (integer value) what
format we gave. Then when loading application needs to query what
formats are supported by the implementation and load the correct
binary.
We just need to internally make agreement on format list and return
correct one matching the current driver in use?


Not that it's actually likely to happen but if we were to only have a
single MESA enum an application could only distribute a single binary.


Applications really, really, *REALLY* should not distribute binaries
retrieved from the driver.  The intention of this extension is for
applications to implement their own shader cache, for example, at
application installation.  The driver can reject the binary at any time
for any reason.  Driver changes, hardware changes, OS changes, phase of
the moon, etc.

Looking at the GLES extension registry, it appears that the other
vendors have just a single binary for all the hardware they make.  Based
on that, having a single Mesa enum isn't an insane idea.  We would just
need to agree on the format of the header so that the driver receiving
the blob could determine which driver generated the blob.


The only other thing to consider with a single enum is that it will
require a laptop with an Intel cpu and Nvidia gpu for example to
recompile the binary if the user were to switch between using the Intel
and Nvidia gpus. This might happen depending on if the laptop is plugged
into a power source or not.

If we don't care about this than one enum is fine.


Hm... I think we care, but I don't think multiple enums will help
existing apps... but maybe?  I imagine the usual scenario is:

- User runs first time on nouveau.

- Application saves binaries from nouveau.

- User runs second time on i965.

- Application submits binary from nouveau.

- Application deletes its binary cache, resubmits from source, resaves
binaries from i965.

- User runs third time on nouveau.

- Application submits binary from i965.

- Application deletes its binary cache, resubmits from source, resaves
binaries from nouveau.

- Lather

- Rinse

- Repeat

It seems like if we actually care about this configuration, we'd need a
more complex solution.  It's not 100% clear what that solution would be
or how we would be able to implement it.  I think the right solution is
a driver-side shader cache that is smart enough to track binaries from
multiple drivers without stomping on each other.  Right? :)


Yeah ok, I hadn't thought about apps ignoring the enum and just keeping 
a single copy. I guess your right we should eventually have a 
driver-side cache by default so can just package up the cache items and 
return them.


I'm happy with a one enum solution :)





e.g either for AMD, INTEL or NVIDIA but not one for each. That is unless
we were to compile and pack all gpu vendor binarys at the same time
which seems overly complicated and expensive.

I could see an intenal id being used for gpu generations from hardware
vendors.



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/8] gallium: add get_disk_shader_cache() callback

2017-02-21 Thread Michel Dänzer
On 22/02/17 12:45 PM, Timothy Arceri wrote:
> 
> +get_disk_shader_cache
> +^
> +
> +Returns a pointer to driver-specific on-disk shader cache. If the driver
> +failed to create the cache or does not support an on-disk shader cache NULL 
> is
> +returned.

[...]

> +   /**
> +* Returns a pointer to driver-specific on-disk shader cache. If the 
> driver
> +* failed to create the cache or does not support an on-disk shader cache
> +* NULL is returned.
> +*/
> +   struct disk_cache *(*get_disk_shader_cache)(struct pipe_screen *screen);
>  };

Drivers which don't support an on-disk shader cache don't set this
callback in the first place, right? :) (Just a suggestion for
improvement before landing this patch, not a blocker, no need to resend)


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv: implement pipeline statistics queries

2017-02-21 Thread Ilia Mirkin
On Wed, Feb 22, 2017 at 12:07 AM, Jason Ekstrand  wrote:
> Hey Look.  I'm actually reading your patch now!
>
> I read through the whole thing and over-all I think it looks fairly good.
>
> On Sun, Nov 27, 2016 at 11:23 AM, Ilia Mirkin  wrote:
>>
>> The strategy is to just keep n anv_query_pool_slot entries per query
>> instead of one. The available bit is only valid in the last one.
>
>
> Seems like a reasonable approach.  To be honest, I'm not a huge fan of the
> "available" bit (or 64 bits as the case may be) but I'm not sure how we'd
> get away without it.
>
> Maybe it would be better to do something like:
>
> struct anv_query_entry {
>uint64_t begin;
>uint64_t end;
> };
>
> struct anv_query_pool_slot {
>uint64_t available
>struct anv_query_entry entries[0];
> };
>
> Food for thought.

Seems reasonable.

>
>>
>> Signed-off-by: Ilia Mirkin 
>> ---
>>
>> I think this is in a pretty good state now. I've tested both the direct
>> and
>> buffer paths with a hacked up cube application, and I'm seeing
>> non-ridiculous
>> values for the various counters, although I haven't 100% verified them for
>> accuracy.
>>
>> This also implements the hsw/bdw workaround for dividing frag invocations
>> by 4,
>> copied from hsw_queryobj. I tested this on SKL and it seem to divide the
>> values
>> as expected.
>>
>> The cube patch I've been testing with is at
>> http://paste.debian.net/899374/
>> You can flip between copying to a buffer and explicit retrieval by
>> commenting
>> out the relevant function calls.
>>
>>  src/intel/vulkan/anv_device.c  |   2 +-
>>  src/intel/vulkan/anv_private.h |   4 +
>>  src/intel/vulkan/anv_query.c   |  99 ++
>>  src/intel/vulkan/genX_cmd_buffer.c | 260
>> -
>>  4 files changed, 308 insertions(+), 57 deletions(-)
>>
>> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
>> index 99eb73c..7ad1970 100644
>> --- a/src/intel/vulkan/anv_device.c
>> +++ b/src/intel/vulkan/anv_device.c
>> @@ -427,7 +427,7 @@ void anv_GetPhysicalDeviceFeatures(
>>.textureCompressionASTC_LDR   = pdevice->info.gen >= 9,
>> /* FINISHME CHV */
>>.textureCompressionBC = true,
>>.occlusionQueryPrecise= true,
>> -  .pipelineStatisticsQuery  = false,
>> +  .pipelineStatisticsQuery  = true,
>>.fragmentStoresAndAtomics = true,
>>.shaderTessellationAndGeometryPointSize   = true,
>>.shaderImageGatherExtended= false,
>> diff --git a/src/intel/vulkan/anv_private.h
>> b/src/intel/vulkan/anv_private.h
>> index 2fc543d..7271609 100644
>> --- a/src/intel/vulkan/anv_private.h
>> +++ b/src/intel/vulkan/anv_private.h
>> @@ -1763,6 +1763,8 @@ struct anv_render_pass {
>> struct anv_subpass   subpasses[0];
>>  };
>>
>> +#define ANV_PIPELINE_STATISTICS_COUNT 11
>> +
>>  struct anv_query_pool_slot {
>> uint64_t begin;
>> uint64_t end;
>> @@ -1772,6 +1774,8 @@ struct anv_query_pool_slot {
>>  struct anv_query_pool {
>> VkQueryType  type;
>> uint32_t slots;
>> +   uint32_t pipeline_statistics;
>> +   uint32_t slot_stride;
>> struct anv_bobo;
>>  };
>>
>> diff --git a/src/intel/vulkan/anv_query.c b/src/intel/vulkan/anv_query.c
>> index 293257b..dc00859 100644
>> --- a/src/intel/vulkan/anv_query.c
>> +++ b/src/intel/vulkan/anv_query.c
>> @@ -38,8 +38,10 @@ VkResult anv_CreateQueryPool(
>> ANV_FROM_HANDLE(anv_device, device, _device);
>> struct anv_query_pool *pool;
>> VkResult result;
>> -   uint32_t slot_size;
>> -   uint64_t size;
>> +   uint32_t slot_size = sizeof(struct anv_query_pool_slot);
>>
>> +   uint32_t slot_stride = 1;
>
>
> Strides are usually in bytes, not slots...

slot_pitch? :) IMHO stride/pitch doesn't have a unit implied by the
name itself. I'm pretty sure I've seen it refer to both bytes and
higher-level units.

>
>>
>> +   uint64_t size = pCreateInfo->queryCount * slot_size;
>
>
> Might make sense to move this to after we compute the slot_stride.
>
>>
>> +   uint32_t pipeline_statistics = 0;
>>
>> assert(pCreateInfo->sType ==
>> VK_STRUCTURE_TYPE_QUERY_POOL_CREATE_INFO);
>>
>> @@ -48,12 +50,16 @@ VkResult anv_CreateQueryPool(
>> case VK_QUERY_TYPE_TIMESTAMP:
>>break;
>> case VK_QUERY_TYPE_PIPELINE_STATISTICS:
>> -  return VK_ERROR_INCOMPATIBLE_DRIVER;
>> +  pipeline_statistics = pCreateInfo->pipelineStatistics &
>> + ((1 << ANV_PIPELINE_STATISTICS_COUNT) - 1);
>> +  slot_stride = _mesa_bitcount(pipeline_statistics);
>> +  size *= slot_stride;
>> +  break;
>> default:
>>assert(!"Invalid query type");
>> +

Re: [Mesa-dev] [PATCH] android: glsl: build shader cache sources

2017-02-21 Thread Tapani Pälli



On 02/20/2017 07:10 PM, Emil Velikov wrote:

On 20 February 2017 at 09:31, Tapani Pälli  wrote:

On 02/19/2017 03:06 AM, Timothy Arceri wrote:


I would have thought this commit [1] should have fixed it for android as
weel as scons.

[1]

https://cgit.freedesktop.org/mesa/mesa/commit/?id=172c48cc15e2a7b42a7de8ff9164ad8733155667



Problem is that we have ENABLE_SHADER_CACHE on because it is linked to
having SHA1 available, see following commit:

9f8dc3bf03ec825bae7041858dda6ca2e9a34363

not sure how we should deal with shader cache on Android, it will probably
require some custom location to write to, otherwise I guess it should work
similar as on desktop.


Until 'the correct' place is established one can use
MESA_GLSL_CACHE_DIR envvar ;-)



That's true, we can have some testing done with this to figure out a 
good place.


Thanks;

// Tapani
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv: implement pipeline statistics queries

2017-02-21 Thread Jason Ekstrand
Hey Look.  I'm actually reading your patch now!

I read through the whole thing and over-all I think it looks fairly good.

On Sun, Nov 27, 2016 at 11:23 AM, Ilia Mirkin  wrote:

> The strategy is to just keep n anv_query_pool_slot entries per query
> instead of one. The available bit is only valid in the last one.
>

Seems like a reasonable approach.  To be honest, I'm not a huge fan of the
"available" bit (or 64 bits as the case may be) but I'm not sure how we'd
get away without it.

Maybe it would be better to do something like:

struct anv_query_entry {
   uint64_t begin;
   uint64_t end;
};

struct anv_query_pool_slot {
   uint64_t available
   struct anv_query_entry entries[0];
};

Food for thought.


> Signed-off-by: Ilia Mirkin 
> ---
>
> I think this is in a pretty good state now. I've tested both the direct and
> buffer paths with a hacked up cube application, and I'm seeing
> non-ridiculous
> values for the various counters, although I haven't 100% verified them for
> accuracy.
>
> This also implements the hsw/bdw workaround for dividing frag invocations
> by 4,
> copied from hsw_queryobj. I tested this on SKL and it seem to divide the
> values
> as expected.
>
> The cube patch I've been testing with is at http://paste.debian.net/899374
> /
> You can flip between copying to a buffer and explicit retrieval by
> commenting
> out the relevant function calls.
>
>  src/intel/vulkan/anv_device.c  |   2 +-
>  src/intel/vulkan/anv_private.h |   4 +
>  src/intel/vulkan/anv_query.c   |  99 ++
>  src/intel/vulkan/genX_cmd_buffer.c | 260 ++
> ++-
>  4 files changed, 308 insertions(+), 57 deletions(-)
>
> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
> index 99eb73c..7ad1970 100644
> --- a/src/intel/vulkan/anv_device.c
> +++ b/src/intel/vulkan/anv_device.c
> @@ -427,7 +427,7 @@ void anv_GetPhysicalDeviceFeatures(
>.textureCompressionASTC_LDR   = pdevice->info.gen >= 9,
> /* FINISHME CHV */
>.textureCompressionBC = true,
>.occlusionQueryPrecise= true,
> -  .pipelineStatisticsQuery  = false,
> +  .pipelineStatisticsQuery  = true,
>.fragmentStoresAndAtomics = true,
>.shaderTessellationAndGeometryPointSize   = true,
>.shaderImageGatherExtended= false,
> diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private
> .h
> index 2fc543d..7271609 100644
> --- a/src/intel/vulkan/anv_private.h
> +++ b/src/intel/vulkan/anv_private.h
> @@ -1763,6 +1763,8 @@ struct anv_render_pass {
> struct anv_subpass   subpasses[0];
>  };
>
> +#define ANV_PIPELINE_STATISTICS_COUNT 11
> +
>  struct anv_query_pool_slot {
> uint64_t begin;
> uint64_t end;
> @@ -1772,6 +1774,8 @@ struct anv_query_pool_slot {
>  struct anv_query_pool {
> VkQueryType  type;
> uint32_t slots;
> +   uint32_t pipeline_statistics;
> +   uint32_t slot_stride;
> struct anv_bobo;
>  };
>
> diff --git a/src/intel/vulkan/anv_query.c b/src/intel/vulkan/anv_query.c
> index 293257b..dc00859 100644
> --- a/src/intel/vulkan/anv_query.c
> +++ b/src/intel/vulkan/anv_query.c
> @@ -38,8 +38,10 @@ VkResult anv_CreateQueryPool(
> ANV_FROM_HANDLE(anv_device, device, _device);
> struct anv_query_pool *pool;
> VkResult result;
> -   uint32_t slot_size;
> -   uint64_t size;
> +   uint32_t slot_size = sizeof(struct anv_query_pool_slot);
>
+   uint32_t slot_stride = 1;
>

Strides are usually in bytes, not slots...


> +   uint64_t size = pCreateInfo->queryCount * slot_size;
>

Might make sense to move this to after we compute the slot_stride.


> +   uint32_t pipeline_statistics = 0;
>
> assert(pCreateInfo->sType == VK_STRUCTURE_TYPE_QUERY_POOL_C
> REATE_INFO);
>
> @@ -48,12 +50,16 @@ VkResult anv_CreateQueryPool(
> case VK_QUERY_TYPE_TIMESTAMP:
>break;
> case VK_QUERY_TYPE_PIPELINE_STATISTICS:
> -  return VK_ERROR_INCOMPATIBLE_DRIVER;
> +  pipeline_statistics = pCreateInfo->pipelineStatistics &
> + ((1 << ANV_PIPELINE_STATISTICS_COUNT) - 1);
> +  slot_stride = _mesa_bitcount(pipeline_statistics);
> +  size *= slot_stride;
> +  break;
> default:
>assert(!"Invalid query type");
> +  return VK_ERROR_INCOMPATIBLE_DRIVER;
> }
>
> -   slot_size = sizeof(struct anv_query_pool_slot);
> pool = vk_alloc2(>alloc, pAllocator, sizeof(*pool), 8,
>   VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
> if (pool == NULL)
> @@ -61,8 +67,9 @@ VkResult anv_CreateQueryPool(
>
> pool->type = pCreateInfo->queryType;
> pool->slots = pCreateInfo->queryCount;
> +   

[Mesa-dev] [PATCH v3] radeonsi, r600g: Alias 'R600_DEBUG' with 'RADEON_DEBUG'

2017-02-21 Thread Edward O'Callaghan
The name has become a little misleading now that it applies
to both r600g and radeonsi.

V.2: Michel Dänzer - R600_DEBUG must continue to work.
V.3: fixup missed case in V.2.

Signed-off-by: Edward O'Callaghan 
---
 src/gallium/drivers/r600/r600_pipe.c  | 1 +
 src/gallium/drivers/radeon/r600_pipe_common.c | 2 ++
 src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c | 4 +++-
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 4 +++-
 4 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_pipe.c 
b/src/gallium/drivers/r600/r600_pipe.c
index 1803c26..f4ab0ee 100644
--- a/src/gallium/drivers/r600/r600_pipe.c
+++ b/src/gallium/drivers/r600/r600_pipe.c
@@ -641,6 +641,7 @@ struct pipe_screen *r600_screen_create(struct radeon_winsys 
*ws)
}
 
rscreen->b.debug_flags |= debug_get_flags_option("R600_DEBUG", 
r600_debug_options, 0);
+   rscreen->b.debug_flags |= debug_get_flags_option("RADEON_DEBUG", 
r600_debug_options, 0);
if (debug_get_bool_option("R600_DEBUG_COMPUTE", FALSE))
rscreen->b.debug_flags |= DBG_COMPUTE;
if (debug_get_bool_option("R600_DUMP_SHADERS", FALSE))
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index 1781584..5670c41 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -1257,7 +1257,9 @@ bool r600_common_screen_init(struct r600_common_screen 
*rscreen,
rscreen->ws = ws;
rscreen->family = rscreen->info.family;
rscreen->chip_class = rscreen->info.chip_class;
+
rscreen->debug_flags = debug_get_flags_option("R600_DEBUG", 
common_debug_options, 0);
+   rscreen->debug_flags |= debug_get_flags_option("RADEON_DEBUG", 
common_debug_options, 0);
 
slab_create_parent(>pool_transfers, sizeof(struct 
r600_transfer), 64);
 
diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
index da9371d..0531f92 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
@@ -372,7 +372,9 @@ static bool do_winsys_init(struct amdgpu_winsys *ws, int fd)
if (ws->info.chip_class == SI)
   ws->info.gfx_ib_pad_with_type2 = TRUE;
 
-   ws->check_vm = strstr(debug_get_option("R600_DEBUG", ""), "check_vm") != 
NULL;
+   if ((strstr(debug_get_option("R600_DEBUG", ""), "check_vm") != NULL) ||
+   (strstr(debug_get_option("RADEON_DEBUG", ""), "check_vm") != NULL))
+  ws->check_vm = true;
 
return true;
 
diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c 
b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
index a39a7be..e070d29 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
@@ -526,7 +526,9 @@ static bool do_winsys_init(struct radeon_drm_winsys *ws)
  ws->accel_working2 < 3);
 ws->info.tcc_cache_line_size = 64; /* TC L2 line size on GCN */
 
-ws->check_vm = strstr(debug_get_option("R600_DEBUG", ""), "check_vm") != 
NULL;
+if ((strstr(debug_get_option("R600_DEBUG", ""), "check_vm") != NULL) ||
+   (strstr(debug_get_option("RADEON_DEBUG", ""), "check_vm") != NULL))
+   ws->check_vm = true;
 
 return true;
 }
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: fetch sample index via fmask for image coord as well.

2017-02-21 Thread Dave Airlie
From: Dave Airlie 

This follows the txf_ms code, I can't figure out why amdgpu-pro
doesn't do this in their shaders, they must know someone we don't.

This fixes:
dEQP-VK.pipeline.multisample_shader_builtin.sample_id.*

Signed-off-by: Dave Airlie 
---
 src/amd/common/ac_nir_to_llvm.c | 180 
 1 file changed, 126 insertions(+), 54 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index d02531b..1e6c979 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -2367,60 +2367,6 @@ static int image_type_to_components_count(enum 
glsl_sampler_dim dim, bool array)
return 0;
 }
 
-static LLVMValueRef get_image_coords(struct nir_to_llvm_context *ctx,
-nir_intrinsic_instr *instr)
-{
-   const struct glsl_type *type = instr->variables[0]->var->type;
-   if(instr->variables[0]->deref.child)
-   type = instr->variables[0]->deref.child->type;
-
-   LLVMValueRef src0 = get_src(ctx, instr->src[0]);
-   LLVMValueRef coords[4];
-   LLVMValueRef masks[] = {
-   LLVMConstInt(ctx->i32, 0, false), LLVMConstInt(ctx->i32, 1, 
false),
-   LLVMConstInt(ctx->i32, 2, false), LLVMConstInt(ctx->i32, 3, 
false),
-   };
-   LLVMValueRef res;
-   int count;
-   enum glsl_sampler_dim dim = glsl_get_sampler_dim(type);
-   bool add_frag_pos = (dim == GLSL_SAMPLER_DIM_SUBPASS ||
-dim == GLSL_SAMPLER_DIM_SUBPASS_MS);
-   bool is_ms = (dim == GLSL_SAMPLER_DIM_MS ||
- dim == GLSL_SAMPLER_DIM_SUBPASS_MS);
-
-   count = image_type_to_components_count(dim,
-  
glsl_sampler_type_is_array(type));
-
-   if (count == 1) {
-   if (instr->src[0].ssa->num_components)
-   res = LLVMBuildExtractElement(ctx->builder, src0, 
masks[0], "");
-   else
-   res = src0;
-   } else {
-   int chan;
-   if (is_ms)
-   count--;
-   for (chan = 0; chan < count; ++chan) {
-   coords[chan] = LLVMBuildExtractElement(ctx->builder, 
src0, masks[chan], "");
-   }
-
-   if (add_frag_pos) {
-   for (chan = 0; chan < count; ++chan)
-   coords[chan] = LLVMBuildAdd(ctx->builder, 
coords[chan], LLVMBuildFPToUI(ctx->builder, ctx->frag_pos[chan], ctx->i32, ""), 
"");
-   }
-   if (is_ms) {
-   coords[count] = llvm_extract_elem(ctx, get_src(ctx, 
instr->src[1]), 0);
-   count++;
-   }
-
-   if (count == 3) {
-   coords[3] = LLVMGetUndef(ctx->i32);
-   count = 4;
-   }
-   res = ac_build_gather_values(>ac, coords, count);
-   }
-   return res;
-}
 
 static void build_type_name_for_intr(
 LLVMTypeRef type,
@@ -2483,6 +2429,132 @@ static void get_image_intr_name(const char *base_name,
 }
 }
 
+static LLVMValueRef get_image_coords(struct nir_to_llvm_context *ctx,
+nir_intrinsic_instr *instr)
+{
+   const struct glsl_type *type = instr->variables[0]->var->type;
+   if(instr->variables[0]->deref.child)
+   type = instr->variables[0]->deref.child->type;
+
+   LLVMValueRef src0 = get_src(ctx, instr->src[0]);
+   LLVMValueRef coords[4];
+   LLVMValueRef masks[] = {
+   LLVMConstInt(ctx->i32, 0, false), LLVMConstInt(ctx->i32, 1, 
false),
+   LLVMConstInt(ctx->i32, 2, false), LLVMConstInt(ctx->i32, 3, 
false),
+   };
+   LLVMValueRef res;
+   LLVMValueRef sample_index = llvm_extract_elem(ctx, get_src(ctx, 
instr->src[1]), 0);
+
+   int count;
+   enum glsl_sampler_dim dim = glsl_get_sampler_dim(type);
+   bool add_frag_pos = (dim == GLSL_SAMPLER_DIM_SUBPASS ||
+dim == GLSL_SAMPLER_DIM_SUBPASS_MS);
+   bool is_ms = (dim == GLSL_SAMPLER_DIM_MS ||
+ dim == GLSL_SAMPLER_DIM_SUBPASS_MS);
+
+   count = image_type_to_components_count(dim,
+  
glsl_sampler_type_is_array(type));
+
+   if (is_ms) {
+   LLVMValueRef fmask_load_address[4];
+   LLVMValueRef params[7];
+   LLVMValueRef glc = LLVMConstInt(ctx->i1, 0, false);
+   LLVMValueRef slc = LLVMConstInt(ctx->i1, 0, false);
+   LLVMValueRef da = ctx->i32zero;
+   char intrinsic_name[64];
+   int chan;
+   fmask_load_address[0] = LLVMBuildExtractElement(ctx->builder, 
src0, masks[0], "");
+   fmask_load_address[1] = LLVMBuildExtractElement(ctx->builder, 
src0, masks[1], "");
+

Re: [Mesa-dev] V4 TGSI on-disk shader cache

2017-02-21 Thread Edward O'Callaghan
The rest of this series is,
Reviewed-by: Edward O'Callaghan 

On 02/22/2017 02:45 PM, Timothy Arceri wrote:
> Changes in V4:
> 
> - split tgsi cache code into its own file
> - add missing fallback for tgsi cache miss
> - share the sha1 generated by the load function with the 
>   store function like in the glsl ir cache.
> - add get_disk_shader_cache() to the pass-throughs
> - add get_disk_shader_cache() description to screen.rst 
> - bug fis for old cache dir deletion
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [AppVeyor] mesa master #3526 completed

2017-02-21 Thread AppVeyor


Build mesa 3526 completed



Commit b87ef9e606 by Brian Paul on 2/21/2017 10:52 PM:

util: fix MSVC build issue in disk_cache.h\n\nWindows doesn't have dlfcn.h.  Protect the code in question\nwith #if ENABLE_SHADER_CACHE test.  And fix indentation.\n\nReviewed-by: Timothy Arceri \nReviewed-by: Roland Scheidegger 


Configure your notification preferences

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] util: fix MSVC build issue in disk_cache.h

2017-02-21 Thread Brian Paul

On 02/21/2017 05:48 PM, Timothy Arceri wrote:



On 22/02/17 09:59, Brian Paul wrote:

On 02/21/2017 03:57 PM, Brian Paul wrote:

Windows doesn't have dlfcn.h.  Protect the code in question
with #if ENABLE_SHADER_CACHE test.
---
  src/util/disk_cache.h | 26 --
  1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/src/util/disk_cache.h b/src/util/disk_cache.h
index 8b6fc0d..7f4da80 100644
--- a/src/util/disk_cache.h
+++ b/src/util/disk_cache.h
@@ -24,7 +24,9 @@
  #ifndef DISK_CACHE_H
  #define DISK_CACHE_H

+#ifdef ENABLE_SHADER_CACHE
  #include 
+#endif
  #include 
  #include 
  #include 
@@ -43,16 +45,20 @@ struct disk_cache;
  static inline bool
  disk_cache_get_function_timestamp(void *ptr, uint32_t* timestamp)
  {
-Dl_info info;
-struct stat st;
-if (!dladdr(ptr, ) || !info.dli_fname) {
-return false;
-}
-if (stat(info.dli_fname, )) {
-return false;
-}
-*timestamp = st.st_mtim.tv_sec;
-return true;
+#ifdef ENABLE_SHADER_CACHE
+   Dl_info info;
+   struct stat st;
+   if (!dladdr(ptr, ) || !info.dli_fname) {
+  return false;
+   }
+   if (stat(info.dli_fname, )) {
+  return false;
+   }
+   *timestamp = st.st_mtim.tv_sec;
+   return true;
+#else
+   return false;
+#endif
  }

  /* Provide inlined stub functions if the shader cache is disabled. */




Timothy,

Does this function really need to be inlined?  AFAICT, it's not called
on a performance critical path.  Moving it into the .c file would seem
to be cleaner.


No not really, feel free to move it. Either way this patch is:

Reviewed-by: Timothy Arceri 


Thanks.  I pushed this patch as-is.

I also tried moving the function to the .c file but I get a link error 
that dladdr() is undefined.  It's a _GNU_SOURCE extension.  It looks 
like _GNU_SOURCE is defined, but something else is wrong.  My WIP is 
attached if you want to take a crack at it.


-Brian


diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c
index 2f138da..7b6b460 100644
--- a/src/util/disk_cache.c
+++ b/src/util/disk_cache.c
@@ -24,6 +24,7 @@
 #ifdef ENABLE_SHADER_CACHE
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -37,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "util/u_atomic.h"
 #include "util/mesa-sha1.h"
@@ -187,6 +189,21 @@ create_mesa_cache_dir(void *mem_ctx, char *path, const char *timestamp,
return new_path;
 }
 
+bool
+disk_cache_get_function_timestamp(void *ptr, uint32_t *timestamp)
+{
+   Dl_info info;
+   struct stat st;
+   if (!dladdr(ptr, ) || !info.dli_fname) {
+  return false;
+   }
+   if (stat(info.dli_fname, )) {
+  return false;
+   }
+   *timestamp = st.st_mtim.tv_sec;
+   return true;
+}
+
 struct disk_cache *
 disk_cache_create(const char *gpu_name, const char *timestamp)
 {
diff --git a/src/util/disk_cache.h b/src/util/disk_cache.h
index 7f4da80..9623f95 100644
--- a/src/util/disk_cache.h
+++ b/src/util/disk_cache.h
@@ -24,12 +24,8 @@
 #ifndef DISK_CACHE_H
 #define DISK_CACHE_H
 
-#ifdef ENABLE_SHADER_CACHE
-#include 
-#endif
 #include 
 #include 
-#include 
 
 #ifdef __cplusplus
 extern "C" {
@@ -42,29 +38,13 @@ typedef uint8_t cache_key[CACHE_KEY_SIZE];
 
 struct disk_cache;
 
-static inline bool
-disk_cache_get_function_timestamp(void *ptr, uint32_t* timestamp)
-{
-#ifdef ENABLE_SHADER_CACHE
-   Dl_info info;
-   struct stat st;
-   if (!dladdr(ptr, ) || !info.dli_fname) {
-  return false;
-   }
-   if (stat(info.dli_fname, )) {
-  return false;
-   }
-   *timestamp = st.st_mtim.tv_sec;
-   return true;
-#else
-   return false;
-#endif
-}
-
 /* Provide inlined stub functions if the shader cache is disabled. */
 
 #ifdef ENABLE_SHADER_CACHE
 
+bool
+disk_cache_get_function_timestamp(void *ptr, uint32_t *timestamp);
+
 /**
  * Create a new cache object.
  *
@@ -162,6 +142,12 @@ disk_cache_has_key(struct disk_cache *cache, cache_key key);
 
 #else
 
+static inline bool
+disk_cache_get_function_timestamp(void *ptr, uint32_t* timestamp)
+{
+   return false;
+}
+
 static inline struct disk_cache *
 disk_cache_create(const char *gpu_name, const char *timestamp)
 {
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/8] gallium: add get_disk_shader_cache() callback

2017-02-21 Thread Timothy Arceri
V2: Provide more detail in callback description and add description to
screen.rst
---
 src/gallium/docs/source/screen.rst  | 9 +
 src/gallium/include/pipe/p_screen.h | 8 
 2 files changed, 17 insertions(+)

diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index 74c8cec..c75786b 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -763,6 +763,15 @@ The driver-specific query group is described with the
 pipe_driver_query_group_info structure.
 
 
+
+get_disk_shader_cache
+^
+
+Returns a pointer to driver-specific on-disk shader cache. If the driver
+failed to create the cache or does not support an on-disk shader cache NULL is
+returned.
+
+
 Thread safety
 -
 
diff --git a/src/gallium/include/pipe/p_screen.h 
b/src/gallium/include/pipe/p_screen.h
index b6203f1..40c0887 100644
--- a/src/gallium/include/pipe/p_screen.h
+++ b/src/gallium/include/pipe/p_screen.h
@@ -58,6 +58,7 @@ struct pipe_surface;
 struct pipe_transfer;
 struct pipe_box;
 struct pipe_memory_info;
+struct disk_cache;
 
 
 /**
@@ -318,6 +319,13 @@ struct pipe_screen {
const void *(*get_compiler_options)(struct pipe_screen *screen,
   enum pipe_shader_ir ir,
   unsigned shader);
+
+   /**
+* Returns a pointer to driver-specific on-disk shader cache. If the driver
+* failed to create the cache or does not support an on-disk shader cache
+* NULL is returned.
+*/
+   struct disk_cache *(*get_disk_shader_cache)(struct pipe_screen *screen);
 };
 
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 8/8] r600/radeonsi: enable glsl/tgsi on-disk cache

2017-02-21 Thread Timothy Arceri
For gpu generations that use LLVM we create a timestamp string
containing both the LLVM and Mesa build times, otherwise we just
use the Mesa build time.

Reviewed-by: Marek Olšák 
Reviewed-by: Edward O'Callaghan 
---
 src/gallium/drivers/radeon/r600_pipe_common.c | 43 +++
 src/gallium/drivers/radeon/r600_pipe_common.h |  3 ++
 2 files changed, 46 insertions(+)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index 1781584..bae6d6f 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -43,6 +43,10 @@
 #define HAVE_LLVM 0
 #endif
 
+#if HAVE_LLVM
+#include 
+#endif
+
 #ifndef MESA_LLVM_VERSION_PATCH
 #define MESA_LLVM_VERSION_PATCH 0
 #endif
@@ -779,6 +783,41 @@ static const char* r600_get_chip_name(struct 
r600_common_screen *rscreen)
}
 }
 
+static void r600_disk_cache_create(struct r600_common_screen *rscreen)
+{
+   uint32_t mesa_timestamp;
+   if (disk_cache_get_function_timestamp(r600_disk_cache_create,
+ _timestamp)) {
+   char *timestamp_str;
+   int res = -1;
+   if (rscreen->chip_class < SI) {
+   res = asprintf(_str, "%u",mesa_timestamp);
+   }
+#if HAVE_LLVM
+   else {
+   uint32_t llvm_timestamp;
+   if 
(disk_cache_get_function_timestamp(LLVMInitializeAMDGPUTargetInfo,
+ _timestamp)) 
{
+   res = asprintf(_str, "%u_%u",
+  mesa_timestamp, llvm_timestamp);
+   }
+   }
+#endif
+   if (res != -1) {
+   rscreen->disk_shader_cache =
+   disk_cache_create(r600_get_chip_name(rscreen),
+ timestamp_str);
+   free(timestamp_str);
+   }
+   }
+}
+
+static struct disk_cache *r600_get_disk_shader_cache(struct pipe_screen 
*pscreen)
+{
+   struct r600_common_screen *rscreen = (struct 
r600_common_screen*)pscreen;
+   return rscreen->disk_shader_cache;
+}
+
 static const char* r600_get_name(struct pipe_screen* pscreen)
 {
struct r600_common_screen *rscreen = (struct 
r600_common_screen*)pscreen;
@@ -1234,6 +1273,7 @@ bool r600_common_screen_init(struct r600_common_screen 
*rscreen,
rscreen->b.get_name = r600_get_name;
rscreen->b.get_vendor = r600_get_vendor;
rscreen->b.get_device_vendor = r600_get_device_vendor;
+   rscreen->b.get_disk_shader_cache = r600_get_disk_shader_cache;
rscreen->b.get_compute_param = r600_get_compute_param;
rscreen->b.get_paramf = r600_get_paramf;
rscreen->b.get_timestamp = r600_get_timestamp;
@@ -1259,6 +1299,8 @@ bool r600_common_screen_init(struct r600_common_screen 
*rscreen,
rscreen->chip_class = rscreen->info.chip_class;
rscreen->debug_flags = debug_get_flags_option("R600_DEBUG", 
common_debug_options, 0);
 
+   r600_disk_cache_create(rscreen);
+
slab_create_parent(>pool_transfers, sizeof(struct 
r600_transfer), 64);
 
rscreen->force_aniso = MIN2(16, debug_get_num_option("R600_TEX_ANISO", 
-1));
@@ -1324,6 +1366,7 @@ void r600_destroy_common_screen(struct r600_common_screen 
*rscreen)
 
slab_destroy_parent(>pool_transfers);
 
+   disk_cache_destroy(rscreen->disk_shader_cache);
rscreen->ws->destroy(rscreen->ws);
FREE(rscreen);
 }
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index a977dc1..94cf0fc 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -36,6 +36,7 @@
 
 #include "radeon/radeon_winsys.h"
 
+#include "util/disk_cache.h"
 #include "util/u_blitter.h"
 #include "util/list.h"
 #include "util/u_range.h"
@@ -405,6 +406,8 @@ struct r600_common_screen {
boolhas_cp_dma;
boolhas_streamout;
 
+   struct disk_cache   *disk_shader_cache;
+
struct slab_parent_pool pool_transfers;
 
/* Texture filter settings. */
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/8] st/mesa: add sha1 field to st program structs

2017-02-21 Thread Timothy Arceri
This will be used to share the sha1 computed by the tgsi load
function with the tgsi write function.
---
 src/mesa/state_tracker/st_program.h | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/src/mesa/state_tracker/st_program.h 
b/src/mesa/state_tracker/st_program.h
index 9f9777a..70664d1 100644
--- a/src/mesa/state_tracker/st_program.h
+++ b/src/mesa/state_tracker/st_program.h
@@ -150,6 +150,9 @@ struct st_fragment_program
struct gl_shader_program *shader_program;
 
struct st_fp_variant *variants;
+
+   /** SHA1 hash of linked tgsi shader program, used for on-disk cache */
+   unsigned char sha1[20];
 };
 
 
@@ -220,6 +223,9 @@ struct st_vertex_program
/** List of translated variants of this vertex program.
 */
struct st_vp_variant *variants;
+
+   /** SHA1 hash of linked tgsi shader program, used for on-disk cache */
+   unsigned char sha1[20];
 };
 
 
@@ -256,6 +262,9 @@ struct st_geometry_program
uint64_t affected_states; /**< ST_NEW_* flags to mark dirty when binding */
 
struct st_basic_variant *variants;
+
+   /** SHA1 hash of linked tgsi shader program, used for on-disk cache */
+   unsigned char sha1[20];
 };
 
 
@@ -270,6 +279,9 @@ struct st_tessctrl_program
uint64_t affected_states; /**< ST_NEW_* flags to mark dirty when binding */
 
struct st_basic_variant *variants;
+
+   /** SHA1 hash of linked tgsi shader program, used for on-disk cache */
+   unsigned char sha1[20];
 };
 
 
@@ -284,6 +296,9 @@ struct st_tesseval_program
uint64_t affected_states; /**< ST_NEW_* flags to mark dirty when binding */
 
struct st_basic_variant *variants;
+
+   /** SHA1 hash of linked tgsi shader program, used for on-disk cache */
+   unsigned char sha1[20];
 };
 
 
@@ -298,6 +313,9 @@ struct st_compute_program
uint64_t affected_states; /**< ST_NEW_* flags to mark dirty when binding */
 
struct st_basic_variant *variants;
+
+   /** SHA1 hash of linked tgsi shader program, used for on-disk cache */
+   unsigned char sha1[20];
 };
 
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/8] st/mesa: get on-disk shader cache

2017-02-21 Thread Timothy Arceri
Reviewed-by: Marek Olšák 
Reviewed-by: Edward O'Callaghan 
---
 src/mesa/state_tracker/st_context.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/mesa/state_tracker/st_context.c 
b/src/mesa/state_tracker/st_context.c
index f4ad6d8..6321309 100644
--- a/src/mesa/state_tracker/st_context.c
+++ b/src/mesa/state_tracker/st_context.c
@@ -538,6 +538,9 @@ struct st_context *st_create_context(gl_api api, struct 
pipe_context *pipe,
   return NULL;
}
 
+   if (pipe->screen->get_disk_shader_cache)
+  ctx->Cache = pipe->screen->get_disk_shader_cache(pipe->screen);
+
st_init_driver_flags(>DriverFlags);
 
/* XXX: need a capability bit in gallium to query if the pipe
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/8] util/disk_cache: fix bug with deleting old cache dirs

2017-02-21 Thread Timothy Arceri
If there was more than a single directory in the .cache/mesa dir
then it would only remove one (or none) of the directories.

Apparently Valgrind was also reporting:
Conditional jump or move depends on uninitialised value
---
 src/util/disk_cache.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c
index 2f138da..b172b8b 100644
--- a/src/util/disk_cache.c
+++ b/src/util/disk_cache.c
@@ -152,14 +152,15 @@ remove_old_cache_directories(void *mem_ctx, char *path, 
const char *timestamp)
struct dirent* d_entry;
while((d_entry = readdir(dir)) != NULL)
{
+  char *full_path =
+ ralloc_asprintf(mem_ctx, "%s/%s", path, d_entry->d_name);
+
   struct stat sb;
-  stat(d_entry->d_name, );
+  stat(full_path, );
   if (S_ISDIR(sb.st_mode) &&
   strcmp(d_entry->d_name, timestamp) != 0 &&
   strcmp(d_entry->d_name, "..") != 0 &&
   strcmp(d_entry->d_name, ".") != 0) {
- char *full_path =
-ralloc_asprintf(mem_ctx, "%s/%s", path, d_entry->d_name);
  nftw(full_path, remove_dir, 20, FTW_DEPTH);
   }
}
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/8] st/mesa: move set_prog_affected_state_flags() to st_program.c

2017-02-21 Thread Timothy Arceri
We want to use this in the new tgsi shader cache so we move it here
and make it available externally.
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 142 +
 src/mesa/state_tracker/st_program.c| 141 
 src/mesa/state_tracker/st_program.h|   2 +
 3 files changed, 144 insertions(+), 141 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 630f5af..476d185 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -6809,146 +6809,6 @@ get_mesa_program_tgsi(struct gl_context *ctx,
return prog;
 }
 
-static void
-set_affected_state_flags(uint64_t *states,
- struct gl_program *prog,
- uint64_t new_constants,
- uint64_t new_sampler_views,
- uint64_t new_samplers,
- uint64_t new_images,
- uint64_t new_ubos,
- uint64_t new_ssbos,
- uint64_t new_atomics)
-{
-   if (prog->Parameters->NumParameters)
-  *states |= new_constants;
-
-   if (prog->info.num_textures)
-  *states |= new_sampler_views | new_samplers;
-
-   if (prog->info.num_images)
-  *states |= new_images;
-
-   if (prog->info.num_ubos)
-  *states |= new_ubos;
-
-   if (prog->info.num_ssbos)
-  *states |= new_ssbos;
-
-   if (prog->info.num_abos)
-  *states |= new_atomics;
-}
-
-static void
-set_prog_affected_state_flags(struct gl_program *prog)
-{
-   uint64_t *states;
-
-   /* This determines which states will be updated when the shader is bound.
-*/
-   switch (prog->info.stage) {
-   case MESA_SHADER_VERTEX:
-  states = &((struct st_vertex_program*)prog)->affected_states;
-
-  *states = ST_NEW_VS_STATE |
-ST_NEW_RASTERIZER |
-ST_NEW_VERTEX_ARRAYS;
-
-  set_affected_state_flags(states, prog,
-   ST_NEW_VS_CONSTANTS,
-   ST_NEW_VS_SAMPLER_VIEWS,
-   ST_NEW_RENDER_SAMPLERS,
-   ST_NEW_VS_IMAGES,
-   ST_NEW_VS_UBOS,
-   ST_NEW_VS_SSBOS,
-   ST_NEW_VS_ATOMICS);
-  break;
-
-   case MESA_SHADER_TESS_CTRL:
-  states = &((struct st_tessctrl_program*)prog)->affected_states;
-
-  *states = ST_NEW_TCS_STATE;
-
-  set_affected_state_flags(states, prog,
-   ST_NEW_TCS_CONSTANTS,
-   ST_NEW_TCS_SAMPLER_VIEWS,
-   ST_NEW_RENDER_SAMPLERS,
-   ST_NEW_TCS_IMAGES,
-   ST_NEW_TCS_UBOS,
-   ST_NEW_TCS_SSBOS,
-   ST_NEW_TCS_ATOMICS);
-  break;
-
-   case MESA_SHADER_TESS_EVAL:
-  states = &((struct st_tesseval_program*)prog)->affected_states;
-
-  *states = ST_NEW_TES_STATE |
-ST_NEW_RASTERIZER;
-
-  set_affected_state_flags(states, prog,
-   ST_NEW_TES_CONSTANTS,
-   ST_NEW_TES_SAMPLER_VIEWS,
-   ST_NEW_RENDER_SAMPLERS,
-   ST_NEW_TES_IMAGES,
-   ST_NEW_TES_UBOS,
-   ST_NEW_TES_SSBOS,
-   ST_NEW_TES_ATOMICS);
-  break;
-
-   case MESA_SHADER_GEOMETRY:
-  states = &((struct st_geometry_program*)prog)->affected_states;
-
-  *states = ST_NEW_GS_STATE |
-ST_NEW_RASTERIZER;
-
-  set_affected_state_flags(states, prog,
-   ST_NEW_GS_CONSTANTS,
-   ST_NEW_GS_SAMPLER_VIEWS,
-   ST_NEW_RENDER_SAMPLERS,
-   ST_NEW_GS_IMAGES,
-   ST_NEW_GS_UBOS,
-   ST_NEW_GS_SSBOS,
-   ST_NEW_GS_ATOMICS);
-  break;
-
-   case MESA_SHADER_FRAGMENT:
-  states = &((struct st_fragment_program*)prog)->affected_states;
-
-  /* gl_FragCoord and glDrawPixels always use constants. */
-  *states = ST_NEW_FS_STATE |
-ST_NEW_SAMPLE_SHADING |
-ST_NEW_FS_CONSTANTS;
-
-  set_affected_state_flags(states, prog,
-   ST_NEW_FS_CONSTANTS,
-   ST_NEW_FS_SAMPLER_VIEWS,
-   ST_NEW_RENDER_SAMPLERS,
-   ST_NEW_FS_IMAGES,
-   ST_NEW_FS_UBOS,
-   ST_NEW_FS_SSBOS,
-   ST_NEW_FS_ATOMICS);
-  break;
-
-   case MESA_SHADER_COMPUTE:
-  states = &((struct 

[Mesa-dev] [PATCH 4/8] st/mesa: implement a tgsi on-disk shader cache

2017-02-21 Thread Timothy Arceri
Implements a tgsi cache for the OpenGL state tracker.

V2: add support for compute shaders
---
 src/mesa/Makefile.sources  |   2 +
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   6 +
 src/mesa/state_tracker/st_program.c|  27 +-
 src/mesa/state_tracker/st_shader_cache.c   | 406 +
 src/mesa/state_tracker/st_shader_cache.h   |  46 
 5 files changed, 481 insertions(+), 6 deletions(-)
 create mode 100644 src/mesa/state_tracker/st_shader_cache.c
 create mode 100644 src/mesa/state_tracker/st_shader_cache.h

diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index ee737b0..9262967 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -514,6 +514,8 @@ STATETRACKER_FILES = \
state_tracker/st_sampler_view.h \
state_tracker/st_scissor.c \
state_tracker/st_scissor.h \
+   state_tracker/st_shader_cache.c \
+   state_tracker/st_shader_cache.h \
state_tracker/st_texture.c \
state_tracker/st_texture.h \
state_tracker/st_tgsi_lower_yuv.c \
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 476d185..d43d821 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -54,6 +54,7 @@
 #include "st_format.h"
 #include "st_glsl_types.h"
 #include "st_nir.h"
+#include "st_shader_cache.h"
 
 #include 
 
@@ -6870,6 +6871,11 @@ extern "C" {
 GLboolean
 st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog)
 {
+   /* Return early if we are loading the shader from on-disk cache */
+   if (st_load_tgsi_from_disk_cache(ctx, prog)) {
+  return GL_TRUE;
+   }
+
struct pipe_screen *pscreen = ctx->st->pipe->screen;
assert(prog->data->LinkStatus);
 
diff --git a/src/mesa/state_tracker/st_program.c 
b/src/mesa/state_tracker/st_program.c
index 3795f25..4d9250b 100644
--- a/src/mesa/state_tracker/st_program.c
+++ b/src/mesa/state_tracker/st_program.c
@@ -58,6 +58,7 @@
 #include "st_mesa_to_tgsi.h"
 #include "st_atifs_to_tgsi.h"
 #include "st_nir.h"
+#include "st_shader_cache.h"
 #include "cso_cache/cso_context.h"
 
 
@@ -364,7 +365,6 @@ st_release_cp_variants(struct st_context *st, struct 
st_compute_program *stcp)
}
 }
 
-
 /**
  * Translate a vertex program.
  */
@@ -583,7 +583,6 @@ st_translate_vertex_program(struct st_context *st,
   >tgsi.stream_output);
 
   free_glsl_to_tgsi_visitor(stvp->glsl_to_tgsi);
-  stvp->glsl_to_tgsi = NULL;
} else
   error = st_translate_mesa_program(st->ctx,
 PIPE_SHADER_VERTEX,
@@ -608,8 +607,15 @@ st_translate_vertex_program(struct st_context *st,
   return false;
}
 
-   stvp->tgsi.tokens = ureg_get_tokens(ureg, NULL);
+   unsigned num_tokens;
+   stvp->tgsi.tokens = ureg_get_tokens(ureg, _tokens);
ureg_destroy(ureg);
+
+   if (stvp->glsl_to_tgsi) {
+  stvp->glsl_to_tgsi = NULL;
+  st_store_tgsi_in_disk_cache(st, >Base, NULL, num_tokens);
+   }
+
return stvp->tgsi.tokens != NULL;
 }
 
@@ -1031,7 +1037,6 @@ st_translate_fragment_program(struct st_context *st,
fs_output_semantic_index);
 
   free_glsl_to_tgsi_visitor(stfp->glsl_to_tgsi);
-  stfp->glsl_to_tgsi = NULL;
} else if (stfp->ati_fs)
   st_translate_atifs_program(ureg,
  stfp->ati_fs,
@@ -1064,8 +1069,15 @@ st_translate_fragment_program(struct st_context *st,
 fs_output_semantic_name,
 fs_output_semantic_index);
 
-   stfp->tgsi.tokens = ureg_get_tokens(ureg, NULL);
+   unsigned num_tokens;
+   stfp->tgsi.tokens = ureg_get_tokens(ureg, _tokens);
ureg_destroy(ureg);
+
+   if (stfp->glsl_to_tgsi) {
+  stfp->glsl_to_tgsi = NULL;
+  st_store_tgsi_in_disk_cache(st, >Base, NULL, num_tokens);
+   }
+
return stfp->tgsi.tokens != NULL;
 }
 
@@ -1600,13 +1612,16 @@ st_translate_program_common(struct st_context *st,
 output_semantic_name,
 output_semantic_index);
 
-   out_state->tokens = ureg_get_tokens(ureg, NULL);
+   unsigned num_tokens;
+   out_state->tokens = ureg_get_tokens(ureg, _tokens);
ureg_destroy(ureg);
 
st_translate_stream_output_info(glsl_to_tgsi,
outputMapping,
_state->stream_output);
 
+   st_store_tgsi_in_disk_cache(st, prog, out_state, num_tokens);
+
if ((ST_DEBUG & DEBUG_TGSI) && (ST_DEBUG & DEBUG_MESA)) {
   _mesa_print_program(prog);
   debug_printf("\n");
diff --git a/src/mesa/state_tracker/st_shader_cache.c 
b/src/mesa/state_tracker/st_shader_cache.c
new file mode 100644
index 000..84ad94e
--- /dev/null
+++ b/src/mesa/state_tracker/st_shader_cache.c
@@ -0,0 +1,406 @@
+/*
+ * Copyright © 2017 Timothy Arceri
+ *
+ * Permission is hereby granted, free of 

[Mesa-dev] V4 TGSI on-disk shader cache

2017-02-21 Thread Timothy Arceri
Changes in V4:

- split tgsi cache code into its own file
- add missing fallback for tgsi cache miss
- share the sha1 generated by the load function with the 
  store function like in the glsl ir cache.
- add get_disk_shader_cache() to the pass-throughs
- add get_disk_shader_cache() description to screen.rst 
- bug fis for old cache dir deletion

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/8] ddebug/rbug/trace: add get_disk_shader_cache() to pass-throughs

2017-02-21 Thread Timothy Arceri
---
 src/gallium/drivers/ddebug/dd_screen.c |  9 +
 src/gallium/drivers/rbug/rbug_screen.c |  9 +
 src/gallium/drivers/trace/tr_screen.c  | 21 +
 3 files changed, 39 insertions(+)

diff --git a/src/gallium/drivers/ddebug/dd_screen.c 
b/src/gallium/drivers/ddebug/dd_screen.c
index 58e496a..996ff85 100644
--- a/src/gallium/drivers/ddebug/dd_screen.c
+++ b/src/gallium/drivers/ddebug/dd_screen.c
@@ -55,6 +55,14 @@ dd_screen_get_device_vendor(struct pipe_screen *_screen)
return screen->get_device_vendor(screen);
 }
 
+static struct disk_cache *
+dd_screen_get_disk_shader_cache(struct pipe_screen *_screen)
+{
+   struct pipe_screen *screen = dd_screen(_screen)->screen;
+
+   return screen->get_disk_shader_cache(screen);
+}
+
 static int
 dd_screen_get_param(struct pipe_screen *_screen,
 enum pipe_cap param)
@@ -378,6 +386,7 @@ ddebug_screen_create(struct pipe_screen *screen)
dscreen->base.get_name = dd_screen_get_name;
dscreen->base.get_vendor = dd_screen_get_vendor;
dscreen->base.get_device_vendor = dd_screen_get_device_vendor;
+   dscreen->base.get_disk_shader_cache = dd_screen_get_disk_shader_cache;
dscreen->base.get_param = dd_screen_get_param;
dscreen->base.get_paramf = dd_screen_get_paramf;
dscreen->base.get_compute_param = dd_screen_get_compute_param;
diff --git a/src/gallium/drivers/rbug/rbug_screen.c 
b/src/gallium/drivers/rbug/rbug_screen.c
index 8fbbe73..0ea5139 100644
--- a/src/gallium/drivers/rbug/rbug_screen.c
+++ b/src/gallium/drivers/rbug/rbug_screen.c
@@ -77,6 +77,14 @@ rbug_screen_get_device_vendor(struct pipe_screen *_screen)
return screen->get_device_vendor(screen);
 }
 
+static struct disk_cache *
+rbug_screen_get_disk_shader_cache(struct pipe_screen *_screen)
+{
+   struct pipe_screen *screen = rbug_screen(_screen)->screen;
+
+   return screen->get_disk_shader_cache(screen);
+}
+
 static int
 rbug_screen_get_param(struct pipe_screen *_screen,
   enum pipe_cap param)
@@ -283,6 +291,7 @@ rbug_screen_create(struct pipe_screen *screen)
rb_screen->base.destroy = rbug_screen_destroy;
rb_screen->base.get_name = rbug_screen_get_name;
rb_screen->base.get_vendor = rbug_screen_get_vendor;
+   rb_screen->base.get_disk_shader_cache = rbug_screen_get_disk_shader_cache;
rb_screen->base.get_device_vendor = rbug_screen_get_device_vendor;
rb_screen->base.get_param = rbug_screen_get_param;
rb_screen->base.get_shader_param = rbug_screen_get_shader_param;
diff --git a/src/gallium/drivers/trace/tr_screen.c 
b/src/gallium/drivers/trace/tr_screen.c
index aaf2e26..4256855 100644
--- a/src/gallium/drivers/trace/tr_screen.c
+++ b/src/gallium/drivers/trace/tr_screen.c
@@ -103,6 +103,26 @@ trace_screen_get_device_vendor(struct pipe_screen *_screen)
 }
 
 
+static struct disk_cache *
+trace_screen_get_disk_shader_cache(struct pipe_screen *_screen)
+{
+   struct trace_screen *tr_scr = trace_screen(_screen);
+   struct pipe_screen *screen = tr_scr->screen;
+
+   trace_dump_call_begin("pipe_screen", "get_disk_shader_cache");
+
+   trace_dump_arg(ptr, screen);
+
+   struct disk_cache *result = screen->get_disk_shader_cache(screen);
+
+   trace_dump_ret(ptr, result);
+
+   trace_dump_call_end();
+
+   return result;
+}
+
+
 static int
 trace_screen_get_param(struct pipe_screen *_screen,
enum pipe_cap param)
@@ -525,6 +545,7 @@ trace_screen_create(struct pipe_screen *screen)
tr_scr->base.get_name = trace_screen_get_name;
tr_scr->base.get_vendor = trace_screen_get_vendor;
tr_scr->base.get_device_vendor = trace_screen_get_device_vendor;
+   tr_scr->base.get_disk_shader_cache = trace_screen_get_disk_shader_cache;
tr_scr->base.get_param = trace_screen_get_param;
tr_scr->base.get_shader_param = trace_screen_get_shader_param;
tr_scr->base.get_paramf = trace_screen_get_paramf;
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] anv/Makefile: Gather all the genX files into one place

2017-02-21 Thread Jason Ekstrand
While we're here, we also fix the alphabetization of the list of
genx_* files.
---
 src/intel/vulkan/Makefile.sources | 31 +--
 1 file changed, 9 insertions(+), 22 deletions(-)

diff --git a/src/intel/vulkan/Makefile.sources 
b/src/intel/vulkan/Makefile.sources
index 21c04cd..fd149b2 100644
--- a/src/intel/vulkan/Makefile.sources
+++ b/src/intel/vulkan/Makefile.sources
@@ -60,39 +60,26 @@ VULKAN_GENERATED_FILES := \
anv_entrypoints.c \
anv_entrypoints.h
 
-
-GEN7_FILES := \
-   gen7_cmd_buffer.c \
-   genX_cmd_buffer.c \
+GENX_FILES := \
genX_blorp_exec.c \
+   genX_cmd_buffer.c \
genX_gpu_memcpy.c \
genX_pipeline.c \
genX_query.c \
genX_state.c
 
+GEN7_FILES := \
+   gen7_cmd_buffer.c \
+$(GENX_FILES)
+
 GEN75_FILES := \
gen7_cmd_buffer.c \
-   genX_cmd_buffer.c \
-   genX_blorp_exec.c \
-   genX_gpu_memcpy.c \
-   genX_pipeline.c \
-   genX_query.c \
-   genX_state.c
+$(GENX_FILES)
 
 GEN8_FILES := \
gen8_cmd_buffer.c \
-   genX_cmd_buffer.c \
-   genX_blorp_exec.c \
-   genX_gpu_memcpy.c \
-   genX_pipeline.c \
-   genX_query.c \
-   genX_state.c
+$(GENX_FILES)
 
 GEN9_FILES := \
gen8_cmd_buffer.c \
-   genX_cmd_buffer.c \
-   genX_blorp_exec.c \
-   genX_gpu_memcpy.c \
-   genX_pipeline.c \
-   genX_query.c \
-   genX_state.c
+$(GENX_FILES)
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 5/6] anv/Makefile: alphabetize

2017-02-21 Thread Jason Ekstrand
On Tue, Feb 21, 2017 at 5:04 PM, Mike Lothian  wrote:

> Should genX_blorp be above genX_cmd?
>

You are absolutely right!  I'll send another patch.


> On Mon, 20 Feb 2017 at 19:26 Jason Ekstrand  wrote:
>
>> ---
>>  src/intel/vulkan/Makefile.sources | 8 
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/intel/vulkan/Makefile.sources
>> b/src/intel/vulkan/Makefile.sources
>> index bd78805..b99 100644
>> --- a/src/intel/vulkan/Makefile.sources
>> +++ b/src/intel/vulkan/Makefile.sources
>> @@ -63,33 +63,33 @@ VULKAN_GENERATED_FILES := \
>>
>>
>>  GEN7_FILES := \
>> +   gen7_cmd_buffer.c \
>> genX_cmd_buffer.c \
>> genX_blorp_exec.c \
>> genX_gpu_memcpy.c \
>> genX_pipeline.c \
>> -   gen7_cmd_buffer.c \
>> genX_state.c
>>
>>  GEN75_FILES := \
>> +   gen7_cmd_buffer.c \
>> genX_cmd_buffer.c \
>> genX_blorp_exec.c \
>> genX_gpu_memcpy.c \
>> genX_pipeline.c \
>> -   gen7_cmd_buffer.c \
>> genX_state.c
>>
>>  GEN8_FILES := \
>> +   gen8_cmd_buffer.c \
>> genX_cmd_buffer.c \
>> genX_blorp_exec.c \
>> genX_gpu_memcpy.c \
>> genX_pipeline.c \
>> -   gen8_cmd_buffer.c \
>> genX_state.c
>>
>>  GEN9_FILES := \
>> +   gen8_cmd_buffer.c \
>> genX_cmd_buffer.c \
>> genX_blorp_exec.c \
>> genX_gpu_memcpy.c \
>> genX_pipeline.c \
>> -   gen8_cmd_buffer.c \
>> genX_state.c
>> --
>> 2.5.0.400.gff86faf
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] anv/blorp/clear_subpass: Only set surface clear color for fast clears

2017-02-21 Thread Jason Ekstrand
Not all clear colors are valid.  In particular, on Broadwell and
earlier, only 0/1 colors are allowed in surface state.  No CTS tests are
affected outright by this because, apparently, the CTS coverage for
different clear colors is pretty terrible.  However, when multisample
compression is enabled, we do hit it with CTS tests and this commit
prevents regressions when enabling MCS on Broadwell and earlier.

Cc: "13.0 17.0" 
---
 src/intel/vulkan/anv_blorp.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
index 4e7078b..8db03e4 100644
--- a/src/intel/vulkan/anv_blorp.c
+++ b/src/intel/vulkan/anv_blorp.c
@@ -1198,9 +1198,10 @@ anv_cmd_buffer_clear_subpass(struct anv_cmd_buffer 
*cmd_buffer)
   struct blorp_surf surf;
   get_blorp_surf_for_anv_image(image, VK_IMAGE_ASPECT_COLOR_BIT,
att_state->aux_usage, );
-  surf.clear_color = vk_to_isl_color(att_state->clear_value.color);
 
   if (att_state->fast_clear) {
+ surf.clear_color = vk_to_isl_color(att_state->clear_value.color);
+
  blorp_fast_clear(, , iview->isl.format,
   iview->isl.base_level,
   iview->isl.base_array_layer, fb->layers,
@@ -1224,7 +1225,7 @@ anv_cmd_buffer_clear_subpass(struct anv_cmd_buffer 
*cmd_buffer)
  render_area.offset.x, render_area.offset.y,
  render_area.offset.x + render_area.extent.width,
  render_area.offset.y + render_area.extent.height,
- surf.clear_color, NULL);
+ vk_to_isl_color(att_state->clear_value.color), NULL);
   }
 
   att_state->pending_clear_aspects = 0;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] i965: Implement INTEL_performance_query backend

2017-02-21 Thread Kenneth Graunke
On Thursday, February 16, 2017 5:20:37 AM PST Robert Bragg wrote:
[snip]
> +   switch(obj->query->kind) {

Space after "switch" please.

Patch 3 is:
Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] Model INTEL perf query backend after query object BE

2017-02-21 Thread Kenneth Graunke
On Wednesday, February 15, 2017 1:37:36 PM PST Robert Bragg wrote:
> Instead of using the same backend interface as AMD_performance_monitor
> this defines a dedicated INTEL_performance_query interface that is
> modelled more on the ARB_query_buffer_object interface (considering the
> similarity of the extensions) with the addition of vfuncs for
> initializing and enumerating query and counter info.

Patches 1 and 2's commit titles should start with "mesa: ".

> Compared to the previous backend, some notable differences are:
> 
> - The backend is free to represent counters using whatever data
>   structures are optimal/convenient since queries and counters are
>   enumerated via an iterator api instead of declaring them using
>   structures directly shared with the frontend.
> 
>   This is also done to help us support the full range of data and
>   semantic types available with INTEL_performance_query which is awkward
>   while using a structure shared with the AMD_performance_monitor
>   backend since neither extension's types are a subset of the other.
> 
> - The backend must support waiting for a query instead of the frontend
>   simply using glFinish().
> 
> - Objects go through 'Active' and 'Ready' states consistent with the
>   query object backend (hopefully making them more familiar). There is
>   no 'Ended' state (which used to show that a query has ended at least
>   once for a given object). There is a new 'Used' state similar to the
>   'EverBound' state of query objects, set when a query is first begun
>   which implies that we are expecting to get results back for the object
>   at some point.

That's a little different from EverBound, which is used to answer stupid
glIsFoo() queries - where glGenFoo() doesn't actually "create" a Foo,
but glBindFoo() does.  An awkward concept.

> The INTEL_performance_query and AMD_performance_monitor extensions are
> now completely orthogonal within Mesa main (though a driver could
> optionally choose to implement both extensions within a unified backend
> if that were convenient for the sake of sharing state/code).
> 
> v2: (Samuel Pitoiset)
> - init PerfQuery.NumQueries in frontend
> - s/return_string/output_clipped_string/
> - s/backed/backend/ typo
> - remove redundant *bytesWritten = 0
> v3:
> - Add InitPerfQueryInfo for lazy probing of available queries
> 
> Signed-off-by: Robert Bragg 
> ---
>  src/mesa/main/dd.h|  41 +++
>  src/mesa/main/mtypes.h|  24 +-
>  src/mesa/main/performance_query.c | 625 
> ++
>  src/mesa/main/performance_query.h |   6 +-
>  4 files changed, 295 insertions(+), 401 deletions(-)
> 
> diff --git a/src/mesa/main/dd.h b/src/mesa/main/dd.h
> index 7ebd084ca3..e77df31cf2 100644
> --- a/src/mesa/main/dd.h
> +++ b/src/mesa/main/dd.h
> @@ -780,6 +780,47 @@ struct dd_function_table {
> /*@}*/
>  
> /**
> +* \name Performance Query objects
> +*/
> +   /*@{*/
> +   GLuint (*InitPerfQueryInfo)(struct gl_context *ctx);
> +   void (*GetPerfQueryInfo)(struct gl_context *ctx,
> +int queryIndex,
> +const char **name,
> +GLuint *dataSize,
> +GLuint *numCounters,
> +GLuint *numActive);
> +   void (*GetPerfCounterInfo)(struct gl_context *ctx,
> +  int queryIndex,
> +  int counterIndex,
> +  const char **name,
> +  const char **desc,
> +  GLuint *offset,
> +  GLuint *data_size,
> +  GLuint *type_enum,
> +  GLuint *data_type_enum,
> +  GLuint64 *raw_max);
> +   struct gl_perf_query_object * (*NewPerfQueryObject)(struct gl_context 
> *ctx,
> +   int queryIndex);
> +   void (*DeletePerfQuery)(struct gl_context *ctx,
> +   struct gl_perf_query_object *obj);
> +   GLboolean (*BeginPerfQuery)(struct gl_context *ctx,
> +   struct gl_perf_query_object *obj);
> +   void (*EndPerfQuery)(struct gl_context *ctx,
> +struct gl_perf_query_object *obj);
> +   void (*WaitPerfQuery)(struct gl_context *ctx,
> + struct gl_perf_query_object *obj);
> +   GLboolean (*IsPerfQueryReady)(struct gl_context *ctx,
> + struct gl_perf_query_object *obj);
> +   void (*GetPerfQueryData)(struct gl_context *ctx,
> +struct gl_perf_query_object *obj,
> +GLsizei dataSize,
> +GLuint *data,
> +GLuint *bytesWritten);
> +   /*@}*/
> +
> +
> +   /**
>  * \name GREMEDY debug/marker functions
>  */
> 

Re: [Mesa-dev] [PATCH mesa] glx: add GLXdispatchIndex sort check

2017-02-21 Thread Ilia Mirkin
Please set lang=c to avoid any order issues.

On Feb 21, 2017 11:58 AM, "Eric Engestrom" 
wrote:

> Signed-off-by: Eric Engestrom 
> ---
>  src/glx/tests/Makefile.am  |  2 +-
>  src/glx/tests/dispatch-index-check | 24 
>  2 files changed, 25 insertions(+), 1 deletion(-)
>  create mode 100755 src/glx/tests/dispatch-index-check
>
> diff --git a/src/glx/tests/Makefile.am b/src/glx/tests/Makefile.am
> index bdc78c0d5a..8874c20b01 100644
> --- a/src/glx/tests/Makefile.am
> +++ b/src/glx/tests/Makefile.am
> @@ -12,7 +12,7 @@ AM_CPPFLAGS = \
> $(LIBDRM_CFLAGS) \
> $(X11_INCLUDES)
>
> -TESTS = glx-test
> +TESTS = glx-test dispatch-index-check
>  check_PROGRAMS = glx-test
>
>  glx_test_SOURCES = \
> diff --git a/src/glx/tests/dispatch-index-check b/src/glx/tests/dispatch-
> index-check
> new file mode 100755
> index 00..e2b5faff09
> --- /dev/null
> +++ b/src/glx/tests/dispatch-index-check
> @@ -0,0 +1,24 @@
> +#!/bin/sh
> +
> +# extract enum definition
> +dispatch_list=$(sed '/__GLXdispatchIndex/,/__GLXdispatchIndex/!d' \
> +  "$srcdir"/../g_glxglvnddispatchindices.h)
> +
> +# extract values inside of enum
> +dispatch_list=$(sed '1d;$d' <<< "$dispatch_list")
> +
> +# remove indentation
> +dispatch_list=$(sed 's/^\s\+//' <<< "$dispatch_list")
> +
> +# extract function names
> +dispatch_list=$(sed 's/DI_//;s/,//' <<< "$dispatch_list")
> +
> +# same for commented functions, we want to keep them sorted too
> +dispatch_list=$(sed 's#// ##;s/ implemented by [a-z]\+//' <<<
> "$dispatch_list")
> +
> +# remove LAST_INDEX, as it will not be in alphabetical order
> +dispatch_list=$(sed '/LAST_INDEX/d' <<< "$dispatch_list")
> +
> +sorted=$(sort <<< "$dispatch_list")
> +
> +test "$dispatch_list" = "$sorted"
> --
> Cheers,
>   Eric
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 5/6] anv/Makefile: alphabetize

2017-02-21 Thread Mike Lothian
Should genX_blorp be above genX_cmd?

On Mon, 20 Feb 2017 at 19:26 Jason Ekstrand  wrote:

> ---
>  src/intel/vulkan/Makefile.sources | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/src/intel/vulkan/Makefile.sources
> b/src/intel/vulkan/Makefile.sources
> index bd78805..b99 100644
> --- a/src/intel/vulkan/Makefile.sources
> +++ b/src/intel/vulkan/Makefile.sources
> @@ -63,33 +63,33 @@ VULKAN_GENERATED_FILES := \
>
>
>  GEN7_FILES := \
> +   gen7_cmd_buffer.c \
> genX_cmd_buffer.c \
> genX_blorp_exec.c \
> genX_gpu_memcpy.c \
> genX_pipeline.c \
> -   gen7_cmd_buffer.c \
> genX_state.c
>
>  GEN75_FILES := \
> +   gen7_cmd_buffer.c \
> genX_cmd_buffer.c \
> genX_blorp_exec.c \
> genX_gpu_memcpy.c \
> genX_pipeline.c \
> -   gen7_cmd_buffer.c \
> genX_state.c
>
>  GEN8_FILES := \
> +   gen8_cmd_buffer.c \
> genX_cmd_buffer.c \
> genX_blorp_exec.c \
> genX_gpu_memcpy.c \
> genX_pipeline.c \
> -   gen8_cmd_buffer.c \
> genX_state.c
>
>  GEN9_FILES := \
> +   gen8_cmd_buffer.c \
> genX_cmd_buffer.c \
> genX_blorp_exec.c \
> genX_gpu_memcpy.c \
> genX_pipeline.c \
> -   gen8_cmd_buffer.c \
> genX_state.c
> --
> 2.5.0.400.gff86faf
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] util: fix MSVC build issue in disk_cache.h

2017-02-21 Thread Timothy Arceri



On 22/02/17 09:59, Brian Paul wrote:

On 02/21/2017 03:57 PM, Brian Paul wrote:

Windows doesn't have dlfcn.h.  Protect the code in question
with #if ENABLE_SHADER_CACHE test.
---
  src/util/disk_cache.h | 26 --
  1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/src/util/disk_cache.h b/src/util/disk_cache.h
index 8b6fc0d..7f4da80 100644
--- a/src/util/disk_cache.h
+++ b/src/util/disk_cache.h
@@ -24,7 +24,9 @@
  #ifndef DISK_CACHE_H
  #define DISK_CACHE_H

+#ifdef ENABLE_SHADER_CACHE
  #include 
+#endif
  #include 
  #include 
  #include 
@@ -43,16 +45,20 @@ struct disk_cache;
  static inline bool
  disk_cache_get_function_timestamp(void *ptr, uint32_t* timestamp)
  {
-Dl_info info;
-struct stat st;
-if (!dladdr(ptr, ) || !info.dli_fname) {
-return false;
-}
-if (stat(info.dli_fname, )) {
-return false;
-}
-*timestamp = st.st_mtim.tv_sec;
-return true;
+#ifdef ENABLE_SHADER_CACHE
+   Dl_info info;
+   struct stat st;
+   if (!dladdr(ptr, ) || !info.dli_fname) {
+  return false;
+   }
+   if (stat(info.dli_fname, )) {
+  return false;
+   }
+   *timestamp = st.st_mtim.tv_sec;
+   return true;
+#else
+   return false;
+#endif
  }

  /* Provide inlined stub functions if the shader cache is disabled. */




Timothy,

Does this function really need to be inlined?  AFAICT, it's not called
on a performance critical path.  Moving it into the .c file would seem
to be cleaner.


No not really, feel free to move it. Either way this patch is:

Reviewed-by: Timothy Arceri 



-Brian


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] util: fix MSVC build issue in disk_cache.h

2017-02-21 Thread Roland Scheidegger
Looks good to me.
I guess ideally there'd be some os abstraction so the cache stuff could
actually build/work on windows but that's another story...

Roland

Am 21.02.2017 um 23:57 schrieb Brian Paul:
> Windows doesn't have dlfcn.h.  Protect the code in question
> with #if ENABLE_SHADER_CACHE test.
> ---
>  src/util/disk_cache.h | 26 --
>  1 file changed, 16 insertions(+), 10 deletions(-)
> 
> diff --git a/src/util/disk_cache.h b/src/util/disk_cache.h
> index 8b6fc0d..7f4da80 100644
> --- a/src/util/disk_cache.h
> +++ b/src/util/disk_cache.h
> @@ -24,7 +24,9 @@
>  #ifndef DISK_CACHE_H
>  #define DISK_CACHE_H
>  
> +#ifdef ENABLE_SHADER_CACHE
>  #include 
> +#endif
>  #include 
>  #include 
>  #include 
> @@ -43,16 +45,20 @@ struct disk_cache;
>  static inline bool
>  disk_cache_get_function_timestamp(void *ptr, uint32_t* timestamp)
>  {
> - Dl_info info;
> - struct stat st;
> - if (!dladdr(ptr, ) || !info.dli_fname) {
> - return false;
> - }
> - if (stat(info.dli_fname, )) {
> - return false;
> - }
> - *timestamp = st.st_mtim.tv_sec;
> - return true;
> +#ifdef ENABLE_SHADER_CACHE
> +   Dl_info info;
> +   struct stat st;
> +   if (!dladdr(ptr, ) || !info.dli_fname) {
> +  return false;
> +   }
> +   if (stat(info.dli_fname, )) {
> +  return false;
> +   }
> +   *timestamp = st.st_mtim.tv_sec;
> +   return true;
> +#else
> +   return false;
> +#endif
>  }
>  
>  /* Provide inlined stub functions if the shader cache is disabled. */
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH mesa 9/9] eglapi: replace linear entrypoint search with binary search

2017-02-21 Thread Eric Engestrom
Tested with dEQP-EGL.functional.get_proc_address.*

Signed-off-by: Eric Engestrom 
---
 src/egl/main/eglapi.c | 37 -
 1 file changed, 24 insertions(+), 13 deletions(-)

diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
index 77ec5426ec..5694b5a4ca 100644
--- a/src/egl/main/eglapi.c
+++ b/src/egl/main/eglapi.c
@@ -157,6 +157,12 @@
_EGL_CHECK_OBJECT(disp, Sync, s, ret, drv)
 
 
+struct _egl_entrypoint {
+   const char *name;
+   _EGLProc function;
+};
+
+
 static inline _EGLDriver *
 _eglCheckDisplay(_EGLDisplay *disp, const char *msg)
 {
@@ -2350,34 +2356,39 @@ eglQueryDebugKHR(EGLint attribute, EGLAttrib *value)
return EGL_TRUE;
 }
 
+static int
+_eglFunctionCompare(const void *key, const void *elem)
+{
+   const char *procname = key;
+   const struct _egl_entrypoint *entrypoint = elem;
+   return strcmp(procname, entrypoint->name);
+}
+
 __eglMustCastToProperFunctionPointerType EGLAPIENTRY
 eglGetProcAddress(const char *procname)
 {
-   static const struct {
-  const char *name;
-  _EGLProc function;
-   } egl_functions[] = {
+   static const struct _egl_entrypoint egl_functions[] = {
 #define EGL_ENTRYPOINT(f) { .name = #f, .function = (_EGLProc) f },
 #include "eglentrypoint.h"
 #undef EGL_ENTRYPOINT
};
-   EGLint i;
-   _EGLProc ret;
+   _EGLProc ret = NULL;
 
if (!procname)
   RETURN_EGL_SUCCESS(NULL, NULL);
 
_EGL_FUNC_START(NULL, EGL_NONE, NULL, NULL);
 
-   ret = NULL;
if (strncmp(procname, "egl", 3) == 0) {
-  for (i = 0; egl_functions[i].name; i++) {
- if (strcmp(egl_functions[i].name, procname) == 0) {
-ret = egl_functions[i].function;
-break;
- }
-  }
+  const struct _egl_entrypoint *entrypoint =
+ bsearch(procname,
+ egl_functions, ARRAY_SIZE(egl_functions),
+ sizeof(egl_functions[0]),
+ _eglFunctionCompare);
+  if (entrypoint)
+ ret = entrypoint->function;
}
+
if (!ret)
   ret = _eglGetDriverProc(procname);
 
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH mesa 4/9] eglapi: use macro to map entrypoints to functions

2017-02-21 Thread Eric Engestrom
As of the last 3 commits, there's a function for each entrypoint.

Signed-off-by: Eric Engestrom 
---
 src/egl/main/eglapi.c | 149 +-
 1 file changed, 75 insertions(+), 74 deletions(-)

diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
index e44375a106..6c90333998 100644
--- a/src/egl/main/eglapi.c
+++ b/src/egl/main/eglapi.c
@@ -2357,84 +2357,85 @@ eglGetProcAddress(const char *procname)
   const char *name;
   _EGLProc function;
} egl_functions[] = {
+#define EGL_ENTRYPOINT(f) { .name = #f, .function = (_EGLProc) f },
   /* core functions queryable in the presence of
* EGL_KHR_get_all_proc_addresses or EGL 1.5
*/
   /* alphabetical order */
-  { "eglBindAPI", (_EGLProc) eglBindAPI },
-  { "eglBindTexImage", (_EGLProc) eglBindTexImage },
-  { "eglChooseConfig", (_EGLProc) eglChooseConfig },
-  { "eglCopyBuffers", (_EGLProc) eglCopyBuffers },
-  { "eglCreateContext", (_EGLProc) eglCreateContext },
-  { "eglCreatePbufferFromClientBuffer", (_EGLProc) 
eglCreatePbufferFromClientBuffer },
-  { "eglCreatePbufferSurface", (_EGLProc) eglCreatePbufferSurface },
-  { "eglCreatePixmapSurface", (_EGLProc) eglCreatePixmapSurface },
-  { "eglCreateWindowSurface", (_EGLProc) eglCreateWindowSurface },
-  { "eglDestroyContext", (_EGLProc) eglDestroyContext },
-  { "eglDestroySurface", (_EGLProc) eglDestroySurface },
-  { "eglGetConfigAttrib", (_EGLProc) eglGetConfigAttrib },
-  { "eglGetConfigs", (_EGLProc) eglGetConfigs },
-  { "eglGetCurrentContext", (_EGLProc) eglGetCurrentContext },
-  { "eglGetCurrentDisplay", (_EGLProc) eglGetCurrentDisplay },
-  { "eglGetCurrentSurface", (_EGLProc) eglGetCurrentSurface },
-  { "eglGetDisplay", (_EGLProc) eglGetDisplay },
-  { "eglGetError", (_EGLProc) eglGetError },
-  { "eglGetProcAddress", (_EGLProc) eglGetProcAddress },
-  { "eglInitialize", (_EGLProc) eglInitialize },
-  { "eglMakeCurrent", (_EGLProc) eglMakeCurrent },
-  { "eglQueryAPI", (_EGLProc) eglQueryAPI },
-  { "eglQueryContext", (_EGLProc) eglQueryContext },
-  { "eglQueryString", (_EGLProc) eglQueryString },
-  { "eglQuerySurface", (_EGLProc) eglQuerySurface },
-  { "eglReleaseTexImage", (_EGLProc) eglReleaseTexImage },
-  { "eglReleaseThread", (_EGLProc) eglReleaseThread },
-  { "eglSurfaceAttrib", (_EGLProc) eglSurfaceAttrib },
-  { "eglSwapBuffers", (_EGLProc) eglSwapBuffers },
-  { "eglSwapInterval", (_EGLProc) eglSwapInterval },
-  { "eglTerminate", (_EGLProc) eglTerminate },
-  { "eglWaitClient", (_EGLProc) eglWaitClient },
-  { "eglWaitGL", (_EGLProc) eglWaitGL },
-  { "eglWaitNative", (_EGLProc) eglWaitNative },
-  { "eglCreateSync", (_EGLProc) eglCreateSync },
-  { "eglDestroySync", (_EGLProc) eglDestroySync },
-  { "eglClientWaitSync", (_EGLProc) eglClientWaitSync },
-  { "eglGetSyncAttrib", (_EGLProc) eglGetSyncAttrib },
-  { "eglWaitSync", (_EGLProc) eglWaitSync },
-  { "eglCreateImage", (_EGLProc) eglCreateImage },
-  { "eglDestroyImage", (_EGLProc) eglDestroyImage },
-  { "eglGetPlatformDisplay", (_EGLProc) eglGetPlatformDisplay },
-  { "eglCreatePlatformWindowSurface", (_EGLProc) 
eglCreatePlatformWindowSurface },
-  { "eglCreatePlatformPixmapSurface", (_EGLProc) 
eglCreatePlatformPixmapSurface },
-  { "eglCreateImageKHR", (_EGLProc) eglCreateImageKHR },
-  { "eglDestroyImageKHR", (_EGLProc) eglDestroyImageKHR },
-  { "eglCreateSyncKHR", (_EGLProc) eglCreateSyncKHR },
-  { "eglCreateSync64KHR", (_EGLProc) eglCreateSync64KHR },
-  { "eglDestroySyncKHR", (_EGLProc) eglDestroySyncKHR },
-  { "eglClientWaitSyncKHR", (_EGLProc) eglClientWaitSyncKHR },
-  { "eglWaitSyncKHR", (_EGLProc) eglWaitSyncKHR },
-  { "eglSignalSyncKHR", (_EGLProc) eglSignalSyncKHR },
-  { "eglGetSyncAttribKHR", (_EGLProc) eglGetSyncAttribKHR },
-  { "eglSwapBuffersRegionNOK", (_EGLProc) eglSwapBuffersRegionNOK },
-  { "eglCreateDRMImageMESA", (_EGLProc) eglCreateDRMImageMESA },
-  { "eglExportDRMImageMESA", (_EGLProc) eglExportDRMImageMESA },
-  { "eglBindWaylandDisplayWL", (_EGLProc) eglBindWaylandDisplayWL },
-  { "eglUnbindWaylandDisplayWL", (_EGLProc) eglUnbindWaylandDisplayWL },
-  { "eglQueryWaylandBufferWL", (_EGLProc) eglQueryWaylandBufferWL },
-  { "eglCreateWaylandBufferFromImageWL", (_EGLProc) 
eglCreateWaylandBufferFromImageWL },
-  { "eglPostSubBufferNV", (_EGLProc) eglPostSubBufferNV },
-  { "eglSwapBuffersWithDamageEXT", (_EGLProc) eglSwapBuffersWithDamageEXT 
},
-  { "eglSwapBuffersWithDamageKHR", (_EGLProc) eglSwapBuffersWithDamageKHR 
},
-  { "eglGetPlatformDisplayEXT", (_EGLProc) eglGetPlatformDisplayEXT },
-  { "eglCreatePlatformWindowSurfaceEXT", (_EGLProc) 
eglCreatePlatformWindowSurfaceEXT },
-  { "eglCreatePlatformPixmapSurfaceEXT", 

[Mesa-dev] [PATCH mesa 7/9] egl: distribute all tests

2017-02-21 Thread Eric Engestrom
Signed-off-by: Eric Engestrom 
---
 src/egl/Makefile.am | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am
index bd8903f666..d36a786ab4 100644
--- a/src/egl/Makefile.am
+++ b/src/egl/Makefile.am
@@ -129,7 +129,7 @@ egl_HEADERS = \
 TESTS = egl-symbols-check
 
 EXTRA_DIST = \
-   egl-symbols-check \
+   $(TESTS) \
SConscript \
drivers/haiku \
main/egl.def \
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH mesa v2 8/9] egl: make sure entrypoints list is always sorted

2017-02-21 Thread Eric Engestrom
Starting with the next commit, badly sorting this list will break the
eglGetProcAddress().

Signed-off-by: Eric Engestrom 
---
v2: use sh instead of bash (Emil)
---
 src/egl/Makefile.am  | 3 ++-
 src/egl/egl-entrypoint-check | 5 +
 2 files changed, 7 insertions(+), 1 deletion(-)
 create mode 100755 src/egl/egl-entrypoint-check

diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am
index d36a786ab4..3477f797d7 100644
--- a/src/egl/Makefile.am
+++ b/src/egl/Makefile.am
@@ -126,7 +126,8 @@ egl_HEADERS = \
$(top_srcdir)/include/EGL/eglmesaext.h \
$(top_srcdir)/include/EGL/eglplatform.h
 
-TESTS = egl-symbols-check
+TESTS = egl-symbols-check \
+   egl-entrypoint-check
 
 EXTRA_DIST = \
$(TESTS) \
diff --git a/src/egl/egl-entrypoint-check b/src/egl/egl-entrypoint-check
new file mode 100755
index 00..ec33d8e97f
--- /dev/null
+++ b/src/egl/egl-entrypoint-check
@@ -0,0 +1,5 @@
+#!/bin/sh
+
+entrypoints=$(grep EGL_ENTRYPOINT "$srcdir"/main/eglentrypoint.h)
+sorted=$(sort <<< "$entrypoints")
+test "$entrypoints" = "$sorted"
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH mesa 1/9] eglapi: add entrypoint for eglDestroyImageKHR

2017-02-21 Thread Eric Engestrom
Signed-off-by: Eric Engestrom 
---
Note for Ilia: I'm not opposed to removing the first 3 patches and
adding a macro for the special cases, but I'll wait until someone else
wants it too.
And if the request comes after this lands, these 3 patches are easy
enough to revert (just ignore the last hunk and you're good).
---
 src/egl/main/eglapi.c | 28 +---
 1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
index cab05c2301..251855cc3b 100644
--- a/src/egl/main/eglapi.c
+++ b/src/egl/main/eglapi.c
@@ -1574,16 +1574,12 @@ eglCreateImage(EGLDisplay dpy, EGLContext ctx, EGLenum 
target,
 }
 
 
-EGLBoolean EGLAPIENTRY
-eglDestroyImage(EGLDisplay dpy, EGLImage image)
+static EGLBoolean
+_eglDestroyImageCommon(_EGLDisplay *disp, _EGLImage *img)
 {
-   _EGLDisplay *disp = _eglLockDisplay(dpy);
-   _EGLImage *img = _eglLookupImage(image, disp);
_EGLDriver *drv;
EGLBoolean ret;
 
-   _EGL_FUNC_START(disp, EGL_OBJECT_IMAGE_KHR, img, EGL_FALSE);
-
_EGL_CHECK_DISPLAY(disp, EGL_FALSE, drv);
if (!disp->Extensions.KHR_image_base)
   RETURN_EGL_EVAL(disp, EGL_FALSE);
@@ -1596,6 +1592,24 @@ eglDestroyImage(EGLDisplay dpy, EGLImage image)
RETURN_EGL_EVAL(disp, ret);
 }
 
+EGLBoolean EGLAPIENTRY
+eglDestroyImage(EGLDisplay dpy, EGLImage image)
+{
+   _EGLDisplay *disp = _eglLockDisplay(dpy);
+   _EGLImage *img = _eglLookupImage(image, disp);
+   _EGL_FUNC_START(disp, EGL_OBJECT_IMAGE_KHR, img, EGL_FALSE);
+   return _eglDestroyImageCommon(disp, img);
+}
+
+static EGLBoolean EGLAPIENTRY
+eglDestroyImageKHR(EGLDisplay dpy, EGLImage image)
+{
+   _EGLDisplay *disp = _eglLockDisplay(dpy);
+   _EGLImage *img = _eglLookupImage(image, disp);
+   _EGL_FUNC_START(disp, EGL_OBJECT_IMAGE_KHR, img, EGL_FALSE);
+   return _eglDestroyImageCommon(disp, img);
+}
+
 
 static EGLSync
 _eglCreateSync(_EGLDisplay *disp, EGLenum type, const EGLAttrib *attrib_list,
@@ -2361,7 +2375,7 @@ eglGetProcAddress(const char *procname)
   { "eglCreatePlatformWindowSurface", (_EGLProc) 
eglCreatePlatformWindowSurface },
   { "eglCreatePlatformPixmapSurface", (_EGLProc) 
eglCreatePlatformPixmapSurface },
   { "eglCreateImageKHR", (_EGLProc) eglCreateImageKHR },
-  { "eglDestroyImageKHR", (_EGLProc) eglDestroyImage },
+  { "eglDestroyImageKHR", (_EGLProc) eglDestroyImageKHR },
   { "eglCreateSyncKHR", (_EGLProc) eglCreateSyncKHR },
   { "eglCreateSync64KHR", (_EGLProc) eglCreateSync64KHR },
   { "eglDestroySyncKHR", (_EGLProc) eglDestroySync },
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH mesa 5/9] eglapi: sort entrypoints list

2017-02-21 Thread Eric Engestrom
Let's make that comment true.
If will also be necessary in a couple commits (using bsearch).

Signed-off-by: Eric Engestrom 
---
 src/egl/main/eglapi.c | 74 +--
 1 file changed, 37 insertions(+), 37 deletions(-)

diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
index 6c90333998..4597e10821 100644
--- a/src/egl/main/eglapi.c
+++ b/src/egl/main/eglapi.c
@@ -2364,15 +2364,38 @@ eglGetProcAddress(const char *procname)
   /* alphabetical order */
   EGL_ENTRYPOINT(eglBindAPI)
   EGL_ENTRYPOINT(eglBindTexImage)
+  EGL_ENTRYPOINT(eglBindWaylandDisplayWL)
   EGL_ENTRYPOINT(eglChooseConfig)
+  EGL_ENTRYPOINT(eglClientWaitSync)
+  EGL_ENTRYPOINT(eglClientWaitSyncKHR)
   EGL_ENTRYPOINT(eglCopyBuffers)
   EGL_ENTRYPOINT(eglCreateContext)
+  EGL_ENTRYPOINT(eglCreateDRMImageMESA)
+  EGL_ENTRYPOINT(eglCreateImage)
+  EGL_ENTRYPOINT(eglCreateImageKHR)
   EGL_ENTRYPOINT(eglCreatePbufferFromClientBuffer)
   EGL_ENTRYPOINT(eglCreatePbufferSurface)
   EGL_ENTRYPOINT(eglCreatePixmapSurface)
+  EGL_ENTRYPOINT(eglCreatePlatformPixmapSurface)
+  EGL_ENTRYPOINT(eglCreatePlatformPixmapSurfaceEXT)
+  EGL_ENTRYPOINT(eglCreatePlatformWindowSurface)
+  EGL_ENTRYPOINT(eglCreatePlatformWindowSurfaceEXT)
+  EGL_ENTRYPOINT(eglCreateSync)
+  EGL_ENTRYPOINT(eglCreateSync64KHR)
+  EGL_ENTRYPOINT(eglCreateSyncKHR)
+  EGL_ENTRYPOINT(eglCreateWaylandBufferFromImageWL)
   EGL_ENTRYPOINT(eglCreateWindowSurface)
+  EGL_ENTRYPOINT(eglDebugMessageControlKHR)
   EGL_ENTRYPOINT(eglDestroyContext)
+  EGL_ENTRYPOINT(eglDestroyImage)
+  EGL_ENTRYPOINT(eglDestroyImageKHR)
   EGL_ENTRYPOINT(eglDestroySurface)
+  EGL_ENTRYPOINT(eglDestroySync)
+  EGL_ENTRYPOINT(eglDestroySyncKHR)
+  EGL_ENTRYPOINT(eglDupNativeFenceFDANDROID)
+  EGL_ENTRYPOINT(eglExportDMABUFImageMESA)
+  EGL_ENTRYPOINT(eglExportDMABUFImageQueryMESA)
+  EGL_ENTRYPOINT(eglExportDRMImageMESA)
   EGL_ENTRYPOINT(eglGetConfigAttrib)
   EGL_ENTRYPOINT(eglGetConfigs)
   EGL_ENTRYPOINT(eglGetCurrentContext)
@@ -2380,61 +2403,38 @@ eglGetProcAddress(const char *procname)
   EGL_ENTRYPOINT(eglGetCurrentSurface)
   EGL_ENTRYPOINT(eglGetDisplay)
   EGL_ENTRYPOINT(eglGetError)
+  EGL_ENTRYPOINT(eglGetPlatformDisplay)
+  EGL_ENTRYPOINT(eglGetPlatformDisplayEXT)
   EGL_ENTRYPOINT(eglGetProcAddress)
+  EGL_ENTRYPOINT(eglGetSyncAttrib)
+  EGL_ENTRYPOINT(eglGetSyncAttribKHR)
+  EGL_ENTRYPOINT(eglGetSyncValuesCHROMIUM)
   EGL_ENTRYPOINT(eglInitialize)
+  EGL_ENTRYPOINT(eglLabelObjectKHR)
   EGL_ENTRYPOINT(eglMakeCurrent)
+  EGL_ENTRYPOINT(eglPostSubBufferNV)
   EGL_ENTRYPOINT(eglQueryAPI)
   EGL_ENTRYPOINT(eglQueryContext)
+  EGL_ENTRYPOINT(eglQueryDebugKHR)
   EGL_ENTRYPOINT(eglQueryString)
   EGL_ENTRYPOINT(eglQuerySurface)
+  EGL_ENTRYPOINT(eglQueryWaylandBufferWL)
   EGL_ENTRYPOINT(eglReleaseTexImage)
   EGL_ENTRYPOINT(eglReleaseThread)
+  EGL_ENTRYPOINT(eglSignalSyncKHR)
   EGL_ENTRYPOINT(eglSurfaceAttrib)
   EGL_ENTRYPOINT(eglSwapBuffers)
+  EGL_ENTRYPOINT(eglSwapBuffersRegionNOK)
+  EGL_ENTRYPOINT(eglSwapBuffersWithDamageEXT)
+  EGL_ENTRYPOINT(eglSwapBuffersWithDamageKHR)
   EGL_ENTRYPOINT(eglSwapInterval)
   EGL_ENTRYPOINT(eglTerminate)
+  EGL_ENTRYPOINT(eglUnbindWaylandDisplayWL)
   EGL_ENTRYPOINT(eglWaitClient)
   EGL_ENTRYPOINT(eglWaitGL)
   EGL_ENTRYPOINT(eglWaitNative)
-  EGL_ENTRYPOINT(eglCreateSync)
-  EGL_ENTRYPOINT(eglDestroySync)
-  EGL_ENTRYPOINT(eglClientWaitSync)
-  EGL_ENTRYPOINT(eglGetSyncAttrib)
   EGL_ENTRYPOINT(eglWaitSync)
-  EGL_ENTRYPOINT(eglCreateImage)
-  EGL_ENTRYPOINT(eglDestroyImage)
-  EGL_ENTRYPOINT(eglGetPlatformDisplay)
-  EGL_ENTRYPOINT(eglCreatePlatformWindowSurface)
-  EGL_ENTRYPOINT(eglCreatePlatformPixmapSurface)
-  EGL_ENTRYPOINT(eglCreateImageKHR)
-  EGL_ENTRYPOINT(eglDestroyImageKHR)
-  EGL_ENTRYPOINT(eglCreateSyncKHR)
-  EGL_ENTRYPOINT(eglCreateSync64KHR)
-  EGL_ENTRYPOINT(eglDestroySyncKHR)
-  EGL_ENTRYPOINT(eglClientWaitSyncKHR)
   EGL_ENTRYPOINT(eglWaitSyncKHR)
-  EGL_ENTRYPOINT(eglSignalSyncKHR)
-  EGL_ENTRYPOINT(eglGetSyncAttribKHR)
-  EGL_ENTRYPOINT(eglSwapBuffersRegionNOK)
-  EGL_ENTRYPOINT(eglCreateDRMImageMESA)
-  EGL_ENTRYPOINT(eglExportDRMImageMESA)
-  EGL_ENTRYPOINT(eglBindWaylandDisplayWL)
-  EGL_ENTRYPOINT(eglUnbindWaylandDisplayWL)
-  EGL_ENTRYPOINT(eglQueryWaylandBufferWL)
-  EGL_ENTRYPOINT(eglCreateWaylandBufferFromImageWL)
-  EGL_ENTRYPOINT(eglPostSubBufferNV)
-  EGL_ENTRYPOINT(eglSwapBuffersWithDamageEXT)
-  EGL_ENTRYPOINT(eglSwapBuffersWithDamageKHR)
-  EGL_ENTRYPOINT(eglGetPlatformDisplayEXT)
-  EGL_ENTRYPOINT(eglCreatePlatformWindowSurfaceEXT)
-  

[Mesa-dev] [PATCH mesa 3/9] eglapi: add entrypoint for eglClientWaitSyncKHR

2017-02-21 Thread Eric Engestrom
Signed-off-by: Eric Engestrom 
---
 src/egl/main/eglapi.c | 31 ---
 1 file changed, 24 insertions(+), 7 deletions(-)

diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
index e149c0f8d1..e44375a106 100644
--- a/src/egl/main/eglapi.c
+++ b/src/egl/main/eglapi.c
@@ -1762,16 +1762,13 @@ eglDestroySyncKHR(EGLDisplay dpy, EGLSync sync)
 }
 
 
-EGLint EGLAPIENTRY
-eglClientWaitSync(EGLDisplay dpy, EGLSync sync, EGLint flags, EGLTime timeout)
+static EGLint
+_eglClientWaitSyncCommon(_EGLDisplay *disp, EGLDisplay dpy,
+ _EGLSync *s, EGLint flags, EGLTime timeout)
 {
-   _EGLDisplay *disp = _eglLockDisplay(dpy);
-   _EGLSync *s = _eglLookupSync(sync, disp);
_EGLDriver *drv;
EGLint ret;
 
-   _EGL_FUNC_START(disp, EGL_OBJECT_SYNC_KHR, s, EGL_FALSE);
-
_EGL_CHECK_SYNC(disp, s, EGL_FALSE, drv);
assert(disp->Extensions.KHR_reusable_sync ||
   disp->Extensions.KHR_fence_sync ||
@@ -1800,6 +1797,26 @@ eglClientWaitSync(EGLDisplay dpy, EGLSync sync, EGLint 
flags, EGLTime timeout)
   RETURN_EGL_EVAL(disp, ret);
 }
 
+EGLint EGLAPIENTRY
+eglClientWaitSync(EGLDisplay dpy, EGLSync sync,
+  EGLint flags, EGLTime timeout)
+{
+   _EGLDisplay *disp = _eglLockDisplay(dpy);
+   _EGLSync *s = _eglLookupSync(sync, disp);
+   _EGL_FUNC_START(disp, EGL_OBJECT_SYNC_KHR, s, EGL_FALSE);
+   return _eglClientWaitSyncCommon(disp, dpy, s, flags, timeout);
+}
+
+static EGLint EGLAPIENTRY
+eglClientWaitSyncKHR(EGLDisplay dpy, EGLSync sync,
+ EGLint flags, EGLTime timeout)
+{
+   _EGLDisplay *disp = _eglLockDisplay(dpy);
+   _EGLSync *s = _eglLookupSync(sync, disp);
+   _EGL_FUNC_START(disp, EGL_OBJECT_SYNC_KHR, s, EGL_FALSE);
+   return _eglClientWaitSyncCommon(disp, dpy, s, flags, timeout);
+}
+
 
 static EGLint
 _eglWaitSyncCommon(_EGLDisplay *disp, _EGLSync *s, EGLint flags)
@@ -2393,7 +2410,7 @@ eglGetProcAddress(const char *procname)
   { "eglCreateSyncKHR", (_EGLProc) eglCreateSyncKHR },
   { "eglCreateSync64KHR", (_EGLProc) eglCreateSync64KHR },
   { "eglDestroySyncKHR", (_EGLProc) eglDestroySyncKHR },
-  { "eglClientWaitSyncKHR", (_EGLProc) eglClientWaitSync },
+  { "eglClientWaitSyncKHR", (_EGLProc) eglClientWaitSyncKHR },
   { "eglWaitSyncKHR", (_EGLProc) eglWaitSyncKHR },
   { "eglSignalSyncKHR", (_EGLProc) eglSignalSyncKHR },
   { "eglGetSyncAttribKHR", (_EGLProc) eglGetSyncAttribKHR },
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH mesa v2 6/9] eglapi: move entrypoints list out to its own file

2017-02-21 Thread Eric Engestrom
This will allow us to make sure the list is always sorted in the next
commit.

Signed-off-by: Eric Engestrom 
---
v2: use .h extension for the new file, and track it in LIBEGL_C_FILES (Emil)
---
 src/egl/Makefile.sources |  1 +
 src/egl/main/eglapi.c| 78 +---
 src/egl/main/eglentrypoint.h | 77 +++
 3 files changed, 79 insertions(+), 77 deletions(-)
 create mode 100644 src/egl/main/eglentrypoint.h

diff --git a/src/egl/Makefile.sources b/src/egl/Makefile.sources
index 48db8518f8..e6fd3f114c 100644
--- a/src/egl/Makefile.sources
+++ b/src/egl/Makefile.sources
@@ -26,6 +26,7 @@ LIBEGL_C_FILES := \
main/eglsurface.h \
main/eglsync.c \
main/eglsync.h \
+   main/eglentrypoint.h \
main/egltypedefs.h
 
 dri2_backend_core_FILES := \
diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
index 4597e10821..77ec5426ec 100644
--- a/src/egl/main/eglapi.c
+++ b/src/egl/main/eglapi.c
@@ -2358,83 +2358,7 @@ eglGetProcAddress(const char *procname)
   _EGLProc function;
} egl_functions[] = {
 #define EGL_ENTRYPOINT(f) { .name = #f, .function = (_EGLProc) f },
-  /* core functions queryable in the presence of
-   * EGL_KHR_get_all_proc_addresses or EGL 1.5
-   */
-  /* alphabetical order */
-  EGL_ENTRYPOINT(eglBindAPI)
-  EGL_ENTRYPOINT(eglBindTexImage)
-  EGL_ENTRYPOINT(eglBindWaylandDisplayWL)
-  EGL_ENTRYPOINT(eglChooseConfig)
-  EGL_ENTRYPOINT(eglClientWaitSync)
-  EGL_ENTRYPOINT(eglClientWaitSyncKHR)
-  EGL_ENTRYPOINT(eglCopyBuffers)
-  EGL_ENTRYPOINT(eglCreateContext)
-  EGL_ENTRYPOINT(eglCreateDRMImageMESA)
-  EGL_ENTRYPOINT(eglCreateImage)
-  EGL_ENTRYPOINT(eglCreateImageKHR)
-  EGL_ENTRYPOINT(eglCreatePbufferFromClientBuffer)
-  EGL_ENTRYPOINT(eglCreatePbufferSurface)
-  EGL_ENTRYPOINT(eglCreatePixmapSurface)
-  EGL_ENTRYPOINT(eglCreatePlatformPixmapSurface)
-  EGL_ENTRYPOINT(eglCreatePlatformPixmapSurfaceEXT)
-  EGL_ENTRYPOINT(eglCreatePlatformWindowSurface)
-  EGL_ENTRYPOINT(eglCreatePlatformWindowSurfaceEXT)
-  EGL_ENTRYPOINT(eglCreateSync)
-  EGL_ENTRYPOINT(eglCreateSync64KHR)
-  EGL_ENTRYPOINT(eglCreateSyncKHR)
-  EGL_ENTRYPOINT(eglCreateWaylandBufferFromImageWL)
-  EGL_ENTRYPOINT(eglCreateWindowSurface)
-  EGL_ENTRYPOINT(eglDebugMessageControlKHR)
-  EGL_ENTRYPOINT(eglDestroyContext)
-  EGL_ENTRYPOINT(eglDestroyImage)
-  EGL_ENTRYPOINT(eglDestroyImageKHR)
-  EGL_ENTRYPOINT(eglDestroySurface)
-  EGL_ENTRYPOINT(eglDestroySync)
-  EGL_ENTRYPOINT(eglDestroySyncKHR)
-  EGL_ENTRYPOINT(eglDupNativeFenceFDANDROID)
-  EGL_ENTRYPOINT(eglExportDMABUFImageMESA)
-  EGL_ENTRYPOINT(eglExportDMABUFImageQueryMESA)
-  EGL_ENTRYPOINT(eglExportDRMImageMESA)
-  EGL_ENTRYPOINT(eglGetConfigAttrib)
-  EGL_ENTRYPOINT(eglGetConfigs)
-  EGL_ENTRYPOINT(eglGetCurrentContext)
-  EGL_ENTRYPOINT(eglGetCurrentDisplay)
-  EGL_ENTRYPOINT(eglGetCurrentSurface)
-  EGL_ENTRYPOINT(eglGetDisplay)
-  EGL_ENTRYPOINT(eglGetError)
-  EGL_ENTRYPOINT(eglGetPlatformDisplay)
-  EGL_ENTRYPOINT(eglGetPlatformDisplayEXT)
-  EGL_ENTRYPOINT(eglGetProcAddress)
-  EGL_ENTRYPOINT(eglGetSyncAttrib)
-  EGL_ENTRYPOINT(eglGetSyncAttribKHR)
-  EGL_ENTRYPOINT(eglGetSyncValuesCHROMIUM)
-  EGL_ENTRYPOINT(eglInitialize)
-  EGL_ENTRYPOINT(eglLabelObjectKHR)
-  EGL_ENTRYPOINT(eglMakeCurrent)
-  EGL_ENTRYPOINT(eglPostSubBufferNV)
-  EGL_ENTRYPOINT(eglQueryAPI)
-  EGL_ENTRYPOINT(eglQueryContext)
-  EGL_ENTRYPOINT(eglQueryDebugKHR)
-  EGL_ENTRYPOINT(eglQueryString)
-  EGL_ENTRYPOINT(eglQuerySurface)
-  EGL_ENTRYPOINT(eglQueryWaylandBufferWL)
-  EGL_ENTRYPOINT(eglReleaseTexImage)
-  EGL_ENTRYPOINT(eglReleaseThread)
-  EGL_ENTRYPOINT(eglSignalSyncKHR)
-  EGL_ENTRYPOINT(eglSurfaceAttrib)
-  EGL_ENTRYPOINT(eglSwapBuffers)
-  EGL_ENTRYPOINT(eglSwapBuffersRegionNOK)
-  EGL_ENTRYPOINT(eglSwapBuffersWithDamageEXT)
-  EGL_ENTRYPOINT(eglSwapBuffersWithDamageKHR)
-  EGL_ENTRYPOINT(eglSwapInterval)
-  EGL_ENTRYPOINT(eglTerminate)
-  EGL_ENTRYPOINT(eglUnbindWaylandDisplayWL)
-  EGL_ENTRYPOINT(eglWaitClient)
-  EGL_ENTRYPOINT(eglWaitGL)
-  EGL_ENTRYPOINT(eglWaitNative)
-  EGL_ENTRYPOINT(eglWaitSync)
-  EGL_ENTRYPOINT(eglWaitSyncKHR)
+#include "eglentrypoint.h"
 #undef EGL_ENTRYPOINT
};
EGLint i;
diff --git a/src/egl/main/eglentrypoint.h b/src/egl/main/eglentrypoint.h
new file mode 100644
index 00..e6318b9311
--- /dev/null
+++ b/src/egl/main/eglentrypoint.h
@@ -0,0 +1,77 @@
+/* core functions queryable in the presence of
+ * EGL_KHR_get_all_proc_addresses or EGL 1.5
+ */
+/* alphabetical order */
+EGL_ENTRYPOINT(eglBindAPI)
+EGL_ENTRYPOINT(eglBindTexImage)
+EGL_ENTRYPOINT(eglBindWaylandDisplayWL)

[Mesa-dev] [PATCH mesa 2/9] eglapi: add entrypoint for eglDestroySyncKHR

2017-02-21 Thread Eric Engestrom
Signed-off-by: Eric Engestrom 
---
 src/egl/main/eglapi.c | 28 +---
 1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
index 251855cc3b..e149c0f8d1 100644
--- a/src/egl/main/eglapi.c
+++ b/src/egl/main/eglapi.c
@@ -1726,16 +1726,12 @@ eglCreateSync(EGLDisplay dpy, EGLenum type, const 
EGLAttrib *attrib_list)
 }
 
 
-EGLBoolean EGLAPIENTRY
-eglDestroySync(EGLDisplay dpy, EGLSync sync)
+static EGLBoolean
+_eglDestroySync(_EGLDisplay *disp, _EGLSync *s)
 {
-   _EGLDisplay *disp = _eglLockDisplay(dpy);
-   _EGLSync *s = _eglLookupSync(sync, disp);
_EGLDriver *drv;
EGLBoolean ret;
 
-   _EGL_FUNC_START(disp, EGL_OBJECT_SYNC_KHR, s, EGL_FALSE);
-
_EGL_CHECK_SYNC(disp, s, EGL_FALSE, drv);
assert(disp->Extensions.KHR_reusable_sync ||
   disp->Extensions.KHR_fence_sync ||
@@ -1747,6 +1743,24 @@ eglDestroySync(EGLDisplay dpy, EGLSync sync)
RETURN_EGL_EVAL(disp, ret);
 }
 
+EGLBoolean EGLAPIENTRY
+eglDestroySync(EGLDisplay dpy, EGLSync sync)
+{
+   _EGLDisplay *disp = _eglLockDisplay(dpy);
+   _EGLSync *s = _eglLookupSync(sync, disp);
+   _EGL_FUNC_START(disp, EGL_OBJECT_SYNC_KHR, s, EGL_FALSE);
+   return _eglDestroySync(disp, s);
+}
+
+static EGLBoolean EGLAPIENTRY
+eglDestroySyncKHR(EGLDisplay dpy, EGLSync sync)
+{
+   _EGLDisplay *disp = _eglLockDisplay(dpy);
+   _EGLSync *s = _eglLookupSync(sync, disp);
+   _EGL_FUNC_START(disp, EGL_OBJECT_SYNC_KHR, s, EGL_FALSE);
+   return _eglDestroySync(disp, s);
+}
+
 
 EGLint EGLAPIENTRY
 eglClientWaitSync(EGLDisplay dpy, EGLSync sync, EGLint flags, EGLTime timeout)
@@ -2378,7 +2392,7 @@ eglGetProcAddress(const char *procname)
   { "eglDestroyImageKHR", (_EGLProc) eglDestroyImageKHR },
   { "eglCreateSyncKHR", (_EGLProc) eglCreateSyncKHR },
   { "eglCreateSync64KHR", (_EGLProc) eglCreateSync64KHR },
-  { "eglDestroySyncKHR", (_EGLProc) eglDestroySync },
+  { "eglDestroySyncKHR", (_EGLProc) eglDestroySyncKHR },
   { "eglClientWaitSyncKHR", (_EGLProc) eglClientWaitSync },
   { "eglWaitSyncKHR", (_EGLProc) eglWaitSyncKHR },
   { "eglSignalSyncKHR", (_EGLProc) eglSignalSyncKHR },
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] [swr] fix index buffers with non-zero indices

2017-02-21 Thread Cherniak, Bruce
Reviewed-by: Bruce Cherniak 

> On Feb 17, 2017, at 2:30 PM, George Kyriazis  
> wrote:
> 
> Fix issue with index buffers that do not contain a 0 index.  0 index
> can be a non-valid index if the (copied) vertex buffers are a subset of the
> user's (which happens because we only copy the range between min & max).
> Core will use an index passed in from the driver to replace invalid indices.
> 
> Only do this for calls that contain non-zero indices, to minimize performance
> cost.
> ---
> src/gallium/drivers/swr/rasterizer/core/state.h|  1 +
> .../drivers/swr/rasterizer/jitter/fetch_jit.cpp| 60 +++---
> .../drivers/swr/rasterizer/jitter/fetch_jit.h  |  2 +
> src/gallium/drivers/swr/swr_draw.cpp   |  1 +
> src/gallium/drivers/swr/swr_state.cpp  |  4 ++
> 5 files changed, 62 insertions(+), 6 deletions(-)
> 
> diff --git a/src/gallium/drivers/swr/rasterizer/core/state.h 
> b/src/gallium/drivers/swr/rasterizer/core/state.h
> index 2f3b913..05347dc 100644
> --- a/src/gallium/drivers/swr/rasterizer/core/state.h
> +++ b/src/gallium/drivers/swr/rasterizer/core/state.h
> @@ -524,6 +524,7 @@ struct SWR_VERTEX_BUFFER_STATE
> const uint8_t *pData;
> uint32_t size;
> uint32_t numaNode;
> +uint32_t minVertex; // min vertex (for bounds checking)
> uint32_t maxVertex; // size / pitch.  precalculated value 
> used by fetch shader for OOB checks
> uint32_t partialInboundsSize;   // size % pitch.  precalculated value 
> used by fetch shader for partially OOB vertices
> };
> diff --git a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp 
> b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp
> index 901bce6..ffa7605 100644
> --- a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp
> +++ b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.cpp
> @@ -309,11 +309,29 @@ void FetchJit::JitLoadVertices(const 
> FETCH_COMPILE_STATE , Value* str
> 
> Value* startVertexOffset = MUL(Z_EXT(startOffset, mInt64Ty), stride);
> 
> +Value *minVertex = NULL;
> +Value *minVertexOffset = NULL;
> +if (fetchState.bPartialVertexBuffer) {
> +// fetch min index for low bounds checking
> +minVertex = GEP(streams, {C(ied.StreamIndex), 
> C(SWR_VERTEX_BUFFER_STATE_minVertex)});
> +minVertex = LOAD(minVertex);
> +if (!fetchState.bDisableIndexOOBCheck) {
> +minVertexOffset = MUL(Z_EXT(minVertex, mInt64Ty), stride);
> +}
> +}
> +
> // Load from the stream.
> for(uint32_t lane = 0; lane < mVWidth; ++lane)
> {
> // Get index
> Value* index = VEXTRACT(vCurIndices, C(lane));
> +
> +if (fetchState.bPartialVertexBuffer) {
> +// clamp below minvertex
> +Value *isBelowMin = ICMP_SLT(index, minVertex);
> +index = SELECT(isBelowMin, minVertex, index);
> +}
> +
> index = Z_EXT(index, mInt64Ty);
> 
> Value*offset = MUL(index, stride);
> @@ -321,10 +339,14 @@ void FetchJit::JitLoadVertices(const 
> FETCH_COMPILE_STATE , Value* str
> offset = ADD(offset, startVertexOffset);
> 
> if (!fetchState.bDisableIndexOOBCheck) {
> -// check for out of bound access, including partial OOB, and 
> mask them to 0
> +// check for out of bound access, including partial OOB, and 
> replace them with minVertex
> Value *endOffset = ADD(offset, C((int64_t)info.Bpp));
> Value *oob = ICMP_ULE(endOffset, size);
> -offset = SELECT(oob, offset, ConstantInt::get(mInt64Ty, 0));
> +if (fetchState.bPartialVertexBuffer) {
> +offset = SELECT(oob, offset, minVertexOffset);
> +} else {
> +offset = SELECT(oob, offset, ConstantInt::get(mInt64Ty, 
> 0));
> +}
> }
> 
> Value*pointer = GEP(stream, offset);
> @@ -732,6 +754,13 @@ void FetchJit::JitGatherVertices(const 
> FETCH_COMPILE_STATE ,
> Value *maxVertex = GEP(streams, {C(ied.StreamIndex), 
> C(SWR_VERTEX_BUFFER_STATE_maxVertex)});
> maxVertex = LOAD(maxVertex);
> 
> +Value *minVertex = NULL;
> +if (fetchState.bPartialVertexBuffer) {
> +// min vertex index for low bounds OOB checking
> +minVertex = GEP(streams, {C(ied.StreamIndex), 
> C(SWR_VERTEX_BUFFER_STATE_minVertex)});
> +minVertex = LOAD(minVertex);
> +}
> +
> Value *vCurIndices;
> Value *startOffset;
> if(ied.InstanceEnable)
> @@ -769,9 +798,16 @@ void FetchJit::JitGatherVertices(const 
> FETCH_COMPILE_STATE ,
> 
> // if we have a start offset, subtract from max vertex. Used for OOB 
> check
> maxVertex = SUB(Z_EXT(maxVertex, mInt64Ty), 

Re: [Mesa-dev] [PATCH 1/2] [swr] Add fetch shader cache

2017-02-21 Thread Cherniak, Bruce
Reviewed-by: Bruce Cherniak 

> On Feb 17, 2017, at 2:30 PM, George Kyriazis  
> wrote:
> 
> For now, the cache key is all of FETCH_COMPILE_STATE.
> 
> Use new/delete for swr_vertex_element_state, since we have to call the
> constructors/destructors of the struct elements.
> ---
> src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.h |  2 +-
> src/gallium/drivers/swr/swr_draw.cpp  | 19 +++
> src/gallium/drivers/swr/swr_shader.cpp| 14 ++
> src/gallium/drivers/swr/swr_shader.h  | 15 +++
> src/gallium/drivers/swr/swr_state.cpp |  6 --
> src/gallium/drivers/swr/swr_state.h   |  9 +
> 6 files changed, 50 insertions(+), 15 deletions(-)
> 
> diff --git a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.h 
> b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.h
> index 1547453..622608a 100644
> --- a/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.h
> +++ b/src/gallium/drivers/swr/rasterizer/jitter/fetch_jit.h
> @@ -94,7 +94,7 @@ enum ComponentControl
> //
> struct FETCH_COMPILE_STATE
> {
> -uint32_t numAttribs;
> +uint32_t numAttribs {0};
> INPUT_ELEMENT_DESC layout[KNOB_NUM_ATTRIBUTES];
> SWR_FORMAT indexType;
> uint32_t cutIndex{ 0x };
> diff --git a/src/gallium/drivers/swr/swr_draw.cpp 
> b/src/gallium/drivers/swr/swr_draw.cpp
> index c4d5e5c..4bdd3bb 100644
> --- a/src/gallium/drivers/swr/swr_draw.cpp
> +++ b/src/gallium/drivers/swr/swr_draw.cpp
> @@ -141,19 +141,22 @@ swr_draw_vbo(struct pipe_context *pipe, const struct 
> pipe_draw_info *info)
>}
> 
>struct swr_vertex_element_state *velems = ctx->velems;
> -   if (!velems->fsFunc
> -   || (velems->fsState.cutIndex != info->restart_index)
> -   || (velems->fsState.bEnableCutIndex != info->primitive_restart)) {
> -
> -  velems->fsState.cutIndex = info->restart_index;
> -  velems->fsState.bEnableCutIndex = info->primitive_restart;
> -
> -  /* Create Fetch Shader */
> +   velems->fsState.cutIndex = info->restart_index;
> +   velems->fsState.bEnableCutIndex = info->primitive_restart;
> +
> +   swr_jit_fetch_key key;
> +   swr_generate_fetch_key(key, velems);
> +   auto search = velems->map.find(key);
> +   if (search != velems->map.end()) {
> +  velems->fsFunc = search->second;
> +   } else {
>   HANDLE hJitMgr = swr_screen(ctx->pipe.screen)->hJitMgr;
>   velems->fsFunc = JitCompileFetch(hJitMgr, velems->fsState);
> 
>   debug_printf("fetch shader %p\n", velems->fsFunc);
>   assert(velems->fsFunc && "Error: FetchShader = NULL");
> +
> +  velems->map.insert(std::make_pair(key, velems->fsFunc));
>}
> 
>SwrSetFetchFunc(ctx->swrContext, velems->fsFunc);
> diff --git a/src/gallium/drivers/swr/swr_shader.cpp 
> b/src/gallium/drivers/swr/swr_shader.cpp
> index 979a28b..676938c 100644
> --- a/src/gallium/drivers/swr/swr_shader.cpp
> +++ b/src/gallium/drivers/swr/swr_shader.cpp
> @@ -61,6 +61,11 @@ bool operator==(const swr_jit_vs_key , const 
> swr_jit_vs_key )
>return !memcmp(, , sizeof(lhs));
> }
> 
> +bool operator==(const swr_jit_fetch_key , const swr_jit_fetch_key )
> +{
> +   return !memcmp(, , sizeof(lhs));
> +}
> +
> static void
> swr_generate_sampler_key(const struct lp_tgsi_info ,
>  struct swr_context *ctx,
> @@ -157,6 +162,15 @@ swr_generate_vs_key(struct swr_jit_vs_key ,
>swr_generate_sampler_key(swr_vs->info, ctx, PIPE_SHADER_VERTEX, key);
> }
> 
> +void
> +swr_generate_fetch_key(struct swr_jit_fetch_key ,
> +   struct swr_vertex_element_state *velems)
> +{
> +   memset(, 0, sizeof(key));
> +
> +   key.fsState = velems->fsState;
> +}
> +
> struct BuilderSWR : public Builder {
>BuilderSWR(JitManager *pJitMgr, const char *pName)
>   : Builder(pJitMgr)
> diff --git a/src/gallium/drivers/swr/swr_shader.h 
> b/src/gallium/drivers/swr/swr_shader.h
> index 7e3399c..266573f 100644
> --- a/src/gallium/drivers/swr/swr_shader.h
> +++ b/src/gallium/drivers/swr/swr_shader.h
> @@ -42,6 +42,9 @@ void swr_generate_vs_key(struct swr_jit_vs_key ,
>  struct swr_context *ctx,
>  swr_vertex_shader *swr_vs);
> 
> +void swr_generate_fetch_key(struct swr_jit_fetch_key ,
> +struct swr_vertex_element_state *velems);
> +
> struct swr_jit_sampler_key {
>unsigned nr_samplers;
>unsigned nr_sampler_views;
> @@ -60,6 +63,10 @@ struct swr_jit_vs_key : swr_jit_sampler_key {
>unsigned clip_plane_mask; // from rasterizer state & vs_info
> };
> 
> +struct swr_jit_fetch_key {
> +   FETCH_COMPILE_STATE fsState;
> +};
> +
> namespace std
> {
> template <> struct hash {
> @@ -75,7 +82,15 @@ template <> struct hash {
>   return util_hash_crc32(, sizeof(k));
>}
> };
> +
> +template <> struct hash 

Re: [Mesa-dev] [PATCH 2/3] isl: add MCS width constraint 16 samples

2017-02-21 Thread Jason Ekstrand
On Tue, Feb 21, 2017 at 3:44 PM, Chad Versace 
wrote:

> On Mon 20 Feb 2017, Jason Ekstrand wrote:
> > On Mon, Feb 20, 2017 at 10:33 AM, Lionel Landwerlin <
> > lionel.g.landwer...@intel.com> wrote:
> >
> > >> diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> > >> index 1a47da5..6979063 100644
> > >> --- a/src/intel/isl/isl.c
> > >> +++ b/src/intel/isl/isl.c
> > >> @@ -1417,6 +1417,16 @@ isl_surf_get_mcs_surf(const struct isl_device
> *dev,
> > >>  assert(surf->levels == 1);
> > >>  assert(surf->logical_level0_px.depth == 1);
> > >>   +   /* The "Auxiliary Surface Pitch" field in RENDER_SURFACE_STATE
> is
> > >> only 9
> > >> +* bits which means the maximum pitch of a compression surface is
> 512
> > >> +* tiles or 64KB (since MCS is always Y-tiled).  Since a 16x MCS
> > >> buffer is
> > >> +* 64bpp, this gives us a maximum width of 8192 pixels.  We can
> create
> > >> +* larger multisampled surfaces, we just can't compress them.
>  For
> > >> 2x, 4x,
> > >> +* and 8x, we have enough room for the full 16k supported by the
> > >> hardware.
> > >> +*/
> > >> +   if (surf->samples == 16 && surf->width > 8192)
> > >> +  return false;
> > >> +
> > >>
> > >
> > > I was about to write something like this :
> > >
> > >struct isl_tile_info tile_info;
> > >isl_surf_get_tile_info(dev, surf, _info);
> > >if ((surf->row_pitch / tile_info.phys_extent_B.width) > 512)
> > >   return false;
> > >
> >
> > That would work too and it is a bit more general.  However, ISL currently
> > doesn't touch the isl_surf if creation fails.  I wouldn't mind keeping
> > that.  Also, I like that the end result of the restriction is clearly
> > spelled out with the old check.  I can't say that I care all that much
> one
> > way or the other so long as both the effect (16x 16k surfaces not
> working)
> > and the reason (pitch) are documented.
>
> I don't understand how Lionel's suggestion would modify the isl_surf on
> failure. the isl_surf paramater to isl_surf_get_tile_info() is const.
>

It modifies the out isl_surf regardless of whether the function succeeds or
not so you a partial fill-out.  Meh.


> Either way looks good to me.
> Reviewed-by: Chad Versace 
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] isl: add MCS width constraint 16 samples

2017-02-21 Thread Chad Versace
On Mon 20 Feb 2017, Jason Ekstrand wrote:
> On Mon, Feb 20, 2017 at 10:33 AM, Lionel Landwerlin <
> lionel.g.landwer...@intel.com> wrote:
> 
> >> diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> >> index 1a47da5..6979063 100644
> >> --- a/src/intel/isl/isl.c
> >> +++ b/src/intel/isl/isl.c
> >> @@ -1417,6 +1417,16 @@ isl_surf_get_mcs_surf(const struct isl_device *dev,
> >>  assert(surf->levels == 1);
> >>  assert(surf->logical_level0_px.depth == 1);
> >>   +   /* The "Auxiliary Surface Pitch" field in RENDER_SURFACE_STATE is
> >> only 9
> >> +* bits which means the maximum pitch of a compression surface is 512
> >> +* tiles or 64KB (since MCS is always Y-tiled).  Since a 16x MCS
> >> buffer is
> >> +* 64bpp, this gives us a maximum width of 8192 pixels.  We can create
> >> +* larger multisampled surfaces, we just can't compress them.   For
> >> 2x, 4x,
> >> +* and 8x, we have enough room for the full 16k supported by the
> >> hardware.
> >> +*/
> >> +   if (surf->samples == 16 && surf->width > 8192)
> >> +  return false;
> >> +
> >>
> >
> > I was about to write something like this :
> >
> >struct isl_tile_info tile_info;
> >isl_surf_get_tile_info(dev, surf, _info);
> >if ((surf->row_pitch / tile_info.phys_extent_B.width) > 512)
> >   return false;
> >
> 
> That would work too and it is a bit more general.  However, ISL currently
> doesn't touch the isl_surf if creation fails.  I wouldn't mind keeping
> that.  Also, I like that the end result of the restriction is clearly
> spelled out with the old check.  I can't say that I care all that much one
> way or the other so long as both the effect (16x 16k surfaces not working)
> and the reason (pitch) are documented.

I don't understand how Lionel's suggestion would modify the isl_surf on
failure. the isl_surf paramater to isl_surf_get_tile_info() is const.

Either way looks good to me.
Reviewed-by: Chad Versace 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] isl: Return surface creation success from aux helpers

2017-02-21 Thread Chad Versace
On Mon 20 Feb 2017, Jason Ekstrand wrote:
> The isl_surf_init call that each of these helpers make can, in theory,
> fail.  We should propagate that up to the caller rather than just
> silently ignoring it.
> 
> Reviewed-by: Topi Pohjolainen 
> ---
>  src/intel/isl/isl.c  | 72 
> +---
>  src/intel/isl/isl.h  |  4 +--
>  src/intel/vulkan/anv_image.c |  5 +--
>  3 files changed, 40 insertions(+), 41 deletions(-)

Thank you for adding my bool's back to the hiz funcs :)

Reviewed-by: Chad Versace 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa 7/8] eglapi: make sure list is always sorted

2017-02-21 Thread Eric Engestrom
On Monday, 2017-02-20 19:49:59 +, Emil Velikov wrote:
> On 19 February 2017 at 23:23, Eric Engestrom  wrote:
> > Starting with the next commit, badly sorting this list will break the
> > eglGetProcAddress().
> >
> > Signed-off-by: Eric Engestrom 
> > ---
> >  src/egl/Makefile.am  | 3 ++-
> >  src/egl/egl-entrypoint-check | 4 
> >  2 files changed, 6 insertions(+), 1 deletion(-)
> >  create mode 100755 src/egl/egl-entrypoint-check
> >
> > diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am
> > index bd8903f666..6c0548d856 100644
> > --- a/src/egl/Makefile.am
> > +++ b/src/egl/Makefile.am
> > @@ -126,7 +126,8 @@ egl_HEADERS = \
> > $(top_srcdir)/include/EGL/eglmesaext.h \
> > $(top_srcdir)/include/EGL/eglplatform.h
> >
> > -TESTS = egl-symbols-check
> > +TESTS = egl-symbols-check \
> > +   egl-entrypoint-check
> >
> >  EXTRA_DIST = \
> > egl-symbols-check \
> Maybe:
> 
> - egl-symbols-check \
> +$(TESTS) \

Coming as a separate patch.

> 
> > diff --git a/src/egl/egl-entrypoint-check b/src/egl/egl-entrypoint-check
> > new file mode 100755
> > index 00..d3757aae3c
> > --- /dev/null
> > +++ b/src/egl/egl-entrypoint-check
> > @@ -0,0 +1,4 @@
> > +#!/bin/bash
> Please add a blank line.
> 
> > +entrypoints=$(grep EGL_ENTRYPOINT "$srcdir"/main/eglentrypoint.def)
> > +sorted=$(sort <<< "$entrypoints")
> > +test "$entrypoints" = "$sorted"
> Cannot spot any bashisms here. checkbashisms also cannot find any.
> s|bash|sh| in the shebang ?

Fixed locally; I'll send a v2 later.

> 
> Thanks to the cleanup Eric !
> Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 99886] [radv] No actual multithreading in a Vulkan multithreading demo

2017-02-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99886

Jan Ziak <0xe2.0x9a.0...@gmail.com> changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |INVALID

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] util: fix MSVC build issue in disk_cache.h

2017-02-21 Thread Brian Paul

On 02/21/2017 03:57 PM, Brian Paul wrote:

Windows doesn't have dlfcn.h.  Protect the code in question
with #if ENABLE_SHADER_CACHE test.
---
  src/util/disk_cache.h | 26 --
  1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/src/util/disk_cache.h b/src/util/disk_cache.h
index 8b6fc0d..7f4da80 100644
--- a/src/util/disk_cache.h
+++ b/src/util/disk_cache.h
@@ -24,7 +24,9 @@
  #ifndef DISK_CACHE_H
  #define DISK_CACHE_H

+#ifdef ENABLE_SHADER_CACHE
  #include 
+#endif
  #include 
  #include 
  #include 
@@ -43,16 +45,20 @@ struct disk_cache;
  static inline bool
  disk_cache_get_function_timestamp(void *ptr, uint32_t* timestamp)
  {
-   Dl_info info;
-   struct stat st;
-   if (!dladdr(ptr, ) || !info.dli_fname) {
-   return false;
-   }
-   if (stat(info.dli_fname, )) {
-   return false;
-   }
-   *timestamp = st.st_mtim.tv_sec;
-   return true;
+#ifdef ENABLE_SHADER_CACHE
+   Dl_info info;
+   struct stat st;
+   if (!dladdr(ptr, ) || !info.dli_fname) {
+  return false;
+   }
+   if (stat(info.dli_fname, )) {
+  return false;
+   }
+   *timestamp = st.st_mtim.tv_sec;
+   return true;
+#else
+   return false;
+#endif
  }

  /* Provide inlined stub functions if the shader cache is disabled. */




Timothy,

Does this function really need to be inlined?  AFAICT, it's not called 
on a performance critical path.  Moving it into the .c file would seem 
to be cleaner.


-Brian

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] util: fix MSVC build issue in disk_cache.h

2017-02-21 Thread Brian Paul
Windows doesn't have dlfcn.h.  Protect the code in question
with #if ENABLE_SHADER_CACHE test.
---
 src/util/disk_cache.h | 26 --
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/src/util/disk_cache.h b/src/util/disk_cache.h
index 8b6fc0d..7f4da80 100644
--- a/src/util/disk_cache.h
+++ b/src/util/disk_cache.h
@@ -24,7 +24,9 @@
 #ifndef DISK_CACHE_H
 #define DISK_CACHE_H
 
+#ifdef ENABLE_SHADER_CACHE
 #include 
+#endif
 #include 
 #include 
 #include 
@@ -43,16 +45,20 @@ struct disk_cache;
 static inline bool
 disk_cache_get_function_timestamp(void *ptr, uint32_t* timestamp)
 {
-   Dl_info info;
-   struct stat st;
-   if (!dladdr(ptr, ) || !info.dli_fname) {
-   return false;
-   }
-   if (stat(info.dli_fname, )) {
-   return false;
-   }
-   *timestamp = st.st_mtim.tv_sec;
-   return true;
+#ifdef ENABLE_SHADER_CACHE
+   Dl_info info;
+   struct stat st;
+   if (!dladdr(ptr, ) || !info.dli_fname) {
+  return false;
+   }
+   if (stat(info.dli_fname, )) {
+  return false;
+   }
+   *timestamp = st.st_mtim.tv_sec;
+   return true;
+#else
+   return false;
+#endif
 }
 
 /* Provide inlined stub functions if the shader cache is disabled. */
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4 0/6] Add support for ARB_transform_feedback_overflow_query.

2017-02-21 Thread Kenneth Graunke
On Friday, January 20, 2017 9:53:21 AM PST Rafael Antognolli wrote:
> This patch series implements the ARB_transform_feedback_overflow_query
> extension for i965.
> 
> Changes for v4:
> - Reuse of MI_MATH calcs from hsw_queryobj.c in brw_conditional_render.c
> - Renamed a couple functions as suggested by Kenneth
> - Fallback to CPU-side conditional rendering if MI_MATH is not available.
> 
> The series is available on github here:
> 
> https://github.com/rantogno/mesa/tree/review/overflow_query-v04
> 
> There are also piglit tests available for it here:
> 
> https://github.com/rantogno/piglit/tree/review/overflow_query-v05
> 
> Regards,
> Rafael

Looks great!  Series is:

Reviewed-by: Kenneth Graunke 

I'm guessing you don't have commit access, so I'll try and push it...


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [AppVeyor] mesa master #3522 failed

2017-02-21 Thread AppVeyor



Build mesa 3522 failed


Commit 0441e6bc8b by Timothy Arceri on 2/21/2017 5:34 AM:

util/disk_cache: create timestamp and gpu_id dirs when MESA_GLSL_CACHE_DIR is used\n\nThe make check test is also updated to make sure these dirs are created.\n\nReviewed-by: Nicolai Hähnle 


Configure your notification preferences

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 8/8] gallium/vl: Simplify the matrix filter fragment shader

2017-02-21 Thread Thomas Hellstrom
It looks like it was partly copied from the median filter fragment shader
and unnecessesarily saved a lot of temporary values.

Signed-off-by: Thomas Hellstrom 
Reviewed-by: Brian Paul 
---
 src/gallium/auxiliary/vl/vl_matrix_filter.c | 56 +
 1 file changed, 16 insertions(+), 40 deletions(-)

diff --git a/src/gallium/auxiliary/vl/vl_matrix_filter.c 
b/src/gallium/auxiliary/vl/vl_matrix_filter.c
index 11ec816..b498d1f 100644
--- a/src/gallium/auxiliary/vl/vl_matrix_filter.c
+++ b/src/gallium/auxiliary/vl/vl_matrix_filter.c
@@ -81,15 +81,13 @@ create_frag_shader(struct vl_matrix_filter *filter, 
unsigned num_offsets,
struct ureg_program *shader;
struct ureg_src i_vtex;
struct ureg_src sampler;
-   struct ureg_dst *t_array = MALLOC(sizeof(struct ureg_dst) * num_offsets);
+   struct ureg_dst tmp;
struct ureg_dst t_sum;
struct ureg_dst o_fragment;
-   bool first;
unsigned i;
 
shader = ureg_create(PIPE_SHADER_FRAGMENT);
if (!shader) {
-  FREE(t_array);
   return NULL;
}
 
@@ -101,54 +99,32 @@ create_frag_shader(struct vl_matrix_filter *filter, 
unsigned num_offsets,
   TGSI_RETURN_TYPE_FLOAT,
   TGSI_RETURN_TYPE_FLOAT);
 
-   for (i = 0; i < num_offsets; ++i)
-  if (matrix_values[i] != 0.0f)
- t_array[i] = ureg_DECL_temporary(shader);
-
+   tmp = ureg_DECL_temporary(shader);
+   t_sum = ureg_DECL_temporary(shader);
o_fragment = ureg_DECL_output(shader, TGSI_SEMANTIC_COLOR, 0);
 
-   /*
-* t_array[0..*] = vtex + offset[0..*]
-* t_array[0..*] = tex(t_array[0..*], sampler)
-* o_fragment = sum(t_array[0..*] * matrix_values[0..*])
-*/
-
+   ureg_MOV(shader, t_sum, ureg_imm1f(shader, 0.0f));
for (i = 0; i < num_offsets; ++i) {
-  if (matrix_values[i] != 0.0f && !is_vec_zero(offsets[i])) {
- ureg_ADD(shader, ureg_writemask(t_array[i], TGSI_WRITEMASK_XY),
+  if (matrix_values[i] == 0.0f)
+ continue;
+
+  if (!is_vec_zero(offsets[i])) {
+ ureg_ADD(shader, ureg_writemask(tmp, TGSI_WRITEMASK_XY),
   i_vtex, ureg_imm2f(shader, offsets[i].x, offsets[i].y));
- ureg_MOV(shader, ureg_writemask(t_array[i], TGSI_WRITEMASK_ZW),
+ ureg_MOV(shader, ureg_writemask(tmp, TGSI_WRITEMASK_ZW),
   ureg_imm1f(shader, 0.0f));
+ ureg_TEX(shader, tmp, TGSI_TEXTURE_2D, ureg_src(tmp), sampler);
+  } else {
+ ureg_TEX(shader, tmp, TGSI_TEXTURE_2D, i_vtex, sampler);
   }
+  ureg_MAD(shader, t_sum, ureg_src(tmp), ureg_imm1f(shader, 
matrix_values[i]),
+   ureg_src(t_sum));
}
 
-   for (i = 0; i < num_offsets; ++i) {
-  if (matrix_values[i] != 0.0f) {
- struct ureg_src src = is_vec_zero(offsets[i]) ? i_vtex : 
ureg_src(t_array[i]);
- ureg_TEX(shader, t_array[i], TGSI_TEXTURE_2D, src, sampler);
-  }
-   }
-
-   for (i = 0, first = true; i < num_offsets; ++i) {
-  if (matrix_values[i] != 0.0f) {
- if (first) {
-t_sum = t_array[i];
-ureg_MUL(shader, t_sum, ureg_src(t_array[i]),
- ureg_imm1f(shader, matrix_values[i]));
-first = false;
- } else
-ureg_MAD(shader, t_sum, ureg_src(t_array[i]),
- ureg_imm1f(shader, matrix_values[i]), ureg_src(t_sum));
-  }
-   }
-   if (first)
-  ureg_MOV(shader, o_fragment, ureg_imm1f(shader, 0.0f));
-   else
-  ureg_MOV(shader, o_fragment, ureg_src(t_sum));
+   ureg_MOV(shader, o_fragment, ureg_src(t_sum));
 
ureg_END(shader);
 
-   FREE(t_array);
return ureg_create_shader_and_destroy(shader, filter->pipe);
 }
 
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/8] gallium/vl: declare sampler views in compositor shaders

2017-02-21 Thread Thomas Hellstrom
The svga driver relies on the existence of these sampler views.

Signed-off-by: Thomas Hellstrom 
Reviewed-by: Brian Paul 
---
 src/gallium/auxiliary/vl/vl_compositor.c | 37 +++-
 1 file changed, 32 insertions(+), 5 deletions(-)

diff --git a/src/gallium/auxiliary/vl/vl_compositor.c 
b/src/gallium/auxiliary/vl/vl_compositor.c
index f98b185..693d685 100644
--- a/src/gallium/auxiliary/vl/vl_compositor.c
+++ b/src/gallium/auxiliary/vl/vl_compositor.c
@@ -137,9 +137,15 @@ create_frag_shader_weave(struct ureg_program *shader, 
struct ureg_dst fragment)
i_tc[0] = ureg_DECL_fs_input(shader, TGSI_SEMANTIC_GENERIC, VS_O_VTOP, 
TGSI_INTERPOLATE_LINEAR);
i_tc[1] = ureg_DECL_fs_input(shader, TGSI_SEMANTIC_GENERIC, VS_O_VBOTTOM, 
TGSI_INTERPOLATE_LINEAR);
 
-   for (i = 0; i < 3; ++i)
+   for (i = 0; i < 3; ++i) {
   sampler[i] = ureg_DECL_sampler(shader, i);
-
+  ureg_DECL_sampler_view(shader, i, TGSI_TEXTURE_2D_ARRAY,
+ TGSI_RETURN_TYPE_FLOAT,
+ TGSI_RETURN_TYPE_FLOAT,
+ TGSI_RETURN_TYPE_FLOAT,
+ TGSI_RETURN_TYPE_FLOAT);
+   }
+   
for (i = 0; i < 2; ++i) {
   t_tc[i] = ureg_DECL_temporary(shader);
   t_texel[i] = ureg_DECL_temporary(shader);
@@ -248,9 +254,15 @@ create_frag_shader_video_buffer(struct vl_compositor *c)
   return false;
 
tc = ureg_DECL_fs_input(shader, TGSI_SEMANTIC_GENERIC, VS_O_VTEX, 
TGSI_INTERPOLATE_LINEAR);
-   for (i = 0; i < 3; ++i)
+   for (i = 0; i < 3; ++i) {
   sampler[i] = ureg_DECL_sampler(shader, i);
-
+  ureg_DECL_sampler_view(shader, i, TGSI_TEXTURE_2D_ARRAY,
+ TGSI_RETURN_TYPE_FLOAT,
+ TGSI_RETURN_TYPE_FLOAT,
+ TGSI_RETURN_TYPE_FLOAT,
+ TGSI_RETURN_TYPE_FLOAT);
+   }
+   
texel = ureg_DECL_temporary(shader);
fragment = ureg_DECL_output(shader, TGSI_SEMANTIC_COLOR, 0);
 
@@ -342,8 +354,18 @@ create_frag_shader_palette(struct vl_compositor *c, bool 
include_cc)
 
tc = ureg_DECL_fs_input(shader, TGSI_SEMANTIC_GENERIC, VS_O_VTEX, 
TGSI_INTERPOLATE_LINEAR);
sampler = ureg_DECL_sampler(shader, 0);
+   ureg_DECL_sampler_view(shader, 0, TGSI_TEXTURE_2D,
+  TGSI_RETURN_TYPE_FLOAT,
+  TGSI_RETURN_TYPE_FLOAT,
+  TGSI_RETURN_TYPE_FLOAT,
+  TGSI_RETURN_TYPE_FLOAT);
palette = ureg_DECL_sampler(shader, 1);
-
+   ureg_DECL_sampler_view(shader, 1, TGSI_TEXTURE_1D,
+  TGSI_RETURN_TYPE_FLOAT,
+  TGSI_RETURN_TYPE_FLOAT,
+  TGSI_RETURN_TYPE_FLOAT,
+  TGSI_RETURN_TYPE_FLOAT);
+   
texel = ureg_DECL_temporary(shader);
fragment = ureg_DECL_output(shader, TGSI_SEMANTIC_COLOR, 0);
 
@@ -384,6 +406,11 @@ create_frag_shader_rgba(struct vl_compositor *c)
tc = ureg_DECL_fs_input(shader, TGSI_SEMANTIC_GENERIC, VS_O_VTEX, 
TGSI_INTERPOLATE_LINEAR);
color = ureg_DECL_fs_input(shader, TGSI_SEMANTIC_COLOR, VS_O_COLOR, 
TGSI_INTERPOLATE_LINEAR);
sampler = ureg_DECL_sampler(shader, 0);
+   ureg_DECL_sampler_view(shader, 0, TGSI_TEXTURE_2D,
+  TGSI_RETURN_TYPE_FLOAT,
+  TGSI_RETURN_TYPE_FLOAT,
+  TGSI_RETURN_TYPE_FLOAT,
+  TGSI_RETURN_TYPE_FLOAT);
texel = ureg_DECL_temporary(shader);
fragment = ureg_DECL_output(shader, TGSI_SEMANTIC_COLOR, 0);
 
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/8] gallium/vl: Add sampler views to video filter fragment shaders

2017-02-21 Thread Thomas Hellstrom
Needed for at least the svga driver.

Signed-off-by: Thomas Hellstrom 
Reviewed-by: Brian Paul 
---
 src/gallium/auxiliary/vl/vl_bicubic_filter.c | 5 +
 src/gallium/auxiliary/vl/vl_matrix_filter.c  | 5 +
 src/gallium/auxiliary/vl/vl_median_filter.c  | 5 +
 3 files changed, 15 insertions(+)

diff --git a/src/gallium/auxiliary/vl/vl_bicubic_filter.c 
b/src/gallium/auxiliary/vl/vl_bicubic_filter.c
index 570f153..ae29208 100644
--- a/src/gallium/auxiliary/vl/vl_bicubic_filter.c
+++ b/src/gallium/auxiliary/vl/vl_bicubic_filter.c
@@ -174,6 +174,11 @@ create_frag_shader(struct vl_bicubic_filter *filter, 
unsigned video_width,
 
i_vtex = ureg_DECL_fs_input(shader, TGSI_SEMANTIC_GENERIC, VS_O_VTEX, 
TGSI_INTERPOLATE_LINEAR);
sampler = ureg_DECL_sampler(shader, 0);
+   ureg_DECL_sampler_view(shader, 0, TGSI_TEXTURE_2D,
+  TGSI_RETURN_TYPE_FLOAT,
+  TGSI_RETURN_TYPE_FLOAT,
+  TGSI_RETURN_TYPE_FLOAT,
+  TGSI_RETURN_TYPE_FLOAT);
 
for (i = 0; i < 23; ++i)
   t_array[i] = ureg_DECL_temporary(shader);
diff --git a/src/gallium/auxiliary/vl/vl_matrix_filter.c 
b/src/gallium/auxiliary/vl/vl_matrix_filter.c
index e331cb7..11ec816 100644
--- a/src/gallium/auxiliary/vl/vl_matrix_filter.c
+++ b/src/gallium/auxiliary/vl/vl_matrix_filter.c
@@ -95,6 +95,11 @@ create_frag_shader(struct vl_matrix_filter *filter, unsigned 
num_offsets,
 
i_vtex = ureg_DECL_fs_input(shader, TGSI_SEMANTIC_GENERIC, VS_O_VTEX, 
TGSI_INTERPOLATE_LINEAR);
sampler = ureg_DECL_sampler(shader, 0);
+   ureg_DECL_sampler_view(shader, 0, TGSI_TEXTURE_2D,
+  TGSI_RETURN_TYPE_FLOAT,
+  TGSI_RETURN_TYPE_FLOAT,
+  TGSI_RETURN_TYPE_FLOAT,
+  TGSI_RETURN_TYPE_FLOAT);
 
for (i = 0; i < num_offsets; ++i)
   if (matrix_values[i] != 0.0f)
diff --git a/src/gallium/auxiliary/vl/vl_median_filter.c 
b/src/gallium/auxiliary/vl/vl_median_filter.c
index f7477b7..0183b875 100644
--- a/src/gallium/auxiliary/vl/vl_median_filter.c
+++ b/src/gallium/auxiliary/vl/vl_median_filter.c
@@ -107,6 +107,11 @@ create_frag_shader(struct vl_median_filter *filter,
 
i_vtex = ureg_DECL_fs_input(shader, TGSI_SEMANTIC_GENERIC, VS_O_VTEX, 
TGSI_INTERPOLATE_LINEAR);
sampler = ureg_DECL_sampler(shader, 0);
+   ureg_DECL_sampler_view(shader, 0, TGSI_TEXTURE_2D,
+  TGSI_RETURN_TYPE_FLOAT,
+  TGSI_RETURN_TYPE_FLOAT,
+  TGSI_RETURN_TYPE_FLOAT,
+  TGSI_RETURN_TYPE_FLOAT);
 
for (i = 0; i < num_offsets; ++i)
   t_array[i] = ureg_DECL_temporary(shader);
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] gallium/hud: prevent an infinite loop

2017-02-21 Thread Marek Olšák
From: Marek Olšák 

v2: use UINT64_MAX / 11

---
 src/gallium/auxiliary/hud/hud_context.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/hud/hud_context.c 
b/src/gallium/auxiliary/hud/hud_context.c
index aaa52d5..c44f8c0 100644
--- a/src/gallium/auxiliary/hud/hud_context.c
+++ b/src/gallium/auxiliary/hud/hud_context.c
@@ -724,23 +724,24 @@ hud_pane_set_max_value(struct hud_pane *pane, uint64_t 
value)
uint64_t exp10;
int i;
 
/* The following code determines the max_value in the graph as well as
 * how many describing lines are drawn. The max_value is rounded up,
 * so that all drawn numbers are rounded for readability.
 * We want to print multiples of a simple number instead of multiples of
 * hard-to-read numbers like 1.753.
 */
 
-   /* Find the left-most digit. */
+   /* Find the left-most digit. Make sure exp10 * 10 and fixup_bytes doesn't
+* overflow. (11 is safe) */
exp10 = 1;
-   for (i = 0; value > 9 * exp10; i++) {
+   for (i = 0; exp10 <= UINT64_MAX / 11 && exp10 * 9 < value; i++) {
   exp10 *= 10;
   fixup_bytes(pane->type, i + 1, );
}
 
leftmost_digit = DIV_ROUND_UP(value, exp10);
 
/* Round 9 to 10. */
if (leftmost_digit == 9) {
   leftmost_digit = 1;
   exp10 *= 10;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/8] gallium/vl: Parameter substitution in the csc matrix computation

2017-02-21 Thread Thomas Hellstrom
Makes the code significantly more readable.

Signed-off-by: Thomas Hellstrom 
Reviewed-by: Sinclair Yeh 
---
 src/gallium/auxiliary/vl/vl_csc.c | 29 +
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/src/gallium/auxiliary/vl/vl_csc.c 
b/src/gallium/auxiliary/vl/vl_csc.c
index d70ab14..e4395d0 100644
--- a/src/gallium/auxiliary/vl/vl_csc.c
+++ b/src/gallium/auxiliary/vl/vl_csc.c
@@ -158,6 +158,7 @@ void vl_csc_get_matrix(enum VL_CSC_COLOR_STANDARD cs,
float s = p->saturation;
float b = p->brightness;
float h = p->hue;
+   float x, y;
 
const vl_csc_matrix *cstd;
 
@@ -167,6 +168,10 @@ void vl_csc_get_matrix(enum VL_CSC_COLOR_STANDARD cs,
   b -= c * 16.0f  / 255.0f; /* Adjust for the y bias */
}
 
+   /* Parameter substitutions */
+   x = c * s * cosf(h);
+   y = c * s * sinf(h);
+
assert(matrix);
 
switch (cs) {
@@ -187,23 +192,23 @@ void vl_csc_get_matrix(enum VL_CSC_COLOR_STANDARD cs,
}
 
(*matrix)[0][0] = c * (*cstd)[0][0];
-   (*matrix)[0][1] = c * (*cstd)[0][1] * s * cosf(h) - c * (*cstd)[0][2] * s * 
sinf(h);
-   (*matrix)[0][2] = c * (*cstd)[0][2] * s * cosf(h) + c * (*cstd)[0][1] * s * 
sinf(h);
+   (*matrix)[0][1] = (*cstd)[0][1] * x - (*cstd)[0][2] * y;
+   (*matrix)[0][2] = (*cstd)[0][2] * x + (*cstd)[0][1] * y;
(*matrix)[0][3] = (*cstd)[0][3] + (*cstd)[0][0] * b +
- (*cstd)[0][1] * (c * cbbias * s * cosf(h) + c * crbias * 
s * sinf(h)) +
- (*cstd)[0][2] * (c * crbias * s * cosf(h) - c * cbbias * 
s * sinf(h));
+ (*cstd)[0][1] * (x * cbbias + y * crbias) +
+ (*cstd)[0][2] * (x * crbias - y * cbbias);
 
(*matrix)[1][0] = c * (*cstd)[1][0];
-   (*matrix)[1][1] = c * (*cstd)[1][1] * s * cosf(h) - c * (*cstd)[1][2] * s * 
sinf(h);
-   (*matrix)[1][2] = c * (*cstd)[1][2] * s * cosf(h) + c * (*cstd)[1][1] * s * 
sinf(h);
+   (*matrix)[1][1] = (*cstd)[1][1] * x - (*cstd)[1][2] * y;
+   (*matrix)[1][2] = (*cstd)[1][2] * x + (*cstd)[1][1] * y;
(*matrix)[1][3] = (*cstd)[1][3] + (*cstd)[1][0] * b +
- (*cstd)[1][1] * (c * cbbias * s * cosf(h) + c * crbias * 
s * sinf(h)) +
- (*cstd)[1][2] * (c * crbias * s * cosf(h) - c * cbbias * 
s * sinf(h));
+ (*cstd)[1][1] * (x * cbbias + y * crbias) +
+ (*cstd)[1][2] * (x * crbias - y * cbbias);
 
(*matrix)[2][0] = c * (*cstd)[2][0];
-   (*matrix)[2][1] = c * (*cstd)[2][1] * s * cosf(h) - c * (*cstd)[2][2] * s * 
sinf(h);
-   (*matrix)[2][2] = c * (*cstd)[2][2] * s * cosf(h) + c * (*cstd)[2][1] * s * 
sinf(h);
+   (*matrix)[2][1] = (*cstd)[2][1] * x - (*cstd)[2][2] * y;
+   (*matrix)[2][2] = (*cstd)[2][2] * x + (*cstd)[2][1] * y;
(*matrix)[2][3] = (*cstd)[2][3] + (*cstd)[2][0] * b +
- (*cstd)[2][1] * (c * cbbias * s * cosf(h) + c * crbias * 
s * sinf(h)) +
- (*cstd)[2][2] * (c * crbias * s * cosf(h) - c * cbbias * 
s * sinf(h));
+ (*cstd)[2][1] * (x * cbbias + y * crbias) +
+ (*cstd)[2][2] * (x * crbias - y * cbbias);
 }
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/8] st/vdpau: Fix multithreading

2017-02-21 Thread Thomas Hellstrom
The vdpau state tracker allows multiple threads access to the same gallium
context simultaneously. We can fix this either by locking the same mutex
each time the context is used or by using a different gallium context for
each mutex domain. Here we do the latter, although I'm not sure that's really
the best option.

Signed-off-by: Thomas Hellstrom 
Acked-by: Sinclair Yeh 
---
 src/gallium/state_trackers/vdpau/decode.c | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/vdpau/decode.c 
b/src/gallium/state_trackers/vdpau/decode.c
index f85bce8..2a07d17 100644
--- a/src/gallium/state_trackers/vdpau/decode.c
+++ b/src/gallium/state_trackers/vdpau/decode.c
@@ -68,7 +68,6 @@ vlVdpDecoderCreate(VdpDevice device,
if (!dev)
   return VDP_STATUS_INVALID_HANDLE;
 
-   pipe = dev->context;
screen = dev->vscreen->pscreen;
 
pipe_mutex_lock(dev->mutex);
@@ -123,6 +122,12 @@ vlVdpDecoderCreate(VdpDevice device,
   templat.level = u_get_h264_level(templat.width, templat.height,
 _references);
 
+   pipe = screen->context_create(screen, dev->vscreen, 0);
+   if (!pipe) {
+  ret = VDP_STATUS_RESOURCES;
+  goto error_context;
+   }
+
vldecoder->decoder = pipe->create_video_codec(pipe, );
 
if (!vldecoder->decoder) {
@@ -145,6 +150,8 @@ error_handle:
vldecoder->decoder->destroy(vldecoder->decoder);
 
 error_decoder:
+   pipe->destroy(pipe);
+error_context:
pipe_mutex_unlock(dev->mutex);
DeviceReference(>device, NULL);
FREE(vldecoder);
@@ -158,15 +165,18 @@ VdpStatus
 vlVdpDecoderDestroy(VdpDecoder decoder)
 {
vlVdpDecoder *vldecoder;
+   struct pipe_context *pipe;
 
vldecoder = (vlVdpDecoder *)vlGetDataHTAB(decoder);
if (!vldecoder)
   return VDP_STATUS_INVALID_HANDLE;
 
+   pipe = vldecoder->decoder->context;
pipe_mutex_lock(vldecoder->mutex);
vldecoder->decoder->destroy(vldecoder->decoder);
pipe_mutex_unlock(vldecoder->mutex);
pipe_mutex_destroy(vldecoder->mutex);
+   pipe->destroy(pipe);
 
vlRemoveDataHTAB(decoder);
DeviceReference(>device, NULL);
@@ -674,10 +684,20 @@ vlVdpDecoderRender(VdpDecoder decoder,
if (ret != VDP_STATUS_OK)
   return ret;
 
+   /*
+* Since we use separate contexts for the two mutex domains, we need
+* to flush to make sure rendering operations happen in order.
+* In particular, so that a frame is rendered before it is presented.
+*/
+   pipe_mutex_lock(vldecoder->device->mutex);
+   vldecoder->device->context->flush(vldecoder->device->context, NULL, 0);
+   pipe_mutex_unlock(vldecoder->device->mutex);
+
pipe_mutex_lock(vldecoder->mutex);
dec->begin_frame(dec, vlsurf->video_buffer, );
dec->decode_bitstream(dec, vlsurf->video_buffer, , 
bitstream_buffer_count, buffers, sizes);
dec->end_frame(dec, vlsurf->video_buffer, );
+   dec->flush(dec);
pipe_mutex_unlock(vldecoder->mutex);
return ret;
 }
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/8] gallium/vl: Don't map vertex buffers on creation

2017-02-21 Thread Thomas Hellstrom
It will cause multiple simultaneous maps of the same vertex buffer and
flushed-while-mapped warnings.

Signed-off-by: Thomas Hellstrom 
Reviewed-by: Brian Paul 
---
 src/gallium/auxiliary/vl/vl_vertex_buffers.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/gallium/auxiliary/vl/vl_vertex_buffers.c 
b/src/gallium/auxiliary/vl/vl_vertex_buffers.c
index 13d3f8c..1721227 100644
--- a/src/gallium/auxiliary/vl/vl_vertex_buffers.c
+++ b/src/gallium/auxiliary/vl/vl_vertex_buffers.c
@@ -241,7 +241,6 @@ vl_vb_init(struct vl_vertex_buffer *buffer, struct 
pipe_context *pipe,
  goto error_mv;
}
 
-   vl_vb_map(buffer, pipe);
return true;
 
 error_mv:
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/8] gallium/vl: Simplify usage of full range matrices

2017-02-21 Thread Thomas Hellstrom
When looking at the full range matrices, it becomes obvious that the difference
between the standard matrices and the full range matrices is that the full
range matrices are multiplied by 1.164. Together with offsetting the y value
with -16/255, this will scale and offset RGB with the desired quantities.

However, the standard SMPTE 240M matrix seems to differ a bit since the
U and V coefficients are only multiplied with 1.138 to get the full range
matrix. This would actually alter the color somewhat so I figure that's an
error. The full range matrix is consistent with Nvidia's VDPAU implementation.

We can also incorporate the ybias in the brightness simplifying the
calculation somewhat.

Signed-off-by: Thomas Hellstrom 
Reviewed-by: Sinclair Yeh 
---
 src/gallium/auxiliary/vl/vl_csc.c | 55 ---
 1 file changed, 17 insertions(+), 38 deletions(-)

diff --git a/src/gallium/auxiliary/vl/vl_csc.c 
b/src/gallium/auxiliary/vl/vl_csc.c
index 1587e6c..d70ab14 100644
--- a/src/gallium/auxiliary/vl/vl_csc.c
+++ b/src/gallium/auxiliary/vl/vl_csc.c
@@ -108,18 +108,6 @@ static const vl_csc_matrix bt_601 =
 };
 
 /*
- * Converts ITU-R BT.601 YCbCr pixels to RGB pixels where:
- * Y is in [16,235], Cb and Cr are in [16,240]
- * R, G, and B are in [0,255]
- */
-static const vl_csc_matrix bt_601_full =
-{
-   { 1.164f,  0.0f,1.596f, 0.0f, },
-   { 1.164f, -0.391f, -0.813f, 0.0f, },
-   { 1.164f,  2.018f,  0.0f,   0.0f, }
-};
-
-/*
  * Converts ITU-R BT.709 YCbCr pixels to RGB pixels where:
  * Y is in [16,235], Cb and Cr are in [16,240]
  * R, G, and B are in [16,235]
@@ -132,29 +120,15 @@ static const vl_csc_matrix bt_709 =
 };
 
 /*
- * Converts ITU-R BT.709 YCbCr pixels to RGB pixels where:
+ * Converts SMPTE 240M YCbCr pixels to RGB pixels where:
  * Y is in [16,235], Cb and Cr are in [16,240]
- * R, G, and B are in [0,255]
+ * R, G, and B are in [16,235]
  */
-static const vl_csc_matrix bt_709_full =
-{
-   { 1.164f,  0.0f,1.793f, 0.0f, },
-   { 1.164f, -0.213f, -0.534f, 0.0f, },
-   { 1.164f,  2.115f,  0.0f,   0.0f, }
-};
-
 static const vl_csc_matrix smpte240m =
 {
-   { 1.0f,  0.0f,1.582f, 0.0f, },
-   { 1.0f, -0.228f, -0.478f, 0.0f, },
-   { 1.0f,  1.833f,  0.0f,   0.0f, }
-};
-
-static const vl_csc_matrix smpte240m_full =
-{
-   { 1.164f,  0.0f,1.794f, 0.0f, },
-   { 1.164f, -0.258f, -0.543f, 0.0f, },
-   { 1.164f,  2.079f,  0.0f,   0.0f, }
+   { 1.0f,  0.0f,1.541f, 0.0f, },
+   { 1.0f, -0.221f, -0.466f, 0.0f, },
+   { 1.0f,  1.785f,  0.0f,   0.0f, }
 };
 
 static const vl_csc_matrix identity =
@@ -176,7 +150,6 @@ void vl_csc_get_matrix(enum VL_CSC_COLOR_STANDARD cs,
bool full_range,
vl_csc_matrix *matrix)
 {
-   float ybias = full_range ? -16.0f/255.0f : 0.0f;
float cbbias = -128.0f/255.0f;
float crbias = -128.0f/255.0f;
 
@@ -188,17 +161,23 @@ void vl_csc_get_matrix(enum VL_CSC_COLOR_STANDARD cs,
 
const vl_csc_matrix *cstd;
 
+   if (full_range) {
+  c *= 1.164f;  /* Adjust for the y range */
+  b *= 1.164f;  /* Adjust for the y range */
+  b -= c * 16.0f  / 255.0f; /* Adjust for the y bias */
+   }
+
assert(matrix);
 
switch (cs) {
   case VL_CSC_COLOR_STANDARD_BT_601:
- cstd = full_range ? _601_full : _601;
+ cstd = _601;
  break;
   case VL_CSC_COLOR_STANDARD_BT_709:
- cstd = full_range ? _709_full : _709;
+ cstd = _709;
  break;
   case VL_CSC_COLOR_STANDARD_SMPTE_240M:
- cstd = full_range ? _full : 
+ cstd = 
  break;
   case VL_CSC_COLOR_STANDARD_IDENTITY:
   default:
@@ -210,21 +189,21 @@ void vl_csc_get_matrix(enum VL_CSC_COLOR_STANDARD cs,
(*matrix)[0][0] = c * (*cstd)[0][0];
(*matrix)[0][1] = c * (*cstd)[0][1] * s * cosf(h) - c * (*cstd)[0][2] * s * 
sinf(h);
(*matrix)[0][2] = c * (*cstd)[0][2] * s * cosf(h) + c * (*cstd)[0][1] * s * 
sinf(h);
-   (*matrix)[0][3] = (*cstd)[0][3] + (*cstd)[0][0] * (b + c * ybias) +
+   (*matrix)[0][3] = (*cstd)[0][3] + (*cstd)[0][0] * b +
  (*cstd)[0][1] * (c * cbbias * s * cosf(h) + c * crbias * 
s * sinf(h)) +
  (*cstd)[0][2] * (c * crbias * s * cosf(h) - c * cbbias * 
s * sinf(h));
 
(*matrix)[1][0] = c * (*cstd)[1][0];
(*matrix)[1][1] = c * (*cstd)[1][1] * s * cosf(h) - c * (*cstd)[1][2] * s * 
sinf(h);
(*matrix)[1][2] = c * (*cstd)[1][2] * s * cosf(h) + c * (*cstd)[1][1] * s * 
sinf(h);
-   (*matrix)[1][3] = (*cstd)[1][3] + (*cstd)[1][0] * (b + c * ybias) +
+   (*matrix)[1][3] = (*cstd)[1][3] + (*cstd)[1][0] * b +
  (*cstd)[1][1] * (c * cbbias * s * cosf(h) + c * crbias * 
s * sinf(h)) +
  (*cstd)[1][2] * (c * crbias * s * cosf(h) - c * cbbias * 
s * sinf(h));
 
(*matrix)[2][0] = c * (*cstd)[2][0];
(*matrix)[2][1] = c * (*cstd)[2][1] * s * cosf(h) - c * (*cstd)[2][2] 

[Mesa-dev] [PATCH 4/8] gallium/vl Fix brightness matrix description

2017-02-21 Thread Thomas Hellstrom
The brightness matrix doesn't actually match the procamp matrix and
what's calculated in vl_csc_get_matrix.

Signed-off-by: Thomas Hellstrom 
Reviewed-by: Sinclair Yeh 
---
 src/gallium/auxiliary/vl/vl_csc.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/gallium/auxiliary/vl/vl_csc.c 
b/src/gallium/auxiliary/vl/vl_csc.c
index c8efe28..1587e6c 100644
--- a/src/gallium/auxiliary/vl/vl_csc.c
+++ b/src/gallium/auxiliary/vl/vl_csc.c
@@ -59,10 +59,10 @@
  * [ 0, 0, 0, 1]
  *
  * brightness
- * [ 1, 0, 0, b]
- * [ 0, 1, 0, 0]
- * [ 0, 0, 1, 0]
- * [ 0, 0, 0, 1]
+ * [ 1, 0, 0, b/c]
+ * [ 0, 1, 0,   0]
+ * [ 0, 0, 1,   0]
+ * [ 0, 0, 0,   1]
  *
  * saturation
  * [ 1, 0, 0, 0]
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/8] gallium: A number of video code fixes

2017-02-21 Thread Thomas Hellstrom
A couple of fixes / improvements for things I've encountered while looking
through and testing the video code in preparation for a virtual hardware video
driver.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] gallium/u_queue: fix random crashes when the app calls exit()

2017-02-21 Thread Marek Olšák
From: Marek Olšák 

This fixes:
vdpauinfo: ../lib/CodeGen/TargetPassConfig.cpp:579: virtual void
llvm::TargetPassConfig::addMachinePasses(): Assertion `TPI && IPI &&
"Pass ID not registered!"' failed.

v2: use list_head, switch the call order in destroy

Cc: 13.0 17.0 
---
 src/gallium/auxiliary/util/u_queue.c | 76 +++-
 src/gallium/auxiliary/util/u_queue.h |  4 ++
 2 files changed, 78 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_queue.c 
b/src/gallium/auxiliary/util/u_queue.c
index 4da5d8e..52cfc0a 100644
--- a/src/gallium/auxiliary/util/u_queue.c
+++ b/src/gallium/auxiliary/util/u_queue.c
@@ -22,20 +22,82 @@
  * The above copyright notice and this permission notice (including the
  * next paragraph) shall be included in all copies or substantial portions
  * of the Software.
  */
 
 #include "u_queue.h"
 #include "u_memory.h"
 #include "u_string.h"
 #include "os/os_time.h"
 
+static void util_queue_killall_and_wait(struct util_queue *queue);
+
+/
+ * Wait for all queues to assert idle when exit() is called.
+ *
+ * Otherwise, C++ static variable destructors can be called while threads
+ * are using the static variables.
+ */
+
+static once_flag atexit_once_flag = ONCE_FLAG_INIT;
+static struct list_head queue_list;
+pipe_static_mutex(exit_mutex);
+
+static void
+atexit_handler(void)
+{
+   struct util_queue *iter;
+
+   pipe_mutex_lock(exit_mutex);
+   /* Wait for all queues to assert idle. */
+   LIST_FOR_EACH_ENTRY(iter, _list, head) {
+  util_queue_killall_and_wait(iter);
+   }
+   pipe_mutex_unlock(exit_mutex);
+}
+
+static void
+global_init(void)
+{
+   LIST_INITHEAD(_list);
+   atexit(atexit_handler);
+}
+
+static void
+add_to_atexit_list(struct util_queue *queue)
+{
+   call_once(_once_flag, global_init);
+
+   pipe_mutex_lock(exit_mutex);
+   LIST_ADD(>head, _list);
+   pipe_mutex_unlock(exit_mutex);
+}
+
+static void
+remove_from_atexit_list(struct util_queue *queue)
+{
+   struct util_queue *iter, *tmp;
+
+   pipe_mutex_lock(exit_mutex);
+   LIST_FOR_EACH_ENTRY_SAFE(iter, tmp, _list, head) {
+  if (iter == queue) {
+ LIST_DEL(>head);
+ break;
+  }
+   }
+   pipe_mutex_unlock(exit_mutex);
+}
+
+/
+ * util_queue implementation
+ */
+
 static void
 util_queue_fence_signal(struct util_queue_fence *fence)
 {
pipe_mutex_lock(fence->mutex);
fence->signalled = true;
pipe_condvar_broadcast(fence->cond);
pipe_mutex_unlock(fence->mutex);
 }
 
 void
@@ -97,20 +159,21 @@ static PIPE_THREAD_ROUTINE(util_queue_thread_func, input)
}
 
/* signal remaining jobs before terminating */
pipe_mutex_lock(queue->lock);
while (queue->jobs[queue->read_idx].job) {
   util_queue_fence_signal(queue->jobs[queue->read_idx].fence);
 
   queue->jobs[queue->read_idx].job = NULL;
   queue->read_idx = (queue->read_idx + 1) % queue->max_jobs;
}
+   queue->num_queued = 0; /* reset this when exiting the thread */
pipe_mutex_unlock(queue->lock);
return 0;
 }
 
 bool
 util_queue_init(struct util_queue *queue,
 const char *name,
 unsigned max_jobs,
 unsigned num_threads)
 {
@@ -150,49 +213,58 @@ util_queue_init(struct util_queue *queue,
  if (i == 0) {
 /* no threads created, fail */
 goto fail;
  } else {
 /* at least one thread created, so use it */
 queue->num_threads = i+1;
 break;
  }
   }
}
+
+   add_to_atexit_list(queue);
return true;
 
 fail:
FREE(queue->threads);
 
if (queue->jobs) {
   pipe_condvar_destroy(queue->has_space_cond);
   pipe_condvar_destroy(queue->has_queued_cond);
   pipe_mutex_destroy(queue->lock);
   FREE(queue->jobs);
}
/* also util_queue_is_initialized can be used to check for success */
memset(queue, 0, sizeof(*queue));
return false;
 }
 
-void
-util_queue_destroy(struct util_queue *queue)
+static void
+util_queue_killall_and_wait(struct util_queue *queue)
 {
unsigned i;
 
/* Signal all threads to terminate. */
pipe_mutex_lock(queue->lock);
queue->kill_threads = 1;
pipe_condvar_broadcast(queue->has_queued_cond);
pipe_mutex_unlock(queue->lock);
 
for (i = 0; i < queue->num_threads; i++)
   pipe_thread_wait(queue->threads[i]);
+}
+
+void
+util_queue_destroy(struct util_queue *queue)
+{
+   util_queue_killall_and_wait(queue);
+   remove_from_atexit_list(queue);
 
pipe_condvar_destroy(queue->has_space_cond);
pipe_condvar_destroy(queue->has_queued_cond);
pipe_mutex_destroy(queue->lock);
FREE(queue->jobs);
FREE(queue->threads);
 }
 
 void
 util_queue_fence_init(struct util_queue_fence *fence)
diff --git 

[Mesa-dev] [PATCH 7/7] util: Change the pointer hashing function

2017-02-21 Thread Thomas Helland
Use our knowledge that pointers are at least 4 byte aligned to remove
the useless digits. Then shift by 6, 10, and 14 bits and add this to
the original pointer, effectively folding in the entropy of the higher
bits of the pointer into a 4-bit section. Stopping at 14 means we can
add the entropy from 18 bits, or at least a 600Kbyte section of memory.
Assuming that ralloc allocates from a linearly allocated heap less than
this we can make a very efficient pointer hashing function for our usecase.

The 4 bit increment on the shifts is chosen rather arbitrarily; if we
had chosen a 3 bit increment we would need to add another shift+add to
get a decent amount of memory covered. Increasing it to 5 bits would
spread our entropy more, possibly hurting us with more collisions on
hash tables of size less than 32. With a hash table of size 16 there
are a max of 11 entries, and we can assume that with such a small table
collisions are not that painfull.

This allows us to hash the whole 32 or 64 bit pointer at once,
instead of running FNV1a, looping through each byte and doing
increments, decrements, muls, and xors on every byte. This cuts
_mesa_hash_data from 1.5 % on profiles, to making _mesa_hash_pointer
show up with a 0.09% share. Collisions on insertion actually seems to be
ever so slightly lower with this hash function, as found by printing the
quad_hash loop counter, sorting and counting the data.
---
 src/util/hash_table.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/util/hash_table.h b/src/util/hash_table.h
index fa0e241b03..9bf6092ef9 100644
--- a/src/util/hash_table.h
+++ b/src/util/hash_table.h
@@ -104,7 +104,8 @@ static inline uint32_t _mesa_key_hash_string(const void 
*key)
 
 static inline uint32_t _mesa_hash_pointer(const void *pointer)
 {
-   return _mesa_hash_data(, sizeof(pointer));
+   uintptr_t num = (uintptr_t) pointer;
+   return (uint32_t) ((num >> 2) + (num >> 6) - (num >> 10) + (num >> 14));
 }
 
 enum {
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/7] util: Increase start size of hash table and set to 8

2017-02-21 Thread Thomas Helland
---
 src/util/hash_table.c | 2 +-
 src/util/set.c| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/util/hash_table.c b/src/util/hash_table.c
index a93326ec27..e1255a2484 100644
--- a/src/util/hash_table.c
+++ b/src/util/hash_table.c
@@ -83,7 +83,7 @@ _mesa_hash_table_create(void *mem_ctx,
if (ht == NULL)
   return NULL;
 
-   ht->size_iteration = 2;
+   ht->size_iteration = 3;
ht->size = 1 << ht->size_iteration;
ht->max_entries = ht->size * 0.9;
ht->key_hash_function = key_hash_function;
diff --git a/src/util/set.c b/src/util/set.c
index 110f182244..31014984dc 100644
--- a/src/util/set.c
+++ b/src/util/set.c
@@ -83,7 +83,7 @@ _mesa_set_create(void *mem_ctx,
if (ht == NULL)
   return NULL;
 
-   ht->size_iteration = 2;
+   ht->size_iteration = 3;
ht->size = 1 << ht->size_iteration;
ht->max_entries = ht->size * 0.9;
ht->key_hash_function = key_hash_function;
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/7] util: Use set_foreach instead of rolling our own

2017-02-21 Thread Thomas Helland
This follows the same pattern as in the hash_table.

Reviewed-by: Jason Ekstrand 
---
 src/util/set.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/src/util/set.c b/src/util/set.c
index 99c04369c5..110f182244 100644
--- a/src/util/set.c
+++ b/src/util/set.c
@@ -196,12 +196,8 @@ set_rehash(struct set *ht, uint32_t new_size_iteration)
ht->entries = 0;
ht->deleted_entries = 0;
 
-   for (entry = old_ht.table;
-entry != old_ht.table + old_ht.size;
-entry++) {
-  if (entry_is_present(entry)) {
- set_add(ht, entry->hash, entry->key);
-  }
+   set_foreach(_ht, entry) {
+  set_add(ht, entry->hash, entry->key);
}
 
ralloc_free(old_ht.table);
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/7] util: Change hash_table to use quadratic probing

2017-02-21 Thread Thomas Helland
This will allow us to remove the large static table and use a
power of two hash table size that we can compute on the fly.
We can use bitmasking instead of modulo to fit our hash in the table,
and it's less code.

By using the algorithm hash = sh + i/2 + i*i/2
we are guaranteed that all retries from the quad probing
are distinct, and so we should be able to completely fill the table.
This has been verified separately by Eric Anholt and Jason Ekstrand.
This passes the test added to exercise a worst case collision scenario.

This also exhibits much better collision avoidance performance than
the existing implementation; the worst case collision I've found by
running shader-db on a subset of my shader-collection is 78 runs of
the double hashing loop before insetion succeeded. With this
implementation the worst case is 25. With the existing implementation
99% of insetions had succeeded after 10 retries, while with the new
implementation that is dropped to 7. 99.5% of insertions succeeded after
13 retries, while with the new implementation that number is down to 8-9.
While the new implementation performs better on collisions in general,
the biggest difference is that the old one uses more retries to insert
an entry when it exhibits collisions. While the new implementation sees
12506 insertions that have not found a location after 10 retries,
that drops to only 74 after 20 retries. The old implementation sees
13864 insertions not having a spot after 16 retries. 1370 of these
have still not found a spot in the table after 26 retries.

V4: Feedback from Eric Anholt
   - Don't change load factor or starting size.

V3: Feedback from Connor Abbott
   - Remove hash_size table
   - Correct comment-style

   Feedback from Eric Anholt
   - Correct quadratic probing algorithm

   Feedback from Jason Ekstrand
   - Add "unreachable" if we fail to insert in table
---
 src/util/hash_table.c | 102 +++---
 src/util/hash_table.h |   3 +-
 2 files changed, 32 insertions(+), 73 deletions(-)

diff --git a/src/util/hash_table.c b/src/util/hash_table.c
index 9e643af8b2..a93326ec27 100644
--- a/src/util/hash_table.c
+++ b/src/util/hash_table.c
@@ -33,11 +33,14 @@
  */
 
 /**
- * Implements an open-addressing, linear-reprobing hash table.
+ * Implements an open-addressing, quadratic probing hash table.
  *
- * For more information, see:
- *
- * http://cgit.freedesktop.org/~anholt/hash_table/tree/README
+ * We choose table sizes that's a power of two.
+ * This is computationally less expensive than primes.
+ * As a bonus the size and free space can be calculated instead of looked up.
+ * FNV-1a has good avalanche properties, so collision is not an issue.
+ * These tables are sized to have an extra 10% free to avoid
+ * exponential performance degradation as the hash table fills.
  */
 
 #include 
@@ -50,47 +53,6 @@
 
 static const uint32_t deleted_key_value;
 
-/**
- * From Knuth -- a good choice for hash/rehash values is p, p-2 where
- * p and p-2 are both prime.  These tables are sized to have an extra 10%
- * free to avoid exponential performance degradation as the hash table fills
- */
-static const struct {
-   uint32_t max_entries, size, rehash;
-} hash_sizes[] = {
-   { 2,5,  3 },
-   { 4,7,  5 },
-   { 8,13, 11},
-   { 16,   19, 17},
-   { 32,   43, 41},
-   { 64,   73, 71},
-   { 128,  151,149   },
-   { 256,  283,281   },
-   { 512,  571,569   },
-   { 1024, 1153,   1151  },
-   { 2048, 2269,   2267  },
-   { 4096, 4519,   4517  },
-   { 8192, 9013,   9011  },
-   { 16384,18043,  18041 },
-   { 32768,36109,  36107 },
-   { 65536,72091,  72089 },
-   { 131072,   144409, 144407},
-   { 262144,   288361, 288359},
-   { 524288,   576883, 576881},
-   { 1048576,  1153459,1153457   },
-   { 2097152,  2307163,2307161   },
-   { 4194304,  4613893,4613891   },
-   { 8388608,  9227641,9227639   },
-   { 16777216, 18455029,   18455027  },
-   { 33554432, 36911011,   36911009  },
-   { 67108864, 73819861,   73819859  },
-   { 134217728,147639589,  147639587 },
-   { 268435456,295279081,  295279079 },
-   { 536870912,590559793,  590559791 },
-   { 1073741824,   1181116273, 1181116271},
-   { 2147483648ul, 2362232233ul,   2362232231ul}
-};
-
 static int
 entry_is_free(const struct hash_entry 

[Mesa-dev] [PATCH 1/7] util/tests: Expand collision test for hash table

2017-02-21 Thread Thomas Helland
Add a test to exercise a worst case collision scenario
that may cause us to not be able to find an empty
slot in the table even though it is not full.
This hits the bug in my last revision of the series
converting the hash table to quadratic probing.

V2: Feedback from Emil Velikov
-Don't include code in the assert
---
 src/util/tests/hash_table/collision.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/util/tests/hash_table/collision.c 
b/src/util/tests/hash_table/collision.c
index 69a4c29eca..51a537757b 100644
--- a/src/util/tests/hash_table/collision.c
+++ b/src/util/tests/hash_table/collision.c
@@ -91,5 +91,19 @@ main(int argc, char **argv)
 
_mesa_hash_table_destroy(ht, NULL);
 
+   /* Try inserting multiple items with the same hash
+* This exercises a worst case scenario where we might fail to find
+* an empty slot in the table, even though there is free space
+*/
+   ht = _mesa_hash_table_create(NULL, NULL, _mesa_key_string_equal);
+   for (i = 0; i < 100; i++) {
+  char *key = malloc(10);
+  sprintf(key, "spam%d", i);
+  entry2 = _mesa_hash_table_insert_pre_hashed(ht, bad_hash, key, NULL);
+  assert(entry2 != NULL);
+   }
+
+   _mesa_hash_table_destroy(ht, NULL);
+
return 0;
 }
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/7] util: Use a starting load factor of 7/8 entries

2017-02-21 Thread Thomas Helland
---
 src/util/hash_table.c | 2 +-
 src/util/set.c| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/util/hash_table.c b/src/util/hash_table.c
index e1255a2484..8121c8e67a 100644
--- a/src/util/hash_table.c
+++ b/src/util/hash_table.c
@@ -85,7 +85,7 @@ _mesa_hash_table_create(void *mem_ctx,
 
ht->size_iteration = 3;
ht->size = 1 << ht->size_iteration;
-   ht->max_entries = ht->size * 0.9;
+   ht->max_entries = ht->size - 1;
ht->key_hash_function = key_hash_function;
ht->key_equals_function = key_equals_function;
ht->table = rzalloc_array(ht, struct hash_entry, ht->size);
diff --git a/src/util/set.c b/src/util/set.c
index 31014984dc..98b5670834 100644
--- a/src/util/set.c
+++ b/src/util/set.c
@@ -85,7 +85,7 @@ _mesa_set_create(void *mem_ctx,
 
ht->size_iteration = 3;
ht->size = 1 << ht->size_iteration;
-   ht->max_entries = ht->size * 0.9;
+   ht->max_entries = ht->size - 1;
ht->key_hash_function = key_hash_function;
ht->key_equals_function = key_equals_function;
ht->table = rzalloc_array(ht, struct set_entry, ht->size);
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/7] Another take on the hash table

2017-02-21 Thread Thomas Helland
I think this should pretty much be the patches in their final form.
A recap of the history as of now: The minecraft tests that Eric Anholt
did was based on a replay of an apitrace. So that is likely not all
that representative of real life workloads. Name lookups was the one
big thing that the minecraft replay abused badly. So if someone
has a favourite testcase for this then I'm all ears.

Some numbers from running shader-db single threaded:
hash_table_search: 3.88% -> 1.84%
hash_table_insert: 2.26% -> 1.16%
set_add: 0.70% -> 0.35%
set_search: 0.59% -> 0.27%
_mesa_hash_data is cut from 1.52% to basically nothing, and is replaced
with _mesa_hash_pointer at 0.11%

Runtime of shader-db, five runs (left is before, right is after)
195.06  -->  187.62
194.39  -->  182.20
194.12  -->  181.79
199.27  -->  182.08
194.18  -->  182.16

There are some outliers here, but I thought the general trend
was clear enough that I didn't persue more runs. Aproximately 7.5%
runtime improvement, which matches quite well with the findings from
years ago, where I actually used ministat to get statistically
significant numbers on this.

I've done some tests on my i3-6100 / RX460 combo running the
metro ll, cs:go, dota2 and talos principle test profiles using the
phoronix test suite. No significant changes where detected.
So if you have a favourite workload that could benefit, let me know,
or give the series a spin yourself =)

New in this version is the last patch, that reworks our pointer hashing.
This patch has been tested on shader-db, but has not been tested on
other workloads, like the above mentioned games. That's on my todo-lsit.

Thomas Helland (7):
  util/tests: Expand collision test for hash table
  util: Change hash_table to use quadratic probing
  util: Change util/set to use quadratic probing
  util: Use set_foreach instead of rolling our own
  util: Increase start size of hash table and set to 8
  util: Use a starting load factor of 7/8 entries
  util: Change the pointer hashing function

 src/util/hash_table.c | 102 +
 src/util/hash_table.h |   6 +-
 src/util/set.c| 118 --
 src/util/set.h|   3 +-
 src/util/tests/hash_table/collision.c |  14 
 5 files changed, 91 insertions(+), 152 deletions(-)

-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/7] util: Change util/set to use quadratic probing

2017-02-21 Thread Thomas Helland
The same rationale applies here as for the hash table.
Power of two size should give better performance,
and using the algorithm hash = sh + i/2 + i*i/2
should result in only distinct hash values when hitting collisions.
Collision performane is also a lot better.

V4: Feedback from Jason Ekstrand
   - Split cleanup of set_rehash into separate commit

V3: Feedback from Eric Anholt
   - Don't change load factor and starting size.

V2: Feedback from Connor Abbott
   - Don't set initial hash address before potential rehash
   - Remove hash_sizes table
   - Correct the quadratic hashing algorithm
   - Use correct comment style

   Feedback from Jason Ekstrand
   - Use unreachable() to detect if we fail to insert

Reviewed-by: Jason Ekstrand 
---
 src/util/set.c | 110 +
 src/util/set.h |   3 +-
 2 files changed, 41 insertions(+), 72 deletions(-)

diff --git a/src/util/set.c b/src/util/set.c
index 99abefd063..99c04369c5 100644
--- a/src/util/set.c
+++ b/src/util/set.c
@@ -32,6 +32,17 @@
  *Keith Packard 
  */
 
+/**
+ * Implements an open-addressing, quadratic probing hash-set.
+ *
+ * We choose set sizes that's a power of two.
+ * This is computationally less expensive than primes.
+ * As a bonus the size and free space can be calculated instead of looked up.
+ * FNV-1a has good avalanche properties, so collision is not an issue.
+ * These sets are sized to have an extra 10% free to avoid
+ * exponential performance degradation as the set fills.
+ */
+
 #include 
 #include 
 
@@ -39,51 +50,9 @@
 #include "ralloc.h"
 #include "set.h"
 
-/*
- * From Knuth -- a good choice for hash/rehash values is p, p-2 where
- * p and p-2 are both prime.  These tables are sized to have an extra 10%
- * free to avoid exponential performance degradation as the hash table fills
- */
-
 uint32_t deleted_key_value;
 const void *deleted_key = _key_value;
 
-static const struct {
-   uint32_t max_entries, size, rehash;
-} hash_sizes[] = {
-   { 2,5,3},
-   { 4,7,5},
-   { 8,13,   11   },
-   { 16,   19,   17   },
-   { 32,   43,   41   },
-   { 64,   73,   71   },
-   { 128,  151,  149  },
-   { 256,  283,  281  },
-   { 512,  571,  569  },
-   { 1024, 1153, 1151 },
-   { 2048, 2269, 2267 },
-   { 4096, 4519, 4517 },
-   { 8192, 9013, 9011 },
-   { 16384,18043,18041},
-   { 32768,36109,36107},
-   { 65536,72091,72089},
-   { 131072,   144409,   144407   },
-   { 262144,   288361,   288359   },
-   { 524288,   576883,   576881   },
-   { 1048576,  1153459,  1153457  },
-   { 2097152,  2307163,  2307161  },
-   { 4194304,  4613893,  4613891  },
-   { 8388608,  9227641,  9227639  },
-   { 16777216, 18455029, 18455027 },
-   { 33554432, 36911011, 36911009 },
-   { 67108864, 73819861, 73819859 },
-   { 134217728,147639589,147639587},
-   { 268435456,295279081,295279079},
-   { 536870912,590559793,590559791},
-   { 1073741824,   1181116273,   1181116271   },
-   { 2147483648ul, 2362232233ul, 2362232231ul }
-};
-
 static int
 entry_is_free(struct set_entry *entry)
 {
@@ -114,10 +83,9 @@ _mesa_set_create(void *mem_ctx,
if (ht == NULL)
   return NULL;
 
-   ht->size_index = 0;
-   ht->size = hash_sizes[ht->size_index].size;
-   ht->rehash = hash_sizes[ht->size_index].rehash;
-   ht->max_entries = hash_sizes[ht->size_index].max_entries;
+   ht->size_iteration = 2;
+   ht->size = 1 << ht->size_iteration;
+   ht->max_entries = ht->size * 0.9;
ht->key_hash_function = key_hash_function;
ht->key_equals_function = key_equals_function;
ht->table = rzalloc_array(ht, struct set_entry, ht->size);
@@ -163,12 +131,11 @@ _mesa_set_destroy(struct set *ht, void 
(*delete_function)(struct set_entry *entr
 static struct set_entry *
 set_search(const struct set *ht, uint32_t hash, const void *key)
 {
-   uint32_t hash_address;
+   uint32_t start_hash_address = hash & (ht->size - 1);
+   uint32_t hash_address = start_hash_address;
+   uint32_t quad_hash = 1;
 
-   hash_address = hash % ht->size;
do {
-  uint32_t double_hash;
-
   struct set_entry *entry = ht->table + hash_address;
 
   if (entry_is_free(entry)) {
@@ -179,10 +146,10 @@ set_search(const struct set *ht, uint32_t hash, const 
void *key)
  }
   }
 
-  double_hash = 1 + hash % ht->rehash;
-
-  hash_address = (hash_address + double_hash) % ht->size;
-   } while (hash_address != hash % ht->size);
+  

[Mesa-dev] [Bug 99886] [radv] No actual multithreading in a Vulkan multithreading demo

2017-02-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99886

Mike Lothian  changed:

   What|Removed |Added

 CC||m...@fireburn.co.uk

--- Comment #2 from Mike Lothian  ---
If you use atop it's easier to see that the load is spread over all cores
evenly

Confirmed the demo is working fine here with radv and each core adds up to
~100% usage

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/nine: make use of common uploaders v4

2017-02-21 Thread Constantine Charlamov
On 21.02.2017 23:28, Axel Davy wrote:
> This looks fine to me.
>
> Reviewed-by: Axel Davy 
>
> I think the patch requires your Signed-off-by though.
>
> Axel
>

v2: fixed formatting, broken due to thunderbird configuration

v3: per Axel comment: added a comment into NineDevice9_DrawPrimitiveUP

v4: per Axel comment: changed style of the comment

Signed-off-by: Constantine Charlamov 

 ---
 src/gallium/state_trackers/nine/device9.c| 50 +---
 src/gallium/state_trackers/nine/device9.h|  5 ---
 src/gallium/state_trackers/nine/nine_ff.c|  8 ++---
 src/gallium/state_trackers/nine/nine_state.c | 48 +-
 4 files changed, 37 insertions(+), 74 deletions(-)

diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index b9b7a637d7..86c8e38535 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -477,31 +477,8 @@ NineDevice9_ctor( struct NineDevice9 *This,
 This->driver_caps.user_cbufs = GET_PCAP(USER_CONSTANT_BUFFERS);
 This->driver_caps.user_sw_vbufs = 
This->screen_sw->get_param(This->screen_sw, PIPE_CAP_USER_VERTEX_BUFFERS);
 This->driver_caps.user_sw_cbufs = 
This->screen_sw->get_param(This->screen_sw, PIPE_CAP_USER_CONSTANT_BUFFERS);
-
-/* Implicit use of context pipe for vertex and index uploaded when
- * csmt is not active. Does not need to sync since csmt is unactive,
- * thus no need to call NineDevice9_GetPipe at each upload. */
-if (!This->driver_caps.user_vbufs)
-This->vertex_uploader = u_upload_create(This->csmt_active ?
-This->pipe_secondary : 
This->context.pipe,
-65536,
-PIPE_BIND_VERTEX_BUFFER, 
PIPE_USAGE_STREAM);
-This->vertex_sw_uploader = u_upload_create(This->pipe_sw, 65536,
-PIPE_BIND_VERTEX_BUFFER, 
PIPE_USAGE_STREAM);
-if (!This->driver_caps.user_ibufs)
-This->index_uploader = u_upload_create(This->csmt_active ?
-This->pipe_secondary : 
This->context.pipe,
-   128 * 1024,
-   PIPE_BIND_INDEX_BUFFER, 
PIPE_USAGE_STREAM);
-if (!This->driver_caps.user_cbufs) {
+if (!This->driver_caps.user_cbufs)
 This->constbuf_alignment = GET_PCAP(CONSTANT_BUFFER_OFFSET_ALIGNMENT);
-This->constbuf_uploader = u_upload_create(This->context.pipe, 
This->vs_const_size,
-  PIPE_BIND_CONSTANT_BUFFER, 
PIPE_USAGE_STREAM);
-}
-
-This->constbuf_sw_uploader = u_upload_create(This->pipe_sw, 128 * 1024,
- PIPE_BIND_CONSTANT_BUFFER, 
PIPE_USAGE_STREAM);
-
 This->driver_caps.window_space_position_support = 
GET_PCAP(TGSI_VS_WINDOW_SPACE_POSITION);
 This->driver_caps.vs_integer = pScreen->get_shader_param(pScreen, 
PIPE_SHADER_VERTEX, PIPE_SHADER_CAP_INTEGERS);
 This->driver_caps.ps_integer = pScreen->get_shader_param(pScreen, 
PIPE_SHADER_FRAGMENT, PIPE_SHADER_CAP_INTEGERS);
@@ -552,17 +529,6 @@ NineDevice9_dtor( struct NineDevice9 *This )
 nine_state_clear(>state, TRUE);
 nine_context_clear(This);
 
-if (This->vertex_uploader)
-u_upload_destroy(This->vertex_uploader);
-if (This->index_uploader)
-u_upload_destroy(This->index_uploader);
-if (This->constbuf_uploader)
-u_upload_destroy(This->constbuf_uploader);
-if (This->vertex_sw_uploader)
-u_upload_destroy(This->vertex_sw_uploader);
-if (This->constbuf_sw_uploader)
-u_upload_destroy(This->constbuf_sw_uploader);
-
 nine_bind(>record, NULL);
 
 pipe_sampler_view_reference(>dummy_sampler_view, NULL);
@@ -2852,15 +2818,17 @@ NineDevice9_DrawPrimitiveUP( struct NineDevice9 *This,
 vtxbuf.buffer = NULL;
 vtxbuf.user_buffer = pVertexStreamZeroData;
 
+/* csmt is unactive when user vertex or index buffers are used, thus no
+ * need to call NineDevice9_GetPipe. */
 if (!This->driver_caps.user_vbufs) {
-u_upload_data(This->vertex_uploader,
+u_upload_data(This->context.pipe->stream_uploader,
   0,
   (prim_count_to_vertex_count(PrimitiveType, 
PrimitiveCount)) * VertexStreamZeroStride, /* XXX */
   4,
   vtxbuf.user_buffer,
   _offset,
   );
-u_upload_unmap(This->vertex_uploader);
+u_upload_unmap(This->context.pipe->stream_uploader);
 vtxbuf.user_buffer = NULL;
 }
 
@@ -2916,27 +2884,27 @@ NineDevice9_DrawIndexedPrimitiveUP( struct NineDevice9 
*This,
 
 if (!This->driver_caps.user_vbufs) {
 const unsigned base = 

Re: [Mesa-dev] [PATCH] anv: implement pipeline statistics queries

2017-02-21 Thread Ilia Mirkin
Looks like after nearly 3 months of no reviews that lead to R-b's[1],
this patch no longer applies to master. I'm abandoning it. If anyone's
interested, feel free to pick it up and make it your own.

Cheers,

  -ilia

[1] Robert did look at it, but didn't have significant feedback or say
what I needed to do to get a R-b, nor is he one of the main anv
contributors, so I'd have to get someone else's anyways.


On Tue, Jan 24, 2017 at 2:48 PM, Ilia Mirkin  wrote:
> 2-month ping. [ok, it hasn't been 2 months on the dot, but ... close.]
>
> On Tue, Jan 10, 2017 at 5:49 PM, Ilia Mirkin  wrote:
>> ping.
>>
>> On Thu, Dec 22, 2016 at 11:14 AM, Ilia Mirkin  wrote:
>>> Ping? Any further comments/feedback/reviews?
>>>
>>>
>>> On Dec 5, 2016 11:22 AM, "Ilia Mirkin"  wrote:
>>>
>>> On Mon, Dec 5, 2016 at 11:11 AM, Robert Bragg  wrote:


 On Sun, Nov 27, 2016 at 7:23 PM, Ilia Mirkin  wrote:
>
> The strategy is to just keep n anv_query_pool_slot entries per query
> instead of one. The available bit is only valid in the last one.
>
> Signed-off-by: Ilia Mirkin 
> ---
>
> I think this is in a pretty good state now. I've tested both the direct
> and
> buffer paths with a hacked up cube application, and I'm seeing
> non-ridiculous
> values for the various counters, although I haven't 100% verified them
> for
> accuracy.
>
> This also implements the hsw/bdw workaround for dividing frag invocations
> by 4,
> copied from hsw_queryobj. I tested this on SKL and it seem to divide the
> values
> as expected.
>
> The cube patch I've been testing with is at
> http://paste.debian.net/899374/
> You can flip between copying to a buffer and explicit retrieval by
> commenting
> out the relevant function calls.
>
>  src/intel/vulkan/anv_device.c  |   2 +-
>  src/intel/vulkan/anv_private.h |   4 +
>  src/intel/vulkan/anv_query.c   |  99 ++
>  src/intel/vulkan/genX_cmd_buffer.c | 260
> -
>  4 files changed, 308 insertions(+), 57 deletions(-)
>
>
> diff --git a/src/intel/vulkan/anv_device.c
> b/src/intel/vulkan/anv_device.c
> index 99eb73c..7ad1970 100644
> --- a/src/intel/vulkan/anv_device.c
> +++ b/src/intel/vulkan/anv_device.c
> @@ -427,7 +427,7 @@ void anv_GetPhysicalDeviceFeatures(
>.textureCompressionASTC_LDR   = pdevice->info.gen >=
> 9,
> /* FINISHME CHV */
>.textureCompressionBC = true,
>.occlusionQueryPrecise= true,
> -  .pipelineStatisticsQuery  = false,
> +  .pipelineStatisticsQuery  = true,
>.fragmentStoresAndAtomics = true,
>.shaderTessellationAndGeometryPointSize   = true,
>.shaderImageGatherExtended= false,
> diff --git a/src/intel/vulkan/anv_private.h
> b/src/intel/vulkan/anv_private.h
> index 2fc543d..7271609 100644
> --- a/src/intel/vulkan/anv_private.h
> +++ b/src/intel/vulkan/anv_private.h
> @@ -1763,6 +1763,8 @@ struct anv_render_pass {
> struct anv_subpass   subpasses[0];
>  };
>
> +#define ANV_PIPELINE_STATISTICS_COUNT 11
> +
>  struct anv_query_pool_slot {
> uint64_t begin;
> uint64_t end;
> @@ -1772,6 +1774,8 @@ struct anv_query_pool_slot {
>  struct anv_query_pool {
> VkQueryType  type;
> uint32_t slots;
> +   uint32_t pipeline_statistics;
> +   uint32_t slot_stride;
> struct anv_bobo;
>  };
>
> diff --git a/src/intel/vulkan/anv_query.c b/src/intel/vulkan/anv_query.c
> index 293257b..dc00859 100644
> --- a/src/intel/vulkan/anv_query.c
> +++ b/src/intel/vulkan/anv_query.c
> @@ -38,8 +38,10 @@ VkResult anv_CreateQueryPool(
> ANV_FROM_HANDLE(anv_device, device, _device);
> struct anv_query_pool *pool;
> VkResult result;
> -   uint32_t slot_size;
> -   uint64_t size;
> +   uint32_t slot_size = sizeof(struct anv_query_pool_slot);
> +   uint32_t slot_stride = 1;
> +   uint64_t size = pCreateInfo->queryCount * slot_size;
> +   uint32_t pipeline_statistics = 0;
>
> assert(pCreateInfo->sType ==
> VK_STRUCTURE_TYPE_QUERY_POOL_CREATE_INFO);
>
> @@ -48,12 +50,16 @@ VkResult anv_CreateQueryPool(
> case VK_QUERY_TYPE_TIMESTAMP:
>break;
> case VK_QUERY_TYPE_PIPELINE_STATISTICS:
> -  return 

Re: [Mesa-dev] [PATCH] st/nine: make use of common uploaders v4

2017-02-21 Thread Axel Davy

This looks fine to me.

Reviewed-by: Axel Davy 

I think the patch requires your Signed-off-by though.

Axel

On 21/02/2017 05:31, Constantine Charlamov wrote:

Make use of common uploaders that landed recently to Mesa

v2: fixed formatting, broken due to thunderbird configuration

v3: per Axel comment: added a comment into NineDevice9_DrawPrimitiveUP

v4: per Axel comment: changed style of the comment

---
  src/gallium/state_trackers/nine/device9.c| 50 +---
  src/gallium/state_trackers/nine/device9.h|  5 ---
  src/gallium/state_trackers/nine/nine_ff.c|  8 ++---
  src/gallium/state_trackers/nine/nine_state.c | 48 +-
  4 files changed, 37 insertions(+), 74 deletions(-)

diff --git a/src/gallium/state_trackers/nine/device9.c 
b/src/gallium/state_trackers/nine/device9.c
index b9b7a637d7..86c8e38535 100644
--- a/src/gallium/state_trackers/nine/device9.c
+++ b/src/gallium/state_trackers/nine/device9.c
@@ -477,31 +477,8 @@ NineDevice9_ctor( struct NineDevice9 *This,
  This->driver_caps.user_cbufs = GET_PCAP(USER_CONSTANT_BUFFERS);
  This->driver_caps.user_sw_vbufs = 
This->screen_sw->get_param(This->screen_sw, PIPE_CAP_USER_VERTEX_BUFFERS);
  This->driver_caps.user_sw_cbufs = 
This->screen_sw->get_param(This->screen_sw, PIPE_CAP_USER_CONSTANT_BUFFERS);
-
-/* Implicit use of context pipe for vertex and index uploaded when
- * csmt is not active. Does not need to sync since csmt is unactive,
- * thus no need to call NineDevice9_GetPipe at each upload. */
-if (!This->driver_caps.user_vbufs)
-This->vertex_uploader = u_upload_create(This->csmt_active ?
-This->pipe_secondary : 
This->context.pipe,
-65536,
-PIPE_BIND_VERTEX_BUFFER, 
PIPE_USAGE_STREAM);
-This->vertex_sw_uploader = u_upload_create(This->pipe_sw, 65536,
-PIPE_BIND_VERTEX_BUFFER, 
PIPE_USAGE_STREAM);
-if (!This->driver_caps.user_ibufs)
-This->index_uploader = u_upload_create(This->csmt_active ?
-This->pipe_secondary : 
This->context.pipe,
-   128 * 1024,
-   PIPE_BIND_INDEX_BUFFER, 
PIPE_USAGE_STREAM);
-if (!This->driver_caps.user_cbufs) {
+if (!This->driver_caps.user_cbufs)
  This->constbuf_alignment = GET_PCAP(CONSTANT_BUFFER_OFFSET_ALIGNMENT);
-This->constbuf_uploader = u_upload_create(This->context.pipe, 
This->vs_const_size,
-  PIPE_BIND_CONSTANT_BUFFER, 
PIPE_USAGE_STREAM);
-}
-
-This->constbuf_sw_uploader = u_upload_create(This->pipe_sw, 128 * 1024,
- PIPE_BIND_CONSTANT_BUFFER, 
PIPE_USAGE_STREAM);
-
  This->driver_caps.window_space_position_support = 
GET_PCAP(TGSI_VS_WINDOW_SPACE_POSITION);
  This->driver_caps.vs_integer = pScreen->get_shader_param(pScreen, 
PIPE_SHADER_VERTEX, PIPE_SHADER_CAP_INTEGERS);
  This->driver_caps.ps_integer = pScreen->get_shader_param(pScreen, 
PIPE_SHADER_FRAGMENT, PIPE_SHADER_CAP_INTEGERS);
@@ -552,17 +529,6 @@ NineDevice9_dtor( struct NineDevice9 *This )
  nine_state_clear(>state, TRUE);
  nine_context_clear(This);
  
-if (This->vertex_uploader)

-u_upload_destroy(This->vertex_uploader);
-if (This->index_uploader)
-u_upload_destroy(This->index_uploader);
-if (This->constbuf_uploader)
-u_upload_destroy(This->constbuf_uploader);
-if (This->vertex_sw_uploader)
-u_upload_destroy(This->vertex_sw_uploader);
-if (This->constbuf_sw_uploader)
-u_upload_destroy(This->constbuf_sw_uploader);
-
  nine_bind(>record, NULL);
  
  pipe_sampler_view_reference(>dummy_sampler_view, NULL);

@@ -2852,15 +2818,17 @@ NineDevice9_DrawPrimitiveUP( struct NineDevice9 *This,
  vtxbuf.buffer = NULL;
  vtxbuf.user_buffer = pVertexStreamZeroData;
  
+/* csmt is unactive when user vertex or index buffers are used, thus no

+ * need to call NineDevice9_GetPipe. */
  if (!This->driver_caps.user_vbufs) {
-u_upload_data(This->vertex_uploader,
+u_upload_data(This->context.pipe->stream_uploader,
0,
(prim_count_to_vertex_count(PrimitiveType, 
PrimitiveCount)) * VertexStreamZeroStride, /* XXX */
4,
vtxbuf.user_buffer,
_offset,
);
-u_upload_unmap(This->vertex_uploader);
+u_upload_unmap(This->context.pipe->stream_uploader);
  vtxbuf.user_buffer = NULL;
  }
  
@@ -2916,27 +2884,27 @@ NineDevice9_DrawIndexedPrimitiveUP( struct NineDevice9 *This,
  
  if (!This->driver_caps.user_vbufs) 

Re: [Mesa-dev] [Mesa-stable] [PATCH v2 1/3] i965/fs: fix indirect load DF uniforms on BSW/BXT

2017-02-21 Thread Francisco Jerez
Samuel Iglesias Gonsálvez  writes:

> On 20/02/17 21:31, Francisco Jerez wrote:
>> Samuel Iglesias Gonsálvez  writes:
>> 
>>> On Mon, 2017-02-20 at 08:58 +0100, Samuel Iglesias Gonsálvez wrote:
 On Sat, 2017-02-18 at 18:58 -0800, Francisco Jerez wrote:
> Samuel Iglesias Gonsálvez  writes:
>
>> The lowered BSW/BXT indirect move instructions had incorrect
>> source types, which luckily wasn't causing incorrect assembly to
>> be
>> generated due to the bug fixed in the next patch, but would have
>> confused the remaining back-end IR infrastructure due to the
>> mismatch
>> between the IR source types and the emitted machine code.
>>
>> v2:
>> - Improve commit log (Curro)
>> - Fix read_size (Curro)
>> - Fix DF uniform array detection in assign_constant_locations()
>> when
>>   it is acceded with 32-bit MOV_INDIRECTs in BSW/BXT.
>>
>> Signed-off-by: Samuel Iglesias Gonsálvez 
>> Cc: "17.0" 
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs.cpp | 11 -
>>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 41 --
>> --
>>  2 files changed, 30 insertions(+), 22 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> index c348bc7138d..93ab84b5845 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> @@ -1944,10 +1944,19 @@ fs_visitor::assign_constant_locations()
>>  unsigned last = constant_nr + (inst->src[2].ud / 4)
>> -
>> 1;
>>  assert(last < uniforms);
>>  
>> +bool supports_64bit_indirects =
>> +   !devinfo->is_cherryview && !devinfo->is_broxton;
>> +/* Detect if this is as a result 64-bit MOV
>> INDIRECT.
>> In case
>> + * of BSW/BXT, we substitute each DF MOV INDIRECT by
>> two 32-bit MOV
>> + * INDIRECT.
>> + */
>> +bool mov_indirect_64bit = (type_sz(inst-
>>> src[i].type)
>> == 8) ||
>> +   (!supports_64bit_indirects && inst->dst.type ==
>> BRW_REGISTER_TYPE_UD &&
>> +inst->src[0].type == BRW_REGISTER_TYPE_UD &&
>> inst-
>>> dst.stride == 2);
>
> This seems kind of fragile, I don't think the optimizer gives you
> any
> guarantees that the stride of a lowered 64-bit indirect move will
> remain
> equal to two, or that the destination stride of an actual 32-bit
> indirect uniform load will never end up being two as well.  That
> said,
> because you access these with 32-bit indirect moves, I don't see
> why
> you'd need to treat them as 64-bit uniforms here, the usual
> alignment
> requirements for 64-bit uniforms no longer apply, so you can treat
> them
> as regular 32-bit uniforms AFAICT.  Why did you add this hunk?
>

 I added it because of this case: if we access to one DF uniform array
 element with a normal MOV and the rest with MOV INDIRECT, we will
 mark
 the former as a live 64bit variable. Then we have the array scattered
 as part of it is uploaded as a 64-bits uniform and the other as 32-
 bits. Even if we avoid this by uploading everything together as 32-
 bits, then the access to that DF could not be aligned to 64-bits.

 So my idea was to find a way to identify somehow those MOV INDIRECT
 in
 BSW to mark all the array as a 64-bit one.

>>>
>>> Mmm, maybe I can fix this case without the hack I did. I can add the
>>> following code after marking all live variables accessed by the
>>> instructions.
>>>
>>> It is very similar to the one to detect live variables but it is fixing
>>> the case where any MOV INDIRECT in BSW is accessing to an uniform array
>>> of DF elements where one of these elements is directly accessed by
>>> another instruction.
>>>
>>> What do you think?
>>>
>> 
>> Looks somewhat better, but I don't think this is correct if you have
>> multiple overlapping indirect loads of the same uniform array and only
>> one of them overlaps with a direct 64-bit load.
>
> In that case, I think this is correct. The 2 32-bit MOV INDIRECTs where
> emitted as a "lowering" of an unsupported 64-bit MOV INDIRECT. They both
> keep the 'read_size' of the original one, so they both overlap to any
> other direct 64-bit load to that array like with the original
> instruction. If none of them overlap to the direct 64-bit access, then I
> think they can be handled as non-contiguous to the latter without any issue.
>

What if you have two lowered indirect loads with overlapping but not
identical range, and only the second one contains elements accessed with
regular 64-bit moves?  Wouldn't your 

Re: [Mesa-dev] [PATCH mesa 0/5] gallium/docs: formatting fixes

2017-02-21 Thread Ilia Mirkin
Series is

Reviewed-by: Ilia Mirkin 

I actually had fixes for some of the tgsi.rst stuff but never mailed
them out. We'll see what the rebase leaves me with once you push them.

On Tue, Feb 21, 2017 at 9:15 AM, Eric Engestrom
 wrote:
> I noticed a bunch of warnings and errors when compiling the docs, so
> I fixed the ones I knew how.
>
> There's a couple warnings left to fix, if anyone's interested:
>
> - src/gallium/docs/source/drivers/freedreno/ir3-notes.rst:195:
>   WARNING: Could not lex literal_block as "c". Highlighting skipped.
>   Should this block be marked as assembly? (Rob Clark? you wrote this
>   a couple years ago)
>
> - src/gallium/docs/source/distro.rst:104:
>   WARNING: undefined label: egl
>   There is no EGL section in distro.rst, if anyone wants to write it :)
>
> - src/gallium/docs/source/conf.py:126:
>   WARNING: html_static_path entry 'src/gallium/docs/source/_static' does not 
> exist
>   This one can be safely ignored, or silenced by creating the directory
>   locally. I could create the directory in git by putting some dummy file
>   in there (git ignores empty directories), but that doesn't feel right.
>
>
> Cc: Rob Clark 
>
> Eric Engestrom (5):
>   gallium/docs: fix sublist formatting
>   gallium/docs: add missing math formatting
>   gallium/docs: add missing newlines
>   gallium/docs: fix section title formatting
>   gallium/docs: use imgmath instead of pngmath
>
>  src/gallium/docs/source/conf.py |  2 +-
>  src/gallium/docs/source/context.rst |  2 ++
>  src/gallium/docs/source/tgsi.rst| 41 
> +++--
>  3 files changed, 42 insertions(+), 3 deletions(-)
>
> --
> Cheers,
>   Eric
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 99886] [radv] No actual multithreading in a Vulkan multithreading demo

2017-02-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99886

--- Comment #1 from david becerra  ---
on anv the load is distributed beetwen cores
but it is gpu bound so no core reaches 50% :/

what does the demo say using 4 threads, 1...

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] radv/entrypoints: Only generate entrypoints for supported features

2017-02-21 Thread Dave Airlie
On 19 February 2017 at 21:49, Emil Velikov  wrote:
> This changes the way radv_entrypoints_gen.py works from generating a
> table containing every single entrypoint in the XML to just the ones
> that we actually need.  There's no reason for us to burn entrypoint
> table space on a bunch of NV extensions we never plan to implement.
>
> RADV implements VK_AMD_draw_indirect_count, so add that to the list.

Acked-by: Dave Airlie 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] vulkan/wsi: move image count to shared structure.

2017-02-21 Thread Dave Airlie
On 21 February 2017 at 23:06, Edward O'Callaghan
 wrote:
> wait, why is this needed at all Dave?
>
> The application should be querying and picking the correct GPU as you
> well know. This seems unwise to tamper with the mechanism defined by the
> specification.

how do you propose we get the frames from the discrete GPU to the
integrated GPU for display then?

This doesn't actaully tamper with the querying or picking, the layer I
am hoping to write will do that.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] spec: MESA_program_binary

2017-02-21 Thread Ian Romanick
On 02/16/2017 04:33 PM, Timothy Arceri wrote:
> On 17/02/17 10:44, Ian Romanick wrote:
>> On 02/15/2017 11:58 PM, Timothy Arceri wrote:
>>> On 16/02/17 17:55, Tapani Pälli wrote:

 On 02/16/2017 04:52 AM, Timothy Arceri wrote:
> In order add functionality to ARB_get_program_binary we need
> binary format enums.

 I've understood that this is a driver internal enumeration. When
 application gets the binary it also receives enum (integer value) what
 format we gave. Then when loading application needs to query what
 formats are supported by the implementation and load the correct
 binary.
 We just need to internally make agreement on format list and return
 correct one matching the current driver in use?
>>>
>>> Not that it's actually likely to happen but if we were to only have a
>>> single MESA enum an application could only distribute a single binary.
>>
>> Applications really, really, *REALLY* should not distribute binaries
>> retrieved from the driver.  The intention of this extension is for
>> applications to implement their own shader cache, for example, at
>> application installation.  The driver can reject the binary at any time
>> for any reason.  Driver changes, hardware changes, OS changes, phase of
>> the moon, etc.
>>
>> Looking at the GLES extension registry, it appears that the other
>> vendors have just a single binary for all the hardware they make.  Based
>> on that, having a single Mesa enum isn't an insane idea.  We would just
>> need to agree on the format of the header so that the driver receiving
>> the blob could determine which driver generated the blob.
> 
> The only other thing to consider with a single enum is that it will
> require a laptop with an Intel cpu and Nvidia gpu for example to
> recompile the binary if the user were to switch between using the Intel
> and Nvidia gpus. This might happen depending on if the laptop is plugged
> into a power source or not.
> 
> If we don't care about this than one enum is fine.

Hm... I think we care, but I don't think multiple enums will help
existing apps... but maybe?  I imagine the usual scenario is:

- User runs first time on nouveau.

- Application saves binaries from nouveau.

- User runs second time on i965.

- Application submits binary from nouveau.

- Application deletes its binary cache, resubmits from source, resaves
binaries from i965.

- User runs third time on nouveau.

- Application submits binary from i965.

- Application deletes its binary cache, resubmits from source, resaves
binaries from nouveau.

- Lather

- Rinse

- Repeat

It seems like if we actually care about this configuration, we'd need a
more complex solution.  It's not 100% clear what that solution would be
or how we would be able to implement it.  I think the right solution is
a driver-side shader cache that is smart enough to track binaries from
multiple drivers without stomping on each other.  Right? :)

>>> e.g either for AMD, INTEL or NVIDIA but not one for each. That is unless
>>> we were to compile and pack all gpu vendor binarys at the same time
>>> which seems overly complicated and expensive.
>>>
>>> I could see an intenal id being used for gpu generations from hardware
>>> vendors.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl/tests: Add UINT64 and INT64 types

2017-02-21 Thread Emil Velikov
On 21 February 2017 at 09:54, tournier.elie  wrote:
> On 13 February 2017 at 20:35, Rhys Kidd  wrote:
>>
>>
>> On Mon, Feb 13, 2017 at 9:43 AM Elie Tournier 
>> wrote:
>>>
>>> Seems good to me.
>>>
>>> Reviewed-by: Elie Tournier 
>>
>>
>> Thanks Elie.
>>
>> Given we are both "new-er" Mesa contributors I might wait for one more
>> Reviewed-by before asking for it to be merged.
>
>
> Ping.
> Are you OK with this patch? If yes, can you merge it please?

Patch looks perfectly reasonable and make check passes on my end.
Thanks guys, I've pushed it to master.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 1/5] intel/blorp: Explicitly flush all allocated state

2017-02-21 Thread Lionel Landwerlin

Reviewed-by: Lionel Landwerlin 

On 21/02/17 15:47, Jason Ekstrand wrote:

Found by inspection.  However, I expect it fixes real bugs when using
blorp from Vulkan on little-core platforms.

Cc: "13.0 17.0" 
---
  src/intel/blorp/blorp_genX_exec.h   | 19 ++-
  src/intel/vulkan/genX_blorp_exec.c  | 11 +++
  src/mesa/drivers/dri/i965/genX_blorp_exec.c |  8 
  3 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/src/intel/blorp/blorp_genX_exec.h 
b/src/intel/blorp/blorp_genX_exec.h
index a673ab8..f0c4f38 100644
--- a/src/intel/blorp/blorp_genX_exec.h
+++ b/src/intel/blorp/blorp_genX_exec.h
@@ -66,6 +66,10 @@ blorp_alloc_binding_table(struct blorp_batch *batch, 
unsigned num_entries,
unsigned state_size, unsigned state_alignment,
uint32_t *bt_offset, uint32_t *surface_offsets,
void **surface_maps);
+
+static void
+blorp_flush_range(struct blorp_batch *batch, void *start, size_t size);
+
  static void
  blorp_surface_reloc(struct blorp_batch *batch, uint32_t ss_offset,
  struct blorp_address address, uint32_t delta);
@@ -182,6 +186,7 @@ blorp_emit_vertex_data(struct blorp_batch *batch,
 void *data = blorp_alloc_vertex_buffer(batch, sizeof(vertices), addr);
 memcpy(data, vertices, sizeof(vertices));
 *size = sizeof(vertices);
+   blorp_flush_range(batch, data, *size);
  }
  
  static void

@@ -199,7 +204,8 @@ blorp_emit_input_varying_data(struct blorp_batch *batch,
 *size = 16 + num_varyings * vec4_size_in_bytes;
  
 const uint32_t *const inputs_src = (const uint32_t *)>wm_inputs;

-   uint32_t *inputs = blorp_alloc_vertex_buffer(batch, *size, addr);
+   void *data = blorp_alloc_vertex_buffer(batch, *size, addr);
+   uint32_t *inputs = data;
  
 /* Copy in the VS inputs */

 assert(sizeof(params->vs_inputs) == 16);
@@ -223,6 +229,8 @@ blorp_emit_input_varying_data(struct blorp_batch *batch,
   inputs += 4;
}
 }
+
+   blorp_flush_range(batch, data, *size);
  }
  
  static void

@@ -906,6 +914,7 @@ blorp_emit_blend_state(struct blorp_batch *batch,
 GENX(BLEND_STATE_length) * 4,
 64, );
 GENX(BLEND_STATE_pack)(NULL, state, );
+   blorp_flush_range(batch, state, GENX(BLEND_STATE_length) * 4);
  
  #if GEN_GEN >= 7

 blorp_emit(batch, GENX(3DSTATE_BLEND_STATE_POINTERS), sp) {
@@ -940,6 +949,7 @@ blorp_emit_color_calc_state(struct blorp_batch *batch,
 GENX(COLOR_CALC_STATE_length) * 4,
 64, );
 GENX(COLOR_CALC_STATE_pack)(NULL, state, );
+   blorp_flush_range(batch, state, GENX(COLOR_CALC_STATE_length) * 4);
  
  #if GEN_GEN >= 7

 blorp_emit(batch, GENX(3DSTATE_CC_STATE_POINTERS), sp) {
@@ -1016,6 +1026,7 @@ blorp_emit_depth_stencil_state(struct blorp_batch *batch,
 GENX(DEPTH_STENCIL_STATE_length) * 
4,
 64, );
 GENX(DEPTH_STENCIL_STATE_pack)(NULL, state, );
+   blorp_flush_range(batch, state, GENX(DEPTH_STENCIL_STATE_length) * 4);
  #endif
  
  #if GEN_GEN == 7

@@ -1068,6 +1079,8 @@ blorp_emit_surface_state(struct blorp_batch *batch,
blorp_surface_reloc(batch, state_offset + isl_dev->ss.aux_addr_offset,
surface->aux_addr, *aux_addr);
 }
+
+   blorp_flush_range(batch, state, GENX(RENDER_SURFACE_STATE_length) * 4);
  }
  
  static void

@@ -1098,6 +,8 @@ blorp_emit_null_surface_state(struct blorp_batch *batch,
 };
  
 GENX(RENDER_SURFACE_STATE_pack)(NULL, state, );

+
+   blorp_flush_range(batch, state, GENX(RENDER_SURFACE_STATE_length) * 4);
  }
  
  static void

@@ -1181,6 +1196,7 @@ blorp_emit_sampler_state(struct blorp_batch *batch,
 GENX(SAMPLER_STATE_length) * 4,
 32, );
 GENX(SAMPLER_STATE_pack)(NULL, state, );
+   blorp_flush_range(batch, state, GENX(SAMPLER_STATE_length) * 4);
  
  #if GEN_GEN >= 7

 blorp_emit(batch, GENX(3DSTATE_SAMPLER_STATE_POINTERS_PS), ssp) {
@@ -1333,6 +1349,7 @@ blorp_emit_viewport_state(struct blorp_batch *batch,
   .MinimumDepth = 0.0,
   .MaximumDepth = 1.0,
});
+   blorp_flush_range(batch, state, GENX(CC_VIEWPORT_length) * 4);
  
  #if GEN_GEN >= 7

 blorp_emit(batch, GENX(3DSTATE_VIEWPORT_STATE_POINTERS_CC), vsp) {
diff --git a/src/intel/vulkan/genX_blorp_exec.c 
b/src/intel/vulkan/genX_blorp_exec.c
index 6f0b063..139c387 100644
--- a/src/intel/vulkan/genX_blorp_exec.c
+++ b/src/intel/vulkan/genX_blorp_exec.c
@@ -100,6 +100,9 @@ blorp_alloc_binding_table(struct blorp_batch *batch, 
unsigned num_entries,
surface_offsets[i] = surface_state.offset;
  

[Mesa-dev] [PATCH mesa] glx: add GLXdispatchIndex sort check

2017-02-21 Thread Eric Engestrom
Signed-off-by: Eric Engestrom 
---
 src/glx/tests/Makefile.am  |  2 +-
 src/glx/tests/dispatch-index-check | 24 
 2 files changed, 25 insertions(+), 1 deletion(-)
 create mode 100755 src/glx/tests/dispatch-index-check

diff --git a/src/glx/tests/Makefile.am b/src/glx/tests/Makefile.am
index bdc78c0d5a..8874c20b01 100644
--- a/src/glx/tests/Makefile.am
+++ b/src/glx/tests/Makefile.am
@@ -12,7 +12,7 @@ AM_CPPFLAGS = \
$(LIBDRM_CFLAGS) \
$(X11_INCLUDES)
 
-TESTS = glx-test
+TESTS = glx-test dispatch-index-check
 check_PROGRAMS = glx-test
 
 glx_test_SOURCES = \
diff --git a/src/glx/tests/dispatch-index-check 
b/src/glx/tests/dispatch-index-check
new file mode 100755
index 00..e2b5faff09
--- /dev/null
+++ b/src/glx/tests/dispatch-index-check
@@ -0,0 +1,24 @@
+#!/bin/sh
+
+# extract enum definition
+dispatch_list=$(sed '/__GLXdispatchIndex/,/__GLXdispatchIndex/!d' \
+  "$srcdir"/../g_glxglvnddispatchindices.h)
+
+# extract values inside of enum
+dispatch_list=$(sed '1d;$d' <<< "$dispatch_list")
+
+# remove indentation
+dispatch_list=$(sed 's/^\s\+//' <<< "$dispatch_list")
+
+# extract function names
+dispatch_list=$(sed 's/DI_//;s/,//' <<< "$dispatch_list")
+
+# same for commented functions, we want to keep them sorted too
+dispatch_list=$(sed 's#// ##;s/ implemented by [a-z]\+//' <<< "$dispatch_list")
+
+# remove LAST_INDEX, as it will not be in alphabetical order
+dispatch_list=$(sed '/LAST_INDEX/d' <<< "$dispatch_list")
+
+sorted=$(sort <<< "$dispatch_list")
+
+test "$dispatch_list" = "$sorted"
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] vbo: kill primitive restart lowering in glDrawArrays

2017-02-21 Thread Kenneth Graunke
On Monday, February 20, 2017 10:35:36 AM PST Marek Olšák wrote:
> From: Marek Olšák 
> 
> ---
>  src/mesa/vbo/vbo_exec_array.c | 56 
> ++-
>  1 file changed, 7 insertions(+), 49 deletions(-)
> 
> diff --git a/src/mesa/vbo/vbo_exec_array.c b/src/mesa/vbo/vbo_exec_array.c
> index 6a96167..30c52d5 100644
> --- a/src/mesa/vbo/vbo_exec_array.c
> +++ b/src/mesa/vbo/vbo_exec_array.c
> @@ -404,77 +404,35 @@ vbo_bind_arrays(struct gl_context *ctx)
>   */
>  static void
>  vbo_draw_arrays(struct gl_context *ctx, GLenum mode, GLint start,
>  GLsizei count, GLuint numInstances, GLuint baseInstance)
>  {
> struct vbo_context *vbo = vbo_context(ctx);
> struct _mesa_prim prim[2];
>  
> vbo_bind_arrays(ctx);
>  
> -   /* init most fields to zero */
> +   /* OpenGL 4.5 says that primitive restart is ignored with non-indexed
> +* draws.
> +*/

Cool!  Yes it does.  ES 3.2 also seems to agree.

i965 hardware doesn't support primitive restart on non-indexed draws (yet).
Apparently neither does AMD hardware.  NVIDIA does.  So both of us would
need this lowering...if it were required.  But it isn't, so goodbye
code! :)

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] vulkan/wsi: move image count to shared structure.

2017-02-21 Thread Jason Ekstrand
On Tue, Feb 21, 2017 at 5:06 AM, Edward O'Callaghan <
funfunc...@folklore1984.net> wrote:

> wait, why is this needed at all Dave?
>
> The application should be querying and picking the correct GPU as you
> well know. This seems unwise to tamper with the mechanism defined by the
> specification.
>

Yes, and no.  This is something we've been discussed quite a bit internally
within Khronos.  The crux of the problem is that there's no way, without
something like prime, for an application to render with the discrete card
and display on the integrated card.  Also, we want some sort of mechanism
for the user to set preferences about which applications land on which GPU
regardless of which one is used for display.

--Jason


> Kindly,
> Edward.
>
> On 02/21/2017 01:47 PM, Jason Ekstrand wrote:
> > Fine by me
> >
> > Reviewed-by: Jason Ekstrand  > >
> >
> > On Mon, Feb 20, 2017 at 6:26 PM, Dave Airlie  > > wrote:
> >
> > From: Dave Airlie >
> >
> > For prime support I need to access this, so move it in advance.
> >
> > Signed-off-by: Dave Airlie  > >
> > ---
> >  src/vulkan/wsi/wsi_common.h |  1 +
> >  src/vulkan/wsi/wsi_common_wayland.c | 20 +---
> >  src/vulkan/wsi/wsi_common_x11.c | 29
> ++---
> >  3 files changed, 24 insertions(+), 26 deletions(-)
> >
> > diff --git a/src/vulkan/wsi/wsi_common.h
> b/src/vulkan/wsi/wsi_common.h
> > index ae9e587..1a22935 100644
> > --- a/src/vulkan/wsi/wsi_common.h
> > +++ b/src/vulkan/wsi/wsi_common.h
> > @@ -54,6 +54,7 @@ struct wsi_swapchain {
> > const struct wsi_image_fns *image_fns;
> > VkFence fences[3];
> > VkPresentModeKHR present_mode;
> > +   int image_count;
> >
> > VkResult (*destroy)(struct wsi_swapchain *swapchain,
> > const VkAllocationCallbacks *pAllocator);
> > diff --git a/src/vulkan/wsi/wsi_common_wayland.c
> > b/src/vulkan/wsi/wsi_common_wayland.c
> > index 4489736..e6490ee 100644
> > --- a/src/vulkan/wsi/wsi_common_wayland.c
> > +++ b/src/vulkan/wsi/wsi_common_wayland.c
> > @@ -495,7 +495,6 @@ struct wsi_wl_swapchain {
> > VkPresentModeKHR present_mode;
> > bool fifo_ready;
> >
> > -   uint32_t image_count;
> > struct wsi_wl_image  images[0];
> >  };
> >
> > @@ -508,13 +507,13 @@ wsi_wl_swapchain_get_images(struct
> > wsi_swapchain *wsi_chain,
> > VkResult result;
> >
> > if (pSwapchainImages == NULL) {
> > -  *pCount = chain->image_count;
> > +  *pCount = chain->base.image_count;
> >return VK_SUCCESS;
> > }
> >
> > result = VK_SUCCESS;
> > -   ret_count = chain->image_count;
> > -   if (chain->image_count > *pCount) {
> > +   ret_count = chain->base.image_count;
> > +   if (chain->base.image_count > *pCount) {
> >   ret_count = *pCount;
> >   result = VK_INCOMPLETE;
> > }
> > @@ -543,7 +542,7 @@ wsi_wl_swapchain_acquire_next_image(struct
> > wsi_swapchain *wsi_chain,
> >return VK_ERROR_OUT_OF_DATE_KHR;
> >
> > while (1) {
> > -  for (uint32_t i = 0; i < chain->image_count; i++) {
> > +  for (uint32_t i = 0; i < chain->base.image_count; i++) {
> >   if (!chain->images[i].busy) {
> >  /* We found a non-busy image */
> >  *image_index = i;
> > @@ -591,7 +590,7 @@ wsi_wl_swapchain_queue_present(struct
> > wsi_swapchain *wsi_chain,
> >}
> > }
> >
> > -   assert(image_index < chain->image_count);
> > +   assert(image_index < chain->base.image_count);
> > wl_surface_attach(chain->surface,
> > chain->images[image_index].buffer, 0, 0);
> > wl_surface_damage(chain->surface, 0, 0, INT32_MAX, INT32_MAX);
> >
> > @@ -679,7 +678,7 @@ wsi_wl_swapchain_destroy(struct wsi_swapchain
> > *wsi_chain,
> >  {
> > struct wsi_wl_swapchain *chain = (struct wsi_wl_swapchain
> > *)wsi_chain;
> >
> > -   for (uint32_t i = 0; i < chain->image_count; i++) {
> > +   for (uint32_t i = 0; i < chain->base.image_count; i++) {
> >if (chain->images[i].buffer)
> >   chain->base.image_fns->free_wsi_image(chain->base.device,
> > pAllocator,
> >
>  chain->images[i].image,
> > @@ -724,6 +723,7 @@ wsi_wl_surface_create_swapchain(VkIcdSurfaceBase
> > *icd_surface,
> > chain->base.queue_present = wsi_wl_swapchain_queue_present;
> > chain->base.image_fns = image_fns;
> > chain->base.present_mode = pCreateInfo->presentMode;

Re: [Mesa-dev] [PATCH 1/2] vulkan/wsi: move image count to shared structure.

2017-02-21 Thread Jason Ekstrand
On Tue, Feb 21, 2017 at 1:01 AM, Gustaw Smolarczyk 
wrote:

> 21 lut 2017 03:47 "Jason Ekstrand"  napisał(a):
>
> Fine by me
>
> Reviewed-by: Jason Ekstrand 
>
> On Mon, Feb 20, 2017 at 6:26 PM, Dave Airlie  wrote:
>
>> From: Dave Airlie 
>>
>> For prime support I need to access this, so move it in advance.
>>
>> Signed-off-by: Dave Airlie 
>> ---
>>  src/vulkan/wsi/wsi_common.h |  1 +
>>  src/vulkan/wsi/wsi_common_wayland.c | 20 +---
>>  src/vulkan/wsi/wsi_common_x11.c | 29 ++---
>>  3 files changed, 24 insertions(+), 26 deletions(-)
>>
>> diff --git a/src/vulkan/wsi/wsi_common.h b/src/vulkan/wsi/wsi_common.h
>> index ae9e587..1a22935 100644
>> --- a/src/vulkan/wsi/wsi_common.h
>> +++ b/src/vulkan/wsi/wsi_common.h
>> @@ -54,6 +54,7 @@ struct wsi_swapchain {
>> const struct wsi_image_fns *image_fns;
>> VkFence fences[3];
>> VkPresentModeKHR present_mode;
>> +   int image_count;
>>
>> VkResult (*destroy)(struct wsi_swapchain *swapchain,
>> const VkAllocationCallbacks *pAllocator);
>> diff --git a/src/vulkan/wsi/wsi_common_wayland.c
>> b/src/vulkan/wsi/wsi_common_wayland.c
>> index 4489736..e6490ee 100644
>> --- a/src/vulkan/wsi/wsi_common_wayland.c
>> +++ b/src/vulkan/wsi/wsi_common_wayland.c
>> @@ -495,7 +495,6 @@ struct wsi_wl_swapchain {
>> VkPresentModeKHR present_mode;
>> bool fifo_ready;
>>
>> -   uint32_t image_count;
>> struct wsi_wl_image  images[0];
>>  };
>>
>> @@ -508,13 +507,13 @@ wsi_wl_swapchain_get_images(struct wsi_swapchain
>> *wsi_chain,
>> VkResult result;
>>
>> if (pSwapchainImages == NULL) {
>> -  *pCount = chain->image_count;
>> +  *pCount = chain->base.image_count;
>>return VK_SUCCESS;
>> }
>>
>> result = VK_SUCCESS;
>> -   ret_count = chain->image_count;
>> -   if (chain->image_count > *pCount) {
>> +   ret_count = chain->base.image_count;
>> +   if (chain->base.image_count > *pCount) {
>>   ret_count = *pCount;
>>   result = VK_INCOMPLETE;
>> }
>> @@ -543,7 +542,7 @@ wsi_wl_swapchain_acquire_next_image(struct
>> wsi_swapchain *wsi_chain,
>>return VK_ERROR_OUT_OF_DATE_KHR;
>>
>> while (1) {
>> -  for (uint32_t i = 0; i < chain->image_count; i++) {
>> +  for (uint32_t i = 0; i < chain->base.image_count; i++) {
>>
>
> Looks like a comparison between signed and unsigned. Not sure if you care
> about this (it produces a warning at -Wall or -Wextra IIRC).
>

Good point.  All we need to do is tweak it to store a uint32_t instead of
an int.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nir: Delete unused arg in get_iteration

2017-02-21 Thread Elie Tournier
nir_const_value is not needed in get_iteration

Signed-off-by: Elie Tournier 
---
I don't have the git access. Please push it for me.
BR,
Elie
---
 src/compiler/nir/nir_loop_analyze.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/compiler/nir/nir_loop_analyze.c 
b/src/compiler/nir/nir_loop_analyze.c
index a5f464a45d..6afad9e603 100644
--- a/src/compiler/nir/nir_loop_analyze.c
+++ b/src/compiler/nir/nir_loop_analyze.c
@@ -359,7 +359,7 @@ find_loop_terminators(loop_info_state *state)
 
 static int32_t
 get_iteration(nir_op cond_op, nir_const_value *initial, nir_const_value *step,
-  nir_const_value *limit, nir_alu_instr *alu)
+  nir_const_value *limit)
 {
int32_t iter;
 
@@ -490,7 +490,7 @@ calculate_iterations(nir_const_value *initial, 
nir_const_value *step,
   trip_offset = 1;
}
 
-   int iter_int = get_iteration(cond_alu->op, initial, step, limit, alu);
+   int iter_int = get_iteration(cond_alu->op, initial, step, limit);
 
/* If iter_int is negative the loop is ill-formed or is the conditional is
 * unsigned with a huge iteration count so don't bother going any further.
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Gallium: Removal of set_index_buffer (discussion)

2017-02-21 Thread Roland Scheidegger
Am 21.02.2017 um 16:49 schrieb Axel Davy:
> On 21/02/2017 16:00, Roland Scheidegger wrote:
>> Am 21.02.2017 um 11:46 schrieb Marek Olšák:
>>> On Tue, Feb 21, 2017 at 4:36 AM, Roland Scheidegger
>>>  wrote:
 Am 20.02.2017 um 21:58 schrieb Marek Olšák:
> On Mon, Feb 20, 2017 at 9:28 PM, Roland Scheidegger
>  wrote:
>> Am 20.02.2017 um 20:56 schrieb Marek Olšák:
>>> On Mon, Feb 20, 2017 at 8:29 PM, Axel Davy  wrote:
 On 20/02/2017 20:11, Ilia Mirkin wrote:
> On Mon, Feb 20, 2017 at 2:01 PM, Marek Olšák 
> wrote:
>> Hi,
>>
>> I'd like to remove pipe_context::set_index_buffer. It's not
>> useful to
>> most drivers and the interface is inconvenient for Mesa/OpenGL,
>> because it's a draw state that is set with a separate driver
>> callback,
>> which is an unnecessary driver roundtrip taking some CPU
>> cycles. I'd
>> prefer to pass the index buffer via pipe_draw_info.
>>
>> I'm aware that the interface was inherited from DX10, but I don't
>> think that makes any difference here. DX10 state trackers can
>> pass the
>> index buffer via pipe_draw_info too.
>>
>> This is my proposal:
>>
>> iff --git a/src/gallium/include/pipe/p_state.h
>> b/src/gallium/include/pipe/p_state.h
>> index ce19b92..cffbb33 100644
>> --- a/src/gallium/include/pipe/p_state.h
>> +++ b/src/gallium/include/pipe/p_state.h
>> @@ -635,7 +635,7 @@ struct pipe_index_buffer
>> */
>>struct pipe_draw_info
>>{
>> -   boolean indexed;  /**< use index buffer */
>> +   ubyte index_size;  /**< 0 = non-indexed */
 Isn't that enough to say non-index when index_buffer and
 user_indices are
 NULL ?
>>> We still need index_size and it's only 8 bits as opposed to 64 bits.
>> FWIW at least in d3d10 you can actually have indexed rendering
>> without
>> an index buffer bound. This is perfectly valid, you're just
>> expected to
>> return always zero for all indices... Albeit I believe we actually
>> deal
>> with this with a dummy buffer.
>>
>>   enum pipe_prim_type mode;  /**< the mode of the
>> primitive */
>>   boolean primitive_restart;
>>   ubyte vertices_per_patch; /**< the number of vertices
>> per patch */
>> @@ -666,12 +666,18 @@ struct pipe_draw_info
>>
>>   unsigned indirect_params_offset; /**< must be 4 byte
>> aligned */
>>
>> +   /**
>> +* Index buffer. Only one can be non-NULL.
>> +*/
>> +   struct pipe_resource *index_buffer; /* "start" is the
>> offset */
> Works for me. Is start the offset in bytes or is start *
> index_size
> the offset in bytes?
 Same question here. My understanding is that start is in terms
 of start *
 index_size bytes.
>>> offset = start * index_size;
>>>
 But we really want to have a byte offset.
>>> The offset should be aligned to index_size, otherwise hardware
>>> won't work.
>> Are you sure of that? d3d10 doesn't seem to have such a
>> requirement, or
>> if it has I can't find it now (so, the startIndex really is in
>> terms of
>> index units, but the offset of the buffer is in terms of bytes,
>> and the
>> docs don't seem to mention it's limited to index alignment).
>> I don't actually see such a limitation in GL neither, albeit some
>> quick
>> googling seems to suggest YMMV (e.g.
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__irrlicht.sourceforge.net_forum_viewtopic.php-3Ff-3D7-26t-3D51444=DwIFaQ=uilaK90D4TOVoH58JNXRgQ=_QIjpv-UJ77xEQY8fIYoQtr5qv8wKrPJc7v7_-CYAb0=_nDDEb8aspFcYmYCdx9G-Pfs4rRVzx4rodfxnJNkNyc=XPduNXrH7SGk7lVUD2izbWAOfERG60bJWTsI600UWCg=
>> ).
>> So, I can't quite tell right now if we really need byte offsets...
> It's a natural requirement of hardware. It doesn't have to be
> documented IMO. CPUs might not support it either.
 I did some quick tests and I believe d3d10 doesn't actually require
 offsets not aligned to index size, but don't quote me on that.
 Nevertheless, I've got the feeling this might be expected to work with
 GL - doesn't look like it would be an error if your indices
 "pointer" in
 glDrawElements() isn't aligned, and if it's not an error I don't see
 why
 it wouldn't be well defined?
 (FWIW x86 supports this fine, but indeed not all cpu archs might. Even
 AVX2 gather supports non-aligned lookups - of course just for uint
 indices since gather doesn't support smaller than 32bit gathers.)
>>>  From GL 4.4 Core, section 6.2:
>>>
>>> "Clients must align 

Re: [Mesa-dev] [PATCH v2] softpipe: implement clear_texture

2017-02-21 Thread Roland Scheidegger
Am 21.02.2017 um 16:44 schrieb Lars Hamre:
> That does seem nicer, then pipe->clear_texture could just be set to
> util_clear_texture.
> Tangentially, do you see anything preventing this solution being
> utilized by llvmpipe?
No, that should work just fine. llvmpipe doesn't do anything special
for clear_rt and clear_ds neither.

Roland


> On Mon, Feb 20, 2017 at 2:31 PM, Roland Scheidegger  
> wrote:
>> Am 20.02.2017 um 18:01 schrieb Lars Hamre:
>>> v2: rework util clear functions such that they operate on a resource
>>> instead of a surface (Roland Scheidegger)
>>>
>>> Implements the ARB_clear_texture extension for softpipe.
>>> Passes all corresponding piglit tests.
>>>
>>> Signed-off-by: Lars Hamre 
>>>
>>> ---
>>>
>>> CC: Roland Scheidegger 
>>>
>>> NOTE: someone with access will need to commit this post
>>>   review process
>>>
>>>  src/gallium/auxiliary/util/u_surface.c| 339 
>>> +-
>>>  src/gallium/auxiliary/util/u_surface.h|  17 ++
>>>  src/gallium/drivers/softpipe/sp_screen.c  |   3 +-
>>>  src/gallium/drivers/softpipe/sp_texture.c |  51 +
>>>  4 files changed, 262 insertions(+), 148 deletions(-)
>>>
>>> diff --git a/src/gallium/auxiliary/util/u_surface.c 
>>> b/src/gallium/auxiliary/util/u_surface.c
>>> index a9ed006..1ec168f 100644
>>> --- a/src/gallium/auxiliary/util/u_surface.c
>>> +++ b/src/gallium/auxiliary/util/u_surface.c
>>> @@ -388,7 +388,66 @@ no_src_map:
>>> ;
>>>  }
>>>
>>> +static void
>>> +util_clear_texture_helper(struct pipe_transfer *dst_trans,
>>> +  ubyte *dst_map,
>>> +  enum pipe_format format,
>>> +  const union pipe_color_union *color,
>>> +  unsigned width, unsigned height, unsigned depth)
>>> +{
>>> +   union util_color uc;
>>> +
>>> +   assert(dst_trans->stride > 0);
>>> +
>>> +   if (util_format_is_pure_integer(format)) {
>>> +  /*
>>> +   * We expect int/uint clear values here, though some APIs
>>> +   * might disagree (but in any case util_pack_color()
>>> +   * couldn't handle it)...
>>> +   */
>>> +  if (util_format_is_pure_sint(format)) {
>>> + util_format_write_4i(format, color->i, 0, , 0, 0, 0, 1, 1);
>>> +  } else {
>>> + assert(util_format_is_pure_uint(format));
>>> + util_format_write_4ui(format, color->ui, 0, , 0, 0, 0, 1, 1);
>>> +  }
>>> +   } else {
>>> +  util_pack_color(color->f, format, );
>>> +   }
>>> +
>>> +   util_fill_box(dst_map, format,
>>> + dst_trans->stride, dst_trans->layer_stride,
>>> + 0, 0, 0, width, height, depth, );
>>> +}
>>> +
>>> +void
>>> +util_clear_texture(struct pipe_context *pipe,
>>> +   struct pipe_resource *texture,
>>> +   const union pipe_color_union *color,
>>> +   unsigned level,
>>> +   unsigned dstx, unsigned dsty, unsigned dstz,
>>> +   unsigned width, unsigned height, unsigned depth)
>>> +{
>>> +   struct pipe_transfer *dst_trans;
>>> +   ubyte *dst_map;
>>> +   enum pipe_format format = texture->format;
>>>
>>> +   dst_map = pipe_transfer_map_3d(pipe,
>>> +  texture,
>>> +  level,
>>> +  PIPE_TRANSFER_WRITE,
>>> +  dstx, dsty, dstz,
>>> +  width, height, depth,
>>> +  _trans);
>>> +   if (!dst_map)
>>> +  return;
>>> +
>>> +   if (dst_trans->stride > 0) {
>>> +  util_clear_texture_helper(dst_trans, dst_map, format, color,
>>> +width, height, depth);
>>> +   }
>>> +   pipe->transfer_unmap(pipe, dst_trans);
>>> +}
>>>
>>>  #define UBYTE_TO_USHORT(B) ((B) | ((B) << 8))
>>>
>>> @@ -410,8 +469,6 @@ util_clear_render_target(struct pipe_context *pipe,
>>>  {
>>> struct pipe_transfer *dst_trans;
>>> ubyte *dst_map;
>>> -   union util_color uc;
>>> -   unsigned max_layer;
>>>
>>> assert(dst->texture);
>>> if (!dst->texture)
>>> @@ -426,54 +483,150 @@ util_clear_render_target(struct pipe_context *pipe,
>>>unsigned pixstride = util_format_get_blocksize(dst->format);
>>>dx = (dst->u.buf.first_element + dstx) * pixstride;
>>>w = width * pixstride;
>>> -  max_layer = 0;
>>>dst_map = pipe_transfer_map(pipe,
>>>dst->texture,
>>>0, 0,
>>>PIPE_TRANSFER_WRITE,
>>>dx, 0, w, 1,
>>>_trans);
>>> +  if (dst_map) {
>>> + util_clear_texture_helper(dst_trans, dst_map, dst->format, color,
>>> +   width, height, 1);
>>> + pipe->transfer_unmap(pipe, dst_trans);
>>> + 

Re: [Mesa-dev] Gallium: Removal of set_index_buffer (discussion)

2017-02-21 Thread Axel Davy

On 21/02/2017 16:00, Roland Scheidegger wrote:

Am 21.02.2017 um 11:46 schrieb Marek Olšák:

On Tue, Feb 21, 2017 at 4:36 AM, Roland Scheidegger  wrote:

Am 20.02.2017 um 21:58 schrieb Marek Olšák:

On Mon, Feb 20, 2017 at 9:28 PM, Roland Scheidegger  wrote:

Am 20.02.2017 um 20:56 schrieb Marek Olšák:

On Mon, Feb 20, 2017 at 8:29 PM, Axel Davy  wrote:

On 20/02/2017 20:11, Ilia Mirkin wrote:

On Mon, Feb 20, 2017 at 2:01 PM, Marek Olšák  wrote:

Hi,

I'd like to remove pipe_context::set_index_buffer. It's not useful to
most drivers and the interface is inconvenient for Mesa/OpenGL,
because it's a draw state that is set with a separate driver callback,
which is an unnecessary driver roundtrip taking some CPU cycles. I'd
prefer to pass the index buffer via pipe_draw_info.

I'm aware that the interface was inherited from DX10, but I don't
think that makes any difference here. DX10 state trackers can pass the
index buffer via pipe_draw_info too.

This is my proposal:

iff --git a/src/gallium/include/pipe/p_state.h
b/src/gallium/include/pipe/p_state.h
index ce19b92..cffbb33 100644
--- a/src/gallium/include/pipe/p_state.h
+++ b/src/gallium/include/pipe/p_state.h
@@ -635,7 +635,7 @@ struct pipe_index_buffer
*/
   struct pipe_draw_info
   {
-   boolean indexed;  /**< use index buffer */
+   ubyte index_size;  /**< 0 = non-indexed */

Isn't that enough to say non-index when index_buffer and user_indices are
NULL ?

We still need index_size and it's only 8 bits as opposed to 64 bits.

FWIW at least in d3d10 you can actually have indexed rendering without
an index buffer bound. This is perfectly valid, you're just expected to
return always zero for all indices... Albeit I believe we actually deal
with this with a dummy buffer.


  enum pipe_prim_type mode;  /**< the mode of the primitive */
  boolean primitive_restart;
  ubyte vertices_per_patch; /**< the number of vertices per patch */
@@ -666,12 +666,18 @@ struct pipe_draw_info

  unsigned indirect_params_offset; /**< must be 4 byte aligned */

+   /**
+* Index buffer. Only one can be non-NULL.
+*/
+   struct pipe_resource *index_buffer; /* "start" is the offset */

Works for me. Is start the offset in bytes or is start * index_size
the offset in bytes?

Same question here. My understanding is that start is in terms of start *
index_size bytes.

offset = start * index_size;


But we really want to have a byte offset.

The offset should be aligned to index_size, otherwise hardware won't work.

Are you sure of that? d3d10 doesn't seem to have such a requirement, or
if it has I can't find it now (so, the startIndex really is in terms of
index units, but the offset of the buffer is in terms of bytes, and the
docs don't seem to mention it's limited to index alignment).
I don't actually see such a limitation in GL neither, albeit some quick
googling seems to suggest YMMV (e.g.
https://urldefense.proofpoint.com/v2/url?u=http-3A__irrlicht.sourceforge.net_forum_viewtopic.php-3Ff-3D7-26t-3D51444=DwIFaQ=uilaK90D4TOVoH58JNXRgQ=_QIjpv-UJ77xEQY8fIYoQtr5qv8wKrPJc7v7_-CYAb0=_nDDEb8aspFcYmYCdx9G-Pfs4rRVzx4rodfxnJNkNyc=XPduNXrH7SGk7lVUD2izbWAOfERG60bJWTsI600UWCg=
 ).
So, I can't quite tell right now if we really need byte offsets...

It's a natural requirement of hardware. It doesn't have to be
documented IMO. CPUs might not support it either.

I did some quick tests and I believe d3d10 doesn't actually require
offsets not aligned to index size, but don't quote me on that.
Nevertheless, I've got the feeling this might be expected to work with
GL - doesn't look like it would be an error if your indices "pointer" in
glDrawElements() isn't aligned, and if it's not an error I don't see why
it wouldn't be well defined?
(FWIW x86 supports this fine, but indeed not all cpu archs might. Even
AVX2 gather supports non-aligned lookups - of course just for uint
indices since gather doesn't support smaller than 32bit gathers.)

 From GL 4.4 Core, section 6.2:

"Clients must align data elements consistent with the requirements of
the client platform, with an additional base-level requirement that an
offset within a buffer to a datum comprising N basic machine units be
a multiple of N."


Ah, nice find (not in the section I was looking for...). I suppose it
isn't expected to work then (albeit it's interesting that it's still not
an actual gl error anywhere).

Roland

What is a basic machine unit ? I_ it supposed to be index_size, or just 
the required alignment for the platform ?



Axel

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 1/5] intel/blorp: Explicitly flush all allocated state

2017-02-21 Thread Jason Ekstrand
Found by inspection.  However, I expect it fixes real bugs when using
blorp from Vulkan on little-core platforms.

Cc: "13.0 17.0" 
---
 src/intel/blorp/blorp_genX_exec.h   | 19 ++-
 src/intel/vulkan/genX_blorp_exec.c  | 11 +++
 src/mesa/drivers/dri/i965/genX_blorp_exec.c |  8 
 3 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/src/intel/blorp/blorp_genX_exec.h 
b/src/intel/blorp/blorp_genX_exec.h
index a673ab8..f0c4f38 100644
--- a/src/intel/blorp/blorp_genX_exec.h
+++ b/src/intel/blorp/blorp_genX_exec.h
@@ -66,6 +66,10 @@ blorp_alloc_binding_table(struct blorp_batch *batch, 
unsigned num_entries,
   unsigned state_size, unsigned state_alignment,
   uint32_t *bt_offset, uint32_t *surface_offsets,
   void **surface_maps);
+
+static void
+blorp_flush_range(struct blorp_batch *batch, void *start, size_t size);
+
 static void
 blorp_surface_reloc(struct blorp_batch *batch, uint32_t ss_offset,
 struct blorp_address address, uint32_t delta);
@@ -182,6 +186,7 @@ blorp_emit_vertex_data(struct blorp_batch *batch,
void *data = blorp_alloc_vertex_buffer(batch, sizeof(vertices), addr);
memcpy(data, vertices, sizeof(vertices));
*size = sizeof(vertices);
+   blorp_flush_range(batch, data, *size);
 }
 
 static void
@@ -199,7 +204,8 @@ blorp_emit_input_varying_data(struct blorp_batch *batch,
*size = 16 + num_varyings * vec4_size_in_bytes;
 
const uint32_t *const inputs_src = (const uint32_t *)>wm_inputs;
-   uint32_t *inputs = blorp_alloc_vertex_buffer(batch, *size, addr);
+   void *data = blorp_alloc_vertex_buffer(batch, *size, addr);
+   uint32_t *inputs = data;
 
/* Copy in the VS inputs */
assert(sizeof(params->vs_inputs) == 16);
@@ -223,6 +229,8 @@ blorp_emit_input_varying_data(struct blorp_batch *batch,
  inputs += 4;
   }
}
+
+   blorp_flush_range(batch, data, *size);
 }
 
 static void
@@ -906,6 +914,7 @@ blorp_emit_blend_state(struct blorp_batch *batch,
GENX(BLEND_STATE_length) * 4,
64, );
GENX(BLEND_STATE_pack)(NULL, state, );
+   blorp_flush_range(batch, state, GENX(BLEND_STATE_length) * 4);
 
 #if GEN_GEN >= 7
blorp_emit(batch, GENX(3DSTATE_BLEND_STATE_POINTERS), sp) {
@@ -940,6 +949,7 @@ blorp_emit_color_calc_state(struct blorp_batch *batch,
GENX(COLOR_CALC_STATE_length) * 4,
64, );
GENX(COLOR_CALC_STATE_pack)(NULL, state, );
+   blorp_flush_range(batch, state, GENX(COLOR_CALC_STATE_length) * 4);
 
 #if GEN_GEN >= 7
blorp_emit(batch, GENX(3DSTATE_CC_STATE_POINTERS), sp) {
@@ -1016,6 +1026,7 @@ blorp_emit_depth_stencil_state(struct blorp_batch *batch,
GENX(DEPTH_STENCIL_STATE_length) * 
4,
64, );
GENX(DEPTH_STENCIL_STATE_pack)(NULL, state, );
+   blorp_flush_range(batch, state, GENX(DEPTH_STENCIL_STATE_length) * 4);
 #endif
 
 #if GEN_GEN == 7
@@ -1068,6 +1079,8 @@ blorp_emit_surface_state(struct blorp_batch *batch,
   blorp_surface_reloc(batch, state_offset + isl_dev->ss.aux_addr_offset,
   surface->aux_addr, *aux_addr);
}
+
+   blorp_flush_range(batch, state, GENX(RENDER_SURFACE_STATE_length) * 4);
 }
 
 static void
@@ -1098,6 +,8 @@ blorp_emit_null_surface_state(struct blorp_batch *batch,
};
 
GENX(RENDER_SURFACE_STATE_pack)(NULL, state, );
+
+   blorp_flush_range(batch, state, GENX(RENDER_SURFACE_STATE_length) * 4);
 }
 
 static void
@@ -1181,6 +1196,7 @@ blorp_emit_sampler_state(struct blorp_batch *batch,
GENX(SAMPLER_STATE_length) * 4,
32, );
GENX(SAMPLER_STATE_pack)(NULL, state, );
+   blorp_flush_range(batch, state, GENX(SAMPLER_STATE_length) * 4);
 
 #if GEN_GEN >= 7
blorp_emit(batch, GENX(3DSTATE_SAMPLER_STATE_POINTERS_PS), ssp) {
@@ -1333,6 +1349,7 @@ blorp_emit_viewport_state(struct blorp_batch *batch,
  .MinimumDepth = 0.0,
  .MaximumDepth = 1.0,
   });
+   blorp_flush_range(batch, state, GENX(CC_VIEWPORT_length) * 4);
 
 #if GEN_GEN >= 7
blorp_emit(batch, GENX(3DSTATE_VIEWPORT_STATE_POINTERS_CC), vsp) {
diff --git a/src/intel/vulkan/genX_blorp_exec.c 
b/src/intel/vulkan/genX_blorp_exec.c
index 6f0b063..139c387 100644
--- a/src/intel/vulkan/genX_blorp_exec.c
+++ b/src/intel/vulkan/genX_blorp_exec.c
@@ -100,6 +100,9 @@ blorp_alloc_binding_table(struct blorp_batch *batch, 
unsigned num_entries,
   surface_offsets[i] = surface_state.offset;
   surface_maps[i] = surface_state.map;
}
+
+   if (!cmd_buffer->device->info.has_llc)
+  anv_state_clflush(bt_state);
 }
 
 static void *
@@ -119,6 +122,14 @@ 

Re: [Mesa-dev] [PATCH v2] softpipe: implement clear_texture

2017-02-21 Thread Lars Hamre
That does seem nicer, then pipe->clear_texture could just be set to
util_clear_texture.
Tangentially, do you see anything preventing this solution being
utilized by llvmpipe?

On Mon, Feb 20, 2017 at 2:31 PM, Roland Scheidegger  wrote:
> Am 20.02.2017 um 18:01 schrieb Lars Hamre:
>> v2: rework util clear functions such that they operate on a resource
>> instead of a surface (Roland Scheidegger)
>>
>> Implements the ARB_clear_texture extension for softpipe.
>> Passes all corresponding piglit tests.
>>
>> Signed-off-by: Lars Hamre 
>>
>> ---
>>
>> CC: Roland Scheidegger 
>>
>> NOTE: someone with access will need to commit this post
>>   review process
>>
>>  src/gallium/auxiliary/util/u_surface.c| 339 
>> +-
>>  src/gallium/auxiliary/util/u_surface.h|  17 ++
>>  src/gallium/drivers/softpipe/sp_screen.c  |   3 +-
>>  src/gallium/drivers/softpipe/sp_texture.c |  51 +
>>  4 files changed, 262 insertions(+), 148 deletions(-)
>>
>> diff --git a/src/gallium/auxiliary/util/u_surface.c 
>> b/src/gallium/auxiliary/util/u_surface.c
>> index a9ed006..1ec168f 100644
>> --- a/src/gallium/auxiliary/util/u_surface.c
>> +++ b/src/gallium/auxiliary/util/u_surface.c
>> @@ -388,7 +388,66 @@ no_src_map:
>> ;
>>  }
>>
>> +static void
>> +util_clear_texture_helper(struct pipe_transfer *dst_trans,
>> +  ubyte *dst_map,
>> +  enum pipe_format format,
>> +  const union pipe_color_union *color,
>> +  unsigned width, unsigned height, unsigned depth)
>> +{
>> +   union util_color uc;
>> +
>> +   assert(dst_trans->stride > 0);
>> +
>> +   if (util_format_is_pure_integer(format)) {
>> +  /*
>> +   * We expect int/uint clear values here, though some APIs
>> +   * might disagree (but in any case util_pack_color()
>> +   * couldn't handle it)...
>> +   */
>> +  if (util_format_is_pure_sint(format)) {
>> + util_format_write_4i(format, color->i, 0, , 0, 0, 0, 1, 1);
>> +  } else {
>> + assert(util_format_is_pure_uint(format));
>> + util_format_write_4ui(format, color->ui, 0, , 0, 0, 0, 1, 1);
>> +  }
>> +   } else {
>> +  util_pack_color(color->f, format, );
>> +   }
>> +
>> +   util_fill_box(dst_map, format,
>> + dst_trans->stride, dst_trans->layer_stride,
>> + 0, 0, 0, width, height, depth, );
>> +}
>> +
>> +void
>> +util_clear_texture(struct pipe_context *pipe,
>> +   struct pipe_resource *texture,
>> +   const union pipe_color_union *color,
>> +   unsigned level,
>> +   unsigned dstx, unsigned dsty, unsigned dstz,
>> +   unsigned width, unsigned height, unsigned depth)
>> +{
>> +   struct pipe_transfer *dst_trans;
>> +   ubyte *dst_map;
>> +   enum pipe_format format = texture->format;
>>
>> +   dst_map = pipe_transfer_map_3d(pipe,
>> +  texture,
>> +  level,
>> +  PIPE_TRANSFER_WRITE,
>> +  dstx, dsty, dstz,
>> +  width, height, depth,
>> +  _trans);
>> +   if (!dst_map)
>> +  return;
>> +
>> +   if (dst_trans->stride > 0) {
>> +  util_clear_texture_helper(dst_trans, dst_map, format, color,
>> +width, height, depth);
>> +   }
>> +   pipe->transfer_unmap(pipe, dst_trans);
>> +}
>>
>>  #define UBYTE_TO_USHORT(B) ((B) | ((B) << 8))
>>
>> @@ -410,8 +469,6 @@ util_clear_render_target(struct pipe_context *pipe,
>>  {
>> struct pipe_transfer *dst_trans;
>> ubyte *dst_map;
>> -   union util_color uc;
>> -   unsigned max_layer;
>>
>> assert(dst->texture);
>> if (!dst->texture)
>> @@ -426,54 +483,150 @@ util_clear_render_target(struct pipe_context *pipe,
>>unsigned pixstride = util_format_get_blocksize(dst->format);
>>dx = (dst->u.buf.first_element + dstx) * pixstride;
>>w = width * pixstride;
>> -  max_layer = 0;
>>dst_map = pipe_transfer_map(pipe,
>>dst->texture,
>>0, 0,
>>PIPE_TRANSFER_WRITE,
>>dx, 0, w, 1,
>>_trans);
>> +  if (dst_map) {
>> + util_clear_texture_helper(dst_trans, dst_map, dst->format, color,
>> +   width, height, 1);
>> + pipe->transfer_unmap(pipe, dst_trans);
>> +  }
>> }
>> else {
>> -  max_layer = dst->u.tex.last_layer - dst->u.tex.first_layer;
>> -  dst_map = pipe_transfer_map_3d(pipe,
>> - dst->texture,
>> - dst->u.tex.level,
>> -

Re: [Mesa-dev] [PATCH 1/5] intel/blorp: Explicitly flush all allocated state

2017-02-21 Thread Jason Ekstrand
On Tue, Feb 21, 2017 at 4:05 AM, Lionel Landwerlin <
lionel.g.landwer...@intel.com> wrote:

> On 20/02/17 19:21, Jason Ekstrand wrote:
>
>> Found by inspection.  However, I expect it fixes real bugs when using
>> blorp from Vulkan on little-core platforms.
>>
>> Cc: "13.0 17.0" 
>> ---
>>   src/intel/blorp/blorp_genX_exec.h   | 17 -
>>   src/intel/vulkan/genX_blorp_exec.c  |  8 
>>   src/mesa/drivers/dri/i965/genX_blorp_exec.c |  8 
>>   3 files changed, 32 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/intel/blorp/blorp_genX_exec.h
>> b/src/intel/blorp/blorp_genX_exec.h
>> index a673ab8..1e6b05c 100644
>> --- a/src/intel/blorp/blorp_genX_exec.h
>> +++ b/src/intel/blorp/blorp_genX_exec.h
>> @@ -66,6 +66,10 @@ blorp_alloc_binding_table(struct blorp_batch *batch,
>> unsigned num_entries,
>> unsigned state_size, unsigned state_alignment,
>> uint32_t *bt_offset, uint32_t
>> *surface_offsets,
>> void **surface_maps);
>> +
>> +static void
>> +blorp_flush_range(struct blorp_batch *batch, void *start, size_t size);
>> +
>>   static void
>>   blorp_surface_reloc(struct blorp_batch *batch, uint32_t ss_offset,
>>   struct blorp_address address, uint32_t delta);
>> @@ -182,6 +186,7 @@ blorp_emit_vertex_data(struct blorp_batch *batch,
>>  void *data = blorp_alloc_vertex_buffer(batch, sizeof(vertices),
>> addr);
>>  memcpy(data, vertices, sizeof(vertices));
>>  *size = sizeof(vertices);
>> +   blorp_flush_range(batch, data, *size);
>>   }
>> static void
>> @@ -199,7 +204,8 @@ blorp_emit_input_varying_data(struct blorp_batch
>> *batch,
>>  *size = 16 + num_varyings * vec4_size_in_bytes;
>>const uint32_t *const inputs_src = (const uint32_t
>> *)>wm_inputs;
>> -   uint32_t *inputs = blorp_alloc_vertex_buffer(batch, *size, addr);
>> +   void *data = blorp_alloc_vertex_buffer(batch, *size, addr);
>> +   uint32_t *inputs = data;
>>/* Copy in the VS inputs */
>>  assert(sizeof(params->vs_inputs) == 16);
>> @@ -223,6 +229,8 @@ blorp_emit_input_varying_data(struct blorp_batch
>> *batch,
>>inputs += 4;
>> }
>>  }
>> +
>> +   blorp_flush_range(batch, data, *size);
>>   }
>> static void
>> @@ -906,6 +914,7 @@ blorp_emit_blend_state(struct blorp_batch *batch,
>>  GENX(BLEND_STATE_length) * 4,
>>  64, );
>>  GENX(BLEND_STATE_pack)(NULL, state, );
>> +   blorp_flush_range(batch, state, GENX(BLEND_STATE_length) * 4);
>> #if GEN_GEN >= 7
>>  blorp_emit(batch, GENX(3DSTATE_BLEND_STATE_POINTERS), sp) {
>> @@ -940,6 +949,7 @@ blorp_emit_color_calc_state(struct blorp_batch
>> *batch,
>>
>>  GENX(COLOR_CALC_STATE_length) * 4,
>>  64, );
>>  GENX(COLOR_CALC_STATE_pack)(NULL, state, );
>> +   blorp_flush_range(batch, state, GENX(COLOR_CALC_STATE_length) * 4);
>> #if GEN_GEN >= 7
>>  blorp_emit(batch, GENX(3DSTATE_CC_STATE_POINTERS), sp) {
>> @@ -1016,6 +1026,7 @@ blorp_emit_depth_stencil_state(struct blorp_batch
>> *batch,
>>  GENX(DEPTH_STENCIL_STATE_length)
>> * 4,
>>  64, );
>>  GENX(DEPTH_STENCIL_STATE_pack)(NULL, state, );
>> +   blorp_flush_range(batch, state, GENX(DEPTH_STENCIL_STATE_length) *
>> 4);
>>   #endif
>> #if GEN_GEN == 7
>> @@ -1068,6 +1079,8 @@ blorp_emit_surface_state(struct blorp_batch *batch,
>> blorp_surface_reloc(batch, state_offset +
>> isl_dev->ss.aux_addr_offset,
>> surface->aux_addr, *aux_addr);
>>  }
>> +
>> +   blorp_flush_range(batch, state, GENX(RENDER_SURFACE_STATE_length) *
>> 4);
>>   }
>>
>
> Do we need a flush in blorp_emit_null_surface_state() too?


Yes we do.  I just realized we may have another missing flush as well. :/
I'll send a v3


>
> static void
>> @@ -1181,6 +1194,7 @@ blorp_emit_sampler_state(struct blorp_batch *batch,
>>  GENX(SAMPLER_STATE_length) *
>> 4,
>>  32, );
>>  GENX(SAMPLER_STATE_pack)(NULL, state, );
>> +   blorp_flush_range(batch, state, GENX(SAMPLER_STATE_length) * 4);
>> #if GEN_GEN >= 7
>>  blorp_emit(batch, GENX(3DSTATE_SAMPLER_STATE_POINTERS_PS), ssp) {
>> @@ -1333,6 +1347,7 @@ blorp_emit_viewport_state(struct blorp_batch
>> *batch,
>>.MinimumDepth = 0.0,
>>.MaximumDepth = 1.0,
>> });
>> +   blorp_flush_range(batch, state, GENX(CC_VIEWPORT_length) * 4);
>> #if GEN_GEN >= 7
>>  blorp_emit(batch, GENX(3DSTATE_VIEWPORT_STATE_POINTERS_CC), vsp) {
>> diff --git a/src/intel/vulkan/genX_blorp_exec.c
>> b/src/intel/vulkan/genX_blorp_exec.c
>> index 6f0b063..f7969e5 100644
>> --- 

Re: [Mesa-dev] Gallium: Removal of set_index_buffer (discussion)

2017-02-21 Thread Roland Scheidegger
Am 21.02.2017 um 11:46 schrieb Marek Olšák:
> On Tue, Feb 21, 2017 at 4:36 AM, Roland Scheidegger  
> wrote:
>> Am 20.02.2017 um 21:58 schrieb Marek Olšák:
>>> On Mon, Feb 20, 2017 at 9:28 PM, Roland Scheidegger  
>>> wrote:
 Am 20.02.2017 um 20:56 schrieb Marek Olšák:
> On Mon, Feb 20, 2017 at 8:29 PM, Axel Davy  wrote:
>> On 20/02/2017 20:11, Ilia Mirkin wrote:
>>>
>>> On Mon, Feb 20, 2017 at 2:01 PM, Marek Olšák  wrote:

 Hi,

 I'd like to remove pipe_context::set_index_buffer. It's not useful to
 most drivers and the interface is inconvenient for Mesa/OpenGL,
 because it's a draw state that is set with a separate driver callback,
 which is an unnecessary driver roundtrip taking some CPU cycles. I'd
 prefer to pass the index buffer via pipe_draw_info.

 I'm aware that the interface was inherited from DX10, but I don't
 think that makes any difference here. DX10 state trackers can pass the
 index buffer via pipe_draw_info too.

 This is my proposal:

 iff --git a/src/gallium/include/pipe/p_state.h
 b/src/gallium/include/pipe/p_state.h
 index ce19b92..cffbb33 100644
 --- a/src/gallium/include/pipe/p_state.h
 +++ b/src/gallium/include/pipe/p_state.h
 @@ -635,7 +635,7 @@ struct pipe_index_buffer
*/
   struct pipe_draw_info
   {
 -   boolean indexed;  /**< use index buffer */
 +   ubyte index_size;  /**< 0 = non-indexed */
>>
>> Isn't that enough to say non-index when index_buffer and user_indices are
>> NULL ?
>
> We still need index_size and it's only 8 bits as opposed to 64 bits.
 FWIW at least in d3d10 you can actually have indexed rendering without
 an index buffer bound. This is perfectly valid, you're just expected to
 return always zero for all indices... Albeit I believe we actually deal
 with this with a dummy buffer.

>

  enum pipe_prim_type mode;  /**< the mode of the primitive */
  boolean primitive_restart;
  ubyte vertices_per_patch; /**< the number of vertices per patch */
 @@ -666,12 +666,18 @@ struct pipe_draw_info

  unsigned indirect_params_offset; /**< must be 4 byte aligned */

 +   /**
 +* Index buffer. Only one can be non-NULL.
 +*/
 +   struct pipe_resource *index_buffer; /* "start" is the offset */
>>>
>>> Works for me. Is start the offset in bytes or is start * index_size
>>> the offset in bytes?
>>
>> Same question here. My understanding is that start is in terms of start *
>> index_size bytes.
>
> offset = start * index_size;
>
>> But we really want to have a byte offset.
>
> The offset should be aligned to index_size, otherwise hardware won't work.
 Are you sure of that? d3d10 doesn't seem to have such a requirement, or
 if it has I can't find it now (so, the startIndex really is in terms of
 index units, but the offset of the buffer is in terms of bytes, and the
 docs don't seem to mention it's limited to index alignment).
 I don't actually see such a limitation in GL neither, albeit some quick
 googling seems to suggest YMMV (e.g.
 https://urldefense.proofpoint.com/v2/url?u=http-3A__irrlicht.sourceforge.net_forum_viewtopic.php-3Ff-3D7-26t-3D51444=DwIFaQ=uilaK90D4TOVoH58JNXRgQ=_QIjpv-UJ77xEQY8fIYoQtr5qv8wKrPJc7v7_-CYAb0=_nDDEb8aspFcYmYCdx9G-Pfs4rRVzx4rodfxnJNkNyc=XPduNXrH7SGk7lVUD2izbWAOfERG60bJWTsI600UWCg=
  ).
 So, I can't quite tell right now if we really need byte offsets...
>>>
>>> It's a natural requirement of hardware. It doesn't have to be
>>> documented IMO. CPUs might not support it either.
>> I did some quick tests and I believe d3d10 doesn't actually require
>> offsets not aligned to index size, but don't quote me on that.
>> Nevertheless, I've got the feeling this might be expected to work with
>> GL - doesn't look like it would be an error if your indices "pointer" in
>> glDrawElements() isn't aligned, and if it's not an error I don't see why
>> it wouldn't be well defined?
>> (FWIW x86 supports this fine, but indeed not all cpu archs might. Even
>> AVX2 gather supports non-aligned lookups - of course just for uint
>> indices since gather doesn't support smaller than 32bit gathers.)
> 
> From GL 4.4 Core, section 6.2:
> 
> "Clients must align data elements consistent with the requirements of
> the client platform, with an additional base-level requirement that an
> offset within a buffer to a datum comprising N basic machine units be
> a multiple of N."
> 

Ah, nice find (not in the section I was looking for...). I suppose it
isn't expected to work then (albeit it's interesting that it's still not
an actual gl error 

[Mesa-dev] [Bug 99856] OpenCL Hello world returns "unsupported call to function get_local_size"

2017-02-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99856

--- Comment #3 from Vedran Miletić  ---
On Fedora 25 with custom-compiled LLVM, libclc, and Mesa:

$ gcc -lOpenCL -lm hello.c
hello.c: In function ‘main’:
hello.c:269:5: warning: ‘clCreateCommandQueue’ is deprecated
[-Wdeprecated-declarations]
 commands = clCreateCommandQueue(context, device_id, 0, );
 ^~~~
In file included from hello.c:121:0:
/usr/include/CL/cl.h:1427:1: note: declared here
 clCreateCommandQueue(cl_context /* context */,
 ^~~~

$ ./hello
Computed '1024/1024' correct values!

I would say this is a distribution issue.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa v2] docs: fix gamma correction link

2017-02-21 Thread Emil Velikov
On 14 February 2017 at 22:48, Eric Engestrom  wrote:
> From: Eric Engestrom 
>
> That link has been dead for 15 years...
> We could link to Archive.org [1] to get the last time this page existed,
> but I feel like Wikipedia is a better choice.
>
> [1] 
> http://web.archive.org/web/20021211151318/http://www.inforamp.net/~poynton/notes/colour_and_gamma/GammaFAQ.html
>
> Signed-off-by: Eric Engestrom 
> ---
> v2: fix link name as well

Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa] docs: add link to gallium doc

2017-02-21 Thread Emil Velikov
Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH mesa 3/5] gallium/docs: add missing newlines

2017-02-21 Thread Eric Engestrom
Without these, mathjax considers these as the continuation of the
previous line.

Signed-off-by: Eric Engestrom 
---
 src/gallium/docs/source/tgsi.rst | 33 +
 1 file changed, 33 insertions(+)

diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index b9b9d6ca34..de73e5291a 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -1806,6 +1806,7 @@ two-component vectors with doubled precision in each 
component.
 .. math::
 
   dst.xy = |src0.xy|
+
   dst.zw = |src0.zw|
 
 .. opcode:: DADD - Add
@@ -2060,6 +2061,7 @@ two-component vectors with 64-bits in each component.
 .. math::
 
   dst.xy = |src0.xy|
+
   dst.zw = |src0.zw|
 
 .. opcode:: I64NEG - 64-bit Integer Negate
@@ -2069,6 +2071,7 @@ two-component vectors with 64-bits in each component.
 .. math::
 
   dst.xy = -src.xy
+
   dst.zw = -src.zw
 
 .. opcode:: I64SSG - 64-bit Integer Set Sign
@@ -2076,6 +2079,7 @@ two-component vectors with 64-bits in each component.
 .. math::
 
   dst.xy = (src0.xy < 0) ? -1 : (src0.xy > 0) ? 1 : 0
+
   dst.zw = (src0.zw < 0) ? -1 : (src0.zw > 0) ? 1 : 0
 
 .. opcode:: U64ADD - 64-bit Integer Add
@@ -2083,6 +2087,7 @@ two-component vectors with 64-bits in each component.
 .. math::
 
   dst.xy = src0.xy + src1.xy
+
   dst.zw = src0.zw + src1.zw
 
 .. opcode:: U64MUL - 64-bit Integer Multiply
@@ -2090,6 +2095,7 @@ two-component vectors with 64-bits in each component.
 .. math::
 
   dst.xy = src0.xy * src1.xy
+
   dst.zw = src0.zw * src1.zw
 
 .. opcode:: U64SEQ - 64-bit Integer Set on Equal
@@ -2097,6 +2103,7 @@ two-component vectors with 64-bits in each component.
 .. math::
 
   dst.x = src0.xy == src1.xy ? \sim 0 : 0
+
   dst.z = src0.zw == src1.zw ? \sim 0 : 0
 
 .. opcode:: U64SNE - 64-bit Integer Set on Not Equal
@@ -2104,6 +2111,7 @@ two-component vectors with 64-bits in each component.
 .. math::
 
   dst.x = src0.xy != src1.xy ? \sim 0 : 0
+
   dst.z = src0.zw != src1.zw ? \sim 0 : 0
 
 .. opcode:: U64SLT - 64-bit Unsigned Integer Set on Less Than
@@ -2111,6 +2119,7 @@ two-component vectors with 64-bits in each component.
 .. math::
 
   dst.x = src0.xy < src1.xy ? \sim 0 : 0
+
   dst.z = src0.zw < src1.zw ? \sim 0 : 0
 
 .. opcode:: U64SGE - 64-bit Unsigned Integer Set on Greater Equal
@@ -2118,6 +2127,7 @@ two-component vectors with 64-bits in each component.
 .. math::
 
   dst.x = src0.xy >= src1.xy ? \sim 0 : 0
+
   dst.z = src0.zw >= src1.zw ? \sim 0 : 0
 
 .. opcode:: I64SLT - 64-bit Signed Integer Set on Less Than
@@ -2125,6 +2135,7 @@ two-component vectors with 64-bits in each component.
 .. math::
 
   dst.x = src0.xy < src1.xy ? \sim 0 : 0
+
   dst.z = src0.zw < src1.zw ? \sim 0 : 0
 
 .. opcode:: I64SGE - 64-bit Signed Integer Set on Greater Equal
@@ -2132,6 +2143,7 @@ two-component vectors with 64-bits in each component.
 .. math::
 
   dst.x = src0.xy >= src1.xy ? \sim 0 : 0
+
   dst.z = src0.zw >= src1.zw ? \sim 0 : 0
 
 .. opcode:: I64MIN - Minimum of 64-bit Signed Integers
@@ -2139,6 +2151,7 @@ two-component vectors with 64-bits in each component.
 .. math::
 
   dst.xy = min(src0.xy, src1.xy)
+
   dst.zw = min(src0.zw, src1.zw)
 
 .. opcode:: U64MIN - Minimum of 64-bit Unsigned Integers
@@ -2146,6 +2159,7 @@ two-component vectors with 64-bits in each component.
 .. math::
 
   dst.xy = min(src0.xy, src1.xy)
+
   dst.zw = min(src0.zw, src1.zw)
 
 .. opcode:: I64MAX - Maximum of 64-bit Signed Integers
@@ -2153,6 +2167,7 @@ two-component vectors with 64-bits in each component.
 .. math::
 
   dst.xy = max(src0.xy, src1.xy)
+
   dst.zw = max(src0.zw, src1.zw)
 
 .. opcode:: U64MAX - Maximum of 64-bit Unsigned Integers
@@ -2160,6 +2175,7 @@ two-component vectors with 64-bits in each component.
 .. math::
 
   dst.xy = max(src0.xy, src1.xy)
+
   dst.zw = max(src0.zw, src1.zw)
 
 .. opcode:: U64SHL - Shift Left 64-bit Unsigned Integer
@@ -2169,6 +2185,7 @@ two-component vectors with 64-bits in each component.
 .. math::
 
   dst.xy = src0.xy << (0x3f \& src1.x)
+
   dst.zw = src0.zw << (0x3f \& src1.y)
 
 .. opcode:: I64SHR - Arithmetic Shift Right (of 64-bit Signed Integer)
@@ -2178,6 +2195,7 @@ two-component vectors with 64-bits in each component.
 .. math::
 
   dst.xy = src0.xy >> (0x3f \& src1.x)
+
   dst.zw = src0.zw >> (0x3f \& src1.y)
 
 .. opcode:: U64SHR - Logical Shift Right (of 64-bit Unsigned Integer)
@@ -2187,6 +2205,7 @@ two-component vectors with 64-bits in each component.
 .. math::
 
   dst.xy = src0.xy >> (unsigned) (0x3f \& src1.x)
+
   dst.zw = src0.zw >> (unsigned) (0x3f \& src1.y)
 
 .. opcode:: I64DIV - 64-bit Signed Integer Division
@@ -2194,6 +2213,7 @@ two-component vectors with 64-bits in each component.
 .. math::
 
   dst.xy = src0.xy \ src1.xy
+
   dst.zw = src0.zw \ src1.zw
 
 .. opcode:: U64DIV - 64-bit Unsigned Integer Division
@@ -2201,6 +2221,7 @@ two-component vectors with 64-bits in each component.
 .. math::
 
   dst.xy = src0.xy \ src1.xy

[Mesa-dev] [PATCH mesa 5/5] gallium/docs: use imgmath instead of pngmath

2017-02-21 Thread Eric Engestrom
WARNING: sphinx.ext.pngmath has been deprecated. Please use
sphinx.ext.imgmath instead.

Signed-off-by: Eric Engestrom 
---
 src/gallium/docs/source/conf.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/docs/source/conf.py b/src/gallium/docs/source/conf.py
index 5e8173d869..c6039fbe8d 100644
--- a/src/gallium/docs/source/conf.py
+++ b/src/gallium/docs/source/conf.py
@@ -22,7 +22,7 @@
 
 # Add any Sphinx extension module names here, as strings. They can be 
extensions
 # coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
-extensions = ['sphinx.ext.pngmath', 'sphinx.ext.graphviz', 'formatting']
+extensions = ['sphinx.ext.imgmath', 'sphinx.ext.graphviz', 'formatting']
 
 # Add any paths that contain templates here, relative to this directory.
 templates_path = ['_templates']
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH mesa 0/5] gallium/docs: formatting fixes

2017-02-21 Thread Eric Engestrom
I noticed a bunch of warnings and errors when compiling the docs, so
I fixed the ones I knew how.

There's a couple warnings left to fix, if anyone's interested:

- src/gallium/docs/source/drivers/freedreno/ir3-notes.rst:195:
  WARNING: Could not lex literal_block as "c". Highlighting skipped.
  Should this block be marked as assembly? (Rob Clark? you wrote this
  a couple years ago)

- src/gallium/docs/source/distro.rst:104:
  WARNING: undefined label: egl
  There is no EGL section in distro.rst, if anyone wants to write it :)

- src/gallium/docs/source/conf.py:126:
  WARNING: html_static_path entry 'src/gallium/docs/source/_static' does not 
exist
  This one can be safely ignored, or silenced by creating the directory
  locally. I could create the directory in git by putting some dummy file
  in there (git ignores empty directories), but that doesn't feel right.


Cc: Rob Clark 

Eric Engestrom (5):
  gallium/docs: fix sublist formatting
  gallium/docs: add missing math formatting
  gallium/docs: add missing newlines
  gallium/docs: fix section title formatting
  gallium/docs: use imgmath instead of pngmath

 src/gallium/docs/source/conf.py |  2 +-
 src/gallium/docs/source/context.rst |  2 ++
 src/gallium/docs/source/tgsi.rst| 41 +++--
 3 files changed, 42 insertions(+), 3 deletions(-)

-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH mesa 1/5] gallium/docs: fix sublist formatting

2017-02-21 Thread Eric Engestrom
src/gallium/docs/source/context.rst:95: ERROR: Unexpected indentation.

Sub lists need to be surrounded by a blank line.

Signed-off-by: Eric Engestrom 
---
 src/gallium/docs/source/context.rst | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/gallium/docs/source/context.rst 
b/src/gallium/docs/source/context.rst
index d70e34e7aa..a053193722 100644
--- a/src/gallium/docs/source/context.rst
+++ b/src/gallium/docs/source/context.rst
@@ -91,10 +91,12 @@ objects. They all follow simple, one-method binding calls, 
e.g.
   blits. (Blits have their own way to pass the requisite rectangles
   in.)
 * ``set_tess_state`` configures the default tessellation parameters:
+
   * ``default_outer_level`` is the default value for the outer tessellation
 levels. This corresponds to GL's ``PATCH_DEFAULT_OUTER_LEVEL``.
   * ``default_inner_level`` is the default value for the inner tessellation
 levels. This corresponds to GL's ``PATCH_DEFAULT_INNER_LEVEL``.
+
 * ``set_debug_callback`` sets the callback to be used for reporting
   various debug messages, eventually reported via KHR_debug and
   similar mechanisms.
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 99886] [radv] No actual multithreading in a Vulkan multithreading demo

2017-02-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99886

Bug ID: 99886
   Summary: [radv] No actual multithreading in a Vulkan
multithreading demo
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/Vulkan/radeon
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: 0xe2.0x9a.0...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

Steps to reproduce the issue:

1. Clone and compile https://github.com/SaschaWillems/Vulkan
2. cd bin
3. ./multithreading

Expected CPU usage on a quad-core CPU: 400%
Actual CPU usage on a quad-core CPU: 100%

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >