date:20180220

Re: [Mesa-dev] [PATCH] i965/vec4: use a temp register to compute offsets for pull loads

2018-02-20 Thread Iago Toral

Yes, I agree, thanks for bringing it up.

Iago

On Tue, 2018-02-20 at 16:38 +0200, Andres Gomez wrote:
> Iago, this looks like a good candidate to nominate for inclusion in
> the
> 17.3 stable queue.
> 
> What do you think?
> 
> On Wed, 2017-11-29 at 11:49 +0100, Iago Toral Quiroga wrote:
> > 64-bit pull loads are implemented by emitting 2 separate
> > 32-bit pull load messages, where the second message loads from
> > an offset at +16B.
> > 
> > That addition of 16B to the original offset should not alter the
> > original offset register used as source for the pull load
> > instruction
> > though, since the compiler might use that same offset register in
> > other
> > instructions (for example, for other pull loads in the shader code
> > that take that same offset as reference).
> > 
> > If the pull load is 32-bit then we only need to emit one message
> > and
> > we don't need to do offset calculations, but in that case the
> > optimizer
> > should be able to drop the redundant MOV.
> > 
> > Fixes the following test on Haswell:
> > KHR-GL45.gpu_shader_fp64.fp64.max_uniform_components
> > ---
> >  src/intel/compiler/brw_vec4_nir.cpp | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/intel/compiler/brw_vec4_nir.cpp
> > b/src/intel/compiler/brw_vec4_nir.cpp
> > index 0a1caa9fad..84f5b37a9d 100644
> > --- a/src/intel/compiler/brw_vec4_nir.cpp
> > +++ b/src/intel/compiler/brw_vec4_nir.cpp
> > @@ -888,7 +888,9 @@
> > vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr)
> >if (const_offset) {
> >   offset_reg = brw_imm_ud(const_offset->u32[0] & ~15);
> >} else {
> > - offset_reg = get_nir_src(instr->src[1], nir_type_uint32,
> > 1);
> > + offset_reg = src_reg(this, glsl_type::uint_type);
> > + emit(MOV(dst_reg(offset_reg),
> > +  get_nir_src(instr->src[1], nir_type_uint32,
> > 1)));
> >}
> >  
> >src_reg packed_consts;
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] Update: Vulkan modifiers extension VK_EXT_image_drm_format_modifier

2018-02-20 Thread Chad Versace

As many of you know, I've been writing a Vulkan extension for DRM format
modifiers, named VK_EXT_image_drm_format_modifier.

The extension is very close to completion. I've submitted a branch to
Khronos for review. It's receiving active review inside Khronos from
some non-Mesa closed-source window-system-integration people, and Mesa
people too (namely jekstrand).

You should take a look at the spec if you care about modifiers and Vulkan.
I try to keep up-to-date urls to everything related to this extension at
.

There remain only two unresolved issues from my perspective:

- The exact definition of members of the array
VkImageDrmFormatModifierExplicitCreateInfoEXT::pPlaneLayouts.
Should the extension re-use VkSubresourceLayout as the array
members? Or should it define a new struct with less fields than
VkSubresourceLayout?

- If an image has a modifier that requires an extra plane (such as
a color-compression plane), should the extension allow such an
image to be disjoint? Specifically, if a modifier requires an
extra plane, should the extension allow the modifier's
drmFormatModifierTilingFeatures to contain
VK_FORMAT_FEATURE_DISJOINT_KHR?

I've tentatively concluded "no": images with extra planes must be
non-disjoint. Though we could lift this restriction in a future
extension.

Branches

I maintain a public branch of the Vulkan spec, branch
1.0-VK_EXT_image_drm_format_modifier, which is synchronized with the
Khronos-internal branch of the same name. I like cgit; other people like
Github; so I keep a mirror at both.

cgit:
http://git.kiwitree.net/cgit/~chadv/vulkan-spec/log?h=1.0-VK_EXT_image_drm_format_modifier
github:
https://github.com/chadversary/vulkan-spec/commits/1.0-VK_EXT_image_drm_format_modifier
khronos-internal:
https://gitlab.khronos.org/vulkan/vulkan/merge_requests/2555

Prebuilt Specs
==
I maintain a public build of the branch. The built headers and HTML
specification are synchronized with the git branch thanks to the magic
of shell scripts.

vulkan.h:
http://git.kiwitree.net/cgit/~chadv/vulkan-spec/tree/src/vulkan/vulkan.h?h=1.0-VK_EXT_image_drm_format_modifier
spec appendix:
http://kiwitree.net/~chadv/vulkan/1.0-VK_EXT_image_drm_format_modifier/html/vkspec.html#VK_EXT_image_drm_format_modifier
full spec:
http://kiwitree.net/~chadv/vulkan/1.0-VK_EXT_image_drm_format_modifier/html/vkspec.html

Where to start reading
==
Here's a short reading guide for people unfamiliar with the extension:

- Don't start with the specification. You'll quickly get lost.

- First, read the VK_EXT_external_memory_dma_buf and
VK_EXT_queue_family_foreign extensions. They're small extensions.
They're intended to be used with VK_EXT_image_drm_format_modifier.
Despite that intent, the the three extensions are independent from
the Vulkan specification's perspective.

- Read the appendix chapter for VK_EXT_image_drm_format_modifier.
I've written an "introduction to modifiers" section in the
appendix. If you're already intimately understand modifiers, then
you can briefly scan this section, skipping over the boring stuff.

- Read the issue section in the appendix.

- Now for the tofu. Study the new structs and functions. Find them
under `#define VK_EXT_image_drm_format_modifier` in vulkan.h.
Also, study the new enums values. Find them by grepping 'DRM.*EXT'
in vulkan.h.

- Finally, go read the specification text for the new structs,
functions, and enums.

How to send feedback

I honestly don't know. You _could_ comment on the merge request
in the Khronos-internal Gitlab. But you probably (and rightfully so)
want to keep the discussion public.

You could provide general feedback by replying to this thread.

You could leave comments on my Github branch. I don't like Github, but
I can't think of a better solution, other than...

I could send my specification patches to mesa-dev. If people want that,
say so.

So... yeah. I don't know how you should provide feedback. Just do it,
and we'll iron out any problems as they arise.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3] anv/blorp: multisample resolve all attachment layers

2018-02-20 Thread Iago Toral

Hi Nanley,

thanks for having a look at this, you're right that we should use the
framebuffer dimensions to decide on the number of layers to resolve. 

I'll send a new version with the fix.

Iago

On Tue, 2018-02-20 at 15:18 -0800, Nanley Chery wrote:
> On Thu, Feb 15, 2018 at 09:40:16AM +0100, Iago Toral Quiroga wrote:
> > We were only resolving the first.
> > 
> > v2:
> >   - Do not require that the number of layers on dst and src are an
> > exact match, it is okay if the dst has more layers so long as
> > it has at least the same that we are going to resolve.
> >   - Do not always resolve array_len layers, we should resolve
> > only from base_array_layer to array_len.
> > 
> > v3:
> >   - v2 was assuming that array_len represented the total number of
> > layers in the image, but it represents the number of layers
> > starting at the base array ayer.
> > 
> > Fixes new CTS tests for multisampled layered rendering:
> > dEQP-VK.renderpass.multisample_resolve.layers_*
> > ---
> >  src/intel/vulkan/anv_blorp.c | 30 +++---
> >  1 file changed, 19 insertions(+), 11 deletions(-)
> > 
> > diff --git a/src/intel/vulkan/anv_blorp.c
> > b/src/intel/vulkan/anv_blorp.c
> > index d38b343671..df566773a4 100644
> > --- a/src/intel/vulkan/anv_blorp.c
> > +++ b/src/intel/vulkan/anv_blorp.c
> > @@ -1543,25 +1543,33 @@ anv_cmd_buffer_resolve_subpass(struct
> > anv_cmd_buffer *cmd_buffer)
> >   get_blorp_surf_for_anv_image(cmd_buffer->device,
> > dst_iview->image,
> >VK_IMAGE_ASPECT_COLOR_BIT,
> >dst_aux_usage, _surf);
> > +
> > + uint32_t base_src_layer = src_iview-
> > >planes[0].isl.base_array_layer;
> > + uint32_t base_dst_layer = dst_iview-
> > >planes[0].isl.base_array_layer;
> > + uint32_t num_layers = src_iview->planes[0].isl.array_len;
> 
> num_layers should be equal to fb->layers. As seen in the definition
> of
> renderArea in the Vulkan spec, resolve operations are limited to the
> renderArea, which extends to all layers of the framebuffer.
> 
>renderArea is the render area that is affected by the render pass
>instance. The effects of attachment load, store and multisample
> resolve
>operations are restricted to the pixels whose x and y coordinates
> fall
>within the render area on all attachments. The render area extends
> to
>all layers of framebuffer.
> 
> > + assert(num_layers <= dst_iview->planes[0].isl.array_len);
> > +
> 
> This assertion is false. The spec allows having an arrayed
> multisample
> source image view and a non-arrayed single-sampled destination image
> view as long as the framebuffer is non-arrayed.
> 
>Each element of pAttachments must have dimensions at least as
> large as
>the corresponding framebuffer dimension
> 
> -Nanley
> 
> >   anv_cmd_buffer_mark_image_written(cmd_buffer, dst_iview-
> > >image,
> > VK_IMAGE_ASPECT_COLOR_B
> > IT,
> > dst_surf.aux_usage,
> > dst_iview-
> > >planes[0].isl.base_level,
> > -   dst_iview-
> > >planes[0].isl.base_array_layer, 1);
> > +   base_dst_layer,
> > num_layers);
> >  
> >   assert(!src_iview->image->format->can_ycbcr);
> >   assert(!dst_iview->image->format->can_ycbcr);
> >  
> > - resolve_surface(,
> > - _surf,
> > - src_iview->planes[0].isl.base_level,
> > - src_iview-
> > >planes[0].isl.base_array_layer,
> > - _surf,
> > - dst_iview->planes[0].isl.base_level,
> > - dst_iview-
> > >planes[0].isl.base_array_layer,
> > - render_area.offset.x,
> > render_area.offset.y,
> > - render_area.offset.x,
> > render_area.offset.y,
> > - render_area.extent.width,
> > render_area.extent.height);
> > + for (uint32_t i = 0; i < num_layers; i++) {
> > +resolve_surface(,
> > +_surf,
> > +src_iview->planes[0].isl.base_level,
> > +base_src_layer + i,
> > +_surf,
> > +dst_iview->planes[0].isl.base_level,
> > +base_dst_layer + i,
> > +render_area.offset.x,
> > render_area.offset.y,
> > +render_area.offset.x,
> > render_area.offset.y,
> > +render_area.extent.width,
> > render_area.extent.height);
> > + }
> >}
> >  
> >blorp_batch_finish();
> > -- 
> > 2.14.1
> > 
> > ___
> > mesa-dev mailing

Re: [Mesa-dev] Allocator Nouveau driver, Mesa EXT_external_objects, and DRM metadata import interfaces

2018-02-20 Thread Chad Versace

On Thu 21 Dec 2017, Daniel Vetter wrote:
> On Thu, Dec 21, 2017 at 12:22 AM, Kristian Kristensen  
> wrote:
>> On Wed, Dec 20, 2017 at 12:41 PM, Miguel Angel Vico  
>> wrote:
>>> On Wed, 20 Dec 2017 11:54:10 -0800 Kristian Høgsberg  
>>> wrote:
 I'd like to see concrete examples of actual display controllers
 supporting more format layouts than what can be specified with a 64
 bit modifier.
>>>
>>> The main problem is our tiling and other metadata parameters can't
>>> generally fit in a modifier, so we find passing a blob of metadata a
>>> more suitable mechanism.
>>
>> I understand that you may have n knobs with a total of more than a total of
>> 56 bits that configure your tiling/swizzling for color buffers. What I don't
>> buy is that you need all those combinations when passing buffers around
>> between codecs, cameras and display controllers. Even if you're sharing
>> between the same 3D drivers in different processes, I expect just locking
>> down, say, 64 different combinations (you can add more over time) and
>> assigning each a modifier would be sufficient. I doubt you'd extract
>> meaningful performance gains from going all the way to a blob.

I agree with Kristian above. In my opinion, choosing to encode in
modifiers a precise description of every possible tiling/compression
layout is not technically incorrect, but I believe it misses the point.
The intention behind modifiers is not to exhaustively describe all
possibilites.

I summarized this opinion in VK_EXT_image_drm_format_modifier,
where I wrote an "introdution to modifiers" section. Here's an excerpt:

One goal of modifiers in the Linux ecosystem is to enumerate for each
vendor a reasonably sized set of tiling formats that are appropriate for
images shared across processes, APIs, and/or devices, where each
participating component may possibly be from different vendors.
A non-goal is to enumerate all tiling formats supported by all vendors.
Some tiling formats used internally by vendors are inappropriate for
sharing; no modifiers should be assigned to such tiling formats.

> Tegra just redesigned it's modifier space from an ungodly amount of
> bits to just a few layouts. Not even just the ones in used, but simply
> limiting to the ones that make sense (there's dependencies apparently)
> Also note that the modifier alone doesn't need to describe the layout
> precisely, it only makes sense together with a specific pixel format
> and size. E.g. a bunch of the i915 layouts change layout depending
> upon bpp.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir: remove old assert

2018-02-20 Thread Ian Romanick

That makes sense.  I guess whoever changed that aspect didn't remove the
assert.  I only noticed it because I build with -Wextra, so it's not
surprising that nobody else noticed.

Reviewed-by: Ian Romanick 

On 02/20/2018 07:42 PM, Timothy Arceri wrote:
> This was originally intended to make sure the remap location
> was not -1. However the code has changed alot since then,
> the location is now never set to -1 and we also handle
> components meaning this old assert has been doing comparisions
> with the pointer to the array of component data.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105183
> ---
>  src/compiler/nir/nir_linking_helpers.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/src/compiler/nir/nir_linking_helpers.c 
> b/src/compiler/nir/nir_linking_helpers.c
> index 6459c6a24d..2b0a2668a3 100644
> --- a/src/compiler/nir/nir_linking_helpers.c
> +++ b/src/compiler/nir/nir_linking_helpers.c
> @@ -283,7 +283,6 @@ remap_slots_and_components(struct exec_list *var_list, 
> gl_shader_stage stage,
>if (var->data.location >= VARYING_SLOT_VAR0 &&
>var->data.location - VARYING_SLOT_VAR0 < 32) {
>   assert(var->data.location - VARYING_SLOT_VAR0 < 32);
> - assert(remap[var->data.location - VARYING_SLOT_VAR0] >= 0);
>  
>   const struct glsl_type *type = var->type;
>   if (nir_is_per_vertex_io(var, stage)) {
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 09/17 v2] spirv: Silence compiler warning about undefined srcs[0]

2018-02-20 Thread Eric Anholt

v2: Use assume() at the srcs[] definition instead.

Cc: Jason Ekstrand 
Cc: Ian Romanick 
Cc: Eric Engestrom 
---
 src/compiler/spirv/spirv_to_nir.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index c6df764682ec..e22fe25a2e82 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -2922,6 +2922,7 @@ vtn_handle_composite(struct vtn_builder *b, SpvOp opcode,
 
case SpvOpCompositeConstruct: {
   unsigned elems = count - 3;
+  assume(elems >= 1);
   if (glsl_type_is_vector_or_scalar(type)) {
  nir_ssa_def *srcs[4];
  for (unsigned i = 0; i < elems; i++)
-- 
2.15.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radeonsi/nir: collect more accurate output_usagemask

2018-02-20 Thread Timothy Arceri

Fixes assert in glsl-1.50-gs-max-output-components piglit test.

Note that the double handling will only work for doubles that
don't take up multiple slots i.e. double and dvec2. However
dual slot double handling is an existing bug which is made no
worse by this patch.
---
 src/gallium/drivers/radeonsi/si_shader_nir.c | 56 +---
 1 file changed, 43 insertions(+), 13 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader_nir.c 
b/src/gallium/drivers/radeonsi/si_shader_nir.c
index 3294019cea..7b10410dd7 100644
--- a/src/gallium/drivers/radeonsi/si_shader_nir.c
+++ b/src/gallium/drivers/radeonsi/si_shader_nir.c
@@ -462,21 +462,35 @@ void si_nir_scan_shader(const struct nir_shader *nir,
}
 
i = variable->data.driver_location;
-   if (processed_outputs & ((uint64_t)1 << i))
-   continue;
-
-   processed_outputs |= ((uint64_t)1 << i);
-   num_outputs++;
-
-   info->output_semantic_name[i] = semantic_name;
-   info->output_semantic_index[i] = semantic_index;
-   info->output_usagemask[i] = TGSI_WRITEMASK_XYZW;
 
unsigned num_components = 4;
unsigned vector_elements = 
glsl_get_vector_elements(glsl_without_array(variable->type));
if (vector_elements)
num_components = vector_elements;
 
+   if (glsl_type_is_64bit(glsl_without_array(variable->type)))
+   num_components = MIN2(num_components * 2, 4);
+
+   ubyte usagemask = 0;
+   for (unsigned j = 0; j < num_components; j++) {
+   switch (j + variable->data.location_frac) {
+   case 0:
+   usagemask |= TGSI_WRITEMASK_X;
+   break;
+   case 1:
+   usagemask |= TGSI_WRITEMASK_Y;
+   break;
+   case 2:
+   usagemask |= TGSI_WRITEMASK_Z;
+   break;
+   case 3:
+   usagemask |= TGSI_WRITEMASK_W;
+   break;
+   default:
+   unreachable("error calculating 
component index");
+   }
+   }
+
unsigned gs_out_streams;
if (variable->data.stream & (1u << 31)) {
gs_out_streams = variable->data.stream & ~(1u << 31);
@@ -492,23 +506,39 @@ void si_nir_scan_shader(const struct nir_shader *nir,
unsigned streamz = (gs_out_streams >> 4) & 3;
unsigned streamw = (gs_out_streams >> 6) & 3;
 
-   if (info->output_usagemask[i] & TGSI_WRITEMASK_X) {
+   if (usagemask & TGSI_WRITEMASK_X) {
+   info->output_usagemask[i] |= TGSI_WRITEMASK_X;
info->output_streams[i] |= streamx;
info->num_stream_output_components[streamx]++;
}
-   if (info->output_usagemask[i] & TGSI_WRITEMASK_Y) {
+   if (usagemask & TGSI_WRITEMASK_Y) {
+   info->output_usagemask[i] |= TGSI_WRITEMASK_Y;
info->output_streams[i] |= streamy << 2;
info->num_stream_output_components[streamy]++;
}
-   if (info->output_usagemask[i] & TGSI_WRITEMASK_Z) {
+   if (usagemask & TGSI_WRITEMASK_Z) {
+   info->output_usagemask[i] |= TGSI_WRITEMASK_Z;
info->output_streams[i] |= streamz << 4;
info->num_stream_output_components[streamz]++;
}
-   if (info->output_usagemask[i] & TGSI_WRITEMASK_W) {
+   if (usagemask & TGSI_WRITEMASK_W) {
+   info->output_usagemask[i] |= TGSI_WRITEMASK_W;
info->output_streams[i] |= streamw << 6;
info->num_stream_output_components[streamw]++;
}
 
+   /* make sure we only count this location once against the
+* num_outputs counter.
+*/
+   if (processed_outputs & ((uint64_t)1 << i))
+   continue;
+
+   processed_outputs |= ((uint64_t)1 << i);
+   num_outputs++;
+
+   info->output_semantic_name[i] = semantic_name;
+   info->output_semantic_index[i] = semantic_index;
+
switch (semantic_name) {
case TGSI_SEMANTIC_PRIMID:
info->writes_primid = true;
-- 
2.14.3

___
mesa-dev mailing list

[Mesa-dev] [PATCH] nvc0: fix writing query results into buffer

2018-02-20 Thread Ilia Mirkin

We need to mark the range as valid, and validate the resource using a
helper to ensure that the buffer status is marked properly.

Fixes some CTS pipeline stats query tests.

Signed-off-by: Ilia Mirkin 
---
 src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c
index 7568eeb94db..ef5f939319a 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_query_hw.c
@@ -473,10 +473,10 @@ nvc0_hw_get_query_result_resource(struct nvc0_context 
*nvc0,
PUSH_DATAh(push, buf->address + offset);
PUSH_DATA (push, buf->address + offset);
 
-   if (buf->mm) {
-  nouveau_fence_ref(nvc0->screen->base.fence.current, >fence);
-  nouveau_fence_ref(nvc0->screen->base.fence.current, >fence_wr);
-   }
+   util_range_add(>valid_buffer_range, offset,
+  offset + (result_type >= PIPE_QUERY_TYPE_I64 ? 8 : 4));
+
+   nvc0_resource_validate(buf, NOUVEAU_BO_WR);
 }
 
 static const struct nvc0_query_funcs hw_query_funcs = {
-- 
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 16/17] intel/compiler: Disable Align16 tests on Gen11+

2018-02-20 Thread Matt Turner

Align16 is no more.
---
 src/intel/compiler/test_eu_validate.cpp | 16 
 1 file changed, 16 insertions(+)

diff --git a/src/intel/compiler/test_eu_validate.cpp 
b/src/intel/compiler/test_eu_validate.cpp
index cb2fcd3d40f..f6c2b35625e 100644
--- a/src/intel/compiler/test_eu_validate.cpp
+++ b/src/intel/compiler/test_eu_validate.cpp
@@ -374,6 +374,10 @@ TEST_P(validation_test, dst_horizontal_stride_0)
 
clear_instructions(p);
 
+   /* Align16 does not exist on Gen11+ */
+   if (devinfo.gen >= 11)
+  return;
+
brw_set_default_access_mode(p, BRW_ALIGN_16);
 
brw_ADD(p, g0, g0, g0);
@@ -421,6 +425,10 @@ TEST_P(validation_test, 
must_not_cross_grf_boundary_in_a_width)
 /* Destination Horizontal must be 1 in Align16 */
 TEST_P(validation_test, dst_hstride_on_align16_must_be_1)
 {
+   /* Align16 does not exist on Gen11+ */
+   if (devinfo.gen >= 11)
+  return;
+
brw_set_default_access_mode(p, BRW_ALIGN_16);
 
brw_ADD(p, g0, g0, g0);
@@ -439,6 +447,10 @@ TEST_P(validation_test, dst_hstride_on_align16_must_be_1)
 /* VertStride must be 0 or 4 in Align16 */
 TEST_P(validation_test, vstride_on_align16_must_be_0_or_4)
 {
+   /* Align16 does not exist on Gen11+ */
+   if (devinfo.gen >= 11)
+  return;
+
const struct {
   enum brw_vertical_stride vstride;
   bool expected_result;
@@ -1419,6 +1431,10 @@ TEST_P(validation_test, align16_64_bit_integer)
if (devinfo.gen < 8)
   return;
 
+   /* Align16 does not exist on Gen11+ */
+   if (devinfo.gen >= 11)
+  return;
+
brw_set_default_access_mode(p, BRW_ALIGN_16);
 
for (unsigned i = 0; i < sizeof(inst) / sizeof(inst[0]); i++) {
-- 
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 17/17] intel/compiler: Add ICL to test_eu_validate.cpp

2018-02-20 Thread Matt Turner

With the Align16 tests now disabled, we can run the rest of the tests in
ICL mode (and see them pass!)
---
 src/intel/compiler/test_eu_validate.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/intel/compiler/test_eu_validate.cpp 
b/src/intel/compiler/test_eu_validate.cpp
index f6c2b35625e..d987311ef84 100644
--- a/src/intel/compiler/test_eu_validate.cpp
+++ b/src/intel/compiler/test_eu_validate.cpp
@@ -56,6 +56,7 @@ static const struct gen_info {
{ "glk", 9, IS_GLK },
{ "cfl", 9, IS_CFL },
{ "cnl", 10 },
+   { "icl", 11 },
 };
 
 class validation_test: public ::testing::TestWithParam {
-- 
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 08/17] intel/compiler/fs: Fix application of cmod and saturate to LINE/MAC pair

2018-02-20 Thread Matt Turner

---
 src/intel/compiler/brw_fs_generator.cpp | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/src/intel/compiler/brw_fs_generator.cpp 
b/src/intel/compiler/brw_fs_generator.cpp
index 0854709b272..f2bdac7d731 100644
--- a/src/intel/compiler/brw_fs_generator.cpp
+++ b/src/intel/compiler/brw_fs_generator.cpp
@@ -673,6 +673,7 @@ fs_generator::generate_linterp(fs_inst *inst,
struct brw_reg delta_x = src[0];
struct brw_reg delta_y = offset(src[0], inst->exec_size / 8);
struct brw_reg interp = src[1];
+   brw_inst *i[2];
 
if (devinfo->gen >= 11) {
   struct brw_reg acc = retype(brw_acc_reg(8), BRW_REGISTER_TYPE_NF);
@@ -727,11 +728,19 @@ fs_generator::generate_linterp(fs_inst *inst,
 
   return false;
} else {
-  brw_LINE(p, brw_null_reg(), interp, delta_x);
-  brw_MAC(p, dst, suboffset(interp, 1), delta_y);
-
-  return true;
+  i[0] = brw_LINE(p, brw_null_reg(), interp, delta_x);
+  i[1] = brw_MAC(p, dst, suboffset(interp, 1), delta_y);
}
+
+   brw_inst_set_cond_modifier(p->devinfo, i[1], inst->conditional_mod);
+
+   /* brw_set_default_saturate() is called before emitting instructions, so the
+* saturate bit is set in each instruction, so we need to unset it on the
+* first instruction.
+*/
+   brw_inst_set_saturate(p->devinfo, i[0], false);
+
+   return true;
 }
 
 void
-- 
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 13/17] intel/compiler: Lower flrp32 on Gen11+

2018-02-20 Thread Matt Turner

The LRP instruction is no more.
---
 src/intel/compiler/brw_compiler.c   | 35 +
 src/intel/compiler/brw_fs_builder.h |  2 +-
 src/intel/compiler/brw_fs_generator.cpp |  2 +-
 src/intel/compiler/brw_vec4_builder.h   |  2 +-
 src/intel/compiler/brw_vec4_visitor.cpp |  2 +-
 5 files changed, 26 insertions(+), 17 deletions(-)

diff --git a/src/intel/compiler/brw_compiler.c 
b/src/intel/compiler/brw_compiler.c
index e515559acb6..b651ba14f1b 100644
--- a/src/intel/compiler/brw_compiler.c
+++ b/src/intel/compiler/brw_compiler.c
@@ -45,20 +45,28 @@
.use_interpolated_input_intrinsics = true, \
.vertex_id_zero_based = true
 
+#define COMMON_SCALAR_OPTIONS \
+   .lower_pack_half_2x16 = true,  \
+   .lower_pack_snorm_2x16 = true, \
+   .lower_pack_snorm_4x8 = true,  \
+   .lower_pack_unorm_2x16 = true, \
+   .lower_pack_unorm_4x8 = true,  \
+   .lower_unpack_half_2x16 = true,\
+   .lower_unpack_snorm_2x16 = true,   \
+   .lower_unpack_snorm_4x8 = true,\
+   .lower_unpack_unorm_2x16 = true,   \
+   .lower_unpack_unorm_4x8 = true,\
+   .max_unroll_iterations = 32
+
 static const struct nir_shader_compiler_options scalar_nir_options = {
COMMON_OPTIONS,
-   .lower_pack_half_2x16 = true,
-   .lower_pack_snorm_2x16 = true,
-   .lower_pack_snorm_4x8 = true,
-   .lower_pack_unorm_2x16 = true,
-   .lower_pack_unorm_4x8 = true,
-   .lower_unpack_half_2x16 = true,
-   .lower_unpack_snorm_2x16 = true,
-   .lower_unpack_snorm_4x8 = true,
-   .lower_unpack_unorm_2x16 = true,
-   .lower_unpack_unorm_4x8 = true,
-   .vs_inputs_dual_locations = true,
-   .max_unroll_iterations = 32,
+   COMMON_SCALAR_OPTIONS,
+};
+
+static const struct nir_shader_compiler_options scalar_nir_options_gen11 = {
+   COMMON_OPTIONS,
+   COMMON_SCALAR_OPTIONS,
+   .lower_flrp32 = true,
 };
 
 static const struct nir_shader_compiler_options vector_nir_options = {
@@ -148,7 +156,8 @@ brw_compiler_create(void *mem_ctx, const struct 
gen_device_info *devinfo)
   compiler->glsl_compiler_options[i].OptimizeForAOS = !is_scalar;
 
   if (is_scalar) {
- compiler->glsl_compiler_options[i].NirOptions = _nir_options;
+ compiler->glsl_compiler_options[i].NirOptions =
+devinfo->gen < 11 ? _nir_options : 
_nir_options_gen11;
   } else {
  compiler->glsl_compiler_options[i].NirOptions =
 devinfo->gen < 6 ? _nir_options : _nir_options_gen6;
diff --git a/src/intel/compiler/brw_fs_builder.h 
b/src/intel/compiler/brw_fs_builder.h
index 87394bc17b3..874272b7afd 100644
--- a/src/intel/compiler/brw_fs_builder.h
+++ b/src/intel/compiler/brw_fs_builder.h
@@ -540,7 +540,7 @@ namespace brw {
   LRP(const dst_reg , const src_reg , const src_reg ,
   const src_reg ) const
   {
- if (shader->devinfo->gen >= 6) {
+ if (shader->devinfo->gen >= 6 && shader->devinfo->gen <= 10) {
 /* The LRP instruction actually does op1 * op0 + op2 * (1 - op0), 
so
  * we need to reorder the operands.
  */
diff --git a/src/intel/compiler/brw_fs_generator.cpp 
b/src/intel/compiler/brw_fs_generator.cpp
index ffc46972420..9817e317cb8 100644
--- a/src/intel/compiler/brw_fs_generator.cpp
+++ b/src/intel/compiler/brw_fs_generator.cpp
@@ -1857,7 +1857,7 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
 break;
 
   case BRW_OPCODE_LRP:
- assert(devinfo->gen >= 6);
+ assert(devinfo->gen >= 6 && devinfo->gen <= 10);
  if (devinfo->gen < 10)
 brw_set_default_access_mode(p, BRW_ALIGN_16);
  brw_LRP(p, dst, src[0], src[1], src[2]);
diff --git a/src/intel/compiler/brw_vec4_builder.h 
b/src/intel/compiler/brw_vec4_builder.h
index 4c3efe8457b..5c880c19f52 100644
--- a/src/intel/compiler/brw_vec4_builder.h
+++ b/src/intel/compiler/brw_vec4_builder.h
@@ -501,7 +501,7 @@ namespace brw {
   LRP(const dst_reg , const src_reg , const src_reg ,
   const src_reg ) const
   {
- if (shader->devinfo->gen >= 6) {
+ if (shader->devinfo->gen >= 6 && shader->devinfo->gen <= 10) {
 /* The LRP instruction actually does op1 * op0 + op2 * (1 - op0), 
so
  * we need to reorder the operands.
  */
diff --git a/src/intel/compiler/brw_vec4_visitor.cpp 
b/src/intel/compiler/brw_vec4_visitor.cpp
index 53f6a5ed546..e683a8c51db 100644
--- a/src/intel/compiler/brw_vec4_visitor.cpp
+++ b/src/intel/compiler/brw_vec4_visitor.cpp
@@

[Mesa-dev] [PATCH 12/17] intel/compiler/fs: Implement ddy without using align16 for Gen11+

2018-02-20 Thread Matt Turner

Align16 is no more. We previously generated an align16 ADD instruction
to calculate DDY:

   add(8) g11<1>F  -g10<4>.xyxyF  g10<4>.zwzwF  { align16 1Q };

Without align16, we now implement it as two align1 instructions:

   add(4) g11<2>F   -g10<4,2,0>Fg10.2<4,2,0>F  { align1 1N };
   add(4) g11.1<2>F -g10.1<4,2,0>F  g10.3<4,2,0>F  { align1 1N };
---
 src/intel/compiler/brw_fs_generator.cpp | 70 ++---
 1 file changed, 56 insertions(+), 14 deletions(-)

diff --git a/src/intel/compiler/brw_fs_generator.cpp 
b/src/intel/compiler/brw_fs_generator.cpp
index 013d2c820a0..ffc46972420 100644
--- a/src/intel/compiler/brw_fs_generator.cpp
+++ b/src/intel/compiler/brw_fs_generator.cpp
@@ -1192,23 +1192,65 @@ fs_generator::generate_ddy(const fs_inst *inst,
 {
if (inst->opcode == FS_OPCODE_DDY_FINE) {
   /* produce accurate derivatives */
-  struct brw_reg src0 = src;
-  struct brw_reg src1 = src;
+  if (devinfo->gen >= 11) {
+ struct brw_reg x = src;
+ struct brw_reg y = src;
+ struct brw_reg z = src;
+ struct brw_reg w = src;
+ struct brw_reg dst_e = dst;
+ struct brw_reg dst_o = dst;
+
+ x.vstride = BRW_VERTICAL_STRIDE_4;
+ y.vstride = BRW_VERTICAL_STRIDE_4;
+ z.vstride = BRW_VERTICAL_STRIDE_4;
+ w.vstride = BRW_VERTICAL_STRIDE_4;
+
+ x.width = BRW_WIDTH_2;
+ y.width = BRW_WIDTH_2;
+ z.width = BRW_WIDTH_2;
+ w.width = BRW_WIDTH_2;
+
+ x.hstride = BRW_HORIZONTAL_STRIDE_0;
+ y.hstride = BRW_HORIZONTAL_STRIDE_0;
+ z.hstride = BRW_HORIZONTAL_STRIDE_0;
+ w.hstride = BRW_HORIZONTAL_STRIDE_0;
+
+ x.subnr = 0 * sizeof(float);
+ y.subnr = 1 * sizeof(float);
+ z.subnr = 2 * sizeof(float);
+ w.subnr = 3 * sizeof(float);
+
+ dst_e.hstride = BRW_HORIZONTAL_STRIDE_2;
+ dst_o.hstride = BRW_HORIZONTAL_STRIDE_2;
+ dst_o.subnr = sizeof(float);
 
-  src0.swizzle = BRW_SWIZZLE_XYXY;
-  src0.vstride = BRW_VERTICAL_STRIDE_4;
-  src0.width   = BRW_WIDTH_4;
-  src0.hstride = BRW_HORIZONTAL_STRIDE_1;
+ brw_push_insn_state(p);
+ if (inst->exec_size == 8)
+brw_set_default_exec_size(p, BRW_EXECUTE_4);
+ else
+brw_set_default_exec_size(p, BRW_EXECUTE_8);
+ brw_ADD(p, dst_e, negate(x), z);
+ brw_ADD(p, dst_o, negate(y), w);
+ brw_pop_insn_state(p);
+  } else {
+ struct brw_reg src0 = src;
+ struct brw_reg src1 = src;
 
-  src1.swizzle = BRW_SWIZZLE_ZWZW;
-  src1.vstride = BRW_VERTICAL_STRIDE_4;
-  src1.width   = BRW_WIDTH_4;
-  src1.hstride = BRW_HORIZONTAL_STRIDE_1;
+ src0.swizzle = BRW_SWIZZLE_XYXY;
+ src0.vstride = BRW_VERTICAL_STRIDE_4;
+ src0.width   = BRW_WIDTH_4;
+ src0.hstride = BRW_HORIZONTAL_STRIDE_1;
 
-  brw_push_insn_state(p);
-  brw_set_default_access_mode(p, BRW_ALIGN_16);
-  brw_ADD(p, dst, negate(src0), src1);
-  brw_pop_insn_state(p);
+ src1.swizzle = BRW_SWIZZLE_ZWZW;
+ src1.vstride = BRW_VERTICAL_STRIDE_4;
+ src1.width   = BRW_WIDTH_4;
+ src1.hstride = BRW_HORIZONTAL_STRIDE_1;
+
+ brw_push_insn_state(p);
+ brw_set_default_access_mode(p, BRW_ALIGN_16);
+ brw_ADD(p, dst, negate(src0), src1);
+ brw_pop_insn_state(p);
+  }
} else {
   /* replicate the derivative at the top-left pixel to other pixels */
   struct brw_reg src0 = src;
-- 
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 07/17] intel/compiler/fs: Return multiple_instructions_emitted from generate_linterp

2018-02-20 Thread Matt Turner

If multiple instructions are emitted, special handling of things like
conditional mod, saturate, and NoDDClr/NoDDChk need to be performed.

I noticed that conditional mods were misapplied when adding support for
Gen11 (in the previous patch). The next patch fixes the same bug in the
Gen4 LINE/MAC case, though I was not able to trigger it.
---
 src/intel/compiler/brw_fs.h |  2 +-
 src/intel/compiler/brw_fs_generator.cpp | 12 +---
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/src/intel/compiler/brw_fs.h b/src/intel/compiler/brw_fs.h
index 63373580ee4..37106ccb284 100644
--- a/src/intel/compiler/brw_fs.h
+++ b/src/intel/compiler/brw_fs.h
@@ -409,7 +409,7 @@ private:
void generate_urb_write(fs_inst *inst, struct brw_reg payload);
void generate_cs_terminate(fs_inst *inst, struct brw_reg payload);
void generate_barrier(fs_inst *inst, struct brw_reg src);
-   void generate_linterp(fs_inst *inst, struct brw_reg dst,
+   bool generate_linterp(fs_inst *inst, struct brw_reg dst,
 struct brw_reg *src);
void generate_tex(fs_inst *inst, struct brw_reg dst, struct brw_reg src,
  struct brw_reg surface_index,
diff --git a/src/intel/compiler/brw_fs_generator.cpp 
b/src/intel/compiler/brw_fs_generator.cpp
index 54869bc3ebc..0854709b272 100644
--- a/src/intel/compiler/brw_fs_generator.cpp
+++ b/src/intel/compiler/brw_fs_generator.cpp
@@ -646,9 +646,9 @@ fs_generator::generate_barrier(fs_inst *inst, struct 
brw_reg src)
brw_WAIT(p);
 }
 
-void
+bool
 fs_generator::generate_linterp(fs_inst *inst,
-struct brw_reg dst, struct brw_reg *src)
+   struct brw_reg dst, struct brw_reg *src)
 {
/* PLN reads:
 *  /   in SIMD16   \
@@ -719,12 +719,18 @@ fs_generator::generate_linterp(fs_inst *inst,
  brw_inst_set_saturate(p->devinfo, i[0], false);
  brw_inst_set_saturate(p->devinfo, i[2], false);
   }
+
+  return true;
} else if (devinfo->has_pln &&
   (devinfo->gen >= 7 || (delta_x.nr & 1) == 0)) {
   brw_PLN(p, dst, interp, delta_x);
+
+  return false;
} else {
   brw_LINE(p, brw_null_reg(), interp, delta_x);
   brw_MAC(p, dst, suboffset(interp, 1), delta_y);
+
+  return true;
}
 }
 
@@ -1999,7 +2005,7 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
 brw_MOV(p, dst, src[0]);
 break;
   case FS_OPCODE_LINTERP:
-generate_linterp(inst, dst, src);
+multiple_instructions_emitted = generate_linterp(inst, dst, src);
 break;
   case FS_OPCODE_PIXEL_X:
  assert(src[0].type == BRW_REGISTER_TYPE_UW);
-- 
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 11/17] intel/compiler/fs: Simplify ddx/ddy code generation

2018-02-20 Thread Matt Turner

The brw_reg() constructor just obfuscates things here, in my opinion.
---
 src/intel/compiler/brw_fs_generator.cpp | 77 +++--
 1 file changed, 35 insertions(+), 42 deletions(-)

diff --git a/src/intel/compiler/brw_fs_generator.cpp 
b/src/intel/compiler/brw_fs_generator.cpp
index e5a5a76a932..013d2c820a0 100644
--- a/src/intel/compiler/brw_fs_generator.cpp
+++ b/src/intel/compiler/brw_fs_generator.cpp
@@ -1168,20 +1168,17 @@ fs_generator::generate_ddx(const fs_inst *inst,
   width = BRW_WIDTH_4;
}
 
-   struct brw_reg src0 = brw_reg(src.file, src.nr, 1,
- src.negate, src.abs,
-BRW_REGISTER_TYPE_F,
-vstride,
-width,
-BRW_HORIZONTAL_STRIDE_0,
-BRW_SWIZZLE_XYZW, WRITEMASK_XYZW);
-   struct brw_reg src1 = brw_reg(src.file, src.nr, 0,
- src.negate, src.abs,
-BRW_REGISTER_TYPE_F,
-vstride,
-width,
-BRW_HORIZONTAL_STRIDE_0,
-BRW_SWIZZLE_XYZW, WRITEMASK_XYZW);
+   struct brw_reg src0 = src;
+   struct brw_reg src1 = src;
+
+   src0.subnr   = sizeof(float);
+   src0.vstride = vstride;
+   src0.width   = width;
+   src0.hstride = BRW_HORIZONTAL_STRIDE_0;
+   src1.vstride = vstride;
+   src1.width   = width;
+   src1.hstride = BRW_HORIZONTAL_STRIDE_0;
+
brw_ADD(p, dst, src0, negate(src1));
 }
 
@@ -1195,40 +1192,36 @@ fs_generator::generate_ddy(const fs_inst *inst,
 {
if (inst->opcode == FS_OPCODE_DDY_FINE) {
   /* produce accurate derivatives */
-  struct brw_reg src0 = brw_reg(src.file, src.nr, 0,
-src.negate, src.abs,
-BRW_REGISTER_TYPE_F,
-BRW_VERTICAL_STRIDE_4,
-BRW_WIDTH_4,
-BRW_HORIZONTAL_STRIDE_1,
-BRW_SWIZZLE_XYXY, WRITEMASK_XYZW);
-  struct brw_reg src1 = brw_reg(src.file, src.nr, 0,
-src.negate, src.abs,
-BRW_REGISTER_TYPE_F,
-BRW_VERTICAL_STRIDE_4,
-BRW_WIDTH_4,
-BRW_HORIZONTAL_STRIDE_1,
-BRW_SWIZZLE_ZWZW, WRITEMASK_XYZW);
+  struct brw_reg src0 = src;
+  struct brw_reg src1 = src;
+
+  src0.swizzle = BRW_SWIZZLE_XYXY;
+  src0.vstride = BRW_VERTICAL_STRIDE_4;
+  src0.width   = BRW_WIDTH_4;
+  src0.hstride = BRW_HORIZONTAL_STRIDE_1;
+
+  src1.swizzle = BRW_SWIZZLE_ZWZW;
+  src1.vstride = BRW_VERTICAL_STRIDE_4;
+  src1.width   = BRW_WIDTH_4;
+  src1.hstride = BRW_HORIZONTAL_STRIDE_1;
+
   brw_push_insn_state(p);
   brw_set_default_access_mode(p, BRW_ALIGN_16);
   brw_ADD(p, dst, negate(src0), src1);
   brw_pop_insn_state(p);
} else {
   /* replicate the derivative at the top-left pixel to other pixels */
-  struct brw_reg src0 = brw_reg(src.file, src.nr, 0,
-src.negate, src.abs,
-BRW_REGISTER_TYPE_F,
-BRW_VERTICAL_STRIDE_4,
-BRW_WIDTH_4,
-BRW_HORIZONTAL_STRIDE_0,
-BRW_SWIZZLE_XYZW, WRITEMASK_XYZW);
-  struct brw_reg src1 = brw_reg(src.file, src.nr, 2,
-src.negate, src.abs,
-BRW_REGISTER_TYPE_F,
-BRW_VERTICAL_STRIDE_4,
-BRW_WIDTH_4,
-BRW_HORIZONTAL_STRIDE_0,
-BRW_SWIZZLE_XYZW, WRITEMASK_XYZW);
+  struct brw_reg src0 = src;
+  struct brw_reg src1 = src;
+
+  src0.vstride = BRW_VERTICAL_STRIDE_4;
+  src0.width   = BRW_WIDTH_4;
+  src0.hstride = BRW_HORIZONTAL_STRIDE_0;
+  src1.vstride = BRW_VERTICAL_STRIDE_4;
+  src1.width   = BRW_WIDTH_4;
+  src1.hstride = BRW_HORIZONTAL_STRIDE_0;
+  src1.subnr   = 2 * sizeof(float);
+
   brw_ADD(p, dst, negate(src0), src1);
}
 }
-- 
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 14/17] intel/compiler: Mark line, pln, and lrp as removed on Gen11+

2018-02-20 Thread Matt Turner

---
 src/intel/compiler/brw_eu.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/src/intel/compiler/brw_eu.c b/src/intel/compiler/brw_eu.c
index bc297a21b32..3646076a8e8 100644
--- a/src/intel/compiler/brw_eu.c
+++ b/src/intel/compiler/brw_eu.c
@@ -384,7 +384,8 @@ enum gen {
GEN75 = (1 << 5),
GEN8  = (1 << 6),
GEN9  = (1 << 7),
-   GEN10  = (1 << 8),
+   GEN10 = (1 << 8),
+   GEN11 = (1 << 9),
GEN_ALL = ~0
 };
 
@@ -628,16 +629,16 @@ static const struct opcode_desc opcode_descs[128] = {
},
/* Reserved 88 */
[BRW_OPCODE_LINE] = {
-  .name = "line",.nsrc = 2, .ndst = 1, .gens = GEN_ALL,
+  .name = "line",.nsrc = 2, .ndst = 1, .gens = GEN_LE(GEN10),
},
[BRW_OPCODE_PLN] = {
-  .name = "pln", .nsrc = 2, .ndst = 1, .gens = GEN_GE(GEN45),
+  .name = "pln", .nsrc = 2, .ndst = 1, .gens = GEN_GE(GEN45) & 
GEN_LE(GEN10),
},
[BRW_OPCODE_MAD] = {
   .name = "mad", .nsrc = 3, .ndst = 1, .gens = GEN_GE(GEN6),
},
[BRW_OPCODE_LRP] = {
-  .name = "lrp", .nsrc = 3, .ndst = 1, .gens = GEN_GE(GEN6),
+  .name = "lrp", .nsrc = 3, .ndst = 1, .gens = GEN_GE(GEN6) & 
GEN_LE(GEN10),
},
[93] = {
   .name = "madm",.nsrc = 3, .ndst = 1, .gens = GEN_GE(GEN8),
@@ -662,6 +663,7 @@ gen_from_devinfo(const struct gen_device_info *devinfo)
case 8: return GEN8;
case 9: return GEN9;
case 10: return GEN10;
+   case 11: return GEN11;
default:
   unreachable("not reached");
}
-- 
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 10/17] intel/compiler/fs: Pass fs_inst to generate_ddx/ddy instead of opcode

2018-02-20 Thread Matt Turner

In a future patch, generate_ddy will want to inspect inst->exec_size.
Change generate_ddx as well for consistency.
---
 src/intel/compiler/brw_fs.h |  6 --
 src/intel/compiler/brw_fs_generator.cpp | 12 ++--
 2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/src/intel/compiler/brw_fs.h b/src/intel/compiler/brw_fs.h
index 37106ccb284..76ad76e08b7 100644
--- a/src/intel/compiler/brw_fs.h
+++ b/src/intel/compiler/brw_fs.h
@@ -417,8 +417,10 @@ private:
void generate_get_buffer_size(fs_inst *inst, struct brw_reg dst,
  struct brw_reg src,
  struct brw_reg surf_index);
-   void generate_ddx(enum opcode op, struct brw_reg dst, struct brw_reg src);
-   void generate_ddy(enum opcode op, struct brw_reg dst, struct brw_reg src);
+   void generate_ddx(const fs_inst *inst,
+ struct brw_reg dst, struct brw_reg src);
+   void generate_ddy(const fs_inst *inst,
+ struct brw_reg dst, struct brw_reg src);
void generate_scratch_write(fs_inst *inst, struct brw_reg src);
void generate_scratch_read(fs_inst *inst, struct brw_reg dst);
void generate_scratch_read_gen7(fs_inst *inst, struct brw_reg dst);
diff --git a/src/intel/compiler/brw_fs_generator.cpp 
b/src/intel/compiler/brw_fs_generator.cpp
index f2bdac7d731..e5a5a76a932 100644
--- a/src/intel/compiler/brw_fs_generator.cpp
+++ b/src/intel/compiler/brw_fs_generator.cpp
@@ -1153,12 +1153,12 @@ fs_generator::generate_tex(fs_inst *inst, struct 
brw_reg dst, struct brw_reg src
  * appropriate swizzling.
  */
 void
-fs_generator::generate_ddx(enum opcode opcode,
+fs_generator::generate_ddx(const fs_inst *inst,
struct brw_reg dst, struct brw_reg src)
 {
unsigned vstride, width;
 
-   if (opcode == FS_OPCODE_DDX_FINE) {
+   if (inst->opcode == FS_OPCODE_DDX_FINE) {
   /* produce accurate derivatives */
   vstride = BRW_VERTICAL_STRIDE_2;
   width = BRW_WIDTH_2;
@@ -1190,10 +1190,10 @@ fs_generator::generate_ddx(enum opcode opcode,
  * left.
  */
 void
-fs_generator::generate_ddy(enum opcode opcode,
+fs_generator::generate_ddy(const fs_inst *inst,
struct brw_reg dst, struct brw_reg src)
 {
-   if (opcode == FS_OPCODE_DDY_FINE) {
+   if (inst->opcode == FS_OPCODE_DDY_FINE) {
   /* produce accurate derivatives */
   struct brw_reg src0 = brw_reg(src.file, src.nr, 0,
 src.negate, src.abs,
@@ -2049,11 +2049,11 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
 break;
   case FS_OPCODE_DDX_COARSE:
   case FS_OPCODE_DDX_FINE:
- generate_ddx(inst->opcode, dst, src[0]);
+ generate_ddx(inst, dst, src[0]);
  break;
   case FS_OPCODE_DDY_COARSE:
   case FS_OPCODE_DDY_FINE:
- generate_ddy(inst->opcode, dst, src[0]);
+ generate_ddy(inst, dst, src[0]);
 break;
 
   case SHADER_OPCODE_GEN4_SCRATCH_WRITE:
-- 
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 09/17] intel/compiler/fs: Don't generate integer DWord multiply on Gen11

2018-02-20 Thread Matt Turner

Like CHV et al., Gen11 does not support 32x32 -> 32/64-bit integer
multiplies.
---
 src/intel/common/gen_device_info.c | 4 
 src/intel/common/gen_device_info.h | 1 +
 src/intel/compiler/brw_fs.cpp  | 6 +-
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/src/intel/common/gen_device_info.c 
b/src/intel/common/gen_device_info.c
index 465d4c783a1..c4b78e032a3 100644
--- a/src/intel/common/gen_device_info.c
+++ b/src/intel/common/gen_device_info.c
@@ -323,6 +323,7 @@ static const struct gen_device_info gen_device_info_hsw_gt3 
= {
.has_llc = true, \
.has_sample_with_hiz = false,\
.has_pln = true, \
+   .has_integer_dword_mul = true,   \
.has_64bit_types = true, \
.supports_simd16_3src = true,\
.has_surface_tile_offset = true, \
@@ -405,6 +406,7 @@ static const struct gen_device_info gen_device_info_bdw_gt3 
= {
 static const struct gen_device_info gen_device_info_chv = {
GEN8_FEATURES, .is_cherryview = 1, .gt = 1,
.has_llc = false,
+   .has_integer_dword_mul = false,
.num_slices = 1,
.num_subslices = { 2, },
.num_thread_per_eu = 7,
@@ -455,6 +457,7 @@ static const struct gen_device_info gen_device_info_chv = {
 #define GEN9_LP_FEATURES   \
GEN8_FEATURES,  \
GEN9_HW_INFO,   \
+   .has_integer_dword_mul = false, \
.gt = 1,\
.has_llc = false,   \
.has_sample_with_hiz = true,\
@@ -759,6 +762,7 @@ static const struct gen_device_info gen_device_info_cnl_5x8 
= {
GEN8_FEATURES,   \
GEN11_HW_INFO,   \
.has_64bit_types = false,\
+   .has_integer_dword_mul = false,  \
.gt = _gt, .num_slices = _slices, .l3_banks = _l3
 
 static const struct gen_device_info gen_device_info_icl_8x8 = {
diff --git a/src/intel/common/gen_device_info.h 
b/src/intel/common/gen_device_info.h
index 7761eeba7e0..edd910faee7 100644
--- a/src/intel/common/gen_device_info.h
+++ b/src/intel/common/gen_device_info.h
@@ -60,6 +60,7 @@ struct gen_device_info
 
bool has_pln;
bool has_64bit_types;
+   bool has_integer_dword_mul;
bool has_compr4;
bool has_surface_tile_offset;
bool supports_simd16_3src;
diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
index 6fb46e7374c..3b61fe9178c 100644
--- a/src/intel/compiler/brw_fs.cpp
+++ b/src/intel/compiler/brw_fs.cpp
@@ -3549,11 +3549,7 @@ fs_visitor::lower_integer_multiplication()
   inst->dst.type != BRW_REGISTER_TYPE_UD))
 continue;
 
- /* Gen8's MUL instruction can do a 32-bit x 32-bit -> 32-bit
-  * operation directly, but CHV/BXT cannot.
-  */
- if (devinfo->gen >= 8 &&
- !devinfo->is_cherryview && !gen_device_info_is_9lp(devinfo))
+ if (devinfo->has_integer_dword_mul)
 continue;
 
  if (inst->src[1].file == IMM &&
-- 
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 15/17] intel/compiler: Add instruction compaction support on Gen11

2018-02-20 Thread Matt Turner

Gen11 only differs from SKL+ in that it uses a new datatype index table.
---
 src/intel/compiler/brw_eu_compact.c | 42 +
 1 file changed, 42 insertions(+)

diff --git a/src/intel/compiler/brw_eu_compact.c 
b/src/intel/compiler/brw_eu_compact.c
index 8d33e2adffc..ae14ef10ec0 100644
--- a/src/intel/compiler/brw_eu_compact.c
+++ b/src/intel/compiler/brw_eu_compact.c
@@ -637,6 +637,41 @@ static const uint16_t gen8_src_index_table[32] = {
0b010110001000
 };
 
+static const uint32_t gen11_datatype_table[32] = {
+   0b00101,
+   0b001000100,
+   0b001000101,
+   0b001001101,
+   0b0010101100101,
+   0b0010010100101,
+   0b0010010010101,
+   0b00100100101000101,
+   0b00100100101100101,
+   0b001010101,
+   0b001110100,
+   0b001110101,
+   0b001000101000101000101,
+   0b001000111000101000100,
+   0b001000111000101000101,
+   0b001100100100101100101,
+   0b001100101100100100101,
+   0b001100101100101100100,
+   0b001100101100101100101,
+   0b00110000101100100,
+   0b001001100,
+   0b0010001100101,
+   0b0010101000101,
+   0b001010100,
+   0b001000101000101000100,
+   0b00100011100010100,
+   0b00100100100101001,
+   0b00110100101100101,
+   0b00110000101100101,
+   0b00100001101001100,
+   0b001001001001001001000,
+   0b001001011001001001000,
+};
+
 /* This is actually the control index table for Cherryview (26 bits), but the
  * only difference from Broadwell (24 bits) is that it has two extra 0-bits at
  * the start.
@@ -1450,8 +1485,15 @@ brw_init_compaction_tables(const struct gen_device_info 
*devinfo)
assert(gen8_datatype_table[ARRAY_SIZE(gen8_datatype_table) - 1] != 0);
assert(gen8_subreg_table[ARRAY_SIZE(gen8_subreg_table) - 1] != 0);
assert(gen8_src_index_table[ARRAY_SIZE(gen8_src_index_table) - 1] != 0);
+   assert(gen11_datatype_table[ARRAY_SIZE(gen11_datatype_table) - 1] != 0);
 
switch (devinfo->gen) {
+   case 11:
+  control_index_table = gen8_control_index_table;
+  datatype_table = gen11_datatype_table;
+  subreg_table = gen8_subreg_table;
+  src_index_table = gen8_src_index_table;
+  break;
case 10:
case 9:
case 8:
-- 
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 06/17] intel/compiler/fs: Implement FS_OPCODE_LINTERP with MADs on Gen11+

2018-02-20 Thread Matt Turner

The PLN instruction is no more. Its functionality is now implemented
using two MAD instructions with the new native-float type. Instead of

   pln(16) r20.0<1>:F r10.4<0;1,0>:F r4.0<8;8,1>:F

we now have

   mad(8) acc0<1>:NF r10.7<0;1,0>:F r4.0<8;8,1>:F r10.4<0;1,0>:F
   mad(8) r20.0<1>:F acc0<8;8,1>:NF r5.0<8;8,1>:F r10.5<0;1,0>:F
   mad(8) acc0<1>:NF r10.7<0;1,0>:F r6.0<8;8,1>:F r10.4<0;1,0>:F
   mad(8) r21.0<1>:F acc0<8;8,1>:NF r7.0<8;8,1>:F r10.5<0;1,0>:F

... and in the case of SIMD8 only the first pair of MAD instructions is
used.
---
 src/intel/compiler/brw_eu_emit.c|  2 +-
 src/intel/compiler/brw_fs_generator.cpp | 49 +++--
 2 files changed, 48 insertions(+), 3 deletions(-)

diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c
index ec871e5aa75..a96fe43556e 100644
--- a/src/intel/compiler/brw_eu_emit.c
+++ b/src/intel/compiler/brw_eu_emit.c
@@ -968,7 +968,7 @@ ALU2(DP4)
 ALU2(DPH)
 ALU2(DP3)
 ALU2(DP2)
-ALU3F(MAD)
+ALU3(MAD)
 ALU3F(LRP)
 ALU1(BFREV)
 ALU3(BFE)
diff --git a/src/intel/compiler/brw_fs_generator.cpp 
b/src/intel/compiler/brw_fs_generator.cpp
index cd5be054f69..54869bc3ebc 100644
--- a/src/intel/compiler/brw_fs_generator.cpp
+++ b/src/intel/compiler/brw_fs_generator.cpp
@@ -674,8 +674,53 @@ fs_generator::generate_linterp(fs_inst *inst,
struct brw_reg delta_y = offset(src[0], inst->exec_size / 8);
struct brw_reg interp = src[1];
 
-   if (devinfo->has_pln &&
-   (devinfo->gen >= 7 || (delta_x.nr & 1) == 0)) {
+   if (devinfo->gen >= 11) {
+  struct brw_reg acc = retype(brw_acc_reg(8), BRW_REGISTER_TYPE_NF);
+  struct brw_reg dwP = suboffset(interp, 0);
+  struct brw_reg dwQ = suboffset(interp, 1);
+  struct brw_reg dwR = suboffset(interp, 3);
+
+  brw_set_default_access_mode(p, BRW_ALIGN_1);
+  brw_set_default_exec_size(p, BRW_EXECUTE_8);
+
+  if (inst->exec_size == 8) {
+ brw_inst *i[2];
+
+ i[0] = brw_MAD(p,acc, dwR, offset(delta_x, 0), dwP);
+ i[1] = brw_MAD(p, offset(dst, 0), acc, offset(delta_y, 0), dwQ);
+
+ brw_inst_set_cond_modifier(p->devinfo, i[1], inst->conditional_mod);
+
+ /* brw_set_default_saturate() is called before emitting instructions,
+  * so the saturate bit is set in each instruction, so we need to unset
+  * it on the first instruction of each pair.
+  */
+ brw_inst_set_saturate(p->devinfo, i[0], false);
+  } else {
+ brw_inst *i[4];
+
+ brw_set_default_compression_control(p, BRW_COMPRESSION_NONE);
+ i[0] = brw_MAD(p,acc, dwR, offset(delta_x, 0), dwP);
+ i[1] = brw_MAD(p, offset(dst, 0), acc, offset(delta_x, 1), dwQ);
+
+ brw_set_default_compression_control(p, BRW_COMPRESSION_2NDHALF);
+ i[2] = brw_MAD(p,acc, dwR, offset(delta_y, 0), dwP);
+ i[3] = brw_MAD(p, offset(dst, 1), acc, offset(delta_y, 1), dwQ);
+
+ brw_set_default_compression_control(p, BRW_COMPRESSION_COMPRESSED);
+
+ brw_inst_set_cond_modifier(p->devinfo, i[1], inst->conditional_mod);
+ brw_inst_set_cond_modifier(p->devinfo, i[3], inst->conditional_mod);
+
+ /* brw_set_default_saturate() is called before emitting instructions,
+  * so the saturate bit is set in each instruction, so we need to unset
+  * it on the first instruction of each pair.
+  */
+ brw_inst_set_saturate(p->devinfo, i[0], false);
+ brw_inst_set_saturate(p->devinfo, i[2], false);
+  }
+   } else if (devinfo->has_pln &&
+  (devinfo->gen >= 7 || (delta_x.nr & 1) == 0)) {
   brw_PLN(p, dst, interp, delta_x);
} else {
   brw_LINE(p, brw_null_reg(), interp, delta_x);
-- 
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 00/17] intel/compiler: Ice Lake support

2018-02-20 Thread Matt Turner

[PATCH 01/17] intel: Add a preliminary device for Ice Lake
[PATCH 02/17] intel: Add icl pci id for INTEL_DEVID_OVERRIDE
[PATCH 03/17] intel: Disable 64-bit extensions on platforms without
[PATCH 04/17] intel/compiler: Add Gen11 register types
[PATCH 05/17] intel/compiler: Add Gen11+ native float type
[PATCH 06/17] intel/compiler/fs: Implement FS_OPCODE_LINTERP with
[PATCH 07/17] intel/compiler/fs: Return multiple_instructions_emitted
[PATCH 08/17] intel/compiler/fs: Fix application of cmod and saturate
[PATCH 09/17] intel/compiler/fs: Don't generate integer DWord
[PATCH 10/17] intel/compiler/fs: Pass fs_inst to generate_ddx/ddy
[PATCH 11/17] intel/compiler/fs: Simplify ddx/ddy code generation
[PATCH 12/17] intel/compiler/fs: Implement ddy without using align16
[PATCH 13/17] intel/compiler: Lower flrp32 on Gen11+
[PATCH 14/17] intel/compiler: Mark line, pln, and lrp as removed on
[PATCH 15/17] intel/compiler: Add instruction compaction support on
[PATCH 16/17] intel/compiler: Disable Align16 tests on Gen11+
[PATCH 17/17] intel/compiler: Add ICL to test_eu_validate.cpp
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 01/17] intel: Add a preliminary device for Ice Lake

2018-02-20 Thread Matt Turner

From: Anuj Phogat 

Signed-off-by: Anuj Phogat 
---
 include/pci_ids/i965_pci_ids.h |  9 ++
 src/intel/common/gen_device_info.c | 56 +-
 2 files changed, 64 insertions(+), 1 deletion(-)

diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h
index feb9c582b19..81c9a5f13fb 100644
--- a/include/pci_ids/i965_pci_ids.h
+++ b/include/pci_ids/i965_pci_ids.h
@@ -196,3 +196,12 @@ CHIPSET(0x5A50, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 
5x8 GT2)")
 CHIPSET(0x5A51, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 5x8 GT2)")
 CHIPSET(0x5A52, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 5x8 GT2)")
 CHIPSET(0x5A54, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 5x8 GT2)")
+CHIPSET(0x8A50, icl_8x8, "Intel(R) HD Graphics (Ice Lake 8x8 GT2)")
+CHIPSET(0x8A51, icl_8x8, "Intel(R) HD Graphics (Ice Lake 8x8 GT2)")
+CHIPSET(0x8A52, icl_8x8, "Intel(R) HD Graphics (Ice Lake 8x8 GT2)")
+CHIPSET(0x8A5A, icl_6x8, "Intel(R) HD Graphics (Ice Lake 6x8 GT1.5)")
+CHIPSET(0x8A5B, icl_4x8, "Intel(R) HD Graphics (Ice Lake 4x8 GT1)")
+CHIPSET(0x8A5C, icl_6x8, "Intel(R) HD Graphics (Ice Lake 6x8 GT1.5)")
+CHIPSET(0x8A5D, icl_4x8, "Intel(R) HD Graphics (Ice Lake 4x8 GT1)")
+CHIPSET(0x8A71, icl_1x8, "Intel(R) HD Graphics (Ice Lake 1x8 GT0.5)")
+CHIPSET(0xFF05, icl_8x8, "Intel(R) HD Graphics (Ice Lake Simulation)")
diff --git a/src/intel/common/gen_device_info.c 
b/src/intel/common/gen_device_info.c
index a08a13a32a4..8bf4b6b9bb0 100644
--- a/src/intel/common/gen_device_info.c
+++ b/src/intel/common/gen_device_info.c
@@ -731,6 +731,49 @@ static const struct gen_device_info 
gen_device_info_cnl_5x8 = {
.is_cannonlake = true,
 };
 
+#define GEN11_HW_INFO   \
+   .gen = 11,   \
+   .has_pln = false,\
+   .max_vs_threads = 364,   \
+   .max_gs_threads = 224,   \
+   .max_tcs_threads = 224,  \
+   .max_tes_threads = 364,  \
+   .max_cs_threads = 56,\
+   .urb = { \
+  .size = 1024, \
+  .min_entries = {  \
+ [MESA_SHADER_VERTEX]= 64,  \
+ [MESA_SHADER_TESS_EVAL] = 34,  \
+  },\
+  .max_entries = {  \
+ [MESA_SHADER_VERTEX]= 2384,\
+ [MESA_SHADER_TESS_CTRL] = 1032,\
+ [MESA_SHADER_TESS_EVAL] = 2384,\
+ [MESA_SHADER_GEOMETRY]  = 1032,\
+  },\
+   }
+
+#define GEN11_FEATURES(_gt, _slices, _l3)   \
+   GEN8_FEATURES,   \
+   GEN11_HW_INFO,   \
+   .gt = _gt, .num_slices = _slices, .l3_banks = _l3
+
+static const struct gen_device_info gen_device_info_icl_8x8 = {
+   GEN11_FEATURES(2, 1, 8),
+};
+
+static const struct gen_device_info gen_device_info_icl_6x8 = {
+   GEN11_FEATURES(1, 1, 6),
+};
+
+static const struct gen_device_info gen_device_info_icl_4x8 = {
+   GEN11_FEATURES(1, 1, 6),
+};
+
+static const struct gen_device_info gen_device_info_icl_1x8 = {
+   GEN11_FEATURES(1, 1, 6),
+};
+
 bool
 gen_get_device_info(int devid, struct gen_device_info *devinfo)
 {
@@ -757,10 +800,21 @@ gen_get_device_info(int devid, struct gen_device_info 
*devinfo)
 * Extra padding can be necessary depending how the thread IDs are
 * calculated for a particular shader stage.
 */
-   if (devinfo->gen >= 9) {
+
+   switch(devinfo->gen) {
+   case 9:
+   case 10:
   devinfo->max_wm_threads = 64 /* threads-per-PSD */
   * devinfo->num_slices
   * 4; /* effective subslices per slice */
+  break;
+   case 11:
+  devinfo->max_wm_threads = 128 /* threads-per-PSD */
+  * devinfo->num_slices
+  * 8; /* subslices per slice */
+  break;
+   default:
+  break;
}
 
assert(devinfo->num_slices <= ARRAY_SIZE(devinfo->num_subslices));
-- 
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 04/17] intel/compiler: Add Gen11 register types

2018-02-20 Thread Matt Turner

The hardware register types' encodings have changed on Gen11. Good thing
we have that superfluous looking brw_reg_type abstraction lying around!
---
 src/intel/compiler/brw_reg_type.c | 73 ++-
 1 file changed, 65 insertions(+), 8 deletions(-)

diff --git a/src/intel/compiler/brw_reg_type.c 
b/src/intel/compiler/brw_reg_type.c
index b7fff0867f4..c4f8eedeb4b 100644
--- a/src/intel/compiler/brw_reg_type.c
+++ b/src/intel/compiler/brw_reg_type.c
@@ -40,6 +40,18 @@ enum hw_reg_type {
BRW_HW_REG_TYPE_B   = 5,
GEN7_HW_REG_TYPE_DF = 6,
GEN8_HW_REG_TYPE_HF = 10,
+
+   GEN11_HW_REG_TYPE_UD = 0,
+   GEN11_HW_REG_TYPE_D  = 1,
+   GEN11_HW_REG_TYPE_UW = 2,
+   GEN11_HW_REG_TYPE_W  = 3,
+   GEN11_HW_REG_TYPE_UB = 4,
+   GEN11_HW_REG_TYPE_B  = 5,
+   GEN11_HW_REG_TYPE_UQ = 6,
+   GEN11_HW_REG_TYPE_Q  = 7,
+   GEN11_HW_REG_TYPE_HF = 8,
+   GEN11_HW_REG_TYPE_F  = 9,
+   GEN11_HW_REG_TYPE_DF = 10,
 };
 
 enum hw_imm_type {
@@ -56,9 +68,22 @@ enum hw_imm_type {
BRW_HW_IMM_TYPE_V   = 6,
GEN8_HW_IMM_TYPE_DF = 10,
GEN8_HW_IMM_TYPE_HF = 11,
+
+   GEN11_HW_IMM_TYPE_UD = 0,
+   GEN11_HW_IMM_TYPE_D  = 1,
+   GEN11_HW_IMM_TYPE_UW = 2,
+   GEN11_HW_IMM_TYPE_W  = 3,
+   GEN11_HW_IMM_TYPE_UV = 4,
+   GEN11_HW_IMM_TYPE_V  = 5,
+   GEN11_HW_IMM_TYPE_UQ = 6,
+   GEN11_HW_IMM_TYPE_Q  = 7,
+   GEN11_HW_IMM_TYPE_HF = 8,
+   GEN11_HW_IMM_TYPE_F  = 9,
+   GEN11_HW_IMM_TYPE_DF = 10,
+   GEN11_HW_IMM_TYPE_VF = 11,
 };
 
-static const struct {
+static const struct hw_type {
enum hw_reg_type reg_type;
enum hw_imm_type imm_type;
 } gen4_hw_type[] = {
@@ -77,6 +102,22 @@ static const struct {
[BRW_REGISTER_TYPE_UB] = { BRW_HW_REG_TYPE_UB,  INVALID },
[BRW_REGISTER_TYPE_V]  = { INVALID, BRW_HW_IMM_TYPE_V   },
[BRW_REGISTER_TYPE_UV] = { INVALID, BRW_HW_IMM_TYPE_UV  },
+}, gen11_hw_type[] = {
+   [BRW_REGISTER_TYPE_DF] = { GEN11_HW_REG_TYPE_DF, GEN11_HW_IMM_TYPE_DF },
+   [BRW_REGISTER_TYPE_F]  = { GEN11_HW_REG_TYPE_F,  GEN11_HW_IMM_TYPE_F  },
+   [BRW_REGISTER_TYPE_HF] = { GEN11_HW_REG_TYPE_HF, GEN11_HW_IMM_TYPE_HF },
+   [BRW_REGISTER_TYPE_VF] = { INVALID,  GEN11_HW_IMM_TYPE_VF },
+
+   [BRW_REGISTER_TYPE_Q]  = { GEN11_HW_REG_TYPE_Q,  GEN11_HW_IMM_TYPE_Q  },
+   [BRW_REGISTER_TYPE_UQ] = { GEN11_HW_REG_TYPE_UQ, GEN11_HW_IMM_TYPE_UQ },
+   [BRW_REGISTER_TYPE_D]  = { GEN11_HW_REG_TYPE_D,  GEN11_HW_IMM_TYPE_D  },
+   [BRW_REGISTER_TYPE_UD] = { GEN11_HW_REG_TYPE_UD, GEN11_HW_IMM_TYPE_UD },
+   [BRW_REGISTER_TYPE_W]  = { GEN11_HW_REG_TYPE_W,  GEN11_HW_IMM_TYPE_W  },
+   [BRW_REGISTER_TYPE_UW] = { GEN11_HW_REG_TYPE_UW, GEN11_HW_IMM_TYPE_UW },
+   [BRW_REGISTER_TYPE_B]  = { GEN11_HW_REG_TYPE_B,  INVALID  },
+   [BRW_REGISTER_TYPE_UB] = { GEN11_HW_REG_TYPE_UB, INVALID  },
+   [BRW_REGISTER_TYPE_V]  = { INVALID,  GEN11_HW_IMM_TYPE_V  },
+   [BRW_REGISTER_TYPE_UV] = { INVALID,  GEN11_HW_IMM_TYPE_UV },
 };
 
 /* SNB adds 3-src instructions (MAD and LRP) that only operate on floats, so
@@ -147,14 +188,22 @@ brw_reg_type_to_hw_type(const struct gen_device_info 
*devinfo,
 enum brw_reg_file file,
 enum brw_reg_type type)
 {
-   assert(type < ARRAY_SIZE(gen4_hw_type));
+   const struct hw_type *table;
+
+   if (devinfo->gen >= 11) {
+  assert(type < ARRAY_SIZE(gen11_hw_type));
+  table = gen11_hw_type;
+   } else {
+  assert(type < ARRAY_SIZE(gen4_hw_type));
+  table = gen4_hw_type;
+   }
 
if (file == BRW_IMMEDIATE_VALUE) {
-  assert(gen4_hw_type[type].imm_type != (enum hw_imm_type)INVALID);
-  return gen4_hw_type[type].imm_type;
+  assert(table[type].imm_type != (enum hw_imm_type)INVALID);
+  return table[type].imm_type;
} else {
-  assert(gen4_hw_type[type].reg_type != (enum hw_reg_type)INVALID);
-  return gen4_hw_type[type].reg_type;
+  assert(table[type].reg_type != (enum hw_reg_type)INVALID);
+  return table[type].reg_type;
}
 }
 
@@ -167,15 +216,23 @@ enum brw_reg_type
 brw_hw_type_to_reg_type(const struct gen_device_info *devinfo,
 enum brw_reg_file file, unsigned hw_type)
 {
+   const struct hw_type *table;
+
+   if (devinfo->gen >= 11) {
+  table = gen11_hw_type;
+   } else {
+  table = gen4_hw_type;
+   }
+
if (file == BRW_IMMEDIATE_VALUE) {
   for (enum brw_reg_type i = 0; i <= BRW_REGISTER_TYPE_LAST; i++) {
- if (gen4_hw_type[i].imm_type == (enum hw_imm_type)hw_type) {
+ if (table[i].imm_type == (enum hw_imm_type)hw_type) {
 return i;
  }
   }
} else {
   for (enum brw_reg_type i = 0; i <= BRW_REGISTER_TYPE_LAST; i++) {
- if (gen4_hw_type[i].reg_type == (enum hw_reg_type)hw_type) {
+ if (table[i].reg_type == (enum hw_reg_type)hw_type) {
 return i;
  }
   }
-- 
2.16.1

___
mesa-dev mailing list

[Mesa-dev] [PATCH 05/17] intel/compiler: Add Gen11+ native float type

2018-02-20 Thread Matt Turner

This new type exposes the additional precision offered by the
accumulator register and will be used in the next patch to implement the
functionality of the PLN instruction using a pair of MAD instructions.

One weird thing to note: align1 ternary instructions may only have an
accumulator in the dst or src1 normally, but when src0's type is :NF
the accumulator is read.
---
 src/intel/compiler/brw_disasm.c  |  7 +++
 src/intel/compiler/brw_eu_emit.c | 10 --
 src/intel/compiler/brw_eu_validate.c |  1 +
 src/intel/compiler/brw_reg_type.c|  8 
 src/intel/compiler/brw_reg_type.h|  2 ++
 src/intel/compiler/brw_shader.cpp|  6 ++
 6 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/src/intel/compiler/brw_disasm.c b/src/intel/compiler/brw_disasm.c
index 429ed781404..a9a108f8acd 100644
--- a/src/intel/compiler/brw_disasm.c
+++ b/src/intel/compiler/brw_disasm.c
@@ -1035,6 +1035,12 @@ src0_3src(FILE *file, const struct gen_device_info 
*devinfo, const brw_inst *ins
  reg_nr = brw_inst_3src_src0_reg_nr(devinfo, inst);
  subreg_nr = brw_inst_3src_a1_src0_subreg_nr(devinfo, inst);
  type = brw_inst_3src_a1_src0_type(devinfo, inst);
+  } else if (brw_inst_3src_a1_src0_type(devinfo, inst) ==
+ BRW_REGISTER_TYPE_NF) {
+ _file = BRW_ARCHITECTURE_REGISTER_FILE;
+ reg_nr = brw_inst_3src_src0_reg_nr(devinfo, inst);
+ subreg_nr = brw_inst_3src_a1_src0_subreg_nr(devinfo, inst);
+ type = brw_inst_3src_a1_src0_type(devinfo, inst);
   } else {
  _file = BRW_IMMEDIATE_VALUE;
  uint16_t imm_val = brw_inst_3src_a1_src0_imm(devinfo, inst);
@@ -1288,6 +1294,7 @@ imm(FILE *file, const struct gen_device_info *devinfo, 
enum brw_reg_type type,
case BRW_REGISTER_TYPE_HF:
   string(file, "Half Float IMM");
   break;
+   case BRW_REGISTER_TYPE_NF:
case BRW_REGISTER_TYPE_UB:
case BRW_REGISTER_TYPE_B:
   format(file, "*** invalid immediate type %d ", type);
diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c
index c25d8d6eda0..ec871e5aa75 100644
--- a/src/intel/compiler/brw_eu_emit.c
+++ b/src/intel/compiler/brw_eu_emit.c
@@ -771,7 +771,11 @@ brw_alu3(struct brw_codegen *p, unsigned opcode, struct 
brw_reg dest,
 to_3src_align1_hstride(src2.hstride));
 
   brw_inst_set_3src_a1_src0_subreg_nr(devinfo, inst, src0.subnr);
-  brw_inst_set_3src_src0_reg_nr(devinfo, inst, src0.nr);
+  if (src0.type == BRW_REGISTER_TYPE_NF) {
+ brw_inst_set_3src_src0_reg_nr(devinfo, inst, BRW_ARF_ACCUMULATOR);
+  } else {
+ brw_inst_set_3src_src0_reg_nr(devinfo, inst, src0.nr);
+  }
   brw_inst_set_3src_src0_abs(devinfo, inst, src0.abs);
   brw_inst_set_3src_src0_negate(devinfo, inst, src0.negate);
 
@@ -790,7 +794,9 @@ brw_alu3(struct brw_codegen *p, unsigned opcode, struct 
brw_reg dest,
   brw_inst_set_3src_src2_negate(devinfo, inst, src2.negate);
 
   assert(src0.file == BRW_GENERAL_REGISTER_FILE ||
- src0.file == BRW_IMMEDIATE_VALUE);
+ src0.file == BRW_IMMEDIATE_VALUE ||
+ (src0.file == BRW_ARCHITECTURE_REGISTER_FILE &&
+  src0.type == BRW_REGISTER_TYPE_NF));
   assert(src1.file == BRW_GENERAL_REGISTER_FILE ||
  src1.file == BRW_ARCHITECTURE_REGISTER_FILE);
   assert(src2.file == BRW_GENERAL_REGISTER_FILE ||
diff --git a/src/intel/compiler/brw_eu_validate.c 
b/src/intel/compiler/brw_eu_validate.c
index 6ee6b4ffbe7..d3189d1ef5e 100644
--- a/src/intel/compiler/brw_eu_validate.c
+++ b/src/intel/compiler/brw_eu_validate.c
@@ -277,6 +277,7 @@ static enum brw_reg_type
 execution_type_for_type(enum brw_reg_type type)
 {
switch (type) {
+   case BRW_REGISTER_TYPE_NF:
case BRW_REGISTER_TYPE_DF:
case BRW_REGISTER_TYPE_F:
case BRW_REGISTER_TYPE_HF:
diff --git a/src/intel/compiler/brw_reg_type.c 
b/src/intel/compiler/brw_reg_type.c
index c4f8eedeb4b..3c82eb0a76f 100644
--- a/src/intel/compiler/brw_reg_type.c
+++ b/src/intel/compiler/brw_reg_type.c
@@ -52,6 +52,7 @@ enum hw_reg_type {
GEN11_HW_REG_TYPE_HF = 8,
GEN11_HW_REG_TYPE_F  = 9,
GEN11_HW_REG_TYPE_DF = 10,
+   GEN11_HW_REG_TYPE_NF = 11,
 };
 
 enum hw_imm_type {
@@ -87,6 +88,8 @@ static const struct hw_type {
enum hw_reg_type reg_type;
enum hw_imm_type imm_type;
 } gen4_hw_type[] = {
+   [0 ... BRW_REGISTER_TYPE_LAST] = { INVALID, INVALID },
+
[BRW_REGISTER_TYPE_DF] = { GEN7_HW_REG_TYPE_DF, GEN8_HW_IMM_TYPE_DF },
[BRW_REGISTER_TYPE_F]  = { BRW_HW_REG_TYPE_F,   BRW_HW_IMM_TYPE_F   },
[BRW_REGISTER_TYPE_HF] = { GEN8_HW_REG_TYPE_HF, GEN8_HW_IMM_TYPE_HF },
@@ -103,6 +106,7 @@ static const struct hw_type {
[BRW_REGISTER_TYPE_V]  = { INVALID, BRW_HW_IMM_TYPE_V   },
[BRW_REGISTER_TYPE_UV] = { INVALID, BRW_HW_IMM_TYPE_UV  },
 }, gen11_hw_type[] = {
+   [BRW_REGISTER_TYPE_NF]

[Mesa-dev] [PATCH 03/17] intel: Disable 64-bit extensions on platforms without 64-bit types

2018-02-20 Thread Matt Turner

Gen11 does not support DF, Q, UQ types in hardware. As a result, we have
to disable some GL extensions until they can be reimplemented.
---
 src/intel/common/gen_device_info.c   | 3 +++
 src/intel/common/gen_device_info.h   | 1 +
 src/mesa/drivers/dri/i965/intel_extensions.c | 9 +
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/src/intel/common/gen_device_info.c 
b/src/intel/common/gen_device_info.c
index 8bf4b6b9bb0..465d4c783a1 100644
--- a/src/intel/common/gen_device_info.c
+++ b/src/intel/common/gen_device_info.c
@@ -138,6 +138,7 @@ static const struct gen_device_info gen_device_info_snb_gt2 
= {
.must_use_separate_stencil = true,   \
.has_llc = true, \
.has_pln = true, \
+   .has_64bit_types = true, \
.has_surface_tile_offset = true, \
.timestamp_frequency = 1250
 
@@ -322,6 +323,7 @@ static const struct gen_device_info gen_device_info_hsw_gt3 
= {
.has_llc = true, \
.has_sample_with_hiz = false,\
.has_pln = true, \
+   .has_64bit_types = true, \
.supports_simd16_3src = true,\
.has_surface_tile_offset = true, \
.max_vs_threads = 504,   \
@@ -756,6 +758,7 @@ static const struct gen_device_info gen_device_info_cnl_5x8 
= {
 #define GEN11_FEATURES(_gt, _slices, _l3)   \
GEN8_FEATURES,   \
GEN11_HW_INFO,   \
+   .has_64bit_types = false,\
.gt = _gt, .num_slices = _slices, .l3_banks = _l3
 
 static const struct gen_device_info gen_device_info_icl_8x8 = {
diff --git a/src/intel/common/gen_device_info.h 
b/src/intel/common/gen_device_info.h
index fd9c17531db..7761eeba7e0 100644
--- a/src/intel/common/gen_device_info.h
+++ b/src/intel/common/gen_device_info.h
@@ -59,6 +59,7 @@ struct gen_device_info
bool has_llc;
 
bool has_pln;
+   bool has_64bit_types;
bool has_compr4;
bool has_surface_tile_offset;
bool supports_simd16_3src;
diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index cc961e051fd..3f5f4dab411 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -217,7 +217,7 @@ intelInitExtensions(struct gl_context *ctx)
   ctx->Extensions.ARB_derivative_control = true;
   ctx->Extensions.ARB_framebuffer_no_attachments = true;
   ctx->Extensions.ARB_gpu_shader5 = true;
-  ctx->Extensions.ARB_gpu_shader_fp64 = true;
+  ctx->Extensions.ARB_gpu_shader_fp64 = devinfo->has_64bit_types;
   ctx->Extensions.ARB_shader_atomic_counters = true;
   ctx->Extensions.ARB_shader_atomic_counter_ops = true;
   ctx->Extensions.ARB_shader_clock = true;
@@ -229,7 +229,7 @@ intelInitExtensions(struct gl_context *ctx)
   ctx->Extensions.ARB_texture_compression_bptc = true;
   ctx->Extensions.ARB_texture_view = true;
   ctx->Extensions.ARB_shader_storage_buffer_object = true;
-  ctx->Extensions.ARB_vertex_attrib_64bit = true;
+  ctx->Extensions.ARB_vertex_attrib_64bit = devinfo->has_64bit_types;
   ctx->Extensions.EXT_shader_samples_identical = true;
   ctx->Extensions.OES_primitive_bounding_box = true;
   ctx->Extensions.OES_texture_buffer = true;
@@ -279,8 +279,9 @@ intelInitExtensions(struct gl_context *ctx)
}
 
if (devinfo->gen >= 8) {
-  ctx->Extensions.ARB_gpu_shader_int64 = true;
-  ctx->Extensions.ARB_shader_ballot = true; /* requires 
ARB_gpu_shader_int64 */
+  ctx->Extensions.ARB_gpu_shader_int64 = devinfo->has_64bit_types;
+  /* requires ARB_gpu_shader_int64 */
+  ctx->Extensions.ARB_shader_ballot = devinfo->has_64bit_types;
   ctx->Extensions.ARB_ES3_2_compatibility = true;
}
 
-- 
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 02/17] intel: Add icl pci id for INTEL_DEVID_OVERRIDE

2018-02-20 Thread Matt Turner

From: Anuj Phogat 

Reviewed-by: Matt Turner 
Signed-off-by: Anuj Phogat 
---
 src/mesa/drivers/dri/i965/intel_screen.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index ef5aee894fa..0367feb47c2 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -2380,6 +2380,7 @@ parse_devid_override(const char *devid_override)
   { "kbl", 0x5912 },
   { "glk", 0x3185 },
   { "cnl", 0x5a52 },
+  { "icl", 0x8a52 }
};
 
for (unsigned i = 0; i < ARRAY_SIZE(name_map); i++) {
-- 
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] intel/gen9+: Enable object level preemption.

2018-02-20 Thread Ben Widawsky


On 18-02-20 09:15:01, Antognolli, Rafael wrote:

On Tue, Feb 20, 2018 at 08:11:14AM -0800, Rafael Antognolli wrote:

On Fri, Feb 16, 2018 at 06:37:55PM -0800, Ben Widawsky wrote:
> On 18-02-16 13:44:00, Antognolli, Rafael wrote:
> > "This field controls the granularity of the replay mechanism when
> > coming back into a previously preempted context."
> >
> > The kernel disables this bit but whitelists the register, and it's a
> > context register. So enable it and take advantage of finer granularity
> > when preemption is available.
> >
>
> Does the kernel actually disable it? I thought the kernel just doesn't touch 
it
> (I don't think it's whitelisted by the kernel either, it's just writable).

I'm seeing it being disabled at WaDisable3DMidCmdPreemption, seems to be
in effect since commit 5152defe4a53ad15e6d96c422440152302c8abd7.

And it's whitelisted by WaEnablePreemptionGranularityControlByUMD.

> > Signed-off-by: Rafael Antognolli 
> > Cc: Ben Widawsky 
> > ---
> >
> > This patch still needs more testing (only ran it through CI and also did
> > some basic tests on my machine to make sure it's not breaking anything).
> >
> > src/intel/genxml/gen10.xml   |  8 
> > src/intel/genxml/gen11.xml   |  8 
> > src/intel/genxml/gen9.xml|  8 
> > src/intel/vulkan/genX_state.c| 18 ++
> > src/mesa/drivers/dri/i965/brw_defines.h  |  5 +
> > src/mesa/drivers/dri/i965/brw_state_upload.c | 10 ++
> > 6 files changed, 57 insertions(+)
> >
> > diff --git a/src/intel/genxml/gen10.xml b/src/intel/genxml/gen10.xml
> > index 47c679a3fa9..42ac6e82696 100644
> > --- a/src/intel/genxml/gen10.xml
> > +++ b/src/intel/genxml/gen10.xml
> > @@ -3692,6 +3692,14 @@
> > 
> >   
> >
> > +  
> > +
> > +  
> > +  
> > +
> > +
> > +  
> > +
> >   
> > 
> >   
> > diff --git a/src/intel/genxml/gen11.xml b/src/intel/genxml/gen11.xml
> > index 9a8a2fe21e3..e6ce42b2bfb 100644
> > --- a/src/intel/genxml/gen11.xml
> > +++ b/src/intel/genxml/gen11.xml
> > @@ -3688,6 +3688,14 @@
> > 
> >   
> >
> > +  
> > +
> > +  
> > +  
> > +
> > +
> > +  
> > +
> >   
> > 
> >   
> > diff --git a/src/intel/genxml/gen9.xml b/src/intel/genxml/gen9.xml
> > index 7eef4bee013..45e1fddeb50 100644
> > --- a/src/intel/genxml/gen9.xml
> > +++ b/src/intel/genxml/gen9.xml
> > @@ -3638,6 +3638,14 @@
> > 
> >   
> >
> > +  
> > +
> > +  
> > +  
> > +
> > +
> > +  
> > +
> >   
> > 
> >   
> > diff --git a/src/intel/vulkan/genX_state.c b/src/intel/vulkan/genX_state.c
> > index 54fb8634fdc..83b6c6387f3 100644
> > --- a/src/intel/vulkan/genX_state.c
> > +++ b/src/intel/vulkan/genX_state.c
> > @@ -169,6 +169,24 @@ genX(init_device_state)(struct anv_device *device)
> >gen10_emit_wa_lri_to_cache_mode_zero();
> > #endif
> >
> > +#if GEN_GEN >= 9
> > +   /* A fixed function pipe flush is required before modifying this field 
*/
> > +   anv_batch_emit(, GENX(PIPE_CONTROL), pipe) {
> > +  pipe.PipeControlFlushEnable = true;
> > +   }
> > +
> > +   /* enable object level preemption */
> > +   uint32_t csc1;
> > +
> > +   anv_pack_struct(, GENX(CS_CHICKEN1),
> > +   .ReplayMode = ObjectLevelPreemption,
> > +   .ReplayModeMask = 1);
> > +   anv_batch_emit(, GENX(MI_LOAD_REGISTER_IMM), lri) {
> > +  lri.RegisterOffset   = GENX(CS_CHICKEN1_num);
> > +  lri.DataDWord= csc1;
> > +   }
> > +#endif
> > +
> >anv_batch_emit(, GENX(MI_BATCH_BUFFER_END), bbe);
> >
> >assert(batch.next <= batch.end);
> > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
> > index 8bf6f68b67c..f0994d3b139 100644
> > --- a/src/mesa/drivers/dri/i965/brw_defines.h
> > +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> > @@ -1661,4 +1661,9 @@ enum brw_pixel_shader_coverage_mask_mode {
> > # define GLK_SCEC_BARRIER_MODE_3D_HULL (1 << 7)
> > # define GLK_SCEC_BARRIER_MODE_MASKREG_MASK(1 << 7)
> >
> > +#define CS_CHICKEN10x2580 /* Gen9+ */
> > +# define GEN9_REPLAY_MODE_MIDBUFFER (0 << 0)
> > +# define GEN9_REPLAY_MODE_MIDOBJECT (1 << 0)
> > +# define GEN9_REPLAY_MODE_MASK  REG_MASK(1 << 0)
> > +
> > #endif
> > diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c 
b/src/mesa/drivers/dri/i965/brw_state_upload.c
> > index 86c12e4d357..a90dc01d87b 100644
> > --- a/src/mesa/drivers/dri/i965/brw_state_upload.c
> > +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
> > @@ -115,6 +115,16 @@ brw_upload_initial_gpu_state(struct brw_context *brw)
> >   OUT_BATCH(0);
> >   ADVANCE_BATCH();
> >}
> > +
> > +   if (devinfo->gen >= 9) {
> > +  /* A fixed function pipe flush is required before modifying this 
field */
> > +  brw_emit_pipe_control_flush(brw,

[Mesa-dev] [PATCH] nv50,nvc0: fix clear buffer acceleration

2018-02-20 Thread Ilia Mirkin

Two things were off:
 - valid range was not updated, which could affect waiting for future
   maps
 - fencing was done manually instead of using the *_resource_validate
   helper, which resulted in a missed dirty buffer flag being set

Fixes: KHR-GL45.direct_state_access.buffers_clear
Signed-off-by: Ilia Mirkin 
---

Untested on pre-kepler paths. Pretty similar overall.

 src/gallium/drivers/nouveau/nv50/nv50_surface.c | 20 
 src/gallium/drivers/nouveau/nvc0/nvc0_surface.c | 25 +
 2 files changed, 17 insertions(+), 28 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_surface.c 
b/src/gallium/drivers/nouveau/nv50/nv50_surface.c
index 908c534b92e..037e14a4d60 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_surface.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_surface.c
@@ -672,10 +672,7 @@ nv50_clear_buffer_push(struct pipe_context *pipe,
   count -= nr;
}
 
-   if (buf->mm) {
-  nouveau_fence_ref(nv50->screen->base.fence.current, >fence);
-  nouveau_fence_ref(nv50->screen->base.fence.current, >fence_wr);
-   }
+   nv50_resource_validate(buf, NOUVEAU_BO_WR);
 
nouveau_bufctx_reset(nv50->bufctx, 0);
 }
@@ -727,6 +724,8 @@ nv50_clear_buffer(struct pipe_context *pipe,
   return;
}
 
+   util_range_add(>valid_buffer_range, offset, offset + size);
+
assert(size % data_size == 0);
 
if (offset & 0xff) {
@@ -747,10 +746,10 @@ nv50_clear_buffer(struct pipe_context *pipe,
assert(width > 0);
 
BEGIN_NV04(push, NV50_3D(CLEAR_COLOR(0)), 4);
-   PUSH_DATAf(push, color.f[0]);
-   PUSH_DATAf(push, color.f[1]);
-   PUSH_DATAf(push, color.f[2]);
-   PUSH_DATAf(push, color.f[3]);
+   PUSH_DATA (push, color.ui[0]);
+   PUSH_DATA (push, color.ui[1]);
+   PUSH_DATA (push, color.ui[2]);
+   PUSH_DATA (push, color.ui[3]);
 
if (nouveau_pushbuf_space(push, 64, 1, 0))
   return;
@@ -796,10 +795,7 @@ nv50_clear_buffer(struct pipe_context *pipe,
BEGIN_NV04(push, NV50_3D(COND_MODE), 1);
PUSH_DATA (push, nv50->cond_condmode);
 
-   if (buf->mm) {
-  nouveau_fence_ref(nv50->screen->base.fence.current, >fence);
-  nouveau_fence_ref(nv50->screen->base.fence.current, >fence_wr);
-   }
+   nv50_resource_validate(buf, NOUVEAU_BO_WR);
 
if (width * height != elements) {
   offset += width * height * data_size;
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c
index 9445c05f3ab..0f86c11b7f4 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c
@@ -403,10 +403,7 @@ nvc0_clear_buffer_push_nvc0(struct pipe_context *pipe,
   size -= nr * 4;
}
 
-   if (buf->mm) {
-  nouveau_fence_ref(nvc0->screen->base.fence.current, >fence);
-  nouveau_fence_ref(nvc0->screen->base.fence.current, >fence_wr);
-   }
+   nvc0_resource_validate(buf, NOUVEAU_BO_WR);
 
nouveau_bufctx_reset(nvc0->bufctx, 0);
 }
@@ -453,10 +450,7 @@ nvc0_clear_buffer_push_nve4(struct pipe_context *pipe,
   size -= nr * 4;
}
 
-   if (buf->mm) {
-  nouveau_fence_ref(nvc0->screen->base.fence.current, >fence);
-  nouveau_fence_ref(nvc0->screen->base.fence.current, >fence_wr);
-   }
+   nvc0_resource_validate(buf, NOUVEAU_BO_WR);
 
nouveau_bufctx_reset(nvc0->bufctx, 0);
 }
@@ -540,6 +534,8 @@ nvc0_clear_buffer(struct pipe_context *pipe,
   return;
}
 
+   util_range_add(>valid_buffer_range, offset, offset + size);
+
assert(size % data_size == 0);
 
if (data_size == 12) {
@@ -570,10 +566,10 @@ nvc0_clear_buffer(struct pipe_context *pipe,
PUSH_REFN (push, buf->bo, buf->domain | NOUVEAU_BO_WR);
 
BEGIN_NVC0(push, NVC0_3D(CLEAR_COLOR(0)), 4);
-   PUSH_DATAf(push, color.f[0]);
-   PUSH_DATAf(push, color.f[1]);
-   PUSH_DATAf(push, color.f[2]);
-   PUSH_DATAf(push, color.f[3]);
+   PUSH_DATA (push, color.ui[0]);
+   PUSH_DATA (push, color.ui[1]);
+   PUSH_DATA (push, color.ui[2]);
+   PUSH_DATA (push, color.ui[3]);
BEGIN_NVC0(push, NVC0_3D(SCREEN_SCISSOR_HORIZ), 2);
PUSH_DATA (push, width << 16);
PUSH_DATA (push, height << 16);
@@ -600,10 +596,7 @@ nvc0_clear_buffer(struct pipe_context *pipe,
 
IMMED_NVC0(push, NVC0_3D(COND_MODE), nvc0->cond_condmode);
 
-   if (buf->mm) {
-  nouveau_fence_ref(nvc0->screen->base.fence.current, >fence);
-  nouveau_fence_ref(nvc0->screen->base.fence.current, >fence_wr);
-   }
+   nvc0_resource_validate(buf, NOUVEAU_BO_WR);
 
if (width * height != elements) {
   offset += width * height * data_size;
-- 
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v1 0/7] Implement commont gralloc_handle_t in libdrm

2018-02-20 Thread Tomasz Figa

On Wed, Feb 21, 2018 at 4:03 AM, Rob Herring  wrote:
> On Tue, Feb 20, 2018 at 4:26 AM, Tomasz Figa  wrote:
>> On Tue, Feb 20, 2018 at 6:51 PM, Robert Foss  
>> wrote:
>>> Hey Tomasz,
>>>
>>> On 02/20/2018 09:55 AM, Tomasz Figa wrote:

 Hi Rob,

 On Fri, Feb 16, 2018 at 11:48 PM, Tomasz Figa  wrote:
>
> On Fri, Feb 16, 2018 at 11:33 PM, Robert Foss 
> wrote:
>>
>> Hey Tomasz,
>>
>>
>> On 02/16/2018 05:10 AM, Tomaszzz Figa wrote:
>>>
>>>
>>> On Fri, Feb 9, 2018 at 11:06 PM, Rob Herring  wrote:


 On Fri, Feb 9, 2018 at 3:58 AM, Tomasz Figa >>
>>> On Fri, Feb 2, 2018 at 2:01 AM, Tomasz Figa 
>>>
>>> wrote:


 Hi Rob,

 On Tue, Jan 30, 2018 at 9:36 PM, Robert Foss
  wrote:
>>
>>
>> uint32_t (*get_fd)(buffer_handle_t handle, uint32_t
>> plane);
>> uint64_t (*get_modifier)(buffer_handle_t handle,
>> uint32_t
>> plane);
>> uint32_t (*get_offsets)(buffer_handle_t handle,
>> uint32_t
>> plane);
>> uint32_t (*get_stride)(buffer_handle_t handle,
>> uint32_t
>> plane);
>> ...
>> } gralloc_funcs_t;
>>
>>
>>
>>
>> These ones? >
>> Yeah, if we could retrieve such function pointer struct using
>> perform
>> or any equivalent (like the implementation-specific methods in
>> gralloc1, but not sure if that's going to be used in practice
>> anywhere), it could work for us.
>
>
>
>
> So this is where you and Rob Herring lose me, I don't think I
> understand
> quite how the gralloc1 call would be used, and how it would tie
> into
> this
> handle struct. I think I could do with some guidance on this.



 This would be very similar to gralloc0 perform call. gralloc1
 implementations need to provide getFunction() callback [1], which
 returns a pointer to given function. The list of standard
 functions
 is
 defined in the gralloc1.h header [2], but we could take some
 random
 big number and use it for our function that fills in provided
 gralloc_funcs_t struct with necessary pointers.

 [1]

 https://android.googlesource.com/platform/hardware/libhardware/+/master/include/hardware/gralloc1.h#300
 [2]

 https://android.googlesource.com/platform/hardware/libhardware/+/master/include/hardware/gralloc1.h#134
>>>
>>>
>>>
>>> This is a deadend because it won't work with a HIDL based
>>> implementation (aka gralloc 2.0). You can't set function pointers
>>> (or
>>> any pointers) because gralloc runs in a different process. Yes,
>>> currently gralloc is a pass-thru HAL, but AIUI that will go away.
>>
>>
>>
>> Part of it. I can't see IMapper being implemented by a separate
>> process. You can't map a buffer into one process from another
>> process.
>>
>> But anyway, it's a good point, thanks, I almost forgot about its
>> existence. I'll do further investigation.
>
>
>
> Okay, so IMapper indeed breaks the approach I suggested. I'm not sure
> at the moment what we could do about it. (The idea of a dynamic
> library of a pre-defined name, exporting functions we specify, might
> still work, though.)
>
> Note that the DRM_GRALLOC_GET_FD used currently by Mesa will also be
> impossible to implement with IAllocator/IMapper. (Although I still
> think Mesa and Gralloc are free to have separate logic for choosing
> the DRM device to use.)



 I think the need for GET_FD goes away when the render node is used. We
 may still need the card node for s/w rendering (if I can ever get that
 working) though. Of course, if we use the vgem approach like CrOS then
 we wouldn't.
>>>
>>>
>>>
>>> Hmm, if so, then we probably wouldn't have any strict need for these
>>> function pointers anymore. We already have a makeshift format resolve
>>> in place and

Re: [Mesa-dev] [PATCH v5 01/34] st/glsl_to_nir: run lower_output_reads on !PIPE_CAP_TGSI_CAN_READ_OUTPUTS

2018-02-20 Thread Timothy Arceri


Reviewed-by: Timothy Arceri 

On 21/02/18 08:02, Karol Herbst wrote:

this is required for Nouveau

Signed-off-by: Karol Herbst 
---
  src/mesa/state_tracker/st_glsl_to_nir.cpp | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/src/mesa/state_tracker/st_glsl_to_nir.cpp 
b/src/mesa/state_tracker/st_glsl_to_nir.cpp
index 765c827d93..f6f55afe40 100644
--- a/src/mesa/state_tracker/st_glsl_to_nir.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_nir.cpp
@@ -43,6 +43,7 @@
  #include "compiler/glsl_types.h"
  #include "compiler/glsl/glsl_to_nir.h"
  #include "compiler/glsl/ir.h"
+#include "compiler/glsl/ir_optimization.h"
  #include "compiler/glsl/string_to_uint_map.h"
  
  
@@ -471,6 +472,7 @@ st_nir_get_mesa_program(struct gl_context *ctx,

  struct gl_linked_shader *shader)
  {
 struct st_context *st = st_context(ctx);
+   struct pipe_screen *pscreen = ctx->st->pipe->screen;
 struct gl_program *prog;
  
 validate_ir_tree(shader->ir);

@@ -483,6 +485,10 @@ st_nir_get_mesa_program(struct gl_context *ctx,
 _mesa_generate_parameters_list_for_uniforms(ctx, shader_program, shader,
 prog->Parameters);
  
+   /* Remove reads from output registers. */

+   if (!pscreen->get_param(pscreen, PIPE_CAP_TGSI_CAN_READ_OUTPUTS))
+  lower_output_reads(shader->Stage, shader->ir);
+
 if (ctx->_Shader->Flags & GLSL_DUMP) {
_mesa_log("\n");
_mesa_log("GLSL IR for linked %s program %d:\n",


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] ac/nir: set the DA field when performing atomics on 3D images

2018-02-20 Thread Timothy Arceri


On 21/02/18 07:29, Samuel Pitoiset wrote:
On VI, 3D images are considered as 2D arrays. RadeonSI sets DA for 
loads/stores/atomics and RADV only for loads/stores, so I guess there is 
a reason for that?


I've changed the nir->llvm code recently in order to fix some piglit 
test on the  radeonsi nir backend.


[1] 
https://cgit.freedesktop.org/mesa/mesa/commit/?id=e68150de263156a3f3d1b609b6506c5649967f61
[2] 
https://cgit.freedesktop.org/mesa/mesa/commit/?id=82adf53308c137ce0dc5f2d5da4e7cc40c5b808c




Anyway, there is a potential issue on the RADV side I think.

On 02/20/2018 04:43 PM, Nicolai Hähnle wrote:

Why? 3D images are not arrays.

On 20.02.2018 11:11, Samuel Pitoiset wrote:

This doesn't fix anything known but it should definitely be set.

Signed-off-by: Samuel Pitoiset 
---
  src/amd/common/ac_nir_to_llvm.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c 
b/src/amd/common/ac_nir_to_llvm.c

index dc471de977..9244f8bc7b 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -3764,7 +3764,8 @@ static LLVMValueRef visit_image_atomic(struct 
ac_nir_context *ctx,

  char coords_type[8];
  bool da = glsl_sampler_type_is_array(type) ||
-  glsl_get_sampler_dim(type) == GLSL_SAMPLER_DIM_CUBE;
+  glsl_get_sampler_dim(type) == 
GLSL_SAMPLER_DIM_CUBE ||

+  glsl_get_sampler_dim(type) == GLSL_SAMPLER_DIM_3D;
  LLVMValueRef coords = params[param_count++] = 
get_image_coords(ctx, instr);
  params[param_count++] = get_sampler_desc(ctx, 
instr->variables[0], AC_DESC_IMAGE,






___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 18.0] i965: Disable ARB_get_program_binary for compat profiles

2018-02-20 Thread Timothy Arceri


On 21/02/18 13:21, Ilia Mirkin wrote:

Is this worth doing for st/mesa as well? Some quick grepping suggests
it's enabled on the 18.0 branch there too, but it's behind a
conditional which perhaps is never set.


Yes the st will need a change too as it will be enable for any driver 
that enables the disk cache (which is most drivers). The qt bug has been 
observed on radeonsi.




On Tue, Feb 20, 2018 at 9:12 PM, Jordan Justen
 wrote:

The QT framework has a bug in their shader program cache, which is
built on GL_ARB_get_program_binary.

In an effort to allow them to fix the bug we don't enable more than 1
binary format for compatibility profiles.

This is only being done on the 18.0 release branch.

Ref: https://bugreports.qt.io/browse/QTBUG-66420
Ref: https://bugs.freedesktop.org/show_bug.cgi?id=105065
Cc: "18.0" 
Cc: Mark Janes 
Cc: Kenneth Graunke 
Cc: Scott D Phillips 
Signed-off-by: Jordan Justen 
---
  docs/relnotes/17.4.0.html   | 2 +-
  src/mesa/drivers/dri/i965/brw_context.c | 9 -
  2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/docs/relnotes/17.4.0.html b/docs/relnotes/17.4.0.html
index 412c0fc455e..fecdfe77969 100644
--- a/docs/relnotes/17.4.0.html
+++ b/docs/relnotes/17.4.0.html
@@ -53,7 +53,7 @@ Note: some of the new features are only available with 
certain drivers.
  GL_ARB_enhanced_layouts on r600/evergreen+
  GL_ARB_bindless_texture on nvc0/kepler
  OpenGL 4.3 on r600/evergreen with hw fp64 support
-Support 1 binary format for GL_ARB_get_program_binary on i965
+Support 1 binary format for GL_ARB_get_program_binary on i965 (except in GL 
compatibility profiles)
  

  Bug fixes
diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index e9358b7bc9c..58527d77263 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -704,7 +704,14 @@ brw_initialize_context_constants(struct brw_context *brw)
ctx->Const.AllowMappedBuffersDuringExecution = true;

 /* GL_ARB_get_program_binary */
-   ctx->Const.NumProgramBinaryFormats = 1;
+   /* The QT framework has a bug in their shader program cache, which is built
+* on GL_ARB_get_program_binary. In an effort to allow them to fix the bug
+* we don't enable more than 1 binary format for compatibility profiles.
+* This is only being done on the 18.0 release branch.
+*/
+   if (ctx->API != API_OPENGL_COMPAT) {
+  ctx->Const.NumProgramBinaryFormats = 1;
+   }
  }

  static void
--
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nir: remove old assert

2018-02-20 Thread Timothy Arceri

This was originally intended to make sure the remap location
was not -1. However the code has changed alot since then,
the location is now never set to -1 and we also handle
components meaning this old assert has been doing comparisions
with the pointer to the array of component data.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105183
---
 src/compiler/nir/nir_linking_helpers.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/compiler/nir/nir_linking_helpers.c 
b/src/compiler/nir/nir_linking_helpers.c
index 6459c6a24d..2b0a2668a3 100644
--- a/src/compiler/nir/nir_linking_helpers.c
+++ b/src/compiler/nir/nir_linking_helpers.c
@@ -283,7 +283,6 @@ remap_slots_and_components(struct exec_list *var_list, 
gl_shader_stage stage,
   if (var->data.location >= VARYING_SLOT_VAR0 &&
   var->data.location - VARYING_SLOT_VAR0 < 32) {
  assert(var->data.location - VARYING_SLOT_VAR0 < 32);
- assert(remap[var->data.location - VARYING_SLOT_VAR0] >= 0);
 
  const struct glsl_type *type = var->type;
  if (nir_is_per_vertex_io(var, stage)) {
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 5/7] ac/radeonsi: pass bindless bool to load_sampler_desc()

2018-02-20 Thread Marek Olšák

On Wed, Feb 21, 2018 at 2:21 AM, Timothy Arceri  wrote:
>
>
> On 21/02/18 12:10, Marek Olšák wrote:
>>
>> On Wed, Feb 21, 2018 at 12:50 AM, Timothy Arceri 
>> wrote:
>>>
>>> On 21/02/18 10:33, Marek Olšák wrote:


 On Tue, Feb 20, 2018 at 11:51 PM, Timothy Arceri 
 wrote:
>
>
> On 21/02/18 09:46, Marek Olšák wrote:
>>
>>
>>
>> On Tue, Feb 20, 2018 at 11:42 PM, Marek Olšák 
>> wrote:
>>>
>>>
>>>
>>> For patches 1-5:
>>>
>>> Reviewed-by: Marek Olšák 
>>
>>
>>
>>
>> Actually no. Only patches 1, 3, 5 are reviewed by me.
>>
>> Marek
>
>
>
>
> Do you have an issue with patch 4?



 No, I'm just not sure if it's correct. It calls
 st_nir_lookup_parameter_index, but bindless handless are just
 variables. I think it should just visit the whole expression leading
 to the bindless variable in a generic way and not treat it as a
 uniform.
>>>
>>>
>>>
>>> I'm not sure I understand. We use uniform storage for bindless in tgsi
>>> also.
>>
>>
>> A bindless (sampler or buffer) variable is represented as a 64-bit
>> number in the GL API. It can be passed to shaders in many different
>> ways. For example, a bindless sampler2D variable can be a vertex
>> shader input (loaded from a vertex buffer).
>
>
> Right I should have specified this series does not yet handle bindless
> input/output support, that will require more updates to nir itself as those
> shaders currently trip asserts. Patch 4 however is specifically about
> bindless uniforms.

OK. Patch 4 also has my Rb.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] virgl: reduce some default capset limits.

2018-02-20 Thread Stéphane Marchesin

On Tue, Feb 20, 2018 at 5:49 PM, Dave Airlie  wrote:
> From: Dave Airlie 
>
> Since v2 might take a while to rollout, we should reduce
> these inside some gathered minimums and then v2 can increase
> them using host values.
>
> Signed-off-by: Dave Airlie 

Reviewed-by: Stéphane Marchesin 

> ---
>  src/gallium/drivers/virgl/virgl_winsys.h | 16 
>  1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/src/gallium/drivers/virgl/virgl_winsys.h 
> b/src/gallium/drivers/virgl/virgl_winsys.h
> index d633678597b..95e21a8afde 100644
> --- a/src/gallium/drivers/virgl/virgl_winsys.h
> +++ b/src/gallium/drivers/virgl/virgl_winsys.h
> @@ -114,17 +114,17 @@ struct virgl_winsys {
>   */
>  static inline void virgl_ws_fill_new_caps_defaults(struct virgl_drm_caps 
> *caps)
>  {
> -   caps->caps.v2.min_aliased_point_size = 0.f;
> +   caps->caps.v2.min_aliased_point_size = 1.f;
> caps->caps.v2.max_aliased_point_size = 255.f;
> -   caps->caps.v2.min_smooth_point_size = 0.f;
> -   caps->caps.v2.max_smooth_point_size = 255.f;
> -   caps->caps.v2.min_aliased_line_width = 0.f;
> -   caps->caps.v2.max_aliased_line_width = 255.f;
> +   caps->caps.v2.min_smooth_point_size = 1.f;
> +   caps->caps.v2.max_smooth_point_size = 190.f;
> +   caps->caps.v2.min_aliased_line_width = 1.f;
> +   caps->caps.v2.max_aliased_line_width = 10.f;
> caps->caps.v2.min_smooth_line_width = 0.f;
> -   caps->caps.v2.max_smooth_line_width = 255.f;
> -   caps->caps.v2.max_texture_lod_bias = 16.0f;
> +   caps->caps.v2.max_smooth_line_width = 10.f;
> +   caps->caps.v2.max_texture_lod_bias = 15.0f;
> caps->caps.v2.max_geom_output_vertices = 256;
> -   caps->caps.v2.max_geom_total_output_components = 16384;
> +   caps->caps.v2.max_geom_total_output_components = 1024;
> caps->caps.v2.max_vertex_outputs = 32;
> caps->caps.v2.max_vertex_attribs = 16;
> caps->caps.v2.max_shader_patch_varyings = 0;
> --
> 2.14.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] virgl: handle getting new capsets.

2018-02-20 Thread Stéphane Marchesin

On Tue, Feb 20, 2018 at 5:49 PM, Dave Airlie  wrote:
> From: Dave Airlie 
>
> This checks the kernel api is new enough and asks for the
> larger caps size since the kernel won't mess it up now.
>
> Signed-off-by: Dave Airlie 

Reviewed-by: Stéphane Marchesin 

> ---
>  src/gallium/drivers/virgl/virgl_winsys.h   | 25 ++-
>  src/gallium/winsys/virgl/drm/virgl_drm_winsys.c| 52 
> ++
>  src/gallium/winsys/virgl/drm/virgl_drm_winsys.h|  1 +
>  src/gallium/winsys/virgl/drm/virtgpu_drm.h |  1 +
>  .../winsys/virgl/vtest/virgl_vtest_socket.c|  2 +-
>  .../winsys/virgl/vtest/virgl_vtest_winsys.c|  2 +
>  6 files changed, 52 insertions(+), 31 deletions(-)
>
> diff --git a/src/gallium/drivers/virgl/virgl_winsys.h 
> b/src/gallium/drivers/virgl/virgl_winsys.h
> index ea21f2b6712..d633678597b 100644
> --- a/src/gallium/drivers/virgl/virgl_winsys.h
> +++ b/src/gallium/drivers/virgl/virgl_winsys.h
> @@ -109,5 +109,28 @@ struct virgl_winsys {
>   struct pipe_box *sub_box);
>  };
>
> -
> +/* this defaults all newer caps,
> + * the kernel will overwrite these if newer version is available.
> + */
> +static inline void virgl_ws_fill_new_caps_defaults(struct virgl_drm_caps 
> *caps)
> +{
> +   caps->caps.v2.min_aliased_point_size = 0.f;
> +   caps->caps.v2.max_aliased_point_size = 255.f;
> +   caps->caps.v2.min_smooth_point_size = 0.f;
> +   caps->caps.v2.max_smooth_point_size = 255.f;
> +   caps->caps.v2.min_aliased_line_width = 0.f;
> +   caps->caps.v2.max_aliased_line_width = 255.f;
> +   caps->caps.v2.min_smooth_line_width = 0.f;
> +   caps->caps.v2.max_smooth_line_width = 255.f;
> +   caps->caps.v2.max_texture_lod_bias = 16.0f;
> +   caps->caps.v2.max_geom_output_vertices = 256;
> +   caps->caps.v2.max_geom_total_output_components = 16384;
> +   caps->caps.v2.max_vertex_outputs = 32;
> +   caps->caps.v2.max_vertex_attribs = 16;
> +   caps->caps.v2.max_shader_patch_varyings = 0;
> +   caps->caps.v2.min_texel_offset = -8;
> +   caps->caps.v2.max_texel_offset = 7;
> +   caps->caps.v2.min_texture_gather_offset = -8;
> +   caps->caps.v2.max_texture_gather_offset = 7;
> +}
>  #endif
> diff --git a/src/gallium/winsys/virgl/drm/virgl_drm_winsys.c 
> b/src/gallium/winsys/virgl/drm/virgl_drm_winsys.c
> index fd6ae98a515..77854680e59 100644
> --- a/src/gallium/winsys/virgl/drm/virgl_drm_winsys.c
> +++ b/src/gallium/winsys/virgl/drm/virgl_drm_winsys.c
> @@ -705,46 +705,28 @@ static int virgl_drm_get_caps(struct virgl_winsys *vws,
> struct virgl_drm_winsys *vdws = virgl_drm_winsys(vws);
> struct drm_virtgpu_get_caps args;
> int ret;
> -   bool fill_v2 = false;
>
> -   memset(, 0, sizeof(args));
> +   virgl_ws_fill_new_caps_defaults(caps);
>
> -   args.cap_set_id = 1;
> +   memset(, 0, sizeof(args));
> +   if (vdws->has_capset_query_fix) {
> +  /* if we have the query fix - try and get cap set id 2 first */
> +  args.cap_set_id = 2;
> +  args.size = sizeof(union virgl_caps);
> +   } else {
> +  args.cap_set_id = 1;
> +  args.size = sizeof(struct virgl_caps_v1);
> +   }
> args.addr = (unsigned long)>caps;
> -   args.size = sizeof(union virgl_caps);
>
> ret = drmIoctl(vdws->fd, DRM_IOCTL_VIRTGPU_GET_CAPS, );
> -
> if (ret == -1 && errno == EINVAL) {
>/* Fallback to v1 */
> +  args.cap_set_id = 1;
>args.size = sizeof(struct virgl_caps_v1);
>ret = drmIoctl(vdws->fd, DRM_IOCTL_VIRTGPU_GET_CAPS, );
>if (ret == -1)
>return ret;
> -  fill_v2 = true;
> -   }
> -   if (caps->caps.max_version == 1)
> -   fill_v2 = true;
> -
> -   if (fill_v2) {
> -  caps->caps.v2.min_aliased_point_size = 0.f;
> -  caps->caps.v2.max_aliased_point_size = 255.f;
> -  caps->caps.v2.min_smooth_point_size = 0.f;
> -  caps->caps.v2.max_smooth_point_size = 255.f;
> -  caps->caps.v2.min_aliased_line_width = 0.f;
> -  caps->caps.v2.max_aliased_line_width = 255.f;
> -  caps->caps.v2.min_smooth_line_width = 0.f;
> -  caps->caps.v2.max_smooth_line_width = 255.f;
> -  caps->caps.v2.max_texture_lod_bias = 16.0f;
> -  caps->caps.v2.max_geom_output_vertices = 256;
> -  caps->caps.v2.max_geom_total_output_components = 16384;
> -  caps->caps.v2.max_vertex_outputs = 32;
> -  caps->caps.v2.max_vertex_attribs = 16;
> -  caps->caps.v2.max_shader_patch_varyings = 0;
> -  caps->caps.v2.min_texel_offset = -8;
> -  caps->caps.v2.max_texel_offset = 7;
> -  caps->caps.v2.min_texture_gather_offset = -8;
> -  caps->caps.v2.max_texture_gather_offset = 7;
> }
> return ret;
>  }
> @@ -813,6 +795,8 @@ static struct virgl_winsys *
>  virgl_drm_winsys_create(int drmFD)
>  {
> struct virgl_drm_winsys *qdws;
> +   int ret;
> +   struct drm_virtgpu_getparam getparam = {0};
>
> qdws = CALLOC_STRUCT(virgl_drm_winsys);
> if (!qdws)
> @@

Re: [Mesa-dev] [PATCH 18.0] i965: Disable ARB_get_program_binary for compat profiles

2018-02-20 Thread Ilia Mirkin

Is this worth doing for st/mesa as well? Some quick grepping suggests
it's enabled on the 18.0 branch there too, but it's behind a
conditional which perhaps is never set.

On Tue, Feb 20, 2018 at 9:12 PM, Jordan Justen
 wrote:
> The QT framework has a bug in their shader program cache, which is
> built on GL_ARB_get_program_binary.
>
> In an effort to allow them to fix the bug we don't enable more than 1
> binary format for compatibility profiles.
>
> This is only being done on the 18.0 release branch.
>
> Ref: https://bugreports.qt.io/browse/QTBUG-66420
> Ref: https://bugs.freedesktop.org/show_bug.cgi?id=105065
> Cc: "18.0" 
> Cc: Mark Janes 
> Cc: Kenneth Graunke 
> Cc: Scott D Phillips 
> Signed-off-by: Jordan Justen 
> ---
>  docs/relnotes/17.4.0.html   | 2 +-
>  src/mesa/drivers/dri/i965/brw_context.c | 9 -
>  2 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/docs/relnotes/17.4.0.html b/docs/relnotes/17.4.0.html
> index 412c0fc455e..fecdfe77969 100644
> --- a/docs/relnotes/17.4.0.html
> +++ b/docs/relnotes/17.4.0.html
> @@ -53,7 +53,7 @@ Note: some of the new features are only available with 
> certain drivers.
>  GL_ARB_enhanced_layouts on r600/evergreen+
>  GL_ARB_bindless_texture on nvc0/kepler
>  OpenGL 4.3 on r600/evergreen with hw fp64 support
> -Support 1 binary format for GL_ARB_get_program_binary on i965
> +Support 1 binary format for GL_ARB_get_program_binary on i965 (except in 
> GL compatibility profiles)
>  
>
>  Bug fixes
> diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
> b/src/mesa/drivers/dri/i965/brw_context.c
> index e9358b7bc9c..58527d77263 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.c
> +++ b/src/mesa/drivers/dri/i965/brw_context.c
> @@ -704,7 +704,14 @@ brw_initialize_context_constants(struct brw_context *brw)
>ctx->Const.AllowMappedBuffersDuringExecution = true;
>
> /* GL_ARB_get_program_binary */
> -   ctx->Const.NumProgramBinaryFormats = 1;
> +   /* The QT framework has a bug in their shader program cache, which is 
> built
> +* on GL_ARB_get_program_binary. In an effort to allow them to fix the bug
> +* we don't enable more than 1 binary format for compatibility profiles.
> +* This is only being done on the 18.0 release branch.
> +*/
> +   if (ctx->API != API_OPENGL_COMPAT) {
> +  ctx->Const.NumProgramBinaryFormats = 1;
> +   }
>  }
>
>  static void
> --
> 2.16.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 105183] Weird assertion in NIR linker

2018-02-20 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=105183

Bug ID: 105183
   Summary: Weird assertion in NIR linker
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: glsl-compiler
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: i...@freedesktop.org
QA Contact: intel-3d-b...@lists.freedesktop.org

GCC issues the following warning in my build:

In file included from ../../SOURCE/master/src/compiler/glsl_types.h:29:0,
 from ../../SOURCE/master/src/compiler/nir_types.h:36,
 from ../../SOURCE/master/src/compiler/nir/nir.h:39,
 from
../../SOURCE/master/src/compiler/nir/nir_linking_helpers.c:24:
../../SOURCE/master/src/compiler/nir/nir_linking_helpers.c: In function
‘remap_slots_and_components’:
../../SOURCE/master/src/compiler/nir/nir_linking_helpers.c:286:63: warning:
ordered comparison of pointer with integer zero [-Wextra]
  assert(remap[var->data.location - VARYING_SLOT_VAR0] >= 0);
   ^

I looked at the code, and remap is declared as "struct varying_loc
(*remap)[4]".  It is comparing that a pointer to an array of 4 structures is >=
0.  I'm not sure what the original intention was, but this tautology isn't
doing it.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 105183] Weird assertion in NIR linker

2018-02-20 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=105183

Ian Romanick  changed:

   What|Removed |Added

 CC||t_arc...@yahoo.com.au

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 18.0] i965: Disable ARB_get_program_binary for compat profiles

2018-02-20 Thread Jordan Justen

The QT framework has a bug in their shader program cache, which is
built on GL_ARB_get_program_binary.

In an effort to allow them to fix the bug we don't enable more than 1
binary format for compatibility profiles.

This is only being done on the 18.0 release branch.

Ref: https://bugreports.qt.io/browse/QTBUG-66420
Ref: https://bugs.freedesktop.org/show_bug.cgi?id=105065
Cc: "18.0" 
Cc: Mark Janes 
Cc: Kenneth Graunke 
Cc: Scott D Phillips 
Signed-off-by: Jordan Justen 
---
 docs/relnotes/17.4.0.html   | 2 +-
 src/mesa/drivers/dri/i965/brw_context.c | 9 -
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/docs/relnotes/17.4.0.html b/docs/relnotes/17.4.0.html
index 412c0fc455e..fecdfe77969 100644
--- a/docs/relnotes/17.4.0.html
+++ b/docs/relnotes/17.4.0.html
@@ -53,7 +53,7 @@ Note: some of the new features are only available with 
certain drivers.
 GL_ARB_enhanced_layouts on r600/evergreen+
 GL_ARB_bindless_texture on nvc0/kepler
 OpenGL 4.3 on r600/evergreen with hw fp64 support
-Support 1 binary format for GL_ARB_get_program_binary on i965
+Support 1 binary format for GL_ARB_get_program_binary on i965 (except in 
GL compatibility profiles)
 
 
 Bug fixes
diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index e9358b7bc9c..58527d77263 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -704,7 +704,14 @@ brw_initialize_context_constants(struct brw_context *brw)
   ctx->Const.AllowMappedBuffersDuringExecution = true;
 
/* GL_ARB_get_program_binary */
-   ctx->Const.NumProgramBinaryFormats = 1;
+   /* The QT framework has a bug in their shader program cache, which is built
+* on GL_ARB_get_program_binary. In an effort to allow them to fix the bug
+* we don't enable more than 1 binary format for compatibility profiles.
+* This is only being done on the 18.0 release branch.
+*/
+   if (ctx->API != API_OPENGL_COMPAT) {
+  ctx->Const.NumProgramBinaryFormats = 1;
+   }
 }
 
 static void
-- 
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

1 2 3 >

1 - 100 of 212 matches

Mail list logo