date:20171122

[Mesa-dev] [PATCH] mesa: add AllowGLSLCrossStageInterpolationMismatch workaround

2017-11-22 Thread Tapani Pälli

This fixes issues seen with certain versions of Unreal Engine 4 editor
and games built with that using GLSL 4.30.

Signed-off-by: Tapani Pälli 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97852
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103801
---
 src/compiler/glsl/link_varyings.cpp | 51 +++--
 src/gallium/include/state_tracker/st_api.h  |  1 +
 src/gallium/state_trackers/dri/dri_screen.c |  2 ++
 src/mesa/drivers/dri/i965/brw_context.c |  3 ++
 src/mesa/drivers/dri/i965/intel_screen.c|  1 +
 src/mesa/main/mtypes.h  |  5 +++
 src/mesa/state_tracker/st_extensions.c  |  2 ++
 src/util/drirc  |  8 +
 src/util/xmlpool/t_options.h|  4 +++
 9 files changed, 59 insertions(+), 18 deletions(-)

diff --git a/src/compiler/glsl/link_varyings.cpp 
b/src/compiler/glsl/link_varyings.cpp
index 72309365a0..0f53cd4aa9 100644
--- a/src/compiler/glsl/link_varyings.cpp
+++ b/src/compiler/glsl/link_varyings.cpp
@@ -189,7 +189,8 @@ process_xfb_layout_qualifiers(void *mem_ctx, const 
gl_linked_shader *sh,
  * matching input to another stage.
  */
 static void
-cross_validate_types_and_qualifiers(struct gl_shader_program *prog,
+cross_validate_types_and_qualifiers(struct gl_context *ctx,
+struct gl_shader_program *prog,
 const ir_variable *input,
 const ir_variable *output,
 gl_shader_stage consumer_stage,
@@ -343,17 +344,30 @@ cross_validate_types_and_qualifiers(struct 
gl_shader_program *prog,
}
if (input_interpolation != output_interpolation &&
prog->data->Version < 440) {
-  linker_error(prog,
-   "%s shader output `%s' specifies %s "
-   "interpolation qualifier, "
-   "but %s shader input specifies %s "
-   "interpolation qualifier\n",
-   _mesa_shader_stage_to_string(producer_stage),
-   output->name,
-   interpolation_string(output->data.interpolation),
-   _mesa_shader_stage_to_string(consumer_stage),
-   interpolation_string(input->data.interpolation));
-  return;
+  if (!ctx->Const.AllowGLSLCrossStageInterpolationMismatch) {
+ linker_error(prog,
+  "%s shader output `%s' specifies %s "
+  "interpolation qualifier, "
+  "but %s shader input specifies %s "
+  "interpolation qualifier\n",
+  _mesa_shader_stage_to_string(producer_stage),
+  output->name,
+  interpolation_string(output->data.interpolation),
+  _mesa_shader_stage_to_string(consumer_stage),
+  interpolation_string(input->data.interpolation));
+ return;
+  } else {
+ linker_warning(prog,
+"%s shader output `%s' specifies %s "
+"interpolation qualifier, "
+"but %s shader input specifies %s "
+"interpolation qualifier\n",
+_mesa_shader_stage_to_string(producer_stage),
+output->name,
+interpolation_string(output->data.interpolation),
+_mesa_shader_stage_to_string(consumer_stage),
+interpolation_string(input->data.interpolation));
+  }
}
 }
 
@@ -361,7 +375,8 @@ cross_validate_types_and_qualifiers(struct 
gl_shader_program *prog,
  * Validate front and back color outputs against single color input
  */
 static void
-cross_validate_front_and_back_color(struct gl_shader_program *prog,
+cross_validate_front_and_back_color(struct gl_context *ctx,
+struct gl_shader_program *prog,
 const ir_variable *input,
 const ir_variable *front_color,
 const ir_variable *back_color,
@@ -369,11 +384,11 @@ cross_validate_front_and_back_color(struct 
gl_shader_program *prog,
 gl_shader_stage producer_stage)
 {
if (front_color != NULL && front_color->data.assigned)
-  cross_validate_types_and_qualifiers(prog, input, front_color,
+  cross_validate_types_and_qualifiers(ctx, prog, input, front_color,
   consumer_stage, producer_stage);
 
if (back_color != NULL && back_color->data.assigned)
-  cross_validate_types_and_qualifiers(prog, input, back_color,
+  cross_validate_types_and_qualifiers(ctx, prog, input, back_color,
   consumer_stage, producer_stage);
 }
 
@@ -710,7 +725,7 @@ cross_validate_outputs_to_inputs(struct

Re: [Mesa-dev] V2 Initial GS NIR support for radeonsi

2017-11-22 Thread Timothy Arceri


On 23/11/17 15:09, Dieter Nützel wrote:

Am 22.11.2017 10:29, schrieb Timothy Arceri:

This series depends on [1] and [2].

V2
 - use driver_location as per Nicolais suggestion
 - tidy ups as per Mareks suggestions
 - bug fixes (many more piglit tests now passing)

[1] https://patchwork.freedesktop.org/series/34131/
[2] https://patchwork.freedesktop.org/series/34132/


Hello Timothy,

I could run Unigine_Heaven-4.0 (with tess disabled of course) and 
Unigine_Valley-1.0 with all 3 together on my RX580.
If I'll try to swith to wireframe, 'game' window disappeared (as 
expected, too).


SOURCE/Unigine_Valley-1.0> echo $R600_DEBUG
nir

So here is my

Tested-by: Dieter Nützel 

on all _3_ series.


Cool. Thanks for testing.



GREAT work!
Dieter

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] V2 Initial GS NIR support for radeonsi

2017-11-22 Thread Dieter Nützel


Am 22.11.2017 10:29, schrieb Timothy Arceri:

This series depends on [1] and [2].

V2
 - use driver_location as per Nicolais suggestion
 - tidy ups as per Mareks suggestions
 - bug fixes (many more piglit tests now passing)

[1] https://patchwork.freedesktop.org/series/34131/
[2] https://patchwork.freedesktop.org/series/34132/


Hello Timothy,

I could run Unigine_Heaven-4.0 (with tess disabled of course) and 
Unigine_Valley-1.0 with all 3 together on my RX580.
If I'll try to swith to wireframe, 'game' window disappeared (as 
expected, too).


SOURCE/Unigine_Valley-1.0> echo $R600_DEBUG
nir

So here is my

Tested-by: Dieter Nützel 

on all _3_ series.

GREAT work!
Dieter
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 06/20] vbo: decrease the size of vbo_context slightly

2017-11-22 Thread Marek Olšák

On Wed, Nov 22, 2017 at 10:53 PM, Ian Romanick  wrote:
> On 11/21/2017 10:01 AM, Marek Olšák wrote:
>> From: Marek Olšák 
>>
>> vbo_context: 21520 -> 20344 bytes
>> ---
>>  src/mesa/main/mtypes.h   | 8 
>>  src/mesa/vbo/vbo_context.h   | 4 ++--
>>  src/mesa/vbo/vbo_exec_draw.c | 2 +-
>>  src/mesa/vbo/vbo_save_draw.c | 2 +-
>>  4 files changed, 8 insertions(+), 8 deletions(-)
>>
>> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
>> index 67711d8..660b1a5 100644
>> --- a/src/mesa/main/mtypes.h
>> +++ b/src/mesa/main/mtypes.h
>> @@ -1452,31 +1452,31 @@ struct gl_pixelstore_attrib
>>  };
>>
>>
>>  /**
>>   * Vertex array information which is derived from gl_array_attributes
>>   * and gl_vertex_buffer_binding information.  Used by the VBO module and
>>   * device drivers.
>>   */
>>  struct gl_vertex_array
>>  {
>> -   GLint Size;  /**< components per element (1,2,3,4) */
>> GLenum16 Type;   /**< datatype: GL_FLOAT, GL_INT, etc */
>> GLenum16 Format; /**< default: GL_RGBA, but may be GL_BGRA */
>> -   GLsizei StrideB;  /**< actual stride in bytes */
>> -   GLuint _ElementSize; /**< size of each element in bytes */
>> -   const GLubyte *Ptr;  /**< Points to array data */
>> +   GLshort StrideB;  /**< actual stride in bytes */
>
> It looks like the largest value anyone currently advertises for
> MaxVertexAttribStride is 2048.  We should probably have a check
> somewhere that someone doesn't try to use 65537.

I'm not sure where the check should be.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 14/14] anv: Add support for MSAA fast-clears

2017-11-22 Thread Nanley Chery

On Mon, Nov 13, 2017 at 08:12:54AM -0800, Jason Ekstrand wrote:
> This speeds up the Sascha Willems multisampling demo by around 25% when
> using 8x MSAA.
> ---
>  src/intel/vulkan/anv_blorp.c   |  6 ++
>  src/intel/vulkan/genX_cmd_buffer.c | 22 --
>  2 files changed, 18 insertions(+), 10 deletions(-)
> 


For better performance, I think this patch should include a hunk in
cmd_buffer_subpass_sync_fast_clear_values() to mark the needs_resolve
dword to false if the fast clear color is 0. We currently do this for
CCS_E.

-Nanley

> diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
> index 266cb9a..1e15797 100644
> --- a/src/intel/vulkan/anv_blorp.c
> +++ b/src/intel/vulkan/anv_blorp.c
> @@ -1549,6 +1549,12 @@ anv_cmd_buffer_resolve_subpass(struct anv_cmd_buffer 
> *cmd_buffer)
>   get_blorp_surf_for_anv_image(src_iview->image,
>VK_IMAGE_ASPECT_COLOR_BIT,
>src_aux_usage, _surf);
> + if (src_aux_usage == ISL_AUX_USAGE_MCS) {
> +src_surf.clear_color_addr = anv_to_blorp_address(
> +   anv_image_get_clear_color_addr(cmd_buffer->device,
> +  src_iview->image,
> +  VK_IMAGE_ASPECT_COLOR_BIT, 0));
> + }
>   get_blorp_surf_for_anv_image(dst_iview->image,
>VK_IMAGE_ASPECT_COLOR_BIT,
>dst_aux_usage, _surf);
> diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
> b/src/intel/vulkan/genX_cmd_buffer.c
> index 2491b1d..1c1c644 100644
> --- a/src/intel/vulkan/genX_cmd_buffer.c
> +++ b/src/intel/vulkan/genX_cmd_buffer.c
> @@ -244,8 +244,6 @@ color_attachment_compute_aux_usage(struct anv_device * 
> device,
> } else if (iview->image->planes[0].aux_usage == ISL_AUX_USAGE_MCS) {
>att_state->aux_usage = ISL_AUX_USAGE_MCS;
>att_state->input_aux_usage = ISL_AUX_USAGE_MCS;
> -  att_state->fast_clear = false;
> -  return;
> } else if (iview->image->planes[0].aux_usage == ISL_AUX_USAGE_CCS_E) {
>att_state->aux_usage = ISL_AUX_USAGE_CCS_E;
>att_state->input_aux_usage = ISL_AUX_USAGE_CCS_E;
> @@ -281,7 +279,8 @@ color_attachment_compute_aux_usage(struct anv_device * 
> device,
>}
> }
>  
> -   assert(iview->image->planes[0].aux_surface.isl.usage & 
> ISL_SURF_USAGE_CCS_BIT);
> +   assert(iview->image->planes[0].aux_surface.isl.usage &
> +(ISL_SURF_USAGE_CCS_BIT | ISL_SURF_USAGE_MCS_BIT));
>  
> att_state->clear_color_is_zero_one =
>color_is_zero_one(att_state->clear_value.color, 
> iview->planes[0].isl.format);
> @@ -726,9 +725,6 @@ transition_color_buffer(struct anv_cmd_buffer *cmd_buffer,
> * if the initial layout is COLOR_ATTACHMENT_OPTIMAL.
> */
>return;
> -   } else if (image->samples > 1) {
> -  /* MCS buffers don't need resolving. */
> -  return;
> }
>  
> /* Perform a resolve to synchronize data between the main and aux buffer.
> @@ -760,10 +756,16 @@ transition_color_buffer(struct anv_cmd_buffer 
> *cmd_buffer,
>  
>genX(load_needs_resolve_predicate)(cmd_buffer, image, aspect, level);
>  
> -  anv_ccs_resolve(cmd_buffer, image, aspect, level, base_layer, 
> layer_count,
> -  image->planes[plane].aux_usage == ISL_AUX_USAGE_CCS_E ?
> -  BLORP_FAST_CLEAR_OP_RESOLVE_PARTIAL :
> -  BLORP_FAST_CLEAR_OP_RESOLVE_FULL);
> +  if (image->samples > 1) {
> + anv_mcs_partial_resolve(cmd_buffer, image, aspect,
> + base_layer, layer_count);
> +  } else {
> + anv_ccs_resolve(cmd_buffer, image, aspect,
> + level, base_layer, layer_count,
> + image->planes[plane].aux_usage == 
> ISL_AUX_USAGE_CCS_E ?
> + BLORP_FAST_CLEAR_OP_RESOLVE_PARTIAL :
> + BLORP_FAST_CLEAR_OP_RESOLVE_FULL);
> +  }
>  
>genX(set_image_needs_resolve)(cmd_buffer, image, aspect, level, false);
> }
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/28] vulkan/wsi: Rework WSI to look a lot more like a layer

2017-11-22 Thread Jason Ekstrand

On November 22, 2017 15:12:20 Grazvydas Ignotas  wrote:

On Wed, Nov 22, 2017 at 7:54 AM, Jason Ekstrand  wrote:

On Tue, Nov 21, 2017 at 1:21 PM, Grazvydas Ignotas 
wrote:

On Mon, Nov 20, 2017 at 6:08 PM, Jason Ekstrand 
wrote:
> On Sun, Nov 19, 2017 at 5:07 AM, Grazvydas Ignotas 
> wrote:
>>
>> On Sun, Nov 19, 2017 at 1:51 AM, Jason Ekstrand 
>> wrote:
>> >
>> > I force-pushed the branch again with an added commit: "radv: Move wsi
>> > initialization later in physical_device_init" that fixes the memory
>> > type
>> > issue with radv.  I've tested both radv + radeon and anv + radeon on
>> > my
>> > HSW
>> > + Rx550 and they both work now.  I'm having a bit of trouble actually
>> > getting my system to start up on the Intel card so I'll have to leave
>> > testing radv on Intel for another day.
>>
>> Radv is working now on both displays, however "display on amd + anv"
>> case still acts the same (black window on most, but not all
>> SaschaWillems demos). I'm using xf86-video-amdgpu 1.4.0, 4.14 kernel
>> and xorg-server 1.18.4, if that makes a difference.
>
>
> I'm completely unable to reproduce.  Here's my setup:
>
>  - Fedora 27
>  - X.org 1.19.5
>  - xf86-video-amdgpu 1.3.0
>  - Linux 4.13.12
>  - Intel Haswell
>  - AMD RX550
>
> I've tried with amdgpu, modesetting, and XWayland all running on the AMD
> card and anv works on all three.  I'm a little weirded out by the fact
> that
> my X server is newer but my xf86-video-amdgpu is older.

Well I compiled my own xf86-video-amdgpu. Not sure why.

> Two things I'd like you to try if you can:
>
>  1) Use modesetting.  It may be a bug in your version of amdgpu.

Same results (black window), plus all the tearing all over I usually
get with it. Also tried the distro kernel (4.10).

>  2) Try the attached patch with radv + display on AMD.  It will make
> radv
> use the prime path regardless of the fact that it's displaying on the
> same
> GPU.

Still works fine, albeit a bit slower (as expected I guess).
Maybe something specific to SKL?

I think it may be weirdly enough.  More specifically, I think you're
probably hitting a bug I found today in the Sascha demos that probably only
actually shows up on SKL.  Give this PR a try:

https://github.com/SaschaWillems/Vulkan/pull/400

That helps, thanks.

Cool, glad to see things are working better than that.

But textoverlay and subpasses still show black
screen on amd display and multisampling is dark even on intel. All are
fine on radv on both displays.

Here are some weird ones:
- texture3d works on anv on amd display, but triggers
"vulkan/anv_batch_chain.c:1139: adjust_relocations_from_state_pool:
Assertion `last_pool_center_bo_offset <=
pool->block_pool.center_bo_offset' failed." on intel display.
- multisampling on radv isn't clearing the background on intel
display, but is fine on amd display.

Thanks for the info, I'll look into it on Monday.

--Jason

In other tests with this series, Talos Principle and Serious Sam
Fusion work on anv+amd display, while DOOM is failing with
VK_ERROR_FEATURE_NOT_PRESENT on any display (and does so for many
months now, probably since 5dd96b1156, but I haven't tried looking
what the feature is).

Gražvydas

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 01/20] mesa: replace GLenum with GLenum16 in common structures (v2)

2017-11-22 Thread Fredrik Höglund

On Wednesday 22 November 2017, Ian Romanick wrote:
> There are a couple small nits below.
> 
> I haven't reviewed this with the rigor that I would like because there
> are just so many enums changed all at once.  I fear the day someone has
> a bug that bisects to this commit.
> 
> The difficult part was matching the changes in get_hash_params.py to the
> changes in mtypes.h.  It seems like modern C or C++ should have a way
> that we could let the compiler sort some of this out.  GCC has typeof(),
> but it looks like the equivalent things in Visual Studio are all
> run-time checks (typeid()) or deprecated (__typeof()).  Ugh.

C++11 has decltype(). It's supported by Visual Studio 2010 and later.

Fredrik

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 13/14] anv/blorp: Add an mcs_partial_resolve helper

2017-11-22 Thread Nanley Chery

On Mon, Nov 13, 2017 at 08:12:53AM -0800, Jason Ekstrand wrote:
> ---
>  src/intel/vulkan/anv_blorp.c   | 31 +++
>  src/intel/vulkan/anv_private.h |  6 ++
>  2 files changed, 37 insertions(+)
> 
> diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
> index 27320c2..266cb9a 100644
> --- a/src/intel/vulkan/anv_blorp.c
> +++ b/src/intel/vulkan/anv_blorp.c
> @@ -1696,3 +1696,34 @@ anv_ccs_resolve(struct anv_cmd_buffer * const 
> cmd_buffer,
>  
> blorp_batch_finish();
>  }
> +
> +void
> +anv_mcs_partial_resolve(struct anv_cmd_buffer * const cmd_buffer,
> +const struct anv_image * const image,
> +VkImageAspectFlagBits aspect,
> +const uint32_t start_layer, const uint32_t 
> layer_count)
> +{
> +   assert(cmd_buffer && image);
> +
> +   uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect);
> +
> +   /* The resolved subresource range must have a CCS buffer. */
^
MCS

With that fixed, this patch is
Reviewed-by: Nanley Chery 

> +   assert(aspect == VK_IMAGE_ASPECT_COLOR_BIT);
> +   assert(start_layer + layer_count <= anv_image_aux_layers(image, aspect, 
> 0));
> +   assert(image->samples > 1);
> +
> +   struct blorp_batch batch;
> +   blorp_batch_init(_buffer->device->blorp, , cmd_buffer,
> +BLORP_BATCH_PREDICATE_ENABLE);
> +
> +   struct blorp_surf surf;
> +   get_blorp_surf_for_anv_image(image, aspect, ISL_AUX_USAGE_MCS, );
> +   surf.clear_color_addr = anv_to_blorp_address(
> +  anv_image_get_clear_color_addr(cmd_buffer->device, image, aspect, 0));
> +
> +   blorp_mcs_partial_resolve(, ,
> + image->planes[plane].surface.isl.format,
> + start_layer, layer_count);
> +
> +   blorp_batch_finish();
> +}
> diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
> index a1b1d48..6be7e58 100644
> --- a/src/intel/vulkan/anv_private.h
> +++ b/src/intel/vulkan/anv_private.h
> @@ -2534,6 +2534,12 @@ anv_ccs_resolve(struct anv_cmd_buffer * const 
> cmd_buffer,
>  const enum blorp_fast_clear_op op);
>  
>  void
> +anv_mcs_partial_resolve(struct anv_cmd_buffer * const cmd_buffer,
> +const struct anv_image * const image,
> +VkImageAspectFlagBits aspect,
> +const uint32_t start_layer, const uint32_t 
> layer_count);
> +
> +void
>  anv_image_fast_clear(struct anv_cmd_buffer *cmd_buffer,
>   const struct anv_image *image,
>   VkImageAspectFlagBits aspect,
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 10/14] intel/blorp: Drop blorp_resolve_ccs_attachment

2017-11-22 Thread Nanley Chery

On Mon, Nov 13, 2017 at 08:12:50AM -0800, Jason Ekstrand wrote:
> The only reason why we needed that version was because the Vulkan driver
> needed to be able to create the surface states so it could handle
> indirect clear colors.  Now that blorp handles them natively, there's no
> need for the extra entrypoint.
> ---
>  src/intel/blorp/blorp.h   | 11 ---
>  src/intel/blorp/blorp_clear.c | 70 
> +--
>  2 files changed, 20 insertions(+), 61 deletions(-)
> 

This patch is
Reviewed-by: Nanley Chery 

> diff --git a/src/intel/blorp/blorp.h b/src/intel/blorp/blorp.h
> index 690e65f..a95b6a7 100644
> --- a/src/intel/blorp/blorp.h
> +++ b/src/intel/blorp/blorp.h
> @@ -207,17 +207,6 @@ blorp_ccs_resolve(struct blorp_batch *batch,
>enum isl_format format,
>enum blorp_fast_clear_op resolve_op);
>  
> -/* Resolves subresources of the image subresource range specified in the
> - * binding table.
> - */
> -void
> -blorp_ccs_resolve_attachment(struct blorp_batch *batch,
> - const uint32_t binding_table_offset,
> - struct blorp_surf * const surf,
> - const uint32_t level, const uint32_t num_layers,
> - const enum isl_format format,
> - const enum blorp_fast_clear_op resolve_op);
> -
>  void
>  blorp_mcs_partial_resolve(struct blorp_batch *batch,
>struct blorp_surf *surf,
> diff --git a/src/intel/blorp/blorp_clear.c b/src/intel/blorp/blorp_clear.c
> index 56cc3dd..8e7bc9f 100644
> --- a/src/intel/blorp/blorp_clear.c
> +++ b/src/intel/blorp/blorp_clear.c
> @@ -715,17 +715,18 @@ blorp_clear_attachments(struct blorp_batch *batch,
> batch->blorp->exec(batch, );
>  }
>  
> -static void
> -prepare_ccs_resolve(struct blorp_batch * const batch,
> -struct blorp_params * const params,
> -const struct blorp_surf * const surf,
> -const uint32_t level, const uint32_t layer,
> -const enum isl_format format,
> -const enum blorp_fast_clear_op resolve_op)
> +void
> +blorp_ccs_resolve(struct blorp_batch *batch,
> +  struct blorp_surf *surf, uint32_t level,
> +  uint32_t start_layer, uint32_t num_layers,
> +  enum isl_format format,
> +  enum blorp_fast_clear_op resolve_op)
>  {
> -   blorp_params_init(params);
> -   brw_blorp_surface_info_init(batch->blorp, >dst, surf,
> -   level, layer, format, true);
> +   struct blorp_params params;
> +
> +   blorp_params_init();
> +   brw_blorp_surface_info_init(batch->blorp, , surf,
> +   level, start_layer, format, true);
>  
> /* From the Ivy Bridge PRM, Vol2 Part1 11.9 "Render Target Resolve":
>  *
> @@ -737,7 +738,7 @@ prepare_ccs_resolve(struct blorp_batch * const batch,
>  * multiply by 8 and 16. On Sky Lake, we multiply by 8.
>  */
> const struct isl_format_layout *aux_fmtl =
> -  isl_format_get_layout(params->dst.aux_surf.format);
> +  isl_format_get_layout(params.dst.aux_surf.format);
> assert(aux_fmtl->txc == ISL_TXC_CCS);
>  
> unsigned x_scaledown, y_scaledown;
> @@ -751,11 +752,11 @@ prepare_ccs_resolve(struct blorp_batch * const batch,
>x_scaledown = aux_fmtl->bw / 2;
>y_scaledown = aux_fmtl->bh / 2;
> }
> -   params->x0 = params->y0 = 0;
> -   params->x1 = minify(params->dst.aux_surf.logical_level0_px.width, level);
> -   params->y1 = minify(params->dst.aux_surf.logical_level0_px.height, level);
> -   params->x1 = ALIGN(params->x1, x_scaledown) / x_scaledown;
> -   params->y1 = ALIGN(params->y1, y_scaledown) / y_scaledown;
> +   params.x0 = params.y0 = 0;
> +   params.x1 = minify(params.dst.aux_surf.logical_level0_px.width, level);
> +   params.y1 = minify(params.dst.aux_surf.logical_level0_px.height, level);
> +   params.x1 = ALIGN(params.x1, x_scaledown) / x_scaledown;
> +   params.y1 = ALIGN(params.y1, y_scaledown) / y_scaledown;
>  
> if (batch->blorp->isl_dev->info->gen >= 9) {
>assert(resolve_op == BLORP_FAST_CLEAR_OP_RESOLVE_FULL ||
> @@ -764,7 +765,8 @@ prepare_ccs_resolve(struct blorp_batch * const batch,
>/* Broadwell and earlier do not have a partial resolve */
>assert(resolve_op == BLORP_FAST_CLEAR_OP_RESOLVE_FULL);
> }
> -   params->fast_clear_op = resolve_op;
> +   params.fast_clear_op = resolve_op;
> +   params.num_layers = num_layers;
>  
> /* Note: there is no need to initialize push constants because it doesn't
>  * matter what data gets dispatched to the render target.  However, we 
> must
> @@ -772,40 +774,8 @@ prepare_ccs_resolve(struct blorp_batch * const batch,
>  * color" message.
>  */
>  
> -   if (!blorp_params_get_clear_kernel(batch->blorp, params, true))
>

Re: [Mesa-dev] [PATCH 15/15] radeonsi: enable gs support for nir backend

2017-11-22 Thread Timothy Arceri


On 23/11/17 00:24, Ilia Mirkin wrote:

On Wed, Nov 22, 2017 at 5:30 AM, Timothy Arceri  wrote:

---
  src/gallium/drivers/radeonsi/si_pipe.c   |  3 ++-
  src/gallium/drivers/radeonsi/si_shader_nir.c | 10 --
  src/mesa/state_tracker/st_glsl_to_nir.cpp| 12 
  3 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index b3d8ae508b..dffdab5d41 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -542,21 +542,21 @@ static int si_get_param(struct pipe_screen* pscreen, enum 
pipe_cap param)
 case PIPE_CAP_CONSTANT_BUFFER_OFFSET_ALIGNMENT:
 case PIPE_CAP_TEXTURE_BUFFER_OFFSET_ALIGNMENT:
 case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS:
 case PIPE_CAP_MAX_STREAM_OUTPUT_BUFFERS:
 case PIPE_CAP_MAX_VERTEX_STREAMS:
 case PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT:
 return 4;

 case PIPE_CAP_GLSL_FEATURE_LEVEL:
 if (sscreen->b.debug_flags & DBG(NIR))
-   return 140; /* no geometry and tessellation shaders yet 
*/
+   return 150; /* no tessellation shaders yet */


330 presumably?


The thing is that even though the GLSL version doesn't support tess the 
extension is actually enable as we don't have (and shouldn't bother 
implementing) a different extension table for the tgsi and nir backends. 
Just bumping this to 150 causes a bunch of tess piglit test to start 
running (and crashing). I'd rather just keep this at 150 for now until I 
can fix more bugs and enable tess support.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 09/14] anv: Let blorp handle indirect clear colors for CCS resolves

2017-11-22 Thread Nanley Chery

On Mon, Nov 13, 2017 at 08:12:49AM -0800, Jason Ekstrand wrote:
> ---
>  src/intel/vulkan/anv_blorp.c   | 32 +---
>  src/intel/vulkan/anv_private.h |  4 +--
>  src/intel/vulkan/genX_cmd_buffer.c | 51 
> +-
>  3 files changed, 20 insertions(+), 67 deletions(-)
> 

This patch is
Reviewed-by: Nanley Chery 

> diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
> index 6deb350..27320c2 100644
> --- a/src/intel/vulkan/anv_blorp.c
> +++ b/src/intel/vulkan/anv_blorp.c
> @@ -177,6 +177,15 @@ get_blorp_surf_for_anv_buffer(struct anv_device *device,
>  
>  #define ANV_AUX_USAGE_DEFAULT ((enum isl_aux_usage)0xff)
>  
> +static struct blorp_address
> +anv_to_blorp_address(struct anv_address addr)
> +{
> +   return (struct blorp_address) {
> +  .buffer = addr.bo,
> +  .offset = addr.offset,
> +   };
> +}
> +
>  static void
>  get_blorp_surf_for_anv_image(const struct anv_image *image,
>   VkImageAspectFlags aspect,
> @@ -1655,10 +1664,10 @@ anv_gen8_hiz_op_resolve(struct anv_cmd_buffer 
> *cmd_buffer,
>  
>  void
>  anv_ccs_resolve(struct anv_cmd_buffer * const cmd_buffer,
> -const struct anv_state surface_state,
>  const struct anv_image * const image,
>  VkImageAspectFlagBits aspect,
> -const uint8_t level, const uint32_t layer_count,
> +const uint8_t level,
> +const uint32_t start_layer, const uint32_t layer_count,
>  const enum blorp_fast_clear_op op)
>  {
> assert(cmd_buffer && image);
> @@ -1667,17 +1676,10 @@ anv_ccs_resolve(struct anv_cmd_buffer * const 
> cmd_buffer,
>  
> /* The resolved subresource range must have a CCS buffer. */
> assert(level < anv_image_aux_levels(image, aspect));
> -   assert(layer_count <= anv_image_aux_layers(image, aspect, level));
> +   assert(start_layer + layer_count <=
> +  anv_image_aux_layers(image, aspect, level));
> assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV && 
> image->samples == 1);
>  
> -   /* Create a binding table for this surface state. */
> -   uint32_t binding_table;
> -   VkResult result =
> -  binding_table_for_surface_state(cmd_buffer, surface_state,
> -  _table);
> -   if (result != VK_SUCCESS)
> -  return;
> -
> struct blorp_batch batch;
> blorp_batch_init(_buffer->device->blorp, , cmd_buffer,
>  BLORP_BATCH_PREDICATE_ENABLE);
> @@ -1686,11 +1688,11 @@ anv_ccs_resolve(struct anv_cmd_buffer * const 
> cmd_buffer,
> get_blorp_surf_for_anv_image(image, aspect,
>  fast_clear_aux_usage(image, aspect),
>  );
> +   surf.clear_color_addr = anv_to_blorp_address(
> +  anv_image_get_clear_color_addr(cmd_buffer->device, image, aspect, 
> level));
>  
> -   blorp_ccs_resolve_attachment(, binding_table, , level,
> -layer_count,
> -image->planes[plane].surface.isl.format,
> -op);
> +   blorp_ccs_resolve(, , level, start_layer, layer_count,
> + image->planes[plane].surface.isl.format, op);
>  
> blorp_batch_finish();
>  }
> diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
> index 6eed057..a1b1d48 100644
> --- a/src/intel/vulkan/anv_private.h
> +++ b/src/intel/vulkan/anv_private.h
> @@ -2527,10 +2527,10 @@ anv_gen8_hiz_op_resolve(struct anv_cmd_buffer 
> *cmd_buffer,
>  enum blorp_hiz_op op);
>  void
>  anv_ccs_resolve(struct anv_cmd_buffer * const cmd_buffer,
> -const struct anv_state surface_state,
>  const struct anv_image * const image,
>  VkImageAspectFlagBits aspect,
> -const uint8_t level, const uint32_t layer_count,
> +const uint8_t level,
> +const uint32_t start_layer, const uint32_t layer_count,
>  const enum blorp_fast_clear_op op);
>  
>  void
> diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
> b/src/intel/vulkan/genX_cmd_buffer.c
> index d7e4f23..2491b1d 100644
> --- a/src/intel/vulkan/genX_cmd_buffer.c
> +++ b/src/intel/vulkan/genX_cmd_buffer.c
> @@ -179,29 +179,6 @@ add_surface_state_reloc(struct anv_cmd_buffer 
> *cmd_buffer,
>  }
>  
>  static void
> -add_image_relocs(struct anv_cmd_buffer *cmd_buffer,
> - const struct anv_image *image,
> - const uint32_t plane,
> - struct anv_surface_state state)
> -{
> -   const struct isl_device *isl_dev = _buffer->device->isl_dev;
> -
> -   add_surface_state_reloc(cmd_buffer, state.state,
> -   image->planes[plane].bo, state.address);
> -
> -   if (state.aux_address) {
> -  VkResult result =
> -

[Mesa-dev] [Bug 103586] OpenCL/Clover: AMD Turks: corrupt output buffer (depending on dimension order?)

2017-11-22 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=103586

--- Comment #16 from Jan Vesely  ---
(In reply to Dave Gilbert from comment #15)
> Hi Jan,
>   Yes, doing:
> --- a/ocl.cpp
> +++ b/ocl.cpp
> @@ -65,6 +65,7 @@ static int got_dev(cl::Platform ,
> std::vector , cl::Dev
>  events.push_back(event);
>  cl::Event eventMap;
>  queue.enqueueBarrierWithWaitList();
> +event.wait();
>  mapped = (cl_uint*)queue.enqueueMapBuffer(output, CL_TRUE /* blocking
> */, CL_MAP_READ,
> 0 /* offset */, 
> SIZE * SIZE * SIZE * sizeof(cl_uint) /* size */,
> 
> does seem to work.

thanks, that means the kernel work event works correctly.
I'll need to double check the specs wrt synchronization points. we either miss
a wait, or fail to update mapped buffers after kernel finishes execution.

> 
> Vedran: I've only got a Turks to play with; feel free to try my test on
> something else.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 103852] Rendering errors when running dolphin-emu with Vulkan backend, radv (Super Smash Bros. Melee)

2017-11-22 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=103852

Bug ID: 103852
   Summary: Rendering errors when running dolphin-emu with Vulkan
backend, radv (Super Smash Bros. Melee)
   Product: Mesa
   Version: 17.2
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Drivers/Vulkan/radeon
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: benclap...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

Created attachment 135676
  --> https://bugs.freedesktop.org/attachment.cgi?id=135676=edit
text file containing output from glxinfo and vulkaninfo

The version of dolphin-emu used for testing was version 5.0-5874 (this commit:
https://github.com/dolphin-emu/dolphin/commit/01794126ade973a125161ca0ea9904197bccedc3
)

OS used is Debian 10 Buster (the current testing branch of debian).
I've attached the output of glxinfo and vulkaninfo, from which you can see I'm
currently on Mesa 17.2.5.
The GPU used is a RX 580.

When playing Super Smash Bros. Melee (NTSC, version 1.02), a number of minor
rendering issues/errors can be observed when using the Vulkan backend:
* The game's title screen does not render correctly.
* The background does not render correctly for some stages (Fountain of Dreams,
Final Destination, etc...)
* The background for the trophy gallery does not render correctly
* The background for the small screen showing fighters clapping in the results
screen seems to renders the wrong color (if playing as P1 against a CPU, player
1 should render red, not light-blue)
* Turning on cropping (Options -> Graphics Settings -> Advanced -> Misc ->
Crop) results in a black screen.

None of the aforementioned bugs occur when using the OpenGL backend, or when
using the OpenGL or Vulkan backends using NVIDIA's closed-source drivers on a
different computer w/GTX 960, so I suspect these are bugs in radv.

Below is a video I recorded showing the rendering errors and steps to reproduce
the above issues:
https://youtu.be/mOhB-17b0rg

For comparison, here is the game running on the OpenGL backend, which does not
have these rendering issues:
https://youtu.be/owA8TOa6LcQ

As an aside, you may notice a disproportionate number of dropped frames
compared to the FPS indicator in dolphin-emu at certain points in the videos I
recorded.
I highly suspect this to be due to the following bug in GNOME3's Mutter
compositor, and not related to mesa/radv (it occurs even when not recording):
https://bugzilla.gnome.org/show_bug.cgi?id=745032

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] r600: set DX10_CLAMP for compute shader too

2017-11-22 Thread Dave Airlie

On 22 November 2017 at 13:27,   wrote:
> From: Roland Scheidegger 
>
> I really intended to set this for all shader stages by
> 3835009796166968750ff46cf209f6d4208cda86 but missed it for compute shaders
> (because it's in a different source file...).

Reviewed-by: Dave Airlie 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 04/20] mesa: decrease the array size of ctx->Texture.FixedFuncUnit to 8

2017-11-22 Thread Marek Olšák

On Wed, Nov 22, 2017 at 10:28 PM, Ian Romanick  wrote:
> On 11/21/2017 10:01 AM, Marek Olšák wrote:
>> From: Marek Olšák 
>>
>> This makes piglit/texunits fail, because we can't do glTexEnv and
>> glGetTexEnv with 192 texture units anymore. Not that it ever made sense.
>
> It's just the test_texture_env subtest that fails, right?  Does this
> test pass on any closed-source drivers?  I'm wondering if the test is
> just wrong.  glTexEnv and glGetTexEnv are about fragment processing.
> The GL spec is pretty clear that the limit advertised by
> GL_MAX_TEXTURE_IMAGE_UNITS applies to fragment shader usage.  It seems
> logical that fixed-function fragment processing would have the same unit
> (though I haven't found anything to support this theory yet).
>
> Further, it's quite clear that GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS is
> the count of total users of all texture units.  Section 11.1.3.5
> (Texture Access) of the OpenGL 4.5 compatibility profile spec says:
>
> All active shaders, and fixed-function fragment processing if no
> fragment shader is active, combined cannot use more than the value
> of MAX_COMBINED_TEXTURE_IMAGE_UNITS texture image units. If more
> than one pipeline stage accesses the same texture image unit, each
> such access counts separately against the
> MAX_COMBINED_TEXTURE_IMAGE_UNITS limit.
>
> Does that subtest pass with this change if you
> s/MaxTextureCombinedUnits/MaxTextureImageUnits/ on lines 282 and 291?

It passes if I use MaxTextureCoordUnits.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 08/14] anv: Move get_fast_clear_state_address into anv_private.h

2017-11-22 Thread Nanley Chery

On Mon, Nov 13, 2017 at 08:12:48AM -0800, Jason Ekstrand wrote:
> While we're at it, we break it into two nicely named functions.
> ---
>  src/intel/vulkan/anv_private.h | 27 ++
>  src/intel/vulkan/genX_cmd_buffer.c | 56 
> --
>  2 files changed, 33 insertions(+), 50 deletions(-)
> 

This patch is
Reviewed-by: Nanley Chery 

> diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
> index e17a52a..6eed057 100644
> --- a/src/intel/vulkan/anv_private.h
> +++ b/src/intel/vulkan/anv_private.h
> @@ -2480,6 +2480,33 @@ anv_fast_clear_state_entry_size(const struct 
> anv_device *device)
> return device->isl_dev.ss.clear_value_size + 4;
>  }
>  
> +static inline struct anv_address
> +anv_image_get_clear_color_addr(const struct anv_device *device,
> +   const struct anv_image *image,
> +   VkImageAspectFlagBits aspect,
> +   unsigned level)
> +{
> +   uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect);
> +   return (struct anv_address) {
> +  .bo = image->planes[plane].bo,
> +  .offset = image->planes[plane].bo_offset +
> +image->planes[plane].fast_clear_state_offset +
> +anv_fast_clear_state_entry_size(device) * level,
> +   };
> +}
> +
> +static inline struct anv_address
> +anv_image_get_needs_resolve_addr(const struct anv_device *device,
> + const struct anv_image *image,
> + VkImageAspectFlagBits aspect,
> + unsigned level)
> +{
> +   struct anv_address addr =
> +  anv_image_get_clear_color_addr(device, image, aspect, level);
> +   addr.offset += device->isl_dev.ss.clear_value_size;
> +   return addr;
> +}
> +
>  /* Returns true if a HiZ-enabled depth buffer can be sampled from. */
>  static inline bool
>  anv_can_sample_with_hiz(const struct gen_device_info * const devinfo,
> diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
> b/src/intel/vulkan/genX_cmd_buffer.c
> index 9eb6074..d7e4f23 100644
> --- a/src/intel/vulkan/genX_cmd_buffer.c
> +++ b/src/intel/vulkan/genX_cmd_buffer.c
> @@ -426,45 +426,6 @@ transition_depth_buffer(struct anv_cmd_buffer 
> *cmd_buffer,
>anv_gen8_hiz_op_resolve(cmd_buffer, image, hiz_op);
>  }
>  
> -enum fast_clear_state_field {
> -   FAST_CLEAR_STATE_FIELD_CLEAR_COLOR,
> -   FAST_CLEAR_STATE_FIELD_NEEDS_RESOLVE,
> -};
> -
> -static inline struct anv_address
> -get_fast_clear_state_address(const struct anv_device *device,
> - const struct anv_image *image,
> - VkImageAspectFlagBits aspect,
> - unsigned level,
> - enum fast_clear_state_field field)
> -{
> -   assert(device && image);
> -   assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
> -   assert(level < anv_image_aux_levels(image, aspect));
> -
> -   uint32_t plane = anv_image_aspect_to_plane(image->aspects, aspect);
> -
> -   /* Refer to the definition of anv_image for the memory layout. */
> -   uint32_t offset = image->planes[plane].fast_clear_state_offset;
> -
> -   offset += anv_fast_clear_state_entry_size(device) * level;
> -
> -   switch (field) {
> -   case FAST_CLEAR_STATE_FIELD_NEEDS_RESOLVE:
> -  offset += device->isl_dev.ss.clear_value_size;
> -  /* Fall-through */
> -   case FAST_CLEAR_STATE_FIELD_CLEAR_COLOR:
> -  break;
> -   }
> -
> -   assert(offset < image->planes[plane].surface.offset + 
> image->planes[plane].size);
> -
> -   return (struct anv_address) {
> -  .bo = image->planes[plane].bo,
> -  .offset = image->planes[plane].bo_offset + offset,
> -   };
> -}
> -
>  #define MI_PREDICATE_SRC0  0x2400
>  #define MI_PREDICATE_SRC1  0x2408
>  
> @@ -481,16 +442,13 @@ genX(set_image_needs_resolve)(struct anv_cmd_buffer 
> *cmd_buffer,
> assert(image->aspects & VK_IMAGE_ASPECT_ANY_COLOR_BIT_ANV);
> assert(level < anv_image_aux_levels(image, aspect));
>  
> -   const struct anv_address resolve_flag_addr =
> -  get_fast_clear_state_address(cmd_buffer->device, image, aspect, level,
> -   FAST_CLEAR_STATE_FIELD_NEEDS_RESOLVE);
> -
> /* The HW docs say that there is no way to guarantee the completion of
>  * the following command. We use it nevertheless because it shows no
>  * issues in testing is currently being used in the GL driver.
>  */
> anv_batch_emit(_buffer->batch, GENX(MI_STORE_DATA_IMM), sdi) {
> -  sdi.Address = resolve_flag_addr;
> +  sdi.Address = anv_image_get_needs_resolve_addr(cmd_buffer->device,
> + image, aspect, level);
>sdi.ImmediateData = needs_resolve;
> }
>  }
> @@ -506,8 +464,8 @@ genX(load_needs_resolve_predicate)(struct anv_cmd_buffer 
> *cmd_buffer,
> assert(level <

Re: [Mesa-dev] [PATCH 00/28] vulkan/wsi: Rework WSI to look a lot more like a layer

2017-11-22 Thread Grazvydas Ignotas

On Wed, Nov 22, 2017 at 7:54 AM, Jason Ekstrand  wrote:
> On Tue, Nov 21, 2017 at 1:21 PM, Grazvydas Ignotas 
> wrote:
>>
>> On Mon, Nov 20, 2017 at 6:08 PM, Jason Ekstrand 
>> wrote:
>> > On Sun, Nov 19, 2017 at 5:07 AM, Grazvydas Ignotas 
>> > wrote:
>> >>
>> >> On Sun, Nov 19, 2017 at 1:51 AM, Jason Ekstrand 
>> >> wrote:
>> >> >
>> >> > I force-pushed the branch again with an added commit: "radv: Move wsi
>> >> > initialization later in physical_device_init" that fixes the memory
>> >> > type
>> >> > issue with radv.  I've tested both radv + radeon and anv + radeon on
>> >> > my
>> >> > HSW
>> >> > + Rx550 and they both work now.  I'm having a bit of trouble actually
>> >> > getting my system to start up on the Intel card so I'll have to leave
>> >> > testing radv on Intel for another day.
>> >>
>> >> Radv is working now on both displays, however "display on amd + anv"
>> >> case still acts the same (black window on most, but not all
>> >> SaschaWillems demos). I'm using xf86-video-amdgpu 1.4.0, 4.14 kernel
>> >> and xorg-server 1.18.4, if that makes a difference.
>> >
>> >
>> > I'm completely unable to reproduce.  Here's my setup:
>> >
>> >  - Fedora 27
>> >  - X.org 1.19.5
>> >  - xf86-video-amdgpu 1.3.0
>> >  - Linux 4.13.12
>> >  - Intel Haswell
>> >  - AMD RX550
>> >
>> > I've tried with amdgpu, modesetting, and XWayland all running on the AMD
>> > card and anv works on all three.  I'm a little weirded out by the fact
>> > that
>> > my X server is newer but my xf86-video-amdgpu is older.
>>
>> Well I compiled my own xf86-video-amdgpu. Not sure why.
>>
>> > Two things I'd like you to try if you can:
>> >
>> >  1) Use modesetting.  It may be a bug in your version of amdgpu.
>>
>> Same results (black window), plus all the tearing all over I usually
>> get with it. Also tried the distro kernel (4.10).
>>
>> >  2) Try the attached patch with radv + display on AMD.  It will make
>> > radv
>> > use the prime path regardless of the fact that it's displaying on the
>> > same
>> > GPU.
>>
>> Still works fine, albeit a bit slower (as expected I guess).
>> Maybe something specific to SKL?
>
>
> I think it may be weirdly enough.  More specifically, I think you're
> probably hitting a bug I found today in the Sascha demos that probably only
> actually shows up on SKL.  Give this PR a try:
>
> https://github.com/SaschaWillems/Vulkan/pull/400

That helps, thanks. But textoverlay and subpasses still show black
screen on amd display and multisampling is dark even on intel. All are
fine on radv on both displays.

Here are some weird ones:
- texture3d works on anv on amd display, but triggers
"vulkan/anv_batch_chain.c:1139: adjust_relocations_from_state_pool:
Assertion `last_pool_center_bo_offset <=
pool->block_pool.center_bo_offset' failed." on intel display.
- multisampling on radv isn't clearing the background on intel
display, but is fine on amd display.

In other tests with this series, Talos Principle and Serious Sam
Fusion work on anv+amd display, while DOOM is failing with
VK_ERROR_FEATURE_NOT_PRESENT on any display (and does so for many
months now, probably since 5dd96b1156, but I haven't tried looking
what the feature is).

Gražvydas
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/20] Mesa: Reducing sizes of gl_context etc.

2017-11-22 Thread Marek Olšák

On Wed, Nov 22, 2017 at 10:18 PM, Ian Romanick  wrote:
> On 11/22/2017 11:02 AM, Nicolai Hähnle wrote:
>> On 21.11.2017 19:01, Marek Olšák wrote:
>>> Hi,
>>>
>>> This series reduces sizes of many driver structures. For example:
>>>
>>> gl_context: 152488 -> 72944 bytes
>>> vbo_context: 22696 -> 20008 bytes
>>> st_context: 10120 ->  3704 bytes
>>>
>>> The idea is to decrease CPU cache usage on smaller CPUs. I have not
>>> noticed a performance difference with microbenchmarks. It might be
>>> a different story with complex apps.
>>>
>>> There are some good cleanups as well as some controversial changes,
>>> so some piglit regressions are to be expected, but we can change
>>> piglit not to test those silly cases if we all agree that this is
>>> the right thing to do. Feel free to discuss them on the commit
>>> threads.
>>>
>>> Branch for testing:
>>>  git://people.freedesktop.org/~mareko/mesa context-reduce-size
>>>
>>> Expected piglit failures where piglit might need changes:
>>> - spec/!openg 1.3/texunits
>>> - spec/arb_viewport_array/viewport-indices
>>
>> Can you explain where these failures come from?
>
> Patch 4 explains the first, and, while I haven't gotten there yet, I'll
> guess that patch 14 explains the second.

Yes, the patches explain the failures.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 11/20] mesa: reduce the size of gl_vertex_array_object

2017-11-22 Thread Ian Romanick

On 11/21/2017 10:01 AM, Marek Olšák wrote:
> From: Marek Olšák 
> 
> RelativeOffset should actually be uint, not intptr,
> according to ARB_vertex_attrib_binding.
> 
> gl_vertex_array_object: 3632 -> 3112 bytes
> ---
>  src/mesa/main/mtypes.h | 17 -
>  1 file changed, 8 insertions(+), 9 deletions(-)
> 
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index 6ddef05..773fa57 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -1492,32 +1492,33 @@ struct gl_vertex_array
>   *
>   * Note that the Stride field corresponds to VERTEX_ATTRIB_ARRAY_STRIDE
>   * and is only present for backwards compatibility reasons.
>   * Rendering always uses VERTEX_BINDING_STRIDE.
>   * The gl*Pointer() functions will set VERTEX_ATTRIB_ARRAY_STRIDE
>   * and VERTEX_BINDING_STRIDE to the same value, while
>   * glBindVertexBuffer() will only set VERTEX_BINDING_STRIDE.
>   */
>  struct gl_array_attributes
>  {
> -   GLint Size;  /**< Components per element (1,2,3,4) */
> +   GLuint RelativeOffset; /**< Offset of the first element relative to the 
> binding offset */
^
More spaces before the comment.

> GLenum16 Type;   /**< Datatype: GL_FLOAT, GL_INT, etc */
> GLenum16 Format; /**< Default: GL_RGBA, but may be GL_BGRA */
> -   GLsizei Stride;  /**< Stride as specified with gl*Pointer() */
> -   const GLubyte *Ptr;  /**< Points to client array data. Not used when 
> a VBO is bound */
> -   GLintptr RelativeOffset; /**< Offset of the first element relative to the 
> binding offset */
> +   GLshort Stride;  /**< Stride as specified with gl*Pointer() */
> +   GLubyte Size;/**< Components per element (1,2,3,4) */
> GLboolean Enabled;   /**< Whether the array is enabled */
> GLboolean Normalized;/**< Fixed-point values are normalized when 
> converted to floats */
> GLboolean Integer;   /**< Fixed-point values are not converted to 
> floats */
> GLboolean Doubles;   /**< double precision values are not converted 
> to floats */
> GLuint _ElementSize; /**< Size of each element in bytes */
> GLuint BufferBindingIndex;/**< Vertex buffer binding */

Do these two fields need to be 32-bits?

> +
> +   const GLubyte *Ptr;  /**< Points to client array data. Not used when 
> a VBO is bound */
>  };
>  
>  
>  /**
>   * This describes the buffer object used for a vertex array (or
>   * multiple vertex arrays).  If BufferObj points to the default/null
>   * buffer object, then the vertex array lives in user memory and not a VBO.
>   */
>  struct gl_vertex_buffer_binding
>  {
> @@ -1536,38 +1537,36 @@ struct gl_vertex_buffer_binding
>  struct gl_vertex_array_object
>  {
> /** Name of the VAO as received from glGenVertexArray. */
> GLuint Name;
>  
> GLint RefCount;
>  
> GLchar *Label;   /**< GL_KHR_debug */
>  
> /**
> -* Has this array object been bound?
> -*/
> -   GLboolean EverBound;
> -
> -   /**
>  * Derived vertex attribute arrays
>  *
>  * This is a legacy data structure created from gl_vertex_attrib_array and
>  * gl_vertex_buffer_binding, for compatibility with existing driver code.
>  */
> struct gl_vertex_array _VertexAttrib[VERT_ATTRIB_MAX];
>  
> /** Vertex attribute arrays */
> struct gl_array_attributes VertexAttrib[VERT_ATTRIB_MAX];
>  
> /** Vertex buffer bindings */
> struct gl_vertex_buffer_binding BufferBinding[VERT_ATTRIB_MAX];
>  
> +   /** Has this array object been bound? */
> +   GLboolean EverBound;
> +
> /** Mask indicating which vertex arrays have vertex buffer associated. */
> GLbitfield VertexAttribBufferMask;
>  
> /** Mask of VERT_BIT_* values indicating which arrays are enabled */
> GLbitfield _Enabled;
>  
> /** Mask of VERT_BIT_* values indicating changed/dirty arrays */
> GLbitfield NewArrays;
>  
> /** The index buffer (also known as the element array buffer in OpenGL). 
> */
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 03/20] mesa: separate legacy stuff from gl_texture_unit into gl_fixedfunc_texture_unit

2017-11-22 Thread Marek Olšák

On Wed, Nov 22, 2017 at 10:15 PM, Ian Romanick  wrote:
> On 11/21/2017 10:01 AM, Marek Olšák wrote:
>> From: Marek Olšák 
>>
>> ---
>>  src/mesa/drivers/common/meta.c  | 18 +++---
>>  src/mesa/drivers/dri/i915/i830_texblend.c   |  3 +-
>>  src/mesa/drivers/dri/nouveau/nouveau_util.h |  2 +-
>>  src/mesa/drivers/dri/nouveau/nv04_context.c | 14 +++--
>>  src/mesa/drivers/dri/nouveau/nv04_state_frag.c  |  6 +-
>>  src/mesa/drivers/dri/nouveau/nv10_state_frag.c  |  4 +-
>>  src/mesa/drivers/dri/nouveau/nv10_state_tex.c   |  5 +-
>>  src/mesa/drivers/dri/nouveau/nv20_state_tex.c   |  3 +-
>>  src/mesa/drivers/dri/r200/r200_tex.c|  3 +-
>>  src/mesa/drivers/dri/r200/r200_texstate.c   | 31 +
>>  src/mesa/drivers/dri/radeon/radeon_maos_verts.c |  2 +-
>>  src/mesa/drivers/dri/radeon/radeon_state.c  |  2 +-
>>  src/mesa/drivers/dri/radeon/radeon_tex.c|  3 +-
>>  src/mesa/drivers/dri/radeon/radeon_texstate.c   | 13 ++--
>>  src/mesa/main/attrib.c  | 17 ++---
>>  src/mesa/main/context.c |  6 +-
>>  src/mesa/main/enable.c  | 23 ---
>>  src/mesa/main/ff_fragment_shader.cpp|  3 +-
>>  src/mesa/main/ffvertex_prog.c   |  5 +-
>>  src/mesa/main/get.c |  7 ++-
>>  src/mesa/main/get_hash_params.py| 10 +--
>>  src/mesa/main/mtypes.h  | 44 +++--
>>  src/mesa/main/rastpos.c |  5 +-
>>  src/mesa/main/texenv.c  | 40 +++-
>>  src/mesa/main/texgen.c  | 18 +++---
>>  src/mesa/main/texstate.c| 83 
>> ++---
>>  src/mesa/main/texstate.h| 15 +
>>  src/mesa/program/prog_statevars.c   | 20 +++---
>>  src/mesa/swrast/s_context.c |  2 +-
>>  src/mesa/swrast/s_texcombine.c  |  3 +-
>>  src/mesa/swrast/s_triangle.c| 10 +--
>>  src/mesa/tnl/t_vb_texgen.c  | 10 +--
>>  32 files changed, 253 insertions(+), 177 deletions(-)
>>
>
> [snip]
>
>> diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
>> index 83838d8..cc18ac1 100644
>> --- a/src/mesa/main/get.c
>> +++ b/src/mesa/main/get.c
>> @@ -1357,21 +1357,20 @@ static const struct value_desc error_value =
>>   *
>>   * \return the struct value_desc corresponding to the enum or a struct
>>   * value_desc of TYPE_INVALID if not found.  This lets the calling
>>   * glGet*v() function jump right into a switch statement and
>>   * handle errors there instead of having to check for NULL.
>>   */
>>  static const struct value_desc *
>>  find_value(const char *func, GLenum pname, void **p, union value *v)
>>  {
>> GET_CURRENT_CONTEXT(ctx);
>> -   struct gl_texture_unit *unit;
>> int mask, hash;
>> const struct value_desc *d;
>> int api;
>>
>> api = ctx->API;
>> /* We index into the table_set[] list of per-API hash tables using the 
>> API's
>>  * value in the gl_api enum. Since GLES 3 doesn't have an API_OPENGL* 
>> enum
>>  * value since it's compatible with GLES2 its entry in table_set[] is at 
>> the
>>  * end.
>>  */
>> @@ -1412,22 +1411,24 @@ find_value(const char *func, GLenum pname, void **p, 
>> union value *v)
>> case LOC_BUFFER:
>>*p = ((char *) ctx->DrawBuffer + d->offset);
>>return d;
>> case LOC_CONTEXT:
>>*p = ((char *) ctx + d->offset);
>>return d;
>> case LOC_ARRAY:
>>*p = ((char *) ctx->Array.VAO + d->offset);
>>return d;
>> case LOC_TEXUNIT:
>> -  unit = >Texture.Unit[ctx->Texture.CurrentUnit];
>> -  *p = ((char *) unit + d->offset);
>> +  if (ctx->Texture.CurrentUnit < 
>> ARRAY_SIZE(ctx->Texture.FixedFuncUnit)) {
>> + unsigned index = ctx->Texture.CurrentUnit;
>> + *p = ((char *)>Texture.FixedFuncUnit[index] + d->offset);
>> +  }
>
> Presumably this is an error?  Looking at the surrounding code, it's not
> obvious that the error is signaled.  If we hit this patch, nothing ever
> initializes *p, and _mesa_GetIntegerv, for example, would dereference
> random junk.
>
> If the error was previously signaled, then execution shouldn't even get
> here in the ctx->Texture.CurrentUnit >=
> ARRAY_SIZE(ctx->Texture.FixedFuncUnit) case, right?

This is only used by GL_TEXTURE_GEN_* queries. It seems to be a
discrepancy in the GL spec. It's allowed to set and get the gen enables
for 192 combined texture units, but only 8 are usable in practice.
Reporting an error would be out-of-spec, returning garbage (like here)
is also out-of-spec, but it's an unusable state and therefore wasted memory.

If ARRAY_SIZE(ctx->Texture.FixedFuncUnit) ==
MAX_COMBINED_TEXTURE_IMAGE_UNITS, the condition is always true.

If

Re: [Mesa-dev] [PATCH 10/20] mesa: reduce the size of gl_image_unit

2017-11-22 Thread Ian Romanick

On 11/21/2017 10:01 AM, Marek Olšák wrote:
> From: Marek Olšák 
> 
> gl_context::ImageUnits: 6144 -> 4608 bytes
> gl_context: 74608 -> 73072 bytes
> ---
>  src/mesa/main/mtypes.h | 13 ++---
>  1 file changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index f16ff4e..6ddef05 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -4605,59 +4605,58 @@ struct gl_buffer_binding
>  struct gl_image_unit
>  {
> /**
>  * Texture object bound to this unit.
>  */
> struct gl_texture_object *TexObj;
>  
> /**
>  * Level of the texture object bound to this unit.
>  */
> -   GLuint Level;
> +   GLubyte Level;
>  
> /**
>  * \c GL_TRUE if the whole level is bound as an array of layers, \c
>  * GL_FALSE if only some specific layer of the texture is bound.
>  * \sa Layer
>  */
> GLboolean Layered;
>  
> /**
>  * Layer of the texture object bound to this unit as specified by the
>  * application.
>  */
> -   GLuint Layer;
> +   GLushort Layer;
>  
> /**
> -* Layer of the texture object bound to this unit, or zero if the
> -* whole level is bound.
> +* Layer of the texture object bound to this unit, or zero if
> +* Layered == false.
>  */
> -   GLuint _Layer;
> +   GLushort _Layer;
>  
> /**
>  * Access allowed to this texture image.  Either \c GL_READ_ONLY,
>  * \c GL_WRITE_ONLY or \c GL_READ_WRITE.
>  */
> GLenum16 Access;
>  
> /**
>  * GL internal format that determines the interpretation of the
>  * image memory when shader image operations are performed through
>  * this unit.
>  */
> GLenum16 Format;
>  
> /**
>  * Mesa format corresponding to \c Format.
>  */
> -   mesa_format _ActualFormat;
> -
> +   mesa_format _ActualFormat:16;

We should either add checks using Brian's new macro or just make
mesa_format packed on GCC / clang builds.

>  };
>  
>  /**
>   * Shader subroutines storage
>   */
>  struct gl_subroutine_index_binding
>  {
> GLuint NumIndex;
> GLuint *IndexPtr;
>  };
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 07/20] mesa: don't assign numbers to vertex attrib enums manually

2017-11-22 Thread Ian Romanick

Patches 7, 8, and 9 are

Reviewed-by: Ian Romanick 

On 11/21/2017 10:01 AM, Marek Olšák wrote:
> From: Marek Olšák 
> 
> I plan to remove one of them.
> ---
>  src/compiler/shader_enums.h |  68 ++--
>  src/mesa/tnl/t_context.h| 106 
> ++--
>  src/mesa/vbo/vbo_attrib.h   |  92 +++---
>  3 files changed, 133 insertions(+), 133 deletions(-)
> 
> diff --git a/src/compiler/shader_enums.h b/src/compiler/shader_enums.h
> index 9d229d4..17b236e 100644
> --- a/src/compiler/shader_enums.h
> +++ b/src/compiler/shader_enums.h
> @@ -67,54 +67,54 @@ const char *_mesa_shader_stage_to_abbrev(unsigned stage);
>  
>  /**
>   * Indexes for vertex program attributes.
>   * GL_NV_vertex_program aliases generic attributes over the conventional
>   * attributes.  In GL_ARB_vertex_program shader the aliasing is optional.
>   * In GL_ARB_vertex_shader / OpenGL 2.0 the aliasing is disallowed (the
>   * generic attributes are distinct/separate).
>   */
>  typedef enum
>  {
> -   VERT_ATTRIB_POS = 0,
> -   VERT_ATTRIB_WEIGHT = 1,
> -   VERT_ATTRIB_NORMAL = 2,
> -   VERT_ATTRIB_COLOR0 = 3,
> -   VERT_ATTRIB_COLOR1 = 4,
> -   VERT_ATTRIB_FOG = 5,
> -   VERT_ATTRIB_COLOR_INDEX = 6,
> -   VERT_ATTRIB_EDGEFLAG = 7,
> -   VERT_ATTRIB_TEX0 = 8,
> -   VERT_ATTRIB_TEX1 = 9,
> -   VERT_ATTRIB_TEX2 = 10,
> -   VERT_ATTRIB_TEX3 = 11,
> -   VERT_ATTRIB_TEX4 = 12,
> -   VERT_ATTRIB_TEX5 = 13,
> -   VERT_ATTRIB_TEX6 = 14,
> -   VERT_ATTRIB_TEX7 = 15,
> -   VERT_ATTRIB_POINT_SIZE = 16,
> -   VERT_ATTRIB_GENERIC0 = 17,
> -   VERT_ATTRIB_GENERIC1 = 18,
> -   VERT_ATTRIB_GENERIC2 = 19,
> -   VERT_ATTRIB_GENERIC3 = 20,
> -   VERT_ATTRIB_GENERIC4 = 21,
> -   VERT_ATTRIB_GENERIC5 = 22,
> -   VERT_ATTRIB_GENERIC6 = 23,
> -   VERT_ATTRIB_GENERIC7 = 24,
> -   VERT_ATTRIB_GENERIC8 = 25,
> -   VERT_ATTRIB_GENERIC9 = 26,
> -   VERT_ATTRIB_GENERIC10 = 27,
> -   VERT_ATTRIB_GENERIC11 = 28,
> -   VERT_ATTRIB_GENERIC12 = 29,
> -   VERT_ATTRIB_GENERIC13 = 30,
> -   VERT_ATTRIB_GENERIC14 = 31,
> -   VERT_ATTRIB_GENERIC15 = 32,
> -   VERT_ATTRIB_MAX = 33
> +   VERT_ATTRIB_POS,
> +   VERT_ATTRIB_WEIGHT,
> +   VERT_ATTRIB_NORMAL,
> +   VERT_ATTRIB_COLOR0,
> +   VERT_ATTRIB_COLOR1,
> +   VERT_ATTRIB_FOG,
> +   VERT_ATTRIB_COLOR_INDEX,
> +   VERT_ATTRIB_EDGEFLAG,
> +   VERT_ATTRIB_TEX0,
> +   VERT_ATTRIB_TEX1,
> +   VERT_ATTRIB_TEX2,
> +   VERT_ATTRIB_TEX3,
> +   VERT_ATTRIB_TEX4,
> +   VERT_ATTRIB_TEX5,
> +   VERT_ATTRIB_TEX6,
> +   VERT_ATTRIB_TEX7,
> +   VERT_ATTRIB_POINT_SIZE,
> +   VERT_ATTRIB_GENERIC0,
> +   VERT_ATTRIB_GENERIC1,
> +   VERT_ATTRIB_GENERIC2,
> +   VERT_ATTRIB_GENERIC3,
> +   VERT_ATTRIB_GENERIC4,
> +   VERT_ATTRIB_GENERIC5,
> +   VERT_ATTRIB_GENERIC6,
> +   VERT_ATTRIB_GENERIC7,
> +   VERT_ATTRIB_GENERIC8,
> +   VERT_ATTRIB_GENERIC9,
> +   VERT_ATTRIB_GENERIC10,
> +   VERT_ATTRIB_GENERIC11,
> +   VERT_ATTRIB_GENERIC12,
> +   VERT_ATTRIB_GENERIC13,
> +   VERT_ATTRIB_GENERIC14,
> +   VERT_ATTRIB_GENERIC15,
> +   VERT_ATTRIB_MAX
>  } gl_vert_attrib;
>  
>  const char *gl_vert_attrib_name(gl_vert_attrib attrib);
>  
>  /**
>   * Symbolic constats to help iterating over
>   * specific blocks of vertex attributes.
>   *
>   * VERT_ATTRIB_FF
>   *   includes all fixed function attributes as well as
> diff --git a/src/mesa/tnl/t_context.h b/src/mesa/tnl/t_context.h
> index e7adb5f..67a87f2 100644
> --- a/src/mesa/tnl/t_context.h
> +++ b/src/mesa/tnl/t_context.h
> @@ -69,84 +69,84 @@
>   * number of bits allocated for these numbers in places like vertex
>   * program instruction formats and register layouts.
>   */
>  /* The bit space exhaustion is a fact now, done by _TNL_ATTRIB_ATTRIBUTE* for
>   * GLSL vertex shader which cannot be aliased with conventional vertex 
> attribs.
>   * Compacting _TNL_ATTRIB_MAT_* attribs would not work, they would not give
>   * as many free bits (11 plus already 1 free bit) as _TNL_ATTRIB_ATTRIBUTE*
>   * attribs want (16).
>   */
>  enum {
> - _TNL_ATTRIB_POS = 0,
> - _TNL_ATTRIB_WEIGHT = 1,
> - _TNL_ATTRIB_NORMAL = 2,
> - _TNL_ATTRIB_COLOR0 = 3,
> - _TNL_ATTRIB_COLOR1 = 4,
> - _TNL_ATTRIB_FOG = 5,
> - _TNL_ATTRIB_COLOR_INDEX = 6,
> - _TNL_ATTRIB_EDGEFLAG = 7,
> - _TNL_ATTRIB_TEX0 = 8,
> - _TNL_ATTRIB_TEX1 = 9,
> - _TNL_ATTRIB_TEX2 = 10,
> - _TNL_ATTRIB_TEX3 = 11,
> - _TNL_ATTRIB_TEX4 = 12,
> - _TNL_ATTRIB_TEX5 = 13,
> - _TNL_ATTRIB_TEX6 = 14,
> - _TNL_ATTRIB_TEX7 = 15,
> -
> - _TNL_ATTRIB_GENERIC0 = 17, /* doesn't really exist! */
> - _TNL_ATTRIB_GENERIC1 = 18,
> - _TNL_ATTRIB_GENERIC2 = 19,
> - _TNL_ATTRIB_GENERIC3 = 20,
> - _TNL_ATTRIB_GENERIC4 = 21,
> - _TNL_ATTRIB_GENERIC5 = 22,
> - _TNL_ATTRIB_GENERIC6 = 23,
> - _TNL_ATTRIB_GENERIC7 = 24,
> - _TNL_ATTRIB_GENERIC8 = 25,
> - _TNL_ATTRIB_GENERIC9 = 26,
> - _TNL_ATTRIB_GENERIC10 = 27,
> - _TNL_ATTRIB_GENERIC11 = 28,

Re: [Mesa-dev] [PATCH 06/20] vbo: decrease the size of vbo_context slightly

2017-11-22 Thread Ian Romanick

On 11/21/2017 10:01 AM, Marek Olšák wrote:
> From: Marek Olšák 
> 
> vbo_context: 21520 -> 20344 bytes
> ---
>  src/mesa/main/mtypes.h   | 8 
>  src/mesa/vbo/vbo_context.h   | 4 ++--
>  src/mesa/vbo/vbo_exec_draw.c | 2 +-
>  src/mesa/vbo/vbo_save_draw.c | 2 +-
>  4 files changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index 67711d8..660b1a5 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -1452,31 +1452,31 @@ struct gl_pixelstore_attrib
>  };
>  
>  
>  /**
>   * Vertex array information which is derived from gl_array_attributes
>   * and gl_vertex_buffer_binding information.  Used by the VBO module and
>   * device drivers.
>   */
>  struct gl_vertex_array
>  {
> -   GLint Size;  /**< components per element (1,2,3,4) */
> GLenum16 Type;   /**< datatype: GL_FLOAT, GL_INT, etc */
> GLenum16 Format; /**< default: GL_RGBA, but may be GL_BGRA */
> -   GLsizei StrideB;  /**< actual stride in bytes */
> -   GLuint _ElementSize; /**< size of each element in bytes */
> -   const GLubyte *Ptr;  /**< Points to array data */
> +   GLshort StrideB;  /**< actual stride in bytes */

It looks like the largest value anyone currently advertises for
MaxVertexAttribStride is 2048.  We should probably have a check
somewhere that someone doesn't try to use 65537.

> +   GLubyte Size;/**< components per element (1,2,3,4) */
> +   GLubyte _ElementSize;/**< size of each element in bytes */
> GLboolean Normalized;/**< GL_ARB_vertex_program */
> GLboolean Integer;   /**< Integer-valued? */
> GLboolean Doubles;   /**< double precision values are not converted 
> to floats */
> GLuint InstanceDivisor;  /**< GL_ARB_instanced_arrays */
>  
> +   const GLubyte *Ptr;  /**< Points to array data */
> struct gl_buffer_object *BufferObj;/**< GL_ARB_vertex_buffer_object */
>  };
>  
>  
>  /**
>   * Attributes to describe a vertex array.
>   *
>   * Contains the size, type, format and normalization flag,
>   * along with the index of a vertex buffer binding point.
>   *
> diff --git a/src/mesa/vbo/vbo_context.h b/src/mesa/vbo/vbo_context.h
> index 70757d0..04079b7 100644
> --- a/src/mesa/vbo/vbo_context.h
> +++ b/src/mesa/vbo/vbo_context.h
> @@ -60,22 +60,22 @@
>  #include "main/macros.h"
>  
>  #ifdef __cplusplus
>  extern "C" {
>  #endif
>  
>  struct vbo_context {
> struct gl_vertex_array currval[VBO_ATTRIB_MAX];
> 
> /** Map VERT_ATTRIB_x to VBO_ATTRIB_y */
> -   GLuint map_vp_none[VERT_ATTRIB_MAX];
> -   GLuint map_vp_arb[VERT_ATTRIB_MAX];
> +   GLubyte map_vp_none[VERT_ATTRIB_MAX];
> +   GLubyte map_vp_arb[VERT_ATTRIB_MAX];
>  
> struct vbo_exec_context exec;
> struct vbo_save_context save;
>  
> /* Callback into the driver.  This must always succeed, the driver
>  * is responsible for initiating any fallback actions required:
>  */
> vbo_draw_func draw_prims;
>  
> /* Optional callback for indirect draws. This allows multidraws to not be
> diff --git a/src/mesa/vbo/vbo_exec_draw.c b/src/mesa/vbo/vbo_exec_draw.c
> index df34f05..f34b591 100644
> --- a/src/mesa/vbo/vbo_exec_draw.c
> +++ b/src/mesa/vbo/vbo_exec_draw.c
> @@ -168,21 +168,21 @@ vbo_copy_vertices( struct vbo_exec_context *exec )
>  
>  
>  /* TODO: populate these as the vertex is defined:
>   */
>  static void
>  vbo_exec_bind_arrays( struct gl_context *ctx )
>  {
> struct vbo_context *vbo = vbo_context(ctx);
> struct vbo_exec_context *exec = >exec;
> struct gl_vertex_array *arrays = exec->vtx.arrays;
> -   const GLuint *map;
> +   const GLubyte *map;
> GLuint attr;
> GLbitfield64 varying_inputs = 0x0;
> bool swap_pos = false;
>  
> /* Install the default (ie Current) attributes first, then overlay
>  * all active ones.
>  */
> switch (get_program_mode(exec->ctx)) {
> case VP_NONE:
>for (attr = 0; attr < VERT_ATTRIB_FF_MAX; attr++) {
> diff --git a/src/mesa/vbo/vbo_save_draw.c b/src/mesa/vbo/vbo_save_draw.c
> index 3fad4c7..02920c9 100644
> --- a/src/mesa/vbo/vbo_save_draw.c
> +++ b/src/mesa/vbo/vbo_save_draw.c
> @@ -130,21 +130,21 @@ _playback_copy_to_current(struct gl_context *ctx,
>   * Treat the vertex storage as a VBO, define vertex arrays pointing
>   * into it:
>   */
>  static void vbo_bind_vertex_list(struct gl_context *ctx,
>   const struct vbo_save_vertex_list *node)
>  {
> struct vbo_context *vbo = vbo_context(ctx);
> struct vbo_save_context *save = >save;
> struct gl_vertex_array *arrays = save->arrays;
> GLuint buffer_offset = node->buffer_offset;
> -   const GLuint *map;
> +   const GLubyte *map;
> GLuint attr;
> GLubyte node_attrsz[VBO_ATTRIB_MAX];  /* copy of node->attrsz[] */
> GLenum16 node_attrtype[VBO_ATTRIB_MAX];  /* copy of

Re: [Mesa-dev] [PATCH 07/14] intel/blorp: Take a range of layers in blorp_ccs_resolve

2017-11-22 Thread Nanley Chery

On Mon, Nov 13, 2017 at 08:12:47AM -0800, Jason Ekstrand wrote:
> ---
>  src/intel/blorp/blorp.h   | 3 ++-
>  src/intel/blorp/blorp_clear.c | 7 +--
>  src/mesa/drivers/dri/i965/brw_blorp.c | 2 +-
>  3 files changed, 8 insertions(+), 4 deletions(-)

This patch is
Reviewed-by: Nanley Chery 

> 
> diff --git a/src/intel/blorp/blorp.h b/src/intel/blorp/blorp.h
> index c3077aa..690e65f 100644
> --- a/src/intel/blorp/blorp.h
> +++ b/src/intel/blorp/blorp.h
> @@ -202,7 +202,8 @@ enum blorp_fast_clear_op {
>  
>  void
>  blorp_ccs_resolve(struct blorp_batch *batch,
> -  struct blorp_surf *surf, uint32_t level, uint32_t layer,
> +  struct blorp_surf *surf, uint32_t level,
> +  uint32_t start_layer, uint32_t num_layers,
>enum isl_format format,
>enum blorp_fast_clear_op resolve_op);
>  
> diff --git a/src/intel/blorp/blorp_clear.c b/src/intel/blorp/blorp_clear.c
> index 8d758df..56cc3dd 100644
> --- a/src/intel/blorp/blorp_clear.c
> +++ b/src/intel/blorp/blorp_clear.c
> @@ -778,13 +778,16 @@ prepare_ccs_resolve(struct blorp_batch * const batch,
>  
>  void
>  blorp_ccs_resolve(struct blorp_batch *batch,
> -  struct blorp_surf *surf, uint32_t level, uint32_t layer,
> +  struct blorp_surf *surf, uint32_t level,
> +  uint32_t start_layer, uint32_t num_layers,
>enum isl_format format,
>enum blorp_fast_clear_op resolve_op)
>  {
> struct blorp_params params;
>  
> -   prepare_ccs_resolve(batch, , surf, level, layer, format, 
> resolve_op);
> +   prepare_ccs_resolve(batch, , surf, level, start_layer,
> +   format, resolve_op);
> +   params.num_layers = num_layers;
>  
> batch->blorp->exec(batch, );
>  }
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
> b/src/mesa/drivers/dri/i965/brw_blorp.c
> index eae8aaa..0736583 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.c
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
> @@ -1486,7 +1486,7 @@ brw_blorp_resolve_color(struct brw_context *brw, struct 
> intel_mipmap_tree *mt,
>  
> struct blorp_batch batch;
> blorp_batch_init(>blorp, , brw, 0);
> -   blorp_ccs_resolve(, , level, layer,
> +   blorp_ccs_resolve(, , level, layer, 1,
>   brw_blorp_to_isl_format(brw, format, true),
>   resolve_op);
> blorp_batch_finish();
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/9] ac: change legacy_surf_level::slice_size to dword units

2017-11-22 Thread Marek Olšák

On Wed, Nov 22, 2017 at 7:46 PM, Nicolai Hähnle  wrote:
> On 21.11.2017 18:30, Marek Olšák wrote:
>>
>> From: Marek Olšák 
>>
>> The next commit will reduce the size even more.
>> ---
>>   src/amd/common/ac_surface.c|  2 +-
>>   src/amd/common/ac_surface.h|  2 +-
>>   src/amd/vulkan/radv_image.c|  8 
>>   src/gallium/drivers/r600/evergreen_state.c |  8 
>>   src/gallium/drivers/r600/r600_state.c  |  8 
>>   src/gallium/drivers/r600/r600_texture.c| 14 +++---
>>   src/gallium/drivers/r600/radeon_uvd.c  |  2 +-
>>   src/gallium/drivers/radeon/r600_texture.c  | 14 +++---
>>   src/gallium/drivers/radeon/radeon_uvd.c|  2 +-
>>   src/gallium/drivers/radeonsi/cik_sdma.c|  4 ++--
>>   src/gallium/drivers/radeonsi/si_dma.c  |  8 
>>   src/gallium/winsys/radeon/drm/radeon_drm_surface.c |  4 ++--
>>   12 files changed, 38 insertions(+), 38 deletions(-)
>>
>> diff --git a/src/amd/common/ac_surface.c b/src/amd/common/ac_surface.c
>> index f7600a3..2b6c3fb 100644
>> --- a/src/amd/common/ac_surface.c
>> +++ b/src/amd/common/ac_surface.c
>> @@ -297,21 +297,21 @@ static int gfx6_compute_level(ADDR_HANDLE addrlib,
>> ret = AddrComputeSurfaceInfo(addrlib,
>>  AddrSurfInfoIn,
>>  AddrSurfInfoOut);
>> if (ret != ADDR_OK) {
>> return ret;
>> }
>> surf_level = is_stencil ? >u.legacy.stencil_level[level] :
>> >u.legacy.level[level];
>> surf_level->offset = align64(surf->surf_size,
>> AddrSurfInfoOut->baseAlign);
>> -   surf_level->slice_size = AddrSurfInfoOut->sliceSize;
>> +   surf_level->slice_size_dw = AddrSurfInfoOut->sliceSize / 4;
>> surf_level->nblk_x = AddrSurfInfoOut->pitch;
>> surf_level->nblk_y = AddrSurfInfoOut->height;
>> switch (AddrSurfInfoOut->tileMode) {
>> case ADDR_TM_LINEAR_ALIGNED:
>> surf_level->mode = RADEON_SURF_MODE_LINEAR_ALIGNED;
>> break;
>> case ADDR_TM_1D_TILED_THIN1:
>> surf_level->mode = RADEON_SURF_MODE_1D;
>> break;
>> diff --git a/src/amd/common/ac_surface.h b/src/amd/common/ac_surface.h
>> index 1dc95cd..ec89f6b 100644
>> --- a/src/amd/common/ac_surface.h
>> +++ b/src/amd/common/ac_surface.h
>> @@ -64,21 +64,21 @@ enum radeon_micro_mode {
>>   /* bits 19 and 20 are reserved for libdrm_radeon, don't use them */
>>   #define RADEON_SURF_FMASK   (1 << 21)
>>   #define RADEON_SURF_DISABLE_DCC (1 << 22)
>>   #define RADEON_SURF_TC_COMPATIBLE_HTILE (1 << 23)
>>   #define RADEON_SURF_IMPORTED(1 << 24)
>>   #define RADEON_SURF_OPTIMIZE_FOR_SPACE  (1 << 25)
>>   #define RADEON_SURF_SHAREABLE   (1 << 26)
>> struct legacy_surf_level {
>>   uint64_toffset;
>> -uint64_tslice_size;
>> +uint32_tslice_size_dw; /* in dwords; max = 4GB /
>> 4. */
>>   uint32_tdcc_offset; /* relative offset within
>> DCC mip tree */
>>   uint32_tdcc_fast_clear_size;
>>   uint16_tnblk_x;
>>   uint16_tnblk_y;
>>   enum radeon_surf_mode   mode;
>>   };
>> struct legacy_surf_layout {
>>   unsignedbankw:4;  /* max 8 */
>>   unsignedbankh:4;  /* max 8 */
>> diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
>> index b532aa9..fb7bbde 100644
>> --- a/src/amd/vulkan/radv_image.c
>> +++ b/src/amd/vulkan/radv_image.c
>> @@ -1149,25 +1149,25 @@ void radv_GetImageSubresourceLayout(
>> if (device->physical_device->rad_info.chip_class >= GFX9) {
>> pLayout->offset = surface->u.gfx9.offset[level] +
>> surface->u.gfx9.surf_slice_size * layer;
>> pLayout->rowPitch = surface->u.gfx9.surf_pitch *
>> surface->bpe;
>> pLayout->arrayPitch = surface->u.gfx9.surf_slice_size;
>> pLayout->depthPitch = surface->u.gfx9.surf_slice_size;
>> pLayout->size = surface->u.gfx9.surf_slice_size;
>> if (image->type == VK_IMAGE_TYPE_3D)
>> pLayout->size *= u_minify(image->info.depth,
>> level);
>> } else {
>> -   pLayout->offset = surface->u.legacy.level[level].offset +
>> surface->u.legacy.level[level].slice_size * layer;
>> +   pLayout->offset = surface->u.legacy.level[level].offset +
>> surface->u.legacy.level[level].slice_size_dw * 4 * layer;
>
>
> I believe the maximum slice size in bytes is (with an RGBA32 texture)
>
> 16384 * 16384 * 16 = 2^14 * 2^14 * 2^4 = 2^32
>
> The problem with this

Re: [Mesa-dev] [PATCH 04/20] mesa: decrease the array size of ctx->Texture.FixedFuncUnit to 8

2017-11-22 Thread Ian Romanick

On 11/21/2017 10:01 AM, Marek Olšák wrote:
> From: Marek Olšák 
> 
> This makes piglit/texunits fail, because we can't do glTexEnv and
> glGetTexEnv with 192 texture units anymore. Not that it ever made sense.

It's just the test_texture_env subtest that fails, right?  Does this
test pass on any closed-source drivers?  I'm wondering if the test is
just wrong.  glTexEnv and glGetTexEnv are about fragment processing.
The GL spec is pretty clear that the limit advertised by
GL_MAX_TEXTURE_IMAGE_UNITS applies to fragment shader usage.  It seems
logical that fixed-function fragment processing would have the same unit
(though I haven't found anything to support this theory yet).

Further, it's quite clear that GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS is
the count of total users of all texture units.  Section 11.1.3.5
(Texture Access) of the OpenGL 4.5 compatibility profile spec says:

All active shaders, and fixed-function fragment processing if no
fragment shader is active, combined cannot use more than the value
of MAX_COMBINED_TEXTURE_IMAGE_UNITS texture image units. If more
than one pipeline stage accesses the same texture image unit, each
such access counts separately against the
MAX_COMBINED_TEXTURE_IMAGE_UNITS limit.

Does that subtest pass with this change if you
s/MaxTextureCombinedUnits/MaxTextureImageUnits/ on lines 282 and 291?

> If people are OK with this, we can adjust the test to check only 8 texture
> units, so that it doesn't fail. See also code comments.
> 
> gl_context: 136896 -> 75072 bytes
> ---
>  src/mesa/main/enable.c   |  5 +
>  src/mesa/main/mtypes.h   |  2 +-
>  src/mesa/main/texenv.c   | 27 +++
>  src/mesa/main/texstate.c |  2 +-
>  src/mesa/main/texstate.h |  3 +++
>  5 files changed, 37 insertions(+), 2 deletions(-)
> 
> diff --git a/src/mesa/main/enable.c b/src/mesa/main/enable.c
> index 93ffb0d..54b3b65 100644
> --- a/src/mesa/main/enable.c
> +++ b/src/mesa/main/enable.c
> @@ -213,20 +213,22 @@ get_texcoord_unit(struct gl_context *ctx)
>  /**
>   * Helper function to enable or disable a texture target.
>   * \param bit  one of the TEXTURE_x_BIT values
>   * \return GL_TRUE if state is changing or GL_FALSE if no change
>   */
>  static GLboolean
>  enable_texture(struct gl_context *ctx, GLboolean state, GLbitfield texBit)
>  {
> struct gl_fixedfunc_texture_unit *texUnit =
>_mesa_get_current_fixedfunc_tex_unit(ctx);
> +   if (!texUnit)
> +  return false;
>  
> const GLbitfield newenabled = state
>? (texUnit->Enabled | texBit) : (texUnit->Enabled & ~texBit);
>  
> if (texUnit->Enabled == newenabled)
> return GL_FALSE;
>  
> FLUSH_VERTICES(ctx, _NEW_TEXTURE_STATE);
> texUnit->Enabled = newenabled;
> return GL_TRUE;
> @@ -1291,20 +1293,23 @@ _mesa_IsEnabledi( GLenum cap, GLuint index )
>  
>  /**
>   * Helper function to determine whether a texture target is enabled.
>   */
>  static GLboolean
>  is_texture_enabled(struct gl_context *ctx, GLbitfield bit)
>  {
> const struct gl_fixedfunc_texture_unit *const texUnit =
>_mesa_get_current_fixedfunc_tex_unit(ctx);
>  
> +   if (!texUnit)
> +  return false;
> +
> return (texUnit->Enabled & bit) ? GL_TRUE : GL_FALSE;
>  }
>  
>  
>  /**
>   * Return simple enable/disable state.
>   *
>   * \param cap  state variable to query.
>   *
>   * Returns the state of the specified capability from the current GL context.
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index 42b9721..23abc68 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -1314,21 +1314,21 @@ struct gl_texture_attrib
> /** Bitwise-OR of all Texture.Unit[i]._GenFlags */
> GLbitfield _GenFlags;
>  
> /** Largest index of a texture unit with _Current != NULL. */
> GLint _MaxEnabledTexImageUnit;
>  
> /** Largest index + 1 of texture units that have had any CurrentTex set. 
> */
> GLint NumCurrentTexUsed;
>  
> struct gl_texture_unit Unit[MAX_COMBINED_TEXTURE_IMAGE_UNITS];
> -   struct gl_fixedfunc_texture_unit 
> FixedFuncUnit[MAX_COMBINED_TEXTURE_IMAGE_UNITS];
> +   struct gl_fixedfunc_texture_unit FixedFuncUnit[MAX_TEXTURE_COORD_UNITS];
>  };
>  
>  
>  /**
>   * Data structure representing a single clip plane (e.g. one of the elements
>   * of the ctx->Transform.EyeUserPlane or ctx->Transform._ClipUserPlane 
> array).
>   */
>  typedef GLfloat gl_clip_plane[4];
>  
>  
> diff --git a/src/mesa/main/texenv.c b/src/mesa/main/texenv.c
> index 9018ce9..22fc8da 100644
> --- a/src/mesa/main/texenv.c
> +++ b/src/mesa/main/texenv.c
> @@ -393,20 +393,29 @@ _mesa_TexEnvfv( GLenum target, GLenum pname, const 
> GLfloat *param )
>? ctx->Const.MaxTextureCoordUnits : 
> ctx->Const.MaxCombinedTextureImageUnits;
> if (ctx->Texture.CurrentUnit >= maxUnit) {
>_mesa_error(ctx, GL_INVALID_OPERATION, "glTexEnvfv(current unit)");
>return;
> }
>  
> if (target ==

Re: [Mesa-dev] [PATCH v3 06/15] meson: build virgl driver

2017-11-22 Thread Eric Anholt

Dylan Baker  writes:

> Build tested only.

I haven't done a detailed comparison to the autotools build, but this
all looks fine.  Patches 1-6:

Reviewed-by: Eric Anholt 


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/20] Mesa: Reducing sizes of gl_context etc.

2017-11-22 Thread Ian Romanick

On 11/22/2017 11:02 AM, Nicolai Hähnle wrote:
> On 21.11.2017 19:01, Marek Olšák wrote:
>> Hi,
>>
>> This series reduces sizes of many driver structures. For example:
>>
>> gl_context: 152488 -> 72944 bytes
>> vbo_context: 22696 -> 20008 bytes
>> st_context: 10120 ->  3704 bytes
>>
>> The idea is to decrease CPU cache usage on smaller CPUs. I have not
>> noticed a performance difference with microbenchmarks. It might be
>> a different story with complex apps.
>>
>> There are some good cleanups as well as some controversial changes,
>> so some piglit regressions are to be expected, but we can change
>> piglit not to test those silly cases if we all agree that this is
>> the right thing to do. Feel free to discuss them on the commit
>> threads.
>>
>> Branch for testing:
>>  git://people.freedesktop.org/~mareko/mesa context-reduce-size
>>
>> Expected piglit failures where piglit might need changes:
>> - spec/!openg 1.3/texunits
>> - spec/arb_viewport_array/viewport-indices
> 
> Can you explain where these failures come from?

Patch 4 explains the first, and, while I haven't gotten there yet, I'll
guess that patch 14 explains the second.

> Will look at the patches themselves later / tomorrow.
> 
> Cheers,
> Nicolai
> 
> 
>>
>> Please review.
>>
>> Thanks,
>> Marek
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 03/20] mesa: separate legacy stuff from gl_texture_unit into gl_fixedfunc_texture_unit

2017-11-22 Thread Ian Romanick

On 11/21/2017 10:01 AM, Marek Olšák wrote:
> From: Marek Olšák 
> 
> ---
>  src/mesa/drivers/common/meta.c  | 18 +++---
>  src/mesa/drivers/dri/i915/i830_texblend.c   |  3 +-
>  src/mesa/drivers/dri/nouveau/nouveau_util.h |  2 +-
>  src/mesa/drivers/dri/nouveau/nv04_context.c | 14 +++--
>  src/mesa/drivers/dri/nouveau/nv04_state_frag.c  |  6 +-
>  src/mesa/drivers/dri/nouveau/nv10_state_frag.c  |  4 +-
>  src/mesa/drivers/dri/nouveau/nv10_state_tex.c   |  5 +-
>  src/mesa/drivers/dri/nouveau/nv20_state_tex.c   |  3 +-
>  src/mesa/drivers/dri/r200/r200_tex.c|  3 +-
>  src/mesa/drivers/dri/r200/r200_texstate.c   | 31 +
>  src/mesa/drivers/dri/radeon/radeon_maos_verts.c |  2 +-
>  src/mesa/drivers/dri/radeon/radeon_state.c  |  2 +-
>  src/mesa/drivers/dri/radeon/radeon_tex.c|  3 +-
>  src/mesa/drivers/dri/radeon/radeon_texstate.c   | 13 ++--
>  src/mesa/main/attrib.c  | 17 ++---
>  src/mesa/main/context.c |  6 +-
>  src/mesa/main/enable.c  | 23 ---
>  src/mesa/main/ff_fragment_shader.cpp|  3 +-
>  src/mesa/main/ffvertex_prog.c   |  5 +-
>  src/mesa/main/get.c |  7 ++-
>  src/mesa/main/get_hash_params.py| 10 +--
>  src/mesa/main/mtypes.h  | 44 +++--
>  src/mesa/main/rastpos.c |  5 +-
>  src/mesa/main/texenv.c  | 40 +++-
>  src/mesa/main/texgen.c  | 18 +++---
>  src/mesa/main/texstate.c| 83 
> ++---
>  src/mesa/main/texstate.h| 15 +
>  src/mesa/program/prog_statevars.c   | 20 +++---
>  src/mesa/swrast/s_context.c |  2 +-
>  src/mesa/swrast/s_texcombine.c  |  3 +-
>  src/mesa/swrast/s_triangle.c| 10 +--
>  src/mesa/tnl/t_vb_texgen.c  | 10 +--
>  32 files changed, 253 insertions(+), 177 deletions(-)
> 

[snip]

> diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
> index 83838d8..cc18ac1 100644
> --- a/src/mesa/main/get.c
> +++ b/src/mesa/main/get.c
> @@ -1357,21 +1357,20 @@ static const struct value_desc error_value =
>   *
>   * \return the struct value_desc corresponding to the enum or a struct
>   * value_desc of TYPE_INVALID if not found.  This lets the calling
>   * glGet*v() function jump right into a switch statement and
>   * handle errors there instead of having to check for NULL.
>   */
>  static const struct value_desc *
>  find_value(const char *func, GLenum pname, void **p, union value *v)
>  {
> GET_CURRENT_CONTEXT(ctx);
> -   struct gl_texture_unit *unit;
> int mask, hash;
> const struct value_desc *d;
> int api;
>  
> api = ctx->API;
> /* We index into the table_set[] list of per-API hash tables using the 
> API's
>  * value in the gl_api enum. Since GLES 3 doesn't have an API_OPENGL* enum
>  * value since it's compatible with GLES2 its entry in table_set[] is at 
> the
>  * end.
>  */
> @@ -1412,22 +1411,24 @@ find_value(const char *func, GLenum pname, void **p, 
> union value *v)
> case LOC_BUFFER:
>*p = ((char *) ctx->DrawBuffer + d->offset);
>return d;
> case LOC_CONTEXT:
>*p = ((char *) ctx + d->offset);
>return d;
> case LOC_ARRAY:
>*p = ((char *) ctx->Array.VAO + d->offset);
>return d;
> case LOC_TEXUNIT:
> -  unit = >Texture.Unit[ctx->Texture.CurrentUnit];
> -  *p = ((char *) unit + d->offset);
> +  if (ctx->Texture.CurrentUnit < ARRAY_SIZE(ctx->Texture.FixedFuncUnit)) 
> {
> + unsigned index = ctx->Texture.CurrentUnit;
> + *p = ((char *)>Texture.FixedFuncUnit[index] + d->offset);
> +  }

Presumably this is an error?  Looking at the surrounding code, it's not
obvious that the error is signaled.  If we hit this patch, nothing ever
initializes *p, and _mesa_GetIntegerv, for example, would dereference
random junk.

If the error was previously signaled, then execution shouldn't even get
here in the ctx->Texture.CurrentUnit >=
ARRAY_SIZE(ctx->Texture.FixedFuncUnit) case, right?

>return d;
> case LOC_CUSTOM:
>find_custom_value(ctx, d, v);
>*p = v;
>return d;
> default:
>assert(0);
>break;
> }
>  

[big snip]

> diff --git a/src/mesa/swrast/s_triangle.c b/src/mesa/swrast/s_triangle.c
> index a4113e5..d3ad89d 100644
> --- a/src/mesa/swrast/s_triangle.c
> +++ b/src/mesa/swrast/s_triangle.c
> @@ -532,21 +532,21 @@ affine_span(struct gl_context *ctx, SWspan *span,
>  #define NAME affine_textured_triangle
>  #define INTERP_Z 1
>  #define INTERP_RGB 1
>  #define INTERP_ALPHA 1
>  #define INTERP_INT_TEX 1
>  #define S_SCALE twidth
>  #define T_SCALE theight
>  
>  #define SETUP_CODE

Re: [Mesa-dev] [PATCH 1/9] ac: pack ac_surface better

2017-11-22 Thread Marek Olšák

On Wed, Nov 22, 2017 at 7:48 PM, Nicolai Hähnle  wrote:
> On 21.11.2017 18:30, Marek Olšák wrote:
>>
>> From: Marek Olšák 
>>
>> r600_texture: 1736 -> 1488 bytes
>> ---
>>   src/amd/common/ac_surface.h   |  9 +
>>   src/gallium/drivers/r600/r600_texture.c   |  2 +-
>>   src/gallium/drivers/radeon/r600_texture.c | 12 ++--
>>   3 files changed, 12 insertions(+), 11 deletions(-)
>>
>> diff --git a/src/amd/common/ac_surface.h b/src/amd/common/ac_surface.h
>> index 7ac4737..1dc95cd 100644
>> --- a/src/amd/common/ac_surface.h
>> +++ b/src/amd/common/ac_surface.h
>> @@ -65,22 +65,22 @@ enum radeon_micro_mode {
>>   #define RADEON_SURF_FMASK   (1 << 21)
>>   #define RADEON_SURF_DISABLE_DCC (1 << 22)
>>   #define RADEON_SURF_TC_COMPATIBLE_HTILE (1 << 23)
>>   #define RADEON_SURF_IMPORTED(1 << 24)
>>   #define RADEON_SURF_OPTIMIZE_FOR_SPACE  (1 << 25)
>>   #define RADEON_SURF_SHAREABLE   (1 << 26)
>> struct legacy_surf_level {
>>   uint64_toffset;
>>   uint64_tslice_size;
>> -uint64_tdcc_offset;
>> -uint64_tdcc_fast_clear_size;
>> +uint32_tdcc_offset; /* relative offset within DCC
>> mip tree */
>
>
> What about array textures? Those can get rather large.

In order to get dcc_size = 4GB, the color miptree has to have 1TB.

Marek

>
> Apart from that, this patch is:
>
> Reviewed-by: Nicolai Hähnle 
>
>
>> +uint32_tdcc_fast_clear_size;
>>   uint16_tnblk_x;
>>   uint16_tnblk_y;
>>   enum radeon_surf_mode   mode;
>>   };
>> struct legacy_surf_layout {
>>   unsignedbankw:4;  /* max 8 */
>>   unsignedbankh:4;  /* max 8 */
>>   unsignedmtilea:4; /* max 8 */
>>   unsignedtile_split:13; /* max 4K */
>> @@ -180,22 +180,23 @@ struct radeon_surf {
>>* Only these surfaces are allowed to set it:
>>* - color (if it doesn't have to be displayable)
>>* - DCC (same tile swizzle as color)
>>* - FMASK
>>* - CMASK if it's TC-compatible or if the gen is GFX9
>>* - depth/stencil if HTILE is not TC-compatible and if the gen is
>> not GFX9
>>*/
>>   uint8_t tile_swizzle;
>> uint64_tsurf_size;
>> -uint64_tdcc_size;
>> -uint64_thtile_size;
>> +/* DCC and HTILE are very small. */
>> +uint32_tdcc_size;
>> +uint32_thtile_size;
>> uint32_thtile_slice_size;
>> uint32_tsurf_alignment;
>>   uint32_tdcc_alignment;
>>   uint32_thtile_alignment;
>> union {
>>   /* R600-VI return values.
>>*
>> diff --git a/src/gallium/drivers/r600/r600_texture.c
>> b/src/gallium/drivers/r600/r600_texture.c
>> index ee6ed64..f7c9b63 100644
>> --- a/src/gallium/drivers/r600/r600_texture.c
>> +++ b/src/gallium/drivers/r600/r600_texture.c
>> @@ -830,21 +830,21 @@ void r600_print_texture_info(struct
>> r600_common_screen *rscreen,
>> rtex->fmask.pitch_in_pixels,
>> rtex->fmask.bank_height,
>> rtex->fmask.slice_tile_max,
>> rtex->fmask.tile_mode_index);
>> if (rtex->cmask.size)
>> u_log_printf(log, "  CMask: offset=%"PRIu64",
>> size=%"PRIu64", alignment=%u, "
>> "slice_tile_max=%u\n",
>> rtex->cmask.offset, rtex->cmask.size,
>> rtex->cmask.alignment,
>> rtex->cmask.slice_tile_max);
>> if (rtex->htile_offset)
>> -   u_log_printf(log, "  HTile: offset=%"PRIu64",
>> size=%"PRIu64", "
>> +   u_log_printf(log, "  HTile: offset=%"PRIu64", size=%u "
>> "alignment=%u\n",
>>  rtex->htile_offset, rtex->surface.htile_size,
>>  rtex->surface.htile_alignment);
>> for (i = 0; i <= rtex->resource.b.b.last_level; i++)
>> u_log_printf(log, "  Level[%i]: offset=%"PRIu64",
>> slice_size=%"PRIu64", "
>> "npix_x=%u, npix_y=%u, npix_z=%u, nblk_x=%u,
>> nblk_y=%u, "
>> "mode=%u, tiling_index = %u\n",
>> i, rtex->surface.u.legacy.level[i].offset,
>> rtex->surface.u.legacy.level[i].slice_size,
>> diff --git a/src/gallium/drivers/radeon/r600_texture.c
>> b/src/gallium/drivers/radeon/r600_texture.c
>> index 5f6e913..38d2470 100644
>> --- a/src/gallium/drivers/radeon/r600_texture.c
>> +++

Re: [Mesa-dev] [PATCH 05/14] i965/blorp: Use a designated initializer for blorp_surf

2017-11-22 Thread Nanley Chery

On Mon, Nov 13, 2017 at 08:12:45AM -0800, Jason Ekstrand wrote:
> This way uninitialized fields get automatically zeroed and it's safe to
> add more fields to blorp_surf.
> ---
>  src/mesa/drivers/dri/i965/brw_blorp.c | 15 ---
>  1 file changed, 8 insertions(+), 7 deletions(-)
> 

This patch is
Reviewed-by: Nanley Chery 

> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
> b/src/mesa/drivers/dri/i965/brw_blorp.c
> index 58e1f8a..eae8aaa 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.c
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
> @@ -154,15 +154,16 @@ blorp_surf_for_miptree(struct brw_context *brw,
>   intel_miptree_check_level_layer(mt, *level, start_layer + i);
> }
>  
> -   surf->surf = >surf;
> -   surf->addr = (struct blorp_address) {
> -  .buffer = mt->bo,
> -  .offset = mt->offset,
> -  .reloc_flags = is_render_target ? EXEC_OBJECT_WRITE : 0,
> +   *surf = (struct blorp_surf) {
> +  .surf = >surf,
> +  .addr = (struct blorp_address) {
> + .buffer = mt->bo,
> + .offset = mt->offset,
> + .reloc_flags = is_render_target ? EXEC_OBJECT_WRITE : 0,
> +  },
> +  .aux_usage = aux_usage,
> };
>  
> -   surf->aux_usage = aux_usage;
> -
> struct isl_surf *aux_surf = NULL;
> if (mt->mcs_buf)
>aux_surf = >mcs_buf->surf;
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 06/14] intel/blorp: Add initial support for indirect clear colors

2017-11-22 Thread Nanley Chery

On Mon, Nov 13, 2017 at 08:12:46AM -0800, Jason Ekstrand wrote:
> ---
>  src/intel/blorp/blorp.c |  1 +
>  src/intel/blorp/blorp.h |  7 +++
>  src/intel/blorp/blorp_genX_exec.h   | 77 
> +
>  src/intel/blorp/blorp_priv.h|  1 +
>  src/intel/vulkan/genX_blorp_exec.c  | 10 
>  src/mesa/drivers/dri/i965/genX_blorp_exec.c | 13 +
>  6 files changed, 109 insertions(+)
> 
> diff --git a/src/intel/blorp/blorp.c b/src/intel/blorp/blorp.c
> index 5faba75..8a9d2fd 100644
> --- a/src/intel/blorp/blorp.c
> +++ b/src/intel/blorp/blorp.c
> @@ -100,6 +100,7 @@ brw_blorp_surface_info_init(struct blorp_context *blorp,
> }
>  
> info->clear_color = surf->clear_color;
> +   info->clear_color_addr = surf->clear_color_addr;
>  
> info->view = (struct isl_view) {
>.usage = is_render_target ? ISL_SURF_USAGE_RENDER_TARGET_BIT :
> diff --git a/src/intel/blorp/blorp.h b/src/intel/blorp/blorp.h
> index 9716c66..c3077aa 100644
> --- a/src/intel/blorp/blorp.h
> +++ b/src/intel/blorp/blorp.h
> @@ -106,6 +106,13 @@ struct blorp_surf
> enum isl_aux_usage aux_usage;
>  
> union isl_color_value clear_color;
> +
> +   /** If set (bo != NULL), clear_color is ignored and the actual clear color

The first line of the comment should be blank.

> +* this is fetched from this address.  On gen7-8, this is all of dword 7 
> of
 ^
 Extra word.

> +* RENDER_SURFACE_STATE and is the responsibility of the caller to ensure
> +* that it contains a swizzle of RGBA and resource min LOD of 0.
> +*/
> +   struct blorp_address clear_color_addr;
>  };
>  
>  void
> diff --git a/src/intel/blorp/blorp_genX_exec.h 
> b/src/intel/blorp/blorp_genX_exec.h
> index 5389262..4f88650 100644
> --- a/src/intel/blorp/blorp_genX_exec.h
> +++ b/src/intel/blorp/blorp_genX_exec.h
> @@ -78,6 +78,11 @@ static void
>  blorp_surface_reloc(struct blorp_batch *batch, uint32_t ss_offset,
>  struct blorp_address address, uint32_t delta);
>  
> +#if GEN_GEN >= 7
> +static struct blorp_address
> +blorp_get_surface_base_address(struct blorp_batch *batch);
> +#endif
> +
>  static void
>  blorp_emit_urb_config(struct blorp_batch *batch,
>unsigned vs_entry_size, unsigned sf_entry_size);
> @@ -1202,6 +1207,42 @@ blorp_emit_pipeline(struct blorp_batch *batch,
>  
>  #endif /* GEN_GEN >= 6 */
>  
> +#if GEN_GEN >= 7 && GEN_GEN <= 10
> +static void
> +blorp_emit_memcpy(struct blorp_batch *batch,
> +  struct blorp_address dst,
> +  struct blorp_address src,
> +  uint32_t size)
> +{
> +   assert(size % 4 == 0);
> +
> +   for (unsigned dw = 0; dw < size; dw += 4) {
> +#if GEN_GEN >= 8
> +  blorp_emit(batch, GENX(MI_COPY_MEM_MEM), cp) {
> + cp.DestinationMemoryAddress = dst;
> + cp.SourceMemoryAddress = src;
> +  }
> +#else
> +  /* IVB does not have a general purpose register for command streamer
> +   * commands. Therefore, we use an alternate temporary register.
> +   */
> +#define BLORP_TEMP_REG 0x2440 /* GEN7_3DPRIM_BASE_VERTEX */
> +  blorp_emit(batch, GENX(MI_LOAD_REGISTER_MEM), load) {
> + load.RegisterAddress = BLORP_TEMP_REG;
> + load.MemoryAddress = src;
> +  }
> +  blorp_emit(batch, GENX(MI_STORE_REGISTER_MEM), store) {
> + store.RegisterAddress = BLORP_TEMP_REG;
> + store.MemoryAddress = dst;
> +  }
> +#undef BLORP_TEMP_REG
> +#endif
> +  dst.offset += 4;
> +  src.offset += 4;
> +   }
> +}
> +#endif
> +
>  static void
>  blorp_emit_surface_state(struct blorp_batch *batch,
>   const struct brw_blorp_surface_info *surface,
> @@ -1259,6 +1300,20 @@ blorp_emit_surface_state(struct blorp_batch *batch,
> }
>  
> blorp_flush_range(batch, state, GENX(RENDER_SURFACE_STATE_length) * 4);
> +
> +#if GEN_GEN > 10
> +#  error("Implement indirect clear support on gen11+")
> +#elif GEN_GEN >= 7 && GEN_GEN <= 10

Could we move the #if/elif block under the if statement below and use
assert(!str) instead of #error? Otherwise we can't run any tests on
gen11+ until this feature is implemented.

-Nanley

> +   if (surface->clear_color_addr.buffer) {
> +  struct blorp_address dst_addr = blorp_get_surface_base_address(batch);
> +  dst_addr.offset += state_offset + isl_dev->ss.clear_value_offset;
> +  blorp_emit_memcpy(batch, dst_addr, surface->clear_color_addr,
> +isl_dev->ss.clear_value_size);
> +   }
> +#else
> +   /* Indirect clears are only supported on gen7+ */
> +   assert(surface->clear_color_addr.buffer == NULL);
> +#endif
>  }
>  
>  static void
> @@ -1303,6 +1358,7 @@ blorp_emit_surface_states(struct blorp_batch *batch,
> uint32_t bind_offset, surface_offsets[2];
> void *surface_maps[2];
>  
> +   MAYBE_UNUSED bool has_indirect_clear_color = false;
> if

Re: [Mesa-dev] [PATCH 11/12] spirv: Generate code to track SPIR-V capability dependencies

2017-11-22 Thread Dylan Baker

With the one thing Eric pointed out changed, this patch and the next one are:
Reviewed-by: Dylan Baker 

Thanks for making those changes for me.

Quoting Ian Romanick (2017-11-20 17:24:09)
> From: Ian Romanick 
> 
> v2: Clean ups.  Remove some functions that never ended up being used.
> 
> v3: After updating spirv.core.grammar.json, fix the handling of
> ShaderViewportMaskNV.  See the comment around line 71 of
> spirv_capabilities_h.py.
> 
> v4: Many Python style changes based on feedback from Dylan.
> 
> Signed-off-by: Ian Romanick 
> ---
>  src/compiler/Makefile.sources  |   2 +
>  src/compiler/Makefile.spirv.am |   8 +
>  src/compiler/glsl/meson.build  |   5 +-
>  src/compiler/spirv/.gitignore  |   2 +
>  src/compiler/spirv/meson.build |  14 ++
>  src/compiler/spirv/spirv_capabilities_h.py | 365 
> +
>  6 files changed, 394 insertions(+), 2 deletions(-)
>  create mode 100644 src/compiler/spirv/spirv_capabilities_h.py
> 
> diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
> index 2ab8e16..1d67cba 100644
> --- a/src/compiler/Makefile.sources
> +++ b/src/compiler/Makefile.sources
> @@ -287,6 +287,8 @@ NIR_FILES = \
> nir/nir_worklist.h
>  
>  SPIRV_GENERATED_FILES = \
> +   spirv/spirv_capabilities.cpp \
> +   spirv/spirv_capabilities.h \
> spirv/spirv_info.c
>  
>  SPIRV_FILES = \
> diff --git a/src/compiler/Makefile.spirv.am b/src/compiler/Makefile.spirv.am
> index 9841004..4bc684a 100644
> --- a/src/compiler/Makefile.spirv.am
> +++ b/src/compiler/Makefile.spirv.am
> @@ -20,6 +20,14 @@
>  # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> DEALINGS
>  # IN THE SOFTWARE.
>  
> +spirv/spirv_capabilities.cpp: spirv/spirv_capabilities_h.py 
> spirv/spirv.core.grammar.json
> +   $(MKDIR_GEN)
> +   $(PYTHON_GEN) $(srcdir)/spirv/spirv_capabilities_h.py 
> $(srcdir)/spirv/spirv.core.grammar.json $@ || ($(RM) $@; false)
> +
> +spirv/spirv_capabilities.h: spirv/spirv_capabilities_h.py 
> spirv/spirv.core.grammar.json
> +   $(MKDIR_GEN)
> +   $(PYTHON_GEN) $(srcdir)/spirv/spirv_capabilities_h.py 
> $(srcdir)/spirv/spirv.core.grammar.json $@ || ($(RM) $@; false)
> +
>  spirv/spirv_info.c: spirv/spirv_info_c.py spirv/spirv.core.grammar.json
> $(MKDIR_GEN)
> $(PYTHON_GEN) $(srcdir)/spirv/spirv_info_c.py 
> $(srcdir)/spirv/spirv.core.grammar.json $@ || ($(RM) $@; false)
> diff --git a/src/compiler/glsl/meson.build b/src/compiler/glsl/meson.build
> index 5b505c0..6e43f80 100644
> --- a/src/compiler/glsl/meson.build
> +++ b/src/compiler/glsl/meson.build
> @@ -198,11 +198,12 @@ files_libglsl_standalone = files(
>  libglsl = static_library(
>'glsl',
>[files_libglsl, glsl_parser, glsl_lexer_cpp, ir_expression_operation_h,
> -   ir_expression_operation_strings_h, ir_expression_operation_constant_h],
> +   ir_expression_operation_strings_h, ir_expression_operation_constant_h,
> +   spirv_capabilities_cpp, spirv_capabilities_h],
>c_args : [c_vis_args, c_msvc_compat_args, no_override_init_args],
>cpp_args : [cpp_vis_args, cpp_msvc_compat_args],
>link_with : [libnir, libglcpp],
> -  include_directories : [inc_common, inc_compiler, inc_nir],
> +  include_directories : [inc_common, inc_compiler, inc_nir, inc_spirv],
>build_by_default : false,
>  )
>  
> diff --git a/src/compiler/spirv/.gitignore b/src/compiler/spirv/.gitignore
> index f723c31..6b5ef0a 100644
> --- a/src/compiler/spirv/.gitignore
> +++ b/src/compiler/spirv/.gitignore
> @@ -1 +1,3 @@
> +/spirv_capabilities.cpp
> +/spirv_capabilities.h
>  /spirv_info.c
> diff --git a/src/compiler/spirv/meson.build b/src/compiler/spirv/meson.build
> index 8f1a02e..8b6071a 100644
> --- a/src/compiler/spirv/meson.build
> +++ b/src/compiler/spirv/meson.build
> @@ -24,3 +24,17 @@ spirv_info_c = custom_target(
>output : 'spirv_info.c',
>command : [prog_python2, '@INPUT0@', '@INPUT1@', '@OUTPUT@'],
>  )
> +
> +spirv_capabilities_cpp = custom_target(
> +  'spirv_capabilities.cpp',
> +  input : files('spirv_capabilities_h.py', 'spirv.core.grammar.json'),
> +  output : 'spirv_capabilities.cpp',
> +  command : [prog_python2, '@INPUT0@', '--gen-cpp', '@OUTPUT@', '@INPUT1@'],
> +)
> +
> +spirv_capabilities_h = custom_target(
> +  'spirv_capabilities.h',
> +  input : files('spirv_capabilities_h.py', 'spirv.core.grammar.json'),
> +  output : 'spirv_capabilities.h',
> +  command : [prog_python2, '@INPUT0@', '--gen-h', '@OUTPUT@', '@INPUT1@'],
> +)
> diff --git a/src/compiler/spirv/spirv_capabilities_h.py 
> b/src/compiler/spirv/spirv_capabilities_h.py
> new file mode 100644
> index 000..78b1166
> --- /dev/null
> +++ b/src/compiler/spirv/spirv_capabilities_h.py
> @@ -0,0 +1,365 @@
> +COPYRIGHT = """\
> +/*
> + * Copyright (C) 2017 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge,

Re: [Mesa-dev] [PATCH 12/12] spirv: Generate SPIR-V builder infrastructure

2017-11-22 Thread Ian Romanick

On 11/21/2017 03:56 AM, Eric Engestrom wrote:
>> diff --git a/src/compiler/Makefile.spirv.am b/src/compiler/Makefile.spirv.am
>> index 4bc684a..dc3c01c 100644
>> --- a/src/compiler/Makefile.spirv.am
>> +++ b/src/compiler/Makefile.spirv.am
>> @@ -20,13 +20,17 @@
>>  # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
>> DEALINGS
>>  # IN THE SOFTWARE.
>>  
>> +spirv/spirv_builder_functions.h: spirv/spirv_builder_h.py 
>> spirv/spirv.core.grammar.json
>> +$(MKDIR_GEN)
>> +$(PYTHON_GEN) $(srcdir)/spirv/spirv_builder_h.py 
>> $(srcdir)/spirv/spirv.core.grammar.json $@ || ($(RM) $@; false)
>> +
>>  spirv/spirv_capabilities.cpp: spirv/spirv_capabilities_h.py 
>> spirv/spirv.core.grammar.json
>>  $(MKDIR_GEN)
>> -$(PYTHON_GEN) $(srcdir)/spirv/spirv_capabilities_h.py 
>> $(srcdir)/spirv/spirv.core.grammar.json $@ || ($(RM) $@; false)
>> +$(PYTHON_GEN) $(srcdir)/spirv/spirv_capabilities_h.py --gen-cpp $@ 
>> $(srcdir)/spirv/spirv.core.grammar.json || ($(RM) $@; false)
>>  
>>  spirv/spirv_capabilities.h: spirv/spirv_capabilities_h.py 
>> spirv/spirv.core.grammar.json
>>  $(MKDIR_GEN)
>> -$(PYTHON_GEN) $(srcdir)/spirv/spirv_capabilities_h.py 
>> $(srcdir)/spirv/spirv.core.grammar.json $@ || ($(RM) $@; false)
>> +$(PYTHON_GEN) $(srcdir)/spirv/spirv_capabilities_h.py --gen-h $@ 
>> $(srcdir)/spirv/spirv.core.grammar.json || ($(RM) $@; false)
> 
> Oh, I see you fixed it already, the hunk just ended up in the wrong
> commit (:

Mystery solved!  I'll move this change to the previous commit... where
it belongs.  Thanks for catching this!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 11/12] spirv: Generate code to track SPIR-V capability dependencies

2017-11-22 Thread Ian Romanick

On 11/21/2017 03:29 AM, Eric Engestrom wrote:
> On Monday, 2017-11-20 17:24:09 -0800, Ian Romanick wrote:
>> diff --git a/src/compiler/Makefile.spirv.am b/src/compiler/Makefile.spirv.am
>> index 9841004..4bc684a 100644
>> --- a/src/compiler/Makefile.spirv.am
>> +++ b/src/compiler/Makefile.spirv.am
>> @@ -20,6 +20,14 @@
>>  # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
>> DEALINGS
>>  # IN THE SOFTWARE.
>>  
>> +spirv/spirv_capabilities.cpp: spirv/spirv_capabilities_h.py 
>> spirv/spirv.core.grammar.json
>> +$(MKDIR_GEN)
>> +$(PYTHON_GEN) $(srcdir)/spirv/spirv_capabilities_h.py 
>> $(srcdir)/spirv/spirv.core.grammar.json $@ || ($(RM) $@; false)
>> +
>> +spirv/spirv_capabilities.h: spirv/spirv_capabilities_h.py 
>> spirv/spirv.core.grammar.json
>> +$(MKDIR_GEN)
>> +$(PYTHON_GEN) $(srcdir)/spirv/spirv_capabilities_h.py 
>> $(srcdir)/spirv/spirv.core.grammar.json $@ || ($(RM) $@; false)
> 
> Missing `--gen-cpp`/`--gen-h`

That's weird.  I made that change after it failed to build on our CI
system.  This also prompted me to poke at Dylan about when we can remove
the autotools build system so that I only have to update 1 build instead
of 2. :)  I will look into this...

>> +
>>  spirv/spirv_info.c: spirv/spirv_info_c.py spirv/spirv.core.grammar.json
>>  $(MKDIR_GEN)
>>  $(PYTHON_GEN) $(srcdir)/spirv/spirv_info_c.py 
>> $(srcdir)/spirv/spirv.core.grammar.json $@ || ($(RM) $@; false)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] meson: add logic to select apple and windows dri

2017-11-22 Thread Dylan Baker

Quoting Eric Engestrom (2017-11-22 03:16:17)
> On Tuesday, 2017-11-21 10:50:29 -0800, Dylan Baker wrote:
> > Quoting Eric Engestrom (2017-11-21 10:38:25)
> > > On Tuesday, 2017-11-21 10:21:07 -0800, Dylan Baker wrote:
> > > > This is still not fully correct (haiku and BSD is notably probably not
> > > > correct), but Linux is not regressed and this should be correct for
> > > > macOS and Windows.
> > > > 
> > > > v2: - set the dri_platform to windows on Cygwin as well (Jon)
> > > 
> > > R-b stands
> > > 
> > > > 
> > > > Signed-off-by: Dylan Baker 
> > > > ---
> > > >  meson.build | 15 +--
> > > >  1 file changed, 13 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/meson.build b/meson.build
> > > > index 52f2c1cb0d0..4248cbcfd7e 100644
> > > > --- a/meson.build
> > > > +++ b/meson.build
> > > > @@ -187,8 +187,19 @@ if with_dri_i915
> > > >dep_libdrm_intel = dependency('libdrm_intel', version : '>= 2.4.75')
> > > >  endif
> > > >  
> > > > -# TODO: other OSes
> > > > -with_dri_platform = 'drm'
> > > > +# TODO: gnu 
> > > 
> > > I missed that comment the first time around; I don't understand what it
> > > means?
> > 
> > The autotools build has a handlers for setting the dri_platform to 'none' on
> > gnu* (which I assume to be hurd). See configure.ac:1513
> > 
> > As far as I know meson doesn't support hurd ATM (though I doubt they'd turn 
> > away
> > patches for it).
> > 
> > We can drop the TODO if you'd prefer, I just like to note things in the
> > autotools/scons build that aren't currently supported in the meson build.
> 
> No, I think keeping the TODO is good to indicate something that is
> handled by the other build system(s), even if they might never be
> "fixed" (eg. if meson never gets ported to hurd).
> 
> I guess this case is already covered by your `else` though, so maybe
> move the comment there, and make it more than 3 letters? :P
> 
> Don't let this stop you from pushing it though, it's really a nitpick ;)

I went ahead and moved the FIXME into the else block and gave a longer (and
hopefully better) comment than `gnu`

Dylan


signature.asc
Description: signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH mesa 0/7] remove upstreamed specs

2017-11-22 Thread Eric Anholt

Jordan Justen  writes:

> On 2017-11-22 09:59:34, Eric Engestrom wrote:
>> A recent thread [1] made me check our local specs to see which ones were
>> upstream. This series removes the ones that are identical upstream
>> (modulo "TBD" extension numbers in some cases).
>
> While I don't have too strong of an opinion on it, I think we should
> keep a copy of Mesa specs that are in the upstream registry.
>
> I think it makes sense to send a patch to mesa-dev for new Mesa specs
> or changes to Mesa specs. Having a copy in docs/specs works well for
> that.

The downside is that that process means that we'll inevitably keep stale
or divergent copies in Mesa, when the canonical location for GL specs is
Khronos.  We do have a reasonable process for modifying Khronos's specs
now, which we didn't before.

I think we should get all our specs out and into the Khronos.

signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] radv/winsys: do not try to create a BO list with 0 buffers

2017-11-22 Thread Samuel Pitoiset

This happens when all BOs have the RADEON_FLAG_NO_INTERPROCESS_SHARING
(DRM version >= 3.23) flag set. This flag is mainly used for reducing
overhead on the userspace side because we don't have to put those BOs
inside the list.

Though, if the driver tries to create a list with 0 buffers inside it,
libdrm returns -EINVAL and the app just crashes.

This fixes a bunch of CTS dEQP-VK.sparse_resources.* fails (~100).

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c 
b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
index 4e4a82a1f1..67dc4a8ccc 100644
--- a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
+++ b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
@@ -518,7 +518,8 @@ static int radv_amdgpu_create_bo_list(struct 
radv_amdgpu_winsys *ws,
  struct radeon_winsys_cs *extra_cs,
  amdgpu_bo_list_handle *bo_list)
 {
-   int r;
+   int r = 0;
+
if (ws->debug_all_bos) {
struct radv_amdgpu_winsys_bo *bo;
amdgpu_bo_handle *handles;
@@ -636,8 +637,13 @@ static int radv_amdgpu_create_bo_list(struct 
radv_amdgpu_winsys *ws,
}
}
}
-   r = amdgpu_bo_list_create(ws->dev, unique_bo_count, handles,
- priorities, bo_list);
+
+   if (unique_bo_count > 0) {
+   r = amdgpu_bo_list_create(ws->dev, unique_bo_count, 
handles,
+ priorities, bo_list);
+   } else {
+   *bo_list = 0;
+   }
 
free(handles);
free(priorities);
-- 
2.15.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] radv/winsys: improve error messages when the buffer list creation failed

2017-11-22 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c 
b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
index 67dc4a8ccc..e5ea312aee 100644
--- a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
+++ b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c
@@ -711,7 +711,8 @@ static int radv_amdgpu_winsys_cs_submit_chained(struct 
radeon_winsys_ctx *_ctx,
 
r = radv_amdgpu_create_bo_list(cs0->ws, cs_array, cs_count, NULL, 
initial_preamble_cs, _list);
if (r) {
-   fprintf(stderr, "amdgpu: Failed to created the BO list for 
submission\n");
+   fprintf(stderr, "amdgpu: buffer list creation failed for the "
+   "chained submission(%d)\n", r);
return r;
}
 
@@ -778,7 +779,8 @@ static int radv_amdgpu_winsys_cs_submit_fallback(struct 
radeon_winsys_ctx *_ctx,
r = radv_amdgpu_create_bo_list(cs0->ws, _array[i], cnt, NULL,
   preamble_cs, _list);
if (r) {
-   fprintf(stderr, "amdgpu: Failed to created the BO list 
for submission\n");
+   fprintf(stderr, "amdgpu: buffer list creation failed "
+   "for the fallback submission (%d)\n", 
r);
return r;
}
 
@@ -900,7 +902,8 @@ static int radv_amdgpu_winsys_cs_submit_sysmem(struct 
radeon_winsys_ctx *_ctx,
   (struct 
radv_amdgpu_winsys_bo*)bo,
   preamble_cs, _list);
if (r) {
-   fprintf(stderr, "amdgpu: Failed to created the BO list 
for submission\n");
+   fprintf(stderr, "amdgpu: buffer list creation failed "
+   "for the sysmem submission (%d)\n", r);
return r;
}
 
-- 
2.15.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/20] Mesa: Reducing sizes of gl_context etc.

2017-11-22 Thread Nicolai Hähnle


On 21.11.2017 19:01, Marek Olšák wrote:

Hi,

This series reduces sizes of many driver structures. For example:

gl_context: 152488 -> 72944 bytes
vbo_context: 22696 -> 20008 bytes
st_context: 10120 ->  3704 bytes

The idea is to decrease CPU cache usage on smaller CPUs. I have not
noticed a performance difference with microbenchmarks. It might be
a different story with complex apps.

There are some good cleanups as well as some controversial changes,
so some piglit regressions are to be expected, but we can change
piglit not to test those silly cases if we all agree that this is
the right thing to do. Feel free to discuss them on the commit
threads.

Branch for testing:
 git://people.freedesktop.org/~mareko/mesa context-reduce-size

Expected piglit failures where piglit might need changes:
- spec/!openg 1.3/texunits
- spec/arb_viewport_array/viewport-indices


Can you explain where these failures come from?

Will look at the patches themselves later / tomorrow.

Cheers,
Nicolai




Please review.

Thanks,
Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev




--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 12/12] gallium/hud: add HUD sharing within a context share group

2017-11-22 Thread Nicolai Hähnle


On 21.11.2017 18:46, Marek Olšák wrote:

From: Marek Olšák 

This is needed for profiling multi-context applications like Chrome.
One context can record queries and another context can draw the HUD.
---
  src/gallium/auxiliary/hud/hud_context.c  | 103 +++
  src/gallium/auxiliary/hud/hud_context.h  |   3 +
  src/gallium/auxiliary/hud/hud_private.h  |   2 +
  src/gallium/state_trackers/dri/dri_context.c |  12 +++-
  4 files changed, 106 insertions(+), 14 deletions(-)

diff --git a/src/gallium/auxiliary/hud/hud_context.c 
b/src/gallium/auxiliary/hud/hud_context.c
index 783dafd..2586b5f 100644
--- a/src/gallium/auxiliary/hud/hud_context.c
+++ b/src/gallium/auxiliary/hud/hud_context.c
@@ -673,26 +673,59 @@ hud_stop_queries(struct hud_context *hud, struct 
pipe_context *pipe)
   }
}
  
hud_pane_accumulate_vertices(hud, pane);

 }
  
 /* unmap the uploader's vertex buffer before drawing */

 u_upload_unmap(pipe->stream_uploader);
  }
  
+/**

+ * Record queries and draw the HUD. The "cso" parameter acts as a filter.
+ * If "cso" is not the recording context, recording is skipped.
+ * If "cso" is not the drawing context, drawing is skipped.
+ * cso == NULL ignores the filter.
+ */
  void
  hud_run(struct hud_context *hud, struct cso_context *cso,
  struct pipe_resource *tex)
  {
+   struct pipe_context *pipe = cso ? cso_get_pipe_context(cso) : NULL;
+
+   /* If "cso" is the recording or drawing context or NULL, execute
+* the operation. Otherwise, don't do anything.
+*/
+   if (hud->record_pipe && (!pipe || pipe == hud->record_pipe))
+  hud_stop_queries(hud, hud->record_pipe);
+
+   if (hud->cso && (!cso || cso == hud->cso))
+  hud_draw_results(hud, tex);
+
+   if (hud->record_pipe && (!pipe || pipe == hud->record_pipe))
+  hud_start_queries(hud, hud->record_pipe);
+}
+
+/**
+ * Record query results and assemble vertices if "pipe" is a recording but
+ * not drawing context.
+ */
+void
+hud_record_only(struct hud_context *hud, struct pipe_context *pipe)
+{
+   assert(pipe);
+
+   /* If it's a drawing context, only hud_run() records query results. */
+   if (pipe == hud->pipe || pipe != hud->record_pipe)
+  return;
+
 hud_stop_queries(hud, hud->record_pipe);
-   hud_draw_results(hud, tex);
 hud_start_queries(hud, hud->record_pipe);
  }
  
  static void

  fixup_bytes(enum pipe_driver_query_type type, int position, uint64_t *exp10)
  {
 if (type == PIPE_DRIVER_QUERY_TYPE_BYTES && position % 3 == 0)
*exp10 = (*exp10 / 1000) * 1024;
  }
  
@@ -1677,23 +1710,60 @@ hud_unset_record_context(struct hud_context *hud)

 hud_batch_query_cleanup(>batch_query, pipe);
 hud->record_pipe = NULL;
  }
  
  static void

  hud_set_record_context(struct hud_context *hud, struct pipe_context *pipe)
  {
 hud->record_pipe = pipe;
  }
  
+/**

+ * Create the HUD.
+ *
+ * If "share" is non-NULL and GALLIUM_HUD_SHARE=x,y is set, increment the
+ * reference counter of "share", set "cso" as the recording or drawing context
+ * according to the environment variable, and return "share".
+ * This allows sharing the HUD instance within a multi-context share group,
+ * record queries in one context and draw them in another.
+ */
  struct hud_context *
  hud_create(struct cso_context *cso, struct hud_context *share)
  {
+   const char *share_env = debug_get_option("GALLIUM_HUD_SHARE", NULL);
+   unsigned record_ctx, draw_ctx;
+
+   if (share_env && sscanf(share_env, "%u,%u", _ctx, _ctx) != 2)
+  share_env = NULL;
+
+   if (share && share_env) {
+  /* All contexts in a share group share the HUD instance.
+   * Only one context can record queries and only one context
+   * can draw the HUD.
+   *
+   * GALLIUM_HUD_SHARE=x,y determines the context indices.
+   */
+  int context_id = p_atomic_inc_return(>refcount) - 1;
+
+  if (context_id == record_ctx) {
+ assert(!share->record_pipe);
+ hud_set_record_context(share, cso_get_pipe_context(cso));
+  }
+
+  if (context_id == draw_ctx) {
+ assert(!share->pipe);
+ hud_set_draw_context(share, cso);
+  }
+
+  return share;
+   }
+
 struct pipe_screen *screen = cso_get_pipe_context(cso)->screen;
 struct hud_context *hud;
 unsigned i;
 const char *env = debug_get_option("GALLIUM_HUD", NULL);
  #ifdef PIPE_OS_UNIX
 unsigned signo = debug_get_num_option("GALLIUM_HUD_TOGGLE_SIGNAL", 0);
 static boolean sig_handled = FALSE;
 struct sigaction action = {};
  #endif
 huds_visible = debug_get_bool_option("GALLIUM_HUD_VISIBLE", TRUE);
@@ -1710,20 +1780,21 @@ hud_create(struct cso_context *cso, struct hud_context 
*share)
 if (!hud)
return NULL;
  
 /* font (the context is only used for the texture upload) */

 if (!util_font_create(cso_get_pipe_context(cso),
   UTIL_FONT_FIXED_8X13, >font)) {

Re: [Mesa-dev] [PATCH 1/9] ac: pack ac_surface better

2017-11-22 Thread Nicolai Hähnle


On 21.11.2017 18:30, Marek Olšák wrote:

From: Marek Olšák 

r600_texture: 1736 -> 1488 bytes
---
  src/amd/common/ac_surface.h   |  9 +
  src/gallium/drivers/r600/r600_texture.c   |  2 +-
  src/gallium/drivers/radeon/r600_texture.c | 12 ++--
  3 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/src/amd/common/ac_surface.h b/src/amd/common/ac_surface.h
index 7ac4737..1dc95cd 100644
--- a/src/amd/common/ac_surface.h
+++ b/src/amd/common/ac_surface.h
@@ -65,22 +65,22 @@ enum radeon_micro_mode {
  #define RADEON_SURF_FMASK   (1 << 21)
  #define RADEON_SURF_DISABLE_DCC (1 << 22)
  #define RADEON_SURF_TC_COMPATIBLE_HTILE (1 << 23)
  #define RADEON_SURF_IMPORTED(1 << 24)
  #define RADEON_SURF_OPTIMIZE_FOR_SPACE  (1 << 25)
  #define RADEON_SURF_SHAREABLE   (1 << 26)
  
  struct legacy_surf_level {

  uint64_toffset;
  uint64_tslice_size;
-uint64_tdcc_offset;
-uint64_tdcc_fast_clear_size;
+uint32_tdcc_offset; /* relative offset within DCC mip 
tree */


What about array textures? Those can get rather large.

Apart from that, this patch is:

Reviewed-by: Nicolai Hähnle 


+uint32_tdcc_fast_clear_size;
  uint16_tnblk_x;
  uint16_tnblk_y;
  enum radeon_surf_mode   mode;
  };
  
  struct legacy_surf_layout {

  unsignedbankw:4;  /* max 8 */
  unsignedbankh:4;  /* max 8 */
  unsignedmtilea:4; /* max 8 */
  unsignedtile_split:13; /* max 4K */
@@ -180,22 +180,23 @@ struct radeon_surf {
   * Only these surfaces are allowed to set it:
   * - color (if it doesn't have to be displayable)
   * - DCC (same tile swizzle as color)
   * - FMASK
   * - CMASK if it's TC-compatible or if the gen is GFX9
   * - depth/stencil if HTILE is not TC-compatible and if the gen is not 
GFX9
   */
  uint8_t tile_swizzle;
  
  uint64_tsurf_size;

-uint64_tdcc_size;
-uint64_thtile_size;
+/* DCC and HTILE are very small. */
+uint32_tdcc_size;
+uint32_thtile_size;
  
  uint32_thtile_slice_size;
  
  uint32_tsurf_alignment;

  uint32_tdcc_alignment;
  uint32_thtile_alignment;
  
  union {

  /* R600-VI return values.
   *
diff --git a/src/gallium/drivers/r600/r600_texture.c 
b/src/gallium/drivers/r600/r600_texture.c
index ee6ed64..f7c9b63 100644
--- a/src/gallium/drivers/r600/r600_texture.c
+++ b/src/gallium/drivers/r600/r600_texture.c
@@ -830,21 +830,21 @@ void r600_print_texture_info(struct r600_common_screen 
*rscreen,
rtex->fmask.pitch_in_pixels, rtex->fmask.bank_height,
rtex->fmask.slice_tile_max, 
rtex->fmask.tile_mode_index);
  
  	if (rtex->cmask.size)

u_log_printf(log, "  CMask: offset=%"PRIu64", size=%"PRIu64", 
alignment=%u, "
"slice_tile_max=%u\n",
rtex->cmask.offset, rtex->cmask.size, 
rtex->cmask.alignment,
rtex->cmask.slice_tile_max);
  
  	if (rtex->htile_offset)

-   u_log_printf(log, "  HTile: offset=%"PRIu64", size=%"PRIu64", "
+   u_log_printf(log, "  HTile: offset=%"PRIu64", size=%u "
"alignment=%u\n",
 rtex->htile_offset, rtex->surface.htile_size,
 rtex->surface.htile_alignment);
  
  	for (i = 0; i <= rtex->resource.b.b.last_level; i++)

u_log_printf(log, "  Level[%i]: offset=%"PRIu64", 
slice_size=%"PRIu64", "
"npix_x=%u, npix_y=%u, npix_z=%u, nblk_x=%u, nblk_y=%u, 
"
"mode=%u, tiling_index = %u\n",
i, rtex->surface.u.legacy.level[i].offset,
rtex->surface.u.legacy.level[i].slice_size,
diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index 5f6e913..38d2470 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -998,31 +998,31 @@ void si_print_texture_info(struct r600_common_screen 
*rscreen,
u_log_printf(log, "  CMask: offset=%"PRIu64", 
size=%"PRIu64", "
"alignment=%u, rb_aligned=%u, 
pipe_aligned=%u\n",
rtex->cmask.offset,
rtex->surface.u.gfx9.cmask_size,

Re: [Mesa-dev] [PATCH 9/9] radeonsi: expose all CB performance counters on Stoney

2017-11-22 Thread Nicolai Hähnle


Patches 3-9:

Reviewed-by: Nicolai Hähnle 

On 21.11.2017 18:30, Marek Olšák wrote:

From: Marek Olšák 

---
  src/gallium/drivers/radeonsi/si_perfcounter.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_perfcounter.c 
b/src/gallium/drivers/radeonsi/si_perfcounter.c
index 56af0a0..5029af0 100644
--- a/src/gallium/drivers/radeonsi/si_perfcounter.c
+++ b/src/gallium/drivers/radeonsi/si_perfcounter.c
@@ -366,21 +366,21 @@ static struct si_pc_block groups_CIK[] = {
{ _IA, 22 },
{ _MC, 22 },
{ _SRBM, 19 },
{ _WD, 22 },
{ _CPG, 46 },
{ _CPC, 22 },
  
  };
  
  static struct si_pc_block groups_VI[] = {

-   { _CB, 396, 4 },
+   { _CB, 405, 4 },
{ _CPF, 19 },
{ _DB, 257, 4 },
{ _GRBM, 34 },
{ _GRBMSE, 15 },
{ _PA_SU, 153 },
{ _PA_SC, 397 },
{ _SPI, 197 },
{ _SQ, 273 },
{ _SX, 34 },
{ _TA, 119, 16 },




--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/9] ac: change legacy_surf_level::slice_size to dword units

2017-11-22 Thread Nicolai Hähnle


On 21.11.2017 18:30, Marek Olšák wrote:

From: Marek Olšák 

The next commit will reduce the size even more.
---
  src/amd/common/ac_surface.c|  2 +-
  src/amd/common/ac_surface.h|  2 +-
  src/amd/vulkan/radv_image.c|  8 
  src/gallium/drivers/r600/evergreen_state.c |  8 
  src/gallium/drivers/r600/r600_state.c  |  8 
  src/gallium/drivers/r600/r600_texture.c| 14 +++---
  src/gallium/drivers/r600/radeon_uvd.c  |  2 +-
  src/gallium/drivers/radeon/r600_texture.c  | 14 +++---
  src/gallium/drivers/radeon/radeon_uvd.c|  2 +-
  src/gallium/drivers/radeonsi/cik_sdma.c|  4 ++--
  src/gallium/drivers/radeonsi/si_dma.c  |  8 
  src/gallium/winsys/radeon/drm/radeon_drm_surface.c |  4 ++--
  12 files changed, 38 insertions(+), 38 deletions(-)

diff --git a/src/amd/common/ac_surface.c b/src/amd/common/ac_surface.c
index f7600a3..2b6c3fb 100644
--- a/src/amd/common/ac_surface.c
+++ b/src/amd/common/ac_surface.c
@@ -297,21 +297,21 @@ static int gfx6_compute_level(ADDR_HANDLE addrlib,
  
  	ret = AddrComputeSurfaceInfo(addrlib,

 AddrSurfInfoIn,
 AddrSurfInfoOut);
if (ret != ADDR_OK) {
return ret;
}
  
  	surf_level = is_stencil ? >u.legacy.stencil_level[level] : >u.legacy.level[level];

surf_level->offset = align64(surf->surf_size, 
AddrSurfInfoOut->baseAlign);
-   surf_level->slice_size = AddrSurfInfoOut->sliceSize;
+   surf_level->slice_size_dw = AddrSurfInfoOut->sliceSize / 4;
surf_level->nblk_x = AddrSurfInfoOut->pitch;
surf_level->nblk_y = AddrSurfInfoOut->height;
  
  	switch (AddrSurfInfoOut->tileMode) {

case ADDR_TM_LINEAR_ALIGNED:
surf_level->mode = RADEON_SURF_MODE_LINEAR_ALIGNED;
break;
case ADDR_TM_1D_TILED_THIN1:
surf_level->mode = RADEON_SURF_MODE_1D;
break;
diff --git a/src/amd/common/ac_surface.h b/src/amd/common/ac_surface.h
index 1dc95cd..ec89f6b 100644
--- a/src/amd/common/ac_surface.h
+++ b/src/amd/common/ac_surface.h
@@ -64,21 +64,21 @@ enum radeon_micro_mode {
  /* bits 19 and 20 are reserved for libdrm_radeon, don't use them */
  #define RADEON_SURF_FMASK   (1 << 21)
  #define RADEON_SURF_DISABLE_DCC (1 << 22)
  #define RADEON_SURF_TC_COMPATIBLE_HTILE (1 << 23)
  #define RADEON_SURF_IMPORTED(1 << 24)
  #define RADEON_SURF_OPTIMIZE_FOR_SPACE  (1 << 25)
  #define RADEON_SURF_SHAREABLE   (1 << 26)
  
  struct legacy_surf_level {

  uint64_toffset;
-uint64_tslice_size;
+uint32_tslice_size_dw; /* in dwords; max = 4GB / 4. */
  uint32_tdcc_offset; /* relative offset within DCC mip 
tree */
  uint32_tdcc_fast_clear_size;
  uint16_tnblk_x;
  uint16_tnblk_y;
  enum radeon_surf_mode   mode;
  };
  
  struct legacy_surf_layout {

  unsignedbankw:4;  /* max 8 */
  unsignedbankh:4;  /* max 8 */
diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index b532aa9..fb7bbde 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan/radv_image.c
@@ -1149,25 +1149,25 @@ void radv_GetImageSubresourceLayout(
  
  	if (device->physical_device->rad_info.chip_class >= GFX9) {

pLayout->offset = surface->u.gfx9.offset[level] + 
surface->u.gfx9.surf_slice_size * layer;
pLayout->rowPitch = surface->u.gfx9.surf_pitch * surface->bpe;
pLayout->arrayPitch = surface->u.gfx9.surf_slice_size;
pLayout->depthPitch = surface->u.gfx9.surf_slice_size;
pLayout->size = surface->u.gfx9.surf_slice_size;
if (image->type == VK_IMAGE_TYPE_3D)
pLayout->size *= u_minify(image->info.depth, level);
} else {
-   pLayout->offset = surface->u.legacy.level[level].offset + 
surface->u.legacy.level[level].slice_size * layer;
+   pLayout->offset = surface->u.legacy.level[level].offset + 
surface->u.legacy.level[level].slice_size_dw * 4 * layer;


I believe the maximum slice size in bytes is (with an RGBA32 texture)

16384 * 16384 * 16 = 2^14 * 2^14 * 2^4 = 2^32

The problem with this code is that the multiplication is now performed 
as uint32_t and can therefore wrap-around. So an explicit cast to 
64-bits is required.


In practice, I guess this rather becomes an issue with smaller slice 
sizes but larger layer indices. We really need test case to exercise >= 
4 GB textures...


Cheers,
Nicolai



pLayout->rowPitch =

Re: [Mesa-dev] [PATCH v2 04/17] etnaviv: Use only DRAW_INSTANCED on GC3000+

2017-11-22 Thread Wladimir J. van der Laan


> I would really like to know what's wrong with this patch, as using the
> new draw command should be fine on GC3000 and we certainly want to
> support instanced drawing at some point.

Did you also apply the "etnaviv: Emit SCALE for vertex attributes" patch?

If so maybe there's something wrong with that one - DRAW_INSTANCED seems to be
ignored if state 00780+attr stays at 0.

Wladimir
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH mesa 0/7] remove upstreamed specs

2017-11-22 Thread Jordan Justen

On 2017-11-22 09:59:34, Eric Engestrom wrote:
> A recent thread [1] made me check our local specs to see which ones were
> upstream. This series removes the ones that are identical upstream
> (modulo "TBD" extension numbers in some cases).

While I don't have too strong of an opinion on it, I think we should
keep a copy of Mesa specs that are in the upstream registry.

I think it makes sense to send a patch to mesa-dev for new Mesa specs
or changes to Mesa specs. Having a copy in docs/specs works well for
that.

I don't think we should have specs under docs/specs that are not
published in the Khronos registry. This creates confusion.
Non-published specs of that nature should probably live on the
developer's branch.

-Jordan

> There are a few more specs left that are upstream, but have typo fixes
> that I'm going to submit to Khronos, and I'll remove the local copies
> once the fixes have been upstreamed:
> - EGL_MESA_drm_image
> - GLX_MESA_copy_sub_buffer
> - GLX_MESA_query_renderer
> - GLX_MESA_release_buffers
> - MESA_pack_invert
> - MESA_window_pos
> - MESA_ycbcr_texture
> 
> [1] https://lists.freedesktop.org/archives/mesa-dev/2017-November/177861.html
> 
> Eric Engestrom (7):
>   docs/specs: remove upstreamed spec EGL_MESA_platform_surfaceless
>   docs/specs: remove upstreamed spec MESA_image_dma_buf_export
>   docs/specs: remove upstreamed spec MESA_shader_integer_functions
>   docs/specs: remove upstreamed spec EXT_shader_integer_mix
>   docs/specs: remove upstreamed spec MESA_agp_offset
>   docs/specs: remove upstreamed spec MESA_pixmap_colormap
>   docs/specs: remove upstreamed spec MESA_set_3dfx_mode
> 
>  docs/specs/EGL_MESA_platform_surfaceless.txt | 120 --
>  docs/specs/EXT_shader_integer_mix.spec   | 138 ---
>  docs/specs/MESA_agp_offset.spec  |  95 -
>  docs/specs/MESA_image_dma_buf_export.txt | 147 
>  docs/specs/MESA_pixmap_colormap.spec |  90 -
>  docs/specs/MESA_set_3dfx_mode.spec   |  85 -
>  docs/specs/MESA_shader_integer_functions.txt | 522 
> ---
>  7 files changed, 1197 deletions(-)
>  delete mode 100644 docs/specs/EGL_MESA_platform_surfaceless.txt
>  delete mode 100644 docs/specs/EXT_shader_integer_mix.spec
>  delete mode 100644 docs/specs/MESA_agp_offset.spec
>  delete mode 100644 docs/specs/MESA_image_dma_buf_export.txt
>  delete mode 100644 docs/specs/MESA_pixmap_colormap.spec
>  delete mode 100644 docs/specs/MESA_set_3dfx_mode.spec
>  delete mode 100644 docs/specs/MESA_shader_integer_functions.txt
> 
> -- 
> Cheers,
>   Eric
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH mesa] genxml: fix assert guards

2017-11-22 Thread Kenneth Graunke

On Wednesday, November 22, 2017 3:11:22 AM PST Eric Engestrom wrote:
> This removes a few hundred warnings on debug builds with asserts off.
> 
> Signed-off-by: Eric Engestrom 
> ---
>  src/intel/genxml/gen_pack_header.py | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)

#ifndef NDEBUG is the right thing to use for guarding asserts.

Reviewed-by: Kenneth Graunke 

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH mesa 4/7] docs/specs: remove upstreamed spec EXT_shader_integer_mix

2017-11-22 Thread Eric Engestrom

Spec is now available on Khronos:
https://www.khronos.org/registry/OpenGL/extensions/EXT/EXT_shader_integer_mix.txt

Signed-off-by: Eric Engestrom 
---
 docs/specs/EXT_shader_integer_mix.spec | 138 -
 1 file changed, 138 deletions(-)
 delete mode 100644 docs/specs/EXT_shader_integer_mix.spec

diff --git a/docs/specs/EXT_shader_integer_mix.spec 
b/docs/specs/EXT_shader_integer_mix.spec
deleted file mode 100644
index 92cec64dccfec317ee04..
--- a/docs/specs/EXT_shader_integer_mix.spec
+++ /dev/null
@@ -1,138 +0,0 @@
-Name
-
-EXT_shader_integer_mix
-
-Name Strings
-
-GL_EXT_shader_integer_mix
-
-Contact
-
-Matt Turner (matt.turner 'at' intel.com)
-
-Contributors
-
-Matt Turner, Intel
-Ian Romanick, Intel
-
-Status
-
-Shipping
-
-Version
-
-Last Modified Date: 09/12/2013
-Author Revision:6
-
-Number
-
-TBD
-
-Dependencies
-
-OpenGL 3.0 or OpenGL ES 3.0 is required. This extension interacts with
-GL_ARB_ES3_compatibility.
-
-This extension is written against the OpenGL 4.4 (core) specification
-and the GLSL 4.40 specification.
-
-Overview
-
-GLSL 1.30 (and GLSL ES 3.00) expanded the mix() built-in function to
-operate on a boolean third argument that does not interpolate but
-selects. This extension extends mix() to select between int, uint,
-and bool components.
-
-New Procedures and Functions
-
-None.
-
-New Tokens
-
-None.
-
-Additions to Chapter 8 of the GLSL 4.40 Specification (Built-in Functions)
-
-Modify Section 8.3, Common Functions
-
-Additions to the table listing common built-in functions:
-
-  Syntax   Description
-  ---  
--
-  genIType mix(genIType x, Selects which vector each returned 
component comes
-   genIType y, from. For a component of a that is false, 
the
-   genBType a) corresponding component of x is returned. 
For a
-  genUType mix(genUType x, component of a that is true, the 
corresponding
-   genUType y, component of y is returned.
-   genBType a)
-  genBType mix(genBType x,
-   genBType y,
-   genBType a)
-
-Additions to the AGL/GLX/WGL Specifications
-
-None.
-
-Modifications to The OpenGL Shading Language Specification, Version 4.40
-
-Including the following line in a shader can be used to control the
-language features described in this extension:
-
-  #extension GL_EXT_shader_integer_mix : 
-
-where  is as specified in section 3.3.
-
-New preprocessor #defines are added to the OpenGL Shading Language:
-
-  #define GL_EXT_shader_integer_mix1
-
-Interactions with ARB_ES3_compatibility
-
-On desktop implementations that support ARB_ES3_compatibility,
-GL_EXT_shader_integer_mix can be enabled (and the new functions
-used) in shaders declared with '#version 300 es'.
-
-GLX Protocol
-
-None.
-
-Errors
-
-None.
-
-New State
-
-None.
-
-New Implementation Dependent State
-
-None.
-
-Issues
-
-1) Should we allow linear interpolation of integers via a non-boolean
-   third component?
-
-RESOLVED: No.
-
-2) Should we allow mix() to select between boolean components?
-
-RESOLVED: Yes. Implementing the same functionality using casts would be
-possible but ugly.
-
-Revision History
-
-Rev.Date  AuthorChanges
-    -
-  6   09/12/2013  idr   After discussions in Khronos, change vendor
-prefix to EXT.
-
-  5   09/09/2013  idr   Add ARB_ES3_compatibility interaction.
-
-  4   09/06/2013  mattst88  Allow extension on OpenGL ES 3.0.
-
-  3   08/28/2013  mattst88  Add #extension/#define changes.
-
-  2   08/26/2013  mattst88  Change vendor prefix to MESA. Add mix() that
-selects between boolean components.
-  1   08/26/2013  mattst88  Initial revision
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH mesa 5/7] docs/specs: remove upstreamed spec MESA_agp_offset

2017-11-22 Thread Eric Engestrom

Spec is now available on Khronos:
https://www.khronos.org/registry/OpenGL/extensions/MESA/GLX_MESA_agp_offset.txt

Signed-off-by: Eric Engestrom 
---
 docs/specs/MESA_agp_offset.spec | 95 -
 1 file changed, 95 deletions(-)
 delete mode 100644 docs/specs/MESA_agp_offset.spec

diff --git a/docs/specs/MESA_agp_offset.spec b/docs/specs/MESA_agp_offset.spec
deleted file mode 100644
index 06e1d902edd31740a1ea..
--- a/docs/specs/MESA_agp_offset.spec
+++ /dev/null
@@ -1,95 +0,0 @@
-Name
-
-MESA_agp_offset
-
-Name Strings
-
-GLX_MESA_agp_offset
-
-Contact
-
-Brian Paul, Tungsten Graphics, Inc. (brian.paul 'at' tungstengraphics.com)
-Keith Whitwell, Tungsten Graphics, Inc.  (keith 'at' tungstengraphics.com)
-
-Status
-
-Shipping (Mesa 4.0.4 and later.  Only implemented in particular
-XFree86/DRI drivers.)
-
-Version
-
-1.0
-
-Number
-
-TBD
-
-Dependencies
-
-OpenGL 1.0 or later is required
-GLX_NV_vertex_array_range is required.
-This extensions is written against the OpenGL 1.4 Specification.
-
-Overview
-
-This extensions provides a way to convert pointers in an AGP memory
-region into byte offsets into the AGP aperture.
-Note, this extension depends on GLX_NV_vertex_array_range, for which
-no real specification exists.  See GL_NV_vertex_array_range for more
-information.
-
-IP Status
-
-None
-
-Issues
-
-None
-
-New Procedures and Functions
-
-unsigned int glXGetAGPOffsetMESA( const void *pointer )
-
-New Tokens
-
-None
-
-Additions to the OpenGL 1.4 Specification
-
-None
-
-Additions to Chapter 3 the GLX 1.4 Specification (Functions and Errors)
-
-Add a new section, 3.6 as follows:
-
-3.6 AGP Memory Access
-
-On "PC" computers, AGP memory can be allocated with glXAllocateMemoryNV
-and freed with glXFreeMemoryNV.  Sometimes it's useful to know where a
-block of AGP memory is located with respect to the start of the AGP
-aperture.  The function
-
-GLuint glXGetAGPOffsetMESA( const GLvoid *pointer )
-
-Returns the offset of the given memory block from the start of AGP
-memory in basic machine units (i.e. bytes).  If pointer is invalid
-the value ~0 will be returned.
-
-GLX Protocol
-
-None.  This is a client side-only extension.
-
-Errors
-
-glXGetAGPOffsetMESA will return ~0 if the pointer does not point to
-an AGP memory region.
-
-New State
-
-None
-
-Revision History
-
-20 September 2002 - Initial draft
-2 October 2002 - finished GLX chapter 3 additions
-27 July 2004 - use unsigned int instead of GLuint, void instead of GLvoid
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH mesa 6/7] docs/specs: remove upstreamed spec MESA_pixmap_colormap

2017-11-22 Thread Eric Engestrom

Spec is now available on Khronos:
https://www.khronos.org/registry/OpenGL/extensions/MESA/GLX_MESA_pixmap_colormap.txt

Signed-off-by: Eric Engestrom 
---
 docs/specs/MESA_pixmap_colormap.spec | 90 
 1 file changed, 90 deletions(-)
 delete mode 100644 docs/specs/MESA_pixmap_colormap.spec

diff --git a/docs/specs/MESA_pixmap_colormap.spec 
b/docs/specs/MESA_pixmap_colormap.spec
deleted file mode 100644
index fb0b441cc58aee974ff8..
--- a/docs/specs/MESA_pixmap_colormap.spec
+++ /dev/null
@@ -1,90 +0,0 @@
-Name
-
-MESA_pixmap_colormap
-
-Name Strings
-
-GLX_MESA_pixmap_colormap
-
-Contact
-
-Brian Paul (brian.paul 'at' tungstengraphics.com)
-
-Status
-
-Shipping since Mesa 1.2.8 in May, 1996.
-
-Version
-
-Last Modified Date:  8 June 2000
-
-Number
-
-216
-
-Dependencies
-
-OpenGL 1.0 or later is required.
-GLX 1.0 or later is required.
-
-Overview
-
-Since Mesa allows RGB rendering into drawables with PseudoColor,
-StaticColor, GrayScale and StaticGray visuals, Mesa needs a colormap
-in order to compute pixel values during rendering.
-
-The colormap associated with a window can be queried with normal
-Xlib functions but there is no colormap associated with pixmaps.
-
-The glXCreateGLXPixmapMESA function is an alternative to glXCreateGLXPixmap
-which allows specification of a colormap.
-
-IP Status
-
-Open-source; freely implementable.
-
-Issues
-
-None.
-
-New Procedures and Functions
-
-GLXPixmap glXCreateGLXPixmapMESA( Display *dpy, XVisualInfo *visual,
- Pixmap pixmap, Colormap cmap );
-
-New Tokens
-
-None.
-
-Additions to Chapter 3 of the GLX 1.3 Specification (Functions and Errors)
-
-Add to section 3.4.2 Off Screen Rendering
-
-The Mesa implementation of GLX allows RGB rendering into X windows and
-pixmaps of any visual class, not just TrueColor or DirectColor.  In order
-to compute pixel values from RGB values Mesa requires a colormap.
-
-The function
-
-   GLXPixmap glXCreateGLXPixmapMESA( Display *dpy, XVisualInfo *visual,
- Pixmap pixmap, Colormap cmap );
-
-allows one to create a GLXPixmap with a specific colormap.  The image
-rendered into the pixmap may then be copied to a window (which uses the
-same colormap and visual) with the expected results.
-
-GLX Protocol
-
-None since this is a client-side extension.
-
-Errors
-
-None.
-
-New State
-
-None.
-
-Revision History
-
-8 June 2000 - initial specification
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH mesa 7/7] docs/specs: remove upstreamed spec MESA_set_3dfx_mode

2017-11-22 Thread Eric Engestrom

Spec is now available on Khronos:
https://www.khronos.org/registry/OpenGL/extensions/MESA/GLX_MESA_set_3dfx_mode.txt

Signed-off-by: Eric Engestrom 
---
 docs/specs/MESA_set_3dfx_mode.spec | 85 --
 1 file changed, 85 deletions(-)
 delete mode 100644 docs/specs/MESA_set_3dfx_mode.spec

diff --git a/docs/specs/MESA_set_3dfx_mode.spec 
b/docs/specs/MESA_set_3dfx_mode.spec
deleted file mode 100644
index 06d97ca021feb846c00f..
--- a/docs/specs/MESA_set_3dfx_mode.spec
+++ /dev/null
@@ -1,85 +0,0 @@
-Name
-
-MESA_set_3dfx_mode
-
-Name Strings
-
-GLX_MESA_set_3dfx_mode
-
-Contact
-
-Brian Paul (brian.paul 'at' tungstengraphics.com)
-
-Status
-
-Shipping since Mesa 2.6 in February, 1998.
-
-Version
-
-Last Modified Date:  8 June 2000
-
-Number
-
-218
-
-Dependencies
-
-OpenGL 1.0 or later is required.
-GLX 1.0 or later is required.
-
-Overview
-
-The Mesa Glide driver allows full-screen rendering or rendering into
-an X window.  The glXSet3DfxModeMESA() function allows an application
-to switch between full-screen and windowed rendering.
-
-IP Status
-
-Open-source; freely implementable.
-
-Issues
-
-None.
-
-New Procedures and Functions
-
-GLboolean glXSet3DfxModeMESA( GLint mode );
-
-New Tokens
-
-GLX_3DFX_WINDOW_MODE_MESA  0x1
-GLX_3DFX_FULLSCREEN_MODE_MESA   0x2
-
-Additions to Chapter 3 of the GLX 1.3 Specification (Functions and Errors)
-
-The Mesa Glide device driver allows either rendering in full-screen
-mode or rendering into an X window.  An application can switch between
-full-screen and window rendering with the command:
-
-   GLboolean glXSet3DfxModeMESA( GLint mode );
-
- may either be GLX_3DFX_WINDOW_MODE_MESA to indicate window
-rendering or GLX_3DFX_FULLSCREEN_MODE_MESA to indicate full-screen mode.
-
-GL_TRUE is returned if  is valid and the operation completed
-normally.  GL_FALSE is returned if  is invalid or if the Glide
-driver is not being used.
-
-Note that only one drawable and context can be created at any given
-time with the Mesa Glide driver.
-
-GLX Protocol
-
-None since this is a client-side extension.
-
-Errors
-
-None.
-
-New State
-
-None.
-
-Revision History
-
-8 June 2000 - initial specification
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH mesa 3/7] docs/specs: remove upstreamed spec MESA_shader_integer_functions

2017-11-22 Thread Eric Engestrom

Spec is now available on Khronos:
https://www.khronos.org/registry/OpenGL/extensions/MESA/MESA_shader_integer_functions.txt

Signed-off-by: Eric Engestrom 
---
 docs/specs/MESA_shader_integer_functions.txt | 522 ---
 1 file changed, 522 deletions(-)
 delete mode 100644 docs/specs/MESA_shader_integer_functions.txt

diff --git a/docs/specs/MESA_shader_integer_functions.txt 
b/docs/specs/MESA_shader_integer_functions.txt
deleted file mode 100644
index 9fcc9b4c5da012be2a24..
--- a/docs/specs/MESA_shader_integer_functions.txt
+++ /dev/null
@@ -1,522 +0,0 @@
-Name
-
-MESA_shader_integer_functions
-
-Name Strings
-
-GL_MESA_shader_integer_functions
-
-Contact
-
-Ian Romanick 
-
-Contributors
-
-All the contributors of GL_ARB_gpu_shader5
-
-Status
-
-Supported by all GLSL 1.30 capable drivers in Mesa 12.1 and later
-
-Version
-
-Version 3, March 31, 2017
-
-Number
-
-OpenGL Extension #495
-
-Dependencies
-
-This extension is written against the OpenGL 3.2 (Compatibility Profile)
-Specification.
-
-This extension is written against Version 1.50 (Revision 09) of the OpenGL
-Shading Language Specification.
-
-GLSL 1.30 (OpenGL) or GLSL ES 3.00 (OpenGL ES) is required.
-
-This extension interacts with ARB_gpu_shader5.
-
-This extension interacts with ARB_gpu_shader_fp64.
-
-This extension interacts with NV_gpu_shader5.
-
-Overview
-
-GL_ARB_gpu_shader5 extends GLSL in a number of useful ways.  Much of this
-added functionality requires significant hardware support.  There are many
-aspects, however, that can be easily implmented on any GPU with "real"
-integer support (as opposed to simulating integers using floating point
-calculations).
-
-This extension provides a set of new features to the OpenGL Shading
-Language to support capabilities of these GPUs, extending the
-capabilities of version 1.30 of the OpenGL Shading Language and version
-3.00 of the OpenGL ES Shading Language.  Shaders using the new
-functionality provided by this extension should enable this
-functionality via the construct
-
-  #extension GL_MESA_shader_integer_functions : require   (or enable)
-
-This extension provides a variety of new features for all shader types,
-including:
-
-  * support for implicitly converting signed integer types to unsigned
-types, as well as more general implicit conversion and function
-overloading infrastructure to support new data types introduced by
-other extensions;
-
-  * new built-in functions supporting:
-
-* splitting a floating-point number into a significand and exponent
-  (frexp), or building a floating-point number from a significand and
-  exponent (ldexp);
-
-* integer bitfield manipulation, including functions to find the
-  position of the most or least significant set bit, count the number
-  of one bits, and bitfield insertion, extraction, and reversal;
-
-* extended integer precision math, including add with carry, subtract
-  with borrow, and extenended multiplication;
-
-The resulting extension is a strict subset of GL_ARB_gpu_shader5.
-
-IP Status
-
-No known IP claims.
-
-New Procedures and Functions
-
-None
-
-New Tokens
-
-None
-
-Additions to Chapter 2 of the OpenGL 3.2 (Compatibility Profile) Specification
-(OpenGL Operation)
-
-None.
-
-Additions to Chapter 3 of the OpenGL 3.2 (Compatibility Profile) Specification
-(Rasterization)
-
-None.
-
-Additions to Chapter 4 of the OpenGL 3.2 (Compatibility Profile) Specification
-(Per-Fragment Operations and the Frame Buffer)
-
-None.
-
-Additions to Chapter 5 of the OpenGL 3.2 (Compatibility Profile) Specification
-(Special Functions)
-
-None.
-
-Additions to Chapter 6 of the OpenGL 3.2 (Compatibility Profile) Specification
-(State and State Requests)
-
-None.
-
-Additions to Appendix A of the OpenGL 3.2 (Compatibility Profile)
-Specification (Invariance)
-
-None.
-
-Additions to the AGL/GLX/WGL Specifications
-
-None.
-
-Modifications to The OpenGL Shading Language Specification, Version 1.50
-(Revision 09)
-
-Including the following line in a shader can be used to control the
-language features described in this extension:
-
-  #extension GL_MESA_shader_integer_functions : 
-
-where  is as specified in section 3.3.
-
-New preprocessor #defines are added to the OpenGL Shading Language:
-
-  #define GL_MESA_shader_integer_functions1
-
-
-Modify Section 4.1.10, Implicit Conversions, p. 27
-
-(modify table of implicit conversions)
-
-Can be implicitly
-Type of expressionconverted to
--   -
-int uint, float
-ivec2

[Mesa-dev] [PATCH mesa 1/7] docs/specs: remove upstreamed spec EGL_MESA_platform_surfaceless

2017-11-22 Thread Eric Engestrom

Spec is now available on Khronos:
https://www.khronos.org/registry/EGL/extensions/MESA/EGL_MESA_platform_surfaceless.txt

Signed-off-by: Eric Engestrom 
---
 docs/specs/EGL_MESA_platform_surfaceless.txt | 120 ---
 1 file changed, 120 deletions(-)
 delete mode 100644 docs/specs/EGL_MESA_platform_surfaceless.txt

diff --git a/docs/specs/EGL_MESA_platform_surfaceless.txt 
b/docs/specs/EGL_MESA_platform_surfaceless.txt
deleted file mode 100644
index 871ee509c55e13c55987..
--- a/docs/specs/EGL_MESA_platform_surfaceless.txt
+++ /dev/null
@@ -1,120 +0,0 @@
-Name
-
-MESA_platform_surfaceless
-
-Name Strings
-
-EGL_MESA_platform_surfaceless
-
-Contributors
-
-Chad Versace 
-Haixia Shi 
-Stéphane Marchesin 
-Zach Reizner 
-Gurchetan Singh 
-
-Contacts
-
-Chad Versace 
-
-Status
-
-DRAFT
-
-Version
-
-Version 2, 2016-10-13
-
-Number
-
-EGL Extension #TODO
-
-Extension Type
-
-EGL client extension
-
-Dependencies
-
-Requires EGL 1.5 or later; or EGL 1.4 with EGL_EXT_platform_base.
-
-This extension is written against the EGL 1.5 Specification (draft
-20140122).
-
-This extension interacts with EGL_EXT_platform_base as follows. If the
-implementation supports EGL_EXT_platform_base, then text regarding
-eglGetPlatformDisplay applies also to eglGetPlatformDisplayEXT;
-eglCreatePlatformWindowSurface to eglCreatePlatformWindowSurfaceEXT; and
-eglCreatePlatformPixmapSurface to eglCreatePlatformPixmapSurfaceEXT.
-
-Overview
-
-This extension defines a new EGL platform, the "surfaceless" platform. This
-platfom's defining property is that it has no native surfaces, and hence
-neither eglCreatePlatformWindowSurface nor eglCreatePlatformPixmapSurface
-can be used. The platform is independent of any native window system.
-
-The platform's intended use case is for enabling OpenGL and OpenGL ES
-applications on systems where no window system exists. However, the
-platform's permitted usage is not restricted to this case.  Since the
-platform is independent of any native window system, it may also be used on
-systems where a window system is present.
-
-New Types
-
-None
-
-New Procedures and Functions
-
-None
-
-New Tokens
-
-Accepted as the  argument of eglGetPlatformDisplay:
-
-EGL_PLATFORM_SURFACELESS_MESA   0x31DD
-
-Additions to the EGL Specification
-
-None.
-
-New Behavior
-
-To determine if the EGL implementation supports this extension, clients
-should query the EGL_EXTENSIONS string of EGL_NO_DISPLAY.
-
-To obtain an EGLDisplay on the surfaceless platform, call
-eglGetPlatformDisplay with  set to EGL_PLATFORM_SURFACELESS_MESA.
-The  parameter must be EGL_DEFAULT_DISPLAY.
-
-eglCreatePlatformWindowSurface fails when called with a  that
-belongs to the surfaceless platform. It returns EGL_NO_SURFACE and
-generates EGL_BAD_NATIVE_WINDOW. The justification for this unconditional
-failure is that the surfaceless platform has no native windows, and
-therefore the  parameter is always invalid.
-
-Likewise, eglCreatePlatformPixmapSurface also fails when called with a
- that belongs to the surfaceless platform.  It returns
-EGL_NO_SURFACE and generates EGL_BAD_NATIVE_PIXMAP.
-
-The surfaceless platform imposes no platform-specific restrictions on the
-creation of pbuffers, as eglCreatePbufferSurface has no native surface
-parameter.  Specifically, if the EGLDisplay advertises an EGLConfig whose
-EGL_SURFACE_TYPE attribute contains EGL_PBUFFER_BIT, then the EGLDisplay
-permits the creation of pbuffers with that config.
-
-Issues
-
-None.
-
-Revision History
-
-Version 2, 2016-10-13 (Chad Versace)
-- Assign enum values
-- Define interfactions with EGL 1.4 and EGL_EXT_platform_base.
-- Add Gurchetan as contributor, as he implemented the pbuffer support.
-
-Version 1, 2016-09-23 (Chad Versace)
-- Initial version
-- Posted for review at
-  
https://lists.freedesktop.org/archives/mesa-dev/2016-September/129549.html
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH mesa 2/7] docs/specs: remove upstreamed spec MESA_image_dma_buf_export

2017-11-22 Thread Eric Engestrom

Spec is now available on Khronos:
https://www.khronos.org/registry/EGL/extensions/MESA/EGL_MESA_image_dma_buf_export.txt

Signed-off-by: Eric Engestrom 
---
 docs/specs/MESA_image_dma_buf_export.txt | 147 ---
 1 file changed, 147 deletions(-)
 delete mode 100644 docs/specs/MESA_image_dma_buf_export.txt

diff --git a/docs/specs/MESA_image_dma_buf_export.txt 
b/docs/specs/MESA_image_dma_buf_export.txt
deleted file mode 100644
index cc9497e437d449a56abc..
--- a/docs/specs/MESA_image_dma_buf_export.txt
+++ /dev/null
@@ -1,147 +0,0 @@
-Name
-
-MESA_image_dma_buf_export
-
-Name Strings
-
-EGL_MESA_image_dma_buf_export
-
-Contributors
-
-Dave Airlie
-
-Contact
-
-Dave Airlie (airlied 'at' redhat 'dot' com)
-
-Status
-
-Complete, shipping.
-
-Version
-
-Version 3, May 5, 2015
-
-Number
-
-EGL Extension #87
-
-Dependencies
-
-Requires EGL 1.4 or later.  This extension is written against the
-wording of the EGL 1.4 specification.
-
-EGL_KHR_base_image is required.
-
-The EGL implementation must be running on a Linux kernel supporting the
-dma_buf buffer sharing mechanism.
-
-Overview
-
-This extension provides entry points for integrating EGLImage with the
-dma-buf infrastructure.  The extension allows creating a Linux dma_buf
-file descriptor or multiple file descriptors, in the case of multi-plane
-YUV image, from an EGLImage.
-
-It is designed to provide the complementary functionality to
-EGL_EXT_image_dma_buf_import.
-
-IP Status
-
-Open-source; freely implementable.
-
-New Types
-
-This extension uses the 64-bit unsigned integer type EGLuint64KHR
-first introduced by the EGL_KHR_stream extension, but does not
-depend on that extension. The typedef may be reproduced separately
-for this extension, if not already present in eglext.h.
-
-typedef khronos_uint64_t EGLuint64KHR;
-
-New Procedures and Functions
-
-EGLBoolean eglExportDMABUFImageQueryMESA(EGLDisplay dpy,
-  EGLImageKHR image,
- int *fourcc,
- int *num_planes,
- EGLuint64KHR *modifiers);
-
-EGLBoolean eglExportDMABUFImageMESA(EGLDisplay dpy,
-EGLImageKHR image,
-int *fds,
-   EGLint *strides,
-   EGLint *offsets);
-
-New Tokens
-
-None
-
-
-Additions to the EGL 1.4 Specification:
-
-To mirror the import extension, this extension attempts to return
-enough information to enable an exported dma-buf to be imported
-via eglCreateImageKHR and EGL_LINUX_DMA_BUF_EXT token.
-
-Retrieving the information is a two step process, so two APIs
-are required.
-
-The first entrypoint
-   EGLBoolean eglExportDMABUFImageQueryMESA(EGLDisplay dpy,
-  EGLImageKHR image,
- int *fourcc,
- int *num_planes,
- EGLuint64KHR *modifiers);
-
-is used to retrieve the pixel format of the buffer, as specified by
-drm_fourcc.h, the number of planes in the image and the Linux
-drm modifiers. ,  and  may be NULL,
-in which case no value is retrieved.
-
-The second entrypoint retrieves the dma_buf file descriptors,
-strides and offsets for the image. The caller should pass
-arrays sized according to the num_planes values retrieved previously.
-Passing arrays of the wrong size will have undefined results.
-If the number of fds is less than the number of planes, then
-subsequent fd slots should contain -1.
-
-EGLBoolean eglExportDMABUFImageMESA(EGLDisplay dpy,
- EGLImageKHR image,
-int *fds,
- EGLint *strides,
- EGLint *offsets);
-
-, ,  can be NULL if the infomatation isn't
-required by the caller.
-
-Issues
-
-1. Should the API look more like an attribute getting API?
-
-ANSWER: No, from a user interface pov, having to iterate across calling
-the API up to 12 times using attribs seems like the wrong solution.
-
-2. Should the API take a plane and just get the fd/stride/offset for that
-   plane?
-
-ANSWER: UNKNOWN,this might be just as valid an API.
-
-3. Does ownership of the file descriptor remain with the app?
-
-ANSWER: Yes, the app is responsible for closing any fds retrieved.
-
-4. If number of planes and number of fds differ what should we do?
-
-ANSWER: Return -1 for the secondary slots, as this avoids having
-to dup the fd extra times to make the interface sane.
-
-Revision History
-
-Version 3, May, 2015
-Just use the KHR 64-bit type.
-Version

[Mesa-dev] [PATCH mesa 0/7] remove upstreamed specs

2017-11-22 Thread Eric Engestrom

A recent thread [1] made me check our local specs to see which ones were
upstream. This series removes the ones that are identical upstream
(modulo "TBD" extension numbers in some cases).

There are a few more specs left that are upstream, but have typo fixes
that I'm going to submit to Khronos, and I'll remove the local copies
once the fixes have been upstreamed:
- EGL_MESA_drm_image
- GLX_MESA_copy_sub_buffer
- GLX_MESA_query_renderer
- GLX_MESA_release_buffers
- MESA_pack_invert
- MESA_window_pos
- MESA_ycbcr_texture

[1] https://lists.freedesktop.org/archives/mesa-dev/2017-November/177861.html

Eric Engestrom (7):
  docs/specs: remove upstreamed spec EGL_MESA_platform_surfaceless
  docs/specs: remove upstreamed spec MESA_image_dma_buf_export
  docs/specs: remove upstreamed spec MESA_shader_integer_functions
  docs/specs: remove upstreamed spec EXT_shader_integer_mix
  docs/specs: remove upstreamed spec MESA_agp_offset
  docs/specs: remove upstreamed spec MESA_pixmap_colormap
  docs/specs: remove upstreamed spec MESA_set_3dfx_mode

 docs/specs/EGL_MESA_platform_surfaceless.txt | 120 --
 docs/specs/EXT_shader_integer_mix.spec   | 138 ---
 docs/specs/MESA_agp_offset.spec  |  95 -
 docs/specs/MESA_image_dma_buf_export.txt | 147 
 docs/specs/MESA_pixmap_colormap.spec |  90 -
 docs/specs/MESA_set_3dfx_mode.spec   |  85 -
 docs/specs/MESA_shader_integer_functions.txt | 522 ---
 7 files changed, 1197 deletions(-)
 delete mode 100644 docs/specs/EGL_MESA_platform_surfaceless.txt
 delete mode 100644 docs/specs/EXT_shader_integer_mix.spec
 delete mode 100644 docs/specs/MESA_agp_offset.spec
 delete mode 100644 docs/specs/MESA_image_dma_buf_export.txt
 delete mode 100644 docs/specs/MESA_pixmap_colormap.spec
 delete mode 100644 docs/specs/MESA_set_3dfx_mode.spec
 delete mode 100644 docs/specs/MESA_shader_integer_functions.txt

-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radeonsi: try flushing unflushed fences in si_fence_finish even when timeout == 0

2017-11-22 Thread Marek Olšák

Reviewed-by: Marek Olšák 

Marek

On Wed, Nov 22, 2017 at 5:52 PM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> Under certain conditions, waiting on a GL sync objects should act like
> a flush, regardless of the timeout.
>
> Portal 2, CS:GO, and presumably other Source engine games rely on this
> behavior and hang during loading without this fix.
>
> Fixes: bc65dcab3bc4 ("radeonsi: avoid syncing the driver thread in 
> si_fence_finish")
> ---
>  src/gallium/drivers/radeonsi/si_fence.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_fence.c 
> b/src/gallium/drivers/radeonsi/si_fence.c
> index 9d6bcfe1027..b835ed649ee 100644
> --- a/src/gallium/drivers/radeonsi/si_fence.c
> +++ b/src/gallium/drivers/radeonsi/si_fence.c
> @@ -184,36 +184,36 @@ static void si_fine_fence_set(struct si_context *ctx,
>  static boolean si_fence_finish(struct pipe_screen *screen,
>struct pipe_context *ctx,
>struct pipe_fence_handle *fence,
>uint64_t timeout)
>  {
> struct radeon_winsys *rws = ((struct r600_common_screen*)screen)->ws;
> struct si_multi_fence *rfence = (struct si_multi_fence *)fence;
> int64_t abs_timeout = os_time_get_absolute_timeout(timeout);
>
> if (!util_queue_fence_is_signalled(>ready)) {
> -   if (!timeout)
> -   return false;
> -
> if (rfence->tc_token) {
> /* Ensure that si_flush_from_st will be called for
>  * this fence, but only if we're in the API thread
>  * where the context is current.
>  *
>  * Note that the batch containing the flush may 
> already
>  * be in flight in the driver thread, so the fence
>  * may not be ready yet when this call returns.
>  */
> threaded_context_flush(ctx, rfence->tc_token,
>timeout == 0);
> }
>
> +   if (!timeout)
> +   return false;
> +
> if (timeout == PIPE_TIMEOUT_INFINITE) {
> util_queue_fence_wait(>ready);
> } else {
> if (!util_queue_fence_wait_timeout(>ready, 
> abs_timeout))
> return false;
> }
>
> if (timeout && timeout != PIPE_TIMEOUT_INFINITE) {
> int64_t time = os_time_get_nano();
> timeout = abs_timeout > time ? abs_timeout - time : 0;
> --
> 2.11.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] gallium/u_blitter: support blitting PIPE_BUFFERs

2017-11-22 Thread Roland Scheidegger

Am 22.11.2017 um 16:56 schrieb Rob Clark:
> So, I could potentially do this for a5xx, which actually has a sort of
> blit engine.  But for earlier gen's, everything is a glDraw(), and I
> don't want to duplicate the functionality of u_blitter.
> 
> I suppose I could bypass util_blitter_blit() and use
> util_blitter_blit_generic() directly..

I still don't really see the need for such gross hacks.
gallium fully supports rendering to buffers (most drivers don't,
llvmpipe does), and texturing from them. Therefore, I think you could
just use a proper buffer sampler view template (using the offset/size
fields instead of level/layer). Likewise, don't use the level/layer
fields for surface views, but the first_element/last_element fields
(looks like right now you're actually relying on the blit code using the
wrong fields for buffers...).
You just need to handle sampling from buffers and rendering to buffers
properly in your driver, and make sure the blitter code handles the
restrictions appropriately (e.g. must use txf, not ordinary texture
lookups).

Albeit in any case you're still restricted to tiny buffers - while the
first_element in surface views can give you access to large buffers (no
idea if your hw could comply with that...) you're still restricted to
16k or so (whatever your viewport limits are) wrt actually writing in
one shot. (Technically, you could get 16 times more than that limit
since buffers are compatible to any format, so you can use rgba32ui, but
then your offsets and sizes need to be aligned, and you can't
communicate the format to the blitter code.)
If you want to support large buffers, you probably need to wrap them in
a fake (untiled) 2d resource.

Roland

> BR,
> -R
> 
> On Wed, Nov 22, 2017 at 10:14 AM, Roland Scheidegger  
> wrote:
>> I don't think this is a good idea.
>> 1D and buffer resources are fundamentally incompatible, it is highly
>> illegal to use targets in views which are incompatible, so I'd rather
>> not see such atrocities in shared code.
>> I think you really want to fix up your resource_copy_region
>> implementation one way or another so it can do gpu copies for buffers.
>>
>> Roland
>>
>> Am 22.11.2017 um 15:43 schrieb Rob Clark:
>>> It is useful for staging/shadow transfers for drivers to be able to blit
>>> BUFFERs.  Treat them as R8 1D textures for this purpose.
>>>
>>> Signed-off-by: Rob Clark 
>>> ---
>>> This works at least if 1D textures are linear, so I suppose might not
>>> work for all drivers.  Although I'm not entirely sure what the point
>>> of a tiled 1D texture is.  And I guess drivers for which this wouldn't
>>> work could continue to just not use u_blitter for BUFFERs.
>>>
>>>  src/gallium/auxiliary/util/u_blitter.c | 12 ++--
>>>  1 file changed, 10 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/src/gallium/auxiliary/util/u_blitter.c 
>>> b/src/gallium/auxiliary/util/u_blitter.c
>>> index 476ef08737e..7ba7b5aa57d 100644
>>> --- a/src/gallium/auxiliary/util/u_blitter.c
>>> +++ b/src/gallium/auxiliary/util/u_blitter.c
>>> @@ -1445,7 +1445,10 @@ void util_blitter_default_dst_texture(struct 
>>> pipe_surface *dst_templ,
>>>unsigned dstz)
>>>  {
>>> memset(dst_templ, 0, sizeof(*dst_templ));
>>> -   dst_templ->format = util_format_linear(dst->format);
>>> +   if (dst->target == PIPE_BUFFER)
>>> +  dst_templ->format = PIPE_FORMAT_R8_UINT;
>>> +   else
>>> +  dst_templ->format = util_format_linear(dst->format);
>>> dst_templ->u.tex.level = dstlevel;
>>> dst_templ->u.tex.first_layer = dstz;
>>> dst_templ->u.tex.last_layer = dstz;
>>> @@ -1482,7 +1485,12 @@ void util_blitter_default_src_texture(struct 
>>> blitter_context *blitter,
>>> else
>>>src_templ->target = src->target;
>>>
>>> -   src_templ->format = util_format_linear(src->format);
>>> +   if (src->target  == PIPE_BUFFER) {
>>> +  src_templ->target = PIPE_TEXTURE_1D;
>>> +  src_templ->format = PIPE_FORMAT_R8_UINT;
>>> +   } else {
>>> +  src_templ->format = util_format_linear(src->format);
>>> +   }
>>> src_templ->u.tex.first_level = srclevel;
>>> src_templ->u.tex.last_level = srclevel;
>>> src_templ->u.tex.first_layer = 0;
>>>
>>

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] gallium/u_blitter: support blitting PIPE_BUFFERs

2017-11-22 Thread Marek Olšák

I don't have anything against driver-specific hacks in u_blitter,
because it already contains a lot of that, although separated in
special functions.

For buffer blitting, I recommend using DMA. Your hw should have it,
because it's roughly based on radeon, which has had CP DMA since r200.

If not, I recommend using a compute shader for the copy. Pixel shaders
are also possible (with buffer stores), but your maximum width is 16K
at best, and the 3D engine also adds overhead.

The preferred method of hacking blits is to use
util_blitter_blit_generic. You start by inlining util_blitter_blit in
your driver and then you modify that. That's why
util_blitter_blit_generic exists.

Marek

On Wed, Nov 22, 2017 at 3:43 PM, Rob Clark  wrote:
> It is useful for staging/shadow transfers for drivers to be able to blit
> BUFFERs.  Treat them as R8 1D textures for this purpose.
>
> Signed-off-by: Rob Clark 
> ---
> This works at least if 1D textures are linear, so I suppose might not
> work for all drivers.  Although I'm not entirely sure what the point
> of a tiled 1D texture is.  And I guess drivers for which this wouldn't
> work could continue to just not use u_blitter for BUFFERs.
>
>  src/gallium/auxiliary/util/u_blitter.c | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/auxiliary/util/u_blitter.c 
> b/src/gallium/auxiliary/util/u_blitter.c
> index 476ef08737e..7ba7b5aa57d 100644
> --- a/src/gallium/auxiliary/util/u_blitter.c
> +++ b/src/gallium/auxiliary/util/u_blitter.c
> @@ -1445,7 +1445,10 @@ void util_blitter_default_dst_texture(struct 
> pipe_surface *dst_templ,
>unsigned dstz)
>  {
> memset(dst_templ, 0, sizeof(*dst_templ));
> -   dst_templ->format = util_format_linear(dst->format);
> +   if (dst->target == PIPE_BUFFER)
> +  dst_templ->format = PIPE_FORMAT_R8_UINT;
> +   else
> +  dst_templ->format = util_format_linear(dst->format);
> dst_templ->u.tex.level = dstlevel;
> dst_templ->u.tex.first_layer = dstz;
> dst_templ->u.tex.last_layer = dstz;
> @@ -1482,7 +1485,12 @@ void util_blitter_default_src_texture(struct 
> blitter_context *blitter,
> else
>src_templ->target = src->target;
>
> -   src_templ->format = util_format_linear(src->format);
> +   if (src->target  == PIPE_BUFFER) {
> +  src_templ->target = PIPE_TEXTURE_1D;
> +  src_templ->format = PIPE_FORMAT_R8_UINT;
> +   } else {
> +  src_templ->format = util_format_linear(src->format);
> +   }
> src_templ->u.tex.first_level = srclevel;
> src_templ->u.tex.last_level = srclevel;
> src_templ->u.tex.first_layer = 0;
> --
> 2.13.6
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 04/17] etnaviv: Use only DRAW_INSTANCED on GC3000+

2017-11-22 Thread Lucas Stach

Am Mittwoch, den 22.11.2017, 18:16 +0100 schrieb Wladimir J. van der Laan:
> Hello Lucas,
> 
> On Wed, Nov 22, 2017 at 02:29:33PM +0100, Lucas Stach wrote:
> > Hi Wladimir,
> > 
> > Am Samstag, den 18.11.2017, 10:44 +0100 schrieb Wladimir J. van der Laan:
> > > The blob does this, as DRAW_INSTANCED can replace fully all the other
> > > draw commands. It is also required to handle integer vertex formats.
> > > The other path is only there for compatibility and might go away (or at
> > > least rot to become buggy due to dis-use) in newer hardware.
> > > 
> > > As a by-effect this changes the behavior for GC3000-, by no longer using
> > > the index offset for DRAW_INDEXED but instead adding it to INDEX_ADDR.
> > > This should make no difference.
> > > 
> > > Preparation for GC7000 support.
> > 
> > I haven't looked into it much yet, but this commit breaks QT5 GUI
> > rendering on GC3000 for me. It seems like the DRAWs get dropped on the
> > floor with nothing being rendered. I didn't spot anything obviously
> > wrong in this patch from a quick look, so would be glad if you could
> > look into this.
> 
> Are you possibly running an older kernel that doesn't allow DRAW_INSTANCED
> in the command stream filter?

No, this is a 4.14 kernel and it doesn't complain about the cmdstream.

> I did test this patch series on GC3000, fairly sure that included qt5.
> 
> But yes let's leave out this patch for now, or have the change in behavior 
> only for
> GC7000.

I would really like to know what's wrong with this patch, as using the
new draw command should be fine on GC3000 and we certainly want to
support instanced drawing at some point.

Regards,
Lucas
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 04/17] etnaviv: Use only DRAW_INSTANCED on GC3000+

2017-11-22 Thread Wladimir J. van der Laan

Hello Lucas,

On Wed, Nov 22, 2017 at 02:29:33PM +0100, Lucas Stach wrote:
> Hi Wladimir,
> 
> Am Samstag, den 18.11.2017, 10:44 +0100 schrieb Wladimir J. van der Laan:
> > The blob does this, as DRAW_INSTANCED can replace fully all the other
> > draw commands. It is also required to handle integer vertex formats.
> > The other path is only there for compatibility and might go away (or at
> > least rot to become buggy due to dis-use) in newer hardware.
> > 
> > As a by-effect this changes the behavior for GC3000-, by no longer using
> > the index offset for DRAW_INDEXED but instead adding it to INDEX_ADDR.
> > This should make no difference.
> > 
> > Preparation for GC7000 support.
> 
> I haven't looked into it much yet, but this commit breaks QT5 GUI
> rendering on GC3000 for me. It seems like the DRAWs get dropped on the
> floor with nothing being rendered. I didn't spot anything obviously
> wrong in this patch from a quick look, so would be glad if you could
> look into this.

Are you possibly running an older kernel that doesn't allow DRAW_INSTANCED
in the command stream filter?

I did test this patch series on GC3000, fairly sure that included qt5.

But yes let's leave out this patch for now, or have the change in behavior only 
for
GC7000.

Regards,
Wladimir
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 01/17] docs/specs: Add GL_MESA_program_binary_formats extension spec

2017-11-22 Thread Eric Engestrom

On Tuesday, 2017-11-21 11:24:48 -0800, Jordan Justen wrote:
> On 2017-11-21 05:45:36, Eric Engestrom wrote:
> > On Monday, 2017-11-20 14:27:27 -0800, Jordan Justen wrote:
> > > Similar idea to Tim's "spec: MESA_program_binary", but simplified and
> > > written to support both ARB_get_program_binary and
> > > OES_get_program_binary.
> > > 
> > > This spec was merged into the OpenGL Registry in version
> > > 667c5a253781834b40a6ae9eb19d05af4542cfe1.
> > > 
> > > Ref: https://github.com/KhronosGroup/OpenGL-Registry/pull/127
> > > Signed-off-by: Jordan Justen 
> > > Cc: Ian Romanick 
> > > Cc: Timothy Arceri 
> > > Reviewed-by: Ian Romanick 
> > > Reviewed-by: Nicolai Hähnle 
> > > ---
> > >  docs/specs/MESA_program_binary_formats.txt | 88 
> > > ++
> > 
> > As far as I can tell, this file ^ is identical to the one upstream [1].
> > Why do we want a copy here?
> 
> I assumed that we kept a copy in Mesa as well. Why would we keep spec
> docs in Mesa that are not in the Khronos registry?

My understanding is that we only kept specs that are not upstream (yet),
but I might be wrong.

Why would we want a copy of some of the upstream specs? At best it's an
unnecessary duplicate, at worst it's a non-authoritative differing spec.

I had a quick look, and it seems about half of the specs we have in that
folder are now upstream; I'll double check that they're identical and
send patches to remove them.

Let's see what people say when I send that series.

> 
> -Jordan
> 
> > (ack on the rest of the patch though)
> > 
> > [1] 
> > https://www.khronos.org/registry/OpenGL/extensions/MESA/MESA_program_binary_formats.txt
> > 
> > >  docs/specs/enums.txt   |  3 +
> > >  src/mapi/glapi/registry/gl.xml |  7 ++-
> > >  3 files changed, 97 insertions(+), 1 deletion(-)
> > >  create mode 100644 docs/specs/MESA_program_binary_formats.txt
> > > 
> > > diff --git a/docs/specs/MESA_program_binary_formats.txt 
> > > b/docs/specs/MESA_program_binary_formats.txt
> > > new file mode 100644
> > > index 000..937e8ef4bf3
> > > --- /dev/null
> > > +++ b/docs/specs/MESA_program_binary_formats.txt
> > > @@ -0,0 +1,88 @@
> > > +Name
> > > +
> > > +MESA_program_binary_formats
> > > +
> > > +Name Strings
> > > +
> > > +GL_MESA_program_binary_formats
> > > +
> > > +Contributors
> > > +
> > > +Ian Romanick
> > > +Jordan Justen
> > > +Timothy Arceri
> > > +
> > > +Contact
> > > +
> > > +Jordan Justen (jordan.l.justen 'at' intel.com)
> > > +
> > > +Status
> > > +
> > > +Complete.
> > > +
> > > +Version
> > > +
> > > +Last Modified Date: November 10, 2017
> > > +Revision: #2
> > > +
> > > +Number
> > > +
> > > +OpenGL Extension #516
> > > +OpenGL ES Extension #294
> > > +
> > > +Dependencies
> > > +
> > > +For use with the OpenGL ARB_get_program_binary extension, or the
> > > +OpenGL ES OES_get_program_binary extension.
> > > +
> > > +Overview
> > > +
> > > +The get_program_binary exensions require a GLenum binaryFormat.
> > > +This extension documents that format for use with Mesa.
> > > +
> > > +New Procedures and Functions
> > > +
> > > +None.
> > > +
> > > +New Tokens
> > > +
> > > +GL_PROGRAM_BINARY_FORMAT_MESA   0x875F
> > > +
> > > +For ARB_get_program_binary, GL_PROGRAM_BINARY_FORMAT_MESA may be
> > > +returned from GetProgramBinary calls in the 
> > > +parameter and when retrieving the value of PROGRAM_BINARY_FORMATS.
> > > +
> > > +For OES_get_program_binary, GL_PROGRAM_BINARY_FORMAT_MESA may be
> > > +returned from GetProgramBinaryOES calls in the 
> > > +parameter and when retrieving the value of
> > > +PROGRAM_BINARY_FORMATS_OES.
> > > +
> > > +New State
> > > +
> > > +None.
> > > +
> > > +Issues
> > > +
> > > +(1) Should we have a different format for each driver?
> > > +
> > > +  RESOLVED. Since Mesa supports multiple hardware drivers, having
> > > +  a single format may cause separate drivers to have to reject a
> > > +  binary for another type of hardware on the same machine. This
> > > +  could lead to an application having to invalidate and get a new
> > > +  binary more often.
> > > +
> > > +  This extension, at least initially, does not to attempt to
> > > +  define a new token for each driver since systems that run
> > > +  multiple drivers are not the common case.
> > > +
> > > +  Additionally, drivers in Mesa are now gaining the ability to
> > > +  transparently cache shader programs. Therefore, although they
> > > +  may need to provide the application with a new binary more
> > > +  often, they likely can retrieve the program from the cache
> > > +  rather than performing an expensive recompile.
> > > +
> > > +Revision History
> > > +
> > > +#0211/10/2017Jordan

[Mesa-dev] [PATCH] radeonsi: try flushing unflushed fences in si_fence_finish even when timeout == 0

2017-11-22 Thread Nicolai Hähnle

From: Nicolai Hähnle 

Under certain conditions, waiting on a GL sync objects should act like
a flush, regardless of the timeout.

Portal 2, CS:GO, and presumably other Source engine games rely on this
behavior and hang during loading without this fix.

Fixes: bc65dcab3bc4 ("radeonsi: avoid syncing the driver thread in 
si_fence_finish")
---
 src/gallium/drivers/radeonsi/si_fence.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_fence.c 
b/src/gallium/drivers/radeonsi/si_fence.c
index 9d6bcfe1027..b835ed649ee 100644
--- a/src/gallium/drivers/radeonsi/si_fence.c
+++ b/src/gallium/drivers/radeonsi/si_fence.c
@@ -184,36 +184,36 @@ static void si_fine_fence_set(struct si_context *ctx,
 static boolean si_fence_finish(struct pipe_screen *screen,
   struct pipe_context *ctx,
   struct pipe_fence_handle *fence,
   uint64_t timeout)
 {
struct radeon_winsys *rws = ((struct r600_common_screen*)screen)->ws;
struct si_multi_fence *rfence = (struct si_multi_fence *)fence;
int64_t abs_timeout = os_time_get_absolute_timeout(timeout);
 
if (!util_queue_fence_is_signalled(>ready)) {
-   if (!timeout)
-   return false;
-
if (rfence->tc_token) {
/* Ensure that si_flush_from_st will be called for
 * this fence, but only if we're in the API thread
 * where the context is current.
 *
 * Note that the batch containing the flush may already
 * be in flight in the driver thread, so the fence
 * may not be ready yet when this call returns.
 */
threaded_context_flush(ctx, rfence->tc_token,
   timeout == 0);
}
 
+   if (!timeout)
+   return false;
+
if (timeout == PIPE_TIMEOUT_INFINITE) {
util_queue_fence_wait(>ready);
} else {
if (!util_queue_fence_wait_timeout(>ready, 
abs_timeout))
return false;
}
 
if (timeout && timeout != PIPE_TIMEOUT_INFINITE) {
int64_t time = os_time_get_nano();
timeout = abs_timeout > time ? abs_timeout - time : 0;
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] gallium/u_blitter: support blitting PIPE_BUFFERs

2017-11-22 Thread Rob Clark

So, I could potentially do this for a5xx, which actually has a sort of
blit engine.  But for earlier gen's, everything is a glDraw(), and I
don't want to duplicate the functionality of u_blitter.

I suppose I could bypass util_blitter_blit() and use
util_blitter_blit_generic() directly..

BR,
-R

On Wed, Nov 22, 2017 at 10:14 AM, Roland Scheidegger  wrote:
> I don't think this is a good idea.
> 1D and buffer resources are fundamentally incompatible, it is highly
> illegal to use targets in views which are incompatible, so I'd rather
> not see such atrocities in shared code.
> I think you really want to fix up your resource_copy_region
> implementation one way or another so it can do gpu copies for buffers.
>
> Roland
>
> Am 22.11.2017 um 15:43 schrieb Rob Clark:
>> It is useful for staging/shadow transfers for drivers to be able to blit
>> BUFFERs.  Treat them as R8 1D textures for this purpose.
>>
>> Signed-off-by: Rob Clark 
>> ---
>> This works at least if 1D textures are linear, so I suppose might not
>> work for all drivers.  Although I'm not entirely sure what the point
>> of a tiled 1D texture is.  And I guess drivers for which this wouldn't
>> work could continue to just not use u_blitter for BUFFERs.
>>
>>  src/gallium/auxiliary/util/u_blitter.c | 12 ++--
>>  1 file changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/gallium/auxiliary/util/u_blitter.c 
>> b/src/gallium/auxiliary/util/u_blitter.c
>> index 476ef08737e..7ba7b5aa57d 100644
>> --- a/src/gallium/auxiliary/util/u_blitter.c
>> +++ b/src/gallium/auxiliary/util/u_blitter.c
>> @@ -1445,7 +1445,10 @@ void util_blitter_default_dst_texture(struct 
>> pipe_surface *dst_templ,
>>unsigned dstz)
>>  {
>> memset(dst_templ, 0, sizeof(*dst_templ));
>> -   dst_templ->format = util_format_linear(dst->format);
>> +   if (dst->target == PIPE_BUFFER)
>> +  dst_templ->format = PIPE_FORMAT_R8_UINT;
>> +   else
>> +  dst_templ->format = util_format_linear(dst->format);
>> dst_templ->u.tex.level = dstlevel;
>> dst_templ->u.tex.first_layer = dstz;
>> dst_templ->u.tex.last_layer = dstz;
>> @@ -1482,7 +1485,12 @@ void util_blitter_default_src_texture(struct 
>> blitter_context *blitter,
>> else
>>src_templ->target = src->target;
>>
>> -   src_templ->format = util_format_linear(src->format);
>> +   if (src->target  == PIPE_BUFFER) {
>> +  src_templ->target = PIPE_TEXTURE_1D;
>> +  src_templ->format = PIPE_FORMAT_R8_UINT;
>> +   } else {
>> +  src_templ->format = util_format_linear(src->format);
>> +   }
>> src_templ->u.tex.first_level = srclevel;
>> src_templ->u.tex.last_level = srclevel;
>> src_templ->u.tex.first_layer = 0;
>>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] gallium/u_blitter: support blitting PIPE_BUFFERs

2017-11-22 Thread Roland Scheidegger

I don't think this is a good idea.
1D and buffer resources are fundamentally incompatible, it is highly
illegal to use targets in views which are incompatible, so I'd rather
not see such atrocities in shared code.
I think you really want to fix up your resource_copy_region
implementation one way or another so it can do gpu copies for buffers.

Roland

Am 22.11.2017 um 15:43 schrieb Rob Clark:
> It is useful for staging/shadow transfers for drivers to be able to blit
> BUFFERs.  Treat them as R8 1D textures for this purpose.
> 
> Signed-off-by: Rob Clark 
> ---
> This works at least if 1D textures are linear, so I suppose might not
> work for all drivers.  Although I'm not entirely sure what the point
> of a tiled 1D texture is.  And I guess drivers for which this wouldn't
> work could continue to just not use u_blitter for BUFFERs.
> 
>  src/gallium/auxiliary/util/u_blitter.c | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/util/u_blitter.c 
> b/src/gallium/auxiliary/util/u_blitter.c
> index 476ef08737e..7ba7b5aa57d 100644
> --- a/src/gallium/auxiliary/util/u_blitter.c
> +++ b/src/gallium/auxiliary/util/u_blitter.c
> @@ -1445,7 +1445,10 @@ void util_blitter_default_dst_texture(struct 
> pipe_surface *dst_templ,
>unsigned dstz)
>  {
> memset(dst_templ, 0, sizeof(*dst_templ));
> -   dst_templ->format = util_format_linear(dst->format);
> +   if (dst->target == PIPE_BUFFER)
> +  dst_templ->format = PIPE_FORMAT_R8_UINT;
> +   else
> +  dst_templ->format = util_format_linear(dst->format);
> dst_templ->u.tex.level = dstlevel;
> dst_templ->u.tex.first_layer = dstz;
> dst_templ->u.tex.last_layer = dstz;
> @@ -1482,7 +1485,12 @@ void util_blitter_default_src_texture(struct 
> blitter_context *blitter,
> else
>src_templ->target = src->target;
>  
> -   src_templ->format = util_format_linear(src->format);
> +   if (src->target  == PIPE_BUFFER) {
> +  src_templ->target = PIPE_TEXTURE_1D;
> +  src_templ->format = PIPE_FORMAT_R8_UINT;
> +   } else {
> +  src_templ->format = util_format_linear(src->format);
> +   }
> src_templ->u.tex.first_level = srclevel;
> src_templ->u.tex.last_level = srclevel;
> src_templ->u.tex.first_layer = 0;
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] Revert "radv: remove unnecessary memset() in radv_AllocateCommandBuffers()"

2017-11-22 Thread Samuel Pitoiset

This fixes two CTS regressions:
- 
dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_primary
- 
dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_secondary

These two tests are part the mustpass lists, so presumably they
are correct and my change was wrong.

This reverts commit 0f68208f1d1d3b7b2963dab40e84c60212518692.
---
 src/amd/vulkan/radv_cmd_buffer.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 7d86eee979..bd72ba2a87 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -2139,6 +2139,9 @@ VkResult radv_AllocateCommandBuffers(
VkResult result = VK_SUCCESS;
uint32_t i;
 
+   memset(pCommandBuffers, 0,
+   
sizeof(*pCommandBuffers)*pAllocateInfo->commandBufferCount);
+
for (i = 0; i < pAllocateInfo->commandBufferCount; i++) {
 
if (!list_empty(>free_cmd_buffers)) {
-- 
2.15.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] gallium/u_blitter: support blitting PIPE_BUFFERs

2017-11-22 Thread Rob Clark

yes, because that ends up being a cpu copy for BUFFER (ie. stall),
which is what I'm trying to avoid in the first place ;-)

Maybe this needs to map things to a linear 2d buffer for larger sizes,
not sure.  One way or another at the hw level this gets treated like a
2d render target (which might have height=1).  I was trying to think
of a way to do this hack in driver, but it is awkward without frob'ing
prsc->target (which isn't really safe to do if the resource is shared
between contexts on different threads).

Or maybe other option is to fall back to memcpy for BUFFERs that are
too large.  In practice, this isn't a problem, at least not with the
games I'm looking at.

BR,
-R

On Wed, Nov 22, 2017 at 9:47 AM, Ilia Mirkin  wrote:
> Are you sure you're not looking for resource_copy_region? BUFFERs can
> be wide (128MB in many impls), 1D textures can't. There are probably
> other differences.
>
> On Wed, Nov 22, 2017 at 9:43 AM, Rob Clark  wrote:
>> It is useful for staging/shadow transfers for drivers to be able to blit
>> BUFFERs.  Treat them as R8 1D textures for this purpose.
>>
>> Signed-off-by: Rob Clark 
>> ---
>> This works at least if 1D textures are linear, so I suppose might not
>> work for all drivers.  Although I'm not entirely sure what the point
>> of a tiled 1D texture is.  And I guess drivers for which this wouldn't
>> work could continue to just not use u_blitter for BUFFERs.
>>
>>  src/gallium/auxiliary/util/u_blitter.c | 12 ++--
>>  1 file changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/gallium/auxiliary/util/u_blitter.c 
>> b/src/gallium/auxiliary/util/u_blitter.c
>> index 476ef08737e..7ba7b5aa57d 100644
>> --- a/src/gallium/auxiliary/util/u_blitter.c
>> +++ b/src/gallium/auxiliary/util/u_blitter.c
>> @@ -1445,7 +1445,10 @@ void util_blitter_default_dst_texture(struct 
>> pipe_surface *dst_templ,
>>unsigned dstz)
>>  {
>> memset(dst_templ, 0, sizeof(*dst_templ));
>> -   dst_templ->format = util_format_linear(dst->format);
>> +   if (dst->target == PIPE_BUFFER)
>> +  dst_templ->format = PIPE_FORMAT_R8_UINT;
>> +   else
>> +  dst_templ->format = util_format_linear(dst->format);
>> dst_templ->u.tex.level = dstlevel;
>> dst_templ->u.tex.first_layer = dstz;
>> dst_templ->u.tex.last_layer = dstz;
>> @@ -1482,7 +1485,12 @@ void util_blitter_default_src_texture(struct 
>> blitter_context *blitter,
>> else
>>src_templ->target = src->target;
>>
>> -   src_templ->format = util_format_linear(src->format);
>> +   if (src->target  == PIPE_BUFFER) {
>> +  src_templ->target = PIPE_TEXTURE_1D;
>> +  src_templ->format = PIPE_FORMAT_R8_UINT;
>> +   } else {
>> +  src_templ->format = util_format_linear(src->format);
>> +   }
>> src_templ->u.tex.first_level = srclevel;
>> src_templ->u.tex.last_level = srclevel;
>> src_templ->u.tex.first_layer = 0;
>> --
>> 2.13.6
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] gallium/u_blitter: support blitting PIPE_BUFFERs

2017-11-22 Thread Ilia Mirkin

Are you sure you're not looking for resource_copy_region? BUFFERs can
be wide (128MB in many impls), 1D textures can't. There are probably
other differences.

On Wed, Nov 22, 2017 at 9:43 AM, Rob Clark  wrote:
> It is useful for staging/shadow transfers for drivers to be able to blit
> BUFFERs.  Treat them as R8 1D textures for this purpose.
>
> Signed-off-by: Rob Clark 
> ---
> This works at least if 1D textures are linear, so I suppose might not
> work for all drivers.  Although I'm not entirely sure what the point
> of a tiled 1D texture is.  And I guess drivers for which this wouldn't
> work could continue to just not use u_blitter for BUFFERs.
>
>  src/gallium/auxiliary/util/u_blitter.c | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/auxiliary/util/u_blitter.c 
> b/src/gallium/auxiliary/util/u_blitter.c
> index 476ef08737e..7ba7b5aa57d 100644
> --- a/src/gallium/auxiliary/util/u_blitter.c
> +++ b/src/gallium/auxiliary/util/u_blitter.c
> @@ -1445,7 +1445,10 @@ void util_blitter_default_dst_texture(struct 
> pipe_surface *dst_templ,
>unsigned dstz)
>  {
> memset(dst_templ, 0, sizeof(*dst_templ));
> -   dst_templ->format = util_format_linear(dst->format);
> +   if (dst->target == PIPE_BUFFER)
> +  dst_templ->format = PIPE_FORMAT_R8_UINT;
> +   else
> +  dst_templ->format = util_format_linear(dst->format);
> dst_templ->u.tex.level = dstlevel;
> dst_templ->u.tex.first_layer = dstz;
> dst_templ->u.tex.last_layer = dstz;
> @@ -1482,7 +1485,12 @@ void util_blitter_default_src_texture(struct 
> blitter_context *blitter,
> else
>src_templ->target = src->target;
>
> -   src_templ->format = util_format_linear(src->format);
> +   if (src->target  == PIPE_BUFFER) {
> +  src_templ->target = PIPE_TEXTURE_1D;
> +  src_templ->format = PIPE_FORMAT_R8_UINT;
> +   } else {
> +  src_templ->format = util_format_linear(src->format);
> +   }
> src_templ->u.tex.first_level = srclevel;
> src_templ->u.tex.last_level = srclevel;
> src_templ->u.tex.first_layer = 0;
> --
> 2.13.6
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC] gallium/u_blitter: support blitting PIPE_BUFFERs

2017-11-22 Thread Rob Clark

It is useful for staging/shadow transfers for drivers to be able to blit
BUFFERs.  Treat them as R8 1D textures for this purpose.

Signed-off-by: Rob Clark 
---
This works at least if 1D textures are linear, so I suppose might not
work for all drivers.  Although I'm not entirely sure what the point
of a tiled 1D texture is.  And I guess drivers for which this wouldn't
work could continue to just not use u_blitter for BUFFERs.

 src/gallium/auxiliary/util/u_blitter.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_blitter.c 
b/src/gallium/auxiliary/util/u_blitter.c
index 476ef08737e..7ba7b5aa57d 100644
--- a/src/gallium/auxiliary/util/u_blitter.c
+++ b/src/gallium/auxiliary/util/u_blitter.c
@@ -1445,7 +1445,10 @@ void util_blitter_default_dst_texture(struct 
pipe_surface *dst_templ,
   unsigned dstz)
 {
memset(dst_templ, 0, sizeof(*dst_templ));
-   dst_templ->format = util_format_linear(dst->format);
+   if (dst->target == PIPE_BUFFER)
+  dst_templ->format = PIPE_FORMAT_R8_UINT;
+   else
+  dst_templ->format = util_format_linear(dst->format);
dst_templ->u.tex.level = dstlevel;
dst_templ->u.tex.first_layer = dstz;
dst_templ->u.tex.last_layer = dstz;
@@ -1482,7 +1485,12 @@ void util_blitter_default_src_texture(struct 
blitter_context *blitter,
else
   src_templ->target = src->target;
 
-   src_templ->format = util_format_linear(src->format);
+   if (src->target  == PIPE_BUFFER) {
+  src_templ->target = PIPE_TEXTURE_1D;
+  src_templ->format = PIPE_FORMAT_R8_UINT;
+   } else {
+  src_templ->format = util_format_linear(src->format);
+   }
src_templ->u.tex.first_level = srclevel;
src_templ->u.tex.last_level = srclevel;
src_templ->u.tex.first_layer = 0;
-- 
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] st/glsl_to_tgsi: Add support for SYSTEM_VALUE_BASE_VERTEX_ID

2017-11-22 Thread Neil Roberts

SYSTEM_VALUE_BASE_VERTEX has changed to be the correct value for
gl_BaseVertex, which means it will be zero when used with a
non-indexed call. The new BASE_VERTEX_ID value can be used as before
as an offset to calculate a value for gl_VertexID. These values should
be different, but this patch just makes them same for now in order to
at least retain the previous behaviour and not break gl_BaseVertexID
and gl_VertexID entirely on radeonsi.

Note, this hasn’t been tested apart from to verify that it compiles.
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 0772b73..3dfed19 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -5385,6 +5385,11 @@ _mesa_sysval_to_semantic(unsigned sysval)
case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE:
   return TGSI_SEMANTIC_VERTEXID_NOBASE;
case SYSTEM_VALUE_BASE_VERTEX:
+   case SYSTEM_VALUE_BASE_VERTEX_ID:
+  /* FIXME: These two values are actually supposed to be different. The
+   * one used for gl_BaseVertex is supposed to be zero when a non-indexed
+   * draw call is used.
+   */
   return TGSI_SEMANTIC_BASEVERTEX;
case SYSTEM_VALUE_BASE_INSTANCE:
   return TGSI_SEMANTIC_BASEINSTANCE;
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] freedreno: Update to handle rename of the base vertex ID intrinsic

2017-11-22 Thread Neil Roberts

The old intrinsic called base_vertex that is used to add to
gl_VertexID is now called base_vertex_id so that base_vertex can be
used for the value of gl_BaseVertex, which is different. As far as I
can tell freedreno doesn’t support GL_ARB_shader_draw_parameters so it
won’t need any changes to generate the new base_vertex intrinsic.

I haven’t tested this at all apart from to verify that it compiles.

Cc: Rob Clark 
---
 src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c 
b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
index da4aeaa..e6fbf45 100644
--- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
+++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
@@ -2071,10 +2071,10 @@ emit_intrinsic(struct ir3_context *ctx, 
nir_intrinsic_instr *intr)
ctx->ir->outputs[n] = src[i];
}
break;
-   case nir_intrinsic_load_base_vertex:
+   case nir_intrinsic_load_base_vertex_id:
if (!ctx->basevertex) {
ctx->basevertex = create_driver_param(ctx, 
IR3_DP_VTXID_BASE);
-   add_sysval_input(ctx, SYSTEM_VALUE_BASE_VERTEX,
+   add_sysval_input(ctx, SYSTEM_VALUE_BASE_VERTEX_ID,
ctx->basevertex);
}
dst[0] = ctx->basevertex;
-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 0/2] Fix radeonsi and freedreno for the BaseVertex patches

2017-11-22 Thread Neil Roberts

Here are two bonus patches to address the problems mentioned that it
might break radeonsi and freedreno. The radeonsi patch doesn’t solve
the problem that the value for gl_BaseVertex is presumably wrong, but
it at least should stop it from breaking gl_VertexID entirely. I
haven’t been able to test either patch so this is more of a request
for comments.

Neil Roberts (2):
  freedreno: Update to handle rename of the base vertex ID intrinsic
  st/glsl_to_tgsi: Add support for SYSTEM_VALUE_BASE_VERTEX_ID

 src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c | 4 ++--
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp   | 5 +
 2 files changed, 7 insertions(+), 2 deletions(-)

-- 
2.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 08/17] etnaviv: GC7000: BLT engine blitting support

2017-11-22 Thread Christian Gmeiner

2017-11-18 10:44 GMT+01:00 Wladimir J. van der Laan :
> Add an implemenation of key clear_blit functions using the BLT engine
> that replaced the RS on GC7000.
>
> Also set level->size correctly for imported resources. This is important
> for the BLT resolve-in-place path to work for them.
>
> Signed-off-by: Wladimir J. van der Laan 

Reviewed-by: Christian Gmeiner 

> ---
>  src/gallium/drivers/etnaviv/Makefile.sources |   3 +
>  src/gallium/drivers/etnaviv/etnaviv_blt.c| 562 
> +++
>  src/gallium/drivers/etnaviv/etnaviv_blt.h| 100 
>  src/gallium/drivers/etnaviv/etnaviv_clear_blit.c |   8 +-
>  src/gallium/drivers/etnaviv/etnaviv_context.c|   6 +-
>  src/gallium/drivers/etnaviv/etnaviv_internal.h   |   2 +
>  src/gallium/drivers/etnaviv/etnaviv_resource.c   |   1 +
>  src/gallium/drivers/etnaviv/etnaviv_screen.c |   2 +
>  src/gallium/drivers/etnaviv/meson.build  |   3 +
>  9 files changed, 684 insertions(+), 3 deletions(-)
>  create mode 100644 src/gallium/drivers/etnaviv/etnaviv_blt.c
>  create mode 100644 src/gallium/drivers/etnaviv/etnaviv_blt.h
>
> - Code style issues resolved
> - Update both meson and makefile
> - Remove copy_buffer, compute_mipmaps for now
> - Make etnaviv_blt self-contained like etnaviv_rs, make functions that could 
> be static static
> - No more etnaviv_clear_blit_blt.c
> - Set level->size correctly for imported resources. This is important for the 
> BLT resolve-in-place path to work for them
>
> diff --git a/src/gallium/drivers/etnaviv/Makefile.sources 
> b/src/gallium/drivers/etnaviv/Makefile.sources
> index aafcc38..78029ad 100644
> --- a/src/gallium/drivers/etnaviv/Makefile.sources
> +++ b/src/gallium/drivers/etnaviv/Makefile.sources
> @@ -4,12 +4,15 @@ C_SOURCES :=  \
> hw/common_3d.xml.h \
> hw/isa.xml.h \
> hw/state_3d.xml.h \
> +   hw/state_blt.xml.h \
> hw/state.xml.h \
> \
> etnaviv_asm.c \
> etnaviv_asm.h \
> etnaviv_blend.c \
> etnaviv_blend.h \
> +   etnaviv_blt.c \
> +   etnaviv_blt.h \
> etnaviv_clear_blit.c \
> etnaviv_clear_blit.h \
> etnaviv_compiler.c \
> diff --git a/src/gallium/drivers/etnaviv/etnaviv_blt.c 
> b/src/gallium/drivers/etnaviv/etnaviv_blt.c
> new file mode 100644
> index 000..ec3eac9
> --- /dev/null
> +++ b/src/gallium/drivers/etnaviv/etnaviv_blt.c
> @@ -0,0 +1,562 @@
> +/*
> + * Copyright (c) 2017 Etnaviv Project
> + * Copyright (C) 2017 Zodiac Inflight Innovations
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sub license,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> + * next paragraph) shall be included in all copies or substantial portions
> + * of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + *
> + * Authors:
> + *Wladimir J. van der Laan 
> + */
> +#include "etnaviv_blt.h"
> +
> +#include "etnaviv_emit.h"
> +#include "etnaviv_clear_blit.h"
> +#include "etnaviv_context.h"
> +#include "etnaviv_emit.h"
> +#include "etnaviv_format.h"
> +#include "etnaviv_resource.h"
> +#include "etnaviv_surface.h"
> +#include "etnaviv_translate.h"
> +
> +#include "util/u_math.h"
> +#include "pipe/p_defines.h"
> +#include "pipe/p_state.h"
> +#include "util/u_blitter.h"
> +#include "util/u_inlines.h"
> +#include "util/u_memory.h"
> +#include "util/u_surface.h"
> +
> +#include "hw/common_3d.xml.h"
> +#include "hw/state_blt.xml.h"
> +#include "hw/common.xml.h"
> +
> +#include 
> +
> +/* Currently, used BLT formats overlap 100% with RS formats */
> +#define translate_blt_format translate_rs_format
> +
> +static inline uint32_t
> +blt_compute_stride_bits(const struct blt_imginfo *img)
> +{
> +   return VIVS_BLT_DEST_STRIDE_TILING(img->tiling == ETNA_LAYOUT_LINEAR ? 0 
> : 3) | /* 1/3? */
> +  VIVS_BLT_DEST_STRIDE_FORMAT(img->format) |
> +  VIVS_BLT_DEST_STRIDE_STRIDE(img->stride);
> +}
> +
> +static inline uint32_t
> +blt_compute_img_config_bits(const struct blt_imginfo *img,

Re: [Mesa-dev] [PATCH v2 07/17] etnaviv: GC7000: Factor out RS blit functionality

2017-11-22 Thread Christian Gmeiner

2017-11-18 10:44 GMT+01:00 Wladimir J. van der Laan :
> Prepare for BLT-based blitting path by moving RS-based
> blitting to the RS implementation file, making this
> self-contained.
>
> Signed-off-by: Wladimir J. van der Laan 

Reviewed-by: Christian Gmeiner 

> ---
>  src/gallium/drivers/etnaviv/etnaviv_clear_blit.c | 558 +--
>  src/gallium/drivers/etnaviv/etnaviv_clear_blit.h |   6 +
>  src/gallium/drivers/etnaviv/etnaviv_emit.c   |  79 ---
>  src/gallium/drivers/etnaviv/etnaviv_emit.h   |   3 -
>  src/gallium/drivers/etnaviv/etnaviv_rs.c | 665 
> ++-
>  src/gallium/drivers/etnaviv/etnaviv_rs.h |   4 +-
>  6 files changed, 677 insertions(+), 638 deletions(-)
>
> - Made etnaviv_rs.c self-contained, make functions that could be static static
>   and local (idea by Christian Gmeiner)
>
> - No more etnaviv_clear_blit_rs.c, so no build system changes needed here.
>
> diff --git a/src/gallium/drivers/etnaviv/etnaviv_clear_blit.c 
> b/src/gallium/drivers/etnaviv/etnaviv_clear_blit.c
> index ff37a6b..ae5300a 100644
> --- a/src/gallium/drivers/etnaviv/etnaviv_clear_blit.c
> +++ b/src/gallium/drivers/etnaviv/etnaviv_clear_blit.c
> @@ -30,9 +30,9 @@
>
>  #include "etnaviv_context.h"
>  #include "etnaviv_emit.h"
> -#include "etnaviv_emit.h"
>  #include "etnaviv_format.h"
>  #include "etnaviv_resource.h"
> +#include "etnaviv_rs.h"
>  #include "etnaviv_surface.h"
>  #include "etnaviv_translate.h"
>
> @@ -44,7 +44,7 @@
>  #include "util/u_surface.h"
>
>  /* Save current state for blitter operation */
> -static void
> +void
>  etna_blit_save_state(struct etna_context *ctx)
>  {
> util_blitter_save_vertex_buffer_slot(ctx->blitter, ctx->vertex_buffer.vb);
> @@ -65,43 +65,8 @@ etna_blit_save_state(struct etna_context *ctx)
>   ctx->num_fragment_sampler_views, ctx->sampler_view);
>  }
>
> -/* Generate clear command for a surface (non-fast clear case) */
> -void
> -etna_rs_gen_clear_surface(struct etna_context *ctx, struct etna_surface 
> *surf,
> -  uint32_t clear_value)
> -{
> -   struct etna_resource *dst = etna_resource(surf->base.texture);
> -   uint32_t format = translate_rs_format(surf->base.format);
> -
> -   if (format == ETNA_NO_MATCH) {
> -  BUG("etna_rs_gen_clear_surface: Unhandled clear fmt %s", 
> util_format_name(surf->base.format));
> -  format = RS_FORMAT_A8R8G8B8;
> -  assert(0);
> -   }
> -
> -   /* use tiled clear if width is multiple of 16 */
> -   bool tiled_clear = (surf->surf.padded_width & ETNA_RS_WIDTH_MASK) == 0 &&
> -  (surf->surf.padded_height & ETNA_RS_HEIGHT_MASK) == 0;
> -
> -   etna_compile_rs_state( ctx, >clear_command, &(struct rs_state) {
> -  .source_format = format,
> -  .dest_format = format,
> -  .dest = dst->bo,
> -  .dest_offset = surf->surf.offset,
> -  .dest_stride = surf->surf.stride,
> -  .dest_padded_height = surf->surf.padded_height,
> -  .dest_tiling = tiled_clear ? dst->layout : ETNA_LAYOUT_LINEAR,
> -  .dither = {0x, 0x},
> -  .width = surf->surf.padded_width, /* These must be padded to 16x4 if 
> !LINEAR, otherwise RS will hang */
> -  .height = surf->surf.padded_height,
> -  .clear_value = {clear_value},
> -  .clear_mode = VIVS_RS_CLEAR_CONTROL_MODE_ENABLED1,
> -  .clear_bits = 0x
> -   });
> -}
> -
> -static inline uint32_t
> -pack_rgba(enum pipe_format format, const float *rgba)
> +uint32_t
> +etna_clear_blit_pack_rgba(enum pipe_format format, const float *rgba)
>  {
> union util_color uc;
> util_pack_color(rgba, format, );
> @@ -112,152 +77,6 @@ pack_rgba(enum pipe_format format, const float *rgba)
>  }
>
>  static void
> -etna_blit_clear_color(struct pipe_context *pctx, struct pipe_surface *dst,
> -  const union pipe_color_union *color)
> -{
> -   struct etna_context *ctx = etna_context(pctx);
> -   struct etna_surface *surf = etna_surface(dst);
> -   uint32_t new_clear_value = pack_rgba(surf->base.format, color->f);
> -
> -   if (surf->surf.ts_size) { /* TS: use precompiled clear command */
> -  ctx->framebuffer.TS_COLOR_CLEAR_VALUE = new_clear_value;
> -
> -  if (VIV_FEATURE(ctx->screen, chipMinorFeatures1, AUTO_DISABLE)) {
> - /* Set number of color tiles to be filled */
> - etna_set_state(ctx->stream, VIVS_TS_COLOR_AUTO_DISABLE_COUNT,
> -surf->surf.padded_width * surf->surf.padded_height / 
> 16);
> - ctx->framebuffer.TS_MEM_CONFIG |= 
> VIVS_TS_MEM_CONFIG_COLOR_AUTO_DISABLE;
> -  }
> -
> -  surf->level->ts_valid = true;
> -  ctx->dirty |= ETNA_DIRTY_TS | ETNA_DIRTY_DERIVE_TS;
> -   } else if (unlikely(new_clear_value != surf->level->clear_value)) { /* 
> Queue normal RS clear for non-TS surfaces */
> -  /* If clear color changed, re-generate stored command */
> -  etna_rs_gen_clear_surface(ctx, surf,

Re: [Mesa-dev] [PATCH v2 05/17] etnaviv: GC7000: Support BLT as recipient for etna_stall

2017-11-22 Thread Christian Gmeiner

2017-11-18 10:44 GMT+01:00 Wladimir J. van der Laan :
> When the BLT is involved as source or target, add an extra BLT
> enable/disable sequence around the sync sequence.
>
> Signed-off-by: Wladimir J. van der Laan 

Reviewed-by: Christian Gmeiner 

> ---
>  src/gallium/drivers/etnaviv/etnaviv_emit.c | 15 ++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
>
> Unchanged since v1.
>
> diff --git a/src/gallium/drivers/etnaviv/etnaviv_emit.c 
> b/src/gallium/drivers/etnaviv/etnaviv_emit.c
> index 3b460a0..98f7baa 100644
> --- a/src/gallium/drivers/etnaviv/etnaviv_emit.c
> +++ b/src/gallium/drivers/etnaviv/etnaviv_emit.c
> @@ -41,6 +41,7 @@
>  #include "etnaviv_zsa.h"
>  #include "hw/common.xml.h"
>  #include "hw/state.xml.h"
> +#include "hw/state_blt.xml.h"
>  #include "util/u_math.h"
>
>  struct etna_coalesce {
> @@ -60,8 +61,15 @@ CMD_STALL(struct etna_cmd_stream *stream, uint32_t from, 
> uint32_t to)
>  void
>  etna_stall(struct etna_cmd_stream *stream, uint32_t from, uint32_t to)
>  {
> -   etna_cmd_stream_reserve(stream, 4);
> +   bool blt = (from == SYNC_RECIPIENT_BLT) || (to == SYNC_RECIPIENT_BLT);
> +   etna_cmd_stream_reserve(stream, blt ? 8 : 4);
>
> +   if (blt) {
> +  etna_emit_load_state(stream, VIVS_BLT_ENABLE >> 2, 1, 0);
> +  etna_cmd_stream_emit(stream, 1);
> +   }
> +
> +   /* TODO: set bit 28/29 of token after BLT COPY_BUFFER */
> etna_emit_load_state(stream, VIVS_GL_SEMAPHORE_TOKEN >> 2, 1, 0);
> etna_cmd_stream_emit(stream, VIVS_GL_SEMAPHORE_TOKEN_FROM(from) | 
> VIVS_GL_SEMAPHORE_TOKEN_TO(to));
>
> @@ -73,6 +81,11 @@ etna_stall(struct etna_cmd_stream *stream, uint32_t from, 
> uint32_t to)
>etna_emit_load_state(stream, VIVS_GL_STALL_TOKEN >> 2, 1, 0);
>etna_cmd_stream_emit(stream, VIVS_GL_STALL_TOKEN_FROM(from) | 
> VIVS_GL_STALL_TOKEN_TO(to));
> }
> +
> +   if (blt) {
> +  etna_emit_load_state(stream, VIVS_BLT_ENABLE >> 2, 1, 0);
> +  etna_cmd_stream_emit(stream, 0);
> +   }
>  }
>
>  static void
> --
> 2.7.4
>



-- 
greets
--
Christian Gmeiner, MSc

https://christian-gmeiner.info
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 02/17] etnaviv: Put HALTI level in specs

2017-11-22 Thread Lucas Stach

I've pushed the first 2 patches of this series to upstream, as they
look fine, have seen enough review and I want to have them out of the
way should we need another rev of the series.

Regards,
Lucas

Am Samstag, den 18.11.2017, 10:44 +0100 schrieb Wladimir J. van der Laan:
> The HALTI level is an indication of the gross architecture of the GPU.
> It determines for significant part what feature level the GPU has, what
> state (especially frontend state) is there, and where it is located.
> 
> > Signed-off-by: Wladimir J. van der Laan 
> > Reviewed-by: Christian Gmeiner 
> ---
>  src/gallium/drivers/etnaviv/etnaviv_internal.h |  2 ++
>  src/gallium/drivers/etnaviv/etnaviv_screen.c   | 21 +
>  2 files changed, 23 insertions(+)
> 
> Unchanged since v1.
> 
> diff --git a/src/gallium/drivers/etnaviv/etnaviv_internal.h 
> b/src/gallium/drivers/etnaviv/etnaviv_internal.h
> index 707a1e0..48dd5bf 100644
> --- a/src/gallium/drivers/etnaviv/etnaviv_internal.h
> +++ b/src/gallium/drivers/etnaviv/etnaviv_internal.h
> @@ -60,6 +60,8 @@
>  
>  /* GPU chip 3D specs */
>  struct etna_specs {
> +   /* HALTI (gross architecture) level. -1 for pre-HALTI. */
> +   int halti : 8;
> /* supports SUPERTILE (64x64) tiling? */
> unsigned can_supertile : 1;
> /* needs z=(z+w)/2, for older GCxxx */
> diff --git a/src/gallium/drivers/etnaviv/etnaviv_screen.c 
> b/src/gallium/drivers/etnaviv/etnaviv_screen.c
> index eaf3ca2..9a957ab 100644
> --- a/src/gallium/drivers/etnaviv/etnaviv_screen.c
> +++ b/src/gallium/drivers/etnaviv/etnaviv_screen.c
> @@ -690,6 +690,27 @@ etna_get_specs(struct etna_screen *screen)
> }
> screen->specs.num_constants = val;
>  
> +   /* Figure out gross GPU architecture. See rnndb/common.xml for a specific
> +* description of the differences. */
> +   if (VIV_FEATURE(screen, chipMinorFeatures5, HALTI5))
> +  screen->specs.halti = 5; /* New GC7000/GC8x00  */
> +   else if (VIV_FEATURE(screen, chipMinorFeatures5, HALTI4))
> +  screen->specs.halti = 4; /* Old GC7000/GC7400 */
> +   else if (VIV_FEATURE(screen, chipMinorFeatures5, HALTI3))
> +  screen->specs.halti = 3; /* None? */
> +   else if (VIV_FEATURE(screen, chipMinorFeatures4, HALTI2))
> +  screen->specs.halti = 2; /* GC2500/GC3000/GC5000/GC6400 */
> +   else if (VIV_FEATURE(screen, chipMinorFeatures2, HALTI1))
> +  screen->specs.halti = 1; /* GC900/GC4000/GC7000UL */
> +   else if (VIV_FEATURE(screen, chipMinorFeatures1, HALTI0))
> +  screen->specs.halti = 0; /* GC880/GC2000/GC7000TM */
> +   else
> +  screen->specs.halti = -1; /* GC7000nanolite / pre-GC2000 except GC880 
> */
> +   if (screen->specs.halti >= 0)
> +  DBG("etnaviv: GPU arch: HALTI%d\n", screen->specs.halti);
> +   else
> +  DBG("etnaviv: GPU arch: pre-HALTI\n");
> +
> screen->specs.can_supertile =
>    VIV_FEATURE(screen, chipMinorFeatures0, SUPER_TILED);
> screen->specs.bits_per_tile =
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 04/17] etnaviv: Use only DRAW_INSTANCED on GC3000+

2017-11-22 Thread Lucas Stach

Hi Wladimir,

Am Samstag, den 18.11.2017, 10:44 +0100 schrieb Wladimir J. van der Laan:
> The blob does this, as DRAW_INSTANCED can replace fully all the other
> draw commands. It is also required to handle integer vertex formats.
> The other path is only there for compatibility and might go away (or at
> least rot to become buggy due to dis-use) in newer hardware.
> 
> As a by-effect this changes the behavior for GC3000-, by no longer using
> the index offset for DRAW_INDEXED but instead adding it to INDEX_ADDR.
> This should make no difference.
> 
> Preparation for GC7000 support.

I haven't looked into it much yet, but this commit breaks QT5 GUI
rendering on GC3000 for me. It seems like the DRAWs get dropped on the
floor with nothing being rendered. I didn't spot anything obviously
wrong in this patch from a quick look, so would be glad if you could
look into this.

Regards,
Lucas

> Signed-off-by: Wladimir J. van der Laan 
> > Reviewed-by: Philipp Zabel 
> ---
>  src/gallium/drivers/etnaviv/etnaviv_context.c | 16 
>  src/gallium/drivers/etnaviv/etnaviv_emit.h| 21 +
>  2 files changed, 33 insertions(+), 4 deletions(-)
> 
> Unchanged since v1, only commit message updated as noted by Philipp Zabel.
> 
> diff --git a/src/gallium/drivers/etnaviv/etnaviv_context.c 
> b/src/gallium/drivers/etnaviv/etnaviv_context.c
> index 65c20d2..5aa9c66 100644
> --- a/src/gallium/drivers/etnaviv/etnaviv_context.c
> +++ b/src/gallium/drivers/etnaviv/etnaviv_context.c
> @@ -188,6 +188,8 @@ etna_draw_vbo(struct pipe_context *pctx, const struct 
> pipe_draw_info *info)
>   BUG("Index buffer upload failed.");
>   return;
>    }
> +  /* Add start to index offset, when rendering indexed */
> +  index_offset += info->start * info->index_size;
>  
>    ctx->index_buffer.FE_INDEX_STREAM_BASE_ADDR.bo = 
> etna_resource(indexbuf)->bo;
>    ctx->index_buffer.FE_INDEX_STREAM_BASE_ADDR.offset = index_offset;
> @@ -273,10 +275,16 @@ etna_draw_vbo(struct pipe_context *pctx, const struct 
> pipe_draw_info *info)
> /* First, sync state, then emit DRAW_PRIMITIVES or 
> DRAW_INDEXED_PRIMITIVES */
> etna_emit_state(ctx);
>  
> -   if (info->index_size)
> -  etna_draw_indexed_primitives(ctx->stream, draw_mode, info->start, 
> prims, info->index_bias);
> -   else
> -  etna_draw_primitives(ctx->stream, draw_mode, info->start, prims);
> +   if (ctx->specs.halti >= 2) {
> +  /* On HALTI2+ (GC3000 and higher) only use instanced drawing commands, 
> as the blob does */
> +  etna_draw_instanced(ctx->stream, info->index_size, draw_mode, 1,
> + info->count, info->index_size ? info->index_bias : info->start);
> +   } else {
> +  if (info->index_size)
> + etna_draw_indexed_primitives(ctx->stream, draw_mode, 0, prims, 
> info->index_bias);
> +  else
> + etna_draw_primitives(ctx->stream, draw_mode, info->start, prims);
> +   }
>  
> if (DBG_ENABLED(ETNA_DBG_DRAW_STALL)) {
>    /* Stall the FE after every draw operation.  This allows better
> diff --git a/src/gallium/drivers/etnaviv/etnaviv_emit.h 
> b/src/gallium/drivers/etnaviv/etnaviv_emit.h
> index e0c0eda..3c3d129 100644
> --- a/src/gallium/drivers/etnaviv/etnaviv_emit.h
> +++ b/src/gallium/drivers/etnaviv/etnaviv_emit.h
> @@ -117,6 +117,27 @@ etna_draw_indexed_primitives(struct etna_cmd_stream 
> *stream,
> etna_cmd_stream_emit(stream, 0);
>  }
>  
> +/* important: this takes a vertex count, not a primitive count */
> +static inline void
> +etna_draw_instanced(struct etna_cmd_stream *stream,
> +uint32_t indexed, uint32_t primitive_type,
> +uint32_t instance_count,
> +uint32_t vertex_count, uint32_t offset)
> +{
> +   etna_cmd_stream_reserve(stream, 3 + 1);
> +   etna_cmd_stream_emit(stream,
> +  VIV_FE_DRAW_INSTANCED_HEADER_OP_DRAW_INSTANCED |
> +  COND(indexed, VIV_FE_DRAW_INSTANCED_HEADER_INDEXED) |
> +  VIV_FE_DRAW_INSTANCED_HEADER_TYPE(primitive_type) |
> +  VIV_FE_DRAW_INSTANCED_HEADER_INSTANCE_COUNT_LO(instance_count & 
> 0x));
> +   etna_cmd_stream_emit(stream,
> +  VIV_FE_DRAW_INSTANCED_COUNT_INSTANCE_COUNT_HI(instance_count >> 16) |
> +  VIV_FE_DRAW_INSTANCED_COUNT_VERTEX_COUNT(vertex_count));
> +   etna_cmd_stream_emit(stream,
> +  VIV_FE_DRAW_INSTANCED_START_INDEX(offset));
> +   etna_cmd_stream_emit(stream, 0);
> +}
> +
>  void
>  etna_emit_state(struct etna_context *ctx);
>  
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 15/15] radeonsi: enable gs support for nir backend

2017-11-22 Thread Ilia Mirkin

On Wed, Nov 22, 2017 at 5:30 AM, Timothy Arceri  wrote:
> ---
>  src/gallium/drivers/radeonsi/si_pipe.c   |  3 ++-
>  src/gallium/drivers/radeonsi/si_shader_nir.c | 10 --
>  src/mesa/state_tracker/st_glsl_to_nir.cpp| 12 
>  3 files changed, 22 insertions(+), 3 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
> b/src/gallium/drivers/radeonsi/si_pipe.c
> index b3d8ae508b..dffdab5d41 100644
> --- a/src/gallium/drivers/radeonsi/si_pipe.c
> +++ b/src/gallium/drivers/radeonsi/si_pipe.c
> @@ -542,21 +542,21 @@ static int si_get_param(struct pipe_screen* pscreen, 
> enum pipe_cap param)
> case PIPE_CAP_CONSTANT_BUFFER_OFFSET_ALIGNMENT:
> case PIPE_CAP_TEXTURE_BUFFER_OFFSET_ALIGNMENT:
> case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS:
> case PIPE_CAP_MAX_STREAM_OUTPUT_BUFFERS:
> case PIPE_CAP_MAX_VERTEX_STREAMS:
> case PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT:
> return 4;
>
> case PIPE_CAP_GLSL_FEATURE_LEVEL:
> if (sscreen->b.debug_flags & DBG(NIR))
> -   return 140; /* no geometry and tessellation shaders 
> yet */
> +   return 150; /* no tessellation shaders yet */

330 presumably?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] meson: add logic to select apple and windows dri

2017-11-22 Thread Eric Engestrom

On Tuesday, 2017-11-21 10:50:29 -0800, Dylan Baker wrote:
> Quoting Eric Engestrom (2017-11-21 10:38:25)
> > On Tuesday, 2017-11-21 10:21:07 -0800, Dylan Baker wrote:
> > > This is still not fully correct (haiku and BSD is notably probably not
> > > correct), but Linux is not regressed and this should be correct for
> > > macOS and Windows.
> > > 
> > > v2: - set the dri_platform to windows on Cygwin as well (Jon)
> > 
> > R-b stands
> > 
> > > 
> > > Signed-off-by: Dylan Baker 
> > > ---
> > >  meson.build | 15 +--
> > >  1 file changed, 13 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/meson.build b/meson.build
> > > index 52f2c1cb0d0..4248cbcfd7e 100644
> > > --- a/meson.build
> > > +++ b/meson.build
> > > @@ -187,8 +187,19 @@ if with_dri_i915
> > >dep_libdrm_intel = dependency('libdrm_intel', version : '>= 2.4.75')
> > >  endif
> > >  
> > > -# TODO: other OSes
> > > -with_dri_platform = 'drm'
> > > +# TODO: gnu 
> > 
> > I missed that comment the first time around; I don't understand what it
> > means?
> 
> The autotools build has a handlers for setting the dri_platform to 'none' on
> gnu* (which I assume to be hurd). See configure.ac:1513
> 
> As far as I know meson doesn't support hurd ATM (though I doubt they'd turn 
> away
> patches for it).
> 
> We can drop the TODO if you'd prefer, I just like to note things in the
> autotools/scons build that aren't currently supported in the meson build.

No, I think keeping the TODO is good to indicate something that is
handled by the other build system(s), even if they might never be
"fixed" (eg. if meson never gets ported to hurd).

I guess this case is already covered by your `else` though, so maybe
move the comment there, and make it more than 3 letters? :P

Don't let this stop you from pushing it though, it's really a nitpick ;)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH mesa] genxml: fix assert guards

2017-11-22 Thread Eric Engestrom

This removes a few hundred warnings on debug builds with asserts off.

Signed-off-by: Eric Engestrom 
---
 src/intel/genxml/gen_pack_header.py | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/intel/genxml/gen_pack_header.py 
b/src/intel/genxml/gen_pack_header.py
index 1a5d193d228b0fc4cefa..e6cea8646ff3aabc6db6 100644
--- a/src/intel/genxml/gen_pack_header.py
+++ b/src/intel/genxml/gen_pack_header.py
@@ -73,7 +73,7 @@
 {
__gen_validate_value(v);
 
-#if DEBUG
+#ifndef NDEBUG
const int width = end - start + 1;
if (width < 64) {
   const uint64_t max = (1ull << width) - 1;
@@ -91,7 +91,7 @@
 
__gen_validate_value(v);
 
-#if DEBUG
+#ifndef NDEBUG
if (width < 64) {
   const int64_t max = (1ll << (width - 1)) - 1;
   const int64_t min = -(1ll << (width - 1));
@@ -108,7 +108,7 @@
 __gen_offset(uint64_t v, uint32_t start, uint32_t end)
 {
__gen_validate_value(v);
-#if DEBUG
+#ifndef NDEBUG
uint64_t mask = (~0ull >> (64 - (end - start + 1))) << start;
 
assert((v & ~mask) == 0);
@@ -131,7 +131,7 @@
 
const float factor = (1 << fract_bits);
 
-#if DEBUG
+#ifndef NDEBUG
const float max = ((1 << (end - start)) - 1) / factor;
const float min = -(1 << (end - start)) / factor;
assert(min <= v && v <= max);
@@ -150,7 +150,7 @@
 
const float factor = (1 << fract_bits);
 
-#if DEBUG
+#ifndef NDEBUG
const float max = ((1 << (end - start + 1)) - 1) / factor;
const float min = 0.0f;
assert(min <= v && v <= max);
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 15/15] radeonsi: enable gs support for nir backend

2017-11-22 Thread Timothy Arceri




On 22/11/17 21:30, Timothy Arceri wrote:

---
  src/gallium/drivers/radeonsi/si_pipe.c   |  3 ++-
  src/gallium/drivers/radeonsi/si_shader_nir.c | 10 --
  src/mesa/state_tracker/st_glsl_to_nir.cpp| 12 
  3 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index b3d8ae508b..dffdab5d41 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -542,21 +542,21 @@ static int si_get_param(struct pipe_screen* pscreen, enum 
pipe_cap param)
case PIPE_CAP_CONSTANT_BUFFER_OFFSET_ALIGNMENT:
case PIPE_CAP_TEXTURE_BUFFER_OFFSET_ALIGNMENT:
case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS:
case PIPE_CAP_MAX_STREAM_OUTPUT_BUFFERS:
case PIPE_CAP_MAX_VERTEX_STREAMS:
case PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT:
return 4;
  
  	case PIPE_CAP_GLSL_FEATURE_LEVEL:

if (sscreen->b.debug_flags & DBG(NIR))
-   return 140; /* no geometry and tessellation shaders yet 
*/
+   return 150; /* no tessellation shaders yet */
if (si_have_tgsi_compute(sscreen))
return 450;
return 420;
  
  	case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE:

return MIN2(sscreen->b.info.max_alloc_size, INT_MAX);
  
  	case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY:

case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY:
case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY:
@@ -746,20 +746,21 @@ static int si_get_shader_param(struct pipe_screen* 
pscreen,
return SI_NUM_SAMPLERS;
case PIPE_SHADER_CAP_MAX_SHADER_BUFFERS:
return SI_NUM_SHADER_BUFFERS;
case PIPE_SHADER_CAP_MAX_SHADER_IMAGES:
return SI_NUM_IMAGES;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
case PIPE_SHADER_CAP_PREFERRED_IR:
if (sscreen->b.debug_flags & DBG(NIR) &&
(shader == PIPE_SHADER_VERTEX ||
+shader == PIPE_SHADER_GEOMETRY ||
 shader == PIPE_SHADER_FRAGMENT))
return PIPE_SHADER_IR_NIR;
return PIPE_SHADER_IR_TGSI;
case PIPE_SHADER_CAP_LOWER_IF_THRESHOLD:
return 4;
  
  	/* Supported boolean features. */

case PIPE_SHADER_CAP_TGSI_CONT_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_SQRT_SUPPORTED:
case PIPE_SHADER_CAP_INDIRECT_TEMP_ADDR:
diff --git a/src/gallium/drivers/radeonsi/si_shader_nir.c 
b/src/gallium/drivers/radeonsi/si_shader_nir.c
index 54f79ba0c3..9396403bf0 100644
--- a/src/gallium/drivers/radeonsi/si_shader_nir.c
+++ b/src/gallium/drivers/radeonsi/si_shader_nir.c
@@ -123,20 +123,21 @@ static void scan_instruction(struct tgsi_shader_info 
*info,
}
  }
  
  void si_nir_scan_shader(const struct nir_shader *nir,

struct tgsi_shader_info *info)
  {
nir_function *func;
unsigned i;
  
  	assert(nir->info.stage == MESA_SHADER_VERTEX ||

+  nir->info.stage == MESA_SHADER_GEOMETRY ||
   nir->info.stage == MESA_SHADER_FRAGMENT);
  
  	info->processor = pipe_shader_type_from_mesa(nir->info.stage);

info->num_tokens = 2; /* indicate that the shader is non-empty */
info->num_instructions = 2;
  
  	if (nir->info.stage == MESA_SHADER_GEOMETRY) {

info->properties[TGSI_PROPERTY_GS_INPUT_PRIM] = 
nir->info.gs.input_primitive;
info->properties[TGSI_PROPERTY_GS_OUTPUT_PRIM] = 
nir->info.gs.output_primitive;
info->properties[TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES] = 
nir->info.gs.vertices_out;
@@ -144,29 +145,30 @@ void si_nir_scan_shader(const struct nir_shader *nir,
}
  
  	i = 0;

uint64_t processed_inputs = 0;
unsigned num_inputs = 0;
nir_foreach_variable(variable, >inputs) {
unsigned semantic_name, semantic_index;
unsigned attrib_count = 
glsl_count_attribute_slots(variable->type,
   
nir->info.stage == MESA_SHADER_VERTEX);
  
-		assert(attrib_count == 1 && "not implemented");

-
/* Vertex shader inputs don't have semantics. The state
 * tracker has already mapped them to attributes via
 * variable->data.driver_location.
 */
if (nir->info.stage == MESA_SHADER_VERTEX)
continue;
  
+		assert(nir->info.stage != MESA_SHADER_FRAGMENT ||

+  (attrib_count == 1 && "not implemented"));
+
/* Fragment shader position is a system value. */
if (nir->info.stage == MESA_SHADER_FRAGMENT &&
variable->data.location == VARYING_SLOT_POS) {

[Mesa-dev] [PATCH 10/15] radeonsi: pass llvm type to lds_load()

2017-11-22 Thread Timothy Arceri

v2: use LLVMBuildBitCast() directly
---
 src/gallium/drivers/radeonsi/si_shader.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index c68ffad8ea..ce90c7beb5 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -1077,50 +1077,49 @@ static LLVMValueRef buffer_load(struct 
lp_build_tgsi_context *bld_base,
 }
 
 /**
  * Load from LDS.
  *
  * \param type output value type
  * \param swizzle  offset (typically 0..3); it can be ~0, which loads a 
vec4
  * \param dw_addr  address in dwords
  */
 static LLVMValueRef lds_load(struct lp_build_tgsi_context *bld_base,
-enum tgsi_opcode_type type, unsigned swizzle,
+LLVMTypeRef type, unsigned swizzle,
 LLVMValueRef dw_addr)
 {
struct si_shader_context *ctx = si_shader_context(bld_base);
LLVMValueRef value;
 
if (swizzle == ~0) {
LLVMValueRef values[TGSI_NUM_CHANNELS];
 
for (unsigned chan = 0; chan < TGSI_NUM_CHANNELS; chan++)
values[chan] = lds_load(bld_base, type, chan, dw_addr);
 
return lp_build_gather_values(>gallivm, values,
  TGSI_NUM_CHANNELS);
}
 
dw_addr = lp_build_add(_base->uint_bld, dw_addr,
LLVMConstInt(ctx->i32, swizzle, 0));
 
value = ac_lds_load(>ac, dw_addr);
-   if (tgsi_type_is_64bit(type)) {
+   if (llvm_type_is_64bit(ctx, type)) {
LLVMValueRef value2;
dw_addr = lp_build_add(_base->uint_bld, dw_addr,
   ctx->i32_1);
value2 = ac_lds_load(>ac, dw_addr);
-   return si_llvm_emit_fetch_64bit(bld_base, 
tgsi2llvmtype(bld_base, type),
-   value, value2);
+   return si_llvm_emit_fetch_64bit(bld_base, type, value, value2);
}
 
-   return bitcast(bld_base, type, value);
+   return LLVMBuildBitCast(ctx->ac.builder, value, type, "");
 }
 
 /**
  * Store to LDS.
  *
  * \param swizzle  offset (typically 0..3)
  * \param dw_addr  address in dwords
  * \param valuevalue to store
  */
 static void lds_store(struct si_shader_context *ctx,
@@ -1162,41 +1161,41 @@ static LLVMValueRef fetch_input_tcs(
const struct tgsi_full_src_register *reg,
enum tgsi_opcode_type type, unsigned swizzle)
 {
struct si_shader_context *ctx = si_shader_context(bld_base);
LLVMValueRef dw_addr, stride;
 
stride = get_tcs_in_vertex_dw_stride(ctx);
dw_addr = get_tcs_in_current_patch_offset(ctx);
dw_addr = get_dw_address(ctx, NULL, reg, stride, dw_addr);
 
-   return lds_load(bld_base, type, swizzle, dw_addr);
+   return lds_load(bld_base, tgsi2llvmtype(bld_base, type), swizzle, 
dw_addr);
 }
 
 static LLVMValueRef fetch_output_tcs(
struct lp_build_tgsi_context *bld_base,
const struct tgsi_full_src_register *reg,
enum tgsi_opcode_type type, unsigned swizzle)
 {
struct si_shader_context *ctx = si_shader_context(bld_base);
LLVMValueRef dw_addr, stride;
 
if (reg->Register.Dimension) {
stride = get_tcs_out_vertex_dw_stride(ctx);
dw_addr = get_tcs_out_current_patch_offset(ctx);
dw_addr = get_dw_address(ctx, NULL, reg, stride, dw_addr);
} else {
dw_addr = get_tcs_out_current_patch_data_offset(ctx);
dw_addr = get_dw_address(ctx, NULL, reg, NULL, dw_addr);
}
 
-   return lds_load(bld_base, type, swizzle, dw_addr);
+   return lds_load(bld_base, tgsi2llvmtype(bld_base, type), swizzle, 
dw_addr);
 }
 
 static LLVMValueRef fetch_input_tes(
struct lp_build_tgsi_context *bld_base,
const struct tgsi_full_src_register *reg,
enum tgsi_opcode_type type, unsigned swizzle)
 {
struct si_shader_context *ctx = si_shader_context(bld_base);
LLVMValueRef buffer, base, addr;
 
@@ -1346,21 +1345,22 @@ static LLVMValueRef fetch_input_gs(
vtx_offset = unpack_param(ctx, 
ctx->param_gs_vtx45_offset,
  index % 2 ? 16 : 0, 16);
break;
default:
assert(0);
return NULL;
}
 
vtx_offset = LLVMBuildAdd(ctx->ac.builder, vtx_offset,
  LLVMConstInt(ctx->i32, param * 4, 0), 
"");
-   return lds_load(bld_base, type, swizzle, vtx_offset);
+   return lds_load(bld_base, tgsi2llvmtype(bld_base, type),
+   swizzle,

[Mesa-dev] [PATCH 11/15] radeonsi: create si_llvm_load_input_gs()

2017-11-22 Thread Timothy Arceri

This creates a common function that can be shared by the tgsi
and nir backends.

v2: use LLVMBuildBitCast() directly
---
 src/gallium/drivers/radeonsi/si_shader.c  | 61 ++-
 src/gallium/drivers/radeonsi/si_shader_internal.h |  6 +++
 2 files changed, 44 insertions(+), 23 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index ce90c7beb5..c2338089b3 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -1297,47 +1297,42 @@ static void store_output_tcs(struct 
lp_build_tgsi_context *bld_base,
}
 
if (reg->Register.WriteMask == 0xF && !is_tess_factor) {
LLVMValueRef value = lp_build_gather_values(>gallivm,
values, 4);
ac_build_buffer_store_dword(>ac, buffer, value, 4, 
buf_addr,
base, 0, 1, 0, true, false);
}
 }
 
-static LLVMValueRef fetch_input_gs(
-   struct lp_build_tgsi_context *bld_base,
-   const struct tgsi_full_src_register *reg,
-   enum tgsi_opcode_type type,
-   unsigned swizzle)
+LLVMValueRef si_llvm_load_input_gs(struct ac_shader_abi *abi,
+  unsigned input_index,
+  unsigned vtx_offset_param,
+  LLVMTypeRef type,
+  unsigned swizzle)
 {
-   struct si_shader_context *ctx = si_shader_context(bld_base);
+   struct si_shader_context *ctx = si_shader_context_from_abi(abi);
+   struct lp_build_tgsi_context *bld_base = >bld_base;
struct si_shader *shader = ctx->shader;
struct lp_build_context *uint = >bld_base.uint_bld;
LLVMValueRef vtx_offset, soffset;
struct tgsi_shader_info *info = >selector->info;
-   unsigned semantic_name = info->input_semantic_name[reg->Register.Index];
-   unsigned semantic_index = 
info->input_semantic_index[reg->Register.Index];
+   unsigned semantic_name = info->input_semantic_name[input_index];
+   unsigned semantic_index = info->input_semantic_index[input_index];
unsigned param;
LLVMValueRef value;
 
-   if (swizzle != ~0 && semantic_name == TGSI_SEMANTIC_PRIMID)
-   return get_primitive_id(ctx, swizzle);
-
-   if (!reg->Register.Dimension)
-   return NULL;
-
param = si_shader_io_get_unique_index(semantic_name, semantic_index);
 
/* GFX9 has the ESGS ring in LDS. */
if (ctx->screen->b.chip_class >= GFX9) {
-   unsigned index = reg->Dimension.Index;
+   unsigned index = vtx_offset_param;
 
switch (index / 2) {
case 0:
vtx_offset = unpack_param(ctx, 
ctx->param_gs_vtx01_offset,
  index % 2 ? 16 : 0, 16);
break;
case 1:
vtx_offset = unpack_param(ctx, 
ctx->param_gs_vtx23_offset,
  index % 2 ? 16 : 0, 16);
break;
@@ -1345,56 +1340,76 @@ static LLVMValueRef fetch_input_gs(
vtx_offset = unpack_param(ctx, 
ctx->param_gs_vtx45_offset,
  index % 2 ? 16 : 0, 16);
break;
default:
assert(0);
return NULL;
}
 
vtx_offset = LLVMBuildAdd(ctx->ac.builder, vtx_offset,
  LLVMConstInt(ctx->i32, param * 4, 0), 
"");
-   return lds_load(bld_base, tgsi2llvmtype(bld_base, type),
-   swizzle, vtx_offset);
+   return lds_load(bld_base, type, swizzle, vtx_offset);
}
 
/* GFX6: input load from the ESGS ring in memory. */
if (swizzle == ~0) {
LLVMValueRef values[TGSI_NUM_CHANNELS];
unsigned chan;
for (chan = 0; chan < TGSI_NUM_CHANNELS; chan++) {
-   values[chan] = fetch_input_gs(bld_base, reg, type, 
chan);
+   values[chan] = si_llvm_load_input_gs(abi, input_index, 
vtx_offset_param,
+type, chan);
}
return lp_build_gather_values(>gallivm, values,
  TGSI_NUM_CHANNELS);
}
 
/* Get the vertex offset parameter on GFX6. */
-   unsigned vtx_offset_param = reg->Dimension.Index;
LLVMValueRef gs_vtx_offset = ctx->gs_vtx_offset[vtx_offset_param];
 
vtx_offset = lp_build_mul_imm(uint, gs_vtx_offset, 4);
 
soffset = LLVMConstInt(ctx->i32, (param * 4 + swizzle) * 256, 0);
 
value = ac_build_buffer_load(>ac,

[Mesa-dev] [PATCH 12/15] ac: add basic nir -> llvm type helper

2017-11-22 Thread Timothy Arceri

---
 src/amd/common/ac_nir_to_llvm.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 1ecdeca063..a38db0c9b7 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -151,20 +151,42 @@ struct nir_to_llvm_context {
uint64_t tess_patch_outputs_written;
 };
 
 static inline struct nir_to_llvm_context *
 nir_to_llvm_context_from_abi(struct ac_shader_abi *abi)
 {
struct nir_to_llvm_context *ctx = NULL;
return container_of(abi, ctx, abi);
 }
 
+static LLVMTypeRef
+nir2llvmtype(struct ac_nir_context *ctx,
+const struct glsl_type *type)
+{
+   switch (glsl_get_base_type(glsl_without_array(type))) {
+   case GLSL_TYPE_UINT:
+   case GLSL_TYPE_INT:
+   return ctx->ac.i32;
+   case GLSL_TYPE_UINT64:
+   case GLSL_TYPE_INT64:
+   return ctx->ac.i64;
+   case GLSL_TYPE_DOUBLE:
+   return ctx->ac.f64;
+   case GLSL_TYPE_FLOAT:
+   return ctx->ac.f32;
+   default:
+   assert(!"Unsupported type in nir2llvmtype()");
+   break;
+   }
+   return 0;
+}
+
 static LLVMValueRef get_sampler_desc(struct ac_nir_context *ctx,
 const nir_deref_var *deref,
 enum ac_descriptor_type desc_type,
 const nir_tex_instr *instr,
 bool image, bool write);
 
 static unsigned radeon_llvm_reg_index_soa(unsigned index, unsigned chan)
 {
return (index * 4) + chan;
 }
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 15/15] radeonsi: enable gs support for nir backend

2017-11-22 Thread Timothy Arceri

---
 src/gallium/drivers/radeonsi/si_pipe.c   |  3 ++-
 src/gallium/drivers/radeonsi/si_shader_nir.c | 10 --
 src/mesa/state_tracker/st_glsl_to_nir.cpp| 12 
 3 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index b3d8ae508b..dffdab5d41 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -542,21 +542,21 @@ static int si_get_param(struct pipe_screen* pscreen, enum 
pipe_cap param)
case PIPE_CAP_CONSTANT_BUFFER_OFFSET_ALIGNMENT:
case PIPE_CAP_TEXTURE_BUFFER_OFFSET_ALIGNMENT:
case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS:
case PIPE_CAP_MAX_STREAM_OUTPUT_BUFFERS:
case PIPE_CAP_MAX_VERTEX_STREAMS:
case PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT:
return 4;
 
case PIPE_CAP_GLSL_FEATURE_LEVEL:
if (sscreen->b.debug_flags & DBG(NIR))
-   return 140; /* no geometry and tessellation shaders yet 
*/
+   return 150; /* no tessellation shaders yet */
if (si_have_tgsi_compute(sscreen))
return 450;
return 420;
 
case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE:
return MIN2(sscreen->b.info.max_alloc_size, INT_MAX);
 
case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY:
case PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY:
case PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY:
@@ -746,20 +746,21 @@ static int si_get_shader_param(struct pipe_screen* 
pscreen,
return SI_NUM_SAMPLERS;
case PIPE_SHADER_CAP_MAX_SHADER_BUFFERS:
return SI_NUM_SHADER_BUFFERS;
case PIPE_SHADER_CAP_MAX_SHADER_IMAGES:
return SI_NUM_IMAGES;
case PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT:
return 32;
case PIPE_SHADER_CAP_PREFERRED_IR:
if (sscreen->b.debug_flags & DBG(NIR) &&
(shader == PIPE_SHADER_VERTEX ||
+shader == PIPE_SHADER_GEOMETRY ||
 shader == PIPE_SHADER_FRAGMENT))
return PIPE_SHADER_IR_NIR;
return PIPE_SHADER_IR_TGSI;
case PIPE_SHADER_CAP_LOWER_IF_THRESHOLD:
return 4;
 
/* Supported boolean features. */
case PIPE_SHADER_CAP_TGSI_CONT_SUPPORTED:
case PIPE_SHADER_CAP_TGSI_SQRT_SUPPORTED:
case PIPE_SHADER_CAP_INDIRECT_TEMP_ADDR:
diff --git a/src/gallium/drivers/radeonsi/si_shader_nir.c 
b/src/gallium/drivers/radeonsi/si_shader_nir.c
index 54f79ba0c3..9396403bf0 100644
--- a/src/gallium/drivers/radeonsi/si_shader_nir.c
+++ b/src/gallium/drivers/radeonsi/si_shader_nir.c
@@ -123,20 +123,21 @@ static void scan_instruction(struct tgsi_shader_info 
*info,
}
 }
 
 void si_nir_scan_shader(const struct nir_shader *nir,
struct tgsi_shader_info *info)
 {
nir_function *func;
unsigned i;
 
assert(nir->info.stage == MESA_SHADER_VERTEX ||
+  nir->info.stage == MESA_SHADER_GEOMETRY ||
   nir->info.stage == MESA_SHADER_FRAGMENT);
 
info->processor = pipe_shader_type_from_mesa(nir->info.stage);
info->num_tokens = 2; /* indicate that the shader is non-empty */
info->num_instructions = 2;
 
if (nir->info.stage == MESA_SHADER_GEOMETRY) {
info->properties[TGSI_PROPERTY_GS_INPUT_PRIM] = 
nir->info.gs.input_primitive;
info->properties[TGSI_PROPERTY_GS_OUTPUT_PRIM] = 
nir->info.gs.output_primitive;
info->properties[TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES] = 
nir->info.gs.vertices_out;
@@ -144,29 +145,30 @@ void si_nir_scan_shader(const struct nir_shader *nir,
}
 
i = 0;
uint64_t processed_inputs = 0;
unsigned num_inputs = 0;
nir_foreach_variable(variable, >inputs) {
unsigned semantic_name, semantic_index;
unsigned attrib_count = 
glsl_count_attribute_slots(variable->type,
   
nir->info.stage == MESA_SHADER_VERTEX);
 
-   assert(attrib_count == 1 && "not implemented");
-
/* Vertex shader inputs don't have semantics. The state
 * tracker has already mapped them to attributes via
 * variable->data.driver_location.
 */
if (nir->info.stage == MESA_SHADER_VERTEX)
continue;
 
+   assert(nir->info.stage != MESA_SHADER_FRAGMENT ||
+  (attrib_count == 1 && "not implemented"));
+
/* Fragment shader position is a system value. */
if (nir->info.stage == MESA_SHADER_FRAGMENT &&
variable->data.location == VARYING_SLOT_POS) {
if

[Mesa-dev] [PATCH 09/15] radeonsi: add llvm_type_is_64bit() helper

2017-11-22 Thread Timothy Arceri

---
 src/gallium/drivers/radeonsi/si_shader.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 0df9454edb..c68ffad8ea 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -96,20 +96,29 @@ static void si_build_ps_epilog_function(struct 
si_shader_context *ctx,
 /* Ideally pass the sample mask input to the PS epilog as v14, which
  * is its usual location, so that the shader doesn't have to add v_mov.
  */
 #define PS_EPILOG_SAMPLEMASK_MIN_LOC 14
 
 enum {
CONST_ADDR_SPACE = 2,
LOCAL_ADDR_SPACE = 3,
 };
 
+static bool llvm_type_is_64bit(struct si_shader_context *ctx,
+  LLVMTypeRef type)
+{
+   if (type == ctx->ac.i64 || type == ctx->ac.f64)
+   return true;
+
+   return false;
+}
+
 static bool is_merged_shader(struct si_shader *shader)
 {
if (shader->selector->screen->b.chip_class <= VI)
return false;
 
return shader->key.as_ls ||
   shader->key.as_es ||
   shader->selector->type == PIPE_SHADER_TESS_CTRL ||
   shader->selector->type == PIPE_SHADER_GEOMETRY;
 }
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 08/15] radeonsi: pass llvm type to si_llvm_emit_fetch_64bit()

2017-11-22 Thread Timothy Arceri

v2: use LLVMBuildBitCast() directly
---
 src/gallium/drivers/radeonsi/si_shader.c| 11 +++
 src/gallium/drivers/radeonsi/si_shader_internal.h   |  2 +-
 src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c | 17 ++---
 3 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 3dc988693e..0df9454edb 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -1056,21 +1056,22 @@ static LLVMValueRef buffer_load(struct 
lp_build_tgsi_context *bld_base,
return LLVMBuildExtractElement(ctx->ac.builder, value,
LLVMConstInt(ctx->i32, swizzle, 0), "");
}
 
value = ac_build_buffer_load(>ac, buffer, 1, NULL, base, offset,
  swizzle * 4, 1, 0, can_speculate, false);
 
value2 = ac_build_buffer_load(>ac, buffer, 1, NULL, base, offset,
   swizzle * 4 + 4, 1, 0, can_speculate, false);
 
-   return si_llvm_emit_fetch_64bit(bld_base, type, value, value2);
+   return si_llvm_emit_fetch_64bit(bld_base, tgsi2llvmtype(bld_base, type),
+   value, value2);
 }
 
 /**
  * Load from LDS.
  *
  * \param type output value type
  * \param swizzle  offset (typically 0..3); it can be ~0, which loads a 
vec4
  * \param dw_addr  address in dwords
  */
 static LLVMValueRef lds_load(struct lp_build_tgsi_context *bld_base,
@@ -1092,21 +1093,22 @@ static LLVMValueRef lds_load(struct 
lp_build_tgsi_context *bld_base,
 
dw_addr = lp_build_add(_base->uint_bld, dw_addr,
LLVMConstInt(ctx->i32, swizzle, 0));
 
value = ac_lds_load(>ac, dw_addr);
if (tgsi_type_is_64bit(type)) {
LLVMValueRef value2;
dw_addr = lp_build_add(_base->uint_bld, dw_addr,
   ctx->i32_1);
value2 = ac_lds_load(>ac, dw_addr);
-   return si_llvm_emit_fetch_64bit(bld_base, type, value, value2);
+   return si_llvm_emit_fetch_64bit(bld_base, 
tgsi2llvmtype(bld_base, type),
+   value, value2);
}
 
return bitcast(bld_base, type, value);
 }
 
 /**
  * Store to LDS.
  *
  * \param swizzle  offset (typically 0..3)
  * \param dw_addr  address in dwords
@@ -1366,21 +1368,21 @@ static LLVMValueRef fetch_input_gs(
 
value = ac_build_buffer_load(>ac, ctx->esgs_ring, 1, ctx->i32_0,
 vtx_offset, soffset, 0, 1, 0, true, false);
if (tgsi_type_is_64bit(type)) {
LLVMValueRef value2;
soffset = LLVMConstInt(ctx->i32, (param * 4 + swizzle + 1) * 
256, 0);
 
value2 = ac_build_buffer_load(>ac, ctx->esgs_ring, 1,
  ctx->i32_0, vtx_offset, soffset,
  0, 1, 0, true, false);
-   return si_llvm_emit_fetch_64bit(bld_base, type,
+   return si_llvm_emit_fetch_64bit(bld_base, 
tgsi2llvmtype(bld_base, type),
value, value2);
}
return bitcast(bld_base, type, value);
 }
 
 static int lookup_interp_param_index(unsigned interpolate, unsigned location)
 {
switch (interpolate) {
case TGSI_INTERPOLATE_CONSTANT:
return 0;
@@ -1969,21 +1971,22 @@ static LLVMValueRef fetch_constant(
 
return lp_build_gather_values(>gallivm, values, 4);
}
 
/* Split 64-bit loads. */
if (tgsi_type_is_64bit(type)) {
LLVMValueRef lo, hi;
 
lo = fetch_constant(bld_base, reg, TGSI_TYPE_UNSIGNED, swizzle);
hi = fetch_constant(bld_base, reg, TGSI_TYPE_UNSIGNED, swizzle 
+ 1);
-   return si_llvm_emit_fetch_64bit(bld_base, type, lo, hi);
+   return si_llvm_emit_fetch_64bit(bld_base, 
tgsi2llvmtype(bld_base, type),
+   lo, hi);
}
 
idx = reg->Register.Index * 4 + swizzle;
if (reg->Register.Indirect) {
addr = si_get_indirect_index(ctx, ireg, 16, idx * 4);
} else {
addr = LLVMConstInt(ctx->i32, idx * 4, 0);
}
 
/* Fast path when user data SGPRs point to constant buffer 0 directly. 
*/
diff --git a/src/gallium/drivers/radeonsi/si_shader_internal.h 
b/src/gallium/drivers/radeonsi/si_shader_internal.h
index ebe11fad56..5219aa0f1d 100644
--- a/src/gallium/drivers/radeonsi/si_shader_internal.h
+++ b/src/gallium/drivers/radeonsi/si_shader_internal.h
@@ -263,21 +263,21 @@ void si_llvm_context_set_tgsi(struct si_shader_context 
*ctx,
 void si_llvm_create_func(struct si_shader_context *ctx,
 const char *name,

[Mesa-dev] [PATCH 06/15] radeonsi: add nir support for es epilogue

2017-11-22 Thread Timothy Arceri

v2: make use of existing si_tgsi_emit_epilogue()
---
 src/gallium/drivers/radeonsi/si_shader.c | 29 +
 1 file changed, 13 insertions(+), 16 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index defe833c30..8682c91edd 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -3175,55 +3175,56 @@ static void si_llvm_emit_ls_epilogue(struct 
ac_shader_abi *abi,
for (chan = 0; chan < 4; chan++) {
lds_store(ctx, chan, dw_addr,
  LLVMBuildLoad(ctx->ac.builder, addrs[4 * i + 
chan], ""));
}
}
 
if (ctx->screen->b.chip_class >= GFX9)
si_set_ls_return_value_for_tcs(ctx);
 }
 
-static void si_llvm_emit_es_epilogue(struct lp_build_tgsi_context *bld_base)
+static void si_llvm_emit_es_epilogue(struct ac_shader_abi *abi,
+unsigned max_outputs,
+LLVMValueRef *addrs)
 {
-   struct si_shader_context *ctx = si_shader_context(bld_base);
+   struct si_shader_context *ctx = si_shader_context_from_abi(abi);
struct si_shader *es = ctx->shader;
struct tgsi_shader_info *info = >selector->info;
LLVMValueRef soffset = LLVMGetParam(ctx->main_fn,
ctx->param_es2gs_offset);
LLVMValueRef lds_base = NULL;
unsigned chan;
int i;
 
if (ctx->screen->b.chip_class >= GFX9 && info->num_outputs) {
unsigned itemsize_dw = es->selector->esgs_itemsize / 4;
LLVMValueRef vertex_idx = ac_get_thread_id(>ac);
LLVMValueRef wave_idx = unpack_param(ctx, 
ctx->param_merged_wave_info, 24, 4);
vertex_idx = LLVMBuildOr(ctx->ac.builder, vertex_idx,
 LLVMBuildMul(ctx->ac.builder, wave_idx,
  LLVMConstInt(ctx->i32, 
64, false), ""), "");
lds_base = LLVMBuildMul(ctx->ac.builder, vertex_idx,
LLVMConstInt(ctx->i32, itemsize_dw, 0), 
"");
}
 
for (i = 0; i < info->num_outputs; i++) {
-   LLVMValueRef *out_ptr = ctx->outputs[i];
int param;
 
if (info->output_semantic_name[i] == 
TGSI_SEMANTIC_VIEWPORT_INDEX ||
info->output_semantic_name[i] == TGSI_SEMANTIC_LAYER)
continue;
 
param = 
si_shader_io_get_unique_index(info->output_semantic_name[i],
  
info->output_semantic_index[i]);
 
for (chan = 0; chan < 4; chan++) {
-   LLVMValueRef out_val = LLVMBuildLoad(ctx->ac.builder, 
out_ptr[chan], "");
+   LLVMValueRef out_val = LLVMBuildLoad(ctx->ac.builder, 
addrs[4 * i + chan], "");
out_val = ac_to_integer(>ac, out_val);
 
/* GFX9 has the ESGS ring in LDS. */
if (ctx->screen->b.chip_class >= GFX9) {
lds_store(ctx, param * 4 + chan, lds_base, 
out_val);
continue;
}
 
ac_build_buffer_store_dword(>ac,
ctx->esgs_ring,
@@ -4420,21 +4421,20 @@ static void create_function(struct si_shader_context 
*ctx)
 
/* VGPRs */
declare_vs_input_vgprs(ctx, , _prolog_vgprs);
break;
}
 
declare_per_stage_desc_pointers(ctx, , true);
declare_vs_specific_input_sgprs(ctx, );
 
if (shader->key.as_es) {
-   assert(!shader->selector->nir);
ctx->param_es2gs_offset = add_arg(, ARG_SGPR, 
ctx->i32);
} else if (shader->key.as_ls) {
/* no extra parameters */
} else {
if (shader->is_gs_copy_shader) {
fninfo.num_params = ctx->param_rw_buffers + 1;
fninfo.num_sgpr_params = fninfo.num_params;
}
 
/* The locations of the other parameters are assigned 
dynamically. */
@@ -5727,44 +5727,41 @@ static bool si_compile_tgsi_main(struct 
si_shader_context *ctx,
 bool is_monolithic)
 {
struct si_shader *shader = ctx->shader;
struct si_shader_selector *sel = shader->selector;
struct lp_build_tgsi_context *bld_base = >bld_base;
 
// TODO clean all this up!
switch (ctx->type) {
case PIPE_SHADER_VERTEX:
ctx->load_input = declare_input_vs;
-   if

[Mesa-dev] [PATCH 14/15] ac: add si_nir_load_input_gs() to the abi

2017-11-22 Thread Timothy Arceri

V2: make use of driver_location and don't expose NIR to the ABI.
---
 src/amd/common/ac_nir_to_llvm.c   | 39 +++
 src/amd/common/ac_shader_abi.h|  9 ++
 src/gallium/drivers/radeonsi/si_shader.c  |  1 +
 src/gallium/drivers/radeonsi/si_shader_internal.h |  9 ++
 src/gallium/drivers/radeonsi/si_shader_nir.c  | 20 
 5 files changed, 64 insertions(+), 14 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 2c30652288..6dc74409a8 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -2875,41 +2875,43 @@ load_tes_input(struct nir_to_llvm_context *ctx,
buf_addr = LLVMBuildAdd(ctx->builder, buf_addr, comp_offset, "");
 
result = ac_build_buffer_load(>ac, ctx->hs_ring_tess_offchip, 
instr->num_components, NULL,
  buf_addr, ctx->oc_lds, is_compact ? (4 * 
const_index) : 0, 1, 0, true, false);
result = trim_vector(>ac, result, instr->num_components);
result = LLVMBuildBitCast(ctx->builder, result, get_def_type(ctx->nir, 
>dest.ssa), "");
return result;
 }
 
 static LLVMValueRef
-load_gs_input(struct nir_to_llvm_context *ctx,
- nir_intrinsic_instr *instr)
+load_gs_input(struct ac_shader_abi *abi,
+ unsigned location,
+ unsigned driver_location,
+ unsigned component,
+ unsigned num_components,
+ unsigned vertex_index,
+ unsigned const_index,
+ LLVMTypeRef type)
 {
-   LLVMValueRef indir_index, vtx_offset;
-   unsigned const_index;
+   struct nir_to_llvm_context *ctx = nir_to_llvm_context_from_abi(abi);
+   LLVMValueRef vtx_offset;
LLVMValueRef args[9];
unsigned param, vtx_offset_param;
LLVMValueRef value[4], result;
-   unsigned vertex_index;
-   get_deref_offset(ctx->nir, instr->variables[0],
-false, _index, NULL,
-_index, _index);
+
vtx_offset_param = vertex_index;
assert(vtx_offset_param < 6);
vtx_offset = LLVMBuildMul(ctx->builder, 
ctx->gs_vtx_offset[vtx_offset_param],
  LLVMConstInt(ctx->ac.i32, 4, false), "");
 
-   param = 
shader_io_get_unique_index(instr->variables[0]->var->data.location);
+   param = shader_io_get_unique_index(location);
 
-   unsigned comp = instr->variables[0]->var->data.location_frac;
-   for (unsigned i = comp; i < instr->num_components + comp; i++) {
+   for (unsigned i = component; i < num_components + component; i++) {
if (ctx->ac.chip_class >= GFX9) {
LLVMValueRef dw_addr = 
ctx->gs_vtx_offset[vtx_offset_param];
dw_addr = LLVMBuildAdd(ctx->ac.builder, dw_addr,
   LLVMConstInt(ctx->ac.i32, param 
* 4 + i + const_index, 0), "");
value[i] = ac_lds_load(>ac, dw_addr);
} else {
args[0] = ctx->esgs_ring;
args[1] = vtx_offset;
args[2] = LLVMConstInt(ctx->ac.i32, (param * 4 + i + 
const_index) * 256, false);
args[3] = ctx->ac.i32_0;
@@ -2918,21 +2920,21 @@ load_gs_input(struct nir_to_llvm_context *ctx,
args[6] = ctx->ac.i32_1; /* GLC */
args[7] = ctx->ac.i32_0; /* SLC */
args[8] = ctx->ac.i32_0; /* TFE */
 
value[i] = ac_build_intrinsic(>ac, 
"llvm.SI.buffer.load.dword.i32.i32",
  ctx->ac.i32, args, 9,
  AC_FUNC_ATTR_READONLY |
  AC_FUNC_ATTR_LEGACY);
}
}
-   result = ac_build_varying_gather_values(>ac, value, 
instr->num_components, comp);
+   result = ac_build_varying_gather_values(>ac, value, 
num_components, component);
 
return result;
 }
 
 static LLVMValueRef
 build_gep_for_deref(struct ac_nir_context *ctx,
nir_deref_var *deref)
 {
struct hash_entry *entry = _mesa_hash_table_search(ctx->vars, 
deref->var);
assert(entry->data);
@@ -2987,21 +2989,30 @@ static LLVMValueRef visit_load_var(struct 
ac_nir_context *ctx,
if (instr->dest.ssa.bit_size == 64)
ve *= 2;
 
switch (instr->variables[0]->var->data.mode) {
case nir_var_shader_in:
if (ctx->stage == MESA_SHADER_TESS_CTRL)
return load_tcs_input(ctx->nctx, instr);
if (ctx->stage == MESA_SHADER_TESS_EVAL)
return load_tes_input(ctx->nctx, instr);
if (ctx->stage == MESA_SHADER_GEOMETRY) {
-   return load_gs_input(ctx->nctx,

[Mesa-dev] [PATCH 13/15] ac: move build_varying_gather_values() to ac_llvm_build.h and expose

2017-11-22 Thread Timothy Arceri

---
 src/amd/common/ac_llvm_build.c  | 22 ++
 src/amd/common/ac_llvm_build.h  |  4 
 src/amd/common/ac_nir_to_llvm.c | 34 ++
 3 files changed, 32 insertions(+), 28 deletions(-)

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 5640a23b8a..b2bf1bf7b5 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -363,20 +363,42 @@ ac_build_vote_eq(struct ac_llvm_context *ctx, 
LLVMValueRef value)
LLVMValueRef vote_set = ac_build_ballot(ctx, value);
 
LLVMValueRef all = LLVMBuildICmp(ctx->builder, LLVMIntEQ,
 vote_set, active_set, "");
LLVMValueRef none = LLVMBuildICmp(ctx->builder, LLVMIntEQ,
  vote_set,
  LLVMConstInt(ctx->i64, 0, 0), "");
return LLVMBuildOr(ctx->builder, all, none, "");
 }
 
+LLVMValueRef
+ac_build_varying_gather_values(struct ac_llvm_context *ctx, LLVMValueRef 
*values,
+  unsigned value_count, unsigned component)
+{
+   LLVMValueRef vec = NULL;
+
+   if (value_count == 1) {
+   return values[component];
+   } else if (!value_count)
+   unreachable("value_count is 0");
+
+   for (unsigned i = component; i < value_count + component; i++) {
+   LLVMValueRef value = values[i];
+
+   if (!i)
+   vec = LLVMGetUndef( LLVMVectorType(LLVMTypeOf(value), 
value_count));
+   LLVMValueRef index = LLVMConstInt(ctx->i32, i - component, 
false);
+   vec = LLVMBuildInsertElement(ctx->builder, vec, value, index, 
"");
+   }
+   return vec;
+}
+
 LLVMValueRef
 ac_build_gather_values_extended(struct ac_llvm_context *ctx,
LLVMValueRef *values,
unsigned value_count,
unsigned value_stride,
bool load,
bool always_vector)
 {
LLVMBuilderRef builder = ctx->builder;
LLVMValueRef vec = NULL;
diff --git a/src/amd/common/ac_llvm_build.h b/src/amd/common/ac_llvm_build.h
index 1f51937c9e..655dc1dcc8 100644
--- a/src/amd/common/ac_llvm_build.h
+++ b/src/amd/common/ac_llvm_build.h
@@ -105,20 +105,24 @@ void ac_build_optimization_barrier(struct ac_llvm_context 
*ctx,
   LLVMValueRef *pvgpr);
 
 LLVMValueRef ac_build_ballot(struct ac_llvm_context *ctx, LLVMValueRef value);
 
 LLVMValueRef ac_build_vote_all(struct ac_llvm_context *ctx, LLVMValueRef 
value);
 
 LLVMValueRef ac_build_vote_any(struct ac_llvm_context *ctx, LLVMValueRef 
value);
 
 LLVMValueRef ac_build_vote_eq(struct ac_llvm_context *ctx, LLVMValueRef value);
 
+LLVMValueRef
+ac_build_varying_gather_values(struct ac_llvm_context *ctx, LLVMValueRef 
*values,
+  unsigned value_count, unsigned component);
+
 LLVMValueRef
 ac_build_gather_values_extended(struct ac_llvm_context *ctx,
LLVMValueRef *values,
unsigned value_count,
unsigned value_stride,
bool load,
bool always_vector);
 LLVMValueRef
 ac_build_gather_values(struct ac_llvm_context *ctx,
   LLVMValueRef *values,
diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index a38db0c9b7..2c30652288 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -2694,42 +2694,20 @@ get_dw_address(struct nir_to_llvm_context *ctx,
 
dw_addr = LLVMBuildAdd(ctx->builder, dw_addr,
   LLVMConstInt(ctx->ac.i32, param * 4, false), "");
 
if (const_index && compact_const_index)
dw_addr = LLVMBuildAdd(ctx->builder, dw_addr,
   LLVMConstInt(ctx->ac.i32, const_index, 
false), "");
return dw_addr;
 }
 
-static LLVMValueRef
-build_varying_gather_values(struct ac_llvm_context *ctx, LLVMValueRef *values,
-   unsigned value_count, unsigned component)
-{
-   LLVMValueRef vec = NULL;
-
-   if (value_count == 1) {
-   return values[component];
-   } else if (!value_count)
-   unreachable("value_count is 0");
-
-   for (unsigned i = component; i < value_count + component; i++) {
-   LLVMValueRef value = values[i];
-
-   if (!i)
-   vec = LLVMGetUndef( LLVMVectorType(LLVMTypeOf(value), 
value_count));
-   LLVMValueRef index = LLVMConstInt(ctx->i32, i - component, 
false);
-   vec = LLVMBuildInsertElement(ctx->builder, vec, value, index, 
"");
-   }
-   return vec;
-}
-
 static LLVMValueRef
 load_tcs_input(struct nir_to_llvm_context *ctx,

[Mesa-dev] [PATCH 07/15] radeonsi: add nir support for gs epilogue

2017-11-22 Thread Timothy Arceri

v2: add emit_gs_epilogue() helper function to reduce duplication.
---
 src/gallium/drivers/radeonsi/si_shader.c | 25 +
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 8682c91edd..3dc988693e 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -3239,31 +3239,47 @@ static void si_llvm_emit_es_epilogue(struct 
ac_shader_abi *abi,
 }
 
 static LLVMValueRef si_get_gs_wave_id(struct si_shader_context *ctx)
 {
if (ctx->screen->b.chip_class >= GFX9)
return unpack_param(ctx, ctx->param_merged_wave_info, 16, 8);
else
return LLVMGetParam(ctx->main_fn, ctx->param_gs_wave_id);
 }
 
-static void si_llvm_emit_gs_epilogue(struct lp_build_tgsi_context *bld_base)
+static void emit_gs_epilogue(struct si_shader_context *ctx)
 {
-   struct si_shader_context *ctx = si_shader_context(bld_base);
-
ac_build_sendmsg(>ac, AC_SENDMSG_GS_OP_NOP | AC_SENDMSG_GS_DONE,
 si_get_gs_wave_id(ctx));
 
if (ctx->screen->b.chip_class >= GFX9)
lp_build_endif(>merged_wrap_if_state);
 }
 
+static void si_llvm_emit_gs_epilogue(struct ac_shader_abi *abi,
+unsigned max_outputs,
+LLVMValueRef *addrs)
+{
+   struct si_shader_context *ctx = si_shader_context_from_abi(abi);
+   struct tgsi_shader_info UNUSED *info = >shader->selector->info;
+
+   assert(info->num_outputs <= max_outputs);
+
+   emit_gs_epilogue(ctx);
+}
+
+static void si_tgsi_emit_gs_epilogue(struct lp_build_tgsi_context *bld_base)
+{
+   struct si_shader_context *ctx = si_shader_context(bld_base);
+   emit_gs_epilogue(ctx);
+}
+
 static void si_llvm_emit_vs_epilogue(struct ac_shader_abi *abi,
 unsigned max_outputs,
 LLVMValueRef *addrs)
 {
struct si_shader_context *ctx = si_shader_context_from_abi(abi);
struct tgsi_shader_info *info = >shader->selector->info;
struct si_shader_output_values *outputs = NULL;
int i,j;
 
assert(!ctx->shader->is_gs_copy_shader);
@@ -5752,21 +5768,22 @@ static bool si_compile_tgsi_main(struct 
si_shader_context *ctx,
bld_base->emit_fetch_funcs[TGSI_FILE_INPUT] = fetch_input_tes;
if (shader->key.as_es)
ctx->abi.emit_outputs = si_llvm_emit_es_epilogue;
else
ctx->abi.emit_outputs = si_llvm_emit_vs_epilogue;
bld_base->emit_epilogue = si_tgsi_emit_epilogue;
break;
case PIPE_SHADER_GEOMETRY:
bld_base->emit_fetch_funcs[TGSI_FILE_INPUT] = fetch_input_gs;
ctx->abi.emit_vertex = si_llvm_emit_vertex;
-   bld_base->emit_epilogue = si_llvm_emit_gs_epilogue;
+   ctx->abi.emit_outputs = si_llvm_emit_gs_epilogue;
+   bld_base->emit_epilogue = si_tgsi_emit_gs_epilogue;
break;
case PIPE_SHADER_FRAGMENT:
ctx->load_input = declare_input_fs;
ctx->abi.emit_outputs = si_llvm_return_fs_outputs;
bld_base->emit_epilogue = si_tgsi_emit_epilogue;
break;
case PIPE_SHADER_COMPUTE:
break;
default:
assert(!"Unsupported shader type");
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 01/15] nir: add array lowering function that assumes there are no indirects

2017-11-22 Thread Timothy Arceri

The gallium glsl->nir pass currently lowers away all indirects on both inputs
and outputs. This fuction allows us to lower vs inputs and fs outputs and also
lower things one stage at a time as we don't need to worry about indirects
on the other side of the shaders interface.
---
 src/compiler/nir/nir.h |  1 +
 src/compiler/nir/nir_lower_io_arrays_to_elements.c | 44 +-
 2 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 5832c05680..97f38e7c73 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -2478,20 +2478,21 @@ bool nir_lower_constant_initializers(nir_shader *shader,
 
 bool nir_move_vec_src_uses_to_dest(nir_shader *shader);
 bool nir_lower_vec_to_movs(nir_shader *shader);
 void nir_lower_alpha_test(nir_shader *shader, enum compare_func func,
   bool alpha_to_one);
 bool nir_lower_alu_to_scalar(nir_shader *shader);
 bool nir_lower_load_const_to_scalar(nir_shader *shader);
 bool nir_lower_read_invocation_to_scalar(nir_shader *shader);
 bool nir_lower_phis_to_scalar(nir_shader *shader);
 void nir_lower_io_arrays_to_elements(nir_shader *producer, nir_shader 
*consumer);
+void nir_lower_io_arrays_to_elements_no_indirects(nir_shader *shader);
 void nir_lower_io_to_scalar(nir_shader *shader, nir_variable_mode mask);
 void nir_lower_io_to_scalar_early(nir_shader *shader, nir_variable_mode mask);
 
 bool nir_lower_samplers(nir_shader *shader,
 const struct gl_shader_program *shader_program);
 bool nir_lower_samplers_as_deref(nir_shader *shader,
  const struct gl_shader_program 
*shader_program);
 
 typedef struct nir_lower_subgroups_options {
uint8_t subgroup_size;
diff --git a/src/compiler/nir/nir_lower_io_arrays_to_elements.c 
b/src/compiler/nir/nir_lower_io_arrays_to_elements.c
index c41f300edb..29076bf79b 100644
--- a/src/compiler/nir/nir_lower_io_arrays_to_elements.c
+++ b/src/compiler/nir/nir_lower_io_arrays_to_elements.c
@@ -28,38 +28,41 @@
  *
  * Split arrays/matrices with direct indexing into individual elements. This
  * will allow optimisation passes to better clean up unused elements.
  *
  */
 
 static unsigned
 get_io_offset(nir_builder *b, nir_deref_var *deref, nir_variable *var,
   unsigned *element_index)
 {
+   bool vs_in = (b->shader->info.stage == MESA_SHADER_VERTEX) &&
+(var->data.mode == nir_var_shader_in);
+
nir_deref *tail = >deref;
 
/* For per-vertex input arrays (i.e. geometry shader inputs), skip the
 * outermost array index.  Process the rest normally.
 */
if (nir_is_per_vertex_io(var, b->shader->info.stage)) {
   tail = tail->child;
}
 
unsigned offset = 0;
while (tail->child != NULL) {
   tail = tail->child;
 
   if (tail->deref_type == nir_deref_type_array) {
  nir_deref_array *deref_array = nir_deref_as_array(tail);
  assert(deref_array->deref_array_type != 
nir_deref_array_type_indirect);
 
- unsigned size = glsl_count_attribute_slots(tail->type, false);
+ unsigned size = glsl_count_attribute_slots(tail->type, vs_in);
  offset += size * deref_array->base_offset;
 
  unsigned num_elements = glsl_type_is_array(tail->type) ?
 glsl_get_aoa_size(tail->type) : 1;
 
  num_elements *= glsl_type_is_matrix(glsl_without_array(tail->type)) ?
 glsl_get_matrix_columns(glsl_without_array(tail->type)) : 1;
 
  *element_index += num_elements * deref_array->base_offset;
   } else if (tail->deref_type == nir_deref_type_struct) {
@@ -324,20 +327,59 @@ lower_io_arrays_to_elements(nir_shader *shader, 
nir_variable_mode mask,
   break;
default:
   break;
}
 }
  }
   }
}
 }
 
+void
+nir_lower_io_arrays_to_elements_no_indirects(nir_shader *shader)
+{
+   struct hash_table *split_inputs =
+  _mesa_hash_table_create(NULL, _mesa_hash_pointer,
+  _mesa_key_pointer_equal);
+   struct hash_table *split_outputs =
+  _mesa_hash_table_create(NULL, _mesa_hash_pointer,
+  _mesa_key_pointer_equal);
+
+   uint64_t indirects[4] = {0}, patch_indirects[4] = {0};
+
+   lower_io_arrays_to_elements(shader, nir_var_shader_out, indirects,
+   patch_indirects, split_outputs);
+
+   lower_io_arrays_to_elements(shader, nir_var_shader_in, indirects,
+   patch_indirects, split_inputs);
+
+   /* Remove old input from the shaders inputs list */
+   struct hash_entry *entry;
+   hash_table_foreach(split_inputs, entry) {
+  nir_variable *var = (nir_variable *) entry->key;
+  exec_node_remove(>node);
+
+  free(entry->data);
+   }
+
+   /* Remove old output from the shaders outputs list */
+   hash_table_foreach(split_outputs, entry) {
+

[Mesa-dev] [PATCH 05/15] radeonsi: add nir support for ls epilogue

2017-11-22 Thread Timothy Arceri

v2: make use of existing si_tgsi_emit_epilogue()
---
 src/gallium/drivers/radeonsi/si_shader.c | 29 ++---
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index e6b14f9205..defe833c30 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -1105,27 +1105,25 @@ static LLVMValueRef lds_load(struct 
lp_build_tgsi_context *bld_base,
return bitcast(bld_base, type, value);
 }
 
 /**
  * Store to LDS.
  *
  * \param swizzle  offset (typically 0..3)
  * \param dw_addr  address in dwords
  * \param valuevalue to store
  */
-static void lds_store(struct lp_build_tgsi_context *bld_base,
+static void lds_store(struct si_shader_context *ctx,
  unsigned dw_offset_imm, LLVMValueRef dw_addr,
  LLVMValueRef value)
 {
-   struct si_shader_context *ctx = si_shader_context(bld_base);
-
-   dw_addr = lp_build_add(_base->uint_bld, dw_addr,
+   dw_addr = lp_build_add(>bld_base.uint_bld, dw_addr,
LLVMConstInt(ctx->i32, dw_offset_imm, 0));
 
ac_lds_store(>ac, dw_addr, value);
 }
 
 static LLVMValueRef desc_from_addr_base64k(struct si_shader_context *ctx,
  unsigned param)
 {
LLVMBuilderRef builder = ctx->ac.builder;
 
@@ -1257,21 +1255,21 @@ static void store_output_tcs(struct 
lp_build_tgsi_context *bld_base,
uint32_t writemask = reg->Register.WriteMask;
while (writemask) {
chan_index = u_bit_scan();
LLVMValueRef value = dst[chan_index];
 
if (inst->Instruction.Saturate)
value = ac_build_clamp(>ac, value);
 
/* Skip LDS stores if there is no LDS read of this output. */
if (!skip_lds_store)
-   lds_store(bld_base, chan_index, dw_addr, value);
+   lds_store(ctx, chan_index, dw_addr, value);
 
value = ac_to_integer(>ac, value);
values[chan_index] = value;
 
if (reg->Register.WriteMask != 0xF && !is_tess_factor) {
ac_build_buffer_store_dword(>ac, buffer, value, 1,
buf_addr, base,
4 * chan_index, 1, 0, true, 
false);
}
 
@@ -3124,36 +3122,37 @@ static void si_set_es_return_value_for_gs(struct 
si_shader_context *ctx)
   8 + 
GFX9_SGPR_GS_SAMPLERS_AND_IMAGES);
 
unsigned vgpr = 8 + GFX9_GS_NUM_USER_SGPR;
for (unsigned i = 0; i < 5; i++) {
unsigned param = ctx->param_gs_vtx01_offset + i;
ret = si_insert_input_ret_float(ctx, ret, param, vgpr++);
}
ctx->return_value = ret;
 }
 
-static void si_llvm_emit_ls_epilogue(struct lp_build_tgsi_context *bld_base)
+static void si_llvm_emit_ls_epilogue(struct ac_shader_abi *abi,
+unsigned max_outputs,
+LLVMValueRef *addrs)
 {
-   struct si_shader_context *ctx = si_shader_context(bld_base);
+   struct si_shader_context *ctx = si_shader_context_from_abi(abi);
struct si_shader *shader = ctx->shader;
struct tgsi_shader_info *info = >selector->info;
unsigned i, chan;
LLVMValueRef vertex_id = LLVMGetParam(ctx->main_fn,
  ctx->param_rel_auto_id);
LLVMValueRef vertex_dw_stride = get_tcs_in_vertex_dw_stride(ctx);
LLVMValueRef base_dw_addr = LLVMBuildMul(ctx->ac.builder, vertex_id,
 vertex_dw_stride, "");
 
/* Write outputs to LDS. The next shader (TCS aka HS) will read
 * its inputs from it. */
for (i = 0; i < info->num_outputs; i++) {
-   LLVMValueRef *out_ptr = ctx->outputs[i];
unsigned name = info->output_semantic_name[i];
unsigned index = info->output_semantic_index[i];
 
/* The ARB_shader_viewport_layer_array spec contains the
 * following issue:
 *
 *2) What happens if gl_ViewportIndex or gl_Layer is
 *written in the vertex shader and a geometry shader is
 *present?
 *
@@ -3167,22 +3166,22 @@ static void si_llvm_emit_ls_epilogue(struct 
lp_build_tgsi_context *bld_base)
 */
if (name == TGSI_SEMANTIC_LAYER ||
name == TGSI_SEMANTIC_VIEWPORT_INDEX)
continue;
 
int param = si_shader_io_get_unique_index(name, index);
LLVMValueRef dw_addr = LLVMBuildAdd(ctx->ac.builder, 
base_dw_addr,

[Mesa-dev] V2 Initial GS NIR support for radeonsi

2017-11-22 Thread Timothy Arceri

This series depends on [1] and [2].

V2
 - use driver_location as per Nicolais suggestion
 - tidy ups as per Mareks suggestions
 - bug fixes (many more piglit tests now passing)

[1] https://patchwork.freedesktop.org/series/34131/
[2] https://patchwork.freedesktop.org/series/34132/

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 03/15] st/glsl_to_nir: use nir_lower_io_arrays_to_elements() to lower arrays

2017-11-22 Thread Timothy Arceri

This pass is more fully featured, it supports geom and tess shaders.
It also supports interpolation intrinsics.
---
 src/mesa/state_tracker/st_glsl_to_nir.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_nir.cpp 
b/src/mesa/state_tracker/st_glsl_to_nir.cpp
index 53e07a78e2..af6b6e2607 100644
--- a/src/mesa/state_tracker/st_glsl_to_nir.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_nir.cpp
@@ -628,21 +628,21 @@ st_link_nir(struct gl_context *ctx,
  * variant lowering.
  */
 void
 st_finalize_nir(struct st_context *st, struct gl_program *prog,
 struct gl_shader_program *shader_program, nir_shader *nir)
 {
struct pipe_screen *screen = st->pipe->screen;
 
NIR_PASS_V(nir, nir_split_var_copies);
NIR_PASS_V(nir, nir_lower_var_copies);
-   NIR_PASS_V(nir, nir_lower_io_types);
+   NIR_PASS_V(nir, nir_lower_io_arrays_to_elements_no_indirects);
 
if (nir->info.stage == MESA_SHADER_VERTEX) {
   /* Needs special handling so drvloc matches the vbo state: */
   st_nir_assign_vs_in_locations(prog, nir);
   /* Re-lower global vars, to deal with any dead VS inputs. */
   NIR_PASS_V(nir, nir_lower_global_vars_to_local);
 
   sort_varyings(>outputs);
   st_nir_assign_var_locations(>outputs,
   >num_outputs);
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 02/15] nir: allow builin arrays to be lowered

2017-11-22 Thread Timothy Arceri

Galliums nir drivers expect this to be done.
---
 src/compiler/nir/nir_lower_io_arrays_to_elements.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/src/compiler/nir/nir_lower_io_arrays_to_elements.c 
b/src/compiler/nir/nir_lower_io_arrays_to_elements.c
index 29076bf79b..dfd8e85d96 100644
--- a/src/compiler/nir/nir_lower_io_arrays_to_elements.c
+++ b/src/compiler/nir/nir_lower_io_arrays_to_elements.c
@@ -249,21 +249,22 @@ create_indirects_mask(nir_shader *shader, uint64_t 
*indirects,
}
 }
  }
   }
}
 }
 
 static void
 lower_io_arrays_to_elements(nir_shader *shader, nir_variable_mode mask,
 uint64_t *indirects, uint64_t *patch_indirects,
-struct hash_table *varyings)
+struct hash_table *varyings,
+bool after_cross_stage_opts)
 {
nir_foreach_function(function, shader) {
   if (function->impl) {
  nir_builder b;
  nir_builder_init(, function->impl);
 
  nir_foreach_block(block, function->impl) {
 nir_foreach_instr_safe(instr, block) {
if (instr->type != nir_instr_type_intrinsic)
   continue;
@@ -298,28 +299,30 @@ lower_io_arrays_to_elements(nir_shader *shader, 
nir_variable_mode mask,
}
 
/* Skip types we cannot split.
 *
 * TODO: Add support for struct splitting.
 */
if ((!glsl_type_is_array(type) && !glsl_type_is_matrix(type))||
glsl_type_is_struct(glsl_without_array(type)))
   continue;
 
-   if (var->data.location < VARYING_SLOT_VAR0 &&
+   /* Skip builtins */
+   if (!after_cross_stage_opts &&
+   var->data.location < VARYING_SLOT_VAR0 &&
var->data.location >= 0)
   continue;
 
/* Don't bother splitting if we can't opt away any unused
 * elements.
 */
-   if (var->data.always_active_io)
+   if (!after_cross_stage_opts && var->data.always_active_io)
   continue;
 
switch (intr->intrinsic) {
case nir_intrinsic_interp_var_at_centroid:
case nir_intrinsic_interp_var_at_sample:
case nir_intrinsic_interp_var_at_offset:
case nir_intrinsic_load_var:
case nir_intrinsic_store_var:
   if ((mask & nir_var_shader_in && mode == nir_var_shader_in) 
||
   (mask & nir_var_shader_out && mode == 
nir_var_shader_out))
@@ -340,24 +343,24 @@ nir_lower_io_arrays_to_elements_no_indirects(nir_shader 
*shader)
struct hash_table *split_inputs =
   _mesa_hash_table_create(NULL, _mesa_hash_pointer,
   _mesa_key_pointer_equal);
struct hash_table *split_outputs =
   _mesa_hash_table_create(NULL, _mesa_hash_pointer,
   _mesa_key_pointer_equal);
 
uint64_t indirects[4] = {0}, patch_indirects[4] = {0};
 
lower_io_arrays_to_elements(shader, nir_var_shader_out, indirects,
-   patch_indirects, split_outputs);
+   patch_indirects, split_outputs, true);
 
lower_io_arrays_to_elements(shader, nir_var_shader_in, indirects,
-   patch_indirects, split_inputs);
+   patch_indirects, split_inputs, true);
 
/* Remove old input from the shaders inputs list */
struct hash_entry *entry;
hash_table_foreach(split_inputs, entry) {
   nir_variable *var = (nir_variable *) entry->key;
   exec_node_remove(>node);
 
   free(entry->data);
}
 
@@ -381,24 +384,24 @@ nir_lower_io_arrays_to_elements(nir_shader *producer, 
nir_shader *consumer)
   _mesa_key_pointer_equal);
struct hash_table *split_outputs =
   _mesa_hash_table_create(NULL, _mesa_hash_pointer,
   _mesa_key_pointer_equal);
 
uint64_t indirects[4] = {0}, patch_indirects[4] = {0};
create_indirects_mask(producer, indirects, patch_indirects);
create_indirects_mask(consumer, indirects, patch_indirects);
 
lower_io_arrays_to_elements(producer, nir_var_shader_out, indirects,
-   patch_indirects, split_outputs);
+   patch_indirects, split_outputs, false);
 
lower_io_arrays_to_elements(consumer, nir_var_shader_in, indirects,
-   patch_indirects, split_inputs);
+   patch_indirects, split_inputs, false);
 
/* Remove old input from the shaders inputs list */
struct hash_entry *entry;
hash_table_foreach(split_inputs, entry) {
   nir_variable *var = (nir_variable *) entry->key;

[Mesa-dev] [PATCH 04/15] st/glsl_to_nir: add gs support to st_nir_assign_var_locations()

2017-11-22 Thread Timothy Arceri

---
 src/mesa/state_tracker/st_glsl_to_nir.cpp | 23 +--
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_nir.cpp 
b/src/mesa/state_tracker/st_glsl_to_nir.cpp
index af6b6e2607..d1cd2ec3ab 100644
--- a/src/mesa/state_tracker/st_glsl_to_nir.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_nir.cpp
@@ -115,40 +115,48 @@ st_nir_assign_vs_in_locations(struct gl_program *prog, 
nir_shader *nir)
   * set.
   */
  exec_node_remove(>node);
  var->data.mode = nir_var_global;
  exec_list_push_tail(>globals, >node);
   }
}
 }
 
 static void
-st_nir_assign_var_locations(struct exec_list *var_list, unsigned *size)
+st_nir_assign_var_locations(struct exec_list *var_list, unsigned *size,
+gl_shader_stage stage)
 {
unsigned location = 0;
unsigned assigned_locations[VARYING_SLOT_MAX];
uint64_t processed_locs = 0;
 
nir_foreach_variable(var, var_list) {
+
+  const struct glsl_type *type = var->type;
+  if (nir_is_per_vertex_io(var, stage)) {
+ assert(glsl_type_is_array(type));
+ type = glsl_get_array_element(type);
+  }
+
   /* Because component packing allows varyings to share the same location
* we may have already have processed this location.
*/
   if (var->data.location >= VARYING_SLOT_VAR0 &&
   processed_locs & ((uint64_t)1 << var->data.location)) {
  var->data.driver_location = assigned_locations[var->data.location];
- *size += type_size(var->type);
+ *size += type_size(type);
  continue;
   }
 
   assigned_locations[var->data.location] = location;
   var->data.driver_location = location;
-  location += type_size(var->type);
+  location += type_size(type);
 
   processed_locs |= ((uint64_t)1 << var->data.location);
}
 
*size += location;
 }
 
 static int
 st_nir_lookup_parameter_index(const struct gl_program_parameter_list *params,
   const char *name)
@@ -638,29 +646,32 @@ st_finalize_nir(struct st_context *st, struct gl_program 
*prog,
NIR_PASS_V(nir, nir_lower_io_arrays_to_elements_no_indirects);
 
if (nir->info.stage == MESA_SHADER_VERTEX) {
   /* Needs special handling so drvloc matches the vbo state: */
   st_nir_assign_vs_in_locations(prog, nir);
   /* Re-lower global vars, to deal with any dead VS inputs. */
   NIR_PASS_V(nir, nir_lower_global_vars_to_local);
 
   sort_varyings(>outputs);
   st_nir_assign_var_locations(>outputs,
-  >num_outputs);
+  >num_outputs,
+  nir->info.stage);
   st_nir_fixup_varying_slots(st, >outputs);
} else if (nir->info.stage == MESA_SHADER_FRAGMENT) {
   sort_varyings(>inputs);
   st_nir_assign_var_locations(>inputs,
-  >num_inputs);
+  >num_inputs,
+  nir->info.stage);
   st_nir_fixup_varying_slots(st, >inputs);
   st_nir_assign_var_locations(>outputs,
-  >num_outputs);
+  >num_outputs,
+  nir->info.stage);
} else if (nir->info.stage == MESA_SHADER_COMPUTE) {
/* TODO? */
} else {
   unreachable("invalid shader type for tgsi bypass\n");
}
 
NIR_PASS_V(nir, nir_lower_atomics_to_ssbo,
  st->ctx->Const.Program[nir->info.stage].MaxAtomicBuffers);
 
st_nir_assign_uniform_locations(prog, shader_program,
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH] gallium/winsys/kms: fix leaking gem handle.

2017-11-22 Thread 吴涛@Eng

Thanks for your review. I got some reply from somebody eles that these
functions actually doesn't work with mainline kernel and it's supposed
to work with modified kernel on chrome os. So perhaps a right fix
should be still on kernel side instead of mesa side. So I will draw
back this
now.

Here is the thread:
https://chromium-review.googlesource.com/c/chromiumos/overlays/chromiumos-overlay/+/764820

On Mon, Nov 13, 2017 at 7:17 AM, Emil Velikov  wrote:
> Hi Tao Wu,
>
> Welcome to Mesa!
>
> On 12 November 2017 at 07:00, Tao Wu  wrote:
>> When handle was got from drmPrimeFDToHandle, it can't be destroyed with
>> destroy_dumb. Change to use drm_gem_close to release it. Otherwise video
>> ram could get leaked.
>>
>> Signed-off-by: Tao Wu 
>> CC: 
>> ---
>>  src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c | 21 -
>>  1 file changed, 20 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c 
>> b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
>> index 22e1c936ac5..f7cd09b6aa9 100644
>> --- a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
>> +++ b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
>> @@ -27,6 +27,7 @@
>>   *
>>   **/
>>
>> +#include 
>>  #include 
>>  #include 
>>  #include 
>> @@ -172,7 +173,25 @@ kms_sw_displaytarget_destroy(struct sw_winsys *ws,
>>
>> memset(_req, 0, sizeof destroy_req);
>> destroy_req.handle = kms_sw_dt->handle;
>> -   drmIoctl(kms_sw->fd, DRM_IOCTL_MODE_DESTROY_DUMB, _req);
>> +   int ret = drmIoctl(kms_sw->fd, DRM_IOCTL_MODE_DESTROY_DUMB, 
>> _req);
>> +   /* If handle was from drmPrimeFDToHandle, then kms_sw->fd is connected
>> +* as render, we have to use drm_gem_close to release it.
>> +*/
>> +   if (ret < 0) {
>> +  if (errno == EACCES) {
> The patch should resolve the leak, although I'm unsure about the
> explicit EACCESS check.
> AFAICT there is no documentation which that said errno will always
> (and only) be returned in this case.
>
> A more robust solution is to have separate bo_list for the
> prime/imported buffers. This way will ensure we use the correct ioctl.
> What do you think, do you think you can give that a spin?
>
> Thanks
> Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

97 matches

Mail list logo