Re: [Mesa-dev] [PATCH] util/build-id: Fix address comparison for binaries with LOAD vaddr > 0

2018-01-25 Thread Tapani Pälli
I've verified this gets the correct address. Very nice work figuring 
this out Stephan!


Reviewed-by: Tapani Pälli 

On 01/24/2018 04:13 PM, Stephan Gerhold wrote:

build_id_find_nhdr_for_addr() fails to find the build-id if the first LOAD
segment has a virtual address other than 0x0.

For most shared libraries, the first LOAD segment has vaddr=0x0:

 Type   Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
 LOAD   0x00 0x 0x 0x2d2e26 0x2d2e26 R E 0x1000
 LOAD   0x2d2e54 0x002d3e54 0x002d3e54 0x2e248 0x2f148 RW  0x1000

However, compiling the Intel Vulkan driver as 32-bit binary on Android produces
the following ELF header with vaddr=0x8000 instead:

 Type   Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
 PHDR   0x34 0x8034 0x8034 0x00100 0x00100 R   0x4
 LOAD   0x00 0x8000 0x8000 0x224a04 0x224a04 R E 0x1000
 LOAD   0x225710 0x0022e710 0x0022e710 0x25988 0x27364 RW  0x1000

build_id_find_nhdr_callback() compares the address of dli_fbase from dladdr()
and dlpi_addr from dl_iterate_phdr(). With vaddr > 0, these point to a
different memory address, e.g.:

 dli_fbase=0xd8395000 (offset 0x8000)
 dlpi_addr=0xd838d000

At least on glibc and bionic (Android) dli_fbase refers to the address where
the shared object is mapped into the process space, whereas dlpi_addr is just
the base address for the vaddrs declared in the ELF header.

To compare them correctly, we need to calculate the start of the mapping
by adding the vaddr of the first LOAD segment to the base address.

Cc: Chad Versace 
Cc: Emil Velikov 
Cc: Tapani Pälli 
Cc: 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104642
Fixes: 5c98d38 "util: Query build-id by symbol address, not library name"
---
  src/util/build_id.c | 13 -
  1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/src/util/build_id.c b/src/util/build_id.c
index 536c74360e..fb67d160e3 100644
--- a/src/util/build_id.c
+++ b/src/util/build_id.c
@@ -58,7 +58,18 @@ build_id_find_nhdr_callback(struct dl_phdr_info *info, 
size_t size, void *data_)
  {
 struct callback_data *data = data_;
  
-   if ((void *)info->dlpi_addr != data->dli_fbase)

+   /* Calculate address where shared object is mapped into the process space.
+* (Using the base address and the virtual address of the first LOAD 
segment)
+*/
+   void *map_start = NULL;
+   for (unsigned i = 0; i < info->dlpi_phnum; i++) {
+  if (info->dlpi_phdr[i].p_type == PT_LOAD) {
+ map_start = (void *)(info->dlpi_addr + info->dlpi_phdr[i].p_vaddr);
+ break;
+  }
+   }
+
+   if (map_start != data->dli_fbase)
return 0;
  
 for (unsigned i = 0; i < info->dlpi_phnum; i++) {



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 0.5/5] i965/tiled_memcpy: linear_to_ytiled a cache line at a time

2018-01-25 Thread Jason Ekstrand
The OCD in me is seeing a couple more places you could micro-optimize.
Before I actually point them out,

Reviewed-by: Jason Ekstrand 

On Thu, Jan 25, 2018 at 8:23 AM, Scott D Phillips <
scott.d.phill...@intel.com> wrote:

> TileY's low 6 address bits are: v1 v0 u3 u2 u1 u0
> Thus a cache line in the tiled surface is composed of a 2d area of
> 16x4 bytes of the linear surface.
>
> Add a special case where the area being copied is 4-line aligned
> and a multiple of 4-lines so that entire cache lines will be
> written at a time.
>
> On Apollolake, this increases tiling throughput to wc maps by
> 84.8512% +/- 0.935379%
>

Nice!


> v2: Split [y0, y1) and [y2, y3) loops apart for clarity (Jason Ekstrand)
> ---
>  src/mesa/drivers/dri/i965/intel_tiled_memcpy.c | 80
> +++---
>  1 file changed, 72 insertions(+), 8 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
> b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
> index 53a5679691..9e6bafa4b4 100644
> --- a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
> +++ b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
> @@ -287,8 +287,8 @@ linear_to_xtiled(uint32_t x0, uint32_t x1, uint32_t
> x2, uint32_t x3,
>   */
>  static inline void
>  linear_to_ytiled(uint32_t x0, uint32_t x1, uint32_t x2, uint32_t x3,
> - uint32_t y0, uint32_t y1,
> - char *dst, const char *src,
> + uint32_t y0, uint32_t y3,
> + char *dst, const char *src0,
>   int32_t src_pitch,
>   uint32_t swizzle_bit,
>   mem_copy_fn mem_copy,
> @@ -306,6 +306,9 @@ linear_to_ytiled(uint32_t x0, uint32_t x1, uint32_t
> x2, uint32_t x3,
> const uint32_t column_width = ytile_span;
> const uint32_t bytes_per_column = column_width * ytile_height;
>
> +   uint32_t y1 = ALIGN_UP(y0, 4);
> +   uint32_t y2 = ALIGN_DOWN(y3, 4);
> +
> uint32_t xo0 = (x0 % ytile_span) + (x0 / ytile_span) *
> bytes_per_column;
> uint32_t xo1 = (x1 % ytile_span) + (x1 / ytile_span) *
> bytes_per_column;
>
> @@ -319,26 +322,87 @@ linear_to_ytiled(uint32_t x0, uint32_t x1, uint32_t
> x2, uint32_t x3,
>
> uint32_t x, yo;
>
> -   src += (ptrdiff_t)y0 * src_pitch;
> +   const char *src = src0 + (ptrdiff_t)y0 * src_pitch;
>
> -   for (yo = y0 * column_width; yo < y1 * column_width; yo +=
> column_width) {
> +   if (y0 != y1) {
> +  for (yo = y0 * column_width; yo < y1 * column_width; yo +=
> column_width) {
> + uint32_t xo = xo1;
> + uint32_t swizzle = swizzle1;
> +
> + mem_copy(dst + ((xo0 + yo) ^ swizzle0), src + x0, x1 - x0);
> +
> + /* Step by spans/columns.  As it happens, the swizzle bit flips
> +  * at each step so we don't need to calculate it explicitly.
> +  */
> + for (x = x1; x < x2; x += ytile_span) {
> +mem_copy_align16(dst + ((xo + yo) ^ swizzle), src + x,
> ytile_span);
> +xo += bytes_per_column;
> +swizzle ^= swizzle_bit;
> + }
> +
> + mem_copy_align16(dst + ((xo + yo) ^ swizzle), src + x2, x3 - x2);
> +
> + src += src_pitch;
> +  }
> +   }
> +
> +   src = src0 + (ptrdiff_t)y1 * src_pitch;
> +
> +   for (yo = y1 * column_width; yo < y2 * column_width; yo += 4 *
> column_width) {
>uint32_t xo = xo1;
>uint32_t swizzle = swizzle1;
>
> -  mem_copy(dst + ((xo0 + yo) ^ swizzle0), src + x0, x1 - x0);
> +  if (x0 != x1) {
> + mem_copy(dst + ((xo0 + yo + 0 * column_width) ^ swizzle0), src +
> x0 + 0 * src_pitch, x1 - x0);
> + mem_copy(dst + ((xo0 + yo + 1 * column_width) ^ swizzle0), src +
> x0 + 1 * src_pitch, x1 - x0);
> + mem_copy(dst + ((xo0 + yo + 2 * column_width) ^ swizzle0), src +
> x0 + 2 * src_pitch, x1 - x0);
> + mem_copy(dst + ((xo0 + yo + 3 * column_width) ^ swizzle0), src +
> x0 + 3 * src_pitch, x1 - x0);
> +  }
>
>/* Step by spans/columns.  As it happens, the swizzle bit flips
> * at each step so we don't need to calculate it explicitly.
> */
>for (x = x1; x < x2; x += ytile_span) {
> - mem_copy_align16(dst + ((xo + yo) ^ swizzle), src + x,
> ytile_span);
> + mem_copy_align16(dst + ((xo + yo + 0 * column_width) ^ swizzle),
> src + x + 0 * src_pitch, ytile_span);
> + mem_copy_align16(dst + ((xo + yo + 1 * column_width) ^ swizzle),
> src + x + 1 * src_pitch, ytile_span);
> + mem_copy_align16(dst + ((xo + yo + 2 * column_width) ^ swizzle),
> src + x + 2 * src_pitch, ytile_span);
> + mem_copy_align16(dst + ((xo + yo + 3 * column_width) ^ swizzle),
> src + x + 3 * src_pitch, ytile_span);
>   xo += bytes_per_column;
>   swizzle ^= swizzle_bit;
>}
>
> -  mem_copy_align16(dst + ((xo + yo) ^ swizzle), src + x2, x3 - x2);
> +  if (x2 != x3) {
> + mem_copy_align16(dst + ((xo + yo + 0 * column_width) ^ swizzle),
> src + x2 + 0 * src_pitch, x3 - 

[Mesa-dev] [PATCH] meson: Add new picture_{h264, hevc}_enc.c files to meson too

2018-01-25 Thread Christoph Haag
---

Very nice that this finally arrives. Can you add the files to meson too,
something like this patch?
I can't test it because I only have Polaris here.


 src/gallium/state_trackers/va/meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/va/meson.build 
b/src/gallium/state_trackers/va/meson.build
index 35da5ab532..2eb312ce4c 100644
--- a/src/gallium/state_trackers/va/meson.build
+++ b/src/gallium/state_trackers/va/meson.build
@@ -26,7 +26,7 @@ libva_st = static_library(
 'buffer.c', 'config.c', 'context.c', 'display.c', 'image.c', 'picture.c',
 'picture_mpeg12.c', 'picture_mpeg4.c', 'picture_h264.c', 'picture_hevc.c',
 'picture_vc1.c', 'picture_mjpeg.c', 'postproc.c', 'subpicture.c',
-'surface.c',
+'surface.c', 'picture_h264_enc.c', 'picture_hevc_enc.c'
   ),
   c_args : [
 c_vis_args,
-- 
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] mesa: add blob overrun check to program binary reads

2018-01-25 Thread Timothy Arceri
---
 src/mesa/main/program_binary.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/mesa/main/program_binary.c b/src/mesa/main/program_binary.c
index 2786487362..68a15ec258 100644
--- a/src/mesa/main/program_binary.c
+++ b/src/mesa/main/program_binary.c
@@ -287,5 +287,19 @@ _mesa_program_binary(struct gl_context *ctx, struct 
gl_shader_program *sh_prog,
   return;
}
 
+   if (blob.current != blob.end || blob.overrun) {
+  /* Something has gone wrong ignore the binary and set link status to
+   * failure.
+   */
+  assert(!"Invalid program binary cache item!");
+
+  if (ctx->_Shader->Flags & GLSL_CACHE_INFO) {
+ fprintf(stderr, "Error reading program from program binary\n");
+  }
+  sh_prog->data->LinkStatus = linking_failure;
+
+  return;
+   }
+
sh_prog->data->LinkStatus = linking_success;
 }
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] st/shader_cache: restore num_tgsi_tokens when loading from cache

2018-01-25 Thread Timothy Arceri
Without this we will fail to correctly serialise programs when
using glGetProgramBinary() if the program was retrieved from
the disk cache rather than freshly compiled.

Fixes: c69b0dd6817b "st/glsl_to_tgsi: store num_tgsi_tokens in st_*_program"

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104762
---
 src/mesa/state_tracker/st_shader_cache.c | 25 -
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/src/mesa/state_tracker/st_shader_cache.c 
b/src/mesa/state_tracker/st_shader_cache.c
index a971b0d7ee..92c633d450 100644
--- a/src/mesa/state_tracker/st_shader_cache.c
+++ b/src/mesa/state_tracker/st_shader_cache.c
@@ -142,10 +142,11 @@ read_stream_out_from_cache(struct blob_reader 
*blob_reader,
 
 static void
 read_tgsi_from_cache(struct blob_reader *blob_reader,
- const struct tgsi_token **tokens)
+ const struct tgsi_token **tokens,
+ unsigned *num_tokens)
 {
-   uint32_t num_tokens  = blob_read_uint32(blob_reader);
-   unsigned tokens_size = num_tokens * sizeof(struct tgsi_token);
+   *num_tokens  = blob_read_uint32(blob_reader);
+   unsigned tokens_size = *num_tokens * sizeof(struct tgsi_token);
*tokens = (const struct tgsi_token*) MALLOC(tokens_size);
blob_copy_bytes(blob_reader, (uint8_t *) *tokens, tokens_size);
 }
@@ -175,7 +176,8 @@ st_deserialise_tgsi_program(struct gl_context *ctx,
   sizeof(stvp->result_to_output));
 
   read_stream_out_from_cache(_reader, >tgsi);
-  read_tgsi_from_cache(_reader, >tgsi.tokens);
+  read_tgsi_from_cache(_reader, >tgsi.tokens,
+   >num_tgsi_tokens);
 
   if (st->vp == stvp)
  st->dirty |= ST_NEW_VERTEX_PROGRAM(st, stvp);
@@ -189,7 +191,8 @@ st_deserialise_tgsi_program(struct gl_context *ctx,
 >variants, >tgsi);
 
   read_stream_out_from_cache(_reader, >tgsi);
-  read_tgsi_from_cache(_reader, >tgsi.tokens);
+  read_tgsi_from_cache(_reader, >tgsi.tokens,
+   >num_tgsi_tokens);
 
   if (st->tcp == sttcp)
  st->dirty |= sttcp->affected_states;
@@ -203,7 +206,8 @@ st_deserialise_tgsi_program(struct gl_context *ctx,
 >variants, >tgsi);
 
   read_stream_out_from_cache(_reader, >tgsi);
-  read_tgsi_from_cache(_reader, >tgsi.tokens);
+  read_tgsi_from_cache(_reader, >tgsi.tokens,
+   >num_tgsi_tokens);
 
   if (st->tep == sttep)
  st->dirty |= sttep->affected_states;
@@ -217,7 +221,8 @@ st_deserialise_tgsi_program(struct gl_context *ctx,
 >tgsi);
 
   read_stream_out_from_cache(_reader, >tgsi);
-  read_tgsi_from_cache(_reader, >tgsi.tokens);
+  read_tgsi_from_cache(_reader, >tgsi.tokens,
+   >num_tgsi_tokens);
 
   if (st->gp == stgp)
  st->dirty |= stgp->affected_states;
@@ -229,7 +234,8 @@ st_deserialise_tgsi_program(struct gl_context *ctx,
 
   st_release_fp_variants(st, stfp);
 
-  read_tgsi_from_cache(_reader, >tgsi.tokens);
+  read_tgsi_from_cache(_reader, >tgsi.tokens,
+   >num_tgsi_tokens);
 
   if (st->fp == stfp)
  st->dirty |= stfp->affected_states;
@@ -242,7 +248,8 @@ st_deserialise_tgsi_program(struct gl_context *ctx,
   st_release_cp_variants(st, stcp);
 
   read_tgsi_from_cache(_reader,
-   (const struct tgsi_token**) >tgsi.prog);
+   (const struct tgsi_token**) >tgsi.prog,
+   >num_tgsi_tokens);
 
   stcp->tgsi.req_local_mem = stcp->Base.info.cs.shared_size;
   stcp->tgsi.req_private_mem = 0;
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] radeon/vcn: add and manage render picture list

2018-01-25 Thread Zhang, Boyuan
Hi Emil,

I cherry-picked these 2 patches over to the 17.3 branch, then tested locally 
and confirmed that it works fine.

I also added bugzilla reference to the patch as well. Please see the details in 
the links below:

https://lists.freedesktop.org/archives/mesa-dev/2018-January/183485.html
https://lists.freedesktop.org/archives/mesa-dev/2018-January/183484.html

Regards,
Boyuan


-Original Message-
From: Emil Velikov [mailto:emil.l.veli...@gmail.com] 
Sent: January-24-18 8:38 AM
To: Zhang, Boyuan; ML mesa-stable
Cc: ML mesa-dev
Subject: Re: [Mesa-dev] [PATCH 1/2] radeon/vcn: add and manage render picture 
list

On 11 December 2017 at 16:47,   wrote:
> From: Boyuan Zhang 
>
> Create a list in decoder to store all render picture buffer pointers 
> that currently being used in reference picture lists.
>
> During get message buffer call, check each pointer in 
> render_pic_list[] within given pic->ref[] list, remove pointer that no 
> longer being used by
> pic->ref[]. Then add current render surface pointer to the 
> pic->render_pic_list[]
> and assign the associated index to result.curr_idx.
>
> As a result, result.curr_idx will have the correct index to represent 
> the current render picture, instead of the previous increamenting values.
>
> Signed-off-by: Boyuan Zhang 
> Reviewed-by: Christian König 
> ---
We'd want this and 2/2 (sha's below) in stable. Otherwise people will 
experience regressions when updating their firmware.

f2bfd1cbb7e radeon/vcn: add and manage render picture list 2ec48039b8a 
radeon/uvd: add and manage render picture list

Including the bugzilla reference will be great.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104745

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: silence MinGW 'may be unused uninitialized' warning in get.c

2018-01-25 Thread Brian Paul
The warning happens on line 2114 for the memcpy(data, p, size) call.
I'm not sure why that generates the warning but not the earlier use
of p in the code.
---
 src/mesa/main/get.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 7f2d72a..5fee9a6 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -2051,7 +2051,7 @@ _mesa_GetUnsignedBytevEXT(GLenum pname, GLubyte *data)
const struct value_desc *d;
union value v;
int shift;
-   void *p;
+   void *p = NULL;
GLsizei size;
const char *func = "glGetUnsignedBytevEXT";
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] radeon/uvd: add and manage render picture list

2018-01-25 Thread boyuan.zhang
From: Boyuan Zhang 

Create a list in decoder to store all render picture buffer pointers that
currently being used in reference picture lists.

During get message buffer call, check each pointer in render_pic_list[]
within given pic->ref[] list, remove pointer that no longer being used by
pic->ref[]. Then add current render surface pointer to the render_pic_list[]
and assign the associated index to result.curr_idx.

As a result, result.curr_idx will have the correct index to represent the
current render picture, instead of the previous increamenting values.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104745

Signed-off-by: Boyuan Zhang 
Reviewed-by: Christian König 
Cc: mesa-sta...@lists.freedesktop.org
(cherry picked from commit 2ec48039b8aa1f6a5e16f3f12483b88981d0f5d3)
---
 src/gallium/drivers/radeon/radeon_uvd.c | 29 +
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_uvd.c 
b/src/gallium/drivers/radeon/radeon_uvd.c
index 032ed7c..87e7858 100644
--- a/src/gallium/drivers/radeon/radeon_uvd.c
+++ b/src/gallium/drivers/radeon/radeon_uvd.c
@@ -97,6 +97,8 @@ struct ruvd_decoder {
unsignedcmd;
unsignedcntl;
} reg;
+
+   void*render_pic_list[16];
 };
 
 /* flush IB to the hardware */
@@ -596,7 +598,7 @@ static struct ruvd_h265 get_h265_msg(struct ruvd_decoder 
*dec, struct pipe_video
 struct pipe_h265_picture_desc *pic)
 {
struct ruvd_h265 result;
-   unsigned i;
+   unsigned i, j;
 
memset(, 0, sizeof(result));
 
@@ -676,11 +678,28 @@ static struct ruvd_h265 get_h265_msg(struct ruvd_decoder 
*dec, struct pipe_video
result.row_height_minus1[i] = pic->pps->row_height_minus1[i];
 
result.num_delta_pocs_ref_rps_idx = pic->NumDeltaPocsOfRefRpsIdx;
-   result.curr_idx = pic->CurrPicOrderCntVal;
result.curr_poc = pic->CurrPicOrderCntVal;
 
+   for (i = 0 ; i < 16 ; i++) {
+   for (j = 0; (pic->ref[j] != NULL) && (j < 16) ; j++) {
+   if (dec->render_pic_list[i] == pic->ref[j])
+   break;
+   if (j == 15)
+   dec->render_pic_list[i] = NULL;
+   else if (pic->ref[j+1] == NULL)
+   dec->render_pic_list[i] = NULL;
+   }
+   }
+   for (i = 0 ; i < 16 ; i++) {
+   if (dec->render_pic_list[i] == NULL) {
+   dec->render_pic_list[i] = target;
+   result.curr_idx = i;
+   break;
+   }
+   }
+
vl_video_buffer_set_associated_data(target, >base,
-   (void 
*)(uintptr_t)pic->CurrPicOrderCntVal,
+   (void *)(uintptr_t)result.curr_idx,
_destroy_associated_data);
 
for (i = 0; i < 16; ++i) {
@@ -723,7 +742,7 @@ static struct ruvd_h265 get_h265_msg(struct ruvd_decoder 
*dec, struct pipe_video
memcpy(dec->it + 864, pic->pps->sps->ScalingList32x32, 2 * 64);
 
for (i = 0 ; i < 2 ; i++) {
-   for (int j = 0 ; j < 15 ; j++)
+   for (j = 0 ; j < 15 ; j++)
result.direct_reflist[i][j] = pic->RefPicList[i][j];
}
 
@@ -1407,6 +1426,8 @@ struct pipe_video_codec 
*si_common_uvd_create_decoder(struct pipe_context *conte
goto error;
}
 
+   for (i = 0; i < 16; i++)
+dec->render_pic_list[i] = NULL;
dec->fb_size = (info.family == CHIP_TONGA) ? FB_BUFFER_SIZE_TONGA :
FB_BUFFER_SIZE;
bs_buf_size = width * height * (512 / (16 * 16));
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] radeon/vcn: add and manage render picture list

2018-01-25 Thread boyuan.zhang
From: Boyuan Zhang 

Create a list in decoder to store all render picture buffer pointers that
currently being used in reference picture lists.

During get message buffer call, check each pointer in render_pic_list[]
within given pic->ref[] list, remove pointer that no longer being used by
pic->ref[]. Then add current render surface pointer to the render_pic_list[]
and assign the associated index to result.curr_idx.

As a result, result.curr_idx will have the correct index to represent the
current render picture, instead of the previous increamenting values.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104745

Signed-off-by: Boyuan Zhang 
Reviewed-by: Christian König 
Cc: mesa-sta...@lists.freedesktop.org
(cherry picked from commit f2bfd1cbb7e72945ca192845a1ad28426c7aea89)
---
 src/gallium/drivers/radeon/radeon_vcn_dec.c | 28 
 1 file changed, 24 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_vcn_dec.c 
b/src/gallium/drivers/radeon/radeon_vcn_dec.c
index 2ece4a3..8010010 100644
--- a/src/gallium/drivers/radeon/radeon_vcn_dec.c
+++ b/src/gallium/drivers/radeon/radeon_vcn_dec.c
@@ -78,6 +78,7 @@ struct radeon_decoder {
 
unsignedbs_size;
unsignedcur_buffer;
+   void*render_pic_list[16];
 };
 
 static rvcn_dec_message_avc_t get_h264_msg(struct radeon_decoder *dec,
@@ -186,7 +187,7 @@ static rvcn_dec_message_hevc_t get_h265_msg(struct 
radeon_decoder *dec,
struct pipe_h265_picture_desc *pic)
 {
rvcn_dec_message_hevc_t result;
-   unsigned i;
+   unsigned i, j;
 
memset(, 0, sizeof(result));
result.sps_info_flags = 0;
@@ -273,11 +274,28 @@ static rvcn_dec_message_hevc_t get_h265_msg(struct 
radeon_decoder *dec,
result.row_height_minus1[i] = pic->pps->row_height_minus1[i];
 
result.num_delta_pocs_ref_rps_idx = pic->NumDeltaPocsOfRefRpsIdx;
-   result.curr_idx = pic->CurrPicOrderCntVal;
result.curr_poc = pic->CurrPicOrderCntVal;
 
+   for (i = 0 ; i < 16 ; i++) {
+   for (j = 0; (pic->ref[j] != NULL) && (j < 16) ; j++) {
+   if (dec->render_pic_list[i] == pic->ref[j])
+   break;
+   if (j == 15)
+   dec->render_pic_list[i] = NULL;
+   else if (pic->ref[j+1] == NULL)
+   dec->render_pic_list[i] = NULL;
+   }
+   }
+   for (i = 0 ; i < 16 ; i++) {
+   if (dec->render_pic_list[i] == NULL) {
+   dec->render_pic_list[i] = target;
+   result.curr_idx = i;
+   break;
+   }
+   }
+
vl_video_buffer_set_associated_data(target, >base,
-   (void 
*)(uintptr_t)pic->CurrPicOrderCntVal,
+   (void *)(uintptr_t)result.curr_idx,

_dec_destroy_associated_data);
 
for (i = 0; i < 16; ++i) {
@@ -320,7 +338,7 @@ static rvcn_dec_message_hevc_t get_h265_msg(struct 
radeon_decoder *dec,
memcpy(dec->it + 864, pic->pps->sps->ScalingList32x32, 2 * 64);
 
for (i = 0 ; i < 2 ; i++) {
-   for (int j = 0 ; j < 15 ; j++)
+   for (j = 0 ; j < 15 ; j++)
result.direct_reflist[i][j] = pic->RefPicList[i][j];
}
 
@@ -1236,6 +1254,8 @@ struct pipe_video_codec *radeon_create_decoder(struct 
pipe_context *context,
goto error;
}
 
+   for (i = 0; i < 16; i++)
+   dec->render_pic_list[i] = NULL;
bs_buf_size = width * height * (512 / (16 * 16));
for (i = 0; i < NUM_BUFFERS; ++i) {
unsigned msg_fb_it_size = FB_BUFFER_OFFSET + FB_BUFFER_SIZE;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 2/5] i965/miptree: Use cpu tiling/detiling when mapping

2018-01-25 Thread Scott D Phillips
Rename the (un)map_gtt functions to (un)map_map (map by
returning a map) and add new functions (un)map_tiled_memcpy that
return a shadow buffer populated with the intel_tiled_memcpy
functions.

v2: Compute extents properly in the x|y-rounded-down case (Chris Wilson)
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 95 ---
 1 file changed, 86 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index c480eade93..85297cb0c1 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -31,6 +31,7 @@
 #include "intel_image.h"
 #include "intel_mipmap_tree.h"
 #include "intel_tex.h"
+#include "intel_tiled_memcpy.h"
 #include "intel_blit.h"
 #include "intel_fbo.h"
 
@@ -3028,10 +3029,10 @@ intel_miptree_unmap_raw(struct intel_mipmap_tree *mt)
 }
 
 static void
-intel_miptree_map_gtt(struct brw_context *brw,
- struct intel_mipmap_tree *mt,
- struct intel_miptree_map *map,
- unsigned int level, unsigned int slice)
+intel_miptree_map_map(struct brw_context *brw,
+  struct intel_mipmap_tree *mt,
+  struct intel_miptree_map *map,
+  unsigned int level, unsigned int slice)
 {
unsigned int bw, bh;
void *base;
@@ -3049,7 +3050,7 @@ intel_miptree_map_gtt(struct brw_context *brw,
y /= bh;
x /= bw;
 
-   base = intel_miptree_map_raw(brw, mt, map->mode);
+   base = intel_miptree_map_raw(brw, mt, map->mode | MAP_RAW);
 
if (base == NULL)
   map->ptr = NULL;
@@ -3075,11 +3076,80 @@ intel_miptree_map_gtt(struct brw_context *brw,
 }
 
 static void
-intel_miptree_unmap_gtt(struct intel_mipmap_tree *mt)
+intel_miptree_unmap_map(struct intel_mipmap_tree *mt)
 {
intel_miptree_unmap_raw(mt);
 }
 
+/* Compute extent parameters for use with tiled_memcpy functions.
+ * xs are in units of bytes and ys are in units of strides. */
+static inline void
+tile_extents(struct intel_mipmap_tree *mt, struct intel_miptree_map *map,
+ unsigned int level, unsigned int slice, unsigned int *x1,
+ unsigned int *x2, unsigned int *y1, unsigned int *y2)
+{
+   unsigned int block_width, block_height, block_bytes;
+   unsigned int x0_el, y0_el;
+
+   _mesa_get_format_block_size(mt->format, _width, _height);
+   block_bytes = _mesa_get_format_bytes(mt->format);
+
+   assert(map->x % block_width == 0);
+   assert(map->y % block_height == 0);
+
+   intel_miptree_get_image_offset(mt, level, slice, _el, _el);
+   *x1 = (map->x / block_width + x0_el) * block_bytes;
+   *y1 = map->y / block_height + y0_el;
+   *x2 = (DIV_ROUND_UP(map->x + map->w, block_width) + x0_el) * block_bytes;
+   *y2 = DIV_ROUND_UP(map->y + map->h, block_height) + y0_el;
+}
+
+static void
+intel_miptree_map_tiled_memcpy(struct brw_context *brw,
+   struct intel_mipmap_tree *mt,
+   struct intel_miptree_map *map,
+   unsigned int level, unsigned int slice)
+{
+   unsigned int x1, x2, y1, y2;
+   tile_extents(mt, map, level, slice, , , , );
+   map->stride = _mesa_format_row_stride(mt->format, map->w);
+   map->buffer = map->ptr = malloc(map->stride * (y2 - y1));
+
+   if (!(map->mode & GL_MAP_INVALIDATE_RANGE_BIT)) {
+  char *src = intel_miptree_map_raw(brw, mt, map->mode | MAP_RAW);
+  src += mt->offset;
+
+  tiled_to_linear(x1, x2, y1, y2, map->ptr, src, map->stride,
+  mt->surf.row_pitch, brw->has_swizzling, mt->surf.tiling,
+  memcpy);
+
+  intel_miptree_unmap_raw(mt);
+   }
+}
+
+static void
+intel_miptree_unmap_tiled_memcpy(struct brw_context *brw,
+ struct intel_mipmap_tree *mt,
+ struct intel_miptree_map *map,
+ unsigned int level,
+ unsigned int slice)
+{
+   if (map->mode & GL_MAP_WRITE_BIT) {
+  unsigned int x1, x2, y1, y2;
+  tile_extents(mt, map, level, slice, , , , );
+
+  char *dst = intel_miptree_map_raw(brw, mt, map->mode | MAP_RAW);
+  dst += mt->offset;
+
+  linear_to_tiled(x1, x2, y1, y2, dst, map->ptr, mt->surf.row_pitch,
+  map->stride, brw->has_swizzling, mt->surf.tiling, 
memcpy);
+
+  intel_miptree_unmap_raw(mt);
+   }
+   free(map->buffer);
+   map->buffer = map->ptr = NULL;
+}
+
 static void
 intel_miptree_map_blit(struct brw_context *brw,
   struct intel_mipmap_tree *mt,
@@ -3637,8 +3707,10 @@ intel_miptree_map(struct brw_context *brw,
   (mt->surf.row_pitch % 16 == 0)) {
   intel_miptree_map_movntdqa(brw, mt, map, level, slice);
 #endif
+   } else if (mt->surf.tiling != ISL_TILING_LINEAR) {
+  intel_miptree_map_tiled_memcpy(brw, mt, map, level, slice);
} else {
-  

[Mesa-dev] [PATCH v2 0.5/5] i965/tiled_memcpy: linear_to_ytiled a cache line at a time

2018-01-25 Thread Scott D Phillips
TileY's low 6 address bits are: v1 v0 u3 u2 u1 u0
Thus a cache line in the tiled surface is composed of a 2d area of
16x4 bytes of the linear surface.

Add a special case where the area being copied is 4-line aligned
and a multiple of 4-lines so that entire cache lines will be
written at a time.

On Apollolake, this increases tiling throughput to wc maps by
84.8512% +/- 0.935379%

v2: Split [y0, y1) and [y2, y3) loops apart for clarity (Jason Ekstrand)
---
 src/mesa/drivers/dri/i965/intel_tiled_memcpy.c | 80 +++---
 1 file changed, 72 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c 
b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
index 53a5679691..9e6bafa4b4 100644
--- a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
+++ b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c
@@ -287,8 +287,8 @@ linear_to_xtiled(uint32_t x0, uint32_t x1, uint32_t x2, 
uint32_t x3,
  */
 static inline void
 linear_to_ytiled(uint32_t x0, uint32_t x1, uint32_t x2, uint32_t x3,
- uint32_t y0, uint32_t y1,
- char *dst, const char *src,
+ uint32_t y0, uint32_t y3,
+ char *dst, const char *src0,
  int32_t src_pitch,
  uint32_t swizzle_bit,
  mem_copy_fn mem_copy,
@@ -306,6 +306,9 @@ linear_to_ytiled(uint32_t x0, uint32_t x1, uint32_t x2, 
uint32_t x3,
const uint32_t column_width = ytile_span;
const uint32_t bytes_per_column = column_width * ytile_height;
 
+   uint32_t y1 = ALIGN_UP(y0, 4);
+   uint32_t y2 = ALIGN_DOWN(y3, 4);
+
uint32_t xo0 = (x0 % ytile_span) + (x0 / ytile_span) * bytes_per_column;
uint32_t xo1 = (x1 % ytile_span) + (x1 / ytile_span) * bytes_per_column;
 
@@ -319,26 +322,87 @@ linear_to_ytiled(uint32_t x0, uint32_t x1, uint32_t x2, 
uint32_t x3,
 
uint32_t x, yo;
 
-   src += (ptrdiff_t)y0 * src_pitch;
+   const char *src = src0 + (ptrdiff_t)y0 * src_pitch;
 
-   for (yo = y0 * column_width; yo < y1 * column_width; yo += column_width) {
+   if (y0 != y1) {
+  for (yo = y0 * column_width; yo < y1 * column_width; yo += column_width) 
{
+ uint32_t xo = xo1;
+ uint32_t swizzle = swizzle1;
+
+ mem_copy(dst + ((xo0 + yo) ^ swizzle0), src + x0, x1 - x0);
+
+ /* Step by spans/columns.  As it happens, the swizzle bit flips
+  * at each step so we don't need to calculate it explicitly.
+  */
+ for (x = x1; x < x2; x += ytile_span) {
+mem_copy_align16(dst + ((xo + yo) ^ swizzle), src + x, ytile_span);
+xo += bytes_per_column;
+swizzle ^= swizzle_bit;
+ }
+
+ mem_copy_align16(dst + ((xo + yo) ^ swizzle), src + x2, x3 - x2);
+
+ src += src_pitch;
+  }
+   }
+
+   src = src0 + (ptrdiff_t)y1 * src_pitch;
+
+   for (yo = y1 * column_width; yo < y2 * column_width; yo += 4 * 
column_width) {
   uint32_t xo = xo1;
   uint32_t swizzle = swizzle1;
 
-  mem_copy(dst + ((xo0 + yo) ^ swizzle0), src + x0, x1 - x0);
+  if (x0 != x1) {
+ mem_copy(dst + ((xo0 + yo + 0 * column_width) ^ swizzle0), src + x0 + 
0 * src_pitch, x1 - x0);
+ mem_copy(dst + ((xo0 + yo + 1 * column_width) ^ swizzle0), src + x0 + 
1 * src_pitch, x1 - x0);
+ mem_copy(dst + ((xo0 + yo + 2 * column_width) ^ swizzle0), src + x0 + 
2 * src_pitch, x1 - x0);
+ mem_copy(dst + ((xo0 + yo + 3 * column_width) ^ swizzle0), src + x0 + 
3 * src_pitch, x1 - x0);
+  }
 
   /* Step by spans/columns.  As it happens, the swizzle bit flips
* at each step so we don't need to calculate it explicitly.
*/
   for (x = x1; x < x2; x += ytile_span) {
- mem_copy_align16(dst + ((xo + yo) ^ swizzle), src + x, ytile_span);
+ mem_copy_align16(dst + ((xo + yo + 0 * column_width) ^ swizzle), src 
+ x + 0 * src_pitch, ytile_span);
+ mem_copy_align16(dst + ((xo + yo + 1 * column_width) ^ swizzle), src 
+ x + 1 * src_pitch, ytile_span);
+ mem_copy_align16(dst + ((xo + yo + 2 * column_width) ^ swizzle), src 
+ x + 2 * src_pitch, ytile_span);
+ mem_copy_align16(dst + ((xo + yo + 3 * column_width) ^ swizzle), src 
+ x + 3 * src_pitch, ytile_span);
  xo += bytes_per_column;
  swizzle ^= swizzle_bit;
   }
 
-  mem_copy_align16(dst + ((xo + yo) ^ swizzle), src + x2, x3 - x2);
+  if (x2 != x3) {
+ mem_copy_align16(dst + ((xo + yo + 0 * column_width) ^ swizzle), src 
+ x2 + 0 * src_pitch, x3 - x2);
+ mem_copy_align16(dst + ((xo + yo + 1 * column_width) ^ swizzle), src 
+ x2 + 1 * src_pitch, x3 - x2);
+ mem_copy_align16(dst + ((xo + yo + 2 * column_width) ^ swizzle), src 
+ x2 + 2 * src_pitch, x3 - x2);
+ mem_copy_align16(dst + ((xo + yo + 3 * column_width) ^ swizzle), src 
+ x2 + 3 * src_pitch, x3 - x2);
+  }
 
-  src += src_pitch;
+  src += 4 * src_pitch;
+   }
+
+   if (y2 != y3) {
+  src = src0 + (ptrdiff_t)y2 

[Mesa-dev] [PATCH] ac/nir: Correctly handle imod with different signs.

2018-01-25 Thread Bas Nieuwenhuizen
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102032
---
 src/amd/common/ac_nir_to_llvm.c | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 8ae8650a7b..4f1e4af37b 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -1671,6 +1671,23 @@ static LLVMValueRef emit_ddxy_interp(
return ac_build_gather_values(>ac, result, 4);
 }
 
+static LLVMValueRef emit_imod(struct ac_llvm_context *ctx, LLVMValueRef src0, 
LLVMValueRef src1)
+{
+   /* The imod result should have the same sign as src1 when not 0. */
+
+   LLVMValueRef result = LLVMBuildSRem(ctx->builder, src0, src1, "");
+
+   LLVMValueRef diff_sign = LLVMBuildXor(ctx->builder, result, src1, "");
+   diff_sign = LLVMBuildICmp(ctx->builder, LLVMIntSLT, diff_sign, 
ctx->i32_0, "");
+
+   LLVMValueRef nonzero = LLVMBuildICmp(ctx->builder, LLVMIntNE, result, 
ctx->i32_0, "");
+
+   LLVMValueRef cond = LLVMBuildAnd(ctx->builder, diff_sign, nonzero, "");
+   LLVMValueRef offset = LLVMBuildSelect(ctx->builder, cond, src1, 
ctx->i32_0, "");
+
+   return LLVMBuildAdd(ctx->builder, result, offset, "");
+}
+
 static void visit_alu(struct ac_nir_context *ctx, const nir_alu_instr *instr)
 {
LLVMValueRef src[4], result = NULL;
@@ -1733,7 +1750,7 @@ static void visit_alu(struct ac_nir_context *ctx, const 
nir_alu_instr *instr)
result = LLVMBuildMul(ctx->ac.builder, src[0], src[1], "");
break;
case nir_op_imod:
-   result = LLVMBuildSRem(ctx->ac.builder, src[0], src[1], "");
+   result = emit_imod(>ac, src[0], src[1]);
break;
case nir_op_umod:
result = LLVMBuildURem(ctx->ac.builder, src[0], src[1], "");
-- 
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102032] nir_op_imod is incorrectly implemented as LLVM's srem

2018-01-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102032

--- Comment #1 from Bas Nieuwenhuizen  ---
I went looking to why there were no good CTS tests for this and found this in
the vulkan spec:

For the OpSRem and OpSMod instructions, if either operand is negative the
result is undefined.

Note
While the OpSRem and OpSMod instructions are supported by the Vulkan
environment, they require non-negative values and thus do not enable additional
functionality beyond what OpUMod provides.


While I'm open to fixing this, you may want to rethink what you are doing.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] anv/pipeline: remove the pipeline layout field from anv_pipeline

2018-01-25 Thread Jason Ekstrand
3 and 4 are

Reviewed-by: Jason Ekstrand 

On Thu, Jan 25, 2018 at 4:24 AM, Iago Toral Quiroga 
wrote:

> It no longer has any users.
>
> Suggested-by: Jason Ekstrand 
> ---
>  src/intel/vulkan/anv_pipeline.c  | 2 --
>  src/intel/vulkan/anv_private.h   | 1 -
>  src/intel/vulkan/genX_pipeline.c | 1 -
>  3 files changed, 4 deletions(-)
>
> diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_
> pipeline.c
> index 4dc18096af5..43ae9f5ef91 100644
> --- a/src/intel/vulkan/anv_pipeline.c
> +++ b/src/intel/vulkan/anv_pipeline.c
> @@ -1297,8 +1297,6 @@ anv_pipeline_init(struct anv_pipeline *pipeline,
> assert(pCreateInfo->subpass < render_pass->subpass_count);
> pipeline->subpass = _pass->subpasses[pCreateInfo->subpass];
>
> -   pipeline->layout = anv_pipeline_layout_from_
> handle(pCreateInfo->layout);
> -
> result = anv_reloc_list_init(>batch_relocs, alloc);
> if (result != VK_SUCCESS)
>return result;
> diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_
> private.h
> index ae99cd51ff4..ea3af3a0f2b 100644
> --- a/src/intel/vulkan/anv_private.h
> +++ b/src/intel/vulkan/anv_private.h
> @@ -2147,7 +2147,6 @@ struct anv_pipeline {
> struct anv_dynamic_state dynamic_state;
>
> struct anv_subpass * subpass;
> -   struct anv_pipeline_layout * layout;
>
> bool needs_data_cache;
>
> diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_
> pipeline.c
> index 82fdf206a95..91cc37de04a 100644
> --- a/src/intel/vulkan/genX_pipeline.c
> +++ b/src/intel/vulkan/genX_pipeline.c
> @@ -1756,7 +1756,6 @@ compute_pipeline_create(
>return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
>
> pipeline->device = device;
> -   pipeline->layout = anv_pipeline_layout_from_
> handle(pCreateInfo->layout);
>
> pipeline->blend_state.map = NULL;
>
> --
> 2.14.1
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/4] anv/pipeline: don't take the layout from the pipeline to compile shaders

2018-01-25 Thread Jason Ekstrand
I had a few nits below.  With those fixed,

Reviewed-by: Jason Ekstrand 


On Thu, Jan 25, 2018 at 4:24 AM, Iago Toral Quiroga 
wrote:

> The Vulkan spec states that VkPipelineLayout objects must not be
> destroyed while any command buffer that uses them is in the recording
> state, but it permits them to be destroyed otherwise. This means that
> applications are allowed to free pipeline layouts after command recording
> is finished even if there are pipeline objects that still exist and were
> created with these layouts.
>
> There are two solutions to this, one is to use reference counting on
> pipeline layout objects. The other is to avoid holding references to
> pipeline layouts where they are not really needed.
>
> This patch takes a step towards the second option by making the
> pipeline shader compile code take pipeline layout from the
> VkGraphicsPipelineCreateInfo provided rather than the pipeline
> object.
>
> A follow-up patch will remove any remaining uses of the layout field
> so we can remove it from the pipeline object and avoid the need
> for reference counting.
>
> Suggested-by: Jason Ekstrand 
> ---
>  src/intel/vulkan/anv_nir.h   |  3 +-
>  src/intel/vulkan/anv_nir_apply_pipeline_layout.c |  2 +-
>  src/intel/vulkan/anv_nir_lower_ycbcr_textures.c  |  9 ++--
>  src/intel/vulkan/anv_pipeline.c  | 54
> 
>  4 files changed, 44 insertions(+), 24 deletions(-)
>
> diff --git a/src/intel/vulkan/anv_nir.h b/src/intel/vulkan/anv_nir.h
> index 8ac0a119dac..ce95b40b014 100644
> --- a/src/intel/vulkan/anv_nir.h
> +++ b/src/intel/vulkan/anv_nir.h
> @@ -38,9 +38,10 @@ void anv_nir_lower_push_constants(nir_shader *shader);
>  bool anv_nir_lower_multiview(nir_shader *shader, uint32_t view_mask);
>
>  bool anv_nir_lower_ycbcr_textures(nir_shader *shader,
> -  struct anv_pipeline *pipeline);
> +  struct anv_pipeline_layout *layout);
>
>  void anv_nir_apply_pipeline_layout(struct anv_pipeline *pipeline,
> +   struct anv_pipeline_layout *layout,
> nir_shader *shader,
> struct brw_stage_prog_data *prog_data,
> struct anv_pipeline_bind_map *map);
> diff --git a/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
> b/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
> index 6775f9b464e..acabc5426be 100644
> --- a/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
> +++ b/src/intel/vulkan/anv_nir_apply_pipeline_layout.c
> @@ -326,11 +326,11 @@ setup_vec4_uniform_value(uint32_t *params, uint32_t
> offset, unsigned n)
>
>  void
>  anv_nir_apply_pipeline_layout(struct anv_pipeline *pipeline,
> +  struct anv_pipeline_layout *layout,
>nir_shader *shader,
>struct brw_stage_prog_data *prog_data,
>struct anv_pipeline_bind_map *map)
>  {
> -   struct anv_pipeline_layout *layout = pipeline->layout;
> gl_shader_stage stage = shader->info.stage;
>
> struct apply_pipeline_layout_state state = {
> diff --git a/src/intel/vulkan/anv_nir_lower_ycbcr_textures.c
> b/src/intel/vulkan/anv_nir_lower_ycbcr_textures.c
> index 028f24e2f60..ad793ee0a0c 100644
> --- a/src/intel/vulkan/anv_nir_lower_ycbcr_textures.c
> +++ b/src/intel/vulkan/anv_nir_lower_ycbcr_textures.c
> @@ -316,13 +316,13 @@ swizzle_channel(struct isl_swizzle swizzle, unsigned
> channel)
>  }
>
>  static bool
> -try_lower_tex_ycbcr(struct anv_pipeline *pipeline,
> +try_lower_tex_ycbcr(struct anv_pipeline_layout *layout,
>  nir_builder *builder,
>  nir_tex_instr *tex)
>  {
> nir_variable *var = tex->texture->var;
> const struct anv_descriptor_set_layout *set_layout =
> -  pipeline->layout->set[var->data.descriptor_set].layout;
> +  layout->set[var->data.descriptor_set].layout;
> const struct anv_descriptor_set_binding_layout *binding =
>_layout->binding[var->data.binding];
>
> @@ -440,7 +440,8 @@ try_lower_tex_ycbcr(struct anv_pipeline *pipeline,
>  }
>
>  bool
> -anv_nir_lower_ycbcr_textures(nir_shader *shader, struct anv_pipeline
> *pipeline)
> +anv_nir_lower_ycbcr_textures(nir_shader *shader,
> + struct anv_pipeline_layout *layout)
>  {
> bool progress = false;
>
> @@ -458,7 +459,7 @@ anv_nir_lower_ycbcr_textures(nir_shader *shader,
> struct anv_pipeline *pipeline)
> continue;
>
>  nir_tex_instr *tex = nir_instr_as_tex(instr);
> -function_progress |= try_lower_tex_ycbcr(pipeline, ,
> tex);
> +function_progress |= try_lower_tex_ycbcr(layout, ,
> tex);
>   }
>}
>
> diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_
> pipeline.c
> index 

Re: [Mesa-dev] [PATCH 3/4] r600: add ARB_query_buffer_object support

2018-01-25 Thread Roland Scheidegger
Am 25.01.2018 um 01:40 schrieb Dave Airlie:
> From: Dave Airlie 
> 
> This uses a different shader than radeonsi, as we can't address non-256
> aligned ssbos, which the radeonsi code does. This passes some extra
> offsets into the shader.
Couldn't you just require the query buffers to have sufficient alignment
in the first place, hence simplifying this? ssbo's need to have 256B
alignment as well, as do UBOs.
Albeit I can't really see what GL would require, buffer object alignment
is quite a mystery to me in general...

> 
> It also contains a set of u64 instruction implementation that may
> or may not be complete (at least the u64div is definitely not something
> that works outside this use-case). If r600 grows 64-bit integers,
> it will use the GLSL lowering for divmod.
> 
> Signed-off-by: Dave Airlie 
> ---



...
> +static int emit_u64add(struct r600_shader_ctx *ctx, int op,
> +int treg,
> +int src0_sel, int src0_chan,
> +int src1_sel, int src1_chan)
> +{
> + struct r600_bytecode_alu alu;
> + int r;
> + int opc;
> +
> + if (op == ALU_OP2_ADD_INT)
> + opc = ALU_OP2_ADDC_UINT;
> + else
> + opc = ALU_OP2_SUBB_UINT;
> +
> + memset(, 0, sizeof(struct r600_bytecode_alu));
> + alu.op = op;;
> + alu.dst.sel = treg;
> + alu.dst.chan = 0;
> + alu.dst.write = 1;
> + alu.src[0].sel = src0_sel;
> + alu.src[0].chan = src0_chan + 0;
> + alu.src[1].sel = src1_sel;
> + alu.src[1].chan = src1_chan + 0;
> + alu.src[1].neg = 0;
> + r = r600_bytecode_add_alu(ctx->bc, );
> + if (r)
> + return r;
> +
> + memset(, 0, sizeof(struct r600_bytecode_alu));
> + alu.op = op;
> + alu.dst.sel = treg;
> + alu.dst.chan = 1;
> + alu.dst.write = 1;
> + alu.src[0].sel = src0_sel;
> + alu.src[0].chan = src0_chan + 1;
> + alu.src[1].sel = src1_sel;
> + alu.src[1].chan = src1_chan + 1;
> + alu.src[1].neg = 0;
> + r = r600_bytecode_add_alu(ctx->bc, );
> + if (r)
> + return r;
> +
> + memset(, 0, sizeof(struct r600_bytecode_alu));
> + alu.op = opc;
> + alu.dst.sel = treg;
> + alu.dst.chan = 2;
> + alu.dst.write = 1;
> + alu.last = 1;
> + alu.src[0].sel = src0_sel;
> + alu.src[0].chan = src0_chan + 0;
> + alu.src[1].sel = src1_sel;
> + alu.src[1].chan = src1_chan + 0;
> + alu.src[1].neg = 0;
> + r = r600_bytecode_add_alu(ctx->bc, );
> + if (r)
> + return r;
> +
> + memset(, 0, sizeof(struct r600_bytecode_alu));
> + alu.op = op;
> + alu.dst.sel = treg;
> + alu.dst.chan = 1;
> + alu.dst.write = 1;
> + alu.src[0].sel = treg;
> + alu.src[0].chan = 1;
> + alu.src[1].sel = treg;
> + alu.src[1].chan = 2;
> + alu.last = 1;
> + r = r600_bytecode_add_alu(ctx->bc, );
> + if (r)
> + return r;
> + return 0;
> +}
> +
> +static int egcm_u64add(struct r600_shader_ctx *ctx)
Couldn't you call into emit_u64add for performing the actual add?
Or maybe it wouldn't really be simpler...


> +{
> + struct tgsi_full_instruction *inst = 
> >parse.FullToken.FullInstruction;
> + struct r600_bytecode_alu alu;
> + int r;
> + int treg = ctx->temp_reg;
> + int op = ALU_OP2_ADD_INT, opc = ALU_OP2_ADDC_UINT;
> +
> + if (ctx->src[1].neg) {
> + op = ALU_OP2_SUB_INT;
> + opc = ALU_OP2_SUBB_UINT;
> + }
> + memset(, 0, sizeof(struct r600_bytecode_alu));
> + alu.op = op;;
> + alu.dst.sel = treg;
> + alu.dst.chan = 0;
> + alu.dst.write = 1;
> + r600_bytecode_src([0], >src[0], 0);
> + r600_bytecode_src([1], >src[1], 0);
> + alu.src[1].neg = 0;
> + r = r600_bytecode_add_alu(ctx->bc, );
> + if (r)
> + return r;
> +
> + memset(, 0, sizeof(struct r600_bytecode_alu));
> + alu.op = op;
> + alu.dst.sel = treg;
> + alu.dst.chan = 1;
> + alu.dst.write = 1;
> + r600_bytecode_src([0], >src[0], 1);
> + r600_bytecode_src([1], >src[1], 1);
> + alu.src[1].neg = 0;
> + r = r600_bytecode_add_alu(ctx->bc, );
> + if (r)
> + return r;
> +
> + memset(, 0, sizeof(struct r600_bytecode_alu));
> + alu.op = opc  ;
> + alu.dst.sel = treg;
> + alu.dst.chan = 2;
> + alu.dst.write = 1;
> + alu.last = 1;
> + r600_bytecode_src([0], >src[0], 0);
> + r600_bytecode_src([1], >src[1], 0);
> + alu.src[1].neg = 0;
> + r = r600_bytecode_add_alu(ctx->bc, );
> + if (r)
> + return r;
> +
> + memset(, 0, sizeof(struct r600_bytecode_alu));
> + alu.op = op;
> + tgsi_dst(ctx, >Dst[0], 1, );
> + alu.src[0].sel = treg;
> + alu.src[0].chan = 1;
> + alu.src[1].sel = treg;
> + alu.src[1].chan = 2;
> + alu.last = 1;
> + r = r600_bytecode_add_alu(ctx->bc, );
> + if (r)
> +  

Re: [Mesa-dev] [PATCH 1/4] anv/descriptor_set: add reference counting for descriptor set layouts

2018-01-25 Thread Jason Ekstrand
On Thu, Jan 25, 2018 at 4:24 AM, Iago Toral Quiroga 
wrote:

> The spec states that descriptor set layouts can be destroyed almost
> at any time:
>
>"VkDescriptorSetLayout objects may be accessed by commands that
> operate on descriptor sets allocated using that layout, and those
> descriptor sets must not be updated with vkUpdateDescriptorSets
> after the descriptor set layout has been destroyed. Otherwise,
> descriptor set layouts can be destroyed any time they are not in
> use by an API command."
>
> Fixes the following work-in-progress CTS tests:
> dEQP-VK.api.descriptor_set.descriptor_set_layout_lifetime.graphics
> dEQP-VK.api.descriptor_set.descriptor_set_layout_lifetime.compute
>
> Suggested-by: Jason Ekstrand 
> ---
>  src/intel/vulkan/anv_cmd_buffer.c |  6 ++
>  src/intel/vulkan/anv_descriptor_set.c | 17 ++---
>  src/intel/vulkan/anv_private.h| 26 --
>  3 files changed, 40 insertions(+), 9 deletions(-)
>
> diff --git a/src/intel/vulkan/anv_cmd_buffer.c b/src/intel/vulkan/anv_cmd_
> buffer.c
> index bf80061c6d4..521cf6b6a54 100644
> --- a/src/intel/vulkan/anv_cmd_buffer.c
> +++ b/src/intel/vulkan/anv_cmd_buffer.c
> @@ -913,8 +913,7 @@ void anv_CmdPushDescriptorSetKHR(
>
> assert(_set < MAX_SETS);
>
> -   const struct anv_descriptor_set_layout *set_layout =
> -  layout->set[_set].layout;
> +   struct anv_descriptor_set_layout *set_layout =
> layout->set[_set].layout;
>
> struct anv_push_descriptor_set *push_set =
>anv_cmd_buffer_get_push_descriptor_set(cmd_buffer,
> @@ -1006,8 +1005,7 @@ void anv_CmdPushDescriptorSetWithTemplateKHR(
>
> assert(_set < MAX_PUSH_DESCRIPTORS);
>
> -   const struct anv_descriptor_set_layout *set_layout =
> -  layout->set[_set].layout;
> +   struct anv_descriptor_set_layout *set_layout =
> layout->set[_set].layout;
>
> struct anv_push_descriptor_set *push_set =
>anv_cmd_buffer_get_push_descriptor_set(cmd_buffer,
> diff --git a/src/intel/vulkan/anv_descriptor_set.c b/src/intel/vulkan/anv_
> descriptor_set.c
> index 1d4df264ae6..99122aed229 100644
> --- a/src/intel/vulkan/anv_descriptor_set.c
> +++ b/src/intel/vulkan/anv_descriptor_set.c
> @@ -67,6 +67,8 @@ VkResult anv_CreateDescriptorSetLayout(
>return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
>
> memset(set_layout, 0, sizeof(*set_layout));
> +   set_layout->ref_cnt = 1;
> +   set_layout->allocator = pAllocator;
>

There's a sticky bit here around allocators.  Because we're reference
counting, there's no guarantee that it will get freed when they call
vkDestroyDescriptorSetLayout.  This means that
VK_SYSTEM_ALLOCATION_SCOPE_OBJECT is not appropriate.  Instead, we should
be using VK_SYSTEM_ALLOCATION_SCOPE_DEVICE and allocating it off the device
allocator ignoring pAllocator.  That will probably cause a CTS warning (not
an error) in the allocation tests but I think it's the right thing to do.

Other than that, looks good!


> set_layout->binding_count = max_binding + 1;
>
> for (uint32_t b = 0; b <= max_binding; b++) {
> @@ -204,7 +206,8 @@ void anv_DestroyDescriptorSetLayout(
> if (!set_layout)
>return;
>
> -   vk_free2(>alloc, pAllocator, set_layout);
> +   assert(pAllocator == set_layout->allocator);
> +   anv_descriptor_set_layout_unref(device, set_layout);
>  }
>
>  static void
> @@ -246,6 +249,7 @@ VkResult anv_CreatePipelineLayout(
>ANV_FROM_HANDLE(anv_descriptor_set_layout, set_layout,
>pCreateInfo->pSetLayouts[set]);
>layout->set[set].layout = set_layout;
> +  anv_descriptor_set_layout_ref(set_layout);
>
>layout->set[set].dynamic_offset_start = dynamic_offset_count;
>for (uint32_t b = 0; b < set_layout->binding_count; b++) {
> @@ -290,6 +294,9 @@ void anv_DestroyPipelineLayout(
> if (!pipeline_layout)
>return;
>
> +   for (uint32_t i = 0; i < pipeline_layout->num_sets; i++)
> +  anv_descriptor_set_layout_unref(device, pipeline_layout->set[i].
> layout);
> +
> vk_free2(>alloc, pAllocator, pipeline_layout);
>  }
>
> @@ -423,7 +430,7 @@ struct surface_state_free_list_entry {
>  VkResult
>  anv_descriptor_set_create(struct anv_device *device,
>struct anv_descriptor_pool *pool,
> -  const struct anv_descriptor_set_layout *layout,
> +  struct anv_descriptor_set_layout *layout,
>struct anv_descriptor_set **out_set)
>  {
> struct anv_descriptor_set *set;
> @@ -455,8 +462,10 @@ anv_descriptor_set_create(struct anv_device *device,
>}
> }
>
> -   set->size = size;
> set->layout = layout;
> +   anv_descriptor_set_layout_ref(layout);
> +
> +   set->size = size;
> set->buffer_views =
>(struct anv_buffer_view *) >descriptors[layout->size];
> set->buffer_count = layout->buffer_count;
> @@ -512,6 +521,8 @@ 

Re: [Mesa-dev] [Mesa-stable] [PATCH] anv/pipeline: Don't look at blend state unless we have an attachment

2018-01-25 Thread Jason Ekstrand
It landed as 4b69ba381766cd911eb1284f1b0332a139ec8a75

On Thu, Jan 25, 2018 at 3:27 AM, Emil Velikov 
wrote:

> On 18 January 2018 at 01:16, Jason Ekstrand  wrote:
> > Without this, we may end up dereferencing blend before we check for
> > binding->index != UINT32_MAX.  However, Vulkan allows the blend state to
> > be NULL so long as you don't have any color attachments.  This fixes a
> > segfault when running The Talos Principal.
> >
> > Fixes: 12f4e00b69e724a23504b7bd3958fb75dc462950
> > Cc: mesa-sta...@lists.freedesktop.org
> > ---
> Jason, did this fall through the cracks or it has been
> superseded/rejected for some reason?
>
> -Emil
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/12] st/va: enable dual instances encode only for H264

2018-01-25 Thread boyuan.zhang
From: Boyuan Zhang 

Logics that related to dual instances encode should only be done for
H264, not other codecs.

Signed-off-by: Boyuan Zhang 
---
 src/gallium/state_trackers/va/picture.c |  3 ++-
 src/gallium/state_trackers/va/surface.c | 23 +--
 2 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c
index 77d379b..537e931 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -650,7 +650,8 @@ vlVaEndPicture(VADriverContextP ctx, VAContextID context_id)
}
 
context->decoder->end_frame(context->decoder, context->target, 
>desc.base);
-   if (context->decoder->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE) {
+   if (context->decoder->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE &&
+  u_reduce_video_profile(context->templat.profile) == 
PIPE_VIDEO_FORMAT_MPEG4_AVC) {
   int idr_period = context->desc.h264enc.gop_size / context->gop_coeff;
   int p_remain_in_idr = idr_period - context->desc.h264enc.frame_num;
   surf->frame_num_cnt = context->desc.h264enc.frame_num_cnt;
diff --git a/src/gallium/state_trackers/va/surface.c 
b/src/gallium/state_trackers/va/surface.c
index 636505b..9823232 100644
--- a/src/gallium/state_trackers/va/surface.c
+++ b/src/gallium/state_trackers/va/surface.c
@@ -36,6 +36,7 @@
 #include "util/u_rect.h"
 #include "util/u_sampler.h"
 #include "util/u_surface.h"
+#include "util/u_video.h"
 
 #include "vl/vl_compositor.h"
 #include "vl/vl_video_buffer.h"
@@ -122,16 +123,18 @@ vlVaSyncSurface(VADriverContextP ctx, VASurfaceID 
render_target)
}
 
if (context->decoder->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE) {
-  int frame_diff;
-  if (context->desc.h264enc.frame_num_cnt >= surf->frame_num_cnt)
- frame_diff = context->desc.h264enc.frame_num_cnt - 
surf->frame_num_cnt;
-  else
- frame_diff = 0x - surf->frame_num_cnt + 1 + 
context->desc.h264enc.frame_num_cnt;
-  if ((frame_diff == 0) &&
-  (surf->force_flushed == false) &&
-  (context->desc.h264enc.frame_num_cnt % 2 != 0)) {
- context->decoder->flush(context->decoder);
- context->first_single_submitted = true;
+  if (u_reduce_video_profile(context->templat.profile) == 
PIPE_VIDEO_FORMAT_MPEG4_AVC) {
+ int frame_diff;
+ if (context->desc.h264enc.frame_num_cnt >= surf->frame_num_cnt)
+frame_diff = context->desc.h264enc.frame_num_cnt - 
surf->frame_num_cnt;
+ else
+frame_diff = 0x - surf->frame_num_cnt + 1 + 
context->desc.h264enc.frame_num_cnt;
+ if ((frame_diff == 0) &&
+ (surf->force_flushed == false) &&
+ (context->desc.h264enc.frame_num_cnt % 2 != 0)) {
+context->decoder->flush(context->decoder);
+context->first_single_submitted = true;
+ }
   }
   context->decoder->get_feedback(context->decoder, surf->feedback, 
&(surf->coded_buf->coded_size));
   surf->feedback = NULL;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/12] st/va: move H264 enc functions into separate file

2018-01-25 Thread boyuan.zhang
From: Boyuan Zhang 

Move all H264 encode related functions into separate file. Similar to
VAAPI decode side, there will be separate file for each codec on encode
side as well.

Signed-off-by: Boyuan Zhang 
---
 src/gallium/state_trackers/va/Makefile.sources   |   1 +
 src/gallium/state_trackers/va/picture.c  | 146 +++-
 src/gallium/state_trackers/va/picture_h264_enc.c | 163 +++
 src/gallium/state_trackers/va/va_private.h   |   5 +
 4 files changed, 218 insertions(+), 97 deletions(-)
 create mode 100644 src/gallium/state_trackers/va/picture_h264_enc.c

diff --git a/src/gallium/state_trackers/va/Makefile.sources 
b/src/gallium/state_trackers/va/Makefile.sources
index 2d6546b..8a69828 100644
--- a/src/gallium/state_trackers/va/Makefile.sources
+++ b/src/gallium/state_trackers/va/Makefile.sources
@@ -8,6 +8,7 @@ C_SOURCES := \
picture_mpeg12.c \
picture_mpeg4.c \
picture_h264.c \
+   picture_h264_enc.c \
picture_hevc.c \
picture_vc1.c \
picture_mjpeg.c \
diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c
index 8951573..77d379b 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -349,55 +349,52 @@ handleVASliceDataBufferType(vlVaContext *context, 
vlVaBuffer *buf)
 static VAStatus
 handleVAEncMiscParameterTypeRateControl(vlVaContext *context, 
VAEncMiscParameterBuffer *misc)
 {
-   VAEncMiscParameterRateControl *rc = (VAEncMiscParameterRateControl 
*)misc->data;
-   if (context->desc.h264enc.rate_ctrl.rate_ctrl_method ==
-   PIPE_H264_ENC_RATE_CONTROL_METHOD_CONSTANT)
-  context->desc.h264enc.rate_ctrl.target_bitrate = rc->bits_per_second;
-   else
-  context->desc.h264enc.rate_ctrl.target_bitrate = rc->bits_per_second * 
(rc->target_percentage / 100.0);
-   context->desc.h264enc.rate_ctrl.peak_bitrate = rc->bits_per_second;
-   if (context->desc.h264enc.rate_ctrl.target_bitrate < 200)
-  context->desc.h264enc.rate_ctrl.vbv_buffer_size = 
MIN2((context->desc.h264enc.rate_ctrl.target_bitrate * 2.75), 200);
-   else
-  context->desc.h264enc.rate_ctrl.vbv_buffer_size = 
context->desc.h264enc.rate_ctrl.target_bitrate;
+   VAStatus status = VA_STATUS_SUCCESS;
 
-   return VA_STATUS_SUCCESS;
+   switch (u_reduce_video_profile(context->templat.profile)) {
+   case PIPE_VIDEO_FORMAT_MPEG4_AVC:
+  status = vlVaHandleVAEncMiscParameterTypeRateControlH264(context, misc);
+  break;
+
+   default:
+  break;
+   }
+
+   return status;
 }
 
 static VAStatus
 handleVAEncMiscParameterTypeFrameRate(vlVaContext *context, 
VAEncMiscParameterBuffer *misc)
 {
-   VAEncMiscParameterFrameRate *fr = (VAEncMiscParameterFrameRate *)misc->data;
-   if (fr->framerate & 0x) {
-  context->desc.h264enc.rate_ctrl.frame_rate_num = fr->framerate   & 
0x;
-  context->desc.h264enc.rate_ctrl.frame_rate_den = fr->framerate >> 16 & 
0x;
-   } else {
-  context->desc.h264enc.rate_ctrl.frame_rate_num = fr->framerate;
-  context->desc.h264enc.rate_ctrl.frame_rate_den = 1;
+   VAStatus status = VA_STATUS_SUCCESS;
+
+   switch (u_reduce_video_profile(context->templat.profile)) {
+   case PIPE_VIDEO_FORMAT_MPEG4_AVC:
+  status = vlVaHandleVAEncMiscParameterTypeFrameRateH264(context, misc);
+  break;
+
+   default:
+  break;
}
-   return VA_STATUS_SUCCESS;
+
+   return status;
 }
 
 static VAStatus
 handleVAEncSequenceParameterBufferType(vlVaDriver *drv, vlVaContext *context, 
vlVaBuffer *buf)
 {
-   VAEncSequenceParameterBufferH264 *h264 = (VAEncSequenceParameterBufferH264 
*)buf->data;
-   if (!context->decoder) {
-  context->templat.max_references = h264->max_num_ref_frames;
-  context->templat.level = h264->level_idc;
-  context->decoder = drv->pipe->create_video_codec(drv->pipe, 
>templat);
-  if (!context->decoder)
- return VA_STATUS_ERROR_ALLOCATION_FAILED;
+   VAStatus status = VA_STATUS_SUCCESS;
+
+   switch (u_reduce_video_profile(context->templat.profile)) {
+   case PIPE_VIDEO_FORMAT_MPEG4_AVC:
+  status = vlVaHandleVAEncSequenceParameterBufferTypeH264(drv, context, 
buf);
+  break;
+
+   default:
+  break;
}
 
-   context->gop_coeff = ((1024 + h264->intra_idr_period - 1) / 
h264->intra_idr_period + 1) / 2 * 2;
-   if (context->gop_coeff > VL_VA_ENC_GOP_COEFF)
-  context->gop_coeff = VL_VA_ENC_GOP_COEFF;
-   context->desc.h264enc.gop_size = h264->intra_idr_period * 
context->gop_coeff;
-   context->desc.h264enc.rate_ctrl.frame_rate_num = h264->time_scale / 2;
-   context->desc.h264enc.rate_ctrl.frame_rate_den = h264->num_units_in_tick;
-   context->desc.h264enc.pic_order_cnt_type = 
h264->seq_fields.bits.pic_order_cnt_type;
-   return VA_STATUS_SUCCESS;
+   return status;
 }
 
 static VAStatus
@@ -426,80 +423,35 @@ handleVAEncMiscParameterBufferType(vlVaContext *context, 

[Mesa-dev] [PATCH 10/12] st/va: add HEVC encode functions

2018-01-25 Thread boyuan.zhang
From: Boyuan Zhang 

Add a separate file for HEVC encode functions.

Signed-off-by: Boyuan Zhang 
---
 src/gallium/state_trackers/va/Makefile.sources   |  1 +
 src/gallium/state_trackers/va/picture.c  | 56 +--
 src/gallium/state_trackers/va/picture_hevc_enc.c | 69 
 src/gallium/state_trackers/va/va_private.h   |  5 ++
 4 files changed, 128 insertions(+), 3 deletions(-)
 create mode 100644 src/gallium/state_trackers/va/picture_hevc_enc.c

diff --git a/src/gallium/state_trackers/va/Makefile.sources 
b/src/gallium/state_trackers/va/Makefile.sources
index 8a69828..f3a13f2 100644
--- a/src/gallium/state_trackers/va/Makefile.sources
+++ b/src/gallium/state_trackers/va/Makefile.sources
@@ -10,6 +10,7 @@ C_SOURCES := \
picture_h264.c \
picture_h264_enc.c \
picture_hevc.c \
+   picture_hevc_enc.c \
picture_vc1.c \
picture_mjpeg.c \
postproc.c \
diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c
index 537e931..e26996c 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -139,6 +139,31 @@ getEncParamPreset(vlVaContext *context)
context->desc.h264enc.ref_pic_mode = 0x0201;
 }
 
+static void
+getEncParamPresetH265(vlVaContext *context)
+{
+   //rate control
+   context->desc.h265enc.rc.vbv_buffer_size = 2000;
+   context->desc.h265enc.rc.vbv_buf_lv = 48;
+   context->desc.h265enc.rc.fill_data_enable = 1;
+   context->desc.h265enc.rc.enforce_hrd = 1;
+   if (context->desc.h265enc.rc.frame_rate_num == 0 ||
+ context->desc.h265enc.rc.frame_rate_den == 0) {
+  context->desc.h265enc.rc.frame_rate_num = 30;
+  context->desc.h265enc.rc.frame_rate_den = 1;
+   }
+   context->desc.h265enc.rc.target_bits_picture =
+  context->desc.h265enc.rc.target_bitrate *
+  ((float)context->desc.h265enc.rc.frame_rate_den /
+   context->desc.h265enc.rc.frame_rate_num);
+   context->desc.h265enc.rc.peak_bits_picture_integer =
+  context->desc.h265enc.rc.peak_bitrate *
+  ((float)context->desc.h265enc.rc.frame_rate_den /
+   context->desc.h265enc.rc.frame_rate_num);
+
+   context->desc.h265enc.rc.peak_bits_picture_fraction = 0;
+}
+
 static VAStatus
 handlePictureParameterBuffer(vlVaDriver *drv, vlVaContext *context, vlVaBuffer 
*buf)
 {
@@ -356,6 +381,10 @@ handleVAEncMiscParameterTypeRateControl(vlVaContext 
*context, VAEncMiscParameter
   status = vlVaHandleVAEncMiscParameterTypeRateControlH264(context, misc);
   break;
 
+   case PIPE_VIDEO_FORMAT_HEVC:
+  status = vlVaHandleVAEncMiscParameterTypeRateControlHEVC(context, misc);
+  break;
+
default:
   break;
}
@@ -373,6 +402,10 @@ handleVAEncMiscParameterTypeFrameRate(vlVaContext 
*context, VAEncMiscParameterBu
   status = vlVaHandleVAEncMiscParameterTypeFrameRateH264(context, misc);
   break;
 
+   case PIPE_VIDEO_FORMAT_HEVC:
+  status = vlVaHandleVAEncMiscParameterTypeFrameRateHEVC(context, misc);
+  break;
+
default:
   break;
}
@@ -390,6 +423,10 @@ handleVAEncSequenceParameterBufferType(vlVaDriver *drv, 
vlVaContext *context, vl
   status = vlVaHandleVAEncSequenceParameterBufferTypeH264(drv, context, 
buf);
   break;
 
+   case PIPE_VIDEO_FORMAT_HEVC:
+  status = vlVaHandleVAEncSequenceParameterBufferTypeHEVC(drv, context, 
buf);
+  break;
+
default:
   break;
}
@@ -430,6 +467,10 @@ handleVAEncPictureParameterBufferType(vlVaDriver *drv, 
vlVaContext *context, vlV
   status = vlVaHandleVAEncPictureParameterBufferTypeH264(drv, context, 
buf);
   break;
 
+   case PIPE_VIDEO_FORMAT_HEVC:
+  status = vlVaHandleVAEncPictureParameterBufferTypeHEVC(drv, context, 
buf);
+  break;
+
default:
   break;
}
@@ -447,6 +488,10 @@ handleVAEncSliceParameterBufferType(vlVaDriver *drv, 
vlVaContext *context, vlVaB
   status = vlVaHandleVAEncSliceParameterBufferTypeH264(drv, context, buf);
   break;
 
+   case PIPE_VIDEO_FORMAT_HEVC:
+  status = vlVaHandleVAEncSliceParameterBufferTypeHEVC(drv, context, buf);
+  break;
+
default:
   break;
}
@@ -640,8 +685,11 @@ vlVaEndPicture(VADriverContextP ctx, VAContextID 
context_id)
 
if (context->decoder->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE) {
   coded_buf = context->coded_buf;
-  getEncParamPreset(context);
-  context->desc.h264enc.frame_num_cnt++;
+  if (u_reduce_video_profile(context->templat.profile) == 
PIPE_VIDEO_FORMAT_MPEG4_AVC) {
+ getEncParamPreset(context);
+ context->desc.h264enc.frame_num_cnt++;
+  } else if (u_reduce_video_profile(context->templat.profile) == 
PIPE_VIDEO_FORMAT_HEVC)
+ getEncParamPresetH265(context);
   context->decoder->begin_frame(context->decoder, context->target, 
>desc.base);
   context->decoder->encode_bitstream(context->decoder, 

[Mesa-dev] [PATCH 12/12] radeonsi: enable vcn encode for HEVC main

2018-01-25 Thread boyuan.zhang
From: Boyuan Zhang 

Enable vcn encode for HEVC main profile on Raven.

Signed-off-by: Boyuan Zhang 
---
 src/gallium/drivers/radeonsi/si_get.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_get.c 
b/src/gallium/drivers/radeonsi/si_get.c
index 1c84a25..8382721 100644
--- a/src/gallium/drivers/radeonsi/si_get.c
+++ b/src/gallium/drivers/radeonsi/si_get.c
@@ -588,8 +588,10 @@ static int si_get_video_param(struct pipe_screen *screen,
if (entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE) {
switch (param) {
case PIPE_VIDEO_CAP_SUPPORTED:
-   return codec == PIPE_VIDEO_FORMAT_MPEG4_AVC &&
+   return (codec == PIPE_VIDEO_FORMAT_MPEG4_AVC &&
(si_vce_is_fw_version_supported(sscreen) ||
+   sscreen->info.family == CHIP_RAVEN)) ||
+   (profile == PIPE_VIDEO_PROFILE_HEVC_MAIN &&
sscreen->info.family == CHIP_RAVEN);
case PIPE_VIDEO_CAP_NPOT_TEXTURES:
return 1;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/12] st/va: implement HEVC encode functions

2018-01-25 Thread boyuan.zhang
From: Boyuan Zhang 

Implement HEVC encode functions based on VAAPI HEVC encode interface.

Signed-off-by: Boyuan Zhang 
---
 src/gallium/state_trackers/va/picture_hevc_enc.c | 130 ++-
 1 file changed, 125 insertions(+), 5 deletions(-)

diff --git a/src/gallium/state_trackers/va/picture_hevc_enc.c 
b/src/gallium/state_trackers/va/picture_hevc_enc.c
index 144bb8c..1f14098 100644
--- a/src/gallium/state_trackers/va/picture_hevc_enc.c
+++ b/src/gallium/state_trackers/va/picture_hevc_enc.c
@@ -32,7 +32,50 @@
 VAStatus
 vlVaHandleVAEncPictureParameterBufferTypeHEVC(vlVaDriver *drv, vlVaContext 
*context, vlVaBuffer *buf)
 {
-   /* TODO */
+   VAEncPictureParameterBufferHEVC *h265;
+   vlVaBuffer *coded_buf;
+   int i;
+
+   h265 = buf->data;
+   context->desc.h265enc.decoded_curr_pic = h265->decoded_curr_pic.picture_id;
+
+   for (i = 0; i < 15; i++)
+  context->desc.h265enc.reference_frames[i] = 
h265->reference_frames[i].picture_id;
+
+   context->desc.h265enc.pic_order_cnt = h265->decoded_curr_pic.pic_order_cnt;
+   coded_buf = handle_table_get(drv->htab, h265->coded_buf);
+
+   if (!coded_buf->derived_surface.resource)
+  coded_buf->derived_surface.resource = 
pipe_buffer_create(drv->pipe->screen, PIPE_BIND_VERTEX_BUFFER,
+PIPE_USAGE_STREAM, 
coded_buf->size);
+
+   context->coded_buf = coded_buf;
+   context->desc.h265enc.pic.log2_parallel_merge_level_minus2 = 
h265->log2_parallel_merge_level_minus2;
+   context->desc.h265enc.pic.nal_unit_type = h265->nal_unit_type;
+
+   switch(h265->pic_fields.bits.coding_type) {
+   case 1:
+  if (h265->pic_fields.bits.idr_pic_flag)
+ context->desc.h265enc.picture_type = PIPE_H265_ENC_PICTURE_TYPE_IDR;
+  else
+ context->desc.h265enc.picture_type = PIPE_H265_ENC_PICTURE_TYPE_I;
+  break;
+   case 2:
+  context->desc.h265enc.picture_type = PIPE_H265_ENC_PICTURE_TYPE_P;
+  break;
+   case 3:
+   case 4:
+   case 5:
+  return VA_STATUS_ERROR_UNIMPLEMENTED; //no b frame support
+  break;
+   }
+
+   context->desc.h265enc.pic.constrained_intra_pred_flag = 
h265->pic_fields.bits.constrained_intra_pred_flag;
+   context->desc.h265enc.pic.loop_filter_across_tiles_enabled_flag = 
h265->pic_fields.bits.loop_filter_across_tiles_enabled_flag;
+
+   util_hash_table_set(context->desc.h265enc.frame_idx,
+   UINT_TO_PTR(h265->decoded_curr_pic.picture_id),
+   UINT_TO_PTR(context->desc.h265enc.frame_num));
 
return VA_STATUS_SUCCESS;
 }
@@ -40,7 +83,33 @@ vlVaHandleVAEncPictureParameterBufferTypeHEVC(vlVaDriver 
*drv, vlVaContext *cont
 VAStatus
 vlVaHandleVAEncSliceParameterBufferTypeHEVC(vlVaDriver *drv, vlVaContext 
*context, vlVaBuffer *buf)
 {
-   /* TODO */
+   VAEncSliceParameterBufferHEVC *h265;
+
+   h265 = buf->data;
+   context->desc.h265enc.ref_idx_l0 = VA_INVALID_ID;
+   context->desc.h265enc.ref_idx_l1 = VA_INVALID_ID;
+
+   for (int i = 0; i < 15; i++) {
+  if (h265->ref_pic_list0[i].picture_id != VA_INVALID_ID) {
+ if (context->desc.h265enc.ref_idx_l0 == VA_INVALID_ID)
+context->desc.h265enc.ref_idx_l0 = 
PTR_TO_UINT(util_hash_table_get(context->desc.h265enc.frame_idx,
+   
UINT_TO_PTR(h265->ref_pic_list0[i].picture_id)));
+  }
+  if (h265->ref_pic_list1[i].picture_id != VA_INVALID_ID && 
h265->slice_type == 1) {
+ if (context->desc.h265enc.ref_idx_l1 == VA_INVALID_ID)
+context->desc.h265enc.ref_idx_l1 = 
PTR_TO_UINT(util_hash_table_get(context->desc.h265enc.frame_idx,
+   
UINT_TO_PTR(h265->ref_pic_list1[i].picture_id)));
+  }
+   }
+
+   context->desc.h265enc.slice.max_num_merge_cand = h265->max_num_merge_cand;
+   context->desc.h265enc.slice.slice_cb_qp_offset = h265->slice_cb_qp_offset;
+   context->desc.h265enc.slice.slice_cr_qp_offset = h265->slice_cr_qp_offset;
+   context->desc.h265enc.slice.slice_beta_offset_div2 = 
h265->slice_beta_offset_div2;
+   context->desc.h265enc.slice.slice_tc_offset_div2 = 
h265->slice_tc_offset_div2;
+   context->desc.h265enc.slice.cabac_init_flag = 
h265->slice_fields.bits.cabac_init_flag;
+   context->desc.h265enc.slice.slice_deblocking_filter_disabled_flag = 
h265->slice_fields.bits.slice_deblocking_filter_disabled_flag;
+   context->desc.h265enc.slice.slice_loop_filter_across_slices_enabled_flag = 
h265->slice_fields.bits.slice_loop_filter_across_slices_enabled_flag;
 
return VA_STATUS_SUCCESS;
 }
@@ -48,14 +117,57 @@ vlVaHandleVAEncSliceParameterBufferTypeHEVC(vlVaDriver 
*drv, vlVaContext *contex
 VAStatus
 vlVaHandleVAEncSequenceParameterBufferTypeHEVC(vlVaDriver *drv, vlVaContext 
*context, vlVaBuffer *buf)
 {
-   /* TODO */
+   VAEncSequenceParameterBufferHEVC *h265 = (VAEncSequenceParameterBufferHEVC 
*)buf->data;
+
+   if (!context->decoder) {
+  context->templat.level = 

[Mesa-dev] [PATCH 02/12] radeon/vcn: add vcn encode interface for HEVC

2018-01-25 Thread boyuan.zhang
From: Boyuan Zhang 

Add vcn encode interface for HEVC, and rename radeon_enc_h264_enc_pic
to radeon_enc_pic since radeon_enc_pic is used by both H264 and HEVC.

Signed-off-by: Boyuan Zhang 
---
 src/gallium/drivers/radeon/radeon_vcn_enc.h | 82 -
 1 file changed, 80 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_vcn_enc.h 
b/src/gallium/drivers/radeon/radeon_vcn_enc.h
index 0385860..2ec42e4 100644
--- a/src/gallium/drivers/radeon/radeon_vcn_enc.h
+++ b/src/gallium/drivers/radeon/radeon_vcn_enc.h
@@ -48,6 +48,10 @@
 #define RENCODE_IB_PARAM_FEEDBACK_BUFFER   0x0010
 #define RENCODE_IB_PARAM_DIRECT_OUTPUT_NALU0x0020
 
+#define RENCODE_HEVC_IB_PARAM_SLICE_CONTROL0x0011
+#define RENCODE_HEVC_IB_PARAM_SPEC_MISC0x0012
+#define RENCODE_HEVC_IB_PARAM_DEBLOCKING_FILTER0x0013
+
 #define RENCODE_H264_IB_PARAM_SLICE_CONTROL0x0021
 #define RENCODE_H264_IB_PARAM_SPEC_MISC0x0022
 #define RENCODE_H264_IB_PARAM_ENCODE_PARAMS0x0023
@@ -67,6 +71,7 @@
 #define RENCODE_IF_MINOR_VERSION_MASK  0x
 #define RENCODE_IF_MINOR_VERSION_SHIFT 0
 
+#define RENCODE_ENCODE_STANDARD_HEVC   0
 #define RENCODE_ENCODE_STANDARD_H264   1
 
 #define RENCODE_PREENCODE_MODE_NONE0x
@@ -77,6 +82,9 @@
 #define RENCODE_H264_SLICE_CONTROL_MODE_FIXED_MBS  0x
 #define RENCODE_H264_SLICE_CONTROL_MODE_FIXED_BITS 0x0001
 
+#define RENCODE_HEVC_SLICE_CONTROL_MODE_FIXED_CTBS 0x
+#define RENCODE_HEVC_SLICE_CONTROL_MODE_FIXED_BITS 0x0001
+
 #define RENCODE_RATE_CONTROL_METHOD_NONE   0x
 #define RENCODE_RATE_CONTROL_METHOD_LATENCY_CONSTRAINED_VBR0x0001
 #define RENCODE_RATE_CONTROL_METHOD_PEAK_CONSTRAINED_VBR   0x0002
@@ -95,6 +103,11 @@
 #define RENCODE_HEADER_INSTRUCTION_END 0x
 #define RENCODE_HEADER_INSTRUCTION_COPY0x0001
 
+#define RENCODE_HEVC_HEADER_INSTRUCTION_DEPENDENT_SLICE_END0x0001
+#define RENCODE_HEVC_HEADER_INSTRUCTION_FIRST_SLICE0x00010001
+#define RENCODE_HEVC_HEADER_INSTRUCTION_SLICE_SEGMENT  0x00010002
+#define RENCODE_HEVC_HEADER_INSTRUCTION_SLICE_QP_DELTA 0x00010003
+
 #define RENCODE_H264_HEADER_INSTRUCTION_FIRST_MB   0x0002
 #define RENCODE_H264_HEADER_INSTRUCTION_SLICE_QP_DELTA 0x00020001
 
@@ -181,6 +194,25 @@ typedef struct rvcn_enc_h264_slice_control_s
 };
 } rvcn_enc_h264_slice_control_t;
 
+typedef struct rvcn_enc_hevc_slice_control_s
+{
+uint32_t   slice_control_mode;
+union
+{
+struct
+{
+uint32_t   num_ctbs_per_slice;
+uint32_t   num_ctbs_per_slice_segment;
+} fixed_ctbs_per_slice;
+
+struct
+{
+uint32_t   num_bits_per_slice;
+uint32_t   num_bits_per_slice_segment;
+} fixed_bits_per_slice;
+};
+} rvcn_enc_hevc_slice_control_t;
+
 typedef struct rvcn_enc_h264_spec_misc_s
 {
 uint32_t   constrained_intra_pred_flag;
@@ -192,6 +224,17 @@ typedef struct rvcn_enc_h264_spec_misc_s
 uint32_t   level_idc;
 } rvcn_enc_h264_spec_misc_t;
 
+typedef struct rvcn_enc_hevc_spec_misc_s
+{
+uint32_t   log2_min_luma_coding_block_size_minus3;
+uint32_t   amp_disabled;
+uint32_t   strong_intra_smoothing_enabled;
+uint32_t   constrained_intra_pred_flag;
+uint32_t   cabac_init_flag;
+uint32_t   half_pel_enabled;
+uint32_t   quarter_pel_enabled;
+} rvcn_enc_hevc_spec_misc_t;
+
 typedef struct rvcn_enc_rate_ctl_session_init_s
 {
 uint32_t   rate_control_method;
@@ -276,6 +319,16 @@ typedef struct rvcn_enc_h264_deblocking_filter_s
 int32_tcr_qp_offset;
 } rvcn_enc_h264_deblocking_filter_t;
 
+typedef struct rvcn_enc_hevc_deblocking_filter_s
+{
+uint32_t   loop_filter_across_slices_enabled;
+int32_tdeblocking_filter_disabled;
+int32_tbeta_offset_div2;
+int32_ttc_offset_div2;
+int32_tcb_qp_offset;
+int32_tcr_qp_offset;
+} rvcn_enc_hevc_deblocking_filter_t;
+
 typedef struct rvcn_enc_intra_refresh_s
 {
 uint32_t   intra_refresh_mode;
@@ -331,7 +384,7 @@ struct pipe_video_codec *radeon_create_encoder(struct 
pipe_context *context,
struct radeon_winsys* ws,
radeon_enc_get_buffer get_buffer);
 
-struct radeon_enc_h264_enc_pic {
+struct radeon_enc_pic {
enumpipe_h264_enc_picture_type picture_type;
 
unsignedframe_num;
@@ -343,21 +396,46 @@ struct radeon_enc_h264_enc_pic {
unsignedcrop_right;
unsigned

[Mesa-dev] [PATCH 03/12] radeon/vcn: support picture parameters for HEVC

2018-01-25 Thread boyuan.zhang
From: Boyuan Zhang 

Pass pipe_picture_desc instead of pipe_h264_enc_picture_desc so that
it can be used for different codecs. Add functions to handle picture
parameters that will be used for HEVC encode.

Signed-off-by: Boyuan Zhang 
---
 src/gallium/drivers/radeon/radeon_vcn_enc.c | 73 +++--
 src/gallium/drivers/radeon/radeon_vcn_enc.h |  2 +-
 src/gallium/drivers/radeon/radeon_vcn_enc_1_2.c | 11 ++--
 3 files changed, 65 insertions(+), 21 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_vcn_enc.c 
b/src/gallium/drivers/radeon/radeon_vcn_enc.c
index 06579c8..20be5e6 100644
--- a/src/gallium/drivers/radeon/radeon_vcn_enc.c
+++ b/src/gallium/drivers/radeon/radeon_vcn_enc.c
@@ -38,20 +38,62 @@
 #include "radeon_video.h"
 #include "radeon_vcn_enc.h"
 
-static void radeon_vcn_enc_get_param(struct radeon_encoder *enc, struct 
pipe_h264_enc_picture_desc *pic)
+static void radeon_vcn_enc_get_param(struct radeon_encoder *enc, struct 
pipe_picture_desc *picture)
 {
-   enc->enc_pic.picture_type = pic->picture_type;
-   enc->enc_pic.frame_num = pic->frame_num;
-   enc->enc_pic.pic_order_cnt = pic->pic_order_cnt;
-   enc->enc_pic.pic_order_cnt_type = pic->pic_order_cnt_type;
-   enc->enc_pic.ref_idx_l0 = pic->ref_idx_l0;
-   enc->enc_pic.ref_idx_l1 = pic->ref_idx_l1;
-   enc->enc_pic.not_referenced = pic->not_referenced;
-   enc->enc_pic.is_idr = (pic->picture_type == 
PIPE_H264_ENC_PICTURE_TYPE_IDR);
-   enc->enc_pic.crop_left = 0;
-   enc->enc_pic.crop_right = (align(enc->base.width, 16) - 
enc->base.width) / 2;
-   enc->enc_pic.crop_top = 0;
-   enc->enc_pic.crop_bottom = (align(enc->base.height, 16) - 
enc->base.height) / 2;
+   if (u_reduce_video_profile(picture->profile) == 
PIPE_VIDEO_FORMAT_MPEG4_AVC) {
+  struct pipe_h264_enc_picture_desc *pic = (struct 
pipe_h264_enc_picture_desc *)picture;
+  enc->enc_pic.picture_type = pic->picture_type;
+  enc->enc_pic.frame_num = pic->frame_num;
+  enc->enc_pic.pic_order_cnt = pic->pic_order_cnt;
+  enc->enc_pic.pic_order_cnt_type = pic->pic_order_cnt_type;
+  enc->enc_pic.ref_idx_l0 = pic->ref_idx_l0;
+  enc->enc_pic.ref_idx_l1 = pic->ref_idx_l1;
+  enc->enc_pic.not_referenced = pic->not_referenced;
+  enc->enc_pic.is_idr = (pic->picture_type == 
PIPE_H264_ENC_PICTURE_TYPE_IDR);
+  enc->enc_pic.crop_left = 0;
+  enc->enc_pic.crop_right = (align(enc->base.width, 16) - enc->base.width) 
/ 2;
+  enc->enc_pic.crop_top = 0;
+  enc->enc_pic.crop_bottom = (align(enc->base.height, 16) - 
enc->base.height) / 2;
+   } else if (u_reduce_video_profile(picture->profile) == 
PIPE_VIDEO_FORMAT_HEVC) {
+  struct pipe_h265_enc_picture_desc *pic = (struct 
pipe_h265_enc_picture_desc *)picture;
+  enc->enc_pic.picture_type = pic->picture_type;
+  enc->enc_pic.frame_num = pic->frame_num;
+  enc->enc_pic.pic_order_cnt = pic->pic_order_cnt;
+  enc->enc_pic.pic_order_cnt_type = pic->pic_order_cnt_type;
+  enc->enc_pic.ref_idx_l0 = pic->ref_idx_l0;
+  enc->enc_pic.ref_idx_l1 = pic->ref_idx_l1;
+  enc->enc_pic.not_referenced = pic->not_referenced;
+  enc->enc_pic.is_idr = (pic->picture_type == 
PIPE_H265_ENC_PICTURE_TYPE_IDR) ||
+(pic->picture_type == 
PIPE_H265_ENC_PICTURE_TYPE_I);
+  enc->enc_pic.crop_left = 0;
+  enc->enc_pic.crop_right = (align(enc->base.width, 16) - enc->base.width) 
/ 2;
+  enc->enc_pic.crop_top = 0;
+  enc->enc_pic.crop_bottom = (align(enc->base.height, 16) - 
enc->base.height) / 2;
+  enc->enc_pic.general_tier_flag = pic->seq.general_tier_flag;
+  enc->enc_pic.general_profile_idc = pic->seq.general_profile_idc;
+  enc->enc_pic.general_level_idc = pic->seq.general_level_idc;
+  enc->enc_pic.max_poc = pic->seq.intra_period;
+  enc->enc_pic.log2_max_poc = 0;
+  for (int i = enc->enc_pic.max_poc; i != 0; enc->enc_pic.log2_max_poc++)
+ i = (i >> 1);
+  enc->enc_pic.chroma_format_idc = pic->seq.chroma_format_idc;
+  enc->enc_pic.pic_width_in_luma_samples = 
pic->seq.pic_width_in_luma_samples;
+  enc->enc_pic.pic_height_in_luma_samples = 
pic->seq.pic_height_in_luma_samples;
+  enc->enc_pic.log2_diff_max_min_luma_coding_block_size = 
pic->seq.log2_diff_max_min_luma_coding_block_size;
+  enc->enc_pic.log2_min_transform_block_size_minus2 = 
pic->seq.log2_min_transform_block_size_minus2;
+  enc->enc_pic.log2_diff_max_min_transform_block_size = 
pic->seq.log2_diff_max_min_transform_block_size;
+  enc->enc_pic.max_transform_hierarchy_depth_inter = 
pic->seq.max_transform_hierarchy_depth_inter;
+  enc->enc_pic.max_transform_hierarchy_depth_intra = 
pic->seq.max_transform_hierarchy_depth_intra;
+  enc->enc_pic.log2_parallel_merge_level_minus2 = 
pic->pic.log2_parallel_merge_level_minus2;
+  enc->enc_pic.bit_depth_luma_minus8 = 

[Mesa-dev] [PATCH 05/12] radeon/vcn: add header implementations for HEVC

2018-01-25 Thread boyuan.zhang
From: Boyuan Zhang 

Implement encoding of sps, pps, vps, aud, and slice headers for HEVC
based on HEVC specs.

Signed-off-by: Boyuan Zhang 
---
 src/gallium/drivers/radeon/radeon_vcn_enc_1_2.c | 348 +++-
 1 file changed, 347 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeon/radeon_vcn_enc_1_2.c 
b/src/gallium/drivers/radeon/radeon_vcn_enc_1_2.c
index a651f7e..74c4a08 100644
--- a/src/gallium/drivers/radeon/radeon_vcn_enc_1_2.c
+++ b/src/gallium/drivers/radeon/radeon_vcn_enc_1_2.c
@@ -551,6 +551,86 @@ static void radeon_enc_nalu_sps(struct radeon_encoder *enc)
RADEON_ENC_END();
 }
 
+static void radeon_enc_nalu_sps_hevc(struct radeon_encoder *enc)
+{
+   RADEON_ENC_BEGIN(RENCODE_IB_PARAM_DIRECT_OUTPUT_NALU);
+   RADEON_ENC_CS(RENCODE_DIRECT_OUTPUT_NALU_TYPE_SPS);
+   uint32_t *size_in_bytes = >cs->current.buf[enc->cs->current.cdw++];
+   int i;
+
+   radeon_enc_reset(enc);
+   radeon_enc_set_emulation_prevention(enc, false);
+   radeon_enc_code_fixed_bits(enc, 0x0001, 32);
+   radeon_enc_code_fixed_bits(enc, 0x4201, 16);
+   radeon_enc_byte_align(enc);
+   radeon_enc_set_emulation_prevention(enc, true);
+   radeon_enc_code_fixed_bits(enc, 0x0, 4);
+   radeon_enc_code_fixed_bits(enc, 
enc->enc_pic.layer_ctrl.max_num_temporal_layers - 1, 3);
+   radeon_enc_code_fixed_bits(enc, 0x1, 1);
+   radeon_enc_code_fixed_bits(enc, 0x0, 2);
+   radeon_enc_code_fixed_bits(enc, enc->enc_pic.general_tier_flag, 1);
+   radeon_enc_code_fixed_bits(enc, enc->enc_pic.general_profile_idc, 5);
+   radeon_enc_code_fixed_bits(enc, 0x6000, 32);
+   radeon_enc_code_fixed_bits(enc, 0xb000, 32);
+   radeon_enc_code_fixed_bits(enc, 0x0, 16);
+   radeon_enc_code_fixed_bits(enc, enc->enc_pic.general_level_idc, 8);
+
+   for (i = 0; i < (enc->enc_pic.layer_ctrl.max_num_temporal_layers - 1) ; 
i++)
+   radeon_enc_code_fixed_bits(enc, 0x0, 2);
+
+   if ((enc->enc_pic.layer_ctrl.max_num_temporal_layers - 1) > 0) {
+   for (i = (enc->enc_pic.layer_ctrl.max_num_temporal_layers - 1); 
i < 8; i++)
+   radeon_enc_code_fixed_bits(enc, 0x0, 2);
+   }
+
+   radeon_enc_code_ue(enc, 0x0);
+   radeon_enc_code_ue(enc, enc->enc_pic.chroma_format_idc);
+   radeon_enc_code_ue(enc, enc->enc_pic.pic_width_in_luma_samples);
+   radeon_enc_code_ue(enc, enc->enc_pic.pic_height_in_luma_samples);
+   radeon_enc_code_fixed_bits(enc, 0x0, 1);
+   radeon_enc_code_ue(enc, enc->enc_pic.bit_depth_luma_minus8);
+   radeon_enc_code_ue(enc, enc->enc_pic.bit_depth_chroma_minus8);
+   radeon_enc_code_ue(enc, enc->enc_pic.log2_max_poc - 4);
+   radeon_enc_code_fixed_bits(enc, 0x0, 1);
+   radeon_enc_code_ue(enc, 1);
+   radeon_enc_code_ue(enc, 0x0);
+   radeon_enc_code_ue(enc, 0x0);
+   radeon_enc_code_ue(enc, 
enc->enc_pic.hevc_spec_misc.log2_min_luma_coding_block_size_minus3);
+   //Only support CTBSize 64
+   radeon_enc_code_ue(enc, 6 - 
(enc->enc_pic.hevc_spec_misc.log2_min_luma_coding_block_size_minus3 + 3));
+   radeon_enc_code_ue(enc, 
enc->enc_pic.log2_min_transform_block_size_minus2);
+   radeon_enc_code_ue(enc, 
enc->enc_pic.log2_diff_max_min_transform_block_size);
+   radeon_enc_code_ue(enc, 
enc->enc_pic.max_transform_hierarchy_depth_inter);
+   radeon_enc_code_ue(enc, 
enc->enc_pic.max_transform_hierarchy_depth_intra);
+
+   radeon_enc_code_fixed_bits(enc, 0x0, 1);
+   radeon_enc_code_fixed_bits(enc, 
!enc->enc_pic.hevc_spec_misc.amp_disabled, 1);
+   radeon_enc_code_fixed_bits(enc, 
enc->enc_pic.sample_adaptive_offset_enabled_flag, 1);
+   radeon_enc_code_fixed_bits(enc, enc->enc_pic.pcm_enabled_flag, 1);
+
+   radeon_enc_code_ue(enc, 1);
+   radeon_enc_code_ue(enc, 1);
+   radeon_enc_code_ue(enc, 0);
+   radeon_enc_code_ue(enc, 0);
+   radeon_enc_code_fixed_bits(enc, 0x1, 1);
+
+   radeon_enc_code_fixed_bits(enc, 0x0, 1);
+
+   radeon_enc_code_fixed_bits(enc, 0, 1);
+   radeon_enc_code_fixed_bits(enc, 
enc->enc_pic.hevc_spec_misc.strong_intra_smoothing_enabled, 1);
+
+   radeon_enc_code_fixed_bits(enc, 0x0, 1);
+
+   radeon_enc_code_fixed_bits(enc, 0x0, 1);
+
+   radeon_enc_code_fixed_bits(enc, 0x1, 1);
+
+   radeon_enc_byte_align(enc);
+   radeon_enc_flush_headers(enc);
+   *size_in_bytes = (enc->bits_output + 7) / 8;
+   RADEON_ENC_END();
+}
+
 static void radeon_enc_nalu_pps(struct radeon_encoder *enc)
 {
RADEON_ENC_BEGIN(RENCODE_IB_PARAM_DIRECT_OUTPUT_NALU);
@@ -586,6 +666,150 @@ static void radeon_enc_nalu_pps(struct radeon_encoder 
*enc)
RADEON_ENC_END();
 }
 
+static void radeon_enc_nalu_pps_hevc(struct radeon_encoder *enc)
+{
+   RADEON_ENC_BEGIN(RENCODE_IB_PARAM_DIRECT_OUTPUT_NALU);
+   RADEON_ENC_CS(RENCODE_DIRECT_OUTPUT_NALU_TYPE_PPS);
+

[Mesa-dev] [PATCH 07/12] st/va: add HEVC picture desc

2018-01-25 Thread boyuan.zhang
From: Boyuan Zhang 

Add HEVC picture desc, and add codec check when creating and destroying
context.

Signed-off-by: Boyuan Zhang 
---
 src/gallium/state_trackers/va/context.c| 26 ++
 src/gallium/state_trackers/va/va_private.h |  1 +
 2 files changed, 23 insertions(+), 4 deletions(-)

diff --git a/src/gallium/state_trackers/va/context.c 
b/src/gallium/state_trackers/va/context.c
index 78e1f19..f03b326 100644
--- a/src/gallium/state_trackers/va/context.c
+++ b/src/gallium/state_trackers/va/context.c
@@ -284,8 +284,18 @@ vlVaCreateContext(VADriverContextP ctx, VAConfigID 
config_id, int picture_width,
context->desc.base.profile = config->profile;
context->desc.base.entry_point = config->entrypoint;
if (config->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE) {
-  context->desc.h264enc.rate_ctrl.rate_ctrl_method = config->rc;
-  context->desc.h264enc.frame_idx = util_hash_table_create(handle_hash, 
handle_compare);
+  switch (u_reduce_video_profile(context->templat.profile)) {
+  case PIPE_VIDEO_FORMAT_MPEG4_AVC:
+ context->desc.h264enc.rate_ctrl.rate_ctrl_method = config->rc;
+ context->desc.h264enc.frame_idx = util_hash_table_create(handle_hash, 
handle_compare);
+ break;
+  case PIPE_VIDEO_FORMAT_HEVC:
+ context->desc.h265enc.rc.rate_ctrl_method = config->rc;
+ context->desc.h265enc.frame_idx = util_hash_table_create(handle_hash, 
handle_compare);
+ break;
+  default:
+ break;
+  }
}
 
mtx_lock(>mutex);
@@ -314,8 +324,16 @@ vlVaDestroyContext(VADriverContextP ctx, VAContextID 
context_id)
 
if (context->decoder) {
   if (context->desc.base.entry_point == PIPE_VIDEO_ENTRYPOINT_ENCODE) {
- if (context->desc.h264enc.frame_idx)
-util_hash_table_destroy (context->desc.h264enc.frame_idx);
+ if (u_reduce_video_profile(context->decoder->profile) ==
+ PIPE_VIDEO_FORMAT_MPEG4_AVC) {
+if (context->desc.h264enc.frame_idx)
+   util_hash_table_destroy (context->desc.h264enc.frame_idx);
+ }
+ if (u_reduce_video_profile(context->decoder->profile) ==
+ PIPE_VIDEO_FORMAT_HEVC) {
+if (context->desc.h265enc.frame_idx)
+   util_hash_table_destroy (context->desc.h265enc.frame_idx);
+ }
   } else {
  if (u_reduce_video_profile(context->decoder->profile) ==
PIPE_VIDEO_FORMAT_MPEG4_AVC) {
diff --git a/src/gallium/state_trackers/va/va_private.h 
b/src/gallium/state_trackers/va/va_private.h
index 520f970..c022feb 100644
--- a/src/gallium/state_trackers/va/va_private.h
+++ b/src/gallium/state_trackers/va/va_private.h
@@ -270,6 +270,7 @@ typedef struct {
   struct pipe_h265_picture_desc h265;
   struct pipe_mjpeg_picture_desc mjpeg;
   struct pipe_h264_enc_picture_desc h264enc;
+  struct pipe_h265_enc_picture_desc h265enc;
} desc;
 
struct {
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/12] st/va: add entrypoint check for HEVC

2018-01-25 Thread boyuan.zhang
From: Boyuan Zhang 

Add entrypoint check for HEVC to differentiate decode and encode jobs.

Signed-off-by: Boyuan Zhang 
---
 src/gallium/state_trackers/va/context.c | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/src/gallium/state_trackers/va/context.c 
b/src/gallium/state_trackers/va/context.c
index f03b326..f567f54 100644
--- a/src/gallium/state_trackers/va/context.c
+++ b/src/gallium/state_trackers/va/context.c
@@ -263,16 +263,18 @@ vlVaCreateContext(VADriverContextP ctx, VAConfigID 
config_id, int picture_width,
 
  case PIPE_VIDEO_FORMAT_HEVC:
  context->templat.max_references = num_render_targets;
- context->desc.h265.pps = CALLOC_STRUCT(pipe_h265_pps);
- if (!context->desc.h265.pps) {
-FREE(context);
-return VA_STATUS_ERROR_ALLOCATION_FAILED;
- }
- context->desc.h265.pps->sps = CALLOC_STRUCT(pipe_h265_sps);
- if (!context->desc.h265.pps->sps) {
-FREE(context->desc.h265.pps);
-FREE(context);
-return VA_STATUS_ERROR_ALLOCATION_FAILED;
+ if (config->entrypoint != PIPE_VIDEO_ENTRYPOINT_ENCODE) {
+context->desc.h265.pps = CALLOC_STRUCT(pipe_h265_pps);
+if (!context->desc.h265.pps) {
+   FREE(context);
+   return VA_STATUS_ERROR_ALLOCATION_FAILED;
+}
+context->desc.h265.pps->sps = CALLOC_STRUCT(pipe_h265_sps);
+if (!context->desc.h265.pps->sps) {
+   FREE(context->desc.h265.pps);
+   FREE(context);
+   return VA_STATUS_ERROR_ALLOCATION_FAILED;
+}
  }
  break;
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/12] radeon/vcn: add ib implementations for HEVC

2018-01-25 Thread boyuan.zhang
From: Boyuan Zhang 

Implement required ibs for vcn HEVC encode.

Signed-off-by: Boyuan Zhang 
---
 src/gallium/drivers/radeon/radeon_vcn_enc_1_2.c | 267 
 1 file changed, 222 insertions(+), 45 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_vcn_enc_1_2.c 
b/src/gallium/drivers/radeon/radeon_vcn_enc_1_2.c
index 06b8092..a651f7e 100644
--- a/src/gallium/drivers/radeon/radeon_vcn_enc_1_2.c
+++ b/src/gallium/drivers/radeon/radeon_vcn_enc_1_2.c
@@ -231,6 +231,27 @@ static void radeon_enc_session_init(struct radeon_encoder 
*enc)
RADEON_ENC_END();
 }
 
+static void radeon_enc_session_init_hevc(struct radeon_encoder *enc)
+{
+   enc->enc_pic.session_init.encode_standard = 
RENCODE_ENCODE_STANDARD_HEVC;
+   enc->enc_pic.session_init.aligned_picture_width = 
align(enc->base.width, 64);
+   enc->enc_pic.session_init.aligned_picture_height = 
align(enc->base.height, 16);
+   enc->enc_pic.session_init.padding_width = 
enc->enc_pic.session_init.aligned_picture_width - enc->base.width;
+   enc->enc_pic.session_init.padding_height = 
enc->enc_pic.session_init.aligned_picture_height - enc->base.height;
+   enc->enc_pic.session_init.pre_encode_mode = RENCODE_PREENCODE_MODE_NONE;
+   enc->enc_pic.session_init.pre_encode_chroma_enabled = false;
+
+   RADEON_ENC_BEGIN(RENCODE_IB_PARAM_SESSION_INIT);
+   RADEON_ENC_CS(enc->enc_pic.session_init.encode_standard);
+   RADEON_ENC_CS(enc->enc_pic.session_init.aligned_picture_width);
+   RADEON_ENC_CS(enc->enc_pic.session_init.aligned_picture_height);
+   RADEON_ENC_CS(enc->enc_pic.session_init.padding_width);
+   RADEON_ENC_CS(enc->enc_pic.session_init.padding_height);
+   RADEON_ENC_CS(enc->enc_pic.session_init.pre_encode_mode);
+   RADEON_ENC_CS(enc->enc_pic.session_init.pre_encode_chroma_enabled);
+   RADEON_ENC_END();
+}
+
 static void radeon_enc_layer_control(struct radeon_encoder *enc)
 {
enc->enc_pic.layer_ctrl.max_num_temporal_layers = 1;
@@ -262,6 +283,19 @@ static void radeon_enc_slice_control(struct radeon_encoder 
*enc)
RADEON_ENC_END();
 }
 
+static void radeon_enc_slice_control_hevc(struct radeon_encoder *enc)
+{
+   enc->enc_pic.hevc_slice_ctrl.slice_control_mode = 
RENCODE_HEVC_SLICE_CONTROL_MODE_FIXED_CTBS;
+   enc->enc_pic.hevc_slice_ctrl.fixed_ctbs_per_slice.num_ctbs_per_slice = 
align(enc->base.width, 64) / 64 * align(enc->base.height, 64) / 64;
+   
enc->enc_pic.hevc_slice_ctrl.fixed_ctbs_per_slice.num_ctbs_per_slice_segment = 
enc->enc_pic.hevc_slice_ctrl.fixed_ctbs_per_slice.num_ctbs_per_slice;
+
+   RADEON_ENC_BEGIN(RENCODE_HEVC_IB_PARAM_SLICE_CONTROL);
+   RADEON_ENC_CS(enc->enc_pic.hevc_slice_ctrl.slice_control_mode);
+   
RADEON_ENC_CS(enc->enc_pic.hevc_slice_ctrl.fixed_ctbs_per_slice.num_ctbs_per_slice);
+   
RADEON_ENC_CS(enc->enc_pic.hevc_slice_ctrl.fixed_ctbs_per_slice.num_ctbs_per_slice_segment);
+   RADEON_ENC_END();
+}
+
 static void radeon_enc_spec_misc(struct radeon_encoder *enc)
 {
enc->enc_pic.spec_misc.constrained_intra_pred_flag = 0;
@@ -283,27 +317,68 @@ static void radeon_enc_spec_misc(struct radeon_encoder 
*enc)
RADEON_ENC_END();
 }
 
+static void radeon_enc_spec_misc_hevc(struct radeon_encoder *enc, struct 
pipe_picture_desc *picture)
+{
+   struct pipe_h265_enc_picture_desc *pic = (struct 
pipe_h265_enc_picture_desc *)picture;
+   enc->enc_pic.hevc_spec_misc.log2_min_luma_coding_block_size_minus3 = 
pic->seq.log2_min_luma_coding_block_size_minus3;
+   enc->enc_pic.hevc_spec_misc.amp_disabled = !pic->seq.amp_enabled_flag;
+   enc->enc_pic.hevc_spec_misc.strong_intra_smoothing_enabled = 
pic->seq.strong_intra_smoothing_enabled_flag;
+   enc->enc_pic.hevc_spec_misc.constrained_intra_pred_flag = 
pic->pic.constrained_intra_pred_flag;
+   enc->enc_pic.hevc_spec_misc.cabac_init_flag = 
pic->slice.cabac_init_flag;
+   enc->enc_pic.hevc_spec_misc.half_pel_enabled = 1;
+   enc->enc_pic.hevc_spec_misc.quarter_pel_enabled = 1;
+
+   RADEON_ENC_BEGIN(RENCODE_HEVC_IB_PARAM_SPEC_MISC);
+   
RADEON_ENC_CS(enc->enc_pic.hevc_spec_misc.log2_min_luma_coding_block_size_minus3);
+   RADEON_ENC_CS(enc->enc_pic.hevc_spec_misc.amp_disabled);
+   
RADEON_ENC_CS(enc->enc_pic.hevc_spec_misc.strong_intra_smoothing_enabled);
+   RADEON_ENC_CS(enc->enc_pic.hevc_spec_misc.constrained_intra_pred_flag);
+   RADEON_ENC_CS(enc->enc_pic.hevc_spec_misc.cabac_init_flag);
+   RADEON_ENC_CS(enc->enc_pic.hevc_spec_misc.half_pel_enabled);
+   RADEON_ENC_CS(enc->enc_pic.hevc_spec_misc.quarter_pel_enabled);
+   RADEON_ENC_END();
+}
+
 static void radeon_enc_rc_session_init(struct radeon_encoder *enc, struct 
pipe_picture_desc *picture)
 {
-   struct pipe_h264_enc_picture_desc *pic = (struct 
pipe_h264_enc_picture_desc *)picture;
-   switch(pic->rate_ctrl.rate_ctrl_method) {
-   

[Mesa-dev] [PATCH 01/12] vl: add parameters for HEVC encode

2018-01-25 Thread boyuan.zhang
From: Boyuan Zhang 

Add HEVC encode interface

Signed-off-by: Boyuan Zhang 
---
 src/gallium/include/pipe/p_video_state.h | 100 +++
 1 file changed, 100 insertions(+)

diff --git a/src/gallium/include/pipe/p_video_state.h 
b/src/gallium/include/pipe/p_video_state.h
index 5a88e6c..26e0acf 100644
--- a/src/gallium/include/pipe/p_video_state.h
+++ b/src/gallium/include/pipe/p_video_state.h
@@ -120,6 +120,15 @@ enum pipe_h264_enc_picture_type
PIPE_H264_ENC_PICTURE_TYPE_SKIP = 0x04
 };
 
+enum pipe_h265_enc_picture_type
+{
+   PIPE_H265_ENC_PICTURE_TYPE_P = 0x00,
+   PIPE_H265_ENC_PICTURE_TYPE_B = 0x01,
+   PIPE_H265_ENC_PICTURE_TYPE_I = 0x02,
+   PIPE_H265_ENC_PICTURE_TYPE_IDR = 0x03,
+   PIPE_H265_ENC_PICTURE_TYPE_SKIP = 0x04
+};
+
 enum pipe_h264_enc_rate_control_method
 {
PIPE_H264_ENC_RATE_CONTROL_METHOD_DISABLE = 0x00,
@@ -129,6 +138,15 @@ enum pipe_h264_enc_rate_control_method
PIPE_H264_ENC_RATE_CONTROL_METHOD_VARIABLE = 0x04
 };
 
+enum pipe_h265_enc_rate_control_method
+{
+   PIPE_H265_ENC_RATE_CONTROL_METHOD_DISABLE = 0x00,
+   PIPE_H265_ENC_RATE_CONTROL_METHOD_CONSTANT_SKIP = 0x01,
+   PIPE_H265_ENC_RATE_CONTROL_METHOD_VARIABLE_SKIP = 0x02,
+   PIPE_H265_ENC_RATE_CONTROL_METHOD_CONSTANT = 0x03,
+   PIPE_H265_ENC_RATE_CONTROL_METHOD_VARIABLE = 0x04
+};
+
 struct pipe_picture_desc
 {
enum pipe_video_profile profile;
@@ -412,6 +430,88 @@ struct pipe_h264_enc_picture_desc
 
 };
 
+struct pipe_h265_enc_seq_param
+{
+   uint8_t  general_profile_idc;
+   uint8_t  general_level_idc;
+   uint8_t  general_tier_flag;
+   uint32_t intra_period;
+   uint16_t pic_width_in_luma_samples;
+   uint16_t pic_height_in_luma_samples;
+   uint32_t chroma_format_idc;
+   uint32_t bit_depth_luma_minus8;
+   uint32_t bit_depth_chroma_minus8;
+   bool strong_intra_smoothing_enabled_flag;
+   bool amp_enabled_flag;
+   bool sample_adaptive_offset_enabled_flag;
+   bool pcm_enabled_flag;
+   bool sps_temporal_mvp_enabled_flag;
+   uint8_t  log2_min_luma_coding_block_size_minus3;
+   uint8_t  log2_diff_max_min_luma_coding_block_size;
+   uint8_t  log2_min_transform_block_size_minus2;
+   uint8_t  log2_diff_max_min_transform_block_size;
+   uint8_t  max_transform_hierarchy_depth_inter;
+   uint8_t  max_transform_hierarchy_depth_intra;
+};
+
+struct pipe_h265_enc_pic_param
+{
+   uint8_t log2_parallel_merge_level_minus2;
+   uint8_t nal_unit_type;
+   bool constrained_intra_pred_flag;
+   bool loop_filter_across_tiles_enabled_flag;
+};
+
+struct pipe_h265_enc_slice_param
+{
+   uint8_t max_num_merge_cand;
+   int8_t slice_cb_qp_offset;
+   int8_t slice_cr_qp_offset;
+   int8_t slice_beta_offset_div2;
+   int8_t slice_tc_offset_div2;
+   bool cabac_init_flag;
+   uint32_t slice_deblocking_filter_disabled_flag;
+   bool slice_loop_filter_across_slices_enabled_flag;
+};
+
+struct pipe_h265_enc_rate_control
+{
+   enum pipe_h265_enc_rate_control_method rate_ctrl_method;
+   unsigned target_bitrate;
+   unsigned peak_bitrate;
+   unsigned frame_rate_num;
+   unsigned frame_rate_den;
+   unsigned quant_i_frames;
+   unsigned vbv_buffer_size;
+   unsigned vbv_buf_lv;
+   unsigned target_bits_picture;
+   unsigned peak_bits_picture_integer;
+   unsigned peak_bits_picture_fraction;
+   unsigned fill_data_enable;
+   unsigned enforce_hrd;
+};
+
+struct pipe_h265_enc_picture_desc
+{
+   struct pipe_picture_desc base;
+
+   struct pipe_h265_enc_seq_param seq;
+   struct pipe_h265_enc_pic_param pic;
+   struct pipe_h265_enc_slice_param slice;
+   struct pipe_h265_enc_rate_control rc;
+
+   enum pipe_h265_enc_picture_type picture_type;
+   unsigned decoded_curr_pic;
+   unsigned reference_frames[16];
+   unsigned frame_num;
+   unsigned pic_order_cnt;
+   unsigned pic_order_cnt_type;
+   unsigned ref_idx_l0;
+   unsigned ref_idx_l1;
+   bool not_referenced;
+   struct util_hash_table *frame_idx;
+};
+
 struct pipe_h265_sps
 {
uint8_t chroma_format_idc;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Reset the register file to VGRF in lower_integer_multiplication

2018-01-25 Thread Jason Ekstrand
On Thu, Jan 25, 2018 at 10:08 AM, Matt Turner  wrote:

> On Fri, Dec 15, 2017 at 5:12 PM, Jason Ekstrand 
> wrote:
> > 18fde36ced4279f2577097a1a7d31b55f2f5f141 changed the way temporary
> > registers were allocated in lower_integer_multiplication so that we
> > allocate regs_written(inst) space and keep the stride of the original
> > destination register.  This was to ensure that any MUL which originally
> > followed the CHV/BXT integer multiply regioning restrictions would
> > continue to follow those restrictions even after lowering.  This works
> > fine except that I forgot to reset the register file to VGRF so, even
> > though they were assigned a number from alloc.allocate(), they had the
> > wrong register file.  This caused some GLES 3.0 CTS tests to start
> > failing on Sandy Bridge due to attempted reads from the MRF:
> >
> > ES3-CTS.functional.shaders.precision.int.highp_mul_fragment.snbm64
> > ES3-CTS.functional.shaders.precision.int.mediump_mul_fragment.snbm64
> > ES3-CTS.functional.shaders.precision.int.lowp_mul_fragment.snbm64
> > ES3-CTS.functional.shaders.precision.uint.highp_mul_fragment.snbm64
> > ES3-CTS.functional.shaders.precision.uint.mediump_mul_
> fragment.snbm64
> > ES3-CTS.functional.shaders.precision.uint.lowp_mul_fragment.snbm64
> >
> > This commit remedies this problem by, instead of copying inst->dst and
> > overwriting nr, just make a new register and set the region to match
> > inst->dst.
> >
> > Cc: Matt Turner 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103626
> > Fixes: 18fde36ced4279f2577097a1a7d31b55f2f5f141
> > Cc: "17.3" 
>
> Thanks. Sorry this got lost. Looks like it was sent the day I started
> vacation.
>
> Reviewed-by: Matt Turner 
>

Thanks!  I'll give it one more run through Jenkins and land it.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] vulkan: Update the XML and headers to 1.0.68

2018-01-25 Thread Jason Ekstrand
pushed

On Thu, Jan 25, 2018 at 10:30 AM, Chad Versace 
wrote:

> On Wed 24 Jan 2018, Jason Ekstrand wrote:
> > ---
> >  include/vulkan/vulkan.h| 54 ---
> >  src/vulkan/registry/vk.xml | 91 ++
> +++-
> >  2 files changed, 130 insertions(+), 15 deletions(-)
>
> Acked-by: Chad Versace 
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Fix function pointers initialization in status tracker

2018-01-25 Thread Brian Paul

On 01/25/2018 01:09 PM, Eleni Maria Stea wrote:

We assigned the function that gets the device uuid to the GetDriverUuid
function pointer and the function that gets the driver uuid to the
GetDeviceUuid function pointer inside the state tracker. Exchanged the
pointers.
---
  src/mesa/state_tracker/st_context.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/state_tracker/st_context.c 
b/src/mesa/state_tracker/st_context.c
index 3ba4847926..d3e7d3fb7f 100644
--- a/src/mesa/state_tracker/st_context.c
+++ b/src/mesa/state_tracker/st_context.c
@@ -757,8 +757,8 @@ st_init_driver_functions(struct pipe_screen *screen,
 functions->UpdateState = st_invalidate_state;
 functions->QueryMemoryInfo = st_query_memory_info;
 functions->SetBackgroundContext = st_set_background_context;
-   functions->GetDriverUuid = st_get_device_uuid;
-   functions->GetDeviceUuid = st_get_driver_uuid;
+   functions->GetDriverUuid = st_get_driver_uuid;
+   functions->GetDeviceUuid = st_get_device_uuid;
  
 /* GL_ARB_get_program_binary */

 functions->GetProgramBinaryDriverSHA1 = st_get_program_binary_driver_sha1;



Reviewed-by: Brian Paul 

I'll also cc mesa-stable on it and push it soon.

-Brian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: Fix function pointers initialization in status tracker

2018-01-25 Thread Eleni Maria Stea
We assigned the function that gets the device uuid to the GetDriverUuid
function pointer and the function that gets the driver uuid to the
GetDeviceUuid function pointer inside the state tracker. Exchanged the
pointers.
---
 src/mesa/state_tracker/st_context.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/state_tracker/st_context.c 
b/src/mesa/state_tracker/st_context.c
index 3ba4847926..d3e7d3fb7f 100644
--- a/src/mesa/state_tracker/st_context.c
+++ b/src/mesa/state_tracker/st_context.c
@@ -757,8 +757,8 @@ st_init_driver_functions(struct pipe_screen *screen,
functions->UpdateState = st_invalidate_state;
functions->QueryMemoryInfo = st_query_memory_info;
functions->SetBackgroundContext = st_set_background_context;
-   functions->GetDriverUuid = st_get_device_uuid;
-   functions->GetDeviceUuid = st_get_driver_uuid;
+   functions->GetDriverUuid = st_get_driver_uuid;
+   functions->GetDeviceUuid = st_get_device_uuid;
 
/* GL_ARB_get_program_binary */
functions->GetProgramBinaryDriverSHA1 = st_get_program_binary_driver_sha1;
-- 
2.15.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: emit a cache flush before enabling predication

2018-01-25 Thread Dave Airlie
On 26 Jan. 2018 01:10, "Matthew Nicholls" 
wrote:

Otherwise cache flushes could get conditionally disabled while still
clearing
the flush_bits, and thus flushes due to application pipeline barriers may
never
get executed.


I wonder would we better not predicating flushes. I added that as an extra
opt, but it might be the wrong move.

Dave.


Cc: mesa-sta...@lists.freedesktop.org
---
 src/amd/vulkan/radv_meta_fast_clear.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/amd/vulkan/radv_meta_fast_clear.c
b/src/amd/vulkan/radv_meta_fast_clear.c
index fdeeaeedbf..f4353fd889 100644
--- a/src/amd/vulkan/radv_meta_fast_clear.c
+++ b/src/amd/vulkan/radv_meta_fast_clear.c
@@ -602,6 +602,8 @@ radv_emit_color_decompress(struct radv_cmd_buffer
*cmd_buffer,
}

if (!decompress_dcc && image->surface.dcc_size) {
+   si_emit_cache_flush(cmd_buffer);
+
radv_emit_set_predication_state_from_image(cmd_buffer,
image, true);
cmd_buffer->state.predicating = true;
}
--
2.13.6
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v1 0/7] Implement commont gralloc_handle_t in libdrm

2018-01-25 Thread Rob Herring
On Thu, Jan 25, 2018 at 10:21 AM, Robert Foss  wrote:
> Hey Tomasz,
>
> On 01/24/2018 11:04 AM, Tomasz Figa wrote:
>>
>> Hi Robert,
>>
>> On Wed, Jan 17, 2018 at 2:36 AM, Robert Foss 
>> wrote:
>>>
>>> This series moves {gbm,drm,cros}_gralloc_handle_t struct to libdrm,
>>> since at least 4 implementations exist, and share a lot of contents.
>>> The idea is to keep the common stuff defined in one place, and libdrm
>>> is the common codebase to all of these platforms.
>>>
>>>
>>> Additionally, having this struct defined in libdrm will make it
>>> easier for mesa and gralloc implementations to communicate.
>>>
>>> Robert Foss (7):
>>>android: Move gralloc handle struct to libdrm
>>>android: Add version variable to gralloc_handle_t
>>>android: Mark gralloc_handle_t magic variable as const
>>>android: Remove member name from gralloc_handle_t
>>>android: Change gralloc_handle_t format from Android format to fourcc
>>>android: Change gralloc_handle_t members to be fixed width
>>>android: Add accessor functions for gralloc_handle_t variables
>>
>>
>> Again, thanks for working on this.
>>
>> I looked through the series and it seems to be much different from
>> what I imagined when writing my previous reply. I must have
>> misunderstood your proposal back then.
>
>
> Ah, glad we caught it before v2 then :)
>
>>
>> Generally, current series doesn't solve Chromium OS main concern of
>> locking down the handle struct. Even though accessors are added, they
>> are implemented in libdrm and refer to the exact handle layout as per
>> the handle struct defined by libdrm.
>
>
> So solving the problems of multiple projects is the goal, so reconsidering
> is probably they way forward.
>
>>
>> What I had in my mind, would be creating a secondary struct,
>> consisting only of callbacks, which would be filled in by particular
>> gralloc implementation running in the system with its accessors. This
>> would completely eliminate any dependencies on the handle struct
>> itself from consumers of gralloc buffers.
>
>
> So just to sketch out the solution, it would look something like this?
>
> struct gralloc_handle_t {

This is not a handle...

> uint32_t (*get_fd)(buffer_handle_t handle, uint32_t plane);
> uint64_t (*get_modifier)(buffer_handle_t handle, uint32_t plane);
> uint32_t (*get_offsets)(buffer_handle_t handle, uint32_t plane);
> uint32_t (*get_stride)(buffer_handle_t handle, uint32_t plane);
> ...
> } gralloc_funcs_t;
>
> struct gralloc_handle_t {
> native_handle_t base;
>
> /* api variables */
> const int magic; /* differentiate between allocator impls */
> const int version; /* api version */
>
> gralloc_funcs_t funcs;

This doesn't go in the handle, but rather you would retrieve this
struct I guess with a "perform" call to gralloc AIUI.

Of course, if you have 1 perform call, then why not just a perform op
for each accessor. Does perform even exist in a gralloc 2
implementation?

> ...
> } gralloc_handle_t;
>
> For reasons of backwards compatability gralloc_handle_t should probably
> contain whatever gbm_gralloc_handle_t contains now too.

Being backwards compatible with upstream (to the extent there is one)
was a goal. You can't really have that and what Tomasz proposes.

> Since we're going to version this struct, we can always drop extraneous
> variables later.
> Since we'll be able to drop variables, we could add more variables to
> support the cros minigbm variables of even the intel minigbm ones.
> This would be a bit high churn, but probably ease adoption.

I've yet to hear technical reasons why the handle struct needs to be different.

> Additionally the gralloc buffer registering mechanism doesn't exist in any
> of the gralloc implementations, so being able to start out with something
> that works on all platforms would be nice.
>
>
> Rob.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH v2] configure.ac: add missing llvm dependencies to .pc files

2018-01-25 Thread Chuck Atkins
Hi Emil,


I'll squash it before pushing
>>
>
> Thanks!  Hopefully once my new account goes through I can push on my own.
>

It looks like my account finally went through so I can just take care of
pushing it myself.

- Chuck
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] swr/rast: Optimize DumpToFile output size

2018-01-25 Thread Cherniak, Bruce
Reviewed-by: Bruce Cherniak 

> On Jan 24, 2018, at 2:50 PM, George Kyriazis  
> wrote:
> 
> Modify DumpToFile to only dump the function, not the entire module.
> Reduces file sizes and speeds up the dumping.
> ---
> src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp 
> b/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp
> index 675438b..7105766 100644
> --- a/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp
> +++ b/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp
> @@ -421,8 +421,7 @@ void JitManager::DumpToFile(Function *f, const char 
> *fileName)
> sprintf(fName, "%s.%s.ll", funcName, fileName);
> #endif
> raw_fd_ostream fd(fName, EC, llvm::sys::fs::F_None);
> -Module* pModule = f->getParent();
> -pModule->print(fd, nullptr);
> +f->print(fd, nullptr);
> 
> #if defined(_WIN32)
> sprintf(fName, "%s\\cfg.%s.%s.dot", outDir.c_str(), funcName, 
> fileName);
> -- 
> 2.7.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] svga: s/Bool/SVGA3dBool/ in SVGA3dDevCapResult

2018-01-25 Thread Charmaine Lee

Reviewed-by: Charmaine Lee 


From: Brian Paul 
Sent: Thursday, January 25, 2018 10:38:51 AM
To: mesa-dev@lists.freedesktop.org
Cc: Charmaine Lee; Neha Bhende
Subject: [PATCH] svga: s/Bool/SVGA3dBool/ in SVGA3dDevCapResult

And fix whitespace.  To sync up with in-house code.
---
 src/gallium/drivers/svga/include/svga3d_devcaps.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/svga/include/svga3d_devcaps.h 
b/src/gallium/drivers/svga/include/svga3d_devcaps.h
index ade210b..4e2f6bf 100644
--- a/src/gallium/drivers/svga/include/svga3d_devcaps.h
+++ b/src/gallium/drivers/svga/include/svga3d_devcaps.h
@@ -448,10 +448,10 @@ typedef enum {
SVGADX_DXFMT_MULTISAMPLE_8 )

 typedef union {
-   Bool   b;
+   SVGA3dBool b;
uint32 u;
-   int32  i;
-   float  f;
+   int32 i;
+   float f;
 } SVGA3dDevCapResult;

 #endif /* _SVGA3D_DEVCAPS_H_ */
--
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/10] swrast: remove non-applicable GLX_SWAP_COPY_OML comment

2018-01-25 Thread Emil Velikov
On 12 December 2017 at 00:19, Ian Romanick  wrote:
> On 12/07/2017 09:07 AM, Emil Velikov wrote:
>> From: Emil Velikov 
>>
>> Noticed while skimming for GLX_ instances i the dri codebase.
>
> in
>
> With that fixed, this patch is also
>
> Reviewed-by: Ian Romanick 
>
Thank you Ian, tweaked and pushed the first four patches.
If anyone is feeling a bit bored and wants to skim through the rest
[1] that would be appreciated.

-Emil
[1] https://patchwork.freedesktop.org/series/35051/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] svga: s/Bool/SVGA3dBool/ in SVGA3dDevCapResult

2018-01-25 Thread Brian Paul
And fix whitespace.  To sync up with in-house code.
---
 src/gallium/drivers/svga/include/svga3d_devcaps.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/svga/include/svga3d_devcaps.h 
b/src/gallium/drivers/svga/include/svga3d_devcaps.h
index ade210b..4e2f6bf 100644
--- a/src/gallium/drivers/svga/include/svga3d_devcaps.h
+++ b/src/gallium/drivers/svga/include/svga3d_devcaps.h
@@ -448,10 +448,10 @@ typedef enum {
SVGADX_DXFMT_MULTISAMPLE_8 )
 
 typedef union {
-   Bool   b;
+   SVGA3dBool b;
uint32 u;
-   int32  i;
-   float  f;
+   int32 i;
+   float f;
 } SVGA3dDevCapResult;
 
 #endif /* _SVGA3D_DEVCAPS_H_ */
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] vulkan: Update the XML and headers to 1.0.68

2018-01-25 Thread Chad Versace
On Wed 24 Jan 2018, Jason Ekstrand wrote:
> ---
>  include/vulkan/vulkan.h| 54 ---
>  src/vulkan/registry/vk.xml | 91 
> +-
>  2 files changed, 130 insertions(+), 15 deletions(-)

Acked-by: Chad Versace 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 104141] include/c11/threads_posix.h:96: undefined reference to `pthread_once'

2018-01-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=104141

Vinson Lee  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] swr/rast: Support USE_SIMD16_FRONTEND=0 for EarlyRast

2018-01-25 Thread Cherniak, Bruce
Series Reviewed-by: Bruce Cherniak  

> On Jan 24, 2018, at 9:31 AM, George Kyriazis  
> wrote:
> 
> Early Rasterization did not initially work with USE_SIMD16_FRONTEND=0.
> Fix it so it works there, too.  Please note that the default setting
> is USE_SIMD16_FRONTEND=1.
> ---
> .../drivers/swr/rasterizer/core/frontend.cpp   | 66 +++---
> 1 file changed, 33 insertions(+), 33 deletions(-)
> 
> diff --git a/src/gallium/drivers/swr/rasterizer/core/frontend.cpp 
> b/src/gallium/drivers/swr/rasterizer/core/frontend.cpp
> index 9600f78..66c4b74 100644
> --- a/src/gallium/drivers/swr/rasterizer/core/frontend.cpp
> +++ b/src/gallium/drivers/swr/rasterizer/core/frontend.cpp
> @@ -1032,31 +1032,31 @@ static void GeometryShaderStage(
> simdscalari vPrimId = 
> _simd_set1_epi32(pPrimitiveId[inputPrim]);
> 
> // Gather data from the SVG if provided.
> -simdscalari vViewportIdx = 
> SIMD16::setzero_si();
> -simdscalari vRtIdx = SIMD16::setzero_si();
> -SIMD8::Vec4 svgAttrib[4];
> +simdscalari vViewportIdx = 
> SIMD::setzero_si();
> +simdscalari vRtIdx = SIMD::setzero_si();
> +SIMD::Vec4 svgAttrib[4];
> 
> if (state.backendState.readViewportArrayIndex 
> || state.backendState.readRenderTargetArrayIndex)
> {
> -tessPa.Assemble(VERTEX_SGV_SLOT, 
> svgAttrib);
> +gsPa.Assemble(VERTEX_SGV_SLOT, 
> svgAttrib);
> }
> 
> 
> if (state.backendState.readViewportArrayIndex)
> {
> -vViewportIdx = 
> SIMD8::castps_si(svgAttrib[0][VERTEX_SGV_VAI_COMP]);
> +vViewportIdx = 
> SIMD::castps_si(svgAttrib[0][VERTEX_SGV_VAI_COMP]);
> 
> // OOB VPAI indices => forced to zero.
> -vViewportIdx = 
> SIMD8::max_epi32(vViewportIdx, SIMD8::setzero_si());
> -simd16scalari vNumViewports = 
> SIMD8::set1_epi32(KNOB_NUM_VIEWPORTS_SCISSORS);
> -simd16scalari vClearMask = 
> SIMD8::cmplt_epi32(vViewportIdx, vNumViewports);
> -vViewportIdx = SIMD8::and_si(vClearMask, 
> vViewportIdx);
> -tessPa.viewportArrayActive = true;
> +vViewportIdx = 
> SIMD::max_epi32(vViewportIdx, SIMD::setzero_si());
> +simdscalari vNumViewports = 
> SIMD::set1_epi32(KNOB_NUM_VIEWPORTS_SCISSORS);
> +simdscalari vClearMask = 
> SIMD::cmplt_epi32(vViewportIdx, vNumViewports);
> +vViewportIdx = SIMD::and_si(vClearMask, 
> vViewportIdx);
> +gsPa.viewportArrayActive = true;
> }
> if 
> (state.backendState.readRenderTargetArrayIndex)
> {
> -vRtIdx = 
> SIMD8::castps_si(svgAttrib[0][VERTEX_SGV_RTAI_COMP]);
> -tessPa.rtArrayActive = true;
> +vRtIdx = 
> SIMD::castps_si(svgAttrib[0][VERTEX_SGV_RTAI_COMP]);
> +gsPa.rtArrayActive = true;
> }
> 
> pfnClipFunc(pDC, gsPa, workerId, attrib, 
> GenMask(gsPa.NumPrims()), vPrimId, vViewportIdx, vRtIdx);
> @@ -1437,9 +1437,9 @@ static void TessellationStages(
> }
> #else
> // Gather data from the SVG if provided.
> -simdscalari vViewportIdx = SIMD16::setzero_si();
> -simdscalari vRtIdx = SIMD16::setzero_si();
> -SIMD8::Vec4 svgAttrib[4];
> +simdscalari vViewportIdx = SIMD::setzero_si();
> +simdscalari vRtIdx = SIMD::setzero_si();
> +SIMD::Vec4 svgAttrib[4];
> 
> if (state.backendState.readViewportArrayIndex || 
> state.backendState.readRenderTargetArrayIndex)
> {
> @@ -1448,18 +1448,18 @@ static void TessellationStages(
> 
> if (state.backendState.readViewportArrayIndex)
> {
> -vViewportIdx = 
> SIMD8::castps_si(svgAttrib[0][VERTEX_SGV_VAI_COMP]);
> +vViewportIdx = 
> SIMD::castps_si(svgAttrib[0][VERTEX_SGV_VAI_COMP]);
> 
> // OOB VPAI 

[Mesa-dev] [Bug 104710] [swrast] piglit draw-batch regression

2018-01-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=104710

Vinson Lee  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #2 from Vinson Lee  ---
commit 365a48abddcabf6596c2e34a784d91c8ab929918
Author: Brian Paul 
Date:   Tue Jan 23 10:48:51 2018 -0700

vbo: fix incorrect min/max_index values in display list draw call

This fixes another regression from commit 8e4efdc895ea ("vbo: optimize
some display list drawing").  The problem was the min_index, max_index
values passed to the vbo drawing function were not computed to compensate
for the biased prim::start values.

https://bugs.freedesktop.org/show_bug.cgi?id=104746
https://bugs.freedesktop.org/show_bug.cgi?id=104742
https://bugs.freedesktop.org/show_bug.cgi?id=104690
Tested-by: Clayton Craft 
Fixes: 8e4efdc895ea ("vbo: optimize some display list drawing")
Reviewed-by: Emil Velikov 

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 2/8] compiler: Add SYSTEM_VALUE_FIRST_VERTEX and instrinsics

2018-01-25 Thread Antia Puentes
This VS system value will contain the value passed as 
for indexed draw calls or the value passed as  for non-indexed
draw calls. It can be used to calculate the gl_VertexID as
SYSTEM_VALUE_VERTEX_ID_ZERO_BASE plus SYSTEM_VALUE_FIRST_VERTEX.

From the OpenGL 4.6 spec, 10.4 "Drawing Commands Using Vertex Arrays":

-  Page 352:
"The index of any element transferred to the GL by DrawArraysOneInstance is
referred to as its vertex ID, and may be read by a vertex shader as gl_VertexID.
The vertex ID of the ith element transferred is first + i."

  - Page 355:
"The index of any element transferred to the GL by DrawElementsOneInstance is
referred to as its vertex ID, and may be read by a vertex shader as gl_VertexID.
The vertex ID of the ith element transferred is the sum of basevertex and the
value stored in the currently bound element array buffer at offset indices + i."

Currently the gl_VertexID calculation uses SYSTEM_VALUE_BASE_VERTEX but this 
will
have to change when the value of gl_BaseVertex is fixed. Currently its value is
broken for non-indexed draw calls because it must be zero but we are setting it
to .

v2: use SYSTEM_VALUE_FIRST_VERTEX as name for the value, instead of
SYSTEM_VALUE_BASE_VERTEX_ID (Kenneth).

Reviewed-by: Neil Roberts 
Reviewed-by: Kenneth Graunke 
---
 src/compiler/nir/nir.c |  4 
 src/compiler/nir/nir_gather_info.c |  1 +
 src/compiler/nir/nir_intrinsics.h  |  1 +
 src/compiler/shader_enums.c|  1 +
 src/compiler/shader_enums.h| 14 ++
 5 files changed, 21 insertions(+)

diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
index bdd8960403c..e69c2accbbf 100644
--- a/src/compiler/nir/nir.c
+++ b/src/compiler/nir/nir.c
@@ -1919,6 +1919,8 @@ nir_intrinsic_from_system_value(gl_system_value val)
   return nir_intrinsic_load_base_instance;
case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE:
   return nir_intrinsic_load_vertex_id_zero_base;
+   case SYSTEM_VALUE_FIRST_VERTEX:
+  return nir_intrinsic_load_first_vertex;
case SYSTEM_VALUE_BASE_VERTEX:
   return nir_intrinsic_load_base_vertex;
case SYSTEM_VALUE_INVOCATION_ID:
@@ -1990,6 +1992,8 @@ nir_system_value_from_intrinsic(nir_intrinsic_op intrin)
   return SYSTEM_VALUE_BASE_INSTANCE;
case nir_intrinsic_load_vertex_id_zero_base:
   return SYSTEM_VALUE_VERTEX_ID_ZERO_BASE;
+   case nir_intrinsic_load_first_vertex:
+  return SYSTEM_VALUE_FIRST_VERTEX;
case nir_intrinsic_load_base_vertex:
   return SYSTEM_VALUE_BASE_VERTEX;
case nir_intrinsic_load_invocation_id:
diff --git a/src/compiler/nir/nir_gather_info.c 
b/src/compiler/nir/nir_gather_info.c
index 946939657ec..555ae77b1d3 100644
--- a/src/compiler/nir/nir_gather_info.c
+++ b/src/compiler/nir/nir_gather_info.c
@@ -247,6 +247,7 @@ gather_intrinsic_info(nir_intrinsic_instr *instr, 
nir_shader *shader)
case nir_intrinsic_load_vertex_id:
case nir_intrinsic_load_vertex_id_zero_base:
case nir_intrinsic_load_base_vertex:
+   case nir_intrinsic_load_first_vertex:
case nir_intrinsic_load_base_instance:
case nir_intrinsic_load_instance_id:
case nir_intrinsic_load_sample_id:
diff --git a/src/compiler/nir/nir_intrinsics.h 
b/src/compiler/nir/nir_intrinsics.h
index ede29277876..7d3421f0e30 100644
--- a/src/compiler/nir/nir_intrinsics.h
+++ b/src/compiler/nir/nir_intrinsics.h
@@ -333,6 +333,7 @@ SYSTEM_VALUE(frag_coord, 4, 0, xx, xx, xx)
 SYSTEM_VALUE(front_face, 1, 0, xx, xx, xx)
 SYSTEM_VALUE(vertex_id, 1, 0, xx, xx, xx)
 SYSTEM_VALUE(vertex_id_zero_base, 1, 0, xx, xx, xx)
+SYSTEM_VALUE(first_vertex, 1, 0, xx, xx, xx)
 SYSTEM_VALUE(base_vertex, 1, 0, xx, xx, xx)
 SYSTEM_VALUE(instance_id, 1, 0, xx, xx, xx)
 SYSTEM_VALUE(base_instance, 1, 0, xx, xx, xx)
diff --git a/src/compiler/shader_enums.c b/src/compiler/shader_enums.c
index 2179c475abd..5e123f29f37 100644
--- a/src/compiler/shader_enums.c
+++ b/src/compiler/shader_enums.c
@@ -214,6 +214,7 @@ gl_system_value_name(gl_system_value sysval)
  ENUM(SYSTEM_VALUE_INSTANCE_ID),
  ENUM(SYSTEM_VALUE_INSTANCE_INDEX),
  ENUM(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE),
+ ENUM(SYSTEM_VALUE_FIRST_VERTEX),
  ENUM(SYSTEM_VALUE_BASE_VERTEX),
  ENUM(SYSTEM_VALUE_BASE_INSTANCE),
  ENUM(SYSTEM_VALUE_DRAW_ID),
diff --git a/src/compiler/shader_enums.h b/src/compiler/shader_enums.h
index ffe551ab20f..9f71194c146 100644
--- a/src/compiler/shader_enums.h
+++ b/src/compiler/shader_enums.h
@@ -472,6 +472,20 @@ typedef enum
 */
SYSTEM_VALUE_BASE_VERTEX,
 
+   /**
+* Depending on the type of the draw call (indexed or non-indexed),
+* is the value of \c basevertex passed to \c glDrawElementsBaseVertex and
+* similar, or is the value of \c first passed to \c glDrawArrays and
+* similar.
+*
+* \note
+* It can be used to calculate the \c SYSTEM_VALUE_VERTEX_ID as
+* \c SYSTEM_VALUE_VERTEX_ID_ZERO_BASE plus \c SYSTEM_VALUE_FIRST_VERTEX.
+*
+* \sa 

[Mesa-dev] [PATCH v3 8/8] i965: gl_BaseVertex must be zero for non-indexed draw calls

2018-01-25 Thread Antia Puentes
We keep 'firstvertex' as it is and move gl_BaseVertex to the drawID vertex
element. The previous Vertex Elements order was:

  * VE 1: 
  * VE 2: 

and now it is:

  * VE 1: 
  * VE 2: 

To move the BaseVertex keeping VE1 as it is, allows to keep pointing the vertex
buffer associated to VE 1 to the indirect buffer for indirect draw calls.

From the OpenGL 4.6 (11.1.3.9 Shader Inputs) specification:

  "gl_BaseVertex holds the integer value passed to the baseVertex parameter to 
the
  command that resulted in the current shader invocation. In the case where the
  command has no baseVertex parameter, the value of gl_BaseVertex is zero."

Fixes CTS tests:

  * KHR-GL45.shader_draw_parameters_tests.ShaderDrawArraysParameters
  * KHR-GL45.shader_draw_parameters_tests.ShaderDrawArraysInstancedParameters
  * KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysParameters
  * 
KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysIndirectParameters
  * KHR-GL45.shader_draw_parameters_tests.MultiDrawArraysIndirectCountParameters

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102678
---
 src/intel/compiler/brw_nir.c  | 14 +
 src/intel/compiler/brw_vec4.cpp   | 14 +
 src/mesa/drivers/dri/i965/brw_context.h   | 32 ++-
 src/mesa/drivers/dri/i965/brw_draw.c  | 45 ++-
 src/mesa/drivers/dri/i965/brw_draw_upload.c   | 24 --
 src/mesa/drivers/dri/i965/genX_state_upload.c | 38 +++---
 6 files changed, 105 insertions(+), 62 deletions(-)

diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
index 34b1e44adf0..c10fa73f4fc 100644
--- a/src/intel/compiler/brw_nir.c
+++ b/src/intel/compiler/brw_nir.c
@@ -238,8 +238,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
 */
const bool has_sgvs =
   nir->info.system_values_read &
-  (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
-   BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
+  (BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) |
BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) |
BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID));
@@ -279,7 +278,6 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
 
nir_intrinsic_set_base(load, num_inputs);
switch (intrin->intrinsic) {
-   case nir_intrinsic_load_base_vertex:
case nir_intrinsic_load_first_vertex:
   nir_intrinsic_set_component(load, 0);
   break;
@@ -293,11 +291,15 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
   nir_intrinsic_set_component(load, 3);
   break;
case nir_intrinsic_load_draw_id:
-  /* gl_DrawID is stored right after gl_VertexID and friends
-   * if any of them exist.
+   case nir_intrinsic_load_base_vertex:
+  /* gl_DrawID and gl_BaseVertex are stored right after
+ gl_VertexID and friends if any of them exist.
*/
   nir_intrinsic_set_base(load, num_inputs + has_sgvs);
-  nir_intrinsic_set_component(load, 0);
+  if (intrin->intrinsic == nir_intrinsic_load_draw_id)
+ nir_intrinsic_set_component(load, 0);
+  else
+ nir_intrinsic_set_component(load, 1);
   break;
default:
   unreachable("Invalid system value intrinsic");
diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index 06c79630119..3b4b3c01b57 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -2787,14 +2787,19 @@ brw_compile_vs(const struct brw_compiler *compiler, 
void *log_data,
 * incoming vertex attribute.  So, add an extra slot.
 */
if (shader->info.system_values_read &
-   (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
-BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
+   (BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
 BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) |
 BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) |
 BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID))) {
   nr_attribute_slots++;
}
 
+   if (shader->info.system_values_read &
+   (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
+BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID))) {
+  nr_attribute_slots++;
+   }
+
if (shader->info.system_values_read &
BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX))
   prog_data->uses_basevertex = true;
@@ -2815,12 +2820,9 @@ brw_compile_vs(const struct brw_compiler *compiler, void 
*log_data,
BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID))
   prog_data->uses_instanceid = true;
 
-   /* gl_DrawID has its very own vec4 */
if (shader->info.system_values_read &
-   

[Mesa-dev] [PATCH v3 3/8] intel/compiler: Add a uses_firstvertex flag

2018-01-25 Thread Antia Puentes
From: Neil Roberts 

Reviewed-by: Kenneth Graunke 
---
 src/intel/compiler/brw_compiler.h | 1 +
 src/intel/compiler/brw_vec4.cpp   | 4 
 2 files changed, 5 insertions(+)

diff --git a/src/intel/compiler/brw_compiler.h 
b/src/intel/compiler/brw_compiler.h
index b1086bbcee5..0afe5757945 100644
--- a/src/intel/compiler/brw_compiler.h
+++ b/src/intel/compiler/brw_compiler.h
@@ -966,6 +966,7 @@ struct brw_vs_prog_data {
bool uses_vertexid;
bool uses_instanceid;
bool uses_basevertex;
+   bool uses_firstvertex;
bool uses_baseinstance;
bool uses_drawid;
 };
diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index ad6d8f9d6bc..36e17d77d47 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -2798,6 +2798,10 @@ brw_compile_vs(const struct brw_compiler *compiler, void 
*log_data,
BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX))
   prog_data->uses_basevertex = true;
 
+   if (shader->info.system_values_read &
+   BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX))
+  prog_data->uses_firstvertex = true;
+
if (shader->info.system_values_read &
BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE))
   prog_data->uses_baseinstance = true;
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 5/8] spirv: Lower BaseVertex to FIRST_VERTEX instead of BASE_VERTEX

2018-01-25 Thread Antia Puentes
From: Neil Roberts 

The base vertex in Vulkan is different from GL in that for non-indexed
primitives the value is taken from the firstVertex parameter instead
of being set to zero. This coincides with the new SYSTEM_VALUE_FIRST_VERTEX
instead of BASE_VERTEX.
---
 src/compiler/spirv/vtn_variables.c |  2 +-
 src/intel/vulkan/genX_cmd_buffer.c | 16 
 src/intel/vulkan/genX_pipeline.c   |  2 ++
 3 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c
index eb306d0c4a8..3e5686af1d9 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -1279,7 +1279,7 @@ vtn_get_builtin_location(struct vtn_builder *b,
   set_mode_system_value(b, mode);
   break;
case SpvBuiltInBaseVertex:
-  *location = SYSTEM_VALUE_BASE_VERTEX;
+  *location = SYSTEM_VALUE_FIRST_VERTEX;
   set_mode_system_value(b, mode);
   break;
case SpvBuiltInBaseInstance:
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index c23a54fb7b9..9fc281bf4eb 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -2223,7 +2223,9 @@ void genX(CmdDraw)(
 
genX(cmd_buffer_flush_state)(cmd_buffer);
 
-   if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance)
+   if (vs_prog_data->uses_firstvertex ||
+   vs_prog_data->uses_basevertex ||
+   vs_prog_data->uses_baseinstance)
   emit_base_vertex_instance(cmd_buffer, firstVertex, firstInstance);
if (vs_prog_data->uses_drawid)
   emit_draw_index(cmd_buffer, 0);
@@ -2261,7 +2263,9 @@ void genX(CmdDrawIndexed)(
 
genX(cmd_buffer_flush_state)(cmd_buffer);
 
-   if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance)
+   if (vs_prog_data->uses_firstvertex ||
+   vs_prog_data->uses_basevertex ||
+   vs_prog_data->uses_baseinstance)
   emit_base_vertex_instance(cmd_buffer, vertexOffset, firstInstance);
if (vs_prog_data->uses_drawid)
   emit_draw_index(cmd_buffer, 0);
@@ -2417,7 +2421,9 @@ void genX(CmdDrawIndirect)(
   struct anv_bo *bo = buffer->bo;
   uint32_t bo_offset = buffer->offset + offset;
 
-  if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance)
+  if (vs_prog_data->uses_firstvertex ||
+  vs_prog_data->uses_basevertex ||
+  vs_prog_data->uses_baseinstance)
  emit_base_vertex_instance_bo(cmd_buffer, bo, bo_offset + 8);
   if (vs_prog_data->uses_drawid)
  emit_draw_index(cmd_buffer, i);
@@ -2456,7 +2462,9 @@ void genX(CmdDrawIndexedIndirect)(
   uint32_t bo_offset = buffer->offset + offset;
 
   /* TODO: We need to stomp base vertex to 0 somehow */
-  if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance)
+  if (vs_prog_data->uses_firstvertex ||
+  vs_prog_data->uses_basevertex ||
+  vs_prog_data->uses_baseinstance)
  emit_base_vertex_instance_bo(cmd_buffer, bo, bo_offset + 12);
   if (vs_prog_data->uses_drawid)
  emit_draw_index(cmd_buffer, i);
diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c
index 82fdf206a95..5f4cf58b83d 100644
--- a/src/intel/vulkan/genX_pipeline.c
+++ b/src/intel/vulkan/genX_pipeline.c
@@ -98,6 +98,7 @@ emit_vertex_input(struct anv_pipeline *pipeline,
const bool needs_svgs_elem = vs_prog_data->uses_vertexid ||
 vs_prog_data->uses_instanceid ||
 vs_prog_data->uses_basevertex ||
+vs_prog_data->uses_firstvertex ||
 vs_prog_data->uses_baseinstance;
 
uint32_t elem_count = __builtin_popcount(elements) -
@@ -178,6 +179,7 @@ emit_vertex_input(struct anv_pipeline *pipeline,
* well.  Just do all or nothing.
*/
   uint32_t base_ctrl = (vs_prog_data->uses_basevertex ||
+vs_prog_data->uses_firstvertex ||
 vs_prog_data->uses_baseinstance) ?
VFCOMP_STORE_SRC : VFCOMP_STORE_0;
 
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 6/8] i965: Don't request GLSL IR lowering of gl_VertexID

2018-01-25 Thread Antia Puentes
From: Ian Romanick 

Let the lowering in NIR handle it instead.

This hurts one shader that occurs twice in shader-db (SynMark GSCloth)
on IVB and HSW.  No other shaders or platforms were affected.

total cycles in shared programs: 253438422 -> 253438426 (0.00%)
cycles in affected programs: 412 -> 416 (0.97%)
helped: 0
HURT: 2

Signed-off-by: Ian Romanick 
Reviewed-by: Antia Puentes 
---
 src/mesa/drivers/dri/i965/brw_context.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 9ed8bc64bb3..7775468f98a 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -590,7 +590,6 @@ brw_initialize_context_constants(struct brw_context *brw)
   ctx->Const.QuadsFollowProvokingVertexConvention = false;
 
ctx->Const.NativeIntegers = true;
-   ctx->Const.VertexID_is_zero_based = true;
 
/* Regarding the CMP instruction, the Ivybridge PRM says:
 *
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 7/8] nir: Offset vertex_id by first_vertex instead of base_vertex

2018-01-25 Thread Antia Puentes
From: Neil Roberts 

base_vertex will be zero for non-indexed calls and in that case we
need vertex_id to be offset by the ‘first’ parameter instead. That is
what we get with first_vertex. This is true for both GL and Vulkan.

The freedreno driver is also setting vertex_id_zero_based on
nir_options. In order to avoid breakage this patch switches the
relevant code to handle SYSTEM_VALUE_FIRST_VERTEX so that it can
retain the same behavior.

v2: change a3xx/fd3_emit.c and a4xx/fd4_emit.c from
SYSTEM_VALUE_BASE_VERTEX to SYSTEM_VALUE_FIRST_VERTEX (Kenneth).

Cc: Rob Clark 
Cc: Marek Olšák 
---
 src/compiler/nir/nir_lower_system_values.c   | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_emit.c| 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_emit.c| 2 +-
 src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c | 5 ++---
 src/intel/vulkan/genX_cmd_buffer.c   | 4 
 src/intel/vulkan/genX_pipeline.c | 4 +---
 6 files changed, 6 insertions(+), 13 deletions(-)

diff --git a/src/compiler/nir/nir_lower_system_values.c 
b/src/compiler/nir/nir_lower_system_values.c
index 3594f4ae5ce..6f4fb8233ab 100644
--- a/src/compiler/nir/nir_lower_system_values.c
+++ b/src/compiler/nir/nir_lower_system_values.c
@@ -105,7 +105,7 @@ convert_block(nir_block *block, nir_builder *b)
  if (b->shader->options->vertex_id_zero_based) {
 sysval = nir_iadd(b,
   nir_load_vertex_id_zero_base(b),
-  nir_load_base_vertex(b));
+  nir_load_first_vertex(b));
  } else {
 sysval = nir_load_vertex_id(b);
  }
diff --git a/src/gallium/drivers/freedreno/a3xx/fd3_emit.c 
b/src/gallium/drivers/freedreno/a3xx/fd3_emit.c
index b9e1af00e2c..3419ba86d46 100644
--- a/src/gallium/drivers/freedreno/a3xx/fd3_emit.c
+++ b/src/gallium/drivers/freedreno/a3xx/fd3_emit.c
@@ -374,7 +374,7 @@ fd3_emit_vertex_bufs(struct fd_ringbuffer *ring, struct 
fd3_emit *emit)
continue;
if (vp->inputs[i].sysval) {
switch(vp->inputs[i].slot) {
-   case SYSTEM_VALUE_BASE_VERTEX:
+   case SYSTEM_VALUE_FIRST_VERTEX:
/* handled elsewhere */
break;
case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE:
diff --git a/src/gallium/drivers/freedreno/a4xx/fd4_emit.c 
b/src/gallium/drivers/freedreno/a4xx/fd4_emit.c
index 5fec2b6b08a..42268ceea71 100644
--- a/src/gallium/drivers/freedreno/a4xx/fd4_emit.c
+++ b/src/gallium/drivers/freedreno/a4xx/fd4_emit.c
@@ -378,7 +378,7 @@ fd4_emit_vertex_bufs(struct fd_ringbuffer *ring, struct 
fd4_emit *emit)
continue;
if (vp->inputs[i].sysval) {
switch(vp->inputs[i].slot) {
-   case SYSTEM_VALUE_BASE_VERTEX:
+   case SYSTEM_VALUE_FIRST_VERTEX:
/* handled elsewhere */
break;
case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE:
diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c 
b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
index 15a3aa4c802..d3a8dbec14e 100644
--- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
+++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
@@ -2073,11 +2073,10 @@ emit_intrinsic(struct ir3_context *ctx, 
nir_intrinsic_instr *intr)
ctx->ir->outputs[n] = src[i];
}
break;
-   case nir_intrinsic_load_base_vertex:
+   case nir_intrinsic_load_first_vertex:
if (!ctx->basevertex) {
ctx->basevertex = create_driver_param(ctx, 
IR3_DP_VTXID_BASE);
-   add_sysval_input(ctx, SYSTEM_VALUE_BASE_VERTEX,
-   ctx->basevertex);
+   add_sysval_input(ctx, SYSTEM_VALUE_FIRST_VERTEX, 
ctx->basevertex);
}
dst[0] = ctx->basevertex;
break;
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 9fc281bf4eb..d7dc14f387b 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -2224,7 +2224,6 @@ void genX(CmdDraw)(
genX(cmd_buffer_flush_state)(cmd_buffer);
 
if (vs_prog_data->uses_firstvertex ||
-   vs_prog_data->uses_basevertex ||
vs_prog_data->uses_baseinstance)
   emit_base_vertex_instance(cmd_buffer, firstVertex, firstInstance);
if (vs_prog_data->uses_drawid)
@@ -2264,7 +2263,6 @@ void genX(CmdDrawIndexed)(
genX(cmd_buffer_flush_state)(cmd_buffer);
 
if (vs_prog_data->uses_firstvertex ||
-   vs_prog_data->uses_basevertex ||
vs_prog_data->uses_baseinstance)
   

[Mesa-dev] [PATCH v3 1/8] i965: allocate a SGVS element when VertexID or InstanceID are read

2018-01-25 Thread Antia Puentes
From: Iago Toral Quiroga 

Although on gen8+ platforms we can in theory use 3DSTATE_VF_SGVS
to put these beyond the last vertex element it seems that we still
need to allocate the SVGS element, otherwise we have observed cases
where we end up reading garbage. Specifically, the CTS test mentioned
below was flaky with a fail rate of ~1% on some gen9+ platforms caused
by reading garbage for the gl_InstanceID value. The flakyness goes
away as soon as we start allocating the SVGS element.

v2:
  - Do this for gen8+, not just gen9+, and pull the boolean
outside the #if block (Jason)

Fixes flaky test:
KHR-GL45.vertex_attrib_64bit.limits_test

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104335

Reviewed-by: Jason Ekstrand 
---
 src/mesa/drivers/dri/i965/genX_state_upload.c | 17 ++---
 1 file changed, 2 insertions(+), 15 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 50ac5bc59ff..d0a980f9730 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -486,26 +486,13 @@ genX(emit_vertices)(struct brw_context *brw)
} else {
   brw_batch_emit(brw, GENX(3DSTATE_VF_SGVS), vfs);
}
+#endif
 
-   /* Normally we don't need an element for the SGVS attribute because the
-* 3DSTATE_VF_SGVS instruction lets you store the generated attribute in an
-* element that is past the list in 3DSTATE_VERTEX_ELEMENTS. However if
-* we're using draw parameters then we need an element for the those
-* values.  Additionally if there is an edge flag element then the SGVS
-* can't be inserted past that so we need a dummy element to ensure that
-* the edge flag is the last one.
-*/
-   const bool needs_sgvs_element = (vs_prog_data->uses_basevertex ||
-vs_prog_data->uses_baseinstance ||
-((vs_prog_data->uses_instanceid ||
-  vs_prog_data->uses_vertexid)
- && uses_edge_flag));
-#else
const bool needs_sgvs_element = (vs_prog_data->uses_basevertex ||
 vs_prog_data->uses_baseinstance ||
 vs_prog_data->uses_instanceid ||
 vs_prog_data->uses_vertexid);
-#endif
+
unsigned nr_elements =
   brw->vb.nr_enabled + needs_sgvs_element + vs_prog_data->uses_drawid;
 
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 4/8] intel: Handle firstvertex in an identical way to BaseVertex

2018-01-25 Thread Antia Puentes
Until we set gl_BaseVertex to zero for non-indexed draw calls
both have an identical value.

The Vertex Elements are kept like that:
* VE 1: 
* VE 2: 
---
 src/intel/compiler/brw_nir.c  |  3 +++
 src/intel/compiler/brw_vec4.cpp   |  1 +
 src/mesa/drivers/dri/i965/brw_context.h   |  8 ++--
 src/mesa/drivers/dri/i965/brw_draw.c  | 14 +-
 src/mesa/drivers/dri/i965/brw_draw_upload.c   |  7 +--
 src/mesa/drivers/dri/i965/genX_state_upload.c | 11 +++
 6 files changed, 31 insertions(+), 13 deletions(-)

diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
index dbddef0d04d..34b1e44adf0 100644
--- a/src/intel/compiler/brw_nir.c
+++ b/src/intel/compiler/brw_nir.c
@@ -239,6 +239,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
const bool has_sgvs =
   nir->info.system_values_read &
   (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
+   BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) |
BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) |
BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID));
@@ -261,6 +262,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
 
 switch (intrin->intrinsic) {
 case nir_intrinsic_load_base_vertex:
+case nir_intrinsic_load_first_vertex:
 case nir_intrinsic_load_base_instance:
 case nir_intrinsic_load_vertex_id_zero_base:
 case nir_intrinsic_load_instance_id:
@@ -278,6 +280,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
nir_intrinsic_set_base(load, num_inputs);
switch (intrin->intrinsic) {
case nir_intrinsic_load_base_vertex:
+   case nir_intrinsic_load_first_vertex:
   nir_intrinsic_set_component(load, 0);
   break;
case nir_intrinsic_load_base_instance:
diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index 36e17d77d47..06c79630119 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -2788,6 +2788,7 @@ brw_compile_vs(const struct brw_compiler *compiler, void 
*log_data,
 */
if (shader->info.system_values_read &
(BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
+BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
 BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) |
 BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) |
 BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID))) {
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 9046acd175c..0a20706567e 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -881,8 +881,12 @@ struct brw_context
 
struct {
   struct {
- /** The value of gl_BaseVertex for the current _mesa_prim. */
- int gl_basevertex;
+ /**
+  * Either the value of gl_BaseVertex for indexed draw calls or the
+  * value of the argument  for non-indexed draw calls for the
+  * current _mesa_prim.
+  */
+ int firstvertex;
 
  /** The value of gl_BaseInstance for the current _mesa_prim. */
  int gl_baseinstance;
diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 50cf8b12c74..a1a5161fd35 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -816,25 +816,29 @@ brw_draw_single_prim(struct gl_context *ctx,
 * always flag if the shader uses one of the values. For direct draws,
 * we only flag if the values change.
 */
-   const int new_basevertex =
+   const int new_firstvertex =
   prim->indexed ? prim->basevertex : prim->start;
const int new_baseinstance = prim->base_instance;
const struct brw_vs_prog_data *vs_prog_data =
   brw_vs_prog_data(brw->vs.base.prog_data);
if (prim_id > 0) {
-  const bool uses_draw_parameters =
+  const bool uses_firstvertex =
  vs_prog_data->uses_basevertex ||
+ vs_prog_data->uses_firstvertex;
+
+  const bool uses_draw_parameters =
+ uses_firstvertex ||
  vs_prog_data->uses_baseinstance;
 
   if ((uses_draw_parameters && prim->is_indirect) ||
-  (vs_prog_data->uses_basevertex &&
-   brw->draw.params.gl_basevertex != new_basevertex) ||
+  (uses_firstvertex &&
+   brw->draw.params.firstvertex != new_firstvertex) ||
   (vs_prog_data->uses_baseinstance &&
brw->draw.params.gl_baseinstance != new_baseinstance))
  brw->ctx.NewDriverState |= BRW_NEW_VERTICES;
}
 
-   brw->draw.params.gl_basevertex = new_basevertex;
+   brw->draw.params.firstvertex = new_firstvertex;
brw->draw.params.gl_baseinstance = new_baseinstance;
brw_bo_unreference(brw->draw.draw_params_bo);
 
diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c 

Re: [Mesa-dev] [PATCH] i965/fs: Reset the register file to VGRF in lower_integer_multiplication

2018-01-25 Thread Matt Turner
On Fri, Dec 15, 2017 at 5:12 PM, Jason Ekstrand  wrote:
> 18fde36ced4279f2577097a1a7d31b55f2f5f141 changed the way temporary
> registers were allocated in lower_integer_multiplication so that we
> allocate regs_written(inst) space and keep the stride of the original
> destination register.  This was to ensure that any MUL which originally
> followed the CHV/BXT integer multiply regioning restrictions would
> continue to follow those restrictions even after lowering.  This works
> fine except that I forgot to reset the register file to VGRF so, even
> though they were assigned a number from alloc.allocate(), they had the
> wrong register file.  This caused some GLES 3.0 CTS tests to start
> failing on Sandy Bridge due to attempted reads from the MRF:
>
> ES3-CTS.functional.shaders.precision.int.highp_mul_fragment.snbm64
> ES3-CTS.functional.shaders.precision.int.mediump_mul_fragment.snbm64
> ES3-CTS.functional.shaders.precision.int.lowp_mul_fragment.snbm64
> ES3-CTS.functional.shaders.precision.uint.highp_mul_fragment.snbm64
> ES3-CTS.functional.shaders.precision.uint.mediump_mul_fragment.snbm64
> ES3-CTS.functional.shaders.precision.uint.lowp_mul_fragment.snbm64
>
> This commit remedies this problem by, instead of copying inst->dst and
> overwriting nr, just make a new register and set the region to match
> inst->dst.
>
> Cc: Matt Turner 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103626
> Fixes: 18fde36ced4279f2577097a1a7d31b55f2f5f141
> Cc: "17.3" 

Thanks. Sorry this got lost. Looks like it was sent the day I started vacation.

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: add missing RGB9_E5 format in _mesa_base_fbo_format

2018-01-25 Thread Roland Scheidegger
Am 25.01.2018 um 17:56 schrieb Roland Scheidegger:
> Am 25.01.2018 um 16:30 schrieb Michel Dänzer:
>> On 2018-01-24 05:38 PM, Juan A. Suarez Romero wrote:
>>> This fixes KHR-GL45.internalformat.renderbuffer.rgb9_e5.
>>> ---
>>>  src/mesa/main/fbobject.c | 3 +++
>>>  1 file changed, 3 insertions(+)
>>>
>>> diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
>>> index d23916d1ad7..c72204e11a0 100644
>>> --- a/src/mesa/main/fbobject.c
>>> +++ b/src/mesa/main/fbobject.c
>>> @@ -1976,6 +1976,9 @@ _mesa_base_fbo_format(const struct gl_context *ctx, 
>>> GLenum internalFormat)
>>> ctx->Extensions.ARB_texture_float) ||
>>>_mesa_is_gles3(ctx) /* EXT_color_buffer_float */ )
>>>   ? GL_RGBA : 0;
>>> +   case GL_RGB9_E5:
>>> +  return (_mesa_is_desktop_gl(ctx) && 
>>> ctx->Extensions.EXT_texture_shared_exponent)
>>> + ? GL_RGB: 0;
>>> case GL_ALPHA16F_ARB:
>>> case GL_ALPHA32F_ARB:
>>>return ctx->API == API_OPENGL_COMPAT &&
>>>
>>
>> Unfortunately, this broke the "spec@arb_internalformat_query2@samples
>> and num_sample_counts pname checks" piglit tests with radeonsi and
>> llvmpipe, see below.
>>
>> Any idea what might need to be done in Gallium to fix this?
>>
>>
>> 32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
>> GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), 
>> supported=1
>> 32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
>> GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = 
>> (1,GL_TRUE), supported=1
>> 32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
>> GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = 
>> (1,GL_TRUE), supported=1
>> 64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
>> GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), 
>> supported=1
>> 64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
>> GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = 
>> (1,GL_TRUE), supported=1
>> 64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
>> GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = 
>> (1,GL_TRUE), supported=1
>> PIGLIT: {"subtest": {"GL_NUM_SAMPLE_COUNTS" : "fail"}}
>> 32 bit failing case: pname = GL_SAMPLES, target = GL_RENDERBUFFER, 
>> internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1
>> 32 bit failing case: pname = GL_SAMPLES, target = 
>> GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = 
>> (1,GL_TRUE), supported=1
>> 32 bit failing case: pname = GL_SAMPLES, target = 
>> GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = 
>> (1,GL_TRUE), supported=1
>> 64 bit failing case: pname = GL_SAMPLES, target = GL_RENDERBUFFER, 
>> internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1
>> 64 bit failing case: pname = GL_SAMPLES, target = 
>> GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = 
>> (1,GL_TRUE), supported=1
>> 64 bit failing case: pname = GL_SAMPLES, target = 
>> GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = 
>> (1,GL_TRUE), supported=1
>> PIGLIT: {"subtest": {"GL_SAMPLES" : "fail"}}
>> PIGLIT: {"result": "fail" }
>>
>>
> 
> Purely coincidentally, I was trying to clean up the formatquery code
> recently (should help some failures with r600 too), and I think these
> cleanups would fix it.
> Basically outright say "no" to target/pname combinations which don't
> make sense rather than trying to find a format suitable for another
> target and then asking the driver for the nonsense combination, plus
> some other small bits like not validating things again (sometimes, a
> third time...).
> Albeit it will cause some breakage with the piglit test, which I believe
> is a test error, but that might be open for debate...
> (For TEXTURE_BUFFER and the internalformat size/type queries, do you
> return valid values or unsupported? The problem here is ARB_tbo says you
> can't get these values via the equivalent GetTexLevelParameter queries,
> whereas with GL 3.1 you can. And internalformat_query2 says it returns
> "the same information" as GetTexLevelParameter, albeit it's not entirely
> true in any case since the equivalent of the internalformat stencil type
> doesn't even exist. My stance would be that valid values should be
> reported even without GL 3.1, but the piglit test thinks differently.)
> 

Err, actually this won't fix it I suppose - because rgb9e5 now is a
valid fbo format. Was that commit really correct? It does not make sense
to me, rgb9e5 cannot be a fbo/renderable format. Or was this just
working around issues in formatquery.c (which I try to address with this
patch)?

Roland


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: expand glDrawPixels cache to handle multiple images

2018-01-25 Thread Gustaw Smolarczyk
2018-01-25 18:02 GMT+01:00 Roland Scheidegger :

> Am 25.01.2018 um 16:55 schrieb Brian Paul:
> > The newest version of WSI Fusion makes several glDrawPixels calls
> > per frame.  By caching more than one image, we get better performance
> > when panning/zomming the map.
> Still zooming :-)
>
>
>
>
> >
> > v2: move pixel unpack param checking out of cache search loop, per Roland
> > ---
> >  src/mesa/state_tracker/st_cb_drawpixels.c | 196
> +-
> >  src/mesa/state_tracker/st_context.c   |   4 -
> >  src/mesa/state_tracker/st_context.h   |  22 +++-
> >  3 files changed, 154 insertions(+), 68 deletions(-)
> >
> > diff --git a/src/mesa/state_tracker/st_cb_drawpixels.c
> b/src/mesa/state_tracker/st_cb_drawpixels.c
> > index 1d88976..e63f6f7 100644
> > --- a/src/mesa/state_tracker/st_cb_drawpixels.c
> > +++ b/src/mesa/state_tracker/st_cb_drawpixels.c
> > @@ -375,6 +375,131 @@ alloc_texture(struct st_context *st, GLsizei
> width, GLsizei height,
> >
> >
> >  /**
> > + * Search the cache for an image which matches the given parameters.
> > + * \return  pipe_resource pointer if found, NULL if not found.
> > + */
> > +static struct pipe_resource *
> > +search_drawpixels_cache(struct st_context *st,
> > +GLsizei width, GLsizei height,
> > +GLenum format, GLenum type,
> > +const struct gl_pixelstore_attrib *unpack,
> > +const void *pixels)
> > +{
> > +   struct pipe_resource *pt = NULL;
> > +   const GLint bpp = _mesa_bytes_per_pixel(format, type);
> > +   unsigned i;
> > +
> > +   if ((unpack->RowLength != 0 && unpack->RowLength != width) ||
> > +   unpack->SkipPixels != 0 ||
> > +   unpack->SkipRows != 0 ||
> > +   unpack->SwapBytes) {
> > +  /* we don't allow non-default pixel unpacking values */
> > +  return NULL;
> > +   }
> > +
> > +   /* Search cache entries for a match */
> > +   for (i = 0; i < ARRAY_SIZE(st->drawpix_cache.entries); i++) {
> > +  struct drawpix_cache_entry *entry = >drawpix_cache.entries[i];
> > +
> > +  if (width == entry->width &&
> > +  height == entry->height &&
> > +  format == entry->format &&
> > +  type == entry->type &&
> > +  pixels == entry->user_pointer &&
> > +  !_mesa_is_bufferobj(unpack->BufferObj) &&
> Move this line as well?
>
>
>
> > +  entry->image) {
> > + assert(entry->texture);
> > +
> > + /* check if the pixel data is the same */
> > + if (memcmp(pixels, entry->image, width * height * bpp) == 0) {
> > +/* Success - found a cache match */
> > +pipe_resource_reference(, entry->texture);
> > +/* refcount of returned texture should be at least two
> here.  One
> > + * reference for the cache to hold on to, one for the
> caller (which
> > + * it will release), and possibly more held by the driver.
> > + */
> > +assert(pt->reference.count >= 2);
> > +
> > +/* update the age of this entry */
> > +entry->age = ++st->drawpix_cache.age;
> > +
> > +return pt;
> > + }
> > +  }
> > +   }
> > +
> > +   /* no cache match found */
> > +   return NULL;
> > +}
> > +
> > +
> > +/**
> > + * Find the oldest entry in the glDrawPixels cache.  We'll replace this
> > + * one when we need to store a new image.
> > + */
> > +static struct drawpix_cache_entry *
> > +find_oldest_drawpixels_cache_entry(struct st_context *st)
> > +{
> > +   unsigned oldest_age = ~0u, oldest_index = ~0u;
> > +   unsigned i;
> > +
> > +   /* Find entry with oldest (lowest) age */
> > +   for (i = 0; i < ARRAY_SIZE(st->drawpix_cache.entries); i++) {
> > +  const struct drawpix_cache_entry *entry =
> >drawpix_cache.entries[i];
> > +  if (entry->age < oldest_age) {
> > + oldest_age = entry->age;
> > + oldest_index = i;
> > +  }
> > +   }
> > +
> > +   assert(oldest_age != ~0u);
> Ok, if it takes 2 years to hit it, that's probably ok...
>
> Reviewed-by: Roland Scheidegger 
>

Note that at 13000fps (maximum I could achieve with glxgears) it would take
less than 4 days. Though I guess if you run glDrawPixels each frame you
couldn't achieve such fps value.

Gustaw Smolarczyk


>
> > +   assert(oldest_index != ~0u);
> > +
> > +   return >drawpix_cache.entries[oldest_index];
> > +}
> > +
> > +
> > +/**
> > + * Try to save the given glDrawPixels image in the cache.
> > + */
> > +static void
> > +cache_drawpixels_image(struct st_context *st,
> > +   GLsizei width, GLsizei height,
> > +   GLenum format, GLenum type,
> > +   const struct gl_pixelstore_attrib *unpack,
> > +   const void *pixels,
> > +   struct pipe_resource *pt)
> > +{
> > +   if ((unpack->RowLength == 0 || unpack->RowLength == width) &&
> > 

Re: [Mesa-dev] [PATCH] configure.ac: correct driglx-direct help text

2018-01-25 Thread Daniel Stone
On 20 December 2017 at 17:34, Emil Velikov  wrote:
> The default was toggled a while back, but the text wasn't updated.

Reviewed-by: Daniel Stone 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] configure.ac: add Wundef to the build flags

2018-01-25 Thread Emil Velikov
On 24 November 2017 at 18:26, Eric Engestrom  wrote:
> On Friday, 2017-11-24 18:14:41 +, Emil Velikov wrote:
>> On 24 November 2017 at 14:32, Eric Engestrom  
>> wrote:
>> > On Friday, 2017-11-24 14:25:02 +, Emil Velikov wrote:
>> >> From: Emil Velikov 
>> >>
>> >> From the manual:
>> >>   Warn if an undefined identifier is evaluated in an `#if' directive.
>> >>
>> >> This is something we want to know and address. Otherwise we can end up
>> >> with subtle issues, in the less commonly used codepaths.
>> >>
>> >> Note: this will trigger a lot of extra warnings, with ~60 of those being
>> >> unique. Once all those are resolved we'd want to promote the warning to
>> >> an error.
>> >
>> > Yes please; series is
>> > Reviewed-by: Eric Engestrom 
>> >
>> Thanks. I think we should hold these off, until some (say 1/3?) of the
>> issues are resolved.
>> Otherwise devs might get a bit annoyed my the massive amount of warnings.
>
> Agreed. The series I just sent fixes 99% of the warnings already,
> because c99_{compat,math}.h is included everywhere.
>
> Once that series and your gtest patches land, if think it should be good
> enough, and individual devs can take care of the rest.
>
> The next biggest offender is Nouveau, and I haven't had a proper look
> but at a glance I think it looked like it was probably just a few places
> generating many warnings.

FTR, I haven't forgotten about this one. Upstream gtest has not
replied to the series that I've posted nearly 2 months ago [1].
I'd love to address any feedback and flow things naturally into Mesa.
Alternatively we could pull it locally, although next time we update
gtest things might be fiddly.

-Emil

[1] https://github.com/google/googletest/pull/1335
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] mesa: whitespace fixes in varray.h

2018-01-25 Thread Neha Bhende
Looks good.


For the series,

Reviewed-by: Neha Bhende


Regards,

Neha


From: Brian Paul 
Sent: Thursday, January 25, 2018 8:48:00 AM
To: mesa-dev@lists.freedesktop.org
Cc: Neha Bhende; Charmaine Lee; Roland Scheidegger
Subject: [PATCH 3/3] mesa: whitespace fixes in varray.h

---
 src/mesa/main/varray.h | 55 ++
 1 file changed, 29 insertions(+), 26 deletions(-)

diff --git a/src/mesa/main/varray.h b/src/mesa/main/varray.h
index 03d81d0..93f2f47 100644
--- a/src/mesa/main/varray.h
+++ b/src/mesa/main/varray.h
@@ -44,9 +44,10 @@ _mesa_vertex_attrib_address(const struct gl_array_attributes 
*array,
if (_mesa_is_bufferobj(binding->BufferObj))
   return (const GLubyte *) (binding->Offset + array->RelativeOffset);
else
-  return array->Ptr;
+  return array->Ptr;
 }

+
 /**
  * Sets the fields in a gl_vertex_array to values derived from a
  * gl_array_attributes and a gl_vertex_buffer_binding.
@@ -70,6 +71,7 @@ _mesa_update_client_array(struct gl_context *ctx,
_mesa_reference_buffer_object(ctx, >BufferObj, binding->BufferObj);
 }

+
 static inline bool
 _mesa_attr_zero_aliases_vertex(const struct gl_context *ctx)
 {
@@ -190,7 +192,7 @@ _mesa_SecondaryColorPointer_no_error(GLint size, GLenum 
type,
  GLsizei stride, const GLvoid *ptr);
 extern void GLAPIENTRY
 _mesa_SecondaryColorPointer(GLint size, GLenum type,
-  GLsizei stride, const GLvoid *ptr);
+GLsizei stride, const GLvoid *ptr);


 extern void GLAPIENTRY
@@ -206,8 +208,8 @@ _mesa_VertexAttribPointer_no_error(GLuint index, GLint 
size, GLenum type,
const GLvoid *pointer);
 extern void GLAPIENTRY
 _mesa_VertexAttribPointer(GLuint index, GLint size, GLenum type,
- GLboolean normalized, GLsizei stride,
- const GLvoid *pointer);
+  GLboolean normalized, GLsizei stride,
+  const GLvoid *pointer);

 void GLAPIENTRY
 _mesa_VertexAttribIPointer_no_error(GLuint index, GLint size, GLenum type,
@@ -295,35 +297,35 @@ _mesa_InterleavedArrays(GLenum format, GLsizei stride, 
const GLvoid *pointer);


 extern void GLAPIENTRY
-_mesa_MultiDrawArrays( GLenum mode, const GLint *first,
-  const GLsizei *count, GLsizei primcount );
+_mesa_MultiDrawArrays(GLenum mode, const GLint *first,
+  const GLsizei *count, GLsizei primcount);

 extern void GLAPIENTRY
-_mesa_MultiDrawElementsEXT( GLenum mode, const GLsizei *count, GLenum type,
-const GLvoid **indices, GLsizei primcount );
+_mesa_MultiDrawElementsEXT(GLenum mode, const GLsizei *count, GLenum type,
+   const GLvoid **indices, GLsizei primcount);

 extern void GLAPIENTRY
-_mesa_MultiDrawElementsBaseVertex( GLenum mode,
-  const GLsizei *count, GLenum type,
-  const GLvoid **indices, GLsizei primcount,
-  const GLint *basevertex);
+_mesa_MultiDrawElementsBaseVertex(GLenum mode,
+  const GLsizei *count, GLenum type,
+  const GLvoid **indices, GLsizei primcount,
+  const GLint *basevertex);

 extern void GLAPIENTRY
-_mesa_MultiModeDrawArraysIBM( const GLenum * mode, const GLint * first,
- const GLsizei * count,
- GLsizei primcount, GLint modestride );
+_mesa_MultiModeDrawArraysIBM(const GLenum * mode, const GLint * first,
+ const GLsizei * count,
+ GLsizei primcount, GLint modestride );


 extern void GLAPIENTRY
-_mesa_MultiModeDrawElementsIBM( const GLenum * mode, const GLsizei * count,
-   GLenum type, const GLvoid * const * indices,
-   GLsizei primcount, GLint modestride );
+_mesa_MultiModeDrawElementsIBM(const GLenum * mode, const GLsizei * count,
+   GLenum type, const GLvoid * const * indices,
+   GLsizei primcount, GLint modestride );

 extern void GLAPIENTRY
 _mesa_LockArraysEXT(GLint first, GLsizei count);

 extern void GLAPIENTRY
-_mesa_UnlockArraysEXT( void );
+_mesa_UnlockArraysEXT(void);


 extern void GLAPIENTRY
@@ -343,13 +345,13 @@ _mesa_DrawRangeElements(GLenum mode, GLuint start, GLuint 
end, GLsizei count,

 extern void GLAPIENTRY
 _mesa_DrawElementsBaseVertex(GLenum mode, GLsizei count, GLenum type,
-const GLvoid *indices, GLint basevertex);
+ const GLvoid *indices, GLint basevertex);

 extern void GLAPIENTRY
 _mesa_DrawRangeElementsBaseVertex(GLenum mode, GLuint start, GLuint end,

Re: [Mesa-dev] [PATCH] configure.ac: correct driglx-direct help text

2018-01-25 Thread Emil Velikov
On 20 December 2017 at 17:34, Emil Velikov  wrote:
> From: Emil Velikov 
>
> The default was toggled a while back, but the text wasn't updated.
>
> Fixes: bd526ec9e1b ("configure: Always default to
> --enable-driglx-direct")
> Cc: Jon TURNEY 
> Signed-off-by: Emil Velikov 
> ---
>  configure.ac | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/configure.ac b/configure.ac
> index 79f275d3914..cadbe4bce3c 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -1597,7 +1597,7 @@ fi
>  AC_ARG_ENABLE([driglx-direct],
>  [AS_HELP_STRING([--disable-driglx-direct],
>  [disable direct rendering in GLX and EGL for DRI \
> -@<:@default=auto@:>@])],
> +@<:@default=enabled@:>@])],
>  [driglx_direct="$enableval"],
>  [driglx_direct="yes"])
Humble ping, anyone?

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] radv: fix RADV_DEBUG=syncshaders on GFX9

2018-01-25 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen 

On Thu, Jan 25, 2018 at 3:46 PM, Samuel Pitoiset 
wrote:

> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_cmd_buffer.c | 11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_
> buffer.c
> index ba5fd92f2a1..b694174de68 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -433,13 +433,22 @@ radv_cmd_buffer_after_draw(struct radv_cmd_buffer
> *cmd_buffer,
>enum radv_cmd_flush_bits flags)
>  {
> if (cmd_buffer->device->instance->debug_flags &
> RADV_DEBUG_SYNC_SHADERS) {
> +   uint32_t *ptr = NULL;
> +   uint64_t va = 0;
> +
> assert(flags & (RADV_CMD_FLAG_PS_PARTIAL_FLUSH |
> RADV_CMD_FLAG_CS_PARTIAL_FLUSH));
>
> +   if (cmd_buffer->device->physical_device->rad_info.chip_class
> == GFX9) {
> +   va = radv_buffer_get_va(cmd_buffer->gfx9_fence_bo)
> +
> +cmd_buffer->gfx9_fence_offset;
> +   ptr = _buffer->gfx9_fence_idx;
> +   }
> +
> /* Force wait for graphics or compute engines to be idle.
> */
> si_cs_emit_cache_flush(cmd_buffer->cs, false,
>cmd_buffer->device->physical_
> device->rad_info.chip_class,
> -  NULL, 0,
> +  ptr, va,
>radv_cmd_buffer_uses_mec(cmd_
> buffer),
>flags);
> }
> --
> 2.16.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] radv: fix a GPU hang with RADV_DEBUG=syncshaders

2018-01-25 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen 

On Thu, Jan 25, 2018 at 3:46 PM, Samuel Pitoiset 
wrote:

> The GPU hangs when the driver forces a PS_PARTIAL_FLUSH after
> a dispatch call (and vice versa for graphics). Something has
> changed in the kernel driver because it used to work.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/amd/vulkan/radv_cmd_buffer.c | 15 +++
>  1 file changed, 7 insertions(+), 8 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_
> buffer.c
> index 6d512c6070a..ba5fd92f2a1 100644
> --- a/src/amd/vulkan/radv_cmd_buffer.c
> +++ b/src/amd/vulkan/radv_cmd_buffer.c
> @@ -429,15 +429,14 @@ void radv_cmd_buffer_trace_emit(struct
> radv_cmd_buffer *cmd_buffer)
>  }
>
>  static void
> -radv_cmd_buffer_after_draw(struct radv_cmd_buffer *cmd_buffer)
> +radv_cmd_buffer_after_draw(struct radv_cmd_buffer *cmd_buffer,
> +  enum radv_cmd_flush_bits flags)
>  {
> if (cmd_buffer->device->instance->debug_flags &
> RADV_DEBUG_SYNC_SHADERS) {
> -   enum radv_cmd_flush_bits flags;
> -
> -   /* Force wait for graphics/compute engines to be idle. */
> -   flags = RADV_CMD_FLAG_PS_PARTIAL_FLUSH |
> -   RADV_CMD_FLAG_CS_PARTIAL_FLUSH;
> +   assert(flags & (RADV_CMD_FLAG_PS_PARTIAL_FLUSH |
> +   RADV_CMD_FLAG_CS_PARTIAL_FLUSH));
>
> +   /* Force wait for graphics or compute engines to be idle.
> */
> si_cs_emit_cache_flush(cmd_buffer->cs, false,
>cmd_buffer->device->physical_
> device->rad_info.chip_class,
>NULL, 0,
> @@ -3501,7 +3500,7 @@ radv_draw(struct radv_cmd_buffer *cmd_buffer,
> }
>
> assert(cmd_buffer->cs->cdw <= cdw_max);
> -   radv_cmd_buffer_after_draw(cmd_buffer);
> +   radv_cmd_buffer_after_draw(cmd_buffer, RADV_CMD_FLAG_PS_PARTIAL_
> FLUSH);
>  }
>
>  void radv_CmdDraw(
> @@ -3821,7 +3820,7 @@ radv_dispatch(struct radv_cmd_buffer *cmd_buffer,
> radv_emit_dispatch_packets(cmd_buffer, info);
> }
>
> -   radv_cmd_buffer_after_draw(cmd_buffer);
> +   radv_cmd_buffer_after_draw(cmd_buffer, RADV_CMD_FLAG_CS_PARTIAL_
> FLUSH);
>  }
>
>  void radv_CmdDispatch(
> --
> 2.16.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: expand glDrawPixels cache to handle multiple images

2018-01-25 Thread Roland Scheidegger
Am 25.01.2018 um 16:55 schrieb Brian Paul:
> The newest version of WSI Fusion makes several glDrawPixels calls
> per frame.  By caching more than one image, we get better performance
> when panning/zomming the map.
Still zooming :-)




> 
> v2: move pixel unpack param checking out of cache search loop, per Roland
> ---
>  src/mesa/state_tracker/st_cb_drawpixels.c | 196 
> +-
>  src/mesa/state_tracker/st_context.c   |   4 -
>  src/mesa/state_tracker/st_context.h   |  22 +++-
>  3 files changed, 154 insertions(+), 68 deletions(-)
> 
> diff --git a/src/mesa/state_tracker/st_cb_drawpixels.c 
> b/src/mesa/state_tracker/st_cb_drawpixels.c
> index 1d88976..e63f6f7 100644
> --- a/src/mesa/state_tracker/st_cb_drawpixels.c
> +++ b/src/mesa/state_tracker/st_cb_drawpixels.c
> @@ -375,6 +375,131 @@ alloc_texture(struct st_context *st, GLsizei width, 
> GLsizei height,
>  
>  
>  /**
> + * Search the cache for an image which matches the given parameters.
> + * \return  pipe_resource pointer if found, NULL if not found.
> + */
> +static struct pipe_resource *
> +search_drawpixels_cache(struct st_context *st,
> +GLsizei width, GLsizei height,
> +GLenum format, GLenum type,
> +const struct gl_pixelstore_attrib *unpack,
> +const void *pixels)
> +{
> +   struct pipe_resource *pt = NULL;
> +   const GLint bpp = _mesa_bytes_per_pixel(format, type);
> +   unsigned i;
> +
> +   if ((unpack->RowLength != 0 && unpack->RowLength != width) ||
> +   unpack->SkipPixels != 0 ||
> +   unpack->SkipRows != 0 ||
> +   unpack->SwapBytes) {
> +  /* we don't allow non-default pixel unpacking values */
> +  return NULL;
> +   }
> +
> +   /* Search cache entries for a match */
> +   for (i = 0; i < ARRAY_SIZE(st->drawpix_cache.entries); i++) {
> +  struct drawpix_cache_entry *entry = >drawpix_cache.entries[i];
> +
> +  if (width == entry->width &&
> +  height == entry->height &&
> +  format == entry->format &&
> +  type == entry->type &&
> +  pixels == entry->user_pointer &&
> +  !_mesa_is_bufferobj(unpack->BufferObj) &&
Move this line as well?



> +  entry->image) {
> + assert(entry->texture);
> +
> + /* check if the pixel data is the same */
> + if (memcmp(pixels, entry->image, width * height * bpp) == 0) {
> +/* Success - found a cache match */
> +pipe_resource_reference(, entry->texture);
> +/* refcount of returned texture should be at least two here.  One
> + * reference for the cache to hold on to, one for the caller 
> (which
> + * it will release), and possibly more held by the driver.
> + */
> +assert(pt->reference.count >= 2);
> +
> +/* update the age of this entry */
> +entry->age = ++st->drawpix_cache.age;
> +
> +return pt;
> + }
> +  }
> +   }
> +
> +   /* no cache match found */
> +   return NULL;
> +}
> +
> +
> +/**
> + * Find the oldest entry in the glDrawPixels cache.  We'll replace this
> + * one when we need to store a new image.
> + */
> +static struct drawpix_cache_entry *
> +find_oldest_drawpixels_cache_entry(struct st_context *st)
> +{
> +   unsigned oldest_age = ~0u, oldest_index = ~0u;
> +   unsigned i;
> +
> +   /* Find entry with oldest (lowest) age */
> +   for (i = 0; i < ARRAY_SIZE(st->drawpix_cache.entries); i++) {
> +  const struct drawpix_cache_entry *entry = 
> >drawpix_cache.entries[i];
> +  if (entry->age < oldest_age) {
> + oldest_age = entry->age;
> + oldest_index = i;
> +  }
> +   }
> +
> +   assert(oldest_age != ~0u);
Ok, if it takes 2 years to hit it, that's probably ok...

Reviewed-by: Roland Scheidegger 


> +   assert(oldest_index != ~0u);
> +
> +   return >drawpix_cache.entries[oldest_index];
> +}
> +
> +
> +/**
> + * Try to save the given glDrawPixels image in the cache.
> + */
> +static void
> +cache_drawpixels_image(struct st_context *st,
> +   GLsizei width, GLsizei height,
> +   GLenum format, GLenum type,
> +   const struct gl_pixelstore_attrib *unpack,
> +   const void *pixels,
> +   struct pipe_resource *pt)
> +{
> +   if ((unpack->RowLength == 0 || unpack->RowLength == width) &&
> +   unpack->SkipPixels == 0 &&
> +   unpack->SkipRows == 0) {
> +  const GLint bpp = _mesa_bytes_per_pixel(format, type);
> +  struct drawpix_cache_entry *entry =
> + find_oldest_drawpixels_cache_entry(st);
> +  assert(entry);
> +  entry->width = width;
> +  entry->height = height;
> +  entry->format = format;
> +  entry->type = type;
> +  entry->user_pointer = pixels;
> +  free(entry->image);
> +  entry->image = malloc(width * height * bpp);
> +  if 

[Mesa-dev] [Bug 104749] rasterizer/jitter/JitManager.cpp:252:91: error: no matching function for call to ‘llvm::DIBuilder::createBasicType(const char [8], int, llvm::dwarf::TypeKind)’

2018-01-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=104749

Emil Velikov  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #5 from Emil Velikov  ---
Should be resolved as of

commit 0e879aad2fd1dac102c13d680edf455aa068d5df
Author: George Kyriazis 
Date:   Tue Jan 23 16:12:42 2018 -0600

swr/rast: support llvm 3.9 type declarations

LLVM 3.9 was not taken into account in initial check-in.

Fixes: 01ab218bbc ("swr/rast: Initial work for debugging support.")
cc: mesa-sta...@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104749

-- 
You are receiving this mail because:
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] mesa: whitespace fixes in varray.h

2018-01-25 Thread Charmaine Lee

For the series, Reviewed-by: Charmaine Lee 


From: Brian Paul 
Sent: Thursday, January 25, 2018 8:48:00 AM
To: mesa-dev@lists.freedesktop.org
Cc: Neha Bhende; Charmaine Lee; Roland Scheidegger
Subject: [PATCH 3/3] mesa: whitespace fixes in varray.h

---
 src/mesa/main/varray.h | 55 ++
 1 file changed, 29 insertions(+), 26 deletions(-)

diff --git a/src/mesa/main/varray.h b/src/mesa/main/varray.h
index 03d81d0..93f2f47 100644
--- a/src/mesa/main/varray.h
+++ b/src/mesa/main/varray.h
@@ -44,9 +44,10 @@ _mesa_vertex_attrib_address(const struct gl_array_attributes 
*array,
if (_mesa_is_bufferobj(binding->BufferObj))
   return (const GLubyte *) (binding->Offset + array->RelativeOffset);
else
-  return array->Ptr;
+  return array->Ptr;
 }

+
 /**
  * Sets the fields in a gl_vertex_array to values derived from a
  * gl_array_attributes and a gl_vertex_buffer_binding.
@@ -70,6 +71,7 @@ _mesa_update_client_array(struct gl_context *ctx,
_mesa_reference_buffer_object(ctx, >BufferObj, binding->BufferObj);
 }

+
 static inline bool
 _mesa_attr_zero_aliases_vertex(const struct gl_context *ctx)
 {
@@ -190,7 +192,7 @@ _mesa_SecondaryColorPointer_no_error(GLint size, GLenum 
type,
  GLsizei stride, const GLvoid *ptr);
 extern void GLAPIENTRY
 _mesa_SecondaryColorPointer(GLint size, GLenum type,
-  GLsizei stride, const GLvoid *ptr);
+GLsizei stride, const GLvoid *ptr);


 extern void GLAPIENTRY
@@ -206,8 +208,8 @@ _mesa_VertexAttribPointer_no_error(GLuint index, GLint 
size, GLenum type,
const GLvoid *pointer);
 extern void GLAPIENTRY
 _mesa_VertexAttribPointer(GLuint index, GLint size, GLenum type,
- GLboolean normalized, GLsizei stride,
- const GLvoid *pointer);
+  GLboolean normalized, GLsizei stride,
+  const GLvoid *pointer);

 void GLAPIENTRY
 _mesa_VertexAttribIPointer_no_error(GLuint index, GLint size, GLenum type,
@@ -295,35 +297,35 @@ _mesa_InterleavedArrays(GLenum format, GLsizei stride, 
const GLvoid *pointer);


 extern void GLAPIENTRY
-_mesa_MultiDrawArrays( GLenum mode, const GLint *first,
-  const GLsizei *count, GLsizei primcount );
+_mesa_MultiDrawArrays(GLenum mode, const GLint *first,
+  const GLsizei *count, GLsizei primcount);

 extern void GLAPIENTRY
-_mesa_MultiDrawElementsEXT( GLenum mode, const GLsizei *count, GLenum type,
-const GLvoid **indices, GLsizei primcount );
+_mesa_MultiDrawElementsEXT(GLenum mode, const GLsizei *count, GLenum type,
+   const GLvoid **indices, GLsizei primcount);

 extern void GLAPIENTRY
-_mesa_MultiDrawElementsBaseVertex( GLenum mode,
-  const GLsizei *count, GLenum type,
-  const GLvoid **indices, GLsizei primcount,
-  const GLint *basevertex);
+_mesa_MultiDrawElementsBaseVertex(GLenum mode,
+  const GLsizei *count, GLenum type,
+  const GLvoid **indices, GLsizei primcount,
+  const GLint *basevertex);

 extern void GLAPIENTRY
-_mesa_MultiModeDrawArraysIBM( const GLenum * mode, const GLint * first,
- const GLsizei * count,
- GLsizei primcount, GLint modestride );
+_mesa_MultiModeDrawArraysIBM(const GLenum * mode, const GLint * first,
+ const GLsizei * count,
+ GLsizei primcount, GLint modestride );


 extern void GLAPIENTRY
-_mesa_MultiModeDrawElementsIBM( const GLenum * mode, const GLsizei * count,
-   GLenum type, const GLvoid * const * indices,
-   GLsizei primcount, GLint modestride );
+_mesa_MultiModeDrawElementsIBM(const GLenum * mode, const GLsizei * count,
+   GLenum type, const GLvoid * const * indices,
+   GLsizei primcount, GLint modestride );

 extern void GLAPIENTRY
 _mesa_LockArraysEXT(GLint first, GLsizei count);

 extern void GLAPIENTRY
-_mesa_UnlockArraysEXT( void );
+_mesa_UnlockArraysEXT(void);


 extern void GLAPIENTRY
@@ -343,13 +345,13 @@ _mesa_DrawRangeElements(GLenum mode, GLuint start, GLuint 
end, GLsizei count,

 extern void GLAPIENTRY
 _mesa_DrawElementsBaseVertex(GLenum mode, GLsizei count, GLenum type,
-const GLvoid *indices, GLint basevertex);
+ const GLvoid *indices, GLint basevertex);

 extern void GLAPIENTRY
 _mesa_DrawRangeElementsBaseVertex(GLenum mode, GLuint start, GLuint end,
-

[Mesa-dev] [Bug 104710] [swrast] piglit draw-batch regression

2018-01-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=104710

--- Comment #1 from Emil Velikov  ---
Vinson I'm suspecting that this should be fixed with
365a48abddcabf6596c2e34a784d91c8ab929918. Can you please confirm?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: add missing RGB9_E5 format in _mesa_base_fbo_format

2018-01-25 Thread Roland Scheidegger
Am 25.01.2018 um 16:30 schrieb Michel Dänzer:
> On 2018-01-24 05:38 PM, Juan A. Suarez Romero wrote:
>> This fixes KHR-GL45.internalformat.renderbuffer.rgb9_e5.
>> ---
>>  src/mesa/main/fbobject.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
>> index d23916d1ad7..c72204e11a0 100644
>> --- a/src/mesa/main/fbobject.c
>> +++ b/src/mesa/main/fbobject.c
>> @@ -1976,6 +1976,9 @@ _mesa_base_fbo_format(const struct gl_context *ctx, 
>> GLenum internalFormat)
>> ctx->Extensions.ARB_texture_float) ||
>>_mesa_is_gles3(ctx) /* EXT_color_buffer_float */ )
>>   ? GL_RGBA : 0;
>> +   case GL_RGB9_E5:
>> +  return (_mesa_is_desktop_gl(ctx) && 
>> ctx->Extensions.EXT_texture_shared_exponent)
>> + ? GL_RGB: 0;
>> case GL_ALPHA16F_ARB:
>> case GL_ALPHA32F_ARB:
>>return ctx->API == API_OPENGL_COMPAT &&
>>
> 
> Unfortunately, this broke the "spec@arb_internalformat_query2@samples
> and num_sample_counts pname checks" piglit tests with radeonsi and
> llvmpipe, see below.
> 
> Any idea what might need to be done in Gallium to fix this?
> 
> 
> 32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
> GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), 
> supported=1
> 32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
> GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = 
> (1,GL_TRUE), supported=1
> 32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
> GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = 
> (1,GL_TRUE), supported=1
> 64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
> GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), 
> supported=1
> 64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
> GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = 
> (1,GL_TRUE), supported=1
> 64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
> GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = 
> (1,GL_TRUE), supported=1
> PIGLIT: {"subtest": {"GL_NUM_SAMPLE_COUNTS" : "fail"}}
> 32 bit failing case: pname = GL_SAMPLES, target = GL_RENDERBUFFER, 
> internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1
> 32 bit failing case: pname = GL_SAMPLES, target = 
> GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = 
> (1,GL_TRUE), supported=1
> 32 bit failing case: pname = GL_SAMPLES, target = 
> GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = 
> (1,GL_TRUE), supported=1
> 64 bit failing case: pname = GL_SAMPLES, target = GL_RENDERBUFFER, 
> internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1
> 64 bit failing case: pname = GL_SAMPLES, target = 
> GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = 
> (1,GL_TRUE), supported=1
> 64 bit failing case: pname = GL_SAMPLES, target = 
> GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = 
> (1,GL_TRUE), supported=1
> PIGLIT: {"subtest": {"GL_SAMPLES" : "fail"}}
> PIGLIT: {"result": "fail" }
> 
> 

Purely coincidentally, I was trying to clean up the formatquery code
recently (should help some failures with r600 too), and I think these
cleanups would fix it.
Basically outright say "no" to target/pname combinations which don't
make sense rather than trying to find a format suitable for another
target and then asking the driver for the nonsense combination, plus
some other small bits like not validating things again (sometimes, a
third time...).
Albeit it will cause some breakage with the piglit test, which I believe
is a test error, but that might be open for debate...
(For TEXTURE_BUFFER and the internalformat size/type queries, do you
return valid values or unsupported? The problem here is ARB_tbo says you
can't get these values via the equivalent GetTexLevelParameter queries,
whereas with GL 3.1 you can. And internalformat_query2 says it returns
"the same information" as GetTexLevelParameter, albeit it's not entirely
true in any case since the equivalent of the internalformat stencil type
doesn't even exist. My stance would be that valid values should be
reported even without GL 3.1, but the piglit test thinks differently.)

Roland
diff --git a/src/mesa/main/formatquery.c b/src/mesa/main/formatquery.c
index 61f798c88f..3f5da272c3 100644
--- a/src/mesa/main/formatquery.c
+++ b/src/mesa/main/formatquery.c
@@ -398,8 +398,6 @@ _is_target_supported(struct gl_context *ctx, GLenum target)
case GL_TEXTURE_1D:
case GL_TEXTURE_2D:
case GL_TEXTURE_3D:
-  if (!_mesa_is_desktop_gl(ctx))
- return false;
   break;
 
case GL_TEXTURE_1D_ARRAY:
@@ -560,15 +558,29 @@ _is_internalformat_supported(struct gl_context *ctx, GLenum target,
 * implementation accepts it for any texture specification commands, and
 * - 

[Mesa-dev] [PATCH 3/3] mesa: whitespace fixes in varray.h

2018-01-25 Thread Brian Paul
---
 src/mesa/main/varray.h | 55 ++
 1 file changed, 29 insertions(+), 26 deletions(-)

diff --git a/src/mesa/main/varray.h b/src/mesa/main/varray.h
index 03d81d0..93f2f47 100644
--- a/src/mesa/main/varray.h
+++ b/src/mesa/main/varray.h
@@ -44,9 +44,10 @@ _mesa_vertex_attrib_address(const struct gl_array_attributes 
*array,
if (_mesa_is_bufferobj(binding->BufferObj))
   return (const GLubyte *) (binding->Offset + array->RelativeOffset);
else
-  return array->Ptr;   
+  return array->Ptr;
 }
 
+
 /**
  * Sets the fields in a gl_vertex_array to values derived from a
  * gl_array_attributes and a gl_vertex_buffer_binding.
@@ -70,6 +71,7 @@ _mesa_update_client_array(struct gl_context *ctx,
_mesa_reference_buffer_object(ctx, >BufferObj, binding->BufferObj);
 }
 
+
 static inline bool
 _mesa_attr_zero_aliases_vertex(const struct gl_context *ctx)
 {
@@ -190,7 +192,7 @@ _mesa_SecondaryColorPointer_no_error(GLint size, GLenum 
type,
  GLsizei stride, const GLvoid *ptr);
 extern void GLAPIENTRY
 _mesa_SecondaryColorPointer(GLint size, GLenum type,
-  GLsizei stride, const GLvoid *ptr);
+GLsizei stride, const GLvoid *ptr);
 
 
 extern void GLAPIENTRY
@@ -206,8 +208,8 @@ _mesa_VertexAttribPointer_no_error(GLuint index, GLint 
size, GLenum type,
const GLvoid *pointer);
 extern void GLAPIENTRY
 _mesa_VertexAttribPointer(GLuint index, GLint size, GLenum type,
- GLboolean normalized, GLsizei stride,
- const GLvoid *pointer);
+  GLboolean normalized, GLsizei stride,
+  const GLvoid *pointer);
 
 void GLAPIENTRY
 _mesa_VertexAttribIPointer_no_error(GLuint index, GLint size, GLenum type,
@@ -295,35 +297,35 @@ _mesa_InterleavedArrays(GLenum format, GLsizei stride, 
const GLvoid *pointer);
 
 
 extern void GLAPIENTRY
-_mesa_MultiDrawArrays( GLenum mode, const GLint *first,
-  const GLsizei *count, GLsizei primcount );
+_mesa_MultiDrawArrays(GLenum mode, const GLint *first,
+  const GLsizei *count, GLsizei primcount);
 
 extern void GLAPIENTRY
-_mesa_MultiDrawElementsEXT( GLenum mode, const GLsizei *count, GLenum type,
-const GLvoid **indices, GLsizei primcount );
+_mesa_MultiDrawElementsEXT(GLenum mode, const GLsizei *count, GLenum type,
+   const GLvoid **indices, GLsizei primcount);
 
 extern void GLAPIENTRY
-_mesa_MultiDrawElementsBaseVertex( GLenum mode,
-  const GLsizei *count, GLenum type,
-  const GLvoid **indices, GLsizei primcount,
-  const GLint *basevertex);
+_mesa_MultiDrawElementsBaseVertex(GLenum mode,
+  const GLsizei *count, GLenum type,
+  const GLvoid **indices, GLsizei primcount,
+  const GLint *basevertex);
 
 extern void GLAPIENTRY
-_mesa_MultiModeDrawArraysIBM( const GLenum * mode, const GLint * first,
- const GLsizei * count,
- GLsizei primcount, GLint modestride );
+_mesa_MultiModeDrawArraysIBM(const GLenum * mode, const GLint * first,
+ const GLsizei * count,
+ GLsizei primcount, GLint modestride );
 
 
 extern void GLAPIENTRY
-_mesa_MultiModeDrawElementsIBM( const GLenum * mode, const GLsizei * count,
-   GLenum type, const GLvoid * const * indices,
-   GLsizei primcount, GLint modestride );
+_mesa_MultiModeDrawElementsIBM(const GLenum * mode, const GLsizei * count,
+   GLenum type, const GLvoid * const * indices,
+   GLsizei primcount, GLint modestride );
 
 extern void GLAPIENTRY
 _mesa_LockArraysEXT(GLint first, GLsizei count);
 
 extern void GLAPIENTRY
-_mesa_UnlockArraysEXT( void );
+_mesa_UnlockArraysEXT(void);
 
 
 extern void GLAPIENTRY
@@ -343,13 +345,13 @@ _mesa_DrawRangeElements(GLenum mode, GLuint start, GLuint 
end, GLsizei count,
 
 extern void GLAPIENTRY
 _mesa_DrawElementsBaseVertex(GLenum mode, GLsizei count, GLenum type,
-const GLvoid *indices, GLint basevertex);
+ const GLvoid *indices, GLint basevertex);
 
 extern void GLAPIENTRY
 _mesa_DrawRangeElementsBaseVertex(GLenum mode, GLuint start, GLuint end,
- GLsizei count, GLenum type,
- const GLvoid *indices,
- GLint basevertex);
+  GLsizei count, GLenum type,
+  const GLvoid *indices,
+

[Mesa-dev] [PATCH 2/3] mesa: include mtypes.h in varray.h

2018-01-25 Thread Brian Paul
We actually use some of the types from mtypes.h so include it directly
instead of relying on indirectly including it via bufferobj.h
---
 src/mesa/main/varray.h | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/src/mesa/main/varray.h b/src/mesa/main/varray.h
index 6dcf1db..03d81d0 100644
--- a/src/mesa/main/varray.h
+++ b/src/mesa/main/varray.h
@@ -28,11 +28,9 @@
 #define VARRAY_H
 
 
-#include "glheader.h"
+#include "mtypes.h"
 #include "bufferobj.h"
 
-struct gl_vertex_array;
-struct gl_context;
 
 /**
  * Returns a pointer to the vertex attribute data in a client array,
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] mesa: s/gl_vertex_attrib_array/gl_array_attributes/ in comments

2018-01-25 Thread Brian Paul
The structure type was renamed some time ago, but some comments
were not updated.
---
 src/mesa/main/arrayobj.c | 2 +-
 src/mesa/main/mtypes.h   | 2 +-
 src/mesa/main/varray.h   | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/main/arrayobj.c b/src/mesa/main/arrayobj.c
index d6dc82d..2810647 100644
--- a/src/mesa/main/arrayobj.c
+++ b/src/mesa/main/arrayobj.c
@@ -307,7 +307,7 @@ _mesa_initialize_vao(struct gl_context *ctx,
 
 
 /**
- * Updates the derived gl_vertex_arrays when a gl_vertex_attrib_array
+ * Updates the derived gl_vertex_arrays when a gl_array_attributes
  * or a gl_vertex_buffer_binding has changed.
  */
 void
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index ce4fd4c..66c56a9 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -1541,7 +1541,7 @@ struct gl_vertex_array_object
/**
 * Derived vertex attribute arrays
 *
-* This is a legacy data structure created from gl_vertex_attrib_array and
+* This is a legacy data structure created from gl_array_attributes and
 * gl_vertex_buffer_binding, for compatibility with existing driver code.
 */
struct gl_vertex_array _VertexAttrib[VERT_ATTRIB_MAX];
diff --git a/src/mesa/main/varray.h b/src/mesa/main/varray.h
index 8ec6d30..6dcf1db 100644
--- a/src/mesa/main/varray.h
+++ b/src/mesa/main/varray.h
@@ -51,7 +51,7 @@ _mesa_vertex_attrib_address(const struct gl_array_attributes 
*array,
 
 /**
  * Sets the fields in a gl_vertex_array to values derived from a
- * gl_vertex_attrib_array and a gl_vertex_buffer_binding.
+ * gl_array_attributes and a gl_vertex_buffer_binding.
  */
 static inline void
 _mesa_update_client_array(struct gl_context *ctx,
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH v2] configure.ac: add missing llvm dependencies to .pc files

2018-01-25 Thread Chuck Atkins
>
> > +if test "x$enable_glx" == xgallium-xlib; then
> > +GL_PC_LIB_PRIV="$GL_PC_LIB_PRIV $LLVM_LIBS"
> > +fi
> > +if test "x$enable_gallium_osmesa" = xyes; then
> > +OSMESA_PC_LIB_PRIV="$OSMESA_PC_LIB_PRIV $LLVM_LIBS"
> > +fi
> I'm itching to add a comment above these two, since Eric brought it up.
> Modulo any objections


None from me.  Please go ahead.


I'll squash it before pushing.
>

Thanks!  Hopefully once my new account goes through I can push on my own.

- Chuck
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v1 0/7] Implement commont gralloc_handle_t in libdrm

2018-01-25 Thread Robert Foss

Hey Tomasz,

On 01/24/2018 11:04 AM, Tomasz Figa wrote:

Hi Robert,

On Wed, Jan 17, 2018 at 2:36 AM, Robert Foss  wrote:

This series moves {gbm,drm,cros}_gralloc_handle_t struct to libdrm,
since at least 4 implementations exist, and share a lot of contents.
The idea is to keep the common stuff defined in one place, and libdrm
is the common codebase to all of these platforms.


Additionally, having this struct defined in libdrm will make it
easier for mesa and gralloc implementations to communicate.

Robert Foss (7):
   android: Move gralloc handle struct to libdrm
   android: Add version variable to gralloc_handle_t
   android: Mark gralloc_handle_t magic variable as const
   android: Remove member name from gralloc_handle_t
   android: Change gralloc_handle_t format from Android format to fourcc
   android: Change gralloc_handle_t members to be fixed width
   android: Add accessor functions for gralloc_handle_t variables


Again, thanks for working on this.

I looked through the series and it seems to be much different from
what I imagined when writing my previous reply. I must have
misunderstood your proposal back then.


Ah, glad we caught it before v2 then :)



Generally, current series doesn't solve Chromium OS main concern of
locking down the handle struct. Even though accessors are added, they
are implemented in libdrm and refer to the exact handle layout as per
the handle struct defined by libdrm.


So solving the problems of multiple projects is the goal, so reconsidering is 
probably they way forward.




What I had in my mind, would be creating a secondary struct,
consisting only of callbacks, which would be filled in by particular
gralloc implementation running in the system with its accessors. This
would completely eliminate any dependencies on the handle struct
itself from consumers of gralloc buffers.


So just to sketch out the solution, it would look something like this?

struct gralloc_handle_t {
uint32_t (*get_fd)(buffer_handle_t handle, uint32_t plane);
uint64_t (*get_modifier)(buffer_handle_t handle, uint32_t plane);
uint32_t (*get_offsets)(buffer_handle_t handle, uint32_t plane);
uint32_t (*get_stride)(buffer_handle_t handle, uint32_t plane);
...
} gralloc_funcs_t;

struct gralloc_handle_t {
native_handle_t base;

/* api variables */
const int magic; /* differentiate between allocator impls */
const int version; /* api version */

gralloc_funcs_t funcs;

...
} gralloc_handle_t;

For reasons of backwards compatability gralloc_handle_t should probably
contain whatever gbm_gralloc_handle_t contains now too.
Since we're going to version this struct, we can always drop extraneous 
variables later.
Since we'll be able to drop variables, we could add more variables to support 
the cros minigbm variables of even the intel minigbm ones.

This would be a bit high churn, but probably ease adoption.

Additionally the gralloc buffer registering mechanism doesn't exist in any of 
the gralloc implementations, so being able to start out with something that 
works on all platforms would be nice.



Rob.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa] egl: keep extension list sorted, per comment at the top

2018-01-25 Thread Adam Jackson
On Thu, 2018-01-25 at 10:14 +, Eric Engestrom wrote:
> Signed-off-by: Eric Engestrom 

Reviewed-by: Adam Jackson 

- ajax
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/mesa: expand glDrawPixels cache to handle multiple images

2018-01-25 Thread Brian Paul
The newest version of WSI Fusion makes several glDrawPixels calls
per frame.  By caching more than one image, we get better performance
when panning/zomming the map.

v2: move pixel unpack param checking out of cache search loop, per Roland
---
 src/mesa/state_tracker/st_cb_drawpixels.c | 196 +-
 src/mesa/state_tracker/st_context.c   |   4 -
 src/mesa/state_tracker/st_context.h   |  22 +++-
 3 files changed, 154 insertions(+), 68 deletions(-)

diff --git a/src/mesa/state_tracker/st_cb_drawpixels.c 
b/src/mesa/state_tracker/st_cb_drawpixels.c
index 1d88976..e63f6f7 100644
--- a/src/mesa/state_tracker/st_cb_drawpixels.c
+++ b/src/mesa/state_tracker/st_cb_drawpixels.c
@@ -375,6 +375,131 @@ alloc_texture(struct st_context *st, GLsizei width, 
GLsizei height,
 
 
 /**
+ * Search the cache for an image which matches the given parameters.
+ * \return  pipe_resource pointer if found, NULL if not found.
+ */
+static struct pipe_resource *
+search_drawpixels_cache(struct st_context *st,
+GLsizei width, GLsizei height,
+GLenum format, GLenum type,
+const struct gl_pixelstore_attrib *unpack,
+const void *pixels)
+{
+   struct pipe_resource *pt = NULL;
+   const GLint bpp = _mesa_bytes_per_pixel(format, type);
+   unsigned i;
+
+   if ((unpack->RowLength != 0 && unpack->RowLength != width) ||
+   unpack->SkipPixels != 0 ||
+   unpack->SkipRows != 0 ||
+   unpack->SwapBytes) {
+  /* we don't allow non-default pixel unpacking values */
+  return NULL;
+   }
+
+   /* Search cache entries for a match */
+   for (i = 0; i < ARRAY_SIZE(st->drawpix_cache.entries); i++) {
+  struct drawpix_cache_entry *entry = >drawpix_cache.entries[i];
+
+  if (width == entry->width &&
+  height == entry->height &&
+  format == entry->format &&
+  type == entry->type &&
+  pixels == entry->user_pointer &&
+  !_mesa_is_bufferobj(unpack->BufferObj) &&
+  entry->image) {
+ assert(entry->texture);
+
+ /* check if the pixel data is the same */
+ if (memcmp(pixels, entry->image, width * height * bpp) == 0) {
+/* Success - found a cache match */
+pipe_resource_reference(, entry->texture);
+/* refcount of returned texture should be at least two here.  One
+ * reference for the cache to hold on to, one for the caller (which
+ * it will release), and possibly more held by the driver.
+ */
+assert(pt->reference.count >= 2);
+
+/* update the age of this entry */
+entry->age = ++st->drawpix_cache.age;
+
+return pt;
+ }
+  }
+   }
+
+   /* no cache match found */
+   return NULL;
+}
+
+
+/**
+ * Find the oldest entry in the glDrawPixels cache.  We'll replace this
+ * one when we need to store a new image.
+ */
+static struct drawpix_cache_entry *
+find_oldest_drawpixels_cache_entry(struct st_context *st)
+{
+   unsigned oldest_age = ~0u, oldest_index = ~0u;
+   unsigned i;
+
+   /* Find entry with oldest (lowest) age */
+   for (i = 0; i < ARRAY_SIZE(st->drawpix_cache.entries); i++) {
+  const struct drawpix_cache_entry *entry = >drawpix_cache.entries[i];
+  if (entry->age < oldest_age) {
+ oldest_age = entry->age;
+ oldest_index = i;
+  }
+   }
+
+   assert(oldest_age != ~0u);
+   assert(oldest_index != ~0u);
+
+   return >drawpix_cache.entries[oldest_index];
+}
+
+
+/**
+ * Try to save the given glDrawPixels image in the cache.
+ */
+static void
+cache_drawpixels_image(struct st_context *st,
+   GLsizei width, GLsizei height,
+   GLenum format, GLenum type,
+   const struct gl_pixelstore_attrib *unpack,
+   const void *pixels,
+   struct pipe_resource *pt)
+{
+   if ((unpack->RowLength == 0 || unpack->RowLength == width) &&
+   unpack->SkipPixels == 0 &&
+   unpack->SkipRows == 0) {
+  const GLint bpp = _mesa_bytes_per_pixel(format, type);
+  struct drawpix_cache_entry *entry =
+ find_oldest_drawpixels_cache_entry(st);
+  assert(entry);
+  entry->width = width;
+  entry->height = height;
+  entry->format = format;
+  entry->type = type;
+  entry->user_pointer = pixels;
+  free(entry->image);
+  entry->image = malloc(width * height * bpp);
+  if (entry->image) {
+ memcpy(entry->image, pixels, width * height * bpp);
+ pipe_resource_reference(>texture, pt);
+ entry->age = ++st->drawpix_cache.age;
+  }
+  else {
+ /* out of memory, free/disable cached texture */
+ entry->width = 0;
+ entry->height = 0;
+ pipe_resource_reference(>texture, NULL);
+  }
+   }
+}
+
+
+/**
  * Make texture containing an image for glDrawPixels image.
  * If 'pixels' is 

Re: [Mesa-dev] [Mesa-stable] [PATCH v2] configure.ac: add missing llvm dependencies to .pc files

2018-01-25 Thread Emil Velikov
On 25 January 2018 at 14:43, Chuck Atkins  wrote:
> v2: Only add as dependencies for gallium-osmesa and gallium-xlib
>
> CC: 
> Signed-of-by: Chuck Atkins 
Reviewed-by: Emil Velikov 

> ---
>  configure.ac | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/configure.ac b/configure.ac
> index 7c1fbe0ed1..448bd3a6ba 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -2780,6 +2780,12 @@ if test "x$enable_llvm" = xyes; then
>  fi
>  fi
>  fi
> +if test "x$enable_glx" == xgallium-xlib; then
> +GL_PC_LIB_PRIV="$GL_PC_LIB_PRIV $LLVM_LIBS"
> +fi
> +if test "x$enable_gallium_osmesa" = xyes; then
> +OSMESA_PC_LIB_PRIV="$OSMESA_PC_LIB_PRIV $LLVM_LIBS"
> +fi
I'm itching to add a comment above these two, since Eric brought it up.
Modulo any objections I'll squash it before pushing.

The following two targets embed the swr/llvmpipe driver into the final binary.

Adding LLVM_LIBS results in the LLVM library propagated in the
Libs.private of the respective .pc file.
With the latter of which used when static linking the respective
targets into other projects.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radeonsi: Export signalled sync file instead of -1.

2018-01-25 Thread Bas Nieuwenhuizen
-1 is considered an error for EGL_ANDROID_native_fence_sync, so
we need to actually create a sync file.

Fixes: f536f45250 "radeonsi: implement sync_file import/export"
---
 src/gallium/drivers/radeon/radeon_winsys.h |  5 +
 src/gallium/drivers/radeonsi/si_fence.c|  2 ++
 src/gallium/winsys/amdgpu/drm/amdgpu_cs.c  | 23 +++
 3 files changed, 30 insertions(+)

diff --git a/src/gallium/drivers/radeon/radeon_winsys.h 
b/src/gallium/drivers/radeon/radeon_winsys.h
index d1c761f4ee..307f8efaec 100644
--- a/src/gallium/drivers/radeon/radeon_winsys.h
+++ b/src/gallium/drivers/radeon/radeon_winsys.h
@@ -610,6 +610,11 @@ struct radeon_winsys {
 int (*fence_export_sync_file)(struct radeon_winsys *ws,
  struct pipe_fence_handle *fence);
 
+/**
+ * Return a sync file FD that is already signalled.
+ */
+int (*export_signalled_sync_file)(struct radeon_winsys *ws);
+
 /**
  * Initialize surface
  *
diff --git a/src/gallium/drivers/radeonsi/si_fence.c 
b/src/gallium/drivers/radeonsi/si_fence.c
index 5f320803aa..47d68dbc33 100644
--- a/src/gallium/drivers/radeonsi/si_fence.c
+++ b/src/gallium/drivers/radeonsi/si_fence.c
@@ -356,6 +356,8 @@ static int si_fence_get_fd(struct pipe_screen *screen,
 
/* If we don't have FDs at this point, it means we don't have fences
 * either. */
+   if (sdma_fd == -1 && gfx_fd == -1)
+   return ws->export_signalled_sync_file(ws);
if (sdma_fd == -1)
return gfx_fd;
if (gfx_fd == -1)
diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
index 63cd63287f..b60574cfdd 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
@@ -114,6 +114,28 @@ static int amdgpu_fence_export_sync_file(struct 
radeon_winsys *rws,
return fd;
 }
 
+static int amdgpu_export_signalled_sync_file(struct radeon_winsys *rws)
+{
+   struct amdgpu_winsys *ws = amdgpu_winsys(rws);
+   uint32_t syncobj;
+   int fd = -1;
+
+   int r = amdgpu_cs_create_syncobj2(ws->dev, DRM_SYNCOBJ_CREATE_SIGNALED,
+ );
+   if (r) {
+  return -1;
+   }
+
+   r = amdgpu_cs_syncobj_export_sync_file(ws->dev, syncobj, );
+   if (r) {
+  fd = -1;
+   }
+
+   amdgpu_cs_destroy_syncobj(ws->dev, syncobj);
+   return fd;
+}
+
+
 static void amdgpu_fence_submitted(struct pipe_fence_handle *fence,
uint64_t seq_no,
uint64_t *user_fence_cpu_address)
@@ -1560,4 +1582,5 @@ void amdgpu_cs_init_functions(struct amdgpu_winsys *ws)
ws->base.fence_reference = amdgpu_fence_reference;
ws->base.fence_import_sync_file = amdgpu_fence_import_sync_file;
ws->base.fence_export_sync_file = amdgpu_fence_export_sync_file;
+   ws->base.export_signalled_sync_file = amdgpu_export_signalled_sync_file;
 }
-- 
2.16.0.rc1.238.g530d649a79-goog

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: expand glDrawPixels cache to handle multiple images

2018-01-25 Thread Brian Paul

On 01/24/2018 09:06 PM, Roland Scheidegger wrote:

Am 25.01.2018 um 00:19 schrieb Brian Paul:

The newest version of WSI Fusion makes several glDrawPixels calls
per frame.  By caching more than one image, we get better performance
when panning/zomming the map.

zooming


---
  src/mesa/state_tracker/st_cb_drawpixels.c | 192 +-
  src/mesa/state_tracker/st_context.c   |   4 -
  src/mesa/state_tracker/st_context.h   |  22 +++-
  3 files changed, 150 insertions(+), 68 deletions(-)

diff --git a/src/mesa/state_tracker/st_cb_drawpixels.c 
b/src/mesa/state_tracker/st_cb_drawpixels.c
index 1d88976..2e4e89d 100644
--- a/src/mesa/state_tracker/st_cb_drawpixels.c
+++ b/src/mesa/state_tracker/st_cb_drawpixels.c
@@ -375,6 +375,127 @@ alloc_texture(struct st_context *st, GLsizei width, 
GLsizei height,
  
  
  /**

+ * Search the cache for an image which matches the given parameters.
+ * \return  pipe_resource pointer if found, NULL if not found.
+ */
+static struct pipe_resource *
+search_drawpixels_cache(struct st_context *st,
+GLsizei width, GLsizei height,
+GLenum format, GLenum type,
+const struct gl_pixelstore_attrib *unpack,
+const void *pixels)
+{
+   struct pipe_resource *pt = NULL;
+   const GLint bpp = _mesa_bytes_per_pixel(format, type);
+   unsigned i;
+
+   /* Search cache entries for a match */
+   for (i = 0; i < ARRAY_SIZE(st->drawpix_cache.entries); i++) {
+  struct drawpix_cache_entry *entry = >drawpix_cache.entries[i];
+
+  if (width == entry->width &&
+  height == entry->height &&
+  format == entry->format &&
+  type == entry->type &&
+  pixels == entry->user_pointer &&
+  !_mesa_is_bufferobj(unpack->BufferObj) &&
+  (unpack->RowLength == 0 || unpack->RowLength == width) &&
+  unpack->SkipPixels == 0 &&
+  unpack->SkipRows == 0 &&
+  unpack->SwapBytes == GL_FALSE &&

Maybe factor out all these unpack parameter (which don't change) into
their own var? Would make it more obvious which parameter you're
actually comparing in the entries. And if that combined unpack var isn't
true, you should probably skip the for loop in the first place.


Yeah, I'll lift those out of the loop.






+  entry->image) {
+ assert(entry->texture);
+
+ /* check if the pixel data is the same */
+ if (memcmp(pixels, entry->image, width * height * bpp) == 0) {
+/* Success- found a cache match */

whitespace before -


+pipe_resource_reference(, entry->texture);
+/* refcount of returned texture should be at least two here.  One
+ * reference for the cache to hold on to, one for the caller (which
+ * it will release), and possibly more held by the driver.
+ */
+assert(pt->reference.count >= 2);
+
+/* update the age of this entry */
+entry->age = ++st->drawpix_cache.age;
+
+return pt;
+ }
+  }
+   }
+
+   /* no cache match found */
+   return NULL;
+}
+
+
+/**
+ * Find the oldest entry in the glDrawPixels cache.  We'll replace this
+ * one when we need to store a new image.
+ */
+static struct drawpix_cache_entry *
+find_oldest_drawpixels_cache_entry(struct st_context *st)
+{
+   unsigned oldest_age = ~0u, oldest_index = ~0u;
+   unsigned i;
+
+   /* Find entry with oldest (lowest) age */
+   for (i = 0; i < ARRAY_SIZE(st->drawpix_cache.entries); i++) {
+  const struct drawpix_cache_entry *entry = >drawpix_cache.entries[i];
+  if (entry->age < oldest_age) {
+ oldest_age = entry->age;
+ oldest_index = i;
+  }
+   }
+
+   assert(oldest_age != ~0u);

Couldn't you hit that with 32bit wraparound of age? I think the logic
should be pretty safe against wraparound (would just not return the
oldest entry).


Yeah, I can drop that.  Though, even at 60 draws/second, it'd take over 
2 years to hit wrap-around. :)




Reviewed-by: Roland Scheidegger 


Thanks.  I'll post a v2.

-Brian


+   assert(oldest_index != ~0u);
+
+   return >drawpix_cache.entries[oldest_index];
+}
+
+
+/**
+ * Try to save the given glDrawPixels image in the cache.
+ */
+static void
+cache_drawpixels_image(struct st_context *st,
+   GLsizei width, GLsizei height,
+   GLenum format, GLenum type,
+   const struct gl_pixelstore_attrib *unpack,
+   const void *pixels,
+   struct pipe_resource *pt)
+{
+   if ((unpack->RowLength == 0 || unpack->RowLength == width) &&
+   unpack->SkipPixels == 0 &&
+   unpack->SkipRows == 0) {
+  const GLint bpp = _mesa_bytes_per_pixel(format, type);
+  struct drawpix_cache_entry *entry =
+ find_oldest_drawpixels_cache_entry(st);
+  assert(entry);
+  entry->width = width;
+  entry->height = 

Re: [Mesa-dev] [PATCH 2/3] anv/gen10: Ignore push constant packets during context restore.

2018-01-25 Thread Rafael Antognolli
On Wed, Jan 24, 2018 at 05:08:54PM -0800, Jason Ekstrand wrote:
> On Wed, Jan 24, 2018 at 4:33 PM, Rafael Antognolli 
>  > wrote:
> 
> Similar to the GL driver, ignore 3DSTATE_CONSTANT_* packets when doing a
> context restore.
> 
> Signed-off-by: Rafael Antognolli 
> Cc: Jason Ekstrand 
> Cc: "18.0" 
> ---
>  src/intel/vulkan/anv_private.h |  1 +
>  src/intel/vulkan/genX_cmd_buffer.c | 47 ++
> 
>  2 files changed, 48 insertions(+)
> 
> diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_
> private.h
> index b351c6f63b3..a4c84d2c295 100644
> --- a/src/intel/vulkan/anv_private.h
> +++ b/src/intel/vulkan/anv_private.h
> @@ -1458,6 +1458,7 @@ enum anv_pipe_bits {
> ANV_PIPE_CONSTANT_CACHE_INVALIDATE_BIT= (1 << 3),
> ANV_PIPE_VF_CACHE_INVALIDATE_BIT  = (1 << 4),
> ANV_PIPE_DATA_CACHE_FLUSH_BIT = (1 << 5),
> +   ANV_PIPE_ISP_DISABLE_BIT  = (1 << 9),
> 
> 
> Let's drop this if we're not going to use it.

OK.

> 
> ANV_PIPE_TEXTURE_CACHE_INVALIDATE_BIT = (1 << 10),
> ANV_PIPE_INSTRUCTION_CACHE_INVALIDATE_BIT = (1 << 11),
> ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT= (1 << 12),
> diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/
> genX_cmd_buffer.c
> index c23a54fb7b9..7028c1ce9df 100644
> --- a/src/intel/vulkan/genX_cmd_buffer.c
> +++ b/src/intel/vulkan/genX_cmd_buffer.c
> @@ -1008,6 +1008,50 @@ genX(BeginCommandBuffer)(
> return result;
>  }
> 
> +/**
> + * From the PRM, Volume 2a:
> + *
> + *"Indirect State Pointers Disable
> + *
> + *At the completion of the post-sync operation associated with this
> pipe
> + *control packet, the indirect state pointers in the hardware are
> + *considered invalid; the indirect pointers are not saved in the
> context.
> + *If any new indirect state commands are executed in the command
> stream
> + *while the pipe control is pending, the new indirect state commands
> are
> + *preserved.
> + *
> + *[DevIVB+]: Using Invalidate State Pointer (ISP) only inhibits
> context
> + *restoring of Push Constant (3DSTATE_CONSTANT_*) commands. Push
> Constant
> + *commands are only considered as Indirect State Pointers. Once ISP 
> is
> + *issued in a context, SW must initialize by programming push 
> constant
> + *commands for all the shaders (at least to zero length) before
> attempting
> + *any rendering operation for the same context."
> + *
> + * 3DSTATE_CONSTANT_* packets are restored during a context restore,
> + * even though they point to a BO that has been already unreferenced at
> + * the end of the previous batch buffer. This has been fine so far since
> + * we are protected by these scratch page (every address not covered by
> + * a BO should be pointing to the scratch page). But on CNL, it is
> + * causing a GPU hang during context restore at the 3DSTATE_CONSTANT_*
> + * instruction.
> + *
> + * The flag "Indirect State Pointers Disable" in PIPE_CONTROL tells the
> + * hardware to ignore previous 3DSTATE_CONSTANT_* packets during a
> + * context restore, so the mentioned hang doesn't happen. However,
> + * software must program push constant commands for all stages prior to
> + * rendering anything, so we flag them as dirty.
> 
> 
> And... The next command buffer won't.  I just looked at it and we won't set up
> push constants again until we use them.  We could either set 
> 3DSTATE_CONSTANT_*
> instead or we can make sure that push constants are flagged as dirty in
> BeginCommandBuffer.

Oh, I understood that anv was always sending them at every command
buffer. OK, will check this again.

> + */
> +static void
> +emit_isp_disable(struct anv_cmd_buffer *cmd_buffer)
> +{
> +   anv_batch_emit(_buffer->batch, GENX(PIPE_CONTROL), pc) {
> + pc.IndirectStatePointersDisable = true;
> + pc.PostSyncOperation = WriteImmediateData;
> + pc.Address   =
> +(struct anv_address) { _buffer->device->workaround_bo, 0
> };'
> 
> 
> Is the W/A BO write needed?

That's what I understood from

"At the completion of the post-sync operation associated with this pipe
control packet..."

> 
> +   }
> +}
> +
>  VkResult
>  genX(EndCommandBuffer)(
>  VkCommandBuffer commandBuffer)
> @@ -1024,6 +1068,9 @@ genX(EndCommandBuffer)(
> 
> genX(cmd_buffer_apply_pipe_flushes)(cmd_buffer);
> 
> +   if (GEN_GEN == 10)
> +  emit_isp_disable(cmd_buffer);
> +
> 

Re: [Mesa-dev] [PATCH] mesa: add missing RGB9_E5 format in _mesa_base_fbo_format

2018-01-25 Thread Michel Dänzer
On 2018-01-24 05:38 PM, Juan A. Suarez Romero wrote:
> This fixes KHR-GL45.internalformat.renderbuffer.rgb9_e5.
> ---
>  src/mesa/main/fbobject.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
> index d23916d1ad7..c72204e11a0 100644
> --- a/src/mesa/main/fbobject.c
> +++ b/src/mesa/main/fbobject.c
> @@ -1976,6 +1976,9 @@ _mesa_base_fbo_format(const struct gl_context *ctx, 
> GLenum internalFormat)
> ctx->Extensions.ARB_texture_float) ||
>_mesa_is_gles3(ctx) /* EXT_color_buffer_float */ )
>   ? GL_RGBA : 0;
> +   case GL_RGB9_E5:
> +  return (_mesa_is_desktop_gl(ctx) && 
> ctx->Extensions.EXT_texture_shared_exponent)
> + ? GL_RGB: 0;
> case GL_ALPHA16F_ARB:
> case GL_ALPHA32F_ARB:
>return ctx->API == API_OPENGL_COMPAT &&
> 

Unfortunately, this broke the "spec@arb_internalformat_query2@samples
and num_sample_counts pname checks" piglit tests with radeonsi and
llvmpipe, see below.

Any idea what might need to be done in Gallium to fix this?


32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), 
supported=1
32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = 
(1,GL_TRUE), supported=1
32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = 
(1,GL_TRUE), supported=1
64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), 
supported=1
64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = 
(1,GL_TRUE), supported=1
64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = 
(1,GL_TRUE), supported=1
PIGLIT: {"subtest": {"GL_NUM_SAMPLE_COUNTS" : "fail"}}
32 bit failing case: pname = GL_SAMPLES, target = GL_RENDERBUFFER, 
internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1
32 bit failing case: pname = GL_SAMPLES, target = 
GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = 
(1,GL_TRUE), supported=1
32 bit failing case: pname = GL_SAMPLES, target = 
GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = 
(1,GL_TRUE), supported=1
64 bit failing case: pname = GL_SAMPLES, target = GL_RENDERBUFFER, 
internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1
64 bit failing case: pname = GL_SAMPLES, target = 
GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = 
(1,GL_TRUE), supported=1
64 bit failing case: pname = GL_SAMPLES, target = 
GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = 
(1,GL_TRUE), supported=1
PIGLIT: {"subtest": {"GL_SAMPLES" : "fail"}}
PIGLIT: {"result": "fail" }


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: Correctly print glTexImage dimensions

2018-01-25 Thread Elie Tournier
texture_format_error_check_gles() displays error like "glTexImage%dD".
This patch just replace the %d by the correct dimension.

Signed-off-by: Elie Tournier 
---
 src/mesa/main/teximage.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
index e5f8bb0718..cc329e6410 100644
--- a/src/mesa/main/teximage.c
+++ b/src/mesa/main/teximage.c
@@ -1787,7 +1787,6 @@ texture_formats_agree(GLenum internalFormat,
  * \param format pixel data format given by the user.
  * \param type pixel data type given by the user.
  * \param internalFormat internal format given by the user.
- * \param dimensions texture image dimensions (must be 1, 2 or 3).
  * \param callerName name of the caller function to print in the error message
  *
  * \return true if a error is found, false otherwise
@@ -1796,8 +1795,7 @@ texture_formats_agree(GLenum internalFormat,
  */
 static bool
 texture_format_error_check_gles(struct gl_context *ctx, GLenum format,
-GLenum type, GLenum internalFormat,
-GLuint dimensions, const char *callerName)
+GLenum type, GLenum internalFormat, const char 
*callerName)
 {
GLenum err = _mesa_es3_error_check_format_and_type(ctx, format, type,
   internalFormat);
@@ -1911,9 +1909,11 @@ texture_error_check( struct gl_context *ctx,
 * Formats and types that require additional extensions (e.g., GL_FLOAT
 * requires GL_OES_texture_float) are filtered elsewhere.
 */
+   char bufCallerName[20];
+   snprintf(bufCallerName, 20, "glTexImage%dD", dimensions);
if (_mesa_is_gles(ctx) &&
-   texture_format_error_check_gles(ctx, format, type, internalFormat,
-   dimensions, "glTexImage%dD")) {
+   texture_format_error_check_gles(ctx, format, type,
+   internalFormat, bufCallerName)) {
   return GL_TRUE;
}
 
@@ -2234,8 +2234,7 @@ texsubimage_error_check(struct gl_context *ctx, GLuint 
dimensions,
 */
if (_mesa_is_gles(ctx) &&
texture_format_error_check_gles(ctx, format, type,
-   internalFormat,
-   dimensions, callerName)) {
+   internalFormat, callerName)) {
   return GL_TRUE;
}
 
-- 
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] meson: fix some more defines meson.build

2018-01-25 Thread Eric Engestrom
On Thursday, 2018-01-25 10:31:12 +0100, Marc Dietrich wrote:
> Am Donnerstag, 25. Januar 2018, 10:28:26 CET schrieb Marc Dietrich:
> > Am Donnerstag, 25. Januar 2018, 10:18:16 CET schrieb Eric Engestrom:
> > > On Wednesday, 2018-01-24 22:02:42 +0100, Marc Dietrich wrote:
> > > > Btw, there is still some strange problem in PACKAGE_BUGREPORT as it
> > > > includes a "$" in the url. I don't know where this comes from.
> > > 
> > > Where do you see this "$"?
> > > I've looked at the code and it looks all good to me.
> > 
> > yes, code is fine, output is not (see build.ninja):
> > 
> > '-DPACKAGE_BUGREPORT="https$://bugs.freedesktop.org/enter_bug.cgi?
> > product=Mesa"'
> > 
> > maybe some escaping required?
> > 
> > Marc
> 
> arr, I just checked the resulting binary and it seems to be ok there, so 
> false 
> alarm. Still puzzling where it came from and where it went to.

This is some ninja-specific escaping:
https://ninja-build.org/manual.html#_lexical_syntax

> 
> Marc
> 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: simplify _mesa_delete_list() a bit, add some assertions

2018-01-25 Thread Brian Paul

On 01/24/2018 09:41 PM, Roland Scheidegger wrote:

Am 25.01.2018 um 00:19 schrieb Brian Paul:

All but two cases of the switch did the same n += InstSize[n[0].opcode]
instruction.  Just move it after the switch.

Add some sanity check assertions.
---
  src/mesa/main/dlist.c | 39 +++
  1 file changed, 11 insertions(+), 28 deletions(-)

diff --git a/src/mesa/main/dlist.c b/src/mesa/main/dlist.c
index a6b212e..7b8e0f6 100644
--- a/src/mesa/main/dlist.c
+++ b/src/mesa/main/dlist.c
@@ -961,79 +961,60 @@ _mesa_delete_list(struct gl_context *ctx, struct 
gl_display_list *dlist)
  /* for some commands, we need to free malloc'd memory */
   case OPCODE_MAP1:
  free(get_pointer([6]));
-n += InstSize[n[0].opcode];
  break;
   case OPCODE_MAP2:
  free(get_pointer([10]));
-n += InstSize[n[0].opcode];
  break;
   case OPCODE_CALL_LISTS:
  free(get_pointer([3]));
-n += InstSize[n[0].opcode];
  break;
   case OPCODE_DRAW_PIXELS:
  free(get_pointer([5]));
-n += InstSize[n[0].opcode];
  break;
   case OPCODE_BITMAP:
  free(get_pointer([7]));
-n += InstSize[n[0].opcode];
  break;
   case OPCODE_POLYGON_STIPPLE:
  free(get_pointer([1]));
-n += InstSize[n[0].opcode];
  break;
   case OPCODE_TEX_IMAGE1D:
  free(get_pointer([8]));
-n += InstSize[n[0].opcode];
  break;
   case OPCODE_TEX_IMAGE2D:
  free(get_pointer([9]));
-n += InstSize[n[0].opcode];
  break;
   case OPCODE_TEX_IMAGE3D:
  free(get_pointer([10]));
-n += InstSize[n[0].opcode];
  break;
   case OPCODE_TEX_SUB_IMAGE1D:
  free(get_pointer([7]));
-n += InstSize[n[0].opcode];
  break;
   case OPCODE_TEX_SUB_IMAGE2D:
  free(get_pointer([9]));
-n += InstSize[n[0].opcode];
  break;
   case OPCODE_TEX_SUB_IMAGE3D:
  free(get_pointer([11]));
-n += InstSize[n[0].opcode];
  break;
   case OPCODE_COMPRESSED_TEX_IMAGE_1D:
  free(get_pointer([7]));
-n += InstSize[n[0].opcode];
  break;
   case OPCODE_COMPRESSED_TEX_IMAGE_2D:
  free(get_pointer([8]));
-n += InstSize[n[0].opcode];
  break;
   case OPCODE_COMPRESSED_TEX_IMAGE_3D:
  free(get_pointer([9]));
-n += InstSize[n[0].opcode];
  break;
   case OPCODE_COMPRESSED_TEX_SUB_IMAGE_1D:
  free(get_pointer([7]));
-n += InstSize[n[0].opcode];
  break;
   case OPCODE_COMPRESSED_TEX_SUB_IMAGE_2D:
  free(get_pointer([9]));
-n += InstSize[n[0].opcode];
  break;
   case OPCODE_COMPRESSED_TEX_SUB_IMAGE_3D:
  free(get_pointer([11]));
-n += InstSize[n[0].opcode];
  break;
   case OPCODE_PROGRAM_STRING_ARB:
  free(get_pointer([4]));  /* program string */
-n += InstSize[n[0].opcode];
  break;
   case OPCODE_UNIFORM_1FV:
   case OPCODE_UNIFORM_2FV:
@@ -1048,7 +1029,6 @@ _mesa_delete_list(struct gl_context *ctx, struct 
gl_display_list *dlist)
   case OPCODE_UNIFORM_3UIV:
   case OPCODE_UNIFORM_4UIV:
  free(get_pointer([3]));
-n += InstSize[n[0].opcode];
  break;
   case OPCODE_UNIFORM_MATRIX22:
   case OPCODE_UNIFORM_MATRIX33:
@@ -1060,7 +1040,6 @@ _mesa_delete_list(struct gl_context *ctx, struct 
gl_display_list *dlist)
   case OPCODE_UNIFORM_MATRIX34:
   case OPCODE_UNIFORM_MATRIX43:
  free(get_pointer([4]));
-n += InstSize[n[0].opcode];
  break;
   case OPCODE_PROGRAM_UNIFORM_1FV:
   case OPCODE_PROGRAM_UNIFORM_2FV:
@@ -1075,7 +1054,6 @@ _mesa_delete_list(struct gl_context *ctx, struct 
gl_display_list *dlist)
   case OPCODE_PROGRAM_UNIFORM_3UIV:
   case OPCODE_PROGRAM_UNIFORM_4UIV:
  free(get_pointer([4]));
-n += InstSize[n[0].opcode];
  break;
   case OPCODE_PROGRAM_UNIFORM_MATRIX22F:
   case OPCODE_PROGRAM_UNIFORM_MATRIX33F:
@@ -1087,15 +1065,12 @@ _mesa_delete_list(struct gl_context *ctx, struct 
gl_display_list *dlist)
   case OPCODE_PROGRAM_UNIFORM_MATRIX34F:
   case OPCODE_PROGRAM_UNIFORM_MATRIX43F:
  free(get_pointer([5]));
-n += InstSize[n[0].opcode];
  break;
   case OPCODE_PIXEL_MAP:
  free(get_pointer([3]));
-n += InstSize[n[0].opcode];
  break;

Re: [Mesa-dev] [Mesa-stable] [PATCH] util/build-id: Fix address comparison for binaries with LOAD vaddr > 0

2018-01-25 Thread Stephan Gerhold
On Thu, Jan 25, 2018 at 11:22:10AM +, Emil Velikov wrote:
> On 24 January 2018 at 14:13, Stephan Gerhold  wrote:
> > build_id_find_nhdr_for_addr() fails to find the build-id if the first LOAD
> > segment has a virtual address other than 0x0.
> >
> > For most shared libraries, the first LOAD segment has vaddr=0x0:
> >
> > Type   Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
> > LOAD   0x00 0x 0x 0x2d2e26 0x2d2e26 R E 
> > 0x1000
> > LOAD   0x2d2e54 0x002d3e54 0x002d3e54 0x2e248 0x2f148 RW  0x1000
> >
> > However, compiling the Intel Vulkan driver as 32-bit binary on Android 
> > produces
> > the following ELF header with vaddr=0x8000 instead:
> >
> > Type   Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
> > PHDR   0x34 0x8034 0x8034 0x00100 0x00100 R   0x4
> > LOAD   0x00 0x8000 0x8000 0x224a04 0x224a04 R E 
> > 0x1000
> > LOAD   0x225710 0x0022e710 0x0022e710 0x25988 0x27364 RW  0x1000
> >
> > build_id_find_nhdr_callback() compares the address of dli_fbase from 
> > dladdr()
> > and dlpi_addr from dl_iterate_phdr(). With vaddr > 0, these point to a
> > different memory address, e.g.:
> >
> > dli_fbase=0xd8395000 (offset 0x8000)
> > dlpi_addr=0xd838d000
> >
> > At least on glibc and bionic (Android) dli_fbase refers to the address where
> > the shared object is mapped into the process space, whereas dlpi_addr is 
> > just
> > the base address for the vaddrs declared in the ELF header.
> >
> > To compare them correctly, we need to calculate the start of the mapping
> > by adding the vaddr of the first LOAD segment to the base address.
> >
> > Cc: Chad Versace 
> > Cc: Emil Velikov 
> > Cc: Tapani Pälli 
> > Cc: 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104642
> > Fixes: 5c98d38 "util: Query build-id by symbol address, not library name"
> > ---
> Based on my observation of glibc code and reading at the spec, I think
> this is correct.
> Admittedly the man page could be improved.
> 
> FWIW I've poked the #musl people about this change last night, and
> haven't heard any feedback yet.
> Be that about a) our understanding of how it should work or b) musl's
> implementation on the topic.

I found a related discussion about the implementation of dli_fbase on the
musl mailing list[1]. The FreeBSD man page for dladdr()[2] linked in the
message on the musl mailing list is a bit more specific about dli_fbase:

"The base address at which the shared object is mapped into the
address space of the calling process."

... which is - at least as far as I understand it - exactly how glibc and
bionic behave and the reason why we need this patch for LOAD vaddrs != 0.

However, from what I've noticed when testing with musl, they seem to handle
it unlike glibc/bionic/the FreeBSD man page. musl always returns the base
address without the offset where the shared object is mapped.

Technically, this means that this patch will break on musl in the rare
situation that you actually link a shared library with LOAD vaddr != 0.
However, considering that only they seem to handle it differently, this might
be worth reporting to them instead?

[1]: http://www.openwall.com/lists/musl/2013/01/16/10
[2]: https://www.unix.com/man-page/FreeBSD/3/dladdr/

> Patch looks sensible, although input from Chad/Matt would be appreciated.
> Reviewed-by: Emil Velikov 
> 
> -Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: emit a cache flush before enabling predication

2018-01-25 Thread Matthew Nicholls
Otherwise cache flushes could get conditionally disabled while still clearing
the flush_bits, and thus flushes due to application pipeline barriers may never
get executed.

Cc: mesa-sta...@lists.freedesktop.org
---
 src/amd/vulkan/radv_meta_fast_clear.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/amd/vulkan/radv_meta_fast_clear.c 
b/src/amd/vulkan/radv_meta_fast_clear.c
index fdeeaeedbf..f4353fd889 100644
--- a/src/amd/vulkan/radv_meta_fast_clear.c
+++ b/src/amd/vulkan/radv_meta_fast_clear.c
@@ -602,6 +602,8 @@ radv_emit_color_decompress(struct radv_cmd_buffer 
*cmd_buffer,
}
 
if (!decompress_dcc && image->surface.dcc_size) {
+   si_emit_cache_flush(cmd_buffer);
+
radv_emit_set_predication_state_from_image(cmd_buffer, image, 
true);
cmd_buffer->state.predicating = true;
}
-- 
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallivm: fix crash with seamless cube filtering with different min/mag filter

2018-01-25 Thread Jose Fonseca

Looks great.

Reviewed-by: Jose Fonseca 


On 25/01/18 03:33, srol...@vmware.com wrote:

From: Roland Scheidegger 

We are not allowed to modify the incoming coords values, or things may
crash (as we may be inside a llvm conditional and the values may be used
in another branch).
I recently broke this when fixing an issue with NaNs and seamless cube
map filtering, and it causes crashes when doing cubemap filtering
if the min and mag filters are different.
Add const to the pointers passed in to prevent this mishap in the future.

Fixes: a485ad0bcd ("gallivm: fix an issue with NaNs with seamless cube 
filtering")
---
  src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 38 +--
  1 file changed, 21 insertions(+), 17 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
index ff8cbf6..8f760f5 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
@@ -857,7 +857,7 @@ lp_build_sample_image_nearest(struct 
lp_build_sample_context *bld,
LLVMValueRef img_stride_vec,
LLVMValueRef data_ptr,
LLVMValueRef mipoffsets,
-  LLVMValueRef *coords,
+  const LLVMValueRef *coords,
const LLVMValueRef *offsets,
LLVMValueRef colors_out[4])
  {
@@ -1004,7 +1004,7 @@ lp_build_sample_image_linear(struct 
lp_build_sample_context *bld,
   LLVMValueRef img_stride_vec,
   LLVMValueRef data_ptr,
   LLVMValueRef mipoffsets,
- LLVMValueRef *coords,
+ const LLVMValueRef *coords,
   const LLVMValueRef *offsets,
   LLVMValueRef colors_out[4])
  {
@@ -1106,7 +1106,7 @@ lp_build_sample_image_linear(struct 
lp_build_sample_context *bld,
struct lp_build_if_state edge_if;
LLVMTypeRef int1t;
LLVMValueRef new_faces[4], new_xcoords[4][2], new_ycoords[4][2];
-  LLVMValueRef coord, have_edge, have_corner;
+  LLVMValueRef coord0, coord1, have_edge, have_corner;
LLVMValueRef fall_off_ym_notxm, fall_off_ym_notxp, fall_off_x, 
fall_off_y;
LLVMValueRef fall_off_yp_notxm, fall_off_yp_notxp;
LLVMValueRef x0, x1, y0, y1, y0_clamped, y1_clamped;
@@ -1130,20 +1130,20 @@ lp_build_sample_image_linear(struct 
lp_build_sample_context *bld,
 * other values might be bogus in the end too).
 * So kill off the NaNs here.
 */
-  coords[0] = lp_build_max_ext(coord_bld, coords[0], coord_bld->zero,
-   GALLIVM_NAN_RETURN_OTHER_SECOND_NONNAN);
-  coords[1] = lp_build_max_ext(coord_bld, coords[1], coord_bld->zero,
-   GALLIVM_NAN_RETURN_OTHER_SECOND_NONNAN);
-  coord = lp_build_mul(coord_bld, coords[0], flt_width_vec);
+  coord0 = lp_build_max_ext(coord_bld, coords[0], coord_bld->zero,
+GALLIVM_NAN_RETURN_OTHER_SECOND_NONNAN);
+  coord0 = lp_build_mul(coord_bld, coord0, flt_width_vec);
/* instead of clamp, build mask if overflowed */
-  coord = lp_build_sub(coord_bld, coord, half);
+  coord0 = lp_build_sub(coord_bld, coord0, half);
/* convert to int, compute lerp weight */
/* not ideal with AVX (and no AVX2) */
-  lp_build_ifloor_fract(coord_bld, coord, , _fpart);
+  lp_build_ifloor_fract(coord_bld, coord0, , _fpart);
x1 = lp_build_add(ivec_bld, x0, ivec_bld->one);
-  coord = lp_build_mul(coord_bld, coords[1], flt_height_vec);
-  coord = lp_build_sub(coord_bld, coord, half);
-  lp_build_ifloor_fract(coord_bld, coord, , _fpart);
+  coord1 = lp_build_max_ext(coord_bld, coords[1], coord_bld->zero,
+GALLIVM_NAN_RETURN_OTHER_SECOND_NONNAN);
+  coord1 = lp_build_mul(coord_bld, coord1, flt_height_vec);
+  coord1 = lp_build_sub(coord_bld, coord1, half);
+  lp_build_ifloor_fract(coord_bld, coord1, , _fpart);
y1 = lp_build_add(ivec_bld, y0, ivec_bld->one);
  
fall_off[0] = lp_build_cmp(ivec_bld, PIPE_FUNC_LESS, x0, ivec_bld->zero);

@@ -1747,7 +1747,7 @@ lp_build_sample_mipmap(struct lp_build_sample_context 
*bld,
 unsigned img_filter,
 unsigned mip_filter,
 boolean is_gather,
-   LLVMValueRef *coords,
+   const LLVMValueRef *coords,
 const LLVMValueRef *offsets,
 LLVMValueRef ilevel0,
 LLVMValueRef ilevel1,
@@ -1820,6 +1820,7 @@ lp_build_sample_mipmap(struct lp_build_sample_context 
*bld,
  

[Mesa-dev] [PATCH 1/2] radv: fix a GPU hang with RADV_DEBUG=syncshaders

2018-01-25 Thread Samuel Pitoiset
The GPU hangs when the driver forces a PS_PARTIAL_FLUSH after
a dispatch call (and vice versa for graphics). Something has
changed in the kernel driver because it used to work.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 6d512c6070a..ba5fd92f2a1 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -429,15 +429,14 @@ void radv_cmd_buffer_trace_emit(struct radv_cmd_buffer 
*cmd_buffer)
 }
 
 static void
-radv_cmd_buffer_after_draw(struct radv_cmd_buffer *cmd_buffer)
+radv_cmd_buffer_after_draw(struct radv_cmd_buffer *cmd_buffer,
+  enum radv_cmd_flush_bits flags)
 {
if (cmd_buffer->device->instance->debug_flags & 
RADV_DEBUG_SYNC_SHADERS) {
-   enum radv_cmd_flush_bits flags;
-
-   /* Force wait for graphics/compute engines to be idle. */
-   flags = RADV_CMD_FLAG_PS_PARTIAL_FLUSH |
-   RADV_CMD_FLAG_CS_PARTIAL_FLUSH;
+   assert(flags & (RADV_CMD_FLAG_PS_PARTIAL_FLUSH |
+   RADV_CMD_FLAG_CS_PARTIAL_FLUSH));
 
+   /* Force wait for graphics or compute engines to be idle. */
si_cs_emit_cache_flush(cmd_buffer->cs, false,
   
cmd_buffer->device->physical_device->rad_info.chip_class,
   NULL, 0,
@@ -3501,7 +3500,7 @@ radv_draw(struct radv_cmd_buffer *cmd_buffer,
}
 
assert(cmd_buffer->cs->cdw <= cdw_max);
-   radv_cmd_buffer_after_draw(cmd_buffer);
+   radv_cmd_buffer_after_draw(cmd_buffer, RADV_CMD_FLAG_PS_PARTIAL_FLUSH);
 }
 
 void radv_CmdDraw(
@@ -3821,7 +3820,7 @@ radv_dispatch(struct radv_cmd_buffer *cmd_buffer,
radv_emit_dispatch_packets(cmd_buffer, info);
}
 
-   radv_cmd_buffer_after_draw(cmd_buffer);
+   radv_cmd_buffer_after_draw(cmd_buffer, RADV_CMD_FLAG_CS_PARTIAL_FLUSH);
 }
 
 void radv_CmdDispatch(
-- 
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] radv: fix RADV_DEBUG=syncshaders on GFX9

2018-01-25 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index ba5fd92f2a1..b694174de68 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -433,13 +433,22 @@ radv_cmd_buffer_after_draw(struct radv_cmd_buffer 
*cmd_buffer,
   enum radv_cmd_flush_bits flags)
 {
if (cmd_buffer->device->instance->debug_flags & 
RADV_DEBUG_SYNC_SHADERS) {
+   uint32_t *ptr = NULL;
+   uint64_t va = 0;
+
assert(flags & (RADV_CMD_FLAG_PS_PARTIAL_FLUSH |
RADV_CMD_FLAG_CS_PARTIAL_FLUSH));
 
+   if (cmd_buffer->device->physical_device->rad_info.chip_class == 
GFX9) {
+   va = radv_buffer_get_va(cmd_buffer->gfx9_fence_bo) +
+cmd_buffer->gfx9_fence_offset;
+   ptr = _buffer->gfx9_fence_idx;
+   }
+
/* Force wait for graphics or compute engines to be idle. */
si_cs_emit_cache_flush(cmd_buffer->cs, false,
   
cmd_buffer->device->physical_device->rad_info.chip_class,
-  NULL, 0,
+  ptr, va,
   radv_cmd_buffer_uses_mec(cmd_buffer),
   flags);
}
-- 
2.16.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] configure.ac: add missing llvm dependencies to .pc files

2018-01-25 Thread Chuck Atkins
v2: Only add as dependencies for gallium-osmesa and gallium-xlib

CC: 
Signed-of-by: Chuck Atkins 
---
 configure.ac | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/configure.ac b/configure.ac
index 7c1fbe0ed1..448bd3a6ba 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2780,6 +2780,12 @@ if test "x$enable_llvm" = xyes; then
 fi
 fi
 fi
+if test "x$enable_glx" == xgallium-xlib; then
+GL_PC_LIB_PRIV="$GL_PC_LIB_PRIV $LLVM_LIBS"
+fi
+if test "x$enable_gallium_osmesa" = xyes; then
+OSMESA_PC_LIB_PRIV="$OSMESA_PC_LIB_PRIV $LLVM_LIBS"
+fi
 fi
 
 AM_CONDITIONAL(HAVE_GALLIUM_SVGA, test "x$HAVE_GALLIUM_SVGA" = xyes)
-- 
2.14.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/7] freedreno: a2xx: Support TEXTURE_RECT

2018-01-25 Thread Wladimir J. van der Laan
On Thu, Jan 25, 2018 at 08:41:11AM -0500, Ilia Mirkin wrote:
> Should you also expose PIPE_CAP_TEXTURE_RECTANGLE? (Or whatever it's
> called... I forget.)

Yes, good point, will add that.

Wladimir
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Freedreno] [PATCH 1/7] freedreno: a2xx: Update rnndb header

2018-01-25 Thread Wladimir J. van der Laan
On Thu, Jan 25, 2018 at 08:40:00AM -0500, Ilia Mirkin wrote:
> On Thu, Jan 25, 2018 at 8:29 AM, Wladimir J. van der Laan
>  wrote:
> > Also update BLEND_ to BLEND2_ opcodes to accomodate.
> 
> Are you saying this doesn't compile right now? I would have expected
> the accompanying change to a2xx.xml.h for that. Perhaps this landed
> into the wrong commit?

There used to be a rename from BLEND_ to BLEND2_ here, it probably made it in
in an earlier patch? It does compile like now but I think the change is correct:

BLEND_* a3xx_rb_blend_opcode
BLEND2_* is a2xx_rb_blend_opcode

Howver, it happens that BLEND2_DST_PLUS_SRC and BLEND_DST_PLUS_SRC have the
same value so it's a nop either way.

> Also it's odd that the formats are so different than originally
> entered. Any opinion on how that happened?

I do not know where the original values come from - mine come from the yamoto
register headers that are part of the amd-gpu kernel driver.
(see freedreno envytools commit 1b32c444f82cd7144d71602106462f59f146c1d0,
and also: 
https://github.com/jaketesler/UDOO_Kernel/blob/master/drivers/mxc/amd-gpu/include/reg/yamato/22/yamato_enum.h#L1799
 )

I've checked on a20x that for example ETC1 ones check out, but obviously not
every single one of them.

Regards,
Wladimir

> 
> >
> > Signed-off-by: Wladimir J. van der Laan 
> > ---
> >  src/gallium/drivers/freedreno/a2xx/a2xx.xml.h | 33 
> > +++
> >  src/gallium/drivers/freedreno/a2xx/fd2_gmem.c |  4 ++--
> >  2 files changed, 15 insertions(+), 22 deletions(-)
> >
> > diff --git a/src/gallium/drivers/freedreno/a2xx/a2xx.xml.h 
> > b/src/gallium/drivers/freedreno/a2xx/a2xx.xml.h
> > index 55a4355..279a652 100644
> > --- a/src/gallium/drivers/freedreno/a2xx/a2xx.xml.h
> > +++ b/src/gallium/drivers/freedreno/a2xx/a2xx.xml.h
> > @@ -84,13 +84,12 @@ enum a2xx_sq_surfaceformat {
> > FMT_5_5_5_1 = 13,
> > FMT_8_8_8_8_A = 14,
> > FMT_4_4_4_4 = 15,
> > -   FMT_10_11_11 = 16,
> > -   FMT_11_11_10 = 17,
> > +   FMT_8_8_8 = 16,
> > FMT_DXT1 = 18,
> > FMT_DXT2_3 = 19,
> > FMT_DXT4_5 = 20,
> > +   FMT_10_10_10_2 = 21,
> > FMT_24_8 = 22,
> > -   FMT_24_8_FLOAT = 23,
> > FMT_16 = 24,
> > FMT_16_16 = 25,
> > FMT_16_16_16_16 = 26,
> > @@ -106,29 +105,23 @@ enum a2xx_sq_surfaceformat {
> > FMT_32_FLOAT = 36,
> > FMT_32_32_FLOAT = 37,
> > FMT_32_32_32_32_FLOAT = 38,
> > -   FMT_32_AS_8 = 39,
> > -   FMT_32_AS_8_8 = 40,
> > -   FMT_16_MPEG = 41,
> > -   FMT_16_16_MPEG = 42,
> > -   FMT_8_INTERLACED = 43,
> > -   FMT_32_AS_8_INTERLACED = 44,
> > -   FMT_32_AS_8_8_INTERLACED = 45,
> > -   FMT_16_INTERLACED = 46,
> > -   FMT_16_MPEG_INTERLACED = 47,
> > -   FMT_16_16_MPEG_INTERLACED = 48,
> > +   FMT_ATI_TC_RGB = 39,
> > +   FMT_ATI_TC_RGBA = 40,
> > +   FMT_ATI_TC_555_565_RGB = 41,
> > +   FMT_ATI_TC_555_565_RGBA = 42,
> > +   FMT_ATI_TC_RGBA_INTERP = 43,
> > +   FMT_ATI_TC_555_565_RGBA_INTERP = 44,
> > +   FMT_ETC1_RGBA_INTERP = 46,
> > +   FMT_ETC1_RGB = 47,
> > +   FMT_ETC1_RGBA = 48,
> > FMT_DXN = 49,
> > -   FMT_8_8_8_8_AS_16_16_16_16 = 50,
> > -   FMT_DXT1_AS_16_16_16_16 = 51,
> > -   FMT_DXT2_3_AS_16_16_16_16 = 52,
> > -   FMT_DXT4_5_AS_16_16_16_16 = 53,
> > +   FMT_2_3_3 = 51,
> > FMT_2_10_10_10_AS_16_16_16_16 = 54,
> > -   FMT_10_11_11_AS_16_16_16_16 = 55,
> > -   FMT_11_11_10_AS_16_16_16_16 = 56,
> > +   FMT_10_10_10_2_AS_16_16_16_16 = 55,
> > FMT_32_32_32_FLOAT = 57,
> > FMT_DXT3A = 58,
> > FMT_DXT5A = 59,
> > FMT_CTX1 = 60,
> > -   FMT_DXT3A_AS_1_1_1_1 = 61,
> >  };
> >
> >  enum a2xx_sq_ps_vtx_mode {
> > diff --git a/src/gallium/drivers/freedreno/a2xx/fd2_gmem.c 
> > b/src/gallium/drivers/freedreno/a2xx/fd2_gmem.c
> > index 0905ab6..46a7d18 100644
> > --- a/src/gallium/drivers/freedreno/a2xx/fd2_gmem.c
> > +++ b/src/gallium/drivers/freedreno/a2xx/fd2_gmem.c
> > @@ -293,10 +293,10 @@ fd2_emit_tile_mem2gmem(struct fd_batch *batch, struct 
> > fd_tile *tile)
> > OUT_PKT3(ring, CP_SET_CONSTANT, 2);
> > OUT_RING(ring, CP_REG(REG_A2XX_RB_BLEND_CONTROL));
> > OUT_RING(ring, A2XX_RB_BLEND_CONTROL_COLOR_SRCBLEND(FACTOR_ONE) |
> > -   
> > A2XX_RB_BLEND_CONTROL_COLOR_COMB_FCN(BLEND_DST_PLUS_SRC) |
> > +   
> > A2XX_RB_BLEND_CONTROL_COLOR_COMB_FCN(BLEND2_DST_PLUS_SRC) |
> > A2XX_RB_BLEND_CONTROL_COLOR_DESTBLEND(FACTOR_ZERO) |
> > A2XX_RB_BLEND_CONTROL_ALPHA_SRCBLEND(FACTOR_ONE) |
> > -   
> > A2XX_RB_BLEND_CONTROL_ALPHA_COMB_FCN(BLEND_DST_PLUS_SRC) |
> > +   
> > A2XX_RB_BLEND_CONTROL_ALPHA_COMB_FCN(BLEND2_DST_PLUS_SRC) |
> > A2XX_RB_BLEND_CONTROL_ALPHA_DESTBLEND(FACTOR_ZERO));
> >
> > 

Re: [Mesa-dev] [PATCH 4/7] freedreno: a2xx: Support TEXTURE_RECT

2018-01-25 Thread Ilia Mirkin
Should you also expose PIPE_CAP_TEXTURE_RECTANGLE? (Or whatever it's
called... I forget.)

On Thu, Jan 25, 2018 at 8:29 AM, Wladimir J. van der Laan
 wrote:
> Denormalized texture coordinates are required for text rendering in
> GALLIUM_HUD.
>
> Signed-off-by: Wladimir J. van der Laan 
> ---
>  src/gallium/drivers/freedreno/a2xx/fd2_compiler.c | 3 ++-
>  src/gallium/drivers/freedreno/a2xx/ir-a2xx.c  | 1 +
>  src/gallium/drivers/freedreno/a2xx/ir-a2xx.h  | 1 +
>  3 files changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/freedreno/a2xx/fd2_compiler.c 
> b/src/gallium/drivers/freedreno/a2xx/fd2_compiler.c
> index 2ffd8cd..9f2fc61 100644
> --- a/src/gallium/drivers/freedreno/a2xx/fd2_compiler.c
> +++ b/src/gallium/drivers/freedreno/a2xx/fd2_compiler.c
> @@ -791,6 +791,7 @@ translate_tex(struct fd2_compile_context *ctx,
> instr = ir2_instr_create(next_exec_cf(ctx), IR2_FETCH);
> instr->fetch.opc = TEX_FETCH;
> instr->fetch.is_cube = (inst->Texture.Texture == TGSI_TEXTURE_3D);
> +   instr->fetch.is_rect = (inst->Texture.Texture == TGSI_TEXTURE_RECT);
> assert(inst->Texture.NumOffsets <= 1); // TODO what to do in other 
> cases?
>
> /* save off the tex fetch to be patched later with correct const_idx: 
> */
> @@ -802,7 +803,7 @@ translate_tex(struct fd2_compile_context *ctx,
> reg = add_src_reg(ctx, instr, coord);
>
> /* blob compiler always sets 3rd component to same as 1st for 2d: */
> -   if (inst->Texture.Texture == TGSI_TEXTURE_2D)
> +   if (inst->Texture.Texture == TGSI_TEXTURE_2D || inst->Texture.Texture 
> == TGSI_TEXTURE_RECT)
> reg->swizzle[2] = reg->swizzle[0];
>
> /* dst register needs to be marked for sync: */
> diff --git a/src/gallium/drivers/freedreno/a2xx/ir-a2xx.c 
> b/src/gallium/drivers/freedreno/a2xx/ir-a2xx.c
> index 163c282..3666a7e 100644
> --- a/src/gallium/drivers/freedreno/a2xx/ir-a2xx.c
> +++ b/src/gallium/drivers/freedreno/a2xx/ir-a2xx.c
> @@ -341,6 +341,7 @@ static int instr_emit_fetch(struct ir2_instruction *instr,
> tex->use_comp_lod = 1;
> tex->use_reg_lod = !instr->fetch.is_cube;
> tex->sample_location = SAMPLE_CENTER;
> +tex->tx_coord_denorm = instr->fetch.is_rect;
>
> if (instr->pred != IR2_PRED_NONE) {
> tex->pred_select = 1;
> diff --git a/src/gallium/drivers/freedreno/a2xx/ir-a2xx.h 
> b/src/gallium/drivers/freedreno/a2xx/ir-a2xx.h
> index 36ed204..c4b6c18 100644
> --- a/src/gallium/drivers/freedreno/a2xx/ir-a2xx.h
> +++ b/src/gallium/drivers/freedreno/a2xx/ir-a2xx.h
> @@ -74,6 +74,7 @@ struct ir2_instruction {
> unsigned const_idx;
> /* texture fetch specific: */
> bool is_cube : 1;
> +   bool is_rect : 1;
> /* vertex fetch specific: */
> unsigned const_idx_sel;
> enum a2xx_sq_surfaceformat fmt;
> --
> 2.7.4
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Freedreno] [PATCH 1/7] freedreno: a2xx: Update rnndb header

2018-01-25 Thread Ilia Mirkin
On Thu, Jan 25, 2018 at 8:29 AM, Wladimir J. van der Laan
 wrote:
> Also update BLEND_ to BLEND2_ opcodes to accomodate.

Are you saying this doesn't compile right now? I would have expected
the accompanying change to a2xx.xml.h for that. Perhaps this landed
into the wrong commit?

Also it's odd that the formats are so different than originally
entered. Any opinion on how that happened?

>
> Signed-off-by: Wladimir J. van der Laan 
> ---
>  src/gallium/drivers/freedreno/a2xx/a2xx.xml.h | 33 
> +++
>  src/gallium/drivers/freedreno/a2xx/fd2_gmem.c |  4 ++--
>  2 files changed, 15 insertions(+), 22 deletions(-)
>
> diff --git a/src/gallium/drivers/freedreno/a2xx/a2xx.xml.h 
> b/src/gallium/drivers/freedreno/a2xx/a2xx.xml.h
> index 55a4355..279a652 100644
> --- a/src/gallium/drivers/freedreno/a2xx/a2xx.xml.h
> +++ b/src/gallium/drivers/freedreno/a2xx/a2xx.xml.h
> @@ -84,13 +84,12 @@ enum a2xx_sq_surfaceformat {
> FMT_5_5_5_1 = 13,
> FMT_8_8_8_8_A = 14,
> FMT_4_4_4_4 = 15,
> -   FMT_10_11_11 = 16,
> -   FMT_11_11_10 = 17,
> +   FMT_8_8_8 = 16,
> FMT_DXT1 = 18,
> FMT_DXT2_3 = 19,
> FMT_DXT4_5 = 20,
> +   FMT_10_10_10_2 = 21,
> FMT_24_8 = 22,
> -   FMT_24_8_FLOAT = 23,
> FMT_16 = 24,
> FMT_16_16 = 25,
> FMT_16_16_16_16 = 26,
> @@ -106,29 +105,23 @@ enum a2xx_sq_surfaceformat {
> FMT_32_FLOAT = 36,
> FMT_32_32_FLOAT = 37,
> FMT_32_32_32_32_FLOAT = 38,
> -   FMT_32_AS_8 = 39,
> -   FMT_32_AS_8_8 = 40,
> -   FMT_16_MPEG = 41,
> -   FMT_16_16_MPEG = 42,
> -   FMT_8_INTERLACED = 43,
> -   FMT_32_AS_8_INTERLACED = 44,
> -   FMT_32_AS_8_8_INTERLACED = 45,
> -   FMT_16_INTERLACED = 46,
> -   FMT_16_MPEG_INTERLACED = 47,
> -   FMT_16_16_MPEG_INTERLACED = 48,
> +   FMT_ATI_TC_RGB = 39,
> +   FMT_ATI_TC_RGBA = 40,
> +   FMT_ATI_TC_555_565_RGB = 41,
> +   FMT_ATI_TC_555_565_RGBA = 42,
> +   FMT_ATI_TC_RGBA_INTERP = 43,
> +   FMT_ATI_TC_555_565_RGBA_INTERP = 44,
> +   FMT_ETC1_RGBA_INTERP = 46,
> +   FMT_ETC1_RGB = 47,
> +   FMT_ETC1_RGBA = 48,
> FMT_DXN = 49,
> -   FMT_8_8_8_8_AS_16_16_16_16 = 50,
> -   FMT_DXT1_AS_16_16_16_16 = 51,
> -   FMT_DXT2_3_AS_16_16_16_16 = 52,
> -   FMT_DXT4_5_AS_16_16_16_16 = 53,
> +   FMT_2_3_3 = 51,
> FMT_2_10_10_10_AS_16_16_16_16 = 54,
> -   FMT_10_11_11_AS_16_16_16_16 = 55,
> -   FMT_11_11_10_AS_16_16_16_16 = 56,
> +   FMT_10_10_10_2_AS_16_16_16_16 = 55,
> FMT_32_32_32_FLOAT = 57,
> FMT_DXT3A = 58,
> FMT_DXT5A = 59,
> FMT_CTX1 = 60,
> -   FMT_DXT3A_AS_1_1_1_1 = 61,
>  };
>
>  enum a2xx_sq_ps_vtx_mode {
> diff --git a/src/gallium/drivers/freedreno/a2xx/fd2_gmem.c 
> b/src/gallium/drivers/freedreno/a2xx/fd2_gmem.c
> index 0905ab6..46a7d18 100644
> --- a/src/gallium/drivers/freedreno/a2xx/fd2_gmem.c
> +++ b/src/gallium/drivers/freedreno/a2xx/fd2_gmem.c
> @@ -293,10 +293,10 @@ fd2_emit_tile_mem2gmem(struct fd_batch *batch, struct 
> fd_tile *tile)
> OUT_PKT3(ring, CP_SET_CONSTANT, 2);
> OUT_RING(ring, CP_REG(REG_A2XX_RB_BLEND_CONTROL));
> OUT_RING(ring, A2XX_RB_BLEND_CONTROL_COLOR_SRCBLEND(FACTOR_ONE) |
> -   
> A2XX_RB_BLEND_CONTROL_COLOR_COMB_FCN(BLEND_DST_PLUS_SRC) |
> +   
> A2XX_RB_BLEND_CONTROL_COLOR_COMB_FCN(BLEND2_DST_PLUS_SRC) |
> A2XX_RB_BLEND_CONTROL_COLOR_DESTBLEND(FACTOR_ZERO) |
> A2XX_RB_BLEND_CONTROL_ALPHA_SRCBLEND(FACTOR_ONE) |
> -   
> A2XX_RB_BLEND_CONTROL_ALPHA_COMB_FCN(BLEND_DST_PLUS_SRC) |
> +   
> A2XX_RB_BLEND_CONTROL_ALPHA_COMB_FCN(BLEND2_DST_PLUS_SRC) |
> A2XX_RB_BLEND_CONTROL_ALPHA_DESTBLEND(FACTOR_ZERO));
>
> OUT_PKT3(ring, CP_SET_CONSTANT, 3);
> --
> 2.7.4
>
> ___
> Freedreno mailing list
> freedr...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/freedreno
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] configure.ac: add missing llvm dependencies to .pc files

2018-01-25 Thread Chuck Atkins
> Should be used only for gallium-xlib based glx, since it embeds the
> swr/llvmpipe driver.
>
...
...

> There is no LLVM specific code in these - ^^ should not be needed.
>

Correct.  This was initially to address the problem for OSMesa but I
realized it was likely an issue for more than just OSMesa.  After a bit of
debugging I see that I was indeed a bit overzealous on that.  Will fix in
v2.



>
> >> +if test "x$enable_osmesa$enable_gallium_osmesa" != xnono; then
> >> +OSMESA_PC_LIB_PRIV="$OSMESA_PC_LIB_PRIV $OSMESA_PC_LIB_PRIV
> $LLVM_LIBS"
> >
> > I'm assuming the duplicate `$OSMESA_PC_LIB_PRIV` wasn't intended?
>

Will fix in v2.



> These variables have the dependency libs (-lfoo) that the respective
> libraries libGL.so/libGLES*so/etc.
> Then they are stored in the the .pc Libs.private section - thus anyone
> static linking said libraries will reuse it.
>

This is inded the use case here: building a static libGL or static
libOSMesa and having proper dependency info available.

I'll push a corrected v2 shortly.

Thanks for the review.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/7] freedreno: a2xx: implement SEQ/SNE instructions

2018-01-25 Thread Wladimir J. van der Laan
Extend translate_sge_slt to emit these, in analogous fashion
but using CNDEv.

Signed-off-by: Wladimir J. van der Laan 
---
 src/gallium/drivers/freedreno/a2xx/fd2_compiler.c | 23 ---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/freedreno/a2xx/fd2_compiler.c 
b/src/gallium/drivers/freedreno/a2xx/fd2_compiler.c
index 9f2fc61..52f0aba 100644
--- a/src/gallium/drivers/freedreno/a2xx/fd2_compiler.c
+++ b/src/gallium/drivers/freedreno/a2xx/fd2_compiler.c
@@ -829,8 +829,10 @@ translate_tex(struct fd2_compile_context *ctx,
 
 /* SGE(a,b) = GTE((b - a), 1.0, 0.0) */
 /* SLT(a,b) = GTE((b - a), 0.0, 1.0) */
+/* SEQ(a,b) = EQU((b - a), 1.0, 0.0) */
+/* SNE(a,b) = EQU((b - a), 0.0, 1.0) */
 static void
-translate_sge_slt(struct fd2_compile_context *ctx,
+translate_sge_slt_seq_sne(struct fd2_compile_context *ctx,
struct tgsi_full_instruction *inst, unsigned opc)
 {
struct ir2_instruction *instr;
@@ -838,6 +840,7 @@ translate_sge_slt(struct fd2_compile_context *ctx,
struct tgsi_src_register tmp_src;
struct tgsi_src_register tmp_const;
float c0, c1;
+instr_vector_opc_t vopc;
 
switch (opc) {
default:
@@ -845,10 +848,22 @@ translate_sge_slt(struct fd2_compile_context *ctx,
case TGSI_OPCODE_SGE:
c0 = 1.0;
c1 = 0.0;
+vopc = CNDGTEv;
break;
case TGSI_OPCODE_SLT:
c0 = 0.0;
c1 = 1.0;
+vopc = CNDGTEv;
+   break;
+   case TGSI_OPCODE_SEQ:
+   c0 = 0.0;
+   c1 = 1.0;
+vopc = CNDEv;
+   break;
+   case TGSI_OPCODE_SNE:
+   c0 = 1.0;
+   c1 = 0.0;
+vopc = CNDEv;
break;
}
 
@@ -859,7 +874,7 @@ translate_sge_slt(struct fd2_compile_context *ctx,
add_src_reg(ctx, instr, >Src[0].Register)->flags |= 
IR2_REG_NEGATE;
add_src_reg(ctx, instr, >Src[1].Register);
 
-   instr = ir2_instr_create_alu(next_exec_cf(ctx), CNDGTEv, ~0);
+   instr = ir2_instr_create_alu(next_exec_cf(ctx), vopc, ~0);
add_dst_reg(ctx, instr, >Dst[0].Register);
/* maybe should re-arrange the syntax some day, but
 * in assembler/disassembler and what ir.c expects
@@ -1057,7 +1072,9 @@ translate_instruction(struct fd2_compile_context *ctx,
break;
case TGSI_OPCODE_SLT:
case TGSI_OPCODE_SGE:
-   translate_sge_slt(ctx, inst, opc);
+case TGSI_OPCODE_SEQ:
+case TGSI_OPCODE_SNE:
+   translate_sge_slt_seq_sne(ctx, inst, opc);
break;
case TGSI_OPCODE_MAD:
instr = ir2_instr_create_alu(cf, MULADDv, ~0);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/7] freedreno: a2xx: Implement DP2 instruction

2018-01-25 Thread Wladimir J. van der Laan
Use DOT2ADDv instruction with 0.0f constant add.

Signed-off-by: Wladimir J. van der Laan 
---
 src/gallium/drivers/freedreno/a2xx/fd2_compiler.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/src/gallium/drivers/freedreno/a2xx/fd2_compiler.c 
b/src/gallium/drivers/freedreno/a2xx/fd2_compiler.c
index 52f0aba..ce0b33a 100644
--- a/src/gallium/drivers/freedreno/a2xx/fd2_compiler.c
+++ b/src/gallium/drivers/freedreno/a2xx/fd2_compiler.c
@@ -987,6 +987,24 @@ translate_trig(struct fd2_compile_context *ctx,
add_src_reg(ctx, instr, _src);
 }
 
+static void
+translate_dp2(struct fd2_compile_context *ctx,
+   struct tgsi_full_instruction *inst,
+   unsigned opc)
+{
+struct tgsi_src_register tmp_const;
+struct ir2_instruction *instr;
+/* DP2ADD c,a,b -> dot2(a,b) + c */
+/* for c we use the constant 0.0 */
+instr = ir2_instr_create_alu(next_exec_cf(ctx), DOT2ADDv, ~0);
+get_immediate(ctx, _const, fui(0.0f));
+add_dst_reg(ctx, instr, >Dst[0].Register);
+add_src_reg(ctx, instr, _const);
+add_src_reg(ctx, instr, >Src[0].Register);
+add_src_reg(ctx, instr, >Src[1].Register);
+add_vector_clamp(inst, instr);
+}
+
 /*
  * Main part of compiler/translator:
  */
@@ -1054,6 +1072,9 @@ translate_instruction(struct fd2_compile_context *ctx,
instr = ir2_instr_create_alu(cf, ADDv, ~0);
add_regs_vector_2(ctx, inst, instr);
break;
+   case TGSI_OPCODE_DP2:
+   translate_dp2(ctx, inst, opc);
+   break;
case TGSI_OPCODE_DP3:
instr = ir2_instr_create_alu(cf, DOT3v, ~0);
add_regs_vector_2(ctx, inst, instr);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/7] freedreno: a2xx: Fix fd2_tex_swiz

2018-01-25 Thread Wladimir J. van der Laan
Compose swizzles using util_format_compose_swizzles instead
of the custom code (which somehow had a bug).

This makes the GL_ALPHA internal format work.

Signed-off-by: Wladimir J. van der Laan 
---
 src/gallium/drivers/freedreno/a2xx/fd2_util.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/src/gallium/drivers/freedreno/a2xx/fd2_util.c 
b/src/gallium/drivers/freedreno/a2xx/fd2_util.c
index 0bdcfcd..25f2bf4 100644
--- a/src/gallium/drivers/freedreno/a2xx/fd2_util.c
+++ b/src/gallium/drivers/freedreno/a2xx/fd2_util.c
@@ -309,14 +309,14 @@ fd2_tex_swiz(enum pipe_format format, unsigned swizzle_r, 
unsigned swizzle_g,
 {
const struct util_format_description *desc =
util_format_description(format);
-   uint8_t swiz[] = {
-   swizzle_r, swizzle_g, swizzle_b, swizzle_a,
-   PIPE_SWIZZLE_0, PIPE_SWIZZLE_1,
-   PIPE_SWIZZLE_1, PIPE_SWIZZLE_1,
-   };
+   unsigned char swiz[4] = {
+   swizzle_r, swizzle_g, swizzle_b, swizzle_a,
+   }, rswiz[4];
 
-   return A2XX_SQ_TEX_3_SWIZ_X(tex_swiz(swiz[desc->swizzle[0]])) |
-   A2XX_SQ_TEX_3_SWIZ_Y(tex_swiz(swiz[desc->swizzle[1]])) |
-   A2XX_SQ_TEX_3_SWIZ_Z(tex_swiz(swiz[desc->swizzle[2]])) |
-   A2XX_SQ_TEX_3_SWIZ_W(tex_swiz(swiz[desc->swizzle[3]]));
+   util_format_compose_swizzles(desc->swizzle, swiz, rswiz);
+
+   return A2XX_SQ_TEX_3_SWIZ_X(tex_swiz(rswiz[0])) |
+   A2XX_SQ_TEX_3_SWIZ_Y(tex_swiz(rswiz[1])) |
+   A2XX_SQ_TEX_3_SWIZ_Z(tex_swiz(rswiz[2])) |
+   A2XX_SQ_TEX_3_SWIZ_W(tex_swiz(rswiz[3]));
 }
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/7] freedreno: a2xx: Prevent crash in emit_texture if view is not set

2018-01-25 Thread Wladimir J. van der Laan
Textures will sometimes be updated if texture view state was
un-set, without this change that causes an assertion crash or
segfault.

Signed-off-by: Wladimir J. van der Laan 
---
 src/gallium/drivers/freedreno/a2xx/fd2_emit.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/freedreno/a2xx/fd2_emit.c 
b/src/gallium/drivers/freedreno/a2xx/fd2_emit.c
index 5a1db13..ebe698f 100644
--- a/src/gallium/drivers/freedreno/a2xx/fd2_emit.c
+++ b/src/gallium/drivers/freedreno/a2xx/fd2_emit.c
@@ -125,6 +125,7 @@ emit_texture(struct fd_ringbuffer *ring, struct fd_context 
*ctx,
 {
unsigned const_idx = fd2_get_const_idx(ctx, tex, samp_id);
static const struct fd2_sampler_stateobj dummy_sampler = {};
+   static const struct fd2_pipe_sampler_view dummy_view = {};
const struct fd2_sampler_stateobj *sampler;
struct fd2_pipe_sampler_view *view;
 
@@ -134,13 +135,19 @@ emit_texture(struct fd_ringbuffer *ring, struct 
fd_context *ctx,
sampler = tex->samplers[samp_id] ?
fd2_sampler_stateobj(tex->samplers[samp_id]) :
_sampler;
-   view = fd2_pipe_sampler_view(tex->textures[samp_id]);
+   view = tex->textures[samp_id] ?
+   fd2_pipe_sampler_view(tex->textures[samp_id]) :
+   _view;
 
OUT_PKT3(ring, CP_SET_CONSTANT, 7);
OUT_RING(ring, 0x0001 + (0x6 * const_idx));
 
OUT_RING(ring, sampler->tex0 | view->tex0);
-   OUT_RELOC(ring, fd_resource(view->base.texture)->bo, 0, view->fmt, 0);
+   if (view->base.texture)
+   OUT_RELOC(ring, fd_resource(view->base.texture)->bo, 0, 
view->fmt, 0);
+   else
+   OUT_RING(ring, 0);
+
OUT_RING(ring, view->tex2);
OUT_RING(ring, sampler->tex3 | view->tex3);
OUT_RING(ring, sampler->tex4);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/7] freedreno: a2xx: Support TEXTURE_RECT

2018-01-25 Thread Wladimir J. van der Laan
Denormalized texture coordinates are required for text rendering in
GALLIUM_HUD.

Signed-off-by: Wladimir J. van der Laan 
---
 src/gallium/drivers/freedreno/a2xx/fd2_compiler.c | 3 ++-
 src/gallium/drivers/freedreno/a2xx/ir-a2xx.c  | 1 +
 src/gallium/drivers/freedreno/a2xx/ir-a2xx.h  | 1 +
 3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/freedreno/a2xx/fd2_compiler.c 
b/src/gallium/drivers/freedreno/a2xx/fd2_compiler.c
index 2ffd8cd..9f2fc61 100644
--- a/src/gallium/drivers/freedreno/a2xx/fd2_compiler.c
+++ b/src/gallium/drivers/freedreno/a2xx/fd2_compiler.c
@@ -791,6 +791,7 @@ translate_tex(struct fd2_compile_context *ctx,
instr = ir2_instr_create(next_exec_cf(ctx), IR2_FETCH);
instr->fetch.opc = TEX_FETCH;
instr->fetch.is_cube = (inst->Texture.Texture == TGSI_TEXTURE_3D);
+   instr->fetch.is_rect = (inst->Texture.Texture == TGSI_TEXTURE_RECT);
assert(inst->Texture.NumOffsets <= 1); // TODO what to do in other 
cases?
 
/* save off the tex fetch to be patched later with correct const_idx: */
@@ -802,7 +803,7 @@ translate_tex(struct fd2_compile_context *ctx,
reg = add_src_reg(ctx, instr, coord);
 
/* blob compiler always sets 3rd component to same as 1st for 2d: */
-   if (inst->Texture.Texture == TGSI_TEXTURE_2D)
+   if (inst->Texture.Texture == TGSI_TEXTURE_2D || inst->Texture.Texture 
== TGSI_TEXTURE_RECT)
reg->swizzle[2] = reg->swizzle[0];
 
/* dst register needs to be marked for sync: */
diff --git a/src/gallium/drivers/freedreno/a2xx/ir-a2xx.c 
b/src/gallium/drivers/freedreno/a2xx/ir-a2xx.c
index 163c282..3666a7e 100644
--- a/src/gallium/drivers/freedreno/a2xx/ir-a2xx.c
+++ b/src/gallium/drivers/freedreno/a2xx/ir-a2xx.c
@@ -341,6 +341,7 @@ static int instr_emit_fetch(struct ir2_instruction *instr,
tex->use_comp_lod = 1;
tex->use_reg_lod = !instr->fetch.is_cube;
tex->sample_location = SAMPLE_CENTER;
+tex->tx_coord_denorm = instr->fetch.is_rect;
 
if (instr->pred != IR2_PRED_NONE) {
tex->pred_select = 1;
diff --git a/src/gallium/drivers/freedreno/a2xx/ir-a2xx.h 
b/src/gallium/drivers/freedreno/a2xx/ir-a2xx.h
index 36ed204..c4b6c18 100644
--- a/src/gallium/drivers/freedreno/a2xx/ir-a2xx.h
+++ b/src/gallium/drivers/freedreno/a2xx/ir-a2xx.h
@@ -74,6 +74,7 @@ struct ir2_instruction {
unsigned const_idx;
/* texture fetch specific: */
bool is_cube : 1;
+   bool is_rect : 1;
/* vertex fetch specific: */
unsigned const_idx_sel;
enum a2xx_sq_surfaceformat fmt;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/7] freedreno: a2xx: Compressed textures support

2018-01-25 Thread Wladimir J. van der Laan
Add support for:

- PIPE_FORMAT_ETC1_RGB8
- PIPE_FORMAT_DXT1_RGB
- PIPE_FORMAT_DXT1_RGBA
- PIPE_FORMAT_DXT3_RGBA
- PIPE_FORMAT_DXT5_RGBA

Signed-off-by: Wladimir J. van der Laan 
---
 src/gallium/drivers/freedreno/a2xx/fd2_util.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/src/gallium/drivers/freedreno/a2xx/fd2_util.c 
b/src/gallium/drivers/freedreno/a2xx/fd2_util.c
index 25f2bf4..60e5c39 100644
--- a/src/gallium/drivers/freedreno/a2xx/fd2_util.c
+++ b/src/gallium/drivers/freedreno/a2xx/fd2_util.c
@@ -183,6 +183,17 @@ fd2_pipe2surface(enum pipe_format format)
case PIPE_FORMAT_R32G32B32A32_FLOAT:
return FMT_32_32_32_32_FLOAT;
 
+/* Compressed textures. */
+case PIPE_FORMAT_ETC1_RGB8:
+return FMT_ETC1_RGB;
+case PIPE_FORMAT_DXT1_RGB:
+case PIPE_FORMAT_DXT1_RGBA:
+return FMT_DXT1;
+case PIPE_FORMAT_DXT3_RGBA:
+return FMT_DXT2_3;
+case PIPE_FORMAT_DXT5_RGBA:
+return FMT_DXT4_5;
+
/* YUV buffers. */
case PIPE_FORMAT_UYVY:
return FMT_Cr_Y1_Cb_Y0;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/7] freedreno: a2xx: Update rnndb header

2018-01-25 Thread Wladimir J. van der Laan
Also update BLEND_ to BLEND2_ opcodes to accomodate.

Signed-off-by: Wladimir J. van der Laan 
---
 src/gallium/drivers/freedreno/a2xx/a2xx.xml.h | 33 +++
 src/gallium/drivers/freedreno/a2xx/fd2_gmem.c |  4 ++--
 2 files changed, 15 insertions(+), 22 deletions(-)

diff --git a/src/gallium/drivers/freedreno/a2xx/a2xx.xml.h 
b/src/gallium/drivers/freedreno/a2xx/a2xx.xml.h
index 55a4355..279a652 100644
--- a/src/gallium/drivers/freedreno/a2xx/a2xx.xml.h
+++ b/src/gallium/drivers/freedreno/a2xx/a2xx.xml.h
@@ -84,13 +84,12 @@ enum a2xx_sq_surfaceformat {
FMT_5_5_5_1 = 13,
FMT_8_8_8_8_A = 14,
FMT_4_4_4_4 = 15,
-   FMT_10_11_11 = 16,
-   FMT_11_11_10 = 17,
+   FMT_8_8_8 = 16,
FMT_DXT1 = 18,
FMT_DXT2_3 = 19,
FMT_DXT4_5 = 20,
+   FMT_10_10_10_2 = 21,
FMT_24_8 = 22,
-   FMT_24_8_FLOAT = 23,
FMT_16 = 24,
FMT_16_16 = 25,
FMT_16_16_16_16 = 26,
@@ -106,29 +105,23 @@ enum a2xx_sq_surfaceformat {
FMT_32_FLOAT = 36,
FMT_32_32_FLOAT = 37,
FMT_32_32_32_32_FLOAT = 38,
-   FMT_32_AS_8 = 39,
-   FMT_32_AS_8_8 = 40,
-   FMT_16_MPEG = 41,
-   FMT_16_16_MPEG = 42,
-   FMT_8_INTERLACED = 43,
-   FMT_32_AS_8_INTERLACED = 44,
-   FMT_32_AS_8_8_INTERLACED = 45,
-   FMT_16_INTERLACED = 46,
-   FMT_16_MPEG_INTERLACED = 47,
-   FMT_16_16_MPEG_INTERLACED = 48,
+   FMT_ATI_TC_RGB = 39,
+   FMT_ATI_TC_RGBA = 40,
+   FMT_ATI_TC_555_565_RGB = 41,
+   FMT_ATI_TC_555_565_RGBA = 42,
+   FMT_ATI_TC_RGBA_INTERP = 43,
+   FMT_ATI_TC_555_565_RGBA_INTERP = 44,
+   FMT_ETC1_RGBA_INTERP = 46,
+   FMT_ETC1_RGB = 47,
+   FMT_ETC1_RGBA = 48,
FMT_DXN = 49,
-   FMT_8_8_8_8_AS_16_16_16_16 = 50,
-   FMT_DXT1_AS_16_16_16_16 = 51,
-   FMT_DXT2_3_AS_16_16_16_16 = 52,
-   FMT_DXT4_5_AS_16_16_16_16 = 53,
+   FMT_2_3_3 = 51,
FMT_2_10_10_10_AS_16_16_16_16 = 54,
-   FMT_10_11_11_AS_16_16_16_16 = 55,
-   FMT_11_11_10_AS_16_16_16_16 = 56,
+   FMT_10_10_10_2_AS_16_16_16_16 = 55,
FMT_32_32_32_FLOAT = 57,
FMT_DXT3A = 58,
FMT_DXT5A = 59,
FMT_CTX1 = 60,
-   FMT_DXT3A_AS_1_1_1_1 = 61,
 };
 
 enum a2xx_sq_ps_vtx_mode {
diff --git a/src/gallium/drivers/freedreno/a2xx/fd2_gmem.c 
b/src/gallium/drivers/freedreno/a2xx/fd2_gmem.c
index 0905ab6..46a7d18 100644
--- a/src/gallium/drivers/freedreno/a2xx/fd2_gmem.c
+++ b/src/gallium/drivers/freedreno/a2xx/fd2_gmem.c
@@ -293,10 +293,10 @@ fd2_emit_tile_mem2gmem(struct fd_batch *batch, struct 
fd_tile *tile)
OUT_PKT3(ring, CP_SET_CONSTANT, 2);
OUT_RING(ring, CP_REG(REG_A2XX_RB_BLEND_CONTROL));
OUT_RING(ring, A2XX_RB_BLEND_CONTROL_COLOR_SRCBLEND(FACTOR_ONE) |
-   
A2XX_RB_BLEND_CONTROL_COLOR_COMB_FCN(BLEND_DST_PLUS_SRC) |
+   
A2XX_RB_BLEND_CONTROL_COLOR_COMB_FCN(BLEND2_DST_PLUS_SRC) |
A2XX_RB_BLEND_CONTROL_COLOR_DESTBLEND(FACTOR_ZERO) |
A2XX_RB_BLEND_CONTROL_ALPHA_SRCBLEND(FACTOR_ONE) |
-   
A2XX_RB_BLEND_CONTROL_ALPHA_COMB_FCN(BLEND_DST_PLUS_SRC) |
+   
A2XX_RB_BLEND_CONTROL_ALPHA_COMB_FCN(BLEND2_DST_PLUS_SRC) |
A2XX_RB_BLEND_CONTROL_ALPHA_DESTBLEND(FACTOR_ZERO));
 
OUT_PKT3(ring, CP_SET_CONSTANT, 3);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >