Re: [Mesa-dev] [PATCH 3/3] i965/fs: emit DIM instruction to load 64-bit immediates in HSW

2016-07-13 Thread Matt Turner
On Wed, Jul 13, 2016 at 10:52 PM, Samuel Iglesias Gonsálvez
 wrote:
>
>
> On 14/07/16 03:46, Matt Turner wrote:
>> On Wed, Jul 13, 2016 at 5:06 PM, Matt Turner  wrote:
>>> On Tue, Jul 12, 2016 at 11:42 PM, Samuel Iglesias Gonsálvez
>>>  wrote:
 Signed-off-by: Samuel Iglesias Gonsálvez 
 ---
  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 12 
  1 file changed, 12 insertions(+)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
 index a65c273..bf32dfd 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
 @@ -4558,6 +4558,18 @@ setup_imm_df(const fs_builder &bld, double v)
 if (devinfo->gen >= 8)
return brw_imm_df(v);

 +   /* gen7.5 does not support DF immediates straighforward but the DIM
 +* instruction allows to set the 64-bit immediate value.
 +*/
 +   if (devinfo->is_haswell) {
 +  const fs_builder ubld = bld.exec_all();
 +  fs_reg dst = ubld.vgrf(BRW_REGISTER_TYPE_DF, 1);
 +  struct brw_reg imm = brw_imm_reg(BRW_REGISTER_TYPE_F);
 +  imm.df = v;
 +  ubld.DIM(dst, imm);
>>>
>>> I know the hardware is strange and requires that src0's type is F, but
>>> I don't think we need to model that in the IR. I think that using a DF
>>> type in the IR
>>> would require otherwise unnecessary changes to dump_instructions().
>>>
>>> With the above three lines changed to just
>>>
>>>ubld.DIM(dst, brw_imm_df(v));
>>>
>>> this patch is:
>>>
>>> Reviewed-by: Matt Turner 
>>>
>>> Patch 1 I sent comments on. With those addressed it is also
>>>
>>> Reviewed-by: Matt Turner 
>>>
>>> I believe with my comments addressed on 1/3 and 3/3 that 2/3 is unecessary.
>>
>> Actually, I guess 2/3 is necessary since the type is changed before
>> brw_eu_emit? Sorry, not thinking very clearly.
>>
>
> Exactly, patch 2/3 is needed because the type has been changed before
> brw_eu_emit.
>
> Does patch 2/3 get your R-b then?

Yes. Thank you!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965/fs: emit DIM instruction to load 64-bit immediates in HSW

2016-07-13 Thread Samuel Iglesias Gonsálvez


On 14/07/16 03:46, Matt Turner wrote:
> On Wed, Jul 13, 2016 at 5:06 PM, Matt Turner  wrote:
>> On Tue, Jul 12, 2016 at 11:42 PM, Samuel Iglesias Gonsálvez
>>  wrote:
>>> Signed-off-by: Samuel Iglesias Gonsálvez 
>>> ---
>>>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 12 
>>>  1 file changed, 12 insertions(+)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
>>> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>>> index a65c273..bf32dfd 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>>> @@ -4558,6 +4558,18 @@ setup_imm_df(const fs_builder &bld, double v)
>>> if (devinfo->gen >= 8)
>>>return brw_imm_df(v);
>>>
>>> +   /* gen7.5 does not support DF immediates straighforward but the DIM
>>> +* instruction allows to set the 64-bit immediate value.
>>> +*/
>>> +   if (devinfo->is_haswell) {
>>> +  const fs_builder ubld = bld.exec_all();
>>> +  fs_reg dst = ubld.vgrf(BRW_REGISTER_TYPE_DF, 1);
>>> +  struct brw_reg imm = brw_imm_reg(BRW_REGISTER_TYPE_F);
>>> +  imm.df = v;
>>> +  ubld.DIM(dst, imm);
>>
>> I know the hardware is strange and requires that src0's type is F, but
>> I don't think we need to model that in the IR. I think that using a DF
>> type in the IR
>> would require otherwise unnecessary changes to dump_instructions().
>>
>> With the above three lines changed to just
>>
>>ubld.DIM(dst, brw_imm_df(v));
>>
>> this patch is:
>>
>> Reviewed-by: Matt Turner 
>>
>> Patch 1 I sent comments on. With those addressed it is also
>>
>> Reviewed-by: Matt Turner 
>>
>> I believe with my comments addressed on 1/3 and 3/3 that 2/3 is unecessary.
> 
> Actually, I guess 2/3 is necessary since the type is changed before
> brw_eu_emit? Sorry, not thinking very clearly.
> 

Exactly, patch 2/3 is needed because the type has been changed before
brw_eu_emit.

Does patch 2/3 get your R-b then?

Thanks for the review!

Sam



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] i965: enable the emission of the DIM instruction

2016-07-13 Thread Samuel Iglesias Gonsálvez


On 14/07/16 02:04, Matt Turner wrote:
> On Tue, Jul 12, 2016 at 11:42 PM, Samuel Iglesias Gonsálvez
>  wrote:
>> Signed-off-by: Samuel Iglesias Gonsálvez 
>> ---
>>  src/mesa/drivers/dri/i965/brw_defines.h  | 2 +-
>>  src/mesa/drivers/dri/i965/brw_eu.c   | 2 +-
>>  src/mesa/drivers/dri/i965/brw_eu.h   | 1 +
>>  src/mesa/drivers/dri/i965/brw_eu_emit.c  | 1 +
>>  src/mesa/drivers/dri/i965/brw_fs_builder.h   | 1 +
>>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp   | 7 +++
>>  src/mesa/drivers/dri/i965/brw_vec4.h | 2 ++
>>  src/mesa/drivers/dri/i965/brw_vec4_builder.h | 1 +
>>  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 7 +++
>>  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp   | 1 +
>>  10 files changed, 23 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
>> b/src/mesa/drivers/dri/i965/brw_defines.h
>> index d2cd53a..740d03d 100644
>> --- a/src/mesa/drivers/dri/i965/brw_defines.h
>> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
>> @@ -857,7 +857,7 @@ enum opcode {
>> BRW_OPCODE_XOR =7,
>> BRW_OPCODE_SHR =8,
>> BRW_OPCODE_SHL =9,
>> -   // BRW_OPCODE_DIM = 10,  /**< Gen7.5 only */ /* Reused */
>> +   BRW_OPCODE_DIM =10,  /**< Gen7.5 only */ /* Reused */
>> // BRW_OPCODE_SMOV =10,  /**< Gen8+   */ /* Reused */
>> /* Reserved - 11 */
>> BRW_OPCODE_ASR =12,
>> diff --git a/src/mesa/drivers/dri/i965/brw_eu.c 
>> b/src/mesa/drivers/dri/i965/brw_eu.c
>> index cc252de..3a309dc 100644
>> --- a/src/mesa/drivers/dri/i965/brw_eu.c
>> +++ b/src/mesa/drivers/dri/i965/brw_eu.c
>> @@ -421,7 +421,7 @@ enum gen {
>>  #define GEN_LE(gen) (GEN_LT(gen) | (gen))
>>
>>  static const struct opcode_desc opcode_10_descs[] = {
>> -   { .name = "dim",   .nsrc = 0, .ndst = 0, .gens = GEN75 },
>> +   { .name = "dim",   .nsrc = 1, .ndst = 1, .gens = GEN75 },
>> { .name = "smov",  .nsrc = 0, .ndst = 0, .gens = GEN_GE(GEN8) },
>>  };
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_eu.h 
>> b/src/mesa/drivers/dri/i965/brw_eu.h
>> index b057f17..09f51db 100644
>> --- a/src/mesa/drivers/dri/i965/brw_eu.h
>> +++ b/src/mesa/drivers/dri/i965/brw_eu.h
>> @@ -157,6 +157,7 @@ ALU2(OR)
>>  ALU2(XOR)
>>  ALU2(SHR)
>>  ALU2(SHL)
>> +ALU1(DIM)
>>  ALU2(ASR)
>>  ALU1(F32TO16)
>>  ALU1(F16TO32)
>> diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
>> b/src/mesa/drivers/dri/i965/brw_eu_emit.c
>> index 2a8e661..f2f55410 100644
>> --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
>> +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
>> @@ -1064,6 +1064,7 @@ ALU2(OR)
>>  ALU2(XOR)
>>  ALU2(SHR)
>>  ALU2(SHL)
>> +ALU1(DIM)
>>  ALU2(ASR)
>>  ALU1(FRC)
>>  ALU1(RNDD)
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_builder.h 
>> b/src/mesa/drivers/dri/i965/brw_fs_builder.h
>> index f22903e..8e43484 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_builder.h
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_builder.h
>> @@ -460,6 +460,7 @@ namespace brw {
>>ALU1(CBIT)
>>ALU2(CMPN)
>>ALU3(CSEL)
>> +  ALU1(DIM)
>>ALU2(DP2)
>>ALU2(DP3)
>>ALU2(DP4)
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>> index ce1ec0a..ba213b1 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>> @@ -2082,6 +2082,13 @@ fs_generator::generate_code(const cfg_t *cfg, int 
>> dispatch_width)
>>  generate_barrier(inst, src[0]);
>>  break;
>>
>> +  case BRW_OPCODE_DIM:
>> + assert(devinfo->is_haswell);
>> + assert(src[0].type == BRW_REGISTER_TYPE_F);
> 
> As I say in reply to PATCH 3/3, I think it's better to use type DF for
> src0 in the IR, and just fix the type to F here in the generator.
> 
> I would just assert that src[0].type is DF, and then...
> 
>> + assert(dst.type == BRW_REGISTER_TYPE_DF);
>> + brw_DIM(p, dst, src[0]);
> 
>brw_DIM(p, dst, retype(src[0], BRW_REGISTER_TYPE_F));
> 
>> +break;
> 
> The indentation looks wrong here.
> 

Right. Thanks for the comments!

Sam

>> +
>>default:
>>   unreachable("Unsupported opcode");
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
>> b/src/mesa/drivers/dri/i965/brw_vec4.h
>> index 76dea04..3043147 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4.h
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
>> @@ -213,6 +213,8 @@ public:
>> EMIT3(MAD)
>> EMIT2(ADDC)
>> EMIT2(SUBB)
>> +   EMIT1(DIM)
>> +
>>  #undef EMIT1
>>  #undef EMIT2
>>  #undef EMIT3
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_builder.h 
>> b/src/mesa/drivers/dri/i965/brw_vec4_builder.h
>> index 3a8617e..d25a87a 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4_builder.h
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_builder.h
>> @@ -373,6 +373,7 @@ namespace brw {
>>ALU1(CBIT)
>>ALU2(CMPN)
>>ALU3(CSEL)
>> + 

Re: [Mesa-dev] [PATCH] isl/state: Divide the aux qpitch by 2

2016-07-13 Thread Pohjolainen, Topi

Subject says: "isl/state: Divide the aux qpitch by 2". Should be
divide by 4 or shift by 2.

Otherwise:

Reviewed-by: Topi Pohjolainen 

On Wed, Jul 13, 2016 at 04:45:09PM -0700, Jason Ekstrand wrote:
> The field is in multiples of 4 like regular QPitch.
> ---
>  src/intel/isl/isl_surface_state.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/intel/isl/isl_surface_state.c 
> b/src/intel/isl/isl_surface_state.c
> index d40f2c1..1c656c9 100644
> --- a/src/intel/isl/isl_surface_state.c
> +++ b/src/intel/isl/isl_surface_state.c
> @@ -445,7 +445,7 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
> void *state,
> * in units of samples on the main surface.
> */
>s.AuxiliarySurfaceQPitch =
> - isl_surf_get_array_pitch_sa_rows(info->aux_surf);
> + isl_surf_get_array_pitch_sa_rows(info->aux_surf) >> 2;
>s.AuxiliarySurfaceBaseAddress = info->aux_address;
>s.AuxiliarySurfaceMode = isl_to_gen_aux_mode[info->aux_usage];
>  #else
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa v2] vl: fix memory leak

2016-07-13 Thread Nayan Deshmukh
Reviewed-by: Nayan Deshmukh 


On Thu, Jul 14, 2016 at 3:20 AM, Eric Engestrom  wrote:

> CovID: 1363008
> Signed-off-by: Eric Engestrom 
> ---
>
> v2: avoid using malloc() altogether (Christian König)
>
> ---
>  src/gallium/auxiliary/vl/vl_bicubic_filter.c | 8 +---
>  1 file changed, 1 insertion(+), 7 deletions(-)
>
> diff --git a/src/gallium/auxiliary/vl/vl_bicubic_filter.c
> b/src/gallium/auxiliary/vl/vl_bicubic_filter.c
> index 25bc58c..51a0019 100644
> --- a/src/gallium/auxiliary/vl/vl_bicubic_filter.c
> +++ b/src/gallium/auxiliary/vl/vl_bicubic_filter.c
> @@ -242,7 +242,7 @@ vl_bicubic_filter_init(struct vl_bicubic_filter
> *filter, struct pipe_context *pi
>  {
> struct pipe_rasterizer_state rs_state;
> struct pipe_blend_state blend;
> -   struct vertex2f *offsets = NULL;
> +   struct vertex2f offsets[16];
> struct pipe_sampler_state sampler;
> struct pipe_vertex_element ve;
> unsigned i;
> @@ -301,10 +301,6 @@ vl_bicubic_filter_init(struct vl_bicubic_filter
> *filter, struct pipe_context *pi
> if (!filter->ves)
>goto error_ves;
>
> -   offsets = MALLOC(sizeof(struct vertex2f) * 16);
> -   if (!offsets)
> -  goto error_offsets;
> -
> offsets[0].x = -1.0f; offsets[0].y = -1.0f;
> offsets[1].x = 0.0f; offsets[1].y = -1.0f;
> offsets[2].x = 1.0f; offsets[2].y = -1.0f;
> @@ -344,8 +340,6 @@ vl_bicubic_filter_init(struct vl_bicubic_filter
> *filter, struct pipe_context *pi
> pipe->delete_vs_state(pipe, filter->vs);
>
>  error_vs:
> -
> -error_offsets:
> pipe->delete_vertex_elements_state(pipe, filter->ves);
>
>  error_ves:
> --
> 2.9.0
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] egl/dri2: dri2_make_current: Set EGL error if bindContext fails

2016-07-13 Thread Nicolas Boichat
From: Nicolas Boichat 

Without this, if a configuration is, say, available only on GLES2/3, but
not on GLES1, and is rejected by the dri module's bindContext call,
eglMakeCurrent fails with error "EGL_SUCCESS".

In this patch, we set error to EGL_BAD_MATCH, which is what CTS/dEQP
dEQP-EGL.functional.surfaceless_context expect.

Cc: "11.2 12.0" 
Signed-off-by: Nicolas Boichat 
Reviewed-by: Emil Velikov 
---
 src/egl/drivers/dri2/egl_dri2.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
index bfde640..3cbdd0a 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -1231,6 +1231,7 @@ dri2_make_current(_EGLDriver *drv, _EGLDisplay *disp, 
_EGLSurface *dsurf,
   _eglPutSurface(old_rsurf);
   _eglPutContext(old_ctx);
 
+  _eglError(EGL_BAD_MATCH, "eglMakeCurrent error");
   return EGL_FALSE;
}
 }
-- 
2.8.0.rc3.226.g39d4020

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] V3 On disk shader cache for i965 (Now with real world results!)

2016-07-13 Thread Timothy Arceri
On Wed, 2016-07-13 at 23:51 +0300, Grazvydas Ignotas wrote:
> On Wed, Jul 13, 2016 at 2:56 AM, Timothy Arceri
>  wrote:
> > On Sat, 2016-07-09 at 20:21 +0300, Grazvydas Ignotas wrote:
> > > 
> > > I think I still have some more:
> > > - running 32bit program after 64bit version of the same thing (or
> > > vice
> > > versa) leads to segfaults and assert hits. Probably easiest to
> > > build
> > > 32bit and 64bit apitrace and play some traces.
> > > - a trace from comment 2 of bug 96624 doesn't seem to render on
> > > cache.
> > 
> > I can't seem to reproduce this. I created some traces using the 32-
> > bit
> > and 64-bit versions of The Talos Principle but they seems to run
> > with
> > no issues. I've pushed updates to the shader-cache branch maybe I
> > fixed
> > something.
> 
> Still crashes here. What I do:
> 1. build 64bit and 32bit versions of apitrace
> 2. get the trace from bug 96425
> 3. rm -rf ~/.cache/mesa/
> 4. MESA_GLSL_CACHE_ENABLE=1 build64/glretrace -b Talos_flicker2.trace
> 5. MESA_GLSL_CACHE_ENABLE=1 build32/glretrace -b Talos_flicker2.trace

ok so I'd used a different commit to build the 32bit and 64bit builds
which caused the shader-cache to fallback and recompile the shader so I
didn't hit the errors, whoops.

Anyway I've pushed a fix to the shader-cache branch.

Thanks,
Tim

> 
> Program received signal SIGSEGV, Segmentation fault.
> 0xf751b8cb in read_atomic_buffers (prog=,
> metadata=0xba34) at glsl/shader_cache.cpp:405
> 405*stage_buff_list[j] = &prog->AtomicBuffers[i];
> (gdb) bt
> #0  0xf751b8cb in read_atomic_buffers (prog=,
> metadata=0xba34) at glsl/shader_cache.cpp:405
> #1  shader_cache_read_program_metadata (ctx=0xf6d18020,
> prog=0x88ac610) at glsl/shader_cache.cpp:1283
> #2  0xf748eeaf in link_shaders (ctx=0xf6d18020, prog=0x88ac610,
> is_cache_fallback=false) at glsl/linker.cpp:4541
> #3  0xf73d289c in _mesa_glsl_link_shader (ctx=0xf6d18020,
> prog=0x88ac610, is_cache_fallback=false)
> at program/ir_to_mesa.cpp:3073
> #4  0xf72c682a in _mesa_link_program (ctx=0xf6d18020,
> shProg=0x88ac610) at main/shaderapi.c:1099
> #5  0xf72c7e47 in _mesa_LinkProgram (programObj=14) at
> main/shaderapi.c:1603
> #6  0x08264000 in _get_glLinkProgram(unsigned int) ()
> #7  0x0810ef17 in retrace_glLinkProgram(trace::Call&) ()
> 
> Gražvydas
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] i915/sync: Implement DRI2_Fence extension

2016-07-13 Thread Mauro Rossi
Here is the porting of corresponding patch for i965,
i.e. commit c636284 i965/sync: Implement DRI2_Fence extension

Here follows part of original commit message by Chad Versace:

"This enables EGL_KHR_fence_sync and EGL_KHR_wait_sync."

Cc: "12.0" 
---
 src/mesa/drivers/dri/i915/intel_screen.c  |   1 +
 src/mesa/drivers/dri/i915/intel_screen.h  |   1 +
 src/mesa/drivers/dri/i915/intel_syncobj.c | 180 +-
 3 files changed, 152 insertions(+), 30 deletions(-)

diff --git a/src/mesa/drivers/dri/i915/intel_screen.c 
b/src/mesa/drivers/dri/i915/intel_screen.c
index 77af328..fa96c6d 100644
--- a/src/mesa/drivers/dri/i915/intel_screen.c
+++ b/src/mesa/drivers/dri/i915/intel_screen.c
@@ -786,6 +786,7 @@ static const __DRI2rendererQueryExtension 
intelRendererQueryExtension = {
 
 static const __DRIextension *intelScreenExtensions[] = {
 &intelTexBufferExtension.base,
+&intelFenceExtension.base,
 &intelFlushExtension.base,
 &intelImageExtension.base,
 &intelRendererQueryExtension.base,
diff --git a/src/mesa/drivers/dri/i915/intel_screen.h 
b/src/mesa/drivers/dri/i915/intel_screen.h
index 3518572..891894d 100644
--- a/src/mesa/drivers/dri/i915/intel_screen.h
+++ b/src/mesa/drivers/dri/i915/intel_screen.h
@@ -162,6 +162,7 @@ extern void intelDestroyContext(__DRIcontext * 
driContextPriv);
 extern GLboolean intelUnbindContext(__DRIcontext * driContextPriv);
 
 const __DRIextension **__driDriverGetExtensions_i915(void);
+extern const __DRI2fenceExtension intelFenceExtension;
 
 extern GLboolean
 intelMakeCurrent(__DRIcontext * driContextPriv,
diff --git a/src/mesa/drivers/dri/i915/intel_syncobj.c 
b/src/mesa/drivers/dri/i915/intel_syncobj.c
index 33f2fd4..18d1546 100644
--- a/src/mesa/drivers/dri/i915/intel_syncobj.c
+++ b/src/mesa/drivers/dri/i915/intel_syncobj.c
@@ -25,11 +25,11 @@
  *
  */
 
-/** @file intel_syncobj.c
+/**
+ * \file
+ * \brief Support for GL_ARB_sync and EGL_KHR_fence_sync.
  *
- * Support for ARB_sync
- *
- * ARB_sync is implemented by flushing the current batchbuffer and keeping a
+ * GL_ARB_sync is implemented by flushing the current batchbuffer and keeping a
  * reference on it.  We can then check for completion or wait for completion
  * using the normal buffer object mechanisms.  This does mean that if an
  * application is using many sync objects, it will emit small batchbuffers
@@ -45,13 +45,94 @@
 #include "intel_batchbuffer.h"
 #include "intel_reg.h"
 
+struct intel_fence {
+   /** The fence waits for completion of this batch. */
+   drm_intel_bo *batch_bo;
+
+   bool signalled;
+};
+
 struct intel_gl_sync_object {
struct gl_sync_object Base;
-
-   /** Batch associated with this sync object */
-   drm_intel_bo *bo;
+   struct intel_fence fence;
 };
 
+static void
+intel_fence_finish(struct intel_fence *fence)
+{
+   if (fence->batch_bo)
+  drm_intel_bo_unreference(fence->batch_bo);
+}
+
+static void
+intel_fence_insert(struct intel_context *intel, struct intel_fence *fence)
+{
+   assert(!fence->batch_bo);
+   assert(!fence->signalled);
+
+   intel_batchbuffer_emit_mi_flush(intel);
+   fence->batch_bo = intel->batch.bo;
+   drm_intel_bo_reference(fence->batch_bo);
+   intel_batchbuffer_flush(intel);
+}
+
+static bool
+intel_fence_has_completed(struct intel_fence *fence)
+{
+   if (fence->signalled)
+  return true;
+
+   if (fence->batch_bo && !drm_intel_bo_busy(fence->batch_bo)) {
+  drm_intel_bo_unreference(fence->batch_bo);
+  fence->batch_bo = NULL;
+  fence->signalled = true;
+  return true;
+   }
+
+   return false;
+}
+
+/**
+ * Return true if the function successfully signals or has already signalled.
+ * (This matches the behavior expected from __DRI2fence::client_wait_sync).
+ */
+static bool
+intel_fence_client_wait(struct intel_context *intel, struct intel_fence *fence,
+  uint64_t timeout)
+{
+   if (fence->signalled)
+  return true;
+
+   assert(fence->batch_bo);
+
+   /* DRM_IOCTL_I915_GEM_WAIT uses a signed 64 bit timeout and returns
+* immediately for timeouts <= 0.  The best we can do is to clamp the
+* timeout to INT64_MAX.  This limits the maximum timeout from 584 years to
+* 292 years - likely not a big deal.
+*/
+   if (timeout > INT64_MAX)
+  timeout = INT64_MAX;
+
+   if (drm_intel_gem_bo_wait(fence->batch_bo, timeout) != 0)
+  return false;
+
+   fence->signalled = true;
+   drm_intel_bo_unreference(fence->batch_bo);
+   fence->batch_bo = NULL;
+
+   return true;
+}
+
+static void
+intel_fence_server_wait(struct intel_context *intel, struct intel_fence *fence)
+{
+   /* We have nothing to do for WaitSync.  Our GL command stream is sequential,
+* so given that the sync object has already flushed the batchbuffer, any
+* batchbuffers coming after this waitsync will naturally not occur until
+* the previous one is done.
+*/
+}
+
 static struct gl_sync_object *
 intel_gl_new_sync_object(struct gl_context *ctx, GLuint id)
 {
@@ -69,9 +1

[Mesa-dev] [PATCH 3/3] i915: store reference to the context within struct intel_fence

2016-07-13 Thread Mauro Rossi
Porting of the corresponding experimental patch for i965,
i.e. commit 67adb45 in external/mesa project branch x86/marshmallow-x86
in Android-x86 repo.

Here follows the original commit message by Emil Velikov:

"As the spec allows for {server,client}_wait_sync to be called without
currently bound context, while our implementation requires context pointer.

Untested"

Now the changes have been tested for some months.

Cc: "12.0" 
---
 src/mesa/drivers/dri/i915/intel_syncobj.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i915/intel_syncobj.c 
b/src/mesa/drivers/dri/i915/intel_syncobj.c
index 18d1546..39bb9ee 100644
--- a/src/mesa/drivers/dri/i915/intel_syncobj.c
+++ b/src/mesa/drivers/dri/i915/intel_syncobj.c
@@ -46,6 +46,7 @@
 #include "intel_reg.h"
 
 struct intel_fence {
+   struct intel_context *intel;
/** The fence waits for completion of this batch. */
drm_intel_bo *batch_bo;
 
@@ -215,6 +216,7 @@ intel_dri_create_fence(__DRIcontext *ctx)
if (!fence)
   return NULL;
 
+   fence->intel = intel;
intel_fence_insert(intel, fence);
 
return fence;
@@ -233,19 +235,17 @@ static GLboolean
 intel_dri_client_wait_sync(__DRIcontext *ctx, void *driver_fence, unsigned 
flags,
uint64_t timeout)
 {
-   struct intel_context *intel = ctx->driverPrivate;
struct intel_fence *fence = driver_fence;
 
-   return intel_fence_client_wait(intel, fence, timeout);
+   return intel_fence_client_wait(fence->intel, fence, timeout);
 }
 
 static void
 intel_dri_server_wait_sync(__DRIcontext *ctx, void *driver_fence, unsigned 
flags)
 {
-   struct intel_context *intel = ctx->driverPrivate;
struct intel_fence *fence = driver_fence;
 
-   intel_fence_server_wait(intel, fence);
+   intel_fence_server_wait(fence->intel, fence);
 }
 
 const __DRI2fenceExtension intelFenceExtension = {
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] i915/sync: Replace prefix 'intel_sync' -> 'intel_gl_sync'

2016-07-13 Thread Mauro Rossi
This is the porting of corresponding patch for i965,
i.e. commit 2516d83 i965/sync: Replace prefix 'intel_sync' -> 'intel_gl_sync'

The only difference compared to i965 one is that intel_check_sync() was renamed
to intel_gl_check_sync() here, as it is more appropriate.

Here follows original commit message by Chad Versace:

"I'm about to implement DRI2_Fenc in intel_syncobj.c.  To prevent
madness, we need to prefix functions for GL_ARB_sync with 'gl' and
functions for DRI2_Fence with 'dri'. Otherwise, the file will become
a jumble of similiarly named functions.

For example:
old-name:  intel_client_wait_sync()
new-name:  intel_gl_client_wait_sync()
soon-to-come:  intel_dri_client_wait_sync()

I wrote this renaming commit separately from the commit that implements
DRI2_Fence because I wanted the latter diff to be reviewable."

Cc: "12.0" 
---
 src/mesa/drivers/dri/i915/intel_context.h |  7 -
 src/mesa/drivers/dri/i915/intel_syncobj.c | 49 +++
 2 files changed, 30 insertions(+), 26 deletions(-)

diff --git a/src/mesa/drivers/dri/i915/intel_context.h 
b/src/mesa/drivers/dri/i915/intel_context.h
index 39b328a..5832169 100644
--- a/src/mesa/drivers/dri/i915/intel_context.h
+++ b/src/mesa/drivers/dri/i915/intel_context.h
@@ -104,13 +104,6 @@ extern void intelFallback(struct intel_context *intel, 
GLbitfield bit,
 #endif
 #endif
 
-struct intel_sync_object {
-   struct gl_sync_object Base;
-
-   /** Batch associated with this sync object */
-   drm_intel_bo *bo;
-};
-
 struct intel_batchbuffer {
/** Current batchbuffer being queued up. */
drm_intel_bo *bo;
diff --git a/src/mesa/drivers/dri/i915/intel_syncobj.c 
b/src/mesa/drivers/dri/i915/intel_syncobj.c
index 92b5b63..33f2fd4 100644
--- a/src/mesa/drivers/dri/i915/intel_syncobj.c
+++ b/src/mesa/drivers/dri/i915/intel_syncobj.c
@@ -45,12 +45,19 @@
 #include "intel_batchbuffer.h"
 #include "intel_reg.h"
 
+struct intel_gl_sync_object {
+   struct gl_sync_object Base;
+
+   /** Batch associated with this sync object */
+   drm_intel_bo *bo;
+};
+
 static struct gl_sync_object *
-intel_new_sync_object(struct gl_context *ctx, GLuint id)
+intel_gl_new_sync_object(struct gl_context *ctx, GLuint id)
 {
-   struct intel_sync_object *sync;
+   struct intel_gl_sync_object *sync;
 
-   sync = calloc(1, sizeof(struct intel_sync_object));
+   sync = calloc(1, sizeof(*sync));
if (!sync)
   return NULL;
 
@@ -58,9 +65,9 @@ intel_new_sync_object(struct gl_context *ctx, GLuint id)
 }
 
 static void
-intel_delete_sync_object(struct gl_context *ctx, struct gl_sync_object *s)
+intel_gl_delete_sync_object(struct gl_context *ctx, struct gl_sync_object *s)
 {
-   struct intel_sync_object *sync = (struct intel_sync_object *)s;
+   struct intel_gl_sync_object *sync = (struct intel_sync_object *)s;
 
if (sync->bo)
   drm_intel_bo_unreference(sync->bo);
@@ -69,11 +76,11 @@ intel_delete_sync_object(struct gl_context *ctx, struct 
gl_sync_object *s)
 }
 
 static void
-intel_fence_sync(struct gl_context *ctx, struct gl_sync_object *s,
+intel_gl_fence_sync(struct gl_context *ctx, struct gl_sync_object *s,
   GLenum condition, GLbitfield flags)
 {
struct intel_context *intel = intel_context(ctx);
-   struct intel_sync_object *sync = (struct intel_sync_object *)s;
+   struct intel_gl_sync_object *sync = (struct intel_sync_object *)s;
 
assert(condition == GL_SYNC_GPU_COMMANDS_COMPLETE);
intel_batchbuffer_emit_mi_flush(intel);
@@ -84,10 +91,11 @@ intel_fence_sync(struct gl_context *ctx, struct 
gl_sync_object *s,
intel_flush(ctx);
 }
 
-static void intel_client_wait_sync(struct gl_context *ctx, struct 
gl_sync_object *s,
+static void
+intel_gl_client_wait_sync(struct gl_context *ctx, struct gl_sync_object *s,
 GLbitfield flags, GLuint64 timeout)
 {
-   struct intel_sync_object *sync = (struct intel_sync_object *)s;
+   struct intel_gl_sync_object *sync = (struct intel_sync_object *)s;
 
if (sync->bo && drm_intel_gem_bo_wait(sync->bo, timeout) == 0) {
   s->StatusFlag = 1;
@@ -101,14 +109,16 @@ static void intel_client_wait_sync(struct gl_context 
*ctx, struct gl_sync_object
  * any batchbuffers coming after this waitsync will naturally not occur until
  * the previous one is done.
  */
-static void intel_server_wait_sync(struct gl_context *ctx, struct 
gl_sync_object *s,
+static void
+intel_gl_server_wait_sync(struct gl_context *ctx, struct gl_sync_object *s,
 GLbitfield flags, GLuint64 timeout)
 {
 }
 
-static void intel_check_sync(struct gl_context *ctx, struct gl_sync_object *s)
+static void
+intel_gl_check_sync(struct gl_context *ctx, struct gl_sync_object *s)
 {
-   struct intel_sync_object *sync = (struct intel_sync_object *)s;
+   struct intel_gl_sync_object *sync = (struct intel_sync_object *)s;
 
if (sync->bo && !drm_intel_bo_busy(sync->bo)) {
   drm_intel_bo_unreference(sync->bo);
@@ -117,12 +127,13 @@ sta

[Mesa-dev] i915: Enable EGL_KHR_{fence,wait}_sync

2016-07-13 Thread Mauro Rossi
Sending patches to implement DRI2_Fence extension for i915,
to enable EGL_KHR_fence_sync and EGL_KHR_wait_sync.

It is the step-by-step porting to i915 of i965/sync,
plus patch to support {server,client}_wait_sync without bound context.

[PATCH 1/3] i915/sync: Replace prefix 'intel_sync' -> 'intel_gl_sync'
[PATCH 2/3] i915/sync: Implement DRI2_Fence extension
[PATCH 3/3] i915: store reference to the context within struct

Motivation is to support (web)chromium which requires EGL_KHR_{fence,wait}_sync
Tested with marshmallow-x86 build on Asus Eee PC 1015PEM

Spotted by Paulo Travaglia 

Cc: "12.0" 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl/dri2: dri2_make_current: Set EGL error if bindContext fails

2016-07-13 Thread Nicolas Boichat
Hi,

On Wed, Jul 13, 2016 at 11:21 PM, Emil Velikov  wrote:
> On 7 June 2016 at 11:14, Nicolas Boichat  wrote:
>> Without this, if a configuration is, say, available only on GLES2/3, but
>> not on GLES1, eglMakeCurrent fails with error "EGL_SUCCESS".
>>
>> In this patch, we set error to EGL_BAD_MATCH, which is what CTS/dEQP
>> dEQP-EGL.functional.surfaceless_context expect.
>>
> Since all the EGL_KHR_surfaceless_context particulars are/should be
> handled by _eglBindContext(_eglCheckMakeCurrent actually), this patch
> covers the case when the dri module fails in bindContext(), correct ?
> Can you please mention that in the commit message.

Yes that's correct, the problem here is a failure in the dri module,
and, since it's a dri module we can't report the EGL error there.

> Please add the stable tag:
> Cc: "11.2 12.0" 
>
>> Signed-off-by: Nicolas Boichat 
>> ---
>>  src/egl/drivers/dri2/egl_dri2.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/src/egl/drivers/dri2/egl_dri2.c 
>> b/src/egl/drivers/dri2/egl_dri2.c
>> index bfde640..1a38421 100644
>> --- a/src/egl/drivers/dri2/egl_dri2.c
>> +++ b/src/egl/drivers/dri2/egl_dri2.c
>> @@ -1231,6 +1231,7 @@ dri2_make_current(_EGLDriver *drv, _EGLDisplay *disp, 
>> _EGLSurface *dsurf,
>>_eglPutSurface(old_rsurf);
>>_eglPutContext(old_ctx);
>>
>> +  _eglError(EGL_BAD_MATCH, "bindContext error");
> Please use "eglMakeCurrent" as error string.
>
> Related: the error paths looks a bit confusing so any ideas how to
> untangle this will be appreciated. Not a requirement for this to land
> though.

Yes, and we are hitting other issues with this specific function, when
dri2_dpy is NULL (if you call eglTerminate() and then
eglReleaseThread(), like some CTS tests do). So we'll need to add more
sanity checks, and error handling will need to be reworked. We'll look
at it in a follow-up patch.

> Considering my understanding is correct, with the above two
> suggestions the patch is:
> Reviewed-by: Emil Velikov 

Will fix and respin, thanks.

Best,
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/dri: Add shared glapi to LIBADD on Android

2016-07-13 Thread Tomasz Figa
Hi Emil,

On Thu, Jul 14, 2016 at 1:28 AM, Emil Velikov  wrote:
> On 13 July 2016 at 04:29, Nicolas Boichat  wrote:
>> From: Tomasz Figa 
>>
>> An earlier patch fixed the problem for classic drivers, however Gallium
>> was still left broken. This patch applies the same workaround to
>> Gallium, when compiled for Android. Following is a quote from the
>> original patch:
>>
>> 0cbc90c57cfc mesa: dri: Add shared glapi to LIBADD on Android
>>
>> /system/vendor/lib/dri/*_dri.so actually depend on libglapi: without
>> this, loading the so file fails with:
>> cannot locate symbol "__emutls_v._glapi_tls_Context"
>>
>> On non-Android (non-bionic) platform, EGL uses the following
>> workflow, which works fine:
>>   dlopen("libglapi.so", RTLD_LAZY | RTLD_GLOBAL);
>>   dlopen("dri/_dri.so", RTLD_NOW | RTLD_GLOBAL);
>>
>> However, bionic does not respect the RTLD_GLOBAL flag, and the dri
>> library cannot find symbols in libglapi.so, so we need to link
>> to libglapi.so explicitly. Android.mk already does this.
>>
> I believe we want to have this along side the classic patch, thus
> Cc: "12.0" 
>
>> Signed-off-by: Tomasz Figa 
>> Signed-off-by: Nicolas Boichat 
> For this and the "Remove unused variables" patch
> Reviewed-by: Emil Velikov 
>
> Out of curiosity: which driver was this tested with/against ?

softpipe with kms_swrast target. I have a patch to add a fallback to
it to Android EGL platform.

>
> Humble suggestion: if you re-spin the third_party/mesa, arc branch(es)
> against 12.0 you'll see that many of your local patches are
> merged/superseded. From a brief look most/all of the remaining are
> also suitable for upstream, albeit they might need a bit of polish.

I'm working on this (together with the getBuffers return value check
for i915 you requested). :) Just my time budget has been a little
tight lately.

We try to keep our stuff rebased on reasonably current Mesa master,
making sure that there are no significant dEQP regressions. You can
find our last rebase to 12.1.0-devel here
https://chromium-review.googlesource.com/#/c/358315/ , but I'm working
on polishing things up and sending to mailing lists.

Things are a bit tricky because the gralloc supported in upstream Mesa
currently provides some custom APIs, such as
gralloc_drm_get_gem_handle() or GRALLOC_MODULE_PERFORM_GET_DRM_FD.
Ours doesn't have any of this custom stuff and supports only sharing
buffers by PRIME. It also doesn't share the DRI file descriptor with
the client, because we found this was not behaving correctly on i965
(at least - we don't use other drivers) with gralloc stepping over
Mesa in certain conditions. To sum up, with our gralloc we don't
include gralloc_drm*.h from Mesa and need to enumerate and open
respective render node from Android EGL platform backend.

Also Android requires the drivers to support xBGR visuals, while
last attempt to enable them in gallium broke some X11 apps. Need to
figure out how to add it in such way that those apps are unaffected
(even if broken by design because of expecting certain channel order
without requesting it).

>
> Let us know how we can help out with those and/or the .pc business
> suggested earlier.

I guess it would be useful if we could chat on IRC, but I suppose time
zone difference could make this a bit difficult. Could you let me know
your typical online hours?

Best regards,
Tomasz
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/12] st/va: add functions for VAAPI encode

2016-07-13 Thread Zhang, Boyuan
As discussed, we will improve this in a separate patch later.


Regards,

Boyuan


From: Christian König 
Sent: July 1, 2016 9:03:13 AM
To: Zhang, Boyuan; mesa-dev@lists.freedesktop.org
Subject: Re: [PATCH 08/12] st/va: add functions for VAAPI encode

Am 30.06.2016 um 20:30 schrieb Boyuan Zhang:
> Signed-off-by: Boyuan Zhang 
> ---
>   src/gallium/state_trackers/va/buffer.c |   6 +
>   src/gallium/state_trackers/va/picture.c| 170 
> -
>   src/gallium/state_trackers/va/va_private.h |   3 +
>   3 files changed, 177 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/state_trackers/va/buffer.c 
> b/src/gallium/state_trackers/va/buffer.c
> index 7d3167b..dfcebbe 100644
> --- a/src/gallium/state_trackers/va/buffer.c
> +++ b/src/gallium/state_trackers/va/buffer.c
> @@ -133,6 +133,12 @@ vlVaMapBuffer(VADriverContextP ctx, VABufferID buf_id, 
> void **pbuff)
> if (!buf->derived_surface.transfer || !*pbuff)
>return VA_STATUS_ERROR_INVALID_BUFFER;
>
> +  if (buf->type == VAEncCodedBufferType) {
> + ((VACodedBufferSegment*)buf->data)->buf = *pbuff;
> + ((VACodedBufferSegment*)buf->data)->size = buf->coded_size;
> + ((VACodedBufferSegment*)buf->data)->next = NULL;
> + *pbuff = buf->data;
> +  }
>  } else {
> pipe_mutex_unlock(drv->mutex);
> *pbuff = buf->data;
> diff --git a/src/gallium/state_trackers/va/picture.c 
> b/src/gallium/state_trackers/va/picture.c
> index 89ac024..26205b1 100644
> --- a/src/gallium/state_trackers/va/picture.c
> +++ b/src/gallium/state_trackers/va/picture.c
> @@ -78,7 +78,8 @@ vlVaBeginPicture(VADriverContextP ctx, VAContextID 
> context_id, VASurfaceID rende
> return VA_STATUS_SUCCESS;
>  }
>
> -   context->decoder->begin_frame(context->decoder, context->target, 
> &context->desc.base);
> +   if (context->decoder->entrypoint != PIPE_VIDEO_ENTRYPOINT_ENCODE)
> +  context->decoder->begin_frame(context->decoder, context->target, 
> &context->desc.base);
>
>  return VA_STATUS_SUCCESS;
>   }
> @@ -278,6 +279,140 @@ handleVASliceDataBufferType(vlVaContext *context, 
> vlVaBuffer *buf)
> num_buffers, (const void * const*)buffers, sizes);
>   }
>
> +static VAStatus
> +handleVAEncMiscParameterTypeRateControl(vlVaContext *context, 
> VAEncMiscParameterBuffer *misc)
> +{
> +   VAEncMiscParameterRateControl *rc = (VAEncMiscParameterRateControl 
> *)misc->data;
> +   if (context->desc.h264enc.rate_ctrl.rate_ctrl_method ==
> +   PIPE_H264_ENC_RATE_CONTROL_METHOD_CONSTANT)
> +  context->desc.h264enc.rate_ctrl.target_bitrate = rc->bits_per_second;
> +   else
> +  context->desc.h264enc.rate_ctrl.target_bitrate = rc->bits_per_second * 
> rc->target_percentage;
> +   context->desc.h264enc.rate_ctrl.peak_bitrate = rc->bits_per_second;
> +   if (context->desc.h264enc.rate_ctrl.target_bitrate < 200)
> +  context->desc.h264enc.rate_ctrl.vbv_buffer_size = 
> MIN2((context->desc.h264enc.rate_ctrl.target_bitrate * 2.75), 200);
> +   else
> +  context->desc.h264enc.rate_ctrl.vbv_buffer_size = 
> context->desc.h264enc.rate_ctrl.target_bitrate;
> +
> +   return VA_STATUS_SUCCESS;
> +}
> +
> +static VAStatus
> +handleVAEncSequenceParameterBufferType(vlVaDriver *drv, vlVaContext 
> *context, vlVaBuffer *buf)
> +{
> +   VAEncSequenceParameterBufferH264 *h264 = 
> (VAEncSequenceParameterBufferH264 *)buf->data;
> +   if (!context->decoder) {
> +  context->templat.max_references = h264->max_num_ref_frames;
> +  context->templat.level = h264->level_idc;
> +  context->decoder = drv->pipe->create_video_codec(drv->pipe, 
> &context->templat);
> +  if (!context->decoder)
> + return VA_STATUS_ERROR_ALLOCATION_FAILED;
> +   }
> +   context->desc.h264enc.gop_size = h264->intra_idr_period;
> +   return VA_STATUS_SUCCESS;
> +}
> +
> +static VAStatus
> +handleVAEncMiscParameterBufferType(vlVaContext *context, vlVaBuffer *buf)
> +{
> +   VAStatus vaStatus = VA_STATUS_SUCCESS;
> +   VAEncMiscParameterBuffer *misc;
> +   misc = buf->data;
> +
> +   switch (misc->type) {
> +   case VAEncMiscParameterTypeRateControl:
> +  vaStatus = handleVAEncMiscParameterTypeRateControl(context, misc);
> +  break;
> +
> +   default:
> +  break;
> +   }
> +
> +   return vaStatus;
> +}
> +
> +static VAStatus
> +handleVAEncPictureParameterBufferType(vlVaDriver *drv, vlVaContext *context, 
> vlVaBuffer *buf)
> +{
> +   VAEncPictureParameterBufferH264 *h264;
> +   vlVaBuffer *coded_buf;
> +
> +   h264 = buf->data;
> +   context->desc.h264enc.frame_num = h264->frame_num;
> +   context->desc.h264enc.not_referenced = false;
> +   context->desc.h264enc.is_idr = (h264->pic_fields.bits.idr_pic_flag == 1);
> +   context->desc.h264enc.pic_order_cnt = h264->CurrPic.TopFieldOrderCnt / 2;
> +   if (context->desc.h264enc.is_idr)
> +  context->desc.h264enc.i_remain = 1;
> +   else
> +  context->desc.h264enc.i_remain = 0;
>

Re: [Mesa-dev] [PATCH 06/12] st/va: colorspace conversion when image is yv12 and surface is nv12

2016-07-13 Thread Zhang, Boyuan
Hi Christian,


Style issue is fixed.


Also, I checked the utility function, it seems that the existing yv12 to nv12 
function can't be used for "copying from image to surface" case, so I added a 
new function in the utility function to do this job. Please see the new 
submitted patch set.


For IYUV case, it's already converted to yv12 (by swapping u and v field) 
before the colorspace conversion call, so IYUV case should also work.


Regards,

Boyuan



From: Christian König 
Sent: July 1, 2016 8:51 AM
To: Zhang, Boyuan; mesa-dev@lists.freedesktop.org
Subject: Re: [PATCH 06/12] st/va: colorspace conversion when image is yv12 and 
surface is nv12

Am 30.06.2016 um 20:30 schrieb Boyuan Zhang:
> Signed-off-by: Boyuan Zhang 
> ---
>   src/gallium/state_trackers/va/image.c | 48 
> +--
>   1 file changed, 40 insertions(+), 8 deletions(-)
>
> diff --git a/src/gallium/state_trackers/va/image.c 
> b/src/gallium/state_trackers/va/image.c
> index 3c8cc9c..1f68169 100644
> --- a/src/gallium/state_trackers/va/image.c
> +++ b/src/gallium/state_trackers/va/image.c
> @@ -499,7 +499,7 @@ vlVaPutImage(VADriverContextP ctx, VASurfaceID surface, 
> VAImageID image,
>  VAImage *vaimage;
>  struct pipe_sampler_view **views;
>  enum pipe_format format;
> -   void *data[3];
> +   uint8_t *data[3];
>  unsigned pitches[3], i, j;
>
>  if (!ctx)
> @@ -539,7 +539,9 @@ vlVaPutImage(VADriverContextP ctx, VASurfaceID surface, 
> VAImageID image,
> return VA_STATUS_ERROR_OPERATION_FAILED;
>  }
>
> -   if (format != surf->buffer->buffer_format) {
> +   if ((format != surf->buffer->buffer_format) &&
> +  ((format != PIPE_FORMAT_YV12) || (surf->buffer->buffer_format != 
> PIPE_FORMAT_NV12)) &&
> +  ((format != PIPE_FORMAT_IYUV) || (surf->buffer->buffer_format != 
> PIPE_FORMAT_NV12))) {
> struct pipe_video_buffer *tmp_buf;
> struct pipe_video_buffer templat = surf->templat;
>
> @@ -581,12 +583,42 @@ vlVaPutImage(VADriverContextP ctx, VASurfaceID surface, 
> VAImageID image,
> unsigned width, height;
> if (!views[i]) continue;
> vlVaVideoSurfaceSize(surf, i, &width, &height);
> -  for (j = 0; j < views[i]->texture->array_size; ++j) {
> - struct pipe_box dst_box = {0, 0, j, width, height, 1};
> - drv->pipe->transfer_inline_write(drv->pipe, views[i]->texture, 0,
> -PIPE_TRANSFER_WRITE, &dst_box,
> -data[i] + pitches[i] * j,
> -pitches[i] * views[i]->texture->array_size, 0);
> +  if ((format == PIPE_FORMAT_YV12) || (format == PIPE_FORMAT_IYUV) &&
> + (surf->buffer->buffer_format == PIPE_FORMAT_NV12) && (i == 1)) {
> + struct pipe_transfer *transfer = NULL;
> + uint8_t *map = NULL;
> + struct pipe_box dst_box_1 = {0, 0, 0, width, height, 1};
> + map = drv->pipe->transfer_map(drv->pipe,
> +   views[i]->texture,
> +   0,
> +   PIPE_TRANSFER_DISCARD_RANGE,
> +   &dst_box_1, &transfer);
> + if (map == NULL)
> +return VA_STATUS_ERROR_OPERATION_FAILED;
> +
> + bool odd = false;
> + for (unsigned int k = 0; k < ((vaimage->offsets[1])/2) ; k++){
> +if (odd == false) {
> +   map[k] = data[i][k/2];
> +   odd = true;
> +}
> +else {
> +   map[k] = data[i+1][k/2];
> +   odd = false;
> +}
> + }
> + pipe_transfer_unmap(drv->pipe, transfer);
> + pipe_mutex_unlock(drv->mutex);
> + return VA_STATUS_SUCCESS;
> +  }
> +  else {

Style issue, the "}" and the "else {" should be on the same line.

Apart from that please use the u_copy_yv12_to_nv12() functions for the
conversion instead of coding it manually.

Also the code doesn't looks like it handles IYUV correctly.

Regards,
Christian.

> + for (j = 0; j < views[i]->texture->array_size; ++j) {
> +struct pipe_box dst_box = {0, 0, j, width, height, 1};
> +drv->pipe->transfer_inline_write(drv->pipe, views[i]->texture, 0,
> +   PIPE_TRANSFER_WRITE, &dst_box,
> +   data[i] + pitches[i] * j,
> +   pitches[i] * views[i]->texture->array_size, 0);
> + }
> }
>  }
>  pipe_mutex_unlock(drv->mutex);

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: fix compiler warnings for 32bit build

2016-07-13 Thread Matt Turner
Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: fix compiler warnings for 32bit build

2016-07-13 Thread Timothy Arceri
---
 src/mesa/drivers/dri/i965/brw_state_upload.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c 
b/src/mesa/drivers/dri/i965/brw_state_upload.c
index 98c62d5..3eb9259 100644
--- a/src/mesa/drivers/dri/i965/brw_state_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
@@ -668,7 +668,7 @@ brw_print_dirty_count(struct dirty_bit_map *bit_map)
 {
for (int i = 0; bit_map[i].bit != 0; i++) {
   if (bit_map[i].count > 1) {
- fprintf(stderr, "0x%016lx: %12d (%s)\n",
+ fprintf(stderr, "0x%016"PRIx64": %12d (%s)\n",
  bit_map[i].bit, bit_map[i].count, bit_map[i].name);
   }
}
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965/fs: emit DIM instruction to load 64-bit immediates in HSW

2016-07-13 Thread Matt Turner
On Wed, Jul 13, 2016 at 5:06 PM, Matt Turner  wrote:
> On Tue, Jul 12, 2016 at 11:42 PM, Samuel Iglesias Gonsálvez
>  wrote:
>> Signed-off-by: Samuel Iglesias Gonsálvez 
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 12 
>>  1 file changed, 12 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> index a65c273..bf32dfd 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> @@ -4558,6 +4558,18 @@ setup_imm_df(const fs_builder &bld, double v)
>> if (devinfo->gen >= 8)
>>return brw_imm_df(v);
>>
>> +   /* gen7.5 does not support DF immediates straighforward but the DIM
>> +* instruction allows to set the 64-bit immediate value.
>> +*/
>> +   if (devinfo->is_haswell) {
>> +  const fs_builder ubld = bld.exec_all();
>> +  fs_reg dst = ubld.vgrf(BRW_REGISTER_TYPE_DF, 1);
>> +  struct brw_reg imm = brw_imm_reg(BRW_REGISTER_TYPE_F);
>> +  imm.df = v;
>> +  ubld.DIM(dst, imm);
>
> I know the hardware is strange and requires that src0's type is F, but
> I don't think we need to model that in the IR. I think that using a DF
> type in the IR
> would require otherwise unnecessary changes to dump_instructions().
>
> With the above three lines changed to just
>
>ubld.DIM(dst, brw_imm_df(v));
>
> this patch is:
>
> Reviewed-by: Matt Turner 
>
> Patch 1 I sent comments on. With those addressed it is also
>
> Reviewed-by: Matt Turner 
>
> I believe with my comments addressed on 1/3 and 3/3 that 2/3 is unecessary.

Actually, I guess 2/3 is necessary since the type is changed before
brw_eu_emit? Sorry, not thinking very clearly.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/12] vl: add parameters for VAAPI encode

2016-07-13 Thread Zhang, Boyuan
Hi Emil,


Thanks for the suggestion. I added brief message to each of the patch to 
explain what the patch does. Please see the new patch set I just submitted.



Hi Christian,


The un-used ref_pic related definitions are removed from this patch.


For the concern of is_idr flag , I checked the behavior of Vaapi and the codes 
again. Vaapi treats both "idr-iframe" and "non-idr-iframe" same as i-frame in 
picture type, and it uses a separate flag to tell driver whether this iframe is 
idr or not. So from only the picture type, we can't tell whether it's idr or 
not. Since VCE needs this information, so I think we still need to add this 
flag.


Regards,

Boyuan


From: Christian König 
Sent: July 1, 2016 8:21 AM
To: Zhang, Boyuan; mesa-dev@lists.freedesktop.org
Subject: Re: [PATCH 01/12] vl: add parameters for VAAPI encode

Hi Boyuan,

as Emil wrote as well try to add some commit messages to the set. For
this patch something like the following should do it:

Allow to specify more parameters in the encoding interface which where
previously just hardcoded in the encoder.

Additional to that we need to reorder the patches a bit. First the
interface changes, then the OMX changes to fill in the previously
hardcoded values, then the radeon backend changes and then last the
VA-API changes to use the new interface.

Additional to that a few notes below.

Am 30.06.2016 um 20:30 schrieb Boyuan Zhang:
> Signed-off-by: Boyuan Zhang 
> ---
>   src/gallium/include/pipe/p_video_state.h | 36 
> 
>   1 file changed, 36 insertions(+)
>
> diff --git a/src/gallium/include/pipe/p_video_state.h 
> b/src/gallium/include/pipe/p_video_state.h
> index d353be6..9cd489b 100644
> --- a/src/gallium/include/pipe/p_video_state.h
> +++ b/src/gallium/include/pipe/p_video_state.h
> @@ -352,9 +352,29 @@ struct pipe_h264_enc_rate_control
>  unsigned frame_rate_num;
>  unsigned frame_rate_den;
>  unsigned vbv_buffer_size;
> +   unsigned vbv_buf_lv;
>  unsigned target_bits_picture;
>  unsigned peak_bits_picture_integer;
>  unsigned peak_bits_picture_fraction;
> +   unsigned fill_data_enable;
> +   unsigned enforce_hrd;
> +};
> +
> +struct pipe_h264_enc_motion_estimation
> +{
> +   unsigned motion_est_quarter_pixel;
> +   unsigned enc_disable_sub_mode;
> +   unsigned lsmvert;
> +   unsigned enc_en_ime_overw_dis_subm;
> +   unsigned enc_ime_overw_dis_subm_no;
> +   unsigned enc_ime2_search_range_x;
> +   unsigned enc_ime2_search_range_y;
> +};
> +
> +struct pipe_h264_enc_pic_control
> +{
> +   unsigned enc_cabac_enable;
> +   unsigned enc_constraint_set_flags;
>   };
>
>   struct pipe_h264_enc_picture_desc
> @@ -363,17 +383,33 @@ struct pipe_h264_enc_picture_desc
>
>  struct pipe_h264_enc_rate_control rate_ctrl;
>
> +   struct pipe_h264_enc_motion_estimation motion_est;
> +   struct pipe_h264_enc_pic_control pic_ctrl;
> +
>  unsigned quant_i_frames;
>  unsigned quant_p_frames;
>  unsigned quant_b_frames;
>
>  enum pipe_h264_enc_picture_type picture_type;
>  unsigned frame_num;
> +   unsigned frame_num_cnt;
> +   unsigned p_remain;
> +   unsigned i_remain;
> +   unsigned idr_pic_id;
> +   unsigned gop_cnt;
>  unsigned pic_order_cnt;
>  unsigned ref_idx_l0;
>  unsigned ref_idx_l1;
> +   unsigned gop_size;
> +   unsigned ref_pic_mode;
>
>  bool not_referenced;
> +   bool is_idr;

Why can't this be inferred from the encoded picture type?

> +   bool has_ref_pic_list;
> +   bool enable_vui;
> +   unsigned int ref_pic_list_0[32];
> +   unsigned int ref_pic_list_1[32];
> +   unsigned int frame_idx[32];

I thought we wanted to drop the ref_pic_list handling for now. If that
is still the case please drop those fields here as well.

Regards,
Christian.

>   };
>
>   struct pipe_h265_sps

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/disasm: fix compiler warnings for 32bit build

2016-07-13 Thread Matt Turner
Thanks Tim.

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] anv/device: Fix max buffer range limits

2016-07-13 Thread Jason Ekstrand
Thanks for fixing this!  Both are

Reviewed-by: Jason Ekstrand 
Cc: "12.0" 

On Wed, Jul 13, 2016 at 5:32 PM, Nanley Chery  wrote:

> Set limits that are consistent with ISL's assertions in
> isl_genX(buffer_fill_state_s)() and Anvil's format-DescriptorType
> mapping in anv_isl_format_for_descriptor_type().
>
> Fixes the following new crucible tests:
> * stress.limits.buffer-update.range.uniform
> * stress.limits.buffer-update.range.storage
>
> These tests are in this patch:
> https://patchwork.freedesktop.org/patch/98726/
>
> Signed-off-by: Nanley Chery 
> ---
>  src/intel/vulkan/anv_device.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
> index dd941b6..f181eb7 100644
> --- a/src/intel/vulkan/anv_device.c
> +++ b/src/intel/vulkan/anv_device.c
> @@ -438,6 +438,10 @@ void anv_GetPhysicalDeviceProperties(
>
> const float time_stamp_base = devinfo->gen >= 9 ? 83.333 : 80.0;
>
> +   /* See assertions made when programming the buffer surface state. */
> +   const uint32_t max_raw_buffer_sz = devinfo->gen >= 7 ?
> +  (1ul << 30) : (1ul << 27);
> +
> VkSampleCountFlags sample_counts =
>isl_device_get_sample_counts(&pdevice->isl_dev);
>
> @@ -448,8 +452,8 @@ void anv_GetPhysicalDeviceProperties(
>.maxImageDimensionCube= (1 << 14),
>.maxImageArrayLayers  = (1 << 11),
>.maxTexelBufferElements   = 128 * 1024 * 1024,
> -  .maxUniformBufferRange= UINT32_MAX,
> -  .maxStorageBufferRange= UINT32_MAX,
> +  .maxUniformBufferRange= (1ul << 27),
> +  .maxStorageBufferRange= max_raw_buffer_sz,
>.maxPushConstantsSize = MAX_PUSH_CONSTANTS_SIZE,
>.maxMemoryAllocationCount = UINT32_MAX,
>.maxSamplerAllocationCount= 64 * 1024,
> --
> 2.9.0
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] isl: Fix assert on raw buffer surface state size

2016-07-13 Thread Nanley Chery
See inline PRM reference.

Signed-off-by: Nanley Chery 
---
 src/intel/isl/isl_surface_state.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index fc7e1ba..58e9af5 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -460,8 +460,15 @@ isl_genX(buffer_fill_state_s)(void *state,
uint32_t num_elements = info->size / info->stride;
 
if (GEN_GEN >= 7) {
+  /* From the IVB PRM, SURFACE_STATE::Height,
+   *
+   *For typed buffer and structured buffer surfaces, the number
+   *of entries in the buffer ranges from 1 to 2^27. For raw buffer
+   *surfaces, the number of entries in the buffer is the number of 
bytes
+   *which can range from 1 to 2^30.
+   */
   if (info->format == ISL_FORMAT_RAW) {
- assert(num_elements <= (1ull << 31));
+ assert(num_elements <= (1ull << 30));
  assert((num_elements & 3) == 0);
   } else {
  assert(num_elements <= (1ull << 27));
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] anv/device: Fix max buffer range limits

2016-07-13 Thread Nanley Chery
Set limits that are consistent with ISL's assertions in
isl_genX(buffer_fill_state_s)() and Anvil's format-DescriptorType
mapping in anv_isl_format_for_descriptor_type().

Fixes the following new crucible tests:
* stress.limits.buffer-update.range.uniform
* stress.limits.buffer-update.range.storage

These tests are in this patch: https://patchwork.freedesktop.org/patch/98726/

Signed-off-by: Nanley Chery 
---
 src/intel/vulkan/anv_device.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index dd941b6..f181eb7 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -438,6 +438,10 @@ void anv_GetPhysicalDeviceProperties(
 
const float time_stamp_base = devinfo->gen >= 9 ? 83.333 : 80.0;
 
+   /* See assertions made when programming the buffer surface state. */
+   const uint32_t max_raw_buffer_sz = devinfo->gen >= 7 ?
+  (1ul << 30) : (1ul << 27);
+
VkSampleCountFlags sample_counts =
   isl_device_get_sample_counts(&pdevice->isl_dev);
 
@@ -448,8 +452,8 @@ void anv_GetPhysicalDeviceProperties(
   .maxImageDimensionCube= (1 << 14),
   .maxImageArrayLayers  = (1 << 11),
   .maxTexelBufferElements   = 128 * 1024 * 1024,
-  .maxUniformBufferRange= UINT32_MAX,
-  .maxStorageBufferRange= UINT32_MAX,
+  .maxUniformBufferRange= (1ul << 27),
+  .maxStorageBufferRange= max_raw_buffer_sz,
   .maxPushConstantsSize = MAX_PUSH_CONSTANTS_SIZE,
   .maxMemoryAllocationCount = UINT32_MAX,
   .maxSamplerAllocationCount= 64 * 1024,
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/disasm: fix compiler warnings for 32bit build

2016-07-13 Thread Timothy Arceri
---
 src/mesa/drivers/dri/i965/brw_disasm.c | 50 +-
 1 file changed, 25 insertions(+), 25 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_disasm.c 
b/src/mesa/drivers/dri/i965/brw_disasm.c
index 068c120..d74d5d5 100644
--- a/src/mesa/drivers/dri/i965/brw_disasm.c
+++ b/src/mesa/drivers/dri/i965/brw_disasm.c
@@ -719,7 +719,7 @@ dest(FILE *file, const struct brw_device_info *devinfo, 
brw_inst *inst)
  if (err == -1)
 return 0;
  if (brw_inst_dst_da1_subreg_nr(devinfo, inst))
-format(file, ".%ld", brw_inst_dst_da1_subreg_nr(devinfo, inst) /
+format(file, ".%"PRIu64, brw_inst_dst_da1_subreg_nr(devinfo, inst) 
/
reg_type_size[brw_inst_dst_reg_type(devinfo, inst)]);
  string(file, "<");
  err |= control(file, "horiz stride", horiz_stride,
@@ -730,7 +730,7 @@ dest(FILE *file, const struct brw_device_info *devinfo, 
brw_inst *inst)
   } else {
  string(file, "g[a0");
  if (brw_inst_dst_ia_subreg_nr(devinfo, inst))
-format(file, ".%ld", brw_inst_dst_ia_subreg_nr(devinfo, inst) /
+format(file, ".%"PRIu64, brw_inst_dst_ia_subreg_nr(devinfo, inst) /
reg_type_size[brw_inst_dst_reg_type(devinfo, inst)]);
  if (brw_inst_dst_ia1_addr_imm(devinfo, inst))
 format(file, " %d", brw_inst_dst_ia1_addr_imm(devinfo, inst));
@@ -748,7 +748,7 @@ dest(FILE *file, const struct brw_device_info *devinfo, 
brw_inst *inst)
  if (err == -1)
 return 0;
  if (brw_inst_dst_da16_subreg_nr(devinfo, inst))
-format(file, ".%ld", brw_inst_dst_da16_subreg_nr(devinfo, inst) /
+format(file, ".%"PRIu64, brw_inst_dst_da16_subreg_nr(devinfo, 
inst) /
reg_type_size[brw_inst_dst_reg_type(devinfo, inst)]);
  string(file, "<1>");
  err |= control(file, "writemask", writemask,
@@ -779,7 +779,7 @@ dest_3src(FILE *file, const struct brw_device_info 
*devinfo, brw_inst *inst)
if (err == -1)
   return 0;
if (brw_inst_3src_dst_subreg_nr(devinfo, inst))
-  format(file, ".%ld", brw_inst_3src_dst_subreg_nr(devinfo, inst));
+  format(file, ".%"PRIu64, brw_inst_3src_dst_subreg_nr(devinfo, inst));
string(file, "<1>");
err |= control(file, "writemask", writemask,
   brw_inst_3src_dst_writemask(devinfo, inst), NULL);
@@ -1216,9 +1216,9 @@ brw_disassemble_inst(FILE *file, const struct 
brw_device_info *devinfo,
   string(file, "(");
   err |= control(file, "predicate inverse", pred_inv,
  brw_inst_pred_inv(devinfo, inst), NULL);
-  format(file, "f%ld", devinfo->gen >= 7 ? brw_inst_flag_reg_nr(devinfo, 
inst) : 0);
+  format(file, "f%"PRIu64, devinfo->gen >= 7 ? 
brw_inst_flag_reg_nr(devinfo, inst) : 0);
   if (brw_inst_flag_subreg_nr(devinfo, inst))
- format(file, ".%ld", brw_inst_flag_subreg_nr(devinfo, inst));
+ format(file, ".%"PRIu64, brw_inst_flag_subreg_nr(devinfo, inst));
   if (brw_inst_access_mode(devinfo, inst) == BRW_ALIGN_1) {
  err |= control(file, "predicate control align1", pred_ctrl_align1,
 brw_inst_pred_control(devinfo, inst), NULL);
@@ -1252,10 +1252,10 @@ brw_disassemble_inst(FILE *file, const struct 
brw_device_info *devinfo,
   (devinfo->gen < 6 || (opcode != BRW_OPCODE_SEL &&
 opcode != BRW_OPCODE_IF &&
 opcode != BRW_OPCODE_WHILE))) {
- format(file, ".f%ld",
+ format(file, ".f%"PRIu64,
 devinfo->gen >= 7 ? brw_inst_flag_reg_nr(devinfo, inst) : 0);
  if (brw_inst_flag_subreg_nr(devinfo, inst))
-format(file, ".%ld", brw_inst_flag_subreg_nr(devinfo, inst));
+format(file, ".%"PRIu64, brw_inst_flag_subreg_nr(devinfo, inst));
   }
}
 
@@ -1267,7 +1267,7 @@ brw_disassemble_inst(FILE *file, const struct 
brw_device_info *devinfo,
}
 
if (opcode == BRW_OPCODE_SEND && devinfo->gen < 6)
-  format(file, " %ld", brw_inst_base_mrf(devinfo, inst));
+  format(file, " %"PRIu64, brw_inst_base_mrf(devinfo, inst));
 
if (has_uip(devinfo, opcode)) {
   /* Instructions that have UIP also have JIP. */
@@ -1288,7 +1288,7 @@ brw_disassemble_inst(FILE *file, const struct 
brw_device_info *devinfo,
   pad(file, 16);
   format(file, "Jump: %d", brw_inst_gen4_jump_count(devinfo, inst));
   pad(file, 32);
-  format(file, "Pop: %ld", brw_inst_gen4_pop_count(devinfo, inst));
+  format(file, "Pop: %"PRIu64, brw_inst_gen4_pop_count(devinfo, inst));
} else if (devinfo->gen < 6 && (opcode == BRW_OPCODE_IF ||
opcode == BRW_OPCODE_IFF ||
opcode == BRW_OPCODE_HALT)) {
@@ -1296,7 +1296,7 @@ brw_disassemble_inst(FILE *file, const struct 
brw_device_info *devinfo,
   format(file, "Jump: %d", brw_inst_gen4_jump_c

[Mesa-dev] [PATCH v4 21/34] i965/state: Use ISL for emitting image surfaces

2016-07-13 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 32 
 1 file changed, 21 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 65a1f3c..5873ea5 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -1400,22 +1400,32 @@ update_image_surface(struct brw_context *brw,
access != GL_READ_ONLY);
 
  } else {
-const unsigned min_layer = obj->MinLayer + u->_Layer;
-const unsigned min_level = obj->MinLevel + u->Level;
 const unsigned num_layers = (!u->Layered ? 1 :
  obj->Target == GL_TEXTURE_CUBE_MAP ? 
6 :
  mt->logical_depth0);
-const GLenum target = (obj->Target == GL_TEXTURE_CUBE_MAP ||
-   obj->Target == GL_TEXTURE_CUBE_MAP_ARRAY ?
-   GL_TEXTURE_2D_ARRAY : obj->Target);
+
+struct isl_view view = {
+   .format = format,
+   .base_level = obj->MinLevel + u->Level,
+   .levels = 1,
+   .base_array_layer = obj->MinLayer + u->_Layer,
+   .array_len = num_layers,
+   .channel_select = {
+  ISL_CHANNEL_SELECT_RED,
+  ISL_CHANNEL_SELECT_GREEN,
+  ISL_CHANNEL_SELECT_BLUE,
+  ISL_CHANNEL_SELECT_ALPHA,
+   },
+   .usage = ISL_SURF_USAGE_STORAGE_BIT,
+};
+
 const int surf_index = surf_offset - &brw->wm.base.surf_offset[0];
 
-brw->vtbl.emit_texture_surface_state(
-   brw, mt, target,
-   min_layer, min_layer + num_layers,
-   min_level, min_level + 1,
-   format, SWIZZLE_XYZW,
-   surf_offset, surf_index, access != GL_READ_ONLY, false);
+brw_emit_surface_state(brw, mt, &view,
+   surface_state_infos[brw->gen].rb_mocs, 
false,
+   surf_offset, surf_index,
+   I915_GEM_DOMAIN_SAMPLER,
+   I915_GEM_DOMAIN_SAMPLER);
  }
 
  update_texture_image_param(brw, u, surface_idx, param);
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 31/34] i965/state: Account for the element size in emit_buffer_surface_state

2016-07-13 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 11 ++-
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c |  9 +
 src/mesa/drivers/dri/i965/gen8_surface_state.c|  9 +
 3 files changed, 16 insertions(+), 13 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 01c9802..f94aca2 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -492,6 +492,7 @@ gen4_emit_buffer_surface_state(struct brw_context *brw,
unsigned pitch,
bool rw)
 {
+   unsigned elements = buffer_size / pitch;
uint32_t *surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
 6 * 4, 32, out_offset);
memset(surf, 0, 6 * 4);
@@ -500,9 +501,9 @@ gen4_emit_buffer_surface_state(struct brw_context *brw,
  surface_format << BRW_SURFACE_FORMAT_SHIFT |
  (brw->gen >= 6 ? BRW_SURFACE_RC_READ_WRITE : 0);
surf[1] = (bo ? bo->offset64 : 0) + buffer_offset; /* reloc */
-   surf[2] = ((buffer_size - 1) & 0x7f) << BRW_SURFACE_WIDTH_SHIFT |
- (((buffer_size - 1) >> 7) & 0x1fff) << BRW_SURFACE_HEIGHT_SHIFT;
-   surf[3] = (((buffer_size - 1) >> 20) & 0x7f) << BRW_SURFACE_DEPTH_SHIFT |
+   surf[2] = ((elements - 1) & 0x7f) << BRW_SURFACE_WIDTH_SHIFT |
+ (((elements - 1) >> 7) & 0x1fff) << BRW_SURFACE_HEIGHT_SHIFT;
+   surf[3] = (((elements - 1) >> 20) & 0x7f) << BRW_SURFACE_DEPTH_SHIFT |
  (pitch - 1) << BRW_SURFACE_PITCH_SHIFT;
 
/* Emit relocation to surface contents.  The 965 PRM, Volume 4, section
@@ -545,7 +546,7 @@ brw_update_buffer_texture_surface(struct gl_context *ctx,
brw->vtbl.emit_buffer_surface_state(brw, surf_offset, bo,
tObj->BufferOffset,
brw_format,
-   size / texel_size,
+   size,
texel_size,
false /* rw */);
 }
@@ -1476,7 +1477,7 @@ update_image_surface(struct brw_context *brw,
 
  brw->vtbl.emit_buffer_surface_state(
 brw, surf_offset, intel_obj->buffer, obj->BufferOffset,
-format, intel_obj->Base.Size / texel_size, texel_size,
+format, intel_obj->Base.Size, texel_size,
 access != GL_READ_ONLY);
 
  update_buffer_image_param(brw, u, surface_idx, param);
diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
index bb94f2d..65a1cb0 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
@@ -135,6 +135,7 @@ gen7_emit_buffer_surface_state(struct brw_context *brw,
unsigned pitch,
bool rw)
 {
+   unsigned elements = buffer_size / pitch;
uint32_t *surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
 8 * 4, 32, out_offset);
memset(surf, 0, 8 * 4);
@@ -143,12 +144,12 @@ gen7_emit_buffer_surface_state(struct brw_context *brw,
  surface_format << BRW_SURFACE_FORMAT_SHIFT |
  BRW_SURFACE_RC_READ_WRITE;
surf[1] = (bo ? bo->offset64 : 0) + buffer_offset; /* reloc */
-   surf[2] = SET_FIELD((buffer_size - 1) & 0x7f, GEN7_SURFACE_WIDTH) |
- SET_FIELD(((buffer_size - 1) >> 7) & 0x3fff, GEN7_SURFACE_HEIGHT);
+   surf[2] = SET_FIELD((elements - 1) & 0x7f, GEN7_SURFACE_WIDTH) |
+ SET_FIELD(((elements - 1) >> 7) & 0x3fff, GEN7_SURFACE_HEIGHT);
if (surface_format == BRW_SURFACEFORMAT_RAW)
-  surf[3] = SET_FIELD(((buffer_size - 1) >> 21) & 0x3ff, 
BRW_SURFACE_DEPTH);
+  surf[3] = SET_FIELD(((elements - 1) >> 21) & 0x3ff, BRW_SURFACE_DEPTH);
else
-  surf[3] = SET_FIELD(((buffer_size - 1) >> 21) & 0x3f, BRW_SURFACE_DEPTH);
+  surf[3] = SET_FIELD(((elements - 1) >> 21) & 0x3f, BRW_SURFACE_DEPTH);
surf[3] |= (pitch - 1);
 
surf[5] = SET_FIELD(GEN7_MOCS_L3, GEN7_SURFACE_MOCS);
diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
b/src/mesa/drivers/dri/i965/gen8_surface_state.c
index 00e4c48..9ac8a48 100644
--- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
@@ -63,6 +63,7 @@ gen8_emit_buffer_surface_state(struct brw_context *brw,
unsigned pitch,
bool rw)
 {
+   unsigned elements = buffer_size / pitch;
const unsigned mocs = brw->gen >= 9 ? SKL_MOCS_WB : BDW_MOCS_WB;
uint32_t *surf = gen8_allocate_surface_state(brw, out_offset, -1);
 
@@ -71,12 +72,12 @@ gen8_emit_buffer_surface_state(struct brw_context *brw,
  BRW_SURFACE_RC_READ_WRITE;
surf

[Mesa-dev] [PATCH v4 28/34] i965/gen6: Use the generic ISL-based path for renderbuffer surfaces

2016-07-13 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/gen6_surface_state.c | 100 +
 1 file changed, 1 insertion(+), 99 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_surface_state.c 
b/src/mesa/drivers/dri/i965/gen6_surface_state.c
index d892c93..84b8ef4 100644
--- a/src/mesa/drivers/dri/i965/gen6_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_surface_state.c
@@ -40,107 +40,9 @@
 #include "brw_defines.h"
 #include "brw_wm.h"
 
-/**
- * Sets up a surface state structure to point at the given region.
- * While it is only used for the front/back buffer currently, it should be
- * usable for further buffers when doing ARB_draw_buffer support.
- */
-static uint32_t
-gen6_update_renderbuffer_surface(struct brw_context *brw,
- struct gl_renderbuffer *rb,
- bool layered, unsigned unit /* unused */,
- uint32_t surf_index)
-{
-   struct gl_context *ctx = &brw->ctx;
-   struct intel_renderbuffer *irb = intel_renderbuffer(rb);
-   struct intel_mipmap_tree *mt = irb->mt;
-   uint32_t *surf;
-   uint32_t format = 0;
-   uint32_t offset;
-   /* _NEW_BUFFERS */
-   mesa_format rb_format = _mesa_get_render_format(ctx, intel_rb_format(irb));
-   uint32_t surftype;
-   int depth = MAX2(irb->layer_count, 1);
-   const GLenum gl_target =
-  rb->TexImage ? rb->TexImage->TexObject->Target : GL_TEXTURE_2D;
-
-   intel_miptree_used_for_rendering(irb->mt);
-
-   surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE, 6 * 4, 32, &offset);
-
-   format = brw->render_target_format[rb_format];
-   if (unlikely(!brw->format_supported_as_render_target[rb_format])) {
-  _mesa_problem(ctx, "%s: renderbuffer format %s unsupported\n",
-__func__, _mesa_get_format_name(rb_format));
-   }
-
-   switch (gl_target) {
-   case GL_TEXTURE_CUBE_MAP_ARRAY:
-   case GL_TEXTURE_CUBE_MAP:
-  surftype = BRW_SURFACE_2D;
-  depth *= 6;
-  break;
-   case GL_TEXTURE_3D:
-  depth = MAX2(irb->mt->logical_depth0, 1);
-  /* fallthrough */
-   default:
-  surftype = translate_tex_target(gl_target);
-  break;
-   }
-
-   const int min_array_element = irb->mt_layer;
-   assert(!layered || irb->mt_layer == 0);
-
-   surf[0] = SET_FIELD(surftype, BRW_SURFACE_TYPE) |
- SET_FIELD(format, BRW_SURFACE_FORMAT);
-
-   /* reloc */
-   assert(mt->offset % mt->cpp == 0);
-   surf[1] = mt->bo->offset64 + mt->offset;
-
-   /* In the gen6 PRM Volume 1 Part 1: Graphics Core, Section 7.18.3.7.1
-* (Surface Arrays For all surfaces other than separate stencil buffer):
-*
-* "[DevSNB] Errata: Sampler MSAA Qpitch will be 4 greater than the value
-*  calculated in the equation above , for every other odd Surface Height
-*  starting from 1 i.e. 1,5,9,13"
-*
-* Since this Qpitch errata only impacts the sampler, we have to adjust the
-* input for the rendering surface to achieve the same qpitch. For the
-* affected heights, we increment the height by 1 for the rendering
-* surface.
-*/
-   int height0 = irb->mt->logical_height0;
-   if (brw->gen == 6 && irb->mt->num_samples > 1 && (height0 % 4) == 1)
-  height0++;
-
-   surf[2] = SET_FIELD(mt->logical_width0 - 1, BRW_SURFACE_WIDTH) |
- SET_FIELD(height0 - 1, BRW_SURFACE_HEIGHT) |
- SET_FIELD(irb->mt_level - irb->mt->first_level, BRW_SURFACE_LOD);
-
-   surf[3] = brw_get_surface_tiling_bits(mt->tiling) |
- SET_FIELD(depth - 1, BRW_SURFACE_DEPTH) |
- SET_FIELD(mt->pitch - 1, BRW_SURFACE_PITCH);
-
-   surf[4] = brw_get_surface_num_multisamples(mt->num_samples) |
- SET_FIELD(min_array_element, BRW_SURFACE_MIN_ARRAY_ELEMENT) |
- SET_FIELD(depth - 1, BRW_SURFACE_RENDER_TARGET_VIEW_EXTENT);
-
-   surf[5] = (mt->valign == 4 ? BRW_SURFACE_VERTICAL_ALIGN_ENABLE : 0);
-
-   drm_intel_bo_emit_reloc(brw->batch.bo,
-   offset + 4,
-   mt->bo,
-   surf[1] - mt->bo->offset64,
-   I915_GEM_DOMAIN_RENDER,
-   I915_GEM_DOMAIN_RENDER);
-
-   return offset;
-}
-
 void
 gen6_init_vtable_surface_functions(struct brw_context *brw)
 {
gen4_init_vtable_surface_functions(brw);
-   brw->vtbl.update_renderbuffer_surface = gen6_update_renderbuffer_surface;
+   brw->vtbl.update_renderbuffer_surface = brw_update_renderbuffer_surface;
 }
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 34/34] i965/context: Remove some unnecessary vfuncs

2016-07-13 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_context.h   | 17 -
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  |  3 +--
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c |  1 -
 src/mesa/drivers/dri/i965/gen8_surface_state.c|  1 -
 4 files changed, 1 insertion(+), 21 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index df1f177..028bffc 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -744,27 +744,10 @@ struct brw_context
 
struct
{
-  void (*update_texture_surface)(struct gl_context *ctx,
- unsigned unit,
- uint32_t *surf_offset,
- bool for_gather, uint32_t plane);
   uint32_t (*update_renderbuffer_surface)(struct brw_context *brw,
   struct gl_renderbuffer *rb,
   bool layered, unsigned unit,
   uint32_t surf_index);
-
-  void (*emit_texture_surface_state)(struct brw_context *brw,
- struct intel_mipmap_tree *mt,
- GLenum target,
- unsigned min_layer,
- unsigned max_layer,
- unsigned min_level,
- unsigned max_level,
- unsigned format,
- unsigned swizzle,
- uint32_t *surf_offset,
- int surf_index,
- bool rw, bool for_gather);
   void (*emit_null_surface_state)(struct brw_context *brw,
   unsigned width,
   unsigned height,
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 8ed43589..809f5c5 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -1004,7 +1004,7 @@ update_stage_texture_surfaces(struct brw_context *brw,
 
  /* _NEW_TEXTURE */
  if (ctx->Texture.Unit[unit]._Current) {
-brw->vtbl.update_texture_surface(ctx, unit, surf_offset + s, 
for_gather, plane);
+brw_update_texture_surface(ctx, unit, surf_offset + s, for_gather, 
plane);
  }
   }
}
@@ -1583,7 +1583,6 @@ const struct brw_tracked_state brw_wm_image_surfaces = {
 void
 gen4_init_vtable_surface_functions(struct brw_context *brw)
 {
-   brw->vtbl.update_texture_surface = brw_update_texture_surface;
brw->vtbl.update_renderbuffer_surface = gen4_update_renderbuffer_surface;
brw->vtbl.emit_null_surface_state = brw_emit_null_surface_state;
 }
diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
index 742ac0e..5587a02 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
@@ -176,7 +176,6 @@ gen7_emit_null_surface_state(struct brw_context *brw,
 void
 gen7_init_vtable_surface_functions(struct brw_context *brw)
 {
-   brw->vtbl.update_texture_surface = brw_update_texture_surface;
brw->vtbl.update_renderbuffer_surface = brw_update_renderbuffer_surface;
brw->vtbl.emit_null_surface_state = gen7_emit_null_surface_state;
 }
diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
b/src/mesa/drivers/dri/i965/gen8_surface_state.c
index 1f86557..08f83f3 100644
--- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
@@ -80,7 +80,6 @@ gen8_emit_null_surface_state(struct brw_context *brw,
 void
 gen8_init_vtable_surface_functions(struct brw_context *brw)
 {
-   brw->vtbl.update_texture_surface = brw_update_texture_surface;
brw->vtbl.update_renderbuffer_surface = brw_update_renderbuffer_surface;
brw->vtbl.emit_null_surface_state = gen8_emit_null_surface_state;
 }
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 27/34] i965/gen7: Use the generic ISL-based path for renderbuffer surfaces

2016-07-13 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_state.h |   7 -
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 194 +-
 2 files changed, 1 insertion(+), 200 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index b4ddeaf..4aa6b50 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -297,13 +297,6 @@ void brw_update_renderbuffer_surfaces(struct brw_context 
*brw,
   uint32_t *surf_offset);
 
 /* gen7_wm_surface_state.c */
-uint32_t gen7_surface_tiling_mode(uint32_t tiling);
-uint32_t gen7_surface_msaa_bits(unsigned num_samples, enum intel_msaa_layout 
l);
-void gen7_set_surface_mcs_info(struct brw_context *brw,
-   uint32_t *surf,
-   uint32_t surf_offset,
-   const struct intel_mipmap_tree *mcs_mt,
-   bool is_render_target);
 void gen7_check_surface_setup(uint32_t *surf, bool is_render_target);
 void gen7_init_vtable_surface_functions(struct brw_context *brw);
 
diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
index bdb4f66..bb94f2d 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
@@ -39,79 +39,6 @@
 #include "brw_defines.h"
 #include "brw_wm.h"
 
-uint32_t
-gen7_surface_tiling_mode(uint32_t tiling)
-{
-   switch (tiling) {
-   case I915_TILING_X:
-  return GEN7_SURFACE_TILING_X;
-   case I915_TILING_Y:
-  return GEN7_SURFACE_TILING_Y;
-   default:
-  return GEN7_SURFACE_TILING_NONE;
-   }
-}
-
-
-uint32_t
-gen7_surface_msaa_bits(unsigned num_samples, enum intel_msaa_layout layout)
-{
-   uint32_t ss4 = 0;
-
-   assert(num_samples <= 16);
-
-   /* The SURFACE_MULTISAMPLECOUNT_X enums are simply log2(num_samples) << 3. 
*/
-   ss4 |= (ffs(MAX2(num_samples, 1)) - 1) << 3;
-
-   if (layout == INTEL_MSAA_LAYOUT_IMS)
-  ss4 |= GEN7_SURFACE_MSFMT_DEPTH_STENCIL;
-   else
-  ss4 |= GEN7_SURFACE_MSFMT_MSS;
-
-   return ss4;
-}
-
-
-void
-gen7_set_surface_mcs_info(struct brw_context *brw,
-  uint32_t *surf,
-  uint32_t surf_offset,
-  const struct intel_mipmap_tree *mcs_mt,
-  bool is_render_target)
-{
-   /* From the Ivy Bridge PRM, Vol4 Part1 p76, "MCS Base Address":
-*
-* "The MCS surface must be stored as Tile Y."
-*/
-   assert(mcs_mt->tiling == I915_TILING_Y);
-
-   /* Compute the pitch in units of tiles.  To do this we need to divide the
-* pitch in bytes by 128, since a single Y-tile is 128 bytes wide.
-*/
-   unsigned pitch_tiles = mcs_mt->pitch / 128;
-
-   /* The upper 20 bits of surface state DWORD 6 are the upper 20 bits of the
-* GPU address of the MCS buffer; the lower 12 bits contain other control
-* information.  Since buffer addresses are always on 4k boundaries (and
-* thus have their lower 12 bits zero), we can use an ordinary reloc to do
-* the necessary address translation.
-*/
-   assert ((mcs_mt->bo->offset64 & 0xfff) == 0);
-
-   surf[6] = GEN7_SURFACE_MCS_ENABLE |
- SET_FIELD(pitch_tiles - 1, GEN7_SURFACE_MCS_PITCH) |
- mcs_mt->bo->offset64;
-
-   drm_intel_bo_emit_reloc(brw->batch.bo,
-   surf_offset + 6 * 4,
-   mcs_mt->bo,
-   surf[6] & 0xfff,
-   is_render_target ? I915_GEM_DOMAIN_RENDER
-   : I915_GEM_DOMAIN_SAMPLER,
-   is_render_target ? I915_GEM_DOMAIN_RENDER : 0);
-}
-
-
 void
 gen7_check_surface_setup(uint32_t *surf, bool is_render_target)
 {
@@ -291,130 +218,11 @@ gen7_emit_null_surface_state(struct brw_context *brw,
gen7_check_surface_setup(surf, true /* is_render_target */);
 }
 
-/**
- * Sets up a surface state structure to point at the given region.
- * While it is only used for the front/back buffer currently, it should be
- * usable for further buffers when doing ARB_draw_buffer support.
- */
-static uint32_t
-gen7_update_renderbuffer_surface(struct brw_context *brw,
- struct gl_renderbuffer *rb,
- bool layered, unsigned unit /* unused */,
- uint32_t surf_index)
-{
-   struct gl_context *ctx = &brw->ctx;
-   struct intel_renderbuffer *irb = intel_renderbuffer(rb);
-   struct intel_mipmap_tree *mt = irb->mt;
-   uint32_t format;
-   /* _NEW_BUFFERS */
-   mesa_format rb_format = _mesa_get_render_format(ctx, intel_rb_format(irb));
-   uint32_t surftype;
-   bool is_array = false;
-   int depth = MAX2(irb->layer_count, 1);
-   const uint8_t mocs = GEN7_MOCS_L3;
-   uint32_t offset;
-
-   int min_array_element = irb->mt_laye

[Mesa-dev] [PATCH v4 18/34] i965/blorp: Use the generic ISL path for texture surfaces on gen6

2016-07-13 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/gen6_blorp.c | 76 +-
 1 file changed, 2 insertions(+), 74 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_blorp.c 
b/src/mesa/drivers/dri/i965/gen6_blorp.c
index 1af898d..70dc9f6 100644
--- a/src/mesa/drivers/dri/i965/gen6_blorp.c
+++ b/src/mesa/drivers/dri/i965/gen6_blorp.c
@@ -350,78 +350,6 @@ gen6_blorp_emit_cc_state_pointers(struct brw_context *brw,
ADVANCE_BATCH();
 }
 
-/* SURFACE_STATE for renderbuffer or texture surface (see
- * brw_update_renderbuffer_surface and brw_update_texture_surface)
- */
-static uint32_t
-gen6_blorp_emit_surface_state(struct brw_context *brw,
-  const struct brw_blorp_params *params,
-  const struct brw_blorp_surface_info *surface,
-  uint32_t read_domains, uint32_t write_domain)
-{
-   uint32_t wm_surf_offset;
-   uint32_t width = surface->width;
-   uint32_t height = surface->height;
-   if (surface->num_samples > 1) {
-  /* Since gen6 uses INTEL_MSAA_LAYOUT_IMS, width and height are measured
-   * in samples.  But SURFACE_STATE wants them in pixels, so we need to
-   * divide them each by 2.
-   */
-  width /= 2;
-  height /= 2;
-   }
-   struct intel_mipmap_tree *mt = surface->mt;
-   uint32_t tile_x, tile_y;
-
-   uint32_t *surf = (uint32_t *)
-  brw_state_batch(brw, AUB_TRACE_SURFACE_STATE, 6 * 4, 32,
-  &wm_surf_offset);
-
-   surf[0] = (BRW_SURFACE_2D << BRW_SURFACE_TYPE_SHIFT |
-  BRW_SURFACE_MIPMAPLAYOUT_BELOW << BRW_SURFACE_MIPLAYOUT_SHIFT |
-  BRW_SURFACE_CUBEFACE_ENABLES |
-  surface->brw_surfaceformat << BRW_SURFACE_FORMAT_SHIFT);
-
-   /* reloc */
-   surf[1] = (brw_blorp_compute_tile_offsets(surface, &tile_x, &tile_y) +
-  mt->bo->offset64);
-
-   surf[2] = (0 << BRW_SURFACE_LOD_SHIFT |
-  (width - 1) << BRW_SURFACE_WIDTH_SHIFT |
-  (height - 1) << BRW_SURFACE_HEIGHT_SHIFT);
-
-   uint32_t tiling = surface->map_stencil_as_y_tiled
-  ? BRW_SURFACE_TILED | BRW_SURFACE_TILED_Y
-  : brw_get_surface_tiling_bits(mt->tiling);
-   uint32_t pitch_bytes = mt->pitch;
-   if (surface->map_stencil_as_y_tiled)
-  pitch_bytes *= 2;
-   surf[3] = (tiling |
-  0 << BRW_SURFACE_DEPTH_SHIFT |
-  (pitch_bytes - 1) << BRW_SURFACE_PITCH_SHIFT);
-
-   surf[4] = brw_get_surface_num_multisamples(surface->num_samples);
-
-   /* Note that the low bits of these fields are missing, so
-* there's the possibility of getting in trouble.
-*/
-   assert(tile_x % 4 == 0);
-   assert(tile_y % 2 == 0);
-   surf[5] = ((tile_x / 4) << BRW_SURFACE_X_OFFSET_SHIFT |
-  (tile_y / 2) << BRW_SURFACE_Y_OFFSET_SHIFT |
-  (surface->mt->valign == 4 ?
-   BRW_SURFACE_VERTICAL_ALIGN_ENABLE : 0));
-
-   /* Emit relocation to surface contents */
-   drm_intel_bo_emit_reloc(brw->batch.bo,
-   wm_surf_offset + 4,
-   mt->bo,
-   surf[1] - mt->bo->offset64,
-   read_domains, write_domain);
-
-   return wm_surf_offset;
-}
-
 
 /* BINDING_TABLE.  See brw_wm_binding_table(). */
 uint32_t
@@ -1035,8 +963,8 @@ gen6_blorp_exec(struct brw_context *brw,
   I915_GEM_DOMAIN_RENDER, true);
   if (params->src.mt) {
  wm_surf_offset_texture =
-gen6_blorp_emit_surface_state(brw, params, ¶ms->src,
-  I915_GEM_DOMAIN_SAMPLER, 0);
+brw_blorp_emit_surface_state(brw, ¶ms->src,
+ I915_GEM_DOMAIN_SAMPLER, 0, false);
   }
   wm_bind_bo_offset =
  gen6_blorp_emit_binding_table(brw,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 25/34] i965/gen8: Use the generic ISL-based path for renderbuffer surfaces

2016-07-13 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_state.h  |  16 --
 src/mesa/drivers/dri/i965/gen8_surface_state.c | 249 +
 2 files changed, 2 insertions(+), 263 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index 71f6c89..b4ddeaf 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -325,22 +325,6 @@ void gen8_upload_3dstate_so_buffers(struct brw_context 
*brw);
 
 void gen8_init_vtable_surface_functions(struct brw_context *brw);
 
-unsigned gen8_surface_tiling_mode(uint32_t tiling);
-unsigned gen8_vertical_alignment(const struct brw_context *brw,
- const struct intel_mipmap_tree *mt,
- uint32_t surf_type);
-unsigned gen8_horizontal_alignment(const struct brw_context *brw,
-   const struct intel_mipmap_tree *mt,
-   uint32_t surf_type);
-uint32_t *gen8_allocate_surface_state(struct brw_context *brw,
-  uint32_t *out_offset, int index);
-
-void gen8_emit_fast_clear_color(const struct brw_context *brw,
-const struct intel_mipmap_tree *mt,
-uint32_t *surf);
-uint32_t gen8_get_aux_mode(const struct brw_context *brw,
-   const struct intel_mipmap_tree *mt);
-
 /* brw_sampler_state.c */
 void brw_emit_sampler_state(struct brw_context *brw,
 uint32_t *sampler_state,
diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
b/src/mesa/drivers/dri/i965/gen8_surface_state.c
index ed26271..00e4c48 100644
--- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
@@ -42,83 +42,7 @@
 #include "brw_wm.h"
 #include "isl/isl.h"
 
-static uint32_t
-surface_tiling_resource_mode(uint32_t tr_mode)
-{
-   switch (tr_mode) {
-   case INTEL_MIPTREE_TRMODE_YF:
-  return GEN9_SURFACE_TRMODE_TILEYF;
-   case INTEL_MIPTREE_TRMODE_YS:
-  return GEN9_SURFACE_TRMODE_TILEYS;
-   default:
-  return GEN9_SURFACE_TRMODE_NONE;
-   }
-}
-
-uint32_t
-gen8_surface_tiling_mode(uint32_t tiling)
-{
-   switch (tiling) {
-   case I915_TILING_X:
-  return GEN8_SURFACE_TILING_X;
-   case I915_TILING_Y:
-  return GEN8_SURFACE_TILING_Y;
-   default:
-  return GEN8_SURFACE_TILING_NONE;
-   }
-}
-
-unsigned
-gen8_vertical_alignment(const struct brw_context *brw,
-const struct intel_mipmap_tree *mt,
-uint32_t surf_type)
-{
-   /* On Gen9+ vertical alignment is ignored for 1D surfaces and when
-* tr_mode is not TRMODE_NONE. Set to an arbitrary non-reserved value.
-*/
-   if (brw->gen > 8 &&
-   (mt->tr_mode != INTEL_MIPTREE_TRMODE_NONE ||
-surf_type == BRW_SURFACE_1D))
-  return GEN8_SURFACE_VALIGN_4;
-
-   switch (mt->valign) {
-   case 4:
-  return GEN8_SURFACE_VALIGN_4;
-   case 8:
-  return GEN8_SURFACE_VALIGN_8;
-   case 16:
-  return GEN8_SURFACE_VALIGN_16;
-   default:
-  unreachable("Unsupported vertical surface alignment.");
-   }
-}
-
-unsigned
-gen8_horizontal_alignment(const struct brw_context *brw,
-  const struct intel_mipmap_tree *mt,
-  uint32_t surf_type)
-{
-   /* On Gen9+ horizontal alignment is ignored when tr_mode is not
-* TRMODE_NONE. Set to an arbitrary non-reserved value.
-*/
-   if (brw->gen > 8 &&
-   (mt->tr_mode != INTEL_MIPTREE_TRMODE_NONE ||
-gen9_use_linear_1d_layout(brw, mt)))
-  return GEN8_SURFACE_HALIGN_4;
-
-   switch (mt->halign) {
-   case 4:
-  return GEN8_SURFACE_HALIGN_4;
-   case 8:
-  return GEN8_SURFACE_HALIGN_8;
-   case 16:
-  return GEN8_SURFACE_HALIGN_16;
-   default:
-  unreachable("Unsupported horizontal surface alignment.");
-   }
-}
-
-uint32_t *
+static uint32_t *
 gen8_allocate_surface_state(struct brw_context *brw,
 uint32_t *out_offset, int index)
 {
@@ -169,44 +93,6 @@ gen8_emit_buffer_surface_state(struct brw_context *brw,
}
 }
 
-void
-gen8_emit_fast_clear_color(const struct brw_context *brw,
-   const struct intel_mipmap_tree *mt,
-   uint32_t *surf)
-{
-   if (brw->gen >= 9) {
-  surf[12] = mt->gen9_fast_clear_color.ui[0];
-  surf[13] = mt->gen9_fast_clear_color.ui[1];
-  surf[14] = mt->gen9_fast_clear_color.ui[2];
-  surf[15] = mt->gen9_fast_clear_color.ui[3];
-   } else
-  surf[7] |= mt->fast_clear_color_value;
-}
-
-uint32_t
-gen8_get_aux_mode(const struct brw_context *brw,
-  const struct intel_mipmap_tree *mt)
-{
-   if (mt->mcs_mt == NULL)
-  return GEN8_SURFACE_AUX_MODE_NONE;
-
-   /*
-* From the BDW PRM, Volume 2d, page 260 (RENDER_SURFACE_STATE):
-* "When MCS is enabled for non-MSRT, HALIG

[Mesa-dev] [PATCH v4 23/34] i965/state: Add generic surface update functions based on ISL

2016-07-13 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_state.h|   9 ++
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 184 +++
 2 files changed, 193 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index 91ebce9..71f6c89 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -282,6 +282,15 @@ void brw_emit_surface_state(struct brw_context *brw,
 uint32_t *surf_offset, int surf_index,
 unsigned read_domains, unsigned write_domains);
 
+void brw_update_texture_surface(struct gl_context *ctx,
+unsigned unit, uint32_t *surf_offset,
+bool for_gather, uint32_t plane);
+
+uint32_t brw_update_renderbuffer_surface(struct brw_context *brw,
+ struct gl_renderbuffer *rb,
+ bool layered, unsigned unit,
+ uint32_t surf_index);
+
 void brw_update_renderbuffer_surfaces(struct brw_context *brw,
   const struct gl_framebuffer *fb,
   uint32_t render_target_start,
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 82a8537..084bd8c 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -131,6 +131,54 @@ brw_emit_surface_state(struct brw_context *brw,
}
 }
 
+uint32_t
+brw_update_renderbuffer_surface(struct brw_context *brw,
+struct gl_renderbuffer *rb,
+bool layered, unsigned unit /* unused */,
+uint32_t surf_index)
+{
+   struct gl_context *ctx = &brw->ctx;
+   struct intel_renderbuffer *irb = intel_renderbuffer(rb);
+   struct intel_mipmap_tree *mt = irb->mt;
+
+   assert(brw_render_target_supported(brw, rb));
+   intel_miptree_used_for_rendering(mt);
+
+   mesa_format rb_format = _mesa_get_render_format(ctx, intel_rb_format(irb));
+   if (unlikely(!brw->format_supported_as_render_target[rb_format])) {
+  _mesa_problem(ctx, "%s: renderbuffer format %s unsupported\n",
+__func__, _mesa_get_format_name(rb_format));
+   }
+
+   const unsigned layer_multiplier =
+  (irb->mt->msaa_layout == INTEL_MSAA_LAYOUT_UMS ||
+   irb->mt->msaa_layout == INTEL_MSAA_LAYOUT_CMS) ?
+  MAX2(irb->mt->num_samples, 1) : 1;
+
+   struct isl_view view = {
+  .format = brw->render_target_format[rb_format],
+  .base_level = irb->mt_level - irb->mt->first_level,
+  .levels = 1,
+  .base_array_layer = irb->mt_layer / layer_multiplier,
+  .array_len = MAX2(irb->layer_count, 1),
+  .channel_select = {
+ ISL_CHANNEL_SELECT_RED,
+ ISL_CHANNEL_SELECT_GREEN,
+ ISL_CHANNEL_SELECT_BLUE,
+ ISL_CHANNEL_SELECT_ALPHA,
+  },
+  .usage = ISL_SURF_USAGE_RENDER_TARGET_BIT,
+   };
+
+   uint32_t offset;
+   brw_emit_surface_state(brw, mt, &view,
+  surface_state_infos[brw->gen].rb_mocs, false,
+  &offset, surf_index,
+  I915_GEM_DOMAIN_RENDER,
+  I915_GEM_DOMAIN_RENDER);
+   return offset;
+}
+
 GLuint
 translate_tex_target(GLenum target)
 {
@@ -298,6 +346,142 @@ brw_get_texture_swizzle(const struct gl_context *ctx,
 swizzles[GET_SWZ(t->_Swizzle, 3)]);
 }
 
+/**
+ * Convert an swizzle enumeration (i.e. SWIZZLE_X) to one of the Gen7.5+
+ * "Shader Channel Select" enumerations (i.e. HSW_SCS_RED).  The mappings are
+ *
+ * SWIZZLE_X, SWIZZLE_Y, SWIZZLE_Z, SWIZZLE_W, SWIZZLE_ZERO, SWIZZLE_ONE
+ * 0  1  2  3 45
+ * 4  5  6  7 01
+ *   SCS_RED, SCS_GREEN,  SCS_BLUE, SCS_ALPHA, SCS_ZERO, SCS_ONE
+ *
+ * which is simply adding 4 then modding by 8 (or anding with 7).
+ *
+ * We then may need to apply workarounds for textureGather hardware bugs.
+ */
+static unsigned
+swizzle_to_scs(GLenum swizzle, bool need_green_to_blue)
+{
+   unsigned scs = (swizzle + 4) & 7;
+
+   return (need_green_to_blue && scs == HSW_SCS_GREEN) ? HSW_SCS_BLUE : scs;
+}
+
+void
+brw_update_texture_surface(struct gl_context *ctx,
+   unsigned unit,
+   uint32_t *surf_offset,
+   bool for_gather,
+   uint32_t plane)
+{
+   struct brw_context *brw = brw_context(ctx);
+   struct gl_texture_object *obj = ctx->Texture.Unit[unit]._Current;
+
+   if (obj->Target == GL_TEXTURE_BUFFER) {
+  brw_update_buffer_texture_surface(ctx, unit, surf_offset);
+
+   } else {
+  struct intel_texture_object *intel

[Mesa-dev] [PATCH v4 32/34] i965: Use ISL for emitting buffer surface states

2016-07-13 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_binding_tables.c|  2 +-
 src/mesa/drivers/dri/i965/brw_context.h   |  8 --
 src/mesa/drivers/dri/i965/brw_state.h |  9 +++
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 93 +++
 src/mesa/drivers/dri/i965/gen7_cs_state.c |  2 +-
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 47 
 src/mesa/drivers/dri/i965/gen8_surface_state.c| 42 --
 7 files changed, 55 insertions(+), 148 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c 
b/src/mesa/drivers/dri/i965/brw_binding_tables.c
index 3bf2255..9ca841a 100644
--- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
+++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
@@ -100,7 +100,7 @@ brw_upload_binding_table(struct brw_context *brw,
} else {
   /* Upload a new binding table. */
   if (INTEL_DEBUG & DEBUG_SHADER_TIME) {
- brw->vtbl.emit_buffer_surface_state(
+ brw_emit_buffer_surface_state(
 brw, &stage_state->surf_offset[
 prog_data->binding_table.shader_time_start],
 brw->shader_time.bo, 0, BRW_SURFACEFORMAT_RAW,
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index c0cdd7d..df1f177 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -765,14 +765,6 @@ struct brw_context
  uint32_t *surf_offset,
  int surf_index,
  bool rw, bool for_gather);
-  void (*emit_buffer_surface_state)(struct brw_context *brw,
-uint32_t *out_offset,
-drm_intel_bo *bo,
-unsigned buffer_offset,
-unsigned surface_format,
-unsigned buffer_size,
-unsigned pitch,
-bool rw);
   void (*emit_null_surface_state)(struct brw_context *brw,
   unsigned width,
   unsigned height,
diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index 4aa6b50..6ba3710 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -282,6 +282,15 @@ void brw_emit_surface_state(struct brw_context *brw,
 uint32_t *surf_offset, int surf_index,
 unsigned read_domains, unsigned write_domains);
 
+void brw_emit_buffer_surface_state(struct brw_context *brw,
+   uint32_t *out_offset,
+   drm_intel_bo *bo,
+   unsigned buffer_offset,
+   unsigned surface_format,
+   unsigned buffer_size,
+   unsigned pitch,
+   bool rw);
+
 void brw_update_texture_surface(struct gl_context *ctx,
 unsigned unit, uint32_t *surf_offset,
 bool for_gather, uint32_t plane);
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index f94aca2..d2b4b5e 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -482,36 +482,32 @@ brw_update_texture_surface(struct gl_context *ctx,
}
 }
 
-static void
-gen4_emit_buffer_surface_state(struct brw_context *brw,
-   uint32_t *out_offset,
-   drm_intel_bo *bo,
-   unsigned buffer_offset,
-   unsigned surface_format,
-   unsigned buffer_size,
-   unsigned pitch,
-   bool rw)
+void
+brw_emit_buffer_surface_state(struct brw_context *brw,
+  uint32_t *out_offset,
+  drm_intel_bo *bo,
+  unsigned buffer_offset,
+  unsigned surface_format,
+  unsigned buffer_size,
+  unsigned pitch,
+  bool rw)
 {
-   unsigned elements = buffer_size / pitch;
-   uint32_t *surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
-6 * 4, 32, out_offset);
-   memset(surf, 0, 6 * 4);
+   const struct surface_state_info ss_info = surface_state_infos[brw->gen];
+
+   uint32_t *dw = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
+  ss_info.num_dwords * 4,

[Mesa-dev] [PATCH v4 30/34] isl/formats: Mark RAW as having a block size of 1 byte

2016-07-13 Thread Jason Ekstrand
---
 src/intel/isl/isl_format_layout.csv | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/isl/isl_format_layout.csv 
b/src/intel/isl/isl_format_layout.csv
index f0f31c7..c1c98e8 100644
--- a/src/intel/isl/isl_format_layout.csv
+++ b/src/intel/isl/isl_format_layout.csv
@@ -285,7 +285,7 @@ ETC2_EAC_RGBA8  , 128,  4,  4,  1,  un8,  un8,  
un8,  un8, ,
 ETC2_EAC_SRGB8_A8   , 128,  4,  4,  1,  un8,  un8,  un8,  un8, ,   
  ,,   srgb,  etc2
 R8G8B8_UINT ,  24,  1,  1,  1,  ui8,  ui8,  ui8, , ,   
  ,, linear,
 R8G8B8_SINT ,  24,  1,  1,  1,  si8,  si8,  si8, , ,   
  ,, linear,
-RAW ,   0,  0,  0,  0, , , , , ,   
  ,,   ,
+RAW ,   8,  0,  0,  0, , , , , ,   
  ,,   ,
 ASTC_LDR_2D_4X4_U8SRGB  , 128,  4,  4,  1,  un8,  un8,  un8,  un8, ,   
  ,,   srgb,  astc
 ASTC_LDR_2D_5X4_U8SRGB  , 128,  5,  4,  1,  un8,  un8,  un8,  un8, ,   
  ,,   srgb,  astc
 ASTC_LDR_2D_5X5_U8SRGB  , 128,  5,  5,  1,  un8,  un8,  un8,  un8, ,   
  ,,   srgb,  astc
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 33/34] i965: Get rid of gen6_surface_state.c

2016-07-13 Thread Jason Ekstrand
The only useful thing left was gen6_init_vtable_surface_functions which we
can easily put in brw_wm_surface_state.c.

Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/Makefile.sources   |  1 -
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  7 
 src/mesa/drivers/dri/i965/gen6_surface_state.c   | 48 
 3 files changed, 7 insertions(+), 49 deletions(-)
 delete mode 100644 src/mesa/drivers/dri/i965/gen6_surface_state.c

diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
b/src/mesa/drivers/dri/i965/Makefile.sources
index 0d2f4f1..ca7591f 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -186,7 +186,6 @@ i965_FILES = \
gen6_scissor_state.c \
gen6_sf_state.c \
gen6_sol.c \
-   gen6_surface_state.c \
gen6_urb.c \
gen6_viewport_state.c \
gen6_vs_state.c \
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index d2b4b5e..8ed43589 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -1588,6 +1588,13 @@ gen4_init_vtable_surface_functions(struct brw_context 
*brw)
brw->vtbl.emit_null_surface_state = brw_emit_null_surface_state;
 }
 
+void
+gen6_init_vtable_surface_functions(struct brw_context *brw)
+{
+   gen4_init_vtable_surface_functions(brw);
+   brw->vtbl.update_renderbuffer_surface = brw_update_renderbuffer_surface;
+}
+
 static void
 brw_upload_cs_work_groups_surface(struct brw_context *brw)
 {
diff --git a/src/mesa/drivers/dri/i965/gen6_surface_state.c 
b/src/mesa/drivers/dri/i965/gen6_surface_state.c
deleted file mode 100644
index 84b8ef4..000
--- a/src/mesa/drivers/dri/i965/gen6_surface_state.c
+++ /dev/null
@@ -1,48 +0,0 @@
-/*
- * Copyright (c) 2014 Intel Corporation
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice (including the next
- * paragraph) shall be included in all copies or substantial portions of the
- * Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
- * IN THE SOFTWARE.
- */
-
-
-#include "main/context.h"
-#include "main/blend.h"
-#include "main/mtypes.h"
-#include "main/samplerobj.h"
-#include "main/texformat.h"
-#include "program/prog_parameter.h"
-
-#include "intel_mipmap_tree.h"
-#include "intel_batchbuffer.h"
-#include "intel_tex.h"
-#include "intel_fbo.h"
-#include "intel_buffer_objects.h"
-
-#include "brw_context.h"
-#include "brw_state.h"
-#include "brw_defines.h"
-#include "brw_wm.h"
-
-void
-gen6_init_vtable_surface_functions(struct brw_context *brw)
-{
-   gen4_init_vtable_surface_functions(brw);
-   brw->vtbl.update_renderbuffer_surface = brw_update_renderbuffer_surface;
-}
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 19/34] i965/state: Add a helper for emitting a surface state using isl

2016-07-13 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_state.h|  8 +++
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 79 
 2 files changed, 87 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index a16e876..91ebce9 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -274,6 +274,14 @@ GLuint translate_tex_format(struct brw_context *brw,
 int brw_get_texture_swizzle(const struct gl_context *ctx,
 const struct gl_texture_object *t);
 
+struct isl_view;
+void brw_emit_surface_state(struct brw_context *brw,
+struct intel_mipmap_tree *mt,
+const struct isl_view *view,
+uint32_t mocs, bool for_gather,
+uint32_t *surf_offset, int surf_index,
+unsigned read_domains, unsigned write_domains);
+
 void brw_update_renderbuffer_surfaces(struct brw_context *brw,
   const struct gl_framebuffer *fb,
   uint32_t render_target_start,
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index c101e05..65a1f3c 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -35,6 +35,7 @@
 #include "main/mtypes.h"
 #include "main/samplerobj.h"
 #include "main/shaderimage.h"
+#include "main/teximage.h"
 #include "program/prog_parameter.h"
 #include "program/prog_instruction.h"
 #include "main/framebuffer.h"
@@ -52,6 +53,84 @@
 #include "brw_defines.h"
 #include "brw_wm.h"
 
+struct surface_state_info {
+   unsigned num_dwords;
+   unsigned ss_align; /* Required alignment of RENDER_SURFACE_STATE in bytes */
+   unsigned reloc_dw;
+   unsigned aux_reloc_dw;
+   unsigned tex_mocs;
+   unsigned rb_mocs;
+};
+
+static const struct surface_state_info surface_state_infos[] = {
+   [4] = {6,  32, 1,  0},
+   [5] = {6,  32, 1,  0},
+   [6] = {6,  32, 1,  0},
+   [7] = {8,  32, 1,  6,  GEN7_MOCS_L3, GEN7_MOCS_L3},
+   [8] = {13, 64, 8,  10, BDW_MOCS_WB,  BDW_MOCS_PTE},
+   [9] = {16, 64, 8,  10, SKL_MOCS_WB,  SKL_MOCS_PTE},
+};
+
+void
+brw_emit_surface_state(struct brw_context *brw,
+   struct intel_mipmap_tree *mt,
+   const struct isl_view *view,
+   uint32_t mocs, bool for_gather,
+   uint32_t *surf_offset, int surf_index,
+   unsigned read_domains, unsigned write_domains)
+{
+   const struct surface_state_info ss_info = surface_state_infos[brw->gen];
+
+   struct isl_surf surf;
+   intel_miptree_get_isl_surf(brw, mt, &surf);
+
+   union isl_color_value clear_color = { .u32 = { 0, 0, 0, 0 } };
+
+   struct isl_surf *aux_surf = NULL, aux_surf_s;
+   uint64_t aux_offset = 0;
+   enum isl_aux_usage aux_usage = ISL_AUX_USAGE_NONE;
+   if (mt->mcs_mt &&
+   ((view->usage & ISL_SURF_USAGE_RENDER_TARGET_BIT) ||
+mt->fast_clear_state != INTEL_FAST_CLEAR_STATE_RESOLVED)) {
+  intel_miptree_get_aux_isl_surf(brw, mt, &aux_surf_s, &aux_usage);
+  aux_surf = &aux_surf_s;
+  assert(mt->mcs_mt->offset == 0);
+  aux_offset = mt->mcs_mt->bo->offset64;
+
+  /* We only really need a clear color if we also have an auxiliary
+   * surfacae.  Without one, it does nothing.
+   */
+  clear_color = intel_miptree_get_isl_clear_color(brw, mt);
+   }
+
+   uint32_t *dw = __brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
+ss_info.num_dwords * 4, ss_info.ss_align,
+surf_index, surf_offset);
+
+   isl_surf_fill_state(&brw->isl_dev, dw, .surf = &surf, .view = view,
+   .address = mt->bo->offset64 + mt->offset,
+   .aux_surf = aux_surf, .aux_usage = aux_usage,
+   .aux_address = aux_offset,
+   .mocs = mocs, .clear_color = clear_color);
+
+   drm_intel_bo_emit_reloc(brw->batch.bo,
+   *surf_offset + 4 * ss_info.reloc_dw,
+   mt->bo, mt->offset,
+   read_domains, write_domains);
+
+   if (aux_surf) {
+  /* On gen7 and prior, the bottom 12 bits of the MCS base address are
+   * used to store other information.  This should be ok, however, because
+   * surface buffer addresses are always 4K page alinged.
+   */
+  assert((aux_offset & 0xfff) == 0);
+  drm_intel_bo_emit_reloc(brw->batch.bo,
+  *surf_offset + 4 * ss_info.aux_reloc_dw,
+  mt->mcs_mt->bo, dw[ss_info.aux_reloc_dw] & 0xfff,
+  read_domains, write_domains);
+   }
+}
+
 GLuint
 translate_tex_target(GLenum target)
 {
-- 
2.5.0.400.gff86faf

___

[Mesa-dev] [PATCH v4 24/34] i965/gen8: Use the generic ISL-based path for texture surfaces

2016-07-13 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/gen8_surface_state.c | 214 +
 1 file changed, 1 insertion(+), 213 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
b/src/mesa/drivers/dri/i965/gen8_surface_state.c
index bd9e2a1..ed26271 100644
--- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
@@ -42,23 +42,6 @@
 #include "brw_wm.h"
 #include "isl/isl.h"
 
-/**
- * Convert an swizzle enumeration (i.e. SWIZZLE_X) to one of the Gen7.5+
- * "Shader Channel Select" enumerations (i.e. HSW_SCS_RED).  The mappings are
- *
- * SWIZZLE_X, SWIZZLE_Y, SWIZZLE_Z, SWIZZLE_W, SWIZZLE_ZERO, SWIZZLE_ONE
- * 0  1  2  3 45
- * 4  5  6  7 01
- *   SCS_RED, SCS_GREEN,  SCS_BLUE, SCS_ALPHA, SCS_ZERO, SCS_ONE
- *
- * which is simply adding 4 then modding by 8 (or anding with 7).
- */
-static unsigned
-swizzle_to_scs(unsigned swizzle)
-{
-   return (swizzle + 4) & 7;
-}
-
 static uint32_t
 surface_tiling_resource_mode(uint32_t tr_mode)
 {
@@ -224,200 +207,6 @@ gen8_get_aux_mode(const struct brw_context *brw,
return GEN8_SURFACE_AUX_MODE_MCS;
 }
 
-static void
-gen8_emit_texture_surface_state(struct brw_context *brw,
-struct intel_mipmap_tree *mt,
-GLenum target,
-unsigned min_layer, unsigned max_layer,
-unsigned min_level, unsigned max_level,
-unsigned format,
-unsigned swizzle,
-uint32_t *surf_offset, int surf_index,
-bool rw, bool for_gather)
-{
-   const unsigned depth = max_layer - min_layer;
-   struct intel_mipmap_tree *aux_mt = mt->mcs_mt;
-   uint32_t mocs_wb = brw->gen >= 9 ? SKL_MOCS_WB : BDW_MOCS_WB;
-   unsigned tiling_mode, pitch;
-   const unsigned tr_mode = surface_tiling_resource_mode(mt->tr_mode);
-   const uint32_t surf_type = translate_tex_target(target);
-   uint32_t aux_mode = gen8_get_aux_mode(brw, mt);
-
-   if (mt->format == MESA_FORMAT_S_UINT8) {
-  tiling_mode = GEN8_SURFACE_TILING_W;
-  pitch = 2 * mt->pitch;
-   } else {
-  tiling_mode = gen8_surface_tiling_mode(mt->tiling);
-  pitch = mt->pitch;
-   }
-
-   /* Prior to Gen9, MCS is not uploaded for single-sampled surfaces because
-* the color buffer should always have been resolved before it is used as
-* a texture so there is no need for it. On Gen9 it will be uploaded when
-* the surface is losslessly compressed (CCS_E).
-* However, sampling engine is not capable of re-interpreting the
-* underlying color buffer in non-compressible formats when the surface
-* is configured as compressed. Therefore state upload has made sure the
-* buffer is in resolved state allowing the surface to be configured as
-* non-compressed.
-*/
-   if (mt->num_samples <= 1 &&
-   (aux_mode != GEN9_SURFACE_AUX_MODE_CCS_E ||
-!isl_format_supports_lossless_compression(
-brw->intelScreen->devinfo, format))) {
-  assert(!mt->mcs_mt ||
- mt->fast_clear_state == INTEL_FAST_CLEAR_STATE_RESOLVED);
-  aux_mt = NULL;
-  aux_mode = GEN8_SURFACE_AUX_MODE_NONE;
-   }
-
-   uint32_t *surf = gen8_allocate_surface_state(brw, surf_offset, surf_index);
-
-   surf[0] = SET_FIELD(surf_type, BRW_SURFACE_TYPE) |
- format << BRW_SURFACE_FORMAT_SHIFT |
- gen8_vertical_alignment(brw, mt, surf_type) |
- gen8_horizontal_alignment(brw, mt, surf_type) |
- tiling_mode;
-
-   if (surf_type == BRW_SURFACE_CUBE) {
-  surf[0] |= BRW_SURFACE_CUBEFACE_ENABLES;
-   }
-
-   /* From the CHV PRM, Volume 2d, page 321 (RENDER_SURFACE_STATE dword 0
-* bit 9 "Sampler L2 Bypass Mode Disable" Programming Notes):
-*
-*This bit must be set for the following surface types: BC2_UNORM
-*BC3_UNORM BC5_UNORM BC5_SNORM BC7_UNORM
-*/
-   if ((brw->gen >= 9 || brw->is_cherryview) &&
-   (format == BRW_SURFACEFORMAT_BC2_UNORM ||
-format == BRW_SURFACEFORMAT_BC3_UNORM ||
-format == BRW_SURFACEFORMAT_BC5_UNORM ||
-format == BRW_SURFACEFORMAT_BC5_SNORM ||
-format == BRW_SURFACEFORMAT_BC7_UNORM))
-  surf[0] |= GEN8_SURFACE_SAMPLER_L2_BYPASS_DISABLE;
-
-   if (mt->target != GL_TEXTURE_3D)
-  surf[0] |= GEN8_SURFACE_IS_ARRAY;
-
-   surf[1] = SET_FIELD(mocs_wb, GEN8_SURFACE_MOCS) | mt->qpitch >> 2;
-
-   surf[2] = SET_FIELD(mt->logical_width0 - 1, GEN7_SURFACE_WIDTH) |
- SET_FIELD(mt->logical_height0 - 1, GEN7_SURFACE_HEIGHT);
-
-   surf[3] = SET_FIELD(depth - 1, BRW_SURFACE_DEPTH) | (pitch - 1);
-
-   surf[4] = gen7_surface_msaa_bits(mt->num_samples, mt->msaa_layout) |
- SET_FIELD(min_layer, GEN7_SURFACE

[Mesa-dev] [PATCH v4 26/34] i965/gen7: Use the generic ISL-based path for texture surfaces

2016-07-13 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 168 +-
 1 file changed, 1 insertion(+), 167 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
index 932e62e..bdb4f66 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
@@ -39,27 +39,6 @@
 #include "brw_defines.h"
 #include "brw_wm.h"
 
-/**
- * Convert an swizzle enumeration (i.e. SWIZZLE_X) to one of the Gen7.5+
- * "Shader Channel Select" enumerations (i.e. HSW_SCS_RED).  The mappings are
- *
- * SWIZZLE_X, SWIZZLE_Y, SWIZZLE_Z, SWIZZLE_W, SWIZZLE_ZERO, SWIZZLE_ONE
- * 0  1  2  3 45
- * 4  5  6  7 01
- *   SCS_RED, SCS_GREEN,  SCS_BLUE, SCS_ALPHA, SCS_ZERO, SCS_ONE
- *
- * which is simply adding 4 then modding by 8 (or anding with 7).
- *
- * We then may need to apply workarounds for textureGather hardware bugs.
- */
-static unsigned
-swizzle_to_scs(GLenum swizzle, bool need_green_to_blue)
-{
-   unsigned scs = (swizzle + 4) & 7;
-
-   return (need_green_to_blue && scs == HSW_SCS_GREEN) ? HSW_SCS_BLUE : scs;
-}
-
 uint32_t
 gen7_surface_tiling_mode(uint32_t tiling)
 {
@@ -264,150 +243,6 @@ gen7_emit_buffer_surface_state(struct brw_context *brw,
gen7_check_surface_setup(surf, false /* is_render_target */);
 }
 
-static void
-gen7_emit_texture_surface_state(struct brw_context *brw,
-struct intel_mipmap_tree *mt,
-GLenum target,
-unsigned min_layer, unsigned max_layer,
-unsigned min_level, unsigned max_level,
-unsigned format,
-unsigned swizzle,
-uint32_t *surf_offset,
-int surf_index /* unused */,
-bool rw, bool for_gather)
-{
-   const unsigned depth = max_layer - min_layer;
-   uint32_t *surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
-8 * 4, 32, surf_offset);
-
-   memset(surf, 0, 8 * 4);
-
-   surf[0] = translate_tex_target(target) << BRW_SURFACE_TYPE_SHIFT |
- format << BRW_SURFACE_FORMAT_SHIFT |
- gen7_surface_tiling_mode(mt->tiling);
-
-   /* mask of faces present in cube map; for other surfaces MBZ. */
-   if (target == GL_TEXTURE_CUBE_MAP || target == GL_TEXTURE_CUBE_MAP_ARRAY)
-  surf[0] |= BRW_SURFACE_CUBEFACE_ENABLES;
-
-   if (mt->valign == 4)
-  surf[0] |= GEN7_SURFACE_VALIGN_4;
-   if (mt->halign == 8)
-  surf[0] |= GEN7_SURFACE_HALIGN_8;
-
-   if (mt->target != GL_TEXTURE_3D)
-  surf[0] |= GEN7_SURFACE_IS_ARRAY;
-
-   if (mt->array_layout == ALL_SLICES_AT_EACH_LOD)
-  surf[0] |= GEN7_SURFACE_ARYSPC_LOD0;
-
-   surf[1] = mt->bo->offset64 + mt->offset; /* reloc */
-
-   surf[2] = SET_FIELD(mt->logical_width0 - 1, GEN7_SURFACE_WIDTH) |
- SET_FIELD(mt->logical_height0 - 1, GEN7_SURFACE_HEIGHT);
-
-   surf[3] = SET_FIELD(depth - 1, BRW_SURFACE_DEPTH) |
- (mt->pitch - 1);
-
-   if (brw->is_haswell && _mesa_is_format_integer(mt->format))
-  surf[3] |= HSW_SURFACE_IS_INTEGER_FORMAT;
-
-   surf[4] = gen7_surface_msaa_bits(mt->num_samples, mt->msaa_layout) |
- SET_FIELD(min_layer, GEN7_SURFACE_MIN_ARRAY_ELEMENT) |
- SET_FIELD(depth - 1, GEN7_SURFACE_RENDER_TARGET_VIEW_EXTENT);
-
-   surf[5] = (SET_FIELD(GEN7_MOCS_L3, GEN7_SURFACE_MOCS) |
-  SET_FIELD(min_level - mt->first_level, GEN7_SURFACE_MIN_LOD) |
-  /* mip count */
-  (max_level - min_level - 1));
-
-   surf[7] = mt->fast_clear_color_value;
-
-   if (brw->is_haswell) {
-  const bool need_scs_green_to_blue = for_gather && format == 
BRW_SURFACEFORMAT_R32G32_FLOAT_LD;
-
-  surf[7] |=
- SET_FIELD(swizzle_to_scs(GET_SWZ(swizzle, 0), 
need_scs_green_to_blue), GEN7_SURFACE_SCS_R) |
- SET_FIELD(swizzle_to_scs(GET_SWZ(swizzle, 1), 
need_scs_green_to_blue), GEN7_SURFACE_SCS_G) |
- SET_FIELD(swizzle_to_scs(GET_SWZ(swizzle, 2), 
need_scs_green_to_blue), GEN7_SURFACE_SCS_B) |
- SET_FIELD(swizzle_to_scs(GET_SWZ(swizzle, 3), 
need_scs_green_to_blue), GEN7_SURFACE_SCS_A);
-   }
-
-   if (mt->mcs_mt) {
-  gen7_set_surface_mcs_info(brw, surf, *surf_offset,
-mt->mcs_mt, false /* is RT */);
-   }
-
-   /* Emit relocation to surface contents */
-   drm_intel_bo_emit_reloc(brw->batch.bo,
-   *surf_offset + 4,
-   mt->bo,
-   surf[1] - mt->bo->offset64,
-   I915_GEM_DOMAIN_SAMPLER,
-   (rw ? I915_GEM_DOMAIN_SAMPLER : 0));
-
-   gen7_check_surface_se

[Mesa-dev] [PATCH v4 29/34] i965/gen4-6: Use the generic ISL-based path for texture surfaces

2016-07-13 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 94 +---
 1 file changed, 1 insertion(+), 93 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 084bd8c..01c9802 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -550,98 +550,6 @@ brw_update_buffer_texture_surface(struct gl_context *ctx,
false /* rw */);
 }
 
-static void
-gen4_update_texture_surface(struct gl_context *ctx,
-unsigned unit,
-uint32_t *surf_offset,
-bool for_gather,
-uint32_t plane)
-{
-   struct brw_context *brw = brw_context(ctx);
-   struct gl_texture_object *tObj = ctx->Texture.Unit[unit]._Current;
-   struct intel_texture_object *intelObj = intel_texture_object(tObj);
-   struct intel_mipmap_tree *mt = intelObj->mt;
-   struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit);
-   uint32_t *surf;
-
-   /* BRW_NEW_TEXTURE_BUFFER */
-   if (tObj->Target == GL_TEXTURE_BUFFER) {
-  brw_update_buffer_texture_surface(ctx, unit, surf_offset);
-  return;
-   }
-
-   if (plane > 0) {
-  if (mt->plane[plane - 1] == NULL)
- return;
-  mt = mt->plane[plane - 1];
-   }
-
-   surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
- 6 * 4, 32, surf_offset);
-
-   mesa_format mesa_fmt = plane == 0 ? intelObj->_Format : mt->format;
-   uint32_t tex_format = translate_tex_format(brw, mesa_fmt,
-  sampler->sRGBDecode);
-
-   if (for_gather) {
-  /* Sandybridge's gather4 message is broken for integer formats.
-   * To work around this, we pretend the surface is UNORM for
-   * 8 or 16-bit formats, and emit shader instructions to recover
-   * the real INT/UINT value.  For 32-bit formats, we pretend
-   * the surface is FLOAT, and simply reinterpret the resulting
-   * bits.
-   */
-  switch (tex_format) {
-  case BRW_SURFACEFORMAT_R8_SINT:
-  case BRW_SURFACEFORMAT_R8_UINT:
- tex_format = BRW_SURFACEFORMAT_R8_UNORM;
- break;
-
-  case BRW_SURFACEFORMAT_R16_SINT:
-  case BRW_SURFACEFORMAT_R16_UINT:
- tex_format = BRW_SURFACEFORMAT_R16_UNORM;
- break;
-
-  case BRW_SURFACEFORMAT_R32_SINT:
-  case BRW_SURFACEFORMAT_R32_UINT:
- tex_format = BRW_SURFACEFORMAT_R32_FLOAT;
- break;
-
-  default:
- break;
-  }
-   }
-
-   surf[0] = (translate_tex_target(tObj->Target) << BRW_SURFACE_TYPE_SHIFT |
- BRW_SURFACE_MIPMAPLAYOUT_BELOW << BRW_SURFACE_MIPLAYOUT_SHIFT |
- BRW_SURFACE_CUBEFACE_ENABLES |
- tex_format << BRW_SURFACE_FORMAT_SHIFT);
-
-   surf[1] = mt->bo->offset64 + mt->offset; /* reloc */
-
-   surf[2] = ((intelObj->_MaxLevel - tObj->BaseLevel) << BRW_SURFACE_LOD_SHIFT 
|
- (mt->logical_width0 - 1) << BRW_SURFACE_WIDTH_SHIFT |
- (mt->logical_height0 - 1) << BRW_SURFACE_HEIGHT_SHIFT);
-
-   surf[3] = (brw_get_surface_tiling_bits(mt->tiling) |
- (mt->logical_depth0 - 1) << BRW_SURFACE_DEPTH_SHIFT |
- (mt->pitch - 1) << BRW_SURFACE_PITCH_SHIFT);
-
-   const unsigned min_lod = tObj->MinLevel + tObj->BaseLevel - mt->first_level;
-   surf[4] = (brw_get_surface_num_multisamples(mt->num_samples) |
-  SET_FIELD(min_lod, BRW_SURFACE_MIN_LOD) |
-  SET_FIELD(tObj->MinLayer, BRW_SURFACE_MIN_ARRAY_ELEMENT));
-
-   surf[5] = mt->valign == 4 ? BRW_SURFACE_VERTICAL_ALIGN_ENABLE : 0;
-
-   /* Emit relocation to surface contents */
-   drm_intel_bo_emit_reloc(brw->batch.bo,
-   *surf_offset + 4,
-   mt->bo,
-   surf[1] - mt->bo->offset64,
-   I915_GEM_DOMAIN_SAMPLER, 0);
-}
-
 /**
  * Create the constant buffer surface.  Vertex/fragment shader constants will 
be
  * read from this buffer with Data Port Read instructions/messages.
@@ -1678,7 +1586,7 @@ const struct brw_tracked_state brw_wm_image_surfaces = {
 void
 gen4_init_vtable_surface_functions(struct brw_context *brw)
 {
-   brw->vtbl.update_texture_surface = gen4_update_texture_surface;
+   brw->vtbl.update_texture_surface = brw_update_texture_surface;
brw->vtbl.update_renderbuffer_surface = gen4_update_renderbuffer_surface;
brw->vtbl.emit_null_surface_state = brw_emit_null_surface_state;
brw->vtbl.emit_buffer_surface_state = gen4_emit_buffer_surface_state;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 10/34] i965/miptree: Add a helper for getting an isl_surf from a miptree

2016-07-13 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 175 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |   6 +
 2 files changed, 179 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index b6265dc..8519f46 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -26,8 +26,6 @@
 #include 
 #include 
 
-#include "isl/isl.h"
-
 #include "intel_batchbuffer.h"
 #include "intel_mipmap_tree.h"
 #include "intel_resolve_map.h"
@@ -2999,3 +2997,176 @@ intel_miptree_unmap(struct brw_context *brw,
 
intel_miptree_release_map(mt, level, slice);
 }
+
+void
+intel_miptree_get_isl_surf(struct brw_context *brw,
+   const struct intel_mipmap_tree *mt,
+   struct isl_surf *surf)
+{
+   switch (mt->target) {
+   case GL_TEXTURE_1D:
+   case GL_TEXTURE_1D_ARRAY: {
+  surf->dim = ISL_SURF_DIM_1D;
+  if (brw->gen >= 9 && mt->tiling == I915_TILING_NONE)
+ surf->dim_layout = ISL_DIM_LAYOUT_GEN9_1D;
+  else
+ surf->dim_layout = ISL_DIM_LAYOUT_GEN4_2D;
+  break;
+   }
+   case GL_TEXTURE_2D:
+   case GL_TEXTURE_2D_ARRAY:
+   case GL_TEXTURE_RECTANGLE:
+   case GL_TEXTURE_CUBE_MAP:
+   case GL_TEXTURE_CUBE_MAP_ARRAY:
+   case GL_TEXTURE_2D_MULTISAMPLE:
+   case GL_TEXTURE_2D_MULTISAMPLE_ARRAY:
+   case GL_TEXTURE_EXTERNAL_OES:
+  surf->dim = ISL_SURF_DIM_2D;
+  surf->dim_layout = ISL_DIM_LAYOUT_GEN4_2D;
+  break;
+   case GL_TEXTURE_3D:
+  surf->dim = ISL_SURF_DIM_3D;
+  if (brw->gen >= 9)
+ surf->dim_layout = ISL_DIM_LAYOUT_GEN4_2D;
+  else
+ surf->dim_layout = ISL_DIM_LAYOUT_GEN4_3D;
+  break;
+   default:
+  unreachable("Invalid texture target");
+   }
+
+   if (mt->num_samples > 1) {
+  switch (mt->msaa_layout) {
+  case INTEL_MSAA_LAYOUT_NONE:
+ surf->msaa_layout = ISL_MSAA_LAYOUT_NONE;
+ break;
+  case INTEL_MSAA_LAYOUT_IMS:
+ surf->msaa_layout = ISL_MSAA_LAYOUT_INTERLEAVED;
+ break;
+  case INTEL_MSAA_LAYOUT_UMS:
+  case INTEL_MSAA_LAYOUT_CMS:
+ surf->msaa_layout = ISL_MSAA_LAYOUT_ARRAY;
+ break;
+  default:
+ unreachable("Invalid MSAA layout");
+  }
+   } else {
+  surf->msaa_layout = ISL_MSAA_LAYOUT_NONE;
+   }
+
+   if (mt->format == MESA_FORMAT_S_UINT8) {
+  surf->tiling = ISL_TILING_W;
+  /* The ISL definition of row_pitch matches the surface state pitch field
+   * a bit better than intel_mipmap_tree.  In particular, ISL incorporates
+   * the factor of 2 for W-tiling in row_pitch.
+   */
+  surf->row_pitch = 2 * mt->pitch;
+   } else {
+  switch (mt->tiling) {
+  case I915_TILING_NONE:
+ surf->tiling = ISL_TILING_LINEAR;
+ break;
+  case I915_TILING_X:
+ surf->tiling = ISL_TILING_X;
+ break;
+  case I915_TILING_Y:
+ switch (mt->tr_mode) {
+ case INTEL_MIPTREE_TRMODE_NONE:
+surf->tiling = ISL_TILING_Y0;
+break;
+ case INTEL_MIPTREE_TRMODE_YF:
+surf->tiling = ISL_TILING_Yf;
+break;
+ case INTEL_MIPTREE_TRMODE_YS:
+surf->tiling = ISL_TILING_Ys;
+break;
+ }
+ break;
+  default:
+ unreachable("Invalid tiling mode");
+  }
+
+  surf->row_pitch = mt->pitch;
+   }
+
+   surf->format = translate_tex_format(brw, mt->format, false);
+
+   if (brw->gen >= 9) {
+  if (surf->dim == ISL_SURF_DIM_1D && surf->tiling == ISL_TILING_LINEAR) {
+ /* For gen9 1-D surfaces, intel_mipmap_tree has a bogus alignment. */
+ surf->image_alignment_el = isl_extent3d(64, 1, 1);
+  } else {
+ /* On gen9+, intel_mipmap_tree stores the horizontal and vertical
+  * alignment in terms of surface elements like we want.
+  */
+ surf->image_alignment_el = isl_extent3d(mt->halign, mt->valign, 1);
+  }
+   } else {
+  /* On earlier gens it's storred in pixels. */
+  unsigned bw, bh;
+  _mesa_get_format_block_size(mt->format, &bw, &bh);
+  surf->image_alignment_el =
+ isl_extent3d(mt->halign / bw, mt->valign / bh, 1);
+   }
+
+   surf->logical_level0_px.width = mt->logical_width0;
+   surf->logical_level0_px.height = mt->logical_height0;
+   if (surf->dim == ISL_SURF_DIM_3D) {
+  surf->logical_level0_px.depth = mt->logical_depth0;
+  surf->logical_level0_px.array_len = 1;
+   } else if (mt->target == GL_TEXTURE_CUBE_MAP ||
+  mt->target == GL_TEXTURE_CUBE_MAP_ARRAY) {
+  /* For cube maps, mt->logical_depth0 is in number of cubes */
+  surf->logical_level0_px.depth = 1;
+  surf->logical_level0_px.array_len = mt->logical_depth0 * 6;
+   } else {
+  surf->logical_level0_px.depth = 1;
+  surf->logical_level0_px.array_len 

[Mesa-dev] [PATCH v4 20/34] i965/blorp: Use a generic ISL path for texture surfaces on gen8

2016-07-13 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/gen8_blorp.c | 47 +++---
 1 file changed, 38 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen8_blorp.c 
b/src/mesa/drivers/dri/i965/gen8_blorp.c
index 498c9f6..870b67f 100644
--- a/src/mesa/drivers/dri/i965/gen8_blorp.c
+++ b/src/mesa/drivers/dri/i965/gen8_blorp.c
@@ -475,6 +475,25 @@ gen8_blorp_emit_depth_stencil_state(struct brw_context 
*brw,
ADVANCE_BATCH();
 }
 
+/**
+ * Convert an swizzle enumeration (i.e. SWIZZLE_X) to one of the Gen7.5+
+ * "Shader Channel Select" enumerations (i.e. HSW_SCS_RED).  The mappings are
+ *
+ * SWIZZLE_X, SWIZZLE_Y, SWIZZLE_Z, SWIZZLE_W, SWIZZLE_ZERO, SWIZZLE_ONE
+ * 0  1  2  3 45
+ * 4  5  6  7 01
+ *   SCS_RED, SCS_GREEN,  SCS_BLUE, SCS_ALPHA, SCS_ZERO, SCS_ONE
+ *
+ * which is simply adding 4 then modding by 8 (or anding with 7).
+ *
+ * We then may need to apply workarounds for textureGather hardware bugs.
+ */
+static unsigned
+swizzle_to_scs(GLenum swizzle)
+{
+   return (swizzle + 4) & 7;
+}
+
 static uint32_t
 gen8_blorp_emit_surface_states(struct brw_context *brw,
const struct brw_blorp_params *params)
@@ -507,21 +526,31 @@ gen8_blorp_emit_surface_states(struct brw_context *brw,
   mt->msaa_layout == INTEL_MSAA_LAYOUT_CMS) ?
  MAX2(mt->num_samples, 1) : 1;
 
-  /* Cube textures are sampled as 2D array. */
   const bool is_cube = mt->target == GL_TEXTURE_CUBE_MAP_ARRAY ||
mt->target == GL_TEXTURE_CUBE_MAP;
   const unsigned depth = (is_cube ? 6 : 1) * mt->logical_depth0;
-  const GLenum target = is_cube ? GL_TEXTURE_2D_ARRAY : mt->target;
   const unsigned layer = mt->target != GL_TEXTURE_3D ?
 surface->layer / layer_divider : 0;
 
-  brw->vtbl.emit_texture_surface_state(brw, mt, target,
-   layer, depth,
-   surface->level, mt->last_level + 1,
-   surface->brw_surfaceformat,
-   surface->swizzle,
-   &wm_surf_offset_texture,
-   -1, false, false);
+  struct isl_view view = {
+ .format = surface->brw_surfaceformat,
+ .base_level = surface->level,
+ .levels = mt->last_level - surface->level + 1,
+ .base_array_layer = layer,
+ .array_len = depth - layer,
+ .channel_select = {
+swizzle_to_scs(GET_SWZ(surface->swizzle, 0)),
+swizzle_to_scs(GET_SWZ(surface->swizzle, 1)),
+swizzle_to_scs(GET_SWZ(surface->swizzle, 2)),
+swizzle_to_scs(GET_SWZ(surface->swizzle, 3)),
+ },
+ .usage = ISL_SURF_USAGE_TEXTURE_BIT,
+  };
+
+  brw_emit_surface_state(brw, mt, &view,
+ brw->gen >= 9 ? SKL_MOCS_WB : BDW_MOCS_WB,
+ false, &wm_surf_offset_texture, -1,
+ I915_GEM_DOMAIN_SAMPLER, 0);
}
 
return gen6_blorp_emit_binding_table(brw,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 13/34] i965/blorp: Add a generic ISL-based surface state emit path

2016-07-13 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 157 ++
 src/mesa/drivers/dri/i965/brw_blorp.h |   6 ++
 2 files changed, 163 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 04c10b6..282a5b2 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -231,6 +231,163 @@ brw_blorp_compile_nir_shader(struct brw_context *brw, 
struct nir_shader *nir,
return program;
 }
 
+static enum isl_msaa_layout
+get_isl_msaa_layout(enum intel_msaa_layout layout)
+{
+   switch (layout) {
+   case INTEL_MSAA_LAYOUT_NONE:
+  return ISL_MSAA_LAYOUT_NONE;
+   case INTEL_MSAA_LAYOUT_IMS:
+  return ISL_MSAA_LAYOUT_INTERLEAVED;
+   case INTEL_MSAA_LAYOUT_UMS:
+   case INTEL_MSAA_LAYOUT_CMS:
+  return ISL_MSAA_LAYOUT_ARRAY;
+   default:
+  unreachable("Invalid MSAA layout");
+   }
+}
+
+struct surface_state_info {
+   unsigned num_dwords;
+   unsigned ss_align; /* Required alignment of RENDER_SURFACE_STATE in bytes */
+   unsigned reloc_dw;
+   unsigned aux_reloc_dw;
+   unsigned tex_mocs;
+   unsigned rb_mocs;
+};
+
+static const struct surface_state_info surface_state_infos[] = {
+   [6] = {6,  32, 1,  0},
+   [7] = {8,  32, 1,  6,  GEN7_MOCS_L3, GEN7_MOCS_L3},
+   [8] = {13, 64, 8,  10, BDW_MOCS_WB,  BDW_MOCS_PTE},
+   [9] = {16, 64, 8,  10, SKL_MOCS_WB,  SKL_MOCS_PTE},
+};
+
+uint32_t
+brw_blorp_emit_surface_state(struct brw_context *brw,
+ const struct brw_blorp_surface_info *surface,
+ uint32_t read_domains, uint32_t write_domain,
+ bool is_render_target)
+{
+   const struct surface_state_info ss_info = surface_state_infos[brw->gen];
+
+   struct isl_surf surf;
+   intel_miptree_get_isl_surf(brw, surface->mt, &surf);
+
+   /* Stomp surface dimensions and tiling (if needed) with info from blorp */
+   surf.dim = ISL_SURF_DIM_2D;
+   surf.dim_layout = ISL_DIM_LAYOUT_GEN4_2D;
+   surf.msaa_layout = get_isl_msaa_layout(surface->msaa_layout);
+   surf.logical_level0_px.width = surface->width;
+   surf.logical_level0_px.height = surface->height;
+   surf.logical_level0_px.depth = 1;
+   surf.logical_level0_px.array_len = 1;
+   surf.levels = 1;
+   surf.samples = MAX2(surface->num_samples, 1);
+
+   /* Alignment doesn't matter since we have 1 miplevel and 1 array slice so
+* just pick something that works for everybody.
+*/
+   surf.image_alignment_el = isl_extent3d(4, 4, 1);
+
+   if (brw->gen == 6 && surface->num_samples > 1) {
+  /* Since gen6 uses INTEL_MSAA_LAYOUT_IMS, width and height are measured
+   * in samples.  But SURFACE_STATE wants them in pixels, so we need to
+   * divide them each by 2.
+   */
+  surf.logical_level0_px.width /= 2;
+  surf.logical_level0_px.height /= 2;
+   }
+
+   if (brw->gen == 6 && surf.image_alignment_el.height > 4) {
+  /* This can happen on stencil buffers on Sandy Bridge due to the
+   * single-LOD work-around.  It's fairly harmless as long as we don't
+   * pass a bogus value into isl_surf_fill_state().
+   */
+  surf.image_alignment_el = isl_extent3d(4, 2, 1);
+   }
+
+   /* We need to fake W-tiling with Y-tiling */
+   if (surface->map_stencil_as_y_tiled)
+  surf.tiling = ISL_TILING_Y0;
+
+   union isl_color_value clear_color = { .u32 = { 0, 0, 0, 0 } };
+
+   struct isl_surf *aux_surf = NULL, aux_surf_s;
+   uint64_t aux_offset = 0;
+   enum isl_aux_usage aux_usage = ISL_AUX_USAGE_NONE;
+   if (surface->mt->mcs_mt) {
+  /* We should probably to similar stomping to above but most of the aux
+   * surf gets ignored when we fill out the surface state anyway so
+   * there's no point.
+   */
+  intel_miptree_get_aux_isl_surf(brw, surface->mt, &aux_surf_s, 
&aux_usage);
+  aux_surf = &aux_surf_s;
+  assert(surface->mt->mcs_mt->offset == 0);
+  aux_offset = surface->mt->mcs_mt->bo->offset64;
+
+  /* We only really need a clear color if we also have an auxiliary
+   * surface.  Without one, it does nothing.
+   */
+  clear_color = intel_miptree_get_isl_clear_color(brw, surface->mt);
+   }
+
+   struct isl_view view = {
+  .format = surface->brw_surfaceformat,
+  .base_level = 0,
+  .levels = 1,
+  .base_array_layer = 0,
+  .array_len = 1,
+  .channel_select = {
+ ISL_CHANNEL_SELECT_RED,
+ ISL_CHANNEL_SELECT_GREEN,
+ ISL_CHANNEL_SELECT_BLUE,
+ ISL_CHANNEL_SELECT_ALPHA,
+  },
+  .usage = is_render_target ? ISL_SURF_USAGE_RENDER_TARGET_BIT :
+  ISL_SURF_USAGE_TEXTURE_BIT,
+   };
+
+   uint32_t offset, tile_x, tile_y;
+   offset = brw_blorp_compute_tile_offsets(surface, &tile_x, &tile_y);
+
+   uint32_t surf_offset;
+   uint32_t *dw = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
+  ss_info.num_dwords * 4, ss

[Mesa-dev] [PATCH v4 17/34] i965/blorp: Use the generic ISL path for renderbuffer surfaces on gen6

2016-07-13 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/gen6_blorp.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_blorp.c 
b/src/mesa/drivers/dri/i965/gen6_blorp.c
index c38952e..1af898d 100644
--- a/src/mesa/drivers/dri/i965/gen6_blorp.c
+++ b/src/mesa/drivers/dri/i965/gen6_blorp.c
@@ -1030,9 +1030,9 @@ gen6_blorp_exec(struct brw_context *brw,
 
   intel_miptree_used_for_rendering(params->dst.mt);
   wm_surf_offset_renderbuffer =
- gen6_blorp_emit_surface_state(brw, params, ¶ms->dst,
-   I915_GEM_DOMAIN_RENDER,
-   I915_GEM_DOMAIN_RENDER);
+ brw_blorp_emit_surface_state(brw, ¶ms->dst,
+  I915_GEM_DOMAIN_RENDER,
+  I915_GEM_DOMAIN_RENDER, true);
   if (params->src.mt) {
  wm_surf_offset_texture =
 gen6_blorp_emit_surface_state(brw, params, ¶ms->src,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 12/34] i965/miptree: Add a helper for getting the aux isl_surf from a miptree

2016-07-13 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 120 ++
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |   5 ++
 2 files changed, 125 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 7d3cec2..114959e 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3171,6 +3171,126 @@ intel_miptree_get_isl_surf(struct brw_context *brw,
surf->usage = 0; /* TODO */
 }
 
+/* WARNING: THE SURFACE CREATED BY THIS FUNCTION IS NOT COMPLETE AND CANNOT BE
+ * USED FOR ANY REAL CALCULATIONS.  THE ONLY VALID USE OF SUCH A SURFACE IS TO
+ * PASS IT INTO isl_surf_fill_state.
+ */
+void
+intel_miptree_get_aux_isl_surf(struct brw_context *brw,
+   const struct intel_mipmap_tree *mt,
+   struct isl_surf *surf,
+   enum isl_aux_usage *usage)
+{
+   /* Much is the same as the regular surface */
+   intel_miptree_get_isl_surf(brw, mt->mcs_mt, surf);
+
+   /* Figure out the layout */
+   if (_mesa_get_format_base_format(mt->format) == GL_DEPTH_COMPONENT) {
+  *usage = ISL_AUX_USAGE_HIZ;
+   } else if (mt->num_samples > 1) {
+  if (mt->msaa_layout == INTEL_MSAA_LAYOUT_CMS)
+ *usage = ISL_AUX_USAGE_MCS;
+  else
+ *usage = ISL_AUX_USAGE_NONE;
+   } else if (intel_miptree_is_lossless_compressed(brw, mt)) {
+  assert(brw->gen >= 9);
+  *usage = ISL_AUX_USAGE_CCS_E;
+   } else if (mt->fast_clear_state != INTEL_FAST_CLEAR_STATE_NO_MCS) {
+  *usage = ISL_AUX_USAGE_CCS_D;
+   } else {
+  /* Can we even get here? */
+  *usage = ISL_AUX_USAGE_NONE;
+   }
+
+   /* Figure out the format and tiling of the auxiliary surface */
+   switch (*usage) {
+   case ISL_AUX_USAGE_NONE:
+  /* Can we even get here? */
+  break;
+
+   case ISL_AUX_USAGE_HIZ:
+  surf->format = ISL_FORMAT_HIZ;
+  surf->tiling = ISL_TILING_HIZ;
+  surf->usage = ISL_SURF_USAGE_HIZ_BIT;
+  break;
+
+   case ISL_AUX_USAGE_MCS:
+  /*
+   * From the SKL PRM:
+   *"When Auxiliary Surface Mode is set to AUX_CCS_D or AUX_CCS_E,
+   *HALIGN 16 must be used."
+   */
+  if (brw->gen >= 9)
+ assert(mt->halign == 16);
+
+  surf->usage = ISL_SURF_USAGE_MCS_BIT;
+
+  switch (mt->num_samples) {
+  case 2:  surf->format = ISL_FORMAT_MCS_2X;   break;
+  case 4:  surf->format = ISL_FORMAT_MCS_4X;   break;
+  case 8:  surf->format = ISL_FORMAT_MCS_8X;   break;
+  case 16: surf->format = ISL_FORMAT_MCS_16X;  break;
+  default:
+ unreachable("Invalid number of samples");
+  }
+  break;
+
+   case ISL_AUX_USAGE_CCS_D:
+   case ISL_AUX_USAGE_CCS_E:
+  /*
+   * From the BDW PRM, Volume 2d, page 260 (RENDER_SURFACE_STATE):
+   *
+   *"When MCS is enabled for non-MSRT, HALIGN_16 must be used"
+   *
+   * From the hardware spec for GEN9:
+   *
+   *"When Auxiliary Surface Mode is set to AUX_CCS_D or AUX_CCS_E,
+   *HALIGN 16 must be used."
+   */
+  if (brw->gen >= 9 || mt->num_samples == 1)
+ assert(mt->halign == 16);
+
+  surf->tiling = ISL_TILING_CCS;
+  surf->usage = ISL_SURF_USAGE_CCS_BIT;
+
+  if (brw->gen >= 9) {
+ assert(mt->tiling == I915_TILING_Y);
+ switch (_mesa_get_format_bytes(mt->format)) {
+ case 4:  surf->format = ISL_FORMAT_GEN9_CCS_32BPP;   break;
+ case 8:  surf->format = ISL_FORMAT_GEN9_CCS_64BPP;   break;
+ case 16: surf->format = ISL_FORMAT_GEN9_CCS_128BPP;  break;
+ default:
+unreachable("Invalid format size for color compression");
+ }
+  } else if (mt->tiling == I915_TILING_Y) {
+ switch (_mesa_get_format_bytes(mt->format)) {
+ case 4:  surf->format = ISL_FORMAT_GEN7_CCS_32BPP_Y;break;
+ case 8:  surf->format = ISL_FORMAT_GEN7_CCS_64BPP_Y;break;
+ case 16: surf->format = ISL_FORMAT_GEN7_CCS_128BPP_X;   break;
+ default:
+unreachable("Invalid format size for color compression");
+ }
+  } else {
+ assert(mt->tiling == I915_TILING_X);
+ switch (_mesa_get_format_bytes(mt->format)) {
+ case 4:  surf->format = ISL_FORMAT_GEN7_CCS_32BPP_X;break;
+ case 8:  surf->format = ISL_FORMAT_GEN7_CCS_64BPP_X;break;
+ case 16: surf->format = ISL_FORMAT_GEN7_CCS_128BPP_X;   break;
+ default:
+unreachable("Invalid format size for color compression");
+ }
+  }
+  break;
+   }
+
+   /* Auxiliary surfaces in ISL have compressed formats so array_pitch_el_rows
+* is in elements.  This doesn't match intel_mipmap_tree::qpitch which is
+* in elements of the primary color surface so we have to divide by the
+* compression block height.
+*/
+   surf->array_pitch_el_

[Mesa-dev] [PATCH v4 09/34] i965: Add an isl_device to the brw_context

2016-07-13 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_context.c | 2 ++
 src/mesa/drivers/dri/i965/brw_context.h | 4 
 2 files changed, 6 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 3f0c2e3..cb74200 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -896,6 +896,8 @@ brwCreateContext(gl_api api,
brw->must_use_separate_stencil = devinfo->must_use_separate_stencil;
brw->has_swizzling = screen->hw_has_swizzling;
 
+   isl_device_init(&brw->isl_dev, devinfo, screen->hw_has_swizzling);
+
brw->vs.base.stage = MESA_SHADER_VERTEX;
brw->tcs.base.stage = MESA_SHADER_TESS_CTRL;
brw->tes.base.stage = MESA_SHADER_TESS_EVAL;
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 6353ca1..c0cdd7d 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -40,6 +40,8 @@
 #include "brw_compiler.h"
 #include "intel_aub.h"
 
+#include "isl/isl.h"
+
 #ifdef __cplusplus
 extern "C" {
/* Evil hack for using libdrm in a c++ compiler. */
@@ -914,6 +916,8 @@ struct brw_context
 */
bool needs_unlit_centroid_workaround;
 
+   struct isl_device isl_dev;
+
GLuint NewGLState;
struct {
   struct brw_state_flags pipelines[BRW_NUM_PIPELINES];
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 03/34] genxml: Add enough XML for gens 4, 4.5, and 5 to get SURFACE_STATE

2016-07-13 Thread Jason Ekstrand
Acked-by: Chad Versace 
---
 src/intel/genxml/Android.mk   | 15 +++
 src/intel/genxml/Makefile.am  |  3 +++
 src/intel/genxml/Makefile.sources |  3 +++
 src/intel/genxml/gen4.xml | 52 
 src/intel/genxml/gen45.xml| 56 +++
 src/intel/genxml/gen5.xml | 56 +++
 6 files changed, 185 insertions(+)
 create mode 100644 src/intel/genxml/gen4.xml
 create mode 100644 src/intel/genxml/gen45.xml
 create mode 100644 src/intel/genxml/gen5.xml

diff --git a/src/intel/genxml/Android.mk b/src/intel/genxml/Android.mk
index e5b7597..39b5e2c 100644
--- a/src/intel/genxml/Android.mk
+++ b/src/intel/genxml/Android.mk
@@ -49,6 +49,21 @@ define header-gen
$(hide) $(PRIVATE_SCRIPT) $(PRIVATE_XML) > $@
 endef
 
+$(intermediates)/genxml/gen4_pack.h: PRIVATE_SCRIPT := $(MESA_PYTHON2) 
$(LOCAL_PATH)/gen_pack_header.py
+$(intermediates)/genxml/gen4_pack.h: PRIVATE_XML := $(LOCAL_PATH)/gen4.xml
+$(intermediates)/genxml/gen4_pack.h: $(LOCAL_PATH)/gen4.xml 
$(LOCAL_PATH)/gen_pack_header.py
+   $(call header-gen)
+
+$(intermediates)/genxml/gen45_pack.h: PRIVATE_SCRIPT := $(MESA_PYTHON2) 
$(LOCAL_PATH)/gen_pack_header.py
+$(intermediates)/genxml/gen45_pack.h: PRIVATE_XML := $(LOCAL_PATH)/gen45.xml
+$(intermediates)/genxml/gen45_pack.h: $(LOCAL_PATH)/gen45.xml 
$(LOCAL_PATH)/gen_pack_header.py
+   $(call header-gen)
+
+$(intermediates)/genxml/gen5_pack.h: PRIVATE_SCRIPT := $(MESA_PYTHON2) 
$(LOCAL_PATH)/gen_pack_header.py
+$(intermediates)/genxml/gen5_pack.h: PRIVATE_XML := $(LOCAL_PATH)/gen5.xml
+$(intermediates)/genxml/gen5_pack.h: $(LOCAL_PATH)/gen5.xml 
$(LOCAL_PATH)/gen_pack_header.py
+   $(call header-gen)
+
 $(intermediates)/genxml/gen6_pack.h: PRIVATE_SCRIPT := $(MESA_PYTHON2) 
$(LOCAL_PATH)/gen_pack_header.py
 $(intermediates)/genxml/gen6_pack.h: PRIVATE_XML := $(LOCAL_PATH)/gen6.xml
 $(intermediates)/genxml/gen6_pack.h: $(LOCAL_PATH)/gen6.xml 
$(LOCAL_PATH)/gen_pack_header.py
diff --git a/src/intel/genxml/Makefile.am b/src/intel/genxml/Makefile.am
index d6c1c5b..95c1ff9 100644
--- a/src/intel/genxml/Makefile.am
+++ b/src/intel/genxml/Makefile.am
@@ -35,6 +35,9 @@ $(BUILT_SOURCES): gen_pack_header.py
 CLEANFILES = $(BUILT_SOURCES)
 
 EXTRA_DIST = \
+   gen4.xml \
+   gen45.xml \
+   gen5.xml \
gen6.xml \
gen7.xml \
gen75.xml \
diff --git a/src/intel/genxml/Makefile.sources 
b/src/intel/genxml/Makefile.sources
index 9298b4a..86c0bbe 100644
--- a/src/intel/genxml/Makefile.sources
+++ b/src/intel/genxml/Makefile.sources
@@ -1,4 +1,7 @@
 GENXML_GENERATED_FILES = \
+   gen4_pack.h \
+   gen45_pack.h \
+   gen5_pack.h \
gen6_pack.h \
gen7_pack.h \
gen75_pack.h \
diff --git a/src/intel/genxml/gen4.xml b/src/intel/genxml/gen4.xml
new file mode 100644
index 000..1f89b1d
--- /dev/null
+++ b/src/intel/genxml/gen4.xml
@@ -0,0 +1,52 @@
+
+  
+
+  
+  
+  
+  
+  
+  
+
+
+  
+  
+
+
+
+  
+  
+  
+  
+
+
+
+
+
+  
+  
+
+
+  
+  
+
+
+  
+
+
+
+
+
+
+
+
+
+
+  
+  
+
+
+
+
+  
+
diff --git a/src/intel/genxml/gen45.xml b/src/intel/genxml/gen45.xml
new file mode 100644
index 000..973b3bb
--- /dev/null
+++ b/src/intel/genxml/gen45.xml
@@ -0,0 +1,56 @@
+
+  
+
+  
+  
+  
+  
+  
+  
+
+
+  
+  
+
+
+
+  
+  
+  
+  
+
+
+
+
+
+  
+  
+
+
+  
+  
+
+
+  
+  
+  
+
+
+
+
+
+
+
+
+
+
+  
+  
+
+
+
+
+
+
+  
+
diff --git a/src/intel/genxml/gen5.xml b/src/intel/genxml/gen5.xml
new file mode 100644
index 000..37e1ac4
--- /dev/null
+++ b/src/intel/genxml/gen5.xml
@@ -0,0 +1,56 @@
+
+  
+
+  
+  
+  
+  
+  
+  
+
+
+  
+  
+
+
+
+  
+  
+  
+  
+
+
+
+
+
+  
+  
+
+
+  
+  
+
+
+  
+  
+  
+
+
+
+
+
+
+
+
+
+
+  
+  
+
+
+
+
+
+
+  
+
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 16/34] i965/blorp: Use the generic ISL path for texture surfaces on gen7

2016-07-13 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/gen7_blorp.c | 96 ++
 1 file changed, 3 insertions(+), 93 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen7_blorp.c 
b/src/mesa/drivers/dri/i965/gen7_blorp.c
index 600684e..07b622d 100644
--- a/src/mesa/drivers/dri/i965/gen7_blorp.c
+++ b/src/mesa/drivers/dri/i965/gen7_blorp.c
@@ -138,96 +138,6 @@ gen7_blorp_emit_depth_stencil_state_pointers(struct 
brw_context *brw,
 }
 
 
-/* SURFACE_STATE for renderbuffer or texture surface (see
- * brw_update_renderbuffer_surface and brw_update_texture_surface)
- */
-static uint32_t
-gen7_blorp_emit_surface_state(struct brw_context *brw,
-  const struct brw_blorp_surface_info *surface,
-  uint32_t read_domains, uint32_t write_domain,
-  bool is_render_target)
-{
-   uint32_t wm_surf_offset;
-   uint32_t width = surface->width;
-   uint32_t height = surface->height;
-   /* Note: since gen7 uses INTEL_MSAA_LAYOUT_CMS or INTEL_MSAA_LAYOUT_UMS for
-* color surfaces, width and height are measured in pixels; we don't need
-* to divide them by 2 as we do for Gen6 (see
-* gen6_blorp_emit_surface_state).
-*/
-   struct intel_mipmap_tree *mt = surface->mt;
-   uint32_t tile_x, tile_y;
-   const uint8_t mocs = GEN7_MOCS_L3;
-
-   uint32_t tiling = surface->map_stencil_as_y_tiled
-  ? I915_TILING_Y : mt->tiling;
-
-   uint32_t *surf = (uint32_t *)
-  brw_state_batch(brw, AUB_TRACE_SURFACE_STATE, 8 * 4, 32, 
&wm_surf_offset);
-   memset(surf, 0, 8 * 4);
-
-   surf[0] = BRW_SURFACE_2D << BRW_SURFACE_TYPE_SHIFT |
- surface->brw_surfaceformat << BRW_SURFACE_FORMAT_SHIFT |
- gen7_surface_tiling_mode(tiling);
-
-   if (surface->mt->valign == 4)
-  surf[0] |= GEN7_SURFACE_VALIGN_4;
-   if (surface->mt->halign == 8)
-  surf[0] |= GEN7_SURFACE_HALIGN_8;
-
-   if (surface->array_layout == ALL_SLICES_AT_EACH_LOD)
-  surf[0] |= GEN7_SURFACE_ARYSPC_LOD0;
-   else
-  surf[0] |= GEN7_SURFACE_ARYSPC_FULL;
-
-   /* reloc */
-   surf[1] = brw_blorp_compute_tile_offsets(surface, &tile_x, &tile_y) +
- mt->bo->offset64;
-
-   /* Note that the low bits of these fields are missing, so
-* there's the possibility of getting in trouble.
-*/
-   assert(tile_x % 4 == 0);
-   assert(tile_y % 2 == 0);
-   surf[5] = SET_FIELD(tile_x / 4, BRW_SURFACE_X_OFFSET) |
- SET_FIELD(tile_y / 2, BRW_SURFACE_Y_OFFSET) |
- SET_FIELD(mocs, GEN7_SURFACE_MOCS);
-
-   surf[2] = SET_FIELD(width - 1, GEN7_SURFACE_WIDTH) |
- SET_FIELD(height - 1, GEN7_SURFACE_HEIGHT);
-
-   uint32_t pitch_bytes = mt->pitch;
-   if (surface->map_stencil_as_y_tiled)
-  pitch_bytes *= 2;
-   surf[3] = pitch_bytes - 1;
-
-   surf[4] = gen7_surface_msaa_bits(surface->num_samples, 
surface->msaa_layout);
-   if (surface->mt->mcs_mt) {
-  gen7_set_surface_mcs_info(brw, surf, wm_surf_offset, surface->mt->mcs_mt,
-is_render_target);
-   }
-
-   surf[7] = surface->mt->fast_clear_color_value;
-
-   if (brw->is_haswell) {
-  surf[7] |= (SET_FIELD(HSW_SCS_RED,   GEN7_SURFACE_SCS_R) |
-  SET_FIELD(HSW_SCS_GREEN, GEN7_SURFACE_SCS_G) |
-  SET_FIELD(HSW_SCS_BLUE,  GEN7_SURFACE_SCS_B) |
-  SET_FIELD(HSW_SCS_ALPHA, GEN7_SURFACE_SCS_A));
-   }
-
-   /* Emit relocation to surface contents */
-   drm_intel_bo_emit_reloc(brw->batch.bo,
-   wm_surf_offset + 4,
-   mt->bo,
-   surf[1] - mt->bo->offset64,
-   read_domains, write_domain);
-
-   gen7_check_surface_setup(surf, is_render_target);
-
-   return wm_surf_offset;
-}
-
 /* Hardware seems to try to fetch the constants even though the corresponding
  * stage gets disabled. Therefore make sure the settings for the constant
  * buffer are valid.
@@ -787,9 +697,9 @@ gen7_blorp_exec(struct brw_context *brw,
   true /* is_render_target */);
   if (params->src.mt) {
  wm_surf_offset_texture =
-gen7_blorp_emit_surface_state(brw, ¶ms->src,
-  I915_GEM_DOMAIN_SAMPLER, 0,
-  false /* is_render_target */);
+brw_blorp_emit_surface_state(brw, ¶ms->src,
+ I915_GEM_DOMAIN_SAMPLER, 0,
+ false /* is_render_target */);
   }
   wm_bind_bo_offset =
  gen6_blorp_emit_binding_table(brw,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 15/34] i965/blorp: Use the generic ISL path for renderbuffer surfaces on gen7

2016-07-13 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/gen7_blorp.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen7_blorp.c 
b/src/mesa/drivers/dri/i965/gen7_blorp.c
index 72ab082..600684e 100644
--- a/src/mesa/drivers/dri/i965/gen7_blorp.c
+++ b/src/mesa/drivers/dri/i965/gen7_blorp.c
@@ -781,10 +781,10 @@ gen7_blorp_exec(struct brw_context *brw,
 
   intel_miptree_used_for_rendering(params->dst.mt);
   wm_surf_offset_renderbuffer =
- gen7_blorp_emit_surface_state(brw, ¶ms->dst,
-   I915_GEM_DOMAIN_RENDER,
-   I915_GEM_DOMAIN_RENDER,
-   true /* is_render_target */);
+ brw_blorp_emit_surface_state(brw, ¶ms->dst,
+  I915_GEM_DOMAIN_RENDER,
+  I915_GEM_DOMAIN_RENDER,
+  true /* is_render_target */);
   if (params->src.mt) {
  wm_surf_offset_texture =
 gen7_blorp_emit_surface_state(brw, ¶ms->src,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 14/34] i965/blorp: Use the generic ISL path for renderbuffer surfaces on gen8-9

2016-07-13 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/gen8_blorp.c | 99 ++
 1 file changed, 4 insertions(+), 95 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen8_blorp.c 
b/src/mesa/drivers/dri/i965/gen8_blorp.c
index f68aba5..498c9f6 100644
--- a/src/mesa/drivers/dri/i965/gen8_blorp.c
+++ b/src/mesa/drivers/dri/i965/gen8_blorp.c
@@ -33,97 +33,6 @@
 
 #include "brw_blorp.h"
 
-
-/* SURFACE_STATE for renderbuffer or texture surface (see
- * brw_update_renderbuffer_surface and brw_update_texture_surface)
- */
-static uint32_t
-gen8_blorp_emit_surface_state(struct brw_context *brw,
-  const struct brw_blorp_surface_info *surface,
-  uint32_t read_domains, uint32_t write_domain,
-  bool is_render_target)
-{
-   uint32_t wm_surf_offset;
-   const struct intel_mipmap_tree *mt = surface->mt;
-   const uint32_t mocs_wb = is_render_target ?
-   (brw->gen >= 9 ? SKL_MOCS_PTE : BDW_MOCS_PTE) :
-   (brw->gen >= 9 ? SKL_MOCS_WB : BDW_MOCS_WB);
-   const uint32_t tiling = surface->map_stencil_as_y_tiled
-  ? I915_TILING_Y : mt->tiling;
-   uint32_t tile_x, tile_y;
-
-   uint32_t *surf = gen8_allocate_surface_state(brw, &wm_surf_offset, -1);
-
-   surf[0] = BRW_SURFACE_2D << BRW_SURFACE_TYPE_SHIFT |
- surface->brw_surfaceformat << BRW_SURFACE_FORMAT_SHIFT |
- gen8_vertical_alignment(brw, mt, BRW_SURFACE_2D) |
- gen8_horizontal_alignment(brw, mt, BRW_SURFACE_2D) |
- gen8_surface_tiling_mode(tiling);
-
-   surf[1] = SET_FIELD(mocs_wb, GEN8_SURFACE_MOCS) | mt->qpitch >> 2;
-
-   surf[2] = SET_FIELD(surface->width - 1, GEN7_SURFACE_WIDTH) |
- SET_FIELD(surface->height - 1, GEN7_SURFACE_HEIGHT);
-
-   uint32_t pitch_bytes = mt->pitch;
-   if (surface->map_stencil_as_y_tiled)
-  pitch_bytes *= 2;
-   surf[3] = pitch_bytes - 1;
-
-   surf[4] = gen7_surface_msaa_bits(surface->num_samples,
-surface->msaa_layout);
-
-   if (surface->mt->mcs_mt) {
-  surf[6] = SET_FIELD(surface->mt->qpitch / 4, GEN8_SURFACE_AUX_QPITCH) |
-SET_FIELD((surface->mt->mcs_mt->pitch / 128) - 1,
-  GEN8_SURFACE_AUX_PITCH) |
-gen8_get_aux_mode(brw, mt);
-   } else {
-  surf[6] = 0;
-   }
-
-   gen8_emit_fast_clear_color(brw, mt, surf);
-   surf[7] |= SET_FIELD(HSW_SCS_RED,   GEN7_SURFACE_SCS_R) |
-  SET_FIELD(HSW_SCS_GREEN, GEN7_SURFACE_SCS_G) |
-  SET_FIELD(HSW_SCS_BLUE,  GEN7_SURFACE_SCS_B) |
-  SET_FIELD(HSW_SCS_ALPHA, GEN7_SURFACE_SCS_A);
-
-/* reloc */
-   *((uint64_t *)&surf[8]) =
-  brw_blorp_compute_tile_offsets(surface, &tile_x, &tile_y) +
-  mt->bo->offset64;
-
-   /* Note that the low bits of these fields are missing, so there's the
-* possibility of getting in trouble.
-*/
-   assert(tile_x % 4 == 0);
-   assert(tile_y % 4 == 0);
-   surf[5] = SET_FIELD(tile_x / 4, BRW_SURFACE_X_OFFSET) |
- SET_FIELD(tile_y / 4, GEN8_SURFACE_Y_OFFSET);
-
-   if (brw->gen >= 9) {
-  /* Disable Mip Tail by setting a large value. */
-  surf[5] |= SET_FIELD(15, GEN9_SURFACE_MIP_TAIL_START_LOD);
-   }
-
-   if (surface->mt->mcs_mt) {
-  *((uint64_t *) &surf[10]) = surface->mt->mcs_mt->bo->offset64;
-  drm_intel_bo_emit_reloc(brw->batch.bo,
-  wm_surf_offset + 10 * 4,
-  surface->mt->mcs_mt->bo, 0,
-  read_domains, write_domain);
-   }
-
-   /* Emit relocation to surface contents */
-   drm_intel_bo_emit_reloc(brw->batch.bo,
-   wm_surf_offset + 8 * 4,
-   mt->bo,
-   surf[8] - mt->bo->offset64,
-   read_domains, write_domain);
-
-   return wm_surf_offset;
-}
-
 static uint32_t
 gen8_blorp_emit_blend_state(struct brw_context *brw,
 const struct brw_blorp_params *params)
@@ -576,10 +485,10 @@ gen8_blorp_emit_surface_states(struct brw_context *brw,
intel_miptree_used_for_rendering(params->dst.mt);
 
wm_surf_offset_renderbuffer =
-  gen8_blorp_emit_surface_state(brw, ¶ms->dst,
-I915_GEM_DOMAIN_RENDER,
-I915_GEM_DOMAIN_RENDER,
-true /* is_render_target */);
+  brw_blorp_emit_surface_state(brw, ¶ms->dst,
+   I915_GEM_DOMAIN_RENDER,
+   I915_GEM_DOMAIN_RENDER,
+   true /* is_render_target */);
if (params->src.mt) {
   const struct brw_blorp_surface_info *surface = ¶ms->src;
   struct intel_mipmap_tree *mt = surface->mt;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing lis

[Mesa-dev] [PATCH v4 11/34] i965/miptree: Add a helper for getting the ISL clear color from a miptree

2016-07-13 Thread Jason Ekstrand
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 24 
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  4 
 2 files changed, 28 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 8519f46..7d3cec2 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -3170,3 +3170,27 @@ intel_miptree_get_isl_surf(struct brw_context *brw,
 
surf->usage = 0; /* TODO */
 }
+
+union isl_color_value
+intel_miptree_get_isl_clear_color(struct brw_context *brw,
+  const struct intel_mipmap_tree *mt)
+{
+   union isl_color_value clear_color;
+   if (brw->gen >= 9) {
+  clear_color.i32[0] = mt->gen9_fast_clear_color.i[0];
+  clear_color.i32[1] = mt->gen9_fast_clear_color.i[1];
+  clear_color.i32[2] = mt->gen9_fast_clear_color.i[2];
+  clear_color.i32[3] = mt->gen9_fast_clear_color.i[3];
+   } else if (_mesa_is_format_integer(mt->format)) {
+  clear_color.i32[0] = (mt->fast_clear_color_value & (1u << 31)) != 0;
+  clear_color.i32[1] = (mt->fast_clear_color_value & (1u << 30)) != 0;
+  clear_color.i32[2] = (mt->fast_clear_color_value & (1u << 29)) != 0;
+  clear_color.i32[3] = (mt->fast_clear_color_value & (1u << 28)) != 0;
+   } else {
+  clear_color.f32[0] = (mt->fast_clear_color_value & (1u << 31)) != 0;
+  clear_color.f32[1] = (mt->fast_clear_color_value & (1u << 30)) != 0;
+  clear_color.f32[2] = (mt->fast_clear_color_value & (1u << 29)) != 0;
+  clear_color.f32[3] = (mt->fast_clear_color_value & (1u << 28)) != 0;
+   }
+   return clear_color;
+}
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index cf5d1a6..a50f181 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -802,6 +802,10 @@ intel_miptree_get_isl_surf(struct brw_context *brw,
const struct intel_mipmap_tree *mt,
struct isl_surf *surf);
 
+union isl_color_value
+intel_miptree_get_isl_clear_color(struct brw_context *brw,
+  const struct intel_mipmap_tree *mt);
+
 void
 intel_get_image_dims(struct gl_texture_image *image,
  int *width, int *height, int *depth);
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 22/34] i965/surface_state: Rename brw_update to gen4_update

2016-07-13 Thread Jason Ekstrand
We're about to add generic versions which work across gens and those should
have the brw name.

Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 5873ea5..82a8537 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -367,11 +367,11 @@ brw_update_buffer_texture_surface(struct gl_context *ctx,
 }
 
 static void
-brw_update_texture_surface(struct gl_context *ctx,
-   unsigned unit,
-   uint32_t *surf_offset,
-   bool for_gather,
-   uint32_t plane)
+gen4_update_texture_surface(struct gl_context *ctx,
+unsigned unit,
+uint32_t *surf_offset,
+bool for_gather,
+uint32_t plane)
 {
struct brw_context *brw = brw_context(ctx);
struct gl_texture_object *tObj = ctx->Texture.Unit[unit]._Current;
@@ -715,10 +715,10 @@ brw_emit_null_surface_state(struct brw_context *brw,
  * usable for further buffers when doing ARB_draw_buffer support.
  */
 static uint32_t
-brw_update_renderbuffer_surface(struct brw_context *brw,
-struct gl_renderbuffer *rb,
-bool layered, unsigned unit,
-uint32_t surf_index)
+gen4_update_renderbuffer_surface(struct brw_context *brw,
+ struct gl_renderbuffer *rb,
+ bool layered, unsigned unit,
+ uint32_t surf_index)
 {
struct gl_context *ctx = &brw->ctx;
struct intel_renderbuffer *irb = intel_renderbuffer(rb);
@@ -1494,8 +1494,8 @@ const struct brw_tracked_state brw_wm_image_surfaces = {
 void
 gen4_init_vtable_surface_functions(struct brw_context *brw)
 {
-   brw->vtbl.update_texture_surface = brw_update_texture_surface;
-   brw->vtbl.update_renderbuffer_surface = brw_update_renderbuffer_surface;
+   brw->vtbl.update_texture_surface = gen4_update_texture_surface;
+   brw->vtbl.update_renderbuffer_surface = gen4_update_renderbuffer_surface;
brw->vtbl.emit_null_surface_state = brw_emit_null_surface_state;
brw->vtbl.emit_buffer_surface_state = gen4_emit_buffer_surface_state;
 }
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 02/34] isl/state: Divide the aux qpitch by 2

2016-07-13 Thread Jason Ekstrand
The field is in multiples of 4 like regular QPitch.
---
 src/intel/isl/isl_surface_state.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index fc7e1ba..ed964bd 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -385,7 +385,7 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
* in units of samples on the main surface.
*/
   s.AuxiliarySurfaceQPitch =
- isl_surf_get_array_pitch_sa_rows(info->aux_surf);
+ isl_surf_get_array_pitch_sa_rows(info->aux_surf) >> 2;
   s.AuxiliarySurfaceBaseAddress = info->aux_address;
   s.AuxiliarySurfaceMode = isl_to_gen_aux_mode[info->aux_usage];
 #else
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 07/34] isl: Add support for filling out surface states all the way back to gen4

2016-07-13 Thread Jason Ekstrand
Reviewed-by: Chad Versace 
---
 src/intel/isl/Android.mk  | 60 +++
 src/intel/isl/Makefile.am | 12 
 src/intel/isl/Makefile.sources| 13 +++--
 src/intel/isl/isl.c   | 22 ++
 src/intel/isl/isl_priv.h  | 24 
 src/intel/isl/isl_surface_state.c | 56 ++--
 6 files changed, 182 insertions(+), 5 deletions(-)

diff --git a/src/intel/isl/Android.mk b/src/intel/isl/Android.mk
index c828c5c..e2771ad 100644
--- a/src/intel/isl/Android.mk
+++ b/src/intel/isl/Android.mk
@@ -30,6 +30,63 @@ LIBISL_GENX_COMMON_INCLUDES := \
$(MESA_TOP)/src/mesa/drivers/dri/i965
 
 # ---
+# Build libisl_gen4
+# ---
+
+include $(CLEAR_VARS)
+
+LOCAL_MODULE := libmesa_isl_gen4
+
+LOCAL_SRC_FILES := $(ISL_GEN4_FILES)
+
+LOCAL_CFLAGS := -DGEN_VERSIONx10=40
+
+LOCAL_C_INCLUDES := $(LIBISL_GENX_COMMON_INCLUDES)
+
+LOCAL_WHOLE_STATIC_LIBRARIES := libmesa_genxml
+
+include $(MESA_COMMON_MK)
+include $(BUILD_STATIC_LIBRARY)
+
+# ---
+# Build libisl_gen5
+# ---
+
+include $(CLEAR_VARS)
+
+LOCAL_MODULE := libmesa_isl_gen5
+
+LOCAL_SRC_FILES := $(ISL_GEN5_FILES)
+
+LOCAL_CFLAGS := -DGEN_VERSIONx10=50
+
+LOCAL_C_INCLUDES := $(LIBISL_GENX_COMMON_INCLUDES)
+
+LOCAL_WHOLE_STATIC_LIBRARIES := libmesa_genxml
+
+include $(MESA_COMMON_MK)
+include $(BUILD_STATIC_LIBRARY)
+
+# ---
+# Build libisl_gen6
+# ---
+
+include $(CLEAR_VARS)
+
+LOCAL_MODULE := libmesa_isl_gen6
+
+LOCAL_SRC_FILES := $(ISL_GEN6_FILES)
+
+LOCAL_CFLAGS := -DGEN_VERSIONx10=60
+
+LOCAL_C_INCLUDES := $(LIBISL_GENX_COMMON_INCLUDES)
+
+LOCAL_WHOLE_STATIC_LIBRARIES := libmesa_genxml
+
+include $(MESA_COMMON_MK)
+include $(BUILD_STATIC_LIBRARY)
+
+# ---
 # Build libisl_gen7
 # ---
 
@@ -125,6 +182,9 @@ LOCAL_C_INCLUDES := \
 LOCAL_EXPORT_C_INCLUDE_DIRS := $(MESA_TOP)/src/intel
 
 LOCAL_WHOLE_STATIC_LIBRARIES := \
+   libmesa_isl_gen4 \
+   libmesa_isl_gen5 \
+   libmesa_isl_gen6 \
libmesa_isl_gen7 \
libmesa_isl_gen75 \
libmesa_isl_gen8 \
diff --git a/src/intel/isl/Makefile.am b/src/intel/isl/Makefile.am
index 1fd6683..7c22324 100644
--- a/src/intel/isl/Makefile.am
+++ b/src/intel/isl/Makefile.am
@@ -22,6 +22,9 @@
 include Makefile.sources
 
 ISL_GEN_LIBS =   \
+   libisl-gen4.la   \
+   libisl-gen5.la   \
+   libisl-gen6.la   \
libisl-gen7.la   \
libisl-gen75.la  \
libisl-gen8.la   \
@@ -52,6 +55,15 @@ libisl_la_LIBADD = $(ISL_GEN_LIBS)
 
 libisl_la_SOURCES = $(ISL_FILES) $(ISL_GENERATED_FILES)
 
+libisl_gen4_la_SOURCES = $(ISL_GEN4_FILES)
+libisl_gen4_la_CFLAGS = $(libisl_la_CFLAGS) -DGEN_VERSIONx10=40
+
+libisl_gen5_la_SOURCES = $(ISL_GEN5_FILES)
+libisl_gen5_la_CFLAGS = $(libisl_la_CFLAGS) -DGEN_VERSIONx10=50
+
+libisl_gen6_la_SOURCES = $(ISL_GEN6_FILES)
+libisl_gen6_la_CFLAGS = $(libisl_la_CFLAGS) -DGEN_VERSIONx10=60
+
 libisl_gen7_la_SOURCES = $(ISL_GEN7_FILES)
 libisl_gen7_la_CFLAGS = $(libisl_la_CFLAGS) -DGEN_VERSIONx10=70
 
diff --git a/src/intel/isl/Makefile.sources b/src/intel/isl/Makefile.sources
index 89b1418..aa20ed4 100644
--- a/src/intel/isl/Makefile.sources
+++ b/src/intel/isl/Makefile.sources
@@ -2,12 +2,21 @@ ISL_FILES = \
isl.c \
isl.h \
isl_format.c \
+   isl_priv.h \
+   isl_storage_image.c
+
+ISL_GEN4_FILES = \
isl_gen4.c \
isl_gen4.h \
+   isl_surface_state.c
+
+ISL_GEN5_FILES = \
+   isl_surface_state.c
+
+ISL_GEN6_FILES = \
isl_gen6.c \
isl_gen6.h \
-   isl_priv.h \
-   isl_storage_image.c
+   isl_surface_state.c
 
 ISL_GEN7_FILES = \
isl_gen7.c \
diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index 363c5d5..9e01b8d 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -1279,6 +1279,20 @@ isl_surf_fill_state_s(const struct isl_device *dev, void 
*state,
}
 
switch (ISL_DEV_GEN(dev)) {
+   case 4:
+  if (ISL_DEV_IS_G4X(dev)) {
+ /* G45 surface state is the same as gen5 */
+ isl_gen5_surf_fill_state_s(dev, state, info);
+  } else {
+ isl_gen4_surf_fill_state_s(dev, state, info);
+  }
+  break;
+   case 5:
+  isl_gen5_surf_fill_state_s(dev, state, info);
+  break;
+   case 6:
+  isl_gen6_surf_fill_state_s(dev, state, info);
+  break;
case 7:
   if (ISL_DEV_IS_HASWELL(dev)) {
  isl_gen75_surf_fill_state_s(dev, state, info);
@@ -1302,6 +1316,14 @@ isl_buffer_f

[Mesa-dev] [PATCH v4 06/34] isl: Add an ISL_DEV_IS_G4X macro

2016-07-13 Thread Jason Ekstrand
Reviewed-by: Chad Versace 
---
 src/intel/isl/isl.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
index b5884be..3064bd8 100644
--- a/src/intel/isl/isl.h
+++ b/src/intel/isl/isl.h
@@ -65,6 +65,10 @@ struct brw_image_param;
(assert(ISL_DEV_GEN(__dev) == (__dev)->info->gen))
 #endif
 
+#ifndef ISL_DEV_IS_G4X
+#define ISL_DEV_IS_G4X(__dev) ((__dev)->info->is_g4x)
+#endif
+
 #ifndef ISL_DEV_IS_HASWELL
 /**
  * @brief Get the hardware generation of isl_device.
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 04/34] genxml: Make X/Y Offset field of SURFACE_STATE a uint

2016-07-13 Thread Jason Ekstrand
THe offset type has special implications that it's intended to be some form
of aligned memory address.  These assumptions allow it to handle the case
where there is some alignment requirement on the offset and the bottom bits
are used for other things.  However, the offsets in the surface state field
are really just unsigned integers.

Reviewed-by: Chad Versace 
---
 src/intel/genxml/gen45.xml | 4 ++--
 src/intel/genxml/gen5.xml  | 4 ++--
 src/intel/genxml/gen6.xml  | 4 ++--
 src/intel/genxml/gen7.xml  | 4 ++--
 src/intel/genxml/gen75.xml | 4 ++--
 src/intel/genxml/gen8.xml  | 4 ++--
 src/intel/genxml/gen9.xml  | 4 ++--
 7 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/src/intel/genxml/gen45.xml b/src/intel/genxml/gen45.xml
index 973b3bb..ae483b7 100644
--- a/src/intel/genxml/gen45.xml
+++ b/src/intel/genxml/gen45.xml
@@ -50,7 +50,7 @@
 
 
 
-
-
+
+
   
 
diff --git a/src/intel/genxml/gen5.xml b/src/intel/genxml/gen5.xml
index 37e1ac4..cb6a7b6 100644
--- a/src/intel/genxml/gen5.xml
+++ b/src/intel/genxml/gen5.xml
@@ -50,7 +50,7 @@
 
 
 
-
-
+
+
   
 
diff --git a/src/intel/genxml/gen6.xml b/src/intel/genxml/gen6.xml
index 44e2804..595492f 100644
--- a/src/intel/genxml/gen6.xml
+++ b/src/intel/genxml/gen6.xml
@@ -355,12 +355,12 @@
   
 
 
-
+
 
   
   
 
-
+
 
 
   
diff --git a/src/intel/genxml/gen7.xml b/src/intel/genxml/gen7.xml
index 2bbfcb7..66f4f94 100644
--- a/src/intel/genxml/gen7.xml
+++ b/src/intel/genxml/gen7.xml
@@ -388,8 +388,8 @@
 
 
 
-
-
+
+
 
 
 
diff --git a/src/intel/genxml/gen75.xml b/src/intel/genxml/gen75.xml
index 9ab432c..841573a 100644
--- a/src/intel/genxml/gen75.xml
+++ b/src/intel/genxml/gen75.xml
@@ -399,8 +399,8 @@
 
 
 
-
-
+
+
 
 
 
diff --git a/src/intel/genxml/gen8.xml b/src/intel/genxml/gen8.xml
index 80d40fb..97af191 100644
--- a/src/intel/genxml/gen8.xml
+++ b/src/intel/genxml/gen8.xml
@@ -317,8 +317,8 @@
   
 
 
-
-
+
+
 
 
   
diff --git a/src/intel/genxml/gen9.xml b/src/intel/genxml/gen9.xml
index 94b7d28..5e3e2e1 100644
--- a/src/intel/genxml/gen9.xml
+++ b/src/intel/genxml/gen9.xml
@@ -324,8 +324,8 @@
   
 
 
-
-
+
+
 
 
   
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 05/34] genxml: Add macros and #includes for gens 4-6

2016-07-13 Thread Jason Ekstrand
Reviewed-by: Chad Versace 
---
 src/intel/genxml/genX_pack.h  | 10 +-
 src/intel/genxml/gen_macros.h | 15 ++-
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/src/intel/genxml/genX_pack.h b/src/intel/genxml/genX_pack.h
index 7967c29..0c25c4e 100644
--- a/src/intel/genxml/genX_pack.h
+++ b/src/intel/genxml/genX_pack.h
@@ -27,7 +27,15 @@
 #  error "The GEN_VERSIONx10 macro must be defined"
 #endif
 
-#if (GEN_VERSIONx10 == 70)
+#if (GEN_VERSIONx10 == 40)
+#  include "genxml/gen4_pack.h"
+#elif (GEN_VERSIONx10 == 45)
+#  include "genxml/gen45_pack.h"
+#elif (GEN_VERSIONx10 == 50)
+#  include "genxml/gen5_pack.h"
+#elif (GEN_VERSIONx10 == 60)
+#  include "genxml/gen6_pack.h"
+#elif (GEN_VERSIONx10 == 70)
 #  include "genxml/gen7_pack.h"
 #elif (GEN_VERSIONx10 == 75)
 #  include "genxml/gen75_pack.h"
diff --git a/src/intel/genxml/gen_macros.h b/src/intel/genxml/gen_macros.h
index 868bc22..1d591fa 100644
--- a/src/intel/genxml/gen_macros.h
+++ b/src/intel/genxml/gen_macros.h
@@ -57,9 +57,22 @@
 
 #define GEN_GEN ((GEN_VERSIONx10) / 10)
 #define GEN_IS_HASWELL ((GEN_VERSIONx10) == 75)
+#define GEN_IS_G4X ((GEN_VERSIONx10) == 45)
 
 /* Prefixing macros */
-#if (GEN_VERSIONx10 == 70)
+#if (GEN_VERSIONx10 == 40)
+#  define GENX(X) GEN4_##X
+#  define genX(x) gen4_##x
+#elif (GEN_VERSIONx10 == 45)
+#  define GENX(X) GEN45_##X
+#  define genX(x) gen45_##x
+#elif (GEN_VERSIONx10 == 50)
+#  define GENX(X) GEN5_##X
+#  define genX(x) gen5_##x
+#elif (GEN_VERSIONx10 == 60)
+#  define GENX(X) GEN6_##X
+#  define genX(x) gen6_##x
+#elif (GEN_VERSIONx10 == 70)
 #  define GENX(X) GEN7_##X
 #  define genX(x) gen7_##x
 #elif (GEN_VERSIONx10 == 75)
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 08/34] isl/state: Add support for OffsetX/Y in surface state

2016-07-13 Thread Jason Ekstrand
Reviewed-by: Chad Versace 
---
 src/intel/isl/isl.h   |  3 +++
 src/intel/isl/isl_surface_state.c | 28 
 2 files changed, 31 insertions(+)

diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
index 3064bd8..4a9d90e 100644
--- a/src/intel/isl/isl.h
+++ b/src/intel/isl/isl.h
@@ -917,6 +917,9 @@ struct isl_surf_fill_state_info {
 * Valid values depend on hardware generation.
 */
union isl_color_value clear_color;
+
+   /* Intra-tile offset */
+   uint16_t x_offset_sa, y_offset_sa;
 };
 
 struct isl_buffer_fill_state_info {
diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index de56e4f..1c656c9 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -402,6 +402,34 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
s.MOCS = info->mocs;
 #endif
 
+#if GEN_GEN > 4 || GEN_IS_G4X
+   if (info->x_offset_sa != 0 || info->y_offset_sa != 0) {
+  /* There are fairly strict rules about when the offsets can be used.
+   * These are mostly taken from the Sky Lake PRM documentation for
+   * RENDER_SURFACE_STATE.
+   */
+  assert(info->surf->tiling != ISL_TILING_LINEAR);
+  assert(info->surf->dim == ISL_SURF_DIM_2D);
+  assert(isl_is_pow2(isl_format_get_layout(info->view->format)->bpb));
+  assert(info->surf->levels == 1);
+  assert(info->surf->logical_level0_px.array_len == 1);
+  assert(info->aux_usage == ISL_AUX_USAGE_NONE);
+#if GEN_GEN >= 7
+  s.SurfaceArray = false;
+#endif
+   }
+
+   const unsigned x_div = 4;
+   const unsigned y_div = GEN_GEN >= 8 ? 4 : 2;
+   assert(info->x_offset_sa % x_div == 0);
+   assert(info->y_offset_sa % y_div == 0);
+   s.XOffset = info->x_offset_sa / x_div;
+   s.YOffset = info->y_offset_sa / y_div;
+#else
+   assert(info->x_offset_sa == 0);
+   assert(info->y_offset_sa == 0);
+#endif
+
 #if GEN_GEN >= 7
if (info->aux_surf && info->aux_usage != ISL_AUX_USAGE_NONE) {
   struct isl_tile_info tile_info;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v4 00/34] i965: Use ISL for emitting surface state

2016-07-13 Thread Jason Ekstrand
This is a resend of the latest version of this patch series mostly for
Chad's benifit.  Hopefully we can finally get this reviewed and landed yet
this week.

The first two patches are because some untested bits of the isl patches I
pushed today didn't work as intended.  They're pretty trivial but needed
for the rest of the series.  The rest is a rebase of the series on top of
the latest isl changes.

Jason Ekstrand (34):
  isl: Fix the bs assertion in isl_tiling_get_info
  isl/state: Divide the aux qpitch by 2
  genxml: Add enough XML for gens 4, 4.5, and 5 to get SURFACE_STATE
  genxml: Make X/Y Offset field of SURFACE_STATE a uint
  genxml: Add macros and #includes for gens 4-6
  isl: Add an ISL_DEV_IS_G4X macro
  isl: Add support for filling out surface states all the way back to
gen4
  isl/state: Add support for OffsetX/Y in surface state
  i965: Add an isl_device to the brw_context
  i965/miptree: Add a helper for getting an isl_surf from a miptree
  i965/miptree: Add a helper for getting the ISL clear color from a
miptree
  i965/miptree: Add a helper for getting the aux isl_surf from a miptree
  i965/blorp: Add a generic ISL-based surface state emit path
  i965/blorp: Use the generic ISL path for renderbuffer surfaces on
gen8-9
  i965/blorp: Use the generic ISL path for renderbuffer surfaces on gen7
  i965/blorp: Use the generic ISL path for texture surfaces on gen7
  i965/blorp: Use the generic ISL path for renderbuffer surfaces on gen6
  i965/blorp: Use the generic ISL path for texture surfaces on gen6
  i965/state: Add a helper for emitting a surface state using isl
  i965/blorp: Use a generic ISL path for texture surfaces on gen8
  i965/state: Use ISL for emitting image surfaces
  i965/surface_state: Rename brw_update to gen4_update
  i965/state: Add generic surface update functions based on ISL
  i965/gen8: Use the generic ISL-based path for texture surfaces
  i965/gen8: Use the generic ISL-based path for renderbuffer surfaces
  i965/gen7: Use the generic ISL-based path for texture surfaces
  i965/gen7: Use the generic ISL-based path for renderbuffer surfaces
  i965/gen6: Use the generic ISL-based path for renderbuffer surfaces
  i965/gen4-6: Use the generic ISL-based path for texture surfaces
  isl/formats: Mark RAW as having a block size of 1 byte
  i965/state: Account for the element size in emit_buffer_surface_state
  i965: Use ISL for emitting buffer surface states
  i965: Get rid of gen6_surface_state.c
  i965/context: Remove some unnecessary vfuncs

 src/intel/genxml/Android.mk   |  15 +
 src/intel/genxml/Makefile.am  |   3 +
 src/intel/genxml/Makefile.sources |   3 +
 src/intel/genxml/gen4.xml |  52 +++
 src/intel/genxml/gen45.xml|  56 +++
 src/intel/genxml/gen5.xml |  56 +++
 src/intel/genxml/gen6.xml |   4 +-
 src/intel/genxml/gen7.xml |   4 +-
 src/intel/genxml/gen75.xml|   4 +-
 src/intel/genxml/gen8.xml |   4 +-
 src/intel/genxml/gen9.xml |   4 +-
 src/intel/genxml/genX_pack.h  |  10 +-
 src/intel/genxml/gen_macros.h |  15 +-
 src/intel/isl/Android.mk  |  60 +++
 src/intel/isl/Makefile.am |  12 +
 src/intel/isl/Makefile.sources|  13 +-
 src/intel/isl/isl.c   |  29 +-
 src/intel/isl/isl.h   |   7 +
 src/intel/isl/isl_format_layout.csv   |   2 +-
 src/intel/isl/isl_priv.h  |  24 ++
 src/intel/isl/isl_surface_state.c |  86 +++-
 src/mesa/drivers/dri/i965/Makefile.sources|   1 -
 src/mesa/drivers/dri/i965/brw_binding_tables.c|   2 +-
 src/mesa/drivers/dri/i965/brw_blorp.c | 157 +++
 src/mesa/drivers/dri/i965/brw_blorp.h |   6 +
 src/mesa/drivers/dri/i965/brw_context.c   |   2 +
 src/mesa/drivers/dri/i965/brw_context.h   |  29 +-
 src/mesa/drivers/dri/i965/brw_state.h |  49 ++-
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 501 ++---
 src/mesa/drivers/dri/i965/gen6_blorp.c|  82 +---
 src/mesa/drivers/dri/i965/gen6_surface_state.c| 146 ---
 src/mesa/drivers/dri/i965/gen7_blorp.c| 104 +
 src/mesa/drivers/dri/i965/gen7_cs_state.c |   2 +-
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 407 +
 src/mesa/drivers/dri/i965/gen8_blorp.c| 146 ++-
 src/mesa/drivers/dri/i965/gen8_surface_state.c| 503 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 319 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  15 +
 38 files changed, 1370 insertions(+), 1564 deletions(-)
 create mode 100644 src/intel/genxml/gen4.xml
 create 

[Mesa-dev] [PATCH v4 01/34] isl: Fix the bs assertion in isl_tiling_get_info

2016-07-13 Thread Jason Ekstrand
---
 src/intel/isl/isl.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index 48ff8ce..363c5d5 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -113,21 +113,23 @@ isl_tiling_get_info(const struct isl_device *dev,
const uint32_t bs = format_bpb / 8;
struct isl_extent2d logical_el, phys_B;
 
-   assert(bs > 0);
-   assert(tiling == ISL_TILING_LINEAR || isl_is_pow2(bs));
+   assert(tiling == ISL_TILING_LINEAR || isl_is_pow2(format_bpb));
 
switch (tiling) {
case ISL_TILING_LINEAR:
+  assert(bs > 0);
   logical_el = isl_extent2d(1, 1);
   phys_B = isl_extent2d(bs, 1);
   break;
 
case ISL_TILING_X:
+  assert(bs > 0);
   logical_el = isl_extent2d(512 / bs, 8);
   phys_B = isl_extent2d(512, 8);
   break;
 
case ISL_TILING_Y0:
+  assert(bs > 0);
   logical_el = isl_extent2d(128 / bs, 32);
   phys_B = isl_extent2d(128, 32);
   break;
@@ -159,6 +161,7 @@ isl_tiling_get_info(const struct isl_device *dev,
 
   bool is_Ys = tiling == ISL_TILING_Ys;
 
+  assert(bs > 0);
   unsigned width = 1 << (6 + (ffs(bs) / 2) + (2 * is_Ys));
   unsigned height = 1 << (6 - (ffs(bs) / 2) + (2 * is_Ys));
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965/fs: emit DIM instruction to load 64-bit immediates in HSW

2016-07-13 Thread Matt Turner
On Tue, Jul 12, 2016 at 11:42 PM, Samuel Iglesias Gonsálvez
 wrote:
> Signed-off-by: Samuel Iglesias Gonsálvez 
> ---
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 12 
>  1 file changed, 12 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index a65c273..bf32dfd 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -4558,6 +4558,18 @@ setup_imm_df(const fs_builder &bld, double v)
> if (devinfo->gen >= 8)
>return brw_imm_df(v);
>
> +   /* gen7.5 does not support DF immediates straighforward but the DIM
> +* instruction allows to set the 64-bit immediate value.
> +*/
> +   if (devinfo->is_haswell) {
> +  const fs_builder ubld = bld.exec_all();
> +  fs_reg dst = ubld.vgrf(BRW_REGISTER_TYPE_DF, 1);
> +  struct brw_reg imm = brw_imm_reg(BRW_REGISTER_TYPE_F);
> +  imm.df = v;
> +  ubld.DIM(dst, imm);

I know the hardware is strange and requires that src0's type is F, but
I don't think we need to model that in the IR. I think that using a DF
type in the IR
would require otherwise unnecessary changes to dump_instructions().

With the above three lines changed to just

   ubld.DIM(dst, brw_imm_df(v));

this patch is:

Reviewed-by: Matt Turner 

Patch 1 I sent comments on. With those addressed it is also

Reviewed-by: Matt Turner 

I believe with my comments addressed on 1/3 and 3/3 that 2/3 is unecessary.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] i965/eu: set DF imm value to the source of DIM

2016-07-13 Thread Matt Turner
On Tue, Jul 12, 2016 at 11:42 PM, Samuel Iglesias Gonsálvez
 wrote:
> According to HSW's PRM, vol02b, the DIM instruction has the following
> restriction:
>
> "Restriction : src0 must be immediate. src0 must specify the :f (F, Float)
> type encoding but is an immediate 64-bit DF (Double Float) value. dst
> must have type DF."
>
> This commit allows to upload the immediate 64-bit DF value to the source
> of a DIM instruction even when it is of float type encoding.
>
> Signed-off-by: Samuel Iglesias Gonsálvez 
> ---
>  src/mesa/drivers/dri/i965/brw_eu_emit.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
> b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> index f2f55410..fc187d1 100644
> --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> @@ -350,7 +350,8 @@ brw_set_src0(struct brw_codegen *p, brw_inst *inst, 
> struct brw_reg reg)
> brw_inst_set_src0_address_mode(devinfo, inst, reg.address_mode);
>
> if (reg.file == BRW_IMMEDIATE_VALUE) {
> -  if (reg.type == BRW_REGISTER_TYPE_DF)
> +  if (reg.type == BRW_REGISTER_TYPE_DF ||
> +  brw_inst_opcode(devinfo, inst) == BRW_OPCODE_DIM)

I don't think this patch is needed if we treat src[0] as type-DF in the IR.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] i965: enable the emission of the DIM instruction

2016-07-13 Thread Matt Turner
On Tue, Jul 12, 2016 at 11:42 PM, Samuel Iglesias Gonsálvez
 wrote:
> Signed-off-by: Samuel Iglesias Gonsálvez 
> ---
>  src/mesa/drivers/dri/i965/brw_defines.h  | 2 +-
>  src/mesa/drivers/dri/i965/brw_eu.c   | 2 +-
>  src/mesa/drivers/dri/i965/brw_eu.h   | 1 +
>  src/mesa/drivers/dri/i965/brw_eu_emit.c  | 1 +
>  src/mesa/drivers/dri/i965/brw_fs_builder.h   | 1 +
>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp   | 7 +++
>  src/mesa/drivers/dri/i965/brw_vec4.h | 2 ++
>  src/mesa/drivers/dri/i965/brw_vec4_builder.h | 1 +
>  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 7 +++
>  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp   | 1 +
>  10 files changed, 23 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
> b/src/mesa/drivers/dri/i965/brw_defines.h
> index d2cd53a..740d03d 100644
> --- a/src/mesa/drivers/dri/i965/brw_defines.h
> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> @@ -857,7 +857,7 @@ enum opcode {
> BRW_OPCODE_XOR =7,
> BRW_OPCODE_SHR =8,
> BRW_OPCODE_SHL =9,
> -   // BRW_OPCODE_DIM = 10,  /**< Gen7.5 only */ /* Reused */
> +   BRW_OPCODE_DIM =10,  /**< Gen7.5 only */ /* Reused */
> // BRW_OPCODE_SMOV =10,  /**< Gen8+   */ /* Reused */
> /* Reserved - 11 */
> BRW_OPCODE_ASR =12,
> diff --git a/src/mesa/drivers/dri/i965/brw_eu.c 
> b/src/mesa/drivers/dri/i965/brw_eu.c
> index cc252de..3a309dc 100644
> --- a/src/mesa/drivers/dri/i965/brw_eu.c
> +++ b/src/mesa/drivers/dri/i965/brw_eu.c
> @@ -421,7 +421,7 @@ enum gen {
>  #define GEN_LE(gen) (GEN_LT(gen) | (gen))
>
>  static const struct opcode_desc opcode_10_descs[] = {
> -   { .name = "dim",   .nsrc = 0, .ndst = 0, .gens = GEN75 },
> +   { .name = "dim",   .nsrc = 1, .ndst = 1, .gens = GEN75 },
> { .name = "smov",  .nsrc = 0, .ndst = 0, .gens = GEN_GE(GEN8) },
>  };
>
> diff --git a/src/mesa/drivers/dri/i965/brw_eu.h 
> b/src/mesa/drivers/dri/i965/brw_eu.h
> index b057f17..09f51db 100644
> --- a/src/mesa/drivers/dri/i965/brw_eu.h
> +++ b/src/mesa/drivers/dri/i965/brw_eu.h
> @@ -157,6 +157,7 @@ ALU2(OR)
>  ALU2(XOR)
>  ALU2(SHR)
>  ALU2(SHL)
> +ALU1(DIM)
>  ALU2(ASR)
>  ALU1(F32TO16)
>  ALU1(F16TO32)
> diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
> b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> index 2a8e661..f2f55410 100644
> --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> @@ -1064,6 +1064,7 @@ ALU2(OR)
>  ALU2(XOR)
>  ALU2(SHR)
>  ALU2(SHL)
> +ALU1(DIM)
>  ALU2(ASR)
>  ALU1(FRC)
>  ALU1(RNDD)
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_builder.h 
> b/src/mesa/drivers/dri/i965/brw_fs_builder.h
> index f22903e..8e43484 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_builder.h
> +++ b/src/mesa/drivers/dri/i965/brw_fs_builder.h
> @@ -460,6 +460,7 @@ namespace brw {
>ALU1(CBIT)
>ALU2(CMPN)
>ALU3(CSEL)
> +  ALU1(DIM)
>ALU2(DP2)
>ALU2(DP3)
>ALU2(DP4)
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> index ce1ec0a..ba213b1 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> @@ -2082,6 +2082,13 @@ fs_generator::generate_code(const cfg_t *cfg, int 
> dispatch_width)
>  generate_barrier(inst, src[0]);
>  break;
>
> +  case BRW_OPCODE_DIM:
> + assert(devinfo->is_haswell);
> + assert(src[0].type == BRW_REGISTER_TYPE_F);

As I say in reply to PATCH 3/3, I think it's better to use type DF for
src0 in the IR, and just fix the type to F here in the generator.

I would just assert that src[0].type is DF, and then...

> + assert(dst.type == BRW_REGISTER_TYPE_DF);
> + brw_DIM(p, dst, src[0]);

   brw_DIM(p, dst, retype(src[0], BRW_REGISTER_TYPE_F));

> +break;

The indentation looks wrong here.

> +
>default:
>   unreachable("Unsupported opcode");
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
> b/src/mesa/drivers/dri/i965/brw_vec4.h
> index 76dea04..3043147 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.h
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
> @@ -213,6 +213,8 @@ public:
> EMIT3(MAD)
> EMIT2(ADDC)
> EMIT2(SUBB)
> +   EMIT1(DIM)
> +
>  #undef EMIT1
>  #undef EMIT2
>  #undef EMIT3
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_builder.h 
> b/src/mesa/drivers/dri/i965/brw_vec4_builder.h
> index 3a8617e..d25a87a 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_builder.h
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_builder.h
> @@ -373,6 +373,7 @@ namespace brw {
>ALU1(CBIT)
>ALU2(CMPN)
>ALU3(CSEL)
> +  ALU1(DIM)
>ALU2(DP2)
>ALU2(DP3)
>ALU2(DP4)
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> index bb0254e..bcddafe 100644
> --- a/src/me

[Mesa-dev] [Bug 96835] "gallium: Force blend color to 16-byte alignment" crash with "-march=native -O3" causes some 32bit games to crash

2016-07-13 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=96835

--- Comment #11 from i...@yahoo.com ---
I would actually recommend solving the problem in its root.

Disable auto vectorization.

FFmpeg recently tried removing "-fno-tree-vectorize" for gcc >=4.9 .
After a few weeks they reverted the change.
It caused a bunch of strange regressions, breakages and ICE.

That features has been around for nine years and it is still buggy.
Save yourself some headaches and disable it.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3 v3] anv/descriptor_set: Fix binding partly undefined descriptor sets

2016-07-13 Thread Jason Ekstrand
LGTM

On Wed, Jul 13, 2016 at 4:36 PM, Nanley Chery  wrote:

> Section 13.2.3. of the Vulkan spec requires that implementations be able to
> bind sparsely-defined Descriptor Sets without any errors or exceptions.
>
> When binding a descriptor set that contains a dynamic buffer
> binding/descriptor,
> the driver attempts to dereference the descriptor's buffer_view field if
> it is
> non-NULL. It currently segfaults on undefined descriptors as this field is
> never
> zero-initialized. Zero undefined descriptors to avoid segfaulting. This
> solution was suggested by Jason Ekstrand.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96850
> Cc: 12.0 
> Signed-off-by: Nanley Chery 
> ---
>
> v3: memset all descriptors in the DescriptorSet (Jason Ekstrand)
>
>  src/intel/vulkan/anv_descriptor_set.c | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/src/intel/vulkan/anv_descriptor_set.c
> b/src/intel/vulkan/anv_descriptor_set.c
> index 448ae0e..bd3ebed 100644
> --- a/src/intel/vulkan/anv_descriptor_set.c
> +++ b/src/intel/vulkan/anv_descriptor_set.c
> @@ -409,6 +409,11 @@ anv_descriptor_set_create(struct anv_device *device,
>(struct anv_buffer_view *) &set->descriptors[layout->size];
> set->buffer_count = layout->buffer_count;
>
> +   /* By defining the descriptors to be zero now, we can later verify that
> +* a descriptor has not been populated with user data.
> +*/
> +   memset(set->descriptors, 0, sizeof(struct anv_descriptor) *
> layout->size);
> +
> /* Go through and fill out immutable samplers if we have any */
> struct anv_descriptor *desc = set->descriptors;
> for (uint32_t b = 0; b < layout->binding_count; b++) {
> --
> 2.9.0
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] isl/state: Divide the aux qpitch by 2

2016-07-13 Thread Jason Ekstrand
The field is in multiples of 4 like regular QPitch.
---
 src/intel/isl/isl_surface_state.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index d40f2c1..1c656c9 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -445,7 +445,7 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
* in units of samples on the main surface.
*/
   s.AuxiliarySurfaceQPitch =
- isl_surf_get_array_pitch_sa_rows(info->aux_surf);
+ isl_surf_get_array_pitch_sa_rows(info->aux_surf) >> 2;
   s.AuxiliarySurfaceBaseAddress = info->aux_address;
   s.AuxiliarySurfaceMode = isl_to_gen_aux_mode[info->aux_usage];
 #else
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3 v3] anv/descriptor_set: Fix binding partly undefined descriptor sets

2016-07-13 Thread Nanley Chery
Section 13.2.3. of the Vulkan spec requires that implementations be able to
bind sparsely-defined Descriptor Sets without any errors or exceptions.

When binding a descriptor set that contains a dynamic buffer binding/descriptor,
the driver attempts to dereference the descriptor's buffer_view field if it is
non-NULL. It currently segfaults on undefined descriptors as this field is never
zero-initialized. Zero undefined descriptors to avoid segfaulting. This
solution was suggested by Jason Ekstrand.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96850
Cc: 12.0 
Signed-off-by: Nanley Chery 
---

v3: memset all descriptors in the DescriptorSet (Jason Ekstrand)

 src/intel/vulkan/anv_descriptor_set.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/intel/vulkan/anv_descriptor_set.c 
b/src/intel/vulkan/anv_descriptor_set.c
index 448ae0e..bd3ebed 100644
--- a/src/intel/vulkan/anv_descriptor_set.c
+++ b/src/intel/vulkan/anv_descriptor_set.c
@@ -409,6 +409,11 @@ anv_descriptor_set_create(struct anv_device *device,
   (struct anv_buffer_view *) &set->descriptors[layout->size];
set->buffer_count = layout->buffer_count;
 
+   /* By defining the descriptors to be zero now, we can later verify that
+* a descriptor has not been populated with user data.
+*/
+   memset(set->descriptors, 0, sizeof(struct anv_descriptor) * layout->size);
+
/* Go through and fill out immutable samplers if we have any */
struct anv_descriptor *desc = set->descriptors;
for (uint32_t b = 0; b < layout->binding_count; b++) {
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3 v2] anv/descriptor_set: Fix binding partly undefined descriptor sets

2016-07-13 Thread Nanley Chery
Section 13.2.3. of the Vulkan spec requires that implementations be able to
bind sparsely-defined Descriptor Sets without any errors or exceptions.

When binding a descriptor set that contains a dynamic buffer binding/descriptor,
the driver attempts to dereference the descriptor's buffer_view field if it is
non-NULL. It currently segfaults on undefined descriptors as this field is never
zero-initialized. Zero undefined descriptors to avoid segfaulting. This
solution was suggested by Jason Ekstrand.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96850
Cc: 12.0 
Signed-off-by: Nanley Chery 
---

v2: memset all descriptor array elements at once (Jason Ekstrand)

 src/intel/vulkan/anv_descriptor_set.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/intel/vulkan/anv_descriptor_set.c 
b/src/intel/vulkan/anv_descriptor_set.c
index 448ae0e..39d4dde 100644
--- a/src/intel/vulkan/anv_descriptor_set.c
+++ b/src/intel/vulkan/anv_descriptor_set.c
@@ -424,6 +424,12 @@ anv_descriptor_set_create(struct anv_device *device,
.sampler = layout->binding[b].immutable_samplers[i],
 };
  }
+  } else {
+ /* By defining the descriptors to be zero now, we can later verify 
that
+  * the descriptor has not been populated with user data.
+  */
+ memset(desc, 0,
+   sizeof(struct anv_descriptor) * layout->binding[b].array_size);
   }
   desc += layout->binding[b].array_size;
}
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] isl: Fix the bs assertion in isl_tiling_get_info

2016-07-13 Thread Jason Ekstrand
---
 src/intel/isl/isl.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index 2003e2a..9e01b8d 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -113,21 +113,23 @@ isl_tiling_get_info(const struct isl_device *dev,
const uint32_t bs = format_bpb / 8;
struct isl_extent2d logical_el, phys_B;
 
-   assert(bs > 0);
-   assert(tiling == ISL_TILING_LINEAR || isl_is_pow2(bs));
+   assert(tiling == ISL_TILING_LINEAR || isl_is_pow2(format_bpb));
 
switch (tiling) {
case ISL_TILING_LINEAR:
+  assert(bs > 0);
   logical_el = isl_extent2d(1, 1);
   phys_B = isl_extent2d(bs, 1);
   break;
 
case ISL_TILING_X:
+  assert(bs > 0);
   logical_el = isl_extent2d(512 / bs, 8);
   phys_B = isl_extent2d(512, 8);
   break;
 
case ISL_TILING_Y0:
+  assert(bs > 0);
   logical_el = isl_extent2d(128 / bs, 32);
   phys_B = isl_extent2d(128, 32);
   break;
@@ -159,6 +161,7 @@ isl_tiling_get_info(const struct isl_device *dev,
 
   bool is_Ys = tiling == ISL_TILING_Ys;
 
+  assert(bs > 0);
   unsigned width = 1 << (6 + (ffs(bs) / 2) + (2 * is_Ys));
   unsigned height = 1 << (6 - (ffs(bs) / 2) + (2 * is_Ys));
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/11] radeon/omx: assign previous values to new structure

2016-07-13 Thread Boyuan Zhang
Assign previously hardcoded values for OMX to newly defined structure. As a 
result, OMX behaviour will not change at all.

Signed-off-by: Boyuan Zhang 
---
 src/gallium/state_trackers/omx/vid_enc.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/gallium/state_trackers/omx/vid_enc.c 
b/src/gallium/state_trackers/omx/vid_enc.c
index d70439a..bbc7941 100644
--- a/src/gallium/state_trackers/omx/vid_enc.c
+++ b/src/gallium/state_trackers/omx/vid_enc.c
@@ -1006,6 +1006,14 @@ static void enc_ScaleInput(omx_base_PortType *port, 
struct pipe_video_buffer **v
priv->current_scale_buffer %= OMX_VID_ENC_NUM_SCALING_BUFFERS;
 }
 
+static void enc_GetPictureParamPreset(struct pipe_h264_enc_picture_desc 
*picture)
+{
+   picture->motion_est.enc_disable_sub_mode = 0x00fe;
+   picture->motion_est.enc_ime2_search_range_x = 0x0001;
+   picture->motion_est.enc_ime2_search_range_y = 0x0001;
+   picture->pic_ctrl.enc_constraint_set_flags = 0x0040;
+}
+
 static void enc_ControlPicture(omx_base_PortType *port, struct 
pipe_h264_enc_picture_desc *picture)
 {
OMX_COMPONENTTYPE* comp = port->standCompContainer;
@@ -1064,6 +1072,8 @@ static void enc_ControlPicture(omx_base_PortType *port, 
struct pipe_h264_enc_pic
picture->frame_num = priv->frame_num;
picture->ref_idx_l0 = priv->ref_idx_l0;
picture->ref_idx_l1 = priv->ref_idx_l1;
+   picture->enable_vui = (picture->rate_ctrl.frame_rate_num != 0);
+   enc_GetPictureParamPreset(picture);
 }
 
 static void enc_HandleTask(omx_base_PortType *port, struct encode_task *task,
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/11] st/va: add conversion for yv12 to nv12in putimage

2016-07-13 Thread Boyuan Zhang
For putimage call, if image format is yv12 (or IYUV with U V field swap) and 
surface format is nv12, then we need to convert yv12 to nv12 and then copy the 
converted data from image to surface. We can't use the existing logic where 
surface is destroyed and re-created with yv12 format.

Signed-off-by: Boyuan Zhang 
---
 src/gallium/state_trackers/va/image.c | 33 ++---
 1 file changed, 26 insertions(+), 7 deletions(-)

diff --git a/src/gallium/state_trackers/va/image.c 
b/src/gallium/state_trackers/va/image.c
index 1b956e3..47895ee 100644
--- a/src/gallium/state_trackers/va/image.c
+++ b/src/gallium/state_trackers/va/image.c
@@ -471,7 +471,9 @@ vlVaPutImage(VADriverContextP ctx, VASurfaceID surface, 
VAImageID image,
   return VA_STATUS_ERROR_OPERATION_FAILED;
}
 
-   if (format != surf->buffer->buffer_format) {
+   if ((format != surf->buffer->buffer_format) &&
+ ((format != PIPE_FORMAT_YV12) || (surf->buffer->buffer_format != 
PIPE_FORMAT_NV12)) &&
+ ((format != PIPE_FORMAT_IYUV) || (surf->buffer->buffer_format != 
PIPE_FORMAT_NV12))) {
   struct pipe_video_buffer *tmp_buf;
   struct pipe_video_buffer templat = surf->templat;
 
@@ -513,12 +515,29 @@ vlVaPutImage(VADriverContextP ctx, VASurfaceID surface, 
VAImageID image,
   unsigned width, height;
   if (!views[i]) continue;
   vlVaVideoSurfaceSize(surf, i, &width, &height);
-  for (j = 0; j < views[i]->texture->array_size; ++j) {
- struct pipe_box dst_box = {0, 0, j, width, height, 1};
- drv->pipe->transfer_inline_write(drv->pipe, views[i]->texture, 0,
-PIPE_TRANSFER_WRITE, &dst_box,
-data[i] + pitches[i] * j,
-pitches[i] * views[i]->texture->array_size, 0);
+  if ((format == PIPE_FORMAT_YV12) || (format == PIPE_FORMAT_IYUV) &&
+(surf->buffer->buffer_format == PIPE_FORMAT_NV12)) {
+ struct pipe_transfer *transfer = NULL;
+ uint8_t *map = NULL;
+ struct pipe_box dst_box_1 = {0, 0, 0, width, height, 1};
+ map = drv->pipe->transfer_map(drv->pipe,
+   views[i]->texture,
+   0,
+   PIPE_TRANSFER_DISCARD_RANGE,
+   &dst_box_1, &transfer);
+ if (map == NULL)
+return VA_STATUS_ERROR_OPERATION_FAILED;
+
+ u_copy_yv12_img_to_nv12_surf (data, map, vaimage->offsets, i);
+ pipe_transfer_unmap(drv->pipe, transfer);
+  } else {
+ for (j = 0; j < views[i]->texture->array_size; ++j) {
+struct pipe_box dst_box = {0, 0, j, width, height, 1};
+drv->pipe->transfer_inline_write(drv->pipe, views[i]->texture, 0,
+ PIPE_TRANSFER_WRITE, &dst_box,
+ data[i] + pitches[i] * j,
+ pitches[i] * 
views[i]->texture->array_size, 0);
+ }
   }
}
pipe_mutex_unlock(drv->mutex);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] anv/descriptor_set: Fix binding partly undefined descriptor sets

2016-07-13 Thread Nanley Chery
On Wed, Jul 13, 2016 at 03:56:42PM -0700, Jason Ekstrand wrote:
> On Wed, Jul 13, 2016 at 3:34 PM, Nanley Chery  wrote:
> 
> > Section 13.2.3. of the Vulkan spec requires that implementations be able to
> > bind sparsely-defined Descriptor Sets without any errors or exceptions.
> >
> > When binding a descriptor set that contains a dynamic buffer
> > binding/descriptor,
> > the driver attempts to dereference the descriptor's buffer_view field if
> > it is
> > non-NULL. It currently segfaults on undefined descriptors as this field is
> > never
> > zero-initialized. Zero undefined descriptors to avoid segfaulting. This
> > solution was suggested by Jason Ekstrand.
> >
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96850
> > Cc: 12.0 
> > Signed-off-by: Nanley Chery 
> > ---
> >  src/intel/vulkan/anv_descriptor_set.c | 9 +++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/intel/vulkan/anv_descriptor_set.c
> > b/src/intel/vulkan/anv_descriptor_set.c
> > index 448ae0e..f06d2e4 100644
> > --- a/src/intel/vulkan/anv_descriptor_set.c
> > +++ b/src/intel/vulkan/anv_descriptor_set.c
> > @@ -412,8 +412,8 @@ anv_descriptor_set_create(struct anv_device *device,
> > /* Go through and fill out immutable samplers if we have any */
> > struct anv_descriptor *desc = set->descriptors;
> > for (uint32_t b = 0; b < layout->binding_count; b++) {
> > -  if (layout->binding[b].immutable_samplers) {
> > - for (uint32_t i = 0; i < layout->binding[b].array_size; i++) {
> > +  for (uint32_t i = 0; i < layout->binding[b].array_size; i++) {
> > + if (layout->binding[b].immutable_samplers) {
> >  /* The type will get changed to COMBINED_IMAGE_SAMPLER in
> >   * UpdateDescriptorSets if needed.  However, if the descriptor
> >   * set has an immutable sampler, UpdateDescriptorSets may
> > never
> > @@ -423,6 +423,11 @@ anv_descriptor_set_create(struct anv_device *device,
> > .type = VK_DESCRIPTOR_TYPE_SAMPLER,
> > .sampler = layout->binding[b].immutable_samplers[i],
> >  };
> > + } else {
> > +/* By defining the descriptors to be zero now, we can later
> > verify that
> > + * the descriptor has not been populated with user data.
> > + */
> > +zero(desc[i]);
> >
> 
> I think I'd rather just zero the whole thing rather than one descriptor at
> a time.  Given that the memset will use vectorized memory operations, it's
> almost certainly just as fast if not faster to do so and I think it's more
> clear.
> 

I agree. I overlooked the fact that we could memset all descriptor array
elements in a binding.

- Nanley

> --Jason
> 
> 
> >   }
> >}
> >desc += layout->binding[b].array_size;
> > --
> > 2.9.0
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] anv/descriptor_set: Fix binding partly undefined descriptor sets

2016-07-13 Thread Jason Ekstrand
On Wed, Jul 13, 2016 at 3:34 PM, Nanley Chery  wrote:

> Section 13.2.3. of the Vulkan spec requires that implementations be able to
> bind sparsely-defined Descriptor Sets without any errors or exceptions.
>
> When binding a descriptor set that contains a dynamic buffer
> binding/descriptor,
> the driver attempts to dereference the descriptor's buffer_view field if
> it is
> non-NULL. It currently segfaults on undefined descriptors as this field is
> never
> zero-initialized. Zero undefined descriptors to avoid segfaulting. This
> solution was suggested by Jason Ekstrand.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96850
> Cc: 12.0 
> Signed-off-by: Nanley Chery 
> ---
>  src/intel/vulkan/anv_descriptor_set.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/src/intel/vulkan/anv_descriptor_set.c
> b/src/intel/vulkan/anv_descriptor_set.c
> index 448ae0e..f06d2e4 100644
> --- a/src/intel/vulkan/anv_descriptor_set.c
> +++ b/src/intel/vulkan/anv_descriptor_set.c
> @@ -412,8 +412,8 @@ anv_descriptor_set_create(struct anv_device *device,
> /* Go through and fill out immutable samplers if we have any */
> struct anv_descriptor *desc = set->descriptors;
> for (uint32_t b = 0; b < layout->binding_count; b++) {
> -  if (layout->binding[b].immutable_samplers) {
> - for (uint32_t i = 0; i < layout->binding[b].array_size; i++) {
> +  for (uint32_t i = 0; i < layout->binding[b].array_size; i++) {
> + if (layout->binding[b].immutable_samplers) {
>  /* The type will get changed to COMBINED_IMAGE_SAMPLER in
>   * UpdateDescriptorSets if needed.  However, if the descriptor
>   * set has an immutable sampler, UpdateDescriptorSets may
> never
> @@ -423,6 +423,11 @@ anv_descriptor_set_create(struct anv_device *device,
> .type = VK_DESCRIPTOR_TYPE_SAMPLER,
> .sampler = layout->binding[b].immutable_samplers[i],
>  };
> + } else {
> +/* By defining the descriptors to be zero now, we can later
> verify that
> + * the descriptor has not been populated with user data.
> + */
> +zero(desc[i]);
>

I think I'd rather just zero the whole thing rather than one descriptor at
a time.  Given that the memset will use vectorized memory operations, it's
almost certainly just as fast if not faster to do so and I think it's more
clear.

--Jason


>   }
>}
>desc += layout->binding[b].array_size;
> --
> 2.9.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/11] vl: add entry point

2016-07-13 Thread Boyuan Zhang
Add entry point for encoding which previously hardcoded for decoding purpose 
only

Signed-off-by: Boyuan Zhang 
---
 src/gallium/include/pipe/p_video_state.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/include/pipe/p_video_state.h 
b/src/gallium/include/pipe/p_video_state.h
index 754d013..39b3905 100644
--- a/src/gallium/include/pipe/p_video_state.h
+++ b/src/gallium/include/pipe/p_video_state.h
@@ -131,6 +131,7 @@ enum pipe_h264_enc_rate_control_method
 struct pipe_picture_desc
 {
enum pipe_video_profile profile;
+   enum pipe_video_entrypoint entry_point;
 };
 
 struct pipe_quant_matrix
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/11] radeon/vce: handle newly added parameters

2016-07-13 Thread Boyuan Zhang
Replace the previous hardcoded value with newly defined parameters

Signed-off-by: Boyuan Zhang 
---
 src/gallium/drivers/radeon/radeon_vce_52.c | 33 ++
 1 file changed, 20 insertions(+), 13 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_vce_52.c 
b/src/gallium/drivers/radeon/radeon_vce_52.c
index 7d33313..7986eb8 100644
--- a/src/gallium/drivers/radeon/radeon_vce_52.c
+++ b/src/gallium/drivers/radeon/radeon_vce_52.c
@@ -48,13 +48,14 @@ static void get_rate_control_param(struct rvce_encoder 
*enc, struct pipe_h264_en
enc->enc_pic.rc.quant_i_frames = pic->quant_i_frames;
enc->enc_pic.rc.quant_p_frames = pic->quant_p_frames;
enc->enc_pic.rc.quant_b_frames = pic->quant_b_frames;
+   enc->enc_pic.rc.gop_size = pic->gop_size;
enc->enc_pic.rc.frame_rate_num = pic->rate_ctrl.frame_rate_num;
enc->enc_pic.rc.frame_rate_den = pic->rate_ctrl.frame_rate_den;
enc->enc_pic.rc.max_qp = 51;
enc->enc_pic.rc.vbv_buffer_size = pic->rate_ctrl.vbv_buffer_size;
-   enc->enc_pic.rc.vbv_buf_lv = 0;
-   enc->enc_pic.rc.fill_data_enable = 0;
-   enc->enc_pic.rc.enforce_hrd = 0;
+   enc->enc_pic.rc.vbv_buf_lv = pic->rate_ctrl.vbv_buf_lv;
+   enc->enc_pic.rc.fill_data_enable = pic->rate_ctrl.fill_data_enable;
+   enc->enc_pic.rc.enforce_hrd = pic->rate_ctrl.enforce_hrd;
enc->enc_pic.rc.target_bits_picture = 
pic->rate_ctrl.target_bits_picture;
enc->enc_pic.rc.peak_bits_picture_integer = 
pic->rate_ctrl.peak_bits_picture_integer;
enc->enc_pic.rc.peak_bits_picture_fraction = 
pic->rate_ctrl.peak_bits_picture_fraction;
@@ -62,13 +63,13 @@ static void get_rate_control_param(struct rvce_encoder 
*enc, struct pipe_h264_en
 
 static void get_motion_estimation_param(struct rvce_encoder *enc, struct 
pipe_h264_enc_picture_desc *pic)
 {
-   enc->enc_pic.me.motion_est_quarter_pixel = 0x;
-   enc->enc_pic.me.enc_disable_sub_mode = 0x00fe;
-   enc->enc_pic.me.lsmvert = 0x;
-   enc->enc_pic.me.enc_en_ime_overw_dis_subm = 0x;
-   enc->enc_pic.me.enc_ime_overw_dis_subm_no = 0x;
-   enc->enc_pic.me.enc_ime2_search_range_x = 0x0001;
-   enc->enc_pic.me.enc_ime2_search_range_y = 0x0001;
+   enc->enc_pic.me.motion_est_quarter_pixel = 
pic->motion_est.motion_est_quarter_pixel;
+   enc->enc_pic.me.enc_disable_sub_mode = 
pic->motion_est.enc_disable_sub_mode;
+   enc->enc_pic.me.lsmvert = pic->motion_est.lsmvert;
+   enc->enc_pic.me.enc_en_ime_overw_dis_subm = 
pic->motion_est.enc_en_ime_overw_dis_subm;
+   enc->enc_pic.me.enc_ime_overw_dis_subm_no = 
pic->motion_est.enc_ime_overw_dis_subm_no;
+   enc->enc_pic.me.enc_ime2_search_range_x = 
pic->motion_est.enc_ime2_search_range_x;
+   enc->enc_pic.me.enc_ime2_search_range_y = 
pic->motion_est.enc_ime2_search_range_y;
enc->enc_pic.me.enc_ime_decimation_search = 0x0001;
enc->enc_pic.me.motion_est_half_pixel = 0x0001;
enc->enc_pic.me.enc_search_range_x = 0x0010;
@@ -90,8 +91,8 @@ static void get_pic_control_param(struct rvce_encoder *enc, 
struct pipe_h264_enc
enc->enc_pic.pc.enc_max_num_ref_frames = enc->base.max_references + 1;
enc->enc_pic.pc.enc_num_default_active_ref_l0 = 0x0001;
enc->enc_pic.pc.enc_num_default_active_ref_l1 = 0x0001;
-   enc->enc_pic.pc.enc_cabac_enable = 0x;
-   enc->enc_pic.pc.enc_constraint_set_flags = 0x0040;
+   enc->enc_pic.pc.enc_cabac_enable = pic->pic_ctrl.enc_cabac_enable;
+   enc->enc_pic.pc.enc_constraint_set_flags = 
pic->pic_ctrl.enc_constraint_set_flags;
enc->enc_pic.pc.enc_num_default_active_ref_l0 = 0x0001;
enc->enc_pic.pc.enc_num_default_active_ref_l1 = 0x0001;
 }
@@ -113,7 +114,7 @@ static void get_config_ext_param(struct rvce_encoder *enc)
 
 static void get_vui_param(struct rvce_encoder *enc, struct 
pipe_h264_enc_picture_desc *pic)
 {
-   enc->enc_pic.enable_vui = (pic->rate_ctrl.frame_rate_num != 0);
+   enc->enc_pic.enable_vui = pic->enable_vui;
enc->enc_pic.vui.video_format = 0x0005;
enc->enc_pic.vui.color_prim = 0x0002;
enc->enc_pic.vui.transfer_char = 0x0002;
@@ -149,10 +150,16 @@ void radeon_vce_52_get_param(struct rvce_encoder *enc, 
struct pipe_h264_enc_pict
 
enc->enc_pic.picture_type = pic->picture_type;
enc->enc_pic.frame_num = pic->frame_num;
+   enc->enc_pic.frame_num_cnt = pic->frame_num_cnt;
+   enc->enc_pic.p_remain = pic->p_remain;
+   enc->enc_pic.i_remain = pic->i_remain;
+   enc->enc_pic.gop_cnt = pic->gop_cnt;
enc->enc_pic.pic_order_cnt = pic->pic_order_cnt;
enc->enc_pic.ref_idx_l0 = pic->ref_idx_l0;
enc->enc_pic.ref_idx_l1 = pic->ref_idx_l1;
enc->enc_pic.not_referenced = pic->not_referenced;
+   enc->enc_pic.addrmode_arraymode_disrdo_distwoinstants = 
pic->ref_pic_mode;
+

[Mesa-dev] [PATCH 01/11] vl: add parameters for VAAPI encode

2016-07-13 Thread Boyuan Zhang
Allow to specify more parameters in the encoding interface which previously 
just hardcoded in the encoder

Signed-off-by: Boyuan Zhang 
---
 src/gallium/include/pipe/p_video_state.h | 33 
 1 file changed, 33 insertions(+)

diff --git a/src/gallium/include/pipe/p_video_state.h 
b/src/gallium/include/pipe/p_video_state.h
index d353be6..754d013 100644
--- a/src/gallium/include/pipe/p_video_state.h
+++ b/src/gallium/include/pipe/p_video_state.h
@@ -352,9 +352,29 @@ struct pipe_h264_enc_rate_control
unsigned frame_rate_num;
unsigned frame_rate_den;
unsigned vbv_buffer_size;
+   unsigned vbv_buf_lv;
unsigned target_bits_picture;
unsigned peak_bits_picture_integer;
unsigned peak_bits_picture_fraction;
+   unsigned fill_data_enable;
+   unsigned enforce_hrd;
+};
+
+struct pipe_h264_enc_motion_estimation
+{
+   unsigned motion_est_quarter_pixel;
+   unsigned enc_disable_sub_mode;
+   unsigned lsmvert;
+   unsigned enc_en_ime_overw_dis_subm;
+   unsigned enc_ime_overw_dis_subm_no;
+   unsigned enc_ime2_search_range_x;
+   unsigned enc_ime2_search_range_y;
+};
+
+struct pipe_h264_enc_pic_control
+{
+   unsigned enc_cabac_enable;
+   unsigned enc_constraint_set_flags;
 };
 
 struct pipe_h264_enc_picture_desc
@@ -363,17 +383,30 @@ struct pipe_h264_enc_picture_desc
 
struct pipe_h264_enc_rate_control rate_ctrl;
 
+   struct pipe_h264_enc_motion_estimation motion_est;
+   struct pipe_h264_enc_pic_control pic_ctrl;
+
unsigned quant_i_frames;
unsigned quant_p_frames;
unsigned quant_b_frames;
 
enum pipe_h264_enc_picture_type picture_type;
unsigned frame_num;
+   unsigned frame_num_cnt;
+   unsigned p_remain;
+   unsigned i_remain;
+   unsigned idr_pic_id;
+   unsigned gop_cnt;
unsigned pic_order_cnt;
unsigned ref_idx_l0;
unsigned ref_idx_l1;
+   unsigned gop_size;
+   unsigned ref_pic_mode;
 
bool not_referenced;
+   bool is_idr;
+   bool enable_vui;
+   unsigned int frame_idx[32];
 };
 
 struct pipe_h265_sps
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/11] st/va: add copy function for yv12 image to nv12 surface

2016-07-13 Thread Boyuan Zhang
Add function to copy from yv12 image to nv12 surface for VAAPI putimage call. 
Existing function only work for copying from yv12 surface to nv12 image.

Signed-off-by: Boyuan Zhang 
---
 src/gallium/auxiliary/util/u_video.h | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/src/gallium/auxiliary/util/u_video.h 
b/src/gallium/auxiliary/util/u_video.h
index 9196afc..6e835d8 100644
--- a/src/gallium/auxiliary/util/u_video.h
+++ b/src/gallium/auxiliary/util/u_video.h
@@ -130,6 +130,28 @@ u_copy_yv12_to_nv12(void *const *destination_data,
 }
 
 static inline void
+u_copy_yv12_img_to_nv12_surf(uint8_t *const *src, uint8_t *dest, int *offset, 
int field)
+{
+   if (field == 0) {
+  for (int i = 0; i < offset[1] ; i++)
+  dest[i] = src[field][i];
+   }
+   else if (field == 1) {
+  bool odd = false;
+  for (int k = 0; k < (offset[1]/2) ; k++){
+  if (odd == false) {
+   dest[k] = src[field][k/2];
+   odd = true;
+  }
+  else {
+   dest[k] = src[field+1][k/2];
+   odd = false;
+  }
+  }
+   }
+}
+
+static inline void
 u_copy_swap422_packed(void *const *destination_data,
uint32_t const *destination_pitches,
int src_plane, int src_field,
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/11] st/va: add encode entrypoint

2016-07-13 Thread Boyuan Zhang
VAAPI passes PIPE_VIDEO_ENTRYPOINT_ENCODE as entry point for encoding case. We 
will save this encode entry point instead of always hardcoded to 
PIPE_VIDEO_ENTRYPOINT_BITSTREAM for decoding case previously.

Signed-off-by: Boyuan Zhang 
---
 src/gallium/state_trackers/va/config.c | 61 +++---
 src/gallium/state_trackers/va/context.c| 57 
 src/gallium/state_trackers/va/surface.c| 12 --
 src/gallium/state_trackers/va/va_private.h |  5 +++
 4 files changed, 103 insertions(+), 32 deletions(-)

diff --git a/src/gallium/state_trackers/va/config.c 
b/src/gallium/state_trackers/va/config.c
index 9ca0aa8..73704a1 100644
--- a/src/gallium/state_trackers/va/config.c
+++ b/src/gallium/state_trackers/va/config.c
@@ -34,6 +34,8 @@
 
 #include "va_private.h"
 
+#include "util/u_handle_table.h"
+
 DEBUG_GET_ONCE_BOOL_OPTION(mpeg4, "VAAPI_MPEG4_ENABLED", false)
 
 VAStatus
@@ -128,14 +130,27 @@ VAStatus
 vlVaCreateConfig(VADriverContextP ctx, VAProfile profile, VAEntrypoint 
entrypoint,
  VAConfigAttrib *attrib_list, int num_attribs, VAConfigID 
*config_id)
 {
+   vlVaDriver *drv;
+   vlVaConfig *config;
struct pipe_screen *pscreen;
enum pipe_video_profile p;
 
if (!ctx)
   return VA_STATUS_ERROR_INVALID_CONTEXT;
 
+   drv = VL_VA_DRIVER(ctx);
+
+   if (!drv)
+  return VA_STATUS_ERROR_INVALID_CONTEXT;
+
+   config = CALLOC(1, sizeof(vlVaConfig));
+   if (!config)
+  return VA_STATUS_ERROR_ALLOCATION_FAILED;
+
if (profile == VAProfileNone && entrypoint == VAEntrypointVideoProc) {
-  *config_id = PIPE_VIDEO_PROFILE_UNKNOWN;
+  config->entrypoint = VAEntrypointVideoProc;
+  config->profile = PIPE_VIDEO_PROFILE_UNKNOWN;
+  *config_id = handle_table_add(drv->htab, config);
   return VA_STATUS_SUCCESS;
}
 
@@ -150,7 +165,14 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile profile, 
VAEntrypoint entrypoin
if (entrypoint != VAEntrypointVLD)
   return VA_STATUS_ERROR_UNSUPPORTED_ENTRYPOINT;
 
-   *config_id = p;
+   if (entrypoint == VAEntrypointEncSlice || entrypoint == 
VAEntrypointEncPicture)
+  config->entrypoint = PIPE_VIDEO_ENTRYPOINT_ENCODE;
+   else
+  config->entrypoint = PIPE_VIDEO_ENTRYPOINT_BITSTREAM;
+
+   config->profile = p;
+
+   *config_id = handle_table_add(drv->htab, config);
 
return VA_STATUS_SUCCESS;
 }
@@ -158,9 +180,25 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile profile, 
VAEntrypoint entrypoin
 VAStatus
 vlVaDestroyConfig(VADriverContextP ctx, VAConfigID config_id)
 {
+   vlVaDriver *drv;
+   vlVaConfig *config;
+
if (!ctx)
   return VA_STATUS_ERROR_INVALID_CONTEXT;
 
+   drv = VL_VA_DRIVER(ctx);
+
+   if (!drv)
+  return VA_STATUS_ERROR_INVALID_CONTEXT;
+
+   config = handle_table_get(drv->htab, config_id);
+
+   if (!config)
+  return VA_STATUS_ERROR_INVALID_CONFIG;
+
+   FREE(config);
+   handle_table_remove(drv->htab, config_id);
+
return VA_STATUS_SUCCESS;
 }
 
@@ -168,18 +206,31 @@ VAStatus
 vlVaQueryConfigAttributes(VADriverContextP ctx, VAConfigID config_id, 
VAProfile *profile,
   VAEntrypoint *entrypoint, VAConfigAttrib 
*attrib_list, int *num_attribs)
 {
+   vlVaDriver *drv;
+   vlVaConfig *config;
+
if (!ctx)
   return VA_STATUS_ERROR_INVALID_CONTEXT;
 
-   *profile = PipeToProfile(config_id);
+   drv = VL_VA_DRIVER(ctx);
+
+   if (!drv)
+  return VA_STATUS_ERROR_INVALID_CONTEXT;
+
+   config = handle_table_get(drv->htab, config_id);
+
+   if (!config)
+  return VA_STATUS_ERROR_INVALID_CONFIG;
+
+   *profile = PipeToProfile(config->profile);
 
-   if (config_id == PIPE_VIDEO_PROFILE_UNKNOWN) {
+   if (config->profile == PIPE_VIDEO_PROFILE_UNKNOWN) {
   *entrypoint = VAEntrypointVideoProc;
   *num_attribs = 0;
   return VA_STATUS_SUCCESS;
}
 
-   *entrypoint = VAEntrypointVLD;
+   *entrypoint = config->entrypoint;
 
*num_attribs = 1;
attrib_list[0].type = VAConfigAttribRTFormat;
diff --git a/src/gallium/state_trackers/va/context.c 
b/src/gallium/state_trackers/va/context.c
index 402fbb2..b4334f4 100644
--- a/src/gallium/state_trackers/va/context.c
+++ b/src/gallium/state_trackers/va/context.c
@@ -195,18 +195,21 @@ vlVaCreateContext(VADriverContextP ctx, VAConfigID 
config_id, int picture_width,
 {
vlVaDriver *drv;
vlVaContext *context;
+   vlVaConfig *config;
int is_vpp;
 
if (!ctx)
   return VA_STATUS_ERROR_INVALID_CONTEXT;
 
-   is_vpp = config_id == PIPE_VIDEO_PROFILE_UNKNOWN && !picture_width &&
+   drv = VL_VA_DRIVER(ctx);
+   config = handle_table_get(drv->htab, config_id);
+
+   is_vpp = config->profile == PIPE_VIDEO_PROFILE_UNKNOWN && !picture_width &&
 !picture_height && !flag && !render_targets && !num_render_targets;
 
if (!(picture_width && picture_height) && !is_vpp)
   return VA_STATUS_ERROR_INVALID_IMAGE_FORMAT;
 
-   drv = VL_VA_DRIVER(ctx);
context = CALLOC(1, sizeof(vlVaContext));
if (

[Mesa-dev] [PATCH 10/11] st/va: add preset values for VAAPI encode

2016-07-13 Thread Boyuan Zhang
Add some hardcoded values hardware needs mainly for rate control purpose.

Signed-off-by: Boyuan Zhang 
---
 src/gallium/state_trackers/va/picture.c | 36 +
 1 file changed, 36 insertions(+)

diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c
index 12b3cd1..343afd7 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -95,6 +95,41 @@ vlVaGetReferenceFrame(vlVaDriver *drv, VASurfaceID 
surface_id,
   *ref_frame = NULL;
 }
 
+static void
+getEncParamPreset(vlVaContext *context)
+{
+   //motion estimation preset
+   context->desc.h264enc.motion_est.motion_est_quarter_pixel = 0x0001;
+   context->desc.h264enc.motion_est.lsmvert = 0x0002;
+   context->desc.h264enc.motion_est.enc_disable_sub_mode = 0x0078;
+   context->desc.h264enc.motion_est.enc_en_ime_overw_dis_subm = 0x0001;
+   context->desc.h264enc.motion_est.enc_ime_overw_dis_subm_no = 0x0001;
+   context->desc.h264enc.motion_est.enc_ime2_search_range_x = 0x0004;
+   context->desc.h264enc.motion_est.enc_ime2_search_range_y = 0x0004;
+
+   //pic control preset
+   context->desc.h264enc.pic_ctrl.enc_cabac_enable = 0x0001;
+   context->desc.h264enc.pic_ctrl.enc_constraint_set_flags = 0x0040;
+
+   //rate control
+   context->desc.h264enc.rate_ctrl.vbv_buffer_size = 2000;
+   if (context->desc.h264enc.rate_ctrl.frame_rate_num == 0) {
+  context->desc.h264enc.rate_ctrl.frame_rate_num = 30;
+  context->desc.h264enc.rate_ctrl.frame_rate_den = 1;
+   }
+   context->desc.h264enc.rate_ctrl.vbv_buf_lv = 48;
+   context->desc.h264enc.rate_ctrl.fill_data_enable = 1;
+   context->desc.h264enc.rate_ctrl.enforce_hrd = 1;
+   context->desc.h264enc.enable_vui = false;
+   context->desc.h264enc.rate_ctrl.target_bits_picture =
+  context->desc.h264enc.rate_ctrl.target_bitrate / 
context->desc.h264enc.rate_ctrl.frame_rate_num;
+   context->desc.h264enc.rate_ctrl.peak_bits_picture_integer =
+  context->desc.h264enc.rate_ctrl.peak_bitrate / 
context->desc.h264enc.rate_ctrl.frame_rate_num;
+   context->desc.h264enc.rate_ctrl.peak_bits_picture_fraction = 0;
+
+   context->desc.h264enc.ref_pic_mode = 0x0201;
+}
+
 static VAStatus
 handlePictureParameterBuffer(vlVaDriver *drv, vlVaContext *context, vlVaBuffer 
*buf)
 {
@@ -513,6 +548,7 @@ vlVaEndPicture(VADriverContextP ctx, VAContextID context_id)
 
if (context->decoder->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE) {
   coded_buf = context->coded_buf;
+  getEncParamPreset(context);
   context->decoder->begin_frame(context->decoder, context->target, 
&context->desc.base);
   context->decoder->encode_bitstream(context->decoder, context->target,
  coded_buf->derived_surface.resource, 
&feedback);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/11] st/va: get rate control method from configattrib

2016-07-13 Thread Boyuan Zhang
Rate control method is passed from app to driver through config attrib list.

Signed-off-by: Boyuan Zhang 
---
 src/gallium/state_trackers/va/config.c | 11 +++
 src/gallium/state_trackers/va/context.c|  3 ++-
 src/gallium/state_trackers/va/va_private.h |  1 +
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/va/config.c 
b/src/gallium/state_trackers/va/config.c
index 73704a1..ea838c0 100644
--- a/src/gallium/state_trackers/va/config.c
+++ b/src/gallium/state_trackers/va/config.c
@@ -172,6 +172,17 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile profile, 
VAEntrypoint entrypoin
 
config->profile = p;
 
+   for (int i = 0; i rc = PIPE_H264_ENC_RATE_CONTROL_METHOD_CONSTANT;
+ else if (attrib_list[i].value == VA_RC_VBR)
+config->rc = PIPE_H264_ENC_RATE_CONTROL_METHOD_VARIABLE;
+ else
+config->rc = PIPE_H264_ENC_RATE_CONTROL_METHOD_DISABLE;
+  }
+   }
+
*config_id = handle_table_add(drv->htab, config);
 
return VA_STATUS_SUCCESS;
diff --git a/src/gallium/state_trackers/va/context.c 
b/src/gallium/state_trackers/va/context.c
index b4334f4..c67ed1f 100644
--- a/src/gallium/state_trackers/va/context.c
+++ b/src/gallium/state_trackers/va/context.c
@@ -274,7 +274,8 @@ vlVaCreateContext(VADriverContextP ctx, VAConfigID 
config_id, int picture_width,
 
context->desc.base.profile = config->profile;
context->desc.base.entry_point = config->entrypoint;
-
+   if (config->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE)
+  context->desc.h264enc.rate_ctrl.rate_ctrl_method = config->rc;
pipe_mutex_lock(drv->mutex);
*context_id = handle_table_add(drv->htab, context);
pipe_mutex_unlock(drv->mutex);
diff --git a/src/gallium/state_trackers/va/va_private.h 
b/src/gallium/state_trackers/va/va_private.h
index 723983d..ad9010a 100644
--- a/src/gallium/state_trackers/va/va_private.h
+++ b/src/gallium/state_trackers/va/va_private.h
@@ -246,6 +246,7 @@ typedef struct {
 typedef struct {
VAEntrypoint entrypoint;
enum pipe_video_profile profile;
+   enum pipe_h264_enc_rate_control_method rc;
 } vlVaConfig;
 
 typedef struct {
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/11] st/va: enable h264 VAAPI encode

2016-07-13 Thread Boyuan Zhang
Enable H.264 VAAPI encoding through config.

Signed-off-by: Boyuan Zhang 
---
 src/gallium/state_trackers/va/config.c | 32 ++--
 1 file changed, 22 insertions(+), 10 deletions(-)

diff --git a/src/gallium/state_trackers/va/config.c 
b/src/gallium/state_trackers/va/config.c
index ea838c0..04d214d 100644
--- a/src/gallium/state_trackers/va/config.c
+++ b/src/gallium/state_trackers/va/config.c
@@ -74,6 +74,7 @@ vlVaQueryConfigEntrypoints(VADriverContextP ctx, VAProfile 
profile,
 {
struct pipe_screen *pscreen;
enum pipe_video_profile p;
+   int va_status = VA_STATUS_ERROR_UNSUPPORTED_PROFILE;
 
if (!ctx)
   return VA_STATUS_ERROR_INVALID_CONTEXT;
@@ -90,12 +91,18 @@ vlVaQueryConfigEntrypoints(VADriverContextP ctx, VAProfile 
profile,
   return VA_STATUS_ERROR_UNSUPPORTED_PROFILE;
 
pscreen = VL_VA_PSCREEN(ctx);
-   if (!pscreen->get_video_param(pscreen, p, PIPE_VIDEO_ENTRYPOINT_BITSTREAM, 
PIPE_VIDEO_CAP_SUPPORTED))
-  return VA_STATUS_ERROR_UNSUPPORTED_PROFILE;
-
-   entrypoint_list[(*num_entrypoints)++] = VAEntrypointVLD;
+   if (pscreen->get_video_param(pscreen, p, PIPE_VIDEO_ENTRYPOINT_BITSTREAM, 
PIPE_VIDEO_CAP_SUPPORTED)) {
+  entrypoint_list[(*num_entrypoints)++] = VAEntrypointVLD;
+  va_status = VA_STATUS_SUCCESS;
+   }
+   if (pscreen->get_video_param(pscreen, p, PIPE_VIDEO_ENTRYPOINT_ENCODE, 
PIPE_VIDEO_CAP_SUPPORTED) &&
+   p == PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE) {
+  entrypoint_list[(*num_entrypoints)++] = VAEntrypointEncSlice;
+  entrypoint_list[(*num_entrypoints)++] = VAEntrypointEncPicture;
+  va_status = VA_STATUS_SUCCESS;
+   }
 
-   return VA_STATUS_SUCCESS;
+   return va_status;
 }
 
 VAStatus
@@ -114,7 +121,7 @@ vlVaGetConfigAttributes(VADriverContextP ctx, VAProfile 
profile, VAEntrypoint en
  value = VA_RT_FORMAT_YUV420;
  break;
   case VAConfigAttribRateControl:
- value = VA_RC_NONE;
+ value = VA_RC_CQP | VA_RC_CBR;
  break;
   default:
  value = VA_ATTRIB_NOT_SUPPORTED;
@@ -159,10 +166,15 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile profile, 
VAEntrypoint entrypoin
   return VA_STATUS_ERROR_UNSUPPORTED_PROFILE;
 
pscreen = VL_VA_PSCREEN(ctx);
-   if (!pscreen->get_video_param(pscreen, p, PIPE_VIDEO_ENTRYPOINT_BITSTREAM, 
PIPE_VIDEO_CAP_SUPPORTED))
-  return VA_STATUS_ERROR_UNSUPPORTED_PROFILE;
-
-   if (entrypoint != VAEntrypointVLD)
+   if (entrypoint == VAEntrypointVLD) {
+  if (!pscreen->get_video_param(pscreen, p, 
PIPE_VIDEO_ENTRYPOINT_BITSTREAM, PIPE_VIDEO_CAP_SUPPORTED))
+ return VA_STATUS_ERROR_UNSUPPORTED_PROFILE;
+   }
+   else if (entrypoint == VAEntrypointEncSlice) {
+  if (!pscreen->get_video_param(pscreen, p, PIPE_VIDEO_ENTRYPOINT_ENCODE, 
PIPE_VIDEO_CAP_SUPPORTED))
+ return VA_STATUS_ERROR_UNSUPPORTED_PROFILE;
+   }
+   else
   return VA_STATUS_ERROR_UNSUPPORTED_ENTRYPOINT;
 
if (entrypoint == VAEntrypointEncSlice || entrypoint == 
VAEntrypointEncPicture)
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/11] st/va: add functions for VAAPI encode

2016-07-13 Thread Boyuan Zhang
Add necessary functions/changes for VAAPI encoding to buffer and picture, 
without affecting decode behaviour.

Signed-off-by: Boyuan Zhang 
---
 src/gallium/state_trackers/va/buffer.c |   6 ++
 src/gallium/state_trackers/va/picture.c| 161 -
 src/gallium/state_trackers/va/va_private.h |   3 +
 3 files changed, 168 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/va/buffer.c 
b/src/gallium/state_trackers/va/buffer.c
index 7d3167b..dfcebbe 100644
--- a/src/gallium/state_trackers/va/buffer.c
+++ b/src/gallium/state_trackers/va/buffer.c
@@ -133,6 +133,12 @@ vlVaMapBuffer(VADriverContextP ctx, VABufferID buf_id, 
void **pbuff)
   if (!buf->derived_surface.transfer || !*pbuff)
  return VA_STATUS_ERROR_INVALID_BUFFER;
 
+  if (buf->type == VAEncCodedBufferType) {
+ ((VACodedBufferSegment*)buf->data)->buf = *pbuff;
+ ((VACodedBufferSegment*)buf->data)->size = buf->coded_size;
+ ((VACodedBufferSegment*)buf->data)->next = NULL;
+ *pbuff = buf->data;
+  }
} else {
   pipe_mutex_unlock(drv->mutex);
   *pbuff = buf->data;
diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c
index 89ac024..12b3cd1 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -78,7 +78,8 @@ vlVaBeginPicture(VADriverContextP ctx, VAContextID 
context_id, VASurfaceID rende
   return VA_STATUS_SUCCESS;
}
 
-   context->decoder->begin_frame(context->decoder, context->target, 
&context->desc.base);
+   if (context->decoder->entrypoint != PIPE_VIDEO_ENTRYPOINT_ENCODE)
+  context->decoder->begin_frame(context->decoder, context->target, 
&context->desc.base);
 
return VA_STATUS_SUCCESS;
 }
@@ -278,6 +279,131 @@ handleVASliceDataBufferType(vlVaContext *context, 
vlVaBuffer *buf)
   num_buffers, (const void * const*)buffers, sizes);
 }
 
+static VAStatus
+handleVAEncMiscParameterTypeRateControl(vlVaContext *context, 
VAEncMiscParameterBuffer *misc)
+{
+   VAEncMiscParameterRateControl *rc = (VAEncMiscParameterRateControl 
*)misc->data;
+   if (context->desc.h264enc.rate_ctrl.rate_ctrl_method ==
+   PIPE_H264_ENC_RATE_CONTROL_METHOD_CONSTANT)
+  context->desc.h264enc.rate_ctrl.target_bitrate = rc->bits_per_second;
+   else
+  context->desc.h264enc.rate_ctrl.target_bitrate = rc->bits_per_second * 
rc->target_percentage;
+   context->desc.h264enc.rate_ctrl.peak_bitrate = rc->bits_per_second;
+   if (context->desc.h264enc.rate_ctrl.target_bitrate < 200)
+  context->desc.h264enc.rate_ctrl.vbv_buffer_size = 
MIN2((context->desc.h264enc.rate_ctrl.target_bitrate * 2.75), 200);
+   else
+  context->desc.h264enc.rate_ctrl.vbv_buffer_size = 
context->desc.h264enc.rate_ctrl.target_bitrate;
+
+   return VA_STATUS_SUCCESS;
+}
+
+static VAStatus
+handleVAEncSequenceParameterBufferType(vlVaDriver *drv, vlVaContext *context, 
vlVaBuffer *buf)
+{
+   VAEncSequenceParameterBufferH264 *h264 = (VAEncSequenceParameterBufferH264 
*)buf->data;
+   if (!context->decoder) {
+  context->templat.max_references = h264->max_num_ref_frames;
+  context->templat.level = h264->level_idc;
+  context->decoder = drv->pipe->create_video_codec(drv->pipe, 
&context->templat);
+  if (!context->decoder)
+ return VA_STATUS_ERROR_ALLOCATION_FAILED;
+   }
+   context->desc.h264enc.gop_size = h264->intra_idr_period;
+   return VA_STATUS_SUCCESS;
+}
+
+static VAStatus
+handleVAEncMiscParameterBufferType(vlVaContext *context, vlVaBuffer *buf)
+{
+   VAStatus vaStatus = VA_STATUS_SUCCESS;
+   VAEncMiscParameterBuffer *misc;
+   misc = buf->data;
+
+   switch (misc->type) {
+   case VAEncMiscParameterTypeRateControl:
+  vaStatus = handleVAEncMiscParameterTypeRateControl(context, misc);
+  break;
+
+   default:
+  break;
+   }
+
+   return vaStatus;
+}
+
+static VAStatus
+handleVAEncPictureParameterBufferType(vlVaDriver *drv, vlVaContext *context, 
vlVaBuffer *buf)
+{
+   VAEncPictureParameterBufferH264 *h264;
+   vlVaBuffer *coded_buf;
+
+   h264 = buf->data;
+   context->desc.h264enc.frame_num = h264->frame_num;
+   context->desc.h264enc.not_referenced = false;
+   context->desc.h264enc.is_idr = (h264->pic_fields.bits.idr_pic_flag == 1);
+   context->desc.h264enc.pic_order_cnt = h264->CurrPic.TopFieldOrderCnt / 2;
+   if (context->desc.h264enc.is_idr)
+  context->desc.h264enc.i_remain = 1;
+   else
+  context->desc.h264enc.i_remain = 0;
+
+   context->desc.h264enc.p_remain = context->desc.h264enc.gop_size - 
context->desc.h264enc.gop_cnt - context->desc.h264enc.i_remain;
+
+   coded_buf = handle_table_get(drv->htab, h264->coded_buf);
+   coded_buf->derived_surface.resource = pipe_buffer_create(drv->pipe->screen, 
PIPE_BIND_VERTEX_BUFFER,
+ PIPE_USAGE_STREAM, coded_buf->size);
+   context->coded_buf = coded_buf;
+
+   context->desc.h264enc.frame_idx[h264->CurrPic.pi

[Mesa-dev] [PATCH 3/3] anv/cmd_buffer: Simplify range member assignment

2016-07-13 Thread Nanley Chery
A ternary is clearer because the range member is assigned one of two values
dependant on one condition.

Signed-off-by: Nanley Chery 
---
 src/intel/vulkan/anv_cmd_buffer.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/intel/vulkan/anv_cmd_buffer.c 
b/src/intel/vulkan/anv_cmd_buffer.c
index 9ceb194..6256df8 100644
--- a/src/intel/vulkan/anv_cmd_buffer.c
+++ b/src/intel/vulkan/anv_cmd_buffer.c
@@ -646,11 +646,9 @@ void anv_CmdBindDescriptorSets(
 
unsigned array_size = set_layout->binding[b].array_size;
for (unsigned j = 0; j < array_size; j++) {
-  uint32_t range = 0;
-  if (desc->buffer_view)
- range = desc->buffer_view->range;
   push->dynamic[d].offset = *(offsets++);
-  push->dynamic[d].range = range;
+  push->dynamic[d].range = (desc->buffer_view) ?
+desc->buffer_view->range : 0;
   desc++;
   d++;
}
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/3] Fix binding sparsely-defined descriptor sets and misc cleanups

2016-07-13 Thread Nanley Chery
The first patch of this series fixes a bug in the driver. The remaining two
make the related code easier to understand. 

Nanley Chery (3):
  anv/descriptor_set: Fix binding partly undefined descriptor sets
  anv/cmd_buffer: Remove unused variable
  anv/cmd_buffer: Simplify range member assignment

 src/intel/vulkan/anv_cmd_buffer.c | 9 +++--
 src/intel/vulkan/anv_descriptor_set.c | 9 +++--
 2 files changed, 10 insertions(+), 8 deletions(-)

-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] anv/descriptor_set: Fix binding partly undefined descriptor sets

2016-07-13 Thread Nanley Chery
Section 13.2.3. of the Vulkan spec requires that implementations be able to
bind sparsely-defined Descriptor Sets without any errors or exceptions.

When binding a descriptor set that contains a dynamic buffer binding/descriptor,
the driver attempts to dereference the descriptor's buffer_view field if it is
non-NULL. It currently segfaults on undefined descriptors as this field is never
zero-initialized. Zero undefined descriptors to avoid segfaulting. This
solution was suggested by Jason Ekstrand.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96850
Cc: 12.0 
Signed-off-by: Nanley Chery 
---
 src/intel/vulkan/anv_descriptor_set.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/intel/vulkan/anv_descriptor_set.c 
b/src/intel/vulkan/anv_descriptor_set.c
index 448ae0e..f06d2e4 100644
--- a/src/intel/vulkan/anv_descriptor_set.c
+++ b/src/intel/vulkan/anv_descriptor_set.c
@@ -412,8 +412,8 @@ anv_descriptor_set_create(struct anv_device *device,
/* Go through and fill out immutable samplers if we have any */
struct anv_descriptor *desc = set->descriptors;
for (uint32_t b = 0; b < layout->binding_count; b++) {
-  if (layout->binding[b].immutable_samplers) {
- for (uint32_t i = 0; i < layout->binding[b].array_size; i++) {
+  for (uint32_t i = 0; i < layout->binding[b].array_size; i++) {
+ if (layout->binding[b].immutable_samplers) {
 /* The type will get changed to COMBINED_IMAGE_SAMPLER in
  * UpdateDescriptorSets if needed.  However, if the descriptor
  * set has an immutable sampler, UpdateDescriptorSets may never
@@ -423,6 +423,11 @@ anv_descriptor_set_create(struct anv_device *device,
.type = VK_DESCRIPTOR_TYPE_SAMPLER,
.sampler = layout->binding[b].immutable_samplers[i],
 };
+ } else {
+/* By defining the descriptors to be zero now, we can later verify 
that
+ * the descriptor has not been populated with user data.
+ */
+zero(desc[i]);
  }
   }
   desc += layout->binding[b].array_size;
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] anv/cmd_buffer: Remove unused variable

2016-07-13 Thread Nanley Chery
This became unused due to commit 612e35b2c65c99773b73e53d0e6fd112b1a7431f .

Signed-off-by: Nanley Chery 
---
 src/intel/vulkan/anv_cmd_buffer.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/intel/vulkan/anv_cmd_buffer.c 
b/src/intel/vulkan/anv_cmd_buffer.c
index 20d3af1..9ceb194 100644
--- a/src/intel/vulkan/anv_cmd_buffer.c
+++ b/src/intel/vulkan/anv_cmd_buffer.c
@@ -620,7 +620,6 @@ void anv_CmdBindDescriptorSets(
 
assert(firstSet + descriptorSetCount < MAX_SETS);
 
-   uint32_t dynamic_slot = 0;
for (uint32_t i = 0; i < descriptorSetCount; i++) {
   ANV_FROM_HANDLE(anv_descriptor_set, set, pDescriptorSets[i]);
   set_layout = layout->set[firstSet + i].layout;
@@ -638,7 +637,7 @@ void anv_CmdBindDescriptorSets(
cmd_buffer->state.push_constants[s];
 
 unsigned d = layout->set[firstSet + i].dynamic_offset_start;
-const uint32_t *offsets = pDynamicOffsets + dynamic_slot;
+const uint32_t *offsets = pDynamicOffsets;
 struct anv_descriptor *desc = set->descriptors;
 
 for (unsigned b = 0; b < set_layout->binding_count; b++) {
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: fix reference counting bug in st_vdpau

2016-07-13 Thread Leo Liu

I have tested the leak is fixed by this patch.
And have a look again, the leak actually happens in the case of 
st_vdpau_output_surface_dma_buf.


Sorry for the noise.

The patch is
Tested-and-Reviewed by: Leo Liu 

Regards,
Leo


On 07/13/2016 10:08 AM, Leo Liu wrote:



On 07/13/2016 08:56 AM, Christian König wrote:

From: Christian König 

Otherwise we leak the resources created for the DMA-buf descriptors.

Signed-off-by: Christian König 
Cc: 12.0 
---
  src/mesa/state_tracker/st_vdpau.c | 10 --
  1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/mesa/state_tracker/st_vdpau.c 
b/src/mesa/state_tracker/st_vdpau.c

index dffa52f..4f599dd 100644
--- a/src/mesa/state_tracker/st_vdpau.c
+++ b/src/mesa/state_tracker/st_vdpau.c
@@ -65,6 +65,7 @@ st_vdpau_video_surface_gallium(struct gl_context 
*ctx, const void *vdpSurface,

   struct pipe_video_buffer *buffer;
 struct pipe_sampler_view **samplers;
+   struct pipe_resource *res = NULL;
   getProcAddr = (void *)ctx->vdpGetProcAddress;
 if (getProcAddr(device, VDP_FUNC_ID_VIDEO_SURFACE_GALLIUM, 
(void**)&f))
@@ -82,7 +83,8 @@ st_vdpau_video_surface_gallium(struct gl_context 
*ctx, const void *vdpSurface,

 if (!sv)
return NULL;
  -   return sv->texture;
+   pipe_resource_reference(&res, sv->texture);
+   return res;
  }
static struct pipe_resource *
@@ -90,13 +92,15 @@ st_vdpau_output_surface_gallium(struct gl_context 
*ctx, const void *vdpSurface)

  {
 int (*getProcAddr)(uint32_t device, uint32_t id, void **ptr);
 uint32_t device = (uintptr_t)ctx->vdpDevice;
+   struct pipe_resource *res = NULL;
 VdpOutputSurfaceGallium *f;
   getProcAddr = (void *)ctx->vdpGetProcAddress;
 if (getProcAddr(device, VDP_FUNC_ID_OUTPUT_SURFACE_GALLIUM, 
(void**)&f))

return NULL;
  -   return f((uintptr_t)vdpSurface);
+   pipe_resource_reference(&res, f((uintptr_t)vdpSurface));
+   return res;
  }
static struct pipe_resource *
@@ -208,6 +212,7 @@ st_vdpau_map_surface(struct gl_context *ctx, 
GLenum target, GLenum access,

 /* do we have different screen objects ? */
 if (res->screen != st->pipe->screen) {
_mesa_error(ctx, GL_INVALID_OPERATION, "VDPAUMapSurfacesNV");
+  pipe_resource_reference(&res, NULL);
return;
 }
  @@ -241,6 +246,7 @@ st_vdpau_map_surface(struct gl_context *ctx, 
GLenum target, GLenum access,

 stObj->surface_format = res->format;
   _mesa_dirty_texobj(ctx, texObj);
+   pipe_resource_reference(&res, NULL);


Will this be with problem when same surface map again?
Also will the leak happen for the case of 
st_vdpau_output_surface_dma_buf?


Thanks,
Leo


  }
static void


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] st/mesa: check for immutable texture in st_TestProxyTexImage()

2016-07-13 Thread Brian Paul
If it's immutable, we don't have to guess about the number of mipmap
levels.
---
 src/mesa/state_tracker/st_cb_texture.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/mesa/state_tracker/st_cb_texture.c 
b/src/mesa/state_tracker/st_cb_texture.c
index 1474d97..b91a2ae 100644
--- a/src/mesa/state_tracker/st_cb_texture.c
+++ b/src/mesa/state_tracker/st_cb_texture.c
@@ -2716,8 +2716,13 @@ st_TestProxyTexImage(struct gl_context *ctx, GLenum 
target,
   &pt.width0, &pt.height0,
   &pt.depth0, &pt.array_size);
 
-  if (level == 0 && (texObj->Sampler.MinFilter == GL_LINEAR ||
- texObj->Sampler.MinFilter == GL_NEAREST)) {
+  if (texObj->Immutable) {
+ /* For immutable textures we know the final number of mip levels */
+ assert(texObj->NumLevels > 0);
+ pt.last_level = texObj->NumLevels - 1;
+  }
+  else if (level == 0 && (texObj->Sampler.MinFilter == GL_LINEAR ||
+  texObj->Sampler.MinFilter == GL_NEAREST)) {
  /* assume just one mipmap level */
  pt.last_level = 0;
   }
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] mesa: use _mesa_clear_texture_image() in clear_texture_fields()

2016-07-13 Thread Brian Paul
This avoids a failed assert(img->_BaseFormat != -1) in
init_teximage_fields_ms() because the internalFormat argument is GL_NONE.
This was hit when using glTexStorage() to do a proxy texture test.

Fixes a failure with the updated Piglit tex3d-maxsize test.

Cc: 
---
 src/mesa/main/texstorage.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/src/mesa/main/texstorage.c b/src/mesa/main/texstorage.c
index f4a0760..72ed869 100644
--- a/src/mesa/main/texstorage.c
+++ b/src/mesa/main/texstorage.c
@@ -179,9 +179,7 @@ clear_texture_fields(struct gl_context *ctx,
 return;
 }
 
- _mesa_init_teximage_fields(ctx, texImage,
-0, 0, 0, 0, /* w, h, d, border */
-GL_NONE, MESA_FORMAT_NONE);
+ _mesa_clear_texture_image(ctx, texImage);
   }
}
 }
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] mesa: fix issues with glTexStorage() and proxy textures

2016-07-13 Thread Brian Paul
When we call ctx->Driver.TestProxyTexImage() we want to have the
texture object's Immutable and NumLevels fields initialized so that
the function doesn't have to guess about the number of mipmap levels.

But since the proxy textures are shared by the old glTexImage() and
new glTexStorage(), we can't just set these fields and forget them.
We have to save/restore them.

This allows glTexStorage(GL_PROXY_TEXTURE_x) calls to produce a more
accurate result.

An alternative would be to change the ctx->Driver.TestProxyTexImage()
function to take extra immutable, numLevels arguments.
---
 src/mesa/main/texstorage.c | 41 +++--
 1 file changed, 39 insertions(+), 2 deletions(-)

diff --git a/src/mesa/main/texstorage.c b/src/mesa/main/texstorage.c
index 72ed869..ea23876 100644
--- a/src/mesa/main/texstorage.c
+++ b/src/mesa/main/texstorage.c
@@ -367,6 +367,42 @@ tex_storage_error_check(struct gl_context *ctx,
 
 
 /**
+ * Wrapper for ctx->Driver.TestProxyTexImage() which sets/restores
+ * immutable texture state.
+ * We do this so that the TestProxyTexImage() function can know that
+ * we're testing with an immutable texture and we know the number of
+ * mipmap levels.
+ */
+static GLboolean
+test_proxy_tex_image(struct gl_context *ctx,
+ struct gl_texture_object *texObj, GLenum target,
+ mesa_format texFormat, GLsizei levels,
+ GLsizei width, GLsizei height, GLsizei depth)
+{
+   GLboolean immutableSave;
+   GLuint numLevelsSave;
+   GLboolean sizeOK;
+
+   /* save current vals */
+   immutableSave = texObj->Immutable;
+   numLevelsSave = texObj->NumLevels;
+
+   /* set immutable fields */
+   texObj->Immutable = GL_TRUE;
+   texObj->NumLevels = levels;
+
+   sizeOK = ctx->Driver.TestProxyTexImage(ctx, target, 0, texFormat,
+  width, height, depth, 0);
+
+   /* restore */
+   texObj->Immutable = immutableSave;
+   texObj->NumLevels = numLevelsSave;
+
+   return sizeOK;
+}
+
+
+/**
  * Helper that does the storage allocation for _mesa_TexStorage1/2/3D()
  * and _mesa_TextureStorage1/2/3D().
  */
@@ -396,8 +432,8 @@ _mesa_texture_storage(struct gl_context *ctx, GLuint dims,
dimensionsOK = _mesa_legal_texture_dimensions(ctx, target, 0,
   width, height, depth, 0);
 
-   sizeOK = ctx->Driver.TestProxyTexImage(ctx, target, 0, texFormat,
-  width, height, depth, 0);
+   sizeOK = test_proxy_tex_image(ctx, texObj, target, texFormat, levels,
+ width, height, depth);
 
if (_mesa_is_proxy_texture(target)) {
   if (dimensionsOK && sizeOK) {
@@ -447,6 +483,7 @@ _mesa_texture_storage(struct gl_context *ctx, GLuint dims,
  return;
   }
 
+  /* This will set the gl_texture_object->Immutable flag to true */
   _mesa_set_texture_view_state(ctx, texObj, target, levels);
 
   update_fbo_texture(ctx, texObj);
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] glsl/types: Use _mesa_hash_data for hashing function types

2016-07-13 Thread Jason Ekstrand
This is way better than the stupid string approach especially since you
could overflow the string.  Again, I thought I had something better at one
point but it obviously got lost.

Signed-off-by: Jason Ekstrand 
Cc: "12.0" 
---
 src/compiler/glsl_types.cpp | 16 ++--
 1 file changed, 2 insertions(+), 14 deletions(-)

diff --git a/src/compiler/glsl_types.cpp b/src/compiler/glsl_types.cpp
index fa27135..e9b58dd 100644
--- a/src/compiler/glsl_types.cpp
+++ b/src/compiler/glsl_types.cpp
@@ -1097,20 +1097,8 @@ static uint32_t
 function_key_hash(const void *a)
 {
const glsl_type *const key = (glsl_type *) a;
-   char hash_key[128];
-   unsigned size = 0;
-
-   size = snprintf(hash_key, sizeof(hash_key), "%08x", key->length);
-
-   for (unsigned i = 0; i < key->length; i++) {
-  if (size >= sizeof(hash_key))
-break;
-
-  size += snprintf(& hash_key[size], sizeof(hash_key) - size,
-  "%p", (void *) key->fields.structure[i].type);
-   }
-
-   return _mesa_hash_string(hash_key);
+   return _mesa_hash_data(key->fields.parameters,
+  (key->length + 1) * sizeof(*key->fields.parameters));
 }
 
 const glsl_type *
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] glsl/types: Fix function type comparison function

2016-07-13 Thread Jason Ekstrand
It was returning true if the function types have different lengths rather
than false.  This was new with the SPIR-V to NIR pass and I thought I'd
fixed it a while ago but it may have gotten lost in rebasing somewhere.

Signed-off-by: Jason Ekstrand 
Cc: "12.0" 
---
 src/compiler/glsl_types.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/glsl_types.cpp b/src/compiler/glsl_types.cpp
index 066a74e..fa27135 100644
--- a/src/compiler/glsl_types.cpp
+++ b/src/compiler/glsl_types.cpp
@@ -1086,7 +1086,7 @@ function_key_compare(const void *a, const void *b)
const glsl_type *const key2 = (glsl_type *) b;
 
if (key1->length != key2->length)
-  return 1;
+  return false;
 
return memcmp(key1->fields.parameters, key2->fields.parameters,
  (key1->length + 1) * sizeof(*key1->fields.parameters)) == 0;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/5] radeonsi: set dereferenceable attribute on descriptor arrays

2016-07-13 Thread Matt Arsenault

> On Jul 13, 2016, at 12:36, Marek Olšák  wrote:
> 
> On Wed, Jul 13, 2016 at 9:25 PM, Tom Stellard  > wrote:
>> On Wed, Jul 13, 2016 at 03:20:55PM -0400, Tom Stellard wrote:
>>> On Tue, Jul 12, 2016 at 10:52:35PM +0200, Marek Olšák wrote:
 From: Marek Olšák 
 
 This allows moving the loads arbitrarily in the Sinking pass.
 
 26002 shaders in 14643 tests
 Totals:
 SGPRS: 2080160 -> 2080160 (0.00 %)
 VGPRS: 798875 -> 797826 (-0.13 %)
 Spilled SGPRs: 108485 -> 79165 (-27.03 %)
 Spilled VGPRs: 327 -> 327 (0.00 %)
 Scratch VGPRs: 1656 -> 1652 (-0.24 %) dwords per thread
 Code Size: 36127192 -> 35559780 (-1.57 %) bytes
 LDS: 767 -> 767 (0.00 %) blocks
 Max Waves: 212464 -> 212672 (0.10 %)
 Wait states: 0 -> 0 (0.00 %)
 
 PERCENTAGES / AppShadersSGPRs VGPRs  SpillSGPR SpillVGPR  
 Scratch   CodeSize  MaxWavesWaits
 (unknown)  4 . . . . . 
 . . .
 0ad6 . . . . . 
 . . .
 alien_isolation 2938 .0.04 %   -8.53 % . . 
   -0.71 %   -0.06 % .
 anholt10 . . . . . 
 . . .
 batman_arkham_origins589 .   -0.58 %  -79.54 % . . 
   -6.72 %0.57 % .
 bioshock-infinite   1769 .   -0.65 %  -89.32 % . . 
   -4.73 %0.48 % .
 borderlands23968 .   -0.31 %  -51.21 % . . 
   -4.09 %0.22 % .
 brutal-legend338 .   -0.03 %   -2.95 % . . 
   -0.06 % . .
 civilization_beyond..116 . .  -14.17 % . . 
   -0.88 % . .
 counter_strike_glob..   1142 . . . . . 
 . . .
 dirt-showdown541 .   -0.56 %  -40.14 % .   
 -3.45 %   -1.82 %0.35 % .
 dolphin   22 . . . . . 
0.16 % . .
 dota2   1747 . . . . . 
0.01 % . .
 europa_universalis_4  76 .   -0.23 %  -42.11 % . . 
   -0.96 % . .
 f1-2015  774 .   -0.09 %  -28.89 % . . 
   -2.60 %0.09 % .
 furmark-0.7.0  4 . . . . . 
 . . .
 gimark-0.7.0  10 . . . . . 
 . . .
 glamor16 . . . . . 
 . . .
 humus-celshading   4 . . . . . 
 . . .
 humus-domino   6 . . . . . 
 . . .
 humus-dynamicbranching24 .0.71 % . . . 
0.29 %   -0.45 % .
 humus-hdr 10 . . . . . 
 . . .
 humus-portals  2 . . . . . 
 . . .
 humus-volumetricfog..  6 . . . . . 
 . . .
 left_4_dead_2   1762 . . . . . 
 . . .
 metro_2033_redux2670 .   -0.10 %   -7.15 % . . 
   -0.03 % . .
 nexuiz80 . . . . . 
 . . .
 pixmark-julia-fp32 2 . . . . . 
 . . .
 pixmark-julia-fp64 2 . . . . . 
 . . .
 pixmark-piano-0.7.02 . . . . . 
 . . .
 pixmark-volplosion-..  2 . . . . . 
 . . .
 plot3d-0.7.0   8 . . . . . 
 . . .
 portal   474 . . . . . 
 . . .
 sauerbraten7 . . . . . 
 . . .
 serious_sam_3_bfe392 . .  -13.20 % . . 
   -1.81 % . .

Re: [Mesa-dev] V3 On disk shader cache for i965 (Now with real world results!)

2016-07-13 Thread Grazvydas Ignotas
On Wed, Jul 13, 2016 at 2:56 AM, Timothy Arceri
 wrote:
> On Sat, 2016-07-09 at 20:21 +0300, Grazvydas Ignotas wrote:
>>
>> I think I still have some more:
>> - running 32bit program after 64bit version of the same thing (or
>> vice
>> versa) leads to segfaults and assert hits. Probably easiest to build
>> 32bit and 64bit apitrace and play some traces.
>> - a trace from comment 2 of bug 96624 doesn't seem to render on
>> cache.
>
> I can't seem to reproduce this. I created some traces using the 32-bit
> and 64-bit versions of The Talos Principle but they seems to run with
> no issues. I've pushed updates to the shader-cache branch maybe I fixed
> something.

Still crashes here. What I do:
1. build 64bit and 32bit versions of apitrace
2. get the trace from bug 96425
3. rm -rf ~/.cache/mesa/
4. MESA_GLSL_CACHE_ENABLE=1 build64/glretrace -b Talos_flicker2.trace
5. MESA_GLSL_CACHE_ENABLE=1 build32/glretrace -b Talos_flicker2.trace

Program received signal SIGSEGV, Segmentation fault.
0xf751b8cb in read_atomic_buffers (prog=,
metadata=0xba34) at glsl/shader_cache.cpp:405
405*stage_buff_list[j] = &prog->AtomicBuffers[i];
(gdb) bt
#0  0xf751b8cb in read_atomic_buffers (prog=,
metadata=0xba34) at glsl/shader_cache.cpp:405
#1  shader_cache_read_program_metadata (ctx=0xf6d18020,
prog=0x88ac610) at glsl/shader_cache.cpp:1283
#2  0xf748eeaf in link_shaders (ctx=0xf6d18020, prog=0x88ac610,
is_cache_fallback=false) at glsl/linker.cpp:4541
#3  0xf73d289c in _mesa_glsl_link_shader (ctx=0xf6d18020,
prog=0x88ac610, is_cache_fallback=false)
at program/ir_to_mesa.cpp:3073
#4  0xf72c682a in _mesa_link_program (ctx=0xf6d18020,
shProg=0x88ac610) at main/shaderapi.c:1099
#5  0xf72c7e47 in _mesa_LinkProgram (programObj=14) at main/shaderapi.c:1603
#6  0x08264000 in _get_glLinkProgram(unsigned int) ()
#7  0x0810ef17 in retrace_glLinkProgram(trace::Call&) ()

Gražvydas
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/u_queue: add optional cleanup callback

2016-07-13 Thread Marek Olšák
On Wed, Jul 13, 2016 at 6:28 PM, Rob Clark  wrote:
> Adds a second optional cleanup callback, called after the fence is
> signaled.  This is needed if, for example, the queue has the last
> reference to the object that embeds the util_queue_fence.  In this
> case we cannot drop the ref in the main callback, since that would
> result in the fence being destroyed before it is signaled.
>
> Signed-off-by: Rob Clark 
> ---
> Maybe adding util_queue_add_job2() is a bit overkill.. although I
> think Marek has some in-flight stuff using u_queue, so maybe this
> approach is less conflicty?

I have nothing in flight at the moment. My experimental patch doesn't count.

This looks good but I'd like to have only one util_queue_add_job function.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] main: memcpy larger chunks in _mesa_propagate_uniforms_to_driver_storage

2016-07-13 Thread Kenneth Graunke
On Wednesday, July 13, 2016 1:53:26 PM PDT Nils Wallménius wrote:
> When possible, do the memcpy on larger blocks. This reduces cycles
> spent in _mesa_propagate_uniforms_to_driver_storage from
> 1.51 % to 0.62% according to perf during the Unigine Heaven benchmark.
> It did not affect the framerate of the benchmark. The system used for
> testing was an i5 6600K with a Radeon R9 380.
> 
> Piglit hangs randomly on this system both with and without the patch
> so i could not make a comparison.
> 
> Signed-off-by: Nils Wallménius 

Huh.  I didn't think any drivers used the driver_storage mechanism...


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: look for frag data bindings with [0] tacked onto the end for arrays

2016-07-13 Thread Ilia Mirkin
Thanks for confirming, Corentin.

Ian, do you have any opinions on this? Seems like a fairly innocuous
thing to be doing...

On Fri, Jul 8, 2016 at 3:39 PM, Corentin Wallez  wrote:
> Not sure how reviews work in Mesa, but this patch LGTM. I also tested that
> it fixes the relevant tests failures it is supposed to address.
>
> On Wed, Jul 6, 2016 at 7:40 PM, Ilia Mirkin  wrote:
>>
>> The GL spec is very unclear on this point. Apparently this is discussed
>> without resolution in the closed Khronos bugtracker at
>> https://cvs.khronos.org/bugzilla/show_bug.cgi?id=7829 . The
>> recommendation is to allow dropping the [0] for looking up the bindings.
>>
>> The approach taken in this patch is to instead tack on [0]'s for each
>> arrayness level of the output's type, and doing the lookup again. That
>> way, for
>>
>> out vec4 foo[2][2][2]
>>
>> we will end up looking for bindings for foo, foo[0], foo[0][0], and
>> foo[0][0][0], in that order of preference.
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96765
>> Signed-off-by: Ilia Mirkin 
>> ---
>>  src/compiler/glsl/linker.cpp | 39 ---
>>  1 file changed, 28 insertions(+), 11 deletions(-)
>>
>> diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
>> index d963f54..9d54c2f 100644
>> --- a/src/compiler/glsl/linker.cpp
>> +++ b/src/compiler/glsl/linker.cpp
>> @@ -2566,6 +2566,7 @@ find_available_slots(unsigned used_mask, unsigned
>> needed_count)
>>  /**
>>   * Assign locations for either VS inputs or FS outputs
>>   *
>> + * \param mem_ctx   Temporary ralloc context used for linking
>>   * \param prog  Shader program whose variables need locations
>> assigned
>>   * \param constants Driver specific constant values for the program.
>>   * \param target_index  Selector for the program target to receive
>> location
>> @@ -2577,7 +2578,8 @@ find_available_slots(unsigned used_mask, unsigned
>> needed_count)
>>   * error is emitted to the shader link log and false is returned.
>>   */
>>  bool
>> -assign_attribute_or_color_locations(gl_shader_program *prog,
>> +assign_attribute_or_color_locations(void *mem_ctx,
>> +gl_shader_program *prog,
>>  struct gl_constants *constants,
>>  unsigned target_index)
>>  {
>> @@ -2680,16 +2682,31 @@
>> assign_attribute_or_color_locations(gl_shader_program *prog,
>>} else if (target_index == MESA_SHADER_FRAGMENT) {
>>  unsigned binding;
>>  unsigned index;
>> + const char *name = var->name;
>> + const glsl_type *type = var->type;
>> +
>> + while (type) {
>> +/* Check if there's a binding for the variable name */
>> +if (prog->FragDataBindings->get(binding, name)) {
>> +   assert(binding >= FRAG_RESULT_DATA0);
>> +   var->data.location = binding;
>> +   var->data.is_unmatched_generic_inout = 0;
>> +
>> +   if (prog->FragDataIndexBindings->get(index, name)) {
>> +  var->data.index = index;
>> +   }
>> +   break;
>> +}
>>
>> -if (prog->FragDataBindings->get(binding, var->name)) {
>> -   assert(binding >= FRAG_RESULT_DATA0);
>> -   var->data.location = binding;
>> -var->data.is_unmatched_generic_inout = 0;
>> +/* If not, but it's an array type, look for name[0] */
>> +if (type->is_array()) {
>> +   name = ralloc_asprintf(mem_ctx, "%s[0]", name);
>> +   type = type->fields.array;
>> +   continue;
>> +}
>>
>> -   if (prog->FragDataIndexBindings->get(index, var->name)) {
>> -  var->data.index = index;
>> -   }
>> -}
>> +break;
>> + }
>>}
>>
>>/* From GL4.5 core spec, section 15.2 (Shader Execution):
>> @@ -4816,12 +4833,12 @@ link_shaders(struct gl_context *ctx, struct
>> gl_shader_program *prog)
>>prev = i;
>> }
>>
>> -   if (!assign_attribute_or_color_locations(prog, &ctx->Const,
>> +   if (!assign_attribute_or_color_locations(mem_ctx, prog, &ctx->Const,
>>  MESA_SHADER_VERTEX)) {
>>goto done;
>> }
>>
>> -   if (!assign_attribute_or_color_locations(prog, &ctx->Const,
>> +   if (!assign_attribute_or_color_locations(mem_ctx, prog, &ctx->Const,
>>  MESA_SHADER_FRAGMENT)) {
>>goto done;
>> }
>> --
>> 2.7.3
>>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/5] radeonsi: set dereferenceable attribute on descriptor arrays

2016-07-13 Thread Marek Olšák
On Wed, Jul 13, 2016 at 9:25 PM, Tom Stellard  wrote:
> On Wed, Jul 13, 2016 at 03:20:55PM -0400, Tom Stellard wrote:
>> On Tue, Jul 12, 2016 at 10:52:35PM +0200, Marek Olšák wrote:
>> > From: Marek Olšák 
>> >
>> > This allows moving the loads arbitrarily in the Sinking pass.
>> >
>> > 26002 shaders in 14643 tests
>> > Totals:
>> > SGPRS: 2080160 -> 2080160 (0.00 %)
>> > VGPRS: 798875 -> 797826 (-0.13 %)
>> > Spilled SGPRs: 108485 -> 79165 (-27.03 %)
>> > Spilled VGPRs: 327 -> 327 (0.00 %)
>> > Scratch VGPRs: 1656 -> 1652 (-0.24 %) dwords per thread
>> > Code Size: 36127192 -> 35559780 (-1.57 %) bytes
>> > LDS: 767 -> 767 (0.00 %) blocks
>> > Max Waves: 212464 -> 212672 (0.10 %)
>> > Wait states: 0 -> 0 (0.00 %)
>> >
>> >  PERCENTAGES / AppShadersSGPRs VGPRs  SpillSGPR SpillVGPR  
>> > Scratch   CodeSize  MaxWavesWaits
>> >  (unknown)  4 . . . . 
>> > . . . .
>> >  0ad6 . . . . 
>> > . . . .
>> >  alien_isolation 2938 .0.04 %   -8.53 % . 
>> > .   -0.71 %   -0.06 % .
>> >  anholt10 . . . . 
>> > . . . .
>> >  batman_arkham_origins589 .   -0.58 %  -79.54 % . 
>> > .   -6.72 %0.57 % .
>> >  bioshock-infinite   1769 .   -0.65 %  -89.32 % . 
>> > .   -4.73 %0.48 % .
>> >  borderlands23968 .   -0.31 %  -51.21 % . 
>> > .   -4.09 %0.22 % .
>> >  brutal-legend338 .   -0.03 %   -2.95 % . 
>> > .   -0.06 % . .
>> >  civilization_beyond..116 . .  -14.17 % . 
>> > .   -0.88 % . .
>> >  counter_strike_glob..   1142 . . . . 
>> > . . . .
>> >  dirt-showdown541 .   -0.56 %  -40.14 % .   
>> > -3.45 %   -1.82 %0.35 % .
>> >  dolphin   22 . . . . 
>> > .0.16 % . .
>> >  dota2   1747 . . . . 
>> > .0.01 % . .
>> >  europa_universalis_4  76 .   -0.23 %  -42.11 % . 
>> > .   -0.96 % . .
>> >  f1-2015  774 .   -0.09 %  -28.89 % . 
>> > .   -2.60 %0.09 % .
>> >  furmark-0.7.0  4 . . . . 
>> > . . . .
>> >  gimark-0.7.0  10 . . . . 
>> > . . . .
>> >  glamor16 . . . . 
>> > . . . .
>> >  humus-celshading   4 . . . . 
>> > . . . .
>> >  humus-domino   6 . . . . 
>> > . . . .
>> >  humus-dynamicbranching24 .0.71 % . . 
>> > .0.29 %   -0.45 % .
>> >  humus-hdr 10 . . . . 
>> > . . . .
>> >  humus-portals  2 . . . . 
>> > . . . .
>> >  humus-volumetricfog..  6 . . . . 
>> > . . . .
>> >  left_4_dead_2   1762 . . . . 
>> > . . . .
>> >  metro_2033_redux2670 .   -0.10 %   -7.15 % . 
>> > .   -0.03 % . .
>> >  nexuiz80 . . . . 
>> > . . . .
>> >  pixmark-julia-fp32 2 . . . . 
>> > . . . .
>> >  pixmark-julia-fp64 2 . . . . 
>> > . . . .
>> >  pixmark-piano-0.7.02 . . . . 
>> > . . . .
>> >  pixmark-volplosion-..  2 . . . . 
>> > . . . .
>> >  plot3d-0.7.0   8 . . . . 
>> > . . . .
>> >  portal   474 . . . . 
>> > . . . .
>> >  sauerbraten7 . . . . 
>> > . . . .
>> >  serious_sam_3_bfe392 . .  -13.20 % . 
>> > .   -1.81 % . .
>> >  supertuxkart   4 . .  

Re: [Mesa-dev] [PATCH 2/5] radeonsi: set dereferenceable attribute on descriptor arrays

2016-07-13 Thread Tom Stellard
On Wed, Jul 13, 2016 at 03:20:55PM -0400, Tom Stellard wrote:
> On Tue, Jul 12, 2016 at 10:52:35PM +0200, Marek Olšák wrote:
> > From: Marek Olšák 
> > 
> > This allows moving the loads arbitrarily in the Sinking pass.
> > 
> > 26002 shaders in 14643 tests
> > Totals:
> > SGPRS: 2080160 -> 2080160 (0.00 %)
> > VGPRS: 798875 -> 797826 (-0.13 %)
> > Spilled SGPRs: 108485 -> 79165 (-27.03 %)
> > Spilled VGPRs: 327 -> 327 (0.00 %)
> > Scratch VGPRs: 1656 -> 1652 (-0.24 %) dwords per thread
> > Code Size: 36127192 -> 35559780 (-1.57 %) bytes
> > LDS: 767 -> 767 (0.00 %) blocks
> > Max Waves: 212464 -> 212672 (0.10 %)
> > Wait states: 0 -> 0 (0.00 %)
> > 
> >  PERCENTAGES / AppShadersSGPRs VGPRs  SpillSGPR SpillVGPR  
> > Scratch   CodeSize  MaxWavesWaits
> >  (unknown)  4 . . . . . 
> > . . .
> >  0ad6 . . . . . 
> > . . .
> >  alien_isolation 2938 .0.04 %   -8.53 % . . 
> >   -0.71 %   -0.06 % .
> >  anholt10 . . . . . 
> > . . .
> >  batman_arkham_origins589 .   -0.58 %  -79.54 % . . 
> >   -6.72 %0.57 % .
> >  bioshock-infinite   1769 .   -0.65 %  -89.32 % . . 
> >   -4.73 %0.48 % .
> >  borderlands23968 .   -0.31 %  -51.21 % . . 
> >   -4.09 %0.22 % .
> >  brutal-legend338 .   -0.03 %   -2.95 % . . 
> >   -0.06 % . .
> >  civilization_beyond..116 . .  -14.17 % . . 
> >   -0.88 % . .
> >  counter_strike_glob..   1142 . . . . . 
> > . . .
> >  dirt-showdown541 .   -0.56 %  -40.14 % .   
> > -3.45 %   -1.82 %0.35 % .
> >  dolphin   22 . . . . . 
> >0.16 % . .
> >  dota2   1747 . . . . . 
> >0.01 % . .
> >  europa_universalis_4  76 .   -0.23 %  -42.11 % . . 
> >   -0.96 % . .
> >  f1-2015  774 .   -0.09 %  -28.89 % . . 
> >   -2.60 %0.09 % .
> >  furmark-0.7.0  4 . . . . . 
> > . . .
> >  gimark-0.7.0  10 . . . . . 
> > . . .
> >  glamor16 . . . . . 
> > . . .
> >  humus-celshading   4 . . . . . 
> > . . .
> >  humus-domino   6 . . . . . 
> > . . .
> >  humus-dynamicbranching24 .0.71 % . . . 
> >0.29 %   -0.45 % .
> >  humus-hdr 10 . . . . . 
> > . . .
> >  humus-portals  2 . . . . . 
> > . . .
> >  humus-volumetricfog..  6 . . . . . 
> > . . .
> >  left_4_dead_2   1762 . . . . . 
> > . . .
> >  metro_2033_redux2670 .   -0.10 %   -7.15 % . . 
> >   -0.03 % . .
> >  nexuiz80 . . . . . 
> > . . .
> >  pixmark-julia-fp32 2 . . . . . 
> > . . .
> >  pixmark-julia-fp64 2 . . . . . 
> > . . .
> >  pixmark-piano-0.7.02 . . . . . 
> > . . .
> >  pixmark-volplosion-..  2 . . . . . 
> > . . .
> >  plot3d-0.7.0   8 . . . . . 
> > . . .
> >  portal   474 . . . . . 
> > . . .
> >  sauerbraten7 . . . . . 
> > . . .
> >  serious_sam_3_bfe392 . .  -13.20 % . . 
> >   -1.81 % . .
> >  supertuxkart   4 . . . . . 
> > . . .
> >  talos_principle  324 .   -0.21 %  -18.39 % . . 
> 

Re: [Mesa-dev] [PATCH 3/5] radeonsi: replace !tbaa with !invariant.load

2016-07-13 Thread Tom Stellard
On Tue, Jul 12, 2016 at 10:52:36PM +0200, Marek Olšák wrote:
> From: Marek Olšák 
> 

Reviewed-by: Tom Stellard 
> no change in generated code thanks to dereferenceable(n)
> ---
>  src/gallium/drivers/radeonsi/si_shader.c | 17 +
>  1 file changed, 5 insertions(+), 12 deletions(-)
> 
> diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
> b/src/gallium/drivers/radeonsi/si_shader.c
> index b23c7c6..ee63b95 100644
> --- a/src/gallium/drivers/radeonsi/si_shader.c
> +++ b/src/gallium/drivers/radeonsi/si_shader.c
> @@ -101,10 +101,9 @@ struct si_shader_context
>  
>   LLVMTargetMachineRef tm;
>  
> + unsigned invariant_load_md_kind;
>   unsigned range_md_kind;
> - unsigned tbaa_md_kind;
>   unsigned uniform_md_kind;
> - LLVMValueRef tbaa_const_md;
>   LLVMValueRef empty_md;
>  
>   LLVMValueRef const_buffers[SI_NUM_CONST_BUFFERS];
> @@ -418,7 +417,7 @@ static LLVMValueRef build_indexed_load_const(
>   LLVMValueRef base_ptr, LLVMValueRef index)
>  {
>   LLVMValueRef result = build_indexed_load(ctx, base_ptr, index, true);
> - LLVMSetMetadata(result, ctx->tbaa_md_kind, ctx->tbaa_const_md);
> + LLVMSetMetadata(result, ctx->invariant_load_md_kind, ctx->empty_md);
>   return result;
>  }
>  
> @@ -5315,7 +5314,7 @@ static void si_create_function(struct si_shader_context 
> *ctx,
>   /* The combination of:
>* - ByVal
>* - dereferenceable
> -  * - tbaa
> +  * - invariant.load
>* allows the optimization passes to move loads and reduces
>* SGPR spilling significantly.
>*/
> @@ -5346,21 +5345,15 @@ static void si_create_function(struct 
> si_shader_context *ctx,
>  static void create_meta_data(struct si_shader_context *ctx)
>  {
>   struct gallivm_state *gallivm = 
> ctx->radeon_bld.soa.bld_base.base.gallivm;
> - LLVMValueRef tbaa_const[3];
>  
> + ctx->invariant_load_md_kind = LLVMGetMDKindIDInContext(gallivm->context,
> +
> "invariant.load", 14);
>   ctx->range_md_kind = LLVMGetMDKindIDInContext(gallivm->context,
>"range", 5);
> - ctx->tbaa_md_kind = LLVMGetMDKindIDInContext(gallivm->context,
> -  "tbaa", 4);
>   ctx->uniform_md_kind = LLVMGetMDKindIDInContext(gallivm->context,
>   "amdgpu.uniform", 14);
>  
>   ctx->empty_md = LLVMMDNodeInContext(gallivm->context, NULL, 0);
> -
> - tbaa_const[0] = LLVMMDStringInContext(gallivm->context, "const", 5);
> - tbaa_const[1] = 0;
> - tbaa_const[2] = lp_build_const_int32(gallivm, 1);
> - ctx->tbaa_const_md = LLVMMDNodeInContext(gallivm->context, tbaa_const, 
> 3);
>  }
>  
>  static void declare_streamout_params(struct si_shader_context *ctx,
> -- 
> 2.7.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/5] radeonsi: set dereferenceable attribute on descriptor arrays

2016-07-13 Thread Tom Stellard
On Tue, Jul 12, 2016 at 10:52:35PM +0200, Marek Olšák wrote:
> From: Marek Olšák 
> 
> This allows moving the loads arbitrarily in the Sinking pass.
> 
> 26002 shaders in 14643 tests
> Totals:
> SGPRS: 2080160 -> 2080160 (0.00 %)
> VGPRS: 798875 -> 797826 (-0.13 %)
> Spilled SGPRs: 108485 -> 79165 (-27.03 %)
> Spilled VGPRs: 327 -> 327 (0.00 %)
> Scratch VGPRs: 1656 -> 1652 (-0.24 %) dwords per thread
> Code Size: 36127192 -> 35559780 (-1.57 %) bytes
> LDS: 767 -> 767 (0.00 %) blocks
> Max Waves: 212464 -> 212672 (0.10 %)
> Wait states: 0 -> 0 (0.00 %)
> 
>  PERCENTAGES / AppShadersSGPRs VGPRs  SpillSGPR SpillVGPR  
> Scratch   CodeSize  MaxWavesWaits
>  (unknown)  4 . . . . .   
>   . . .
>  0ad6 . . . . .   
>   . . .
>  alien_isolation 2938 .0.04 %   -8.53 % . .   
> -0.71 %   -0.06 % .
>  anholt10 . . . . .   
>   . . .
>  batman_arkham_origins589 .   -0.58 %  -79.54 % . .   
> -6.72 %0.57 % .
>  bioshock-infinite   1769 .   -0.65 %  -89.32 % . .   
> -4.73 %0.48 % .
>  borderlands23968 .   -0.31 %  -51.21 % . .   
> -4.09 %0.22 % .
>  brutal-legend338 .   -0.03 %   -2.95 % . .   
> -0.06 % . .
>  civilization_beyond..116 . .  -14.17 % . .   
> -0.88 % . .
>  counter_strike_glob..   1142 . . . . .   
>   . . .
>  dirt-showdown541 .   -0.56 %  -40.14 % .   -3.45 
> %   -1.82 %0.35 % .
>  dolphin   22 . . . . .   
>  0.16 % . .
>  dota2   1747 . . . . .   
>  0.01 % . .
>  europa_universalis_4  76 .   -0.23 %  -42.11 % . .   
> -0.96 % . .
>  f1-2015  774 .   -0.09 %  -28.89 % . .   
> -2.60 %0.09 % .
>  furmark-0.7.0  4 . . . . .   
>   . . .
>  gimark-0.7.0  10 . . . . .   
>   . . .
>  glamor16 . . . . .   
>   . . .
>  humus-celshading   4 . . . . .   
>   . . .
>  humus-domino   6 . . . . .   
>   . . .
>  humus-dynamicbranching24 .0.71 % . . .   
>  0.29 %   -0.45 % .
>  humus-hdr 10 . . . . .   
>   . . .
>  humus-portals  2 . . . . .   
>   . . .
>  humus-volumetricfog..  6 . . . . .   
>   . . .
>  left_4_dead_2   1762 . . . . .   
>   . . .
>  metro_2033_redux2670 .   -0.10 %   -7.15 % . .   
> -0.03 % . .
>  nexuiz80 . . . . .   
>   . . .
>  pixmark-julia-fp32 2 . . . . .   
>   . . .
>  pixmark-julia-fp64 2 . . . . .   
>   . . .
>  pixmark-piano-0.7.02 . . . . .   
>   . . .
>  pixmark-volplosion-..  2 . . . . .   
>   . . .
>  plot3d-0.7.0   8 . . . . .   
>   . . .
>  portal   474 . . . . .   
>   . . .
>  sauerbraten7 . . . . .   
>   . . .
>  serious_sam_3_bfe392 . .  -13.20 % . .   
> -1.81 % . .
>  supertuxkart   4 . . . . .   
>   . . .
>  talos_principle  324 .   -0.21 %  -18.39 % . .   
> -2.73 %0.14 % .
>  team_fortress_2  808 . . . . .   
>   . . .
>  tesseract430 .0.08 %  -68.57 % . .   
> -0.45 % . 

  1   2   >