Re: [Mesa-dev] [PATCH 1/2] intel/icl: Disable combining of vertices from separate instances

2018-10-29 Thread Kenneth Graunke
On Monday, October 29, 2018 5:38:57 AM PDT Topi Pohjolainen wrote:
> This is new hardware feature, and should be disabled until it is
> clear our VS kernels are prepared for it. Thread payload has new
> bit (See Bspec: Pipeline Stages - 3D Pipeline Geometry -
> Vertex Shader (VS) Stage - Payloads - SIMD8 Payload [BDW+])
> that vertex shaders could consult.
> 
> CC: Jason Ekstrand 
> CC: Kenneth Graunke 
> CC: Anuj Phogat 
> Signed-off-by: Topi Pohjolainen 
> ---
>  src/intel/blorp/blorp_genX_exec.h | 6 ++
>  src/intel/vulkan/genX_pipeline.c  | 6 ++
>  src/mesa/drivers/dri/i965/genX_state_upload.c | 6 ++
>  3 files changed, 18 insertions(+)
> 
> diff --git a/src/intel/blorp/blorp_genX_exec.h 
> b/src/intel/blorp/blorp_genX_exec.h
> index 50341ab0ec..10865b9c15 100644
> --- a/src/intel/blorp/blorp_genX_exec.h
> +++ b/src/intel/blorp/blorp_genX_exec.h
> @@ -629,6 +629,12 @@ blorp_emit_vs_config(struct blorp_batch *batch,
>  #if GEN_GEN >= 8
>   vs.SIMD8DispatchEnable =
>  vs_prog_data->base.dispatch_mode == DISPATCH_MODE_SIMD8;
> +#endif
> +#if GEN_GEN >= 11
> + /* TODO: Disable combining of instances until it is clear VS kernels
> +  * are prepared for it.
> +  */
> + vs.SIMD8SingleInstanceDispatchEnable = vs.SIMD8DispatchEnable;
>  #endif
>}
> }
> diff --git a/src/intel/vulkan/genX_pipeline.c 
> b/src/intel/vulkan/genX_pipeline.c
> index 33f1f7832a..9762fc78b5 100644
> --- a/src/intel/vulkan/genX_pipeline.c
> +++ b/src/intel/vulkan/genX_pipeline.c
> @@ -1157,6 +1157,12 @@ emit_3dstate_vs(struct anv_pipeline *pipeline)
>vs.SIMD8DispatchEnable  =
>   vs_prog_data->base.dispatch_mode == DISPATCH_MODE_SIMD8;
>  #endif
> +#if GEN_GEN >= 11
> +  /* TODO: Disable combining of instances until it is clear VS kernels
> +   * are prepared for it.
> +   */
> +  vs.SIMD8SingleInstanceDispatchEnable = vs.SIMD8DispatchEnable;
> +#endif
>  
>assert(!vs_prog_data->base.base.use_alt_mode);
>  #if GEN_GEN < 11
> diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
> b/src/mesa/drivers/dri/i965/genX_state_upload.c
> index 740cb0c4d2..9198a2953a 100644
> --- a/src/mesa/drivers/dri/i965/genX_state_upload.c
> +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
> @@ -2277,6 +2277,12 @@ genX(upload_vs_state)(struct brw_context *brw)
>  
>vs.UserClipDistanceCullTestEnableBitmask =
>   vue_prog_data->cull_distance_mask;
> +#endif
> +#if GEN_GEN >= 11
> +  /* TODO: Disable combining of instances until it is clear VS kernels
> +   * are prepared for it.
> +   */
> +  vs.SIMD8SingleInstanceDispatchEnable = vs.SIMD8DispatchEnable;
>  #endif
> }
>  
> 

Hey Topi,

I don't foresee any problems with dispatching multiple instances in
a single thread, today.

Ian and I tried to come up with some theoretical cases where you'd
hit problems, and we couldn't come up with much.  Our thinking was
that since instanced rendering is spec'd as a loop, if a shader has
side effects, it might be able to observe this.  But, I don't think
any ordering guarantees exist anyway.  In particular, the HW might
already dispatch threads for instance N and instance N+1, and have
them execute in parallel, even if they're separate threads.

It turns out there's a much more obvious reason.  If you have a
uniform array,

   uniform vec4 foo[...];

and read based on gl_InstanceID:

   foo[gl_InstanceID]

then, because of the thread-dispatch condition, you know that
gl_InstanceID is actually dynamically uniform, so you can do a
single load instead of loading all 8 channels.  More efficient.

The thread combining breaks that optimization.  So, you'd have
to set this bit to turn it off.

We ought to do that optimization!  But, we don't today, so we
should be OK to leave the Icelake feature enabled.

--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] intel/icl: Disable combining of vertices from separate instances

2018-10-29 Thread Topi Pohjolainen
This is new hardware feature, and should be disabled until it is
clear our VS kernels are prepared for it. Thread payload has new
bit (See Bspec: Pipeline Stages - 3D Pipeline Geometry -
Vertex Shader (VS) Stage - Payloads - SIMD8 Payload [BDW+])
that vertex shaders could consult.

CC: Jason Ekstrand 
CC: Kenneth Graunke 
CC: Anuj Phogat 
Signed-off-by: Topi Pohjolainen 
---
 src/intel/blorp/blorp_genX_exec.h | 6 ++
 src/intel/vulkan/genX_pipeline.c  | 6 ++
 src/mesa/drivers/dri/i965/genX_state_upload.c | 6 ++
 3 files changed, 18 insertions(+)

diff --git a/src/intel/blorp/blorp_genX_exec.h 
b/src/intel/blorp/blorp_genX_exec.h
index 50341ab0ec..10865b9c15 100644
--- a/src/intel/blorp/blorp_genX_exec.h
+++ b/src/intel/blorp/blorp_genX_exec.h
@@ -629,6 +629,12 @@ blorp_emit_vs_config(struct blorp_batch *batch,
 #if GEN_GEN >= 8
  vs.SIMD8DispatchEnable =
 vs_prog_data->base.dispatch_mode == DISPATCH_MODE_SIMD8;
+#endif
+#if GEN_GEN >= 11
+ /* TODO: Disable combining of instances until it is clear VS kernels
+  * are prepared for it.
+  */
+ vs.SIMD8SingleInstanceDispatchEnable = vs.SIMD8DispatchEnable;
 #endif
   }
}
diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c
index 33f1f7832a..9762fc78b5 100644
--- a/src/intel/vulkan/genX_pipeline.c
+++ b/src/intel/vulkan/genX_pipeline.c
@@ -1157,6 +1157,12 @@ emit_3dstate_vs(struct anv_pipeline *pipeline)
   vs.SIMD8DispatchEnable  =
  vs_prog_data->base.dispatch_mode == DISPATCH_MODE_SIMD8;
 #endif
+#if GEN_GEN >= 11
+  /* TODO: Disable combining of instances until it is clear VS kernels
+   * are prepared for it.
+   */
+  vs.SIMD8SingleInstanceDispatchEnable = vs.SIMD8DispatchEnable;
+#endif
 
   assert(!vs_prog_data->base.base.use_alt_mode);
 #if GEN_GEN < 11
diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 740cb0c4d2..9198a2953a 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -2277,6 +2277,12 @@ genX(upload_vs_state)(struct brw_context *brw)
 
   vs.UserClipDistanceCullTestEnableBitmask =
  vue_prog_data->cull_distance_mask;
+#endif
+#if GEN_GEN >= 11
+  /* TODO: Disable combining of instances until it is clear VS kernels
+   * are prepared for it.
+   */
+  vs.SIMD8SingleInstanceDispatchEnable = vs.SIMD8DispatchEnable;
 #endif
}
 
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev