Re: [Mesa-dev] [PATCH] panfrost: Refactor blend descriptors

2019-05-05 Thread Alyssa Rosenzweig
> No, it doesn't. The high (64 - 24) bits have to be exactly the same.
> So if your 16 MB allocation is not aligned, you could wind up with two
> shaders crossing that boundary by sheer bad luck and then things go
> boom. ARM's kernel driver dealt with it by aligning all executable
> memory allocations to 2^24, but the upstream driver doesn't have a
> solution yet.

I see, thank you. I'll make a mental note of a TODO for that.

> I could test it, but most of the freedreno tests that exercise blend
> shaders wouldn't work because you never implemented capturing the
> memory for each submission separately. Have you done that since?

I haven't gotten around to working on the pantrace/wrap infrastructure
yet, no. That said, a trivial way to trigger a blend shader is to use
divergent factors. In even just test-triangle-smoothed/cube/etc, this
should trigger a blend shader (at least on Midg):

glEnable(GL_BLEND);
glBlendFunc(GL_SRC_COLOR, GL_DST_COLOR)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] panfrost: Refactor blend descriptors

2019-05-05 Thread Connor Abbott
On Sun, May 5, 2019 at 1:27 AM Alyssa Rosenzweig  wrote:
>
> > The blend shader enable bit is already described in the comments in
> > the header; the blend shader is enabled when unk2 == 0.
>
> I'm pretty sure that comment was from you, but thank you ;)
>
> > (the blend shader has
> > to be allocated within the same 2^24 byte range as the main shader for
> > it to work properly anyways, even on Midgard, which is probably not
> > implemented properly on mainline).
>
> Indeed. Mainline Midgard blend shaders work (well, stubbed out so they
> just do passthrough without any real blending, but the hardware is
> correct). That said, we cap shader memory at 16MB upfront, which
> "resolves" this problem.

No, it doesn't. The high (64 - 24) bits have to be exactly the same.
So if your 16 MB allocation is not aligned, you could wind up with two
shaders crossing that boundary by sheer bad luck and then things go
boom. ARM's kernel driver dealt with it by aligning all executable
memory allocations to 2^24, but the upstream driver doesn't have a
solution yet.

>
> > Maybe it would be better if these functions got passed the
> > mali_shader_descriptor itself?
>
> Possibly. I don't have access to any Bifrost machines right now, so I
> can't test that.

I could test it, but most of the freedreno tests that exercise blend
shaders wouldn't work because you never implemented capturing the
memory for each submission separately. Have you done that since?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] panfrost: Refactor blend descriptors

2019-05-04 Thread Alyssa Rosenzweig
> The blend shader enable bit is already described in the comments in
> the header; the blend shader is enabled when unk2 == 0.

I'm pretty sure that comment was from you, but thank you ;)

> (the blend shader has
> to be allocated within the same 2^24 byte range as the main shader for
> it to work properly anyways, even on Midgard, which is probably not
> implemented properly on mainline).

Indeed. Mainline Midgard blend shaders work (well, stubbed out so they
just do passthrough without any real blending, but the hardware is
correct). That said, we cap shader memory at 16MB upfront, which
"resolves" this problem.

> Maybe it would be better if these functions got passed the
> mali_shader_descriptor itself?

Possibly. I don't have access to any Bifrost machines right now, so I
can't test that.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] panfrost: Refactor blend descriptors

2019-05-04 Thread Connor Abbott
On Sun, May 5, 2019 at 12:14 AM Alyssa Rosenzweig  wrote:
>
> This commit does a fairly large cleanup of blend descriptors, although
> there should not be any functional changes. In particular, we split
> apart the Midgard and Bifrost blend descriptors, since they are
> radically different. From there, we can identify that the Midgard
> descriptor as previously written was really two render targets'
> descriptors stuck together. From this observation, we split the Midgard
> descriptor into what a single RT actually needs. This enables us to
> correctly dump blending configuration for MRT samples on Midgard. It
> also allows the Midgard and Bifrost blend code to peacefully coexist,
> with runtime selection rather than a #ifdef. So, as a bonus, this will
> help the future Bifrost effort, eliminating one major source of
> compile-time architectural divergence.
>
> Signed-off-by: Alyssa Rosenzweig 
> ---
>  .../drivers/panfrost/include/panfrost-job.h   |  56 ---
>  src/gallium/drivers/panfrost/pan_context.c|  31 ++--
>  .../drivers/panfrost/pandecode/decode.c   | 155 +-
>  3 files changed, 122 insertions(+), 120 deletions(-)
>
> diff --git a/src/gallium/drivers/panfrost/include/panfrost-job.h 
> b/src/gallium/drivers/panfrost/include/panfrost-job.h
> index c2d922678b8..71ac054f7c3 100644
> --- a/src/gallium/drivers/panfrost/include/panfrost-job.h
> +++ b/src/gallium/drivers/panfrost/include/panfrost-job.h
> @@ -415,25 +415,37 @@ enum mali_format {
>  #define MALI_READS_ZS (1 << 12)
>  #define MALI_READS_TILEBUFFER (1 << 16)
>
> -struct mali_blend_meta {
> -#ifndef BIFROST
> -/* Base value of 0x200.
> +/* The raw Midgard blend payload can either be an equation or a shader
> + * address, depending on the context */
> +
> +union midgard_blend {
> +mali_ptr shader;
> +struct mali_blend_equation equation;
> +};
> +
> +/* On MRT Midgard systems (using an MFBD), each render target gets its own
> + * blend descriptor */
> +
> +struct midgard_blend_rt {
> +/* Flags base value of 0x200 to enable the render target.
>   * OR with 0x1 for blending (anything other than REPLACE).
> - * OR with 0x2 for programmable blending
> + * OR with 0x2 for programmable blending with 0-2 registers
> + * OR with 0x3 for programmable blending with 2+ registers
>   */
>
> -u64 unk1;
> +u64 flags;
> +union midgard_blend blend;
> +} __attribute__((packed));
>
> -union {
> -struct mali_blend_equation blend_equation_1;
> -mali_ptr blend_shader;
> -};
> +/* On Bifrost systems (all MRT), each render target gets one of these
> + * descriptors */
> +
> +struct bifrost_blend_rt {
> +/* This is likely an analogue of the flags on
> + * midgard_blend_rt */
>
> -u64 zero2;
> -struct mali_blend_equation blend_equation_2;
> -#else
>  u32 unk1; // = 0x200
> -struct mali_blend_equation blend_equation;
> +struct mali_blend_equation equation;
>  /*
>   * - 0x19 normally
>   * - 0x3 when this slot is unused (everything else is 0 except the 
> index)
> @@ -479,11 +491,13 @@ struct mali_blend_meta {
>  * in the same pool as the original shader. The kernel will
>  * make sure this allocation is aligned to 2^24 bytes.
>  */
> -   u32 blend_shader;
> +   u32 shader;
> };
> -#endif
>  } __attribute__((packed));
>
> +/* Descriptor for the shader. Following this is at least one, up to four 
> blend
> + * descriptors for each active render target */
> +
>  struct mali_shader_meta {
>  mali_ptr shader;
>  u16 texture_count;
> @@ -584,17 +598,7 @@ struct mali_shader_meta {
>   * MALI_HAS_BLEND_SHADER to decide how to interpret.
>   */
>
> -union {
> -mali_ptr blend_shader;
> -struct mali_blend_equation blend_equation;
> -};
> -
> -/* There can be up to 4 blend_meta's. None of them are required for
> - * vertex shaders or the non-MRT case for Midgard (so the blob 
> doesn't
> - * allocate any space).
> - */
> -struct mali_blend_meta blend_meta[];
> -
> +union midgard_blend blend;
>  } __attribute__((packed));
>
>  /* This only concerns hardware jobs */
> diff --git a/src/gallium/drivers/panfrost/pan_context.c 
> b/src/gallium/drivers/panfrost/pan_context.c
> index 17b5b75db92..cab7c89ac8b 100644
> --- a/src/gallium/drivers/panfrost/pan_context.c
> +++ b/src/gallium/drivers/panfrost/pan_context.c
> @@ -1000,7 +1000,7 @@ panfrost_emit_for_draw(struct panfrost_context *ctx, 
> bool with_vertex_data)
>   * maybe both are read...?) */
>
>  if (ctx->blend->has_blend_shader) {
> -ctx->fragment_shader_core.blend_shader = 
> ctx->blend->blend_shader;
> + 

[Mesa-dev] [PATCH] panfrost: Refactor blend descriptors

2019-05-04 Thread Alyssa Rosenzweig
This commit does a fairly large cleanup of blend descriptors, although
there should not be any functional changes. In particular, we split
apart the Midgard and Bifrost blend descriptors, since they are
radically different. From there, we can identify that the Midgard
descriptor as previously written was really two render targets'
descriptors stuck together. From this observation, we split the Midgard
descriptor into what a single RT actually needs. This enables us to
correctly dump blending configuration for MRT samples on Midgard. It
also allows the Midgard and Bifrost blend code to peacefully coexist,
with runtime selection rather than a #ifdef. So, as a bonus, this will
help the future Bifrost effort, eliminating one major source of
compile-time architectural divergence.

Signed-off-by: Alyssa Rosenzweig 
---
 .../drivers/panfrost/include/panfrost-job.h   |  56 ---
 src/gallium/drivers/panfrost/pan_context.c|  31 ++--
 .../drivers/panfrost/pandecode/decode.c   | 155 +-
 3 files changed, 122 insertions(+), 120 deletions(-)

diff --git a/src/gallium/drivers/panfrost/include/panfrost-job.h 
b/src/gallium/drivers/panfrost/include/panfrost-job.h
index c2d922678b8..71ac054f7c3 100644
--- a/src/gallium/drivers/panfrost/include/panfrost-job.h
+++ b/src/gallium/drivers/panfrost/include/panfrost-job.h
@@ -415,25 +415,37 @@ enum mali_format {
 #define MALI_READS_ZS (1 << 12)
 #define MALI_READS_TILEBUFFER (1 << 16)
 
-struct mali_blend_meta {
-#ifndef BIFROST
-/* Base value of 0x200.
+/* The raw Midgard blend payload can either be an equation or a shader
+ * address, depending on the context */
+
+union midgard_blend {
+mali_ptr shader;
+struct mali_blend_equation equation;
+};
+
+/* On MRT Midgard systems (using an MFBD), each render target gets its own
+ * blend descriptor */
+
+struct midgard_blend_rt {
+/* Flags base value of 0x200 to enable the render target.
  * OR with 0x1 for blending (anything other than REPLACE).
- * OR with 0x2 for programmable blending
+ * OR with 0x2 for programmable blending with 0-2 registers
+ * OR with 0x3 for programmable blending with 2+ registers
  */
 
-u64 unk1;
+u64 flags;
+union midgard_blend blend;
+} __attribute__((packed));
 
-union {
-struct mali_blend_equation blend_equation_1;
-mali_ptr blend_shader;
-};
+/* On Bifrost systems (all MRT), each render target gets one of these
+ * descriptors */
+
+struct bifrost_blend_rt {
+/* This is likely an analogue of the flags on
+ * midgard_blend_rt */
 
-u64 zero2;
-struct mali_blend_equation blend_equation_2;
-#else
 u32 unk1; // = 0x200
-struct mali_blend_equation blend_equation;
+struct mali_blend_equation equation;
 /*
  * - 0x19 normally
  * - 0x3 when this slot is unused (everything else is 0 except the 
index)
@@ -479,11 +491,13 @@ struct mali_blend_meta {
 * in the same pool as the original shader. The kernel will
 * make sure this allocation is aligned to 2^24 bytes.
 */
-   u32 blend_shader;
+   u32 shader;
};
-#endif
 } __attribute__((packed));
 
+/* Descriptor for the shader. Following this is at least one, up to four blend
+ * descriptors for each active render target */
+
 struct mali_shader_meta {
 mali_ptr shader;
 u16 texture_count;
@@ -584,17 +598,7 @@ struct mali_shader_meta {
  * MALI_HAS_BLEND_SHADER to decide how to interpret.
  */
 
-union {
-mali_ptr blend_shader;
-struct mali_blend_equation blend_equation;
-};
-
-/* There can be up to 4 blend_meta's. None of them are required for
- * vertex shaders or the non-MRT case for Midgard (so the blob doesn't
- * allocate any space).
- */
-struct mali_blend_meta blend_meta[];
-
+union midgard_blend blend;
 } __attribute__((packed));
 
 /* This only concerns hardware jobs */
diff --git a/src/gallium/drivers/panfrost/pan_context.c 
b/src/gallium/drivers/panfrost/pan_context.c
index 17b5b75db92..cab7c89ac8b 100644
--- a/src/gallium/drivers/panfrost/pan_context.c
+++ b/src/gallium/drivers/panfrost/pan_context.c
@@ -1000,7 +1000,7 @@ panfrost_emit_for_draw(struct panfrost_context *ctx, bool 
with_vertex_data)
  * maybe both are read...?) */
 
 if (ctx->blend->has_blend_shader) {
-ctx->fragment_shader_core.blend_shader = 
ctx->blend->blend_shader;
+ctx->fragment_shader_core.blend.shader = 
ctx->blend->blend_shader;
 }
 
 if (ctx->require_sfbd) {
@@ -1010,7 +1010,7 @@ panfrost_emit_for_draw(struct panfrost_context *ctx, bool 
with_vertex_data)
  * modes (so we're able to read back