Re: [Mesa-dev] [HACK] i965/fs: Fix ordering of src0 alpha and oMask in the framebuffer write payload.

2015-07-10 Thread Francisco Jerez
Jason Ekstrand ja...@jlekstrand.net writes:

 On Jul 9, 2015 7:57 AM, Francisco Jerez curroje...@riseup.net wrote:

 We were passing src0 alpha and oMask in reverse order.  There seems to
 be no good way to pass them in the correct order to the new-style
 LOAD_PAYLOAD (how surprising) because src0 alpha is per-channel while
 oMask is not.  Just split src0 alpha in fixed-width registers and pass
 them to LOAD_PAYLOAD as if they were part of the header as work-around
 for now.

 Bah... I came across this when I did the LOAD_PAYLOAD rework but thought it
 was only theoretical.  I wasn't very familiar with what omask actually did
 and, since piglit didn't hit it, I wasn't sure if it was a real problem or
 not.  I probably should have done more digging and written a piglit test at
 the time. My bad.

 One solution that I proposed at the time was to turn header_size into
 header_mask in the obvious way. We can still use 8 bits because we should
 never have a header source higher than 8.


So your idea is to have one bit per source indicating whether it's
header-like or per-channel?  I don't think that extends to instructions
other than LOAD_PAYLOAD (e.g. FB_WRITE) where the same source is at the
same time header and payload.  One bit per 32B register would extend
easily but it would be rather ugly to deal with if you want to keep your
code SIMD width-invariant.

I think if you go with the per-source flag you'll want it to be in its
own subclass of fs_inst.  With its own subclass you could even have an
array of per-source sizes determining the number of registers read for
each source, which would be rather nice for the visitor (no need to
split vectors into components while passing them to LOAD_PAYLOAD).

Still I think the most elegant solution would be to simply get rid of
the header/payload distinction by using force_writemask_all and, if it
proves to be necessary, fix the optimizer to get rid of redundant
force_writemask_all flags where it doesn't do it already.

 Thoughts?
 --Jason

 I've written a piglit test that demonstrates the problem by using
 gl_SampleMask from a fragment shader with multiple color outputs [1].

 [1] http://lists.freedesktop.org/archives/piglit/2015-July/016499.html
 ---
  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 26
 +-
  1 file changed, 17 insertions(+), 9 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 index 94d6a58..304ae74 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 @@ -1535,6 +1535,19 @@ fs_visitor::emit_single_fb_write(const fs_builder
 bld,
length++;
 }

 +   if (src0_alpha.file != BAD_FILE  color0.file != BAD_FILE) {
 +  /* Neat, we need to chop the src0 alpha component and pass it as
 part of
 +   * the header even though it has per-channel semantics, because
 the next
 +   * optional field is header-like and LOAD_PAYLOAD requires all such
 +   * fields to form a contiguous segment at the beginning of the
 message.
 +   */
 +  for (unsigned i = 0; i  exec_size / 8; i++) {
 + setup_color_payload(sources[length], src0_alpha, 1, 8,
 + use_2nd_half || i == 1);
 + length++;
 +  }
 +   }
 +
 prog_data-uses_omask =
prog-OutputsWritten  BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK);
 if (prog_data-uses_omask) {
 @@ -1561,19 +1574,14 @@ fs_visitor::emit_single_fb_write(const fs_builder
 bld,
   offset(this-outputs[0], bld, 3),
   1, exec_size, false);
length += 4;
 -   } else if (color1.file == BAD_FILE) {
 -  if (src0_alpha.file != BAD_FILE) {
 - setup_color_payload(sources[length], src0_alpha, 1, exec_size,
 false);
 - length++;
 -  }
 -
 -  setup_color_payload(sources[length], color0, components,
 -  exec_size, use_2nd_half);
 -  length += 4;
 } else {
setup_color_payload(sources[length], color0, components,
exec_size, use_2nd_half);
length += 4;
 +
 +   }
 +
 +   if (color1.file != BAD_FILE) {
setup_color_payload(sources[length], color1, components,
exec_size, use_2nd_half);
length += 4;
 --
 2.4.3



signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [HACK] i965/fs: Fix ordering of src0 alpha and oMask in the framebuffer write payload.

2015-07-10 Thread Jason Ekstrand
On Fri, Jul 10, 2015 at 5:25 AM, Francisco Jerez curroje...@riseup.net wrote:
 Jason Ekstrand ja...@jlekstrand.net writes:

 On Jul 9, 2015 7:57 AM, Francisco Jerez curroje...@riseup.net wrote:

 We were passing src0 alpha and oMask in reverse order.  There seems to
 be no good way to pass them in the correct order to the new-style
 LOAD_PAYLOAD (how surprising) because src0 alpha is per-channel while
 oMask is not.  Just split src0 alpha in fixed-width registers and pass
 them to LOAD_PAYLOAD as if they were part of the header as work-around
 for now.

 Bah... I came across this when I did the LOAD_PAYLOAD rework but thought it
 was only theoretical.  I wasn't very familiar with what omask actually did
 and, since piglit didn't hit it, I wasn't sure if it was a real problem or
 not.  I probably should have done more digging and written a piglit test at
 the time. My bad.

 One solution that I proposed at the time was to turn header_size into
 header_mask in the obvious way. We can still use 8 bits because we should
 never have a header source higher than 8.


 So your idea is to have one bit per source indicating whether it's
 header-like or per-channel?  I don't think that extends to instructions
 other than LOAD_PAYLOAD (e.g. FB_WRITE) where the same source is at the
 same time header and payload.

You're right, it doesn't.  We really shouldn't be conflating them.  We
should have header_mask and header_present be different fields.  Maybe
use a union to save space, but they should have different semantic
meaning and different names.  We should probably also have a
compr4_mask and get rid of the hackery there.

 One bit per 32B register would extend
 easily but it would be rather ugly to deal with if you want to keep your
 code SIMD width-invariant.

 I think if you go with the per-source flag you'll want it to be in its
 own subclass of fs_inst.  With its own subclass you could even have an
 array of per-source sizes determining the number of registers read for
 each source, which would be rather nice for the visitor (no need to
 split vectors into components while passing them to LOAD_PAYLOAD).

 Still I think the most elegant solution would be to simply get rid of
 the header/payload distinction by using force_writemask_all and, if it
 proves to be necessary, fix the optimizer to get rid of redundant
 force_writemask_all flags where it doesn't do it already.

I really don't think that's a good long-term or short-term solution.

How badly are you blocking on this?   I don't really have a lot of
extra time to work on this at the moment but can carve some out if
needed.
--jason

 Thoughts?
 --Jason

 I've written a piglit test that demonstrates the problem by using
 gl_SampleMask from a fragment shader with multiple color outputs [1].

 [1] http://lists.freedesktop.org/archives/piglit/2015-July/016499.html
 ---
  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 26
 +-
  1 file changed, 17 insertions(+), 9 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 index 94d6a58..304ae74 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 @@ -1535,6 +1535,19 @@ fs_visitor::emit_single_fb_write(const fs_builder
 bld,
length++;
 }

 +   if (src0_alpha.file != BAD_FILE  color0.file != BAD_FILE) {
 +  /* Neat, we need to chop the src0 alpha component and pass it as
 part of
 +   * the header even though it has per-channel semantics, because
 the next
 +   * optional field is header-like and LOAD_PAYLOAD requires all such
 +   * fields to form a contiguous segment at the beginning of the
 message.
 +   */
 +  for (unsigned i = 0; i  exec_size / 8; i++) {
 + setup_color_payload(sources[length], src0_alpha, 1, 8,
 + use_2nd_half || i == 1);
 + length++;
 +  }
 +   }
 +
 prog_data-uses_omask =
prog-OutputsWritten  BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK);
 if (prog_data-uses_omask) {
 @@ -1561,19 +1574,14 @@ fs_visitor::emit_single_fb_write(const fs_builder
 bld,
   offset(this-outputs[0], bld, 3),
   1, exec_size, false);
length += 4;
 -   } else if (color1.file == BAD_FILE) {
 -  if (src0_alpha.file != BAD_FILE) {
 - setup_color_payload(sources[length], src0_alpha, 1, exec_size,
 false);
 - length++;
 -  }
 -
 -  setup_color_payload(sources[length], color0, components,
 -  exec_size, use_2nd_half);
 -  length += 4;
 } else {
setup_color_payload(sources[length], color0, components,
exec_size, use_2nd_half);
length += 4;
 +
 +   }
 +
 +   if (color1.file != BAD_FILE) {
setup_color_payload(sources[length], color1, components,
exec_size, use_2nd_half);
length += 4;
 --

Re: [Mesa-dev] [HACK] i965/fs: Fix ordering of src0 alpha and oMask in the framebuffer write payload.

2015-07-10 Thread Francisco Jerez
Jason Ekstrand ja...@jlekstrand.net writes:

 On Fri, Jul 10, 2015 at 5:25 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 Jason Ekstrand ja...@jlekstrand.net writes:

 On Jul 9, 2015 7:57 AM, Francisco Jerez curroje...@riseup.net wrote:

 We were passing src0 alpha and oMask in reverse order.  There seems to
 be no good way to pass them in the correct order to the new-style
 LOAD_PAYLOAD (how surprising) because src0 alpha is per-channel while
 oMask is not.  Just split src0 alpha in fixed-width registers and pass
 them to LOAD_PAYLOAD as if they were part of the header as work-around
 for now.

 Bah... I came across this when I did the LOAD_PAYLOAD rework but thought it
 was only theoretical.  I wasn't very familiar with what omask actually did
 and, since piglit didn't hit it, I wasn't sure if it was a real problem or
 not.  I probably should have done more digging and written a piglit test at
 the time. My bad.

 One solution that I proposed at the time was to turn header_size into
 header_mask in the obvious way. We can still use 8 bits because we should
 never have a header source higher than 8.


 So your idea is to have one bit per source indicating whether it's
 header-like or per-channel?  I don't think that extends to instructions
 other than LOAD_PAYLOAD (e.g. FB_WRITE) where the same source is at the
 same time header and payload.

 You're right, it doesn't.  We really shouldn't be conflating them.  We
 should have header_mask and header_present be different fields.  Maybe
 use a union to save space, but they should have different semantic
 meaning and different names.  We should probably also have a
 compr4_mask and get rid of the hackery there.

 One bit per 32B register would extend
 easily but it would be rather ugly to deal with if you want to keep your
 code SIMD width-invariant.

 I think if you go with the per-source flag you'll want it to be in its
 own subclass of fs_inst.  With its own subclass you could even have an
 array of per-source sizes determining the number of registers read for
 each source, which would be rather nice for the visitor (no need to
 split vectors into components while passing them to LOAD_PAYLOAD).

 Still I think the most elegant solution would be to simply get rid of
 the header/payload distinction by using force_writemask_all and, if it
 proves to be necessary, fix the optimizer to get rid of redundant
 force_writemask_all flags where it doesn't do it already.

 I really don't think that's a good long-term or short-term solution.

 How badly are you blocking on this?   I don't really have a lot of
 extra time to work on this at the moment but can carve some out if
 needed.

I'm not blocking on this at all, feel free to fix it however you like,
or just go with this hack for the moment if you have higher priority
stuff to work on right now, I honestly don't care.

 --jason

 Thoughts?
 --Jason

 I've written a piglit test that demonstrates the problem by using
 gl_SampleMask from a fragment shader with multiple color outputs [1].

 [1] http://lists.freedesktop.org/archives/piglit/2015-July/016499.html
 ---
  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 26
 +-
  1 file changed, 17 insertions(+), 9 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 index 94d6a58..304ae74 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 @@ -1535,6 +1535,19 @@ fs_visitor::emit_single_fb_write(const fs_builder
 bld,
length++;
 }

 +   if (src0_alpha.file != BAD_FILE  color0.file != BAD_FILE) {
 +  /* Neat, we need to chop the src0 alpha component and pass it as
 part of
 +   * the header even though it has per-channel semantics, because
 the next
 +   * optional field is header-like and LOAD_PAYLOAD requires all such
 +   * fields to form a contiguous segment at the beginning of the
 message.
 +   */
 +  for (unsigned i = 0; i  exec_size / 8; i++) {
 + setup_color_payload(sources[length], src0_alpha, 1, 8,
 + use_2nd_half || i == 1);
 + length++;
 +  }
 +   }
 +
 prog_data-uses_omask =
prog-OutputsWritten  BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK);
 if (prog_data-uses_omask) {
 @@ -1561,19 +1574,14 @@ fs_visitor::emit_single_fb_write(const fs_builder
 bld,
   offset(this-outputs[0], bld, 3),
   1, exec_size, false);
length += 4;
 -   } else if (color1.file == BAD_FILE) {
 -  if (src0_alpha.file != BAD_FILE) {
 - setup_color_payload(sources[length], src0_alpha, 1, exec_size,
 false);
 - length++;
 -  }
 -
 -  setup_color_payload(sources[length], color0, components,
 -  exec_size, use_2nd_half);
 -  length += 4;
 } else {
setup_color_payload(sources[length], color0, components,
 

Re: [Mesa-dev] [HACK] i965/fs: Fix ordering of src0 alpha and oMask in the framebuffer write payload.

2015-07-10 Thread Jason Ekstrand
On Fri, Jul 10, 2015 at 9:53 AM, Francisco Jerez curroje...@riseup.net wrote:
 Jason Ekstrand ja...@jlekstrand.net writes:

 On Fri, Jul 10, 2015 at 5:25 AM, Francisco Jerez curroje...@riseup.net 
 wrote:
 Jason Ekstrand ja...@jlekstrand.net writes:

 On Jul 9, 2015 7:57 AM, Francisco Jerez curroje...@riseup.net wrote:

 We were passing src0 alpha and oMask in reverse order.  There seems to
 be no good way to pass them in the correct order to the new-style
 LOAD_PAYLOAD (how surprising) because src0 alpha is per-channel while
 oMask is not.  Just split src0 alpha in fixed-width registers and pass
 them to LOAD_PAYLOAD as if they were part of the header as work-around
 for now.

 Bah... I came across this when I did the LOAD_PAYLOAD rework but thought it
 was only theoretical.  I wasn't very familiar with what omask actually did
 and, since piglit didn't hit it, I wasn't sure if it was a real problem or
 not.  I probably should have done more digging and written a piglit test at
 the time. My bad.

 One solution that I proposed at the time was to turn header_size into
 header_mask in the obvious way. We can still use 8 bits because we should
 never have a header source higher than 8.


 So your idea is to have one bit per source indicating whether it's
 header-like or per-channel?  I don't think that extends to instructions
 other than LOAD_PAYLOAD (e.g. FB_WRITE) where the same source is at the
 same time header and payload.

 You're right, it doesn't.  We really shouldn't be conflating them.  We
 should have header_mask and header_present be different fields.  Maybe
 use a union to save space, but they should have different semantic
 meaning and different names.  We should probably also have a
 compr4_mask and get rid of the hackery there.

 One bit per 32B register would extend
 easily but it would be rather ugly to deal with if you want to keep your
 code SIMD width-invariant.

 I think if you go with the per-source flag you'll want it to be in its
 own subclass of fs_inst.  With its own subclass you could even have an
 array of per-source sizes determining the number of registers read for
 each source, which would be rather nice for the visitor (no need to
 split vectors into components while passing them to LOAD_PAYLOAD).

 Still I think the most elegant solution would be to simply get rid of
 the header/payload distinction by using force_writemask_all and, if it
 proves to be necessary, fix the optimizer to get rid of redundant
 force_writemask_all flags where it doesn't do it already.

 I really don't think that's a good long-term or short-term solution.

 How badly are you blocking on this?   I don't really have a lot of
 extra time to work on this at the moment but can carve some out if
 needed.

 I'm not blocking on this at all, feel free to fix it however you like,
 or just go with this hack for the moment if you have higher priority
 stuff to work on right now, I honestly don't care.

That's good to hear.  I'll try and take a look at this in a couple of
weeks.  Thanks for bringing it up and writing the piglit test!
--Jason

 --jason

 Thoughts?
 --Jason

 I've written a piglit test that demonstrates the problem by using
 gl_SampleMask from a fragment shader with multiple color outputs [1].

 [1] http://lists.freedesktop.org/archives/piglit/2015-July/016499.html
 ---
  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 26
 +-
  1 file changed, 17 insertions(+), 9 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 index 94d6a58..304ae74 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 @@ -1535,6 +1535,19 @@ fs_visitor::emit_single_fb_write(const fs_builder
 bld,
length++;
 }

 +   if (src0_alpha.file != BAD_FILE  color0.file != BAD_FILE) {
 +  /* Neat, we need to chop the src0 alpha component and pass it as
 part of
 +   * the header even though it has per-channel semantics, because
 the next
 +   * optional field is header-like and LOAD_PAYLOAD requires all such
 +   * fields to form a contiguous segment at the beginning of the
 message.
 +   */
 +  for (unsigned i = 0; i  exec_size / 8; i++) {
 + setup_color_payload(sources[length], src0_alpha, 1, 8,
 + use_2nd_half || i == 1);
 + length++;
 +  }
 +   }
 +
 prog_data-uses_omask =
prog-OutputsWritten  BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK);
 if (prog_data-uses_omask) {
 @@ -1561,19 +1574,14 @@ fs_visitor::emit_single_fb_write(const fs_builder
 bld,
   offset(this-outputs[0], bld, 3),
   1, exec_size, false);
length += 4;
 -   } else if (color1.file == BAD_FILE) {
 -  if (src0_alpha.file != BAD_FILE) {
 - setup_color_payload(sources[length], src0_alpha, 1, exec_size,
 false);
 - length++;
 -  }
 -
 -  

[Mesa-dev] [HACK] i965/fs: Fix ordering of src0 alpha and oMask in the framebuffer write payload.

2015-07-09 Thread Francisco Jerez
We were passing src0 alpha and oMask in reverse order.  There seems to
be no good way to pass them in the correct order to the new-style
LOAD_PAYLOAD (how surprising) because src0 alpha is per-channel while
oMask is not.  Just split src0 alpha in fixed-width registers and pass
them to LOAD_PAYLOAD as if they were part of the header as work-around
for now.

I've written a piglit test that demonstrates the problem by using
gl_SampleMask from a fragment shader with multiple color outputs [1].

[1] http://lists.freedesktop.org/archives/piglit/2015-July/016499.html
---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 26 +-
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 94d6a58..304ae74 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1535,6 +1535,19 @@ fs_visitor::emit_single_fb_write(const fs_builder bld,
   length++;
}
 
+   if (src0_alpha.file != BAD_FILE  color0.file != BAD_FILE) {
+  /* Neat, we need to chop the src0 alpha component and pass it as part of
+   * the header even though it has per-channel semantics, because the next
+   * optional field is header-like and LOAD_PAYLOAD requires all such
+   * fields to form a contiguous segment at the beginning of the message.
+   */
+  for (unsigned i = 0; i  exec_size / 8; i++) {
+ setup_color_payload(sources[length], src0_alpha, 1, 8,
+ use_2nd_half || i == 1);
+ length++;
+  }
+   }
+
prog_data-uses_omask =
   prog-OutputsWritten  BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK);
if (prog_data-uses_omask) {
@@ -1561,19 +1574,14 @@ fs_visitor::emit_single_fb_write(const fs_builder bld,
  offset(this-outputs[0], bld, 3),
  1, exec_size, false);
   length += 4;
-   } else if (color1.file == BAD_FILE) {
-  if (src0_alpha.file != BAD_FILE) {
- setup_color_payload(sources[length], src0_alpha, 1, exec_size, 
false);
- length++;
-  }
-
-  setup_color_payload(sources[length], color0, components,
-  exec_size, use_2nd_half);
-  length += 4;
} else {
   setup_color_payload(sources[length], color0, components,
   exec_size, use_2nd_half);
   length += 4;
+
+   }
+
+   if (color1.file != BAD_FILE) {
   setup_color_payload(sources[length], color1, components,
   exec_size, use_2nd_half);
   length += 4;
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [HACK] i965/fs: Fix ordering of src0 alpha and oMask in the framebuffer write payload.

2015-07-09 Thread Jason Ekstrand
On Jul 9, 2015 7:57 AM, Francisco Jerez curroje...@riseup.net wrote:

 We were passing src0 alpha and oMask in reverse order.  There seems to
 be no good way to pass them in the correct order to the new-style
 LOAD_PAYLOAD (how surprising) because src0 alpha is per-channel while
 oMask is not.  Just split src0 alpha in fixed-width registers and pass
 them to LOAD_PAYLOAD as if they were part of the header as work-around
 for now.

Bah... I came across this when I did the LOAD_PAYLOAD rework but thought it
was only theoretical.  I wasn't very familiar with what omask actually did
and, since piglit didn't hit it, I wasn't sure if it was a real problem or
not.  I probably should have done more digging and written a piglit test at
the time. My bad.

One solution that I proposed at the time was to turn header_size into
header_mask in the obvious way. We can still use 8 bits because we should
never have a header source higher than 8.

Thoughts?
--Jason

 I've written a piglit test that demonstrates the problem by using
 gl_SampleMask from a fragment shader with multiple color outputs [1].

 [1] http://lists.freedesktop.org/archives/piglit/2015-July/016499.html
 ---
  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 26
+-
  1 file changed, 17 insertions(+), 9 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 index 94d6a58..304ae74 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 @@ -1535,6 +1535,19 @@ fs_visitor::emit_single_fb_write(const fs_builder
bld,
length++;
 }

 +   if (src0_alpha.file != BAD_FILE  color0.file != BAD_FILE) {
 +  /* Neat, we need to chop the src0 alpha component and pass it as
part of
 +   * the header even though it has per-channel semantics, because
the next
 +   * optional field is header-like and LOAD_PAYLOAD requires all such
 +   * fields to form a contiguous segment at the beginning of the
message.
 +   */
 +  for (unsigned i = 0; i  exec_size / 8; i++) {
 + setup_color_payload(sources[length], src0_alpha, 1, 8,
 + use_2nd_half || i == 1);
 + length++;
 +  }
 +   }
 +
 prog_data-uses_omask =
prog-OutputsWritten  BITFIELD64_BIT(FRAG_RESULT_SAMPLE_MASK);
 if (prog_data-uses_omask) {
 @@ -1561,19 +1574,14 @@ fs_visitor::emit_single_fb_write(const fs_builder
bld,
   offset(this-outputs[0], bld, 3),
   1, exec_size, false);
length += 4;
 -   } else if (color1.file == BAD_FILE) {
 -  if (src0_alpha.file != BAD_FILE) {
 - setup_color_payload(sources[length], src0_alpha, 1, exec_size,
false);
 - length++;
 -  }
 -
 -  setup_color_payload(sources[length], color0, components,
 -  exec_size, use_2nd_half);
 -  length += 4;
 } else {
setup_color_payload(sources[length], color0, components,
exec_size, use_2nd_half);
length += 4;
 +
 +   }
 +
 +   if (color1.file != BAD_FILE) {
setup_color_payload(sources[length], color1, components,
exec_size, use_2nd_half);
length += 4;
 --
 2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev