From: "Juan A. Suarez Romero" <jasua...@igalia.com>

When converting a DF to 32-bit conversions, we set dst stride to 2,
to fulfill alignment restrictions because the upper Dword of every
Qword will be written with undefined value.

But in IVB/BYT, this is not necessary, as each DF conversion already
writes 2, the first one the real value, and the second one a 0.
That is, IVB/BYT already set stride = 2 implicitly, so we must set it to
1 explicitly to avoid ending up with stride = 4.

v2:
- Fix typo (Matt)

v3:
- Fix stride in the destination's brw_reg, don't modity IR (Curro)

v4:
- Remove 'is_dst' argument of brw_reg_from_fs_reg() (Curro)
- Fix comment (Curro).
- Relax hstride assert (Curro)

Signed-off-by: Juan A. Suarez Romero <jasua...@igalia.com>
Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
 src/intel/compiler/brw_fs_generator.cpp | 68 ++++++++++++++++++++-------------
 1 file changed, 41 insertions(+), 27 deletions(-)

diff --git a/src/intel/compiler/brw_fs_generator.cpp 
b/src/intel/compiler/brw_fs_generator.cpp
index bc15fd11d96..9e530699a08 100644
--- a/src/intel/compiler/brw_fs_generator.cpp
+++ b/src/intel/compiler/brw_fs_generator.cpp
@@ -93,33 +93,47 @@ brw_reg_from_fs_reg(const struct gen_device_info *devinfo, 
fs_inst *inst,
          const unsigned width = MIN2(reg_width, phys_width);
          brw_reg = brw_vecn_reg(width, brw_file_from_reg(reg), reg->nr, 0);
          brw_reg = stride(brw_reg, width * reg->stride, width, reg->stride);
-         /* From the IvyBridge PRM (EU Changes by Processor Generation, page 
13):
-          *  "Each DF (Double Float) operand uses an element size of 4 rather
-          *   than 8 and all regioning parameters are twice what the values
-          *   would be based on the true element size: ExecSize, Width,
-          *   HorzStride, and VertStride. Each DF operand uses a pair of
-          *   channels and all masking and swizzing should be adjusted
-          *   appropriately."
-          *
-          * From the IvyBridge PRM (Special Requirements for Handling Double
-          * Precision Data Types, page 71):
-          *  "In Align1 mode, all regioning parameters like stride, execution
-          *   size, and width must use the syntax of a pair of packed
-          *   floats. The offsets for these data types must be 64-bit
-          *   aligned. The execution size and regioning parameters are in terms
-          *   of floats."
-          *
-          * Summarized: when handling DF-typed arguments, ExecSize,
-          * VertStride, and Width must be doubled.
-          *
-          * It applies to BayTrail too.
-          */
-         if (devinfo->gen == 7 && !devinfo->is_haswell &&
-             type_sz(reg->type) == 8) {
-            brw_reg.width++;
-            if (brw_reg.vstride > 0)
-               brw_reg.vstride++;
-            assert(brw_reg.hstride == BRW_HORIZONTAL_STRIDE_1);
+
+         if (devinfo->gen == 7 && !devinfo->is_haswell) {
+            /* From the IvyBridge PRM (EU Changes by Processor Generation, 
page 13):
+             *  "Each DF (Double Float) operand uses an element size of 4 
rather
+             *   than 8 and all regioning parameters are twice what the values
+             *   would be based on the true element size: ExecSize, Width,
+             *   HorzStride, and VertStride. Each DF operand uses a pair of
+             *   channels and all masking and swizzing should be adjusted
+             *   appropriately."
+             *
+             * From the IvyBridge PRM (Special Requirements for Handling Double
+             * Precision Data Types, page 71):
+             *  "In Align1 mode, all regioning parameters like stride, 
execution
+             *   size, and width must use the syntax of a pair of packed
+             *   floats. The offsets for these data types must be 64-bit
+             *   aligned. The execution size and regioning parameters are in 
terms
+             *   of floats."
+             *
+             * Summarized: when handling DF-typed arguments, ExecSize,
+             * VertStride, and Width must be doubled.
+             *
+             * It applies to BayTrail too.
+             */
+            if (type_sz(reg->type) == 8) {
+               brw_reg.width++;
+               if (brw_reg.vstride > 0)
+                  brw_reg.vstride++;
+               assert(brw_reg.hstride == BRW_HORIZONTAL_STRIDE_1);
+            }
+
+            /* When converting from DF->F, we set destination's stride as 2 as 
an
+             * because each d2f conversion implicitly writes 2 F,
+             * being the first one the converted value. IVB/BYT actually writes
+             * two F components per SIMD channel, and every other component is
+             * filled with garbage.
+             */
+            if (reg == &inst->dst && get_exec_type_size(inst) == 8 &&
+                type_sz(inst->dst.type) < 8) {
+               assert(brw_reg.hstride > BRW_HORIZONTAL_STRIDE_1);
+               brw_reg.hstride--;
+            }
          }
       }
 
-- 
2.11.0

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to