Sorry, I didn't investigate why. Maybe some pass is executed twice?
Probably. Well when I wrote the function the idea was to figure out (once) if a result of MIMG is used or not and if it's not used to adjust the mask, but that doesn't seems to be the case any more.

Whatever, the patch itself looks good to me and obviously fixes a bug, so feel free to commit it. We should just keep in mind to maybe rework that at some point, cause it definitely only needs to be done once.

Christian.

Am 24.09.2013 12:25, schrieb Marek Olšák:
Sorry, I didn't investigate why. Maybe some pass is executed twice?

Marek

On Tue, Sep 24, 2013 at 11:48 AM, Christian König
<deathsim...@vodafone.de> wrote:
Sorry, my fault let me refine the question: Why the heck is the function
applied twice?

Christian.

Am 24.09.2013 11:44, schrieb Marek Olšák:

If the TGSI writemask is .xzw and the initial dmask is 0xf, the first
application of the function sees 4 VGPRs and correctly sets the dmask
to 13 (xzw) because the second VGPR is unused, but the second
application of the function now sees 3 VGPRs, ignores the fact that
dmask is 13 meaning xzw, and sets the writemask to 7 meaning xyz,
which breaks the texture instruction. I thought it was obvious from
the comments I added.

Marek

On Tue, Sep 24, 2013 at 9:00 AM, Christian König
<deathsim...@vodafone.de> wrote:
Why should that be necessary?

Christian.

Am 23.09.2013 21:09, schrieb mar...@gmail.com:

From: Marek Olšák <marek.ol...@amd.com>

This fixes piglit:
- shaders/glsl-fs-texture2d-masked
- shaders/glsl-fs-texture2d-masked-4

Signed-off-by: Marek Olšák <marek.ol...@amd.com>
---
    lib/Target/R600/SIISelLowering.cpp | 27 +++++++++++++++++++++------
    1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/lib/Target/R600/SIISelLowering.cpp
b/lib/Target/R600/SIISelLowering.cpp
index 2174753..891a51b 100644
--- a/lib/Target/R600/SIISelLowering.cpp
+++ b/lib/Target/R600/SIISelLowering.cpp
@@ -1065,7 +1065,9 @@ static unsigned SubIdx2Lane(unsigned Idx) {
    void SITargetLowering::adjustWritemask(MachineSDNode *&Node,
                                           SelectionDAG &DAG) const {
      SDNode *Users[4] = { };
-  unsigned Writemask = 0, Lane = 0;
+  unsigned Lane = 0;
+  unsigned OldDmask = Node->getConstantOperandVal(0);
+  unsigned NewDmask = 0;
        // Try to figure out the used register components
      for (SDNode::use_iterator I = Node->use_begin(), E =
Node->use_end();
@@ -1076,29 +1078,42 @@ void
SITargetLowering::adjustWritemask(MachineSDNode *&Node,
            I->getMachineOpcode() != TargetOpcode::EXTRACT_SUBREG)
          return;
    +    /* Lane means which subreg of %VGPRa_VGPRb_VGPRc_VGPRd is used.
+     * Note that subregs are packed, i.e. Lane==0 is the first bit set
+     * in OldDmask, so it can be any of X,Y,Z,W; Lane==1 is the second
bit
+     * set, etc. */
        Lane = SubIdx2Lane(I->getConstantOperandVal(1));
    +    // Set which texture component corresponds to the lane.
+    unsigned Comp;
+    for (unsigned i = 0, Dmask = OldDmask; i <= Lane; i++) {
+      assert(Dmask);
+      Comp = ffs(Dmask)-1;
+      Dmask &= ~(1 << Comp);
+    }
+
        // Abort if we have more than one user per component
        if (Users[Lane])
          return;
          Users[Lane] = *I;
-    Writemask |= 1 << Lane;
+    NewDmask |= 1 << Comp;
      }
    -  // Abort if all components are used
-  if (Writemask == 0xf)
+  // Abort if there's no change
+  if (NewDmask == OldDmask)
        return;
        // Adjust the writemask in the node
      std::vector<SDValue> Ops;
-  Ops.push_back(DAG.getTargetConstant(Writemask, MVT::i32));
+  Ops.push_back(DAG.getTargetConstant(NewDmask, MVT::i32));
      for (unsigned i = 1, e = Node->getNumOperands(); i != e; ++i)
        Ops.push_back(Node->getOperand(i));
      Node = (MachineSDNode*)DAG.UpdateNodeOperands(Node, Ops.data(),
Ops.size());
        // If we only got one lane, replace it with a copy
-  if (Writemask == (1U << Lane)) {
+  // (if NewDmask has only one bit set...)
+  if (NewDmask && (NewDmask & (NewDmask-1)) == 0) {
        SDValue RC = DAG.getTargetConstant(AMDGPU::VReg_32RegClassID,
MVT::i32);
        SDNode *Copy = DAG.getMachineNode(TargetOpcode::COPY_TO_REGCLASS,
                                          SDLoc(),
Users[Lane]->getValueType(0),


_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to