http://llvm.org/bugs/show_bug.cgi?id=14824

             Bug #: 14824
           Summary: Optimization arm_ldst_opt inserts newly generated
                    instruction vldmia at incorrect position
           Product: libraries
           Version: trunk
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: Backend: ARM
        AssignedTo: [email protected]
        ReportedBy: [email protected]
                CC: [email protected]
    Classification: Unclassified


Created attachment 9821
  --> http://llvm.org/bugs/attachment.cgi?id=9821
The small test case verifying the bug in arm_ldst_opt

Hi,

Optimization arm_ldst_opt inserts newly generated instruction vldmia at
incorrect position.

For attached small test case ldst_opt_bug.ll, using this command line,

llc -mcpu=cortex-a9 -mattr=+neon,+neonfp ldst_opt_bug.ll

we would see incorrect instruction sequence in ldst_opt_bug.s like below,

        vldr    s0, [r0, #432]
        vldr    s5, [r1, #412]
        vldr    s14, [r1, #440]
        vldr    s10, [r0, #440]
        vldmia  r3, {s0, s1, s2, s3}
        add.w   r3, r2, #432
        vldr    s7, [r1, #420]

That is, instruction "vldmia  r3, {s0, s1, s2, s3}" overwrites s0 just loaded
by "vldr    s0, [r0, #432]", which is incorrect according to the original LLVM
IR semantics in ldst_opt_bug.ll.

In optimization arm_ldst_opt, before generating instruction vldmia, we have the
following IR,

(1) %S0<def> = VLDRS %R0, 102, pred:14, pred:%noreg, %Q0<imp-def>;
mem:LD4[%arrayidx67+24](align=8)
(2) %S1<def> = VLDRS %R0, 103, pred:14, pred:%noreg, %Q0<imp-use,kill>,
%Q0<imp-def>; mem:LD4[%arrayidx67+28]
(3) %S11<def> = VLDRS %R0, 111, pred:14, pred:%noreg, %Q2<imp-use,kill>,
%Q2<imp-def>; mem:LD4[%arrayidx67+60]
(4) %S10<def> = VLDRS %R0, 110, pred:14, pred:%noreg, %Q2<imp-use,kill>,
%Q2<imp-def>; mem:LD4[%arrayidx67+56](align=8)
(5) %S1<def> = VLDRS %R0, 109, pred:14, pred:%noreg, %Q0<imp-use,kill>,
%Q0<imp-def>; mem:LD4[%arrayidx67+52]
(6) %S0<def> = VLDRS %R0, 108, pred:14, pred:%noreg, %Q0<imp-use,kill>,
%Q0<imp-def>; mem:LD4[%arrayidx67+48](align=16)
(7) %S3<def> = VLDRS %R0, 105, pred:14, pred:%noreg, %Q0<imp-use,kill>,
%Q0<imp-def>; mem:LD4[%arrayidx67+36]
(8) %S2<def> = VLDRS %R0, 104, pred:14, pred:%noreg, %Q0<imp-use,kill>,
%Q0<imp-def>; mem:LD4[%arrayidx67+32](align=32)
(9) %S7<def> = VLDRS %R1, 105, pred:14, pred:%noreg, %Q1<imp-use,kill>,
%Q1<imp-def>; mem:LD4[%arrayidx64+36]

The optimization tries to hoist instruction 7) and 8) to be able to merge with
1) and 2) to generate vldm, because they are loading sequential memory at
offset 102*4, 103*4, 104*4, 105*4. This intention of the optimization itself is
correct.

After hoist, the algorithm firstly generates an internal instruction sequence,

(1)
(2)
(7)
(8)

The problem is the newly generated instruction vldm is incorrectly inserted
after instruction 8). Obviously the data dependence is violated here with
instruction 6).

The source code introducing the bug is has something to do with the function
ARMLoadStoreOpt::MergeOpsUpdate,

  // Try to do the merge.
  MachineBasicBlock::iterator Loc = memOps[insertAfter].MBBI;
  ++Loc;
  if (!MergeOps(MBB, Loc, Offset, Base, BaseKill, Opcode,
                Pred, PredReg, Scratch, dl, Regs, ImpDefs))
    return;

When Loc is (8), ++Loc is (9).

-- 
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
_______________________________________________
LLVMbugs mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/llvmbugs

Reply via email to