https://bugs.llvm.org/show_bug.cgi?id=49974

            Bug ID: 49974
           Summary: [arm disassembler] Incorrect number of operands in
                    MCInst generated by disassembler
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: All
            Status: NEW
          Severity: normal
          Priority: P
         Component: Backend: ARM
          Assignee: [email protected]
          Reporter: [email protected]
                CC: [email protected], [email protected],
                    [email protected]

This ticket actually contains two bugs but they're really similar so I just
combined them into one.

First, given this binary instruction "0x26,0x00,0x00,0xeb", we can disassemble
it with the following command:
```
$ echo "0x26 0x00 0x00 0xeb" | llvm-mc --disassemble -triple=armv7 -o -
        .text
        bl      #152
$
```
Although the above command looked normal, if we look into its disassembled
`MCInst` (currently the debug output of `llvm-mc --disassemble` doesn't print
the disassembled `MCInst` but you can observe it in other ways like using gdb),
it looks like this:
```
<MCInst #703 BL <MCOperand Imm:152> <MCOperand Imm:14> <MCOperand Reg:0>>
```
According to the instruction definition of `BL`, it only takes 1 operand rather
than 3. The latter two are predicate operands (the second operand represents
`ARMCC::AL` and the third is predicate register it depends on) inserted by
mistake.

Another input that triggers a similar bug is "0xad 0xf2 0x7c 0x4d":
```
$ echo "0xad 0xf2 0x7c 0x4d" | llvm-mc --disassemble -triple=thumbv7 -o -
        .text
        subw    sp, sp, #1148
$
```
Again, the disassembled text is benign, but the disassembled `MCInst` looks
like this:
```
<MCInst #4193 t2SUBspImm12 \
              <MCOperand Reg:15> <MCOperand Reg:15> \
              <MCOperand Imm:1148> \
              <MCOperand Imm:14> <MCOperand Reg:0> \
              <MCOperand Reg:0> <MCOperand Reg:0>>
```
According to the instruction definition of `t2SUBspImm12`, there should be only
5 operands rather than 7. The last two operands are inserted by mistake.

These bug affect some of the users that directly consume the disassembled
`MCInst` object. For example, feeding the disassembled `MCInst` into LLVM MCA
-- it will cause MCA to choke because MCA is more sensitive to the total number
of operands in a `MCInst`.

The reason these two bugs were never caught is because we never directly test
on the in-memory `MCInst` object (or its textual format). The testing
infrastructure we have translate the `MCInst` into assembly code before
checking them. But as you can see above, this can not detect surplus operands
appended at the _end_.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to