[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-28 Thread Kristof Beyls via llvm-branch-commits


@@ -355,6 +389,46 @@ class SrcSafetyAnalysis {
 return Regs;
   }
 
+  // Returns all registers made trusted by this instruction.
+  SmallVector getRegsMadeTrusted(const MCInst &Point,
+const SrcState &Cur) const {
+SmallVector Regs;
+const MCPhysReg NoReg = BC.MIB->getNoRegister();
+
+// An authenticated pointer can be checked, or
+MCPhysReg CheckedReg =
+BC.MIB->getAuthCheckedReg(Point, /*MayOverwrite=*/false);
+if (CheckedReg != NoReg && Cur.SafeToDerefRegs[CheckedReg])
+  Regs.push_back(CheckedReg);
+
+if (CheckerSequenceInfo.contains(&Point)) {
+  MCPhysReg CheckedReg;
+  const MCInst *FirstCheckerInst;
+  std::tie(CheckedReg, FirstCheckerInst) = CheckerSequenceInfo.at(&Point);
+
+  // FirstCheckerInst should belong to the same basic block, meaning
+  // it was deterministically processed a few steps before this 
instruction.
+  const SrcState &StateBeforeChecker =
+  getStateBefore(*FirstCheckerInst).get();

kbeyls wrote:

Thanks, that all makes sense.

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-28 Thread Kristof Beyls via llvm-branch-commits


@@ -591,7 +591,9 @@ obscure_indirect_call_arg_nocfg:
 .globl  safe_lr_at_function_entry_nocfg
 .type   safe_lr_at_function_entry_nocfg,@function
 safe_lr_at_function_entry_nocfg:
-// CHECK-NOT: safe_lr_at_function_entry_nocfg
+// Due to state being reset after a label, paciasp is reported as
+// a signing oracle - this is a known false positive, ignore it.
+// CHECK-NOT: non-protected call{{.*}}safe_lr_at_function_entry_nocfg
 cbz x0, 1f
 ret// LR is safe at the start of the 
function
 1:

kbeyls wrote:

Thanks, that's useful info to know!
FWIW, my experience on pac-ret is that most code generated by the compiler 
follows mostly the same very regular structure, so I'm not surprised that 
you're not getting many false positives on llvm-test-suite.
In my experience with pac-ret scanning, you get most false positives when 
scanning across a full distribution, on pieces of code that were not generated 
by a popular compiler, such as hand-written assembly...

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-28 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls approved this pull request.

Thanks for your patience with my many questions!
This looks good to merge to me now.

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-28 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls edited 
https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-28 Thread Kristof Beyls via llvm-branch-commits


@@ -462,7 +563,22 @@ class DataflowSrcSafetyAnalysis
 return DFParent::getStateBefore(Inst);
   }
 
-  void run() override { DFParent::run(); }
+  void run() override {
+for (BinaryBasicBlock &BB : Func) {
+  if (auto CheckerInfo = BC.MIB->getAuthCheckedReg(BB)) {
+MCInst *LastInstOfChecker = BB.getLastNonPseudoInstr();
+LLVM_DEBUG({
+  dbgs() << "Found pointer checking sequence in " << BB.getName()
+ << ":\n";
+  traceReg(BC, "Checked register", CheckerInfo->first);
+  traceInst(BC, "First instruction", *CheckerInfo->second);
+  traceInst(BC, "Last instruction", *LastInstOfChecker);
+});
+CheckerSequenceInfo[LastInstOfChecker] = *CheckerInfo;
+  }
+}

kbeyls wrote:

Fair enough, let's leave it as it is.

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-26 Thread Anatoly Trosinenko via llvm-branch-commits


@@ -591,7 +591,9 @@ obscure_indirect_call_arg_nocfg:
 .globl  safe_lr_at_function_entry_nocfg
 .type   safe_lr_at_function_entry_nocfg,@function
 safe_lr_at_function_entry_nocfg:
-// CHECK-NOT: safe_lr_at_function_entry_nocfg
+// Due to state being reset after a label, paciasp is reported as
+// a signing oracle - this is a known false positive, ignore it.
+// CHECK-NOT: non-protected call{{.*}}safe_lr_at_function_entry_nocfg
 cbz x0, 1f
 ret// LR is safe at the start of the 
function
 1:

atrosinenko wrote:

I test gadget scanner on llvm-test-suite built at -O2 optimization level from 
time to time. Surprisingly, there doesn't seem to be any issues reported for 
functions without CFG information.

By the way, another issue came up when I was implementing #137224. I have no 
statistics on real-world functions for which BOLT is unable to reconstruct the 
CFG, but leaf functions are probably more widespread than shrink-wrap optimized 
ones. For leaf functions without CFG, it turned out to be quite easy to improve 
handling of the LR.

Considering false positives in general, IIRC there was something like 10 false 
positives reported for llvm-test-suite by my prototype implementation. There 
are still quite a number of false positives reported by top-of-the-patch-stack, 
I plan upstreaming more patches to fix these.

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-25 Thread Kristof Beyls via llvm-branch-commits


@@ -462,7 +563,22 @@ class DataflowSrcSafetyAnalysis
 return DFParent::getStateBefore(Inst);
   }
 
-  void run() override { DFParent::run(); }
+  void run() override {
+for (BinaryBasicBlock &BB : Func) {
+  if (auto CheckerInfo = BC.MIB->getAuthCheckedReg(BB)) {
+MCInst *LastInstOfChecker = BB.getLastNonPseudoInstr();
+LLVM_DEBUG({
+  dbgs() << "Found pointer checking sequence in " << BB.getName()
+ << ":\n";
+  traceReg(BC, "Checked register", CheckerInfo->first);
+  traceInst(BC, "First instruction", *CheckerInfo->second);
+  traceInst(BC, "Last instruction", *LastInstOfChecker);
+});
+CheckerSequenceInfo[LastInstOfChecker] = *CheckerInfo;
+  }
+}

kbeyls wrote:

Another nit-pick.
This to me looks like it's initializing the `CheckerSequenceInfo` variable of 
the `SrcSafetyAnalysis` parent class.
Wouldn't it be cleaner to do this initializing in the constructor for 
`SrcSafetyAnalysis`, rather than in this `run` method?

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-25 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls commented:

I finally managed to read through this patch end-to-end.
I only have 3 very nit-picky questions left.
This looks almost ready to merge.

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-25 Thread Kristof Beyls via llvm-branch-commits


@@ -355,6 +389,46 @@ class SrcSafetyAnalysis {
 return Regs;
   }
 
+  // Returns all registers made trusted by this instruction.
+  SmallVector getRegsMadeTrusted(const MCInst &Point,
+const SrcState &Cur) const {
+SmallVector Regs;
+const MCPhysReg NoReg = BC.MIB->getNoRegister();
+
+// An authenticated pointer can be checked, or
+MCPhysReg CheckedReg =
+BC.MIB->getAuthCheckedReg(Point, /*MayOverwrite=*/false);
+if (CheckedReg != NoReg && Cur.SafeToDerefRegs[CheckedReg])
+  Regs.push_back(CheckedReg);
+
+if (CheckerSequenceInfo.contains(&Point)) {
+  MCPhysReg CheckedReg;
+  const MCInst *FirstCheckerInst;
+  std::tie(CheckedReg, FirstCheckerInst) = CheckerSequenceInfo.at(&Point);
+
+  // FirstCheckerInst should belong to the same basic block, meaning
+  // it was deterministically processed a few steps before this 
instruction.
+  const SrcState &StateBeforeChecker =
+  getStateBefore(*FirstCheckerInst).get();

kbeyls wrote:

This is a nitpick.
I was trying to get my head around whether it's guaranteed to get to a fixed 
point in a dataflow analysis, where next to using just the previous state on 
the current instruction, also a state on another instruction is used as input 
to compute the next state for the current instruction.
I'm assuming this is probably fine (probably, because as you write in the 
comment, this instruction should be in the same basic block).
I thought I'd just ask if you ended up checking this somewhere else and, if so, 
you'd happen to have a pointer to something stating that you can also take 
state from another instruction in the same basic block, and still be guaranteed 
to reach a fixed point in a dataflow analysis?

Orthogonal to that: would it be hard to add an assert here that 
`*FirstCheckerInst` is indeed in the same basic block?

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-25 Thread Kristof Beyls via llvm-branch-commits


@@ -591,7 +591,9 @@ obscure_indirect_call_arg_nocfg:
 .globl  safe_lr_at_function_entry_nocfg
 .type   safe_lr_at_function_entry_nocfg,@function
 safe_lr_at_function_entry_nocfg:
-// CHECK-NOT: safe_lr_at_function_entry_nocfg
+// Due to state being reset after a label, paciasp is reported as
+// a signing oracle - this is a known false positive, ignore it.
+// CHECK-NOT: non-protected call{{.*}}safe_lr_at_function_entry_nocfg
 cbz x0, 1f
 ret// LR is safe at the start of the 
function
 1:

kbeyls wrote:

[Re: lines 
+594 to +600]
I'm wondering if this false positive pattern could end up appearing quite a few 
times in real code, specifically in code that has been shrink-wrap optimized?
Did you run this scanner on a larger code base? How many and what kind of false 
positives did you see?
See this comment 
inline on https://app.graphite.dev/github/pr/llvm/llvm-project/134146?utm_source=unchanged-line-comment";>Graphite.

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-25 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls edited 
https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-24 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko edited 
https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-24 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko edited 
https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-24 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko edited 
https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-24 Thread Anatoly Trosinenko via llvm-branch-commits


@@ -339,6 +369,183 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 }
   }
 
+  std::optional>
+  getAuthCheckedReg(BinaryBasicBlock &BB) const override {
+// Match several possible hard-coded sequences of instructions which can be
+// emitted by LLVM backend to check that the authenticated pointer is
+// correct (see AArch64AsmPrinter::emitPtrauthCheckAuthenticatedValue).
+//
+// This function only matches sequences involving branch instructions.
+// All these sequences have the form:
+//
+// (0) ... regular code that authenticates a pointer in Xn ...
+// (1) analyze Xn
+// (2) branch to .Lon_success if the pointer is correct
+// (3) BRK #imm (fall-through basic block)
+//
+// In the above pseudocode, (1) + (2) is one of the following sequences:
+//
+// - eor Xtmp, Xn, Xn, lsl #1
+//   tbz Xtmp, #62, .Lon_success
+//
+// - mov Xtmp, Xn
+//   xpac(i|d) Xn (or xpaclri if Xn is LR)
+//   cmp Xtmp, Xn
+//   b.eq .Lon_success
+//
+// Note that any branch destination operand is accepted as .Lon_success -
+// it is the responsibility of the caller of getAuthCheckedReg to inspect
+// the list of successors of this basic block as appropriate.
+
+// Any of the above code sequences assume the fall-through basic block
+// is a dead-end BRK instruction (any immediate operand is accepted).
+const BinaryBasicBlock *BreakBB = BB.getFallthrough();
+if (!BreakBB || BreakBB->empty() ||
+BreakBB->front().getOpcode() != AArch64::BRK)
+  return std::nullopt;
+
+// Iterate over the instructions of BB in reverse order, matching opcodes
+// and operands.
+MCPhysReg TestedReg = 0;
+MCPhysReg ScratchReg = 0;
+auto It = BB.end();
+auto StepAndGetOpcode = [&It, &BB]() -> int {
+  if (It == BB.begin())
+return -1;
+  --It;
+  return It->getOpcode();
+};
+
+switch (StepAndGetOpcode()) {
+default:
+  // Not matched the branch instruction.
+  return std::nullopt;
+case AArch64::Bcc:
+  // Bcc EQ, .Lon_success
+  if (It->getOperand(0).getImm() != AArch64CC::EQ)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // SUBSXrs XZR, TestedReg, ScratchReg, 0 (used by "CMP reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::SUBSXrs ||
+  It->getOperand(0).getReg() != AArch64::XZR ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  ScratchReg = It->getOperand(2).getReg();
+
+  // Either XPAC(I|D) ScratchReg, ScratchReg
+  // or XPACLRI
+  switch (StepAndGetOpcode()) {
+  default:
+return std::nullopt;
+  case AArch64::XPACLRI:
+// No operands to check, but using XPACLRI forces TestedReg to be X30.
+if (TestedReg != AArch64::LR)
+  return std::nullopt;
+break;
+  case AArch64::XPACI:
+  case AArch64::XPACD:
+if (It->getOperand(0).getReg() != ScratchReg ||
+It->getOperand(1).getReg() != ScratchReg)
+  return std::nullopt;
+break;
+  }
+
+  // ORRXrs ScratchReg, XZR, TestedReg, 0 (used by "MOV reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::ORRXrs)
+return std::nullopt;
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(1).getReg() != AArch64::XZR ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+
+case AArch64::TBZX:
+  // TBZX ScratchReg, 62, .Lon_success
+  ScratchReg = It->getOperand(0).getReg();
+  if (It->getOperand(1).getImm() != 62)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // EORXrs ScratchReg, TestedReg, TestedReg, 1
+  if (StepAndGetOpcode() != AArch64::EORXrs)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 1)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+}
+  }
+
+  MCPhysReg getAuthCheckedReg(const MCInst &Inst,
+  bool MayOverwrite) const override {
+// Cannot trivially reuse AArch64InstrInfo::getMemOperandWithOffsetWidth()
+// method as it accepts an instance of MachineInstr, not MCInst.
+const MCInstrDesc &Desc = Info->get(Inst.getOpcode());
+
+// If signing oracles are considered, the particular value left in the base
+// register after this instruction is important. This function checks that
+// if the base register was overwritten, it is due to address write-back.
+//
+// Note that this function is not needed for authentication oracles, as the
+  

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-24 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko commented:

@kbeyls Thank you for the comments! I have updated the comments accordingly.

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-24 Thread Anatoly Trosinenko via llvm-branch-commits


@@ -339,6 +369,198 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 }
   }
 
+  std::optional>
+  getAuthCheckedReg(BinaryBasicBlock &BB) const override {
+// Match several possible hard-coded sequences of instructions which can be
+// emitted by LLVM backend to check that the authenticated pointer is
+// correct (see AArch64AsmPrinter::emitPtrauthCheckAuthenticatedValue).
+//
+// This function only matches sequences involving branch instructions.
+// All these sequences have the form:
+//
+// (0) ... regular code that authenticates a pointer in Xn ...
+// (1) analyze Xn
+// (2) branch to .Lon_success if the pointer is correct
+// (3) BRK #imm (fall-through basic block)
+//
+// In the above pseudocode, (1) + (2) is one of the following sequences:
+//
+// - eor Xtmp, Xn, Xn, lsl #1
+//   tbz Xtmp, #62, .Lon_success
+//
+// - mov Xtmp, Xn
+//   xpac(i|d) Xn (or xpaclri if Xn is LR)
+//   cmp Xtmp, Xn
+//   b.eq .Lon_success
+//
+// Note that any branch destination operand is accepted as .Lon_success -
+// it is the responsibility of the caller of getAuthCheckedReg to inspect
+// the list of successors of this basic block as appropriate.
+
+// Any of the above code sequences assume the fall-through basic block
+// is a dead-end BRK instruction (any immediate operand is accepted).
+const BinaryBasicBlock *BreakBB = BB.getFallthrough();
+if (!BreakBB || BreakBB->empty() ||
+BreakBB->front().getOpcode() != AArch64::BRK)
+  return std::nullopt;
+
+// Iterate over the instructions of BB in reverse order, matching opcodes
+// and operands.
+MCPhysReg TestedReg = 0;
+MCPhysReg ScratchReg = 0;
+auto It = BB.end();
+auto StepAndGetOpcode = [&It, &BB]() -> int {
+  if (It == BB.begin())
+return -1;
+  --It;
+  return It->getOpcode();
+};
+
+switch (StepAndGetOpcode()) {
+default:
+  // Not matched the branch instruction.
+  return std::nullopt;
+case AArch64::Bcc:
+  // Bcc EQ, .Lon_success
+  if (It->getOperand(0).getImm() != AArch64CC::EQ)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // SUBSXrs XZR, TestedReg, ScratchReg, 0 (used by "CMP reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::SUBSXrs ||
+  It->getOperand(0).getReg() != AArch64::XZR ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  ScratchReg = It->getOperand(2).getReg();
+
+  // Either XPAC(I|D) ScratchReg, ScratchReg
+  // or XPACLRI
+  switch (StepAndGetOpcode()) {
+  default:
+return std::nullopt;
+  case AArch64::XPACLRI:
+// No operands to check, but using XPACLRI forces TestedReg to be X30.
+if (TestedReg != AArch64::LR)
+  return std::nullopt;
+break;
+  case AArch64::XPACI:
+  case AArch64::XPACD:
+if (It->getOperand(0).getReg() != ScratchReg ||
+It->getOperand(1).getReg() != ScratchReg)
+  return std::nullopt;
+break;
+  }
+
+  // ORRXrs ScratchReg, XZR, TestedReg, 0 (used by "MOV reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::ORRXrs)
+return std::nullopt;
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(1).getReg() != AArch64::XZR ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+
+case AArch64::TBZX:
+  // TBZX ScratchReg, 62, .Lon_success
+  ScratchReg = It->getOperand(0).getReg();
+  if (It->getOperand(1).getImm() != 62)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // EORXrs ScratchReg, TestedReg, TestedReg, 1
+  if (StepAndGetOpcode() != AArch64::EORXrs)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 1)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+}
+  }
+
+  MCPhysReg getAuthCheckedReg(const MCInst &Inst,
+  bool MayOverwrite) const override {
+// Cannot trivially reuse AArch64InstrInfo::getMemOperandWithOffsetWidth()
+// method as it accepts an instance of MachineInstr, not MCInst.
+const MCInstrDesc &Desc = Info->get(Inst.getOpcode());
+
+// If signing oracles are considered, the particular value left in the base
+// register after this instruction is important. This function checks that
+// if the base register was overwritten, it is due to address write-back:
+//
+// ; good:
+// autdza  x1   ; x1 is authenticated (may fail

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-24 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko edited 
https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-23 Thread Kristof Beyls via llvm-branch-commits


@@ -339,6 +369,198 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 }
   }
 
+  std::optional>
+  getAuthCheckedReg(BinaryBasicBlock &BB) const override {
+// Match several possible hard-coded sequences of instructions which can be
+// emitted by LLVM backend to check that the authenticated pointer is
+// correct (see AArch64AsmPrinter::emitPtrauthCheckAuthenticatedValue).
+//
+// This function only matches sequences involving branch instructions.
+// All these sequences have the form:
+//
+// (0) ... regular code that authenticates a pointer in Xn ...
+// (1) analyze Xn
+// (2) branch to .Lon_success if the pointer is correct
+// (3) BRK #imm (fall-through basic block)
+//
+// In the above pseudocode, (1) + (2) is one of the following sequences:
+//
+// - eor Xtmp, Xn, Xn, lsl #1
+//   tbz Xtmp, #62, .Lon_success
+//
+// - mov Xtmp, Xn
+//   xpac(i|d) Xn (or xpaclri if Xn is LR)
+//   cmp Xtmp, Xn
+//   b.eq .Lon_success
+//
+// Note that any branch destination operand is accepted as .Lon_success -
+// it is the responsibility of the caller of getAuthCheckedReg to inspect
+// the list of successors of this basic block as appropriate.
+
+// Any of the above code sequences assume the fall-through basic block
+// is a dead-end BRK instruction (any immediate operand is accepted).
+const BinaryBasicBlock *BreakBB = BB.getFallthrough();
+if (!BreakBB || BreakBB->empty() ||
+BreakBB->front().getOpcode() != AArch64::BRK)
+  return std::nullopt;
+
+// Iterate over the instructions of BB in reverse order, matching opcodes
+// and operands.
+MCPhysReg TestedReg = 0;
+MCPhysReg ScratchReg = 0;
+auto It = BB.end();
+auto StepAndGetOpcode = [&It, &BB]() -> int {
+  if (It == BB.begin())
+return -1;
+  --It;
+  return It->getOpcode();
+};
+
+switch (StepAndGetOpcode()) {
+default:
+  // Not matched the branch instruction.
+  return std::nullopt;
+case AArch64::Bcc:
+  // Bcc EQ, .Lon_success
+  if (It->getOperand(0).getImm() != AArch64CC::EQ)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // SUBSXrs XZR, TestedReg, ScratchReg, 0 (used by "CMP reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::SUBSXrs ||
+  It->getOperand(0).getReg() != AArch64::XZR ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  ScratchReg = It->getOperand(2).getReg();
+
+  // Either XPAC(I|D) ScratchReg, ScratchReg
+  // or XPACLRI
+  switch (StepAndGetOpcode()) {
+  default:
+return std::nullopt;
+  case AArch64::XPACLRI:
+// No operands to check, but using XPACLRI forces TestedReg to be X30.
+if (TestedReg != AArch64::LR)
+  return std::nullopt;
+break;
+  case AArch64::XPACI:
+  case AArch64::XPACD:
+if (It->getOperand(0).getReg() != ScratchReg ||
+It->getOperand(1).getReg() != ScratchReg)
+  return std::nullopt;
+break;
+  }
+
+  // ORRXrs ScratchReg, XZR, TestedReg, 0 (used by "MOV reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::ORRXrs)
+return std::nullopt;
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(1).getReg() != AArch64::XZR ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+
+case AArch64::TBZX:
+  // TBZX ScratchReg, 62, .Lon_success
+  ScratchReg = It->getOperand(0).getReg();
+  if (It->getOperand(1).getImm() != 62)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // EORXrs ScratchReg, TestedReg, TestedReg, 1
+  if (StepAndGetOpcode() != AArch64::EORXrs)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 1)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+}
+  }
+
+  MCPhysReg getAuthCheckedReg(const MCInst &Inst,
+  bool MayOverwrite) const override {
+// Cannot trivially reuse AArch64InstrInfo::getMemOperandWithOffsetWidth()
+// method as it accepts an instance of MachineInstr, not MCInst.
+const MCInstrDesc &Desc = Info->get(Inst.getOpcode());
+
+// If signing oracles are considered, the particular value left in the base
+// register after this instruction is important. This function checks that
+// if the base register was overwritten, it is due to address write-back:
+//
+// ; good:
+// autdza  x1   ; x1 is authenticated (may fail

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-23 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls commented:

Apologies for only reviewing piece-meal. I'm struggling a bit at the time to 
reserve longer blocks of time to review this fully in one go.
I hope my comments still make sense though

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-23 Thread Anatoly Trosinenko via llvm-branch-commits


@@ -339,6 +369,183 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 }
   }
 
+  std::optional>
+  getAuthCheckedReg(BinaryBasicBlock &BB) const override {
+// Match several possible hard-coded sequences of instructions which can be
+// emitted by LLVM backend to check that the authenticated pointer is
+// correct (see AArch64AsmPrinter::emitPtrauthCheckAuthenticatedValue).
+//
+// This function only matches sequences involving branch instructions.
+// All these sequences have the form:
+//
+// (0) ... regular code that authenticates a pointer in Xn ...
+// (1) analyze Xn
+// (2) branch to .Lon_success if the pointer is correct
+// (3) BRK #imm (fall-through basic block)
+//
+// In the above pseudocode, (1) + (2) is one of the following sequences:
+//
+// - eor Xtmp, Xn, Xn, lsl #1
+//   tbz Xtmp, #62, .Lon_success
+//
+// - mov Xtmp, Xn
+//   xpac(i|d) Xn (or xpaclri if Xn is LR)
+//   cmp Xtmp, Xn
+//   b.eq .Lon_success
+//
+// Note that any branch destination operand is accepted as .Lon_success -
+// it is the responsibility of the caller of getAuthCheckedReg to inspect
+// the list of successors of this basic block as appropriate.
+
+// Any of the above code sequences assume the fall-through basic block
+// is a dead-end BRK instruction (any immediate operand is accepted).
+const BinaryBasicBlock *BreakBB = BB.getFallthrough();
+if (!BreakBB || BreakBB->empty() ||
+BreakBB->front().getOpcode() != AArch64::BRK)
+  return std::nullopt;
+
+// Iterate over the instructions of BB in reverse order, matching opcodes
+// and operands.
+MCPhysReg TestedReg = 0;
+MCPhysReg ScratchReg = 0;
+auto It = BB.end();
+auto StepAndGetOpcode = [&It, &BB]() -> int {
+  if (It == BB.begin())
+return -1;
+  --It;
+  return It->getOpcode();
+};
+
+switch (StepAndGetOpcode()) {
+default:
+  // Not matched the branch instruction.
+  return std::nullopt;
+case AArch64::Bcc:
+  // Bcc EQ, .Lon_success
+  if (It->getOperand(0).getImm() != AArch64CC::EQ)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // SUBSXrs XZR, TestedReg, ScratchReg, 0 (used by "CMP reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::SUBSXrs ||
+  It->getOperand(0).getReg() != AArch64::XZR ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  ScratchReg = It->getOperand(2).getReg();
+
+  // Either XPAC(I|D) ScratchReg, ScratchReg
+  // or XPACLRI
+  switch (StepAndGetOpcode()) {
+  default:
+return std::nullopt;
+  case AArch64::XPACLRI:
+// No operands to check, but using XPACLRI forces TestedReg to be X30.
+if (TestedReg != AArch64::LR)
+  return std::nullopt;
+break;
+  case AArch64::XPACI:
+  case AArch64::XPACD:
+if (It->getOperand(0).getReg() != ScratchReg ||
+It->getOperand(1).getReg() != ScratchReg)
+  return std::nullopt;
+break;
+  }
+
+  // ORRXrs ScratchReg, XZR, TestedReg, 0 (used by "MOV reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::ORRXrs)
+return std::nullopt;
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(1).getReg() != AArch64::XZR ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+
+case AArch64::TBZX:
+  // TBZX ScratchReg, 62, .Lon_success
+  ScratchReg = It->getOperand(0).getReg();
+  if (It->getOperand(1).getImm() != 62)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // EORXrs ScratchReg, TestedReg, TestedReg, 1
+  if (StepAndGetOpcode() != AArch64::EORXrs)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 1)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+}
+  }
+
+  MCPhysReg getAuthCheckedReg(const MCInst &Inst,
+  bool MayOverwrite) const override {
+// Cannot trivially reuse AArch64InstrInfo::getMemOperandWithOffsetWidth()
+// method as it accepts an instance of MachineInstr, not MCInst.
+const MCInstrDesc &Desc = Info->get(Inst.getOpcode());
+
+// If signing oracles are considered, the particular value left in the base
+// register after this instruction is important. This function checks that
+// if the base register was overwritten, it is due to address write-back.
+//
+// Note that this function is not needed for authentication oracles, as the
+  

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-23 Thread Kristof Beyls via llvm-branch-commits


@@ -339,6 +369,183 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 }
   }
 
+  std::optional>
+  getAuthCheckedReg(BinaryBasicBlock &BB) const override {
+// Match several possible hard-coded sequences of instructions which can be
+// emitted by LLVM backend to check that the authenticated pointer is
+// correct (see AArch64AsmPrinter::emitPtrauthCheckAuthenticatedValue).
+//
+// This function only matches sequences involving branch instructions.
+// All these sequences have the form:
+//
+// (0) ... regular code that authenticates a pointer in Xn ...
+// (1) analyze Xn
+// (2) branch to .Lon_success if the pointer is correct
+// (3) BRK #imm (fall-through basic block)
+//
+// In the above pseudocode, (1) + (2) is one of the following sequences:
+//
+// - eor Xtmp, Xn, Xn, lsl #1
+//   tbz Xtmp, #62, .Lon_success
+//
+// - mov Xtmp, Xn
+//   xpac(i|d) Xn (or xpaclri if Xn is LR)
+//   cmp Xtmp, Xn
+//   b.eq .Lon_success
+//
+// Note that any branch destination operand is accepted as .Lon_success -
+// it is the responsibility of the caller of getAuthCheckedReg to inspect
+// the list of successors of this basic block as appropriate.
+
+// Any of the above code sequences assume the fall-through basic block
+// is a dead-end BRK instruction (any immediate operand is accepted).
+const BinaryBasicBlock *BreakBB = BB.getFallthrough();
+if (!BreakBB || BreakBB->empty() ||
+BreakBB->front().getOpcode() != AArch64::BRK)
+  return std::nullopt;
+
+// Iterate over the instructions of BB in reverse order, matching opcodes
+// and operands.
+MCPhysReg TestedReg = 0;
+MCPhysReg ScratchReg = 0;
+auto It = BB.end();
+auto StepAndGetOpcode = [&It, &BB]() -> int {
+  if (It == BB.begin())
+return -1;
+  --It;
+  return It->getOpcode();
+};
+
+switch (StepAndGetOpcode()) {
+default:
+  // Not matched the branch instruction.
+  return std::nullopt;
+case AArch64::Bcc:
+  // Bcc EQ, .Lon_success
+  if (It->getOperand(0).getImm() != AArch64CC::EQ)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // SUBSXrs XZR, TestedReg, ScratchReg, 0 (used by "CMP reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::SUBSXrs ||
+  It->getOperand(0).getReg() != AArch64::XZR ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  ScratchReg = It->getOperand(2).getReg();
+
+  // Either XPAC(I|D) ScratchReg, ScratchReg
+  // or XPACLRI
+  switch (StepAndGetOpcode()) {
+  default:
+return std::nullopt;
+  case AArch64::XPACLRI:
+// No operands to check, but using XPACLRI forces TestedReg to be X30.
+if (TestedReg != AArch64::LR)
+  return std::nullopt;
+break;
+  case AArch64::XPACI:
+  case AArch64::XPACD:
+if (It->getOperand(0).getReg() != ScratchReg ||
+It->getOperand(1).getReg() != ScratchReg)
+  return std::nullopt;
+break;
+  }
+
+  // ORRXrs ScratchReg, XZR, TestedReg, 0 (used by "MOV reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::ORRXrs)
+return std::nullopt;
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(1).getReg() != AArch64::XZR ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+
+case AArch64::TBZX:
+  // TBZX ScratchReg, 62, .Lon_success
+  ScratchReg = It->getOperand(0).getReg();
+  if (It->getOperand(1).getImm() != 62)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // EORXrs ScratchReg, TestedReg, TestedReg, 1
+  if (StepAndGetOpcode() != AArch64::EORXrs)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 1)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+}
+  }
+
+  MCPhysReg getAuthCheckedReg(const MCInst &Inst,
+  bool MayOverwrite) const override {
+// Cannot trivially reuse AArch64InstrInfo::getMemOperandWithOffsetWidth()
+// method as it accepts an instance of MachineInstr, not MCInst.
+const MCInstrDesc &Desc = Info->get(Inst.getOpcode());
+
+// If signing oracles are considered, the particular value left in the base
+// register after this instruction is important. This function checks that
+// if the base register was overwritten, it is due to address write-back.
+//
+// Note that this function is not needed for authentication oracles, as the
+  

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-23 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls edited 
https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-23 Thread Anatoly Trosinenko via llvm-branch-commits


@@ -339,6 +369,183 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 }
   }
 
+  std::optional>
+  getAuthCheckedReg(BinaryBasicBlock &BB) const override {
+// Match several possible hard-coded sequences of instructions which can be
+// emitted by LLVM backend to check that the authenticated pointer is
+// correct (see AArch64AsmPrinter::emitPtrauthCheckAuthenticatedValue).
+//
+// This function only matches sequences involving branch instructions.
+// All these sequences have the form:
+//
+// (0) ... regular code that authenticates a pointer in Xn ...
+// (1) analyze Xn
+// (2) branch to .Lon_success if the pointer is correct
+// (3) BRK #imm (fall-through basic block)
+//
+// In the above pseudocode, (1) + (2) is one of the following sequences:
+//
+// - eor Xtmp, Xn, Xn, lsl #1
+//   tbz Xtmp, #62, .Lon_success
+//
+// - mov Xtmp, Xn
+//   xpac(i|d) Xn (or xpaclri if Xn is LR)
+//   cmp Xtmp, Xn
+//   b.eq .Lon_success
+//
+// Note that any branch destination operand is accepted as .Lon_success -
+// it is the responsibility of the caller of getAuthCheckedReg to inspect
+// the list of successors of this basic block as appropriate.
+
+// Any of the above code sequences assume the fall-through basic block
+// is a dead-end BRK instruction (any immediate operand is accepted).
+const BinaryBasicBlock *BreakBB = BB.getFallthrough();
+if (!BreakBB || BreakBB->empty() ||
+BreakBB->front().getOpcode() != AArch64::BRK)
+  return std::nullopt;
+
+// Iterate over the instructions of BB in reverse order, matching opcodes
+// and operands.
+MCPhysReg TestedReg = 0;
+MCPhysReg ScratchReg = 0;
+auto It = BB.end();
+auto StepAndGetOpcode = [&It, &BB]() -> int {
+  if (It == BB.begin())
+return -1;
+  --It;
+  return It->getOpcode();
+};
+
+switch (StepAndGetOpcode()) {
+default:
+  // Not matched the branch instruction.
+  return std::nullopt;
+case AArch64::Bcc:
+  // Bcc EQ, .Lon_success
+  if (It->getOperand(0).getImm() != AArch64CC::EQ)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // SUBSXrs XZR, TestedReg, ScratchReg, 0 (used by "CMP reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::SUBSXrs ||
+  It->getOperand(0).getReg() != AArch64::XZR ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  ScratchReg = It->getOperand(2).getReg();
+
+  // Either XPAC(I|D) ScratchReg, ScratchReg
+  // or XPACLRI
+  switch (StepAndGetOpcode()) {
+  default:
+return std::nullopt;
+  case AArch64::XPACLRI:
+// No operands to check, but using XPACLRI forces TestedReg to be X30.
+if (TestedReg != AArch64::LR)
+  return std::nullopt;
+break;
+  case AArch64::XPACI:
+  case AArch64::XPACD:
+if (It->getOperand(0).getReg() != ScratchReg ||
+It->getOperand(1).getReg() != ScratchReg)
+  return std::nullopt;
+break;
+  }
+
+  // ORRXrs ScratchReg, XZR, TestedReg, 0 (used by "MOV reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::ORRXrs)
+return std::nullopt;
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(1).getReg() != AArch64::XZR ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+
+case AArch64::TBZX:
+  // TBZX ScratchReg, 62, .Lon_success
+  ScratchReg = It->getOperand(0).getReg();
+  if (It->getOperand(1).getImm() != 62)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // EORXrs ScratchReg, TestedReg, TestedReg, 1
+  if (StepAndGetOpcode() != AArch64::EORXrs)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 1)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+}
+  }
+
+  MCPhysReg getAuthCheckedReg(const MCInst &Inst,
+  bool MayOverwrite) const override {
+// Cannot trivially reuse AArch64InstrInfo::getMemOperandWithOffsetWidth()
+// method as it accepts an instance of MachineInstr, not MCInst.
+const MCInstrDesc &Desc = Info->get(Inst.getOpcode());
+
+// If signing oracles are considered, the particular value left in the base
+// register after this instruction is important. This function checks that
+// if the base register was overwritten, it is due to address write-back.
+//
+// Note that this function is not needed for authentication oracles, as the
+  

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko commented:

Partially addressed the comments.

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko edited 
https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Anatoly Trosinenko via llvm-branch-commits


@@ -307,8 +340,10 @@ class SrcSafetyAnalysis {
 
   SrcState createEntryState() {
 SrcState S(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
-for (MCPhysReg Reg : BC.MIB->getTrustedLiveInRegs())
-  S.SafeToDerefRegs |= BC.MIB->getAliases(Reg, /*OnlySmaller=*/true);
+for (MCPhysReg Reg : BC.MIB->getTrustedLiveInRegs()) {
+  S.TrustedRegs |= BC.MIB->getAliases(Reg, /*OnlySmaller=*/true);
+  S.SafeToDerefRegs = S.TrustedRegs;
+}

atrosinenko wrote:

Fixed, thank you!

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Anatoly Trosinenko via llvm-branch-commits


@@ -339,6 +369,183 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 }
   }
 
+  std::optional>
+  getAuthCheckedReg(BinaryBasicBlock &BB) const override {
+// Match several possible hard-coded sequences of instructions which can be
+// emitted by LLVM backend to check that the authenticated pointer is
+// correct (see AArch64AsmPrinter::emitPtrauthCheckAuthenticatedValue).
+//
+// This function only matches sequences involving branch instructions.
+// All these sequences have the form:
+//
+// (0) ... regular code that authenticates a pointer in Xn ...
+// (1) analyze Xn
+// (2) branch to .Lon_success if the pointer is correct
+// (3) BRK #imm (fall-through basic block)
+//
+// In the above pseudocode, (1) + (2) is one of the following sequences:
+//
+// - eor Xtmp, Xn, Xn, lsl #1
+//   tbz Xtmp, #62, .Lon_success
+//
+// - mov Xtmp, Xn
+//   xpac(i|d) Xn (or xpaclri if Xn is LR)
+//   cmp Xtmp, Xn
+//   b.eq .Lon_success
+//
+// Note that any branch destination operand is accepted as .Lon_success -
+// it is the responsibility of the caller of getAuthCheckedReg to inspect
+// the list of successors of this basic block as appropriate.
+
+// Any of the above code sequences assume the fall-through basic block
+// is a dead-end BRK instruction (any immediate operand is accepted).
+const BinaryBasicBlock *BreakBB = BB.getFallthrough();
+if (!BreakBB || BreakBB->empty() ||
+BreakBB->front().getOpcode() != AArch64::BRK)
+  return std::nullopt;
+
+// Iterate over the instructions of BB in reverse order, matching opcodes
+// and operands.
+MCPhysReg TestedReg = 0;
+MCPhysReg ScratchReg = 0;
+auto It = BB.end();
+auto StepAndGetOpcode = [&It, &BB]() -> int {
+  if (It == BB.begin())
+return -1;
+  --It;
+  return It->getOpcode();
+};
+
+switch (StepAndGetOpcode()) {
+default:
+  // Not matched the branch instruction.
+  return std::nullopt;
+case AArch64::Bcc:
+  // Bcc EQ, .Lon_success
+  if (It->getOperand(0).getImm() != AArch64CC::EQ)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // SUBSXrs XZR, TestedReg, ScratchReg, 0 (used by "CMP reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::SUBSXrs ||
+  It->getOperand(0).getReg() != AArch64::XZR ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  ScratchReg = It->getOperand(2).getReg();
+
+  // Either XPAC(I|D) ScratchReg, ScratchReg
+  // or XPACLRI
+  switch (StepAndGetOpcode()) {
+  default:
+return std::nullopt;
+  case AArch64::XPACLRI:
+// No operands to check, but using XPACLRI forces TestedReg to be X30.
+if (TestedReg != AArch64::LR)
+  return std::nullopt;
+break;
+  case AArch64::XPACI:
+  case AArch64::XPACD:
+if (It->getOperand(0).getReg() != ScratchReg ||
+It->getOperand(1).getReg() != ScratchReg)
+  return std::nullopt;
+break;
+  }
+
+  // ORRXrs ScratchReg, XZR, TestedReg, 0 (used by "MOV reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::ORRXrs)
+return std::nullopt;
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(1).getReg() != AArch64::XZR ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+
+case AArch64::TBZX:
+  // TBZX ScratchReg, 62, .Lon_success
+  ScratchReg = It->getOperand(0).getReg();
+  if (It->getOperand(1).getImm() != 62)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // EORXrs ScratchReg, TestedReg, TestedReg, 1
+  if (StepAndGetOpcode() != AArch64::EORXrs)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 1)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+}
+  }
+
+  MCPhysReg getAuthCheckedReg(const MCInst &Inst,
+  bool MayOverwrite) const override {
+// Cannot trivially reuse AArch64InstrInfo::getMemOperandWithOffsetWidth()
+// method as it accepts an instance of MachineInstr, not MCInst.
+const MCInstrDesc &Desc = Info->get(Inst.getOpcode());
+
+// If signing oracles are considered, the particular value left in the base
+// register after this instruction is important. This function checks that
+// if the base register was overwritten, it is due to address write-back.
+//
+// Note that this function is not needed for authentication oracles, as the
+  

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko edited 
https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Kristof Beyls via llvm-branch-commits


@@ -339,6 +369,183 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 }
   }
 
+  std::optional>
+  getAuthCheckedReg(BinaryBasicBlock &BB) const override {
+// Match several possible hard-coded sequences of instructions which can be
+// emitted by LLVM backend to check that the authenticated pointer is
+// correct (see AArch64AsmPrinter::emitPtrauthCheckAuthenticatedValue).
+//
+// This function only matches sequences involving branch instructions.
+// All these sequences have the form:
+//
+// (0) ... regular code that authenticates a pointer in Xn ...
+// (1) analyze Xn
+// (2) branch to .Lon_success if the pointer is correct
+// (3) BRK #imm (fall-through basic block)
+//
+// In the above pseudocode, (1) + (2) is one of the following sequences:
+//
+// - eor Xtmp, Xn, Xn, lsl #1
+//   tbz Xtmp, #62, .Lon_success
+//
+// - mov Xtmp, Xn
+//   xpac(i|d) Xn (or xpaclri if Xn is LR)
+//   cmp Xtmp, Xn
+//   b.eq .Lon_success
+//
+// Note that any branch destination operand is accepted as .Lon_success -
+// it is the responsibility of the caller of getAuthCheckedReg to inspect
+// the list of successors of this basic block as appropriate.
+
+// Any of the above code sequences assume the fall-through basic block
+// is a dead-end BRK instruction (any immediate operand is accepted).
+const BinaryBasicBlock *BreakBB = BB.getFallthrough();
+if (!BreakBB || BreakBB->empty() ||
+BreakBB->front().getOpcode() != AArch64::BRK)
+  return std::nullopt;
+
+// Iterate over the instructions of BB in reverse order, matching opcodes
+// and operands.
+MCPhysReg TestedReg = 0;
+MCPhysReg ScratchReg = 0;
+auto It = BB.end();
+auto StepAndGetOpcode = [&It, &BB]() -> int {
+  if (It == BB.begin())
+return -1;
+  --It;
+  return It->getOpcode();
+};
+
+switch (StepAndGetOpcode()) {
+default:
+  // Not matched the branch instruction.
+  return std::nullopt;
+case AArch64::Bcc:
+  // Bcc EQ, .Lon_success
+  if (It->getOperand(0).getImm() != AArch64CC::EQ)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // SUBSXrs XZR, TestedReg, ScratchReg, 0 (used by "CMP reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::SUBSXrs ||
+  It->getOperand(0).getReg() != AArch64::XZR ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  ScratchReg = It->getOperand(2).getReg();
+
+  // Either XPAC(I|D) ScratchReg, ScratchReg
+  // or XPACLRI
+  switch (StepAndGetOpcode()) {
+  default:
+return std::nullopt;
+  case AArch64::XPACLRI:
+// No operands to check, but using XPACLRI forces TestedReg to be X30.
+if (TestedReg != AArch64::LR)
+  return std::nullopt;
+break;
+  case AArch64::XPACI:
+  case AArch64::XPACD:
+if (It->getOperand(0).getReg() != ScratchReg ||
+It->getOperand(1).getReg() != ScratchReg)
+  return std::nullopt;
+break;
+  }
+
+  // ORRXrs ScratchReg, XZR, TestedReg, 0 (used by "MOV reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::ORRXrs)
+return std::nullopt;
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(1).getReg() != AArch64::XZR ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+
+case AArch64::TBZX:
+  // TBZX ScratchReg, 62, .Lon_success
+  ScratchReg = It->getOperand(0).getReg();
+  if (It->getOperand(1).getImm() != 62)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // EORXrs ScratchReg, TestedReg, TestedReg, 1
+  if (StepAndGetOpcode() != AArch64::EORXrs)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 1)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+}
+  }
+
+  MCPhysReg getAuthCheckedReg(const MCInst &Inst,
+  bool MayOverwrite) const override {
+// Cannot trivially reuse AArch64InstrInfo::getMemOperandWithOffsetWidth()
+// method as it accepts an instance of MachineInstr, not MCInst.
+const MCInstrDesc &Desc = Info->get(Inst.getOpcode());
+
+// If signing oracles are considered, the particular value left in the base
+// register after this instruction is important. This function checks that
+// if the base register was overwritten, it is due to address write-back.
+//
+// Note that this function is not needed for authentication oracles, as the
+  

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls commented:

Just adding a few more comments as I'm reading through the patch.
I haven't yet managed to read the full patch in detail yet

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Kristof Beyls via llvm-branch-commits


@@ -339,6 +369,183 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 }
   }
 
+  std::optional>
+  getAuthCheckedReg(BinaryBasicBlock &BB) const override {
+// Match several possible hard-coded sequences of instructions which can be
+// emitted by LLVM backend to check that the authenticated pointer is
+// correct (see AArch64AsmPrinter::emitPtrauthCheckAuthenticatedValue).
+//
+// This function only matches sequences involving branch instructions.
+// All these sequences have the form:
+//
+// (0) ... regular code that authenticates a pointer in Xn ...
+// (1) analyze Xn
+// (2) branch to .Lon_success if the pointer is correct
+// (3) BRK #imm (fall-through basic block)
+//
+// In the above pseudocode, (1) + (2) is one of the following sequences:
+//
+// - eor Xtmp, Xn, Xn, lsl #1
+//   tbz Xtmp, #62, .Lon_success
+//
+// - mov Xtmp, Xn
+//   xpac(i|d) Xn (or xpaclri if Xn is LR)
+//   cmp Xtmp, Xn
+//   b.eq .Lon_success
+//
+// Note that any branch destination operand is accepted as .Lon_success -
+// it is the responsibility of the caller of getAuthCheckedReg to inspect
+// the list of successors of this basic block as appropriate.
+
+// Any of the above code sequences assume the fall-through basic block
+// is a dead-end BRK instruction (any immediate operand is accepted).
+const BinaryBasicBlock *BreakBB = BB.getFallthrough();
+if (!BreakBB || BreakBB->empty() ||
+BreakBB->front().getOpcode() != AArch64::BRK)
+  return std::nullopt;
+
+// Iterate over the instructions of BB in reverse order, matching opcodes
+// and operands.
+MCPhysReg TestedReg = 0;
+MCPhysReg ScratchReg = 0;
+auto It = BB.end();
+auto StepAndGetOpcode = [&It, &BB]() -> int {
+  if (It == BB.begin())
+return -1;
+  --It;
+  return It->getOpcode();
+};
+
+switch (StepAndGetOpcode()) {
+default:
+  // Not matched the branch instruction.
+  return std::nullopt;
+case AArch64::Bcc:
+  // Bcc EQ, .Lon_success
+  if (It->getOperand(0).getImm() != AArch64CC::EQ)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // SUBSXrs XZR, TestedReg, ScratchReg, 0 (used by "CMP reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::SUBSXrs ||
+  It->getOperand(0).getReg() != AArch64::XZR ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  ScratchReg = It->getOperand(2).getReg();
+
+  // Either XPAC(I|D) ScratchReg, ScratchReg
+  // or XPACLRI
+  switch (StepAndGetOpcode()) {
+  default:
+return std::nullopt;
+  case AArch64::XPACLRI:
+// No operands to check, but using XPACLRI forces TestedReg to be X30.
+if (TestedReg != AArch64::LR)
+  return std::nullopt;
+break;
+  case AArch64::XPACI:
+  case AArch64::XPACD:
+if (It->getOperand(0).getReg() != ScratchReg ||
+It->getOperand(1).getReg() != ScratchReg)
+  return std::nullopt;
+break;
+  }
+
+  // ORRXrs ScratchReg, XZR, TestedReg, 0 (used by "MOV reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::ORRXrs)
+return std::nullopt;
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(1).getReg() != AArch64::XZR ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+
+case AArch64::TBZX:
+  // TBZX ScratchReg, 62, .Lon_success
+  ScratchReg = It->getOperand(0).getReg();
+  if (It->getOperand(1).getImm() != 62)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // EORXrs ScratchReg, TestedReg, TestedReg, 1
+  if (StepAndGetOpcode() != AArch64::EORXrs)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 1)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+}
+  }
+
+  MCPhysReg getAuthCheckedReg(const MCInst &Inst,
+  bool MayOverwrite) const override {
+// Cannot trivially reuse AArch64InstrInfo::getMemOperandWithOffsetWidth()
+// method as it accepts an instance of MachineInstr, not MCInst.
+const MCInstrDesc &Desc = Info->get(Inst.getOpcode());
+
+// If signing oracles are considered, the particular value left in the base
+// register after this instruction is important. This function checks that
+// if the base register was overwritten, it is due to address write-back.
+//
+// Note that this function is not needed for authentication oracles, as the
+  

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Kristof Beyls via llvm-branch-commits


@@ -307,8 +340,10 @@ class SrcSafetyAnalysis {
 
   SrcState createEntryState() {
 SrcState S(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
-for (MCPhysReg Reg : BC.MIB->getTrustedLiveInRegs())
-  S.SafeToDerefRegs |= BC.MIB->getAliases(Reg, /*OnlySmaller=*/true);
+for (MCPhysReg Reg : BC.MIB->getTrustedLiveInRegs()) {
+  S.TrustedRegs |= BC.MIB->getAliases(Reg, /*OnlySmaller=*/true);
+  S.SafeToDerefRegs = S.TrustedRegs;
+}

kbeyls wrote:

Nit pick: I guess
`S.SafeToDerefRegs = S.TrustedRegs` could be move to after the loop, rather 
than inside the loop?

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls edited 
https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Anatoly Trosinenko via llvm-branch-commits


@@ -622,6 +628,40 @@ class MCPlusBuilder {
 return std::make_pair(getNoRegister(), getNoRegister());
   }
 
+  /// Analyzes if a pointer is checked to be valid by the end of BB.
+  ///
+  /// It is possible for pointer authentication instructions not to terminate
+  /// the program abnormally on authentication failure and return some *invalid
+  /// pointer* instead (like it is done on AArch64 when FEAT_FPAC is not
+  /// implemented). This might be enough to crash on invalid memory access
+  /// when the pointer is later used as the destination of load/store or branch
+  /// instruction. On the other hand, when the pointer is not used right away,
+  /// it may be important for the compiler to check the address explicitly not
+  /// to introduce signing or authentication oracle.
+  ///
+  /// If this function returns a (Reg, Inst) pair, then it is known that in any
+  /// successor of BB either
+  /// * Reg is trusted, provided it was safe-to-dereference before Inst, or
+  /// * the program is terminated abnormally without introducing any signing
+  ///   or authentication oracles
+  virtual std::optional>

atrosinenko wrote:

I expect all the check methods listed in 
[`AArch64PointerAuth.h`](https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AArch64/AArch64PointerAuth.h)
 to be detected by one of these two methods. On the other hand, the 
`getAuthCheckedReg(BB)` method has a closed list of hard-coded patterns to 
recognize - if some other compilers or some other versions of LLVM would use 
other instruction sequences, it should be possible to update this method as 
appropriate. For example, LLVM used to emit an "inverted" variant of the checks 
in the past: the fall-through basic block was executed on success and on 
failure a jump was performed to a basic block containing a BRK instruction. Not 
sure if these variants should be supported, as later the mainline backend was 
aligned with the original variants of checks emitted by the prototype from the 
Ahmed Bougacha's branch (these are the patterns detected in this PR), but if I 
would have to support the inverted variants, the only change needed would 
probably be accepting TBNZ in addition to TBZ and B.NE in addition to B.EQ.

Considering the `getAuthCheckedReg(Inst, bool)` method, it could obviously be 
improved by handling store instructions (though, I doubt it would change 
anything for `pauthtest` ABI). False positives (not recognizing some load 
instruction as "checking") are technically possible, but I would expect them to 
be due to the existing implementation of `AArch64MCPlusBuilder::mayLoad()` 
which hardcodes a number of opcodes.

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Kristof Beyls via llvm-branch-commits


@@ -622,6 +628,40 @@ class MCPlusBuilder {
 return std::make_pair(getNoRegister(), getNoRegister());
   }
 
+  /// Analyzes if a pointer is checked to be valid by the end of BB.
+  ///
+  /// It is possible for pointer authentication instructions not to terminate
+  /// the program abnormally on authentication failure and return some *invalid
+  /// pointer* instead (like it is done on AArch64 when FEAT_FPAC is not
+  /// implemented). This might be enough to crash on invalid memory access
+  /// when the pointer is later used as the destination of load/store or branch
+  /// instruction. On the other hand, when the pointer is not used right away,
+  /// it may be important for the compiler to check the address explicitly not
+  /// to introduce signing or authentication oracle.
+  ///
+  /// If this function returns a (Reg, Inst) pair, then it is known that in any
+  /// successor of BB either
+  /// * Reg is trusted, provided it was safe-to-dereference before Inst, or
+  /// * the program is terminated abnormally without introducing any signing
+  ///   or authentication oracles
+  virtual std::optional>

kbeyls wrote:

Thank you for those improved comments and the test case, that makes a lot more 
sense to me now!

One thing I'm still wondering about before resolving this review thread, is if 
you have any thoughts about whether any instruction patterns that guarantee the 
program will terminate on an authentication fault might not be detected by 
either of the 2 versions of `getAuthCheckedReg`?

It makes sense to update the documentation in a separate PR, but it might be 
useful to record the answer to the above question in a review comment to later 
help write up in the documentation what classes of false positive/negatives the 
user may still expect to see?

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-18 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls edited 
https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-18 Thread Anatoly Trosinenko via llvm-branch-commits


@@ -622,6 +628,40 @@ class MCPlusBuilder {
 return std::make_pair(getNoRegister(), getNoRegister());
   }
 
+  /// Analyzes if a pointer is checked to be valid by the end of BB.
+  ///
+  /// It is possible for pointer authentication instructions not to terminate
+  /// the program abnormally on authentication failure and return some *invalid
+  /// pointer* instead (like it is done on AArch64 when FEAT_FPAC is not
+  /// implemented). This might be enough to crash on invalid memory access
+  /// when the pointer is later used as the destination of load/store or branch
+  /// instruction. On the other hand, when the pointer is not used right away,
+  /// it may be important for the compiler to check the address explicitly not
+  /// to introduce signing or authentication oracle.
+  ///
+  /// If this function returns a (Reg, Inst) pair, then it is known that in any
+  /// successor of BB either
+  /// * Reg is trusted, provided it was safe-to-dereference before Inst, or
+  /// * the program is terminated abnormally without introducing any signing
+  ///   or authentication oracles
+  virtual std::optional>

atrosinenko wrote:

Updated the comment and added a test case in 
56ea6bcb91b5dd7191e0a2857303ad86ff09fe74.

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-18 Thread Anatoly Trosinenko via llvm-branch-commits


@@ -622,6 +628,40 @@ class MCPlusBuilder {
 return std::make_pair(getNoRegister(), getNoRegister());
   }
 
+  /// Analyzes if a pointer is checked to be valid by the end of BB.
+  ///
+  /// It is possible for pointer authentication instructions not to terminate
+  /// the program abnormally on authentication failure and return some *invalid
+  /// pointer* instead (like it is done on AArch64 when FEAT_FPAC is not
+  /// implemented). This might be enough to crash on invalid memory access
+  /// when the pointer is later used as the destination of load/store or branch
+  /// instruction. On the other hand, when the pointer is not used right away,
+  /// it may be important for the compiler to check the address explicitly not
+  /// to introduce signing or authentication oracle.
+  ///
+  /// If this function returns a (Reg, Inst) pair, then it is known that in any
+  /// successor of BB either
+  /// * Reg is trusted, provided it was safe-to-dereference before Inst, or
+  /// * the program is terminated abnormally without introducing any signing
+  ///   or authentication oracles
+  virtual std::optional>

atrosinenko wrote:

Oh, got it - it is the description of `getAuthCheckedReg` which is not as 
detailed as it should be. There are two overloaded methods - my current 
assumption is that two different kinds of "pointer checkers" can be detected:
* single instructions, such as `ldr w0, [x1]`
* a number of hardcoded instruction sequences - all these sequences are 
contiguous and involve branching depending on the result of validation, so they 
span some suffix of a basic block. This is a bit hackish but it does work for 
AArch64 code emitted by LLVM, aside from being unsupported when CFG is not 
available

I hope this is an acceptable approach, at least at first, but of course the 
documentation of `getAuthCheckedReg(BB)` should mention that this method does 
not summarize the results of `getAuthCheckedReg(Inst)` across the basic block, 
it detects completely different patterns.

Thus, considering your example, `getAuthCheckedReg(Inst)` should report 
`nullopt`, `nullopt`, `x0`, `x2` for the four instructions of `bb1`, and 
`getAuthCheckedReg(BB)` should report `nullopt` for `bb1`.

Initially, I planned significantly updating `bolt/docs/BinaryAnalysis.md` via a 
separate PR after most of the changes land. I will add your example as a test 
case and update the description of `getAuthCheckedReg` in this PR, of course.

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-18 Thread Kristof Beyls via llvm-branch-commits


@@ -622,6 +628,40 @@ class MCPlusBuilder {
 return std::make_pair(getNoRegister(), getNoRegister());
   }
 
+  /// Analyzes if a pointer is checked to be valid by the end of BB.
+  ///
+  /// It is possible for pointer authentication instructions not to terminate
+  /// the program abnormally on authentication failure and return some *invalid
+  /// pointer* instead (like it is done on AArch64 when FEAT_FPAC is not
+  /// implemented). This might be enough to crash on invalid memory access
+  /// when the pointer is later used as the destination of load/store or branch
+  /// instruction. On the other hand, when the pointer is not used right away,
+  /// it may be important for the compiler to check the address explicitly not
+  /// to introduce signing or authentication oracle.
+  ///
+  /// If this function returns a (Reg, Inst) pair, then it is known that in any
+  /// successor of BB either
+  /// * Reg is trusted, provided it was safe-to-dereference before Inst, or
+  /// * the program is terminated abnormally without introducing any signing
+  ///   or authentication oracles
+  virtual std::optional>

kbeyls wrote:

Off the top of my head, for the following example code (which probably wouldn't 
be produced by any currently existing compiler, but that could be written by 
hand in assembly?), I guess that's a single basic block for which 
`getAuthCheckedReg` would have to return 2 pairs?:
```
bb1:
autda x0, x1
autda x2, x3
ldr xzr, [x0]
ldr xzr, [x2]
```

I'm OK with not handling this case in this PR, but then maybe there should be a 
test case with a basic block similar to the above to show how BOLT will either 
produce an error message or otherwise fails to handle this case?

In any case, this PR should probably also document in 
https://github.com/llvm/llvm-project/blob/main/bolt/docs/BinaryAnalysis.md what 
the user of this analysis can expect in terms of false positives and true 
negatives.

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-18 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls commented:

Thanks for the updates.
I managed to glance through the whole PR now.
I will do a more detailed review once my main currently remaining question is 
answered about which cases are supported and which ones are not supported. I 
think this should be written up as part of this PR in the user-facing 
documentation at bolt/docs/BinaryAnalysis.md .

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-10 Thread Anatoly Trosinenko via llvm-branch-commits


@@ -622,6 +628,40 @@ class MCPlusBuilder {
 return std::make_pair(getNoRegister(), getNoRegister());
   }
 
+  /// Analyzes if a pointer is checked to be valid by the end of BB.
+  ///
+  /// It is possible for pointer authentication instructions not to terminate
+  /// the program abnormally on authentication failure and return some *invalid
+  /// pointer* instead (like it is done on AArch64 when FEAT_FPAC is not
+  /// implemented). This might be enough to crash on invalid memory access
+  /// when the pointer is later used as the destination of load/store or branch
+  /// instruction. On the other hand, when the pointer is not used right away,
+  /// it may be important for the compiler to check the address explicitly not
+  /// to introduce signing or authentication oracle.

atrosinenko wrote:

Fixed, thanks!

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-10 Thread Anatoly Trosinenko via llvm-branch-commits


@@ -622,6 +628,40 @@ class MCPlusBuilder {
 return std::make_pair(getNoRegister(), getNoRegister());
   }
 
+  /// Analyzes if a pointer is checked to be valid by the end of BB.
+  ///
+  /// It is possible for pointer authentication instructions not to terminate
+  /// the program abnormally on authentication failure and return some *invalid
+  /// pointer* instead (like it is done on AArch64 when FEAT_FPAC is not
+  /// implemented). This might be enough to crash on invalid memory access
+  /// when the pointer is later used as the destination of load/store or branch
+  /// instruction. On the other hand, when the pointer is not used right away,
+  /// it may be important for the compiler to check the address explicitly not
+  /// to introduce signing or authentication oracle.
+  ///
+  /// If this function returns a (Reg, Inst) pair, then it is known that in any
+  /// successor of BB either
+  /// * Reg is trusted, provided it was safe-to-dereference before Inst, or

atrosinenko wrote:

It seems that "safe-to-dereference" and "trusted" are quite complex terms which 
are better to keep private to `PAuthGadgetScanner.cpp`. Moreover, the 
definitions of these properties can evolve over time.

I updated the descriptions to be more like "if the register was last written to 
by an authentication instruction, the program should be terminated or that 
authentication should be known to succeed". If the "input" of the checker 
sequence was produced in any other way (which is probably meaningless, though), 
this is out of scope of this function.

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-08 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko edited 
https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-08 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko edited 
https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-08 Thread Anatoly Trosinenko via llvm-branch-commits


@@ -622,6 +628,40 @@ class MCPlusBuilder {
 return std::make_pair(getNoRegister(), getNoRegister());
   }
 
+  /// Analyzes if a pointer is checked to be valid by the end of BB.

atrosinenko wrote:

Replaced with "authenticated successfully" (same as below), thanks!

> Analyzes if this basic block fully authenticates a signed pointer, including 
> triggering program termination when invalidly signed

Strictly speaking, it is perfectly correct for a pointer to be authenticated 
much earlier, provided its "safe-to-dereference" status was propagated here.

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-08 Thread Anatoly Trosinenko via llvm-branch-commits


@@ -622,6 +628,40 @@ class MCPlusBuilder {
 return std::make_pair(getNoRegister(), getNoRegister());
   }
 
+  /// Analyzes if a pointer is checked to be valid by the end of BB.
+  ///
+  /// It is possible for pointer authentication instructions not to terminate
+  /// the program abnormally on authentication failure and return some *invalid
+  /// pointer* instead (like it is done on AArch64 when FEAT_FPAC is not
+  /// implemented). This might be enough to crash on invalid memory access
+  /// when the pointer is later used as the destination of load/store or branch
+  /// instruction. On the other hand, when the pointer is not used right away,
+  /// it may be important for the compiler to check the address explicitly not
+  /// to introduce signing or authentication oracle.
+  ///
+  /// If this function returns a (Reg, Inst) pair, then it is known that in any
+  /// successor of BB either
+  /// * Reg is trusted, provided it was safe-to-dereference before Inst, or
+  /// * the program is terminated abnormally without introducing any signing
+  ///   or authentication oracles
+  virtual std::optional>

atrosinenko wrote:

On AArch64, at most one pair can be returned so far. Given that this method can 
hardly be inlined, I would rather keep `std::optional` for now.

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-08 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko commented:

Aside from addressing the review comments, updated the tests:
* resolved FIXME in `gs-pauth-address-materialization.s`
* updated the `safe_lr_at_function_entry_nocfg` test case introduced in 
ceb9d04b92a843788e22e17dca5b542ab2780f4b

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-07 Thread Kristof Beyls via llvm-branch-commits


@@ -622,6 +628,40 @@ class MCPlusBuilder {
 return std::make_pair(getNoRegister(), getNoRegister());
   }
 
+  /// Analyzes if a pointer is checked to be valid by the end of BB.
+  ///
+  /// It is possible for pointer authentication instructions not to terminate
+  /// the program abnormally on authentication failure and return some *invalid
+  /// pointer* instead (like it is done on AArch64 when FEAT_FPAC is not
+  /// implemented). This might be enough to crash on invalid memory access
+  /// when the pointer is later used as the destination of load/store or branch
+  /// instruction. On the other hand, when the pointer is not used right away,
+  /// it may be important for the compiler to check the address explicitly not
+  /// to introduce signing or authentication oracle.
+  ///
+  /// If this function returns a (Reg, Inst) pair, then it is known that in any
+  /// successor of BB either
+  /// * Reg is trusted, provided it was safe-to-dereference before Inst, or

kbeyls wrote:

I think that "trusted" isn't defined before this sentence?
I'm assuming that "trusted" means "has been authenticated and program 
termination is guaranteed if authentication failed"?
Maybe we should have a specific term for that?
It seems to me that "trusted" might be a too-generic term, and becomes too 
confusing to use for this.
Off the top of my head, I'm thinking maybe "fully-authenticated" might work? 
Not sure if @jacobbramley has a better suggestion?

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-07 Thread Kristof Beyls via llvm-branch-commits


@@ -622,6 +628,40 @@ class MCPlusBuilder {
 return std::make_pair(getNoRegister(), getNoRegister());
   }
 
+  /// Analyzes if a pointer is checked to be valid by the end of BB.
+  ///
+  /// It is possible for pointer authentication instructions not to terminate
+  /// the program abnormally on authentication failure and return some *invalid
+  /// pointer* instead (like it is done on AArch64 when FEAT_FPAC is not
+  /// implemented). This might be enough to crash on invalid memory access
+  /// when the pointer is later used as the destination of load/store or branch
+  /// instruction. On the other hand, when the pointer is not used right away,
+  /// it may be important for the compiler to check the address explicitly not
+  /// to introduce signing or authentication oracle.
+  ///
+  /// If this function returns a (Reg, Inst) pair, then it is known that in any
+  /// successor of BB either
+  /// * Reg is trusted, provided it was safe-to-dereference before Inst, or
+  /// * the program is terminated abnormally without introducing any signing
+  ///   or authentication oracles
+  virtual std::optional>

kbeyls wrote:

Before reading the rest of this patch, I would guess that in theory there could 
be multiple such (Reg, Inst) pairs in a single basic block? If so, should this 
return a list/vector/... instead of an optional?

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-07 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls edited 
https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-07 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls commented:

I only reviewed a small part of this PR so far, but will probably not have more 
time today or tomorrow to review further, so I thought I'd share my thoughts so 
far.

I think that the documentation at 
https://github.com/llvm/llvm-project/blob/main/bolt/docs/BinaryAnalysis.md 
should probably be updated to describe the added functionality in this, and 
maybe also some earlier, PRs.

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-07 Thread Kristof Beyls via llvm-branch-commits


@@ -622,6 +628,40 @@ class MCPlusBuilder {
 return std::make_pair(getNoRegister(), getNoRegister());
   }
 
+  /// Analyzes if a pointer is checked to be valid by the end of BB.

kbeyls wrote:

"a pointer being valid" could mean many things in this target-agnostic header 
file.
Maybe the words "pointer authentication" and "correctly authenticated" or 
something similar should be used in this first sentence?
Maybe
`/// Analyzes if this basic block fully authenticates a signed pointer, 
including triggering program termination when invalidly signed` or something 
along those lines?

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-07 Thread Kristof Beyls via llvm-branch-commits


@@ -622,6 +628,40 @@ class MCPlusBuilder {
 return std::make_pair(getNoRegister(), getNoRegister());
   }
 
+  /// Analyzes if a pointer is checked to be valid by the end of BB.
+  ///
+  /// It is possible for pointer authentication instructions not to terminate
+  /// the program abnormally on authentication failure and return some *invalid
+  /// pointer* instead (like it is done on AArch64 when FEAT_FPAC is not
+  /// implemented). This might be enough to crash on invalid memory access
+  /// when the pointer is later used as the destination of load/store or branch
+  /// instruction. On the other hand, when the pointer is not used right away,
+  /// it may be important for the compiler to check the address explicitly not
+  /// to introduce signing or authentication oracle.

kbeyls wrote:

either "a signing..." or "oracles"?

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits