================
@@ -2660,6 +2644,133 @@ void X86AsmPrinter::emitCallInstruction(const 
llvm::MCInst &MCI) {
   OutStreamer->emitInstruction(MCI, getSubtargetInfo());
 }
 
+// Checks whether a NOP is required after a CALL and inserts the NOP, if
+// necessary.
+void X86AsmPrinter::emitNopAfterCallForWindowsEH(const MachineInstr *MI) {
+  if (needsNopAfterCallForWindowsEH(MI))
+    EmitAndCountInstruction(MCInstBuilder(X86::NOOP));
+}
+
+// Determines whether a NOP is required after a CALL, so that Windows EH
+// IP2State tables have the correct information.
+//
+// On most Windows platforms (AMD64, ARM64, ARM32, IA64, but *not* x86-32),
+// exception handling works by looking up instruction pointers in lookup
+// tables. These lookup tables are stored in .xdata sections in executables.
+// One element of the lookup tables are the "IP2State" tables (Instruction
+// Pointer to State).
+//
+// If a function has any instructions that require cleanup during exception
+// unwinding, then it will have an IP2State table. Each entry in the IP2State
+// table describes a range of bytes in the function's instruction stream, and
+// associates an "EH state number" with that range of instructions. A value of
+// -1 means "the null state", which does not require any code to execute.
+// A value other than -1 is an index into the State table.
+//
+// The entries in the IP2State table contain byte offsets within the 
instruction
+// stream of the function. The Windows ABI requires that these offsets are
+// aligned to instruction boundaries; they are not permitted to point to a byte
+// that is not the first byte of an instruction.
+//
+// Unfortunately, CALL instructions present a problem during unwinding. CALL
+// instructions push the address of the instruction after the CALL instruction,
+// so that execution can resume after the CALL. If the CALL is the last
+// instruction within an IP2State region, then the return address (on the 
stack)
+// points to the *next* IP2State region. This means that the unwinder will
+// use the wrong cleanup funclet during unwinding.
+//
+// To fix this problem, MSVC will insert a NOP after a CALL instruction, if the
+// CALL instruction is the last instruction within an IP2State region. The NOP
+// is placed within the same IP2State region as the CALL, so that the return
+// address points to the NOP and the unwinder will locate the correct region.
+//
+// Previously, LLVM fixed this by adding 1 to the instruction offsets in the
+// IP2State table. This caused the instruction boundary to point *within* the
+// instruction after a CALL. This works for the purposes of unwinding, since
+// there are no AMD64 instructions that can be encoded in a single byte and
+// which throw C++ exceptions. Unfortunately, this violates the Windows ABI
+// specification, which requires that the IP2State table entries point to the
+// boundaries between exceptions.
+//
+// To fix this properly, LLVM will now insert a 1-byte NOP after CALL
+// instructions, in the same situations that MSVC does. In performance tests,
+// the NOP has no detectable significance. The NOP is rarely inserted, since
+// it is only inserted when the CALL is the last instruction before an IP2State
+// transition or the CALL is the last instruction before the function epilogue.
+//
+// NOP padding is only necessary on Windows AMD64 targets. On ARM64 and ARM32,
+// instructions have a fixed size so the unwinder knows how to "back up" by
+// one instruction.
+//
+// Interaction with Import Call Optimization (ICO):
+//
+// Import Call Optimization (ICO) is a compiler + OS feature on Windows which
+// improves the performance and security of DLL imports. ICO relies on using a
+// specific CALL idiom that can be replaced by the OS DLL loader. This removes
+// a load and indirect CALL and replaces it with a single direct CALL.
+//
+// To achieve this, ICO also inserts NOPs after the CALL instruction. If the
+// end of the CALL is aligned with an EH state transition, we *also* insert
+// a single-byte NOP.  **Both forms of NOPs must be preserved.**  They cannot
+// be combined into a single larger NOP; nor can the second NOP be removed.
+//
+// This is necessary because, if ICO is active and the call site is modified
+// by the loader, the loader will end up overwriting the NOPs that were 
inserted
+// for ICO. That means that those NOPs cannot be used for the correct
+// termination of the exception handling region (the IP2State transition),
+// so we still need an additional NOP instruction.  The NOPs cannot be combined
+// into a longer NOP (which is ordinarily desirable) because then ICO would
+// split one instruction, producing a malformed instruction after the ICO call.
----------------
sivadeilra wrote:

But since the purpose of the larger NOP is to be overwritten during DLL import 
resolution, that would mean that the final NOP byte would be overwritten, so 
the IP2State mapping would be incorrect. 

If you mean that an "even larger" NOP could be encoded that way, then I suppose 
that's right, but I don't see any utility in that.  It would be confusing to 
debug, too.

https://github.com/llvm/llvm-project/pull/144745
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to