[PATCH] D119296: KCFI sanitizer

Sami Tolvanen via Phabricator via cfe-commits Mon, 27 Jun 2022 14:10:17 -0700

samitolvanen added inline comments.


================
Comment at: llvm/lib/Target/X86/X86ISelLowering.h:83
+    KCFI_NT_CALL,
+    KCFI_TC_RETURN,
+
----------------
samitolvanen wrote:
> pcc wrote:
> > samitolvanen wrote:
> > > joaomoreira wrote:
> > > > samitolvanen wrote:
> > > > > joaomoreira wrote:
> > > > > > I did not revise the entire patch yet. With this said, IMHO, this 
> > > > > > looks like an overcomplication of a simple problem. Is there a 
> > > > > > reason why you really need specific KCFI_ nodes instead of only 
> > > > > > embedding the hash information into an attribute at the Machine 
> > > > > > Instruction? Then, if hash == 0, it just means it is a call that 
> > > > > > doesn't need instrumentation.
> > > > > > 
> > > > > > This latter approach will require less code and should be easier to 
> > > > > > maintain compatible with other CFI approaches. If the reason is 
> > > > > > because you don't want to have a useless attribute for non-call 
> > > > > > instructions, then you could possibly have a map where you bind the 
> > > > > > call instruction with a respective hash.
> > > > > > 
> > > > > > Unless there is a strong reason for these, I would much better 
> > > > > > prefer the slim approach suggested. Either way, if there is a 
> > > > > > reason for this, I would also suggest that you at least don't name 
> > > > > > these as "KCFI_something", as in the future others might want to 
> > > > > > reuse the same structure for other CFI approaches.
> > > > > > Is there a reason why you really need specific KCFI_ nodes instead 
> > > > > > of only embedding the hash information into an attribute at the 
> > > > > > Machine Instruction?
> > > > > 
> > > > > This implementation is similar to `CALL_RVMARKER`, `CALL_BTI` and 
> > > > > basically all other pseudo call instructions in LLVM. Is adding an 
> > > > > attribute to `MachineInstr` the preferred approach instead?
> > > > > 
> > > > > > I would also suggest that you at least don't name these as 
> > > > > > "KCFI_something", as in the future others might want to reuse the 
> > > > > > same structure for other CFI approaches.
> > > > > 
> > > > > Always happy to hear suggestions for alternative naming. Did you have 
> > > > > something in mind?
> > > > > This implementation is similar to `CALL_RVMARKER`, `CALL_BTI` and 
> > > > > basically all other pseudo call instructions in LLVM. Is adding an 
> > > > > attribute to `MachineInstr` the preferred approach instead?
> > > > 
> > > > My understanding is that if, every time a new mitigation or 
> > > > optimization comes in, you create a new opcode for it, it will 
> > > > eventually bloat to non-feasibility.
> > > > 
> > > > Imagine you have some mitigation like [[ 
> > > > https://www.cs.columbia.edu/~vpk/research/kguard/ | kguard  ]] being 
> > > > implemented. Now you can have calls which are KCFI checked but not 
> > > > KGUARD checked; then KCFI not-checked but KGUARD checked; then KCFI and 
> > > > KGUARD checked.; then none-checked. And then you need all these 
> > > > variations for tail calls (which imho is a first, minor, instance of 
> > > > the problem)...
> > > > 
> > > > So, in general, my understanding is that this approach works, yeah, but 
> > > > that in the long term it could become a hassle... so ideally we should 
> > > > use attributes to define these sub-specific instructions instead of 
> > > > opcodes.
> > > > 
> > > > > 
> > > > > > I would also suggest that you at least don't name these as 
> > > > > > "KCFI_something", as in the future others might want to reuse the 
> > > > > > same structure for other CFI approaches.
> > > > > 
> > > > > Always happy to hear suggestions for alternative naming. Did you have 
> > > > > something in mind?
> > > > 
> > > > I think switching from KCFI into CFI would already be good enough, as 
> > > > in the end these are all implementing the [[ 
> > > > https://dl.acm.org/doi/10.1145/1102120.1102165 | control-flow integrity 
> > > > ]] concept.
> > > > My understanding is that if, every time a new mitigation or 
> > > > optimization comes in, you create a new opcode for it, it will 
> > > > eventually bloat to non-feasibility.
> > > 
> > > I do agree, if there are enough call variants that can be combined, this 
> > > will quickly get messy. So far this probably just hasn't been an issue. 
> > > I'm not particularly happy about the multiple pseudo instructions here 
> > > either.
> > > 
> > > @pcc, any thoughts about the best way to make this cleaner? Should we 
> > > switch to a `MachineInstr` attribute, or perhaps pass another operand 
> > > with the specific call type, or something else?
> > > 
> > > > so ideally we should use attributes to define these sub-specific 
> > > > instructions instead of opcodes.
> > > 
> > > One concern I have with using an attribute is that we have to ensure that 
> > > no pass accidentally drops it while transforming the call instruction or 
> > > replacing it with something else. That's not an issue if we have a 
> > > separate pseudo instruction with the type as an operand. Did you run into 
> > > any issues with this?
> > > 
> > > Also, how do you serialize the type attribute in MIR? Something similar 
> > > to `debug-instr-number`?
> > > 
> > > > I think switching from KCFI into CFI would already be good enough
> > > 
> > > Sure, sounds good to me. I'll change the naming once we have a consensus 
> > > on the correct approach here.
> > Would it make sense to add a field to `MachineInstr::ExtraInfo` to hold the 
> > additional information here? That way you can avoid increasing the size of 
> > `MachineInstr`.
> I'll look into it. Per offline discussion, we might need a similar 
> construction for `SDNode` to avoid increasing memory usage too much should we 
> go with type hash attributes instead.
In my testing, adding a `uint32_t` attribute to `SDNode` increased max RSS by 
~0.15% during a kernel build. As there's no `ExtraInfo` type construction in 
`SDNode` currently, adding one would increase memory usage more. Perhaps we can 
look into that if more attributes are needed?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119296/new/

https://reviews.llvm.org/D119296

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D119296: KCFI sanitizer

Reply via email to