> On May 14, 2019, at 10:15 AM, Andy Lutomirski <[email protected]> wrote: > > > > On May 14, 2019, at 10:00 AM, Nadav Amit <[email protected]> wrote: > >>> On May 14, 2019, at 1:00 AM, Paul Turner <[email protected]> wrote: >>> >>> From: Nadav Amit <[email protected]> >>> Date: Fri, May 10, 2019 at 7:45 PM >>> To: <[email protected]> >>> Cc: Borislav Petkov, <[email protected]>, Nadav Amit, Andy >>> Lutomirsky, Ingo Molnar, Peter Zijlstra, Thomas Gleixner, Jann Horn >>> >>>> It may be useful to check in runtime whether certain assertions are >>>> violated even during speculative execution. This can allow to avoid >>>> adding unnecessary memory fences and at the same time check that no data >>>> leak channels exist. >>>> >>>> For example, adding such checks can show that allocating zeroed pages >>>> can return speculatively non-zeroed pages (the first qword is not >>>> zero). [This might be a problem when the page-fault handler performs >>>> software page-walk, for example.] >>>> >>>> Introduce SPEC_WARN_ON(), which checks in runtime whether a certain >>>> condition is violated during speculative execution. The condition should >>>> be computed without branches, e.g., using bitwise operators. The check >>>> will wait for the condition to be realized (i.e., not speculated), and >>>> if the assertion is violated, a warning will be thrown. >>>> >>>> Warnings can be provided in one of two modes: precise and imprecise. >>>> Both mode are not perfect. The precise mode does not always make it easy >>>> to understand which assertion was broken, but instead points to a point >>>> in the execution somewhere around the point in which the assertion was >>>> violated. In addition, it prints a warning for each violation (unlike >>>> WARN_ONCE() like behavior). >>>> >>>> The imprecise mode, on the other hand, can sometimes throw the wrong >>>> indication, specifically if the control flow has changed between the >>>> speculative execution and the actual one. Note that it is not a >>>> false-positive, it just means that the output would mislead the user to >>>> think the wrong assertion was broken. >>>> >>>> There are some more limitations. Since the mechanism requires an >>>> indirect branch, it should not be used in production systems that are >>>> susceptible for Spectre v2. The mechanism requires TSX and performance >>>> counters that are only available in skylake+. There is a hidden >>>> assumption that TSX is not used in the kernel for anything else, other >>>> than this mechanism. >>> >>> Nice trick! >> >> “Illusion." [ ignore if you don’t know the reference ] >> >>> Can you eliminate the indirect call by forcing an access fault to >>> abort the transaction instead, e.g. "cmove 0, $1”? >>> >>> (If this works, it may also allow support on older architectures as >>> the RTM_RETIRED.ABORT* events go back further I believe?) >> >> I don’t think it would work. The whole problem is that we need a counter >> that is updated during execution and not retirement. I tried several >> counters and could not find other appropriate ones. >> >> The idea behind the implementation is to affect the control flow through >> data dependency. I may be able to do something similar without an indirect >> branch. I’ll take a page, put the XABORT on the page and make the page NX. >> Then, a direct jump would go to this page. The conditional-mov would change >> the PTE to X if the assertion is violated. There should be a page-walk even >> if the CPU finds the entry in the TLB, since this entry is NX. > > I think you only get a page walk if the TLB entry is not-present. I’d be a > bit surprised if the CPU is willing to execute, even speculatively, from > speculatively written data. Good luck!
I guess you are right (although I didn’t try). IIRC, Jann Horn once explained to me that if CPUs used PTEs that were written speculatively, this would have been a correctness issue, since the PTE needs to get to the TLB before it is used. I’ll try a different path (not concrete idea which), assuming there is an interest.

