* Thomas Gleixner <t...@linutronix.de> wrote:
> On Sun, 25 Nov 2018, Andy Lutomirski wrote: > > > On Nov 25, 2018, at 2:20 PM, Thomas Gleixner <t...@linutronix.de> wrote: > > > On Sun, 25 Nov 2018, Andi Kleen wrote: > > > > > >>> The current check whether two tasks belong to the same context is using > > >>> the > > >>> tasks context id. While correct, it's simpler to use the mm pointer > > >>> because > > >>> it allows to mangle the TIF_SPEC_IB bit into it. The context id based > > >>> mechanism requires extra storage, which creates worse code. > > >> > > >> [We tried similar in some really early versions, but it was replaced > > >> with the context id later.] > > >> > > >> One issue with using the pointer is that the pointer can be reused > > >> when the original mm_struct is freed, and then gets reallocated > > >> immediately to an attacker. Then the attacker may avoid the IBPB. > > >> > > >> Given it's probably hard to generate any reasonable leak bandwidth with > > >> such a complex scenario, but it still seemed better to close the hole. > > > > > > Sorry, but that's really a purely academic exercise. > > > > I would guess that it’s actually very easy to force mm_struct* reuse. > > Don’t the various allocators try to allocate hot memory? There’s nothing > > hotter than a just-freed allocation of the same size. > > Sure, but this is about a indirect branch predictor attack against > something which reuses the mm. > > So you'd need to pull off: > > P1 poisons branch predictor > P1 exit > > P2 starts and resuses mm(P1) and uses the poisoned branch predictor > > the only thing between P1 and P2 is either idle or some other kernel > thread, but no other user task. If that happens then the code would not > issue IBPB as it assumes to switch back to the same process. > > Even if you can pull that off the speculation would hit the startup code of > P2, which is truly a source of secret information. Creating a valuable > attack based on mm reuse is really less proabable than a lottery jackpot. > > So using mm is really good enough and results in better assembly code which > is surely more valuable than addressing some hypothetical hole. OTOH we could probably close even this with very little cost if we added an IBPB to non-threaded fork() and vfork()+exec() paths? Those are really slow paths compared to all the context switch paths we are trying to optimize here. Alternatively we could IBPB on the post-exit() final task struct freeing, which too is a relative slow path compared to the context switch paths. But no strong opinion. Thanks, Ingo