On Tue, Mar 09, 2021 at 10:04:21AM +0800, Aili Yao wrote: > On Mon, 8 Mar 2021 14:55:04 -0800 > "Luck, Tony" <tony.l...@intel.com> wrote: > > > There can be races when multiple CPUs consume poison from the same > > page. The first into memory_failure() atomically sets the HWPoison > > page flag and begins hunting for tasks that map this page. Eventually > > it invalidates those mappings and may send a SIGBUS to the affected > > tasks. > > > > But while all that work is going on, other CPUs see a "success" > > return code from memory_failure() and so they believe the error > > has been handled and continue executing. > > > > Fix by wrapping most of the internal parts of memory_failure() in > > a mutex. > > > > Signed-off-by: Tony Luck <tony.l...@intel.com> > > --- ... > > If others are OK with this method, then I am OK too. > But I have two concerns, May you take into account: > > 1. The memory_failure with 0 return code for race condition, then the > kill_me_maybe() goes into branch: > if (!memory_failure(p->mce_addr >> PAGE_SHIFT, flags) && > !(p->mce_kflags & MCE_IN_KERNEL_COPYIN)) { > set_mce_nospec(p->mce_addr >> PAGE_SHIFT, p->mce_whole_page); > sync_core(); > return; > } > > while we place set_mce_nospec() here is for a reason, please see commit > fd0e786d9d09024f67b. > > 2. When memory_failure return 0 and maybe return to user process, and it may > re-execute the instruction triggering previous fault, this behavior > assume an implicit dependence that the related pte has been correctly set. or > if not correctlily set, it will lead to infinite loop again.
These seem to be separate issues from memory_failure()'s concurrency issue, so I'm still expecting that your patch is to be merged. Maybe do you want to update it based on the discussion (if it's concluded)? Thanks, Naoya Horiguchi