On Wed, 2017-07-05 at 11:02 +1000, Michael Neuling wrote: > On Tue, 2017-07-04 at 16:45 -0400, Gustavo Romero wrote: > > Currently tm_reclaim() can return with a corrupted vs0 (fp0) or vs32 (v0) > > due to the fact vs0 is used to save FPSCR and vs32 is used to save VSCR. >
Hi Mikey, This completely fell off my radar, we do need something merged! For what its worth I like the original patch. > tm_reclaim() should have no state live in the registers once it returns. It > should all be saved in the thread struct. The above is not an issue in my > book. > Yeah, this is something I agree with, however, if that is the case then why have tm_recheckpoint() do partial reloads? A partial reload only makes sense if we can be sure that reclaim will have left the state at least (partially) correct - not with (as is the case today) one corrupted fp or Altivec reg. > Having a quick look at the code, I think there's and issue but we need > something > more like this (completely untested). > > When we recheckpoint inside an fp unavail, we need to recheckpoint vec if it > was > enabled. Currently we only ever recheckpoint the FP which seems like a bug. > Visa versa for the other way around. > In your example, we don't need to reload VEC if we can trust that reclaim left the checkpointed regs on the CPU correctly - this patch achieves this. Of course I'm more than happy to reduce complexity and not have this optimisation at all but then we should remove the entire parameter to tm_recheckpoint(). Any in between feels dangerous. Cyril > diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c > index d4e545d27e..d1184264e2 100644 > --- a/arch/powerpc/kernel/traps.c > +++ b/arch/powerpc/kernel/traps.c > @@ -1589,7 +1589,7 @@ void fp_unavailable_tm(struct pt_regs *regs) > * If VMX is in use, the VRs now hold checkpointed values, > * so we don't want to load the VRs from the thread_struct. > */ > - tm_recheckpoint(¤t->thread, MSR_FP); > + tm_recheckpoint(¤t->thread, regs->msr); > > /* If VMX is in use, get the transactional values back */ > if (regs->msr & MSR_VEC) { > @@ -1611,7 +1611,7 @@ void altivec_unavailable_tm(struct pt_regs *regs) > regs->nip, regs->msr); > tm_reclaim_current(TM_CAUSE_FAC_UNAV); > regs->msr |= MSR_VEC; > - tm_recheckpoint(¤t->thread, MSR_VEC); > + tm_recheckpoint(¤t->thread, regs->msr); > current->thread.used_vr = 1; > > if (regs->msr & MSR_FP) { > > > > Later, we recheckpoint trusting that the live state of FP and VEC are ok > > depending on the MSR.FP and MSR.VEC bits, i.e. if MSR.FP is enabled that > > means the FP registers checkpointed when we entered in TM are correct and > > after a treclaim. we can trust the FP live state. Similarly to VEC regs. > > However if tm_reclaim() does not return a sane state then tm_recheckpoint() > > will recheckpoint a corrupted state from live state back to the checkpoint > > area. > > > > > > That commit fixes the corruption by restoring vs0 and vs32 from the > > ckfp_state and ckvr_state after they are used to save FPSCR and VSCR, > > respectively. > > > > The effect of the issue described above is observed, for instance, once a > > VSX unavailable exception is caught in the middle of a transaction with > > MSR.FP = 1 or MSR.VEC = 1. If MSR.FP = 1, then after getting back to user > > space FP state is corrupted. If MSR.VEC = 1, then VEC state is corrupted. > > > > The issue does not occur if MSR.FP = 0 and MSR.VEC = 0 because ckfp_state > > and ckvr_state are both copied from fp_state and vr_state, respectively, > > and on recheckpointing both states will be restored from these thread > > structures and not from the live state. > > > > The issue does not occur also if MSR.FP = 1 and MSR.VEC = 1 because it > > implies MSR.VSX = 1 and in that case the VSX unavailable exception does not > > happen in the middle of the transactional block. > > > > Finally, that commit also fixes the MSR used to check if FP and VEC bits > > are enabled once we are in tm_reclaim_thread(). ckpt_regs.msr is valid only > > if giveup_all() is called *before* using ckpt_regs.msr for checks because > > check_if_tm_restore_required() in giveup_all() will copy regs->msr to > > ckpt_regs.msr and so ckpt_regs.msr reflects exactly the MSR that the thread > > had when it came off the processor. > > > > No regression was observed on powerpc/tm selftests after this fix. > > > > Signed-off-by: Gustavo Romero <grom...@linux.vnet.ibm.com> > > Signed-off-by: Breno Leitao <lei...@debian.org> > > --- > > arch/powerpc/kernel/process.c | 9 +++++++-- > > arch/powerpc/kernel/tm.S | 14 ++++++++++++++ > > 2 files changed, 21 insertions(+), 2 deletions(-) > > > > diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c > > index 2ad725e..ac1fc51 100644 > > --- a/arch/powerpc/kernel/process.c > > +++ b/arch/powerpc/kernel/process.c > > @@ -864,6 +864,13 @@ static void tm_reclaim_thread(struct thread_struct > > *thr, > > if (!MSR_TM_SUSPENDED(mfmsr())) > > return; > > > > + /* Copy regs->msr to ckpt_regs.msr making the last valid for > > + * the checks below. check_if_tm_restore_required() in > > + * giveup_all() will take care of it. Also update fp_state > > + * and vr_state from live state if the live state is valid. > > + */ > > + giveup_all(container_of(thr, struct task_struct, thread)); > > + > > /* > > * If we are in a transaction and FP is off then we can't have > > * used FP inside that transaction. Hence the checkpointed > > @@ -883,8 +890,6 @@ static void tm_reclaim_thread(struct thread_struct *thr, > > memcpy(&thr->ckvr_state, &thr->vr_state, > > sizeof(struct thread_vr_state)); > > > > - giveup_all(container_of(thr, struct task_struct, thread)); > > - > > tm_reclaim(thr, thr->ckpt_regs.msr, cause); > > } > > > > diff --git a/arch/powerpc/kernel/tm.S b/arch/powerpc/kernel/tm.S > > index 3a2d041..5dfbddb 100644 > > --- a/arch/powerpc/kernel/tm.S > > +++ b/arch/powerpc/kernel/tm.S > > @@ -259,9 +259,17 @@ _GLOBAL(tm_reclaim) > > > > addi r7, r3, THREAD_CKVRSTATE > > SAVE_32VRS(0, r6, r7) /* r6 scratch, r7 transact vr state */ > > + > > + /* Corrupting v0 (vs32). Should restore it later. */ > > mfvscr v0 > > li r6, VRSTATE_VSCR > > stvx v0, r7, r6 > > + > > + /* Restore v0 (vs32) from ckvr_state.vr[0], otherwise we might > > + * recheckpoint the wrong live value. > > + */ > > + LXVD2X_ROT(32, R0, R7) > > + > > dont_backup_vec: > > mfspr r0, SPRN_VRSAVE > > std r0, THREAD_CKVRSAVE(r3) > > @@ -272,9 +280,15 @@ dont_backup_vec: > > addi r7, r3, THREAD_CKFPSTATE > > SAVE_32FPRS_VSRS(0, R6, R7) /* r6 scratch, r7 transact fp > > state */ > > > > + /* Corrupting fr0 (vs0). Should restore it later. */ > > mffs fr0 > > stfd fr0,FPSTATE_FPSCR(r7) > > > > + /* Restore fr0 (vs0) from ckfp_state.fpr[0], otherwise we might > > + * recheckpoint the wrong live value. > > + */ > > + LXVD2X_ROT(0, R0, R7) > > + > > dont_backup_fp: > > > > /* TM regs, incl TEXASR -- these live in thread_struct. Note they've