RE: [PATCH v2 3/8] iommu/vt-d: Implement device_pasid domain attach ops
> I would expect real applications will try to use the same PASID for > the same IOVA map to optimize IOTLB caching. On Intel a ring3 application only gets to use one PASID. The ENQCMD instruction pick up the PASID for the process from the IA32_PASID MSR (set by OS when first access, and on context switches thereafter). Kernel users (ring0) can supply any PASID when they use the ENQCMDS instruction. Is that what you mean when you say "real applications"? -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit
On Thu, Feb 10, 2022 at 08:27:50AM -0800, Fenghua Yu wrote: > Hi, Jacob, > > On Wed, Feb 09, 2022 at 07:16:14PM -0800, Jacob Pan wrote: > > Hi Fenghua, > > > > On Mon, 7 Feb 2022 15:02:48 -0800, Fenghua Yu wrote: > > > > > @@ -1047,8 +1040,6 @@ struct iommu_sva *intel_svm_bind(struct device > > > *dev, struct mm_struct *mm, void } > > > > > > sva = intel_svm_bind_mm(iommu, dev, mm, flags); > > > - if (IS_ERR_OR_NULL(sva)) > > > - intel_svm_free_pasid(mm); > > If bind fails, the PASID has no IOMMU nor CPU context. It should be safe to > > free here. > > The PASID can not be freed even if bind fails. The PASID allocated earlier > (either in this thread or in another thread) might be populated to other > threads already and being used now. > > Without freeing the PASID on bind failure, the worst case is the PASID might > not be used in the process (and will be freed on process exit anyway). > > This all matches with the PASID life time described in the commit message. But what does this mean for the user that failed that intel_svm_bind_mm()? Here's a scenario: Process sets up to use PASID capable device #1. Everything works, so the process has mm->pasid, and the IOMMU has the tables to map virtual addresses coming from device #1 using that PASID. Now the same process asks to start using PASID capable device #2, but there is a failure at intel_svm_bind_mm(). Fenghua is right that we shouldn't free the PASID. It is in use by at least one thread of the process to access device #1. But what happens with device #2? Does the caller of intel_svm_bind() do the right thing with the IS_ERR_OR_NULL return value to let the user know that device #2 isn't accessible? -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 5/8] x86/mmu: Add mm-based PASID refcounting
On Wed, Sep 29, 2021 at 06:07:22PM +, Fenghua Yu wrote: > Add > +#ifdef CONFIG_IOMMU_SUPPORT > because mm->pasid and current-pasid_activated are defined only if > CONFIG_IOMMU_SUPPORT is defined. > > > + if (user_mode(regs) && current->mm->pasid && !current->pasid_activated) > > { > > Maybe need to add "&& cpu_feature_enabled(X86_FEATURE_ENQCMD)" because > the IA32_PASID MSR is only used when ENQCMD is enabled? > > > + current->pasid_activated = 1; > > + wrmsrl(MSR_IA32_PASID, current->mm->pasid | > > MSR_IA32_PASID_VALID); > > + return; > > + } > > + > > +endif New version that addresses those issues. Has ugly #ifdef in C code :-( If that's unacceptable, then could create some stub functions, or add a call to __try_fixup_pasid() that's in a file in the iommu code that is only built when CONFIG_IOMMU_SUPPORT is set. But either of those move the details far away from the #GP handler so make extra work for anyone trying to follow along with what is happening here. -Tony --- diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index a58800973aed..5a3c87fd65de 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -528,6 +528,32 @@ static enum kernel_gp_hint get_kernel_gp_address(struct pt_regs *regs, #define GPFSTR "general protection fault" +/* + * When a user executes the ENQCMD instruction it will #GP + * fault if the IA32_PASID MSR has not been set up with a + * valid PASID. + * So if the process has been allocated a PASID (mm->pasid) + * AND the IA32_PASID MSR has not been initialized, try to + * fix this #GP by initializing the IA32_PASID MSR. + * If the #GP was for some other reason, it will trigger + * again, but this routine will return false and the #GP + * will be processed. + */ +static void try_fixup_pasid(void) +{ + if (!cpu_feature_enabled(X86_FEATURE_ENQCMD)) + return false; + +#ifdef CONFIG_IOMMU_SUPPORT + if (current->mm->pasid && !current->pasid_activated) { + current->pasid_activated = 1; + wrmsrl(MSR_IA32_PASID, current->mm->pasid); + return true; + } +#endif + return false; +} + DEFINE_IDTENTRY_ERRORCODE(exc_general_protection) { char desc[sizeof(GPFSTR) + 50 + 2*sizeof(unsigned long) + 1] = GPFSTR; @@ -536,6 +562,9 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection) unsigned long gp_addr; int ret; + if (user_mode(regs) && try_fixup_pasid()) + return; + cond_local_irq_enable(regs); if (static_cpu_has(X86_FEATURE_UMIP)) { ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 5/8] x86/mmu: Add mm-based PASID refcounting
On Wed, Sep 29, 2021 at 07:15:53PM +0200, Thomas Gleixner wrote: > On Wed, Sep 29 2021 at 09:59, Andy Lutomirski wrote: > > On 9/29/21 05:28, Thomas Gleixner wrote: > >> Looking at that patch again, none of this muck in fpu__pasid_write() is > >> required at all. The whole exception fixup is: > >> > >> if (!user_mode(regs)) > >> return false; > >> > >> if (!current->mm->pasid) > >> return false; > >> > >> if (current->pasid_activated) > >> return false; > > > > <-- preemption or BH here: kaboom. > > Sigh, this had obviously to run in the early portion of #GP, i.e. before > enabling interrupts. Like this? Obviously with some comment about why this is being done. diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index a58800973aed..a848a59291e7 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -536,6 +536,12 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection) unsigned long gp_addr; int ret; + if (user_mode(regs) && current->mm->pasid && !current->pasid_activated) { + current->pasid_activated = 1; + wrmsrl(MSR_IA32_PASID, current->mm->pasid | MSR_IA32_PASID_VALID); + return; + } + cond_local_irq_enable(regs); if (static_cpu_has(X86_FEATURE_UMIP)) { -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 4/8] x86/traps: Demand-populate PASID MSR via #GP
On Wed, Sep 29, 2021 at 09:58:22AM -0700, Andy Lutomirski wrote: > On 9/28/21 16:10, Luck, Tony wrote: > > Moving beyond pseudo-code and into compiles-but-probably-broken-code. > > > > > > The intent of the functions below is that Fenghua should be able to > > do: > > > > void fpu__pasid_write(u32 pasid) > > { > > u64 msr_val = pasid | MSR_IA32_PASID_VALID; > > struct ia32_pasid_state *addr; > > > > addr = begin_update_one_xsave_feature(current, XFEATURE_PASID, true); > > addr->pasid = msr_val; > > finish_update_one_xsave_feature(current); > > } > > > > This gets gnarly because we would presumably like to optimize the case where > we can do the update directly in registers. I wonder if we can do it with a > bit of macro magic in a somewhat generic way: Can we defere the optimizations until there is a use case that cares? The existing use case (fixing up the #GP fault by setting the PASID MSR) isn't performance critical. Let's just have something that is clear (or as clear as any xsave code can be) and correct. -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH 5/8] x86/mmu: Add mm-based PASID refcounting
> There is zero requirement to look at TIF_NEED_FPU_LOAD or > fpregs_state_valid() simply because the #GP comes straight from user > space which means the FPU registers contain the current tasks user space > state. Just to double confirm ... there is no point in the #GP handler up to this point where pre-emption can occur? -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH 4/8] x86/traps: Demand-populate PASID MSR via #GP
>> if (!(xsave->header.xfeatures & fmask)) { >> xsave->header.xfeatures |= fmask; //< >> xsaves(xsave, fmask); >> } > > I'm not sure why the FPU state is initialized here. > > For updating the PASID state, it's unnecessary to init the PASID state. > > Maybe it is necessary in other cases? Dave had suggested initializing feature state when it is unknown (could be garbage). This is my attempt to follow that guidance. I'm not confident that my tests for "is the state in registers, in memory, or is garbage" really capture all the cases. -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH 4/8] x86/traps: Demand-populate PASID MSR via #GP
> But the helpers seem to be generic. They take "task" as a parameter and > handle the task as non-current case. So the helpers are not for PASID only. > They may be used by others to modify a running task's FPU state. But > It's not safe to do so. > > At least need some comments/restriction for the helpers to be used on > a running task? Fenghua, Correct. When I add some comments I'll make it very clear that it is the responsibility of the caller to make sure that the task that is targeted cannot run. Earlier in this thread Dave suggested there are two cases where these helpers might be useful: 1) Fork/clone - to set up some xsave state in the child ... but this would be done before the child is allowed to run. 2) ptrace - this is a "maybe" because ptrace already has code to handle all the xsave state as a single entity. Perhaps someone might want to change it to only modify a single feature ... but this seems unlikely. In any case the ptrace code already "stops" the target process while it is reading/writing state. -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH 4/8] x86/traps: Demand-populate PASID MSR via #GP
>> fpregs_lock(); > > I'm afraid we may hit the same locking issue when we send IPI to notify > another task to modify its > PASID state. Here the API is called to modify another running task's PASID > state as well without a right lock. > fpregs_lock() is not enough to deal with this, I'm afraid. We don't send IPI any more to change PASID state. The only place that the current patch series touches the PASID MSR is in the #GP fault handler. -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 4/8] x86/traps: Demand-populate PASID MSR via #GP
On Tue, Sep 28, 2021 at 11:50:37PM +, Fenghua Yu wrote: > If xfeatures's feature bit is 0, xsaves will not write its init value to the > memory due to init optimization. So the xsaves will do nothing and the > state is not initialized and may have random data. > Setting TIF_NEED_FPU_LOAD cannot guaranteed to execute XRSTORS on exiting > to user. In fpregs_restore_userregs(): > if (!fpregs_state_valid(fpu, cpu)) { > ... > __restore_fpregs_from_fpstate(&fpu->state, mask); > ... > } > > fpregs state should be invalid to get the XRSTROS executed. > > So setting TIF_NEED_FPU_LOAD may get the FPU register unchanged on exiting > to user. Does this help? Changed lines marked with //< -Tony void *begin_update_one_xsave_feature(struct task_struct *tsk, enum xfeature xfeature, bool full) { struct xregs_state *xsave = &tsk->thread.fpu.state.xsave; struct xregs_state *xinit = &init_fpstate.xsave; u64 fmask = 1ull << xfeature; void *addr; BUG_ON(!(xsave->header.xcomp_bv & fmask)); fpregs_lock(); addr = __raw_xsave_addr(xsave, xfeature); if (full || tsk != current) { memcpy(addr, __raw_xsave_addr(xinit, xfeature), xstate_sizes[xfeature]); goto out; } if (!(xsave->header.xfeatures & fmask)) { xsave->header.xfeatures |= fmask; //< xsaves(xsave, fmask); } out: xsave->header.xfeatures |= fmask; return addr; } void finish_update_one_xsave_feature(struct task_struct *tsk) { set_ti_thread_flag(task_thread_info(tsk), TIF_NEED_FPU_LOAD); if (tsk == current) //< __cpu_invalidate_fpregs_state();//< fpregs_unlock(); } ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 4/8] x86/traps: Demand-populate PASID MSR via #GP
Moving beyond pseudo-code and into compiles-but-probably-broken-code. The intent of the functions below is that Fenghua should be able to do: void fpu__pasid_write(u32 pasid) { u64 msr_val = pasid | MSR_IA32_PASID_VALID; struct ia32_pasid_state *addr; addr = begin_update_one_xsave_feature(current, XFEATURE_PASID, true); addr->pasid = msr_val; finish_update_one_xsave_feature(current); } So here's the two new functions that would be added to arch/x86/kernel/fpu/xstate.c void *begin_update_one_xsave_feature(struct task_struct *tsk, enum xfeature xfeature, bool full) { struct xregs_state *xsave = &tsk->thread.fpu.state.xsave; struct xregs_state *xinit = &init_fpstate.xsave; u64 fmask = 1ull << xfeature; void *addr; BUG_ON(!(xsave->header.xcomp_bv & fmask)); fpregs_lock(); addr = __raw_xsave_addr(xsave, xfeature); if (full || tsk != current) { memcpy(addr, __raw_xsave_addr(xinit, xfeature), xstate_sizes[xfeature]); goto out; } /* could optimize some cases where xsaves() isn't fastest option */ if (!(xsave->header.xfeatures & fmask)) xsaves(xsave, fmask); out: xsave->header.xfeatures |= fmask; return addr; } void finish_update_one_xsave_feature(struct task_struct *tsk) { set_ti_thread_flag(task_thread_info(tsk), TIF_NEED_FPU_LOAD); fpregs_unlock(); } -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 4/8] x86/traps: Demand-populate PASID MSR via #GP
On Tue, Sep 28, 2021 at 12:19:22PM -0700, Dave Hansen wrote: > On 9/28/21 11:50 AM, Luck, Tony wrote: > > On Mon, Sep 27, 2021 at 04:51:25PM -0700, Dave Hansen wrote: > ... > >> 1. Hide whether we need to write to real registers > >> 2. Hide whether we need to update the in-memory image > >> 3. Hide other FPU infrastructure like the TIF flag. > >> 4. Make the users deal with a *whole* state in the replace API > > > > Is that difference just whether you need to save the > > state from registers to memory (for the "update" case) > > or not (for the "replace" case ... where you can ignore > > the current register, overwrite the whole per-feature > > xsave area and mark it to be restored to registers). > > > > If so, just a "bool full" argument might do the trick? > > I want to be able to hide the complexity of where the old state comes > from. It might be in registers or it might be in memory or it might be > *neither*. It's possible we're running with stale register state and a > current->...->xsave buffer that has XFEATURES&XFEATURE_FOO 0. > > In that case, the "old" copy might be memcpy'd out of the init_task. > Or, for pkeys, we might build it ourselves with init_pkru_val. So should there be an error case if there isn't an "old" state, and the user calls: p = begin_update_one_xsave_feature(XFEATURE_something, false); Maybe instead of an error, just fill it in with the init state for the feature? > > Also - you have a "tsk" argument in your pseudo code. Is > > this needed? Are there places where we need to perform > > these operations on something other than "current"? > > Two cases come to mind: > 1. Fork/clone where we are doing things to our child's XSAVE buffer > 2. ptrace() where we are poking into another task's state > > ptrace() goes for the *whole* buffer now. I'm not sure it would need > this per-feature API. I just call it out as something that we might > need in the future. Ok - those seem ok ... it is up to the caller to make sure that the target task is in some "not running, and can't suddenly start running" state before calling these functions. > > > pseudo-code: > > > > void *begin_update_one_xsave_feature(enum xfeature xfeature, bool full) > > { > > void *addr; > > > > BUG_ON(!(xsave->header.xcomp_bv & xfeature)); > > > > addr = __raw_xsave_addr(xsave, xfeature); > > > > fpregs_lock(); > > > > if (full) > > return addr; > > If the feature is marked as in the init state in the buffer > (XSTATE_BV[feature]==0), this addr *could* contain total garbage. So, > we'd want to make sure that the memory contents have the init state > written before handing them back to the caller. That's not strictly > required if the user is writing the whole thing, but it's the nice thing > to do. Nice guys waste CPU cycles writing to memory that is just going to get written again. > > > if (xfeature registers are "live") > > xsaves(xstate, 1 << xfeature); > > One little note: I don't think we would necessarily need to do an XSAVES > here. For PKRU, for instance, we could just do a rdpkru. Like this? if (tsk == current) { switch (xfeature) { case XFEATURE_PKRU: *(u32 *)addr = rdpkru(); break; case XFEATURE_PASID: rdmsrl(MSR_IA32_PASID, msr); *(u64 *)addr = msr; break; ... any other "easy" states ... default: xsaves(xstate, 1 << xfeature); break; } } > > > return addr; > > } > > > > void finish_update_one_xsave_feature(enum xfeature xfeature) > > { > > mark feature modified > > I think we'd want to do this at the "begin" time. Also, do you mean we > should set XSTATE_BV[feature]? Begin? End? It's all inside fpregs_lock(). But whatever seems best. Yes, I think that this means set XSTATE_BV[feature] ... but I'm relying on you as the xsave expert to help get the subtle bits right so the Andy Lutomirski can smile at this code. > > set TIF bit > > Since the XSAVE buffer was updated, it now contains the canonical FPU > state. It may have diverged from the register state, thus we need to > set TIF_NEED_FPU_LOAD. Yes, that's the TIF bit my pseudo-code intended. &g
Re: [PATCH 4/8] x86/traps: Demand-populate PASID MSR via #GP
On Mon, Sep 27, 2021 at 04:51:25PM -0700, Dave Hansen wrote: > That's close to what we want. > > The size should probably be implicit. If it isn't implicit, it needs to > at least be double-checked against the state sizes. > > Not to get too fancy, but I think we also want to have a "replace" > operation which is separate from the "update". Think of a case where > you are trying to set a bit: > > struct pkru_state *pk = start_update_xstate(tsk, XSTATE_PKRU); > pk->pkru |= 0x100; > finish_update_xstate(tsk, XSTATE_PKRU, pk); > > versus setting a whole new value: > > struct pkru_state *pk = start_replace_xstate(tsk, XSTATE_PKRU); > memset(pkru, sizeof(*pk), 0); > pk->pkru = 0x1234; > finish_replace_xstate(tsk, XSTATE_PKRU, pk); > > They look similar, but they actually might have very different costs for > big states like AMX. We can also do some very different debugging for > them. In normal operation, these _can_ just return pointers directly to > the fpu...->xstate in some case. But, if we're debugging, we could > kmalloc() a buffer and do sanity checking before it goes into the task > buffer. > > We don't have to do any of that fancy stuff now. We don't even need to > do an "update" if all we need for now for XFEATURE_PASID is a "replace". > > 1. Hide whether we need to write to real registers > 2. Hide whether we need to update the in-memory image > 3. Hide other FPU infrastructure like the TIF flag. > 4. Make the users deal with a *whole* state in the replace API Is that difference just whether you need to save the state from registers to memory (for the "update" case) or not (for the "replace" case ... where you can ignore the current register, overwrite the whole per-feature xsave area and mark it to be restored to registers). If so, just a "bool full" argument might do the trick? Also - you have a "tsk" argument in your pseudo code. Is this needed? Are there places where we need to perform these operations on something other than "current"? pseudo-code: void *begin_update_one_xsave_feature(enum xfeature xfeature, bool full) { void *addr; BUG_ON(!(xsave->header.xcomp_bv & xfeature)); addr = __raw_xsave_addr(xsave, xfeature); fpregs_lock(); if (full) return addr; if (xfeature registers are "live") xsaves(xstate, 1 << xfeature); return addr; } void finish_update_one_xsave_feature(enum xfeature xfeature) { mark feature modified set TIF bit fpregs_unlock(); } -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 4/8] x86/traps: Demand-populate PASID MSR via #GP
On Thu, Sep 23, 2021 at 04:17:05PM -0700, Andy Lutomirski wrote: > > + fpregs_lock(); > > + > > + /* > > +* If the task's FPU doesn't need to be loaded or is valid, directly > > +* write the IA32_PASID MSR. Otherwise, write the PASID state and > > +* the MSR will be loaded from the PASID state before returning to > > +* the user. > > +*/ > > + if (!test_thread_flag(TIF_NEED_FPU_LOAD) || > > + fpregs_state_valid(fpu, smp_processor_id())) { > > + wrmsrl(MSR_IA32_PASID, msr_val); > > Let me try to decode this. > > If the current task's FPU state is live or if the state is in memory but the > CPU regs just happen to match the copy in memory, then write the MSR. Else > write the value to memory. > > This is wrong. If !TIF_NEED_FPU_LOAD && fpregs_state_valid, you MUST NOT > MODIFY FPU STATE. This is not negotiable -- you will break coherence between > CPU regs and the memory image. The way you modify the current task's state > is either you modify it in CPU regs (if the kernel knows that the CPU regs > are the one and only source of truth) OR you modify it in memory and > invalidate any preserved copies (by zapping last_cpu). > > In any event, that particular bit of logic really doesn't belong in here -- > it belongs in some helper that gets it right, once. Andy, A helper sounds like a good idea. Can you flesh out what you would like that to look like? Is it just the "where is the live register state?" so the above could be written: if (xsave_state_in_memory(args ...)) update pasid bit of xsave state in memory else wrmsrl(MSR_IA32_PASID, msr_val); Or are you thinking of a helper that does both the check and the update ... so the code here could be: update_one_xsave_feature(XFEATURE_PASID, &msr_val, sizeof(msr_val)); With the helper being something like: void update_one_xsave_feature(enum xfeature xfeature, void *data, size_t size) { if (xsave_state_in_memory(args ...)) { addr = get_xsave_addr(xsave, xfeature); memcpy(addr, data, size); xsave->header.xfeatures |= (1 << xfeature); return; } switch (xfeature) { case XFEATURE_PASID: wrmsrl(MSR_IA32_PASID, *(u64 *)data); break; case each_of_the_other_XFEATURE_enums: code to update registers for that XFEATURE } } either way needs the definitive correct coding for xsave_state_in_memory() -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 5/8] x86/mmu: Add mm-based PASID refcounting
On Fri, Sep 24, 2021 at 04:03:53PM -0700, Andy Lutomirski wrote: > 1. The context switch code needs to resync PASID. Unfortunately, this adds > some overhead to every context switch, although a static_branch could > minimize it for non-PASID users. Any solution that adds to context switch time isn't going to meet the definition of "perfect" either. -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 5/8] x86/mmu: Add mm-based PASID refcounting
On Fri, Sep 24, 2021 at 03:18:12PM +0200, Thomas Gleixner wrote: > On Thu, Sep 23 2021 at 19:48, Thomas Gleixner wrote: > > On Thu, Sep 23 2021 at 09:40, Tony Luck wrote: > > > > fpu_write_task_pasid() can just grab the pasid from current->mm->pasid > > and be done with it. > > > > The task exit code can just call iommu_pasid_put_task_ref() from the > > generic code and not from within x86. > > But OTOH why do you need a per task reference count on the PASID at all? > > The PASID is fundamentaly tied to the mm and the mm can't go away before > the threads have gone away unless this magically changed after I checked > that ~20 years ago. It would be possible to avoid a per-task reference to the PASID by taking an extra reference when mm->pasid is first allocated using the CONFIG_PASID_TASK_REFS you proposed yesterday to define a function to take the extra reference, and another to drop it when the mm is finally freed ... with stubs to do nothing on architectures that don't create per-task PASID context. This solution works, but is functionally different from Fenghua's proposal for this case: Process clones a task task binds a device task accesses device using PASID task unbinds device task exits (but process lives on) Fenghua will free the PASID because the reference count goes back to zero. The "take an extra reference and keep until the mm is freed" option would needlessly hold onto the PASID. This seems like an odd usage case ... even if it does exist, a process that does this may spawn another task that does the same thing. If you think it is sufficiently simpler to use the "extra reference" option (invoking the "perfect is the enemy of good enough" rule) then we can change. -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 4/8] x86/traps: Demand-populate PASID MSR via #GP
On Fri, Sep 24, 2021 at 03:37:22PM +0200, Peter Zijlstra wrote: > On Thu, Sep 23, 2021 at 10:14:42AM -0700, Luck, Tony wrote: > > On Wed, Sep 22, 2021 at 11:07:22PM +0200, Peter Zijlstra wrote: > > > On Mon, Sep 20, 2021 at 07:23:45PM +, Fenghua Yu wrote: > > > > @@ -538,6 +547,9 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection) > > > > > > > > cond_local_irq_enable(regs); > > > > > > > > + if (user_mode(regs) && fixup_pasid_exception()) > > > > + goto exit; > > > > + > > > > > So you're eating any random #GP that might or might not be PASID > > > related. And all that witout a comment... Enlighten? > > > > This is moderately well commented inside the fixup_pasid_exception() > > function. Another copy of the comments here at the call-site seems > > overkill. > > +static bool fixup_pasid_exception(void) > +{ > + if (!cpu_feature_enabled(X86_FEATURE_ENQCMD)) > + return false; > + > + return __fixup_pasid_exception(); > +} > > /me goes looking for comments in that function, lemme get out the > electron microscope, because I can't seem to spot them with the naked > eye. If you have ctags installed then a ctrl-] on that __fixup_pasid_exception() will take you to the function with the comments. No electron microscope needed. + +/* + * Try to figure out if there is a PASID MSR value to propagate to the + * thread taking the #GP. + */ +bool __fixup_pasid_exception(void) +{ + u32 pasid; + + /* +* This function is called only when this #GP was triggered from user +* space. So the mm cannot be NULL. +*/ + pasid = current->mm->pasid; + + /* If no PASID is allocated, there is nothing to propagate. */ + if (pasid == PASID_DISABLED) + return false; + + /* +* If the current task already has a valid PASID MSR, then the #GP +* fault must be for some non-ENQCMD related reason. +*/ + if (current->has_valid_pasid) + return false; + + /* Fix up the MSR by the PASID in the mm. */ + fpu__pasid_write(pasid); + current->has_valid_pasid = 1; + + return true; +} -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 5/8] x86/mmu: Add mm-based PASID refcounting
On Thu, Sep 23, 2021 at 04:09:18PM -0700, Andy Lutomirski wrote: > On Mon, Sep 20, 2021, at 12:23 PM, Fenghua Yu wrote: > I think this is unnecessarily complicated because it's buying in to the > existing ISA misconception that PASID has anything to do with a task. > A PASID belongs to an mm, full stop. Now the ISA is nasty and we have > tasks that have *noticed* that their mm has a PASID and tasks that have > not noticed this fact, but that should be irrelevant to essentially > everything except the fault handler. > > So just refcount the thing the obvious way: take a reference when you > stick the PASID in the mm_struct and drop the reference in __mmdrop(). > Problem solved. You could probably drop it more aggressively in > __mmput(), and the comment explaining why is left as an exercise to the > reader -- if a kernel thread starts doing ENQCMD, we have worse things > to worry about :) That doesn't match well with the non-x86 usage of PASIDs. The code there bumps the reference count on each device bind, and decrements on each device unbind. If we don't keep a reference count for each task that has IA32_PASID set up we could have this sequence 1) Process binds to a PASID capable device 2) Task uses ENQCMD, so PASID MSR is set up. 3) Process unbinds the device, reference count on PASID goes to zero. PASID is freed. PASID is reallocated to another task. 4) Task from step #2 uses ENQCMD to submit a descriptor and device now processes virtual addresses based on mappings in the new task. Now you might say that at step 3 we need to hunt down all the tasks that have PASID enabled and disabled ... but that's the same painful code that we avoided when we said that we would not make Linux hand out a PASID to all existing tasks in a process on the first bind operation. -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 4/8] x86/traps: Demand-populate PASID MSR via #GP
On Wed, Sep 22, 2021 at 11:07:22PM +0200, Peter Zijlstra wrote: > On Mon, Sep 20, 2021 at 07:23:45PM +, Fenghua Yu wrote: > > @@ -538,6 +547,9 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection) > > > > cond_local_irq_enable(regs); > > > > + if (user_mode(regs) && fixup_pasid_exception()) > > + goto exit; > > + > So you're eating any random #GP that might or might not be PASID > related. And all that witout a comment... Enlighten? This is moderately well commented inside the fixup_pasid_exception() function. Another copy of the comments here at the call-site seems overkill. Would it help to change the name to try_fixup_pasid_exception() to make it clearer that this is just a heuristic that may or may not fix this particular #GP? -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 5/8] x86/mmu: Add mm-based PASID refcounting
On Thu, Sep 23, 2021 at 04:36:50PM +0200, Thomas Gleixner wrote: > On Mon, Sep 20 2021 at 19:23, Fenghua Yu wrote: > > > > +#ifdef CONFIG_INTEL_IOMMU_SVM > > +void pasid_put(struct task_struct *tsk, struct mm_struct *mm); > > +#else > > +static inline void pasid_put(struct task_struct *tsk, struct mm_struct > > *mm) { } > > +#endif > > This code is again defining that PASID is entirely restricted to > INTEL. It's true, that no other vendor supports this, but PASID is > a non-vendor specific concept. > > Sticking this into INTEL code means that any other PASID implementation > has to rip it out again from INTEL code and make it a run time property. > > The refcounting issue should be the same for all PASID mechanisms which > attach PASID to a mm. What's INTEL specific about that? > > So can we pretty please do that correct right away? It's a bit messy (surprise). There are two reasons to hold a refcount on a PASID 1) The process has done a bind on a device that uses PASIDs This one isn't dependent on Intel. 2) A task within a process is using ENQCMD (and thus holds a reference on the PASID because IA32_PASID MSR for this task has the PASID value loaded with the enable bit set). This is (currently) Intel specific (until others implement an ENQCMD-like feature to allow apps to access PASID enabled devices without going through the OS). Perhaps some better function naming might help? E.g. have a task_pasid_put() function that handles the process exit case separatley from the device unbind case. void task_pasid_put(void) { if (!cpu_feature_enabled(X86_FEATURE_ENQCMD)) return; if (current->has_valid_pasid) { mutex_lock(&pasid_mutex); iommu_sva_free_pasid(mm); mutex_unlock(&pasid_mutex); } } -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH 4/8] x86/traps: Demand-populate PASID MSR via #GP
>> > +static bool fixup_pasid_exception(void) >> > +{ >> > + if (!cpu_feature_enabled(X86_FEATURE_ENQCMD)) >> > + return false; >> > + >> > + return __fixup_pasid_exception(); >> > +} > > That is, shouldn't the above at the very least decode the instruction > causing the #GP and check it's this ENQCMD thing? It can't reliably do that because some other thread in the process may have re-written the memory that regs->ip points at (bizarre case, but I think Dave Hansen brought it up). So it would just add extra code, and still only be a hint. Without the check this sequence is possible: 1) Process binds an accelerator (so mm->pasid is set) 2) Task in the process executes something other than ENQCMD that gets a #GP 3) Kernel says "Oh, mm->pasid is set, I'll initialize the IA32_PASID MSR to see if that fixes it" 4) Nope. Re-executing the instruction at step #2 just gives another #GP 5) Kernel says "I already set IA32_PASID, so this must be something else ... do regular #GP actions" Now if the task catches the signal that results from step #5 and avoids termination, it will have IA32_PASID set ... but to the right value should it go on to actually execute ENQCMD at some future point. So the corner case from not knowing whether this #GP was from ENQCMD or not is harmless. -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] x86/cpufeatures: Force disable X86_FEATURE_ENQCMD and remove update_pasid()
On Wed, Jun 09, 2021 at 04:34:31PM -0700, Andy Lutomirski wrote: > On 6/9/21 10:32 AM, Luck, Tony wrote: > >>> I've told Fenghua to dig out the previous iteration of this patch where > >>> the plan was to lazily fix the PASID_MSR in other threads in the #GP > >>> handler. > >> > >> Blech. Also this won't work for other PASID-like features. > >> > >> I have a half-written patch to fix this up for real. Stay tuned. > > > > Andy: Any update on this? > > > > -Tony > > > > Let me try to merge my pile with tglx's pile and come up with something > halfway sane. Looks like Thomas' pile is done (well done enough to be queued in TIP). Can we see your "pile" soon? -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH] x86/cpufeatures: Force disable X86_FEATURE_ENQCMD and remove update_pasid()
>> I've told Fenghua to dig out the previous iteration of this patch where >> the plan was to lazily fix the PASID_MSR in other threads in the #GP >> handler. > > Blech. Also this won't work for other PASID-like features. > > I have a half-written patch to fix this up for real. Stay tuned. Andy: Any update on this? -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH] x86/cpufeatures: Force disable X86_FEATURE_ENQCMD and remove update_pasid()
>> ... so on a PASID system, your trivial reproducer would theoretically >> fire the same way and corrupt FPU state just as well. > > This is worse and you can't selftest it because the IPI can just hit in > the middle of _any_ FPU state operation and corrupt state. That sounds like we should abandon the "IPI all the other threads to force enable the PASID for them" approach. It would just be a nightmare of papering over cracks when the IPI was delivered at some inconvenient moment when the recipient was in the middle of touching xsave state. I've told Fenghua to dig out the previous iteration of this patch where the plan was to lazily fix the PASID_MSR in other threads in the #GP handler. That algorithm is very simple and easy to check. Pseudo-code: #GP if (usermode && current->mm->pasid && rdmsr(PASID_MSR) != valid) { wrmsr(PASID_MSR, current->mm->pasid | PASID_VALID); return; } Worst case is that some thread of a multi-threaded process that is using PASID takes some unrelated #GP ... this code will try to fix it by enabling the PASID_MSR. That will just #GP a second time and this test will see the MSR is already set, so fall into the usual #GP handling code. Seems like a better direction than trying to fix the IPI method. The virtualization folks will like this way more because IPI in guest causes a couple of VMEXIT so is somewhat expensive. -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 1/2] iommu/sva: Tighten SVA bind API with explicit flags
On Thu, May 13, 2021 at 12:46:21PM -0700, Jacob Pan wrote: > It seems there are two options: > 1. Add a new IOMMU API to set up a system PASID with a *separate* IOMMU page > table/domain, mark the device is PASID only with a flag. Use DMA APIs > to explicit map/unmap. Based on this PASID-only flag, Vendor IOMMU driver > will decide whether to use system PASID domain during map/unmap. Not clear > if we also need to make IOVA==kernel VA. > > 2. Add a new IOMMU API to setup a system PASID which points to init_mm.pgd. > This API only allows trusted device to bind with the system PASID at its > own risk. There is no need for DMA API. This is the same as the current > code except with an explicit API. > > Which option? Option #1 looks cleaner to me. Option #2 gives access to bits of memory that the users of system PASID shouldn't ever need to touch ... just map regions of memory that the kernel has a "struct page" for. What does "use DMA APIs to explicitly map/unmap" mean? Is that for the whole region? I'm expecting that once this system PASID has been initialized, then any accelerator device with a kernel use case would use the same PASID. I.e. DSA for page clearing, IAX for ZSwap compression & decompression, etc. -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v4 1/2] iommu/sva: Tighten SVA bind API with explicit flags
> If you want this then be explicit about what it is you are making when > building the API. Don't try to hide it under some generic idea of > "kernel PCI DMA SVA" So, a special API call (that Dave can call from IDXD) to set up this kernel PASID. With suitable documentation to explain the scope. Maybe with a separate CONFIG option so it can be completely stubbed out (IDXD does *NOT* "select" this option ... users have to explicitly pick it). > I could easily see an admin option to "turn this off" entirely as > being too dangerous, especially if the users have no interest in IDXD. And a kernel command line option to block IDXD from using that special API ... for users on generic kernels who want to block this use model (but still use IDXD in non-kernel cases). Users who don't want IDXD at all can block loading of the driver. Would that work? -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 1/2] iommu/sva: Tighten SVA bind API with explicit flags
On Thu, May 13, 2021 at 02:33:03PM -0300, Jason Gunthorpe wrote: > The page table under the kernel PASID should behave the same way that > the kernel would operate the page table assigned to a kernel RID. > > If the kernel has security off then the PASID should map to all > physical memory, just like the RID does. > > If security is on then every DMA map needs to be loaded into the > PASID's io page table no different than a RID page table. > > "kernel SVA" is, IMHO, not a desirable thing, it completely destroys > the kernel's DMA security model. > > > If people want to use an accelerator on memory allocated by vmalloc() > > things will get more complicated. But maybe we can delay solving that > > problem until someone comes up with a real use case that needs to > > do this? > > If you have a HW limitation that the device can only issue TLPs > with a PASID, even for kernel users, then I think the proper thing is > to tell the IOMMU layer than a certain 'struct device' enters > PASID-only mode and the IOMMU layer should construct an appropriate > PASID and flow the dma operations through it. > > Pretending the DMA layer doesn't exist and that PASID gets a free pass > is not OK in the kernel. I can see why a tight security model is needed to stop random devices having access to mamory that they should not be able to access. Now that PCIe devices can be plugged into Thunderbolt ports on computers, nobody wants to repeat the disaster that Firewire ports created for systems over a decade ago. But I'd like to challege the one-size-fits-all policy. There's a big difference between a random device plugged into a port (which may even lie about its VendorID/DeviceID) and a device that is part of the CPU socket. I'd like people to think of DSA as an extension to the instruction set. It implements asynchronous instructions like "MEMFILL" and "MEMCOPY". These can be limited in scope when executed in user processes or guests. But when executed by the host OS ring0 code they can have all the same access that ring0 code has when it dereferences a pointer. -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v4 1/2] iommu/sva: Tighten SVA bind API with explicit flags
> For shared workqueue, it can only generate DMA request with PASID. The > submission is done by ENQCMDS (S for supervisor) instruction. > > If we were not to share page tables with init_mm, we need a system PASID > that doing the same direct mapping in IOMMU page tables. Note that for the currently envisioned kernel use cases for accelerators it would be OK for this system PASID to just provide either: 1) A 1:1 mapping for physical addresses. Kernel users of the accelerators would provide physical addresses in descriptors. 2) The same mapping that the kernel uses for its "1:1" map of all physical memory. Users would use kernel virtual addresses in that "1:1" range (e.g. those obtained from page_to_virt(struct page *p);) If people want to use an accelerator on memory allocated by vmalloc() things will get more complicated. But maybe we can delay solving that problem until someone comes up with a real use case that needs to do this? -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 12/12] x86/traps: Fix up invalid PASID
On Fri, Jun 26, 2020 at 11:44:50AM +0200, Peter Zijlstra wrote: > On Thu, Jun 25, 2020 at 01:17:22PM -0700, Fenghua Yu wrote: > > > +static bool fixup_pasid_exception(void) > > +{ > > + if (!IS_ENABLED(CONFIG_INTEL_IOMMU_SVM)) > > + return false; > > + if (!static_cpu_has(X86_FEATURE_ENQCMD)) > > + return false; > > elsewhere you had another variation: > > + if (!IS_ENABLED(CONFIG_INTEL_IOMMU_SVM)) > + return; > + > + if (!cpu_feature_enabled(X86_FEATURE_ENQCMD)) > + return; > > Which is it, and why do we need the CONFIG thing when combined with the > enabled thing? Do we have a standard way of coding for a feature that depends on multiple other features? For this case the system needs to both suport the ENQCMD instruction, and also have kernel code that programs the IOMMU. And/or guidance on when to use each of the very somewhat simlar looking boot_cpu_has() static_cpu_has() IS_ENABLED() cpu_feature_enabled() options? -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v3 01/13] iommu: Change type of pasid to unsigned int
> Is PASID in the uapi at all? $ grep pasid include/uapi/linux/iommu.h * @pasid: Process Address Space ID __u32 pasid; * @pasid: Process Address Space ID __u32 pasid; * @pasid: Process Address Space ID __u32 pasid; * - If the PASID bit is set, the @pasid field is populated and the invalidation * @pasid: process address space ID __u64 pasid; * struct iommu_inv_pasid_info - PASID Selective Invalidation Structure * - If the PASID bit is set, the @pasid field is populated and the invalidation * @pasid: process address space ID struct iommu_inv_pasid_info { __u64 pasid; * @pasid_info: invalidation data when @granularity is %IOMMU_INV_GRANU_PASID struct iommu_inv_pasid_info pasid_info; * struct iommu_gpasid_bind_data_vtd - Intel VT-d specific data on device and guest struct iommu_gpasid_bind_data_vtd { * struct iommu_gpasid_bind_data - Information about device and guest PASID binding * @hpasid: Process address space ID used for the guest mm in host IOMMU * @gpasid: Process address space ID used for the guest mm in guest IOMMU struct iommu_gpasid_bind_data { __u64 hpasid; __u64 gpasid; struct iommu_gpasid_bind_data_vtd vtd; ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v2 12/12] x86/traps: Fix up invalid PASID
> So what’s the RDMSR for? Surely you > have some state somewhere that says “this task has a PASID.” > Can’t you just make sure that stays in sync with the MSR? Then, on #GP, if > the task already has a PASID, you know the MSR is set. We have state that says the process ("mm") has been allocated a PASID. But not for each task. E.g. a process may clone a bunch of tasks, then one of them opens a device that needs a PASID. We did internally have patches to go hunt down all those other tasks and force a PASID onto each. But the code was big and ugly. Also maybe the wrong thing to do if those threads didn't ever need to access the device. PeterZ suggested that we can add a bit to the task structure for this purpose. Fenghua is hesitant about adding an x86 only bit there. -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v2 12/12] x86/traps: Fix up invalid PASID
> Are we planning to keep PASID live once a task has used it once or are we > going to swap it lazily? If the latter, a percpu variable might be better. Current plan is "touch it once and the task owns it until exit(2)" Maybe someday in the future when we have data on how applications actually use accelerators we could look at something more complex if usage patterns look like it would be beneficial. -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v2 12/12] x86/traps: Fix up invalid PASID
>> The heuristic always initializes the MSR with the per mm PASID IIF the >> mm has a valid PASID but the MSR doesn't have one. This heuristic usually >> happens only once on the first #GP in a thread. > > But it doesn't guarantee the PASID is the right one. Suppose both the mm > has a PASID and the MSR has a VALID one, but the MSR isn't the mm one. > Then we NO-OP. So if the exception was due to us having the wrong PASID, > we stuck. ENQCMD only checks the 'valid' bit of the IA32_PASID MSR to decide whether to #GP or not. H/W has no concept of the "right" pasid value. If IA32_PASID is valid with the wrong value ... then the system is about to see some major corruption because the operations in the accelerator are not going to translate to the physical addresses for pages owned by the process that issued the ENQCMD. -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH 5/7] x86/mmu: Allocate/free PASID
> There are two users of a PASID, mm and device driver(FD). If > either one is not done with the PASID, it cannot be reclaimed. As you > mentioned, it could take a long time for the driver to abort. If the > abort ends *after* mmdrop, we are in trouble. > If driver drops reference after abort/drain PASID is done, then we are > safe. I don't think there should be an abort ... suppose the application requested the DSA to copy some large block of important results from DDR4 to persistent memory. Driver should wait for that copy operation to complete. Note that for the operation to succeed, the kernel should still be processing and fixing page faults for the "mm" (some parts of the data that the user wanted to save to persistent memory may have been paged out). The wait by the DSA diver needs to by synchronous ... the "mm" cannot be freed until DSA says all the pending operations have completed. Even without persistent memory, there are cases where you want the operations to complete (mmap'd files, shared memory with other processes). -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH 5/7] x86/mmu: Allocate/free PASID
>> So the driver needs to use flush/drain operations to make sure all >> the in-flight work has completed before releasing/re-using the PASID. >> > Are you suggesting we should let driver also hold a reference of the > PASID? The sequence for bare metal is: process is queuing requests to DSA process exits (either deliberately, or crashes, or is killed) kernel does exit processing DSA driver is called as part of tear down of "mm" issues drain/flush commands to ensure that all queued operations on the PASID for this mm have completed PASID can be freed There's a 1:1 map from "mm" to PASID ... so reference counting seems like overkill. Once the kernel is in the "exit" path, we know that no more work can be queued using this PASID. -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH 5/7] x86/mmu: Allocate/free PASID
> If fd release cleans up then how should there be something in flight at > the final mmdrop? ENQCMD from the user is only synchronous in that it lets the user know their request has been added to a queue (or not). Execution of the request may happen later (if the device is busy working on requests for other users). The request will take some time to complete. Someone told me the theoretical worst case once, which I've since forgotten, but it can be a long time. So the driver needs to use flush/drain operations to make sure all the in-flight work has completed before releasing/re-using the PASID. -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH 6/7] x86/traps: Fix up invalid PASID
> Just for the record I also suggested to have a proper errorcode in the > #GP for ENQCMD and I surely did not suggest to avoid decoding the user > instructions. Thomas, Is the heuristic to avoid decoding the user instructions OK (you are just pointing out that you should not be given credit for this part of the idea)? Or are you saying that you'd like to see the instruction checked to see that it was an ENQCMD? -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [RFC] avoid indirect calls for DMA direct mappings v2
> But that might not be your fault. My ancient system is getting flaky. A v4.19 > build that > has booted before is also resetting :-( After a power-cycle (and some time to let the machine cool off). System now boots with your patch series plus the __phys_to_pfn() #define So if you can figure the right way to fix that, you are good to go. Tested-by: Tony Luck ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [RFC] avoid indirect calls for DMA direct mappings v2
> This should fix it: ... > +#include Not quite. Still have an issue with __phys_to_pfn(paddr) Trying ti #include gave we a raft of redefined macros. So I just added #define __phys_to_pfn(paddr)PHYS_PFN(paddr) to arch/ia64/mm/init.c That made the build work. But boot spontaneously resets after: mptsas: ioc1: attaching ssp device: fw_channel 0, fw_id 6, phy 6, sas_addr 0x5000c5000ecada69 scsi 5:0:0:0: Direct-Access SEAGATE ST9146802SS 0003 PQ: 0 ANSI: 5 EFI Variables Facility v0.08 2004-May-17 sd 5:0:0:0: [sdb] 286749488 512-byte logical blocks: (147 GB/137 GiB) sd 5:0:0:0: [sdb] Write Protect is off sd 5:0:0:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA sdb: sdb1 sdb2 sd 5:0:0:0: [sdb] Attached SCSI disk But that might not be your fault. My ancient system is getting flaky. A v4.19 build that has booted before is also resetting :-( -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC] avoid indirect calls for DMA direct mappings v2
On Fri, Dec 07, 2018 at 11:07:05AM -0800, Christoph Hellwig wrote: > This works is based on the dma-mapping tree, so you probably want to > want this git tree for testing: > > git://git.infradead.org/users/hch/misc.git dma-direct-calls.2 Pulled this tree. Got HEAD 33b9fc015171 ("dma-mapping: bypass indirect calls for dma-direct") But the ia64 build fails with: arch/ia64/mm/init.c:75:21: warning: 'enum dma_data_direction' declared inside parameter list [enabled by default] arch/ia64/mm/init.c:75:21: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default] arch/ia64/mm/init.c:75:40: error: parameter 4 ('dir') has incomplete type arch/ia64/mm/init.c:74:6: error: function declaration isn't a prototype [-Werror=strict-prototypes] arch/ia64/mm/init.c: In function 'arch_sync_dma_for_cpu': arch/ia64/mm/init.c:77:2: error: implicit declaration of function '__phys_to_pfn' [-Werror=implicit-function-declaration] -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: misc ia64 cleanups (resend)
> A couple random cleanups I stumbled upon when doing dma related work. Christoph, Thanks. Applied. Will send pull request for next merge window. -Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu