Re: [REPOST PATCH 1/4] ARM: dts: DRA7: Add dsp1_system syscon node
* Tony Lindgren [151012 15:57]: > * Suman Anna [151012 15:37]: > > Hi Tony, > > > > On 10/12/2015 04:43 PM, Tony Lindgren wrote: > > > * Suman Anna [151002 16:27]: > > >> The DSP_SYSTEM sub-module is a dedicated system control logic > > >> module present within a DRA7 DSP processor sub-system. This > > >> module is responsible for power management, clock generation > > >> and connection to the device PRCM module. > > >> > > >> Add a syscon node for this module for the DSP1 processor > > >> sub-system. This is added as a syscon node as it is a common > > >> configuration module that can be used by the different IOMMU > > >> instances and the corresponding remoteproc device. > > >> > > >> The node is added to the common dra7.dtsi file, as the DSP1 > > >> processor sub-system is mostly common across all the variants > > >> of the DRA7 SoC family. > > >> > > >> Signed-off-by: Suman Anna > > >> --- > > >> arch/arm/boot/dts/dra7.dtsi | 5 + > > >> 1 file changed, 5 insertions(+) > > >> > > >> diff --git a/arch/arm/boot/dts/dra7.dtsi b/arch/arm/boot/dts/dra7.dtsi > > >> index e289c706d27d..62055094e8d5 100644 > > >> --- a/arch/arm/boot/dts/dra7.dtsi > > >> +++ b/arch/arm/boot/dts/dra7.dtsi > > >> @@ -292,6 +292,11 @@ > > >> #thermal-sensor-cells = <1>; > > >> }; > > >> > > >> +dsp1_system: dsp_system@40d0 { > > >> +compatible = "syscon"; > > >> +reg = <0x40d0 0x100>; > > >> +}; > > >> + > > >> sdma: dma-controller@4a056000 { > > >> compatible = "ti,omap4430-sdma"; > > >> reg = <0x4a056000 0x1000>; > > > > > > Hmm so why would you want to set up a complete device as a syscon > > > mapping rather than just doing ioremap on it? > > > > > > What drivers will be sharing access to these registers? > > > > Two different instances of the MMU for now, both get probed > > independently. But there are other registers which a remoteproc driver > > will mostly be interested in (like DSP_SYS_STAT for knowing the C66x > > idle/active status). > > OK makes sense to me then. And applying these into omap-for-v4.4/dt. Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [REPOST PATCH 1/4] ARM: dts: DRA7: Add dsp1_system syscon node
* Suman Anna [151012 15:37]: > Hi Tony, > > On 10/12/2015 04:43 PM, Tony Lindgren wrote: > > * Suman Anna [151002 16:27]: > >> The DSP_SYSTEM sub-module is a dedicated system control logic > >> module present within a DRA7 DSP processor sub-system. This > >> module is responsible for power management, clock generation > >> and connection to the device PRCM module. > >> > >> Add a syscon node for this module for the DSP1 processor > >> sub-system. This is added as a syscon node as it is a common > >> configuration module that can be used by the different IOMMU > >> instances and the corresponding remoteproc device. > >> > >> The node is added to the common dra7.dtsi file, as the DSP1 > >> processor sub-system is mostly common across all the variants > >> of the DRA7 SoC family. > >> > >> Signed-off-by: Suman Anna > >> --- > >> arch/arm/boot/dts/dra7.dtsi | 5 + > >> 1 file changed, 5 insertions(+) > >> > >> diff --git a/arch/arm/boot/dts/dra7.dtsi b/arch/arm/boot/dts/dra7.dtsi > >> index e289c706d27d..62055094e8d5 100644 > >> --- a/arch/arm/boot/dts/dra7.dtsi > >> +++ b/arch/arm/boot/dts/dra7.dtsi > >> @@ -292,6 +292,11 @@ > >>#thermal-sensor-cells = <1>; > >>}; > >> > >> + dsp1_system: dsp_system@40d0 { > >> + compatible = "syscon"; > >> + reg = <0x40d0 0x100>; > >> + }; > >> + > >>sdma: dma-controller@4a056000 { > >>compatible = "ti,omap4430-sdma"; > >>reg = <0x4a056000 0x1000>; > > > > Hmm so why would you want to set up a complete device as a syscon > > mapping rather than just doing ioremap on it? > > > > What drivers will be sharing access to these registers? > > Two different instances of the MMU for now, both get probed > independently. But there are other registers which a remoteproc driver > will mostly be interested in (like DSP_SYS_STAT for knowing the C66x > idle/active status). OK makes sense to me then. Thanks, Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [REPOST PATCH 1/4] ARM: dts: DRA7: Add dsp1_system syscon node
Hi Tony, On 10/12/2015 04:43 PM, Tony Lindgren wrote: > * Suman Anna [151002 16:27]: >> The DSP_SYSTEM sub-module is a dedicated system control logic >> module present within a DRA7 DSP processor sub-system. This >> module is responsible for power management, clock generation >> and connection to the device PRCM module. >> >> Add a syscon node for this module for the DSP1 processor >> sub-system. This is added as a syscon node as it is a common >> configuration module that can be used by the different IOMMU >> instances and the corresponding remoteproc device. >> >> The node is added to the common dra7.dtsi file, as the DSP1 >> processor sub-system is mostly common across all the variants >> of the DRA7 SoC family. >> >> Signed-off-by: Suman Anna >> --- >> arch/arm/boot/dts/dra7.dtsi | 5 + >> 1 file changed, 5 insertions(+) >> >> diff --git a/arch/arm/boot/dts/dra7.dtsi b/arch/arm/boot/dts/dra7.dtsi >> index e289c706d27d..62055094e8d5 100644 >> --- a/arch/arm/boot/dts/dra7.dtsi >> +++ b/arch/arm/boot/dts/dra7.dtsi >> @@ -292,6 +292,11 @@ >> #thermal-sensor-cells = <1>; >> }; >> >> +dsp1_system: dsp_system@40d0 { >> +compatible = "syscon"; >> +reg = <0x40d0 0x100>; >> +}; >> + >> sdma: dma-controller@4a056000 { >> compatible = "ti,omap4430-sdma"; >> reg = <0x4a056000 0x1000>; > > Hmm so why would you want to set up a complete device as a syscon > mapping rather than just doing ioremap on it? > > What drivers will be sharing access to these registers? Two different instances of the MMU for now, both get probed independently. But there are other registers which a remoteproc driver will mostly be interested in (like DSP_SYS_STAT for knowing the C66x idle/active status). regards Suman ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [REPOST PATCH 1/4] ARM: dts: DRA7: Add dsp1_system syscon node
* Suman Anna [151002 16:27]: > The DSP_SYSTEM sub-module is a dedicated system control logic > module present within a DRA7 DSP processor sub-system. This > module is responsible for power management, clock generation > and connection to the device PRCM module. > > Add a syscon node for this module for the DSP1 processor > sub-system. This is added as a syscon node as it is a common > configuration module that can be used by the different IOMMU > instances and the corresponding remoteproc device. > > The node is added to the common dra7.dtsi file, as the DSP1 > processor sub-system is mostly common across all the variants > of the DRA7 SoC family. > > Signed-off-by: Suman Anna > --- > arch/arm/boot/dts/dra7.dtsi | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/arch/arm/boot/dts/dra7.dtsi b/arch/arm/boot/dts/dra7.dtsi > index e289c706d27d..62055094e8d5 100644 > --- a/arch/arm/boot/dts/dra7.dtsi > +++ b/arch/arm/boot/dts/dra7.dtsi > @@ -292,6 +292,11 @@ > #thermal-sensor-cells = <1>; > }; > > + dsp1_system: dsp_system@40d0 { > + compatible = "syscon"; > + reg = <0x40d0 0x100>; > + }; > + > sdma: dma-controller@4a056000 { > compatible = "ti,omap4430-sdma"; > reg = <0x4a056000 0x1000>; Hmm so why would you want to set up a complete device as a syscon mapping rather than just doing ioremap on it? What drivers will be sharing access to these registers? Regards, Tony ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 2/2] vfio: Include no-iommu mode
On Mon, 2015-10-12 at 11:46 -0600, Alex Williamson wrote: > On Mon, 2015-10-12 at 19:27 +0300, Michael S. Tsirkin wrote: > > On Mon, Oct 12, 2015 at 08:56:07AM -0700, Stephen Hemminger wrote: > > > On Fri, 09 Oct 2015 12:41:10 -0600 > > > Alex Williamson wrote: > > > > > > > There is really no way to safely give a user full access to a PCI > > > > without an IOMMU to protect the host from errant DMA. There is also > > > > no way to provide DMA translation, for use cases such as devices > > > > assignment to virtual machines. However, there are still those users > > > > that want userspace drivers under those conditions. The UIO driver > > > > exists for this use case, but does not provide the degree of device > > > > access and programming that VFIO has. In an effort to avoid code > > > > duplication, this introduces a No-IOMMU mode for VFIO. > > > > > > > > This mode requires enabling CONFIG_VFIO_NOIOMMU and loading the vfio > > > > module with the option "enable_unsafe_pci_noiommu_mode". This should > > > > make it very clear that this mode is not safe. In this mode, there is > > > > no support for unprivileged users, CAP_SYS_ADMIN is required for > > > > access to the necessary dev files. Mixing no-iommu and secure VFIO is > > > > also unsupported, as are any VFIO IOMMU backends other than the > > > > vfio-noiommu backend. Furthermore, unsafe group files are relocated > > > > to /dev/vfio-noiommu/. Upon successful loading in this mode, the > > > > kernel is tainted due to the dummy IOMMU put in place. Unloading of > > > > the module in this mode is also unsupported and will BUG due to the > > > > lack of support for unregistering an IOMMU for a bus type. > > > > > > > > Signed-off-by: Alex Williamson > > > > > > Will this work for distro's where chaning kernel command line options > > > is really not that practical. We need to boot with one command line > > > and then decide to use IOMMU (or not) later on during the service > > > startup of the dataplane application. > > > > On open? That's too late in my opinion. But maybe the flag can be > > tweaked so that it will probe for iommu, if there - do the > > right thing, but if that fails, enable the dummy one. > > And maybe defer tainting until device open. I forgot to address the tainting point; I think we were previously talking about tainting at the point where bus master is enabled, but I chose to do it much earlier here because the act of registering a dummy iommu_ops for a bus type is pretty much the point at which we have a good chance of breaking the system. I also considered that some devices can manipulate their config space registers using device specific registers (such as the example of GPUs that mirror config space in mmio). It's therefore not always possible to taint at the point where we think the user has done something bad. The best case would be tainting at the point where the device file descriptor is opened as you suggested, but we can't do that while we're exposing dummy iommu_ops to the whole bus type. Maybe another option would be to create vfio wrappers for the iommu callbacks to have the iommu facade more local to vfio. My main requirements are that I do not want be disruptive to the existing vfio code or add a significant amount of code that needs to be maintained for the purpose of supporting a use mode that we don't really think is supportable. Thanks, Alex > The vfio mechanics are that a vfio bus driver, such as vfio-pci binds to > a device. In the probe function, we check for an iommu group, which > vfio-core then uses to create the vfio group. So there's nothing to > open(), the iommu association needs to be made prior to even binding the > device to vfio-pci. Probing for an iommu can also only be done on a per > bus_type basis, which will likely eventually become a per bus instance > to support heterogeneous iommus, so vfio can't simply determine that an > iommu is not present globally. This is why the new module option > includes the word "pci", so that it can probe for and attach the dummy > iommu specifically on the pci_bus_type. > > We can still consider if there are better points at which to initiate > the fake iommu group. Trying to think through vfio-pci doing it on > probe(), but it seems pretty ugly. > > In this RFC, I specifically avoided making the vfio no-iommu iommu > driver just another modular iommu backend, I wanted it to be tied to a > vfio module option such that vfio behaves differently with open()s and > certain ioctls. I think it would be really confusing to users if safe > and unsafe modes could be used concurrently for different devices. > > > Won't address the "old IOMMUs add performance overhead" > > usecase but I'm not very impressed by that in any case. > > Yep, me neither, certainly not for static mappings. There's a lot of > FUD left over from latencies in the streaming DMA mapping paths where > mappings are created and destroyed at a high rate. That has mo
Re: [RFC PATCH 2/2] vfio: Include no-iommu mode
On Mon, 2015-10-12 at 19:27 +0300, Michael S. Tsirkin wrote: > On Mon, Oct 12, 2015 at 08:56:07AM -0700, Stephen Hemminger wrote: > > On Fri, 09 Oct 2015 12:41:10 -0600 > > Alex Williamson wrote: > > > > > There is really no way to safely give a user full access to a PCI > > > without an IOMMU to protect the host from errant DMA. There is also > > > no way to provide DMA translation, for use cases such as devices > > > assignment to virtual machines. However, there are still those users > > > that want userspace drivers under those conditions. The UIO driver > > > exists for this use case, but does not provide the degree of device > > > access and programming that VFIO has. In an effort to avoid code > > > duplication, this introduces a No-IOMMU mode for VFIO. > > > > > > This mode requires enabling CONFIG_VFIO_NOIOMMU and loading the vfio > > > module with the option "enable_unsafe_pci_noiommu_mode". This should > > > make it very clear that this mode is not safe. In this mode, there is > > > no support for unprivileged users, CAP_SYS_ADMIN is required for > > > access to the necessary dev files. Mixing no-iommu and secure VFIO is > > > also unsupported, as are any VFIO IOMMU backends other than the > > > vfio-noiommu backend. Furthermore, unsafe group files are relocated > > > to /dev/vfio-noiommu/. Upon successful loading in this mode, the > > > kernel is tainted due to the dummy IOMMU put in place. Unloading of > > > the module in this mode is also unsupported and will BUG due to the > > > lack of support for unregistering an IOMMU for a bus type. > > > > > > Signed-off-by: Alex Williamson > > > > Will this work for distro's where chaning kernel command line options > > is really not that practical. We need to boot with one command line > > and then decide to use IOMMU (or not) later on during the service > > startup of the dataplane application. > > On open? That's too late in my opinion. But maybe the flag can be > tweaked so that it will probe for iommu, if there - do the > right thing, but if that fails, enable the dummy one. > And maybe defer tainting until device open. The vfio mechanics are that a vfio bus driver, such as vfio-pci binds to a device. In the probe function, we check for an iommu group, which vfio-core then uses to create the vfio group. So there's nothing to open(), the iommu association needs to be made prior to even binding the device to vfio-pci. Probing for an iommu can also only be done on a per bus_type basis, which will likely eventually become a per bus instance to support heterogeneous iommus, so vfio can't simply determine that an iommu is not present globally. This is why the new module option includes the word "pci", so that it can probe for and attach the dummy iommu specifically on the pci_bus_type. We can still consider if there are better points at which to initiate the fake iommu group. Trying to think through vfio-pci doing it on probe(), but it seems pretty ugly. In this RFC, I specifically avoided making the vfio no-iommu iommu driver just another modular iommu backend, I wanted it to be tied to a vfio module option such that vfio behaves differently with open()s and certain ioctls. I think it would be really confusing to users if safe and unsafe modes could be used concurrently for different devices. > Won't address the "old IOMMUs add performance overhead" > usecase but I'm not very impressed by that in any case. Yep, me neither, certainly not for static mappings. There's a lot of FUD left over from latencies in the streaming DMA mapping paths where mappings are created and destroyed at a high rate. That has more to do with flushing mappings out of the hardware than with iotlb miss latency or actual translation, which is all that should be in play for most uses here. > > Recent experience is that IOMMU's > > are broken on many platforms so the only way to make a DPDK application > > it to write a test program that can be used to check if VFIO+IOMMU > > works first. > > In userspace? Well that's just piling up work-arounds. And assuming > hardware is broken, who knows what's going on security-wise. These > broken systems need to be identified and black-listed in kernel. > > > Also, although you think the long option will set the bar high > > enough it probably will not satisfy anyone. It is annoying enough, that > > I would just carry a patch to remove it the silly requirement. > > That sounds reasonable. Anyone who can carry a kernel patch > does not need the warning. > > > And the the people who believe > > all user mode DMA is evil won't be satisfied either. > > But I really like having the same consistent API for handling device > > access with IOMMU and when IOMMU will/won't work. > > I agree that's good. Makes it easier to migrate applications to > the safe configuration down the road. Thanks Alex! > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.
Re: [PATCH 1/6] mmput: use notifier chain to call subsystem exit handler.
Note that i am no longer actively pushing this patch serie but i believe the solution it provides to be needed in one form or another. So I still think discussion on this to be useful so see below for my answer. On Sun, Oct 11, 2015 at 08:03:29PM +0100, David Woodhouse wrote: > On Tue, 2014-07-08 at 13:03 -0400, Jerome Glisse wrote: > > > > Now regarding the device side, if we were to cleanup inside the file release > > callback than we would be broken in front of fork. Imagine the following : > > - process A open device file and mirror its address space (hmm or kfd) > > through a device file. > > - process A forks itself (child is B) while having the device file open. > > - process A quits > > > > Now the file release will not be call until child B exit which might > > infinite. > > Thus we would be leaking memory. As we already pointed out we can not free > > the > > resources from the mmu_notifier >release callback. > > So if your library just registers a pthread_atfork() handler to close > the file descriptor in the child, that problem goes away? Like any > other "if we continue to hold file descriptors open when we should > close them, resources don't get freed" problem? I was just pointing out existing device driver usage pattern where user space open device file and do ioctl on it without necessarily caring about the mm struct. New usecase where device actually run thread against a specific process mm is different and require proper synchronization as file lifetime is different from process lifetime in many case when fork is involve. > > Perhaps we could even handled that automatically in the kernel, with > something like an O_CLOFORK flag on the file descriptor. Although > that's a little horrid. > > You've argued that the amdkfd code is special and not just a device > driver. I'm not going to contradict you there, but now we *are* going > to see device drivers doing this kind of thing. And we definitely > *don't* want to be calling back into device driver code from the > mmu_notifier's .release() function. Well that's the current solution, call back into device driver from the mmu_notifer release() call back. Since changes to mmu_notifier this is a workable solution (thanks to mmu_notifier_unregister_no_release()). > > I think amdkfd absolutely is *not* a useful precedent for normal device > drivers, and we *don't* want to follow this model in the general case. > > As we try to put together a generic API for device access to processes' > address space, I definitely think we want to stick with the model that > we take a reference on the mm, and we *keep* it until the device driver > unbinds from the mm (because its file descriptor is closed, or > whatever). Well i think when a process is kill (for whatever reasons) we do want to also kill all device threads at the same time. For instance we do not want to have zombie GPU threads that can keep running indefinitly. This is why use mmu_notifier.release() callback is kind of right place as it will be call once all threads using a given mm are killed. The exit_mm() or do_exit() are also places from where we could a callback to let device know that it must kill all thread related to a given mm. > > Perhaps you can keep a back door into the AMD IOMMU code to continue to > do what you're doing, or perhaps the explicit management of off-cpu > tasks that is being posited will give you a sane cleanup path that > *doesn't* involve the IOMMU's mmu_notifier calling back to you. But > either way, I *really* don't want this to be the way it works for > device drivers. So at kernel summit we are supposedly gonna have a discussion about device thread and scheduling and i think device thread lifetime belongs to that discussion too. My opinion is that device threads must be kill when a process quits. > > One hacky way to do it would be to schedule some delayed job from > > >release callback but then again we would have no way to properly > > synchronize ourself with other mm destruction code ie the delayed job > > could run concurently with other mm destruction code and interfer > > badly. > > With the RCU-based free of the struct containing the notifier, your > 'schedule some delayed job' is basically what we have now, isn't it? > > I note that you also have your *own* notifier which does other things, > and has to cope with being called before or after the IOMMU's notifier. > > Seriously, we don't want device drivers having to do this. We really > need to keep it simple. So right now in HMM what happens is that device driver get a callback as a result of mmu_notifier.release() and can call the unregister functions and device driver must then schedule through whatever means a call to the unregister function (can be a workqueue or other a kernel thread). Basicly i am hidding the issue inside the device driver until we can agree on a common proper way to do this. Cheers, Jérôme ___ io
Re: [RFC PATCH 2/2] vfio: Include no-iommu mode
On 10/12/2015 07:23 PM, Alex Williamson wrote: Also, although you think the long option will set the bar high enough it probably will not satisfy anyone. It is annoying enough, that I would just carry a patch to remove it the silly requirement. And the the people who believe all user mode DMA is evil won't be satisfied either. I find that many users blindly follow howtos and only sometimes do they question the options if they sound scary enough. So yeah, I would intend to make the option upstream sound scary enough for people to think twice about using it and maybe even read the description. That still doesn't prevent pasting it into modprobe.d and forgetting about it. I think we need to allow for packages to work with this. I.e. drop files into config directories rather than require editing of config files. I think this is mostly workable via modprobe.d. (for our own use case, we don't require the extreme performance of many L2/L3 dpdk apps so we'll work with regular iommued vfio for bare-metal installations and ship prebuilt virtual machine images for clouds, so it doesn't matter that much to me). ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 2/2] vfio: Include no-iommu mode
On Mon, Oct 12, 2015 at 08:56:07AM -0700, Stephen Hemminger wrote: > On Fri, 09 Oct 2015 12:41:10 -0600 > Alex Williamson wrote: > > > There is really no way to safely give a user full access to a PCI > > without an IOMMU to protect the host from errant DMA. There is also > > no way to provide DMA translation, for use cases such as devices > > assignment to virtual machines. However, there are still those users > > that want userspace drivers under those conditions. The UIO driver > > exists for this use case, but does not provide the degree of device > > access and programming that VFIO has. In an effort to avoid code > > duplication, this introduces a No-IOMMU mode for VFIO. > > > > This mode requires enabling CONFIG_VFIO_NOIOMMU and loading the vfio > > module with the option "enable_unsafe_pci_noiommu_mode". This should > > make it very clear that this mode is not safe. In this mode, there is > > no support for unprivileged users, CAP_SYS_ADMIN is required for > > access to the necessary dev files. Mixing no-iommu and secure VFIO is > > also unsupported, as are any VFIO IOMMU backends other than the > > vfio-noiommu backend. Furthermore, unsafe group files are relocated > > to /dev/vfio-noiommu/. Upon successful loading in this mode, the > > kernel is tainted due to the dummy IOMMU put in place. Unloading of > > the module in this mode is also unsupported and will BUG due to the > > lack of support for unregistering an IOMMU for a bus type. > > > > Signed-off-by: Alex Williamson > > Will this work for distro's where chaning kernel command line options > is really not that practical. We need to boot with one command line > and then decide to use IOMMU (or not) later on during the service > startup of the dataplane application. On open? That's too late in my opinion. But maybe the flag can be tweaked so that it will probe for iommu, if there - do the right thing, but if that fails, enable the dummy one. And maybe defer tainting until device open. Won't address the "old IOMMUs add performance overhead" usecase but I'm not very impressed by that in any case. > Recent experience is that IOMMU's > are broken on many platforms so the only way to make a DPDK application > it to write a test program that can be used to check if VFIO+IOMMU > works first. In userspace? Well that's just piling up work-arounds. And assuming hardware is broken, who knows what's going on security-wise. These broken systems need to be identified and black-listed in kernel. > Also, although you think the long option will set the bar high > enough it probably will not satisfy anyone. It is annoying enough, that > I would just carry a patch to remove it the silly requirement. That sounds reasonable. Anyone who can carry a kernel patch does not need the warning. > And the the people who believe > all user mode DMA is evil won't be satisfied either. > But I really like having the same consistent API for handling device > access with IOMMU and when IOMMU will/won't work. I agree that's good. Makes it easier to migrate applications to the safe configuration down the road. Thanks Alex! -- mST ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 2/2] vfio: Include no-iommu mode
On Mon, 2015-10-12 at 08:56 -0700, Stephen Hemminger wrote: > On Fri, 09 Oct 2015 12:41:10 -0600 > Alex Williamson wrote: > > > There is really no way to safely give a user full access to a PCI > > without an IOMMU to protect the host from errant DMA. There is also > > no way to provide DMA translation, for use cases such as devices > > assignment to virtual machines. However, there are still those users > > that want userspace drivers under those conditions. The UIO driver > > exists for this use case, but does not provide the degree of device > > access and programming that VFIO has. In an effort to avoid code > > duplication, this introduces a No-IOMMU mode for VFIO. > > > > This mode requires enabling CONFIG_VFIO_NOIOMMU and loading the vfio > > module with the option "enable_unsafe_pci_noiommu_mode". This should > > make it very clear that this mode is not safe. In this mode, there is > > no support for unprivileged users, CAP_SYS_ADMIN is required for > > access to the necessary dev files. Mixing no-iommu and secure VFIO is > > also unsupported, as are any VFIO IOMMU backends other than the > > vfio-noiommu backend. Furthermore, unsafe group files are relocated > > to /dev/vfio-noiommu/. Upon successful loading in this mode, the > > kernel is tainted due to the dummy IOMMU put in place. Unloading of > > the module in this mode is also unsupported and will BUG due to the > > lack of support for unregistering an IOMMU for a bus type. > > > > Signed-off-by: Alex Williamson > > Will this work for distro's where chaning kernel command line options > is really not that practical. We need to boot with one command line > and then decide to use IOMMU (or not) later on during the service > startup of the dataplane application. Recent experience is that IOMMU's > are broken on many platforms so the only way to make a DPDK application > it to write a test program that can be used to check if VFIO+IOMMU > works first. There's no kernel command line dependency, if you find that VFIO+IOMMU doesn't work, unload the vfio modules and reload with the no-iommu option and try again. VFIO itself cannot simply fall back to no-iommu, that definitely needs to be a user opt-in. The userspace app though should easily be able to tell when the type1 model is not available and fall back to no-iommu. > Also, although you think the long option will set the bar high > enough it probably will not satisfy anyone. It is annoying enough, that > I would just carry a patch to remove it the silly requirement. > And the the people who believe > all user mode DMA is evil won't be satisfied either. I find that many users blindly follow howtos and only sometimes do they question the options if they sound scary enough. So yeah, I would intend to make the option upstream sound scary enough for people to think twice about using it and maybe even read the description. That still doesn't prevent pasting it into modprobe.d and forgetting about it. I don't see non-IOMMU protected usermode DMA as evil, it's just unsupportable and I want to be able to immediately know that it's a possibility if I'm looking at a kernel bug. > But I really like having the same consistent API for handling device > access with IOMMU and when IOMMU will/won't work. We still need to hear from IOMMU folks, AIUI there are plans to implement dma_iops via iommu_ops, which would make squatting on iommu_ops much more difficult for a hack like this. Thanks, Alex ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [dm-devel] AMD-Vi IO_PAGE_FAULTs and ata3.00: failed command: READ FPDMA QUEUED errors since Linux 4.0
On Fri, 9 Oct 2015, Andreas Hartmann wrote: > Hello Jörg, > > On 10/09/2015 at 04:59 PM, Joerg Roedel wrote: > > On Fri, Oct 09, 2015 at 11:15:05AM +0200, Andreas Hartmann wrote: > >> v4.3-rc4 isn't usable at all for me as long as is hangs the machine on > >> the necessary PCI passthrough for VMs (I need them). > > > > If the fix I just sent you works, could you please test this again with > > a (patched) v4.3-rc4 kernel? > > Your IOMMU-patch works fine - but the ata-problem can be seen here, too. > Same behavior as with 4.1.10. Could you try another ata disk? (copy the whole filesystem to it and run the same test) It may be bug in disk's firmware. Mikulas___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [dm-devel] AMD-Vi IO_PAGE_FAULTs and ata3.00: failed command: READ FPDMA QUEUED errors since Linux 4.0
On 10/11/2015 at 02:23 PM, Andreas Hartmann wrote: > On 10/09/2015 at 07:46 PM, Andreas Hartmann wrote: >> Hello Jörg, >> >> On 10/09/2015 at 04:59 PM, Joerg Roedel wrote: >>> On Fri, Oct 09, 2015 at 11:15:05AM +0200, Andreas Hartmann wrote: v4.3-rc4 isn't usable at all for me as long as is hangs the machine on the necessary PCI passthrough for VMs (I need them). >>> >>> If the fix I just sent you works, could you please test this again with >>> a (patched) v4.3-rc4 kernel? >> >> Your IOMMU-patch works fine - but the ata-problem can be seen here, too. >> Same behavior as with 4.1.10. >> > > Ok, this patch seems to fix the ata errors (I did a lot of tests until > now w/ v4.1.10 - but anyway I'm cautious): > > http://thread.gmane.org/gmane.linux.scsi/104141/focus=104267 -> Forget it - doesn't fix it. Regards, Andreas ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 2/2] vfio: Include no-iommu mode
On 10/11/2015 11:57 AM, Michael S. Tsirkin wrote: On Sun, Oct 11, 2015 at 11:12:14AM +0300, Avi Kivity wrote: Mixing no-iommu and secure VFIO is also unsupported, as are any VFIO IOMMU backends other than the vfio-noiommu backend. Furthermore, unsafe group files are relocated to /dev/vfio-noiommu/. Upon successful loading in this mode, the kernel is tainted due to the dummy IOMMU put in place. Unloading of the module in this mode is also unsupported and will BUG due to the lack of support for unregistering an IOMMU for a bus type. I did not see an API for detecting whether memory translation is provided or not. We can have the caller guess this by looking at the device name, or by requiring the user to specify this, but I think it's cleaner to provide programmatic access to this attribute. It seems that caller can just check for VFIO_NOIOMMU_IOMMU. Isn't this why it's there? That's just means the capability is there, not that it's active. But since you must pass the same value to open(), you already know that you're using noiommu. VFIO_IOMMU_MAP_DMA, VFIO_IOMMU_ENABLE and VFIO_IOMMU_DISABLE will probably also fail ... Don't you have to call MAP_DMA to pin the memory? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 2/2] vfio: Include no-iommu mode
On 10/09/2015 09:41 PM, Alex Williamson wrote: There is really no way to safely give a user full access to a PCI without an IOMMU to protect the host from errant DMA. There is also no way to provide DMA translation, for use cases such as devices assignment to virtual machines. However, there are still those users that want userspace drivers under those conditions. The UIO driver exists for this use case, but does not provide the degree of device access and programming that VFIO has. In an effort to avoid code duplication, this introduces a No-IOMMU mode for VFIO. This mode requires enabling CONFIG_VFIO_NOIOMMU and loading the vfio module with the option "enable_unsafe_pci_noiommu_mode". This should make it very clear that this mode is not safe. In this mode, there is no support for unprivileged users, CAP_SYS_ADMIN is required for access to the necessary dev files. CAP_SYS_RAWIO seems a better match (in particular, it allows access to /dev/mem, which is the same thing). Mixing no-iommu and secure VFIO is also unsupported, as are any VFIO IOMMU backends other than the vfio-noiommu backend. Furthermore, unsafe group files are relocated to /dev/vfio-noiommu/. Upon successful loading in this mode, the kernel is tainted due to the dummy IOMMU put in place. Unloading of the module in this mode is also unsupported and will BUG due to the lack of support for unregistering an IOMMU for a bus type. I did not see an API for detecting whether memory translation is provided or not. We can have the caller guess this by looking at the device name, or by requiring the user to specify this, but I think it's cleaner to provide programmatic access to this attribute. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC PATCH 2/2] vfio: Include no-iommu mode
On Sun, Oct 11, 2015 at 12:19:54PM +0300, Michael S. Tsirkin wrote: > > But since you must pass the same value to open(), you already know that > > you're using noiommu. > > > > >VFIO_IOMMU_MAP_DMA, VFIO_IOMMU_ENABLE and VFIO_IOMMU_DISABLE > > >will probably also fail ... > > > > > > > Don't you have to call MAP_DMA to pin the memory? > > Well check it out - the patch in question doesn't implement this ioctl. > In fact it doesn't implement anything except CHECK_EXTENSION. > > And this makes sense to me: MAP_DMA > maps a virtual address to io address, and that doesn't > work for the dummy iommu. > > You can pin memory using many other ways, including > mlock and hugetlbfs. > mlock() does not pin memory. -- Gleb. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [Intel-gfx] [PATCH 7/7] iommu/vt-d: Implement page request handling
On Fri, Oct 09, 2015 at 12:54:07AM +0100, David Woodhouse wrote: > Signed-off-by: David Woodhouse > --- > drivers/iommu/intel-svm.c | 115 > +++- > include/linux/intel-iommu.h | 21 > 2 files changed, 134 insertions(+), 2 deletions(-) > > diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c > index 1260e87..ace1e32 100644 > --- a/drivers/iommu/intel-svm.c > +++ b/drivers/iommu/intel-svm.c > static irqreturn_t prq_event_thread(int irq, void *d) > { > + if (svm) { > + mutex_lock(&pasid_mutex); > + kref_put(&svm->kref, &intel_mm_free); > + mutex_unlock(&pasid_mutex); kref_put_mutex(&svm->kref, intel_mm_free, &pasid_mutex); ? -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu