Re: [REPOST PATCH 1/4] ARM: dts: DRA7: Add dsp1_system syscon node

2015-10-12 Thread Tony Lindgren
* Tony Lindgren  [151012 15:57]:
> * Suman Anna  [151012 15:37]:
> > Hi Tony,
> > 
> > On 10/12/2015 04:43 PM, Tony Lindgren wrote:
> > > * Suman Anna  [151002 16:27]:
> > >> The DSP_SYSTEM sub-module is a dedicated system control logic
> > >> module present within a DRA7 DSP processor sub-system. This
> > >> module is responsible for power management, clock generation
> > >> and connection to the device PRCM module.
> > >>
> > >> Add a syscon node for this module for the DSP1 processor
> > >> sub-system. This is added as a syscon node as it is a common
> > >> configuration module that can be used by the different IOMMU
> > >> instances and the corresponding remoteproc device.
> > >>
> > >> The node is added to the common dra7.dtsi file, as the DSP1
> > >> processor sub-system is mostly common across all the variants
> > >> of the DRA7 SoC family.
> > >>
> > >> Signed-off-by: Suman Anna 
> > >> ---
> > >>  arch/arm/boot/dts/dra7.dtsi | 5 +
> > >>  1 file changed, 5 insertions(+)
> > >>
> > >> diff --git a/arch/arm/boot/dts/dra7.dtsi b/arch/arm/boot/dts/dra7.dtsi
> > >> index e289c706d27d..62055094e8d5 100644
> > >> --- a/arch/arm/boot/dts/dra7.dtsi
> > >> +++ b/arch/arm/boot/dts/dra7.dtsi
> > >> @@ -292,6 +292,11 @@
> > >>  #thermal-sensor-cells = <1>;
> > >>  };
> > >>  
> > >> +dsp1_system: dsp_system@40d0 {
> > >> +compatible = "syscon";
> > >> +reg = <0x40d0 0x100>;
> > >> +};
> > >> +
> > >>  sdma: dma-controller@4a056000 {
> > >>  compatible = "ti,omap4430-sdma";
> > >>  reg = <0x4a056000 0x1000>;
> > > 
> > > Hmm so why would you want to set up a complete device as a syscon
> > > mapping rather than just doing ioremap on it?
> > > 
> > > What drivers will be sharing access to these registers?
> > 
> > Two different instances of the MMU for now, both get probed
> > independently. But there are other registers which a remoteproc driver
> > will mostly be interested in (like DSP_SYS_STAT for knowing the C66x
> > idle/active status).
> 
> OK makes sense to me then.

And applying these into omap-for-v4.4/dt.

Tony
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [REPOST PATCH 1/4] ARM: dts: DRA7: Add dsp1_system syscon node

2015-10-12 Thread Tony Lindgren
* Suman Anna  [151012 15:37]:
> Hi Tony,
> 
> On 10/12/2015 04:43 PM, Tony Lindgren wrote:
> > * Suman Anna  [151002 16:27]:
> >> The DSP_SYSTEM sub-module is a dedicated system control logic
> >> module present within a DRA7 DSP processor sub-system. This
> >> module is responsible for power management, clock generation
> >> and connection to the device PRCM module.
> >>
> >> Add a syscon node for this module for the DSP1 processor
> >> sub-system. This is added as a syscon node as it is a common
> >> configuration module that can be used by the different IOMMU
> >> instances and the corresponding remoteproc device.
> >>
> >> The node is added to the common dra7.dtsi file, as the DSP1
> >> processor sub-system is mostly common across all the variants
> >> of the DRA7 SoC family.
> >>
> >> Signed-off-by: Suman Anna 
> >> ---
> >>  arch/arm/boot/dts/dra7.dtsi | 5 +
> >>  1 file changed, 5 insertions(+)
> >>
> >> diff --git a/arch/arm/boot/dts/dra7.dtsi b/arch/arm/boot/dts/dra7.dtsi
> >> index e289c706d27d..62055094e8d5 100644
> >> --- a/arch/arm/boot/dts/dra7.dtsi
> >> +++ b/arch/arm/boot/dts/dra7.dtsi
> >> @@ -292,6 +292,11 @@
> >>#thermal-sensor-cells = <1>;
> >>};
> >>  
> >> +  dsp1_system: dsp_system@40d0 {
> >> +  compatible = "syscon";
> >> +  reg = <0x40d0 0x100>;
> >> +  };
> >> +
> >>sdma: dma-controller@4a056000 {
> >>compatible = "ti,omap4430-sdma";
> >>reg = <0x4a056000 0x1000>;
> > 
> > Hmm so why would you want to set up a complete device as a syscon
> > mapping rather than just doing ioremap on it?
> > 
> > What drivers will be sharing access to these registers?
> 
> Two different instances of the MMU for now, both get probed
> independently. But there are other registers which a remoteproc driver
> will mostly be interested in (like DSP_SYS_STAT for knowing the C66x
> idle/active status).

OK makes sense to me then.

Thanks,

Tony
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [REPOST PATCH 1/4] ARM: dts: DRA7: Add dsp1_system syscon node

2015-10-12 Thread Suman Anna
Hi Tony,

On 10/12/2015 04:43 PM, Tony Lindgren wrote:
> * Suman Anna  [151002 16:27]:
>> The DSP_SYSTEM sub-module is a dedicated system control logic
>> module present within a DRA7 DSP processor sub-system. This
>> module is responsible for power management, clock generation
>> and connection to the device PRCM module.
>>
>> Add a syscon node for this module for the DSP1 processor
>> sub-system. This is added as a syscon node as it is a common
>> configuration module that can be used by the different IOMMU
>> instances and the corresponding remoteproc device.
>>
>> The node is added to the common dra7.dtsi file, as the DSP1
>> processor sub-system is mostly common across all the variants
>> of the DRA7 SoC family.
>>
>> Signed-off-by: Suman Anna 
>> ---
>>  arch/arm/boot/dts/dra7.dtsi | 5 +
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/arch/arm/boot/dts/dra7.dtsi b/arch/arm/boot/dts/dra7.dtsi
>> index e289c706d27d..62055094e8d5 100644
>> --- a/arch/arm/boot/dts/dra7.dtsi
>> +++ b/arch/arm/boot/dts/dra7.dtsi
>> @@ -292,6 +292,11 @@
>>  #thermal-sensor-cells = <1>;
>>  };
>>  
>> +dsp1_system: dsp_system@40d0 {
>> +compatible = "syscon";
>> +reg = <0x40d0 0x100>;
>> +};
>> +
>>  sdma: dma-controller@4a056000 {
>>  compatible = "ti,omap4430-sdma";
>>  reg = <0x4a056000 0x1000>;
> 
> Hmm so why would you want to set up a complete device as a syscon
> mapping rather than just doing ioremap on it?
> 
> What drivers will be sharing access to these registers?

Two different instances of the MMU for now, both get probed
independently. But there are other registers which a remoteproc driver
will mostly be interested in (like DSP_SYS_STAT for knowing the C66x
idle/active status).

regards
Suman
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [REPOST PATCH 1/4] ARM: dts: DRA7: Add dsp1_system syscon node

2015-10-12 Thread Tony Lindgren
* Suman Anna  [151002 16:27]:
> The DSP_SYSTEM sub-module is a dedicated system control logic
> module present within a DRA7 DSP processor sub-system. This
> module is responsible for power management, clock generation
> and connection to the device PRCM module.
> 
> Add a syscon node for this module for the DSP1 processor
> sub-system. This is added as a syscon node as it is a common
> configuration module that can be used by the different IOMMU
> instances and the corresponding remoteproc device.
> 
> The node is added to the common dra7.dtsi file, as the DSP1
> processor sub-system is mostly common across all the variants
> of the DRA7 SoC family.
> 
> Signed-off-by: Suman Anna 
> ---
>  arch/arm/boot/dts/dra7.dtsi | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/dra7.dtsi b/arch/arm/boot/dts/dra7.dtsi
> index e289c706d27d..62055094e8d5 100644
> --- a/arch/arm/boot/dts/dra7.dtsi
> +++ b/arch/arm/boot/dts/dra7.dtsi
> @@ -292,6 +292,11 @@
>   #thermal-sensor-cells = <1>;
>   };
>  
> + dsp1_system: dsp_system@40d0 {
> + compatible = "syscon";
> + reg = <0x40d0 0x100>;
> + };
> +
>   sdma: dma-controller@4a056000 {
>   compatible = "ti,omap4430-sdma";
>   reg = <0x4a056000 0x1000>;

Hmm so why would you want to set up a complete device as a syscon
mapping rather than just doing ioremap on it?

What drivers will be sharing access to these registers?

Regards,

Tony
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 2/2] vfio: Include no-iommu mode

2015-10-12 Thread Alex Williamson
On Mon, 2015-10-12 at 11:46 -0600, Alex Williamson wrote:
> On Mon, 2015-10-12 at 19:27 +0300, Michael S. Tsirkin wrote:
> > On Mon, Oct 12, 2015 at 08:56:07AM -0700, Stephen Hemminger wrote:
> > > On Fri, 09 Oct 2015 12:41:10 -0600
> > > Alex Williamson  wrote:
> > > 
> > > > There is really no way to safely give a user full access to a PCI
> > > > without an IOMMU to protect the host from errant DMA.  There is also
> > > > no way to provide DMA translation, for use cases such as devices
> > > > assignment to virtual machines.  However, there are still those users
> > > > that want userspace drivers under those conditions.  The UIO driver
> > > > exists for this use case, but does not provide the degree of device
> > > > access and programming that VFIO has.  In an effort to avoid code
> > > > duplication, this introduces a No-IOMMU mode for VFIO.
> > > > 
> > > > This mode requires enabling CONFIG_VFIO_NOIOMMU and loading the vfio
> > > > module with the option "enable_unsafe_pci_noiommu_mode".  This should
> > > > make it very clear that this mode is not safe.  In this mode, there is
> > > > no support for unprivileged users, CAP_SYS_ADMIN is required for
> > > > access to the necessary dev files.  Mixing no-iommu and secure VFIO is
> > > > also unsupported, as are any VFIO IOMMU backends other than the
> > > > vfio-noiommu backend.  Furthermore, unsafe group files are relocated
> > > > to /dev/vfio-noiommu/.  Upon successful loading in this mode, the
> > > > kernel is tainted due to the dummy IOMMU put in place.  Unloading of
> > > > the module in this mode is also unsupported and will BUG due to the
> > > > lack of support for unregistering an IOMMU for a bus type.
> > > > 
> > > > Signed-off-by: Alex Williamson 
> > > 
> > > Will this work for distro's where chaning kernel command line options
> > > is really not that practical. We need to boot with one command line
> > > and then decide to use IOMMU (or not) later on during the service
> > > startup of the dataplane application.
> > 
> > On open? That's too late in my opinion. But maybe the flag can be
> > tweaked so that it will probe for iommu, if there - do the
> > right thing, but if that fails, enable the dummy one.
> > And maybe defer tainting until device open.

I forgot to address the tainting point; I think we were previously
talking about tainting at the point where bus master is enabled, but I
chose to do it much earlier here because the act of registering a dummy
iommu_ops for a bus type is pretty much the point at which we have a
good chance of breaking the system.  I also considered that some devices
can manipulate their config space registers using device specific
registers (such as the example of GPUs that mirror config space in
mmio).  It's therefore not always possible to taint at the point where
we think the user has done something bad.  The best case would be
tainting at the point where the device file descriptor is opened as you
suggested, but we can't do that while we're exposing dummy iommu_ops to
the whole bus type.  Maybe another option would be to create vfio
wrappers for the iommu callbacks to have the iommu facade more local to
vfio.

My main requirements are that I do not want be disruptive to the
existing vfio code or add a significant amount of code that needs to be
maintained for the purpose of supporting a use mode that we don't really
think is supportable.  Thanks,

Alex

> The vfio mechanics are that a vfio bus driver, such as vfio-pci binds to
> a device.  In the probe function, we check for an iommu group, which
> vfio-core then uses to create the vfio group.  So there's nothing to
> open(), the iommu association needs to be made prior to even binding the
> device to vfio-pci.  Probing for an iommu can also only be done on a per
> bus_type basis, which will likely eventually become a per bus instance
> to support heterogeneous iommus, so vfio can't simply determine that an
> iommu is not present globally.  This is why the new module option
> includes the word "pci", so that it can probe for and attach the dummy
> iommu specifically on the pci_bus_type.
> 
> We can still consider if there are better points at which to initiate
> the fake iommu group.  Trying to think through vfio-pci doing it on
> probe(), but it seems pretty ugly.
> 
> In this RFC, I specifically avoided making the vfio no-iommu iommu
> driver just another modular iommu backend, I wanted it to be tied to a
> vfio module option such that vfio behaves differently with open()s and
> certain ioctls.  I think it would be really confusing to users if safe
> and unsafe modes could be used concurrently for different devices.
> 
> > Won't address the "old IOMMUs add performance overhead"
> > usecase but I'm not very impressed by that in any case.
> 
> Yep, me neither, certainly not for static mappings.  There's a lot of
> FUD left over from latencies in the streaming DMA mapping paths where
> mappings are created and destroyed at a high rate.  That has mo

Re: [RFC PATCH 2/2] vfio: Include no-iommu mode

2015-10-12 Thread Alex Williamson
On Mon, 2015-10-12 at 19:27 +0300, Michael S. Tsirkin wrote:
> On Mon, Oct 12, 2015 at 08:56:07AM -0700, Stephen Hemminger wrote:
> > On Fri, 09 Oct 2015 12:41:10 -0600
> > Alex Williamson  wrote:
> > 
> > > There is really no way to safely give a user full access to a PCI
> > > without an IOMMU to protect the host from errant DMA.  There is also
> > > no way to provide DMA translation, for use cases such as devices
> > > assignment to virtual machines.  However, there are still those users
> > > that want userspace drivers under those conditions.  The UIO driver
> > > exists for this use case, but does not provide the degree of device
> > > access and programming that VFIO has.  In an effort to avoid code
> > > duplication, this introduces a No-IOMMU mode for VFIO.
> > > 
> > > This mode requires enabling CONFIG_VFIO_NOIOMMU and loading the vfio
> > > module with the option "enable_unsafe_pci_noiommu_mode".  This should
> > > make it very clear that this mode is not safe.  In this mode, there is
> > > no support for unprivileged users, CAP_SYS_ADMIN is required for
> > > access to the necessary dev files.  Mixing no-iommu and secure VFIO is
> > > also unsupported, as are any VFIO IOMMU backends other than the
> > > vfio-noiommu backend.  Furthermore, unsafe group files are relocated
> > > to /dev/vfio-noiommu/.  Upon successful loading in this mode, the
> > > kernel is tainted due to the dummy IOMMU put in place.  Unloading of
> > > the module in this mode is also unsupported and will BUG due to the
> > > lack of support for unregistering an IOMMU for a bus type.
> > > 
> > > Signed-off-by: Alex Williamson 
> > 
> > Will this work for distro's where chaning kernel command line options
> > is really not that practical. We need to boot with one command line
> > and then decide to use IOMMU (or not) later on during the service
> > startup of the dataplane application.
> 
> On open? That's too late in my opinion. But maybe the flag can be
> tweaked so that it will probe for iommu, if there - do the
> right thing, but if that fails, enable the dummy one.
> And maybe defer tainting until device open.

The vfio mechanics are that a vfio bus driver, such as vfio-pci binds to
a device.  In the probe function, we check for an iommu group, which
vfio-core then uses to create the vfio group.  So there's nothing to
open(), the iommu association needs to be made prior to even binding the
device to vfio-pci.  Probing for an iommu can also only be done on a per
bus_type basis, which will likely eventually become a per bus instance
to support heterogeneous iommus, so vfio can't simply determine that an
iommu is not present globally.  This is why the new module option
includes the word "pci", so that it can probe for and attach the dummy
iommu specifically on the pci_bus_type.

We can still consider if there are better points at which to initiate
the fake iommu group.  Trying to think through vfio-pci doing it on
probe(), but it seems pretty ugly.

In this RFC, I specifically avoided making the vfio no-iommu iommu
driver just another modular iommu backend, I wanted it to be tied to a
vfio module option such that vfio behaves differently with open()s and
certain ioctls.  I think it would be really confusing to users if safe
and unsafe modes could be used concurrently for different devices.

> Won't address the "old IOMMUs add performance overhead"
> usecase but I'm not very impressed by that in any case.

Yep, me neither, certainly not for static mappings.  There's a lot of
FUD left over from latencies in the streaming DMA mapping paths where
mappings are created and destroyed at a high rate.  That has more to do
with flushing mappings out of the hardware than with iotlb miss latency
or actual translation, which is all that should be in play for most uses
here.

> > Recent experience is that IOMMU's
> > are broken on many platforms so the only way to make a DPDK application
> > it to write a test program that can be used to check if VFIO+IOMMU
> > works first.
> 
> In userspace? Well that's just piling up work-arounds.  And assuming
> hardware is broken, who knows what's going on security-wise.  These
> broken systems need to be identified and black-listed in kernel.
> 
> > Also, although you think the long option will set the bar high
> > enough it probably will not satisfy anyone. It is annoying enough, that
> > I would just carry a patch to remove it the silly requirement.
> 
> That sounds reasonable. Anyone who can carry a kernel patch
> does not need the warning.
> 
> > And the the people who believe
> > all user mode DMA is evil won't be satisfied either.
> > But I really like having the same consistent API for handling device
> > access with IOMMU and when IOMMU will/won't work.
> 
> I agree that's good. Makes it easier to migrate applications to
> the safe configuration down the road. Thanks Alex!
> 



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.

Re: [PATCH 1/6] mmput: use notifier chain to call subsystem exit handler.

2015-10-12 Thread Jerome Glisse

Note that i am no longer actively pushing this patch serie but i believe the
solution it provides to be needed in one form or another. So I still think
discussion on this to be useful so see below for my answer.

On Sun, Oct 11, 2015 at 08:03:29PM +0100, David Woodhouse wrote:
> On Tue, 2014-07-08 at 13:03 -0400, Jerome Glisse wrote:
> > 
> > Now regarding the device side, if we were to cleanup inside the file release
> > callback than we would be broken in front of fork. Imagine the following :
> >   - process A open device file and mirror its address space (hmm or kfd)
> > through a device file.
> >   - process A forks itself (child is B) while having the device file open.
> >   - process A quits
> > 
> > Now the file release will not be call until child B exit which might 
> > infinite.
> > Thus we would be leaking memory. As we already pointed out we can not free 
> > the
> > resources from the mmu_notifier >release callback.
> 
> So if your library just registers a pthread_atfork() handler to close
> the file descriptor in the child, that problem goes away? Like any
> other "if we continue to hold file descriptors open when we should
> close them, resources don't get freed" problem?


I was just pointing out existing device driver usage pattern where user
space open device file and do ioctl on it without necessarily caring
about the mm struct.

New usecase where device actually run thread against a specific process
mm is different and require proper synchronization as file lifetime is
different from process lifetime in many case when fork is involve.

> 
> Perhaps we could even handled that automatically in the kernel, with
> something like an O_CLOFORK flag on the file descriptor. Although
> that's a little horrid.
> 
> You've argued that the amdkfd code is special and not just a device
> driver. I'm not going to contradict you there, but now we *are* going
> to see device drivers doing this kind of thing. And we definitely
> *don't* want to be calling back into device driver code from the
> mmu_notifier's .release() function.

Well that's the current solution, call back into device driver from the
mmu_notifer release() call back. Since changes to mmu_notifier this is
a workable solution (thanks to mmu_notifier_unregister_no_release()).

> 
> I think amdkfd absolutely is *not* a useful precedent for normal device
> drivers, and we *don't* want to follow this model in the general case.
> 
> As we try to put together a generic API for device access to processes'
> address space, I definitely think we want to stick with the model that
> we take a reference on the mm, and we *keep* it until the device driver
> unbinds from the mm (because its file descriptor is closed, or
> whatever).

Well i think when a process is kill (for whatever reasons) we do want to
also kill all device threads at the same time. For instance we do not want
to have zombie GPU threads that can keep running indefinitly. This is why
use mmu_notifier.release() callback is kind of right place as it will be
call once all threads using a given mm are killed.

The exit_mm() or do_exit() are also places from where we could a callback
to let device know that it must kill all thread related to a given mm.

>
> Perhaps you can keep a back door into the AMD IOMMU code to continue to
> do what you're doing, or perhaps the explicit management of off-cpu
> tasks that is being posited will give you a sane cleanup path that
> *doesn't* involve the IOMMU's mmu_notifier calling back to you. But
> either way, I *really* don't want this to be the way it works for
> device drivers.

So at kernel summit we are supposedly gonna have a discussion about device
thread and scheduling and i think device thread lifetime belongs to that
discussion too. My opinion is that device threads must be kill when a
process quits.


> > One hacky way to do it would be to schedule some delayed job from 
> > >release callback but then again we would have no way to properly 
> > synchronize ourself with other mm destruction code ie the delayed job 
> > could run concurently with other mm destruction code and interfer
> > badly.
> 
> With the RCU-based free of the struct containing the notifier, your
> 'schedule some delayed job' is basically what we have now, isn't it?
> 
> I note that you also have your *own* notifier which does other things,
> and has to cope with being called before or after the IOMMU's notifier.
> 
> Seriously, we don't want device drivers having to do this. We really
> need to keep it simple.

So right now in HMM what happens is that device driver get a callback as
a result of mmu_notifier.release() and can call the unregister functions
and device driver must then schedule through whatever means a call to the
unregister function (can be a workqueue or other a kernel thread).

Basicly i am hidding the issue inside the device driver until we can agree
on a common proper way to do this.

Cheers,
Jérôme
___
io

Re: [RFC PATCH 2/2] vfio: Include no-iommu mode

2015-10-12 Thread Avi Kivity



On 10/12/2015 07:23 PM, Alex Williamson wrote:

Also, although you think the long option will set the bar high
enough it probably will not satisfy anyone. It is annoying enough, that
I would just carry a patch to remove it the silly requirement.
And the the people who believe
all user mode DMA is evil won't be satisfied either.

I find that many users blindly follow howtos and only sometimes do they
question the options if they sound scary enough.  So yeah, I would
intend to make the option upstream sound scary enough for people to
think twice about using it and maybe even read the description.  That
still doesn't prevent pasting it into modprobe.d and forgetting about
it.


I think we need to allow for packages to work with this.  I.e. drop 
files into config directories rather than require editing of config 
files.  I think this is mostly workable via modprobe.d.


(for our own use case, we don't require the extreme performance of many 
L2/L3 dpdk apps so we'll work with regular iommued vfio for bare-metal 
installations and ship prebuilt virtual machine images for clouds, so it 
doesn't matter that much to me).


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 2/2] vfio: Include no-iommu mode

2015-10-12 Thread Michael S. Tsirkin
On Mon, Oct 12, 2015 at 08:56:07AM -0700, Stephen Hemminger wrote:
> On Fri, 09 Oct 2015 12:41:10 -0600
> Alex Williamson  wrote:
> 
> > There is really no way to safely give a user full access to a PCI
> > without an IOMMU to protect the host from errant DMA.  There is also
> > no way to provide DMA translation, for use cases such as devices
> > assignment to virtual machines.  However, there are still those users
> > that want userspace drivers under those conditions.  The UIO driver
> > exists for this use case, but does not provide the degree of device
> > access and programming that VFIO has.  In an effort to avoid code
> > duplication, this introduces a No-IOMMU mode for VFIO.
> > 
> > This mode requires enabling CONFIG_VFIO_NOIOMMU and loading the vfio
> > module with the option "enable_unsafe_pci_noiommu_mode".  This should
> > make it very clear that this mode is not safe.  In this mode, there is
> > no support for unprivileged users, CAP_SYS_ADMIN is required for
> > access to the necessary dev files.  Mixing no-iommu and secure VFIO is
> > also unsupported, as are any VFIO IOMMU backends other than the
> > vfio-noiommu backend.  Furthermore, unsafe group files are relocated
> > to /dev/vfio-noiommu/.  Upon successful loading in this mode, the
> > kernel is tainted due to the dummy IOMMU put in place.  Unloading of
> > the module in this mode is also unsupported and will BUG due to the
> > lack of support for unregistering an IOMMU for a bus type.
> > 
> > Signed-off-by: Alex Williamson 
> 
> Will this work for distro's where chaning kernel command line options
> is really not that practical. We need to boot with one command line
> and then decide to use IOMMU (or not) later on during the service
> startup of the dataplane application.

On open? That's too late in my opinion. But maybe the flag can be
tweaked so that it will probe for iommu, if there - do the
right thing, but if that fails, enable the dummy one.
And maybe defer tainting until device open.

Won't address the "old IOMMUs add performance overhead"
usecase but I'm not very impressed by that in any case.

> Recent experience is that IOMMU's
> are broken on many platforms so the only way to make a DPDK application
> it to write a test program that can be used to check if VFIO+IOMMU
> works first.

In userspace? Well that's just piling up work-arounds.  And assuming
hardware is broken, who knows what's going on security-wise.  These
broken systems need to be identified and black-listed in kernel.

> Also, although you think the long option will set the bar high
> enough it probably will not satisfy anyone. It is annoying enough, that
> I would just carry a patch to remove it the silly requirement.

That sounds reasonable. Anyone who can carry a kernel patch
does not need the warning.

> And the the people who believe
> all user mode DMA is evil won't be satisfied either.
> But I really like having the same consistent API for handling device
> access with IOMMU and when IOMMU will/won't work.

I agree that's good. Makes it easier to migrate applications to
the safe configuration down the road. Thanks Alex!

-- 
mST
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 2/2] vfio: Include no-iommu mode

2015-10-12 Thread Alex Williamson
On Mon, 2015-10-12 at 08:56 -0700, Stephen Hemminger wrote:
> On Fri, 09 Oct 2015 12:41:10 -0600
> Alex Williamson  wrote:
> 
> > There is really no way to safely give a user full access to a PCI
> > without an IOMMU to protect the host from errant DMA.  There is also
> > no way to provide DMA translation, for use cases such as devices
> > assignment to virtual machines.  However, there are still those users
> > that want userspace drivers under those conditions.  The UIO driver
> > exists for this use case, but does not provide the degree of device
> > access and programming that VFIO has.  In an effort to avoid code
> > duplication, this introduces a No-IOMMU mode for VFIO.
> > 
> > This mode requires enabling CONFIG_VFIO_NOIOMMU and loading the vfio
> > module with the option "enable_unsafe_pci_noiommu_mode".  This should
> > make it very clear that this mode is not safe.  In this mode, there is
> > no support for unprivileged users, CAP_SYS_ADMIN is required for
> > access to the necessary dev files.  Mixing no-iommu and secure VFIO is
> > also unsupported, as are any VFIO IOMMU backends other than the
> > vfio-noiommu backend.  Furthermore, unsafe group files are relocated
> > to /dev/vfio-noiommu/.  Upon successful loading in this mode, the
> > kernel is tainted due to the dummy IOMMU put in place.  Unloading of
> > the module in this mode is also unsupported and will BUG due to the
> > lack of support for unregistering an IOMMU for a bus type.
> > 
> > Signed-off-by: Alex Williamson 
> 
> Will this work for distro's where chaning kernel command line options
> is really not that practical. We need to boot with one command line
> and then decide to use IOMMU (or not) later on during the service
> startup of the dataplane application. Recent experience is that IOMMU's
> are broken on many platforms so the only way to make a DPDK application
> it to write a test program that can be used to check if VFIO+IOMMU
> works first.

There's no kernel command line dependency, if you find that VFIO+IOMMU
doesn't work, unload the vfio modules and reload with the no-iommu
option and try again.  VFIO itself cannot simply fall back to no-iommu,
that definitely needs to be a user opt-in.  The userspace app though
should easily be able to tell when the type1 model is not available and
fall back to no-iommu.

> Also, although you think the long option will set the bar high
> enough it probably will not satisfy anyone. It is annoying enough, that
> I would just carry a patch to remove it the silly requirement.
> And the the people who believe
> all user mode DMA is evil won't be satisfied either.

I find that many users blindly follow howtos and only sometimes do they
question the options if they sound scary enough.  So yeah, I would
intend to make the option upstream sound scary enough for people to
think twice about using it and maybe even read the description.  That
still doesn't prevent pasting it into modprobe.d and forgetting about
it.

I don't see non-IOMMU protected usermode DMA as evil, it's just
unsupportable and I want to be able to immediately know that it's a
possibility if I'm looking at a kernel bug.

> But I really like having the same consistent API for handling device
> access with IOMMU and when IOMMU will/won't work.

We still need to hear from IOMMU folks, AIUI there are plans to
implement dma_iops via iommu_ops, which would make squatting on
iommu_ops much more difficult for a hack like this.  Thanks,

Alex

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [dm-devel] AMD-Vi IO_PAGE_FAULTs and ata3.00: failed command: READ FPDMA QUEUED errors since Linux 4.0

2015-10-12 Thread Mikulas Patocka


On Fri, 9 Oct 2015, Andreas Hartmann wrote:

> Hello Jörg,
> 
> On 10/09/2015 at 04:59 PM, Joerg Roedel wrote:
> > On Fri, Oct 09, 2015 at 11:15:05AM +0200, Andreas Hartmann wrote:
> >> v4.3-rc4 isn't usable at all for me as long as is hangs the machine on
> >> the necessary PCI passthrough for VMs (I need them).
> > 
> > If the fix I just sent you works, could you please test this again with
> > a (patched) v4.3-rc4 kernel?
> 
> Your IOMMU-patch works fine - but the ata-problem can be seen here, too.
> Same behavior as with 4.1.10.

Could you try another ata disk? (copy the whole filesystem to it and run 
the same test)

It may be bug in disk's firmware.

Mikulas___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [dm-devel] AMD-Vi IO_PAGE_FAULTs and ata3.00: failed command: READ FPDMA QUEUED errors since Linux 4.0

2015-10-12 Thread Andreas Hartmann
On 10/11/2015 at 02:23 PM, Andreas Hartmann wrote:
> On 10/09/2015 at 07:46 PM, Andreas Hartmann wrote:
>> Hello Jörg,
>>
>> On 10/09/2015 at 04:59 PM, Joerg Roedel wrote:
>>> On Fri, Oct 09, 2015 at 11:15:05AM +0200, Andreas Hartmann wrote:
 v4.3-rc4 isn't usable at all for me as long as is hangs the machine on
 the necessary PCI passthrough for VMs (I need them).
>>>
>>> If the fix I just sent you works, could you please test this again with
>>> a (patched) v4.3-rc4 kernel?
>>
>> Your IOMMU-patch works fine - but the ata-problem can be seen here, too.
>> Same behavior as with 4.1.10.
>>
> 
> Ok, this patch seems to fix the ata errors (I did a lot of tests until
> now w/ v4.1.10 - but anyway I'm cautious):
> 
> http://thread.gmane.org/gmane.linux.scsi/104141/focus=104267

-> Forget it - doesn't fix it.


Regards,
Andreas
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 2/2] vfio: Include no-iommu mode

2015-10-12 Thread Avi Kivity



On 10/11/2015 11:57 AM, Michael S. Tsirkin wrote:

On Sun, Oct 11, 2015 at 11:12:14AM +0300, Avi Kivity wrote:

   Mixing no-iommu and secure VFIO is
also unsupported, as are any VFIO IOMMU backends other than the
vfio-noiommu backend.  Furthermore, unsafe group files are relocated
to /dev/vfio-noiommu/.  Upon successful loading in this mode, the
kernel is tainted due to the dummy IOMMU put in place.  Unloading of
the module in this mode is also unsupported and will BUG due to the
lack of support for unregistering an IOMMU for a bus type.

I did not see an API for detecting whether memory translation is provided or
not.  We can have the caller guess this by looking at the device name, or by
requiring the user to specify this, but I think it's cleaner to provide
programmatic access to this attribute.

It seems that caller can just check for VFIO_NOIOMMU_IOMMU.

Isn't this why it's there?


That's just means the capability is there, not that it's active.

But since you must pass the same value to open(), you already know that 
you're using noiommu.



VFIO_IOMMU_MAP_DMA, VFIO_IOMMU_ENABLE and VFIO_IOMMU_DISABLE
will probably also fail ...



Don't you have to call MAP_DMA to pin the memory?

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 2/2] vfio: Include no-iommu mode

2015-10-12 Thread Avi Kivity



On 10/09/2015 09:41 PM, Alex Williamson wrote:

There is really no way to safely give a user full access to a PCI
without an IOMMU to protect the host from errant DMA.  There is also
no way to provide DMA translation, for use cases such as devices
assignment to virtual machines.  However, there are still those users
that want userspace drivers under those conditions.  The UIO driver
exists for this use case, but does not provide the degree of device
access and programming that VFIO has.  In an effort to avoid code
duplication, this introduces a No-IOMMU mode for VFIO.

This mode requires enabling CONFIG_VFIO_NOIOMMU and loading the vfio
module with the option "enable_unsafe_pci_noiommu_mode".  This should
make it very clear that this mode is not safe.  In this mode, there is
no support for unprivileged users, CAP_SYS_ADMIN is required for
access to the necessary dev files.


CAP_SYS_RAWIO seems a better match (in particular, it allows access to 
/dev/mem, which is the same thing).



   Mixing no-iommu and secure VFIO is
also unsupported, as are any VFIO IOMMU backends other than the
vfio-noiommu backend.  Furthermore, unsafe group files are relocated
to /dev/vfio-noiommu/.  Upon successful loading in this mode, the
kernel is tainted due to the dummy IOMMU put in place.  Unloading of
the module in this mode is also unsupported and will BUG due to the
lack of support for unregistering an IOMMU for a bus type.


I did not see an API for detecting whether memory translation is 
provided or not.  We can have the caller guess this by looking at the 
device name, or by requiring the user to specify this, but I think it's 
cleaner to provide programmatic access to this attribute.


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 2/2] vfio: Include no-iommu mode

2015-10-12 Thread Gleb Natapov
On Sun, Oct 11, 2015 at 12:19:54PM +0300, Michael S. Tsirkin wrote:
> > But since you must pass the same value to open(), you already know that
> > you're using noiommu.
> > 
> > >VFIO_IOMMU_MAP_DMA, VFIO_IOMMU_ENABLE and VFIO_IOMMU_DISABLE
> > >will probably also fail ...
> > >
> > 
> > Don't you have to call MAP_DMA to pin the memory?
> 
> Well check it out - the patch in question doesn't implement this ioctl.
> In fact it doesn't implement anything except CHECK_EXTENSION.
> 
> And this makes sense to me: MAP_DMA
> maps a virtual address to io address, and that doesn't
> work for the dummy iommu.
> 
> You can pin memory using many other ways, including
> mlock and hugetlbfs.
> 
mlock() does not pin memory.

--
Gleb.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [Intel-gfx] [PATCH 7/7] iommu/vt-d: Implement page request handling

2015-10-12 Thread Chris Wilson
On Fri, Oct 09, 2015 at 12:54:07AM +0100, David Woodhouse wrote:
> Signed-off-by: David Woodhouse 
> ---
>  drivers/iommu/intel-svm.c   | 115 
> +++-
>  include/linux/intel-iommu.h |  21 
>  2 files changed, 134 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> index 1260e87..ace1e32 100644
> --- a/drivers/iommu/intel-svm.c
> +++ b/drivers/iommu/intel-svm.c
>  static irqreturn_t prq_event_thread(int irq, void *d)
>  {

> + if (svm) {
> + mutex_lock(&pasid_mutex);
> + kref_put(&svm->kref, &intel_mm_free);
> + mutex_unlock(&pasid_mutex);

kref_put_mutex(&svm->kref, intel_mm_free, &pasid_mutex); ?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu