Re: [PATCH 1/1] iommu/vt-d: Leave scalable mode default off

2019-01-24 Thread Lu Baolu

Hi Joerg,

On 1/24/19 9:22 PM, Joerg Roedel wrote:

On Thu, Jan 24, 2019 at 10:31:32AM +0800, Lu Baolu wrote:

Commit 765b6a98c1de3 ("iommu/vt-d: Enumerate the scalable
mode capability") enables VT-d scalable mode if hardware
advertises the capability. As we will bring up different
features and use cases to upstream in different patch
series, it will leave some intermediate kernel versions
which support partial features. Hence, end user might run
into problems when they use such kernels on bare metals
or virtualization environments.


I don't get it, can you be more specific about the problems that users
might run into?


Sorry, I didn't make it clear in the message.

Around VT-d scalable mode, we plan to enable several features.
For example,

(1)basic scalable mode support;
(2)aux domain;
(3)system level pasid allocation;


Since they will be submitted in different patch series for reviewing and
merging, users will face compatible problems. For example, when users
run kernel v5.0, they might fail to assign an ADI (Assignable Device
Interface) to a VM because the aux domain is not included yet. They will
complain "I have a kernel claimed to support scalable mode, but when I
tried to assign an ADI to a VM, ...".

So we decide to leave it off by default, and turn it default on later
when all the features get merged. Users could try scalable mode features
with "intel-iommu=sm_on" in kernel command line.


And is this patch needed as a fix for v5.0 or is it just
a precaution because future patches might break something for users?


It will be better if it can be a fix for v5.0.

Best regards,
Lu Baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] dcdbas: Fix Intel-IOMMU domain allocation failure

2019-01-24 Thread Szabolcs Fruhwald via iommu
Hi Andy,

I took a quick look at arch_setup_pdev_archdata(), overridden it in
dcdbas.c and it works well (despite it's being called twice, since
it's called from platform_device_alloc and platform_device_register).

However, as I am not super familiar with ELF weak method references,
especially with its scope resolution / versioning part, so as I see
this weak method was introduced mainly for arch/** specific hooks.
Is it safe to override this method from driver code, when let's say
there's another implementation in the x86 arch code (currently there
isn't)?

So while I agree that archdata should be ideally written from within
this hook to make sure the value is there when the bus notifiers call
the iommu code (currently not registered for platform_bus), I am just
a bit worried that this might mask a future generic arch
implementation and could cause issues. What do you think?

Best Regards,
Szabolcs

On Thu, Jan 24, 2019 at 1:44 PM Andy Shevchenko
 wrote:
>
> On Thu, Jan 24, 2019 at 10:31 PM Szabolcs Fruhwald  
> wrote:
> >
> > (+iommu list for visibility and confirmation of the intended constant
> > externalization, see 2nd point below)
>
> > Absolutely, I thought so too. But, since the actual need to force
> > id-mapping comes from the lack of support for non-pci buses
> > particularly by the intel iommu implementation, it seemed a bit odd to
> > move this constant into iommu.h. Other platforms' implementations are
> > usually handling and hooking into other buses, eg platform bus.
> >
> > However, even these other hw platform iommu implementations are using
> > ACPI or other (bios related) tables to generate source ids, which
> > might still be an issue with drivers like dcdbas, with no real device
> > info in these tables. Not to mention that forcing id-mapping might be
> > a useful tool in driver developers' hands by other reasons too.
> > Therefore, I just came to the conclusion that it is indeed useful to
> > have this constant present in the common iommu header file (but with a
> > more expressing name).
> >
> > Especially, as I just found that dcdbas is *not* the first one to
> > implement this workaround:
> > https://github.com/torvalds/linux/blob/71f3a82fab1b631ae9cb1feb677f498d4ca5007d/drivers/gpu/drm/i915/selftests/mock_gem_device.c#L154
>
> > Let me come up with a v2 patch-set, containing a separate patch
> > externalizing this constant from intel_iommu.c to iommu.h and making
> > the above code using it too first, followed by this change in dcdbas.
>
> Wait... It sounds to me like a part of arch code where we define
> arch_setup_pdev_archdata() and use this dummy domain.
> Though dummy domain definition should come from IOMMU framework.
>
> --
> With Best Regards,
> Andy Shevchenko
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/5] iommu/tegra-smmu: Fix domain_alloc

2019-01-24 Thread navneet kumar
Thanks for the feedback, Robin, and, Dmitry.
I mostly agree with all your comments, pls see my responses inline.
I'll make these fixes in V2.

On 1/17/19 8:50 AM, Robin Murphy wrote:
> On 17/01/2019 15:13, Dmitry Osipenko wrote:
>> 16.01.2019 23:50, Navneet Kumar пишет:
>>> * Allocate dma iova cookie for a domain while adding dma iommu
>>> devices.
>>> * Perform a stricter check for domain type parameter.
>>>
>>
>> Commit message should tell what exactly is getting "fixed". Apparently 
>> you're trying to support T132 ARM64 here.

I'll fix it.

>>
>>> Signed-off-by: Navneet Kumar 
>>> ---
>>>   drivers/iommu/tegra-smmu.c | 43 
>>> +++
>>>   1 file changed, 27 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
>>> index 543f7c9..ee4d8a8 100644
>>> --- a/drivers/iommu/tegra-smmu.c
>>> +++ b/drivers/iommu/tegra-smmu.c
>>> @@ -16,6 +16,7 @@
>>>   #include 
>>>   #include 
>>>   #include 
>>> +#include 
>>>     #include 
>>>   #include 
>>> @@ -271,8 +272,10 @@ static bool tegra_smmu_capable(enum iommu_cap cap)
>>>   static struct iommu_domain *tegra_smmu_domain_alloc(unsigned type)
>>>   {
>>>   struct tegra_smmu_as *as;
>>> +    int ret;
>>>   -    if (type != IOMMU_DOMAIN_UNMANAGED)
>>> +    if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA &&
>>> +    type != IOMMU_DOMAIN_IDENTITY)
>>>   return NULL;
>>
>> Should be better to write these lines like this for the sake of readability:
>>
>> if (type != IOMMU_DOMAIN_UNMANAGED &&
>>     type != IOMMU_DOMAIN_DMA &&
>>     type != IOMMU_DOMAIN_IDENTITY)
>>     return NULL;
> 
> Furthermore, AFAICS there's no special handling being added for identity 
> domains - don't pretend to support them by giving back a regular translation 
> domain, that's just misleading and broken.

Agreed.

> 
>>
>>
>>>     as = kzalloc(sizeof(*as), GFP_KERNEL);
>>> @@ -281,26 +284,22 @@ static struct iommu_domain 
>>> *tegra_smmu_domain_alloc(unsigned type)
>>>     as->attr = SMMU_PD_READABLE | SMMU_PD_WRITABLE | SMMU_PD_NONSECURE;
>>>   +    ret = (type == IOMMU_DOMAIN_DMA) ? iommu_get_dma_cookie(>domain) 
>>> :
>>> +    -ENODEV;
>>
>> This makes to fail allocation of domain that has type other than 
>> IOMMU_DOMAIN_DMA.
>>
>> Should be:
>>
>> if (type == IOMMU_DOMAIN_DMA) {
>>     err = iommu_get_dma_cookie(>domain);
>>     if (err)
>>     goto free_as;
>> }
>>

Agreed.

>>
>>> +    if (ret)
>>> +    goto free_as;
>>> +
>>>   as->pd = alloc_page(GFP_KERNEL | __GFP_DMA | __GFP_ZERO);
>>> -    if (!as->pd) {
>>> -    kfree(as);
>>> -    return NULL;
>>> -    }
>>> +    if (!as->pd)
>>> +    goto put_dma_cookie;
>>>     as->count = kcalloc(SMMU_NUM_PDE, sizeof(u32), GFP_KERNEL);
>>> -    if (!as->count) {
>>> -    __free_page(as->pd);
>>> -    kfree(as);
>>> -    return NULL;
>>> -    }
>>> +    if (!as->count)
>>> +    goto free_pd_range;
>>>     as->pts = kcalloc(SMMU_NUM_PDE, sizeof(*as->pts), GFP_KERNEL);
>>> -    if (!as->pts) {
>>> -    kfree(as->count);
>>> -    __free_page(as->pd);
>>> -    kfree(as);
>>> -    return NULL;
>>> -    }
>>> +    if (!as->pts)
>>> +    goto free_pts;
>>>     /* setup aperture */
>>>   as->domain.geometry.aperture_start = 0;
>>> @@ -308,6 +307,18 @@ static struct iommu_domain 
>>> *tegra_smmu_domain_alloc(unsigned type)
>>>   as->domain.geometry.force_aperture = true;
>>>     return >domain;
>>> +
>>> +free_pts:
>>> +    kfree(as->pts);
>>> +free_pd_range:
>>> +    __free_page(as->pd);
>>> +put_dma_cookie:
>>> +    if (type == IOMMU_DOMAIN_DMA)
> 
> FWIW you don't strictly need that check - since domain->iova_cookie won't be 
> set for other domain types anyway, iommu_put_dma_cookie() will simply ignore 
> them.
> 

I'll still keep that check and not rely solely on the put_dma_cookie API to 
handle this case.
This will keep the code robust even if the put_dma_cookie API changes in future 
(for whatever reason).
It also looks like the canonical usage in other drivers.

>>> +    iommu_put_dma_cookie(>domain);
>>> +free_as:
>>> +    kfree(as);
>>> +
>>> +    return NULL;
>>>   }
>>>     static void tegra_smmu_domain_free(struct iommu_domain *domain)
> 
> How about not leaking the cookie when you free a DMA ops domain?
> 

Agreed.

> Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] dcdbas: Fix Intel-IOMMU domain allocation failure

2019-01-24 Thread Andy Shevchenko
On Thu, Jan 24, 2019 at 10:31 PM Szabolcs Fruhwald  wrote:
>
> (+iommu list for visibility and confirmation of the intended constant
> externalization, see 2nd point below)

> Absolutely, I thought so too. But, since the actual need to force
> id-mapping comes from the lack of support for non-pci buses
> particularly by the intel iommu implementation, it seemed a bit odd to
> move this constant into iommu.h. Other platforms' implementations are
> usually handling and hooking into other buses, eg platform bus.
>
> However, even these other hw platform iommu implementations are using
> ACPI or other (bios related) tables to generate source ids, which
> might still be an issue with drivers like dcdbas, with no real device
> info in these tables. Not to mention that forcing id-mapping might be
> a useful tool in driver developers' hands by other reasons too.
> Therefore, I just came to the conclusion that it is indeed useful to
> have this constant present in the common iommu header file (but with a
> more expressing name).
>
> Especially, as I just found that dcdbas is *not* the first one to
> implement this workaround:
> https://github.com/torvalds/linux/blob/71f3a82fab1b631ae9cb1feb677f498d4ca5007d/drivers/gpu/drm/i915/selftests/mock_gem_device.c#L154

> Let me come up with a v2 patch-set, containing a separate patch
> externalizing this constant from intel_iommu.c to iommu.h and making
> the above code using it too first, followed by this change in dcdbas.

Wait... It sounds to me like a part of arch code where we define
arch_setup_pdev_archdata() and use this dummy domain.
Though dummy domain definition should come from IOMMU framework.

-- 
With Best Regards,
Andy Shevchenko
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/5] iommu/tegra-smmu: Use non-secure register for flushing

2019-01-24 Thread Dmitry Osipenko
24.01.2019 21:29, navneet kumar пишет:
> On 1/17/19 7:25 AM, Dmitry Osipenko wrote:
>> 16.01.2019 23:50, Navneet Kumar пишет:
>>> Use PTB_ASID instead of SMMU_CONFIG to flush smmu.
>>> PTB_ASID can be accessed from non-secure mode, SMMU_CONFIG cannot be.
>>> Using SMMU_CONFIG could pose a problem when kernel doesn't have secure
>>> mode access enabled from boot.
>>>
>>> Signed-off-by: Navneet Kumar 
>>> ---
>>>  drivers/iommu/tegra-smmu.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
>>> index ee4d8a8..fa175d9 100644
>>> --- a/drivers/iommu/tegra-smmu.c
>>> +++ b/drivers/iommu/tegra-smmu.c
>>> @@ -235,7 +235,7 @@ static inline void smmu_flush_tlb_group(struct 
>>> tegra_smmu *smmu,
>>>  
>>>  static inline void smmu_flush(struct tegra_smmu *smmu)
>>>  {
>>> -   smmu_readl(smmu, SMMU_CONFIG);
>>> +   smmu_readl(smmu, SMMU_PTB_ASID);
>>>  }
>>>  
>>>  static int tegra_smmu_alloc_asid(struct tegra_smmu *smmu, unsigned int 
>>> *idp)
>>>
>>
>> Driver writes to SMMU_CONFIG during of the probing. Looks like it's better 
>> to drop this patch for now and make it part of a complete solution that will 
>> add proper support for a stricter insecure-mode platforms.
>>
> Thanks for the comment Dmitry. On secure platforms, writing to SMMU_CONFIG 
> will be a no-op and
> will pose no harm. Having this patch is important because it fixes the 
> flushing behavior on
> such platforms, which is critical.
> 
> I propose to keep this patch as is, however, i can add more explanation to 
> the commit message,
> stating the case of probe and how it will not have any ill effects. Pls 
> ACK/NACK, and i shall post
> a V2.
> 

Nothing breaks with this change at least for T30. Please extend the commit 
message and add a clarifying comment to the code, thanks.

With the clarifying message being addressed:

Reviewed-by: Dmitry Osipenko 
Tested-by: Dmitry Osipenko 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] dcdbas: Fix Intel-IOMMU domain allocation failure

2019-01-24 Thread Szabolcs Fruhwald via iommu
(+iommu list for visibility and confirmation of the intended constant
externalization, see 2nd point below)

Hi Mario, Thanks for your comments, see my answers inline below.

On Thu, Jan 24, 2019 at 7:16 AM  wrote:
>
> > -Original Message-
> > From: platform-driver-x86-ow...@vger.kernel.org  > ow...@vger.kernel.org> On Behalf Of Szabolcs Fruhwald
> > Sent: Wednesday, January 23, 2019 5:02 PM
> > To: Stuart Hayes
> > Cc: platform-driver-...@vger.kernel.org; Szabolcs Fruhwald
> > Subject: [PATCH] dcdbas: Fix Intel-IOMMU domain allocation failure
> >
> >
> > [EXTERNAL EMAIL]
> >
> > On Dell systems with intel_iommu enabled, dcdbas platform driver's DMA
> > calls fail with ENOMEM / "DMAR: Allocating domain for dcdbas failed".
>
> As a curiosity was this from manually turning on IOMMU or the automatic IOMMU
> enablement caused by 89a6079df791aeace2044ea93be1b397195824ec?
>

In our case it was manual turn-on. We need to deal with several hw
platforms (preferably with a unified sw configuration) and on one
particular hw platform (R730XD) we had problems even with
intel_iommu=off (no USB). However, we need vt-d / iommu support by
security reasons (preventing dma attacks), rather than by other
aspects, eg virtualization. So we didn't try to completely turn it
off.
These issues came up in the past year as we are in the process of
upgrading to v4 kernels and iommu support (and complexity) has been
significantly improved since v3.

> >
> > This is because the intel-iommu driver only supports PCI bus / devices,
> > therefore it is not (yet) possible to properly allocate and attach iommu
> > group/domain and attach devices with no pci parent devices.
> >
> > This workaround is forcing the intel_iommu implementation to use identity
> > mapping for this device, using the DUMMY_DEVICE_DOMAIN_INFO constant value
> > defined and used in intel-iommu.c.
>
> I would think it makes sense to export this out to a common include that both 
> files can
> use if it's using the same constant value as drivers/iommu/intel-iommu.c
>

Absolutely, I thought so too. But, since the actual need to force
id-mapping comes from the lack of support for non-pci buses
particularly by the intel iommu implementation, it seemed a bit odd to
move this constant into iommu.h. Other platforms' implementations are
usually handling and hooking into other buses, eg platform bus.

However, even these other hw platform iommu implementations are using
ACPI or other (bios related) tables to generate source ids, which
might still be an issue with drivers like dcdbas, with no real device
info in these tables. Not to mention that forcing id-mapping might be
a useful tool in driver developers' hands by other reasons too.
Therefore, I just came to the conclusion that it is indeed useful to
have this constant present in the common iommu header file (but with a
more expressing name).

Especially, as I just found that dcdbas is *not* the first one to
implement this workaround:
https://github.com/torvalds/linux/blob/71f3a82fab1b631ae9cb1feb677f498d4ca5007d/drivers/gpu/drm/i915/selftests/mock_gem_device.c#L154

Let me come up with a v2 patch-set, containing a separate patch
externalizing this constant from intel_iommu.c to iommu.h and making
the above code using it too first, followed by this change in dcdbas.

> At least to me it seems likely that dcdbas is just one of the first drivers 
> that will cause this.
>
> >
> > Signed-off-by: Szabolcs Fruhwald 
> > ---
> >  drivers/platform/x86/dcdbas.c | 5 +
> >  drivers/platform/x86/dcdbas.h | 2 +-
> >  2 files changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/platform/x86/dcdbas.c b/drivers/platform/x86/dcdbas.c
> > index 88bd7efafe14..a5d6bb1bc590 100644
> > --- a/drivers/platform/x86/dcdbas.c
> > +++ b/drivers/platform/x86/dcdbas.c
> > @@ -652,6 +652,11 @@ static int dcdbas_probe(struct platform_device *dev)
> >
> >   dcdbas_pdev = dev;
> >
> > + /* Intel-IOMMU workaround: platform-bus unsupported, force ID-mapping
> > */
> > + #if IS_ENABLED(CONFIG_IOMMU_API) &&
> > defined(CONFIG_INTEL_IOMMU)
> > + dev->dev.archdata.iommu = INTEL_IOMMU_DUMMY_DOMAIN;
> > + #endif
> > +
> >   /* Check if ACPI WSMT table specifies protected SMI buffer address */
> >   error = dcdbas_check_wsmt();
> >   if (error < 0)
> > diff --git a/drivers/platform/x86/dcdbas.h b/drivers/platform/x86/dcdbas.h
> > index 52729a494b00..373eb277933a 100644
> > --- a/drivers/platform/x86/dcdbas.h
> > +++ b/drivers/platform/x86/dcdbas.h
> > @@ -54,6 +54,7 @@
> >
> >  #define SMI_CMD_MAGIC(0x534D4931)
> >  #define SMM_EPS_SIG  "$SCB"
> > +#define INTEL_IOMMU_DUMMY_DOMAIN((void *)-1)
> >
> >  #define DCDBAS_DEV_ATTR_RW(_name) \
> >   DEVICE_ATTR(_name,0600,_name##_show,_name##_store);
> > @@ -114,4 +115,3 @@ struct smm_eps_table {
> >  } __packed;
> >
> >  #endif /* _DCDBAS_H_ */
> > -
> > --
> > 

Re: [PATCH 1/3] PCI: iproc: Add feature to set order mode

2019-01-24 Thread Bjorn Helgaas
On Thu, Jan 24, 2019 at 02:10:18PM +0530, Srinath Mannam wrote:
> On Fri, Jan 18, 2019 at 8:37 PM Bjorn Helgaas  wrote:
> > On Fri, Jan 18, 2019 at 09:53:21AM +0530, Srinath Mannam wrote:
> > > Order mode in RX header of incoming pcie packets can be override to
> > > strict or loose order based on requirement.
> > > Sysfs entry is provided to set dynamic and default order modes of upstream
> > > traffic.
> > ...
> > Are you talking about the Relaxed Ordering bit in the TLP Attributes
> > field?  If so, please use the exact language used in the spec and
> > include a reference, e.g., to PCIe r4.0, sec 2.2.6.4, 2.4, etc.
> >
> Yes Relax ordering bit in TLP. I will use spec reference. Thanks.
> > I'm hesitant about a new sysfs file for this.  That automatically
> > creates a user-space dependency on this iProc feature.  Who would use
> > this file?
> >
> This is the specific feature given in iProc, used to improve
> performance for the endpoints which does not support
> ordering configuration at its end.
> This is the reason we used sysfs file, which allows user to change
> ordering based on endpoint used and requirement.
> we are using these sysfs files to configure ordering to improve
> performance in NVMe endpoints with SPDK/DPDK drivers.
> If we enable this default in kernel, then it is applicable to all
> endpoints connected. But it may not required for all endpoints.

Normally, relaxed ordering is used only when an endpoint sets the
"Relaxed Ordering" attribute bit in a TLP.  The endpoint is only
allowed to do that if relaxed ordering is enabled in the Device
Control register.

If I understand correctly, this sysfs knob would let you configure the
iProc RC so it handles *all* TLPs (or just those in certain address
ranges) with relaxed ordering, regardless of whether the endpoint has
relaxed ordering, or even whether it supports relaxed ordering at all.

My gut feeling is that this is a messy hack, and if you want the
performance benefits of relaxed ordering, you should just choose an
NVMe endpoint that supports it correctly.

I assume the iProc RC does actually have the smarts to pay attention
to the Relaxed Ordering bit in TLPs if the endpoint sets it?

> > > To improve performance in few endpoints we required to modify the
> > > ordering attributes. Using this feature we can override order modes of RX
> > > packets at IPROC RC.
> > >
> > > Ex:
> > > In PCIe based NVMe SSD endpoints data read/writes from disk is using
> > > Queue pairs (submit/completion). After completion of block read/write,
> > > EP writes completion command to completion queue to notify RC.
> > > So that all data transfers before completion command write are not
> > > required to strict order except completion command. It means previous all
> > > packets before completion command write to RC should be written to memory
> > > and acknowledged.
> >
> > IIUC, if Enable Relaxed Ordering in the endpoint's Device Control
> > register is set, the device is permitted to set the Relaxed Ordering
> > bit in TLPs it initiates.  So I would think that if we set Enable
> > Relaxed Ordering correctly, the NVMe endpoint should be able to
> > use Relaxed Ordering for the data transfers and strict ordering (the
> > default) for the completion command.  What am I missing?
> >
> As per NVMe spec Enable Relaxed ordering is implementation specific all cards
> may not support.

Relaxed ordering support is optional for *every* PCIe endpoint, not
just NVMe.  If an endpoint doesn't support relaxed ordering at all, it
should hardwire the Enable Relaxed Ordering bit to zero.

> > This sysfs file apparently affects the Root Port/Root Complex
> > behavior, not the Endpoint's behavior.  Does that mean iProc ignores
> > the Relaxed Ordering bit by default, and you're providing a knob that
> >
> With sysfs file setting, ordering attributes of all TLPs directed to
> specific memory window can be override iProc layer based on
> settings.  TLPs to one memory window can keep as strict order and
> other memory windows can set to strict order.
>
> Using sysfs file ordering settings can configure and revert back to default.
>
> > makes it pay attention to it?  If that's the case, why wouldn't you
> > enable iProc support for Relaxed Ordering always, by default?
> >
> Option is given to user, because few endpoints may support ordering
> configuration internally.
> few EPs doesn't support. Based on requirement user can configure
> ordering settings.
> 
> > > Signed-off-by: Srinath Mannam 
> > > Reviewed-by: Ray Jui 
> > > ---
> > >  drivers/pci/controller/pcie-iproc.c | 128 
> > > 
> > >  drivers/pci/controller/pcie-iproc.h |  16 +
> > >  2 files changed, 144 insertions(+)
> > >
> > > diff --git a/drivers/pci/controller/pcie-iproc.c 
> > > b/drivers/pci/controller/pcie-iproc.c
> > > index c20fd6b..13ce80f 100644
> > > --- a/drivers/pci/controller/pcie-iproc.c
> > > +++ b/drivers/pci/controller/pcie-iproc.c
> > > @@ -57,6 +57,9 @@
> 

Re: [PATCH 2/5] iommu/tegra-smmu: Use non-secure register for flushing

2019-01-24 Thread navneet kumar
On 1/17/19 7:25 AM, Dmitry Osipenko wrote:
> 16.01.2019 23:50, Navneet Kumar пишет:
>> Use PTB_ASID instead of SMMU_CONFIG to flush smmu.
>> PTB_ASID can be accessed from non-secure mode, SMMU_CONFIG cannot be.
>> Using SMMU_CONFIG could pose a problem when kernel doesn't have secure
>> mode access enabled from boot.
>>
>> Signed-off-by: Navneet Kumar 
>> ---
>>  drivers/iommu/tegra-smmu.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
>> index ee4d8a8..fa175d9 100644
>> --- a/drivers/iommu/tegra-smmu.c
>> +++ b/drivers/iommu/tegra-smmu.c
>> @@ -235,7 +235,7 @@ static inline void smmu_flush_tlb_group(struct 
>> tegra_smmu *smmu,
>>  
>>  static inline void smmu_flush(struct tegra_smmu *smmu)
>>  {
>> -smmu_readl(smmu, SMMU_CONFIG);
>> +smmu_readl(smmu, SMMU_PTB_ASID);
>>  }
>>  
>>  static int tegra_smmu_alloc_asid(struct tegra_smmu *smmu, unsigned int *idp)
>>
> 
> Driver writes to SMMU_CONFIG during of the probing. Looks like it's better to 
> drop this patch for now and make it part of a complete solution that will add 
> proper support for a stricter insecure-mode platforms.
> 
Thanks for the comment Dmitry. On secure platforms, writing to SMMU_CONFIG will 
be a no-op and
will pose no harm. Having this patch is important because it fixes the flushing 
behavior on
such platforms, which is critical.

I propose to keep this patch as is, however, i can add more explanation to the 
commit message,
stating the case of probe and how it will not have any ill effects. Pls 
ACK/NACK, and i shall post
a V2.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v7 0/7] Add virtio-iommu driver

2019-01-24 Thread Jean-Philippe Brucker
Hi Joerg,

On 23/01/2019 08:34, Joerg Roedel wrote:
> Hi Jean-Philippe,
> 
> thanks for all your hard work on this!
> 
> On Tue, Jan 15, 2019 at 12:19:52PM +, Jean-Philippe Brucker wrote:
>> Implement the virtio-iommu driver, following specification v0.9 [1].
> 
> To make progress on this I think the spec needs to be close to something
> that can be included into the official virtio-specification. Have you
> proposed the specification for inclusion there?

I haven't yet. I did send a few drafts of the spec to the mailing list,
using arbitrary version numbers (0.1 - 0.9), and received excellent
feedback from Eric, Kevin, Ashok and others [2], but I hadn't formally
asked for inclusion yet. Since I haven't made any major change to the
interface in a while, I'll get on that.

> This is because I can't merge a driver that might be incompatible to
> future implementations because the specification needs to be changed on
> its way to an official standard.

Makes sense, though I think other virtio devices have been developed a
little more organically: device and driver code got upstreamed first,
and then the specification describing their interface got merged into
the standard. For example I believe that code for crypto, input and GPU
devices were upstreamed long before the specification was merged. Once
an implementation is upstream, the interface is expected to be
backward-compatible (all subsequent changes are introduced using feature
bits).

So I've been working with this process in mind, also described by Jens
at KVM forum 2017 [3]:
(1) Reserve a device ID, and get that merged into virtio (ID 23 for
virtio-iommu was reserved last year)
(2) Open-source an implementation (this driver and Eric's device)
(3) Formalize and upstream the device specification

But I get that some overlap between (2) and (3) would have been better.
So far the spec document has been reviewed mainly from the IOMMU point
of view, and might require more changes to be in line with the other
virtio devices -- hopefully just wording changes. I'll kick off step
(3), but I think the virtio folks are a bit busy with finalizing the 1.1
spec so I expect it to take a while.

Thanks,
Jean

[2] RFC https://markmail.org/thread/l6b2rpc46nua4egs
0.4 https://markmail.org/thread/f5k37mab7tnrslin
0.5 https://markmail.org/thread/tz65oolu5do7hi6n
0.6 https://markmail.org/thread/dppbg6owzrx2km2n
0.7 https://markmail.org/thread/dgdy4hicswpakmsq

[3] The future of virtio: riddles, myths and surprises
https://www.linux-kvm.org/images/0/03/Virtio_fall_2017.pdf
https://www.youtube.com/watch?v=z9cWwgYH97A


> I had a short discussion with Michael S. Tsirkin about that and from
> what I understood the spec needs to be proposed for inclusion on the
> virtio-comment[1] mailing list and later the TC needs to vote on it.
> Please work with Michael on this to get the specification official (or
> at least pretty close to something that will be part of the official
> virtio standard).
> 
> Regards,
> 
>   Joerg
> 
> [1] https://www.oasis-open.org/committees/comments/index.php?wg_abbrev=virtio

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/5] swiotlb: Add is_swiotlb_active() function

2019-01-24 Thread Joerg Roedel
On Thu, Jan 24, 2019 at 09:41:07AM +0100, Christoph Hellwig wrote:
> On Thu, Jan 24, 2019 at 09:29:23AM +0100, Joerg Roedel wrote:
> > > As I've just introduced and fixed a bug in this area in the current
> > > cycle - I don't think no_iotlb_memory is what your want (and maybe
> > > not useful at all): if the arch valls swiotlb_exit after previously
> > > initializing a buffer it won't be set.  You probably want to check
> > > for non-zero io_tlb_start and/or io_tlb_end.
> > 
> > Okay, but that requires that I also set io_tlb_start and friends back to
> > zero in the failure path of swiotlb_init(). Otherwise it could be left
> > non-zero in case swiotlb_init_with_tbl() returns an error.
> 
> Indeed, and we'll need to do that anyway as otherwise the dma mapping
> path might cause problems similar to the one when swiotlb_exit is
> called that I fixed.

Turns out the the error path in swiotlb_init() is redundant because it
will never be executed. If the function returns it will always return 0
because in case of failure it will just panic (through memblock_alloc).

I'll clean that up in a separate patch-set. There are more users of that
function and all of them panic when the function fails.


Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3] iommu: amd: Fix IOMMU page flush when detach device from a domain

2019-01-24 Thread j...@8bytes.org
On Thu, Jan 24, 2019 at 02:17:34PM +, Suthikulpanit, Suravee wrote:
> On 1/24/19 9:11 PM, j...@8bytes.org wrote:
> > On Thu, Jan 24, 2019 at 04:16:45AM +, Suthikulpanit, Suravee wrote:
> >>   drivers/iommu/amd_iommu.c | 15 +++
> >>   1 file changed, 11 insertions(+), 4 deletions(-)
> > 
> > Applied, thanks Suravee.
> > 
> 
> Thanks. Also, should this also back-ported to stable tree as well?

I added a Fixes tag, so stable should pick it up.


Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3] iommu: amd: Fix IOMMU page flush when detach device from a domain

2019-01-24 Thread Suthikulpanit, Suravee
Joerg,

On 1/24/19 9:11 PM, j...@8bytes.org wrote:
> On Thu, Jan 24, 2019 at 04:16:45AM +, Suthikulpanit, Suravee wrote:
>>   drivers/iommu/amd_iommu.c | 15 +++
>>   1 file changed, 11 insertions(+), 4 deletions(-)
> 
> Applied, thanks Suravee.
> 

Thanks. Also, should this also back-ported to stable tree as well?

Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/dma: Remove unused variable

2019-01-24 Thread Joerg Roedel
On Thu, Jan 24, 2019 at 03:10:02PM +0800, Shaokun Zhang wrote:
>  drivers/iommu/dma-iommu.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)

Applied, thanks.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3] iommu: amd: Fix IOMMU page flush when detach device from a domain

2019-01-24 Thread j...@8bytes.org
On Thu, Jan 24, 2019 at 04:16:45AM +, Suthikulpanit, Suravee wrote:
>  drivers/iommu/amd_iommu.c | 15 +++
>  1 file changed, 11 insertions(+), 4 deletions(-)

Applied, thanks Suravee.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] iommu/vt-d: Leave scalable mode default off

2019-01-24 Thread Joerg Roedel
On Thu, Jan 24, 2019 at 10:31:32AM +0800, Lu Baolu wrote:
> Commit 765b6a98c1de3 ("iommu/vt-d: Enumerate the scalable
> mode capability") enables VT-d scalable mode if hardware
> advertises the capability. As we will bring up different
> features and use cases to upstream in different patch
> series, it will leave some intermediate kernel versions
> which support partial features. Hence, end user might run
> into problems when they use such kernels on bare metals
> or virtualization environments.

I don't get it, can you be more specific about the problems that users
might run into? And is this patch needed as a fix for v5.0 or is it just
a precaution because future patches might break something for users?


Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 2/8] iommu/vt-d: Add per-device IOMMU feature ops entries

2019-01-24 Thread Joerg Roedel
Hi Lu Baolu,

On Thu, Jan 24, 2019 at 02:47:39PM +0800, Lu Baolu wrote:
> bool iommu_dev_feature_enabled(dev, IOMMU_DEV_FEAT_AUX)?

Looks good. Having a function to check for enabled features is certainly
a good thing.

Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 5/5] virtio-blk: Consider virtio_max_dma_size() for maximum segment size

2019-01-24 Thread Joerg Roedel
On Thu, Jan 24, 2019 at 09:42:21AM +0100, Christoph Hellwig wrote:
> Yes.  But more importantly it would fix the limit for all other block
> drivers that set large segment sizes when running over swiotlb.

True, so it would be something like the diff below? I havn't worked on
the block layer, so I don't know if that needs additional checks for
->dev or anything.

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 3e7038e475ee..9a927280c904 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -1,6 +1,7 @@
 /*
  * Functions related to setting various queue properties from drivers
  */
+#include 
 #include 
 #include 
 #include 
@@ -303,13 +304,17 @@ EXPORT_SYMBOL_GPL(blk_queue_max_discard_segments);
  **/
 void blk_queue_max_segment_size(struct request_queue *q, unsigned int max_size)
 {
+   unsigned int dma_max_size;
+
if (max_size < PAGE_SIZE) {
max_size = PAGE_SIZE;
printk(KERN_INFO "%s: set to minimum %d\n",
   __func__, max_size);
}
 
-   q->limits.max_segment_size = max_size;
+   dma_max_size = dma_max_mapping_size(q->backing_dev_info->dev);
+
+   q->limits.max_segment_size = min(max_size, dma_max_size);
 }
 EXPORT_SYMBOL(blk_queue_max_segment_size);
 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/5] swiotlb: Introduce swiotlb_max_mapping_size()

2019-01-24 Thread Christoph Hellwig
On Thu, Jan 24, 2019 at 09:24:31AM +0100, Joerg Roedel wrote:
> On Wed, Jan 23, 2019 at 10:28:13PM +0100, Christoph Hellwig wrote:
> > On Wed, Jan 23, 2019 at 05:30:45PM +0100, Joerg Roedel wrote:
> > > +extern size_t swiotlb_max_mapping_size(struct device *dev);
> > 
> > No need for the extern keyword on function declarations in headers.
> 
> Right, but all other function declarations in that header file have
> 'extern' too, so I added it also for that one.

Your patch 3 also doesn't use an extern.  And I have to say I
much prefer it that way..
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/3] PCI: iproc: CRS state check in config request

2019-01-24 Thread Srinath Mannam via iommu
Hi Bjorn,

Thanks for review, please see my comments below inline.

On Fri, Jan 18, 2019 at 8:38 PM Bjorn Helgaas  wrote:
>
> On Fri, Jan 18, 2019 at 09:53:22AM +0530, Srinath Mannam wrote:
> > In the current implementation, config read of 0x0001 data
> > is assumed as CRS completion. but sometimes 0x0001 can be
> > a valid data.
> > IPROC PCIe RC has a register to show config request status flags
> > like SC, UR, CRS and CA.
> > So that extra check is added in the code to confirm the CRS
> > state using this register before reissue config request.
>
> s/. but/.  But/  (Sentences start with a capital letter.)
will change. Thanks.
>
> Please wrap this text correctly.  If it's a single paragraph, wrap it so
> the lines are filled.  It *looks* like it's intended to be separate
> paragraphs; they should be separated by blank lines.
will change. Thanks.
>
> > Signed-off-by: Srinath Mannam 
> > Reviewed-by: Ray Jui 
> > ---
> >  drivers/pci/controller/pcie-iproc.c | 23 +--
> >  1 file changed, 21 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/pci/controller/pcie-iproc.c 
> > b/drivers/pci/controller/pcie-iproc.c
> > index 13ce80f..ee89d56 100644
> > --- a/drivers/pci/controller/pcie-iproc.c
> > +++ b/drivers/pci/controller/pcie-iproc.c
> > @@ -63,6 +63,10 @@
> >  #define APB_ERR_EN_SHIFT 0
> >  #define APB_ERR_EN   BIT(APB_ERR_EN_SHIFT)
> >
> > +#define CFG_RD_SUCCESS   0
> > +#define CFG_RD_UR1
> > +#define CFG_RD_CRS   2
> > +#define CFG_RD_CA3
> >  #define CFG_RETRY_STATUS 0x0001
> >  #define CFG_RETRY_STATUS_TIMEOUT_US  50 /* 500 milliseconds */
> >
> > @@ -300,6 +304,9 @@ enum iproc_pcie_reg {
> >   IPROC_PCIE_IARR4,
> >   IPROC_PCIE_IMAP4,
> >
> > + /* config read status */
> > + IPROC_PCIE_CFG_RD_STATUS,
> > +
> >   /* link status */
> >   IPROC_PCIE_LINK_STATUS,
> >
> > @@ -370,6 +377,7 @@ static const u16 iproc_pcie_reg_paxb_v2[] = {
> >   [IPROC_PCIE_IMAP3]  = 0xe08,
> >   [IPROC_PCIE_IARR4]  = 0xe68,
> >   [IPROC_PCIE_IMAP4]  = 0xe70,
> > + [IPROC_PCIE_CFG_RD_STATUS]  = 0xee0,
> >   [IPROC_PCIE_LINK_STATUS]= 0xf0c,
> >   [IPROC_PCIE_APB_ERR_EN] = 0xf40,
> >   [IPROC_PCIE_ORDERING_CFG]   = 0x2000,
> > @@ -501,10 +509,12 @@ static void __iomem *iproc_pcie_map_ep_cfg_reg(struct 
> > iproc_pcie *pcie,
> >   return (pcie->base + offset);
> >  }
> >
> > -static unsigned int iproc_pcie_cfg_retry(void __iomem *cfg_data_p)
> > +static unsigned int iproc_pcie_cfg_retry(struct iproc_pcie *pcie,
> > +  void __iomem *cfg_data_p)
> >  {
> >   int timeout = CFG_RETRY_STATUS_TIMEOUT_US;
> >   unsigned int data;
> > + u32 status;
> >
> >   /*
> >* As per PCIe spec r3.1, sec 2.3.2, CRS Software Visibility only
> > @@ -525,6 +535,15 @@ static unsigned int iproc_pcie_cfg_retry(void __iomem 
> > *cfg_data_p)
> >*/
> >   data = readl(cfg_data_p);
> >   while (data == CFG_RETRY_STATUS && timeout--) {
> > + /*
> > +  * CRS state is set in CFG_RD status register
> > +  * This will handle the case where CFG_RETRY_STATUS is
> > +  * valid config data.
> > +  */
> > + status = iproc_pcie_read_reg(pcie, IPROC_PCIE_CFG_RD_STATUS);
> > + if (status != CFG_RD_CRS)
> > + return data;
> > +
> >   udelay(1);
> >   data = readl(cfg_data_p);
> >   }
> > @@ -603,7 +622,7 @@ static int iproc_pcie_config_read(struct pci_bus *bus, 
> > unsigned int devfn,
> >   if (!cfg_data_p)
> >   return PCIBIOS_DEVICE_NOT_FOUND;
> >
> > - data = iproc_pcie_cfg_retry(cfg_data_p);
> > + data = iproc_pcie_cfg_retry(pcie, cfg_data_p);
> >
> >   *val = data;
> >   if (size <= 2)
> > --
> > 2.7.4
> >
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 5/5] virtio-blk: Consider virtio_max_dma_size() for maximum segment size

2019-01-24 Thread Christoph Hellwig
On Thu, Jan 24, 2019 at 09:40:11AM +0100, Joerg Roedel wrote:
> > I wonder if we should just move the dma max segment size check
> > into blk_queue_max_segment_size so that all block drivers benefit
> > from it.  Even if not I think at least the SCSI midlayer should
> > be updated to support it.
> 
> In that case the limit would also apply to virtio-blk even if it doesn't
> use the DMA-API. If that is acceptable we can move the check to
> blk_queue_max_segment_size().

Yes.  But more importantly it would fix the limit for all other block
drivers that set large segment sizes when running over swiotlb.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/5] swiotlb: Add is_swiotlb_active() function

2019-01-24 Thread Christoph Hellwig
On Thu, Jan 24, 2019 at 09:29:23AM +0100, Joerg Roedel wrote:
> > As I've just introduced and fixed a bug in this area in the current
> > cycle - I don't think no_iotlb_memory is what your want (and maybe
> > not useful at all): if the arch valls swiotlb_exit after previously
> > initializing a buffer it won't be set.  You probably want to check
> > for non-zero io_tlb_start and/or io_tlb_end.
> 
> Okay, but that requires that I also set io_tlb_start and friends back to
> zero in the failure path of swiotlb_init(). Otherwise it could be left
> non-zero in case swiotlb_init_with_tbl() returns an error.

Indeed, and we'll need to do that anyway as otherwise the dma mapping
path might cause problems similar to the one when swiotlb_exit is
called that I fixed.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/3] PCI: iproc: Add feature to set order mode

2019-01-24 Thread Srinath Mannam via iommu
Hi Bjorn,

Thanks for review, please see my comments below inline.

On Fri, Jan 18, 2019 at 8:37 PM Bjorn Helgaas  wrote:
>
> On Fri, Jan 18, 2019 at 09:53:21AM +0530, Srinath Mannam wrote:
> > Order mode in RX header of incoming pcie packets can be override to
> > strict or loose order based on requirement.
> > Sysfs entry is provided to set dynamic and default order modes of upstream
> > traffic.
>
> s/pcie/PCIe/
will change. Thanks.
>
> If this is two paragraphs, insert a blank line between.  If it's one
> paragraph, reflow it so it reads like one paragraph.
>
will change. Thanks.
> Are you talking about the Relaxed Ordering bit in the TLP Attributes
> field?  If so, please use the exact language used in the spec and
> include a reference, e.g., to PCIe r4.0, sec 2.2.6.4, 2.4, etc.
>
Yes Relax ordering bit in TLP. I will use spec reference. Thanks.
> I'm hesitant about a new sysfs file for this.  That automatically
> creates a user-space dependency on this iProc feature.  Who would use
> this file?
>
This is the specific feature given in iProc, used to improve
performance for the endpoints which does not support
ordering configuration at its end.
This is the reason we used sysfs file, which allows user to change
ordering based on endpoint used and requirement.
we are using these sysfs files to configure ordering to improve
performance in NVMe endpoints with SPDK/DPDK drivers.
If we enable this default in kernel, then it is applicable to all
endpoints connected. But it may not required for all endpoints.

> > To improve performance in few endpoints we required to modify the
> > ordering attributes. Using this feature we can override order modes of RX
> > packets at IPROC RC.
> >
> > Ex:
> > In PCIe based NVMe SSD endpoints data read/writes from disk is using
> > Queue pairs (submit/completion). After completion of block read/write,
> > EP writes completion command to completion queue to notify RC.
> > So that all data transfers before completion command write are not
> > required to strict order except completion command. It means previous all
> > packets before completion command write to RC should be written to memory
> > and acknowledged.
>
> IIUC, if Enable Relaxed Ordering in the endpoint's Device Control
> register is set, the device is permitted to set the Relaxed Ordering
> bit in TLPs it initiates.  So I would think that if we set Enable
> Relaxed Ordering correctly, the NVMe endpoint should be able to
> use Relaxed Ordering for the data transfers and strict ordering (the
> default) for the completion command.  What am I missing?
>
As per NVMe spec Enable Relaxed ordering is implementation specific all cards
may not support.

> This sysfs file apparently affects the Root Port/Root Complex
> behavior, not the Endpoint's behavior.  Does that mean iProc ignores
> the Relaxed Ordering bit by default, and you're providing a knob that
With sysfs file setting, ordering attributes of all TLPs directed to
specific memory window
can be override iProc layer based on settings.
TLPs to one memory window can keep as strict order and other memory windows can
set to strict order.
Using sysfs file ordering settings can configure and revert back to default.
> makes it pay attention to it?  If that's the case, why wouldn't you
> enable iProc support for Relaxed Ordering always, by default?
>
Option is given to user, because few endpoints may support ordering
configuration internally.
few EPs doesn't support. Based on requirement user can configure
ordering settings.

> > Signed-off-by: Srinath Mannam 
> > Reviewed-by: Ray Jui 
> > ---
> >  drivers/pci/controller/pcie-iproc.c | 128 
> > 
> >  drivers/pci/controller/pcie-iproc.h |  16 +
> >  2 files changed, 144 insertions(+)
> >
> > diff --git a/drivers/pci/controller/pcie-iproc.c 
> > b/drivers/pci/controller/pcie-iproc.c
> > index c20fd6b..13ce80f 100644
> > --- a/drivers/pci/controller/pcie-iproc.c
> > +++ b/drivers/pci/controller/pcie-iproc.c
> > @@ -57,6 +57,9 @@
> >  #define PCIE_DL_ACTIVE_SHIFT 2
> >  #define PCIE_DL_ACTIVE   BIT(PCIE_DL_ACTIVE_SHIFT)
> >
> > +#define MPS_MRRS_CFG_MPS_SHIFT   0
> > +#define MPS_MRRS_CFG_MRRS_SHIFT  16
> > +
> >  #define APB_ERR_EN_SHIFT 0
> >  #define APB_ERR_EN   BIT(APB_ERR_EN_SHIFT)
> >
> > @@ -91,6 +94,14 @@
> >
> >  #define IPROC_PCIE_REG_INVALID   0x
> >
> > +#define RO_FIELD(window) BIT((window) << 1)
> > +#define RO_VALUE(window) BIT(((window) << 1) + 1)
> > +/* All Windows are allowed */
> > +#define RO_ALL_WINDOW0x
> > +/* Wait on All Windows */
> > +#define RO_FIELD_ALL_WINDOW  0x
> > +#define DYNAMIC_ORDER_MODE   0x5
> > +
> >  /**
> >   * iProc PCIe outbound mapping controller specific parameters
> >   *
> > @@ -295,6 +306,15 @@ enum iproc_pcie_reg {
> >   /* enable APB error for unsupported 

Re: [PATCH 5/5] virtio-blk: Consider virtio_max_dma_size() for maximum segment size

2019-01-24 Thread Joerg Roedel
On Wed, Jan 23, 2019 at 10:31:39PM +0100, Christoph Hellwig wrote:
> On Wed, Jan 23, 2019 at 05:30:49PM +0100, Joerg Roedel wrote:
> > +   max_size = virtio_max_dma_size(vdev);
> > +
> > /* Host can optionally specify maximum segment size and number of
> >  * segments. */
> > err = virtio_cread_feature(vdev, VIRTIO_BLK_F_SIZE_MAX,
> >struct virtio_blk_config, size_max, );
> > if (!err)
> > -   blk_queue_max_segment_size(q, v);
> > -   else
> > -   blk_queue_max_segment_size(q, -1U);
> > +   max_size = min(max_size, v);
> > +
> > +   blk_queue_max_segment_size(q, max_size);
> 
> I wonder if we should just move the dma max segment size check
> into blk_queue_max_segment_size so that all block drivers benefit
> from it.  Even if not I think at least the SCSI midlayer should
> be updated to support it.

In that case the limit would also apply to virtio-blk even if it doesn't
use the DMA-API. If that is acceptable we can move the check to
blk_queue_max_segment_size().

Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/5] swiotlb: Add is_swiotlb_active() function

2019-01-24 Thread Joerg Roedel
On Wed, Jan 23, 2019 at 10:27:55PM +0100, Christoph Hellwig wrote:
> On Wed, Jan 23, 2019 at 05:30:46PM +0100, Joerg Roedel wrote:
> > +bool is_swiotlb_active(void)
> > +{
> > +   return !no_iotlb_memory;
> > +}
> 
> As I've just introduced and fixed a bug in this area in the current
> cycle - I don't think no_iotlb_memory is what your want (and maybe
> not useful at all): if the arch valls swiotlb_exit after previously
> initializing a buffer it won't be set.  You probably want to check
> for non-zero io_tlb_start and/or io_tlb_end.

Okay, but that requires that I also set io_tlb_start and friends back to
zero in the failure path of swiotlb_init(). Otherwise it could be left
non-zero in case swiotlb_init_with_tbl() returns an error.


Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/5] swiotlb: Introduce swiotlb_max_mapping_size()

2019-01-24 Thread Joerg Roedel
On Wed, Jan 23, 2019 at 10:28:13PM +0100, Christoph Hellwig wrote:
> On Wed, Jan 23, 2019 at 05:30:45PM +0100, Joerg Roedel wrote:
> > +extern size_t swiotlb_max_mapping_size(struct device *dev);
> 
> No need for the extern keyword on function declarations in headers.

Right, but all other function declarations in that header file have
'extern' too, so I added it also for that one.

Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/amd: Fix IOMMU page flush when detach all devices from a domain

2019-01-24 Thread j...@8bytes.org
Hi Suravee,

On Thu, Jan 24, 2019 at 03:25:19AM +, Suthikulpanit, Suravee wrote:
> Actually, I just noticed that device_flush_dte() has already handled flushing 
> the DTE
> for alias device as well. Please see the link below.
> 
> https://elixir.bootlin.com/linux/latest/source/drivers/iommu/amd_iommu.c#L1219
> 
> So, we don't need to call device_flush_dte() for alias device in do_detach().

You are right, I missed that.

Thanks,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu