Re: [PATCH] dma-contiguous: Prioritize restricted DMA pool over shared DMA pool

2022-02-16 Thread Florian Fainelli
On 2/16/22 11:48 AM, Robin Murphy wrote:
> On 2022-02-16 17:37, Florian Fainelli wrote:
>> On 2/16/22 3:13 AM, Robin Murphy wrote:
>>> On 2022-02-15 22:43, Florian Fainelli wrote:
 Some platforms might define the same memory region to be suitable for
 used by a kernel supporting CONFIG_DMA_RESTRICTED_POOL while
 maintaining
 compatibility with older kernels that do not support that. This is
 achieved by declaring the node this way;
>>>
>>> Those platforms are doing something inexplicably wrong, then.
>>
>> Matter of perspective I guess.
>>
>>>
>>> "restricted-dma-pool" says that DMA for the device has to be bounced
>>> through a dedicated pool because it can't be trusted with visibility of
>>> regular OS memory. "reusable" tells the OS that it's safe to use the
>>> pool as regular OS memory while it's idle. Do you see how those concepts
>>> are fundamentally incompatible?
>>
>>  From the perspective of the software or firmware agent that is
>> responsible for setting up the appropriate protection on the reserved
>> memory, it does not matter what the compatible string is, the only
>> properties that matter are the base address, size, and possibly whether
>> 'no-map' is specified or not to set-up appropriate protection for the
>> various memory controller agents (CPU, PCIe, everything else).
>>
>> Everything else is just information provided to the OS in order to
>> provide a different implementation keyed off the compatible string. So
>> with that in mind, you can imagine that before the introduction of
>> 'restricted-dma-pool' in 5.15, some platforms already had such a concept
>> of a reserved DMA region, that was backed by a device private CMA pool,
>> they would allocate memory from that region and would create their own
>> middle layer for bounce buffering if they liked to. This is obviously
>> not ideal on a number of levels starting from not being done at the
>> appropriate level but it was done.
>>
>> Now that 'restricted-dma-pool' is supported, transitioning them over is
>> obviously better and updating the compatible string for those specific
>> regions to include the more descriptive 'restrictded-dma-pool' sounded
>> to me as an acceptable way to maintain forward/backward DTB
>> compatibility rather than doubly reserving these region one with the
>> "old" compatible and one with the "new" compatible, not that the system
>> is even capable of doing that anyway, so we would have had to
>> essentially make them adjacent.
>>
>> And no, we are not bringing Linux version awareness to our boot loader
>> mangling the Device Tree blob, that's not happening, hence this patch.
> 
> If the patch was adding a "brcm,insecure-dma-pool" compatible and
> hooking it up, I'd be less bothered. As it is, I remain unconvinced that
> describing two things that are not interchangeable with each other as
> interchangeable with each other is in any way "better".

We most definitively should have done that but we did not because we
sort of like to maintain as fewer patches as possible against the
mainline kernel, believe it or not. Also, it would have been fun to
explain why we went with our own compatible string to obtain the same
semantics from the kernel as the generic one, but I will stick a pin in
that idea.

> 
>>> Linux is right to reject contradictory information rather than attempt
>>> to guess at what might be safe or not.
>>
>> The piece of contradictory information here specifically is the
>> 'reusable' boolean property, but as I explain the commit message
>> message, if you have a "properly formed" 'restricted-dma-pool' region
>> then it should not have that property in the first place, but even if it
>> does, it does not harm anything to have it.
>>
>>>
>>> Furthermore, down at the practical level, a SWIOTLB pool is used for
>>> bouncing streaming DMA API mappings, while a coherent/CMA pool is used
>>> for coherent DMA API allocations; they are not functionally
>>> interchangeable either. If a device depends on coherent allocations
>>> rather than streaming DMA, it should still have a coherent pool even
>>> under a physical memory protection scheme, and if it needs both
>>> streaming DMA and coherent buffers then it can have both types of pool -
>>> the bindings explicitly call that out. It's true that SWIOTLB pools can
>>> act as an emergency fallback for small allocations for I/O-coherent
>>> devices, but that's not really intended to be relied upon heavily.
>>>
>>> So no, I do not see anything wrong with the current handling of
>>> nonsensical DTs here, sorry.
>>
>> There is nothing wrong in the current code, but with changes that have
>> no adverse effect on "properly" constructed reserved memory entries we
>> can accept both types of reservation and maintain forward/backward
>> compatibility in our case. So why not?
> 
> Would you be happy to give me blanket permission to point a gun at your
> foot and pull the trigger at any point in the future, if right now I
> show you an unloaded 

Re: [PATCH] dma-contiguous: Prioritize restricted DMA pool over shared DMA pool

2022-02-16 Thread Robin Murphy

On 2022-02-16 17:37, Florian Fainelli wrote:

On 2/16/22 3:13 AM, Robin Murphy wrote:

On 2022-02-15 22:43, Florian Fainelli wrote:

Some platforms might define the same memory region to be suitable for
used by a kernel supporting CONFIG_DMA_RESTRICTED_POOL while maintaining
compatibility with older kernels that do not support that. This is
achieved by declaring the node this way;


Those platforms are doing something inexplicably wrong, then.


Matter of perspective I guess.



"restricted-dma-pool" says that DMA for the device has to be bounced
through a dedicated pool because it can't be trusted with visibility of
regular OS memory. "reusable" tells the OS that it's safe to use the
pool as regular OS memory while it's idle. Do you see how those concepts
are fundamentally incompatible?


 From the perspective of the software or firmware agent that is
responsible for setting up the appropriate protection on the reserved
memory, it does not matter what the compatible string is, the only
properties that matter are the base address, size, and possibly whether
'no-map' is specified or not to set-up appropriate protection for the
various memory controller agents (CPU, PCIe, everything else).

Everything else is just information provided to the OS in order to
provide a different implementation keyed off the compatible string. So
with that in mind, you can imagine that before the introduction of
'restricted-dma-pool' in 5.15, some platforms already had such a concept
of a reserved DMA region, that was backed by a device private CMA pool,
they would allocate memory from that region and would create their own
middle layer for bounce buffering if they liked to. This is obviously
not ideal on a number of levels starting from not being done at the
appropriate level but it was done.

Now that 'restricted-dma-pool' is supported, transitioning them over is
obviously better and updating the compatible string for those specific
regions to include the more descriptive 'restrictded-dma-pool' sounded
to me as an acceptable way to maintain forward/backward DTB
compatibility rather than doubly reserving these region one with the
"old" compatible and one with the "new" compatible, not that the system
is even capable of doing that anyway, so we would have had to
essentially make them adjacent.

And no, we are not bringing Linux version awareness to our boot loader
mangling the Device Tree blob, that's not happening, hence this patch.


If the patch was adding a "brcm,insecure-dma-pool" compatible and 
hooking it up, I'd be less bothered. As it is, I remain unconvinced that 
describing two things that are not interchangeable with each other as 
interchangeable with each other is in any way "better".



Linux is right to reject contradictory information rather than attempt
to guess at what might be safe or not.


The piece of contradictory information here specifically is the
'reusable' boolean property, but as I explain the commit message
message, if you have a "properly formed" 'restricted-dma-pool' region
then it should not have that property in the first place, but even if it
does, it does not harm anything to have it.



Furthermore, down at the practical level, a SWIOTLB pool is used for
bouncing streaming DMA API mappings, while a coherent/CMA pool is used
for coherent DMA API allocations; they are not functionally
interchangeable either. If a device depends on coherent allocations
rather than streaming DMA, it should still have a coherent pool even
under a physical memory protection scheme, and if it needs both
streaming DMA and coherent buffers then it can have both types of pool -
the bindings explicitly call that out. It's true that SWIOTLB pools can
act as an emergency fallback for small allocations for I/O-coherent
devices, but that's not really intended to be relied upon heavily.

So no, I do not see anything wrong with the current handling of
nonsensical DTs here, sorry.


There is nothing wrong in the current code, but with changes that have
no adverse effect on "properly" constructed reserved memory entries we
can accept both types of reservation and maintain forward/backward
compatibility in our case. So why not?


Would you be happy to give me blanket permission to point a gun at your 
foot and pull the trigger at any point in the future, if right now I 
show you an unloaded gun?


Security and lazy shortcuts do not mix well. You are literally arguing 
that mainline Linux should support a back-door ABI for illegal DT 
properties which at worst has the potential to defeat a generic security 
feature. The "restricted-dma-pool" binding explicitly says "When using 
this, the no-map and reusable properties must not be set" (I should spin 
up a patch enforcing that in the schema...). No matter how convinced you 
are that no OS past present or future could possibly ever behave 
differently from the particular downstream software stack you care 
about, NAK to subverting the "restricted-dma-pool" compatible in any 

[PATCH] iommu/vt-d: Enable ATS for the devices in SATC table

2022-02-16 Thread Yian Chen
Starting from Intel VT-d v3.2, Intel platform BIOS can provide
additional SATC table structure. SATC table includes a list of
SoC integrated devices that support ATC (Address translation
cache).

Enabling ATC (via ATS capability) can be a functional requirement
for SATC device operation or an optional to enhance device
performance/functionality. This is determined by the bit of
ATC_REQUIRED in SATC table. When IOMMU is working in scalable
mode, software chooses to always enable ATS for every device in
SATC table because Intel SoC devices in SATC table are trusted
to use ATS.

On the other hand, if IOMMU is in legacy mode, ATS of SATC
capable devices can work transparently to software and be
automatically enabled by IOMMU hardware. As the result,
there is no need for software to enable ATS on these devices.

Signed-off-by: Yian Chen 
---
 drivers/iommu/intel/iommu.c | 53 ++---
 include/linux/intel-iommu.h |  2 +-
 2 files changed, 51 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 92fea3fbbb11..58a80cec50bb 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -872,7 +872,6 @@ static bool iommu_is_dummy(struct intel_iommu *iommu, 
struct device *dev)
 
return false;
 }
-
 struct intel_iommu *device_to_iommu(struct device *dev, u8 *bus, u8 *devfn)
 {
struct dmar_drhd_unit *drhd = NULL;
@@ -2684,7 +2683,7 @@ static struct dmar_domain 
*dmar_insert_one_dev_info(struct intel_iommu *iommu,
 
if (ecap_dev_iotlb_support(iommu->ecap) &&
pci_ats_supported(pdev) &&
-   dmar_find_matched_atsr_unit(pdev))
+   dmar_ats_supported(pdev, iommu))
info->ats_supported = 1;
 
if (sm_supported(iommu)) {
@@ -4020,7 +4019,42 @@ static void intel_iommu_free_dmars(void)
}
 }
 
-int dmar_find_matched_atsr_unit(struct pci_dev *dev)
+/* dev_satc_state - Find if dev is in a DMAR SATC table
+ *
+ * return value:
+ *1: dev is in STAC table and ATS is required
+ *0: dev is in SATC table and ATS is optional
+ *-1: dev isn't in SATC table
+ */
+static int dev_satc_state(struct pci_dev *dev)
+{
+   int i, ret = -1;
+   struct device *tmp;
+   struct dmar_satc_unit *satcu;
+   struct acpi_dmar_satc *satc;
+
+   dev = pci_physfn(dev);
+   rcu_read_lock();
+
+   list_for_each_entry_rcu(satcu, _satc_units, list) {
+   satc = container_of(satcu->hdr, struct acpi_dmar_satc, header);
+   if (satc->segment != pci_domain_nr(dev->bus))
+   continue;
+   for_each_dev_scope(satcu->devices, satcu->devices_cnt, i, tmp)
+   if (to_pci_dev(tmp) == dev) {
+   if (satc->flags)
+   ret = 1;
+   else
+   ret = 0;
+   goto out;
+   }
+   }
+out:
+   rcu_read_unlock();
+   return ret;
+}
+
+int dmar_ats_supported(struct pci_dev *dev, struct intel_iommu *iommu)
 {
int i, ret = 1;
struct pci_bus *bus;
@@ -4030,6 +4064,19 @@ int dmar_find_matched_atsr_unit(struct pci_dev *dev)
struct dmar_atsr_unit *atsru;
 
dev = pci_physfn(dev);
+   i = dev_satc_state(dev);
+   if (i >= 0) {
+   /* This dev supports ATS as it is in SATC table!
+* When IOMMU is in legacy mode, enabling ATS is done
+* automatically by HW for the device that requires
+* ATS, hence OS should not enable this device ATS
+* to avoid duplicated TLB invalidation
+*/
+   if (i && !sm_supported(iommu))
+   ret = 0;
+   return ret;
+   }
+
for (bus = dev->bus; bus; bus = bus->parent) {
bridge = bus->self;
/* If it's an integrated device, allow ATS */
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 69230fd695ea..fe9fd417d611 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -717,7 +717,7 @@ static inline int nr_pte_to_next_page(struct dma_pte *pte)
 }
 
 extern struct dmar_drhd_unit * dmar_find_matched_drhd_unit(struct pci_dev 
*dev);
-extern int dmar_find_matched_atsr_unit(struct pci_dev *dev);
+extern int dmar_ats_supported(struct pci_dev *dev, struct intel_iommu *iommu);
 
 extern int dmar_enable_qi(struct intel_iommu *iommu);
 extern void dmar_disable_qi(struct intel_iommu *iommu);
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] dma-contiguous: Prioritize restricted DMA pool over shared DMA pool

2022-02-16 Thread Florian Fainelli
On 2/16/22 3:13 AM, Robin Murphy wrote:
> On 2022-02-15 22:43, Florian Fainelli wrote:
>> Some platforms might define the same memory region to be suitable for
>> used by a kernel supporting CONFIG_DMA_RESTRICTED_POOL while maintaining
>> compatibility with older kernels that do not support that. This is
>> achieved by declaring the node this way;
> 
> Those platforms are doing something inexplicably wrong, then.

Matter of perspective I guess.

> 
> "restricted-dma-pool" says that DMA for the device has to be bounced
> through a dedicated pool because it can't be trusted with visibility of
> regular OS memory. "reusable" tells the OS that it's safe to use the
> pool as regular OS memory while it's idle. Do you see how those concepts
> are fundamentally incompatible?

>From the perspective of the software or firmware agent that is
responsible for setting up the appropriate protection on the reserved
memory, it does not matter what the compatible string is, the only
properties that matter are the base address, size, and possibly whether
'no-map' is specified or not to set-up appropriate protection for the
various memory controller agents (CPU, PCIe, everything else).

Everything else is just information provided to the OS in order to
provide a different implementation keyed off the compatible string. So
with that in mind, you can imagine that before the introduction of
'restricted-dma-pool' in 5.15, some platforms already had such a concept
of a reserved DMA region, that was backed by a device private CMA pool,
they would allocate memory from that region and would create their own
middle layer for bounce buffering if they liked to. This is obviously
not ideal on a number of levels starting from not being done at the
appropriate level but it was done.

Now that 'restricted-dma-pool' is supported, transitioning them over is
obviously better and updating the compatible string for those specific
regions to include the more descriptive 'restrictded-dma-pool' sounded
to me as an acceptable way to maintain forward/backward DTB
compatibility rather than doubly reserving these region one with the
"old" compatible and one with the "new" compatible, not that the system
is even capable of doing that anyway, so we would have had to
essentially make them adjacent.

And no, we are not bringing Linux version awareness to our boot loader
mangling the Device Tree blob, that's not happening, hence this patch.

> 
> Linux is right to reject contradictory information rather than attempt
> to guess at what might be safe or not.

The piece of contradictory information here specifically is the
'reusable' boolean property, but as I explain the commit message
message, if you have a "properly formed" 'restricted-dma-pool' region
then it should not have that property in the first place, but even if it
does, it does not harm anything to have it.

> 
> Furthermore, down at the practical level, a SWIOTLB pool is used for
> bouncing streaming DMA API mappings, while a coherent/CMA pool is used
> for coherent DMA API allocations; they are not functionally
> interchangeable either. If a device depends on coherent allocations
> rather than streaming DMA, it should still have a coherent pool even
> under a physical memory protection scheme, and if it needs both
> streaming DMA and coherent buffers then it can have both types of pool -
> the bindings explicitly call that out. It's true that SWIOTLB pools can
> act as an emergency fallback for small allocations for I/O-coherent
> devices, but that's not really intended to be relied upon heavily.
> 
> So no, I do not see anything wrong with the current handling of
> nonsensical DTs here, sorry.

There is nothing wrong in the current code, but with changes that have
no adverse effect on "properly" constructed reserved memory entries we
can accept both types of reservation and maintain forward/backward
compatibility in our case. So why not?
-- 
Florian
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] drivers/iommu: use struct_size over open coded arithmetic

2022-02-16 Thread cgel . zte
From: Minghao Chi (CGEL ZTE) 

Replace zero-length array with flexible-array member and make use
of the struct_size() helper in kzalloc(). For example:

struct viommu_request {
...
unsigned intlen;
charbuf[];
};

Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes.

Reported-by: Zeal Robot 
Signed-off-by: Minghao Chi (CGEL ZTE) 
---
 drivers/iommu/virtio-iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index f2aa34f57454..0996d9c7c358 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -231,7 +231,7 @@ static int __viommu_add_req(struct viommu_dev *viommu, void 
*buf, size_t len,
if (write_offset <= 0)
return -EINVAL;
 
-   req = kzalloc(sizeof(*req) + len, GFP_ATOMIC);
+   req = kzalloc(struct_size(req, buf, len), GFP_ATOMIC);
if (!req)
return -ENOMEM;
 
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 3/8] iommu: Extend iommu_at[de]tach_device() for multi-device groups

2022-02-16 Thread Jason Gunthorpe via iommu
On Wed, Feb 16, 2022 at 02:28:09PM +0800, Lu Baolu wrote:

> It seems everyone agrees that for device assignment (where the I/O
> address is owned by the user-space application), the iommu_group-based
> APIs should always be used. Otherwise, the isolation and protection are
> not guaranteed.

This group/device split is all just driven by VFIO. There is nothing
preventing a struct device * API from being used with user-space, and
Robin has been pushing that way. With enough fixing of VFIO we can do
it.

eg the device-centric VFIO patches should be able to eventually work
entirely on an iommu device API.

> Another proposal (as suggested by Joerg) is to introduce the concept of
> "sub-group". An iommu group could have one or multiple sub-groups with
> non-aliased devices sitting in different sub-groups and use different
> domains.

I still don't see how sub groups help or really change anything here.

The API already has the concept of 'ownership' seperated from the
concept of 'attach a domain to a device'.

Ownership works on the ACS group and attach works on the 'same RID'
group.

The API can take in the struct device and select which internal group
to use based on which action is being done.

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/tegra-smmu: Fix missing put_device() call in tegra_smmu_find

2022-02-16 Thread Thierry Reding
On Fri, Jan 07, 2022 at 08:09:11AM +, Miaoqian Lin wrote:
> The reference taken by 'of_find_device_by_node()' must be released when
> not needed anymore.
> Add the corresponding 'put_device()' in the error handling path.
> 
> Fixes: 765a9d1d02b2 ("iommu/tegra-smmu: Fix mc errors on tegra124-nyan")
> Signed-off-by: Miaoqian Lin 
> ---
>  drivers/iommu/tegra-smmu.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
> index e900e3c46903..2561ce8a2ce8 100644
> --- a/drivers/iommu/tegra-smmu.c
> +++ b/drivers/iommu/tegra-smmu.c
> @@ -808,8 +808,10 @@ static struct tegra_smmu *tegra_smmu_find(struct 
> device_node *np)
>   return NULL;
>  
>   mc = platform_get_drvdata(pdev);
> - if (!mc)
> + if (!mc) {
> + put_device(>dev);
>   return NULL;
> + }
>  
>   return mc->smmu;
>  }

Sorry for the late reply, looks correct. We probably also need a similar
call in ->release_device(). I also wonder if we should be returning an
-EPROBE_DEFER here, which is technically the correct thing to do, though
in practice that will likely never happen because these pointers are set
during an arch_initcall, so should always be available by the time a
driver tries to attach to an IOMMU.

Acked-by: Thierry Reding 


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 1/4] dt-bindings: arm-smmu: Document nvidia,memory-controller property

2022-02-16 Thread Thierry Reding
On Thu, Dec 09, 2021 at 05:35:57PM +0100, Thierry Reding wrote:
> From: Thierry Reding 
> 
> On NVIDIA SoC's the ARM SMMU needs to interact with the memory
> controller in order to map memory clients to the corresponding stream
> IDs. Document how the nvidia,memory-controller property can be used to
> achieve this.
> 
> Note that this is a backwards-incompatible change that is, however,
> necessary to ensure correctness. Without the new property, most of the
> devices would still work but it is not guaranteed that all will.
> 
> Signed-off-by: Thierry Reding 
> ---
> Changes in v2:
> - clarify why the new nvidia,memory-controller property is required
> 
>  .../devicetree/bindings/iommu/arm,smmu.yaml | 17 +
>  1 file changed, 17 insertions(+)

Hi Joerg,

can you pick up patches 1-3 of this series? DT bindings have been
reviewed by Rob and Will acked the ARM SMMU change. I can take the
device tree changes (patch 4) through the Tegra tree.

Thanks,
Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH -next] iommu/vt-d: Remove unused function intel_svm_capable()

2022-02-16 Thread YueHaibing via iommu
This is unused since commit 404837741416 ("iommu/vt-d: Use 
iommu_sva_alloc(free)_pasid() helpers")

Signed-off-by: YueHaibing 
---
 drivers/iommu/intel/svm.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 5b5d69b04fcc..2c53689da461 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -168,11 +168,6 @@ int intel_svm_finish_prq(struct intel_iommu *iommu)
return 0;
 }
 
-static inline bool intel_svm_capable(struct intel_iommu *iommu)
-{
-   return iommu->flags & VTD_FLAG_SVM_CAPABLE;
-}
-
 void intel_svm_check(struct intel_iommu *iommu)
 {
if (!pasid_supported(iommu))
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] dma-contiguous: Prioritize restricted DMA pool over shared DMA pool

2022-02-16 Thread Robin Murphy

On 2022-02-15 22:43, Florian Fainelli wrote:

Some platforms might define the same memory region to be suitable for
used by a kernel supporting CONFIG_DMA_RESTRICTED_POOL while maintaining
compatibility with older kernels that do not support that. This is
achieved by declaring the node this way;


Those platforms are doing something inexplicably wrong, then.

"restricted-dma-pool" says that DMA for the device has to be bounced 
through a dedicated pool because it can't be trusted with visibility of 
regular OS memory. "reusable" tells the OS that it's safe to use the 
pool as regular OS memory while it's idle. Do you see how those concepts 
are fundamentally incompatible?


Linux is right to reject contradictory information rather than attempt 
to guess at what might be safe or not.


Furthermore, down at the practical level, a SWIOTLB pool is used for 
bouncing streaming DMA API mappings, while a coherent/CMA pool is used 
for coherent DMA API allocations; they are not functionally 
interchangeable either. If a device depends on coherent allocations 
rather than streaming DMA, it should still have a coherent pool even 
under a physical memory protection scheme, and if it needs both 
streaming DMA and coherent buffers then it can have both types of pool - 
the bindings explicitly call that out. It's true that SWIOTLB pools can 
act as an emergency fallback for small allocations for I/O-coherent 
devices, but that's not really intended to be relied upon heavily.


So no, I do not see anything wrong with the current handling of 
nonsensical DTs here, sorry.


Robin.


cma@4000 {
compatible = "restricted-dma-pool", "shared-dma-pool";
reusable;
...
};

A newer kernel would leverage the 'restricted-dma-pool' compatible
string for that reserved region, while an older kernel would use the
'shared-dma-pool' compatibility string.

Due to the link ordering between files under kernel/dma/ however,
contiguous.c would try to claim the region and we would never get a
chance for swiotlb.c to claim that reserved memory region.

To that extent, update kernel/dma/contiguous.c in order to check
specifically for the 'restricted-dma-pool' compatibility string when
CONFIG_DMA_RESTRICTED_POOL is enabled and give up that region such that
kernel/dma/swiotlb.c has a chance to claim it.

Similarly, kernel/dma/swiotlb.c is updated to remove the check for the
'reusable' property because that check is not useful. When a region is
defined with a compatible string set to 'restricted-dma-pool', no code
under kernel/dma/{contiguous,coherent}.c will match that region since
there is no 'shared-dma-pool' compatible string. If a
region is defined with a compatible string set as above with both
'restricted-dma-pool" *and* 'shared-dma-pool' however, checking for
'reusable' will prevent kernel/dma/swiotlb.c from claiming the region
while it is still perfectly suitable since contiguous.c gave it up.

Signed-off-by: Florian Fainelli 
---
  kernel/dma/contiguous.c | 7 +++
  kernel/dma/swiotlb.c| 3 +--
  2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
index 3d63d91cba5c..3c418af6d306 100644
--- a/kernel/dma/contiguous.c
+++ b/kernel/dma/contiguous.c
@@ -416,6 +416,13 @@ static int __init rmem_cma_setup(struct reserved_mem *rmem)
of_get_flat_dt_prop(node, "no-map", NULL))
return -EINVAL;
  
+#ifdef CONFIG_DMA_RESTRICTED_POOL

+   if (of_flat_dt_is_compatible(node, "restricted-dma-pool")) {
+   pr_warn("Giving up %pa for restricted DMA pool\n", >base);
+   return -ENOENT;
+   }
+#endif
+
if ((rmem->base & mask) || (rmem->size & mask)) {
pr_err("Reserved memory: incorrect alignment of CMA region\n");
return -EINVAL;
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index f1e7ea160b43..9d6e4ae74d04 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -883,8 +883,7 @@ static int __init rmem_swiotlb_setup(struct reserved_mem 
*rmem)
  {
unsigned long node = rmem->fdt_node;
  
-	if (of_get_flat_dt_prop(node, "reusable", NULL) ||

-   of_get_flat_dt_prop(node, "linux,cma-default", NULL) ||
+   if (of_get_flat_dt_prop(node, "linux,cma-default", NULL) ||
of_get_flat_dt_prop(node, "linux,dma-default", NULL) ||
of_get_flat_dt_prop(node, "no-map", NULL))
return -EINVAL;

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 8/8] iommu/arm-smmu-v3: Make default domain type of HiSilicon PTT device to identity

2022-02-16 Thread Yicong Yang via iommu
On 2022/2/15 22:29, Robin Murphy wrote:
> On 2022-02-15 13:42, Will Deacon wrote:
>> On Tue, Feb 15, 2022 at 01:30:26PM +, Robin Murphy wrote:
>>> On 2022-02-15 13:00, Will Deacon wrote:
 On Mon, Feb 14, 2022 at 08:55:20PM +0800, Yicong Yang wrote:
> On 2022/1/24 21:11, Yicong Yang wrote:
>> The DMA of HiSilicon PTT device can only work with identical
>> mapping. So add a quirk for the device to force the domain
>> passthrough.
>>
>> Signed-off-by: Yicong Yang 
>> ---
>>    drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 16 
>>    1 file changed, 16 insertions(+)
>>
>> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
>> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> index 6dc6d8b6b368..6f67a2b1dd27 100644
>> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
>> @@ -2838,6 +2838,21 @@ static int arm_smmu_dev_disable_feature(struct 
>> device *dev,
>>    }
>>    }
>> +#define IS_HISI_PTT_DEVICE(pdev)    ((pdev)->vendor == 
>> PCI_VENDOR_ID_HUAWEI && \
>> + (pdev)->device == 0xa12e)
>> +
>> +static int arm_smmu_def_domain_type(struct device *dev)
>> +{
>> +    if (dev_is_pci(dev)) {
>> +    struct pci_dev *pdev = to_pci_dev(dev);
>> +
>> +    if (IS_HISI_PTT_DEVICE(pdev))
>> +    return IOMMU_DOMAIN_IDENTITY;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>>    static struct iommu_ops arm_smmu_ops = {
>>    .capable    = arm_smmu_capable,
>>    .domain_alloc    = arm_smmu_domain_alloc,
>> @@ -2863,6 +2878,7 @@ static struct iommu_ops arm_smmu_ops = {
>>    .sva_unbind    = arm_smmu_sva_unbind,
>>    .sva_get_pasid    = arm_smmu_sva_get_pasid,
>>    .page_response    = arm_smmu_page_response,
>> +    .def_domain_type    = arm_smmu_def_domain_type,
>>    .pgsize_bitmap    = -1UL, /* Restricted during device attach 
>> */
>>    .owner    = THIS_MODULE,
>>    };
>>
>
> Is this quirk ok with the SMMU v3 driver? Just want to confirm that I'm 
> on the
> right way to dealing with the issue of our device.

 I don't think the quirk should be in the SMMUv3 driver. Assumedly, you 
 would
 have the exact same problem if you stuck the PTT device behind a different
 type of IOMMU, and so the quirk should be handled by a higher level of the
 stack.
>>>
>>> Conceptually, yes, but I'm inclined to be pragmatic here. Default domain
>>> quirks could only move out as far as the other end of the call from
>>> iommu_get_def_domain_type() - it's not like we could rely on some flag in a
>>> driver which may not even be loaded yet, let alone matched to the device.
>>> And even then there's an equal and opposite argument for why the core code
>>> should have to maintain a list of platform-specific quirks rather than code
>>> specific to the relevant platforms. The fact is that a HiSilicon RCiEP is
>>> not going to end up behind anything other than a HiSilicon IOMMU, and if
>>> those ever stop being SMMUv3 *and* such a quirk still exists we can worry
>>> about it then.
>>

That's true that this RCiEP only appears behind the HiSilicon's IOMMU which 
using
SMMU v3 driver.

>> Perhaps, but you know that by adding this hook it's only a matter of time
>> before we get random compatible string matches in there, so I'd rather keep
>> the flood gates closed as long as we can.
>>
>> Given that this is a PCI device, why can't we have a PCI quirk for devices
>> which require an identity mapping and then handle that in the IOMMU core?
> 

As Robin mentioned below, not only PCI devices but some platform devices also 
want
to passthrough the IOMMU. I noticed there're already some fields describe the 
device's
DMA information in struct device, so follow your point can it go there if we're 
going
to make it more generic?

Anyway if we're going to make all these quirks in a more generic place, I'll 
willing
to add this device there and have a test.

> Oh, don't think I *like* having quirks in the driver, it just seems like the 
> least-worst choice from a bad bunch. All of the default domain quirks so far 
> (including this one) exist for integrated devices and/or dodgy firmware 
> setups such that they are platform-specific, so there is no technical reason 
> for trying to split *some* of them off into a generic mechanism when the 
> driver-based platform-specific mechanism still needs to exist anyway (some of 
> them do depend on driver state as well).
> 
> Feel free to test the waters with a patch punting qcom_smmu_def_domain_type() 
> to core code, but I think you'll struggle to find a reason to give in the 
> commit message other than "I don't like it".
> 
>>> Ugly as it is, this is the status quo. I don't recall anyone ever arguing
>>> that 

[PATCH v2 1/1] iommu/vt-d: Fix list_add double add when enabling VMD and scalable mode

2022-02-16 Thread Adrian Huang
From: Adrian Huang 

When enabling VMD and IOMMU scalable mode, the following kernel panic
call trace/kernel log is shown in Eagle Stream platform (Sapphire Rapids
CPU) during booting:

pci :59:00.5: Adding to iommu group 42
...
vmd :59:00.5: PCI host bridge to bus 1:80
pci 1:80:01.0: [8086:352a] type 01 class 0x060400
pci 1:80:01.0: reg 0x10: [mem 0x-0x0001 64bit]
pci 1:80:01.0: enabling Extended Tags
pci 1:80:01.0: PME# supported from D0 D3hot D3cold
pci 1:80:01.0: DMAR: Setup RID2PASID failed
pci 1:80:01.0: Failed to add to iommu group 42: -16
pci 1:80:03.0: [8086:352b] type 01 class 0x060400
pci 1:80:03.0: reg 0x10: [mem 0x-0x0001 64bit]
pci 1:80:03.0: enabling Extended Tags
pci 1:80:03.0: PME# supported from D0 D3hot D3cold
list_add double add: new=ff4d61160b74b8a0, prev=ff4d611d8e245c10, 
next=ff4d61160b74b8a0.
[ cut here ]
kernel BUG at lib/list_debug.c:29!
invalid opcode:  [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 7 Comm: kworker/0:1 Not tainted 5.17.0-rc3+ #7
Hardware name: Lenovo ThinkSystem SR650V3/SB27A86647, BIOS ESE101Y-1.00 
01/13/2022
Workqueue: events work_for_cpu_fn
RIP: 0010:__list_add_valid.cold+0x26/0x3f
Code: 9a 4a ab ff 4c 89 c1 48 c7 c7 40 0c d9 9e e8 b9 b1 fe ff 0f 0b 48 89 f2 
4c 89 c1 48 89 fe 48 c7 c7 f0 0c d9 9e e8 a2 b1 fe ff <0f> 0b 48 89 d1 4c 89 c6 
4c 89 ca 48 c7 c7 98 0c d9 9e e8 8b b1 fe
RSP: :ff5ad434865b3a40 EFLAGS: 00010246
RAX: 0058 RBX: ff4d61160b74b880 RCX: ff4d61255e1fffa8
RDX:  RSI: fffe RDI: 9fd34f20
RBP: ff4d611d8e245c00 R08:  R09: ff5ad434865b3888
R10: ff5ad434865b3880 R11: ff4d61257fdc6fe8 R12: ff4d61160b74b8a0
R13: ff4d61160b74b8a0 R14: ff4d611d8e245c10 R15: ff4d611d8001ba70
FS:  () GS:ff4d611d5ea0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: ff4d611fa1401000 CR3: 000aa0210001 CR4: 00771ef0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe07f0 DR7: 0400
PKRU: 5554
Call Trace:
 
 intel_pasid_alloc_table+0x9c/0x1d0
 dmar_insert_one_dev_info+0x423/0x540
 ? device_to_iommu+0x12d/0x2f0
 intel_iommu_attach_device+0x116/0x290
 __iommu_attach_device+0x1a/0x90
 iommu_group_add_device+0x190/0x2c0
 __iommu_probe_device+0x13e/0x250
 iommu_probe_device+0x24/0x150
 iommu_bus_notifier+0x69/0x90
 blocking_notifier_call_chain+0x5a/0x80
 device_add+0x3db/0x7b0
 ? arch_memremap_can_ram_remap+0x19/0x50
 ? memremap+0x75/0x140
 pci_device_add+0x193/0x1d0
 pci_scan_single_device+0xb9/0xf0
 pci_scan_slot+0x4c/0x110
 pci_scan_child_bus_extend+0x3a/0x290
 vmd_enable_domain.constprop.0+0x63e/0x820
 vmd_probe+0x163/0x190
 local_pci_probe+0x42/0x80
 work_for_cpu_fn+0x13/0x20
 process_one_work+0x1e2/0x3b0
 worker_thread+0x1c4/0x3a0
 ? rescuer_thread+0x370/0x370
 kthread+0xc7/0xf0
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork+0x1f/0x30
 
Modules linked in:
---[ end trace  ]---
...
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x1ca0 from 0x8100 (relocation range: 
0x8000-0xbfff)
---[ end Kernel panic - not syncing: Fatal exception ]---

The following 'lspci' output shows devices '1:80:*' are subdevices of
the VMD device :59:00.5:

  $ lspci
  ...
  :59:00.5 RAID bus controller: Intel Corporation Volume Management Device 
NVMe RAID Controller (rev 20)
  ...
  1:80:01.0 PCI bridge: Intel Corporation Device 352a (rev 03)
  1:80:03.0 PCI bridge: Intel Corporation Device 352b (rev 03)
  1:80:05.0 PCI bridge: Intel Corporation Device 352c (rev 03)
  1:80:07.0 PCI bridge: Intel Corporation Device 352d (rev 03)
  1:81:00.0 Non-Volatile memory controller: Intel Corporation NVMe 
Datacenter SSD [3DNAND, Beta Rock Controller]
  1:82:00.0 Non-Volatile memory controller: Intel Corporation NVMe 
Datacenter SSD [3DNAND, Beta Rock Controller]

The symptom 'list_add double add' is caused by the following failure
message:

  pci 1:80:01.0: DMAR: Setup RID2PASID failed
  pci 1:80:01.0: Failed to add to iommu group 42: -16
  pci 1:80:03.0: [8086:352b] type 01 class 0x060400

Device 1:80:01.0 is the subdevice of the VMD device :59:00.5,
so invoking intel_pasid_alloc_table() gets the pasid_table of the VMD
device :59:00.5. Here is call path:

  intel_pasid_alloc_table
pci_for_each_dma_alias
 get_alias_pasid_table
   search_pasid_table

pci_real_dma_dev() in pci_for_each_dma_alias() gets the real dma device
which is the VMD device :59:00.5. However, pte of the VMD device
:59:00.5 has been configured during this message "pci :59:00.5:
Adding to iommu group 42". So, the status -EBUSY is returned when
configuring pasid entry for device 1:80:01.0.

It then invokes dmar_remove_one_dev_info() to release
'struct device_domain_info *' from 

Re: [PATCH v4 08/35] iommu/mediatek: Use kmalloc for protect buffer

2022-02-16 Thread Yong Wu via iommu
On Wed, 2022-02-16 at 14:59 +0900, Tomasz Figa wrote:
> On Wed, Feb 16, 2022 at 2:55 PM Yong Wu  wrote:
> > 
> > On Thu, 2022-01-27 at 12:08 +0100, AngeloGioacchino Del Regno
> > wrote:
> > > Il 25/01/22 09:56, Yong Wu ha scritto:
> > > > No need zero for the protect buffer that is only accessed by
> > > > the
> > > > IOMMU HW
> > > > translation fault happened.
> > > > 
> > > > Signed-off-by: Yong Wu 
> > > 
> > > I would rather keep this a devm_kzalloc instead... the cost is
> > > very
> > > minimal and
> > > this will be handy when new hardware will be introduced, as it
> > > may
> > > require a bigger
> > > buffer: in that case, "older" platforms will use only part of it
> > > and
> > > we may get
> > > garbage data at the end.
> > 
> > Currently this is to avoid zero 512 bytes for all the platforms.
> > 
> > Sorry, I don't understand why it is unnecessary when the new
> > hardware
> > requires a bigger buffer. If the buffer becomes bigger, then
> > clearing
> > it to 0 need more cost. then this patch is more helpful?
> > 
> > The content in this buffer is garbage, we won't care about or
> > analyse
> > it.
> 
> I think we should zero it for security reasons regardless of any
> other
> aspects. With this patch it's leaking kernel data to the hardware.
> 
> At the same time, we're talking here about something executed just 1
> time when the driver probes. I don't think the cost would really
> matter.

OK. I will remove this patch in next version.

Thanks.

> 
> Best regards,
> Tomasz
> 
> ___
> Linux-mediatek mailing list
> linux-media...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-mediatek

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu