Re: [PATCH] swiotlb: Remove redundant swiotlb_force

2022-06-22 Thread Steven Price
On 22/06/2022 15:32, Christoph Hellwig wrote:
> On Wed, Jun 22, 2022 at 03:29:52PM +0100, Steven Price wrote:
>> The variable (and enum) was removed in commit c6af2aa9ffc9 ("swiotlb:
>> make the swiotlb_init interface more useful") but the declaration was
>> left in swiotlb.h. Tidy up by removing the declaration as well.
>>
>> Signed-off-by: Steven Price 
> 
> I just applied an identical patch from Dongli Zhang a few hours ago.

Ah, I missed that. Sorry for the noise!

Steve
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] swiotlb: Remove redundant swiotlb_force

2022-06-22 Thread Steven Price
The variable (and enum) was removed in commit c6af2aa9ffc9 ("swiotlb:
make the swiotlb_init interface more useful") but the declaration was
left in swiotlb.h. Tidy up by removing the declaration as well.

Signed-off-by: Steven Price 
---
 include/linux/swiotlb.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 7ed35dd3de6e..b1f5ace37502 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -60,8 +60,6 @@ dma_addr_t swiotlb_map(struct device *dev, phys_addr_t phys,
size_t size, enum dma_data_direction dir, unsigned long attrs);
 
 #ifdef CONFIG_SWIOTLB
-extern enum swiotlb_force swiotlb_force;
-
 /**
  * struct io_tlb_mem - IO TLB Memory Pool Descriptor
  *
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v13 0/9] ACPI/IORT: Support for IORT RMR node

2022-06-17 Thread Steven Price
On 15/06/2022 11:10, Shameer Kolothum wrote:
> Hi
> 
> v12 --> v13
>   -No changes. Rebased to 5.19-rc1.
>   -Picked up tags received from Laurentiu, Hanjun and Will. Thanks!.

You've already got my Tested-by tags, but just to confirm I gave this a
spin and it works fine.

Thanks,

Steve

> 
> Thanks,
> Shameer
> 
> From old:
> We have faced issues with 3408iMR RAID controller cards which
> fail to boot when SMMU is enabled. This is because these
> controllers make use of host memory for various caching related
> purposes and when SMMU is enabled the iMR firmware fails to
> access these memory regions as there is no mapping for them.
> IORT RMR provides a way for UEFI to describe and report these
> memory regions so that the kernel can make a unity mapping for
> these in SMMU.
> 
> Change History:
> 
> v11 --> v12
>   -Minor fix in patch #4 to address the issue reported by the kernel test 
> robot.
>   -Added R-by tags by Christoph(patch #1) and Lorenzo(patch #4).
>   -Added T-by from Steve to all relevant patches. Many thanks!.
> 
> v10 --> v11
>  -Addressed Christoph's comments. We now have a  callback to 
>   struct iommu_resv_region to free all related memory and also dropped
>   the FW specific union and now has a container struct iommu_iort_rmr_data.
>   See patches #1 & #4
>  -Added R-by from Christoph.
>  -Dropped R-by from Lorenzo for patches #4 & #5 due to the above changes.
>  -Also dropped T-by from Steve and Laurentiu. Many thanks for your test
>   efforts. I have done basic sanity testing on my platform but please
>   do it again at your end.
> 
> v9 --> v10
>  - Dropped patch #1 ("Add temporary RMR node flag definitions") since
>the ACPICA header updates patch is now in the mailing list
>  - Based on the suggestion from Christoph, introduced a 
>resv_region_free_fw_data() callback in struct iommu_resv_region and
>used that to free RMR specific memory allocations.
> 
> v8 --> v9
>  - Adressed comments from Robin on interfaces.
>  - Addressed comments from Lorenzo.
> 
> v7 --> v8
>   - Patch #1 has temp definitions for RMR related changes till
>     the ACPICA header changes are part of kernel.
>   - No early parsing of RMR node info and is only parsed at the
>     time of use.
>   - Changes to the RMR get/put API format compared to the
>     previous version.
>   - Support for RMR descriptor shared by multiple stream IDs.
> 
> v6 --> v7
>  -fix pointed out by Steve to the SMMUv2 SMR bypass install in patch #8.
> 
> v5 --> v6
> - Addressed comments from Robin & Lorenzo.
>   : Moved iort_parse_rmr() to acpi_iort_init() from
>     iort_init_platform_devices().
>   : Removed use of struct iort_rmr_entry during the initial
>     parse. Using struct iommu_resv_region instead.
>   : Report RMR address alignment and overlap errors, but continue.
>   : Reworked arm_smmu_init_bypass_stes() (patch # 6).
> - Updated SMMUv2 bypass SMR code. Thanks to Jon N (patch #8).
> - Set IOMMU protection flags(IOMMU_CACHE, IOMMU_MMIO) based
>   on Type of RMR region. Suggested by Jon N.
> 
> v4 --> v5
>  -Added a fw_data union to struct iommu_resv_region and removed
>   struct iommu_rmr (Based on comments from Joerg/Robin).
>  -Added iommu_put_rmrs() to release mem.
>  -Thanks to Steve for verifying on SMMUv2, but not added the Tested-by
>   yet because of the above changes.
> 
> v3 -->v4
> -Included the SMMUv2 SMR bypass install changes suggested by
>  Steve(patch #7)
> -As per Robin's comments, RMR reserve implementation is now
>  more generic  (patch #8) and dropped v3 patches 8 and 10.
> -Rebase to 5.13-rc1
> 
> RFC v2 --> v3
>  -Dropped RFC tag as the ACPICA header changes are now ready to be
>   part of 5.13[0]. But this series still has a dependency on that patch.
>  -Added IORT E.b related changes(node flags, _DSM function 5 checks for
>   PCIe).
>  -Changed RMR to stream id mapping from M:N to M:1 as per the spec and
>   discussion here[1].
>  -Last two patches add support for SMMUv2(Thanks to Jon Nettleton!)
> 
> Jon Nettleton (1):
>   iommu/arm-smmu: Get associated RMR info and install bypass SMR
> 
> Shameer Kolothum (8):
>   iommu: Introduce a callback to struct iommu_resv_region
>   ACPI/IORT: Make iort_iommu_msi_get_resv_regions() return void
>   ACPI/IORT: Provide a generic helper to retrieve reserve regions
>   ACPI/IORT: Add support to retrieve IORT RMR reserved regions
>   ACPI/IORT: Add a helper to retrieve RMR info directly
>   iommu/arm-smmu-v3: Introduce strtab init helper
>   iommu/arm-smmu-v3: Refactor arm_smmu_init_bypass_stes() to force
> bypass
>   iommu/arm-smmu-v3: Get associated RMR info and install bypass STE
> 
>  drivers/acpi/arm64/iort.c   | 360 ++--
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  78 -
>  drivers/iommu/arm/arm-smmu/arm-smmu.c   |  52 +++
>  drivers/iommu/dma-iommu.c   |   2 +-
>  drivers/iommu/iommu.c   |  16 +-
>  include/linux/acpi_iort.h   |  14 +-
>  

Re: [RESEND PATCH v8 01/11] iommu: Add DMA ownership management interfaces

2022-06-15 Thread Steven Price
On 15/06/2022 11:57, Robin Murphy wrote:
> On 2022-06-15 10:53, Steven Price wrote:
>> On 18/04/2022 01:49, Lu Baolu wrote:
>>> Multiple devices may be placed in the same IOMMU group because they
>>> cannot be isolated from each other. These devices must either be
>>> entirely under kernel control or userspace control, never a mixture.
>>>
>>> This adds dma ownership management in iommu core and exposes several
>>> interfaces for the device drivers and the device userspace assignment
>>> framework (i.e. VFIO), so that any conflict between user and kernel
>>> controlled dma could be detected at the beginning.
>>>
>>> The device driver oriented interfaces are,
>>>
>>> int iommu_device_use_default_domain(struct device *dev);
>>> void iommu_device_unuse_default_domain(struct device *dev);
>>>
>>> By calling iommu_device_use_default_domain(), the device driver tells
>>> the iommu layer that the device dma is handled through the kernel DMA
>>> APIs. The iommu layer will manage the IOVA and use the default domain
>>> for DMA address translation.
>>>
>>> The device user-space assignment framework oriented interfaces are,
>>>
>>> int iommu_group_claim_dma_owner(struct iommu_group *group,
>>>     void *owner);
>>> void iommu_group_release_dma_owner(struct iommu_group *group);
>>> bool iommu_group_dma_owner_claimed(struct iommu_group *group);
>>>
>>> The device userspace assignment must be disallowed if the DMA owner
>>> claiming interface returns failure.
>>>
>>> Signed-off-by: Jason Gunthorpe 
>>> Signed-off-by: Kevin Tian 
>>> Signed-off-by: Lu Baolu 
>>> Reviewed-by: Robin Murphy 
>>
>> I'm seeing a regression that I've bisected to this commit on a Firefly
>> RK3288 board. The display driver fails to probe properly because
>> __iommu_attach_group() returns -EBUSY. This causes long hangs and splats
>> as the display flips timeout.
>>
>> The call stack to __iommu_attach_group() is:
>>
>>   __iommu_attach_group from iommu_attach_device+0x64/0xb4
>>   iommu_attach_device from rockchip_drm_dma_attach_device+0x20/0x50
>>   rockchip_drm_dma_attach_device from vop_crtc_atomic_enable+0x10c/0xa64
>>   vop_crtc_atomic_enable from
>> drm_atomic_helper_commit_modeset_enables+0xa8/0x290
>>   drm_atomic_helper_commit_modeset_enables from
>> drm_atomic_helper_commit_tail_rpm+0x44/0x8c
>>   drm_atomic_helper_commit_tail_rpm from commit_tail+0x9c/0x180
>>   commit_tail from drm_atomic_helper_commit+0x164/0x18c
>>   drm_atomic_helper_commit from drm_atomic_commit+0xac/0xe4
>>   drm_atomic_commit from drm_client_modeset_commit_atomic+0x23c/0x284
>>   drm_client_modeset_commit_atomic from
>> drm_client_modeset_commit_locked+0x60/0x1c8
>>   drm_client_modeset_commit_locked from
>> drm_client_modeset_commit+0x24/0x40
>>   drm_client_modeset_commit from drm_fb_helper_set_par+0xb8/0xf8
>>   drm_fb_helper_set_par from drm_fb_helper_hotplug_event.part.0+0xa8/0xc0
>>   drm_fb_helper_hotplug_event.part.0 from output_poll_execute+0xb8/0x224
>>
>>> @@ -2109,7 +2115,7 @@ static int __iommu_attach_group(struct
>>> iommu_domain *domain,
>>>   {
>>>   int ret;
>>>   -    if (group->default_domain && group->domain !=
>>> group->default_domain)
>>> +    if (group->domain && group->domain != group->default_domain)
>>>   return -EBUSY;
>>>     ret = __iommu_group_for_each_dev(group, domain,
>>
>> Reverting this 'fixes' the problem for me. The follow up 0286300e6045
>> ("iommu: iommu_group_claim_dma_owner() must always assign a domain")
>> doesn't help.
>>
>> Adding some debug printks I can see that domain is a valid pointer, but
>> both default_domain and blocking_domain are NULL.
>>
>> I'm using the DTB from the kernel tree (rk3288-firefly.dtb).
>>
>> Any ideas?
> 
> Hmm, TBH I'm not sure how that worked previously... it'll be complaining
> because the ARM DMA domain is still attached, but even when the attach
> goes ahead and replaces the ARM domain with the driver's new one, it's
> not using the special arm_iommu_detach_device() interface anywhere so
> the device would still be left with the wrong DMA ops :/
> 
> I guess the most pragmatic option is probably to give rockchip-drm a
> similar bodge to exynos and tegra, to explicitly remove the ARM domain
> before attaching its own.

A bodge like below indeed '

Re: [RESEND PATCH v8 01/11] iommu: Add DMA ownership management interfaces

2022-06-15 Thread Steven Price
On 18/04/2022 01:49, Lu Baolu wrote:
> Multiple devices may be placed in the same IOMMU group because they
> cannot be isolated from each other. These devices must either be
> entirely under kernel control or userspace control, never a mixture.
> 
> This adds dma ownership management in iommu core and exposes several
> interfaces for the device drivers and the device userspace assignment
> framework (i.e. VFIO), so that any conflict between user and kernel
> controlled dma could be detected at the beginning.
> 
> The device driver oriented interfaces are,
> 
>   int iommu_device_use_default_domain(struct device *dev);
>   void iommu_device_unuse_default_domain(struct device *dev);
> 
> By calling iommu_device_use_default_domain(), the device driver tells
> the iommu layer that the device dma is handled through the kernel DMA
> APIs. The iommu layer will manage the IOVA and use the default domain
> for DMA address translation.
> 
> The device user-space assignment framework oriented interfaces are,
> 
>   int iommu_group_claim_dma_owner(struct iommu_group *group,
>   void *owner);
>   void iommu_group_release_dma_owner(struct iommu_group *group);
>   bool iommu_group_dma_owner_claimed(struct iommu_group *group);
> 
> The device userspace assignment must be disallowed if the DMA owner
> claiming interface returns failure.
> 
> Signed-off-by: Jason Gunthorpe 
> Signed-off-by: Kevin Tian 
> Signed-off-by: Lu Baolu 
> Reviewed-by: Robin Murphy 

I'm seeing a regression that I've bisected to this commit on a Firefly
RK3288 board. The display driver fails to probe properly because
__iommu_attach_group() returns -EBUSY. This causes long hangs and splats
as the display flips timeout.

The call stack to __iommu_attach_group() is:

 __iommu_attach_group from iommu_attach_device+0x64/0xb4
 iommu_attach_device from rockchip_drm_dma_attach_device+0x20/0x50
 rockchip_drm_dma_attach_device from vop_crtc_atomic_enable+0x10c/0xa64
 vop_crtc_atomic_enable from drm_atomic_helper_commit_modeset_enables+0xa8/0x290
 drm_atomic_helper_commit_modeset_enables from 
drm_atomic_helper_commit_tail_rpm+0x44/0x8c
 drm_atomic_helper_commit_tail_rpm from commit_tail+0x9c/0x180
 commit_tail from drm_atomic_helper_commit+0x164/0x18c
 drm_atomic_helper_commit from drm_atomic_commit+0xac/0xe4
 drm_atomic_commit from drm_client_modeset_commit_atomic+0x23c/0x284
 drm_client_modeset_commit_atomic from 
drm_client_modeset_commit_locked+0x60/0x1c8
 drm_client_modeset_commit_locked from drm_client_modeset_commit+0x24/0x40
 drm_client_modeset_commit from drm_fb_helper_set_par+0xb8/0xf8
 drm_fb_helper_set_par from drm_fb_helper_hotplug_event.part.0+0xa8/0xc0
 drm_fb_helper_hotplug_event.part.0 from output_poll_execute+0xb8/0x224

> @@ -2109,7 +2115,7 @@ static int __iommu_attach_group(struct iommu_domain 
> *domain,
>  {
>   int ret;
>  
> - if (group->default_domain && group->domain != group->default_domain)
> + if (group->domain && group->domain != group->default_domain)
>   return -EBUSY;
>  
>   ret = __iommu_group_for_each_dev(group, domain,

Reverting this 'fixes' the problem for me. The follow up 0286300e6045
("iommu: iommu_group_claim_dma_owner() must always assign a domain")
doesn't help.

Adding some debug printks I can see that domain is a valid pointer, but
both default_domain and blocking_domain are NULL.

I'm using the DTB from the kernel tree (rk3288-firefly.dtb).

Any ideas?

Thanks,

Steve
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v11 0/9] ACPI/IORT: Support for IORT RMR node

2022-04-28 Thread Steven Price
On 22/04/2022 17:28, Shameer Kolothum wrote:
> Hi
> 
> v9 --> v10
>  -Addressed Christoph's comments. We now have a callback to 
>   struct iommu_resv_region to free all related memory and also dropped
>   the FW specific union and now has a container struct iommu_iort_rmr_data.
>   See patches #1 & #4
>  -Added R-by from Christoph.
>  -Dropped R-by from Lorenzo for patches #4 & #5 due to the above changes.
>  -Also dropped T-by from Steve and Laurentiu. Many thanks for your test
>   efforts. I have done basic sanity testing on my platform but please
>   give it a try at your end as well.

I'm back in the office and given it a spin, it's all good:

Tested-by: Steven Price 

Thanks,

Steve

> 
> As mentioned in v10, this now has a dependency on the ACPICA header patch
> here[1]. 
> 
> Please take a look and let me know.
> 
> Thanks,
> Shameer
> [1] https://lore.kernel.org/all/44610361.fMDQidcC6G@kreacher/
> 
> From old:
> We have faced issues with 3408iMR RAID controller cards which
> fail to boot when SMMU is enabled. This is because these
> controllers make use of host memory for various caching related
> purposes and when SMMU is enabled the iMR firmware fails to
> access these memory regions as there is no mapping for them.
> IORT RMR provides a way for UEFI to describe and report these
> memory regions so that the kernel can make a unity mapping for
> these in SMMU.
> 
> Change History:
> 
> v9 --> v10
>  - Dropped patch #1 ("Add temporary RMR node flag definitions") since
>the ACPICA header updates patch is now in the mailing list
>  - Based on the suggestion from Christoph, introduced a 
>resv_region_free_fw_data() callback in struct iommu_resv_region and
>used that to free RMR specific memory allocations.
> 
> v8 --> v9
>  - Adressed comments from Robin on interfaces.
>  - Addressed comments from Lorenzo.
> 
> v7 --> v8
>   - Patch #1 has temp definitions for RMR related changes till
>     the ACPICA header changes are part of kernel.
>   - No early parsing of RMR node info and is only parsed at the
>     time of use.
>   - Changes to the RMR get/put API format compared to the
>     previous version.
>   - Support for RMR descriptor shared by multiple stream IDs.
> 
> v6 --> v7
>  -fix pointed out by Steve to the SMMUv2 SMR bypass install in patch #8.
> 
> v5 --> v6
> - Addressed comments from Robin & Lorenzo.
>   : Moved iort_parse_rmr() to acpi_iort_init() from
>     iort_init_platform_devices().
>   : Removed use of struct iort_rmr_entry during the initial
>     parse. Using struct iommu_resv_region instead.
>   : Report RMR address alignment and overlap errors, but continue.
>   : Reworked arm_smmu_init_bypass_stes() (patch # 6).
> - Updated SMMUv2 bypass SMR code. Thanks to Jon N (patch #8).
> - Set IOMMU protection flags(IOMMU_CACHE, IOMMU_MMIO) based
>   on Type of RMR region. Suggested by Jon N.
> 
> v4 --> v5
>  -Added a fw_data union to struct iommu_resv_region and removed
>   struct iommu_rmr (Based on comments from Joerg/Robin).
>  -Added iommu_put_rmrs() to release mem.
>  -Thanks to Steve for verifying on SMMUv2, but not added the Tested-by
>   yet because of the above changes.
> 
> v3 -->v4
> -Included the SMMUv2 SMR bypass install changes suggested by
>  Steve(patch #7)
> -As per Robin's comments, RMR reserve implementation is now
>  more generic  (patch #8) and dropped v3 patches 8 and 10.
> -Rebase to 5.13-rc1
> 
> RFC v2 --> v3
>  -Dropped RFC tag as the ACPICA header changes are now ready to be
>   part of 5.13[0]. But this series still has a dependency on that patch.
>  -Added IORT E.b related changes(node flags, _DSM function 5 checks for
>   PCIe).
>  -Changed RMR to stream id mapping from M:N to M:1 as per the spec and
>   discussion here[1].
>  -Last two patches add support for SMMUv2(Thanks to Jon Nettleton!)
> 
> Jon Nettleton (1):
>   iommu/arm-smmu: Get associated RMR info and install bypass SMR
> 
> Shameer Kolothum (8):
>   iommu: Introduce a callback to struct iommu_resv_region
>   ACPI/IORT: Make iort_iommu_msi_get_resv_regions() return void
>   ACPI/IORT: Provide a generic helper to retrieve reserve regions
>   ACPI/IORT: Add support to retrieve IORT RMR reserved regions
>   ACPI/IORT: Add a helper to retrieve RMR info directly
>   iommu/arm-smmu-v3: Introduce strtab init helper
>   iommu/arm-smmu-v3: Refactor arm_smmu_init_bypass_stes() to force
> bypass
>   iommu/arm-smmu-v3: Get associated RMR info and install bypass STE
> 
>  drivers/acpi/arm64/iort.c   | 359 ++--
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  78 -
>  drivers/iommu

Re: [PATCH v10 0/9] ACPI/IORT: Support for IORT RMR node

2022-04-21 Thread Steven Price
On 20/04/2022 17:48, Shameer Kolothum wrote:
> Hi
> 
> v9 --> v10
>  - Dropped patch #1 ("Add temporary RMR node flag definitions") since
>the ACPICA header updates patch is now in the mailing list[1]
>  - Based on the suggestion from Christoph, introduced a 
>resv_region_free_fw_data() callback in struct iommu_resv_region and
>used that to free RMR specific memory allocations.
> 
> Though there is a small change from v9 with respect to how we free up
> the FW specific data, I have taken the liberty to pick up the R-by and
> T-by tags from Lorenzo, Steve and Laurentiu. But please do take a look
> again and let me know.

I've given this a go and it works fine on my Juno setup. So do keep my
T-by tag.

Sami has been kind enough to give me an updated firmware which also
fixes the RMR node in the IORT. Although as mentioned before the details
of the RMR node are currently being ignored so this doesn't change the
functionality but silences the warning.

My concern is that with the RMR region effectively ignored we may see
more broken firmware, and while a length of zero produces a warning, an
otherwise incorrect length will currently "silently work" but mean that
any future tightening would cause problems. For example if the SMMU
driver were to recreate the mappings to only cover the region specified
in the RMR it may not be large enough if the RMR base/length are not
correct. It's up to the maintainers as to whether they see this as a
problem or not.

Thanks,

Steve

> Thanks,
> Shameer
> [1] https://lore.kernel.org/all/44610361.fMDQidcC6G@kreacher/
> 
> From old:
> We have faced issues with 3408iMR RAID controller cards which
> fail to boot when SMMU is enabled. This is because these
> controllers make use of host memory for various caching related
> purposes and when SMMU is enabled the iMR firmware fails to
> access these memory regions as there is no mapping for them.
> IORT RMR provides a way for UEFI to describe and report these
> memory regions so that the kernel can make a unity mapping for
> these in SMMU.
> 
> Change History:
> 
> v8 --> v9
>  - Adressed comments from Robin on interfaces.
>  - Addressed comments from Lorenzo.
> 
> v7 --> v8
>   - Patch #1 has temp definitions for RMR related changes till
>     the ACPICA header changes are part of kernel.
>   - No early parsing of RMR node info and is only parsed at the
>     time of use.
>   - Changes to the RMR get/put API format compared to the
>     previous version.
>   - Support for RMR descriptor shared by multiple stream IDs.
> 
> v6 --> v7
>  -fix pointed out by Steve to the SMMUv2 SMR bypass install in patch #8.
> 
> v5 --> v6
> - Addressed comments from Robin & Lorenzo.
>   : Moved iort_parse_rmr() to acpi_iort_init() from
>     iort_init_platform_devices().
>   : Removed use of struct iort_rmr_entry during the initial
>     parse. Using struct iommu_resv_region instead.
>   : Report RMR address alignment and overlap errors, but continue.
>   : Reworked arm_smmu_init_bypass_stes() (patch # 6).
> - Updated SMMUv2 bypass SMR code. Thanks to Jon N (patch #8).
> - Set IOMMU protection flags(IOMMU_CACHE, IOMMU_MMIO) based
>   on Type of RMR region. Suggested by Jon N.
> 
> v4 --> v5
>  -Added a fw_data union to struct iommu_resv_region and removed
>   struct iommu_rmr (Based on comments from Joerg/Robin).
>  -Added iommu_put_rmrs() to release mem.
>  -Thanks to Steve for verifying on SMMUv2, but not added the Tested-by
>   yet because of the above changes.
> 
> v3 -->v4
> -Included the SMMUv2 SMR bypass install changes suggested by
>  Steve(patch #7)
> -As per Robin's comments, RMR reserve implementation is now
>  more generic  (patch #8) and dropped v3 patches 8 and 10.
> -Rebase to 5.13-rc1
> 
> RFC v2 --> v3
>  -Dropped RFC tag as the ACPICA header changes are now ready to be
>   part of 5.13[0]. But this series still has a dependency on that patch.
>  -Added IORT E.b related changes(node flags, _DSM function 5 checks for
>   PCIe).
>  -Changed RMR to stream id mapping from M:N to M:1 as per the spec and
>   discussion here[1].
>  -Last two patches add support for SMMUv2(Thanks to Jon Nettleton!)
> 
> Jon Nettleton (1):
>   iommu/arm-smmu: Get associated RMR info and install bypass SMR
> 
> Shameer Kolothum (8):
>   iommu: Introduce a union to struct iommu_resv_region
>   ACPI/IORT: Make iort_iommu_msi_get_resv_regions() return void
>   ACPI/IORT: Provide a generic helper to retrieve reserve regions
>   ACPI/IORT: Add support to retrieve IORT RMR reserved regions
>   ACPI/IORT: Add a helper to retrieve RMR info directly
>   iommu/arm-smmu-v3: Introduce strtab init helper
>   iommu/arm-smmu-v3: Refactor arm_smmu_init_bypass_stes() to force
> bypass
>   iommu/arm-smmu-v3: Get associated RMR info and install bypass STE
> 
>  drivers/acpi/arm64/iort.c   | 335 ++--
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  78 -
>  drivers/iommu/arm/arm-smmu/arm-smmu.c   |  52 +++
>  

Re: [PATCH v9 00/11] ACPI/IORT: Support for IORT RMR node

2022-04-07 Thread Steven Price
Hi Shameer,

I've given it a spin on my Juno board and the EFIFB works fine with it.
However I am getting a warning:

ACPI: IORT: [Firmware Bug]: RMR descriptor[0xf80d] with zero length,
continue anyway

Which on examination of the IORT table is correct - the firmware does
indeed seem to have a bug and the length in the IORT table is 0,
hopefully I can get that fixed. However since it "all works" that points
out that the reserved memory region isn't actually used. Instead the
existing entries from the SMMU are reused (even though they don't match
the address/size region in the RMR).

I'm not sure if that matters but I thought it worth pointing out as it's
not currently spelt out that the RMR descriptor's content is currently
actually ignored.

Anyway, FWIW:

Tested-by: Steven Price 

Steve

On 04/04/2022 13:41, Shameer Kolothum wrote:
> Hi
> 
> v8 --> v9
>  - Adressed comments from Robin on interfaces as discussed here[0].
>  - Addressed comments from Lorenzo.
>  
> Though functionally there aren't any major changes, the interfaces have
> changed from v8 and for that reason not included the T-by tags from
> Steve and Eric yet(Many thanks for that). Appreciate it if you could
> give this a spin and let me know.
> 
> (The revised ACPICA pull request for IORT E.d related changes is
> here[1] and this is now merged to acpica:master.)
> 
> Please take a look and let me know your thoughts.
> 
> Thanks,
> Shameer
> [0] 
> https://lore.kernel.org/linux-arm-kernel/c982f1d7-c565-769a-abae-79c962969...@arm.com/
> [1] https://github.com/acpica/acpica/pull/765
> 
> From old:
> We have faced issues with 3408iMR RAID controller cards which
> fail to boot when SMMU is enabled. This is because these
> controllers make use of host memory for various caching related
> purposes and when SMMU is enabled the iMR firmware fails to
> access these memory regions as there is no mapping for them.
> IORT RMR provides a way for UEFI to describe and report these
> memory regions so that the kernel can make a unity mapping for
> these in SMMU.
> 
> Change History:
> 
> v7 --> v8
>   - Patch #1 has temp definitions for RMR related changes till
> the ACPICA header changes are part of kernel.
>   - No early parsing of RMR node info and is only parsed at the
> time of use.
>   - Changes to the RMR get/put API format compared to the
> previous version.
>   - Support for RMR descriptor shared by multiple stream IDs.
> 
> v6 --> v7
>  -fix pointed out by Steve to the SMMUv2 SMR bypass install in patch #8.
> 
> v5 --> v6
> - Addressed comments from Robin & Lorenzo.
>   : Moved iort_parse_rmr() to acpi_iort_init() from
> iort_init_platform_devices().
>   : Removed use of struct iort_rmr_entry during the initial
> parse. Using struct iommu_resv_region instead.
>   : Report RMR address alignment and overlap errors, but continue.
>   : Reworked arm_smmu_init_bypass_stes() (patch # 6).
> - Updated SMMUv2 bypass SMR code. Thanks to Jon N (patch #8).
> - Set IOMMU protection flags(IOMMU_CACHE, IOMMU_MMIO) based
>   on Type of RMR region. Suggested by Jon N.
> 
> v4 --> v5
>  -Added a fw_data union to struct iommu_resv_region and removed
>   struct iommu_rmr (Based on comments from Joerg/Robin).
>  -Added iommu_put_rmrs() to release mem.
>  -Thanks to Steve for verifying on SMMUv2, but not added the Tested-by
>   yet because of the above changes.
> 
> v3 -->v4
> -Included the SMMUv2 SMR bypass install changes suggested by
>  Steve(patch #7)
> -As per Robin's comments, RMR reserve implementation is now
>  more generic  (patch #8) and dropped v3 patches 8 and 10.
> -Rebase to 5.13-rc1
> 
> RFC v2 --> v3
>  -Dropped RFC tag as the ACPICA header changes are now ready to be
>   part of 5.13[0]. But this series still has a dependency on that patch.
>  -Added IORT E.b related changes(node flags, _DSM function 5 checks for
>   PCIe).
>  -Changed RMR to stream id mapping from M:N to M:1 as per the spec and
>   discussion here[1].
>  -Last two patches add support for SMMUv2(Thanks to Jon Nettleton!)
> 
> Jon Nettleton (1):
>   iommu/arm-smmu: Get associated RMR info and install bypass SMR
> 
> Shameer Kolothum (10):
>   ACPI/IORT: Add temporary RMR node flag definitions
>   iommu: Introduce a union to struct iommu_resv_region
>   ACPI/IORT: Make iort_iommu_msi_get_resv_regions() return void
>   ACPI/IORT: Provide a generic helper to retrieve reserve regions
>   iommu/dma: Introduce a helper to remove reserved regions
>   ACPI/IORT: Add support to retrieve IORT RMR reserved regions
>   ACPI/IORT: Add a helper to retrieve RMR info directly
>   iommu/arm-smmu-v3: Introduce strtab init helper
>   iommu/arm-smmu-v3: Refactor arm_s

Re: [PATCH v8 00/11] ACPI/IORT: Support for IORT RMR node

2022-03-03 Thread Steven Price
On 21/02/2022 15:43, Shameer Kolothum wrote:
> Hi,
> 
> Since we now have an updated verion[0] of IORT spec(E.d) which
> addresses the memory attributes issues discussed here [1],
> this series now make use of it.
> 
> The pull request for ACPICA E.d related changes are already
> raised and can be found here,
> https://github.com/acpica/acpica/pull/752
> 
> v7 --> v8
>   - Patch #1 has temp definitions for RMR related changes till
> the ACPICA header changes are part of kernel.
>   - No early parsing of RMR node info and is only parsed at the
> time of use.
>   - Changes to the RMR get/put API format compared to the
> previous version.
>   - Support for RMR descriptor shared by multiple stream IDs.
> 
> Please take a look and let me know your thoughts.

Hi Shameer,

I've now been able to test this on the Juno platform with a modified
firmware supporting the newer spec (thanks Sami!). Everything works, so
feel free to add my:

Tested-by: Steven Price 

(Note that I haven't tested the smmu-v3 support)

Thanks,

Steve

> Thanks,
> Shameer
> [0] https://developer.arm.com/documentation/den0049/ed/
> [1] https://lore.kernel.org/linux-acpi/20210805160319.GB23085@lpieralisi/
> 
> From old:
> We have faced issues with 3408iMR RAID controller cards which
> fail to boot when SMMU is enabled. This is because these
> controllers make use of host memory for various caching related
> purposes and when SMMU is enabled the iMR firmware fails to
> access these memory regions as there is no mapping for them.
> IORT RMR provides a way for UEFI to describe and report these
> memory regions so that the kernel can make a unity mapping for
> these in SMMU.
> 
> Change History:
> 
> v6 --> v7
>  -fix pointed out by Steve to the SMMUv2 SMR bypass install in patch #8.
> 
> v5 --> v6
> - Addressed comments from Robin & Lorenzo.
>   : Moved iort_parse_rmr() to acpi_iort_init() from
> iort_init_platform_devices().
>   : Removed use of struct iort_rmr_entry during the initial
> parse. Using struct iommu_resv_region instead.
>   : Report RMR address alignment and overlap errors, but continue.
>   : Reworked arm_smmu_init_bypass_stes() (patch # 6).
> - Updated SMMUv2 bypass SMR code. Thanks to Jon N (patch #8).
> - Set IOMMU protection flags(IOMMU_CACHE, IOMMU_MMIO) based
>   on Type of RMR region. Suggested by Jon N.
> 
> v4 --> v5
>  -Added a fw_data union to struct iommu_resv_region and removed
>   struct iommu_rmr (Based on comments from Joerg/Robin).
>  -Added iommu_put_rmrs() to release mem.
>  -Thanks to Steve for verifying on SMMUv2, but not added the Tested-by
>   yet because of the above changes.
> 
> v3 -->v4
> -Included the SMMUv2 SMR bypass install changes suggested by
>  Steve(patch #7)
> -As per Robin's comments, RMR reserve implementation is now
>  more generic  (patch #8) and dropped v3 patches 8 and 10.
> -Rebase to 5.13-rc1
> 
> RFC v2 --> v3
>  -Dropped RFC tag as the ACPICA header changes are now ready to be
>   part of 5.13[0]. But this series still has a dependency on that patch.
>  -Added IORT E.b related changes(node flags, _DSM function 5 checks for
>   PCIe).
>  -Changed RMR to stream id mapping from M:N to M:1 as per the spec and
>   discussion here[1].
>  -Last two patches add support for SMMUv2(Thanks to Jon Nettleton!)
> 
> Jon Nettleton (1):
>   iommu/arm-smmu: Get associated RMR info and install bypass SMR
> 
> Shameer Kolothum (10):
>   ACPI/IORT: Add temporary RMR node flag definitions
>   iommu: Introduce a union to struct iommu_resv_region
>   ACPI/IORT: Add helper functions to parse RMR nodes
>   iommu/dma: Introduce generic helper to retrieve RMR info
>   ACPI/IORT: Add a helper to retrieve RMR memory regions
>   iommu/arm-smmu-v3: Introduce strtab init helper
>   iommu/arm-smmu-v3: Refactor arm_smmu_init_bypass_stes() to force
> bypass
>   iommu/arm-smmu-v3: Get associated RMR info and install bypass STE
>   iommu/arm-smmu-v3: Reserve any RMR regions associated with a dev
>   iommu/arm-smmu: Reserve any RMR regions associated with a dev
> 
>  drivers/acpi/arm64/iort.c   | 305 
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  91 --
>  drivers/iommu/arm/arm-smmu/arm-smmu.c   |  65 -
>  drivers/iommu/dma-iommu.c   |  25 ++
>  include/linux/acpi_iort.h   |  14 +
>  include/linux/dma-iommu.h   |  14 +
>  include/linux/iommu.h   |   9 +
>  7 files changed, 504 insertions(+), 19 deletions(-)
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 4/5] drm/panfrost: Add a PANFROST_BO_GPUONLY flag

2021-10-01 Thread Steven Price
On 01/10/2021 15:34, Boris Brezillon wrote:
> This lets the driver reduce shareability domain of the MMU mapping,
> which can in turn reduce access time and save power on cache-coherent
> systems.
> 
> Signed-off-by: Boris Brezillon 

This seems reasonable to me - it matches the kbase
BASE_MEM_COHERENT_SYSTEM (only backwards obviously) and it worked
reasonably well for the blob.

But I'm wondering if we need to do anything special to deal with the
fact we will now have some non-coherent mappings on an otherwise
coherent device.

There are certainly some oddities around how these buffers will be
mapped into user space if requested, e.g. panfrost_gem_create_object()
sets 'map_wc' based on pfdev->coherent which is arguably wrong for
GPUONLY. So there are two things we could consider:

a) Actually prevent user space mapping GPUONLY flagged buffers. Which
matches the intention of the name.

b) Attempt to provide user space with the tools to safely interact with
the buffers (this is the kbase approach). This does have the benefit of
allowing *mostly* GPU access. An example here is the tiler heap where
the CPU could zero out as necessary but mostly the GPU has ownership and
the CPU never reads the contents. GPUONLY/DEVONLY might not be the best
name in that case.

Any thoughts?

Thanks,

Steve

> ---
>  drivers/gpu/drm/panfrost/panfrost_drv.c | 3 ++-
>  drivers/gpu/drm/panfrost/panfrost_gem.c | 1 +
>  drivers/gpu/drm/panfrost/panfrost_gem.h | 1 +
>  drivers/gpu/drm/panfrost/panfrost_mmu.c | 3 +++
>  include/uapi/drm/panfrost_drm.h | 1 +
>  5 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
> b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index b29ac942ae2d..b176921b9392 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> @@ -77,7 +77,8 @@ static int panfrost_ioctl_get_param(struct drm_device 
> *ddev, void *data, struct
>  
>  #define PANFROST_BO_FLAGS \
>   (PANFROST_BO_NOEXEC | PANFROST_BO_HEAP | \
> -  PANFROST_BO_NOREAD | PANFROST_BO_NOWRITE)
> +  PANFROST_BO_NOREAD | PANFROST_BO_NOWRITE | \
> +  PANFROST_BO_GPUONLY)
>  
>  static int panfrost_ioctl_create_bo(struct drm_device *dev, void *data,
>   struct drm_file *file)
> diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c 
> b/drivers/gpu/drm/panfrost/panfrost_gem.c
> index d6c1bb1445f2..4b1f85c0b98f 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_gem.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
> @@ -254,6 +254,7 @@ panfrost_gem_create_with_handle(struct drm_file 
> *file_priv,
>   bo->noread = !!(flags & PANFROST_BO_NOREAD);
>   bo->nowrite = !!(flags & PANFROST_BO_NOWRITE);
>   bo->is_heap = !!(flags & PANFROST_BO_HEAP);
> + bo->gpuonly = !!(flags & PANFROST_BO_GPUONLY);
>  
>   /*
>* Allocate an id of idr table where the obj is registered
> diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.h 
> b/drivers/gpu/drm/panfrost/panfrost_gem.h
> index 6246b5fef446..e332d5a4c24f 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_gem.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_gem.h
> @@ -40,6 +40,7 @@ struct panfrost_gem_object {
>   bool noread :1;
>   bool nowrite:1;
>   bool is_heap:1;
> + bool gpuonly:1;
>  };
>  
>  struct panfrost_gem_mapping {
> diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c 
> b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> index 6a5c9d94d6f2..89eee8e80aa5 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> @@ -321,6 +321,9 @@ int panfrost_mmu_map(struct panfrost_gem_mapping *mapping)
>   if (!bo->noread)
>   prot |= IOMMU_READ;
>  
> + if (bo->gpuonly)
> + prot |= IOMMU_DEVONLY;
> +
>   sgt = drm_gem_shmem_get_pages_sgt(obj);
>   if (WARN_ON(IS_ERR(sgt)))
>   return PTR_ERR(sgt);
> diff --git a/include/uapi/drm/panfrost_drm.h b/include/uapi/drm/panfrost_drm.h
> index a2de81225125..538b58b2d095 100644
> --- a/include/uapi/drm/panfrost_drm.h
> +++ b/include/uapi/drm/panfrost_drm.h
> @@ -88,6 +88,7 @@ struct drm_panfrost_wait_bo {
>  #define PANFROST_BO_HEAP 2
>  #define PANFROST_BO_NOREAD   4
>  #define PANFROST_BO_NOWRITE  8
> +#define PANFROST_BO_GPUONLY  0x10
>  
>  /**
>   * struct drm_panfrost_create_bo - ioctl argument for creating Panfrost BOs.
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 1/9] iommu: Introduce a union to struct iommu_resv_region

2021-08-20 Thread Steven Price
On 05/08/2021 09:07, Shameer Kolothum wrote:
> A union is introduced to struct iommu_resv_region to hold
> any firmware specific data. This is in preparation to add
> support for IORT RMR reserve regions and the union now holds
> the RMR specific information.
> 
> Signed-off-by: Shameer Kolothum 
> ---
>  include/linux/iommu.h | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 32d448050bf7..bd0e4641c569 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -114,6 +114,13 @@ enum iommu_resv_type {
>   IOMMU_RESV_SW_MSI,
>  };
>  
> +struct iommu_iort_rmr_data {
> +#define IOMMU_RMR_REMAP_PERMITTED(1 << 0)
> + u32 flags;
> + u32 sid;/* Stream Id associated with RMR entry */
> + void *smmu; /* Associated IORT SMMU node pointer */
> +};
> +
>  /**
>   * struct iommu_resv_region - descriptor for a reserved memory region
>   * @list: Linked list pointers
> @@ -121,6 +128,7 @@ enum iommu_resv_type {
>   * @length: Length of the region in bytes
>   * @prot: IOMMU Protection flags (READ/WRITE/...)
>   * @type: Type of the reserved region
> + * @rmr: ACPI IORT RMR specific data

NIT: This will provoke a kernel-doc warning as the field name in the
structure is 'fw_data' not 'rmr' ('rmr being a field of the anonymous
union).

I've also retested this series on my Juno setup, so feel free to add:

Tested-by: Steven Price 

Thanks,

Steve

>   */
>  struct iommu_resv_region {
>   struct list_headlist;
> @@ -128,6 +136,9 @@ struct iommu_resv_region {
>   size_t  length;
>   int prot;
>   enum iommu_resv_typetype;
> + union {
> + struct iommu_iort_rmr_data rmr;
> + } fw_data;
>  };
>  
>  /**
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 8/9] iommu/arm-smmu: Get associated RMR info and install bypass SMR

2021-07-16 Thread Steven Price
On 16/07/2021 09:34, Shameer Kolothum wrote:
> From: Jon Nettleton 
> 
> Check if there is any RMR info associated with the devices behind
> the SMMU and if any, install bypass SMRs for them. This is to
> keep any ongoing traffic associated with these devices alive
> when we enable/reset SMMU during probe().
> 
> Signed-off-by: Jon Nettleton 
> Signed-off-by: Steven Price 
> Signed-off-by: Shameer Kolothum 
> ---
>  drivers/iommu/arm/arm-smmu/arm-smmu.c | 48 +++
>  1 file changed, 48 insertions(+)
> 
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
> b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> index f22dbeb1e510..e9fb3d962a86 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> @@ -2063,6 +2063,50 @@ err_reset_platform_ops: __maybe_unused;
>   return err;
>  }
>  
> +static void arm_smmu_rmr_install_bypass_smr(struct arm_smmu_device *smmu)
> +{
> + struct list_head rmr_list;
> + struct iommu_resv_region *e;
> + int i, cnt = 0;
> + u32 reg;
> +
> + INIT_LIST_HEAD(_list);
> + if (iommu_dma_get_rmrs(dev_fwnode(smmu->dev), _list))
> + return;
> +
> + /*
> +  * Rather than trying to look at existing mappings that
> +  * are setup by the firmware and then invalidate the ones
> +  * that do no have matching RMR entries, just disable the
> +  * SMMU until it gets enabled again in the reset routine.
> +  */
> + reg = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_sCR0);
> + reg &= ~ARM_SMMU_sCR0_CLIENTPD;

This looks backwards, the spec states:

  Client Port Disable. The possible values of this bit are:
  0 - Each SMMU client access is subject to SMMU translation, access
  control, and attribute generation.
  1 - Each SMMU client access bypasses SMMU translation, access control,
  and attribute generation.
  This bit resets to 1.

And indeed with the current code if sCR0_USFCFG was set when
sCR0_CLIENTPD is cleared then I get a blank screen until the smmu is reset.

So I believe this should be ORing in the value, i.e.

  reg |= ARM_SMMU_sCR0_CLIENTPD;

And in my testing that works fine even if sCR0_USFCFG is set.

Steve

> + arm_smmu_gr0_write(smmu, ARM_SMMU_GR0_sCR0, reg);
> +
> + list_for_each_entry(e, _list, list) {
> + u32 sid = e->fw_data.rmr.sid;
> +
> + i = arm_smmu_find_sme(smmu, sid, ~0);
> + if (i < 0)
> + continue;
> + if (smmu->s2crs[i].count == 0) {
> + smmu->smrs[i].id = sid;
> + smmu->smrs[i].mask = 0;
> + smmu->smrs[i].valid = true;
> + }
> + smmu->s2crs[i].count++;
> + smmu->s2crs[i].type = S2CR_TYPE_BYPASS;
> + smmu->s2crs[i].privcfg = S2CR_PRIVCFG_DEFAULT;
> +
> + cnt++;
> + }
> +
> + dev_notice(smmu->dev, "\tpreserved %d boot mapping%s\n", cnt,
> +cnt == 1 ? "" : "s");
> + iommu_dma_put_rmrs(dev_fwnode(smmu->dev), _list);
> +}
> +
>  static int arm_smmu_device_probe(struct platform_device *pdev)
>  {
>   struct resource *res;
> @@ -2189,6 +2233,10 @@ static int arm_smmu_device_probe(struct 
> platform_device *pdev)
>   }
>  
>   platform_set_drvdata(pdev, smmu);
> +
> + /* Check for RMRs and install bypass SMRs if any */
> + arm_smmu_rmr_install_bypass_smr(smmu);
> +
>   arm_smmu_device_reset(smmu);
>   arm_smmu_test_smr_masks(smmu);
>  
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 7/8] iommu/arm-smmu: Get associated RMR info and install bypass SMR

2021-06-03 Thread Steven Price
On 03/06/2021 09:52, Jon Nettleton wrote:
> On Mon, May 24, 2021 at 1:04 PM Shameer Kolothum
>  wrote:
>>
>> From: Jon Nettleton 
>>
>> Check if there is any RMR info associated with the devices behind
>> the SMMU and if any, install bypass SMRs for them. This is to
>> keep any ongoing traffic associated with these devices alive
>> when we enable/reset SMMU during probe().
>>
>> Signed-off-by: Jon Nettleton 
>> Signed-off-by: Steven Price 
>> Signed-off-by: Shameer Kolothum 
>> ---
>>  drivers/iommu/arm/arm-smmu/arm-smmu.c | 65 +++
>>  1 file changed, 65 insertions(+)
>>
>> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
>> b/drivers/iommu/arm/arm-smmu/arm-smmu.c
>> index 6f72c4d208ca..56db3d3238fc 100644
>> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
>> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
>> @@ -2042,6 +2042,67 @@ err_reset_platform_ops: __maybe_unused;
>> return err;
>>  }
>>
>> +static void arm_smmu_rmr_install_bypass_smr(struct arm_smmu_device *smmu)
>> +{
>> +   struct list_head rmr_list;
>> +   struct iommu_resv_region *e;
>> +   int i, cnt = 0;
>> +   u32 smr;
>> +   u32 reg;
>> +
>> +   INIT_LIST_HEAD(_list);
>> +   if (iommu_dma_get_rmrs(dev_fwnode(smmu->dev), _list))
>> +   return;
>> +
>> +   reg = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_sCR0);
>> +
>> +   if ((reg & ARM_SMMU_sCR0_USFCFG) && !(reg & ARM_SMMU_sCR0_CLIENTPD)) 
>> {
>> +   /*
>> +* SMMU is already enabled and disallowing bypass, so 
>> preserve
>> +* the existing SMRs
>> +*/
>> +   for (i = 0; i < smmu->num_mapping_groups; i++) {
>> +   smr = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_SMR(i));
>> +   if (!FIELD_GET(ARM_SMMU_SMR_VALID, smr))
>> +   continue;
>> +   smmu->smrs[i].id = FIELD_GET(ARM_SMMU_SMR_ID, smr);
>> +   smmu->smrs[i].mask = FIELD_GET(ARM_SMMU_SMR_MASK, 
>> smr);
>> +   smmu->smrs[i].valid = true;
>> +   }
>> +   }
>> +
>> +   list_for_each_entry(e, _list, list) {
>> +   u32 sid = e->fw_data.rmr.sid;
>> +
>> +   i = arm_smmu_find_sme(smmu, sid, ~0);
>> +   if (i < 0)
>> +   continue;
>> +   if (smmu->s2crs[i].count == 0) {
>> +   smmu->smrs[i].id = sid;
>> +   smmu->smrs[i].mask = ~0;

Looking at this code again, that mask looks wrong! According to the SMMU
spec MASK[i]==1 means ID[i] is ignored. I.e. this means completely
ignore the ID...

I'm not at all sure why they designed the hardware that way round.

>> +   smmu->smrs[i].valid = true;
>> +   }
>> +   smmu->s2crs[i].count++;
>> +   smmu->s2crs[i].type = S2CR_TYPE_BYPASS;
>> +   smmu->s2crs[i].privcfg = S2CR_PRIVCFG_DEFAULT;
>> +   smmu->s2crs[i].cbndx = 0xff;
>> +
>> +   cnt++;
>> +   }
>> +
>> +   if ((reg & ARM_SMMU_sCR0_USFCFG) && !(reg & ARM_SMMU_sCR0_CLIENTPD)) 
>> {
>> +   /* Remove the valid bit for unused SMRs */
>> +   for (i = 0; i < smmu->num_mapping_groups; i++) {
>> +   if (smmu->s2crs[i].count == 0)
>> +   smmu->smrs[i].valid = false;
>> +   }
>> +   }
>> +
>> +   dev_notice(smmu->dev, "\tpreserved %d boot mapping%s\n", cnt,
>> +  cnt == 1 ? "" : "s");
>> +   iommu_dma_put_rmrs(dev_fwnode(smmu->dev), _list);
>> +}
>> +
>>  static int arm_smmu_device_probe(struct platform_device *pdev)
>>  {
>> struct resource *res;
>> @@ -2168,6 +2229,10 @@ static int arm_smmu_device_probe(struct 
>> platform_device *pdev)
>> }
>>
>> platform_set_drvdata(pdev, smmu);
>> +
>> +   /* Check for RMRs and install bypass SMRs if any */
>> +   arm_smmu_rmr_install_bypass_smr(smmu);
>> +
>> arm_smmu_device_reset(smmu);
>> arm_smmu_test_smr_masks(smmu);
>>
>> --
>> 2.17.1
>>
> 
> Shameer and Robin
> 
> I finally got aroun

Re: [PATCH v5 0/8] ACPI/IORT: Support for IORT RMR node

2021-05-24 Thread Steven Price
On 24/05/2021 12:02, Shameer Kolothum wrote:
> Hi,
> 
> v4 --> v5
>  -Added a fw_data union to struct iommu_resv_region and removed
>   struct iommu_rmr (Based on comments from Joerg/Robin).
>  -Added iommu_put_rmrs() to release mem.
>  -Thanks to Steve for verifying v4 on SMMUv2, but not added the
>   Tested-by yet because of the above changes.

I've retested with this series (Juno with SMMU in front of display
controller and EFI framebuffer), and it still works, so:

Tested-by: Steven Price 

Thanks,

Steve

> 
> v3 -->v4
> -Included the SMMUv2 SMR bypass install changes suggested by
>  Steve(patch #7)
> -As per Robin's comments, RMR reserve implementation is now
>  more generic  (patch #8) and dropped v3 patches 8 and 10.
> -Rebase to 5.13-rc1 
> 
> RFC v2 --> v3
>  -Dropped RFC tag as the ACPICA header changes are now ready to be
>   part of 5.13[0]. But this series still has a dependency on that patch.
>  -Added IORT E.b related changes(node flags, _DSM function 5 checks for
>   PCIe).
>  -Changed RMR to stream id mapping from M:N to M:1 as per the spec and
>   discussion here[1].
>  -Last two patches add support for SMMUv2(Thanks to Jon Nettleton!) 
> 
> Sanity tested on a HiSilicon D06. Further testing and feedback is greatly
> appreciated.
> 
> The whole series can be found here,
> https://github.com/hisilicon/kernel-dev/tree/private-v5.12-rc8-rmr-v3
> 
> Thanks,
> Shameer
> 
> [0] 
> https://lore.kernel.org/linux-acpi/20210406213028.718796-22-erik.kan...@intel.com/
> [1] 
> https://op-lists.linaro.org/pipermail/linaro-open-discussions/2021-April/000150.html
> 
> RFC v1 --> v2:
>  - Added a generic interface for IOMMU drivers to retrieve all the 
>    RMR info associated with a given IOMMU.
>  - SMMUv3 driver gets the RMR list during probe() and installs
>    bypass STEs for all the SIDs in the RMR list. This is to keep
>    the ongoing traffic alive(if any) during SMMUv3 reset. This is
>based on the suggestions received for v1 to take care of the
>EFI framebuffer use case. Only sanity tested for now.
>  - During the probe/attach device, SMMUv3 driver reserves any
>    RMR region associated with the device such that there is a unity
>    mapping for them in SMMU.
> ---    
> 
> From RFC v1:
> -
> The series adds support to IORT RMR nodes specified in IORT
> Revision E -ARM DEN 0049E[0]. RMR nodes are used to describe memory
> ranges that are used by endpoints and require a unity mapping
> in SMMU.
> 
> We have faced issues with 3408iMR RAID controller cards which
> fail to boot when SMMU is enabled. This is because these controllers
> make use of host memory for various caching related purposes and when
> SMMU is enabled the iMR firmware fails to access these memory regions
> as there is no mapping for them. IORT RMR provides a way for UEFI to
> describe and report these memory regions so that the kernel can make
> a unity mapping for these in SMMU.
> 
> Tests:
> 
> With a UEFI, that reports the RMR for the dev,
> 
> [16F0h 5872   1] Type : 06
> [16F1h 5873   2]   Length : 007C
> [16F3h 5875   1] Revision : 00
> [1038h 0056   2] Reserved : 
> [1038h 0056   2]   Identifier : 
> [16F8h 5880   4]Mapping Count : 0001
> [16FCh 5884   4]   Mapping Offset : 0040
> 
> [1700h 5888   4]Number of RMR Descriptors : 0002
> [1704h 5892   4]RMR Descriptor Offset : 0018
> 
> [1708h 5896   8]  Base Address of RMR : E640
> [1710h 5904   8]Length of RMR : 0010
> [1718h 5912   4] Reserved : 
> 
> [171Ch 5916   8]  Base Address of RMR : 27B0
> [1724h 5924   8]Length of RMR : 00C0
> [172Ch 5932   4] Reserved : 
> 
> [1730h 5936   4]   Input base : 
> [1734h 5940   4] ID Count : 0001
> [1738h 5944   4]  Output Base : 0003
> [173Ch 5948   4] Output Reference : 0064
> [1740h 5952   4]Flags (decoded below) : 0001
>Single Mapping : 1
> ...
> 
> Without the series the RAID controller initialization fails as
> below,
> 
> ...
> [   12.631117] megaraid_sas :03:00.0: FW supports sync cache: Yes 
>   
> [   12.637360] megaraid_sas :03:00.0: megasas_disable_intr_fusion is 
> called outbound_intr_mask:0x4009  
>  
> [   18.776377] megaraid_sas :03:00.0: Init

Re: [PATCH v4 0/8] ACPI/IORT: Support for IORT RMR node

2021-05-21 Thread Steven Price
On 13/05/2021 14:45, Shameer Kolothum wrote:
> Hi,
> 
> v3 -->v4
> -Included the SMMUv2 SMR bypass install changes suggested by
>  Steve(patch #7)
> -As per Robin's comments, RMR reserve implementation is now
>  more generic  (patch #8) and dropped v3 patches 8 and 10.
> -Rebase to 5.13-rc1 
> 
> The whole series is available here,
> https://github.com/hisilicon/kernel-dev/tree/private-v5.13-rc1-rmr-v4-ext
> 
> RFC v2 --> v3
>  -Dropped RFC tag as the ACPICA header changes are now ready to be
>   part of 5.13[0]. But this series still has a dependency on that patch.
>  -Added IORT E.b related changes(node flags, _DSM function 5 checks for
>   PCIe).
>  -Changed RMR to stream id mapping from M:N to M:1 as per the spec and
>   discussion here[1].
>  -Last two patches add support for SMMUv2(Thanks to Jon Nettleton!) 
> 
> Sanity tested on a HiSilicon D06. Further testing and feedback is greatly
> appreciated.

With the updated SMMUv2 support this works fine on my Juno with EFIFB
(and corresponding patches to the firmware to expose the ACPI tables).
Feel free to add

Tested-by: Steven Price 

Thanks,

Steve

> https://github.com/hisilicon/kernel-dev/tree/private-v5.12-rc8-rmr-v3
> 
> Thanks,
> Shameer
> 
> [0] 
> https://lore.kernel.org/linux-acpi/20210406213028.718796-22-erik.kan...@intel.com/
> [1] 
> https://op-lists.linaro.org/pipermail/linaro-open-discussions/2021-April/000150.html
> 
> RFC v1 --> v2:
>  - Added a generic interface for IOMMU drivers to retrieve all the 
>    RMR info associated with a given IOMMU.
>  - SMMUv3 driver gets the RMR list during probe() and installs
>    bypass STEs for all the SIDs in the RMR list. This is to keep
>    the ongoing traffic alive(if any) during SMMUv3 reset. This is
>based on the suggestions received for v1 to take care of the
>EFI framebuffer use case. Only sanity tested for now.
>  - During the probe/attach device, SMMUv3 driver reserves any
>    RMR region associated with the device such that there is a unity
>    mapping for them in SMMU.
> ---    
> 
> From RFC v1:
> -
> The series adds support to IORT RMR nodes specified in IORT
> Revision E -ARM DEN 0049E[0]. RMR nodes are used to describe memory
> ranges that are used by endpoints and require a unity mapping
> in SMMU.
> 
> We have faced issues with 3408iMR RAID controller cards which
> fail to boot when SMMU is enabled. This is because these controllers
> make use of host memory for various caching related purposes and when
> SMMU is enabled the iMR firmware fails to access these memory regions
> as there is no mapping for them. IORT RMR provides a way for UEFI to
> describe and report these memory regions so that the kernel can make
> a unity mapping for these in SMMU.
> 
> Tests:
> 
> With a UEFI, that reports the RMR for the dev,
> 
> [16F0h 5872   1] Type : 06
> [16F1h 5873   2]   Length : 007C
> [16F3h 5875   1] Revision : 00
> [1038h 0056   2] Reserved : 
> [1038h 0056   2]   Identifier : 
> [16F8h 5880   4]Mapping Count : 0001
> [16FCh 5884   4]   Mapping Offset : 0040
> 
> [1700h 5888   4]Number of RMR Descriptors : 0002
> [1704h 5892   4]RMR Descriptor Offset : 0018
> 
> [1708h 5896   8]  Base Address of RMR : E640
> [1710h 5904   8]Length of RMR : 0010
> [1718h 5912   4] Reserved : 
> 
> [171Ch 5916   8]  Base Address of RMR : 27B0
> [1724h 5924   8]Length of RMR : 00C0
> [172Ch 5932   4] Reserved : 
> 
> [1730h 5936   4]   Input base : 
> [1734h 5940   4] ID Count : 0001
> [1738h 5944   4]  Output Base : 0003
> [173Ch 5948   4] Output Reference : 0064
> [1740h 5952   4]Flags (decoded below) : 0001
>Single Mapping : 1
> ...
> 
> Without the series the RAID controller initialization fails as
> below,
> 
> ...
> [   12.631117] megaraid_sas :03:00.0: FW supports sync cache: Yes 
>   
> [   12.637360] megaraid_sas :03:00.0: megasas_disable_intr_fusion is 
> called outbound_intr_mask:0x4009  
>  
> [   18.776377] megaraid_sas :03:00.0: Init cmd return status FAILED for 
> SCSI host 0   
>   
> [   23.019383] megaraid_sas :03:00.0: Waiting for FW to come to ready 
> state 
>

Re: [PATCH v3 09/10] iommu/arm-smmu: Get associated RMR info and install bypass SMR

2021-05-06 Thread Steven Price

On 20/04/2021 09:27, Shameer Kolothum wrote:

From: Jon Nettleton 

Check if there is any RMR info associated with the devices behind
the SMMU and if any, install bypass SMRs for them. This is to
keep any ongoing traffic associated with these devices alive
when we enable/reset SMMU during probe().

Signed-off-by: Jon Nettleton 
Signed-off-by: Shameer Kolothum 
---
  drivers/iommu/arm/arm-smmu/arm-smmu.c | 42 +++
  drivers/iommu/arm/arm-smmu/arm-smmu.h |  2 ++
  2 files changed, 44 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index d8c6bfde6a61..4d2f91626d87 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -2102,6 +2102,43 @@ err_reset_platform_ops: __maybe_unused;
return err;
  }
  
+static void arm_smmu_rmr_install_bypass_smr(struct arm_smmu_device *smmu)

+{
+   struct iommu_rmr *e;
+   int i, cnt = 0;
+   u32 smr;
+
+   for (i = 0; i < smmu->num_mapping_groups; i++) {
+   smr = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_SMR(i));
+   if (!FIELD_GET(ARM_SMMU_SMR_VALID, smr))
+   continue;
+
+   list_for_each_entry(e, >rmr_list, list) {
+   if (FIELD_GET(ARM_SMMU_SMR_ID, smr) != e->sid)
+   continue;
+
+   smmu->smrs[i].id = FIELD_GET(ARM_SMMU_SMR_ID, smr);
+   smmu->smrs[i].mask = FIELD_GET(ARM_SMMU_SMR_MASK, smr);
+   smmu->smrs[i].valid = true;
+
+   smmu->s2crs[i].type = S2CR_TYPE_BYPASS;
+   smmu->s2crs[i].privcfg = S2CR_PRIVCFG_DEFAULT;
+   smmu->s2crs[i].cbndx = 0xff;
+
+   cnt++;
+   }
+   }


If I understand this correctly - this is looking at the current
(hardware) configuration of the SMMU and attempting to preserve any
bypass SMRs. However from what I can tell it suffers from the following
two problems:

 (a) Only the ID of the SMR is being checked, not the MASK. So if the
firmware has setup an SMR matching a number of streams this will break.

 (b) The SMMU might not be enabled at all (CLIENTPD==1) or bypass
enabled for unmatched streams (USFCFG==0).

Certainly in my test setup case (b) applies and so this doesn't work.
Perhaps something like the below would work better? (It works in the
case of the SMMU not enabled - I've not tested case (a)).

Steve

8<
static void arm_smmu_rmr_install_bypass_smr(struct arm_smmu_device *smmu)
{
struct iommu_rmr *e;
int i, cnt = 0;
u32 smr;
u32 reg;

reg = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_sCR0);

if ((reg & ARM_SMMU_sCR0_USFCFG) && !(reg & ARM_SMMU_sCR0_CLIENTPD)) {
/*
 * SMMU is already enabled and disallowing bypass, so preserve
 * the existing SMRs
 */
for (i = 0; i < smmu->num_mapping_groups; i++) {
smr = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_SMR(i));
if (!FIELD_GET(ARM_SMMU_SMR_VALID, smr))
continue;
smmu->smrs[i].id = FIELD_GET(ARM_SMMU_SMR_ID, smr);
smmu->smrs[i].mask = FIELD_GET(ARM_SMMU_SMR_MASK, smr);
smmu->smrs[i].valid = true;
}
}

list_for_each_entry(e, >rmr_list, list) {
u32 sid = e->sid;

i = arm_smmu_find_sme(smmu, sid, ~0);
if (i < 0)
continue;
if (smmu->s2crs[i].count == 0) {
smmu->smrs[i].id = sid;
smmu->smrs[i].mask = ~0;
smmu->smrs[i].valid = true;
}
smmu->s2crs[i].count++;
smmu->s2crs[i].type = S2CR_TYPE_BYPASS;
smmu->s2crs[i].privcfg = S2CR_PRIVCFG_DEFAULT;
smmu->s2crs[i].cbndx = 0xff;

cnt++;
}

if ((reg & ARM_SMMU_sCR0_USFCFG) && !(reg & ARM_SMMU_sCR0_CLIENTPD)) {
/* Remove the valid bit for unused SMRs */
for (i = 0; i < smmu->num_mapping_groups; i++) {
if (smmu->s2crs[i].count == 0)
smmu->smrs[i].valid = false;
}
}

dev_notice(smmu->dev, "\tpreserved %d boot mapping%s\n", cnt,
   cnt == 1 ? "" : "s");
}
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v5 05/16] swiotlb: Add restricted DMA pool initialization

2021-04-28 Thread Steven Price

On 26/04/2021 17:37, Claire Chang wrote:

On Fri, Apr 23, 2021 at 7:34 PM Steven Price  wrote:

[...]


But even then if it's not and we have the situation where debugfs==NULL
then the debugfs_create_dir() here will cause a subsequent attempt in
swiotlb_create_debugfs() to fail (directory already exists) leading to
mem->debugfs being assigned an error value. I suspect the creation of
the debugfs directory needs to be separated from io_tlb_default_mem
being set.


debugfs creation should move into the if (!mem) {...} above to avoid
duplication.
I think having a separated struct dentry pointer for the default
debugfs should be enough?

if (!debugfs)
 debugfs = debugfs_create_dir("swiotlb", NULL);
swiotlb_create_debugfs(mem, rmem->name, debugfs);


Yes that looks like a good solution to me. Although I'd name the 
variable something a bit more descriptive than just "debugfs" e.g. 
"debugfs_dir" or "debugfs_root".


Thanks,

Steve
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 05/16] swiotlb: Add restricted DMA pool initialization

2021-04-23 Thread Steven Price

On 22/04/2021 09:14, Claire Chang wrote:

Add the initialization function to create restricted DMA pools from
matching reserved-memory nodes.

Signed-off-by: Claire Chang 
---
  include/linux/device.h  |  4 +++
  include/linux/swiotlb.h |  3 +-
  kernel/dma/swiotlb.c| 80 +
  3 files changed, 86 insertions(+), 1 deletion(-)

diff --git a/include/linux/device.h b/include/linux/device.h
index 38a2071cf776..4987608ea4ff 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -416,6 +416,7 @@ struct dev_links_info {
   * @dma_pools:Dma pools (if dma'ble device).
   * @dma_mem:  Internal for coherent mem override.
   * @cma_area: Contiguous memory area for dma allocations
+ * @dma_io_tlb_mem: Internal for swiotlb io_tlb_mem override.
   * @archdata: For arch-specific additions.
   * @of_node:  Associated device tree node.
   * @fwnode:   Associated device node supplied by platform firmware.
@@ -521,6 +522,9 @@ struct device {
  #ifdef CONFIG_DMA_CMA
struct cma *cma_area;   /* contiguous memory area for dma
   allocations */
+#endif
+#ifdef CONFIG_DMA_RESTRICTED_POOL
+   struct io_tlb_mem *dma_io_tlb_mem;
  #endif
/* arch specific additions */
struct dev_archdata archdata;
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 216854a5e513..03ad6e3b4056 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -72,7 +72,8 @@ extern enum swiotlb_force swiotlb_force;
   *range check to see if the memory was in fact allocated by this
   *API.
   * @nslabs:   The number of IO TLB blocks (in groups of 64) between @start and
- * @end. This is command line adjustable via setup_io_tlb_npages.
+ * @end. For default swiotlb, this is command line adjustable via
+ * setup_io_tlb_npages.
   * @used: The number of used IO TLB block.
   * @list: The free list describing the number of free entries available
   *from each index.
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 57a9adb920bf..ffbb8724e06c 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -39,6 +39,13 @@
  #ifdef CONFIG_DEBUG_FS
  #include 
  #endif
+#ifdef CONFIG_DMA_RESTRICTED_POOL
+#include 
+#include 
+#include 
+#include 
+#include 
+#endif
  
  #include 

  #include 
@@ -681,3 +688,76 @@ static int __init swiotlb_create_default_debugfs(void)
  late_initcall(swiotlb_create_default_debugfs);
  
  #endif

+
+#ifdef CONFIG_DMA_RESTRICTED_POOL
+static int rmem_swiotlb_device_init(struct reserved_mem *rmem,
+   struct device *dev)
+{
+   struct io_tlb_mem *mem = rmem->priv;
+   unsigned long nslabs = rmem->size >> IO_TLB_SHIFT;
+
+   if (dev->dma_io_tlb_mem)
+   return 0;
+
+   /* Since multiple devices can share the same pool, the private data,
+* io_tlb_mem struct, will be initialized by the first device attached
+* to it.
+*/
+   if (!mem) {
+   mem = kzalloc(struct_size(mem, slots, nslabs), GFP_KERNEL);
+   if (!mem)
+   return -ENOMEM;
+#ifdef CONFIG_ARM
+   if (!PageHighMem(pfn_to_page(PHYS_PFN(rmem->base {
+   kfree(mem);
+   return -EINVAL;
+   }
+#endif /* CONFIG_ARM */
+   swiotlb_init_io_tlb_mem(mem, rmem->base, nslabs, false);
+
+   rmem->priv = mem;
+   }
+
+#ifdef CONFIG_DEBUG_FS
+   if (!io_tlb_default_mem->debugfs)
+   io_tlb_default_mem->debugfs =
+   debugfs_create_dir("swiotlb", NULL);


At this point it's possible for io_tlb_default_mem to be NULL, leading 
to a splat.


But even then if it's not and we have the situation where debugfs==NULL 
then the debugfs_create_dir() here will cause a subsequent attempt in 
swiotlb_create_debugfs() to fail (directory already exists) leading to 
mem->debugfs being assigned an error value. I suspect the creation of 
the debugfs directory needs to be separated from io_tlb_default_mem 
being set.


Other than that I gave this series a go with our prototype of Arm's 
Confidential Computer Architecture[1] - since the majority of the 
guest's memory is protected from the host the restricted DMA pool allows 
(only) a small area to be shared with the host.


After fixing (well hacking round) the above it all seems to be working 
fine with virtio drivers.


Thanks,

Steve

[1] 
https://www.arm.com/why-arm/architecture/security-features/arm-confidential-compute-architecture

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH v2 0/8] ACPI/IORT: Support for IORT RMR node

2020-12-14 Thread Steven Price

On 14/12/2020 12:33, Robin Murphy wrote:

On 2020-12-14 10:55, Shameerali Kolothum Thodi wrote:

Hi Steve,


-Original Message-
From: Steven Price [mailto:steven.pr...@arm.com]
Sent: 10 December 2020 10:26
To: Shameerali Kolothum Thodi ;
linux-arm-ker...@lists.infradead.org; linux-a...@vger.kernel.org;
iommu@lists.linux-foundation.org; de...@acpica.org
Cc: Linuxarm ; lorenzo.pieral...@arm.com;
j...@8bytes.org; robin.mur...@arm.com; wanghuiqiang
; Guohanjun (Hanjun Guo)
; Jonathan Cameron
; sami.muja...@arm.com
Subject: Re: [RFC PATCH v2 0/8] ACPI/IORT: Support for IORT RMR node

On 19/11/2020 12:11, Shameer Kolothum wrote:

RFC v1 --> v2:
   - Added a generic interface for IOMMU drivers to retrieve all the
     RMR info associated with a given IOMMU.
   - SMMUv3 driver gets the RMR list during probe() and installs
     bypass STEs for all the SIDs in the RMR list. This is to keep
     the ongoing traffic alive(if any) during SMMUv3 reset. This is
 based on the suggestions received for v1 to take care of the
 EFI framebuffer use case. Only sanity tested for now.


Hi Shameer,

Sorry for not looking at this before.

Do you have any plans to implement support in the SMMUv2 driver? The
platform I've been testing the EFI framebuffer support on has the
display controller behind SMMUv2, so as it stands this series doesn't
work. I did hack something up for SMMUv2 so I was able to test the first
4 patches.


Thanks for taking a look. Sure, I can look into adding the support for 
SMMUv2.


Great, thanks!




   - During the probe/attach device, SMMUv3 driver reserves any
     RMR region associated with the device such that there is a unity
     mapping for them in SMMU.


For the EFI framebuffer use case there is no device to attach so I
believe we are left with just the stream ID in bypass mode - which is
definitely an improvement (the display works!)


Cool. That’s good to know.

  but not actually a unity

mapping of the RMR range. I'm not sure whether it's worth fixing this or
not, but I just wanted to point out there's still a need for a driver
for the device before the bypass mode is replaced with the unity 
mapping.


I am not sure either. My idea was we will have bypass STE setup for 
all devices
with RMR initially and when the corresponding driver takes over(if 
that happens)
we will have the unity mapping setup properly for the RMR regions. And 
for cases

like the above, it will remain in the bypass mode.

Do you see any problem(security?) if the dev streams remain in bypass 
mode for
this dev? Or is it possible to have a stub driver for this dev, so 
that we will have

the probe/attach invoked and everything will fall in place?


The downside is that if a driver never binds to that device, it remains 
bypassed. If some other externally-controlled malicious device could 
manage to spoof that device's requester ID, that could potentially be 
exploited.


TBH, I haven't looked into creating a temp domain for these types of 
the devices
and also not sure how we benefit from that compared to the STE bypass 
mode.


That said, setting up temporary translation contexts that ensure any 
access can *only* be to RMR regions until a driver takes over is an 
awful lot more work. I'm inclined to be pragmatic here and say we should 
get things working at all with the simple bypass STE/S2CR method, then 
look at adding the higher-security effort on top.


Right now systems that need this are either broken (but effectively 
secure) or using default bypass with SMMUv2. People who prefer to 
maintain security over functionality in the interim can maintain that 
status quo by simply continuing to not describe any RMRs.


I agree with Robin, let's get this working with the bypass mode (until 
the device binds) like you've currently got. It's much better than what 
we have otherwise. Then once that is merged we can look at the temporary 
translation context or stub driver. The temporary translation context 
would be 'neatest', but I'm only aware of the EFI framebuffer use case 
and for that it might be possible to do something simpler - if indeed 
anything is needed at all. I'm not sure how much we need to be worried 
about malicious devices spoofing requester IDs.


Thanks,

Steve
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu/io-pgtable: Remove tlb_flush_leaf

2020-11-26 Thread Steven Price

On 25/11/2020 17:29, Robin Murphy wrote:

The only user of tlb_flush_leaf is a particularly hairy corner of the
Arm short-descriptor code, which wants a synchronous invalidation to
minimise the races inherent in trying to split a large page mapping.
This is already far enough into "here be dragons" territory that no
sensible caller should ever hit it, and thus it really doesn't need
optimising. Although using tlb_flush_walk there may technically be
more heavyweight than needed, it does the job and saves everyone else
having to carry around useless baggage.

Signed-off-by: Robin Murphy 


LGTM

Reviewed-by: Steven Price 


---

Reviewing the Mediatek TLB optimisation patches just left me thinking
"why do we even have this?"... Panfrost folks, this has zero functional
impact to you, merely wants an ack for straying outside drivers/iommu.

Robin.

  drivers/gpu/drm/msm/msm_iommu.c |  1 -
  drivers/gpu/drm/panfrost/panfrost_mmu.c |  7 --
  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |  7 --
  drivers/iommu/arm/arm-smmu/arm-smmu.c   | 25 +++--
  drivers/iommu/arm/arm-smmu/qcom_iommu.c |  8 ---
  drivers/iommu/io-pgtable-arm-v7s.c  |  3 +--
  drivers/iommu/io-pgtable-arm.c  |  1 -
  drivers/iommu/ipmmu-vmsa.c  |  1 -
  drivers/iommu/msm_iommu.c   |  7 --
  drivers/iommu/mtk_iommu.c   |  1 -
  include/linux/io-pgtable.h  | 11 -
  11 files changed, 4 insertions(+), 68 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index 22ac7c692a81..50d881794758 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -139,7 +139,6 @@ static void msm_iommu_tlb_add_page(struct 
iommu_iotlb_gather *gather,
  static const struct iommu_flush_ops null_tlb_ops = {
.tlb_flush_all = msm_iommu_tlb_flush_all,
.tlb_flush_walk = msm_iommu_tlb_flush_walk,
-   .tlb_flush_leaf = msm_iommu_tlb_flush_walk,
.tlb_add_page = msm_iommu_tlb_add_page,
  };

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c 
b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index 776448c527ea..c186914cc4f9 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -347,16 +347,9 @@ static void mmu_tlb_flush_walk(unsigned long iova, size_t 
size, size_t granule,
mmu_tlb_sync_context(cookie);
  }

-static void mmu_tlb_flush_leaf(unsigned long iova, size_t size, size_t granule,
-  void *cookie)
-{
-   mmu_tlb_sync_context(cookie);
-}
-
  static const struct iommu_flush_ops mmu_tlb_ops = {
.tlb_flush_all  = mmu_tlb_inv_context_s1,
.tlb_flush_walk = mmu_tlb_flush_walk,
-   .tlb_flush_leaf = mmu_tlb_flush_leaf,
  };

  int panfrost_mmu_pgtable_alloc(struct panfrost_file_priv *priv)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index e634bbe60573..fb684a393118 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1741,16 +1741,9 @@ static void arm_smmu_tlb_inv_walk(unsigned long iova, 
size_t size,
arm_smmu_tlb_inv_range(iova, size, granule, false, cookie);
  }

-static void arm_smmu_tlb_inv_leaf(unsigned long iova, size_t size,
- size_t granule, void *cookie)
-{
-   arm_smmu_tlb_inv_range(iova, size, granule, true, cookie);
-}
-
  static const struct iommu_flush_ops arm_smmu_flush_ops = {
.tlb_flush_all  = arm_smmu_tlb_inv_context,
.tlb_flush_walk = arm_smmu_tlb_inv_walk,
-   .tlb_flush_leaf = arm_smmu_tlb_inv_leaf,
.tlb_add_page   = arm_smmu_tlb_inv_page_nosync,
  };

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index dad7fa86fbd4..0b8c59922a2b 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -333,14 +333,6 @@ static void arm_smmu_tlb_inv_walk_s1(unsigned long iova, 
size_t size,
arm_smmu_tlb_sync_context(cookie);
  }

-static void arm_smmu_tlb_inv_leaf_s1(unsigned long iova, size_t size,
-size_t granule, void *cookie)
-{
-   arm_smmu_tlb_inv_range_s1(iova, size, granule, cookie,
- ARM_SMMU_CB_S1_TLBIVAL);
-   arm_smmu_tlb_sync_context(cookie);
-}
-
  static void arm_smmu_tlb_add_page_s1(struct iommu_iotlb_gather *gather,
 unsigned long iova, size_t granule,
 void *cookie)
@@ -357,14 +349,6 @@ static void arm_smmu_tlb_inv_walk_s2(unsigned long iova, 
size_t size,
arm_smmu_tlb_sync_context(cookie);
  }

-static void arm_smmu_tlb_inv_leaf_s2(unsigned long iova, size_t size,
-size_t granule, void *cookie)
-{
-   arm_smmu_tlb_inv

Re: [RFC PATCH 0/4] ACPI/IORT: Support for IORT RMR node

2020-11-06 Thread Steven Price

On 06/11/2020 16:17, Shameerali Kolothum Thodi wrote:




-Original Message-
From: Steven Price [mailto:steven.pr...@arm.com]
Sent: 06 November 2020 15:22
To: Shameerali Kolothum Thodi ;
linux-arm-ker...@lists.infradead.org; linux-a...@vger.kernel.org;
iommu@lists.linux-foundation.org; de...@acpica.org
Cc: lorenzo.pieral...@arm.com; j...@8bytes.org; Jonathan Cameron
; Linuxarm ;
Guohanjun (Hanjun Guo) ; Sami Mujawar
; robin.mur...@arm.com; wanghuiqiang

Subject: Re: [RFC PATCH 0/4] ACPI/IORT: Support for IORT RMR node

On 28/10/2020 18:24, Shameerali Kolothum Thodi wrote:

Hi Steve,


-Original Message-
From: Steven Price [mailto:steven.pr...@arm.com]
Sent: 28 October 2020 16:44
To: Shameerali Kolothum Thodi ;
linux-arm-ker...@lists.infradead.org; linux-a...@vger.kernel.org;
iommu@lists.linux-foundation.org; de...@acpica.org
Cc: lorenzo.pieral...@arm.com; j...@8bytes.org; Jonathan Cameron
; Linuxarm ;
Guohanjun (Hanjun Guo) ;

robin.mur...@arm.com;

wanghuiqiang ; Sami Mujawar

Subject: Re: [RFC PATCH 0/4] ACPI/IORT: Support for IORT RMR node

On 27/10/2020 11:26, Shameer Kolothum wrote:

The series adds support to IORT RMR nodes specified in IORT
Revision E -ARM DEN 0049E[0]. RMR nodes are used to describe memory
ranges that are used by endpoints and require a unity mapping
in SMMU.


Hi Shameer,

I've also been taking a look at RMR, and Sami is helping me get set up
so that I can do some testing. We're hoping to be able to test an EFI
framebuffer or splash screen - which has the added complication of the
unity mapping becoming redundant if a native display driver takes over
the display controller.

I've looked through your series and the code looks correct to me.


Thanks for taking a look and the details.


Hopefully I'll be able to give it some testing soon.


Cool. Please update once you get a chance run the tests.


Hi Shameer,


Hi Steve,


Just to update on this, for the EFI framebuffer use case I hit exactly
the issue that Robin has mentioned in another thread - the RMR is
effectively ignored because the display controller isn't being handled
by Linux (so there's no device to link it to).


Thanks for the update. Here, by "ignored "you meant get_resv_regions()
is not called or not?


get_resv_regions() isn't called.


  The splash screen might

similarly flicker as the SMMU reset will initially block the traffic
before the RMR region is enabled.


Does that mean you somehow managed to make the unity
mapping but there was flicker during the SMMU reset to
unity mapping setup period. Sorry I am trying to understand
the exact behavior observed in this case.


I haven't yet got this completely working (on the board which I'm 
testing the display controller doesn't have any existing ACPI bindings). 
However from what I understand the SMMU reset would block all memory 
access for the display controller before the RMR region would be setup. 
I'm going to try to get the display controller working with ACPI so I 
can test this properly.


Steve
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH 0/4] ACPI/IORT: Support for IORT RMR node

2020-11-06 Thread Steven Price

On 28/10/2020 18:24, Shameerali Kolothum Thodi wrote:

Hi Steve,


-Original Message-
From: Steven Price [mailto:steven.pr...@arm.com]
Sent: 28 October 2020 16:44
To: Shameerali Kolothum Thodi ;
linux-arm-ker...@lists.infradead.org; linux-a...@vger.kernel.org;
iommu@lists.linux-foundation.org; de...@acpica.org
Cc: lorenzo.pieral...@arm.com; j...@8bytes.org; Jonathan Cameron
; Linuxarm ;
Guohanjun (Hanjun Guo) ; robin.mur...@arm.com;
wanghuiqiang ; Sami Mujawar

Subject: Re: [RFC PATCH 0/4] ACPI/IORT: Support for IORT RMR node

On 27/10/2020 11:26, Shameer Kolothum wrote:

The series adds support to IORT RMR nodes specified in IORT
Revision E -ARM DEN 0049E[0]. RMR nodes are used to describe memory
ranges that are used by endpoints and require a unity mapping
in SMMU.


Hi Shameer,

I've also been taking a look at RMR, and Sami is helping me get set up
so that I can do some testing. We're hoping to be able to test an EFI
framebuffer or splash screen - which has the added complication of the
unity mapping becoming redundant if a native display driver takes over
the display controller.

I've looked through your series and the code looks correct to me.


Thanks for taking a look and the details.


Hopefully I'll be able to give it some testing soon.


Cool. Please update once you get a chance run the tests.


Hi Shameer,

Just to update on this, for the EFI framebuffer use case I hit exactly 
the issue that Robin has mentioned in another thread - the RMR is 
effectively ignored because the display controller isn't being handled 
by Linux (so there's no device to link it to). The splash screen might 
similarly flicker as the SMMU reset will initially block the traffic 
before the RMR region is enabled.


Steve
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH 0/4] ACPI/IORT: Support for IORT RMR node

2020-10-28 Thread Steven Price

On 27/10/2020 11:26, Shameer Kolothum wrote:

The series adds support to IORT RMR nodes specified in IORT
Revision E -ARM DEN 0049E[0]. RMR nodes are used to describe memory
ranges that are used by endpoints and require a unity mapping
in SMMU.


Hi Shameer,

I've also been taking a look at RMR, and Sami is helping me get set up 
so that I can do some testing. We're hoping to be able to test an EFI 
framebuffer or splash screen - which has the added complication of the 
unity mapping becoming redundant if a native display driver takes over 
the display controller.


I've looked through your series and the code looks correct to me. 
Hopefully I'll be able to give it some testing soon.


Thanks,

Steve


We have faced issues with 3408iMR RAID controller cards which
fail to boot when SMMU is enabled. This is because these controllers
make use of host memory for various caching related purposes and when
SMMU is enabled the iMR firmware fails to access these memory regions
as there is no mapping for them. IORT RMR provides a way for UEFI to
describe and report these memory regions so that the kernel can make
a unity mapping for these in SMMU.

RFC because, Patch #1 is to update the actbl2.h and should be done
through acpica update. I have send out a pull request[1] for that.

Tests:

With a UEFI, that reports the RMR for the dev,

[16F0h 5872   1] Type : 06
[16F1h 5873   2]   Length : 007C
[16F3h 5875   1] Revision : 00
[1038h 0056   2] Reserved : 
[1038h 0056   2]   Identifier : 
[16F8h 5880   4]Mapping Count : 0001
[16FCh 5884   4]   Mapping Offset : 0040

[1700h 5888   4]Number of RMR Descriptors : 0002
[1704h 5892   4]RMR Descriptor Offset : 0018

[1708h 5896   8]  Base Address of RMR : E640
[1710h 5904   8]Length of RMR : 0010
[1718h 5912   4] Reserved : 

[171Ch 5916   8]  Base Address of RMR : 27B0
[1724h 5924   8]Length of RMR : 00C0
[172Ch 5932   4] Reserved : 

[1730h 5936   4]   Input base : 
[1734h 5940   4] ID Count : 0001
[1738h 5944   4]  Output Base : 0003
[173Ch 5948   4] Output Reference : 0064
[1740h 5952   4]Flags (decoded below) : 0001
Single Mapping : 1
...

Without the series the RAID controller initilization fails as
below,

...
[   12.631117] megaraid_sas :03:00.0: FW supports sync cache: Yes
[   12.637360] megaraid_sas :03:00.0: megasas_disable_intr_fusion is called 
outbound_intr_mask:0x4009
[   18.776377] megaraid_sas :03:00.0: Init cmd return status FAILED for 
SCSI host 0
[   23.019383] megaraid_sas :03:00.0: Waiting for FW to come to ready state
[  106.684281] megaraid_sas :03:00.0: FW in FAULT state, Fault code:0x1 
subcode:0x0 func:megasas_transition_to_ready
[  106.695186] megaraid_sas :03:00.0: System Register set:
[  106.889787] megaraid_sas :03:00.0: Failed to transition controller to 
ready for scsi0.
[  106.910475] megaraid_sas :03:00.0: Failed from megasas_init_fw 6407
estuary:/$

With the series, now the kernel has direct mapping for the dev as
below,

estuary:/$ cat /sys/kernel/iommu_groups/0/reserved_regions
0x0800 0x080f msi
0x27b0 0x286f direct
0xe640 0xe64f direct
estuary:/$


[   12.254318] megaraid_sas :03:00.0: megasas_disable_intr_fusion is called 
outbound_intr_mask:0x4009
[   12.739089] megaraid_sas :03:00.0: FW provided supportMaxExtLDs: 0  
max_lds: 32
[   12.746628] megaraid_sas :03:00.0: controller type   : iMR(0MB)
[   12.752694] megaraid_sas :03:00.0: Online Controller Reset(OCR)  : 
Enabled
[   12.759798] megaraid_sas :03:00.0: Secure JBOD support   : Yes
[   12.765778] megaraid_sas :03:00.0: NVMe passthru support : Yes
[   12.771931] megaraid_sas :03:00.0: FW provided TM TaskAbort/Reset 
timeou: 6 secs/60 secs
[   12.780503] megaraid_sas :03:00.0: JBOD sequence map support : Yes
[   12.787000] megaraid_sas :03:00.0: PCI Lane Margining support: No
[   12.819179] megaraid_sas :03:00.0: NVME page size: (4096)
[   12.825672] megaraid_sas :03:00.0: megasas_enable_intr_fusion is called 
outbound_intr_mask:0x4000
[   12.835199] megaraid_sas :03:00.0: INIT adapter done
[   12.873932] megaraid_sas :03:00.0: pci id: 
(0x1000)/(0x0017)/(0x19e5)/(0xd213)
[   12.881644] megaraid_sas :03:00.0: unevenspan support: no
[   12.887451] megaraid_sas :03:00.0: firmware crash dump   : no
[   12.893344] megaraid_sas :03:00.0: JBOD sequence map : enabled

RAID controller init is now success and 

Re: [PATCH v2 1/3] iommu/io-pgtable-arm: Support coherency for Mali LPAE

2020-10-05 Thread Steven Price

On 05/10/2020 15:50, Boris Brezillon wrote:

On Tue, 22 Sep 2020 15:16:48 +0100
Robin Murphy  wrote:


Midgard GPUs have ACE-Lite master interfaces which allows systems to
integrate them in an I/O-coherent manner. It seems that from the GPU's
viewpoint, the rest of the system is its outer shareable domain, and so
even when snoop signals are wired up, they are only emitted for outer
shareable accesses. As such, setting the TTBR_SHARE_OUTER bit does
indeed get coherent pagetable walks working nicely for the coherent
T620 in the Arm Juno SoC.

Reviewed-by: Steven Price 
Tested-by: Neil Armstrong 
Signed-off-by: Robin Murphy 
---
  drivers/iommu/io-pgtable-arm.c | 11 ++-
  1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index dc7bcf858b6d..b4072a18e45d 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -440,7 +440,13 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct 
arm_lpae_io_pgtable *data,
<< ARM_LPAE_PTE_ATTRINDX_SHIFT);
}
  
-	if (prot & IOMMU_CACHE)

+   /*
+* Also Mali has its own notions of shareability wherein its Inner
+* domain covers the cores within the GPU, and its Outer domain is
+* "outside the GPU" (i.e. either the Inner or System domain in CPU
+* terms, depending on coherency).
+*/
+   if (prot & IOMMU_CACHE && data->iop.fmt != ARM_MALI_LPAE)
pte |= ARM_LPAE_PTE_SH_IS;
else
pte |= ARM_LPAE_PTE_SH_OS;


Actually, it still doesn't work on s922x :-/. For it to work I
correctly, I need to drop the outer shareable flag here.


The logic here does seem a bit odd. Originally it was:

IOMMU_CACHE -> Inner shared (value 3)
!IOMMU_CACHE -> Outer shared (value 2)

For Mali we're forcing everything to the second option. But Mali being 
Mali doesn't do things the same as LPAE, so for Mali we have:


0 - not shared
1 - reserved
2 - inner(*) and outer shareable
3 - inner shareable only

(*) where "inner" means internal to the GPU, and "outer" means shared 
with the CPU "inner". Very confusing!


So originally we had:
IOMMU_CACHE -> not shared with CPU (only internally in the GPU)
!IOMMU_CACHE -> shared with CPU

The change above gets us to "always shared", dropping the SH_OS bit 
would get us to not even shareable between cores (which doesn't sound 
like what we want).


It's not at all clear to me why the change helps, but I suspect we want 
at least "inner" shareable.


Steve


@@ -1049,6 +1055,9 @@ arm_mali_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg, 
void *cookie)
cfg->arm_mali_lpae_cfg.transtab = virt_to_phys(data->pgd) |
  ARM_MALI_LPAE_TTBR_READ_INNER |
  ARM_MALI_LPAE_TTBR_ADRMODE_TABLE;
+   if (cfg->coherent_walk)
+   cfg->arm_mali_lpae_cfg.transtab |= 
ARM_MALI_LPAE_TTBR_SHARE_OUTER;
+
return >iop;
  
  out_free_data:




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/3] arm64: dts: meson: Describe G12b GPU as coherent

2020-10-05 Thread Steven Price

On 05/10/2020 09:39, Boris Brezillon wrote:

On Mon, 5 Oct 2020 09:34:06 +0100
Steven Price  wrote:


On 05/10/2020 09:15, Boris Brezillon wrote:

Hi Robin, Neil,

On Wed, 16 Sep 2020 10:26:43 +0200
Neil Armstrong  wrote:
   

Hi Robin,

On 16/09/2020 01:51, Robin Murphy wrote:

According to a downstream commit I found in the Khadas vendor kernel,
the GPU on G12b is wired up for ACE-lite, so (now that Panfrost knows
how to handle this properly) we should describe it as such. Otherwise
the mismatch leads to all manner of fun with mismatched attributes and
inadvertently snooping stale data from caches, which would account for
at least some of the brokenness observed on this platform.

Signed-off-by: Robin Murphy 
---
   arch/arm64/boot/dts/amlogic/meson-g12b.dtsi | 4 
   1 file changed, 4 insertions(+)

diff --git a/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi 
b/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi
index 9b8548e5f6e5..ee8fcae9f9f0 100644
--- a/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi
+++ b/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi
@@ -135,3 +135,7 @@ map1 {
};
};
   };
+
+ {
+   dma-coherent;
+};
  


Thanks a lot for digging, I'll run a test to confirm it fixes the issue !


Sorry for the late reply. I triggered a dEQP run with this patch applied
and I see a bunch of "panfrost ffe4.gpu: matching BO is not heap type"
errors (see below for a full backtrace). That doesn't seem to happen when
we drop this dma-coherent property.

[  690.945731] [ cut here ]
[  690.950003] panfrost ffe4.gpu: matching BO is not heap type (GPU VA = 
319a000)
[  690.950051] WARNING: CPU: 0 PID: 120 at 
drivers/gpu/drm/panfrost/panfrost_mmu.c:465 
panfrost_mmu_irq_handler_thread+0x47c/0x650
[  690.968854] Modules linked in:
[  690.971878] CPU: 0 PID: 120 Comm: irq/27-panfrost Tainted: GW
 5.9.0-rc5-02434-g7d8109ec5a42 #784
[  690.981964] Hardware name: Khadas VIM3 (DT)
[  690.986107] pstate: 6005 (nZCv daif -PAN -UAO BTYPE=--)
[  690.991627] pc : panfrost_mmu_irq_handler_thread+0x47c/0x650
[  690.997232] lr : panfrost_mmu_irq_handler_thread+0x47c/0x650
[  691.002836] sp : 800011bcbcd0
[  691.006114] x29: 800011bcbcf0 x28: f3fe3800
[  691.011375] x27: ceaf5350 x26: ca5fc500
[  691.016636] x25: f32409c0 x24: 0001
[  691.021897] x23: f3240880 x22: f3e3a800
[  691.027159] x21:  x20: 
[  691.032420] x19: 00010001 x18: 0020
[  691.037681] x17:  x16: 
[  691.042942] x15: f3fe3c70 x14: 
[  691.048204] x13: 8000116c2428 x12: 8000116c2086
[  691.053466] x11: 800011bcbcd0 x10: 800011bcbcd0
[  691.058727] x9 : fffe x8 : 
[  691.063988] x7 : 7420706165682074 x6 : 8000116c1816
[  691.069249] x5 :  x4 : 
[  691.074510] x3 :  x2 : 8000e348c000
[  691.079771] x1 : f1b91ff9af2df000 x0 : 
[  691.085033] Call trace:
[  691.087452]  panfrost_mmu_irq_handler_thread+0x47c/0x650
[  691.092712]  irq_thread_fn+0x2c/0xa0
[  691.096246]  irq_thread+0x184/0x208
[  691.099699]  kthread+0x140/0x160
[  691.102890]  ret_from_fork+0x10/0x34
[  691.106424] ---[ end trace b5dd8c2dfada8236 ]---
   


It's quite possible this is caused by the GPU seeing a stale page table
entry, so perhaps coherency isn't working as well as it should...

Do you get an "Unhandled Page fault" message after this?


Yep (see below).

--->8---


[...]

[  689.805864] panfrost ffe4.gpu: Unhandled Page fault in AS0 at VA 
0x03146080
[  689.805864] Reason: TODO
[  689.805864] raw fault status: 0x10003C3
[  689.805864] decoded fault status: SLAVE FAULT
[  689.805864] exception type 0xC3: TRANSLATION_FAULT_LEVEL3
[  689.805864] access type 0x3: WRITE
[  689.805864] source id 0x100
[  690.170419] panfrost ffe4.gpu: gpu sched timeout, js=1, config=0x7300, 
status=0x8, head=0x3101100, tail=0x3101100, sched_job=4b442768
[  690.770373] panfrost ffe4.gpu: error powering up gpu shader
[  690.945123] panfrost ffe4.gpu: error powering up gpu shader
[  690.945731] [ cut here ]



That's a write fault from level 3 of the page table triggered by shader
core 0 in a fragment job. So could be writing out the frame buffer.

It would be interesting to see if a patch like below would work round
it:

8<
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c 
b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index e8f7b11352d2..5144860afdea 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -460,9 +460,12 @@ static int panfrost_mmu_map_fault_addr(struct 
panfrost_device *pfdev, int as,
 
 	bo = bomapping->obj;

if (!bo->is_heap) {
-   dev_WARN(pfdev->dev, "matc

Re: [PATCH 3/3] arm64: dts: meson: Describe G12b GPU as coherent

2020-10-05 Thread Steven Price

On 05/10/2020 09:15, Boris Brezillon wrote:

Hi Robin, Neil,

On Wed, 16 Sep 2020 10:26:43 +0200
Neil Armstrong  wrote:


Hi Robin,

On 16/09/2020 01:51, Robin Murphy wrote:

According to a downstream commit I found in the Khadas vendor kernel,
the GPU on G12b is wired up for ACE-lite, so (now that Panfrost knows
how to handle this properly) we should describe it as such. Otherwise
the mismatch leads to all manner of fun with mismatched attributes and
inadvertently snooping stale data from caches, which would account for
at least some of the brokenness observed on this platform.

Signed-off-by: Robin Murphy 
---
  arch/arm64/boot/dts/amlogic/meson-g12b.dtsi | 4 
  1 file changed, 4 insertions(+)

diff --git a/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi 
b/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi
index 9b8548e5f6e5..ee8fcae9f9f0 100644
--- a/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi
+++ b/arch/arm64/boot/dts/amlogic/meson-g12b.dtsi
@@ -135,3 +135,7 @@ map1 {
};
};
  };
+
+ {
+   dma-coherent;
+};
   


Thanks a lot for digging, I'll run a test to confirm it fixes the issue !


Sorry for the late reply. I triggered a dEQP run with this patch applied
and I see a bunch of "panfrost ffe4.gpu: matching BO is not heap type"
errors (see below for a full backtrace). That doesn't seem to happen when
we drop this dma-coherent property.

[  690.945731] [ cut here ]
[  690.950003] panfrost ffe4.gpu: matching BO is not heap type (GPU VA = 
319a000)
[  690.950051] WARNING: CPU: 0 PID: 120 at 
drivers/gpu/drm/panfrost/panfrost_mmu.c:465 
panfrost_mmu_irq_handler_thread+0x47c/0x650
[  690.968854] Modules linked in:
[  690.971878] CPU: 0 PID: 120 Comm: irq/27-panfrost Tainted: GW
 5.9.0-rc5-02434-g7d8109ec5a42 #784
[  690.981964] Hardware name: Khadas VIM3 (DT)
[  690.986107] pstate: 6005 (nZCv daif -PAN -UAO BTYPE=--)
[  690.991627] pc : panfrost_mmu_irq_handler_thread+0x47c/0x650
[  690.997232] lr : panfrost_mmu_irq_handler_thread+0x47c/0x650
[  691.002836] sp : 800011bcbcd0
[  691.006114] x29: 800011bcbcf0 x28: f3fe3800
[  691.011375] x27: ceaf5350 x26: ca5fc500
[  691.016636] x25: f32409c0 x24: 0001
[  691.021897] x23: f3240880 x22: f3e3a800
[  691.027159] x21:  x20: 
[  691.032420] x19: 00010001 x18: 0020
[  691.037681] x17:  x16: 
[  691.042942] x15: f3fe3c70 x14: 
[  691.048204] x13: 8000116c2428 x12: 8000116c2086
[  691.053466] x11: 800011bcbcd0 x10: 800011bcbcd0
[  691.058727] x9 : fffe x8 : 
[  691.063988] x7 : 7420706165682074 x6 : 8000116c1816
[  691.069249] x5 :  x4 : 
[  691.074510] x3 :  x2 : 8000e348c000
[  691.079771] x1 : f1b91ff9af2df000 x0 : 
[  691.085033] Call trace:
[  691.087452]  panfrost_mmu_irq_handler_thread+0x47c/0x650
[  691.092712]  irq_thread_fn+0x2c/0xa0
[  691.096246]  irq_thread+0x184/0x208
[  691.099699]  kthread+0x140/0x160
[  691.102890]  ret_from_fork+0x10/0x34
[  691.106424] ---[ end trace b5dd8c2dfada8236 ]---



It's quite possible this is caused by the GPU seeing a stale page table 
entry, so perhaps coherency isn't working as well as it should...


Do you get an "Unhandled Page fault" message after this? It might give 
some clues. Coherency issues are a pain to debug though and it's of 
course possible there are issues on this specific platform.


Steve
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/3] drm: panfrost: Coherency support

2020-09-17 Thread Steven Price

On 17/09/2020 11:51, Tomeu Vizoso wrote:

On 9/17/20 12:38 PM, Steven Price wrote:

On 16/09/2020 18:46, Rob Herring wrote:

On Wed, Sep 16, 2020 at 11:04 AM Alyssa Rosenzweig
 wrote:


So I get a performance regression with the dma-coherent approach, 
even if it's

clearly the cleaner.


That's bizarre -- this should really be the faster of the two.


Coherency may not be free. CortexA9 had something like 4x slower
memcpy if SMP was enabled as an example. I don't know if there's
anything going on like that specifically here. If there's never any
CPU accesses mixed in with kmscube, then there would be no benefit to
coherency.


The DDK blob has the ability to mark only certain areas of memory as 
coherent for performance reasons. For simple things like kmscube I 
would expect that it's basically write-only from the CPU and almost 
all memory the GPU touches isn't touched by the CPU. I.e. coherency 
isn't helping and the coherency traffic is probably expensive. Whether 
the complexity is worth it for "real" content I don't know - it may 
just be silly benchmarks that benefit.


Or maybe it's only a problem for applications that do silly things? I 
don't think kmscube was ever optimized for performance.


Well doing silly things is almost the definition of a benchmark ;) A lot 
of the mobile graphics benchmarks suffer from not being very intelligent 
in how they render (e.g. many have meshes that are far too detailed so 
the triangles are smaller than the pixels).


Of course there are also applications that get things wrong too.

Steve
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/3] drm: panfrost: Coherency support

2020-09-17 Thread Steven Price

On 16/09/2020 18:46, Rob Herring wrote:

On Wed, Sep 16, 2020 at 11:04 AM Alyssa Rosenzweig
 wrote:



So I get a performance regression with the dma-coherent approach, even if it's
clearly the cleaner.


That's bizarre -- this should really be the faster of the two.


Coherency may not be free. CortexA9 had something like 4x slower
memcpy if SMP was enabled as an example. I don't know if there's
anything going on like that specifically here. If there's never any
CPU accesses mixed in with kmscube, then there would be no benefit to
coherency.


The DDK blob has the ability to mark only certain areas of memory as 
coherent for performance reasons. For simple things like kmscube I would 
expect that it's basically write-only from the CPU and almost all memory 
the GPU touches isn't touched by the CPU. I.e. coherency isn't helping 
and the coherency traffic is probably expensive. Whether the complexity 
is worth it for "real" content I don't know - it may just be silly 
benchmarks that benefit.


Steve
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/3] drm/panfrost: Support cache-coherent integrations

2020-09-17 Thread Steven Price

On 16/09/2020 00:51, Robin Murphy wrote:

When the GPU's ACE-Lite interface is fully wired up and capable of
snooping CPU caches, it may be described as "dma-coherent" in
devicetree, which will already inform the DMA layer not to perform
unnecessary cache maintenance. However, we still need to ensure that
the GPU uses the appropriate cacheable outer-shareable attributes in
order to generate the requisite snoop signals, and that CPU mappings
don't create a mismatch by using a non-cacheable type either.

Signed-off-by: Robin Murphy 


LGTM

Reviewed-by: Steven Price 


---
  drivers/gpu/drm/panfrost/panfrost_device.h | 1 +
  drivers/gpu/drm/panfrost/panfrost_drv.c| 2 ++
  drivers/gpu/drm/panfrost/panfrost_gem.c| 2 ++
  drivers/gpu/drm/panfrost/panfrost_mmu.c| 1 +
  4 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h 
b/drivers/gpu/drm/panfrost/panfrost_device.h
index c30c719a8059..b31f45315e96 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.h
+++ b/drivers/gpu/drm/panfrost/panfrost_device.h
@@ -84,6 +84,7 @@ struct panfrost_device {
/* pm_domains for devices with more than one. */
struct device *pm_domain_devs[MAX_PM_DOMAINS];
struct device_link *pm_domain_links[MAX_PM_DOMAINS];
+   bool coherent;
  
  	struct panfrost_features features;

const struct panfrost_compatible *comp;
diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
b/drivers/gpu/drm/panfrost/panfrost_drv.c
index ada51df9a7a3..2a6f2f716b2f 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -588,6 +588,8 @@ static int panfrost_probe(struct platform_device *pdev)
if (!pfdev->comp)
return -ENODEV;
  
+	pfdev->coherent = device_get_dma_attr(>dev) == DEV_DMA_COHERENT;

+
/* Allocate and initialze the DRM device. */
ddev = drm_dev_alloc(_drm_driver, >dev);
if (IS_ERR(ddev))
diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c 
b/drivers/gpu/drm/panfrost/panfrost_gem.c
index 33355dd302f1..cdf1a8754eba 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
@@ -220,6 +220,7 @@ static const struct drm_gem_object_funcs panfrost_gem_funcs 
= {
   */
  struct drm_gem_object *panfrost_gem_create_object(struct drm_device *dev, 
size_t size)
  {
+   struct panfrost_device *pfdev = dev->dev_private;
struct panfrost_gem_object *obj;
  
  	obj = kzalloc(sizeof(*obj), GFP_KERNEL);

@@ -229,6 +230,7 @@ struct drm_gem_object *panfrost_gem_create_object(struct 
drm_device *dev, size_t
INIT_LIST_HEAD(>mappings.list);
mutex_init(>mappings.lock);
obj->base.base.funcs = _gem_funcs;
+   obj->base.map_cached = pfdev->coherent;
  
  	return >base.base;

  }
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c 
b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index e8f7b11352d2..8852fd378f7a 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -371,6 +371,7 @@ int panfrost_mmu_pgtable_alloc(struct panfrost_file_priv 
*priv)
.pgsize_bitmap  = SZ_4K | SZ_2M,
.ias= FIELD_GET(0xff, pfdev->features.mmu_features),
.oas= FIELD_GET(0xff00, 
pfdev->features.mmu_features),
+   .coherent_walk  = pfdev->coherent,
.tlb= _tlb_ops,
.iommu_dev  = pfdev->dev,
};



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 10/25] drm: panfrost: fix common struct sg_table related issues

2020-05-11 Thread Steven Price

On 05/05/2020 09:45, Marek Szyprowski wrote:

The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of the entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. A common mistake was to ignore a result
of the dma_map_sg function and don't use the sg_table->orig_nents at all.

To avoid such issues, lets use common dma-mapping wrappers operating
directly on the struct sg_table objects and adjust references to the
nents and orig_nents respectively.

Signed-off-by: Marek Szyprowski 


The change looks good to me:

Reviewed-by: Steven Price 

Although I would have appreciated the commit message being modified to 
match the specifics of Panfrost - the return of dma_mpa_sg() wasn't 
being ignored, but the use of orig_nents/nents was indeed wrong.


Steve


---
For more information, see '[PATCH v3 00/25] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread: https://lkml.org/lkml/2020/5/5/187
---
  drivers/gpu/drm/panfrost/panfrost_gem.c | 4 ++--
  drivers/gpu/drm/panfrost/panfrost_mmu.c | 5 ++---
  2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c 
b/drivers/gpu/drm/panfrost/panfrost_gem.c
index 17b654e..95d7e80 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
@@ -41,8 +41,8 @@ static void panfrost_gem_free_object(struct drm_gem_object 
*obj)
  
  		for (i = 0; i < n_sgt; i++) {

if (bo->sgts[i].sgl) {
-   dma_unmap_sg(pfdev->dev, bo->sgts[i].sgl,
-bo->sgts[i].nents, 
DMA_BIDIRECTIONAL);
+   dma_unmap_sgtable(pfdev->dev, >sgts[i],
+ DMA_BIDIRECTIONAL);
sg_free_table(>sgts[i]);
}
}
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c 
b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index ed28aeb..9926111 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -517,10 +517,9 @@ static int panfrost_mmu_map_fault_addr(struct 
panfrost_device *pfdev, int as,
if (ret)
goto err_pages;
  
-	if (!dma_map_sg(pfdev->dev, sgt->sgl, sgt->nents, DMA_BIDIRECTIONAL)) {

-   ret = -EINVAL;
+   ret = dma_map_sgtable(pfdev->dev, sgt, DMA_BIDIRECTIONAL);
+   if (ret)
goto err_map;
-   }
  
  	mmu_map_sg(pfdev, bomapping->mmu, addr,

   IOMMU_WRITE | IOMMU_READ | IOMMU_NOEXEC, sgt);



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 09/21] drm: panfrost: fix sg_table nents vs. orig_nents misuse

2020-05-04 Thread Steven Price

On 04/05/2020 13:53, Marek Szyprowski wrote:

The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. Adapt the code to obey those rules.


I find this commit message a bit confusing, but AFAICT the problem with 
the Panfrost code is really in mmu_map_sg() where we don't have the 
return value from dma_map_sg() and the for_each_sg() loop could (in 
theory) run off the end of the list.


The fix seems correct - store the return where it's meant to be (nents) 
and make sure when unmapping we use the original (orig_nents). So you 
might also consider adding:


Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver")

Even better would be the wrappers you mention in the cover letter! ;)

Reviewed-by: Steven Price 



Signed-off-by: Marek Szyprowski 
---
For more information, see '[PATCH v2 00/21] DRM: fix struct sg_table nents
vs. orig_nents misuse' thread: https://lkml.org/lkml/2020/5/4/373
---
  drivers/gpu/drm/panfrost/panfrost_gem.c | 3 ++-
  drivers/gpu/drm/panfrost/panfrost_mmu.c | 4 +++-
  2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c 
b/drivers/gpu/drm/panfrost/panfrost_gem.c
index 17b654e..22fec7c 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
@@ -42,7 +42,8 @@ static void panfrost_gem_free_object(struct drm_gem_object 
*obj)
for (i = 0; i < n_sgt; i++) {
if (bo->sgts[i].sgl) {
dma_unmap_sg(pfdev->dev, bo->sgts[i].sgl,
-bo->sgts[i].nents, 
DMA_BIDIRECTIONAL);
+bo->sgts[i].orig_nents,
+DMA_BIDIRECTIONAL);
sg_free_table(>sgts[i]);
}
}
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c 
b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index ed28aeb..2d9b1f9 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -517,7 +517,9 @@ static int panfrost_mmu_map_fault_addr(struct 
panfrost_device *pfdev, int as,
if (ret)
goto err_pages;
  
-	if (!dma_map_sg(pfdev->dev, sgt->sgl, sgt->nents, DMA_BIDIRECTIONAL)) {

+   sgt->nents = dma_map_sg(pfdev->dev, sgt->sgl, sgt->orig_nents,
+   DMA_BIDIRECTIONAL);
+   if (!sgt->nents) {
ret = -EINVAL;
goto err_map;
}



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 08/10] iommu/io-pgtable-arm: Rationalise TTBRn handling

2019-10-28 Thread Steven Price
On 25/10/2019 19:08, Robin Murphy wrote:
> TTBR1 values have so far been redundant since no users implement any
> support for split address spaces. Crucially, though, one of the main
> reasons for wanting to do so is to be able to manage each half entirely
> independently, e.g. context-switching one set of mappings without
> disturbing the other. Thus it seems unlikely that tying two tables
> together in a single io_pgtable_cfg would ever be particularly desirable
> or useful.
> 
> Streamline the configs to just a single conceptual TTBR value
> representing the allocated table. This paves the way for future users to
> support split address spaces by simply allocating a table and dealing
> with the detailed TTBRn logistics themselves.
> 
> Signed-off-by: Robin Murphy 
> ---
>  drivers/iommu/arm-smmu-v3.c|  2 +-
>  drivers/iommu/arm-smmu.c   |  9 -
>  drivers/iommu/io-pgtable-arm-v7s.c | 16 +++-
>  drivers/iommu/io-pgtable-arm.c |  5 ++---
>  drivers/iommu/ipmmu-vmsa.c |  2 +-
>  drivers/iommu/msm_iommu.c  |  4 ++--
>  drivers/iommu/mtk_iommu.c  |  4 ++--
>  drivers/iommu/qcom_iommu.c |  3 +--
>  include/linux/io-pgtable.h |  4 ++--
>  9 files changed, 22 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 3f20e548f1ec..da31e607698f 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -2170,7 +2170,7 @@ static int arm_smmu_domain_finalise_s1(struct 
> arm_smmu_domain *smmu_domain,
>   }
>  
>   cfg->cd.asid= (u16)asid;
> - cfg->cd.ttbr= pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
> + cfg->cd.ttbr= pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
>   cfg->cd.tcr = pgtbl_cfg->arm_lpae_s1_cfg.tcr;
>   cfg->cd.mair= pgtbl_cfg->arm_lpae_s1_cfg.mair;
>   return 0;
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 2bc3e93b11e6..a249e4e49ead 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -534,13 +534,12 @@ static void arm_smmu_init_context_bank(struct 
> arm_smmu_domain *smmu_domain,
>   /* TTBRs */
>   if (stage1) {
>   if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH32_S) {
> - cb->ttbr[0] = pgtbl_cfg->arm_v7s_cfg.ttbr[0];
> - cb->ttbr[1] = pgtbl_cfg->arm_v7s_cfg.ttbr[1];
> + cb->ttbr[0] = pgtbl_cfg->arm_v7s_cfg.ttbr;
> + cb->ttbr[1] = 0;
>   } else {
> - cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
> + cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
>   cb->ttbr[0] |= FIELD_PREP(TTBRn_ASID, cfg->asid);
> - cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[1];
> - cb->ttbr[1] |= FIELD_PREP(TTBRn_ASID, cfg->asid);
> + cb->ttbr[1] = FIELD_PREP(TTBRn_ASID, cfg->asid);
>   }
>   } else {
>   cb->ttbr[0] = pgtbl_cfg->arm_lpae_s2_cfg.vttbr;
> diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
> b/drivers/iommu/io-pgtable-arm-v7s.c
> index 7c3bd2c3cdca..4d2c1e7f67c4 100644
> --- a/drivers/iommu/io-pgtable-arm-v7s.c
> +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> @@ -822,15 +822,13 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct 
> io_pgtable_cfg *cfg,
>   /* Ensure the empty pgd is visible before any actual TTBR write */
>   wmb();
>  
> - /* TTBRs */
> - cfg->arm_v7s_cfg.ttbr[0] = virt_to_phys(data->pgd) |
> -ARM_V7S_TTBR_S | ARM_V7S_TTBR_NOS |
> -(cfg->coherent_walk ?
> -(ARM_V7S_TTBR_IRGN_ATTR(ARM_V7S_RGN_WBWA) |
> - ARM_V7S_TTBR_ORGN_ATTR(ARM_V7S_RGN_WBWA)) :
> -(ARM_V7S_TTBR_IRGN_ATTR(ARM_V7S_RGN_NC) |
> - ARM_V7S_TTBR_ORGN_ATTR(ARM_V7S_RGN_NC)));
> - cfg->arm_v7s_cfg.ttbr[1] = 0;
> + /* TTBR */
> + cfg->arm_v7s_cfg.ttbr = virt_to_phys(data->pgd) | ARM_V7S_TTBR_S |
> + (cfg->coherent_walk ? (ARM_V7S_TTBR_NOS |
> +   ARM_V7S_TTBR_IRGN_ATTR(ARM_V7S_RGN_WBWA) |
> +   ARM_V7S_TTBR_ORGN_ATTR(ARM_V7S_RGN_WBWA)) :
> +  (ARM_V7S_TTBR_IRGN_ATTR(ARM_V7S_RGN_NC) |
> +   ARM_V7S_TTBR_ORGN_ATTR(ARM_V7S_RGN_NC)));

ARM_V7S_TTBR_NOS seems to have sneaked into the cfg->coherent_walk
condition here - which you haven't mentioned in the commit log, so it
doesn't look like it should be in this commit.

Steve

>   return >iop;
>  
>  out_free_data:
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 1795df8f7a51..bc0841040ebe 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -872,9 +872,8 @@ 

Re: [PATCH 3/3] iommu/io-pgtable-arm: Allow coherent walks for Mali

2019-09-12 Thread Steven Price
On 11/09/2019 15:42, Robin Murphy wrote:
> Midgard GPUs have ACE-Lite master interfaces which allows systems to
> integrate them in an I/O-coherent manner. It seems that from the GPU's
> viewpoint, the rest of the system is its outer shareable domain, and it
> will only emit snoop signals for outer shareable accesses. As such,
> setting the TTBR_SHARE_OUTER bit does indeed get coherent pagetable
> walks working nicely.
> 
> Making data accesses coherent seems to be more of a challenge...
> 
> Signed-off-by: Robin Murphy 

Reviewed-by: Steven Price 

Note the terminology in the GPU is *very* confusing here. Midgard refers
to the system's inner shareable domain as "outer shareable", and uses
"inner shareable" to mean purely within the GPU.

For data access kbase sets up a different default MEMATTR if ACE is
available:

/* Set to implementation defined, outer caching */
#define AS_MEMATTR_LPAE_OUTER_IMPL_DEF0x88ull
[...]
#define AS_MEMATTR_INDEX_DEFAULT_ACE   3
[...]
/* Outer coherent, inner implementation defined policy */
#define AS_MEMATTR_INDEX_OUTER_IMPL_DEF3

Steve

> ---
>  drivers/iommu/io-pgtable-arm.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 77f41c9dd9be..2794d4661339 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -1061,6 +1061,9 @@ arm_mali_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg, 
> void *cookie)
>   cfg->arm_mali_lpae_cfg.transtab = virt_to_phys(data->pgd) |
> ARM_MALI_LPAE_TTBR_READ_INNER |
> ARM_MALI_LPAE_TTBR_ADRMODE_TABLE;
> + if (cfg->coherent_walk)
> + cfg->arm_mali_lpae_cfg.transtab |= 
> ARM_MALI_LPAE_TTBR_SHARE_OUTER;
> +
>   return >iop;
>  
>  out_free_data:
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/3] iommu/io-pgtable-arm: Support more Mali configurations

2019-09-12 Thread Steven Price
On 11/09/2019 15:42, Robin Murphy wrote:
> In principle, Midgard GPUs supporting smaller VA sizes should only
> require 3-level pagetables, since the address bits resolved at level 0
> (47:40) will never change. However, the kbase driver does not appear to
> have any notion of a variable start level, and empirically T720 and T820
> rapidly blow up with translation faults unless given a full 4-level
> table, despite only supporting a 33-bit VA size.

Midgard 'LPAE' isn't really LPAE and does indeed always require all
levels of page tables. The 33-bit VA size is really only limiting the
storage of virtual addresses in the GPU and not affecting the MMU.

> The 'real' IAS value is still valuable in terms of validating addresses
> on map/unmap, so tweak the allocator to allow smaller values while still
> forcing the resultant tables to the full 4 levels.
> 
> Signed-off-by: Robin Murphy 

Reviewed-by: Steven Price 

Steve

> ---
>  drivers/iommu/io-pgtable-arm.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 9e35cd991f06..77f41c9dd9be 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -1022,7 +1022,7 @@ arm_mali_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg, 
> void *cookie)
>   if (cfg->quirks)
>   return NULL;
>  
> - if (cfg->ias != 48 || cfg->oas > 40)
> + if (cfg->ias > 48 || cfg->oas > 40)
>   return NULL;
>  
>   cfg->pgsize_bitmap &= (SZ_4K | SZ_2M | SZ_1G);
> @@ -1031,6 +1031,11 @@ arm_mali_lpae_alloc_pgtable(struct io_pgtable_cfg 
> *cfg, void *cookie)
>   if (!data)
>   return NULL;
>  
> + /* Mali seems to need a full 4-level table regardless of IAS */
> + if (data->levels < ARM_LPAE_MAX_LEVELS) {
> + data->levels = ARM_LPAE_MAX_LEVELS;
> + data->pgd_size = sizeof(arm_lpae_iopte);
> + }
>   /*
>* MEMATTR: Mali has no actual notion of a non-cacheable type, so the
>* best we can do is mimic the out-of-tree driver and hope that the
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/3] iommu/io-pgtable-arm: Correct Mali attributes

2019-09-12 Thread Steven Price
On 11/09/2019 15:42, Robin Murphy wrote:
> Whilst Midgard's MEMATTR follows a similar principle to the VMSA MAIR,
> the actual attribute values differ, so although it currently appears to
> work to some degree, we probably shouldn't be using our standard stage 1
> MAIR for that. Instead, generate a reasonable MEMATTR with attribute
> values borrowed from the kbase driver; at this point we'll be overriding
> or ignoring pretty much all of the LPAE config, so just implement these
> Mali details in a dedicated allocator instead of pretending to subclass
> the standard VMSA format.
> 
> Signed-off-by: Robin Murphy 

The Midgard MMU "uses concepts" from LPAE but really isn't LPAE, so this
seems like a good tidy up.

Reviewed-by: Steven Price 

Steve

> ---
>  drivers/iommu/io-pgtable-arm.c | 53 +-
>  1 file changed, 40 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 161a7d56264d..9e35cd991f06 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -167,6 +167,9 @@
>  #define ARM_MALI_LPAE_TTBR_READ_INNERBIT(2)
>  #define ARM_MALI_LPAE_TTBR_SHARE_OUTER   BIT(4)
>  
> +#define ARM_MALI_LPAE_MEMATTR_IMP_DEF0x88ULL
> +#define ARM_MALI_LPAE_MEMATTR_WRITE_ALLOC 0x8DULL
> +
>  /* IOPTE accessors */
>  #define iopte_deref(pte,d) __va(iopte_to_paddr(pte, d))
>  
> @@ -1013,27 +1016,51 @@ arm_32_lpae_alloc_pgtable_s2(struct io_pgtable_cfg 
> *cfg, void *cookie)
>  static struct io_pgtable *
>  arm_mali_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie)
>  {
> - struct io_pgtable *iop;
> + struct arm_lpae_io_pgtable *data;
> +
> + /* No quirks for Mali (hopefully) */
> + if (cfg->quirks)
> + return NULL;
>  
>   if (cfg->ias != 48 || cfg->oas > 40)
>   return NULL;
>  
>   cfg->pgsize_bitmap &= (SZ_4K | SZ_2M | SZ_1G);
> - iop = arm_64_lpae_alloc_pgtable_s1(cfg, cookie);
> - if (iop) {
> - u64 mair, ttbr;
>  
> - /* Copy values as union fields overlap */
> - mair = cfg->arm_lpae_s1_cfg.mair[0];
> - ttbr = cfg->arm_lpae_s1_cfg.ttbr[0];
> + data = arm_lpae_alloc_pgtable(cfg);
> + if (!data)
> + return NULL;
>  
> - cfg->arm_mali_lpae_cfg.memattr = mair;
> - cfg->arm_mali_lpae_cfg.transtab = ttbr |
> - ARM_MALI_LPAE_TTBR_READ_INNER |
> - ARM_MALI_LPAE_TTBR_ADRMODE_TABLE;
> - }
> + /*
> +  * MEMATTR: Mali has no actual notion of a non-cacheable type, so the
> +  * best we can do is mimic the out-of-tree driver and hope that the
> +  * "implementation-defined caching policy" is good enough. Similarly,
> +  * we'll use it for the sake of a valid attribute for our 'device'
> +  * index, although callers should never request that in practice.
> +  */
> + cfg->arm_mali_lpae_cfg.memattr =
> + (ARM_MALI_LPAE_MEMATTR_IMP_DEF
> +  << ARM_LPAE_MAIR_ATTR_SHIFT(ARM_LPAE_MAIR_ATTR_IDX_NC)) |
> + (ARM_MALI_LPAE_MEMATTR_WRITE_ALLOC
> +  << ARM_LPAE_MAIR_ATTR_SHIFT(ARM_LPAE_MAIR_ATTR_IDX_CACHE)) |
> + (ARM_MALI_LPAE_MEMATTR_IMP_DEF
> +  << ARM_LPAE_MAIR_ATTR_SHIFT(ARM_LPAE_MAIR_ATTR_IDX_DEV));
>  
> - return iop;
> + data->pgd = __arm_lpae_alloc_pages(data->pgd_size, GFP_KERNEL, cfg);
> + if (!data->pgd)
> + goto out_free_data;
> +
> + /* Ensure the empty pgd is visible before TRANSTAB can be written */
> + wmb();
> +
> + cfg->arm_mali_lpae_cfg.transtab = virt_to_phys(data->pgd) |
> +   ARM_MALI_LPAE_TTBR_READ_INNER |
> +   ARM_MALI_LPAE_TTBR_ADRMODE_TABLE;
> + return >iop;
> +
> +out_free_data:
> + kfree(data);
> + return NULL;
>  }
>  
>  struct io_pgtable_init_fns io_pgtable_arm_64_lpae_s1_init_fns = {
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 3/3] drm/panfrost: Add initial panfrost driver

2019-04-15 Thread Steven Price
On 15/04/2019 10:18, Daniel Vetter wrote:
> On Fri, Apr 05, 2019 at 05:42:33PM +0100, Steven Price wrote:
>> On 05/04/2019 17:16, Alyssa Rosenzweig wrote:
>>> acronym once ever and have it as a "??"), I'm not sure how to respond to
>>> that... We don't know how to allocate memory for the GPU-internal data
>>> structures (the tiler heap, for instance, but also a few others I've
>>> just named "misc_0" and "scratchpad" -- guessing one of those is for
>>> "TLS"). With kbase, I took the worst-case strategy of allocating
>>> gigantic chunks on startup with tiny commit counts and GROW_ON_GPF set.
>>> With the new driver, well, our memory consumption is scary since
>>> implementing GROW_ON_GPF in an upstream-friendly way is a bit more work
>>> and isn't expected to hit the 5.2 window.
>>
>> Yes GROW_ON_GPF is pretty much required for the tiler heap - it's not
>> (reasonably) possible to determine how big it should be. The Arm user
>> space driver does the same approach (tiny commit count, but allow it to
>> grow).
> 
> Jumping in here with a drive through comment ...
> 
> Growing gem bo and dma-buf is going to be endless amounts of fun, since we
> hard-coded that their size is invariant.
> 
> I think the only reasonable way to implement this is if you allocate a
> really huge bo, map it, but only put the pages in on faulting. Or when
> really evil userspace tries to export it. Actually changing the underlying
> buffer size is not going to work I think.

Yes the idea is that you allocate a large amount of virtual address
space, but only put a few physical pages in. If the GPU needs more you
fault them in as necessary. The "buffer size" (i.e. virtual address
region) never changes size.

> Note: I didn't read kbase, so might be totally wrong in how GROW_ON_GPF
> works.

For kbase we simply don't support exporting this type of memory, and are
fairly restrictive about mapping it into user space (user space
shouldn't normally need to read it).

Since Panfrost is using GEM BO it will have to deal with malicious user
space. But it would be sufficient to simply fully back the region in
that case.

Recent version of kbase also support what is termed JIT (Just In Time
memory allocation). Basically this involves the kernel driver
allocating/freeing memory regions just before the job is loaded onto the
GPU. These regions might also be GROW_ON_GPF. The intention is that when
there isn't memory pressure these regions can be kept between jobs, but
under memory pressure they can be discarded and recreated if they turn
out to be needed again.

Given the differences between the Panfrost and the proprietary user
space I'm not sure exactly what form this will need to be for Panfrost,
but Vulkan makes memory management "more interesting"! Allocating
upfront for the worst case can become prohibitively expensive.

Steve
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 3/3] drm/panfrost: Add initial panfrost driver

2019-04-10 Thread Steven Price
On 09/04/2019 17:15, Rob Herring wrote:
> On Tue, Apr 9, 2019 at 10:56 AM Tomeu Vizoso  
> wrote:
>>
>> On Mon, 8 Apr 2019 at 23:04, Rob Herring  wrote:
>>>
>>> On Fri, Apr 5, 2019 at 7:30 AM Steven Price  wrote:
>>>>
>>>> On 01/04/2019 08:47, Rob Herring wrote:
>>>>> This adds the initial driver for panfrost which supports Arm Mali
>>>>> Midgard and Bifrost family of GPUs. Currently, only the T860 and
>>>>> T760 Midgard GPUs have been tested.
>>>
>>> [...]
>>>>> +
>>>>> + if (status & JOB_INT_MASK_ERR(j)) {
>>>>> + job_write(pfdev, JS_COMMAND_NEXT(j), 
>>>>> JS_COMMAND_NOP);
>>>>> + job_write(pfdev, JS_COMMAND(j), 
>>>>> JS_COMMAND_HARD_STOP_0);
>>>>
>>>> Hard-stopping an already completed job isn't likely to do very much :)
>>>> Also you are using the "_0" version which is only valid when "job chain
>>>> disambiguation" is present.
>>
>> Yeah, guess that can be removed.
> 
> Steven, TBC, are you suggesting removing both lines or leaving
> JS_COMMAND_NOP? I don't think we can ever have a pending job at this
> point as there's no queuing. So the NOP probably isn't needed, but
> doesn't hurt to have it either.

Both lines are redundant and can be removed. But equally neither will
cause any problems.

Writing NOP into the next register is basically only needed if you know
there's a job there which you no longer want to execute.

kbase does this in certain situations. The main one is on a GPU without
command chain disambiguation if you want to kill a particular job
there's a potential race. For example:

 * Submit job A, followed by job B to slot 0. Job A is currently
executing, job B is waiting in the _NEXT registers.

 * Kernel decides it wants to kill job A (it's taking too long, or the
application has quit).

 * Simply executing a JS_COMMAND_HARD_STOP is racy. If job A finishes
just before doing the register write, then it's actually job B that gets
killed (and it's not always safe to just re-execute a killed job).

 * Instead write NOP to JS_COMMAND_NEXT, then check (again) whether the
job currently running is the one you want. When you then HARD_STOP you
either hit the correct job, or 'miss' and do nothing.

Job chain disambiguation solves this problem by allowing the kernel to
tag each job with a flag, the hard-stop can then be targetted at the job
with the correct flag. Writing NOP into JS_COMMAND_NEXT is also useful
if in the above situation you want to kill job B. In that case you can't
hard-stop it (it hasn't start), so you simply want to remove it from the
_NEXT registers.

>>>> I suspect in this case you should also be signalling the fence? At the
>>>> moment you rely on the GPU timeout recovering from the fault.
>>>
>>> I'll defer to Tomeu who wrote this (IIRC).
>>
>> Yes, that would be an improvement.
> 
> Actually, I think that would break recovery because the job timeout
> will bail out if the done fence is signaled already. Perhaps we want
> to timeout immediately if that is possible with the scheduler.

Ideally you would signal the fence with an error code (which is
presumably what recovery does). There's no actual need to trigger a
timeout. I'm not sure quite how the DRM infrastructure handles this though.

Steve
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 3/3] drm/panfrost: Add initial panfrost driver

2019-04-10 Thread Steven Price
On 08/04/2019 22:04, Rob Herring wrote:
> On Fri, Apr 5, 2019 at 7:30 AM Steven Price  wrote:
>>
>> On 01/04/2019 08:47, Rob Herring wrote:
>>> This adds the initial driver for panfrost which supports Arm Mali
>>> Midgard and Bifrost family of GPUs. Currently, only the T860 and
>>> T760 Midgard GPUs have been tested.
> 
> [...]
> 
>>> + case 0xD3: return "TRANSTAB_BUS_FAULT_LEVEL3";
>>> + case 0xD4: return "TRANSTAB_BUS_FAULT_LEVEL4";
>>> + case 0xD8: return "ACCESS_FLAG";
>>> + }
>>> +
>>> + if (panfrost_has_hw_feature(pfdev, HW_FEATURE_AARCH64_MMU)) {
>>
>> There's not a great deal of point in this conditional - you won't get
>> the below exception codes from hardware which doesn't support it AARCH64
>> page tables.
> 
> I wasn't sure if "UNKNOWN" types could happen or not.
> 
>>
>>> + switch (exception_code) {
>>> + case 0xC9 ... 0xCF: return "PERMISSION_FAULT";
>>> + case 0xD9 ... 0xDF: return "ACCESS_FLAG";
>>> + case 0xE0 ... 0xE7: return "ADDRESS_SIZE_FAULT";
>>> + case 0xE8 ... 0xEF: return "MEMORY_ATTRIBUTES_FAULT";
>>> + }
>>> + }
>>> +
>>> + return "UNKNOWN";
>>> +}
> 
>>> +static inline int panfrost_model_cmp(struct panfrost_device *pfdev, s32 id)
>>> +{
>>> + s32 match_id = pfdev->features.id;
>>> +
>>> + if (match_id & 0xf000)
>>> + match_id &= 0xf00f;
>>
>> I'm puzzled by this, and really not sure if it's going to work out
>> ignoring the ARCH_MINOR/ARCH_REV fields. But for now there's no real
>> Bifrost support.
> 
> That seemed to be enough for kbase to select features/issues.

I can't deny that it seems to work for now, and indeed looking more
closely at kbase that does seem to be the effect of the current code.

>>> + switch (param->param) {
>>> + case DRM_PANFROST_PARAM_GPU_ID:
>>> + param->value = pfdev->features.id;
>>
>> This is unfortunate naming - this parameter is *not* the GPU_ID. It is
>> the top half of the GPU_ID which kbase calls the PRODUCT_ID.
> 
> I can rename it.

It would help prevent confusion, thanks!

>> I'm also somewhat surprised that you don't need loads of other
>> properties from the GPU - in particular knowing the number of shader
>> cores is useful for allocating the right amount of memory for TLS (and
>> can't be obtained purely from the GPU_ID).
> 
> We'll add them as userspace needs them.

Fair enough. I'm not sure how much you want to provide "forward
compatibility" by providing them early - the basic definitions are
already in kbase. I found it a bit surprising that some feature
registers are printed on probe, but not available to be queried.

>>> +static int
>>> +panfrost_lookup_bos(struct drm_device *dev,
>>> +   struct drm_file *file_priv,
>>> +   struct drm_panfrost_submit *args,
>>> +   struct panfrost_job *job)
>>> +{
>>> + u32 *handles;
>>> + int ret = 0;
>>> +
>>> + job->bo_count = args->bo_handle_count;
>>> +
>>> + if (!job->bo_count)
>>> + return 0;
>>> +
>>> + job->bos = kvmalloc_array(job->bo_count,
>>> +   sizeof(struct drm_gem_object *),
>>> +   GFP_KERNEL | __GFP_ZERO);
>>> + if (!job->bos)
>>
>> If we return here then job->bo_count has been set, but job->bos is
>> invalid - this means that panfrost_job_cleanup() will then dereference
>> NULL. Although maybe this is better fixed in panfrost_job_cleanup().
> 
> Good catch. The fence arrays have the same problem. As does V3D which we 
> copied.
> 
>>> + return -ENOMEM;
>>> +
>>> + job->implicit_fences = kvmalloc_array(job->bo_count,
>>> +   sizeof(struct dma_fence *),
>>> +   GFP_KERNEL | __GFP_ZERO);
>>> + if (!job->implicit_fences)
>>> + return -ENOMEM;
> 
>>> +static irqreturn_t panfrost_gpu_irq_handler(int irq, void *data)
>>> +{
>>> + struct panfrost_device *pfdev = data;
>>> + u32 state = gpu_read(pfdev, GPU_INT_STAT);
>>> + u32 fault_status = gpu_read(pfdev, GPU_FAULT_STATUS);

Re: [PATCH v3 1/3] iommu: io-pgtable: Add ARM Mali midgard MMU page table format

2019-04-08 Thread Steven Price
On 05/04/2019 11:36, Steven Price wrote:
> On 05/04/2019 10:51, Robin Murphy wrote:
>> Hi Steve,
>>
>> On 05/04/2019 10:42, Steven Price wrote:
>>> First let me say congratulations to everyone working on Panfrost - it's
>>> an impressive achievement!
>>>
>>> Full disclosure: I used to work on the Mali kbase driver. And have been
>>> playing around with running the Mali user-space blob with the Panfrost
>>> kernel driver.
>>>
>>> On 01/04/2019 08:47, Rob Herring wrote:
>>>> ARM Mali midgard GPU is similar to standard 64-bit stage 1 page
>>>> tables, but
>>>> have a few differences. Add a new format type to represent the
>>>> format. The
>>>> input address size is 48-bits and the output address size is 40-bits
>>>> (and
>>>> possibly less?). Note that the later bifrost GPUs follow the standard
>>>> 64-bit stage 1 format.
>>>>
>>>> The differences in the format compared to 64-bit stage 1 format are:
>>>>
>>>> The 3rd level page entry bits are 0x1 instead of 0x3 for page entries.
>>>>
>>>> The access flags are not read-only and unprivileged, but read and write.
>>>> This is similar to stage 2 entries, but the memory attributes field
>>>> matches
>>>> stage 1 being an index.
>>>>
>>>> The nG bit is not set by the vendor driver. This one didn't seem to
>>>> matter,
>>>> but we'll keep it aligned to the vendor driver.
>>>
>>> The nG bit should be ignored by the hardware.
>>>
>>> The MMU in Midgard/Bifrost has a quirk similar to
>>> IO_PGTABLE_QUIRK_TLBI_ON_MAP - you must perform a cache flush for the
>>> GPU to (reliably) pick up new page table mappings.
>>>
>>> You may not have seen this because of the use of the JS_CONFIG_START_MMU
>>> bit - this effectively performs a cache flush and TLB invalidate before
>>> starting a job, however when using a GPU like T760 (e.g. on the Firefly
>>> RK3288) this bit isn't being set. In my testing on the Firefly board I
>>> saw GPU page faults because of this.
>>>
>>> There's two options for fixing this - a patch like below adds the quirk
>>> mode to the MMU. Or alternatively always set JS_CONFIG_START_MMU on
>>> jobs. In my testing both options solve the page faults.
>>>
>>> To be honest I don't know the reasoning behind kbase making the
>>> JS_CONFIG_START_MMU bit conditional - I'm not aware of any reason why it
>>> can't always be set. My guess is performance, but I haven't benchmarked
>>> the difference between this and JS_CONFIG_START_MMU.
>>>
>>> -8<--
>>>  From e3f75c7f04e43238dfc579029b8c11fb6b4a0c18 Mon Sep 17 00:00:00 2001
>>> From: Steven Price 
>>> Date: Thu, 4 Apr 2019 15:53:17 +0100
>>> Subject: [PATCH] iommu: io-pgtable: IO_PGTABLE_QUIRK_TLBI_ON_MAP for LPAE
>>>
>>> Midgard/Bifrost GPUs require a TLB invalidation when mapping pages,
>>> implement IO_PGTABLE_QUIRK_TLBI_ON_MAP for LPAE iommu page table
>>> formats and add the quirk bit to Panfrost.
>>>
>>> Signed-off-by: Steven Price 
>>> ---
>>>   drivers/gpu/drm/panfrost/panfrost_mmu.c |  1 +
>>>   drivers/iommu/io-pgtable-arm.c  | 11 +--
>>>   2 files changed, 10 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c
>>> b/drivers/gpu/drm/panfrost/panfrost_mmu.c
>>> index f3aad8591cf4..094312074d66 100644
>>> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
>>> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
>>> @@ -343,6 +343,7 @@ int panfrost_mmu_init(struct panfrost_device *pfdev)
>>>   mmu_write(pfdev, MMU_INT_MASK, ~0);
>>>
>>>   pfdev->mmu->pgtbl_cfg = (struct io_pgtable_cfg) {
>>> +    .quirks    = IO_PGTABLE_QUIRK_TLBI_ON_MAP,
>>>   .pgsize_bitmap    = SZ_4K, // | SZ_2M | SZ_1G),
>>>   .ias    = 48,
>>>   .oas    = 40,    /* Should come from dma mask? */
>>> diff --git a/drivers/iommu/io-pgtable-arm.c
>>> b/drivers/iommu/io-pgtable-arm.c
>>> index 84beea1f47a7..45fd7bbdf9aa 100644
>>> --- a/drivers/iommu/io-pgtable-arm.c
>>> +++ b/drivers/iommu/io-pgtable-arm.c
>>> @@ -505,7 +505,13 @@ static int arm_lpae_map(struct io_pgtable_ops *ops,
>>> unsigned long iova,
>>>    * Synchronise all PTE updates for the new mapping befo

Re: [PATCH v2 3/3] drm/panfrost: Add initial panfrost driver

2019-04-05 Thread Steven Price
On 05/04/2019 17:16, Alyssa Rosenzweig wrote:
>> I'm also somewhat surprised that you don't need loads of other
>> properties from the GPU - in particular knowing the number of shader
>> cores is useful for allocating the right amount of memory for TLS (and
>> can't be obtained purely from the GPU_ID).
> 
> Since I have no idea what TLS is (and in my notes, I've come across the

Sorry - "Thread Local Storage" - e.g. registers spilled to memory from a
shader program.

> acronym once ever and have it as a "??"), I'm not sure how to respond to
> that... We don't know how to allocate memory for the GPU-internal data
> structures (the tiler heap, for instance, but also a few others I've
> just named "misc_0" and "scratchpad" -- guessing one of those is for
> "TLS"). With kbase, I took the worst-case strategy of allocating
> gigantic chunks on startup with tiny commit counts and GROW_ON_GPF set.
> With the new driver, well, our memory consumption is scary since
> implementing GROW_ON_GPF in an upstream-friendly way is a bit more work
> and isn't expected to hit the 5.2 window.

Yes GROW_ON_GPF is pretty much required for the tiler heap - it's not
(reasonably) possible to determine how big it should be. The Arm user
space driver does the same approach (tiny commit count, but allow it to
grow).

> 
> Given this is kernel facing, I'm hoping 're able to share some answers
> here?

At the moment I don't have any permission to share details which aren't
already public in the kbase driver. Hopefully that situation will
change. I'm also very much not an expert on anything but the kernel
driver (I tried to stay away from shader compilers and all that graphics
knowledge...). The details of the job descriptors is only really
publicly documented in terms of the "replay workaround" which is quite
limited.

> 
>> I think this comment might have survived since the very earliest version
>> of the Midgard driver! :)
> 
> ^_^
> 
>> But I'm not sure anything will attempt to lock a region spanning two
>> pages like that.
> 
> At least at the moment, I align everything kernel-facing to page
> granularity in userspace, so it's not a cornercase I'm going to hit
> anytime soon. Still probably better to have it technically correct.
> 
>> To be fair only for BASE_HW_ISSUE_6367/T60X - but yes it's not a
>> pleasant workaround. There's no way on that hardware to reliably drain
>> the write buffer other than waiting.
> 
> *wishing T60X disappeared intensifies* ;)

I think we all felt like that :) Still the Nexus 10 wasn't a bad tablet,
and the Chromebook was an exciting first!

> Granted there are enough other errata specific to it that aren't worked
> around here that, well, it makes you wonder ;)

A lot of the errata are things you only hit with soak testing. So to a
large extent you "get lucky".

>> Do we have a good way of user space determining which requirements are
>> supported by the driver? At the moment it's just the one. kbase outgeew
>> the original u16 and has an ugly "compat_core_req", so I suspect you're
>> going to need to add several more along the way.
> 
> Oh, so that's why compat_/core_req is split off! I thought somebody just
> thought it would be "fun" to break the UABI ;)

No that's a case of us actually not breaking the UABI for once :)

> I've definitely issues using the wrong core_req field for the kbase I
> had setup, that set me back a little bit on RK3399/T860 bringup *purses
> lips*
> 
> To be fair, bunches of the kbase reqs are for soft jobs, which I don't
> feel are a good fit for how the Panfrost kernel works. If we need to
> implement functionality corresponding to softjobs, that would likely be
> done with dedicated ioctl(s), instead of affecting the core_req field.
> 
> On that note
> 
>> You might also want to consider being able to submit more than one job
>> chain at a time - but that could easily be implemented using a new ioctl
>> in the future.
> 
> The issue with that at the bottom is having to introduce something akin
> to kbase's annoyingly intra-job-chain dependency management (read: I
> still don't understand how FBOs are supposed to work with kbase ;) ),
> which AFAIK we push off to userspace right now via standard fencing. If
> we want to submit batches at a time, we would potentially need to
> express those somewhat complex dependency trees, which is a lot of work
> for diminishing returns at this stage. Future ioctl indeed...

You should be able to express the dependencies using fences. At the time
kbase was started there was no fence mechanism in the kernel. We
invented horrible things like UMP[1] and KDS[2] for cross-driver sharing.

It all comes down to how small your job chains are - if you don't need
to squeeze too many through the hardware you should be fine. But there's
going to be some performance gain to be had implementing it.

[1] I forget what it actually stands for, but was an attempt to do
something like dma_buf

[2] "Kernel Dependency System" - a not-so-good version of 

Re: [PATCH v3 1/3] iommu: io-pgtable: Add ARM Mali midgard MMU page table format

2019-04-05 Thread Steven Price
On 05/04/2019 10:51, Robin Murphy wrote:
> Hi Steve,
> 
> On 05/04/2019 10:42, Steven Price wrote:
>> First let me say congratulations to everyone working on Panfrost - it's
>> an impressive achievement!
>>
>> Full disclosure: I used to work on the Mali kbase driver. And have been
>> playing around with running the Mali user-space blob with the Panfrost
>> kernel driver.
>>
>> On 01/04/2019 08:47, Rob Herring wrote:
>>> ARM Mali midgard GPU is similar to standard 64-bit stage 1 page
>>> tables, but
>>> have a few differences. Add a new format type to represent the
>>> format. The
>>> input address size is 48-bits and the output address size is 40-bits
>>> (and
>>> possibly less?). Note that the later bifrost GPUs follow the standard
>>> 64-bit stage 1 format.
>>>
>>> The differences in the format compared to 64-bit stage 1 format are:
>>>
>>> The 3rd level page entry bits are 0x1 instead of 0x3 for page entries.
>>>
>>> The access flags are not read-only and unprivileged, but read and write.
>>> This is similar to stage 2 entries, but the memory attributes field
>>> matches
>>> stage 1 being an index.
>>>
>>> The nG bit is not set by the vendor driver. This one didn't seem to
>>> matter,
>>> but we'll keep it aligned to the vendor driver.
>>
>> The nG bit should be ignored by the hardware.
>>
>> The MMU in Midgard/Bifrost has a quirk similar to
>> IO_PGTABLE_QUIRK_TLBI_ON_MAP - you must perform a cache flush for the
>> GPU to (reliably) pick up new page table mappings.
>>
>> You may not have seen this because of the use of the JS_CONFIG_START_MMU
>> bit - this effectively performs a cache flush and TLB invalidate before
>> starting a job, however when using a GPU like T760 (e.g. on the Firefly
>> RK3288) this bit isn't being set. In my testing on the Firefly board I
>> saw GPU page faults because of this.
>>
>> There's two options for fixing this - a patch like below adds the quirk
>> mode to the MMU. Or alternatively always set JS_CONFIG_START_MMU on
>> jobs. In my testing both options solve the page faults.
>>
>> To be honest I don't know the reasoning behind kbase making the
>> JS_CONFIG_START_MMU bit conditional - I'm not aware of any reason why it
>> can't always be set. My guess is performance, but I haven't benchmarked
>> the difference between this and JS_CONFIG_START_MMU.
>>
>> -8<--
>>  From e3f75c7f04e43238dfc579029b8c11fb6b4a0c18 Mon Sep 17 00:00:00 2001
>> From: Steven Price 
>> Date: Thu, 4 Apr 2019 15:53:17 +0100
>> Subject: [PATCH] iommu: io-pgtable: IO_PGTABLE_QUIRK_TLBI_ON_MAP for LPAE
>>
>> Midgard/Bifrost GPUs require a TLB invalidation when mapping pages,
>> implement IO_PGTABLE_QUIRK_TLBI_ON_MAP for LPAE iommu page table
>> formats and add the quirk bit to Panfrost.
>>
>> Signed-off-by: Steven Price 
>> ---
>>   drivers/gpu/drm/panfrost/panfrost_mmu.c |  1 +
>>   drivers/iommu/io-pgtable-arm.c  | 11 +--
>>   2 files changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c
>> b/drivers/gpu/drm/panfrost/panfrost_mmu.c
>> index f3aad8591cf4..094312074d66 100644
>> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
>> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
>> @@ -343,6 +343,7 @@ int panfrost_mmu_init(struct panfrost_device *pfdev)
>>   mmu_write(pfdev, MMU_INT_MASK, ~0);
>>
>>   pfdev->mmu->pgtbl_cfg = (struct io_pgtable_cfg) {
>> +    .quirks    = IO_PGTABLE_QUIRK_TLBI_ON_MAP,
>>   .pgsize_bitmap    = SZ_4K, // | SZ_2M | SZ_1G),
>>   .ias    = 48,
>>   .oas    = 40,    /* Should come from dma mask? */
>> diff --git a/drivers/iommu/io-pgtable-arm.c
>> b/drivers/iommu/io-pgtable-arm.c
>> index 84beea1f47a7..45fd7bbdf9aa 100644
>> --- a/drivers/iommu/io-pgtable-arm.c
>> +++ b/drivers/iommu/io-pgtable-arm.c
>> @@ -505,7 +505,13 @@ static int arm_lpae_map(struct io_pgtable_ops *ops,
>> unsigned long iova,
>>    * Synchronise all PTE updates for the new mapping before there's
>>    * a chance for anything to kick off a table walk for the new iova.
>>    */
>> -    wmb();
>> +    if (data->iop.cfg.quirks & IO_PGTABLE_QUIRK_TLBI_ON_MAP) {
>> +    io_pgtable_tlb_add_flush(>iop, iova, size,
>> + ARM_LPAE_BLOCK_SIZE(2, data), false);
> 
> For correctness (in case this ever ends up used

Re: [PATCH v3 1/3] iommu: io-pgtable: Add ARM Mali midgard MMU page table format

2019-04-05 Thread Steven Price
First let me say congratulations to everyone working on Panfrost - it's
an impressive achievement!

Full disclosure: I used to work on the Mali kbase driver. And have been
playing around with running the Mali user-space blob with the Panfrost
kernel driver.

On 01/04/2019 08:47, Rob Herring wrote:
> ARM Mali midgard GPU is similar to standard 64-bit stage 1 page tables, but
> have a few differences. Add a new format type to represent the format. The
> input address size is 48-bits and the output address size is 40-bits (and
> possibly less?). Note that the later bifrost GPUs follow the standard
> 64-bit stage 1 format.
> 
> The differences in the format compared to 64-bit stage 1 format are:
> 
> The 3rd level page entry bits are 0x1 instead of 0x3 for page entries.
> 
> The access flags are not read-only and unprivileged, but read and write.
> This is similar to stage 2 entries, but the memory attributes field matches
> stage 1 being an index.
> 
> The nG bit is not set by the vendor driver. This one didn't seem to matter,
> but we'll keep it aligned to the vendor driver.

The nG bit should be ignored by the hardware.

The MMU in Midgard/Bifrost has a quirk similar to
IO_PGTABLE_QUIRK_TLBI_ON_MAP - you must perform a cache flush for the
GPU to (reliably) pick up new page table mappings.

You may not have seen this because of the use of the JS_CONFIG_START_MMU
bit - this effectively performs a cache flush and TLB invalidate before
starting a job, however when using a GPU like T760 (e.g. on the Firefly
RK3288) this bit isn't being set. In my testing on the Firefly board I
saw GPU page faults because of this.

There's two options for fixing this - a patch like below adds the quirk
mode to the MMU. Or alternatively always set JS_CONFIG_START_MMU on
jobs. In my testing both options solve the page faults.

To be honest I don't know the reasoning behind kbase making the
JS_CONFIG_START_MMU bit conditional - I'm not aware of any reason why it
can't always be set. My guess is performance, but I haven't benchmarked
the difference between this and JS_CONFIG_START_MMU.

-8<--
>From e3f75c7f04e43238dfc579029b8c11fb6b4a0c18 Mon Sep 17 00:00:00 2001
From: Steven Price 
Date: Thu, 4 Apr 2019 15:53:17 +0100
Subject: [PATCH] iommu: io-pgtable: IO_PGTABLE_QUIRK_TLBI_ON_MAP for LPAE

Midgard/Bifrost GPUs require a TLB invalidation when mapping pages,
implement IO_PGTABLE_QUIRK_TLBI_ON_MAP for LPAE iommu page table
formats and add the quirk bit to Panfrost.

Signed-off-by: Steven Price 
---
 drivers/gpu/drm/panfrost/panfrost_mmu.c |  1 +
 drivers/iommu/io-pgtable-arm.c  | 11 +--
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c
b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index f3aad8591cf4..094312074d66 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -343,6 +343,7 @@ int panfrost_mmu_init(struct panfrost_device *pfdev)
mmu_write(pfdev, MMU_INT_MASK, ~0);

pfdev->mmu->pgtbl_cfg = (struct io_pgtable_cfg) {
+   .quirks = IO_PGTABLE_QUIRK_TLBI_ON_MAP,
.pgsize_bitmap  = SZ_4K, // | SZ_2M | SZ_1G),
.ias= 48,
.oas= 40,   /* Should come from dma mask? */
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 84beea1f47a7..45fd7bbdf9aa 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -505,7 +505,13 @@ static int arm_lpae_map(struct io_pgtable_ops *ops,
unsigned long iova,
 * Synchronise all PTE updates for the new mapping before there's
 * a chance for anything to kick off a table walk for the new iova.
 */
-   wmb();
+   if (data->iop.cfg.quirks & IO_PGTABLE_QUIRK_TLBI_ON_MAP) {
+   io_pgtable_tlb_add_flush(>iop, iova, size,
+ARM_LPAE_BLOCK_SIZE(2, data), false);
+   io_pgtable_tlb_sync(>iop);
+   } else {
+   wmb();
+   }

return ret;
 }
@@ -800,7 +806,8 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg
*cfg, void *cookie)
struct arm_lpae_io_pgtable *data;

if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS | IO_PGTABLE_QUIRK_NO_DMA |
-   IO_PGTABLE_QUIRK_NON_STRICT))
+   IO_PGTABLE_QUIRK_NON_STRICT |
+   IO_PGTABLE_QUIRK_TLBI_ON_MAP))
return NULL;

data = arm_lpae_alloc_pgtable(cfg);
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu