Re: [PATCH 0/3] iommu/arm-smmu: Add support to use Last level cache
On Thu, 24 Jan 2019 at 07:58, Vivek Gautam wrote: > > On Mon, Jan 21, 2019 at 7:55 PM Ard Biesheuvel > wrote: > > > > On Mon, 21 Jan 2019 at 14:56, Robin Murphy wrote: > > > > > > On 21/01/2019 13:36, Ard Biesheuvel wrote: > > > > On Mon, 21 Jan 2019 at 14:25, Robin Murphy wrote: > > > >> > > > >> On 21/01/2019 10:50, Ard Biesheuvel wrote: > > > >>> On Mon, 21 Jan 2019 at 11:17, Vivek Gautam > > > >>> wrote: > > > > > > Hi, > > > > > > > > > On Mon, Jan 21, 2019 at 12:56 PM Ard Biesheuvel > > > wrote: > > > > > > > > On Mon, 21 Jan 2019 at 06:54, Vivek Gautam > > > > wrote: > > > >> > > > >> Qualcomm SoCs have an additional level of cache called as > > > >> System cache, aka. Last level cache (LLC). This cache sits right > > > >> before the DDR, and is tightly coupled with the memory controller. > > > >> The clients using this cache request their slices from this > > > >> system cache, make it active, and can then start using it. > > > >> For these clients with smmu, to start using the system cache for > > > >> buffers and, related page tables [1], memory attributes need to be > > > >> set accordingly. This series add the required support. > > > >> > > > > > > > > Does this actually improve performance on reads from a device? The > > > > non-cache coherent DMA routines perform an unconditional D-cache > > > > invalidate by VA to the PoC before reading from the buffers filled > > > > by > > > > the device, and I would expect the PoC to be defined as lying beyond > > > > the LLC to still guarantee the architected behavior. > > > > > > We have seen performance improvements when running Manhattan > > > GFXBench benchmarks. > > > > > > >>> > > > >>> Ah ok, that makes sense, since in that case, the data flow is mostly > > > >>> to the device, not from the device. > > > >>> > > > As for the PoC, from my knowledge on sdm845 the system cache, aka > > > Last level cache (LLC) lies beyond the point of coherency. > > > Non-cache coherent buffers will not be cached to system cache also, > > > and > > > no additional software cache maintenance ops are required for system > > > cache. > > > Pratik can add more if I am missing something. > > > > > > To take care of the memory attributes from DMA APIs side, we can add > > > a > > > DMA_ATTR definition to take care of any dma non-coherent APIs calls. > > > > > > >>> > > > >>> So does the device use the correct inner non-cacheable, outer > > > >>> writeback cacheable attributes if the SMMU is in pass-through? > > > >>> > > > >>> We have been looking into another use case where the fact that the > > > >>> SMMU overrides memory attributes is causing issues (WC mappings used > > > >>> by the radeon and amdgpu driver). So if the SMMU would honour the > > > >>> existing attributes, would you still need the SMMU changes? > > > >> > > > >> Even if we could force a stage 2 mapping with the weakest pagetable > > > >> attributes (such that combining would work), there would still need to > > > >> be a way to set the TCR attributes appropriately if this behaviour is > > > >> wanted for the SMMU's own table walks as well. > > > >> > > > > > > > > Isn't that just a matter of implementing support for SMMUs that lack > > > > the 'dma-coherent' attribute? > > > > > > Not quite - in general they need INC-ONC attributes in case there > > > actually is something in the architectural outer-cacheable domain. > > > > But is it a problem to use INC-ONC attributes for the SMMU PTW on this > > chip? AIUI, the reason for the SMMU changes is to avoid the > > performance hit of snooping, which is more expensive than cache > > maintenance of SMMU page tables. So are you saying the by-VA cache > > maintenance is not relayed to this system cache, resulting in page > > table updates to be invisible to masters using INC-ONC attributes? > > The reason for this SMMU changes is that the non-coherent devices > can't access the inner caches at all. But they have a way to allocate > and lookup in system cache. > > CPU will by default make use of system cache when the inner-cacheable > and outer-cacheable memory attribute is set. > > So for SMMU page tables to be visible to PTW, > -- For IO coherent clients, the CPU cache maintenance operations are not > required for buffers marked Normal Cached to achieve a coherent view of > memory. However, client-specific cache maintenance may still be > required for devices > with local caches (for example, compute DSP local L1 or L2). Why would devices need to access the SMMU page tables? > -- For non-IO coherent clients, the CPU cache maintenance operations (cleans > and/or invalidates) are required at buffer handoff points for buffers marked > as > Normal Cached in any CPU page table in order to observe the latest updates. > Indeed, and this is what your non-coherent
[PATCH] iommu/dma: Remove unused variable
end_pfn is never used after commit ('iommu/iova: Make dma 32bit pfn implicit'), cleanup it. Cc: Joerg Roedel Cc: Robin Murphy Cc: Zhen Lei Signed-off-by: Shaokun Zhang --- drivers/iommu/dma-iommu.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index d19f3d6b43c1..77aabe637a60 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -289,7 +289,7 @@ int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base, { struct iommu_dma_cookie *cookie = domain->iova_cookie; struct iova_domain *iovad = >iovad; - unsigned long order, base_pfn, end_pfn; + unsigned long order, base_pfn; int attr; if (!cookie || cookie->type != IOMMU_DMA_IOVA_COOKIE) @@ -298,7 +298,6 @@ int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base, /* Use the smallest supported page size for IOVA granularity */ order = __ffs(domain->pgsize_bitmap); base_pfn = max_t(unsigned long, 1, base >> order); - end_pfn = (base + size - 1) >> order; /* Check the domain allows at least some access to the device... */ if (domain->geometry.force_aperture) { -- 2.7.4 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/3] iommu/arm-smmu: Add support to use Last level cache
On Mon, Jan 21, 2019 at 7:55 PM Ard Biesheuvel wrote: > > On Mon, 21 Jan 2019 at 14:56, Robin Murphy wrote: > > > > On 21/01/2019 13:36, Ard Biesheuvel wrote: > > > On Mon, 21 Jan 2019 at 14:25, Robin Murphy wrote: > > >> > > >> On 21/01/2019 10:50, Ard Biesheuvel wrote: > > >>> On Mon, 21 Jan 2019 at 11:17, Vivek Gautam > > >>> wrote: > > > > Hi, > > > > > > On Mon, Jan 21, 2019 at 12:56 PM Ard Biesheuvel > > wrote: > > > > > > On Mon, 21 Jan 2019 at 06:54, Vivek Gautam > > > wrote: > > >> > > >> Qualcomm SoCs have an additional level of cache called as > > >> System cache, aka. Last level cache (LLC). This cache sits right > > >> before the DDR, and is tightly coupled with the memory controller. > > >> The clients using this cache request their slices from this > > >> system cache, make it active, and can then start using it. > > >> For these clients with smmu, to start using the system cache for > > >> buffers and, related page tables [1], memory attributes need to be > > >> set accordingly. This series add the required support. > > >> > > > > > > Does this actually improve performance on reads from a device? The > > > non-cache coherent DMA routines perform an unconditional D-cache > > > invalidate by VA to the PoC before reading from the buffers filled by > > > the device, and I would expect the PoC to be defined as lying beyond > > > the LLC to still guarantee the architected behavior. > > > > We have seen performance improvements when running Manhattan > > GFXBench benchmarks. > > > > >>> > > >>> Ah ok, that makes sense, since in that case, the data flow is mostly > > >>> to the device, not from the device. > > >>> > > As for the PoC, from my knowledge on sdm845 the system cache, aka > > Last level cache (LLC) lies beyond the point of coherency. > > Non-cache coherent buffers will not be cached to system cache also, and > > no additional software cache maintenance ops are required for system > > cache. > > Pratik can add more if I am missing something. > > > > To take care of the memory attributes from DMA APIs side, we can add a > > DMA_ATTR definition to take care of any dma non-coherent APIs calls. > > > > >>> > > >>> So does the device use the correct inner non-cacheable, outer > > >>> writeback cacheable attributes if the SMMU is in pass-through? > > >>> > > >>> We have been looking into another use case where the fact that the > > >>> SMMU overrides memory attributes is causing issues (WC mappings used > > >>> by the radeon and amdgpu driver). So if the SMMU would honour the > > >>> existing attributes, would you still need the SMMU changes? > > >> > > >> Even if we could force a stage 2 mapping with the weakest pagetable > > >> attributes (such that combining would work), there would still need to > > >> be a way to set the TCR attributes appropriately if this behaviour is > > >> wanted for the SMMU's own table walks as well. > > >> > > > > > > Isn't that just a matter of implementing support for SMMUs that lack > > > the 'dma-coherent' attribute? > > > > Not quite - in general they need INC-ONC attributes in case there > > actually is something in the architectural outer-cacheable domain. > > But is it a problem to use INC-ONC attributes for the SMMU PTW on this > chip? AIUI, the reason for the SMMU changes is to avoid the > performance hit of snooping, which is more expensive than cache > maintenance of SMMU page tables. So are you saying the by-VA cache > maintenance is not relayed to this system cache, resulting in page > table updates to be invisible to masters using INC-ONC attributes? The reason for this SMMU changes is that the non-coherent devices can't access the inner caches at all. But they have a way to allocate and lookup in system cache. CPU will by default make use of system cache when the inner-cacheable and outer-cacheable memory attribute is set. So for SMMU page tables to be visible to PTW, -- For IO coherent clients, the CPU cache maintenance operations are not required for buffers marked Normal Cached to achieve a coherent view of memory. However, client-specific cache maintenance may still be required for devices with local caches (for example, compute DSP local L1 or L2). -- For non-IO coherent clients, the CPU cache maintenance operations (cleans and/or invalidates) are required at buffer handoff points for buffers marked as Normal Cached in any CPU page table in order to observe the latest updates. Regards Vivek > > > The > > case of the outer cacheablility being not that but a hint to control > > non-CPU traffic through some not-quite-transparent cache behind the PoC > > definitely stays wrapped up in qcom-specific magic ;) > > > > I'm not surprised ... -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux
Re: [PATCH v5 2/8] iommu/vt-d: Add per-device IOMMU feature ops entries
Hi Joerg, On 1/11/19 7:16 PM, Joerg Roedel wrote: + +static bool +intel_iommu_dev_has_feat(struct device *dev, enum iommu_dev_features feat) +{ + struct device_domain_info *info = dev->archdata.iommu; + + if (feat == IOMMU_DEV_FEAT_AUX) + return scalable_mode_support() && info && info->auxd_enabled; + + return false; +} Why is this checking the auxd_enabled flag? The function should just return whether the device_supports_ scalable mode, not whether it is enabled. Yes, as the API name implies, it should return the device capability instead of enable/disable status. I misused this API in the IOMMU driver. Since we already have iommu_dev_enable/disable_feature() to enable and disable an iommu specific feature, is it possible to add another API to query whether a specific feature has been enabled? How about bool iommu_dev_feature_enabled(dev, IOMMU_DEV_FEAT_AUX)? This is necessary for the third party drivers (like vfio) to determine which domain attach interface it should use: if (iommu_dev_feature_enabled(dev, IOMMU_DEV_FEAT_AUX)) iommmu_aux_attach_device(domain, dev) else iommu_attach_device(domain, dev) Best regards, Lu Baolu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/3] Add IPROC PCIe new features
Hi Bjorn, Thank you for the review. I will address your comments in the next patchset. Regards, Srinath. On Fri, Jan 18, 2019 at 8:11 PM Bjorn Helgaas wrote: > > On Fri, Jan 18, 2019 at 09:53:20AM +0530, Srinath Mannam wrote: > > Add changes related to IPROC PCIe RC IP new features. > > > > This patch set is based on Linux-5.0-rc2. > > > > Srinath Mannam (3): > > PCI: iproc: Add feature to set order mode > > Since this apparently refers to a PCIe feature, the subject should use the > exact terminology used in the PCIe spec. > > > PCI: iproc: CRS state check in config request > > This subject should start with a verb, not "CRS". > > > PCI: iproc: Add PCIe 32bit outbound memory configuration > > > > drivers/pci/controller/pcie-iproc.c | 172 > > +++- > > drivers/pci/controller/pcie-iproc.h | 16 > > 2 files changed, 184 insertions(+), 4 deletions(-) > > > > -- > > 2.7.4 > > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v3] iommu: amd: Fix IOMMU page flush when detach device from a domain
From: Suravee Suthikulpanit When a VM is terminated, the VFIO driver detaches all pass-through devices from VFIO domain by clearing domain id and page table root pointer from each device table entry (DTE), and then invalidates the DTE. Then, the VFIO driver unmap pages and invalidate IOMMU pages. Currently, the IOMMU driver keeps track of which IOMMU and how many devices are attached to the domain. When invalidate IOMMU pages, the driver checks if the IOMMU is still attached to the domain before issuing the invalidate page command. However, since VFIO has already detached all devices from the domain, the subsequent INVALIDATE_IOMMU_PAGES commands are being skipped as there is no IOMMU attached to the domain. This results in data corruption and could cause the PCI device to end up in indeterministic state. Fix this by invalidate IOMMU pages when detach a device, and before decrementing the per-domain device reference counts. Cc: Boris Ostrovsky Suggested-by: Joerg Roedel Co-developed-by: Brijesh Singh Signed-off-by: Brijesh Singh Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd_iommu.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index 87ba23a75b38..6cd4a00036c1 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -1991,16 +1991,13 @@ static void do_attach(struct iommu_dev_data *dev_data, static void do_detach(struct iommu_dev_data *dev_data) { + struct protection_domain *domain = dev_data->domain; struct amd_iommu *iommu; u16 alias; iommu = amd_iommu_rlookup_table[dev_data->devid]; alias = dev_data->alias; - /* decrease reference counters */ - dev_data->domain->dev_iommu[iommu->index] -= 1; - dev_data->domain->dev_cnt -= 1; - /* Update data structures */ dev_data->domain = NULL; list_del(_data->list); @@ -2010,6 +2007,16 @@ static void do_detach(struct iommu_dev_data *dev_data) /* Flush the DTE entry */ device_flush_dte(dev_data); + + /* Flush IOTLB */ + domain_flush_tlb_pde(domain); + + /* Wait for the flushes to finish */ + domain_flush_complete(domain); + + /* decrease reference counters - needs to happen after the flushes */ + domain->dev_iommu[iommu->index] -= 1; + domain->dev_cnt -= 1; } /* -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/amd: Fix IOMMU page flush when detach all devices from a domain
Joerg, On 1/23/19 2:56 PM, j...@8bytes.org wrote: > Hi Suravee, > > On Tue, Jan 22, 2019 at 03:53:18PM +, Suthikulpanit, Suravee wrote: >> Thanks for the detail. Alright then, let's just go with the version you >> sent on 1/16/19. Do you want me to resend V3 with that changes, or >> would you be taking care of that? > > Please send me a v3 based on the diff I sent last week. Also add a > separate patch which adds the missing dte flush for the alias entry. > > Thanks, > > Joerg > Actually, I just noticed that device_flush_dte() has already handled flushing the DTE for alias device as well. Please see the link below. https://elixir.bootlin.com/linux/latest/source/drivers/iommu/amd_iommu.c#L1219 So, we don't need to call device_flush_dte() for alias device in do_detach(). Regards, Suravee ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/1] iommu/vt-d: Leave scalable mode default off
Commit 765b6a98c1de3 ("iommu/vt-d: Enumerate the scalable mode capability") enables VT-d scalable mode if hardware advertises the capability. As we will bring up different features and use cases to upstream in different patch series, it will leave some intermediate kernel versions which support partial features. Hence, end user might run into problems when they use such kernels on bare metals or virtualization environments. This leaves scalable mode default off and end users could turn it on with "intel-iommu=sm_on" only when they have clear ideas about which scalable features are supported in the kernel. Cc: Liu Yi L Cc: Jacob Pan Suggested-by: Ashok Raj Suggested-by: Kevin Tian Signed-off-by: Lu Baolu --- Documentation/admin-guide/kernel-parameters.txt | 7 +++ drivers/iommu/intel-iommu.c | 8 2 files changed, 7 insertions(+), 8 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index b799bcf67d7b..858b6c0b9a15 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1696,12 +1696,11 @@ By default, super page will be supported if Intel IOMMU has the capability. With this option, super page will not be supported. - sm_off [Default Off] - By default, scalable mode will be supported if the + sm_on [Default Off] + By default, scalable mode will be disabled even if the hardware advertises that it has support for the scalable mode translation. With this option set, scalable mode - will not be used even on hardware which claims to support - it. + will be used on hardware which claims to support it. tboot_noforce [Default Off] Do not force the Intel IOMMU enabled under tboot. By default, tboot will force Intel IOMMU on, which diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 2bd9ac285c0d..8d31d3e25e30 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -363,7 +363,7 @@ static int dmar_map_gfx = 1; static int dmar_forcedac; static int intel_iommu_strict; static int intel_iommu_superpage = 1; -static int intel_iommu_sm = 1; +static int intel_iommu_sm; static int iommu_identity_mapping; #define IDENTMAP_ALL 1 @@ -456,9 +456,9 @@ static int __init intel_iommu_setup(char *str) } else if (!strncmp(str, "sp_off", 6)) { pr_info("Disable supported super page\n"); intel_iommu_superpage = 0; - } else if (!strncmp(str, "sm_off", 6)) { - pr_info("Intel-IOMMU: disable scalable mode support\n"); - intel_iommu_sm = 0; + } else if (!strncmp(str, "sm_on", 5)) { + pr_info("Intel-IOMMU: scalable mode supported\n"); + intel_iommu_sm = 1; } else if (!strncmp(str, "tboot_noforce", 13)) { printk(KERN_INFO "Intel-IOMMU: not forcing on after tboot. This could expose security risk for tboot\n"); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 5/5] virtio-blk: Consider virtio_max_dma_size() for maximum segment size
On Wed, Jan 23, 2019 at 10:31:39PM +0100, Christoph Hellwig wrote: > On Wed, Jan 23, 2019 at 05:30:49PM +0100, Joerg Roedel wrote: > > + max_size = virtio_max_dma_size(vdev); > > + > > /* Host can optionally specify maximum segment size and number of > > * segments. */ > > err = virtio_cread_feature(vdev, VIRTIO_BLK_F_SIZE_MAX, > >struct virtio_blk_config, size_max, ); > > if (!err) > > - blk_queue_max_segment_size(q, v); > > - else > > - blk_queue_max_segment_size(q, -1U); > > + max_size = min(max_size, v); > > + > > + blk_queue_max_segment_size(q, max_size); > > I wonder if we should just move the dma max segment size check > into blk_queue_max_segment_size so that all block drivers benefit > from it. Even if not I think at least the SCSI midlayer should > be updated to support it. > > Btw, I wonder why virtio-scsi sticks to the default segment size, > unlike virtio-blk. Well no one bothered exposing that through that device. Why does virtio block have it? Hard for me to say. First driver version had it already but QEMU does not seem to use it and it seems that it never did. -- MST ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 5/5] virtio-blk: Consider virtio_max_dma_size() for maximum segment size
On Wed, Jan 23, 2019 at 05:30:49PM +0100, Joerg Roedel wrote: > + max_size = virtio_max_dma_size(vdev); > + > /* Host can optionally specify maximum segment size and number of >* segments. */ > err = virtio_cread_feature(vdev, VIRTIO_BLK_F_SIZE_MAX, > struct virtio_blk_config, size_max, ); > if (!err) > - blk_queue_max_segment_size(q, v); > - else > - blk_queue_max_segment_size(q, -1U); > + max_size = min(max_size, v); > + > + blk_queue_max_segment_size(q, max_size); I wonder if we should just move the dma max segment size check into blk_queue_max_segment_size so that all block drivers benefit from it. Even if not I think at least the SCSI midlayer should be updated to support it. Btw, I wonder why virtio-scsi sticks to the default segment size, unlike virtio-blk. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/5 v3] Fix virtio-blk issue with SWIOTLB
On Wed, Jan 23, 2019 at 01:51:29PM -0500, Michael S. Tsirkin wrote: > On Wed, Jan 23, 2019 at 05:30:44PM +0100, Joerg Roedel wrote: > > Hi, > > > > here is the third version of this patch-set. Previous > > versions can be found here: > > > > V1: https://lore.kernel.org/lkml/20190110134433.15672-1-j...@8bytes.org/ > > > > V2: https://lore.kernel.org/lkml/20190115132257.6426-1-j...@8bytes.org/ > > > > The problem solved here is a limitation of the SWIOTLB implementation, > > which does not support allocations larger than 256kb. When the > > virtio-blk driver tries to read/write a block larger than that, the > > allocation of the dma-handle fails and an IO error is reported. > > > OK looks good to me. > I will park this in my tree for now this way it will get > testing in linux-next. > Can I get an ack from DMA maintainers on the DMA bits for > merging this in 5.0? You got mine (SWIOTBL is my area). > > > Changes to v2 are: > > > > * Check if SWIOTLB is active before returning its limit in > > dma_direct_max_mapping_size() > > > > * Only apply the maximum segment limit in virtio-blk when > > DMA-API is used for the vring > > > > Please review. > > > > Thanks, > > > > Joerg > > > > Joerg Roedel (5): > > swiotlb: Introduce swiotlb_max_mapping_size() > > swiotlb: Add is_swiotlb_active() function > > dma: Introduce dma_max_mapping_size() > > virtio: Introduce virtio_max_dma_size() > > virtio-blk: Consider virtio_max_dma_size() for maximum segment size > > > > drivers/block/virtio_blk.c | 10 ++ > > drivers/virtio/virtio_ring.c | 10 ++ > > include/linux/dma-mapping.h | 16 > > include/linux/swiotlb.h | 11 +++ > > include/linux/virtio.h | 2 ++ > > kernel/dma/direct.c | 11 +++ > > kernel/dma/swiotlb.c | 10 ++ > > 7 files changed, 66 insertions(+), 4 deletions(-) > > > > -- > > 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 3/5] dma: Introduce dma_max_mapping_size()
This looks ok, but could really use some documentation in Documentation/DMA-API.txt. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/5] swiotlb: Introduce swiotlb_max_mapping_size()
On Wed, Jan 23, 2019 at 05:30:45PM +0100, Joerg Roedel wrote: > +extern size_t swiotlb_max_mapping_size(struct device *dev); No need for the extern keyword on function declarations in headers. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/5] swiotlb: Add is_swiotlb_active() function
On Wed, Jan 23, 2019 at 05:30:46PM +0100, Joerg Roedel wrote: > +bool is_swiotlb_active(void) > +{ > + return !no_iotlb_memory; > +} As I've just introduced and fixed a bug in this area in the current cycle - I don't think no_iotlb_memory is what your want (and maybe not useful at all): if the arch valls swiotlb_exit after previously initializing a buffer it won't be set. You probably want to check for non-zero io_tlb_start and/or io_tlb_end. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/5 v3] Fix virtio-blk issue with SWIOTLB
On Wed, Jan 23, 2019 at 05:30:44PM +0100, Joerg Roedel wrote: > Hi, > > here is the third version of this patch-set. Previous > versions can be found here: > > V1: https://lore.kernel.org/lkml/20190110134433.15672-1-j...@8bytes.org/ > > V2: https://lore.kernel.org/lkml/20190115132257.6426-1-j...@8bytes.org/ > > The problem solved here is a limitation of the SWIOTLB implementation, > which does not support allocations larger than 256kb. When the > virtio-blk driver tries to read/write a block larger than that, the > allocation of the dma-handle fails and an IO error is reported. OK looks good to me. I will park this in my tree for now this way it will get testing in linux-next. Can I get an ack from DMA maintainers on the DMA bits for merging this in 5.0? > Changes to v2 are: > > * Check if SWIOTLB is active before returning its limit in > dma_direct_max_mapping_size() > > * Only apply the maximum segment limit in virtio-blk when > DMA-API is used for the vring > > Please review. > > Thanks, > > Joerg > > Joerg Roedel (5): > swiotlb: Introduce swiotlb_max_mapping_size() > swiotlb: Add is_swiotlb_active() function > dma: Introduce dma_max_mapping_size() > virtio: Introduce virtio_max_dma_size() > virtio-blk: Consider virtio_max_dma_size() for maximum segment size > > drivers/block/virtio_blk.c | 10 ++ > drivers/virtio/virtio_ring.c | 10 ++ > include/linux/dma-mapping.h | 16 > include/linux/swiotlb.h | 11 +++ > include/linux/virtio.h | 2 ++ > kernel/dma/direct.c | 11 +++ > kernel/dma/swiotlb.c | 10 ++ > 7 files changed, 66 insertions(+), 4 deletions(-) > > -- > 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/5 v3] Fix virtio-blk issue with SWIOTLB
On Wed, Jan 23, 2019 at 05:30:44PM +0100, Joerg Roedel wrote: > Hi, > > here is the third version of this patch-set. Previous > versions can be found here: > > V1: https://lore.kernel.org/lkml/20190110134433.15672-1-j...@8bytes.org/ > > V2: https://lore.kernel.org/lkml/20190115132257.6426-1-j...@8bytes.org/ > > The problem solved here is a limitation of the SWIOTLB implementation, > which does not support allocations larger than 256kb. When the > virtio-blk driver tries to read/write a block larger than that, the > allocation of the dma-handle fails and an IO error is reported. > > Changes to v2 are: > > * Check if SWIOTLB is active before returning its limit in > dma_direct_max_mapping_size() > > * Only apply the maximum segment limit in virtio-blk when > DMA-API is used for the vring > > Please review. > > Thanks, > > Joerg > > Joerg Roedel (5): > swiotlb: Introduce swiotlb_max_mapping_size() > swiotlb: Add is_swiotlb_active() function > dma: Introduce dma_max_mapping_size() > virtio: Introduce virtio_max_dma_size() > virtio-blk: Consider virtio_max_dma_size() for maximum segment size > > drivers/block/virtio_blk.c | 10 ++ > drivers/virtio/virtio_ring.c | 10 ++ The kvm-devel mailing list should have been copied on those. When you do can you please put 'Reviewed-by: Konrad Rzeszutek Wilk ' on all of them? Thank you! P.S. Christopher, I am assuming you are OK with this idea, if so - and once the owners of 'virtio' Ack, do you want to put it in my tree? Thanks! > include/linux/dma-mapping.h | 16 > include/linux/swiotlb.h | 11 +++ > include/linux/virtio.h | 2 ++ > kernel/dma/direct.c | 11 +++ > kernel/dma/swiotlb.c | 10 ++ > 7 files changed, 66 insertions(+), 4 deletions(-) > > -- > 2.17.1 > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/intel: quirk to disable DMAR for QM57 igfx
On Wed, Jan 23, 2019 at 05:02:38PM +0200, Joonas Lahtinen wrote: > We have many reports where just having intel_iommu=on (and using the > system normally, without any virtualization stuff going on) will cause > unexplained GPU hangs. For those users, simply switching to > intel_iommu=igfx_off solves the problems, and the debug often ends > there. If you can reproduce problems on your side, then you can try to enable CONFIG_INTEL_IOMMU_BROKEN_GFX_WA to force the GFX devices into the identity mapping. We can also add a boot-parameter and workarounds if it turns out that this is sufficient to make the GFX devices work with IOMMU enabled. Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 0/5 v3] Fix virtio-blk issue with SWIOTLB
Hi, here is the third version of this patch-set. Previous versions can be found here: V1: https://lore.kernel.org/lkml/20190110134433.15672-1-j...@8bytes.org/ V2: https://lore.kernel.org/lkml/20190115132257.6426-1-j...@8bytes.org/ The problem solved here is a limitation of the SWIOTLB implementation, which does not support allocations larger than 256kb. When the virtio-blk driver tries to read/write a block larger than that, the allocation of the dma-handle fails and an IO error is reported. Changes to v2 are: * Check if SWIOTLB is active before returning its limit in dma_direct_max_mapping_size() * Only apply the maximum segment limit in virtio-blk when DMA-API is used for the vring Please review. Thanks, Joerg Joerg Roedel (5): swiotlb: Introduce swiotlb_max_mapping_size() swiotlb: Add is_swiotlb_active() function dma: Introduce dma_max_mapping_size() virtio: Introduce virtio_max_dma_size() virtio-blk: Consider virtio_max_dma_size() for maximum segment size drivers/block/virtio_blk.c | 10 ++ drivers/virtio/virtio_ring.c | 10 ++ include/linux/dma-mapping.h | 16 include/linux/swiotlb.h | 11 +++ include/linux/virtio.h | 2 ++ kernel/dma/direct.c | 11 +++ kernel/dma/swiotlb.c | 10 ++ 7 files changed, 66 insertions(+), 4 deletions(-) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 5/5] virtio-blk: Consider virtio_max_dma_size() for maximum segment size
From: Joerg Roedel Segments can't be larger than the maximum DMA mapping size supported on the platform. Take that into account when setting the maximum segment size for a block device. Signed-off-by: Joerg Roedel --- drivers/block/virtio_blk.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index b16a887bbd02..4bc083b7c9b5 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -723,7 +723,7 @@ static int virtblk_probe(struct virtio_device *vdev) struct request_queue *q; int err, index; - u32 v, blk_size, sg_elems, opt_io_size; + u32 v, blk_size, max_size, sg_elems, opt_io_size; u16 min_io_size; u8 physical_block_exp, alignment_offset; @@ -826,14 +826,16 @@ static int virtblk_probe(struct virtio_device *vdev) /* No real sector limit. */ blk_queue_max_hw_sectors(q, -1U); + max_size = virtio_max_dma_size(vdev); + /* Host can optionally specify maximum segment size and number of * segments. */ err = virtio_cread_feature(vdev, VIRTIO_BLK_F_SIZE_MAX, struct virtio_blk_config, size_max, ); if (!err) - blk_queue_max_segment_size(q, v); - else - blk_queue_max_segment_size(q, -1U); + max_size = min(max_size, v); + + blk_queue_max_segment_size(q, max_size); /* Host can optionally specify the block size of the device */ err = virtio_cread_feature(vdev, VIRTIO_BLK_F_BLK_SIZE, -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 3/5] dma: Introduce dma_max_mapping_size()
From: Joerg Roedel The function returns the maximum size that can be mapped using DMA-API functions. The patch also adds the implementation for direct DMA and a new dma_map_ops pointer so that other implementations can expose their limit. Signed-off-by: Joerg Roedel --- include/linux/dma-mapping.h | 16 kernel/dma/direct.c | 11 +++ 2 files changed, 27 insertions(+) diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index f6ded992c183..a3ca8a71a704 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -130,6 +130,7 @@ struct dma_map_ops { enum dma_data_direction direction); int (*dma_supported)(struct device *dev, u64 mask); u64 (*get_required_mask)(struct device *dev); + size_t (*max_mapping_size)(struct device *dev); }; #define DMA_MAPPING_ERROR (~(dma_addr_t)0) @@ -257,6 +258,8 @@ static inline void dma_direct_sync_sg_for_cpu(struct device *dev, } #endif +size_t dma_direct_max_mapping_size(struct device *dev); + #ifdef CONFIG_HAS_DMA #include @@ -440,6 +443,19 @@ static inline int dma_mapping_error(struct device *dev, dma_addr_t dma_addr) return 0; } +static inline size_t dma_max_mapping_size(struct device *dev) +{ + const struct dma_map_ops *ops = get_dma_ops(dev); + size_t size = SIZE_MAX; + + if (dma_is_direct(ops)) + size = dma_direct_max_mapping_size(dev); + else if (ops && ops->max_mapping_size) + size = ops->max_mapping_size(dev); + + return size; +} + void *dma_alloc_attrs(struct device *dev, size_t size, dma_addr_t *dma_handle, gfp_t flag, unsigned long attrs); void dma_free_attrs(struct device *dev, size_t size, void *cpu_addr, diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index 355d16acee6d..6310ad01f915 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -380,3 +380,14 @@ int dma_direct_supported(struct device *dev, u64 mask) */ return mask >= __phys_to_dma(dev, min_mask); } + +size_t dma_direct_max_mapping_size(struct device *dev) +{ + size_t size = SIZE_MAX; + + /* If SWIOTLB is active, use its maximum mapping size */ + if (is_swiotlb_active()) + size = swiotlb_max_mapping_size(dev); + + return size; +} -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 2/5] swiotlb: Add is_swiotlb_active() function
From: Joerg Roedel This function will be used from dma_direct code to determine the maximum segment size of a dma mapping. Signed-off-by: Joerg Roedel --- include/linux/swiotlb.h | 6 ++ kernel/dma/swiotlb.c| 5 + 2 files changed, 11 insertions(+) diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index ceb623321f38..5c087d330b4b 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -63,6 +63,7 @@ extern void swiotlb_tbl_sync_single(struct device *hwdev, extern int swiotlb_dma_supported(struct device *hwdev, u64 mask); extern size_t swiotlb_max_mapping_size(struct device *dev); +bool is_swiotlb_active(void); #ifdef CONFIG_SWIOTLB extern enum swiotlb_force swiotlb_force; @@ -100,6 +101,11 @@ static inline size_t swiotlb_max_mapping_size(struct device *dev) { return SIZE_MAX; } + +static inline bool is_swiotlb_active(void) +{ + return false; +} #endif /* CONFIG_SWIOTLB */ extern void swiotlb_print_info(void); diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 9cb21259cb0b..9fbd075081d9 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -667,3 +667,8 @@ size_t swiotlb_max_mapping_size(struct device *dev) { return ((size_t)1 << IO_TLB_SHIFT) * IO_TLB_SEGSIZE; } + +bool is_swiotlb_active(void) +{ + return !no_iotlb_memory; +} -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/5] swiotlb: Introduce swiotlb_max_mapping_size()
From: Joerg Roedel The function returns the maximum size that can be remapped by the SWIOTLB implementation. This function will be later exposed to users through the DMA-API. Signed-off-by: Joerg Roedel --- include/linux/swiotlb.h | 5 + kernel/dma/swiotlb.c| 5 + 2 files changed, 10 insertions(+) diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index 7c007ed7505f..ceb623321f38 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -62,6 +62,7 @@ extern void swiotlb_tbl_sync_single(struct device *hwdev, extern int swiotlb_dma_supported(struct device *hwdev, u64 mask); +extern size_t swiotlb_max_mapping_size(struct device *dev); #ifdef CONFIG_SWIOTLB extern enum swiotlb_force swiotlb_force; @@ -95,6 +96,10 @@ static inline unsigned int swiotlb_max_segment(void) { return 0; } +static inline size_t swiotlb_max_mapping_size(struct device *dev) +{ + return SIZE_MAX; +} #endif /* CONFIG_SWIOTLB */ extern void swiotlb_print_info(void); diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 1fb6fd68b9c7..9cb21259cb0b 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -662,3 +662,8 @@ swiotlb_dma_supported(struct device *hwdev, u64 mask) { return __phys_to_dma(hwdev, io_tlb_end - 1) <= mask; } + +size_t swiotlb_max_mapping_size(struct device *dev) +{ + return ((size_t)1 << IO_TLB_SHIFT) * IO_TLB_SEGSIZE; +} -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 4/5] virtio: Introduce virtio_max_dma_size()
From: Joerg Roedel This function returns the maximum segment size for a single dma transaction of a virtio device. The possible limit comes from the SWIOTLB implementation in the Linux kernel, that has an upper limit of (currently) 256kb of contiguous memory it can map. Other DMA-API implementations might also have limits. Use the new dma_max_mapping_size() function to determine the maximum mapping size when DMA-API is in use for virtio. Signed-off-by: Joerg Roedel --- drivers/virtio/virtio_ring.c | 10 ++ include/linux/virtio.h | 2 ++ 2 files changed, 12 insertions(+) diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index cd7e755484e3..9ca3fe6af9fa 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -266,6 +266,16 @@ static bool vring_use_dma_api(struct virtio_device *vdev) return false; } +size_t virtio_max_dma_size(struct virtio_device *vdev) +{ + size_t max_segment_size = SIZE_MAX; + + if (vring_use_dma_api(vdev)) + max_segment_size = dma_max_mapping_size(>dev); + + return max_segment_size; +} + static void *vring_alloc_queue(struct virtio_device *vdev, size_t size, dma_addr_t *dma_handle, gfp_t flag) { diff --git a/include/linux/virtio.h b/include/linux/virtio.h index fa1b5da2804e..673fe3ef3607 100644 --- a/include/linux/virtio.h +++ b/include/linux/virtio.h @@ -157,6 +157,8 @@ int virtio_device_freeze(struct virtio_device *dev); int virtio_device_restore(struct virtio_device *dev); #endif +size_t virtio_max_dma_size(struct virtio_device *vdev); + #define virtio_device_for_each_vq(vdev, vq) \ list_for_each_entry(vq, >vqs, list) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/intel: quirk to disable DMAR for QM57 igfx
Quoting Joerg Roedel (2019-01-22 18:51:35) > On Tue, Jan 22, 2019 at 04:48:26PM +0200, Joonas Lahtinen wrote: > > According to our IOMMU folks there exists some desire to be able to assign > > the iGFX device aka have intel_iommu=on instead of intel_iommu=igfx_off > > due to how the devices might be grouped in IOMMU groups. Even when you > > would not be using the iGFX device. > > You can force the igfx device into a SI domain, or does that also > trigger the iommu issues on the chipset? To be honest, we've had a mixture different issues on different SKUs that have not been hit in the past when intel_iommu was just disabled by default. I know that in one group of the problems, the issue has been debugged into the GPU having its own set of virtualization mapping translation hardware with caching and it fails to track changes to the mapping. So if a identity mapping was established and never changed, I'd assume that to fix at least that class of problems. Would just passing intel_iommu=on already cause a non-identity mapping to possibly be used for the integrated GPU? If it did, then it would explain quite few of the issues. We have many reports where just having intel_iommu=on (and using the system normally, without any virtualization stuff going on) will cause unexplained GPU hangs. For those users, simply switching to intel_iommu=igfx_off solves the problems, and the debug often ends there. Regards, Joonas > In any case, if iommu=on breaks these systems I want to make them work > again with opt-out, even at the cost of breaking assignability. > > Regards, > > Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: use generic DMA mapping code in powerpc V4
Hi Christoph, I also compiled a kernel (zImage) for the X1000 from your Git 'powerpc-dma.6-debug' (both patches) today. It boots and the P.A. Semi Ethernet works! I will test just the first patch tomorrow. Thanks, Christian On 21 January 2019 at 3:38PM, Christian Zigotzky wrote: Hello Christoph, Thanks for your reply. I successfully compiled a kernel (uImage) for the X5000 from your Git 'powerpc-dma.6-debug' (both patches) today. It detects the SATA hard disk drive and boots without any problems. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/mediatek: Use correct fwspec in mtk_iommu_add_device()
Tested-by: Frank Wunderlich ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 0/7] Add virtio-iommu driver
Hi Jean-Philippe, thanks for all your hard work on this! On Tue, Jan 15, 2019 at 12:19:52PM +, Jean-Philippe Brucker wrote: > Implement the virtio-iommu driver, following specification v0.9 [1]. To make progress on this I think the spec needs to be close to something that can be included into the official virtio-specification. Have you proposed the specification for inclusion there? This is because I can't merge a driver that might be incompatible to future implementations because the specification needs to be changed on its way to an official standard. I had a short discussion with Michael S. Tsirkin about that and from what I understood the spec needs to be proposed for inclusion on the virtio-comment[1] mailing list and later the TC needs to vote on it. Please work with Michael on this to get the specification official (or at least pretty close to something that will be part of the official virtio standard). Regards, Joerg [1] https://www.oasis-open.org/committees/comments/index.php?wg_abbrev=virtio ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu