Re: [RFC PATCH 04/30] iommu/arm-smmu-v3: Add support for PCI ATS

2017-05-25 Thread Roy Franz (Cavium)
On Tue, May 23, 2017 at 4:21 AM, Jean-Philippe Brucker
 wrote:
> On 23/05/17 09:41, Leizhen (ThunderTown) wrote:
>> On 2017/2/28 3:54, Jean-Philippe Brucker wrote:
>>> PCIe devices can implement their own TLB, named Address Translation Cache
>>> (ATC). Steps involved in the use and maintenance of such caches are:
>>>
>>> * Device sends an Address Translation Request for a given IOVA to the
>>>   IOMMU. If the translation succeeds, the IOMMU returns the corresponding
>>>   physical address, which is stored in the device's ATC.
>>>
>>> * Device can then use the physical address directly in a transaction.
>>>   A PCIe device does so by setting the TLP AT field to 0b10 - translated.
>>>   The SMMU might check that the device is allowed to send translated
>>>   transactions, and let it pass through.
>>>
>>> * When an address is unmapped, CPU sends a CMD_ATC_INV command to the
>>>   SMMU, that is relayed to the device.
>>>
>>> In theory, this doesn't require a lot of software intervention. The IOMMU
>>> driver needs to enable ATS when adding a PCI device, and send an
>>> invalidation request when unmapping. Note that this invalidation is
>>> allowed to take up to a minute, according to the PCIe spec. In
>>> addition, the invalidation queue on the ATC side is fairly small, 32 by
>>> default, so we cannot keep many invalidations in flight (see ATS spec
>>> section 3.5, Invalidate Flow Control).
>>>
>>> Handling these constraints properly would require to postpone
>>> invalidations, and keep the stale mappings until we're certain that all
>>> devices forgot about them. This requires major work in the page table
>>> managers, and is therefore not done by this patch.
>>>
>>>   Range calculation
>>>   -
>>>
>>> The invalidation packet itself is a bit awkward: range must be naturally
>>> aligned, which means that the start address is a multiple of the range
>>> size. In addition, the size must be a power of two number of 4k pages. We
>>> have a few options to enforce this constraint:
>>>
>>> (1) Find the smallest naturally aligned region that covers the requested
>>> range. This is simple to compute and only takes one ATC_INV, but it
>>> will spill on lots of neighbouring ATC entries.
>>>
>>> (2) Align the start address to the region size (rounded up to a power of
>>> two), and send a second invalidation for the next range of the same
>>> size. Still not great, but reduces spilling.
>>>
>>> (3) Cover the range exactly with the smallest number of naturally aligned
>>> regions. This would be interesting to implement but as for (2),
>>> requires multiple ATC_INV.
>>>
>>> As I suspect ATC invalidation packets will be a very scarce resource,
>>> we'll go with option (1) for now, and only send one big invalidation.
>>>
>>> Note that with io-pgtable, the unmap function is called for each page, so
>>> this doesn't matter. The problem shows up when sharing page tables with
>>> the MMU.
>> Suppose this is true, I'd like to choose option (2). Because the worst cases 
>> of
>> both (1) and (2) will not be happened, but the code of (2) will look clearer.
>> And (2) is technically more acceptable.
>
> I agree that (2) is a bit clearer, but the question is of performance
> rather than readability. I'd like to see some benchmarks or experiment on
> my own before switching to a two-invalidation system.
>
> Intuitively one big invalidation will result in more ATC trashing and will
> bring overall device performance down. But then according to the PCI spec,
> ATC invalidations are grossly expensive, they have an upper bound of a
> minute. I agree that this is highly improbable and might depend on the
> range size, but purely from an architectural standpoint, reducing the
> number of ATC invalidation requests is the priority, because this is much
> worse than any performance slow-down incurred by ATC trashing. And for the
> moment I can only base my decisions on the architecture.
>
> So I'd like to keep (1) for now, and update it to (2) (or even (3)) once
> we have more hardware to experiment with.
>
> Thanks,
> Jean
>

I think (1) is a good place to start, as the same restricted encoding
that is used in
the invalidations is also used in the translation responses - all of
the ATC entries
were created with regions described this way.  We still may end up with nothing
but STU sized ATC entries, as TAs are free to respond to large
translation requests
with multiple STU sized translations, and in some cases this is the
best that they
can do.  Picking the optimal strategy will depend on hardware, and
maybe workload
as well.

Thanks,
Roy


>
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 04/30] iommu/arm-smmu-v3: Add support for PCI ATS

2017-05-23 Thread Jean-Philippe Brucker
On 23/05/17 09:41, Leizhen (ThunderTown) wrote:
> On 2017/2/28 3:54, Jean-Philippe Brucker wrote:
>> PCIe devices can implement their own TLB, named Address Translation Cache
>> (ATC). Steps involved in the use and maintenance of such caches are:
>>
>> * Device sends an Address Translation Request for a given IOVA to the
>>   IOMMU. If the translation succeeds, the IOMMU returns the corresponding
>>   physical address, which is stored in the device's ATC.
>>
>> * Device can then use the physical address directly in a transaction.
>>   A PCIe device does so by setting the TLP AT field to 0b10 - translated.
>>   The SMMU might check that the device is allowed to send translated
>>   transactions, and let it pass through.
>>
>> * When an address is unmapped, CPU sends a CMD_ATC_INV command to the
>>   SMMU, that is relayed to the device.
>>
>> In theory, this doesn't require a lot of software intervention. The IOMMU
>> driver needs to enable ATS when adding a PCI device, and send an
>> invalidation request when unmapping. Note that this invalidation is
>> allowed to take up to a minute, according to the PCIe spec. In
>> addition, the invalidation queue on the ATC side is fairly small, 32 by
>> default, so we cannot keep many invalidations in flight (see ATS spec
>> section 3.5, Invalidate Flow Control).
>>
>> Handling these constraints properly would require to postpone
>> invalidations, and keep the stale mappings until we're certain that all
>> devices forgot about them. This requires major work in the page table
>> managers, and is therefore not done by this patch.
>>
>>   Range calculation
>>   -
>>
>> The invalidation packet itself is a bit awkward: range must be naturally
>> aligned, which means that the start address is a multiple of the range
>> size. In addition, the size must be a power of two number of 4k pages. We
>> have a few options to enforce this constraint:
>>
>> (1) Find the smallest naturally aligned region that covers the requested
>> range. This is simple to compute and only takes one ATC_INV, but it
>> will spill on lots of neighbouring ATC entries.
>>
>> (2) Align the start address to the region size (rounded up to a power of
>> two), and send a second invalidation for the next range of the same
>> size. Still not great, but reduces spilling.
>>
>> (3) Cover the range exactly with the smallest number of naturally aligned
>> regions. This would be interesting to implement but as for (2),
>> requires multiple ATC_INV.
>>
>> As I suspect ATC invalidation packets will be a very scarce resource,
>> we'll go with option (1) for now, and only send one big invalidation.
>>
>> Note that with io-pgtable, the unmap function is called for each page, so
>> this doesn't matter. The problem shows up when sharing page tables with
>> the MMU.
> Suppose this is true, I'd like to choose option (2). Because the worst cases 
> of
> both (1) and (2) will not be happened, but the code of (2) will look clearer.
> And (2) is technically more acceptable.

I agree that (2) is a bit clearer, but the question is of performance
rather than readability. I'd like to see some benchmarks or experiment on
my own before switching to a two-invalidation system.

Intuitively one big invalidation will result in more ATC trashing and will
bring overall device performance down. But then according to the PCI spec,
ATC invalidations are grossly expensive, they have an upper bound of a
minute. I agree that this is highly improbable and might depend on the
range size, but purely from an architectural standpoint, reducing the
number of ATC invalidation requests is the priority, because this is much
worse than any performance slow-down incurred by ATC trashing. And for the
moment I can only base my decisions on the architecture.

So I'd like to keep (1) for now, and update it to (2) (or even (3)) once
we have more hardware to experiment with.

Thanks,
Jean

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 04/30] iommu/arm-smmu-v3: Add support for PCI ATS

2017-05-23 Thread Leizhen (ThunderTown)


On 2017/2/28 3:54, Jean-Philippe Brucker wrote:
> PCIe devices can implement their own TLB, named Address Translation Cache
> (ATC). Steps involved in the use and maintenance of such caches are:
> 
> * Device sends an Address Translation Request for a given IOVA to the
>   IOMMU. If the translation succeeds, the IOMMU returns the corresponding
>   physical address, which is stored in the device's ATC.
> 
> * Device can then use the physical address directly in a transaction.
>   A PCIe device does so by setting the TLP AT field to 0b10 - translated.
>   The SMMU might check that the device is allowed to send translated
>   transactions, and let it pass through.
> 
> * When an address is unmapped, CPU sends a CMD_ATC_INV command to the
>   SMMU, that is relayed to the device.
> 
> In theory, this doesn't require a lot of software intervention. The IOMMU
> driver needs to enable ATS when adding a PCI device, and send an
> invalidation request when unmapping. Note that this invalidation is
> allowed to take up to a minute, according to the PCIe spec. In
> addition, the invalidation queue on the ATC side is fairly small, 32 by
> default, so we cannot keep many invalidations in flight (see ATS spec
> section 3.5, Invalidate Flow Control).
> 
> Handling these constraints properly would require to postpone
> invalidations, and keep the stale mappings until we're certain that all
> devices forgot about them. This requires major work in the page table
> managers, and is therefore not done by this patch.
> 
>   Range calculation
>   -
> 
> The invalidation packet itself is a bit awkward: range must be naturally
> aligned, which means that the start address is a multiple of the range
> size. In addition, the size must be a power of two number of 4k pages. We
> have a few options to enforce this constraint:
> 
> (1) Find the smallest naturally aligned region that covers the requested
> range. This is simple to compute and only takes one ATC_INV, but it
> will spill on lots of neighbouring ATC entries.
> 
> (2) Align the start address to the region size (rounded up to a power of
> two), and send a second invalidation for the next range of the same
> size. Still not great, but reduces spilling.
> 
> (3) Cover the range exactly with the smallest number of naturally aligned
> regions. This would be interesting to implement but as for (2),
> requires multiple ATC_INV.
> 
> As I suspect ATC invalidation packets will be a very scarce resource,
> we'll go with option (1) for now, and only send one big invalidation.
> 
> Note that with io-pgtable, the unmap function is called for each page, so
> this doesn't matter. The problem shows up when sharing page tables with
> the MMU.
Suppose this is true, I'd like to choose option (2). Because the worst cases of
both (1) and (2) will not be happened, but the code of (2) will look clearer.
And (2) is technically more acceptable.

> 
>   Locking
>   ---
> 
> The atc_invalidate function is called from arm_smmu_unmap, with pgtbl_lock
> held (hardirq-safe). When sharing page tables with the MMU, we will have a
> few more call sites:
> 
> * When unbinding an address space from a device, to invalidate the whole
>   address space.
> * When a task bound to a device does an mlock, munmap, etc. This comes
>   from an MMU notifier, with mmap_sem and pte_lock held.
> 
> Given this, all locks take on the ATC invalidation path must be hardirq-
> safe.
> 
>   Timeout
>   ---
> 
> Some SMMU implementations will raise a CERROR_ATC_INV_SYNC when a CMD_SYNC
> fails because of an ATC invalidation. Some will just fail the CMD_SYNC.
> Others might let CMD_SYNC complete and have an asynchronous IMPDEF
> mechanism to record the error. When we receive a CERROR_ATC_INV_SYNC, we
> could retry sending all ATC_INV since last successful CMD_SYNC. When a
> CMD_SYNC fails without CERROR_ATC_INV_SYNC, we could retry sending *all*
> commands since last successful CMD_SYNC. This patch doesn't properly
> handle timeout, and ignores devices that don't behave. It might lead to
> memory corruption.
> 
>   Optional support
>   
> 
> For the moment, enable ATS whenever a device advertises it. Later, we
> might want to allow users to opt-in for the whole system or individual
> devices via sysfs or cmdline. Some firmware interfaces also provide a
> description of ATS capabilities in the root complex, and we might want to
> add a similar capability in DT. For instance, the following could be added
> to bindings/pci/pci-iommu.txt, as an optional property to PCI RC:
> 
> - ats-map: describe Address Translation Service support by the root
>   complex. This property is an arbitrary number of tuples of
>   (rid-base,length). Any RID in this interval is allowed to issue address
>   translation requests.
> 
> Signed-off-by: Jean-Philippe Brucker 
> ---
>  drivers/iommu/arm-smmu-v3.c | 262 
> ++--
>  1 file changed, 250 insertions(

Re: [RFC PATCH 04/30] iommu/arm-smmu-v3: Add support for PCI ATS

2017-05-10 Thread Jean-Philippe Brucker
On 10/05/17 13:54, Tomasz Nowicki wrote:
> Hi Jean,
> 
> On 27.02.2017 20:54, Jean-Philippe Brucker wrote:
>> +/*
>> + * Returns -ENOSYS if ATS is not supported either by the device or by
>> the SMMU
>> + */
>> +static int arm_smmu_enable_ats(struct arm_smmu_master_data *master)
>> +{
>> +int ret;
>> +size_t stu;
>> +struct pci_dev *pdev;
>> +struct arm_smmu_device *smmu = master->smmu;
>> +
>> +if (!(smmu->features & ARM_SMMU_FEAT_ATS) || !dev_is_pci(master->dev))
>> +return -ENOSYS;
>> +
>> +pdev = to_pci_dev(master->dev);
>> +
>> +#ifdef CONFIG_PCI_ATS
>> +if (!pdev->ats_cap)
>> +return -ENOSYS;
>> +#else
>> +return -ENOSYS;
>> +#endif
> 
> Nit: This deserves to be another helper in ats.c like:
> 
> int pci_ats_supported(struct pci_dev *dev) {
> if (!pdev->ats_cap)
> return 0;
> 
> return 1;
> }

Indeed, although in my next version I'll remove this check altogether.
Instead I now rely on pci_enable_ats to check for ats_cap (as discussed in
patch 3). The downside is that we can't distinguish between absence of ATS
and error in enabling ATS. So we don't print a message in the latter case
anymore, we expect device drivers to check whether ATS is enabled.

Thanks,
Jean

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 04/30] iommu/arm-smmu-v3: Add support for PCI ATS

2017-05-10 Thread Tomasz Nowicki

Hi Jean,

On 27.02.2017 20:54, Jean-Philippe Brucker wrote:

+/*
+ * Returns -ENOSYS if ATS is not supported either by the device or by the SMMU
+ */
+static int arm_smmu_enable_ats(struct arm_smmu_master_data *master)
+{
+   int ret;
+   size_t stu;
+   struct pci_dev *pdev;
+   struct arm_smmu_device *smmu = master->smmu;
+
+   if (!(smmu->features & ARM_SMMU_FEAT_ATS) || !dev_is_pci(master->dev))
+   return -ENOSYS;
+
+   pdev = to_pci_dev(master->dev);
+
+#ifdef CONFIG_PCI_ATS
+   if (!pdev->ats_cap)
+   return -ENOSYS;
+#else
+   return -ENOSYS;
+#endif


Nit: This deserves to be another helper in ats.c like:

int pci_ats_supported(struct pci_dev *dev) {
if (!pdev->ats_cap)
return 0;

return 1;
}

Thanks,
Tomasz
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 04/30] iommu/arm-smmu-v3: Add support for PCI ATS

2017-04-03 Thread Jean-Philippe Brucker
On 03/04/17 12:42, Sunil Kovvuri wrote:
> On Mon, Apr 3, 2017 at 3:44 PM, Jean-Philippe Brucker
>  wrote:
>> On 03/04/17 09:34, Sunil Kovvuri wrote:
 +static size_t arm_smmu_atc_invalidate_domain(struct arm_smmu_domain 
 *smmu_domain,
 +unsigned long iova, size_t 
 size)
 +{
 +   unsigned long flags;
 +   struct arm_smmu_cmdq_ent cmd = {0};
 +   struct arm_smmu_group *smmu_group;
 +   struct arm_smmu_master_data *master;
 +   struct arm_smmu_device *smmu = smmu_domain->smmu;
 +   struct arm_smmu_cmdq_ent sync_cmd = {
 +   .opcode = CMDQ_OP_CMD_SYNC,
 +   };
 +
 +   spin_lock_irqsave(&smmu_domain->groups_lock, flags);
 +
 +   list_for_each_entry(smmu_group, &smmu_domain->groups, domain_head) 
 {
 +   if (!smmu_group->ats_enabled)
 +   continue;
>>>
>>> If ATS is not supported, this seems to increase no of cycles spent in
>>> pgtbl_lock.
>>> Can we return from this API by checking 'ARM_SMMU_FEAT_ATS' in 
>>> smmu->features ?
>>
>> Sure, I can add a check before taking the lock. Have you been able to
>> observe a significant difference in cycles between checking FEAT_ATS,
>> checking group->ats_enabled after taking the lock, and removing this
>> function call altogether?
>>
>> Thanks,
>> Jean-Philippe
> 
> No, I haven't verified, was just making an observation.

Fair enough, I think avoiding the lock when ATS isn't in use makes sense.

Thanks,
Jean-Philippe

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 04/30] iommu/arm-smmu-v3: Add support for PCI ATS

2017-04-03 Thread Sunil Kovvuri
On Mon, Apr 3, 2017 at 3:44 PM, Jean-Philippe Brucker
 wrote:
> On 03/04/17 09:34, Sunil Kovvuri wrote:
>>> +static size_t arm_smmu_atc_invalidate_domain(struct arm_smmu_domain 
>>> *smmu_domain,
>>> +unsigned long iova, size_t 
>>> size)
>>> +{
>>> +   unsigned long flags;
>>> +   struct arm_smmu_cmdq_ent cmd = {0};
>>> +   struct arm_smmu_group *smmu_group;
>>> +   struct arm_smmu_master_data *master;
>>> +   struct arm_smmu_device *smmu = smmu_domain->smmu;
>>> +   struct arm_smmu_cmdq_ent sync_cmd = {
>>> +   .opcode = CMDQ_OP_CMD_SYNC,
>>> +   };
>>> +
>>> +   spin_lock_irqsave(&smmu_domain->groups_lock, flags);
>>> +
>>> +   list_for_each_entry(smmu_group, &smmu_domain->groups, domain_head) {
>>> +   if (!smmu_group->ats_enabled)
>>> +   continue;
>>
>> If ATS is not supported, this seems to increase no of cycles spent in
>> pgtbl_lock.
>> Can we return from this API by checking 'ARM_SMMU_FEAT_ATS' in 
>> smmu->features ?
>
> Sure, I can add a check before taking the lock. Have you been able to
> observe a significant difference in cycles between checking FEAT_ATS,
> checking group->ats_enabled after taking the lock, and removing this
> function call altogether?
>
> Thanks,
> Jean-Philippe

No, I haven't verified, was just making an observation.

Thanks,
Sunil.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 04/30] iommu/arm-smmu-v3: Add support for PCI ATS

2017-04-03 Thread Jean-Philippe Brucker
On 03/04/17 09:34, Sunil Kovvuri wrote:
>> +static size_t arm_smmu_atc_invalidate_domain(struct arm_smmu_domain 
>> *smmu_domain,
>> +unsigned long iova, size_t size)
>> +{
>> +   unsigned long flags;
>> +   struct arm_smmu_cmdq_ent cmd = {0};
>> +   struct arm_smmu_group *smmu_group;
>> +   struct arm_smmu_master_data *master;
>> +   struct arm_smmu_device *smmu = smmu_domain->smmu;
>> +   struct arm_smmu_cmdq_ent sync_cmd = {
>> +   .opcode = CMDQ_OP_CMD_SYNC,
>> +   };
>> +
>> +   spin_lock_irqsave(&smmu_domain->groups_lock, flags);
>> +
>> +   list_for_each_entry(smmu_group, &smmu_domain->groups, domain_head) {
>> +   if (!smmu_group->ats_enabled)
>> +   continue;
> 
> If ATS is not supported, this seems to increase no of cycles spent in
> pgtbl_lock.
> Can we return from this API by checking 'ARM_SMMU_FEAT_ATS' in smmu->features 
> ?

Sure, I can add a check before taking the lock. Have you been able to
observe a significant difference in cycles between checking FEAT_ATS,
checking group->ats_enabled after taking the lock, and removing this
function call altogether?

Thanks,
Jean-Philippe
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 04/30] iommu/arm-smmu-v3: Add support for PCI ATS

2017-04-03 Thread Sunil Kovvuri
> +static size_t arm_smmu_atc_invalidate_domain(struct arm_smmu_domain 
> *smmu_domain,
> +unsigned long iova, size_t size)
> +{
> +   unsigned long flags;
> +   struct arm_smmu_cmdq_ent cmd = {0};
> +   struct arm_smmu_group *smmu_group;
> +   struct arm_smmu_master_data *master;
> +   struct arm_smmu_device *smmu = smmu_domain->smmu;
> +   struct arm_smmu_cmdq_ent sync_cmd = {
> +   .opcode = CMDQ_OP_CMD_SYNC,
> +   };
> +
> +   spin_lock_irqsave(&smmu_domain->groups_lock, flags);
> +
> +   list_for_each_entry(smmu_group, &smmu_domain->groups, domain_head) {
> +   if (!smmu_group->ats_enabled)
> +   continue;

If ATS is not supported, this seems to increase no of cycles spent in
pgtbl_lock.
Can we return from this API by checking 'ARM_SMMU_FEAT_ATS' in smmu->features ?

Thanks,
Sunil.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 04/30] iommu/arm-smmu-v3: Add support for PCI ATS

2017-03-21 Thread Jean-Philippe Brucker
On 08/03/17 15:26, Sinan Kaya wrote:
> On 2/27/2017 2:54 PM, Jean-Philippe Brucker wrote:
>> +ats_enabled = !arm_smmu_enable_ats(master);
>> +
> 
> You should make ats_supported field in IORT table part of the decision
> process for when to enable ATS.
> 

Agreed. I will also draft a proposal for adding the ATS property to PCI
host controller DT bindings, it seems to be missing.

Thanks,
Jean-Philippe
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 04/30] iommu/arm-smmu-v3: Add support for PCI ATS

2017-03-08 Thread Sinan Kaya
On 2/27/2017 2:54 PM, Jean-Philippe Brucker wrote:
> + ats_enabled = !arm_smmu_enable_ats(master);
> +

You should make ats_supported field in IORT table part of the decision
process for when to enable ATS.

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 04/30] iommu/arm-smmu-v3: Add support for PCI ATS

2017-03-02 Thread okaya

On 2017-03-02 05:51, Jean-Philippe Brucker wrote:

Hi Sinan,

On 01/03/17 19:24, Sinan Kaya wrote:

On 2/27/2017 2:54 PM, Jean-Philippe Brucker wrote:

/* Initialise command lazily */
+   if (!cmd.opcode)
+   arm_smmu_atc_invalidate_to_cmd(smmu, iova, size, &cmd);
+
+   spin_lock(&smmu_group->devices_lock);
+
+   list_for_each_entry(master, &smmu_group->devices, group_head)
+   arm_smmu_atc_invalidate_master(master, &cmd);
+
+   /*
+* TODO: ensure we do a sync whenever we have sent 
ats_queue_depth
+* invalidations to the same device.
+*/
+   arm_smmu_cmdq_issue_cmd(smmu, &sync_cmd);
+


It is possible to observe ATS invalidation timeout up to 90 seconds 
according to PCIe

spec. How does the current code deal with this?



Currently we give up waiting for sync after 100us 
(ARM_SMMU_POLL_TIMEOUT_US).

A successful sync guarantees that all ATC invalidations since last sync
were successful, so in case of timeout we should resend all of them.
The delay itself isn't important at the moment, since we don't handle
invalidaton failure at all. It's fire and forget, and I haven't come up
with a complete solution yet.

The simplest error handling would be to retry invalidation after 90
seconds if the sync didn't complete. Then after a number of failed
attempts, maybe try to reset the device. Given that ats_invalidate is
generally called in irq-safe context, we would be blocking the CPU for
minutes at a time, which seems unwise. A proper solution would be to
postpone the unmap and return an error, although unmap errors are
usually ignored.

To avoid letting anyone remap something at that address until we're 
sure

the invalidation succeeded, we would need to repopulate the page tables
with the stale mapping, and add a delayed work that inspects the status
of the invalidation and tries again if it failed. If the invalidation
comes from the IOMMU core, we control the page tables and it should be
doable. If it comes from mm/ however, it's more complicated. MMU
notifiers only tell us that the mapping is going away, they don't
provide us with a way to hold on to them. Until all stale mappings have
been invalidated, we also need to hold on to the address space.

I think the problem is complex enough to deserve a series of its own,
once we confirm that it may happen in hardware and have a rough idea of
the timeout values encountered.


Ok, fair enough. I think arm smmuv3 driver should follow the same design 
pattern other iommu drivers are following to solve this issue as other 
drivers are already handling this.


Based on what I see, if there is a timeout happenibg; your sync 
operation will not complete (consumer! = producer) until timeout 
finishes.






Thanks,
Jean-Philippe

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 04/30] iommu/arm-smmu-v3: Add support for PCI ATS

2017-03-02 Thread Jean-Philippe Brucker
Hi Sinan,

On 01/03/17 19:24, Sinan Kaya wrote:
> On 2/27/2017 2:54 PM, Jean-Philippe Brucker wrote:
>> /* Initialise command lazily */
>> +if (!cmd.opcode)
>> +arm_smmu_atc_invalidate_to_cmd(smmu, iova, size, &cmd);
>> +
>> +spin_lock(&smmu_group->devices_lock);
>> +
>> +list_for_each_entry(master, &smmu_group->devices, group_head)
>> +arm_smmu_atc_invalidate_master(master, &cmd);
>> +
>> +/*
>> + * TODO: ensure we do a sync whenever we have sent 
>> ats_queue_depth
>> + * invalidations to the same device.
>> + */
>> +arm_smmu_cmdq_issue_cmd(smmu, &sync_cmd);
>> +
> 
> It is possible to observe ATS invalidation timeout up to 90 seconds according 
> to PCIe
> spec. How does the current code deal with this?
> 

Currently we give up waiting for sync after 100us (ARM_SMMU_POLL_TIMEOUT_US).
A successful sync guarantees that all ATC invalidations since last sync
were successful, so in case of timeout we should resend all of them.
The delay itself isn't important at the moment, since we don't handle
invalidaton failure at all. It's fire and forget, and I haven't come up
with a complete solution yet.

The simplest error handling would be to retry invalidation after 90
seconds if the sync didn't complete. Then after a number of failed
attempts, maybe try to reset the device. Given that ats_invalidate is
generally called in irq-safe context, we would be blocking the CPU for
minutes at a time, which seems unwise. A proper solution would be to
postpone the unmap and return an error, although unmap errors are
usually ignored.

To avoid letting anyone remap something at that address until we're sure
the invalidation succeeded, we would need to repopulate the page tables
with the stale mapping, and add a delayed work that inspects the status
of the invalidation and tries again if it failed. If the invalidation
comes from the IOMMU core, we control the page tables and it should be
doable. If it comes from mm/ however, it's more complicated. MMU
notifiers only tell us that the mapping is going away, they don't
provide us with a way to hold on to them. Until all stale mappings have
been invalidated, we also need to hold on to the address space.

I think the problem is complex enough to deserve a series of its own,
once we confirm that it may happen in hardware and have a rough idea of
the timeout values encountered.

Thanks,
Jean-Philippe
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 04/30] iommu/arm-smmu-v3: Add support for PCI ATS

2017-03-01 Thread Sinan Kaya
On 2/27/2017 2:54 PM, Jean-Philippe Brucker wrote:
> /* Initialise command lazily */
> + if (!cmd.opcode)
> + arm_smmu_atc_invalidate_to_cmd(smmu, iova, size, &cmd);
> +
> + spin_lock(&smmu_group->devices_lock);
> +
> + list_for_each_entry(master, &smmu_group->devices, group_head)
> + arm_smmu_atc_invalidate_master(master, &cmd);
> +
> + /*
> +  * TODO: ensure we do a sync whenever we have sent 
> ats_queue_depth
> +  * invalidations to the same device.
> +  */
> + arm_smmu_cmdq_issue_cmd(smmu, &sync_cmd);
> +

It is possible to observe ATS invalidation timeout up to 90 seconds according 
to PCIe
spec. How does the current code deal with this?

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu