Re: [PATCH] iommu/amd: Fix IOMMU page flush when detach all devices from a domain

2019-01-24 Thread j...@8bytes.org
Hi Suravee,

On Thu, Jan 24, 2019 at 03:25:19AM +, Suthikulpanit, Suravee wrote:
> Actually, I just noticed that device_flush_dte() has already handled flushing 
> the DTE
> for alias device as well. Please see the link below.
> 
> https://elixir.bootlin.com/linux/latest/source/drivers/iommu/amd_iommu.c#L1219
> 
> So, we don't need to call device_flush_dte() for alias device in do_detach().

You are right, I missed that.

Thanks,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/amd: Fix IOMMU page flush when detach all devices from a domain

2019-01-23 Thread Suthikulpanit, Suravee
Joerg,

On 1/23/19 2:56 PM, j...@8bytes.org wrote:
> Hi Suravee,
> 
> On Tue, Jan 22, 2019 at 03:53:18PM +, Suthikulpanit, Suravee wrote:
>> Thanks for the detail. Alright then, let's just go with the version you
>> sent on 1/16/19. Do you want me to resend V3 with that changes, or
>> would you be taking care of that?
> 
> Please send me a v3 based on the diff I sent last week. Also add a
> separate patch which adds the missing dte flush for the alias entry.
> 
> Thanks,
> 
>   Joerg
> 

Actually, I just noticed that device_flush_dte() has already handled flushing 
the DTE
for alias device as well. Please see the link below.

https://elixir.bootlin.com/linux/latest/source/drivers/iommu/amd_iommu.c#L1219

So, we don't need to call device_flush_dte() for alias device in do_detach().

Regards,
Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/amd: Fix IOMMU page flush when detach all devices from a domain

2019-01-22 Thread j...@8bytes.org
Hi Suravee,

On Tue, Jan 22, 2019 at 03:53:18PM +, Suthikulpanit, Suravee wrote:
> Thanks for the detail. Alright then, let's just go with the version you
> sent on 1/16/19. Do you want me to resend V3 with that changes, or
> would you be taking care of that?

Please send me a v3 based on the diff I sent last week. Also add a
separate patch which adds the missing dte flush for the alias entry.

Thanks,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/amd: Fix IOMMU page flush when detach all devices from a domain

2019-01-22 Thread Suthikulpanit, Suravee
Joerg,

On 1/22/19 5:44 PM, j...@8bytes.org wrote:
> Hi Suravee,
> 
> On Thu, Jan 17, 2019 at 08:44:36AM +, Suthikulpanit, Suravee wrote:
>> Then, in __domain_flush_pages, we issue command when the dev_iommu[] >= 0.
>> This should preserve previous behavior, and only add flushing condition to
>> the specific IOMMU in detached state. Please let me know what you think.
> 
> I think the whole point why VFIO is detaching all devices first and then
> goes into unmapping pages is that there are no flushes needed anymore
> when devices are detached.
> 
> But when we follow your proposal we would still do IOTLB flushes even
> when no devices are attached anymore. So I'd like to avoid this, given
> the implications on unmapping performance. We should just flush the
> IOMMU TLB at detach time and be done with it.
> 
> Makes sense?
> 
> Regards,
> 
>   Joerg
> 

Thanks for the detail. Alright then, let's just go with the version you
sent on 1/16/19. Do you want me to resend V3 with that changes, or
would you be taking care of that?

Thanks,
Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/amd: Fix IOMMU page flush when detach all devices from a domain

2019-01-22 Thread j...@8bytes.org
Hi Suravee,

On Thu, Jan 17, 2019 at 08:44:36AM +, Suthikulpanit, Suravee wrote:
> Then, in __domain_flush_pages, we issue command when the dev_iommu[] >= 0.
> This should preserve previous behavior, and only add flushing condition to
> the specific IOMMU in detached state. Please let me know what you think.

I think the whole point why VFIO is detaching all devices first and then
goes into unmapping pages is that there are no flushes needed anymore
when devices are detached.

But when we follow your proposal we would still do IOTLB flushes even
when no devices are attached anymore. So I'd like to avoid this, given
the implications on unmapping performance. We should just flush the
IOMMU TLB at detach time and be done with it.

Makes sense?

Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/amd: Fix IOMMU page flush when detach all devices from a domain

2019-01-17 Thread Suthikulpanit, Suravee
Joerg,

On 1/17/19 3:44 PM, Suravee Suthikulpanit wrote:
> Joerg,
> 
> On 1/17/19 12:08 AM, j...@8bytes.org wrote:
>> On Wed, Jan 16, 2019 at 02:08:55PM +, Suthikulpanit, Suravee wrote:
>>> Actually, I am not sure how we would be missing the flush on the last 
>>> device.
>>> In my test, I am seeing the flush command being issued correctly during
>>> vfio_unmap_unpin(), which is after all devices are detached.
>>> Although, I might be missing your point here. Could you please elaborate?
>>
>> Okay, you are right, the patch effectivly adds an unconditional flush of
>> the domain on all iommus when the last device is detached. So it is
>> correct in that regard. But that code path is also used in the
>> iommu_unmap() path.
>>
>> The problem now is, that with your change we send flush commands to all
>> IOMMUs in the unmap path when no device is attached to the domain.
>> This will hurt performance there, no?
>>
>> Regards,
>>
>> Joerg
>>
> 
> Sounds like we need a way to track state of each IOMMU for a domain.
> What if we define the following:
> 
>    enum IOMMU_DOMAIN_STATES {
>      DOMAIN_FREE = -1,
>      DOMAIN_DETACHED = 0,
>      DOMAIN_ATTACHED >= 1
>    }
> 
> We should be able to use the dev_iommu[] to help track the state.
>      - In amd_iommu_domain_alloc, we initialize the array to DOMAIN_FREE
>      - In do_attach(), we change to DOMAIN_ATTACH or we can increment the 
> count
>    if it is already in DOMAIN_ATTACH state.
>      - In do_detach(). we change to DOMAIN_DETACH.
> 
> Then, in __domain_flush_pages, we issue command when the dev_iommu[] >= 0.
> This should preserve previous behavior, and only add flushing condition to
> the specific IOMMU in detached state. Please let me know what you think.
> 
> Regards,
> Suravee

By the way, I just sent V2 of this patch since it might be more clear.

Regards,
Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu/amd: Fix IOMMU page flush when detach all devices from a domain

2019-01-17 Thread Suthikulpanit, Suravee
Joerg,

On 1/17/19 12:08 AM, j...@8bytes.org wrote:
> On Wed, Jan 16, 2019 at 02:08:55PM +, Suthikulpanit, Suravee wrote:
>> Actually, I am not sure how we would be missing the flush on the last device.
>> In my test, I am seeing the flush command being issued correctly during
>> vfio_unmap_unpin(), which is after all devices are detached.
>> Although, I might be missing your point here. Could you please elaborate?
> 
> Okay, you are right, the patch effectivly adds an unconditional flush of
> the domain on all iommus when the last device is detached. So it is
> correct in that regard. But that code path is also used in the
> iommu_unmap() path.
> 
> The problem now is, that with your change we send flush commands to all
> IOMMUs in the unmap path when no device is attached to the domain.
> This will hurt performance there, no?
> 
> Regards,
> 
>   Joerg
> 

Sounds like we need a way to track state of each IOMMU for a domain.
What if we define the following:

   enum IOMMU_DOMAIN_STATES {
 DOMAIN_FREE = -1,
 DOMAIN_DETACHED = 0,
 DOMAIN_ATTACHED >= 1
   }

We should be able to use the dev_iommu[] to help track the state.
 - In amd_iommu_domain_alloc, we initialize the array to DOMAIN_FREE
 - In do_attach(), we change to DOMAIN_ATTACH or we can increment the count
   if it is already in DOMAIN_ATTACH state.
 - In do_detach(). we change to DOMAIN_DETACH.

Then, in __domain_flush_pages, we issue command when the dev_iommu[] >= 0.
This should preserve previous behavior, and only add flushing condition to
the specific IOMMU in detached state. Please let me know what you think.

Regards,
Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/amd: Fix IOMMU page flush when detach all devices from a domain

2019-01-16 Thread j...@8bytes.org
On Wed, Jan 16, 2019 at 02:40:59PM +, Suthikulpanit, Suravee wrote:
> Actually, device_flush_dte(alias) should be needed regardless of this patch.
> Are you planning to add this?

Yes, I stumbled over this while writing the diff. I'll submit that as a
separate patch.

Thanks,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/amd: Fix IOMMU page flush when detach all devices from a domain

2019-01-16 Thread j...@8bytes.org
On Wed, Jan 16, 2019 at 02:08:55PM +, Suthikulpanit, Suravee wrote:
> Actually, I am not sure how we would be missing the flush on the last device.
> In my test, I am seeing the flush command being issued correctly during
> vfio_unmap_unpin(), which is after all devices are detached.
> Although, I might be missing your point here. Could you please elaborate?

Okay, you are right, the patch effectivly adds an unconditional flush of
the domain on all iommus when the last device is detached. So it is
correct in that regard. But that code path is also used in the
iommu_unmap() path.

The problem now is, that with your change we send flush commands to all
IOMMUs in the unmap path when no device is attached to the domain.
This will hurt performance there, no?

Regards,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/amd: Fix IOMMU page flush when detach all devices from a domain

2019-01-16 Thread Suthikulpanit, Suravee
Joerg,

On 1/16/19 8:26 PM, j...@8bytes.org wrote:
> How about the attached diff? If
> I understand the problem correctly, it should fix the problem more
> reliably.
> 
> Thanks,
> 
>   Joerg
> 
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index 87ba23a75b38..dc1e2a8a19d7 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -1991,25 +1991,36 @@ static void do_attach(struct iommu_dev_data *dev_data,
>   
>   static void do_detach(struct iommu_dev_data *dev_data)
>   {
> + struct protection_domain *domain = dev_data->domain;
>   struct amd_iommu *iommu;
>   u16 alias;
>   
>   iommu = amd_iommu_rlookup_table[dev_data->devid];
>   alias = dev_data->alias;
>   
> - /* decrease reference counters */
> - dev_data->domain->dev_iommu[iommu->index] -= 1;
> - dev_data->domain->dev_cnt -= 1;
> -
>   /* Update data structures */
>   dev_data->domain = NULL;
>   list_del(_data->list);
> - clear_dte_entry(dev_data->devid);
> - if (alias != dev_data->devid)
> - clear_dte_entry(alias);
>   
> + clear_dte_entry(dev_data->devid);
>   /* Flush the DTE entry */
>   device_flush_dte(dev_data);
> +
> + if (alias != dev_data->devid) {
> + clear_dte_entry(alias);
> + /* Flush the Alias DTE entry */
> + device_flush_dte(alias);
> + }
> +

Actually, device_flush_dte(alias) should be needed regardless of this patch.
Are you planning to add this?

Regards,
Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/amd: Fix IOMMU page flush when detach all devices from a domain

2019-01-16 Thread Suthikulpanit, Suravee
Joerg,

On 1/16/19 8:26 PM, j...@8bytes.org wrote:
> On Wed, Jan 16, 2019 at 04:16:25AM +, Suthikulpanit, Suravee wrote:
>> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
>> index 525659b88ade..ab31ba75da1b 100644
>> --- a/drivers/iommu/amd_iommu.c
>> +++ b/drivers/iommu/amd_iommu.c
>> @@ -1248,7 +1248,13 @@ static void __domain_flush_pages(struct 
>> protection_domain *domain,
>>  build_inv_iommu_pages(, address, size, domain->id, pde);
>>   
>>  for (i = 0; i < amd_iommu_get_num_iommus(); ++i) {
>> -if (!domain->dev_iommu[i])
>> +/*
>> + * The dev_cnt is zero when all devices are detached
>> + * from the domain. This is the case when VFIO detaches
>> + * all devices from the group before flushing IOMMU pages.
>> + * So, always issue the flush command.
>> + */
>> +if (domain->dev_cnt && !domain->dev_iommu[i])
>>  continue;
> 
> This doesn't look like the correct fix. We still miss the flush if we
> detach the last device from the domain. 

Actually, I am not sure how we would be missing the flush on the last device.
In my test, I am seeing the flush command being issued correctly during
vfio_unmap_unpin(), which is after all devices are detached.
Although, I might be missing your point here. Could you please elaborate?

> How about the attached diff? If
> I understand the problem correctly, it should fix the problem more
> reliably.
> 
> Thanks,
> 
>   Joerg
> 
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index 87ba23a75b38..dc1e2a8a19d7 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -1991,25 +1991,36 @@ static void do_attach(struct iommu_dev_data *dev_data,
>   
>   static void do_detach(struct iommu_dev_data *dev_data)
>   {
> + struct protection_domain *domain = dev_data->domain;
>   struct amd_iommu *iommu;
>   u16 alias;
>   
>   iommu = amd_iommu_rlookup_table[dev_data->devid];
>   alias = dev_data->alias;
>   
> - /* decrease reference counters */
> - dev_data->domain->dev_iommu[iommu->index] -= 1;
> - dev_data->domain->dev_cnt -= 1;
> -
>   /* Update data structures */
>   dev_data->domain = NULL;
>   list_del(_data->list);
> - clear_dte_entry(dev_data->devid);
> - if (alias != dev_data->devid)
> - clear_dte_entry(alias);
>   
> + clear_dte_entry(dev_data->devid);
>   /* Flush the DTE entry */
>   device_flush_dte(dev_data);
> +
> + if (alias != dev_data->devid) {
> + clear_dte_entry(alias);
> + /* Flush the Alias DTE entry */
> + device_flush_dte(alias);
> + }
> +
> + /* Flush IOTLB */
> + domain_flush_tlb_pde(domain);
> +
> + /* Wait for the flushes to finish */
> + domain_flush_complete(domain);
> +
> + /* decrease reference counters - needs to happen after the flushes */
> + domain->dev_iommu[iommu->index] -= 1;
> + domain->dev_cnt -= 1;
>   }

I have also considered this. This would also work. But since we are already
doing page flushes during page unmapping later on after all devices are 
detached.
So, would this be necessary? Please see vfio_iommu_type1_detach_group().

Also, if we consider the case where there are more than one devices sharing
the domain. This would issue page flush every time we detach a device,
and while other devices still attached to the domain.

Regards,
Suravee

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/amd: Fix IOMMU page flush when detach all devices from a domain

2019-01-16 Thread j...@8bytes.org
On Wed, Jan 16, 2019 at 04:16:25AM +, Suthikulpanit, Suravee wrote:
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index 525659b88ade..ab31ba75da1b 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -1248,7 +1248,13 @@ static void __domain_flush_pages(struct 
> protection_domain *domain,
>   build_inv_iommu_pages(, address, size, domain->id, pde);
>  
>   for (i = 0; i < amd_iommu_get_num_iommus(); ++i) {
> - if (!domain->dev_iommu[i])
> + /*
> +  * The dev_cnt is zero when all devices are detached
> +  * from the domain. This is the case when VFIO detaches
> +  * all devices from the group before flushing IOMMU pages.
> +  * So, always issue the flush command.
> +  */
> + if (domain->dev_cnt && !domain->dev_iommu[i])
>   continue;

This doesn't look like the correct fix. We still miss the flush if we
detach the last device from the domain. How about the attached diff? If
I understand the problem correctly, it should fix the problem more
reliably.

Thanks,

Joerg

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 87ba23a75b38..dc1e2a8a19d7 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1991,25 +1991,36 @@ static void do_attach(struct iommu_dev_data *dev_data,
 
 static void do_detach(struct iommu_dev_data *dev_data)
 {
+   struct protection_domain *domain = dev_data->domain;
struct amd_iommu *iommu;
u16 alias;
 
iommu = amd_iommu_rlookup_table[dev_data->devid];
alias = dev_data->alias;
 
-   /* decrease reference counters */
-   dev_data->domain->dev_iommu[iommu->index] -= 1;
-   dev_data->domain->dev_cnt -= 1;
-
/* Update data structures */
dev_data->domain = NULL;
list_del(_data->list);
-   clear_dte_entry(dev_data->devid);
-   if (alias != dev_data->devid)
-   clear_dte_entry(alias);
 
+   clear_dte_entry(dev_data->devid);
/* Flush the DTE entry */
device_flush_dte(dev_data);
+
+   if (alias != dev_data->devid) {
+   clear_dte_entry(alias);
+   /* Flush the Alias DTE entry */
+   device_flush_dte(alias);
+   }
+
+   /* Flush IOTLB */
+   domain_flush_tlb_pde(domain);
+
+   /* Wait for the flushes to finish */
+   domain_flush_complete(domain);
+
+   /* decrease reference counters - needs to happen after the flushes */
+   domain->dev_iommu[iommu->index] -= 1;
+   domain->dev_cnt -= 1;
 }
 
 /*
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu