RE: iommu/amd: flush IOTLB for specific domains only (v2)

2018-05-15 Thread Nath, Arindam


> -Original Message-
> From: Joseph Salisbury [mailto:joseph.salisb...@canonical.com]
> Sent: Tuesday, May 15, 2018 5:40 PM
> To: Nath, Arindam <arindam.n...@amd.com>
> Cc: io...@lists.linux-foundation.org; Bridgman, John
> <john.bridg...@amd.com>; j...@8bytes.org; amd-
> g...@lists.freedesktop.org; dr...@endlessm.com; stein...@gmail.com;
> Suthikulpanit, Suravee <suravee.suthikulpa...@amd.com>; Deucher,
> Alexander <alexander.deuc...@amd.com>; Kuehling, Felix
> <felix.kuehl...@amd.com>; li...@endlessm.com; mic...@daenzer.net;
> 1747...@bugs.launchpad.net; Lendacky, Thomas
> <thomas.lenda...@amd.com>
> Subject: Re: iommu/amd: flush IOTLB for specific domains only (v2)
> 
> On 05/15/2018 04:03 AM, Nath, Arindam wrote:
> > Adding Tom.
> >
> > Hi Joe,
> >
> > My original patch was never accepted. Tom and Joerg worked on another
> patch series which was supposed to fix the issue in question in addition to do
> some code cleanups. I believe their patches are already in the mainline. If I
> remember correctly, one of the patches disabled PCI ATS for the graphics
> card which was causing the issue.
> >
> > Do you still see the issue with latest mainline kernel?
> >
> > BR,
> > Arindam
> >
> > -Original Message-
> > From: Joseph Salisbury [mailto:joseph.salisb...@canonical.com]
> > Sent: Tuesday, May 15, 2018 1:17 AM
> > To: Nath, Arindam <arindam.n...@amd.com>
> > Cc: io...@lists.linux-foundation.org; Bridgman, John
> > <john.bridg...@amd.com>; j...@8bytes.org;
> > amd-gfx@lists.freedesktop.org; dr...@endlessm.com;
> stein...@gmail.com;
> > Suthikulpanit, Suravee <suravee.suthikulpa...@amd.com>; Deucher,
> > Alexander <alexander.deuc...@amd.com>; Kuehling, Felix
> > <felix.kuehl...@amd.com>; li...@endlessm.com; mic...@daenzer.net;
> > 1747...@bugs.launchpad.net
> > Subject: iommu/amd: flush IOTLB for specific domains only (v2)
> >
> > Hello Arindam,
> >
> > There is a bug report[0] that you created a patch[1] for a while back.
> However, the patch never landed in mainline.  There is a bug reporter in
> Ubuntu[2] that is affected by this bug and is willing to test the patch.  I
> attempted to build a test kernel with the patch, but it does not apply to
> currently mainline cleanly.  Do you still think this patch may resolve this
> bug?  If so, is there a version of your patch available that will apply to 
> current
> mainline?
> >
> > Thanks,
> >
> > Joe
> >
> > [0] https://bugs.freedesktop.org/show_bug.cgi?id=101029
> > [1] https://patchwork.freedesktop.org/patch/157327/
> > [2] http://pad.lv/1747463
> >
> Hi Arindam,
> 
> Thanks for the feedback.  Yes, the latest mainline kernel was tested, and it 
> is
> reported the bug still happens in the Ubuntu kernel bug[0]. Is there any
> specific diagnostic info we can collect that might help?

Joe, I believe all the information needed is already provided in [2]. Let us 
wait for inputs from Tom and Joerg.

I could take a look at the issue locally, but it will take me some really long 
time since I am occupied with other assignments right now.

BR,
Arindam

> 
> Thanks,
> 
> Joe
> 
> [0] http://pad.lv/1747463
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: iommu/amd: flush IOTLB for specific domains only (v2)

2018-05-15 Thread Nath, Arindam
Adding Tom.

Hi Joe,

My original patch was never accepted. Tom and Joerg worked on another patch 
series which was supposed to fix the issue in question in addition to do some 
code cleanups. I believe their patches are already in the mainline. If I 
remember correctly, one of the patches disabled PCI ATS for the graphics card 
which was causing the issue.

Do you still see the issue with latest mainline kernel?

BR,
Arindam

-Original Message-
From: Joseph Salisbury [mailto:joseph.salisb...@canonical.com] 
Sent: Tuesday, May 15, 2018 1:17 AM
To: Nath, Arindam <arindam.n...@amd.com>
Cc: io...@lists.linux-foundation.org; Bridgman, John <john.bridg...@amd.com>; 
j...@8bytes.org; amd-gfx@lists.freedesktop.org; dr...@endlessm.com; 
stein...@gmail.com; Suthikulpanit, Suravee <suravee.suthikulpa...@amd.com>; 
Deucher, Alexander <alexander.deuc...@amd.com>; Kuehling, Felix 
<felix.kuehl...@amd.com>; li...@endlessm.com; mic...@daenzer.net; 
1747...@bugs.launchpad.net
Subject: iommu/amd: flush IOTLB for specific domains only (v2)

Hello Arindam,

There is a bug report[0] that you created a patch[1] for a while back. However, 
the patch never landed in mainline.  There is a bug reporter in Ubuntu[2] that 
is affected by this bug and is willing to test the patch.  I attempted to build 
a test kernel with the patch, but it does not apply to currently mainline 
cleanly.  Do you still think this patch may resolve this bug?  If so, is there 
a version of your patch available that will apply to current mainline?

Thanks,

Joe

[0] https://bugs.freedesktop.org/show_bug.cgi?id=101029
[1] https://patchwork.freedesktop.org/patch/157327/
[2] http://pad.lv/1747463

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: disable coarse grain clockgating for ST

2018-01-29 Thread Nath, Arindam
>From: S, Shirish 
>Sent: Monday, January 29, 2018 4:37 PM
>To: Deucher, Alexander <alexander.deuc...@amd.com>; 
>amd-gfx@lists.freedesktop.org
>Cc: Nath, Arindam <arindam.n...@amd.com>
>Subject: Re: [PATCH] drm/amdgpu: disable coarse grain clockgating for ST
>
>CC:arindam.n...@amd.com

Reviewed-by: Arindam Nath <arindam.n...@amd.com>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] iommu/amd: flush IOTLB for specific domains only (v3)

2017-09-12 Thread Nath, Arindam
Hi Daniel,

>-Original Message-
>From: Daniel Drake [mailto:dr...@endlessm.com]
>Sent: Tuesday, September 12, 2017 12:20 PM
>To: Nath, Arindam <arindam.n...@amd.com>
>Cc: Joerg Roedel <j...@8bytes.org>; Lendacky, Thomas
><thomas.lenda...@amd.com>; io...@lists.linux-foundation.org; amd-
>g...@lists.freedesktop.org; Deucher, Alexander
><alexander.deuc...@amd.com>; Bridgman, John
><john.bridg...@amd.com>; Suthikulpanit, Suravee
><suravee.suthikulpa...@amd.com>; li...@endlessm.com; Craig Stein
><stein...@gmail.com>; mic...@daenzer.net; Kuehling, Felix
><felix.kuehl...@amd.com>; sta...@vger.kernel.org
>Subject: Re: [PATCH] iommu/amd: flush IOTLB for specific domains only (v3)
>
>Hi,
>
>On Tue, May 30, 2017 at 3:38 PM, Nath, Arindam <arindam.n...@amd.com>
>wrote:
>>>-----Original Message-
>>>From: Joerg Roedel [mailto:j...@8bytes.org]
>>>Sent: Monday, May 29, 2017 8:09 PM
>>>To: Nath, Arindam <arindam.n...@amd.com>; Lendacky, Thomas
>>><thomas.lenda...@amd.com>
>>>Cc: io...@lists.linux-foundation.org; amd-gfx@lists.freedesktop.org;
>>>Deucher, Alexander <alexander.deuc...@amd.com>; Bridgman, John
>>><john.bridg...@amd.com>; dr...@endlessm.com; Suthikulpanit,
>Suravee
>>><suravee.suthikulpa...@amd.com>; li...@endlessm.com; Craig Stein
>>><stein...@gmail.com>; mic...@daenzer.net; Kuehling, Felix
>>><felix.kuehl...@amd.com>; sta...@vger.kernel.org
>>>Subject: Re: [PATCH] iommu/amd: flush IOTLB for specific domains only
>(v3)
>>>
>>>Hi Arindam,
>>>
>>>I met Tom Lendacky last week in Nuremberg last week and he told me he is
>>>working on the same area of the code that this patch is for. His reason
>>>for touching this code was to solve some locking problems. Maybe you two
>>>can work together on a joint approach to improve this?
>>
>> Sure Joerg, I will work with Tom.
>
>What was the end result here? I see that the code has been reworked in
>4.13 so your original patch no longer applies. Is the reworked version
>also expected to solve the original issue?

Yes, if the reworked patch resolves the issue, my patch won't be required 
anymore.

Thanks,
Arindam

>
>Thanks
>Daniel
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] iommu/amd: flush IOTLB for specific domains only (v3)

2017-05-30 Thread Nath, Arindam
>-Original Message-
>From: Joerg Roedel [mailto:j...@8bytes.org]
>Sent: Monday, May 29, 2017 8:09 PM
>To: Nath, Arindam <arindam.n...@amd.com>; Lendacky, Thomas
><thomas.lenda...@amd.com>
>Cc: io...@lists.linux-foundation.org; amd-gfx@lists.freedesktop.org;
>Deucher, Alexander <alexander.deuc...@amd.com>; Bridgman, John
><john.bridg...@amd.com>; dr...@endlessm.com; Suthikulpanit, Suravee
><suravee.suthikulpa...@amd.com>; li...@endlessm.com; Craig Stein
><stein...@gmail.com>; mic...@daenzer.net; Kuehling, Felix
><felix.kuehl...@amd.com>; sta...@vger.kernel.org
>Subject: Re: [PATCH] iommu/amd: flush IOTLB for specific domains only (v3)
>
>Hi Arindam,
>
>I met Tom Lendacky last week in Nuremberg last week and he told me he is
>working on the same area of the code that this patch is for. His reason
>for touching this code was to solve some locking problems. Maybe you two
>can work together on a joint approach to improve this?

Sure Joerg, I will work with Tom.

Thanks.

>
>On Mon, May 22, 2017 at 01:18:01PM +0530, arindam.n...@amd.com wrote:
>> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
>> index 63cacf5..1edeebec 100644
>> --- a/drivers/iommu/amd_iommu.c
>> +++ b/drivers/iommu/amd_iommu.c
>> @@ -2227,15 +2227,26 @@ static struct iommu_group
>*amd_iommu_device_group(struct device *dev)
>>
>>  static void __queue_flush(struct flush_queue *queue)
>>  {
>> -struct protection_domain *domain;
>> -unsigned long flags;
>>  int idx;
>>
>> -/* First flush TLB of all known domains */
>> -spin_lock_irqsave(_iommu_pd_lock, flags);
>> -list_for_each_entry(domain, _iommu_pd_list, list)
>> -domain_flush_tlb(domain);
>> -spin_unlock_irqrestore(_iommu_pd_lock, flags);
>> +/* First flush TLB of all domains which were added to flush queue */
>> +for (idx = 0; idx < queue->next; ++idx) {
>> +struct flush_queue_entry *entry;
>> +
>> +entry = queue->entries + idx;
>> +
>> +/*
>> + * There might be cases where multiple IOVA entries for the
>> + * same domain are queued in the flush queue. To avoid
>> + * flushing the same domain again, we check whether the
>> + * flag is set or not. This improves the efficiency of
>> + * flush operation.
>> + */
>> +if (!entry->dma_dom->domain.already_flushed) {
>> +entry->dma_dom->domain.already_flushed = true;
>> +domain_flush_tlb(>dma_dom->domain);
>> +}
>> +}
>
>There is also a race condition here I have to look into. It is not
>introduced by your patch, but needs fixing anyway. I'll look into this
>too.
>
>
>Regards,
>
>   Joerg

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] iommu/amd: flush IOTLB for specific domains only (v2)

2017-05-21 Thread Nath, Arindam
>-Original Message-
>From: Jan Vesely [mailto:jv...@scarletmail.rutgers.edu] On Behalf Of Jan
>Vesely
>Sent: Friday, May 19, 2017 7:06 PM
>To: Nath, Arindam; io...@lists.linux-foundation.org
>Cc: Bridgman, John; Joerg Roedel; amd-gfx@lists.freedesktop.org;
>dr...@endlessm.com; Craig Stein; Suthikulpanit, Suravee; Deucher,
>Alexander; Kuehling, Felix; li...@endlessm.com; mic...@daenzer.net
>Subject: Re: [PATCH] iommu/amd: flush IOTLB for specific domains only (v2)
>
>On Fri, 2017-05-19 at 15:32 +0530, arindam.n...@amd.com wrote:
>> From: Arindam Nath <arindam.n...@amd.com>
>>
>> Change History
>> --
>>
>> v2: changes suggested by Joerg
>> - add flush flag to improve efficiency of flush operation

Joerg, do you have any comments on the patch?

Thanks,
Arindam

>>
>> v1:
>> - The idea behind flush queues is to defer the IOTLB flushing
>>   for domains for which the mappings are no longer valid. We
>>   add such domains in queue_add(), and when the queue size
>>   reaches FLUSH_QUEUE_SIZE, we perform __queue_flush().
>>
>>   Since we have already taken lock before __queue_flush()
>>   is called, we need to make sure the IOTLB flushing is
>>   performed as quickly as possible.
>>
>>   In the current implementation, we perform IOTLB flushing
>>   for all domains irrespective of which ones were actually
>>   added in the flush queue initially. This can be quite
>>   expensive especially for domains for which unmapping is
>>   not required at this point of time.
>>
>>   This patch makes use of domain information in
>>   'struct flush_queue_entry' to make sure we only flush
>>   IOTLBs for domains who need it, skipping others.
>
>Hi,
>
>just a note, the patch also fixes "AMD-Vi: Completion-Wait loop timed
>out" boot hang on my system (carrizo + iceland) [0,1,2]. Presumably the
>old loop also tried to flush domains that included powered-off devices.
>
>regards,
>Jan
>
>[0] https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/issues/20
>[1] https://bugs.freedesktop.org/show_bug.cgi?id=101029
>[2] https://bugzilla.redhat.com/show_bug.cgi?id=1448121
>
>>
>> Suggested-by: Joerg Roedel <j...@8bytes.org>
>> Signed-off-by: Arindam Nath <arindam.n...@amd.com>
>> ---
>>  drivers/iommu/amd_iommu.c   | 27 ---
>>  drivers/iommu/amd_iommu_types.h |  2 ++
>>  2 files changed, 22 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
>> index 63cacf5..1edeebec 100644
>> --- a/drivers/iommu/amd_iommu.c
>> +++ b/drivers/iommu/amd_iommu.c
>> @@ -2227,15 +2227,26 @@ static struct iommu_group
>*amd_iommu_device_group(struct device *dev)
>>
>>  static void __queue_flush(struct flush_queue *queue)
>>  {
>> -struct protection_domain *domain;
>> -unsigned long flags;
>>  int idx;
>>
>> -/* First flush TLB of all known domains */
>> -spin_lock_irqsave(_iommu_pd_lock, flags);
>> -list_for_each_entry(domain, _iommu_pd_list, list)
>> -domain_flush_tlb(domain);
>> -spin_unlock_irqrestore(_iommu_pd_lock, flags);
>> +/* First flush TLB of all domains which were added to flush queue */
>> +for (idx = 0; idx < queue->next; ++idx) {
>> +struct flush_queue_entry *entry;
>> +
>> +entry = queue->entries + idx;
>> +
>> +/*
>> + * There might be cases where multiple IOVA entries for the
>> + * same domain are queued in the flush queue. To avoid
>> + * flushing the same domain again, we check whether the
>> + * flag is set or not. This improves the efficiency of
>> + * flush operation.
>> + */
>> +if (!entry->dma_dom->domain.already_flushed) {
>> +entry->dma_dom->domain.already_flushed = true;
>> +domain_flush_tlb(>dma_dom->domain);
>> +}
>> +}
>>
>>  /* Wait until flushes have completed */
>>  domain_flush_complete(NULL);
>> @@ -2289,6 +2300,8 @@ static void queue_add(struct dma_ops_domain
>*dma_dom,
>>  pages = __roundup_pow_of_two(pages);
>>  address >>= PAGE_SHIFT;
>>
>> +dma_dom->domain.already_flushed = false;
>> +
>>  queue = get_cpu_ptr(_queue);
>>  spin_lock_irqsave(>lock, flags);
>>
>> diff --git a/drivers/iommu/amd

RE: [PATCH] iommu/amd: flush IOTLB for specific domains only

2017-04-05 Thread Nath, Arindam
>-Original Message-
>From: Daniel Drake [mailto:dr...@endlessm.com]
>Sent: Thursday, March 30, 2017 7:15 PM
>To: Nath, Arindam
>Cc: j...@8bytes.org; Deucher, Alexander; Bridgman, John; amd-
>g...@lists.freedesktop.org; io...@lists.linux-foundation.org; Suthikulpanit,
>Suravee; Linux Upstreaming Team
>Subject: Re: [PATCH] iommu/amd: flush IOTLB for specific domains only
>
>On Thu, Mar 30, 2017 at 12:23 AM, Nath, Arindam <arindam.n...@amd.com>
>wrote:
>> Daniel, did you get chance to test this patch?
>
>Not yet. Should we test it alone or alongside "PCI: Blacklist AMD
>Stoney GPU devices for ATS"?

Daniel, any luck with this patch?

Thanks,
Arindam

>
>Thanks
>Daniel
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] iommu/amd: flush IOTLB for specific domains only

2017-03-30 Thread Nath, Arindam
>-Original Message-
>From: Daniel Drake [mailto:dr...@endlessm.com]
>Sent: Thursday, March 30, 2017 7:15 PM
>To: Nath, Arindam
>Cc: j...@8bytes.org; Deucher, Alexander; Bridgman, John; amd-
>g...@lists.freedesktop.org; io...@lists.linux-foundation.org; Suthikulpanit,
>Suravee; Linux Upstreaming Team
>Subject: Re: [PATCH] iommu/amd: flush IOTLB for specific domains only
>
>On Thu, Mar 30, 2017 at 12:23 AM, Nath, Arindam <arindam.n...@amd.com>
>wrote:
>> Daniel, did you get chance to test this patch?
>
>Not yet. Should we test it alone or alongside "PCI: Blacklist AMD
>Stoney GPU devices for ATS"?

You should test this patch alone.

Thanks,
Arindam

>
>Thanks
>Daniel
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] iommu/amd: flush IOTLB for specific domains only

2017-03-30 Thread Nath, Arindam
>-Original Message-
>From: Nath, Arindam
>Sent: Monday, March 27, 2017 5:57 PM
>To: 'Daniel Drake'
>Cc: j...@8bytes.org; Deucher, Alexander; Bridgman, John; amd-
>g...@lists.freedesktop.org; io...@lists.linux-foundation.org; Suthikulpanit,
>Suravee; Linux Upstreaming Team
>Subject: RE: [PATCH] iommu/amd: flush IOTLB for specific domains only
>
>>-Original Message-
>>From: Daniel Drake [mailto:dr...@endlessm.com]
>>Sent: Monday, March 27, 2017 5:56 PM
>>To: Nath, Arindam
>>Cc: j...@8bytes.org; Deucher, Alexander; Bridgman, John; amd-
>>g...@lists.freedesktop.org; io...@lists.linux-foundation.org; Suthikulpanit,
>>Suravee; Linux Upstreaming Team
>>Subject: Re: [PATCH] iommu/amd: flush IOTLB for specific domains only
>>
>>Hi Arindam,
>>
>>You CC'd me on this - does this mean that it is a fix for the issue
>>described in the thread "amd-iommu: can't boot with amdgpu, AMD-Vi:
>>Completion-Wait loop timed out" ?
>
>Yes Daniel, please test this patch to confirm if the issue gets resolved.

Daniel, did you get chance to test this patch?

Thanks,
Arindam

>
>Thanks,
>Arindam
>
>>
>>Thanks
>>Daniel
>>
>>
>>On Mon, Mar 27, 2017 at 12:17 AM,  <arindam.n...@amd.com> wrote:
>>> From: Arindam Nath <arindam.n...@amd.com>
>>>
>>> The idea behind flush queues is to defer the IOTLB flushing
>>> for domains for which the mappings are no longer valid. We
>>> add such domains in queue_add(), and when the queue size
>>> reaches FLUSH_QUEUE_SIZE, we perform __queue_flush().
>>>
>>> Since we have already taken lock before __queue_flush()
>>> is called, we need to make sure the IOTLB flushing is
>>> performed as quickly as possible.
>>>
>>> In the current implementation, we perform IOTLB flushing
>>> for all domains irrespective of which ones were actually
>>> added in the flush queue initially. This can be quite
>>> expensive especially for domains for which unmapping is
>>> not required at this point of time.
>>>
>>> This patch makes use of domain information in
>>> 'struct flush_queue_entry' to make sure we only flush
>>> IOTLBs for domains who need it, skipping others.
>>>
>>> Signed-off-by: Arindam Nath <arindam.n...@amd.com>
>>> ---
>>>  drivers/iommu/amd_iommu.c | 15 ---
>>>  1 file changed, 8 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
>>> index 98940d1..6a9a048 100644
>>> --- a/drivers/iommu/amd_iommu.c
>>> +++ b/drivers/iommu/amd_iommu.c
>>> @@ -2227,15 +2227,16 @@ static struct iommu_group
>>*amd_iommu_device_group(struct device *dev)
>>>
>>>  static void __queue_flush(struct flush_queue *queue)
>>>  {
>>> -   struct protection_domain *domain;
>>> -   unsigned long flags;
>>> int idx;
>>>
>>> -   /* First flush TLB of all known domains */
>>> -   spin_lock_irqsave(_iommu_pd_lock, flags);
>>> -   list_for_each_entry(domain, _iommu_pd_list, list)
>>> -   domain_flush_tlb(domain);
>>> -   spin_unlock_irqrestore(_iommu_pd_lock, flags);
>>> +   /* First flush TLB of all domains which were added to flush queue */
>>> +   for (idx = 0; idx < queue->next; ++idx) {
>>> +   struct flush_queue_entry *entry;
>>> +
>>> +   entry = queue->entries + idx;
>>> +
>>> +   domain_flush_tlb(>dma_dom->domain);
>>> +   }
>>>
>>> /* Wait until flushes have completed */
>>> domain_flush_complete(NULL);
>>> --
>>> 1.9.1
>>>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] iommu/amd: flush IOTLB for specific domains only

2017-03-27 Thread Nath, Arindam
>-Original Message-
>From: Daniel Drake [mailto:dr...@endlessm.com]
>Sent: Monday, March 27, 2017 5:56 PM
>To: Nath, Arindam
>Cc: j...@8bytes.org; Deucher, Alexander; Bridgman, John; amd-
>g...@lists.freedesktop.org; io...@lists.linux-foundation.org; Suthikulpanit,
>Suravee; Linux Upstreaming Team
>Subject: Re: [PATCH] iommu/amd: flush IOTLB for specific domains only
>
>Hi Arindam,
>
>You CC'd me on this - does this mean that it is a fix for the issue
>described in the thread "amd-iommu: can't boot with amdgpu, AMD-Vi:
>Completion-Wait loop timed out" ?

Yes Daniel, please test this patch to confirm if the issue gets resolved.

Thanks,
Arindam

>
>Thanks
>Daniel
>
>
>On Mon, Mar 27, 2017 at 12:17 AM,  <arindam.n...@amd.com> wrote:
>> From: Arindam Nath <arindam.n...@amd.com>
>>
>> The idea behind flush queues is to defer the IOTLB flushing
>> for domains for which the mappings are no longer valid. We
>> add such domains in queue_add(), and when the queue size
>> reaches FLUSH_QUEUE_SIZE, we perform __queue_flush().
>>
>> Since we have already taken lock before __queue_flush()
>> is called, we need to make sure the IOTLB flushing is
>> performed as quickly as possible.
>>
>> In the current implementation, we perform IOTLB flushing
>> for all domains irrespective of which ones were actually
>> added in the flush queue initially. This can be quite
>> expensive especially for domains for which unmapping is
>> not required at this point of time.
>>
>> This patch makes use of domain information in
>> 'struct flush_queue_entry' to make sure we only flush
>> IOTLBs for domains who need it, skipping others.
>>
>> Signed-off-by: Arindam Nath <arindam.n...@amd.com>
>> ---
>>  drivers/iommu/amd_iommu.c | 15 ---
>>  1 file changed, 8 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
>> index 98940d1..6a9a048 100644
>> --- a/drivers/iommu/amd_iommu.c
>> +++ b/drivers/iommu/amd_iommu.c
>> @@ -2227,15 +2227,16 @@ static struct iommu_group
>*amd_iommu_device_group(struct device *dev)
>>
>>  static void __queue_flush(struct flush_queue *queue)
>>  {
>> -   struct protection_domain *domain;
>> -   unsigned long flags;
>> int idx;
>>
>> -   /* First flush TLB of all known domains */
>> -   spin_lock_irqsave(_iommu_pd_lock, flags);
>> -   list_for_each_entry(domain, _iommu_pd_list, list)
>> -   domain_flush_tlb(domain);
>> -   spin_unlock_irqrestore(_iommu_pd_lock, flags);
>> +   /* First flush TLB of all domains which were added to flush queue */
>> +   for (idx = 0; idx < queue->next; ++idx) {
>> +   struct flush_queue_entry *entry;
>> +
>> +   entry = queue->entries + idx;
>> +
>> +   domain_flush_tlb(>dma_dom->domain);
>> +   }
>>
>> /* Wait until flushes have completed */
>> domain_flush_complete(NULL);
>> --
>> 1.9.1
>>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: amd-iommu: can't boot with amdgpu, AMD-Vi: Completion-Wait loop timed out

2017-03-21 Thread Nath, Arindam
>-Original Message-
>From: 'j...@8bytes.org' [mailto:j...@8bytes.org]
>Sent: Tuesday, March 21, 2017 9:41 PM
>To: Deucher, Alexander
>Cc: Alex Deucher; Daniel Drake; Chris Chiu; amd-gfx@lists.freedesktop.org;
>Nath, Arindam; io...@lists.linux-foundation.org; Suthikulpanit, Suravee;
>Linux Upstreaming Team
>Subject: Re: amd-iommu: can't boot with amdgpu, AMD-Vi: Completion-Wait
>loop timed out
>
>On Tue, Mar 21, 2017 at 04:01:53PM +, Deucher, Alexander wrote:
>> It seems to only affect Stoney systems, but not others (Carrizo,
>> Bristol, etc.).  Maybe we could just disable it on Stoney until we
>> root cause it.
>
>Completion-wait loop timeouts indicate something is seriously wrong. How
>can I detect whether I am running on a 'Stoney' system?
>
>Other question, a shot into the dark, does the GPU on these systems have
>ATS? Probably yes, as they are likely HSA compatible.

The GPU on Stoney supports ATS.

Thanks,
Arindam

>
>
>   Joerg

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: amd-iommu: can't boot with amdgpu, AMD-Vi: Completion-Wait loop timed out

2017-03-14 Thread Nath, Arindam


>-Original Message-
>From: Deucher, Alexander
>Sent: Tuesday, March 14, 2017 1:31 AM
>To: 'Daniel Drake'; j...@8bytes.org; Suthikulpanit, Suravee; Nath, Arindam
>Cc: Chris Chiu; io...@lists.linux-foundation.org; Linux Upstreaming Team;
>amd-gfx@lists.freedesktop.org
>Subject: RE: amd-iommu: can't boot with amdgpu, AMD-Vi: Completion-Wait
>loop timed out
>
>> -Original Message-
>> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
>> Of Daniel Drake
>> Sent: Monday, March 13, 2017 3:50 PM
>> To: j...@8bytes.org
>> Cc: Chris Chiu; io...@lists.linux-foundation.org; Linux Upstreaming Team;
>> amd-gfx@lists.freedesktop.org
>> Subject: amd-iommu: can't boot with amdgpu, AMD-Vi: Completion-Wait
>> loop timed out
>>
>> Hi,
>>
>> We are unable to boot Acer Aspire E5-553G (AMD FX-9800P RADEON R7) nor
>> Acer Aspire E5-523 with standard configurations because during boot
>> the screen is flooded with the following error message over and over:
>>
>>   AMD-Vi: Completion-Wait loop timed out
>>
>> We have left the system for quite a while but the message spam does
>> not stop and the system doesn't complete the boot sequence.
>>
>> We have reproduced on Linux 4.8 and Linux 4.10.
>>
>> To avoid this, we can boot with iommu=soft or just disable the amdgpu
>> display driver.
>>
>> Looks like this may also affect HP 15-ba012no :
>> https://bugzilla.redhat.com/show_bug.cgi?id=1409201
>>
>> Earlier during boot the iommu is detected as:
>>
>> [1.274518] AMD-Vi: Found IOMMU at :00:00.2 cap 0x40
>> [1.274519] AMD-Vi: Extended features (0x37ef22294ada):
>> [1.274519]  PPR NX GT IA GA PC GA_vAPIC
>> [1.274523] AMD-Vi: Interrupt remapping enabled
>> [1.274523] AMD-Vi: virtual APIC enabled
>> [1.275144] AMD-Vi: Lazy IO/TLB flushing enabled
>> [1.276498] perf: AMD NB counters detected
>> [1.278096] LVT offset 0 assigned for vector 0x400
>> [1.278963] perf: AMD IBS detected (0x07ff)
>> [1.278977] perf: amd_iommu: Detected. (0 banks, 0 counters/bank)
>>
>> Any suggestions for how we can fix this, or get more useful debug info?
>
>+Suravee and Arindam
>
>We ran into similar issues and bisected it to commit
>b1516a14657acf81a587e9a6e733a881625eee53.  I'm not too familiar with the
>IOMMU hardware to know if this is an iommu or display driver issue yet.

Like Alex mentioned, we have not yet been able to root cause the issue. But 
another way to work-around the issue without disabling graphics driver or IOMMU 
is to use amd_iommu=fullflush kernel boot param.

Thanks,
Arindam

>
>Alex
>
>>
>> Thanks
>> Daniel
>> ___
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 1/1] drm/amd/amdgpu: get maximum and used UVD handles (v3)

2016-12-15 Thread Nath, Arindam
>-Original Message-
>From: Grazvydas Ignotas [mailto:nota...@gmail.com]
>Sent: Thursday, December 15, 2016 5:52 PM
>To: Nath, Arindam
>Cc: Emil Velikov; David Airlie; Deucher, Alexander; ML dri-devel; amd-gfx
>mailing list; Koenig, Christian
>Subject: Re: [PATCH 1/1] drm/amd/amdgpu: get maximum and used UVD
>handles (v3)
>
>On Thu, Dec 15, 2016 at 1:47 PM, Nath, Arindam <arindam.n...@amd.com>
>wrote:
>>>-Original Message-
>>>From: Emil Velikov [mailto:emil.l.veli...@gmail.com]
>>>Sent: Thursday, December 15, 2016 5:01 PM
>>>To: Nath, Arindam
>>>Cc: David Airlie; Deucher, Alexander; amd-gfx mailing list; ML dri-devel;
>>>Koenig, Christian
>>>Subject: Re: [PATCH 1/1] drm/amd/amdgpu: get maximum and used UVD
>>>handles (v3)
>>>
>>>On 15 December 2016 at 07:30, Nath, Arindam <arindam.n...@amd.com>
>>>wrote:
>>>>>-Original Message-
>>>>>From: Emil Velikov [mailto:emil.l.veli...@gmail.com]
>>>>>Sent: Wednesday, December 14, 2016 9:26 PM
>>>>>To: Nath, Arindam
>>>>>Cc: David Airlie; Deucher, Alexander; amd-gfx mailing list; ML dri-devel;
>>>>>Koenig, Christian
>>>>>Subject: Re: [PATCH 1/1] drm/amd/amdgpu: get maximum and used
>UVD
>>>>>handles (v3)
>>>>>
>>>>>On 12 December 2016 at 18:49,  <arindam.n...@amd.com> wrote:
>>>>>> From: Arindam Nath <arindam.n...@amd.com>
>>>>>>
>>>>>> Change History
>>>>>> --
>>>>>>
>>>>>> v3: changes suggested by Christian
>>>>>> - Add a check for UVD IP block using AMDGPU_HW_IP_UVD
>>>>>>   query type.
>>>>>> - Add a check for asic_type to be less than
>>>>>>   CHIP_POLARIS10 since starting Polaris, we support
>>>>>>   unlimited UVD instances.
>>>>>> - Add kerneldoc style comment for
>>>>>>   amdgpu_uvd_used_handles().
>>>>>>
>>>>>> v2: as suggested by Christian
>>>>>> - Add a new query AMDGPU_INFO_NUM_HANDLES
>>>>>> - Create a helper function to return the number
>>>>>>   of currently used UVD handles.
>>>>>> - Modify the logic to count the number of used
>>>>>>   UVD handles since handles can be freed in
>>>>>>   non-linear fashion.
>>>>>>
>>>>>> v1:
>>>>>> - User might want to query the maximum number of UVD
>>>>>>   instances supported by firmware. In addition to that,
>>>>>>   if there are multiple applications using UVD handles
>>>>>>   at the same time, he might also want to query the
>>>>>>   currently used number of handles.
>>>>>>
>>>>>>   For this we add two variables max_handles and
>>>>>>   used_handles inside drm_amdgpu_info_hw_ip. So now
>>>>>>   an application (or libdrm) can use AMDGPU_INFO IOCTL
>>>>>>   with AMDGPU_INFO_HW_IP_INFO query type to get these
>>>>>>   values.
>>>>>>
>>>>>> Signed-off-by: Arindam Nath <arindam.n...@amd.com>
>>>>>> Reviewed-by: Christian König <christian.koe...@amd.com>
>>>>>> ---
>>>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 21
>>>>>+
>>>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 25
>>>>>+
>>>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h |  1 +
>>>>>>  include/uapi/drm/amdgpu_drm.h   |  9 +
>>>>>>  4 files changed, 56 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>>>>>b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>>>>>> index 174eb59..3273d8c 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>>>>>> @@ -570,6 +570,27 @@ static int amdgpu_info_ioctl(struct drm_device
>>>>>*dev, void *data, struct drm_file
>>>>>> return -EINVAL;
>>>>>> }
>>>>>> }
>>>>>> +   case AMDGPU_INFO_NUM_HANDLES: {
>>>>>> +   struct drm_amdgpu_info_num_handles handle;
&

RE: [PATCH 1/1] drm/amd/amdgpu: get maximum and used UVD handles (v3)

2016-12-14 Thread Nath, Arindam
>-Original Message-
>From: Emil Velikov [mailto:emil.l.veli...@gmail.com]
>Sent: Wednesday, December 14, 2016 9:26 PM
>To: Nath, Arindam
>Cc: David Airlie; Deucher, Alexander; amd-gfx mailing list; ML dri-devel;
>Koenig, Christian
>Subject: Re: [PATCH 1/1] drm/amd/amdgpu: get maximum and used UVD
>handles (v3)
>
>On 12 December 2016 at 18:49,  <arindam.n...@amd.com> wrote:
>> From: Arindam Nath <arindam.n...@amd.com>
>>
>> Change History
>> --
>>
>> v3: changes suggested by Christian
>> - Add a check for UVD IP block using AMDGPU_HW_IP_UVD
>>   query type.
>> - Add a check for asic_type to be less than
>>   CHIP_POLARIS10 since starting Polaris, we support
>>   unlimited UVD instances.
>> - Add kerneldoc style comment for
>>   amdgpu_uvd_used_handles().
>>
>> v2: as suggested by Christian
>> - Add a new query AMDGPU_INFO_NUM_HANDLES
>> - Create a helper function to return the number
>>   of currently used UVD handles.
>> - Modify the logic to count the number of used
>>   UVD handles since handles can be freed in
>>   non-linear fashion.
>>
>> v1:
>> - User might want to query the maximum number of UVD
>>   instances supported by firmware. In addition to that,
>>   if there are multiple applications using UVD handles
>>   at the same time, he might also want to query the
>>   currently used number of handles.
>>
>>   For this we add two variables max_handles and
>>   used_handles inside drm_amdgpu_info_hw_ip. So now
>>   an application (or libdrm) can use AMDGPU_INFO IOCTL
>>   with AMDGPU_INFO_HW_IP_INFO query type to get these
>>   values.
>>
>> Signed-off-by: Arindam Nath <arindam.n...@amd.com>
>> Reviewed-by: Christian König <christian.koe...@amd.com>
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 21
>+
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 25
>+
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.h |  1 +
>>  include/uapi/drm/amdgpu_drm.h   |  9 +
>>  4 files changed, 56 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>> index 174eb59..3273d8c 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>> @@ -570,6 +570,27 @@ static int amdgpu_info_ioctl(struct drm_device
>*dev, void *data, struct drm_file
>> return -EINVAL;
>> }
>> }
>> +   case AMDGPU_INFO_NUM_HANDLES: {
>> +   struct drm_amdgpu_info_num_handles handle;
>> +
>> +   switch (info->query_hw_ip.type) {
>> +   case AMDGPU_HW_IP_UVD:
>> +   /* Starting Polaris, we support unlimited UVD 
>> handles */
>> +   if (adev->asic_type < CHIP_POLARIS10) {
>> +   handle.uvd_max_handles = 
>> adev->uvd.max_handles;
>> +   handle.uvd_used_handles =
>amdgpu_uvd_used_handles(adev);
>> +
>> +   return copy_to_user(out, ,
>> +   min((size_t)size, sizeof(handle))) ? 
>> -EFAULT : 0;
>> +   } else {
>> +   return -EINVAL;
>Using EINVAL doesn't seem right here. As per man 3 ioctl
>
>  EINVAL The request or arg argument is not valid for this device.
>
>A bit further down you can see the one you want.
>
>  ENODEV The fildes argument refers to a valid STREAMS device, but
>the corresponding device driver does not support the ioctl() function.

Emil, ENODEV would mean the driver does not support ioctl() itself. But in our 
case ioctl() is supported.

Since we extract the query type from arg passed to ioctl(), and it is this 
query AMDGPU_INFO_NUM_HANDLES which is not supported by driver (for > Polaris), 
doesn’t returning EINVAL make more sense here?

Thanks,
Arindam
>
>Worth checking through the existing code and correcting similar thinkos ?
>
>Thanks
>Emil
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx