Re: [PATCH 2/2] mm/hmm: Only set FAULT_FLAG_ALLOW_RETRY for non-blocking

2019-05-14 Thread Alex Deucher
On Tue, May 14, 2019 at 5:12 PM Kuehling, Felix  wrote:
>
>
> On 2019-05-13 4:21 p.m., Deucher, Alexander wrote:
> > [CAUTION: External Email]
> > I reverted all the amdgpu HMM patches for 5.2 because they also
> > depended on this patch:
> > https://cgit.freedesktop.org/~agd5f/linux/commit/?h=drm-next-5.2-wip=ce05ef71564f7cbe270cd4337c36ee720ea534db
> > which did not have a clear line of sight for 5.2 either.
>
> When was that? I saw "Use HMM for userptr" in Dave's 5.2-rc1 pull
> request to Linus.

https://patchwork.kernel.org/patch/10875587/

Alex



>
>
> Regards,
>Felix
>
>
> >
> > Alex
> > 
> > *From:* amd-gfx  on behalf of
> > Kuehling, Felix 
> > *Sent:* Monday, May 13, 2019 3:36 PM
> > *To:* Jerome Glisse
> > *Cc:* linux...@kvack.org; airl...@gmail.com;
> > amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org;
> > alex.deuc...@amd.com
> > *Subject:* Re: [PATCH 2/2] mm/hmm: Only set FAULT_FLAG_ALLOW_RETRY for
> > non-blocking
> > [CAUTION: External Email]
> >
> > Hi Jerome,
> >
> > Do you want me to push the patches to your branch? Or are you going to
> > apply them yourself?
> >
> > Is your hmm-5.2-v3 branch going to make it into Linux 5.2? If so, do you
> > know when? I'd like to coordinate with Dave Airlie so that we can also
> > get that update into a drm-next branch soon.
> >
> > I see that Linus merged Dave's pull request for Linux 5.2, which
> > includes the first changes in amdgpu using HMM. They're currently broken
> > without these two patches.
> >
> > Thanks,
> >Felix
> >
> > On 2019-05-10 4:14 p.m., Jerome Glisse wrote:
> > > [CAUTION: External Email]
> > >
> > > On Fri, May 10, 2019 at 07:53:24PM +, Kuehling, Felix wrote:
> > >> Don't set this flag by default in hmm_vma_do_fault. It is set
> > >> conditionally just a few lines below. Setting it unconditionally
> > >> can lead to handle_mm_fault doing a non-blocking fault, returning
> > >> -EBUSY and unlocking mmap_sem unexpectedly.
> > >>
> > >> Signed-off-by: Felix Kuehling 
> > > Reviewed-by: Jérôme Glisse 
> > >
> > >> ---
> > >>   mm/hmm.c | 2 +-
> > >>   1 file changed, 1 insertion(+), 1 deletion(-)
> > >>
> > >> diff --git a/mm/hmm.c b/mm/hmm.c
> > >> index b65c27d5c119..3c4f1d62202f 100644
> > >> --- a/mm/hmm.c
> > >> +++ b/mm/hmm.c
> > >> @@ -339,7 +339,7 @@ struct hmm_vma_walk {
> > >>   static int hmm_vma_do_fault(struct mm_walk *walk, unsigned long addr,
> > >>bool write_fault, uint64_t *pfn)
> > >>   {
> > >> - unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_REMOTE;
> > >> + unsigned int flags = FAULT_FLAG_REMOTE;
> > >>struct hmm_vma_walk *hmm_vma_walk = walk->private;
> > >>struct hmm_range *range = hmm_vma_walk->range;
> > >>struct vm_area_struct *vma = walk->vma;
> > >> --
> > >> 2.17.1
> > >>
> > ___
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/2] mm/hmm: Only set FAULT_FLAG_ALLOW_RETRY for non-blocking

2019-05-14 Thread Kuehling, Felix

On 2019-05-13 4:21 p.m., Deucher, Alexander wrote:
> [CAUTION: External Email]
> I reverted all the amdgpu HMM patches for 5.2 because they also 
> depended on this patch:
> https://cgit.freedesktop.org/~agd5f/linux/commit/?h=drm-next-5.2-wip=ce05ef71564f7cbe270cd4337c36ee720ea534db
> which did not have a clear line of sight for 5.2 either.

When was that? I saw "Use HMM for userptr" in Dave's 5.2-rc1 pull 
request to Linus.


Regards,
   Felix


>
> Alex
> 
> *From:* amd-gfx  on behalf of 
> Kuehling, Felix 
> *Sent:* Monday, May 13, 2019 3:36 PM
> *To:* Jerome Glisse
> *Cc:* linux...@kvack.org; airl...@gmail.com; 
> amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org; 
> alex.deuc...@amd.com
> *Subject:* Re: [PATCH 2/2] mm/hmm: Only set FAULT_FLAG_ALLOW_RETRY for 
> non-blocking
> [CAUTION: External Email]
>
> Hi Jerome,
>
> Do you want me to push the patches to your branch? Or are you going to
> apply them yourself?
>
> Is your hmm-5.2-v3 branch going to make it into Linux 5.2? If so, do you
> know when? I'd like to coordinate with Dave Airlie so that we can also
> get that update into a drm-next branch soon.
>
> I see that Linus merged Dave's pull request for Linux 5.2, which
> includes the first changes in amdgpu using HMM. They're currently broken
> without these two patches.
>
> Thanks,
>    Felix
>
> On 2019-05-10 4:14 p.m., Jerome Glisse wrote:
> > [CAUTION: External Email]
> >
> > On Fri, May 10, 2019 at 07:53:24PM +, Kuehling, Felix wrote:
> >> Don't set this flag by default in hmm_vma_do_fault. It is set
> >> conditionally just a few lines below. Setting it unconditionally
> >> can lead to handle_mm_fault doing a non-blocking fault, returning
> >> -EBUSY and unlocking mmap_sem unexpectedly.
> >>
> >> Signed-off-by: Felix Kuehling 
> > Reviewed-by: Jérôme Glisse 
> >
> >> ---
> >>   mm/hmm.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/mm/hmm.c b/mm/hmm.c
> >> index b65c27d5c119..3c4f1d62202f 100644
> >> --- a/mm/hmm.c
> >> +++ b/mm/hmm.c
> >> @@ -339,7 +339,7 @@ struct hmm_vma_walk {
> >>   static int hmm_vma_do_fault(struct mm_walk *walk, unsigned long addr,
> >>    bool write_fault, uint64_t *pfn)
> >>   {
> >> - unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_REMOTE;
> >> + unsigned int flags = FAULT_FLAG_REMOTE;
> >>    struct hmm_vma_walk *hmm_vma_walk = walk->private;
> >>    struct hmm_range *range = hmm_vma_walk->range;
> >>    struct vm_area_struct *vma = walk->vma;
> >> --
> >> 2.17.1
> >>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/2] mm/hmm: Only set FAULT_FLAG_ALLOW_RETRY for non-blocking

2019-05-13 Thread Kuehling, Felix
[Fixed Alex's email address, sorry for getting it wrong first]

On 2019-05-13 3:49 p.m., Jerome Glisse wrote:
> [CAUTION: External Email]
>
> Andrew can we get this 2 fixes line up for 5.2 ?
>
> On Mon, May 13, 2019 at 07:36:44PM +, Kuehling, Felix wrote:
>> Hi Jerome,
>>
>> Do you want me to push the patches to your branch? Or are you going to
>> apply them yourself?
>>
>> Is your hmm-5.2-v3 branch going to make it into Linux 5.2? If so, do you
>> know when? I'd like to coordinate with Dave Airlie so that we can also
>> get that update into a drm-next branch soon.
>>
>> I see that Linus merged Dave's pull request for Linux 5.2, which
>> includes the first changes in amdgpu using HMM. They're currently broken
>> without these two patches.
> HMM patch do not go through any git branch they go through the mmotm
> collection. So it is not something you can easily coordinate with drm
> branch.
>
> By broken i expect you mean that if numabalance happens it breaks ?
> Or it might sleep when you are not expecting it too ?

Without the NUMA fix we'd end up using an outdated physical address in 
the GPU page table. The problem was caught by a test that got incorrect 
computation results using OpenCL on a NUMA system.

Without the FAULT_FLAG_ALLOW_RETRY patch, there can be kernel oopses due 
to incorrect locking/unlocking of mmap_sem. It breaks the promise that 
hmm_range_fault should not unlock the mmap_sem if block==true. It takes 
some memory pressure to trigger this.

Regards,
   Felix


>
> Cheers,
> Jérôme
>
>> Thanks,
>> Felix
>>
>> On 2019-05-10 4:14 p.m., Jerome Glisse wrote:
>>> [CAUTION: External Email]
>>>
>>> On Fri, May 10, 2019 at 07:53:24PM +, Kuehling, Felix wrote:
 Don't set this flag by default in hmm_vma_do_fault. It is set
 conditionally just a few lines below. Setting it unconditionally
 can lead to handle_mm_fault doing a non-blocking fault, returning
 -EBUSY and unlocking mmap_sem unexpectedly.

 Signed-off-by: Felix Kuehling 
>>> Reviewed-by: Jérôme Glisse 
>>>
 ---
mm/hmm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/mm/hmm.c b/mm/hmm.c
 index b65c27d5c119..3c4f1d62202f 100644
 --- a/mm/hmm.c
 +++ b/mm/hmm.c
 @@ -339,7 +339,7 @@ struct hmm_vma_walk {
static int hmm_vma_do_fault(struct mm_walk *walk, unsigned long addr,
 bool write_fault, uint64_t *pfn)
{
 - unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_REMOTE;
 + unsigned int flags = FAULT_FLAG_REMOTE;
 struct hmm_vma_walk *hmm_vma_walk = walk->private;
 struct hmm_range *range = hmm_vma_walk->range;
 struct vm_area_struct *vma = walk->vma;
 --
 2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/2] mm/hmm: Only set FAULT_FLAG_ALLOW_RETRY for non-blocking

2019-05-13 Thread Deucher, Alexander
I reverted all the amdgpu HMM patches for 5.2 because they also depended on 
this patch:
https://cgit.freedesktop.org/~agd5f/linux/commit/?h=drm-next-5.2-wip=ce05ef71564f7cbe270cd4337c36ee720ea534db
which did not have a clear line of sight for 5.2 either.

Alex

From: amd-gfx  on behalf of Kuehling, 
Felix 
Sent: Monday, May 13, 2019 3:36 PM
To: Jerome Glisse
Cc: linux...@kvack.org; airl...@gmail.com; amd-gfx@lists.freedesktop.org; 
dri-de...@lists.freedesktop.org; alex.deuc...@amd.com
Subject: Re: [PATCH 2/2] mm/hmm: Only set FAULT_FLAG_ALLOW_RETRY for 
non-blocking

[CAUTION: External Email]

Hi Jerome,

Do you want me to push the patches to your branch? Or are you going to
apply them yourself?

Is your hmm-5.2-v3 branch going to make it into Linux 5.2? If so, do you
know when? I'd like to coordinate with Dave Airlie so that we can also
get that update into a drm-next branch soon.

I see that Linus merged Dave's pull request for Linux 5.2, which
includes the first changes in amdgpu using HMM. They're currently broken
without these two patches.

Thanks,
   Felix

On 2019-05-10 4:14 p.m., Jerome Glisse wrote:
> [CAUTION: External Email]
>
> On Fri, May 10, 2019 at 07:53:24PM +, Kuehling, Felix wrote:
>> Don't set this flag by default in hmm_vma_do_fault. It is set
>> conditionally just a few lines below. Setting it unconditionally
>> can lead to handle_mm_fault doing a non-blocking fault, returning
>> -EBUSY and unlocking mmap_sem unexpectedly.
>>
>> Signed-off-by: Felix Kuehling 
> Reviewed-by: Jérôme Glisse 
>
>> ---
>>   mm/hmm.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/hmm.c b/mm/hmm.c
>> index b65c27d5c119..3c4f1d62202f 100644
>> --- a/mm/hmm.c
>> +++ b/mm/hmm.c
>> @@ -339,7 +339,7 @@ struct hmm_vma_walk {
>>   static int hmm_vma_do_fault(struct mm_walk *walk, unsigned long addr,
>>bool write_fault, uint64_t *pfn)
>>   {
>> - unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_REMOTE;
>> + unsigned int flags = FAULT_FLAG_REMOTE;
>>struct hmm_vma_walk *hmm_vma_walk = walk->private;
>>struct hmm_range *range = hmm_vma_walk->range;
>>struct vm_area_struct *vma = walk->vma;
>> --
>> 2.17.1
>>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/2] mm/hmm: Only set FAULT_FLAG_ALLOW_RETRY for non-blocking

2019-05-13 Thread Jerome Glisse
Andrew can we get this 2 fixes line up for 5.2 ?

On Mon, May 13, 2019 at 07:36:44PM +, Kuehling, Felix wrote:
> Hi Jerome,
> 
> Do you want me to push the patches to your branch? Or are you going to 
> apply them yourself?
> 
> Is your hmm-5.2-v3 branch going to make it into Linux 5.2? If so, do you 
> know when? I'd like to coordinate with Dave Airlie so that we can also 
> get that update into a drm-next branch soon.
> 
> I see that Linus merged Dave's pull request for Linux 5.2, which 
> includes the first changes in amdgpu using HMM. They're currently broken 
> without these two patches.

HMM patch do not go through any git branch they go through the mmotm
collection. So it is not something you can easily coordinate with drm
branch.

By broken i expect you mean that if numabalance happens it breaks ?
Or it might sleep when you are not expecting it too ?

Cheers,
Jérôme

> 
> Thanks,
>    Felix
> 
> On 2019-05-10 4:14 p.m., Jerome Glisse wrote:
> > [CAUTION: External Email]
> >
> > On Fri, May 10, 2019 at 07:53:24PM +, Kuehling, Felix wrote:
> >> Don't set this flag by default in hmm_vma_do_fault. It is set
> >> conditionally just a few lines below. Setting it unconditionally
> >> can lead to handle_mm_fault doing a non-blocking fault, returning
> >> -EBUSY and unlocking mmap_sem unexpectedly.
> >>
> >> Signed-off-by: Felix Kuehling 
> > Reviewed-by: Jérôme Glisse 
> >
> >> ---
> >>   mm/hmm.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/mm/hmm.c b/mm/hmm.c
> >> index b65c27d5c119..3c4f1d62202f 100644
> >> --- a/mm/hmm.c
> >> +++ b/mm/hmm.c
> >> @@ -339,7 +339,7 @@ struct hmm_vma_walk {
> >>   static int hmm_vma_do_fault(struct mm_walk *walk, unsigned long addr,
> >>bool write_fault, uint64_t *pfn)
> >>   {
> >> - unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_REMOTE;
> >> + unsigned int flags = FAULT_FLAG_REMOTE;
> >>struct hmm_vma_walk *hmm_vma_walk = walk->private;
> >>struct hmm_range *range = hmm_vma_walk->range;
> >>struct vm_area_struct *vma = walk->vma;
> >> --
> >> 2.17.1
> >>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/2] mm/hmm: Only set FAULT_FLAG_ALLOW_RETRY for non-blocking

2019-05-13 Thread Kuehling, Felix
Hi Jerome,

Do you want me to push the patches to your branch? Or are you going to 
apply them yourself?

Is your hmm-5.2-v3 branch going to make it into Linux 5.2? If so, do you 
know when? I'd like to coordinate with Dave Airlie so that we can also 
get that update into a drm-next branch soon.

I see that Linus merged Dave's pull request for Linux 5.2, which 
includes the first changes in amdgpu using HMM. They're currently broken 
without these two patches.

Thanks,
   Felix

On 2019-05-10 4:14 p.m., Jerome Glisse wrote:
> [CAUTION: External Email]
>
> On Fri, May 10, 2019 at 07:53:24PM +, Kuehling, Felix wrote:
>> Don't set this flag by default in hmm_vma_do_fault. It is set
>> conditionally just a few lines below. Setting it unconditionally
>> can lead to handle_mm_fault doing a non-blocking fault, returning
>> -EBUSY and unlocking mmap_sem unexpectedly.
>>
>> Signed-off-by: Felix Kuehling 
> Reviewed-by: Jérôme Glisse 
>
>> ---
>>   mm/hmm.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/hmm.c b/mm/hmm.c
>> index b65c27d5c119..3c4f1d62202f 100644
>> --- a/mm/hmm.c
>> +++ b/mm/hmm.c
>> @@ -339,7 +339,7 @@ struct hmm_vma_walk {
>>   static int hmm_vma_do_fault(struct mm_walk *walk, unsigned long addr,
>>bool write_fault, uint64_t *pfn)
>>   {
>> - unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_REMOTE;
>> + unsigned int flags = FAULT_FLAG_REMOTE;
>>struct hmm_vma_walk *hmm_vma_walk = walk->private;
>>struct hmm_range *range = hmm_vma_walk->range;
>>struct vm_area_struct *vma = walk->vma;
>> --
>> 2.17.1
>>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/2] mm/hmm: Only set FAULT_FLAG_ALLOW_RETRY for non-blocking

2019-05-10 Thread Jerome Glisse
On Fri, May 10, 2019 at 07:53:24PM +, Kuehling, Felix wrote:
> Don't set this flag by default in hmm_vma_do_fault. It is set
> conditionally just a few lines below. Setting it unconditionally
> can lead to handle_mm_fault doing a non-blocking fault, returning
> -EBUSY and unlocking mmap_sem unexpectedly.
> 
> Signed-off-by: Felix Kuehling 

Reviewed-by: Jérôme Glisse 

> ---
>  mm/hmm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/hmm.c b/mm/hmm.c
> index b65c27d5c119..3c4f1d62202f 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -339,7 +339,7 @@ struct hmm_vma_walk {
>  static int hmm_vma_do_fault(struct mm_walk *walk, unsigned long addr,
>   bool write_fault, uint64_t *pfn)
>  {
> - unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_REMOTE;
> + unsigned int flags = FAULT_FLAG_REMOTE;
>   struct hmm_vma_walk *hmm_vma_walk = walk->private;
>   struct hmm_range *range = hmm_vma_walk->range;
>   struct vm_area_struct *vma = walk->vma;
> -- 
> 2.17.1
> 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx