Re: [PATCH v2 hmm 00/11] Various revisions from a locking/code review

2019-06-14 Thread Jason Gunthorpe
On Wed, Jun 12, 2019 at 09:49:12PM +, Yang, Philip wrote:
> Rebase to https://github.com/jgunthorpe/linux.git hmm branch, need some 
> changes because of interface hmm_range_register change. Then run a quick 
> amdgpu_test. Test is finished, result is ok.

Great! Thanks

I'll add your Tested-by to the series

>  But there is below kernel BUG message, seems hmm_free_rcu calls
> down_write.
> 
> [ 1171.919921] BUG: sleeping function called from invalid context at 
> /home/yangp/git/compute_staging/kernel/kernel/locking/rwsem.c:65
> [ 1171.919933] in_atomic(): 1, irqs_disabled(): 0, pid: 53, name: 
> kworker/1:1
> [ 1171.919938] 2 locks held by kworker/1:1/53:
> [ 1171.919940]  #0: 1c7c19d4 ((wq_completion)rcu_gp){+.+.}, at: 
> process_one_work+0x20e/0x630
> [ 1171.919951]  #1: 923f2cfa 
> ((work_completion)(>work)){+.+.}, at: process_one_work+0x20e/0x630
> [ 1171.919959] CPU: 1 PID: 53 Comm: kworker/1:1 Tainted: GW 
> 5.2.0-rc1-kfd-yangp #196
> [ 1171.919961] Hardware name: ASUS All Series/Z97-PRO(Wi-Fi ac)/USB 3.1, 
> BIOS 9001 03/07/2016
> [ 1171.919965] Workqueue: rcu_gp srcu_invoke_callbacks
> [ 1171.919968] Call Trace:
> [ 1171.919974]  dump_stack+0x67/0x9b
> [ 1171.919980]  ___might_sleep+0x149/0x230
> [ 1171.919985]  down_write+0x1c/0x70
> [ 1171.919989]  hmm_free_rcu+0x24/0x80
> [ 1171.919993]  srcu_invoke_callbacks+0xc9/0x150
> [ 1171.92]  process_one_work+0x28e/0x630
> [ 1171.920008]  worker_thread+0x39/0x3f0
> [ 1171.920014]  ? process_one_work+0x630/0x630
> [ 1171.920017]  kthread+0x11c/0x140
> [ 1171.920021]  ? kthread_park+0x90/0x90
> [ 1171.920026]  ret_from_fork+0x24/0x30

Thank you Phillip, it seems the prior tests were not done with
lockdep..

Sigh, I will keep this with the gross pagetable_lock then. I updated
the patches on the git with this modification. I think we have covered
all the bases so I will send another V of the series to the list and
if no more comments then it will move ahead to hmm.git. Thanks to all.

diff --git a/mm/hmm.c b/mm/hmm.c
index 136c812faa2790..4c64d4c32f4825 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -49,16 +49,15 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
 
lockdep_assert_held_exclusive(>mmap_sem);
 
+   /* Abuse the page_table_lock to also protect mm->hmm. */
+   spin_lock(>page_table_lock);
if (mm->hmm) {
-   if (kref_get_unless_zero(>hmm->kref))
+   if (kref_get_unless_zero(>hmm->kref)) {
+   spin_unlock(>page_table_lock);
return mm->hmm;
-   /*
-* The hmm is being freed by some other CPU and is pending a
-* RCU grace period, but this CPU can NULL now it since we
-* have the mmap_sem.
-*/
-   mm->hmm = NULL;
+   }
}
+   spin_unlock(>page_table_lock);
 
hmm = kmalloc(sizeof(*hmm), GFP_KERNEL);
if (!hmm)
@@ -81,7 +80,14 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
}
 
mmgrab(hmm->mm);
+
+   /*
+* We hold the exclusive mmap_sem here so we know that mm->hmm is
+* still NULL or 0 kref, and is safe to update.
+*/
+   spin_lock(>page_table_lock);
mm->hmm = hmm;
+   spin_unlock(>page_table_lock);
return hmm;
 }
 
@@ -89,10 +95,14 @@ static void hmm_free_rcu(struct rcu_head *rcu)
 {
struct hmm *hmm = container_of(rcu, struct hmm, rcu);
 
-   down_write(>mm->mmap_sem);
+   /*
+* The mm->hmm pointer is kept valid while notifier ops can be running
+* so they don't have to deal with a NULL mm->hmm value
+*/
+   spin_lock(>mm->page_table_lock);
if (hmm->mm->hmm == hmm)
hmm->mm->hmm = NULL;
-   up_write(>mm->mmap_sem);
+   spin_unlock(>mm->page_table_lock);
mmdrop(hmm->mm);
 
kfree(hmm);
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v2 hmm 00/11] Various revisions from a locking/code review

2019-06-12 Thread Yang, Philip
Rebase to https://github.com/jgunthorpe/linux.git hmm branch, need some 
changes because of interface hmm_range_register change. Then run a quick 
amdgpu_test. Test is finished, result is ok. But there is below kernel 
BUG message, seems hmm_free_rcu calls down_write.

[ 1171.919921] BUG: sleeping function called from invalid context at 
/home/yangp/git/compute_staging/kernel/kernel/locking/rwsem.c:65
[ 1171.919933] in_atomic(): 1, irqs_disabled(): 0, pid: 53, name: 
kworker/1:1
[ 1171.919938] 2 locks held by kworker/1:1/53:
[ 1171.919940]  #0: 1c7c19d4 ((wq_completion)rcu_gp){+.+.}, at: 
process_one_work+0x20e/0x630
[ 1171.919951]  #1: 923f2cfa 
((work_completion)(>work)){+.+.}, at: process_one_work+0x20e/0x630
[ 1171.919959] CPU: 1 PID: 53 Comm: kworker/1:1 Tainted: GW 
5.2.0-rc1-kfd-yangp #196
[ 1171.919961] Hardware name: ASUS All Series/Z97-PRO(Wi-Fi ac)/USB 3.1, 
BIOS 9001 03/07/2016
[ 1171.919965] Workqueue: rcu_gp srcu_invoke_callbacks
[ 1171.919968] Call Trace:
[ 1171.919974]  dump_stack+0x67/0x9b
[ 1171.919980]  ___might_sleep+0x149/0x230
[ 1171.919985]  down_write+0x1c/0x70
[ 1171.919989]  hmm_free_rcu+0x24/0x80
[ 1171.919993]  srcu_invoke_callbacks+0xc9/0x150
[ 1171.92]  process_one_work+0x28e/0x630
[ 1171.920008]  worker_thread+0x39/0x3f0
[ 1171.920014]  ? process_one_work+0x630/0x630
[ 1171.920017]  kthread+0x11c/0x140
[ 1171.920021]  ? kthread_park+0x90/0x90
[ 1171.920026]  ret_from_fork+0x24/0x30

Philip

On 2019-06-12 1:54 p.m., Kuehling, Felix wrote:
> [+Philip]
> 
> Hi Jason,
> 
> I'm out of the office this week.
> 
> Hi Philip, can you give this a go? Not sure how much you've been
> following this patch series review. Message or call me on Skype to
> discuss any questions.
> 
> Thanks,
>     Felix
> 
> On 2019-06-11 12:48, Jason Gunthorpe wrote:
>> On Thu, Jun 06, 2019 at 03:44:27PM -0300, Jason Gunthorpe wrote:
>>> From: Jason Gunthorpe 
>>>
>>> For hmm.git:
>>>
>>> This patch series arised out of discussions with Jerome when looking at the
>>> ODP changes, particularly informed by use after free races we have already
>>> found and fixed in the ODP code (thanks to syzkaller) working with mmu
>>> notifiers, and the discussion with Ralph on how to resolve the lifetime 
>>> model.
>>>
>>> Overall this brings in a simplified locking scheme and easy to explain
>>> lifetime model:
>>>
>>>If a hmm_range is valid, then the hmm is valid, if a hmm is valid then 
>>> the mm
>>>is allocated memory.
>>>
>>>If the mm needs to still be alive (ie to lock the mmap_sem, find a vma, 
>>> etc)
>>>then the mmget must be obtained via mmget_not_zero().
>>>
>>> Locking of mm->hmm is shifted to use the mmap_sem consistently for all
>>> read/write and unlocked accesses are removed.
>>>
>>> The use unlocked reads on 'hmm->dead' are also eliminated in favour of using
>>> standard mmget() locking to prevent the mm from being released. Many of the
>>> debugging checks of !range->hmm and !hmm->mm are dropped in favour of 
>>> poison -
>>> which is much clearer as to the lifetime intent.
>>>
>>> The trailing patches are just some random cleanups I noticed when reviewing
>>> this code.
>>>
>>> This v2 incorporates alot of the good off list changes & feedback Jerome 
>>> had,
>>> and all the on-list comments too. However, now that we have the shared git I
>>> have kept the one line change to nouveau_svm.c rather than the compat
>>> funtions.
>>>
>>> I believe we can resolve this merge in the DRM tree now and keep the core
>>> mm/hmm.c clean. DRM maintainers, please correct me if I'm wrong.
>>>
>>> It is on top of hmm.git, and I have a git tree of this series to ease 
>>> testing
>>> here:
>>>
>>> https://github.com/jgunthorpe/linux/tree/hmm
>>>
>>> There are still some open locking issues, as I think this remains 
>>> unaddressed:
>>>
>>> https://lore.kernel.org/linux-mm/20190527195829.gb18...@mellanox.com/
>>>
>>> I'm looking for some more acks, reviews and tests so this can move ahead to
>>> hmm.git.
>> AMD Folks, this is looking pretty good now, can you please give at
>> least a Tested-by for the new driver code using this that I see in
>> linux-next?
>>
>> Thanks,
>> Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v2 hmm 00/11] Various revisions from a locking/code review

2019-06-12 Thread Kuehling, Felix
[+Philip]

Hi Jason,

I'm out of the office this week.

Hi Philip, can you give this a go? Not sure how much you've been 
following this patch series review. Message or call me on Skype to 
discuss any questions.

Thanks,
   Felix

On 2019-06-11 12:48, Jason Gunthorpe wrote:
> On Thu, Jun 06, 2019 at 03:44:27PM -0300, Jason Gunthorpe wrote:
>> From: Jason Gunthorpe 
>>
>> For hmm.git:
>>
>> This patch series arised out of discussions with Jerome when looking at the
>> ODP changes, particularly informed by use after free races we have already
>> found and fixed in the ODP code (thanks to syzkaller) working with mmu
>> notifiers, and the discussion with Ralph on how to resolve the lifetime 
>> model.
>>
>> Overall this brings in a simplified locking scheme and easy to explain
>> lifetime model:
>>
>>   If a hmm_range is valid, then the hmm is valid, if a hmm is valid then the 
>> mm
>>   is allocated memory.
>>
>>   If the mm needs to still be alive (ie to lock the mmap_sem, find a vma, 
>> etc)
>>   then the mmget must be obtained via mmget_not_zero().
>>
>> Locking of mm->hmm is shifted to use the mmap_sem consistently for all
>> read/write and unlocked accesses are removed.
>>
>> The use unlocked reads on 'hmm->dead' are also eliminated in favour of using
>> standard mmget() locking to prevent the mm from being released. Many of the
>> debugging checks of !range->hmm and !hmm->mm are dropped in favour of poison 
>> -
>> which is much clearer as to the lifetime intent.
>>
>> The trailing patches are just some random cleanups I noticed when reviewing
>> this code.
>>
>> This v2 incorporates alot of the good off list changes & feedback Jerome had,
>> and all the on-list comments too. However, now that we have the shared git I
>> have kept the one line change to nouveau_svm.c rather than the compat
>> funtions.
>>
>> I believe we can resolve this merge in the DRM tree now and keep the core
>> mm/hmm.c clean. DRM maintainers, please correct me if I'm wrong.
>>
>> It is on top of hmm.git, and I have a git tree of this series to ease testing
>> here:
>>
>> https://github.com/jgunthorpe/linux/tree/hmm
>>
>> There are still some open locking issues, as I think this remains 
>> unaddressed:
>>
>> https://lore.kernel.org/linux-mm/20190527195829.gb18...@mellanox.com/
>>
>> I'm looking for some more acks, reviews and tests so this can move ahead to
>> hmm.git.
> AMD Folks, this is looking pretty good now, can you please give at
> least a Tested-by for the new driver code using this that I see in
> linux-next?
>
> Thanks,
> Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v2 hmm 00/11] Various revisions from a locking/code review

2019-06-12 Thread Jason Gunthorpe
On Thu, Jun 06, 2019 at 03:44:27PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe 
> 
> For hmm.git:
> 
> This patch series arised out of discussions with Jerome when looking at the
> ODP changes, particularly informed by use after free races we have already
> found and fixed in the ODP code (thanks to syzkaller) working with mmu
> notifiers, and the discussion with Ralph on how to resolve the lifetime model.
> 
> Overall this brings in a simplified locking scheme and easy to explain
> lifetime model:
> 
>  If a hmm_range is valid, then the hmm is valid, if a hmm is valid then the mm
>  is allocated memory.
> 
>  If the mm needs to still be alive (ie to lock the mmap_sem, find a vma, etc)
>  then the mmget must be obtained via mmget_not_zero().
> 
> Locking of mm->hmm is shifted to use the mmap_sem consistently for all
> read/write and unlocked accesses are removed.
> 
> The use unlocked reads on 'hmm->dead' are also eliminated in favour of using
> standard mmget() locking to prevent the mm from being released. Many of the
> debugging checks of !range->hmm and !hmm->mm are dropped in favour of poison -
> which is much clearer as to the lifetime intent.
> 
> The trailing patches are just some random cleanups I noticed when reviewing
> this code.
> 
> This v2 incorporates alot of the good off list changes & feedback Jerome had,
> and all the on-list comments too. However, now that we have the shared git I
> have kept the one line change to nouveau_svm.c rather than the compat
> funtions.
> 
> I believe we can resolve this merge in the DRM tree now and keep the core
> mm/hmm.c clean. DRM maintainers, please correct me if I'm wrong.
> 
> It is on top of hmm.git, and I have a git tree of this series to ease testing
> here:
> 
> https://github.com/jgunthorpe/linux/tree/hmm
> 
> There are still some open locking issues, as I think this remains unaddressed:
> 
> https://lore.kernel.org/linux-mm/20190527195829.gb18...@mellanox.com/
> 
> I'm looking for some more acks, reviews and tests so this can move ahead to
> hmm.git.

AMD Folks, this is looking pretty good now, can you please give at
least a Tested-by for the new driver code using this that I see in
linux-next?

Thanks,
Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel