On 06/07/2017 16:48, Peter Zijlstra wrote:
> On Thu, Jul 06, 2017 at 03:46:59PM +0200, Laurent Dufour wrote:
>> On 05/07/2017 20:50, Peter Zijlstra wrote:
>>> On Fri, Jun 16, 2017 at 07:52:33PM +0200, Laurent Dufour wrote:
>>>> @@ -2294,8 +2295,19 @@ static bool pte_map_lock(struct vm_fault *vmf)
>>>>    if (vma_has_changed(vmf->vma, vmf->sequence))
>>>>            goto out;
>>>>  
>>>> -  pte = pte_offset_map_lock(vmf->vma->vm_mm, vmf->pmd,
>>>> -                            vmf->address, &ptl);
> 
>>>> +  ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd);
>>>> +  pte = pte_offset_map(vmf->pmd, vmf->address);
>>>> +  if (unlikely(!spin_trylock(ptl))) {
>>>> +          pte_unmap(pte);
>>>> +          goto out;
>>>> +  }
>>>> +
>>>>    if (vma_has_changed(vmf->vma, vmf->sequence)) {
>>>>            pte_unmap_unlock(pte, ptl);
>>>>            goto out;
>>>
>>> Right, so if you look at my earlier patches you'll see I did something
>>> quite disgusting here.
>>>
>>> Not sure that wants repeating, but I cannot remember why I thought this
>>> deadlock didn't exist anymore.
>>
>> Regarding the deadlock I did face it on my Power victim node, so I guess it
>> is still there, and the stack traces are quiet explicit.
>> Am I missing something here ?
> 
> No, you are right in that the deadlock is quite real. What I cannot
> remember is what made me think to remove the really 'wonderful' code I
> had to deal with it.
> 
> That said, you might want to look at how often you terminate the
> speculation because of your trylock failing. If that shows up at all we
> might need to do something about it.

Based on the benchmarks I run, it doesn't fail so much often, but I was
thinking about adding some counters here. The system is accounting for
major page faults and minor ones, respectively current->maj_flt and
current->min_flt. I was wondering if an additional type like async_flt will
be welcome or if there is another smarter way to get that metric.

Feel free to advise.

Thanks
Laurent.

Reply via email to