Re: KASAN: use-after-free Read in tcf_block_find

2018-09-27 Thread Dmitry Vyukov
On Thu, Sep 27, 2018 at 7:50 PM, Cong Wang  wrote:
> On Thu, Sep 27, 2018 at 1:11 AM Dmitry Vyukov  wrote:
>>
>> Would a stack trace for call_rcu be helpful here? I have this idea for
>> a long time, but never get around to implementing it:
>> https://bugzilla.kernel.org/show_bug.cgi?id=198437
>
> Yes. Generally speaking, showing backtrace of call_rcu()
> or schedule_work(0 etc. is very helpful, we are more interested
> in who calls call_rcu() than what that RCU callback does.
>
> BTW, yesterday I asked syzbot to test this:
> https://github.com/congwang/linux/commit/b7815584cf1c0bbb79e8f6fe3e4b66ba10375560
> I still don't get any result.

I see that test job. It's in some dead loop, now trying to run for 37'th time.
I've just pushed a fix a bug that could have caused it (fuzzing the
fuzzer, we should go deeper!):
https://github.com/google/syzkaller/commit/98b28ead6ceaf22064b9715cc1950848d2bdef0b
If it won't help, I will take a look tomorrow.


> For this specific bug, we should hold a refcnt in dev->qdisc, I don't
> even see how call_rcu() could be invoked, unless of course we mess
> up with qdisc refcnt.


Re: KASAN: use-after-free Read in tcf_block_find

2018-09-27 Thread Cong Wang
On Thu, Sep 27, 2018 at 1:11 AM Dmitry Vyukov  wrote:
>
> Would a stack trace for call_rcu be helpful here? I have this idea for
> a long time, but never get around to implementing it:
> https://bugzilla.kernel.org/show_bug.cgi?id=198437

Yes. Generally speaking, showing backtrace of call_rcu()
or schedule_work(0 etc. is very helpful, we are more interested
in who calls call_rcu() than what that RCU callback does.

BTW, yesterday I asked syzbot to test this:
https://github.com/congwang/linux/commit/b7815584cf1c0bbb79e8f6fe3e4b66ba10375560
I still don't get any result.

For this specific bug, we should hold a refcnt in dev->qdisc, I don't
even see how call_rcu() could be invoked, unless of course we mess
up with qdisc refcnt.


Re: KASAN: use-after-free Read in tcf_block_find

2018-09-27 Thread Dmitry Vyukov
On Thu, Sep 27, 2018 at 3:24 PM, Eric Dumazet  wrote:
> On 09/27/2018 06:02 AM, Dmitry Vyukov wrote:
>
>> I am not suggesting to commit this. This is just a hack for debugging.
>> It in fact lead to some warnings, but still allowed me to reproduce
>> the bug reliably.
>>
>
> Had you got more meaningful stack traces ?
>
> (Showing which context was actually doing the dst_release())

nope... I just handed off the instructions to Wei since she said she
can't reproduce it.


 +   dst_destroy_rcu(>rcu_head);
 +   }
 +   }
 }
  }
>
> Thanks.


Re: KASAN: use-after-free Read in tcf_block_find

2018-09-27 Thread Eric Dumazet



On 09/27/2018 06:02 AM, Dmitry Vyukov wrote:

> I am not suggesting to commit this. This is just a hack for debugging.
> It in fact lead to some warnings, but still allowed me to reproduce
> the bug reliably.
> 

Had you got more meaningful stack traces ?

(Showing which context was actually doing the dst_release())

>>> +   dst_destroy_rcu(>rcu_head);
>>> +   }
>>> +   }
>>> }
>>>  }

Thanks.


Re: KASAN: use-after-free Read in tcf_block_find

2018-09-27 Thread Dmitry Vyukov
On Thu, Sep 27, 2018 at 3:00 PM, Eric Dumazet  wrote:
>
>
> On 09/27/2018 01:10 AM, Dmitry Vyukov wrote:
>
>>
>> Would a stack trace for call_rcu be helpful here? I have this idea for
>> a long time, but never get around to implementing it:
>> https://bugzilla.kernel.org/show_bug.cgi?id=198437
>>
>> Also FWIW I recently used the following hack for another net bug. It
>> made that other bug involving call_rcu way more likely to fire. Maybe
>> it will be helpful here too.
>>
>> diff --git a/net/core/dst.c b/net/core/dst.c
>> index 81ccf20e28265..591a8d0aca545 100644
>> --- a/net/core/dst.c
>> +++ b/net/core/dst.c
>> @@ -187,8 +187,16 @@ void dst_release(struct dst_entry *dst)
>> if (unlikely(newrefcnt < 0))
>> net_warn_ratelimited("%s: dst:%p refcnt:%d\n",
>>  __func__, dst, newrefcnt);
>> -   if (!newrefcnt)
>> -   call_rcu(>rcu_head, dst_destroy_rcu);
>> +   if (!newrefcnt) {
>> +   if (lock_is_held(_bh_lock_map) ||
>> +   lock_is_held(_lock_map) ||
>> +   lock_is_held(_sched_lock_map)) {
>> +   call_rcu(>rcu_head, dst_destroy_rcu);
>> +   } else {
>> +   synchronize_rcu();
>
> dst_release() can be called in context we hold a spinlock, this would be bad 
> to reschedule here.

I am not suggesting to commit this. This is just a hack for debugging.
It in fact lead to some warnings, but still allowed me to reproduce
the bug reliably.

>> +   dst_destroy_rcu(>rcu_head);
>> +   }
>> +   }
>> }
>>  }


Re: KASAN: use-after-free Read in tcf_block_find

2018-09-27 Thread Eric Dumazet



On 09/27/2018 01:10 AM, Dmitry Vyukov wrote:

> 
> Would a stack trace for call_rcu be helpful here? I have this idea for
> a long time, but never get around to implementing it:
> https://bugzilla.kernel.org/show_bug.cgi?id=198437
> 
> Also FWIW I recently used the following hack for another net bug. It
> made that other bug involving call_rcu way more likely to fire. Maybe
> it will be helpful here too.
> 
> diff --git a/net/core/dst.c b/net/core/dst.c
> index 81ccf20e28265..591a8d0aca545 100644
> --- a/net/core/dst.c
> +++ b/net/core/dst.c
> @@ -187,8 +187,16 @@ void dst_release(struct dst_entry *dst)
> if (unlikely(newrefcnt < 0))
> net_warn_ratelimited("%s: dst:%p refcnt:%d\n",
>  __func__, dst, newrefcnt);
> -   if (!newrefcnt)
> -   call_rcu(>rcu_head, dst_destroy_rcu);
> +   if (!newrefcnt) {
> +   if (lock_is_held(_bh_lock_map) ||
> +   lock_is_held(_lock_map) ||
> +   lock_is_held(_sched_lock_map)) {
> +   call_rcu(>rcu_head, dst_destroy_rcu);
> +   } else {
> +   synchronize_rcu();

dst_release() can be called in context we hold a spinlock, this would be bad to 
reschedule here.

> +   dst_destroy_rcu(>rcu_head);
> +   }
> +   }
> }
>  }
>