Le 08/03/2020 à 02:33, Andrew Doran a écrit :
> On Sat, Mar 07, 2020 at 12:24:21PM +0100, Maxime Villard wrote:
>> Can we revert the "__aligned(COHERENCY_UNIT)" for now? There is no particular
>> hurry to fix this bug, however the KUBSAN instance has been down for more 
>> than
>> two months because of this, and it needs to be addressed.
> That should be quelled now.

The change is not correct, see my answer in response to the commit.

>> Similarly, the KASAN instance is currently crashing hard on:
>> https://syzkaller.appspot.com/bug?id=1aa3f789d356bf04644bcef632bf8c2373398ba2
>> Dozens of thousands of times each day. This has been the case for two weeks,
>> and it too needs to be addressed.
> That's been there since I started looking last year.

Not sure what you mean? The history log indicates that it started on Jan 7th

> I guess it's a false positive because the sanitiser probably thinks objects
> are gone once pool_cache_put() is called, but the actual point of disposal
> is the pool_cache dtor.

There is no false positive because of that. If a dtor is there, the data is
left as valid:

3124 #ifdef KASAN
3125    /* If there is a ctor/dtor, leave the data as valid. */
3126    if (__predict_false(pc_has_ctor(pc) || pc_has_dtor(pc))) {
3127            return;
3128    }
3129 #endif

And in all cases, the instance is compiled with POOL_QUARANTINE, which
"cancels" caches, that is, pool_cache_put directly returns the object to
the pool layer. So the buffer remains KASAN-valid, goes through dtor,
then is made KASAN-invalid, and finally lands in the pool.

Looking at the report, it looks like the use-after-free is on this line
in mutex_oncpu():

421     l = (lwp_t *)MUTEX_OWNER(owner);
422     ci = l->l_cpu;

So mutex_oncpu() is called on a lock somehow (previously?) held by an LWP
that now has been freed. If you look at the different reports, this issue
is triggered in random places. Maybe one of your recent changes leaves the
mutex as owned somehow?


Reply via email to