On 2/24/26 17:03, Oleg Nesterov wrote:
> On 02/24, Pavel Tikhomirov wrote:
>>
>> On 2/24/26 13:09, Oleg Nesterov wrote:
>>> On 02/23, Pavel Tikhomirov wrote:
>>>>
>>>> To avoid possible problems related to cpu/compiler optimizations around
>>>> ->child_reaper, let's use WRITE_ONCE (additional to task_list lock)
>>>> everywhere we write it and use READ_ONCE everywhere we read it without
>>>> explicit lock.
>>>
>>> Yes, this is what I meant... but I can never recall if READ_ONCE() alone
>>> is enough to make KCSAN happy...
>>
>> AFAICS this should be fine for memory safety of accesses to ->child_reaper.
>> I would love if someone more experienced in this area would confirm.
>
> __READ_ONCE() uses volatile cast, DEFINE_TSAN_VOLATILE_READ_WRITE()
> will pass KCSAN_ACCESS_ATOMIC to check_access(), so it seems that
> READ_ONCE() should be enough...
>
> But I am not sure, I don't really understand this code.
>
>>> I won't insist, but I think it would be better to do this in a separate
>>> change for documenation purposes and for discussion.
>>
>> Ok, will do. It will be a bit ugly as I will first add READ_ONCE to the
>> pidns_for_children_get() and then remove the hunk with it in the next patch.
>
> Agreed, this is ugly. I almost regret I mentioned _ONCE() in the previous
> discussions, I only tried to "nack" another read_lock(tasklist).
Heh, I missed that in pidns_for_children_get() we have read under tasklist
lock, so I don't need to add _ONCE there so it should not be that ugly =).
I will test and send v3 with prep patches soon.
>
> So lets avoid a separate change and WRITE_ONCE()'s in
> copy_process/find_child_reaper,
> we can add them later if KCSAN complains, they are not needed for correctness.
>
> But up to you, I am fine either way.
I also noticed that
https://docs.kernel.org/dev-tools/kcsan.html#c.ASSERT_EXCLUSIVE_WRITER
asks to add ASSERT_EXCLUSIVE_WRITER, so I will try to accommodate that.
>
>>> Perhaps something like the preparational patch below makes sense ? Not
>>> sure this is actually better...
>>
>> This looks more universal at least, as instead of two checks we have one in
>> one
>> place. My only concern of putting the check where I put it was to avoid extra
>> idr_alloc_cyclic() + idr_remove(), if we already know we don't need it. But
>> it's
>> only in last pid_namespace we can have ->child_reaper unset so we do
>> alloc/remove
>> for all other namespaces anyway in error case, should not be a big deal.
>
> Yes...
>
>> I will add the preparation patches: for below patch and related to _ONCE.
>
> Again, up to you. But either way it would be nice to have a comment or at
> least a note in the changelog to explain that this is also needed to avoid
> the race between alloc_pid() + fail and another alloc_pid(). This is subtle.
Yes I will try to emphasize this.
>
> Oleg.
>
Thank you!
--
Best regards, Pavel Tikhomirov
Senior Software Developer, Virtuozzo.