On 2024/10/26 10:21, Yu Kuai wrote:
> Hi,
> 
> 在 2024/10/25 18:40, Tetsuo Handa 写道:
>> On 2024/10/25 16:05, Yang Erkun wrote:
>>> From: Yang Erkun <yanger...@huawei.com>
>>>
>>> My colleague Wupeng found the following problems during fault injection:
>>>
>>> BUG: unable to handle page fault for address: fffffbfff809d073
>>
>> Excuse me, but subject says "null pointer" whereas dmesg says
>> "not a null pointer dereference". Is this a use-after-free bug?
>> Also, what verb comes after "when modprobe brd" ?
>>
>> Is this problem happening with parallel execution? If yes, parallelly
>> running what and what?
> 
> The problem is straightforward, to be short,
> 
> T1: morprobe brd
> brd_init
>  brd_alloc
>   add_disk
>         T2: open brd
>         bdev_open
>          try_module_get
>   // err path
>   brd_cleanup
>           // dereference brd_fops() while module is freed.

Then, fault injection is irrelevant, isn't it?

If bdev_open() can grab a reference before module's initialization phase
completes is a problem, I think that we can fix the problem with just

 int bdev_open(struct block_device *bdev, blk_mode_t mode, void *holder,
              const struct blk_holder_ops *hops, struct file *bdev_file)
 {
 (...snipped...)
        ret = -ENXIO;
        if (!disk_live(disk))
                goto abort_claiming;
-       if (!try_module_get(disk->fops->owner))
+       if ((disk->fops->owner && module_is_coming(disk->fops->owner)) || 
!try_module_get(disk->fops->owner))
                goto abort_claiming;
        ret = -EBUSY;
        if (!bdev_may_open(bdev, mode))
 (...snipped...)
 }

change. It would be cleaner if we can do

 bool try_module_get(struct module *module)
 {
        bool ret = true;
 
        if (module) {
                /* Note: here, we can fail to get a reference */
-               if (likely(module_is_live(module) &&
+               if (likely(module_is_live(module) && !module_is_coming(module) 
&&
                           atomic_inc_not_zero(&module->refcnt) != 0))
                        trace_module_get(module, _RET_IP_);
                else
                        ret = false;
        }
        return ret;
 }

but I don't know if this change breaks something.
Maybe we can introduce a variant like

bool try_module_get_safe(struct module *module)
{
        bool ret = true;

        if (module) {
                /* Note: here, we can fail to get a reference */
                if (likely(module_is_live(module) && !module_is_coming(module) 
&&
                           atomic_inc_not_zero(&module->refcnt) != 0))
                        trace_module_get(module, _RET_IP_);
                else
                        ret = false;
        }
        return ret;
}

and use it from bdev_open().


Reply via email to