On 3/28/24 13:30, Christoph Hellwig wrote:
> I think this should go into the previous patch, splitting it
> out just causes confusion.
> 
>> +static void disk_free_zone_wplug(struct blk_zone_wplug *zwplug)
>> +{
>> +    struct gendisk *disk = zwplug->disk;
>> +    unsigned long flags;
>> +
>> +    if (zwplug->flags & BLK_ZONE_WPLUG_NEEDS_FREE) {
>> +            kfree(zwplug);
>> +            return;
>> +    }
>> +
>> +    spin_lock_irqsave(&disk->zone_wplugs_lock, flags);
>> +    list_add_tail(&zwplug->link, &disk->zone_wplugs_free_list);
>> +    spin_unlock_irqrestore(&disk->zone_wplugs_lock, flags);
>> +}
>> +
>>  static bool disk_insert_zone_wplug(struct gendisk *disk,
>>                                 struct blk_zone_wplug *zwplug)
>>  {
>> @@ -630,18 +665,24 @@ static struct blk_zone_wplug 
>> *disk_get_zone_wplug(struct gendisk *disk,
>>      return zwplug;
>>  }
>>  
>> +static void disk_free_zone_wplug_rcu(struct rcu_head *rcu_head)
>> +{
>> +    struct blk_zone_wplug *zwplug =
>> +            container_of(rcu_head, struct blk_zone_wplug, rcu_head);
>> +
>> +    disk_free_zone_wplug(zwplug);
>> +}
> 
> Please verify my idea carefully, but I think we can do without the
> RCU grace period and thus the rcu_head in struct blk_zone_wplug:
> 
> When the zwplug is removed from the hash, we set the
> BLK_ZONE_WPLUG_UNHASHED flag under disk->zone_wplugs_lock.  Once
> caller see that flag any lookup that modifies the structure
> will fail/wait.  If we then just clear BLK_ZONE_WPLUG_UNHASHED after
> the final put in disk_put_zone_wplug when we know the bio list is
> empty and no other state is kept (if there might be flags left
> we should clear them before), it is perfectly fine for the
> zwplug to get reused for another zone at this point.

That was my thinking initially as well, which is why I did not have the grace
period. However, getting a reference on a plug is a not done under
disk->zone_wplugs_lock and is thus racy, albeit with a super tiny time window:
the hash table lookup may "see" a plug that has already been removed and has a
refcount dropped to 0 already. The use of atomic_inc_not_zero() prevents us from
trying to keep using that stale plug, but we *are* referencing it. So without
the grace period, I think there is a risk (again, super tiny window) that we
start reusing the plug, or kfree it while atomic_inc_not_zero() is executing...
I am overthinking this ?

-- 
Damien Le Moal
Western Digital Research


Reply via email to