On Mon 28-09-20 16:31:01, paul...@kernel.org wrote:
[...]
> This commit therefore uses preemptible() to determine whether allocation
> is possible at all for double-argument kvfree_rcu().

This deserves a comment. Because GFP_ATOMIC is possible for many
!preemptible() contexts. It is the raw_spin_lock, NMIs and likely few
others that are a problem. You are taking a conservative approach which
is fine but it would be good to articulate that explicitly.

> If !preemptible(),
> then allocation is not possible, and kvfree_rcu() falls back to using
> the less cache-friendly rcu_head approach.  Even when preemptible(),
> the caller might be involved in reclaim, so the GFP_ flags used by
> double-argument kvfree_rcu() must avoid invoking reclaim processing.

Could you be more specific? Is this about being called directly in the
reclaim context and you want to prevent a recursion? If that is the
case, do you really need to special case this in any way? Any memory
reclaim will set PF_MEMALLOC so allocations called from that context
will not perform reclaim. So if you are called from the reclaim directly
then you might want to do GFP_KERNEL | __GFP_NOMEMALLOC | __GFP_NOWARN.
That should handle both from-the-recalim and outside of reclaim contexts
just fine (assuming you don't allocated from !preemptible()) context.

> Note that single-argument kvfree_rcu() must be invoked in sleepable
> contexts, and that its fallback is the relatively high latency
> synchronize_rcu().  Single-argument kvfree_rcu() therefore uses
> GFP_KERNEL|__GFP_RETRY_MAYFAIL to allow limited sleeping within the
> memory allocator.
[...]
>  static inline bool
> -kvfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp, void *ptr)
> +add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp,
> +     unsigned long *flags, void *ptr, bool can_sleep)
>  {
>       struct kvfree_rcu_bulk_data *bnode;
> +     bool can_alloc_page = preemptible();
> +     gfp_t gfp = (can_sleep ? GFP_KERNEL | __GFP_RETRY_MAYFAIL : GFP_ATOMIC) 
> | __GFP_NOWARN;

This is quite confusing IMHO. At least without a further explanation.
can_sleep is not as much about sleeping as it is about the reclaim
recursion AFAIU your changelog, right?

>       int idx;
>  
> -     if (unlikely(!krcp->initialized))
> +     *krcp = krc_this_cpu_lock(flags);
> +     if (unlikely(!(*krcp)->initialized))
>               return false;
>  
> -     lockdep_assert_held(&krcp->lock);
>       idx = !!is_vmalloc_addr(ptr);
>  
>       /* Check if a new block is required. */
> -     if (!krcp->bkvhead[idx] ||
> -                     krcp->bkvhead[idx]->nr_records == KVFREE_BULK_MAX_ENTR) 
> {
> -             bnode = get_cached_bnode(krcp);
> -             if (!bnode) {
> -                     /*
> -                      * To keep this path working on raw non-preemptible
> -                      * sections, prevent the optional entry into the
> -                      * allocator as it uses sleeping locks. In fact, even
> -                      * if the caller of kfree_rcu() is preemptible, this
> -                      * path still is not, as krcp->lock is a raw spinlock.
> -                      * With additional page pre-allocation in the works,
> -                      * hitting this return is going to be much less likely.
> -                      */
> -                     if (IS_ENABLED(CONFIG_PREEMPT_RT))
> -                             return false;
> -
> -                     /*
> -                      * NOTE: For one argument of kvfree_rcu() we can
> -                      * drop the lock and get the page in sleepable
> -                      * context. That would allow to maintain an array
> -                      * for the CONFIG_PREEMPT_RT as well if no cached
> -                      * pages are available.
> -                      */
> -                     bnode = (struct kvfree_rcu_bulk_data *)
> -                             __get_free_page(GFP_NOWAIT | __GFP_NOWARN);
> +     if (!(*krcp)->bkvhead[idx] ||
> +                     (*krcp)->bkvhead[idx]->nr_records == 
> KVFREE_BULK_MAX_ENTR) {
> +             bnode = get_cached_bnode(*krcp);
> +             if (!bnode && can_alloc_page) {
> +                     krc_this_cpu_unlock(*krcp, *flags);
> +                     bnode = kmalloc(PAGE_SIZE, gfp);

What is the point of calling kmalloc  for a PAGE_SIZE object? Wouldn't
using the page allocator directly be better?
-- 
Michal Hocko
SUSE Labs

Reply via email to