Re: [External] Re: [PATCH v2 1/8] mm/cma: change cma mutex to irq safe spinlock

2021-03-30 Thread Muchun Song
On Tue, Mar 30, 2021 at 4:18 PM Michal Hocko  wrote:
>
> On Tue 30-03-21 16:08:36, Muchun Song wrote:
> > On Tue, Mar 30, 2021 at 4:01 PM Michal Hocko  wrote:
> > >
> > > On Mon 29-03-21 16:23:55, Mike Kravetz wrote:
> > > > Ideally, cma_release could be called from any context.  However, that is
> > > > not possible because a mutex is used to protect the per-area bitmap.
> > > > Change the bitmap to an irq safe spinlock.
> > >
> > > I would phrase the changelog slightly differerent
> > > "
> > > cma_release is currently a sleepable operatation because the bitmap
> > > manipulation is protected by cma->lock mutex. Hugetlb code which relies
> > > on cma_release for CMA backed (giga) hugetlb pages, however, needs to be
> > > irq safe.
> > >
> > > The lock doesn't protect any sleepable operation so it can be changed to
> > > a (irq aware) spin lock. The bitmap processing should be quite fast in
> > > typical case but if cma sizes grow to TB then we will likely need to
> > > replace the lock by a more optimized bitmap implementation.
> > > "
> > >
> > > it seems that you are overusing irqsave variants even from context which
> > > are never called from the IRQ context so they do not need storing flags.
> > >
> > > [...]
> > > > @@ -391,8 +391,9 @@ static void cma_debug_show_areas(struct cma *cma)
> > > >   unsigned long start = 0;
> > > >   unsigned long nr_part, nr_total = 0;
> > > >   unsigned long nbits = cma_bitmap_maxno(cma);
> > > > + unsigned long flags;
> > > >
> > > > - mutex_lock(>lock);
> > > > + spin_lock_irqsave(>lock, flags);
> > >
> > > spin_lock_irq should be sufficient. This is only called from the
> > > allocation context and that is never called from IRQ context.
> >
> > This makes me think more. I think that spin_lock should be
> > sufficient. Right?
>
> Nope. Think of the following scenario
> spin_lock(cma->lock);
> 
> put_page
>   __free_huge_page
> cma_release
>   spin_lock_irqsave() DEADLOCK

Got it. Thanks.

> --
> Michal Hocko
> SUSE Labs


Re: [External] Re: [PATCH v2 1/8] mm/cma: change cma mutex to irq safe spinlock

2021-03-30 Thread Michal Hocko
On Tue 30-03-21 16:08:36, Muchun Song wrote:
> On Tue, Mar 30, 2021 at 4:01 PM Michal Hocko  wrote:
> >
> > On Mon 29-03-21 16:23:55, Mike Kravetz wrote:
> > > Ideally, cma_release could be called from any context.  However, that is
> > > not possible because a mutex is used to protect the per-area bitmap.
> > > Change the bitmap to an irq safe spinlock.
> >
> > I would phrase the changelog slightly differerent
> > "
> > cma_release is currently a sleepable operatation because the bitmap
> > manipulation is protected by cma->lock mutex. Hugetlb code which relies
> > on cma_release for CMA backed (giga) hugetlb pages, however, needs to be
> > irq safe.
> >
> > The lock doesn't protect any sleepable operation so it can be changed to
> > a (irq aware) spin lock. The bitmap processing should be quite fast in
> > typical case but if cma sizes grow to TB then we will likely need to
> > replace the lock by a more optimized bitmap implementation.
> > "
> >
> > it seems that you are overusing irqsave variants even from context which
> > are never called from the IRQ context so they do not need storing flags.
> >
> > [...]
> > > @@ -391,8 +391,9 @@ static void cma_debug_show_areas(struct cma *cma)
> > >   unsigned long start = 0;
> > >   unsigned long nr_part, nr_total = 0;
> > >   unsigned long nbits = cma_bitmap_maxno(cma);
> > > + unsigned long flags;
> > >
> > > - mutex_lock(>lock);
> > > + spin_lock_irqsave(>lock, flags);
> >
> > spin_lock_irq should be sufficient. This is only called from the
> > allocation context and that is never called from IRQ context.
> 
> This makes me think more. I think that spin_lock should be
> sufficient. Right?

Nope. Think of the following scenario
spin_lock(cma->lock);

put_page
  __free_huge_page
cma_release
  spin_lock_irqsave() DEADLOCK
-- 
Michal Hocko
SUSE Labs


RE: [External] Re: [PATCH v2 1/8] mm/cma: change cma mutex to irq safe spinlock

2021-03-30 Thread Song Bao Hua (Barry Song)


> -Original Message-
> From: Muchun Song [mailto:songmuc...@bytedance.com]
> Sent: Tuesday, March 30, 2021 9:09 PM
> To: Michal Hocko 
> Cc: Mike Kravetz ; Linux Memory Management List
> ; LKML ; Roman Gushchin
> ; Shakeel Butt ; Oscar Salvador
> ; David Hildenbrand ; David Rientjes
> ; linmiaohe ; Peter Zijlstra
> ; Matthew Wilcox ; HORIGUCHI NAOYA
> ; Aneesh Kumar K . V ;
> Waiman Long ; Peter Xu ; Mina Almasry
> ; Hillf Danton ; Joonsoo Kim
> ; Song Bao Hua (Barry Song)
> ; Will Deacon ; Andrew Morton
> 
> Subject: Re: [External] Re: [PATCH v2 1/8] mm/cma: change cma mutex to irq 
> safe
> spinlock
> 
> On Tue, Mar 30, 2021 at 4:01 PM Michal Hocko  wrote:
> >
> > On Mon 29-03-21 16:23:55, Mike Kravetz wrote:
> > > Ideally, cma_release could be called from any context.  However,
> > > that is not possible because a mutex is used to protect the per-area 
> > > bitmap.
> > > Change the bitmap to an irq safe spinlock.
> >
> > I would phrase the changelog slightly differerent "
> > cma_release is currently a sleepable operatation because the bitmap
> > manipulation is protected by cma->lock mutex. Hugetlb code which
> > relies on cma_release for CMA backed (giga) hugetlb pages, however,
> > needs to be irq safe.
> >
> > The lock doesn't protect any sleepable operation so it can be changed
> > to a (irq aware) spin lock. The bitmap processing should be quite fast
> > in typical case but if cma sizes grow to TB then we will likely need
> > to replace the lock by a more optimized bitmap implementation.
> > "
> >
> > it seems that you are overusing irqsave variants even from context
> > which are never called from the IRQ context so they do not need storing 
> > flags.
> >
> > [...]
> > > @@ -391,8 +391,9 @@ static void cma_debug_show_areas(struct cma *cma)
> > >   unsigned long start = 0;
> > >   unsigned long nr_part, nr_total = 0;
> > >   unsigned long nbits = cma_bitmap_maxno(cma);
> > > + unsigned long flags;
> > >
> > > - mutex_lock(>lock);
> > > + spin_lock_irqsave(>lock, flags);
> >
> > spin_lock_irq should be sufficient. This is only called from the
> > allocation context and that is never called from IRQ context.
> 
> This makes me think more. I think that spin_lock should be sufficient. Right?
> 

It seems Mike's point is that cma_release might be called from both
irq context and process context.

If it is running in process context, we need the irq-disable to lock
the irq context which might jump to call cma_release at the same time.

We have never seen cma_release has been really called in irq context
by now, anyway.

> 
> >
> > >   pr_info("number of available pages: ");
> > >   for (;;) {
> > >   next_zero_bit = find_next_zero_bit(cma->bitmap, nbits,
> > > start); @@ -407,7 +408,7 @@ static void cma_debug_show_areas(struct cma
> *cma)
> > >   start = next_zero_bit + nr_zero;
> > >   }
> > >   pr_cont("=> %lu free of %lu total pages\n", nr_total, cma->count);
> > > - mutex_unlock(>lock);
> > > + spin_unlock_irqrestore(>lock, flags);
> > >  }
> > >  #else
> > >  static inline void cma_debug_show_areas(struct cma *cma) { } @@
> > > -430,6 +431,7 @@ struct page *cma_alloc(struct cma *cma, size_t count,
> unsigned int align,
> > >   unsigned long pfn = -1;
> > >   unsigned long start = 0;
> > >   unsigned long bitmap_maxno, bitmap_no, bitmap_count;
> > > + unsigned long flags;
> > >   size_t i;
> > >   struct page *page = NULL;
> > >   int ret = -ENOMEM;
> > > @@ -454,12 +456,12 @@ struct page *cma_alloc(struct cma *cma, size_t 
> > > count,
> unsigned int align,
> > >   goto out;
> > >
> > >   for (;;) {
> > > - mutex_lock(>lock);
> > > + spin_lock_irqsave(>lock, flags);
> > >   bitmap_no = bitmap_find_next_zero_area_off(cma->bitmap,
> > >   bitmap_maxno, start, bitmap_count, mask,
> > >   offset);
> > >   if (bitmap_no >= bitmap_maxno) {
> > > - mutex_unlock(>lock);
> > > + spin_unlock_irqrestore(>lock, flags);
> > >   break;
> > >   }
> > >   bitmap_set(cma->bitma

Re: [External] Re: [PATCH v2 1/8] mm/cma: change cma mutex to irq safe spinlock

2021-03-30 Thread Muchun Song
On Tue, Mar 30, 2021 at 4:01 PM Michal Hocko  wrote:
>
> On Mon 29-03-21 16:23:55, Mike Kravetz wrote:
> > Ideally, cma_release could be called from any context.  However, that is
> > not possible because a mutex is used to protect the per-area bitmap.
> > Change the bitmap to an irq safe spinlock.
>
> I would phrase the changelog slightly differerent
> "
> cma_release is currently a sleepable operatation because the bitmap
> manipulation is protected by cma->lock mutex. Hugetlb code which relies
> on cma_release for CMA backed (giga) hugetlb pages, however, needs to be
> irq safe.
>
> The lock doesn't protect any sleepable operation so it can be changed to
> a (irq aware) spin lock. The bitmap processing should be quite fast in
> typical case but if cma sizes grow to TB then we will likely need to
> replace the lock by a more optimized bitmap implementation.
> "
>
> it seems that you are overusing irqsave variants even from context which
> are never called from the IRQ context so they do not need storing flags.
>
> [...]
> > @@ -391,8 +391,9 @@ static void cma_debug_show_areas(struct cma *cma)
> >   unsigned long start = 0;
> >   unsigned long nr_part, nr_total = 0;
> >   unsigned long nbits = cma_bitmap_maxno(cma);
> > + unsigned long flags;
> >
> > - mutex_lock(>lock);
> > + spin_lock_irqsave(>lock, flags);
>
> spin_lock_irq should be sufficient. This is only called from the
> allocation context and that is never called from IRQ context.

This makes me think more. I think that spin_lock should be
sufficient. Right?


>
> >   pr_info("number of available pages: ");
> >   for (;;) {
> >   next_zero_bit = find_next_zero_bit(cma->bitmap, nbits, start);
> > @@ -407,7 +408,7 @@ static void cma_debug_show_areas(struct cma *cma)
> >   start = next_zero_bit + nr_zero;
> >   }
> >   pr_cont("=> %lu free of %lu total pages\n", nr_total, cma->count);
> > - mutex_unlock(>lock);
> > + spin_unlock_irqrestore(>lock, flags);
> >  }
> >  #else
> >  static inline void cma_debug_show_areas(struct cma *cma) { }
> > @@ -430,6 +431,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, 
> > unsigned int align,
> >   unsigned long pfn = -1;
> >   unsigned long start = 0;
> >   unsigned long bitmap_maxno, bitmap_no, bitmap_count;
> > + unsigned long flags;
> >   size_t i;
> >   struct page *page = NULL;
> >   int ret = -ENOMEM;
> > @@ -454,12 +456,12 @@ struct page *cma_alloc(struct cma *cma, size_t count, 
> > unsigned int align,
> >   goto out;
> >
> >   for (;;) {
> > - mutex_lock(>lock);
> > + spin_lock_irqsave(>lock, flags);
> >   bitmap_no = bitmap_find_next_zero_area_off(cma->bitmap,
> >   bitmap_maxno, start, bitmap_count, mask,
> >   offset);
> >   if (bitmap_no >= bitmap_maxno) {
> > - mutex_unlock(>lock);
> > + spin_unlock_irqrestore(>lock, flags);
> >   break;
> >   }
> >   bitmap_set(cma->bitmap, bitmap_no, bitmap_count);
>
> same here.
>
> > @@ -468,7 +470,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, 
> > unsigned int align,
> >* our exclusive use. If the migration fails we will take the
> >* lock again and unmark it.
> >*/
> > - mutex_unlock(>lock);
> > + spin_unlock_irqrestore(>lock, flags);
> >
> >   pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit);
> >   ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA,
> > diff --git a/mm/cma.h b/mm/cma.h
> > index 68ffad4e430d..2c775877eae2 100644
> > --- a/mm/cma.h
> > +++ b/mm/cma.h
> > @@ -15,7 +15,7 @@ struct cma {
> >   unsigned long   count;
> >   unsigned long   *bitmap;
> >   unsigned int order_per_bit; /* Order of pages represented by one bit 
> > */
> > - struct mutexlock;
> > + spinlock_t  lock;
> >  #ifdef CONFIG_CMA_DEBUGFS
> >   struct hlist_head mem_head;
> >   spinlock_t mem_head_lock;
> > diff --git a/mm/cma_debug.c b/mm/cma_debug.c
> > index d5bf8aa34fdc..6379cfbfd568 100644
> > --- a/mm/cma_debug.c
> > +++ b/mm/cma_debug.c
> > @@ -35,11 +35,12 @@ static int cma_used_get(void *data, u64 *val)
> >  {
> >   struct cma *cma = data;
> >   unsigned long used;
> > + unsigned long flags;
> >
> > - mutex_lock(>lock);
> > + spin_lock_irqsave(>lock, flags);
> >   /* pages counter is smaller than sizeof(int) */
> >   used = bitmap_weight(cma->bitmap, (int)cma_bitmap_maxno(cma));
> > - mutex_unlock(>lock);
> > + spin_unlock_irqrestore(>lock, flags);
> >   *val = (u64)used << cma->order_per_bit;
>
> same here
>
> >
> >   return 0;
> > @@ -52,8 +53,9 @@ static int cma_maxchunk_get(void *data, u64 *val)
> >   unsigned long maxchunk = 0;
> >   unsigned long