Re: [External] Re: [PATCH v2 1/8] mm/cma: change cma mutex to irq safe spinlock
On Tue, Mar 30, 2021 at 4:18 PM Michal Hocko wrote: > > On Tue 30-03-21 16:08:36, Muchun Song wrote: > > On Tue, Mar 30, 2021 at 4:01 PM Michal Hocko wrote: > > > > > > On Mon 29-03-21 16:23:55, Mike Kravetz wrote: > > > > Ideally, cma_release could be called from any context. However, that is > > > > not possible because a mutex is used to protect the per-area bitmap. > > > > Change the bitmap to an irq safe spinlock. > > > > > > I would phrase the changelog slightly differerent > > > " > > > cma_release is currently a sleepable operatation because the bitmap > > > manipulation is protected by cma->lock mutex. Hugetlb code which relies > > > on cma_release for CMA backed (giga) hugetlb pages, however, needs to be > > > irq safe. > > > > > > The lock doesn't protect any sleepable operation so it can be changed to > > > a (irq aware) spin lock. The bitmap processing should be quite fast in > > > typical case but if cma sizes grow to TB then we will likely need to > > > replace the lock by a more optimized bitmap implementation. > > > " > > > > > > it seems that you are overusing irqsave variants even from context which > > > are never called from the IRQ context so they do not need storing flags. > > > > > > [...] > > > > @@ -391,8 +391,9 @@ static void cma_debug_show_areas(struct cma *cma) > > > > unsigned long start = 0; > > > > unsigned long nr_part, nr_total = 0; > > > > unsigned long nbits = cma_bitmap_maxno(cma); > > > > + unsigned long flags; > > > > > > > > - mutex_lock(>lock); > > > > + spin_lock_irqsave(>lock, flags); > > > > > > spin_lock_irq should be sufficient. This is only called from the > > > allocation context and that is never called from IRQ context. > > > > This makes me think more. I think that spin_lock should be > > sufficient. Right? > > Nope. Think of the following scenario > spin_lock(cma->lock); > > put_page > __free_huge_page > cma_release > spin_lock_irqsave() DEADLOCK Got it. Thanks. > -- > Michal Hocko > SUSE Labs
Re: [External] Re: [PATCH v2 1/8] mm/cma: change cma mutex to irq safe spinlock
On Tue 30-03-21 16:08:36, Muchun Song wrote: > On Tue, Mar 30, 2021 at 4:01 PM Michal Hocko wrote: > > > > On Mon 29-03-21 16:23:55, Mike Kravetz wrote: > > > Ideally, cma_release could be called from any context. However, that is > > > not possible because a mutex is used to protect the per-area bitmap. > > > Change the bitmap to an irq safe spinlock. > > > > I would phrase the changelog slightly differerent > > " > > cma_release is currently a sleepable operatation because the bitmap > > manipulation is protected by cma->lock mutex. Hugetlb code which relies > > on cma_release for CMA backed (giga) hugetlb pages, however, needs to be > > irq safe. > > > > The lock doesn't protect any sleepable operation so it can be changed to > > a (irq aware) spin lock. The bitmap processing should be quite fast in > > typical case but if cma sizes grow to TB then we will likely need to > > replace the lock by a more optimized bitmap implementation. > > " > > > > it seems that you are overusing irqsave variants even from context which > > are never called from the IRQ context so they do not need storing flags. > > > > [...] > > > @@ -391,8 +391,9 @@ static void cma_debug_show_areas(struct cma *cma) > > > unsigned long start = 0; > > > unsigned long nr_part, nr_total = 0; > > > unsigned long nbits = cma_bitmap_maxno(cma); > > > + unsigned long flags; > > > > > > - mutex_lock(>lock); > > > + spin_lock_irqsave(>lock, flags); > > > > spin_lock_irq should be sufficient. This is only called from the > > allocation context and that is never called from IRQ context. > > This makes me think more. I think that spin_lock should be > sufficient. Right? Nope. Think of the following scenario spin_lock(cma->lock); put_page __free_huge_page cma_release spin_lock_irqsave() DEADLOCK -- Michal Hocko SUSE Labs
RE: [External] Re: [PATCH v2 1/8] mm/cma: change cma mutex to irq safe spinlock
> -Original Message- > From: Muchun Song [mailto:songmuc...@bytedance.com] > Sent: Tuesday, March 30, 2021 9:09 PM > To: Michal Hocko > Cc: Mike Kravetz ; Linux Memory Management List > ; LKML ; Roman Gushchin > ; Shakeel Butt ; Oscar Salvador > ; David Hildenbrand ; David Rientjes > ; linmiaohe ; Peter Zijlstra > ; Matthew Wilcox ; HORIGUCHI NAOYA > ; Aneesh Kumar K . V ; > Waiman Long ; Peter Xu ; Mina Almasry > ; Hillf Danton ; Joonsoo Kim > ; Song Bao Hua (Barry Song) > ; Will Deacon ; Andrew Morton > > Subject: Re: [External] Re: [PATCH v2 1/8] mm/cma: change cma mutex to irq > safe > spinlock > > On Tue, Mar 30, 2021 at 4:01 PM Michal Hocko wrote: > > > > On Mon 29-03-21 16:23:55, Mike Kravetz wrote: > > > Ideally, cma_release could be called from any context. However, > > > that is not possible because a mutex is used to protect the per-area > > > bitmap. > > > Change the bitmap to an irq safe spinlock. > > > > I would phrase the changelog slightly differerent " > > cma_release is currently a sleepable operatation because the bitmap > > manipulation is protected by cma->lock mutex. Hugetlb code which > > relies on cma_release for CMA backed (giga) hugetlb pages, however, > > needs to be irq safe. > > > > The lock doesn't protect any sleepable operation so it can be changed > > to a (irq aware) spin lock. The bitmap processing should be quite fast > > in typical case but if cma sizes grow to TB then we will likely need > > to replace the lock by a more optimized bitmap implementation. > > " > > > > it seems that you are overusing irqsave variants even from context > > which are never called from the IRQ context so they do not need storing > > flags. > > > > [...] > > > @@ -391,8 +391,9 @@ static void cma_debug_show_areas(struct cma *cma) > > > unsigned long start = 0; > > > unsigned long nr_part, nr_total = 0; > > > unsigned long nbits = cma_bitmap_maxno(cma); > > > + unsigned long flags; > > > > > > - mutex_lock(>lock); > > > + spin_lock_irqsave(>lock, flags); > > > > spin_lock_irq should be sufficient. This is only called from the > > allocation context and that is never called from IRQ context. > > This makes me think more. I think that spin_lock should be sufficient. Right? > It seems Mike's point is that cma_release might be called from both irq context and process context. If it is running in process context, we need the irq-disable to lock the irq context which might jump to call cma_release at the same time. We have never seen cma_release has been really called in irq context by now, anyway. > > > > > > pr_info("number of available pages: "); > > > for (;;) { > > > next_zero_bit = find_next_zero_bit(cma->bitmap, nbits, > > > start); @@ -407,7 +408,7 @@ static void cma_debug_show_areas(struct cma > *cma) > > > start = next_zero_bit + nr_zero; > > > } > > > pr_cont("=> %lu free of %lu total pages\n", nr_total, cma->count); > > > - mutex_unlock(>lock); > > > + spin_unlock_irqrestore(>lock, flags); > > > } > > > #else > > > static inline void cma_debug_show_areas(struct cma *cma) { } @@ > > > -430,6 +431,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, > unsigned int align, > > > unsigned long pfn = -1; > > > unsigned long start = 0; > > > unsigned long bitmap_maxno, bitmap_no, bitmap_count; > > > + unsigned long flags; > > > size_t i; > > > struct page *page = NULL; > > > int ret = -ENOMEM; > > > @@ -454,12 +456,12 @@ struct page *cma_alloc(struct cma *cma, size_t > > > count, > unsigned int align, > > > goto out; > > > > > > for (;;) { > > > - mutex_lock(>lock); > > > + spin_lock_irqsave(>lock, flags); > > > bitmap_no = bitmap_find_next_zero_area_off(cma->bitmap, > > > bitmap_maxno, start, bitmap_count, mask, > > > offset); > > > if (bitmap_no >= bitmap_maxno) { > > > - mutex_unlock(>lock); > > > + spin_unlock_irqrestore(>lock, flags); > > > break; > > > } > > > bitmap_set(cma->bitma
Re: [External] Re: [PATCH v2 1/8] mm/cma: change cma mutex to irq safe spinlock
On Tue, Mar 30, 2021 at 4:01 PM Michal Hocko wrote: > > On Mon 29-03-21 16:23:55, Mike Kravetz wrote: > > Ideally, cma_release could be called from any context. However, that is > > not possible because a mutex is used to protect the per-area bitmap. > > Change the bitmap to an irq safe spinlock. > > I would phrase the changelog slightly differerent > " > cma_release is currently a sleepable operatation because the bitmap > manipulation is protected by cma->lock mutex. Hugetlb code which relies > on cma_release for CMA backed (giga) hugetlb pages, however, needs to be > irq safe. > > The lock doesn't protect any sleepable operation so it can be changed to > a (irq aware) spin lock. The bitmap processing should be quite fast in > typical case but if cma sizes grow to TB then we will likely need to > replace the lock by a more optimized bitmap implementation. > " > > it seems that you are overusing irqsave variants even from context which > are never called from the IRQ context so they do not need storing flags. > > [...] > > @@ -391,8 +391,9 @@ static void cma_debug_show_areas(struct cma *cma) > > unsigned long start = 0; > > unsigned long nr_part, nr_total = 0; > > unsigned long nbits = cma_bitmap_maxno(cma); > > + unsigned long flags; > > > > - mutex_lock(>lock); > > + spin_lock_irqsave(>lock, flags); > > spin_lock_irq should be sufficient. This is only called from the > allocation context and that is never called from IRQ context. This makes me think more. I think that spin_lock should be sufficient. Right? > > > pr_info("number of available pages: "); > > for (;;) { > > next_zero_bit = find_next_zero_bit(cma->bitmap, nbits, start); > > @@ -407,7 +408,7 @@ static void cma_debug_show_areas(struct cma *cma) > > start = next_zero_bit + nr_zero; > > } > > pr_cont("=> %lu free of %lu total pages\n", nr_total, cma->count); > > - mutex_unlock(>lock); > > + spin_unlock_irqrestore(>lock, flags); > > } > > #else > > static inline void cma_debug_show_areas(struct cma *cma) { } > > @@ -430,6 +431,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, > > unsigned int align, > > unsigned long pfn = -1; > > unsigned long start = 0; > > unsigned long bitmap_maxno, bitmap_no, bitmap_count; > > + unsigned long flags; > > size_t i; > > struct page *page = NULL; > > int ret = -ENOMEM; > > @@ -454,12 +456,12 @@ struct page *cma_alloc(struct cma *cma, size_t count, > > unsigned int align, > > goto out; > > > > for (;;) { > > - mutex_lock(>lock); > > + spin_lock_irqsave(>lock, flags); > > bitmap_no = bitmap_find_next_zero_area_off(cma->bitmap, > > bitmap_maxno, start, bitmap_count, mask, > > offset); > > if (bitmap_no >= bitmap_maxno) { > > - mutex_unlock(>lock); > > + spin_unlock_irqrestore(>lock, flags); > > break; > > } > > bitmap_set(cma->bitmap, bitmap_no, bitmap_count); > > same here. > > > @@ -468,7 +470,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, > > unsigned int align, > >* our exclusive use. If the migration fails we will take the > >* lock again and unmark it. > >*/ > > - mutex_unlock(>lock); > > + spin_unlock_irqrestore(>lock, flags); > > > > pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit); > > ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA, > > diff --git a/mm/cma.h b/mm/cma.h > > index 68ffad4e430d..2c775877eae2 100644 > > --- a/mm/cma.h > > +++ b/mm/cma.h > > @@ -15,7 +15,7 @@ struct cma { > > unsigned long count; > > unsigned long *bitmap; > > unsigned int order_per_bit; /* Order of pages represented by one bit > > */ > > - struct mutexlock; > > + spinlock_t lock; > > #ifdef CONFIG_CMA_DEBUGFS > > struct hlist_head mem_head; > > spinlock_t mem_head_lock; > > diff --git a/mm/cma_debug.c b/mm/cma_debug.c > > index d5bf8aa34fdc..6379cfbfd568 100644 > > --- a/mm/cma_debug.c > > +++ b/mm/cma_debug.c > > @@ -35,11 +35,12 @@ static int cma_used_get(void *data, u64 *val) > > { > > struct cma *cma = data; > > unsigned long used; > > + unsigned long flags; > > > > - mutex_lock(>lock); > > + spin_lock_irqsave(>lock, flags); > > /* pages counter is smaller than sizeof(int) */ > > used = bitmap_weight(cma->bitmap, (int)cma_bitmap_maxno(cma)); > > - mutex_unlock(>lock); > > + spin_unlock_irqrestore(>lock, flags); > > *val = (u64)used << cma->order_per_bit; > > same here > > > > > return 0; > > @@ -52,8 +53,9 @@ static int cma_maxchunk_get(void *data, u64 *val) > > unsigned long maxchunk = 0; > > unsigned long