Hi,

> +CC btrfs
> 
> On 4/1/21 12:51 PM, Wang Yugui wrote:
> > Hi,
> > 
> > an unexpected -ENOMEM from percpu_counter_init() happened when xfstest 
> > with kernel 5.11.10 and 5.10.27
> 
> Is there a dmesg log showing allocation failure or something?

When unexpected -ENOMEM of percpu_counter_init(), btrfs as upper caller
finally output something to dmesg.

And we add one trace to btrfs source to make sure that.
>     if (ret == -ENOMEM) printk("ENOMEM btrfs_drew_lock_init\n");


Now the reproduce frequency become from >50% to not happen or very slow
with the flowing change.

diff --git a/mm/percpu.c b/mm/percpu.c
index 6596a0a..0127be1 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -104,8 +104,8 @@
 /* chunks in slots below this are subject to being sidelined on failed alloc */
 #define PCPU_SLOT_FAIL_THRESHOLD       3
 
-#define PCPU_EMPTY_POP_PAGES_LOW       2
-#define PCPU_EMPTY_POP_PAGES_HIGH      4
+#define PCPU_EMPTY_POP_PAGES_LOW       8
+#define PCPU_EMPTY_POP_PAGES_HIGH      16
 
 #ifdef CONFIG_SMP
 /* default addr <-> pcpu_ptr mapping, override in asm/percpu.h if necessary */
diff --git a/include/linux/percpu.h b/include/linux/percpu.h
index 5e76af7..8cc091b 100644
--- a/include/linux/percpu.h
+++ b/include/linux/percpu.h
@@ -14,7 +14,7 @@
 
 /* enough to cover all DEFINE_PER_CPUs in modules */
 #ifdef CONFIG_MODULES
-#define PERCPU_MODULE_RESERVE          (8 << 10)
+#define PERCPU_MODULE_RESERVE          (32 << 10)
 #else
 #define PERCPU_MODULE_RESERVE          0
 #endif


Just some guess,
1) maybe some releationship to the trigger of 'vm.dirty_bytes=10737418240'.

this problem happen in 
server/T7610 with E5-2660v2 *2 and SSD/SAS(6Gb/s) and 192G memory
but not happen in
server/T620 with E5-2680v2 *2 and SSD/NVMe and 192G memory.

2) maybe some releationship to numa.
128G memory in node1(CPU1), and 64G in node2(CPU2)

Best Regards
Wang Yugui (wangyu...@e16-tech.com)
2021/04/07


> > direct caller:
> > int btrfs_drew_lock_init(struct btrfs_drew_lock *lock)
> > {
> >     int ret;
> > 
> >     ret = percpu_counter_init(&lock->writers, 0, GFP_KERNEL);
> >     if (ret)
> >         return ret;
> > 
> >     atomic_set(&lock->readers, 0);
> >     init_waitqueue_head(&lock->pending_readers);
> >     init_waitqueue_head(&lock->pending_writers);
> > 
> >     return 0;
> > }
> > 
> > upper caller:
> >     nofs_flag = memalloc_nofs_save();
> >     ret = btrfs_drew_lock_init(&root->snapshot_lock);
> >     memalloc_nofs_restore(nofs_flag);
> >     if (ret == -ENOMEM) printk("ENOMEM btrfs_drew_lock_init\n");
> >     if (ret)
> >         goto fail;
> > 
> > The hardware of this server:
> > CPU:  Xeon(R) CPU E5-2660 v2(10 core)  *2
> > memory:  192G, no swap
> > 
> > Only one xfstests job is running in this server, and about 7% of memory
> > is used.
> > 
> > Any advice please.
> > 
> > Best Regards
> > Wang Yugui (wangyu...@e16-tech.com)
> > 2021/04/01
> > 
> > 


Reply via email to