Re: [PATCH v4] debugobjects: scale the static pool size
On Mon, Nov 26, 2018 at 04:45:54AM -0500, Qian Cai wrote: > On 11/25/18 11:52 PM, Qian Cai wrote: > > BTW, calling debug_objects_mem_init() before kmemleak_init() actually > > could trigger a loop on machines with 160+ CPUs until the pool is filled > > up, > > > > debug_objects_pool_min_level += num_possible_cpus() * 4; > > > > [1] while (obj_pool_free < debug_objects_pool_min_level) > > > > kmemleak_init > > kmemleak_ignore (from replaced static debug objects) > > make_black_object > > put_object > > __call_rcu (kernel/rcu/tree.c) > > debug_rcu_head_queue > > debug_object_activate > > debug_object_init > > fill_pool > > kmemleak_ignore (looping in [1]) > > make_black_object > > ... > > > > I think until this is resolved, there is no way to move > > debug_objects_mem_init() before kmemleak_init(). > > I believe this is a separate issue that kmemleak is broken with > CONFIG_DEBUG_OBJECTS_RCU_HEAD anyway where the infinite loop above could be > triggered in the existing code as well, i.e., once the pool need be refilled > (fill_pool()) after the system boot up, debug object creation will call > kmemleak_ignore() and it will create a new rcu debug_object_init(), and then > it will call fill_pool() again and again. As the results, the system is > locking up during kernel compilations. > > Hence, I'll send out a patch for debug objects with large CPUs anyway and > deal with kmemleak + CONFIG_DEBUG_OBJECTS_RCU_HEAD issue later. I haven't hit this before but I can see why it happens. Kmemleak uses RCU for freeing its own data structures to avoid a recursive call to sl*b (kmem_cache_free()). Since we already tell kmemleak to ignore the debug_objects_cache allocations, I think we could as well add SLAB_NOLEAKTRACE to kmem_cache_create() (I haven't tried yet, this was not a documented API for kmemleak, rather used to avoid kmemleak tracking itself). -- Catalin
Re: [PATCH v4] debugobjects: scale the static pool size
On Mon, Nov 26, 2018 at 04:45:54AM -0500, Qian Cai wrote: > On 11/25/18 11:52 PM, Qian Cai wrote: > > BTW, calling debug_objects_mem_init() before kmemleak_init() actually > > could trigger a loop on machines with 160+ CPUs until the pool is filled > > up, > > > > debug_objects_pool_min_level += num_possible_cpus() * 4; > > > > [1] while (obj_pool_free < debug_objects_pool_min_level) > > > > kmemleak_init > > kmemleak_ignore (from replaced static debug objects) > > make_black_object > > put_object > > __call_rcu (kernel/rcu/tree.c) > > debug_rcu_head_queue > > debug_object_activate > > debug_object_init > > fill_pool > > kmemleak_ignore (looping in [1]) > > make_black_object > > ... > > > > I think until this is resolved, there is no way to move > > debug_objects_mem_init() before kmemleak_init(). > > I believe this is a separate issue that kmemleak is broken with > CONFIG_DEBUG_OBJECTS_RCU_HEAD anyway where the infinite loop above could be > triggered in the existing code as well, i.e., once the pool need be refilled > (fill_pool()) after the system boot up, debug object creation will call > kmemleak_ignore() and it will create a new rcu debug_object_init(), and then > it will call fill_pool() again and again. As the results, the system is > locking up during kernel compilations. > > Hence, I'll send out a patch for debug objects with large CPUs anyway and > deal with kmemleak + CONFIG_DEBUG_OBJECTS_RCU_HEAD issue later. I haven't hit this before but I can see why it happens. Kmemleak uses RCU for freeing its own data structures to avoid a recursive call to sl*b (kmem_cache_free()). Since we already tell kmemleak to ignore the debug_objects_cache allocations, I think we could as well add SLAB_NOLEAKTRACE to kmem_cache_create() (I haven't tried yet, this was not a documented API for kmemleak, rather used to avoid kmemleak tracking itself). -- Catalin
Re: [PATCH v4] debugobjects: scale the static pool size
On 11/25/18 11:52 PM, Qian Cai wrote: On 11/25/18 8:31 PM, Waiman Long wrote: On 11/25/2018 03:42 PM, Qian Cai wrote: On 11/23/18 10:01 PM, Qian Cai wrote: On Nov 22, 2018, at 4:56 PM, Thomas Gleixner wrote: On Tue, 20 Nov 2018, Qian Cai wrote: Looking deeper at that. diff --git a/lib/debugobjects.c b/lib/debugobjects.c index 70935ed91125..140571aa483c 100644 --- a/lib/debugobjects.c +++ b/lib/debugobjects.c @@ -23,9 +23,81 @@ #define ODEBUG_HASH_BITS 14 #define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS) -#define ODEBUG_POOL_SIZE 1024 +#define ODEBUG_DEFAULT_POOL 512 #define ODEBUG_POOL_MIN_LEVEL 256 +/* + * Some debug objects are allocated during the early boot. Enabling some options + * like timers or workqueue objects may increase the size required significantly + * with large number of CPUs. For example (as today, 20 Nov. 2018), + * + * No. CPUs x 2 (worker pool) objects: + * + * start_kernel + * workqueue_init_early + * init_worker_pool + * init_timer_key + * debug_object_init + * + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS): + * + * sched_init + * hrtick_rq_init + * hrtimer_init + * + * CONFIG_DEBUG_OBJECTS_WORK: + * No. CPUs x 6 (workqueue) objects: + * + * workqueue_init_early + * alloc_workqueue + * __alloc_workqueue_key + * alloc_and_link_pwqs + * init_pwq + * + * Also, plus No. CPUs objects: + * + * perf_event_init + * __init_srcu_struct + * init_srcu_struct_fields + * init_srcu_struct_nodes + * __init_work None of the things are actually used or required _BEFORE_ debug_objects_mem_init() is invoked. The reason why the call is at this place in start_kernel() is historical. It's because back in the days when debugobjects were added the memory allocator was enabled way later than today. So we can just move the debug_objects_mem_init() call right before sched_init() I think. Well, now that kmemleak_init() seems complains that debug_objects_mem_init() is called before it. [ 0.078805] kmemleak: Cannot insert 0xc00dff93 into the object search tree (overlaps existing) [ 0.078860] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.20.0-rc3+ #3 [ 0.078883] Call Trace: [ 0.078904] [c1c8fcd0] [c0c96b34] dump_stack+0xe8/0x164 (unreliable) [ 0.078935] [c1c8fd20] [c0486e84] create_object+0x344/0x380 [ 0.078962] [c1c8fde0] [c0489544] early_alloc+0x108/0x1f8 [ 0.078989] [c1c8fe20] [c109738c] kmemleak_init+0x1d8/0x3d4 [ 0.079016] [c1c8ff00] [c1054028] start_kernel+0x5c0/0x6f8 [ 0.079043] [c1c8ff90] [c000ae7c] start_here_common+0x1c/0x520 [ 0.079070] kmemleak: Kernel memory leak detector disabled [ 0.079091] kmemleak: Object 0xc00ffd587b68 (size 40): [ 0.079112] kmemleak: comm "swapper/0", pid 0, jiffies 4294937299 [ 0.079135] kmemleak: min_count = -1 [ 0.079153] kmemleak: count = 0 [ 0.079170] kmemleak: flags = 0x5 [ 0.079188] kmemleak: checksum = 0 [ 0.079206] kmemleak: backtrace: [ 0.079227] __debug_object_init+0x688/0x700 [ 0.079250] debug_object_activate+0x1e0/0x350 [ 0.079272] __call_rcu+0x60/0x430 [ 0.079292] put_object+0x60/0x80 [ 0.079311] kmemleak_init+0x2cc/0x3d4 [ 0.079331] start_kernel+0x5c0/0x6f8 [ 0.079351] start_here_common+0x1c/0x520 [ 0.079380] kmemleak: Early log backtrace: [ 0.079399] memblock_alloc_try_nid_raw+0x90/0xcc [ 0.079421] sparse_init_nid+0x144/0x51c [ 0.079440] sparse_init+0x1a0/0x238 [ 0.079459] initmem_init+0x1d8/0x25c [ 0.079498] setup_arch+0x3e0/0x464 [ 0.079517] start_kernel+0xa4/0x6f8 [ 0.079536] start_here_common+0x1c/0x520 So this is an chicken-egg problem. Debug objects need kmemleak_init() first, so it can make use of kmemleak_ignore() for all debug objects in order to avoid the overlapping like the above. while (obj_pool_free < debug_objects_pool_min_level) { new = kmem_cache_zalloc(obj_cache, gfp); if (!new) return; kmemleak_ignore(new); However, there seems no way to move kmemleak_init() together this early in start_kernel() just before vmalloc_init() [1] because it looks like it depends on things like workqueue (schedule_work(_work)) and rcu. Hence, it needs to be after workqueue_init_early() and rcu_init() Given that, maybe the best outcome is to stick to the alternative approach that works [1] rather messing up with the order of debug_objects_mem_init() in start_kernel() which seems tricky. What do you think? [1] https://goo.gl/18N78g [2] https://goo.gl/My6ig6 Could you move kmemleak_init() and debug_objects_mem_init() as far up as possible, like before the hrtimer_init() to at least make static count calculation as simple as possible? Well, there is only 2 x NR_CPUS difference after moved both calls just after rcu_init(). Before After 64-CPU:
Re: [PATCH v4] debugobjects: scale the static pool size
On 11/25/18 11:52 PM, Qian Cai wrote: On 11/25/18 8:31 PM, Waiman Long wrote: On 11/25/2018 03:42 PM, Qian Cai wrote: On 11/23/18 10:01 PM, Qian Cai wrote: On Nov 22, 2018, at 4:56 PM, Thomas Gleixner wrote: On Tue, 20 Nov 2018, Qian Cai wrote: Looking deeper at that. diff --git a/lib/debugobjects.c b/lib/debugobjects.c index 70935ed91125..140571aa483c 100644 --- a/lib/debugobjects.c +++ b/lib/debugobjects.c @@ -23,9 +23,81 @@ #define ODEBUG_HASH_BITS 14 #define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS) -#define ODEBUG_POOL_SIZE 1024 +#define ODEBUG_DEFAULT_POOL 512 #define ODEBUG_POOL_MIN_LEVEL 256 +/* + * Some debug objects are allocated during the early boot. Enabling some options + * like timers or workqueue objects may increase the size required significantly + * with large number of CPUs. For example (as today, 20 Nov. 2018), + * + * No. CPUs x 2 (worker pool) objects: + * + * start_kernel + * workqueue_init_early + * init_worker_pool + * init_timer_key + * debug_object_init + * + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS): + * + * sched_init + * hrtick_rq_init + * hrtimer_init + * + * CONFIG_DEBUG_OBJECTS_WORK: + * No. CPUs x 6 (workqueue) objects: + * + * workqueue_init_early + * alloc_workqueue + * __alloc_workqueue_key + * alloc_and_link_pwqs + * init_pwq + * + * Also, plus No. CPUs objects: + * + * perf_event_init + * __init_srcu_struct + * init_srcu_struct_fields + * init_srcu_struct_nodes + * __init_work None of the things are actually used or required _BEFORE_ debug_objects_mem_init() is invoked. The reason why the call is at this place in start_kernel() is historical. It's because back in the days when debugobjects were added the memory allocator was enabled way later than today. So we can just move the debug_objects_mem_init() call right before sched_init() I think. Well, now that kmemleak_init() seems complains that debug_objects_mem_init() is called before it. [ 0.078805] kmemleak: Cannot insert 0xc00dff93 into the object search tree (overlaps existing) [ 0.078860] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.20.0-rc3+ #3 [ 0.078883] Call Trace: [ 0.078904] [c1c8fcd0] [c0c96b34] dump_stack+0xe8/0x164 (unreliable) [ 0.078935] [c1c8fd20] [c0486e84] create_object+0x344/0x380 [ 0.078962] [c1c8fde0] [c0489544] early_alloc+0x108/0x1f8 [ 0.078989] [c1c8fe20] [c109738c] kmemleak_init+0x1d8/0x3d4 [ 0.079016] [c1c8ff00] [c1054028] start_kernel+0x5c0/0x6f8 [ 0.079043] [c1c8ff90] [c000ae7c] start_here_common+0x1c/0x520 [ 0.079070] kmemleak: Kernel memory leak detector disabled [ 0.079091] kmemleak: Object 0xc00ffd587b68 (size 40): [ 0.079112] kmemleak: comm "swapper/0", pid 0, jiffies 4294937299 [ 0.079135] kmemleak: min_count = -1 [ 0.079153] kmemleak: count = 0 [ 0.079170] kmemleak: flags = 0x5 [ 0.079188] kmemleak: checksum = 0 [ 0.079206] kmemleak: backtrace: [ 0.079227] __debug_object_init+0x688/0x700 [ 0.079250] debug_object_activate+0x1e0/0x350 [ 0.079272] __call_rcu+0x60/0x430 [ 0.079292] put_object+0x60/0x80 [ 0.079311] kmemleak_init+0x2cc/0x3d4 [ 0.079331] start_kernel+0x5c0/0x6f8 [ 0.079351] start_here_common+0x1c/0x520 [ 0.079380] kmemleak: Early log backtrace: [ 0.079399] memblock_alloc_try_nid_raw+0x90/0xcc [ 0.079421] sparse_init_nid+0x144/0x51c [ 0.079440] sparse_init+0x1a0/0x238 [ 0.079459] initmem_init+0x1d8/0x25c [ 0.079498] setup_arch+0x3e0/0x464 [ 0.079517] start_kernel+0xa4/0x6f8 [ 0.079536] start_here_common+0x1c/0x520 So this is an chicken-egg problem. Debug objects need kmemleak_init() first, so it can make use of kmemleak_ignore() for all debug objects in order to avoid the overlapping like the above. while (obj_pool_free < debug_objects_pool_min_level) { new = kmem_cache_zalloc(obj_cache, gfp); if (!new) return; kmemleak_ignore(new); However, there seems no way to move kmemleak_init() together this early in start_kernel() just before vmalloc_init() [1] because it looks like it depends on things like workqueue (schedule_work(_work)) and rcu. Hence, it needs to be after workqueue_init_early() and rcu_init() Given that, maybe the best outcome is to stick to the alternative approach that works [1] rather messing up with the order of debug_objects_mem_init() in start_kernel() which seems tricky. What do you think? [1] https://goo.gl/18N78g [2] https://goo.gl/My6ig6 Could you move kmemleak_init() and debug_objects_mem_init() as far up as possible, like before the hrtimer_init() to at least make static count calculation as simple as possible? Well, there is only 2 x NR_CPUS difference after moved both calls just after rcu_init(). Before After 64-CPU:
Re: [PATCH v4] debugobjects: scale the static pool size
On 11/25/18 8:31 PM, Waiman Long wrote: On 11/25/2018 03:42 PM, Qian Cai wrote: On 11/23/18 10:01 PM, Qian Cai wrote: On Nov 22, 2018, at 4:56 PM, Thomas Gleixner wrote: On Tue, 20 Nov 2018, Qian Cai wrote: Looking deeper at that. diff --git a/lib/debugobjects.c b/lib/debugobjects.c index 70935ed91125..140571aa483c 100644 --- a/lib/debugobjects.c +++ b/lib/debugobjects.c @@ -23,9 +23,81 @@ #define ODEBUG_HASH_BITS 14 #define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS) -#define ODEBUG_POOL_SIZE 1024 +#define ODEBUG_DEFAULT_POOL 512 #define ODEBUG_POOL_MIN_LEVEL 256 +/* + * Some debug objects are allocated during the early boot. Enabling some options + * like timers or workqueue objects may increase the size required significantly + * with large number of CPUs. For example (as today, 20 Nov. 2018), + * + * No. CPUs x 2 (worker pool) objects: + * + * start_kernel + * workqueue_init_early + * init_worker_pool + * init_timer_key + * debug_object_init + * + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS): + * + * sched_init + * hrtick_rq_init + * hrtimer_init + * + * CONFIG_DEBUG_OBJECTS_WORK: + * No. CPUs x 6 (workqueue) objects: + * + * workqueue_init_early + * alloc_workqueue + * __alloc_workqueue_key + * alloc_and_link_pwqs + * init_pwq + * + * Also, plus No. CPUs objects: + * + * perf_event_init + * __init_srcu_struct + * init_srcu_struct_fields + * init_srcu_struct_nodes + * __init_work None of the things are actually used or required _BEFORE_ debug_objects_mem_init() is invoked. The reason why the call is at this place in start_kernel() is historical. It's because back in the days when debugobjects were added the memory allocator was enabled way later than today. So we can just move the debug_objects_mem_init() call right before sched_init() I think. Well, now that kmemleak_init() seems complains that debug_objects_mem_init() is called before it. [ 0.078805] kmemleak: Cannot insert 0xc00dff93 into the object search tree (overlaps existing) [ 0.078860] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.20.0-rc3+ #3 [ 0.078883] Call Trace: [ 0.078904] [c1c8fcd0] [c0c96b34] dump_stack+0xe8/0x164 (unreliable) [ 0.078935] [c1c8fd20] [c0486e84] create_object+0x344/0x380 [ 0.078962] [c1c8fde0] [c0489544] early_alloc+0x108/0x1f8 [ 0.078989] [c1c8fe20] [c109738c] kmemleak_init+0x1d8/0x3d4 [ 0.079016] [c1c8ff00] [c1054028] start_kernel+0x5c0/0x6f8 [ 0.079043] [c1c8ff90] [c000ae7c] start_here_common+0x1c/0x520 [ 0.079070] kmemleak: Kernel memory leak detector disabled [ 0.079091] kmemleak: Object 0xc00ffd587b68 (size 40): [ 0.079112] kmemleak: comm "swapper/0", pid 0, jiffies 4294937299 [ 0.079135] kmemleak: min_count = -1 [ 0.079153] kmemleak: count = 0 [ 0.079170] kmemleak: flags = 0x5 [ 0.079188] kmemleak: checksum = 0 [ 0.079206] kmemleak: backtrace: [ 0.079227] __debug_object_init+0x688/0x700 [ 0.079250] debug_object_activate+0x1e0/0x350 [ 0.079272] __call_rcu+0x60/0x430 [ 0.079292] put_object+0x60/0x80 [ 0.079311] kmemleak_init+0x2cc/0x3d4 [ 0.079331] start_kernel+0x5c0/0x6f8 [ 0.079351] start_here_common+0x1c/0x520 [ 0.079380] kmemleak: Early log backtrace: [ 0.079399] memblock_alloc_try_nid_raw+0x90/0xcc [ 0.079421] sparse_init_nid+0x144/0x51c [ 0.079440] sparse_init+0x1a0/0x238 [ 0.079459] initmem_init+0x1d8/0x25c [ 0.079498] setup_arch+0x3e0/0x464 [ 0.079517] start_kernel+0xa4/0x6f8 [ 0.079536] start_here_common+0x1c/0x520 So this is an chicken-egg problem. Debug objects need kmemleak_init() first, so it can make use of kmemleak_ignore() for all debug objects in order to avoid the overlapping like the above. while (obj_pool_free < debug_objects_pool_min_level) { new = kmem_cache_zalloc(obj_cache, gfp); if (!new) return; kmemleak_ignore(new); However, there seems no way to move kmemleak_init() together this early in start_kernel() just before vmalloc_init() [1] because it looks like it depends on things like workqueue (schedule_work(_work)) and rcu. Hence, it needs to be after workqueue_init_early() and rcu_init() Given that, maybe the best outcome is to stick to the alternative approach that works [1] rather messing up with the order of debug_objects_mem_init() in start_kernel() which seems tricky. What do you think? [1] https://goo.gl/18N78g [2] https://goo.gl/My6ig6 Could you move kmemleak_init() and debug_objects_mem_init() as far up as possible, like before the hrtimer_init() to at least make static count calculation as simple as possible? Well, there is only 2 x NR_CPUS difference after moved both calls just after rcu_init(). Before After 64-CPU: 1114 974 160-CPU: 2774 2429 256-CPU:
Re: [PATCH v4] debugobjects: scale the static pool size
On 11/25/18 8:31 PM, Waiman Long wrote: On 11/25/2018 03:42 PM, Qian Cai wrote: On 11/23/18 10:01 PM, Qian Cai wrote: On Nov 22, 2018, at 4:56 PM, Thomas Gleixner wrote: On Tue, 20 Nov 2018, Qian Cai wrote: Looking deeper at that. diff --git a/lib/debugobjects.c b/lib/debugobjects.c index 70935ed91125..140571aa483c 100644 --- a/lib/debugobjects.c +++ b/lib/debugobjects.c @@ -23,9 +23,81 @@ #define ODEBUG_HASH_BITS 14 #define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS) -#define ODEBUG_POOL_SIZE 1024 +#define ODEBUG_DEFAULT_POOL 512 #define ODEBUG_POOL_MIN_LEVEL 256 +/* + * Some debug objects are allocated during the early boot. Enabling some options + * like timers or workqueue objects may increase the size required significantly + * with large number of CPUs. For example (as today, 20 Nov. 2018), + * + * No. CPUs x 2 (worker pool) objects: + * + * start_kernel + * workqueue_init_early + * init_worker_pool + * init_timer_key + * debug_object_init + * + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS): + * + * sched_init + * hrtick_rq_init + * hrtimer_init + * + * CONFIG_DEBUG_OBJECTS_WORK: + * No. CPUs x 6 (workqueue) objects: + * + * workqueue_init_early + * alloc_workqueue + * __alloc_workqueue_key + * alloc_and_link_pwqs + * init_pwq + * + * Also, plus No. CPUs objects: + * + * perf_event_init + * __init_srcu_struct + * init_srcu_struct_fields + * init_srcu_struct_nodes + * __init_work None of the things are actually used or required _BEFORE_ debug_objects_mem_init() is invoked. The reason why the call is at this place in start_kernel() is historical. It's because back in the days when debugobjects were added the memory allocator was enabled way later than today. So we can just move the debug_objects_mem_init() call right before sched_init() I think. Well, now that kmemleak_init() seems complains that debug_objects_mem_init() is called before it. [ 0.078805] kmemleak: Cannot insert 0xc00dff93 into the object search tree (overlaps existing) [ 0.078860] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.20.0-rc3+ #3 [ 0.078883] Call Trace: [ 0.078904] [c1c8fcd0] [c0c96b34] dump_stack+0xe8/0x164 (unreliable) [ 0.078935] [c1c8fd20] [c0486e84] create_object+0x344/0x380 [ 0.078962] [c1c8fde0] [c0489544] early_alloc+0x108/0x1f8 [ 0.078989] [c1c8fe20] [c109738c] kmemleak_init+0x1d8/0x3d4 [ 0.079016] [c1c8ff00] [c1054028] start_kernel+0x5c0/0x6f8 [ 0.079043] [c1c8ff90] [c000ae7c] start_here_common+0x1c/0x520 [ 0.079070] kmemleak: Kernel memory leak detector disabled [ 0.079091] kmemleak: Object 0xc00ffd587b68 (size 40): [ 0.079112] kmemleak: comm "swapper/0", pid 0, jiffies 4294937299 [ 0.079135] kmemleak: min_count = -1 [ 0.079153] kmemleak: count = 0 [ 0.079170] kmemleak: flags = 0x5 [ 0.079188] kmemleak: checksum = 0 [ 0.079206] kmemleak: backtrace: [ 0.079227] __debug_object_init+0x688/0x700 [ 0.079250] debug_object_activate+0x1e0/0x350 [ 0.079272] __call_rcu+0x60/0x430 [ 0.079292] put_object+0x60/0x80 [ 0.079311] kmemleak_init+0x2cc/0x3d4 [ 0.079331] start_kernel+0x5c0/0x6f8 [ 0.079351] start_here_common+0x1c/0x520 [ 0.079380] kmemleak: Early log backtrace: [ 0.079399] memblock_alloc_try_nid_raw+0x90/0xcc [ 0.079421] sparse_init_nid+0x144/0x51c [ 0.079440] sparse_init+0x1a0/0x238 [ 0.079459] initmem_init+0x1d8/0x25c [ 0.079498] setup_arch+0x3e0/0x464 [ 0.079517] start_kernel+0xa4/0x6f8 [ 0.079536] start_here_common+0x1c/0x520 So this is an chicken-egg problem. Debug objects need kmemleak_init() first, so it can make use of kmemleak_ignore() for all debug objects in order to avoid the overlapping like the above. while (obj_pool_free < debug_objects_pool_min_level) { new = kmem_cache_zalloc(obj_cache, gfp); if (!new) return; kmemleak_ignore(new); However, there seems no way to move kmemleak_init() together this early in start_kernel() just before vmalloc_init() [1] because it looks like it depends on things like workqueue (schedule_work(_work)) and rcu. Hence, it needs to be after workqueue_init_early() and rcu_init() Given that, maybe the best outcome is to stick to the alternative approach that works [1] rather messing up with the order of debug_objects_mem_init() in start_kernel() which seems tricky. What do you think? [1] https://goo.gl/18N78g [2] https://goo.gl/My6ig6 Could you move kmemleak_init() and debug_objects_mem_init() as far up as possible, like before the hrtimer_init() to at least make static count calculation as simple as possible? Well, there is only 2 x NR_CPUS difference after moved both calls just after rcu_init(). Before After 64-CPU: 1114 974 160-CPU: 2774 2429 256-CPU:
Re: [PATCH v4] debugobjects: scale the static pool size
On 11/25/2018 03:42 PM, Qian Cai wrote: > > > On 11/23/18 10:01 PM, Qian Cai wrote: >> >> >>> On Nov 22, 2018, at 4:56 PM, Thomas Gleixner >>> wrote: >>> >>> On Tue, 20 Nov 2018, Qian Cai wrote: >>> >>> Looking deeper at that. >>> diff --git a/lib/debugobjects.c b/lib/debugobjects.c index 70935ed91125..140571aa483c 100644 --- a/lib/debugobjects.c +++ b/lib/debugobjects.c @@ -23,9 +23,81 @@ #define ODEBUG_HASH_BITS 14 #define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS) -#define ODEBUG_POOL_SIZE 1024 +#define ODEBUG_DEFAULT_POOL 512 #define ODEBUG_POOL_MIN_LEVEL 256 +/* + * Some debug objects are allocated during the early boot. Enabling some options + * like timers or workqueue objects may increase the size required significantly + * with large number of CPUs. For example (as today, 20 Nov. 2018), + * + * No. CPUs x 2 (worker pool) objects: + * + * start_kernel + * workqueue_init_early + * init_worker_pool + * init_timer_key + * debug_object_init + * + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS): + * + * sched_init + * hrtick_rq_init + * hrtimer_init + * + * CONFIG_DEBUG_OBJECTS_WORK: + * No. CPUs x 6 (workqueue) objects: + * + * workqueue_init_early + * alloc_workqueue + * __alloc_workqueue_key + * alloc_and_link_pwqs + * init_pwq + * + * Also, plus No. CPUs objects: + * + * perf_event_init + * __init_srcu_struct + * init_srcu_struct_fields + * init_srcu_struct_nodes + * __init_work >>> >>> None of the things are actually used or required _BEFORE_ >>> debug_objects_mem_init() is invoked. >>> >>> The reason why the call is at this place in start_kernel() is >>> historical. It's because back in the days when debugobjects were >>> added the >>> memory allocator was enabled way later than today. So we can just >>> move the >>> debug_objects_mem_init() call right before sched_init() I think. >> >> Well, now that kmemleak_init() seems complains that >> debug_objects_mem_init() >> is called before it. >> >> [ 0.078805] kmemleak: Cannot insert 0xc00dff93 into the >> object search tree (overlaps existing) >> [ 0.078860] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.20.0-rc3+ #3 >> [ 0.078883] Call Trace: >> [ 0.078904] [c1c8fcd0] [c0c96b34] >> dump_stack+0xe8/0x164 (unreliable) >> [ 0.078935] [c1c8fd20] [c0486e84] >> create_object+0x344/0x380 >> [ 0.078962] [c1c8fde0] [c0489544] >> early_alloc+0x108/0x1f8 >> [ 0.078989] [c1c8fe20] [c109738c] >> kmemleak_init+0x1d8/0x3d4 >> [ 0.079016] [c1c8ff00] [c1054028] >> start_kernel+0x5c0/0x6f8 >> [ 0.079043] [c1c8ff90] [c000ae7c] >> start_here_common+0x1c/0x520 >> [ 0.079070] kmemleak: Kernel memory leak detector disabled >> [ 0.079091] kmemleak: Object 0xc00ffd587b68 (size 40): >> [ 0.079112] kmemleak: comm "swapper/0", pid 0, jiffies 4294937299 >> [ 0.079135] kmemleak: min_count = -1 >> [ 0.079153] kmemleak: count = 0 >> [ 0.079170] kmemleak: flags = 0x5 >> [ 0.079188] kmemleak: checksum = 0 >> [ 0.079206] kmemleak: backtrace: >> [ 0.079227] __debug_object_init+0x688/0x700 >> [ 0.079250] debug_object_activate+0x1e0/0x350 >> [ 0.079272] __call_rcu+0x60/0x430 >> [ 0.079292] put_object+0x60/0x80 >> [ 0.079311] kmemleak_init+0x2cc/0x3d4 >> [ 0.079331] start_kernel+0x5c0/0x6f8 >> [ 0.079351] start_here_common+0x1c/0x520 >> [ 0.079380] kmemleak: Early log backtrace: >> [ 0.079399] memblock_alloc_try_nid_raw+0x90/0xcc >> [ 0.079421] sparse_init_nid+0x144/0x51c >> [ 0.079440] sparse_init+0x1a0/0x238 >> [ 0.079459] initmem_init+0x1d8/0x25c >> [ 0.079498] setup_arch+0x3e0/0x464 >> [ 0.079517] start_kernel+0xa4/0x6f8 >> [ 0.079536] start_here_common+0x1c/0x520 >> > > So this is an chicken-egg problem. Debug objects need kmemleak_init() > first, so it can make use of kmemleak_ignore() for all debug objects > in order to avoid the overlapping like the above. > > while (obj_pool_free < debug_objects_pool_min_level) { > > new = kmem_cache_zalloc(obj_cache, gfp); > if (!new) > return; > > kmemleak_ignore(new); > > However, there seems no way to move kmemleak_init() together this > early in start_kernel() just before vmalloc_init() [1] because it > looks like it depends on things like workqueue > (schedule_work(_work)) and rcu. Hence, it needs to be after > workqueue_init_early() and rcu_init() > > Given that, maybe the best outcome is to stick to the alternative > approach that works [1] rather messing up with the order of > debug_objects_mem_init() in start_kernel() which seems
Re: [PATCH v4] debugobjects: scale the static pool size
On 11/25/2018 03:42 PM, Qian Cai wrote: > > > On 11/23/18 10:01 PM, Qian Cai wrote: >> >> >>> On Nov 22, 2018, at 4:56 PM, Thomas Gleixner >>> wrote: >>> >>> On Tue, 20 Nov 2018, Qian Cai wrote: >>> >>> Looking deeper at that. >>> diff --git a/lib/debugobjects.c b/lib/debugobjects.c index 70935ed91125..140571aa483c 100644 --- a/lib/debugobjects.c +++ b/lib/debugobjects.c @@ -23,9 +23,81 @@ #define ODEBUG_HASH_BITS 14 #define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS) -#define ODEBUG_POOL_SIZE 1024 +#define ODEBUG_DEFAULT_POOL 512 #define ODEBUG_POOL_MIN_LEVEL 256 +/* + * Some debug objects are allocated during the early boot. Enabling some options + * like timers or workqueue objects may increase the size required significantly + * with large number of CPUs. For example (as today, 20 Nov. 2018), + * + * No. CPUs x 2 (worker pool) objects: + * + * start_kernel + * workqueue_init_early + * init_worker_pool + * init_timer_key + * debug_object_init + * + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS): + * + * sched_init + * hrtick_rq_init + * hrtimer_init + * + * CONFIG_DEBUG_OBJECTS_WORK: + * No. CPUs x 6 (workqueue) objects: + * + * workqueue_init_early + * alloc_workqueue + * __alloc_workqueue_key + * alloc_and_link_pwqs + * init_pwq + * + * Also, plus No. CPUs objects: + * + * perf_event_init + * __init_srcu_struct + * init_srcu_struct_fields + * init_srcu_struct_nodes + * __init_work >>> >>> None of the things are actually used or required _BEFORE_ >>> debug_objects_mem_init() is invoked. >>> >>> The reason why the call is at this place in start_kernel() is >>> historical. It's because back in the days when debugobjects were >>> added the >>> memory allocator was enabled way later than today. So we can just >>> move the >>> debug_objects_mem_init() call right before sched_init() I think. >> >> Well, now that kmemleak_init() seems complains that >> debug_objects_mem_init() >> is called before it. >> >> [ 0.078805] kmemleak: Cannot insert 0xc00dff93 into the >> object search tree (overlaps existing) >> [ 0.078860] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.20.0-rc3+ #3 >> [ 0.078883] Call Trace: >> [ 0.078904] [c1c8fcd0] [c0c96b34] >> dump_stack+0xe8/0x164 (unreliable) >> [ 0.078935] [c1c8fd20] [c0486e84] >> create_object+0x344/0x380 >> [ 0.078962] [c1c8fde0] [c0489544] >> early_alloc+0x108/0x1f8 >> [ 0.078989] [c1c8fe20] [c109738c] >> kmemleak_init+0x1d8/0x3d4 >> [ 0.079016] [c1c8ff00] [c1054028] >> start_kernel+0x5c0/0x6f8 >> [ 0.079043] [c1c8ff90] [c000ae7c] >> start_here_common+0x1c/0x520 >> [ 0.079070] kmemleak: Kernel memory leak detector disabled >> [ 0.079091] kmemleak: Object 0xc00ffd587b68 (size 40): >> [ 0.079112] kmemleak: comm "swapper/0", pid 0, jiffies 4294937299 >> [ 0.079135] kmemleak: min_count = -1 >> [ 0.079153] kmemleak: count = 0 >> [ 0.079170] kmemleak: flags = 0x5 >> [ 0.079188] kmemleak: checksum = 0 >> [ 0.079206] kmemleak: backtrace: >> [ 0.079227] __debug_object_init+0x688/0x700 >> [ 0.079250] debug_object_activate+0x1e0/0x350 >> [ 0.079272] __call_rcu+0x60/0x430 >> [ 0.079292] put_object+0x60/0x80 >> [ 0.079311] kmemleak_init+0x2cc/0x3d4 >> [ 0.079331] start_kernel+0x5c0/0x6f8 >> [ 0.079351] start_here_common+0x1c/0x520 >> [ 0.079380] kmemleak: Early log backtrace: >> [ 0.079399] memblock_alloc_try_nid_raw+0x90/0xcc >> [ 0.079421] sparse_init_nid+0x144/0x51c >> [ 0.079440] sparse_init+0x1a0/0x238 >> [ 0.079459] initmem_init+0x1d8/0x25c >> [ 0.079498] setup_arch+0x3e0/0x464 >> [ 0.079517] start_kernel+0xa4/0x6f8 >> [ 0.079536] start_here_common+0x1c/0x520 >> > > So this is an chicken-egg problem. Debug objects need kmemleak_init() > first, so it can make use of kmemleak_ignore() for all debug objects > in order to avoid the overlapping like the above. > > while (obj_pool_free < debug_objects_pool_min_level) { > > new = kmem_cache_zalloc(obj_cache, gfp); > if (!new) > return; > > kmemleak_ignore(new); > > However, there seems no way to move kmemleak_init() together this > early in start_kernel() just before vmalloc_init() [1] because it > looks like it depends on things like workqueue > (schedule_work(_work)) and rcu. Hence, it needs to be after > workqueue_init_early() and rcu_init() > > Given that, maybe the best outcome is to stick to the alternative > approach that works [1] rather messing up with the order of > debug_objects_mem_init() in start_kernel() which seems
Re: [PATCH v4] debugobjects: scale the static pool size
On 11/22/18 4:56 PM, Thomas Gleixner wrote: On Tue, 20 Nov 2018, Qian Cai wrote: Looking deeper at that. diff --git a/lib/debugobjects.c b/lib/debugobjects.c index 70935ed91125..140571aa483c 100644 --- a/lib/debugobjects.c +++ b/lib/debugobjects.c @@ -23,9 +23,81 @@ #define ODEBUG_HASH_BITS 14 #define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS) -#define ODEBUG_POOL_SIZE 1024 +#define ODEBUG_DEFAULT_POOL512 #define ODEBUG_POOL_MIN_LEVEL 256 +/* + * Some debug objects are allocated during the early boot. Enabling some options + * like timers or workqueue objects may increase the size required significantly + * with large number of CPUs. For example (as today, 20 Nov. 2018), + * + * No. CPUs x 2 (worker pool) objects: + * + * start_kernel + * workqueue_init_early + * init_worker_pool + * init_timer_key + * debug_object_init + * + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS): + * + * sched_init + * hrtick_rq_init + * hrtimer_init + * + * CONFIG_DEBUG_OBJECTS_WORK: + * No. CPUs x 6 (workqueue) objects: + * + * workqueue_init_early + * alloc_workqueue + * __alloc_workqueue_key + * alloc_and_link_pwqs + * init_pwq + * + * Also, plus No. CPUs objects: + * + * perf_event_init + *__init_srcu_struct + * init_srcu_struct_fields + *init_srcu_struct_nodes + * __init_work None of the things are actually used or required _BEFORE_ debug_objects_mem_init() is invoked. The reason why the call is at this place in start_kernel() is historical. It's because back in the days when debugobjects were added the memory allocator was enabled way later than today. So we can just move the debug_objects_mem_init() call right before sched_init() I think. + * Increase the number a bit more in case the implmentatins are changed in the + * future, as it is better to avoid OOM than spending a bit more kernel memory + * that will/can be freed. + * + * With all debug objects config options selected except the workqueue and the + * timers, kernel reports, + * 64-CPU: ODEBUG: 4 of 4 active objects replaced + * 256-CPU: ODEBUG: 4 of 4 active objects replaced + * + * all the options except the workqueue: + * 64-CPU: ODEBUG: 466 of 466 active objects replaced + * 256-CPU: ODEBUG: 1810 of 1810 active objects replaced + * + * all the options except the timers: + * 64-CPU: ODEBUG: 652 of 652 active objects replaced + * 256-CPU: ODEBUG: 2572 of 2572 active objects replaced + * + * all the options: + * 64-CPU: ODEBUG: 1114 of 1114 active objects replaced + * 256-CPU: ODEBUG: 4378 of 4378 active objects replaced + */ +#ifdef CONFIG_DEBUG_OBJECTS_WORK +#define ODEBUG_WORK_PCPU_CNT 10 +#else +#define ODEBUG_WORK_PCPU_CNT 0 +#endif /* CONFIG_DEBUG_OBJECTS_WORK */ + +#ifdef CONFIG_DEBUG_OBJECTS_TIMERS +#define ODEBUG_TIMERS_PCPU_CNT 10 +#else +#define ODEBUG_TIMERS_PCPU_CNT 0 +#endif /* CONFIG_DEBUG_OBJECTS_TIMERS */ Btw, the scaling here is way off. +#define ODEBUG_POOL_SIZE (ODEBUG_DEFAULT_POOL + CONFIG_NR_CPUS * \ + (ODEBUG_TIMERS_PCPU_CNT + ODEBUG_WORK_PCPU_CNT)) CONFIG_NR_CPUS Pool size Storage size 256 256512 4M According to your list above it uses 4378 object for 256 CPUs. That's off by an factor of ~58. Hmm, it is not clear to me why this is off and where did the pool size 256512 come from. CONFIG_NR_CPUS: 256 Pool size: 1024 + 256 * (10 + 10) = 6144 Am I missing anything?
Re: [PATCH v4] debugobjects: scale the static pool size
On 11/22/18 4:56 PM, Thomas Gleixner wrote: On Tue, 20 Nov 2018, Qian Cai wrote: Looking deeper at that. diff --git a/lib/debugobjects.c b/lib/debugobjects.c index 70935ed91125..140571aa483c 100644 --- a/lib/debugobjects.c +++ b/lib/debugobjects.c @@ -23,9 +23,81 @@ #define ODEBUG_HASH_BITS 14 #define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS) -#define ODEBUG_POOL_SIZE 1024 +#define ODEBUG_DEFAULT_POOL512 #define ODEBUG_POOL_MIN_LEVEL 256 +/* + * Some debug objects are allocated during the early boot. Enabling some options + * like timers or workqueue objects may increase the size required significantly + * with large number of CPUs. For example (as today, 20 Nov. 2018), + * + * No. CPUs x 2 (worker pool) objects: + * + * start_kernel + * workqueue_init_early + * init_worker_pool + * init_timer_key + * debug_object_init + * + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS): + * + * sched_init + * hrtick_rq_init + * hrtimer_init + * + * CONFIG_DEBUG_OBJECTS_WORK: + * No. CPUs x 6 (workqueue) objects: + * + * workqueue_init_early + * alloc_workqueue + * __alloc_workqueue_key + * alloc_and_link_pwqs + * init_pwq + * + * Also, plus No. CPUs objects: + * + * perf_event_init + *__init_srcu_struct + * init_srcu_struct_fields + *init_srcu_struct_nodes + * __init_work None of the things are actually used or required _BEFORE_ debug_objects_mem_init() is invoked. The reason why the call is at this place in start_kernel() is historical. It's because back in the days when debugobjects were added the memory allocator was enabled way later than today. So we can just move the debug_objects_mem_init() call right before sched_init() I think. + * Increase the number a bit more in case the implmentatins are changed in the + * future, as it is better to avoid OOM than spending a bit more kernel memory + * that will/can be freed. + * + * With all debug objects config options selected except the workqueue and the + * timers, kernel reports, + * 64-CPU: ODEBUG: 4 of 4 active objects replaced + * 256-CPU: ODEBUG: 4 of 4 active objects replaced + * + * all the options except the workqueue: + * 64-CPU: ODEBUG: 466 of 466 active objects replaced + * 256-CPU: ODEBUG: 1810 of 1810 active objects replaced + * + * all the options except the timers: + * 64-CPU: ODEBUG: 652 of 652 active objects replaced + * 256-CPU: ODEBUG: 2572 of 2572 active objects replaced + * + * all the options: + * 64-CPU: ODEBUG: 1114 of 1114 active objects replaced + * 256-CPU: ODEBUG: 4378 of 4378 active objects replaced + */ +#ifdef CONFIG_DEBUG_OBJECTS_WORK +#define ODEBUG_WORK_PCPU_CNT 10 +#else +#define ODEBUG_WORK_PCPU_CNT 0 +#endif /* CONFIG_DEBUG_OBJECTS_WORK */ + +#ifdef CONFIG_DEBUG_OBJECTS_TIMERS +#define ODEBUG_TIMERS_PCPU_CNT 10 +#else +#define ODEBUG_TIMERS_PCPU_CNT 0 +#endif /* CONFIG_DEBUG_OBJECTS_TIMERS */ Btw, the scaling here is way off. +#define ODEBUG_POOL_SIZE (ODEBUG_DEFAULT_POOL + CONFIG_NR_CPUS * \ + (ODEBUG_TIMERS_PCPU_CNT + ODEBUG_WORK_PCPU_CNT)) CONFIG_NR_CPUS Pool size Storage size 256 256512 4M According to your list above it uses 4378 object for 256 CPUs. That's off by an factor of ~58. Hmm, it is not clear to me why this is off and where did the pool size 256512 come from. CONFIG_NR_CPUS: 256 Pool size: 1024 + 256 * (10 + 10) = 6144 Am I missing anything?
Re: [PATCH v4] debugobjects: scale the static pool size
On 11/23/18 10:01 PM, Qian Cai wrote: On Nov 22, 2018, at 4:56 PM, Thomas Gleixner wrote: On Tue, 20 Nov 2018, Qian Cai wrote: Looking deeper at that. diff --git a/lib/debugobjects.c b/lib/debugobjects.c index 70935ed91125..140571aa483c 100644 --- a/lib/debugobjects.c +++ b/lib/debugobjects.c @@ -23,9 +23,81 @@ #define ODEBUG_HASH_BITS14 #define ODEBUG_HASH_SIZE(1 << ODEBUG_HASH_BITS) -#define ODEBUG_POOL_SIZE 1024 +#define ODEBUG_DEFAULT_POOL512 #define ODEBUG_POOL_MIN_LEVEL 256 +/* + * Some debug objects are allocated during the early boot. Enabling some options + * like timers or workqueue objects may increase the size required significantly + * with large number of CPUs. For example (as today, 20 Nov. 2018), + * + * No. CPUs x 2 (worker pool) objects: + * + * start_kernel + * workqueue_init_early + * init_worker_pool + * init_timer_key + * debug_object_init + * + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS): + * + * sched_init + * hrtick_rq_init + * hrtimer_init + * + * CONFIG_DEBUG_OBJECTS_WORK: + * No. CPUs x 6 (workqueue) objects: + * + * workqueue_init_early + * alloc_workqueue + * __alloc_workqueue_key + * alloc_and_link_pwqs + * init_pwq + * + * Also, plus No. CPUs objects: + * + * perf_event_init + *__init_srcu_struct + * init_srcu_struct_fields + *init_srcu_struct_nodes + * __init_work None of the things are actually used or required _BEFORE_ debug_objects_mem_init() is invoked. The reason why the call is at this place in start_kernel() is historical. It's because back in the days when debugobjects were added the memory allocator was enabled way later than today. So we can just move the debug_objects_mem_init() call right before sched_init() I think. Well, now that kmemleak_init() seems complains that debug_objects_mem_init() is called before it. [0.078805] kmemleak: Cannot insert 0xc00dff93 into the object search tree (overlaps existing) [0.078860] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.20.0-rc3+ #3 [0.078883] Call Trace: [0.078904] [c1c8fcd0] [c0c96b34] dump_stack+0xe8/0x164 (unreliable) [0.078935] [c1c8fd20] [c0486e84] create_object+0x344/0x380 [0.078962] [c1c8fde0] [c0489544] early_alloc+0x108/0x1f8 [0.078989] [c1c8fe20] [c109738c] kmemleak_init+0x1d8/0x3d4 [0.079016] [c1c8ff00] [c1054028] start_kernel+0x5c0/0x6f8 [0.079043] [c1c8ff90] [c000ae7c] start_here_common+0x1c/0x520 [0.079070] kmemleak: Kernel memory leak detector disabled [0.079091] kmemleak: Object 0xc00ffd587b68 (size 40): [0.079112] kmemleak: comm "swapper/0", pid 0, jiffies 4294937299 [0.079135] kmemleak: min_count = -1 [0.079153] kmemleak: count = 0 [0.079170] kmemleak: flags = 0x5 [0.079188] kmemleak: checksum = 0 [0.079206] kmemleak: backtrace: [0.079227] __debug_object_init+0x688/0x700 [0.079250] debug_object_activate+0x1e0/0x350 [0.079272] __call_rcu+0x60/0x430 [0.079292] put_object+0x60/0x80 [0.079311] kmemleak_init+0x2cc/0x3d4 [0.079331] start_kernel+0x5c0/0x6f8 [0.079351] start_here_common+0x1c/0x520 [0.079380] kmemleak: Early log backtrace: [0.079399]memblock_alloc_try_nid_raw+0x90/0xcc [0.079421]sparse_init_nid+0x144/0x51c [0.079440]sparse_init+0x1a0/0x238 [0.079459]initmem_init+0x1d8/0x25c [0.079498]setup_arch+0x3e0/0x464 [0.079517]start_kernel+0xa4/0x6f8 [0.079536]start_here_common+0x1c/0x520 So this is an chicken-egg problem. Debug objects need kmemleak_init() first, so it can make use of kmemleak_ignore() for all debug objects in order to avoid the overlapping like the above. while (obj_pool_free < debug_objects_pool_min_level) { new = kmem_cache_zalloc(obj_cache, gfp); if (!new) return; kmemleak_ignore(new); However, there seems no way to move kmemleak_init() together this early in start_kernel() just before vmalloc_init() [1] because it looks like it depends on things like workqueue (schedule_work(_work)) and rcu. Hence, it needs to be after workqueue_init_early() and rcu_init() Given that, maybe the best outcome is to stick to the alternative approach that works [1] rather messing up with the order of debug_objects_mem_init() in start_kernel() which seems tricky. What do you think? [1] https://goo.gl/18N78g [2] https://goo.gl/My6ig6
Re: [PATCH v4] debugobjects: scale the static pool size
On 11/23/18 10:01 PM, Qian Cai wrote: On Nov 22, 2018, at 4:56 PM, Thomas Gleixner wrote: On Tue, 20 Nov 2018, Qian Cai wrote: Looking deeper at that. diff --git a/lib/debugobjects.c b/lib/debugobjects.c index 70935ed91125..140571aa483c 100644 --- a/lib/debugobjects.c +++ b/lib/debugobjects.c @@ -23,9 +23,81 @@ #define ODEBUG_HASH_BITS14 #define ODEBUG_HASH_SIZE(1 << ODEBUG_HASH_BITS) -#define ODEBUG_POOL_SIZE 1024 +#define ODEBUG_DEFAULT_POOL512 #define ODEBUG_POOL_MIN_LEVEL 256 +/* + * Some debug objects are allocated during the early boot. Enabling some options + * like timers or workqueue objects may increase the size required significantly + * with large number of CPUs. For example (as today, 20 Nov. 2018), + * + * No. CPUs x 2 (worker pool) objects: + * + * start_kernel + * workqueue_init_early + * init_worker_pool + * init_timer_key + * debug_object_init + * + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS): + * + * sched_init + * hrtick_rq_init + * hrtimer_init + * + * CONFIG_DEBUG_OBJECTS_WORK: + * No. CPUs x 6 (workqueue) objects: + * + * workqueue_init_early + * alloc_workqueue + * __alloc_workqueue_key + * alloc_and_link_pwqs + * init_pwq + * + * Also, plus No. CPUs objects: + * + * perf_event_init + *__init_srcu_struct + * init_srcu_struct_fields + *init_srcu_struct_nodes + * __init_work None of the things are actually used or required _BEFORE_ debug_objects_mem_init() is invoked. The reason why the call is at this place in start_kernel() is historical. It's because back in the days when debugobjects were added the memory allocator was enabled way later than today. So we can just move the debug_objects_mem_init() call right before sched_init() I think. Well, now that kmemleak_init() seems complains that debug_objects_mem_init() is called before it. [0.078805] kmemleak: Cannot insert 0xc00dff93 into the object search tree (overlaps existing) [0.078860] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.20.0-rc3+ #3 [0.078883] Call Trace: [0.078904] [c1c8fcd0] [c0c96b34] dump_stack+0xe8/0x164 (unreliable) [0.078935] [c1c8fd20] [c0486e84] create_object+0x344/0x380 [0.078962] [c1c8fde0] [c0489544] early_alloc+0x108/0x1f8 [0.078989] [c1c8fe20] [c109738c] kmemleak_init+0x1d8/0x3d4 [0.079016] [c1c8ff00] [c1054028] start_kernel+0x5c0/0x6f8 [0.079043] [c1c8ff90] [c000ae7c] start_here_common+0x1c/0x520 [0.079070] kmemleak: Kernel memory leak detector disabled [0.079091] kmemleak: Object 0xc00ffd587b68 (size 40): [0.079112] kmemleak: comm "swapper/0", pid 0, jiffies 4294937299 [0.079135] kmemleak: min_count = -1 [0.079153] kmemleak: count = 0 [0.079170] kmemleak: flags = 0x5 [0.079188] kmemleak: checksum = 0 [0.079206] kmemleak: backtrace: [0.079227] __debug_object_init+0x688/0x700 [0.079250] debug_object_activate+0x1e0/0x350 [0.079272] __call_rcu+0x60/0x430 [0.079292] put_object+0x60/0x80 [0.079311] kmemleak_init+0x2cc/0x3d4 [0.079331] start_kernel+0x5c0/0x6f8 [0.079351] start_here_common+0x1c/0x520 [0.079380] kmemleak: Early log backtrace: [0.079399]memblock_alloc_try_nid_raw+0x90/0xcc [0.079421]sparse_init_nid+0x144/0x51c [0.079440]sparse_init+0x1a0/0x238 [0.079459]initmem_init+0x1d8/0x25c [0.079498]setup_arch+0x3e0/0x464 [0.079517]start_kernel+0xa4/0x6f8 [0.079536]start_here_common+0x1c/0x520 So this is an chicken-egg problem. Debug objects need kmemleak_init() first, so it can make use of kmemleak_ignore() for all debug objects in order to avoid the overlapping like the above. while (obj_pool_free < debug_objects_pool_min_level) { new = kmem_cache_zalloc(obj_cache, gfp); if (!new) return; kmemleak_ignore(new); However, there seems no way to move kmemleak_init() together this early in start_kernel() just before vmalloc_init() [1] because it looks like it depends on things like workqueue (schedule_work(_work)) and rcu. Hence, it needs to be after workqueue_init_early() and rcu_init() Given that, maybe the best outcome is to stick to the alternative approach that works [1] rather messing up with the order of debug_objects_mem_init() in start_kernel() which seems tricky. What do you think? [1] https://goo.gl/18N78g [2] https://goo.gl/My6ig6
Re: [PATCH v4] debugobjects: scale the static pool size
> On Nov 22, 2018, at 4:56 PM, Thomas Gleixner wrote: > > On Tue, 20 Nov 2018, Qian Cai wrote: > > Looking deeper at that. > >> diff --git a/lib/debugobjects.c b/lib/debugobjects.c >> index 70935ed91125..140571aa483c 100644 >> --- a/lib/debugobjects.c >> +++ b/lib/debugobjects.c >> @@ -23,9 +23,81 @@ >> #define ODEBUG_HASH_BITS 14 >> #define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS) >> >> -#define ODEBUG_POOL_SIZE1024 >> +#define ODEBUG_DEFAULT_POOL 512 >> #define ODEBUG_POOL_MIN_LEVEL256 >> >> +/* >> + * Some debug objects are allocated during the early boot. Enabling some >> options >> + * like timers or workqueue objects may increase the size required >> significantly >> + * with large number of CPUs. For example (as today, 20 Nov. 2018), >> + * >> + * No. CPUs x 2 (worker pool) objects: >> + * >> + * start_kernel >> + * workqueue_init_early >> + * init_worker_pool >> + * init_timer_key >> + * debug_object_init >> + * >> + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS): >> + * >> + * sched_init >> + * hrtick_rq_init >> + * hrtimer_init >> + * >> + * CONFIG_DEBUG_OBJECTS_WORK: >> + * No. CPUs x 6 (workqueue) objects: >> + * >> + * workqueue_init_early >> + * alloc_workqueue >> + * __alloc_workqueue_key >> + * alloc_and_link_pwqs >> + * init_pwq >> + * >> + * Also, plus No. CPUs objects: >> + * >> + * perf_event_init >> + *__init_srcu_struct >> + * init_srcu_struct_fields >> + *init_srcu_struct_nodes >> + * __init_work > > None of the things are actually used or required _BEFORE_ > debug_objects_mem_init() is invoked. > > The reason why the call is at this place in start_kernel() is > historical. It's because back in the days when debugobjects were added the > memory allocator was enabled way later than today. So we can just move the > debug_objects_mem_init() call right before sched_init() I think. Well, now that kmemleak_init() seems complains that debug_objects_mem_init() is called before it. [0.078805] kmemleak: Cannot insert 0xc00dff93 into the object search tree (overlaps existing) [0.078860] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.20.0-rc3+ #3 [0.078883] Call Trace: [0.078904] [c1c8fcd0] [c0c96b34] dump_stack+0xe8/0x164 (unreliable) [0.078935] [c1c8fd20] [c0486e84] create_object+0x344/0x380 [0.078962] [c1c8fde0] [c0489544] early_alloc+0x108/0x1f8 [0.078989] [c1c8fe20] [c109738c] kmemleak_init+0x1d8/0x3d4 [0.079016] [c1c8ff00] [c1054028] start_kernel+0x5c0/0x6f8 [0.079043] [c1c8ff90] [c000ae7c] start_here_common+0x1c/0x520 [0.079070] kmemleak: Kernel memory leak detector disabled [0.079091] kmemleak: Object 0xc00ffd587b68 (size 40): [0.079112] kmemleak: comm "swapper/0", pid 0, jiffies 4294937299 [0.079135] kmemleak: min_count = -1 [0.079153] kmemleak: count = 0 [0.079170] kmemleak: flags = 0x5 [0.079188] kmemleak: checksum = 0 [0.079206] kmemleak: backtrace: [0.079227] __debug_object_init+0x688/0x700 [0.079250] debug_object_activate+0x1e0/0x350 [0.079272] __call_rcu+0x60/0x430 [0.079292] put_object+0x60/0x80 [0.079311] kmemleak_init+0x2cc/0x3d4 [0.079331] start_kernel+0x5c0/0x6f8 [0.079351] start_here_common+0x1c/0x520 [0.079380] kmemleak: Early log backtrace: [0.079399]memblock_alloc_try_nid_raw+0x90/0xcc [0.079421]sparse_init_nid+0x144/0x51c [0.079440]sparse_init+0x1a0/0x238 [0.079459]initmem_init+0x1d8/0x25c [0.079498]setup_arch+0x3e0/0x464 [0.079517]start_kernel+0xa4/0x6f8 [0.079536]start_here_common+0x1c/0x520
Re: [PATCH v4] debugobjects: scale the static pool size
> On Nov 22, 2018, at 4:56 PM, Thomas Gleixner wrote: > > On Tue, 20 Nov 2018, Qian Cai wrote: > > Looking deeper at that. > >> diff --git a/lib/debugobjects.c b/lib/debugobjects.c >> index 70935ed91125..140571aa483c 100644 >> --- a/lib/debugobjects.c >> +++ b/lib/debugobjects.c >> @@ -23,9 +23,81 @@ >> #define ODEBUG_HASH_BITS 14 >> #define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS) >> >> -#define ODEBUG_POOL_SIZE1024 >> +#define ODEBUG_DEFAULT_POOL 512 >> #define ODEBUG_POOL_MIN_LEVEL256 >> >> +/* >> + * Some debug objects are allocated during the early boot. Enabling some >> options >> + * like timers or workqueue objects may increase the size required >> significantly >> + * with large number of CPUs. For example (as today, 20 Nov. 2018), >> + * >> + * No. CPUs x 2 (worker pool) objects: >> + * >> + * start_kernel >> + * workqueue_init_early >> + * init_worker_pool >> + * init_timer_key >> + * debug_object_init >> + * >> + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS): >> + * >> + * sched_init >> + * hrtick_rq_init >> + * hrtimer_init >> + * >> + * CONFIG_DEBUG_OBJECTS_WORK: >> + * No. CPUs x 6 (workqueue) objects: >> + * >> + * workqueue_init_early >> + * alloc_workqueue >> + * __alloc_workqueue_key >> + * alloc_and_link_pwqs >> + * init_pwq >> + * >> + * Also, plus No. CPUs objects: >> + * >> + * perf_event_init >> + *__init_srcu_struct >> + * init_srcu_struct_fields >> + *init_srcu_struct_nodes >> + * __init_work > > None of the things are actually used or required _BEFORE_ > debug_objects_mem_init() is invoked. > > The reason why the call is at this place in start_kernel() is > historical. It's because back in the days when debugobjects were added the > memory allocator was enabled way later than today. So we can just move the > debug_objects_mem_init() call right before sched_init() I think. Well, now that kmemleak_init() seems complains that debug_objects_mem_init() is called before it. [0.078805] kmemleak: Cannot insert 0xc00dff93 into the object search tree (overlaps existing) [0.078860] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.20.0-rc3+ #3 [0.078883] Call Trace: [0.078904] [c1c8fcd0] [c0c96b34] dump_stack+0xe8/0x164 (unreliable) [0.078935] [c1c8fd20] [c0486e84] create_object+0x344/0x380 [0.078962] [c1c8fde0] [c0489544] early_alloc+0x108/0x1f8 [0.078989] [c1c8fe20] [c109738c] kmemleak_init+0x1d8/0x3d4 [0.079016] [c1c8ff00] [c1054028] start_kernel+0x5c0/0x6f8 [0.079043] [c1c8ff90] [c000ae7c] start_here_common+0x1c/0x520 [0.079070] kmemleak: Kernel memory leak detector disabled [0.079091] kmemleak: Object 0xc00ffd587b68 (size 40): [0.079112] kmemleak: comm "swapper/0", pid 0, jiffies 4294937299 [0.079135] kmemleak: min_count = -1 [0.079153] kmemleak: count = 0 [0.079170] kmemleak: flags = 0x5 [0.079188] kmemleak: checksum = 0 [0.079206] kmemleak: backtrace: [0.079227] __debug_object_init+0x688/0x700 [0.079250] debug_object_activate+0x1e0/0x350 [0.079272] __call_rcu+0x60/0x430 [0.079292] put_object+0x60/0x80 [0.079311] kmemleak_init+0x2cc/0x3d4 [0.079331] start_kernel+0x5c0/0x6f8 [0.079351] start_here_common+0x1c/0x520 [0.079380] kmemleak: Early log backtrace: [0.079399]memblock_alloc_try_nid_raw+0x90/0xcc [0.079421]sparse_init_nid+0x144/0x51c [0.079440]sparse_init+0x1a0/0x238 [0.079459]initmem_init+0x1d8/0x25c [0.079498]setup_arch+0x3e0/0x464 [0.079517]start_kernel+0xa4/0x6f8 [0.079536]start_here_common+0x1c/0x520
Re: [PATCH v4] debugobjects: scale the static pool size
On Tue, 20 Nov 2018, Qian Cai wrote: Looking deeper at that. > diff --git a/lib/debugobjects.c b/lib/debugobjects.c > index 70935ed91125..140571aa483c 100644 > --- a/lib/debugobjects.c > +++ b/lib/debugobjects.c > @@ -23,9 +23,81 @@ > #define ODEBUG_HASH_BITS 14 > #define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS) > > -#define ODEBUG_POOL_SIZE 1024 > +#define ODEBUG_DEFAULT_POOL 512 > #define ODEBUG_POOL_MIN_LEVEL256 > > +/* > + * Some debug objects are allocated during the early boot. Enabling some > options > + * like timers or workqueue objects may increase the size required > significantly > + * with large number of CPUs. For example (as today, 20 Nov. 2018), > + * > + * No. CPUs x 2 (worker pool) objects: > + * > + * start_kernel > + * workqueue_init_early > + * init_worker_pool > + * init_timer_key > + * debug_object_init > + * > + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS): > + * > + * sched_init > + * hrtick_rq_init > + * hrtimer_init > + * > + * CONFIG_DEBUG_OBJECTS_WORK: > + * No. CPUs x 6 (workqueue) objects: > + * > + * workqueue_init_early > + * alloc_workqueue > + * __alloc_workqueue_key > + * alloc_and_link_pwqs > + * init_pwq > + * > + * Also, plus No. CPUs objects: > + * > + * perf_event_init > + *__init_srcu_struct > + * init_srcu_struct_fields > + *init_srcu_struct_nodes > + * __init_work None of the things are actually used or required _BEFORE_ debug_objects_mem_init() is invoked. The reason why the call is at this place in start_kernel() is historical. It's because back in the days when debugobjects were added the memory allocator was enabled way later than today. So we can just move the debug_objects_mem_init() call right before sched_init() I think. > + * Increase the number a bit more in case the implmentatins are changed in > the > + * future, as it is better to avoid OOM than spending a bit more kernel > memory > + * that will/can be freed. > + * > + * With all debug objects config options selected except the workqueue and > the > + * timers, kernel reports, > + * 64-CPU: ODEBUG: 4 of 4 active objects replaced > + * 256-CPU: ODEBUG: 4 of 4 active objects replaced > + * > + * all the options except the workqueue: > + * 64-CPU: ODEBUG: 466 of 466 active objects replaced > + * 256-CPU: ODEBUG: 1810 of 1810 active objects replaced > + * > + * all the options except the timers: > + * 64-CPU: ODEBUG: 652 of 652 active objects replaced > + * 256-CPU: ODEBUG: 2572 of 2572 active objects replaced > + * > + * all the options: > + * 64-CPU: ODEBUG: 1114 of 1114 active objects replaced > + * 256-CPU: ODEBUG: 4378 of 4378 active objects replaced > + */ > +#ifdef CONFIG_DEBUG_OBJECTS_WORK > +#define ODEBUG_WORK_PCPU_CNT 10 > +#else > +#define ODEBUG_WORK_PCPU_CNT 0 > +#endif /* CONFIG_DEBUG_OBJECTS_WORK */ > + > +#ifdef CONFIG_DEBUG_OBJECTS_TIMERS > +#define ODEBUG_TIMERS_PCPU_CNT 10 > +#else > +#define ODEBUG_TIMERS_PCPU_CNT 0 > +#endif /* CONFIG_DEBUG_OBJECTS_TIMERS */ Btw, the scaling here is way off. > +#define ODEBUG_POOL_SIZE (ODEBUG_DEFAULT_POOL + CONFIG_NR_CPUS * \ > + (ODEBUG_TIMERS_PCPU_CNT + ODEBUG_WORK_PCPU_CNT)) CONFIG_NR_CPUS Pool size Storage size 256 256512 4M According to your list above it uses 4378 object for 256 CPUs. That's off by an factor of ~58. But we can spare all that and just move the init call. Thanks, tglx
Re: [PATCH v4] debugobjects: scale the static pool size
On Tue, 20 Nov 2018, Qian Cai wrote: Looking deeper at that. > diff --git a/lib/debugobjects.c b/lib/debugobjects.c > index 70935ed91125..140571aa483c 100644 > --- a/lib/debugobjects.c > +++ b/lib/debugobjects.c > @@ -23,9 +23,81 @@ > #define ODEBUG_HASH_BITS 14 > #define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS) > > -#define ODEBUG_POOL_SIZE 1024 > +#define ODEBUG_DEFAULT_POOL 512 > #define ODEBUG_POOL_MIN_LEVEL256 > > +/* > + * Some debug objects are allocated during the early boot. Enabling some > options > + * like timers or workqueue objects may increase the size required > significantly > + * with large number of CPUs. For example (as today, 20 Nov. 2018), > + * > + * No. CPUs x 2 (worker pool) objects: > + * > + * start_kernel > + * workqueue_init_early > + * init_worker_pool > + * init_timer_key > + * debug_object_init > + * > + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS): > + * > + * sched_init > + * hrtick_rq_init > + * hrtimer_init > + * > + * CONFIG_DEBUG_OBJECTS_WORK: > + * No. CPUs x 6 (workqueue) objects: > + * > + * workqueue_init_early > + * alloc_workqueue > + * __alloc_workqueue_key > + * alloc_and_link_pwqs > + * init_pwq > + * > + * Also, plus No. CPUs objects: > + * > + * perf_event_init > + *__init_srcu_struct > + * init_srcu_struct_fields > + *init_srcu_struct_nodes > + * __init_work None of the things are actually used or required _BEFORE_ debug_objects_mem_init() is invoked. The reason why the call is at this place in start_kernel() is historical. It's because back in the days when debugobjects were added the memory allocator was enabled way later than today. So we can just move the debug_objects_mem_init() call right before sched_init() I think. > + * Increase the number a bit more in case the implmentatins are changed in > the > + * future, as it is better to avoid OOM than spending a bit more kernel > memory > + * that will/can be freed. > + * > + * With all debug objects config options selected except the workqueue and > the > + * timers, kernel reports, > + * 64-CPU: ODEBUG: 4 of 4 active objects replaced > + * 256-CPU: ODEBUG: 4 of 4 active objects replaced > + * > + * all the options except the workqueue: > + * 64-CPU: ODEBUG: 466 of 466 active objects replaced > + * 256-CPU: ODEBUG: 1810 of 1810 active objects replaced > + * > + * all the options except the timers: > + * 64-CPU: ODEBUG: 652 of 652 active objects replaced > + * 256-CPU: ODEBUG: 2572 of 2572 active objects replaced > + * > + * all the options: > + * 64-CPU: ODEBUG: 1114 of 1114 active objects replaced > + * 256-CPU: ODEBUG: 4378 of 4378 active objects replaced > + */ > +#ifdef CONFIG_DEBUG_OBJECTS_WORK > +#define ODEBUG_WORK_PCPU_CNT 10 > +#else > +#define ODEBUG_WORK_PCPU_CNT 0 > +#endif /* CONFIG_DEBUG_OBJECTS_WORK */ > + > +#ifdef CONFIG_DEBUG_OBJECTS_TIMERS > +#define ODEBUG_TIMERS_PCPU_CNT 10 > +#else > +#define ODEBUG_TIMERS_PCPU_CNT 0 > +#endif /* CONFIG_DEBUG_OBJECTS_TIMERS */ Btw, the scaling here is way off. > +#define ODEBUG_POOL_SIZE (ODEBUG_DEFAULT_POOL + CONFIG_NR_CPUS * \ > + (ODEBUG_TIMERS_PCPU_CNT + ODEBUG_WORK_PCPU_CNT)) CONFIG_NR_CPUS Pool size Storage size 256 256512 4M According to your list above it uses 4378 object for 256 CPUs. That's off by an factor of ~58. But we can spare all that and just move the init call. Thanks, tglx
Re: [PATCH v4] debugobjects: scale the static pool size
On 11/20/2018 09:11 PM, Qian Cai wrote: > The current value of the early boot static pool size is not big enough > for systems with large number of CPUs with timer or/and workqueue > objects selected. As the results, systems have 60+ CPUs with both timer > and workqueue objects enabled could trigger "ODEBUG: Out of memory. > ODEBUG disabled". Hence, fixed it by computing it according to > CONFIG_NR_CPUS and CONFIG_DEBUG_OBJECTS_* options. > > Signed-off-by: Qian Cai > --- > > Changes since v3: > * style fixes > > Changes since v2: > * Made the calculation easier to read suggested by long...@redhat.com. > > Changes since v1: > * Removed the CONFIG_NR_CPUS check since it is always defined. > * Introduced a minimal default pool size to start with. Otherwise, the pool > size would be too low for systems only with a small number of CPUs. > * Adjusted the scale factors according to data. > > lib/debugobjects.c | 81 -- > 1 file changed, 79 insertions(+), 2 deletions(-) > > diff --git a/lib/debugobjects.c b/lib/debugobjects.c > index 70935ed91125..140571aa483c 100644 > --- a/lib/debugobjects.c > +++ b/lib/debugobjects.c > @@ -23,9 +23,81 @@ > #define ODEBUG_HASH_BITS 14 > #define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS) > > -#define ODEBUG_POOL_SIZE 1024 > +#define ODEBUG_DEFAULT_POOL 512 > #define ODEBUG_POOL_MIN_LEVEL256 > > +/* > + * Some debug objects are allocated during the early boot. Enabling some > options > + * like timers or workqueue objects may increase the size required > significantly > + * with large number of CPUs. For example (as today, 20 Nov. 2018), > + * > + * No. CPUs x 2 (worker pool) objects: > + * > + * start_kernel > + * workqueue_init_early > + * init_worker_pool > + * init_timer_key > + * debug_object_init > + * > + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS): > + * > + * sched_init > + * hrtick_rq_init > + * hrtimer_init > + * > + * CONFIG_DEBUG_OBJECTS_WORK: > + * No. CPUs x 6 (workqueue) objects: > + * > + * workqueue_init_early > + * alloc_workqueue > + * __alloc_workqueue_key > + * alloc_and_link_pwqs > + * init_pwq > + * > + * Also, plus No. CPUs objects: > + * > + * perf_event_init > + *__init_srcu_struct > + * init_srcu_struct_fields > + *init_srcu_struct_nodes > + * __init_work > + * > + * Increase the number a bit more in case the implmentatins are changed in > the > + * future, as it is better to avoid OOM than spending a bit more kernel > memory > + * that will/can be freed. > + * > + * With all debug objects config options selected except the workqueue and > the > + * timers, kernel reports, > + * 64-CPU: ODEBUG: 4 of 4 active objects replaced > + * 256-CPU: ODEBUG: 4 of 4 active objects replaced > + * > + * all the options except the workqueue: > + * 64-CPU: ODEBUG: 466 of 466 active objects replaced > + * 256-CPU: ODEBUG: 1810 of 1810 active objects replaced > + * > + * all the options except the timers: > + * 64-CPU: ODEBUG: 652 of 652 active objects replaced > + * 256-CPU: ODEBUG: 2572 of 2572 active objects replaced > + * > + * all the options: > + * 64-CPU: ODEBUG: 1114 of 1114 active objects replaced > + * 256-CPU: ODEBUG: 4378 of 4378 active objects replaced > + */ > +#ifdef CONFIG_DEBUG_OBJECTS_WORK > +#define ODEBUG_WORK_PCPU_CNT 10 > +#else > +#define ODEBUG_WORK_PCPU_CNT 0 > +#endif /* CONFIG_DEBUG_OBJECTS_WORK */ > + > +#ifdef CONFIG_DEBUG_OBJECTS_TIMERS > +#define ODEBUG_TIMERS_PCPU_CNT 10 > +#else > +#define ODEBUG_TIMERS_PCPU_CNT 0 > +#endif /* CONFIG_DEBUG_OBJECTS_TIMERS */ > + > +#define ODEBUG_POOL_SIZE (ODEBUG_DEFAULT_POOL + CONFIG_NR_CPUS * \ > + (ODEBUG_TIMERS_PCPU_CNT + ODEBUG_WORK_PCPU_CNT)) > + > #define ODEBUG_CHUNK_SHIFT PAGE_SHIFT > #define ODEBUG_CHUNK_SIZE(1 << ODEBUG_CHUNK_SHIFT) > #define ODEBUG_CHUNK_MASK(~(ODEBUG_CHUNK_SIZE - 1)) > @@ -58,8 +130,13 @@ static intdebug_objects_fixups > __read_mostly; > static int debug_objects_warnings __read_mostly; > static int debug_objects_enabled __read_mostly > = CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT; > +/* > + * This is only used after replaced static objects, so no need to scale it to > + * use the early boot static pool size and it has already been scaled > according > + * to actual No. CPUs in the box within debug_objects_mem_init(). > + */ > static int debug_objects_pool_size __read_mostly > - = ODEBUG_POOL_SIZE; > + = ODEBUG_DEFAULT_POOL; > static int debug_objects_pool_min_level __read_mostly > = ODEBUG_POOL_MIN_LEVEL; > static struct debug_obj_descr*descr_test __read_mostly; Acked-by: Waiman Long
Re: [PATCH v4] debugobjects: scale the static pool size
On 11/20/2018 09:11 PM, Qian Cai wrote: > The current value of the early boot static pool size is not big enough > for systems with large number of CPUs with timer or/and workqueue > objects selected. As the results, systems have 60+ CPUs with both timer > and workqueue objects enabled could trigger "ODEBUG: Out of memory. > ODEBUG disabled". Hence, fixed it by computing it according to > CONFIG_NR_CPUS and CONFIG_DEBUG_OBJECTS_* options. > > Signed-off-by: Qian Cai > --- > > Changes since v3: > * style fixes > > Changes since v2: > * Made the calculation easier to read suggested by long...@redhat.com. > > Changes since v1: > * Removed the CONFIG_NR_CPUS check since it is always defined. > * Introduced a minimal default pool size to start with. Otherwise, the pool > size would be too low for systems only with a small number of CPUs. > * Adjusted the scale factors according to data. > > lib/debugobjects.c | 81 -- > 1 file changed, 79 insertions(+), 2 deletions(-) > > diff --git a/lib/debugobjects.c b/lib/debugobjects.c > index 70935ed91125..140571aa483c 100644 > --- a/lib/debugobjects.c > +++ b/lib/debugobjects.c > @@ -23,9 +23,81 @@ > #define ODEBUG_HASH_BITS 14 > #define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS) > > -#define ODEBUG_POOL_SIZE 1024 > +#define ODEBUG_DEFAULT_POOL 512 > #define ODEBUG_POOL_MIN_LEVEL256 > > +/* > + * Some debug objects are allocated during the early boot. Enabling some > options > + * like timers or workqueue objects may increase the size required > significantly > + * with large number of CPUs. For example (as today, 20 Nov. 2018), > + * > + * No. CPUs x 2 (worker pool) objects: > + * > + * start_kernel > + * workqueue_init_early > + * init_worker_pool > + * init_timer_key > + * debug_object_init > + * > + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS): > + * > + * sched_init > + * hrtick_rq_init > + * hrtimer_init > + * > + * CONFIG_DEBUG_OBJECTS_WORK: > + * No. CPUs x 6 (workqueue) objects: > + * > + * workqueue_init_early > + * alloc_workqueue > + * __alloc_workqueue_key > + * alloc_and_link_pwqs > + * init_pwq > + * > + * Also, plus No. CPUs objects: > + * > + * perf_event_init > + *__init_srcu_struct > + * init_srcu_struct_fields > + *init_srcu_struct_nodes > + * __init_work > + * > + * Increase the number a bit more in case the implmentatins are changed in > the > + * future, as it is better to avoid OOM than spending a bit more kernel > memory > + * that will/can be freed. > + * > + * With all debug objects config options selected except the workqueue and > the > + * timers, kernel reports, > + * 64-CPU: ODEBUG: 4 of 4 active objects replaced > + * 256-CPU: ODEBUG: 4 of 4 active objects replaced > + * > + * all the options except the workqueue: > + * 64-CPU: ODEBUG: 466 of 466 active objects replaced > + * 256-CPU: ODEBUG: 1810 of 1810 active objects replaced > + * > + * all the options except the timers: > + * 64-CPU: ODEBUG: 652 of 652 active objects replaced > + * 256-CPU: ODEBUG: 2572 of 2572 active objects replaced > + * > + * all the options: > + * 64-CPU: ODEBUG: 1114 of 1114 active objects replaced > + * 256-CPU: ODEBUG: 4378 of 4378 active objects replaced > + */ > +#ifdef CONFIG_DEBUG_OBJECTS_WORK > +#define ODEBUG_WORK_PCPU_CNT 10 > +#else > +#define ODEBUG_WORK_PCPU_CNT 0 > +#endif /* CONFIG_DEBUG_OBJECTS_WORK */ > + > +#ifdef CONFIG_DEBUG_OBJECTS_TIMERS > +#define ODEBUG_TIMERS_PCPU_CNT 10 > +#else > +#define ODEBUG_TIMERS_PCPU_CNT 0 > +#endif /* CONFIG_DEBUG_OBJECTS_TIMERS */ > + > +#define ODEBUG_POOL_SIZE (ODEBUG_DEFAULT_POOL + CONFIG_NR_CPUS * \ > + (ODEBUG_TIMERS_PCPU_CNT + ODEBUG_WORK_PCPU_CNT)) > + > #define ODEBUG_CHUNK_SHIFT PAGE_SHIFT > #define ODEBUG_CHUNK_SIZE(1 << ODEBUG_CHUNK_SHIFT) > #define ODEBUG_CHUNK_MASK(~(ODEBUG_CHUNK_SIZE - 1)) > @@ -58,8 +130,13 @@ static intdebug_objects_fixups > __read_mostly; > static int debug_objects_warnings __read_mostly; > static int debug_objects_enabled __read_mostly > = CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT; > +/* > + * This is only used after replaced static objects, so no need to scale it to > + * use the early boot static pool size and it has already been scaled > according > + * to actual No. CPUs in the box within debug_objects_mem_init(). > + */ > static int debug_objects_pool_size __read_mostly > - = ODEBUG_POOL_SIZE; > + = ODEBUG_DEFAULT_POOL; > static int debug_objects_pool_min_level __read_mostly > = ODEBUG_POOL_MIN_LEVEL; > static struct debug_obj_descr*descr_test __read_mostly; Acked-by: Waiman Long