Re: [PATCH 2/3] kernel/pid: Remove default pid_max value
On Mon, Apr 08, 2024 at 04:58:18PM GMT, Michal Koutný wrote: > The kernel provides mechanisms, while it should not imply policies -- > default pid_max seems to be an example of the policy that does not fit > all. At the same time pid_max must have some value assigned, so use the > end of the allowed range -- pid_max_max. > > This change thus increases initial pid_max from 32k to 4M (x86_64 > defconfig). Out of curiosity I dug out the commit acdc721fe26d ("[PATCH] pid-max-2.5.33-A0") v2.5.34~5 that introduced the 32k default. The commit message doesn't say why such a sudden change though. Previously, the limit was 1G of pids (i.e. effectively no default limit like the intention of this series). Honestly, I expected more enthusiasm or reasons against removing the default value of pid_max. Is this really not of interest to anyone? (Thanks, Andrew, for your responses. I don't plan to pursue this further should there be no more interest in having less default limit values in kernel.) Regards, Michal signature.asc Description: PGP signature
Re: Re: [PATCH 2/3] kernel/pid: Remove default pid_max value
On Thu, Apr 11, 2024 at 03:03:31PM -0700, Andrew Morton wrote: > A large increase in the maximum number of processes. The change from (some) default to effective infinity is the crux of the change. Because that is only a number. (Thus I don't find the number's 12700% increase alone a big change.) Actual maximum amount of processes is "workload dependent" and hence should be determined based on the particular workload. > Or did I misinterpret? I thought you saw an issue with projection of that number into sizings based on the default. Which of them comprises the large change in your eyes? Thanks, Michal
Re: [PATCH 2/3] kernel/pid: Remove default pid_max value
On Thu, 11 Apr 2024 17:40:02 +0200 Michal Koutný wrote: > Hello. > > On Mon, Apr 08, 2024 at 01:29:55PM -0700, Andrew Morton > wrote: > > That seems like a large change. > > In what sense is it large? A large increase in the maximum number of processes. Or did I misinterpret?
Re: Re: [PATCH 2/3] kernel/pid: Remove default pid_max value
Hello. On Mon, Apr 08, 2024 at 01:29:55PM -0700, Andrew Morton wrote: > That seems like a large change. In what sense is it large? I tried to lookup the code parts that depend on this default and either add the other patches or mention the impact (that part could be more thorough) in the commit message. > It isn't clear why we'd want to merge this patchset. Does it improve > anyone's life and if so, how? - kernel devs who don't care about policy - policy should be decided by distros/users, not in kernel - users who need many threads - current default is too low - this is one more place to look at when configuring - users who want to prevent fork-bombs - current default is ineffective (too high), false feeling of safety - i.e. they should configure appropriate mechanism appropriately I thought that the first point alone would be convincing and that only scaling impact might need clarification. Regards, Michal
Re: [PATCH 2/3] kernel/pid: Remove default pid_max value
Hi Michal, kernel test robot noticed the following build errors: [auto build test ERROR on fec50db7033ea478773b159e0e2efb135270e3b7] url: https://github.com/intel-lab-lkp/linux/commits/Michal-Koutn/tracing-Remove-dependency-of-saved_cmdlines_buffer-on-PID_MAX_DEFAULT/20240408-230031 base: fec50db7033ea478773b159e0e2efb135270e3b7 patch link: https://lore.kernel.org/r/20240408145819.8787-3-mkoutny%40suse.com patch subject: [PATCH 2/3] kernel/pid: Remove default pid_max value config: arm-allnoconfig (https://download.01.org/0day-ci/archive/20240409/202404090903.3jz667sn-...@intel.com/config) compiler: clang version 19.0.0git (https://github.com/llvm/llvm-project 8b3b4a92adee40483c27f26c478a384cd69c6f05) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240409/202404090903.3jz667sn-...@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot | Closes: https://lore.kernel.org/oe-kbuild-all/202404090903.3jz667sn-...@intel.com/ All errors (new ones prefixed by >>): In file included from kernel/sysctl.c:23: In file included from include/linux/mm.h:2208: include/linux/vmstat.h:522:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion] 522 | return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_" | ~~~ ^ ~~~ >> kernel/sysctl.c:1819:14: error: initializing 'void *' with an expression of >> type 'const int *' discards qualifiers >> [-Werror,-Wincompatible-pointer-types-discards-qualifiers] 1819 | .extra2 = _max_max, | ^~~~ 1 warning and 1 error generated. vim +1819 kernel/sysctl.c f461d2dcd511c0 Christoph Hellwig 2020-04-24 1617 f461d2dcd511c0 Christoph Hellwig 2020-04-24 1618 static struct ctl_table kern_table[] = { ^1da177e4c3f41 Linus Torvalds 2005-04-16 1619 { f461d2dcd511c0 Christoph Hellwig 2020-04-24 1620 .procname = "panic", f461d2dcd511c0 Christoph Hellwig 2020-04-24 1621 .data = _timeout, f461d2dcd511c0 Christoph Hellwig 2020-04-24 1622 .maxlen = sizeof(int), 49f0ce5f92321c Jerome Marchand 2014-01-21 1623 .mode = 0644, 6d4561110a3e9f Eric W. Biederman 2009-11-16 1624 .proc_handler = proc_dointvec, ^1da177e4c3f41 Linus Torvalds 2005-04-16 1625 }, f461d2dcd511c0 Christoph Hellwig 2020-04-24 1626 #ifdef CONFIG_PROC_SYSCTL ^1da177e4c3f41 Linus Torvalds 2005-04-16 1627 { f461d2dcd511c0 Christoph Hellwig 2020-04-24 1628 .procname = "tainted", f461d2dcd511c0 Christoph Hellwig 2020-04-24 1629 .maxlen = sizeof(long), ^1da177e4c3f41 Linus Torvalds 2005-04-16 1630 .mode = 0644, f461d2dcd511c0 Christoph Hellwig 2020-04-24 1631 .proc_handler = proc_taint, ^1da177e4c3f41 Linus Torvalds 2005-04-16 1632 }, 2da02997e08d3e David Rientjes 2009-01-06 1633 { f461d2dcd511c0 Christoph Hellwig 2020-04-24 1634 .procname = "sysctl_writes_strict", f461d2dcd511c0 Christoph Hellwig 2020-04-24 1635 .data = _writes_strict, 9e3961a0979817 Prarit Bhargava 2014-12-10 1636 .maxlen = sizeof(int), 2da02997e08d3e David Rientjes 2009-01-06 1637 .mode = 0644, 9e3961a0979817 Prarit Bhargava 2014-12-10 1638 .proc_handler = proc_dointvec_minmax, 78e36f3b0dae58 Xiaoming Ni 2022-01-21 1639 .extra1 = SYSCTL_NEG_ONE, eec4844fae7c03 Matteo Croce2019-07-18 1640 .extra2 = SYSCTL_ONE, 2da02997e08d3e David Rientjes 2009-01-06 1641 }, 964c9dff009189 Alexander Popov 2018-08-17 1642 #endif 1efff914afac8a Theodore Ts'o 2015-03-17 1643 { f461d2dcd511c0 Christoph Hellwig 2020-04-24 1644 .procname = "print-fatal-signals", f461d2dcd511c0 Christoph Hellwig 2020-04-24 1645 .data = _fatal_signals, 964c9dff009189 Alexander Popov 2018-08-17 1646 .maxlen = sizeof(int), 1efff914afac8a Theodore Ts'o 2015-03-17 1647 .mode = 0644, f461d2dcd511c0 Christoph Hellwig 2020-04-24 1648 .proc_handler = proc_dointvec, 1efff914afac8a Theodore Ts'o 2015-03-17 1649 }, f461d2dcd511c0 Christoph Hellwig 2020-04-24 1650 #ifdef CONFIG_SPARC ^1da177e4c3f41 Linus Torvalds 2005-04-16 1651 { f461d2dcd511c0 Christoph Hellwig 2020-04-24 1652 .procname = "reb
Re: [PATCH 2/3] kernel/pid: Remove default pid_max value
Hi Michal, kernel test robot noticed the following build warnings: [auto build test WARNING on fec50db7033ea478773b159e0e2efb135270e3b7] url: https://github.com/intel-lab-lkp/linux/commits/Michal-Koutn/tracing-Remove-dependency-of-saved_cmdlines_buffer-on-PID_MAX_DEFAULT/20240408-230031 base: fec50db7033ea478773b159e0e2efb135270e3b7 patch link: https://lore.kernel.org/r/20240408145819.8787-3-mkoutny%40suse.com patch subject: [PATCH 2/3] kernel/pid: Remove default pid_max value config: alpha-allnoconfig (https://download.01.org/0day-ci/archive/20240409/202404090849.mgj3z0xi-...@intel.com/config) compiler: alpha-linux-gcc (GCC) 13.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240409/202404090849.mgj3z0xi-...@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot | Closes: https://lore.kernel.org/oe-kbuild-all/202404090849.mgj3z0xi-...@intel.com/ All warnings (new ones prefixed by >>): >> kernel/sysctl.c:1819:35: warning: initialization discards 'const' qualifier >> from pointer target type [-Wdiscarded-qualifiers] 1819 | .extra2 = _max_max, | ^ vim +/const +1819 kernel/sysctl.c f461d2dcd511c0 Christoph Hellwig 2020-04-24 1617 f461d2dcd511c0 Christoph Hellwig 2020-04-24 1618 static struct ctl_table kern_table[] = { ^1da177e4c3f41 Linus Torvalds 2005-04-16 1619 { f461d2dcd511c0 Christoph Hellwig 2020-04-24 1620 .procname = "panic", f461d2dcd511c0 Christoph Hellwig 2020-04-24 1621 .data = _timeout, f461d2dcd511c0 Christoph Hellwig 2020-04-24 1622 .maxlen = sizeof(int), 49f0ce5f92321c Jerome Marchand 2014-01-21 1623 .mode = 0644, 6d4561110a3e9f Eric W. Biederman 2009-11-16 1624 .proc_handler = proc_dointvec, ^1da177e4c3f41 Linus Torvalds 2005-04-16 1625 }, f461d2dcd511c0 Christoph Hellwig 2020-04-24 1626 #ifdef CONFIG_PROC_SYSCTL ^1da177e4c3f41 Linus Torvalds 2005-04-16 1627 { f461d2dcd511c0 Christoph Hellwig 2020-04-24 1628 .procname = "tainted", f461d2dcd511c0 Christoph Hellwig 2020-04-24 1629 .maxlen = sizeof(long), ^1da177e4c3f41 Linus Torvalds 2005-04-16 1630 .mode = 0644, f461d2dcd511c0 Christoph Hellwig 2020-04-24 1631 .proc_handler = proc_taint, ^1da177e4c3f41 Linus Torvalds 2005-04-16 1632 }, 2da02997e08d3e David Rientjes 2009-01-06 1633 { f461d2dcd511c0 Christoph Hellwig 2020-04-24 1634 .procname = "sysctl_writes_strict", f461d2dcd511c0 Christoph Hellwig 2020-04-24 1635 .data = _writes_strict, 9e3961a0979817 Prarit Bhargava 2014-12-10 1636 .maxlen = sizeof(int), 2da02997e08d3e David Rientjes 2009-01-06 1637 .mode = 0644, 9e3961a0979817 Prarit Bhargava 2014-12-10 1638 .proc_handler = proc_dointvec_minmax, 78e36f3b0dae58 Xiaoming Ni 2022-01-21 1639 .extra1 = SYSCTL_NEG_ONE, eec4844fae7c03 Matteo Croce2019-07-18 1640 .extra2 = SYSCTL_ONE, 2da02997e08d3e David Rientjes 2009-01-06 1641 }, 964c9dff009189 Alexander Popov 2018-08-17 1642 #endif 1efff914afac8a Theodore Ts'o 2015-03-17 1643 { f461d2dcd511c0 Christoph Hellwig 2020-04-24 1644 .procname = "print-fatal-signals", f461d2dcd511c0 Christoph Hellwig 2020-04-24 1645 .data = _fatal_signals, 964c9dff009189 Alexander Popov 2018-08-17 1646 .maxlen = sizeof(int), 1efff914afac8a Theodore Ts'o 2015-03-17 1647 .mode = 0644, f461d2dcd511c0 Christoph Hellwig 2020-04-24 1648 .proc_handler = proc_dointvec, 1efff914afac8a Theodore Ts'o 2015-03-17 1649 }, f461d2dcd511c0 Christoph Hellwig 2020-04-24 1650 #ifdef CONFIG_SPARC ^1da177e4c3f41 Linus Torvalds 2005-04-16 1651 { f461d2dcd511c0 Christoph Hellwig 2020-04-24 1652 .procname = "reboot-cmd", f461d2dcd511c0 Christoph Hellwig 2020-04-24 1653 .data = reboot_command, f461d2dcd511c0 Christoph Hellwig 2020-04-24 1654 .maxlen = 256, ^1da177e4c3f41 Linus Torvalds 2005-04-16 1655 .mode = 0644, f461d2dcd511c0 Christoph Hellwig 2020-04-24 1656 .proc_handler = proc_dostring, ^1da177e4c3f41 Linus Torvalds 2005-04-16 1657 }, ^1da177e4c3f41 Linus Torvalds 2005-04-16 1658 { f461d2dcd511c0 Christoph Hellwig 2020-04-24 1659 .procname
Re: [PATCH 2/3] kernel/pid: Remove default pid_max value
On Mon, 8 Apr 2024 16:58:18 +0200 Michal Koutný wrote: > The kernel provides mechanisms, while it should not imply policies -- > default pid_max seems to be an example of the policy that does not fit > all. At the same time pid_max must have some value assigned, so use the > end of the allowed range -- pid_max_max. > > This change thus increases initial pid_max from 32k to 4M (x86_64 > defconfig). That seems like a large change. It isn't clear why we'd want to merge this patchset. Does it improve anyone's life and if so, how?
[PATCH 2/3] kernel/pid: Remove default pid_max value
pid_max is a per-pidns (thus global too) limit on a number of tasks the kernel admits. The knob can be configured by admin in the range between pid_max_min and pid_max_max (sic). The default value sits between those and it typically equals max(32k, 1k*nr_cpus). The nr_cpu scaling was introduced in commit 72680a191b93 ("pids: increase pid_max based on num_possible_cpus") to accommodate kernel's own helper tasks (before workqueues). Generally, 1024 tasks/cpu cap is too much if they were all running and it is also too little when they are idle (memory being bottleneck). The kernel also provides other mechanisms to restrict number of tasks -- threads-max sysctl and RLIMIT_NPROC with memory-scaled defaults and generic pids cgroup controller (the last one being the solution of fork-bombs, with qualified limits set up by admin). The kernel provides mechanisms, while it should not imply policies -- default pid_max seems to be an example of the policy that does not fit all. At the same time pid_max must have some value assigned, so use the end of the allowed range -- pid_max_max. This change thus increases initial pid_max from 32k to 4M (x86_64 defconfig). This has effect on size of structure that alloc_pid/idr_alloc_cyclic eventually uses and structure that kernel tracing uses with 'record-tgid' (~16 MiB). Signed-off-by: Michal Koutný --- include/linux/pid.h | 4 ++-- include/linux/threads.h | 15 --- kernel/pid.c| 8 +++- 3 files changed, 9 insertions(+), 18 deletions(-) diff --git a/include/linux/pid.h b/include/linux/pid.h index a3aad9b4074c..0d191ac02958 100644 --- a/include/linux/pid.h +++ b/include/linux/pid.h @@ -106,8 +106,8 @@ extern void exchange_tids(struct task_struct *task, struct task_struct *old); extern void transfer_pid(struct task_struct *old, struct task_struct *new, enum pid_type); -extern int pid_max; -extern int pid_max_min, pid_max_max; +extern int pid_max_min, pid_max; +extern const int pid_max_max; /* * look up a PID in the hash table. Must be called with the tasklist_lock diff --git a/include/linux/threads.h b/include/linux/threads.h index c34173e6c5f1..43f8f38a0c13 100644 --- a/include/linux/threads.h +++ b/include/linux/threads.h @@ -22,25 +22,18 @@ #define MIN_THREADS_LEFT_FOR_ROOT 4 -/* - * This controls the default maximum pid allocated to a process - */ -#define PID_MAX_DEFAULT (CONFIG_BASE_SMALL ? 0x1000 : 0x8000) - /* * A maximum of 4 million PIDs should be enough for a while. * [NOTE: PID/TIDs are limited to 2^30 ~= 1 billion, see FUTEX_TID_MASK.] */ #define PID_MAX_LIMIT (CONFIG_BASE_SMALL ? PAGE_SIZE * 8 : \ - (sizeof(long) > 4 ? 4 * 1024 * 1024 : PID_MAX_DEFAULT)) + (sizeof(long) > 4 ? 4 * 1024 * 1024 : 0x8000)) /* - * Define a minimum number of pids per cpu. Heuristically based - * on original pid max of 32k for 32 cpus. Also, increase the - * minimum settable value for pid_max on the running system based - * on similar defaults. See kernel/pid.c:pid_idr_init() for details. + * Define a minimum number of pids per cpu. Mainly to accommodate + * smpboot_register_percpu_thread() kernel threads. + * See kernel/pid.c:pid_idr_init() for details. */ -#define PIDS_PER_CPU_DEFAULT 1024 #define PIDS_PER_CPU_MIN 8 #endif diff --git a/kernel/pid.c b/kernel/pid.c index da76ed1873f7..24ae505ac3b0 100644 --- a/kernel/pid.c +++ b/kernel/pid.c @@ -60,10 +60,10 @@ struct pid init_struct_pid = { }, } }; -int pid_max = PID_MAX_DEFAULT; +int pid_max = PID_MAX_LIMIT; int pid_max_min = RESERVED_PIDS + 1; -int pid_max_max = PID_MAX_LIMIT; +const int pid_max_max = PID_MAX_LIMIT; /* * Pseudo filesystems start inode numbering after one. We use Reserved * PIDs as a natural offset. @@ -652,9 +652,7 @@ void __init pid_idr_init(void) /* Verify no one has done anything silly: */ BUILD_BUG_ON(PID_MAX_LIMIT >= PIDNS_ADDING); - /* bump default and minimum pid_max based on number of cpus */ - pid_max = min(pid_max_max, max_t(int, pid_max, - PIDS_PER_CPU_DEFAULT * num_possible_cpus())); + /* bump minimum pid_max based on number of cpus */ pid_max_min = max_t(int, pid_max_min, PIDS_PER_CPU_MIN * num_possible_cpus()); pr_info("pid_max: default: %u minimum: %u\n", pid_max, pid_max_min); -- 2.44.0