Re: [PATCH 2/3] kernel/pid: Remove default pid_max value

2024-05-13 Thread Michal Koutný
On Mon, Apr 08, 2024 at 04:58:18PM GMT, Michal Koutný  wrote:
> The kernel provides mechanisms, while it should not imply policies --
> default pid_max seems to be an example of the policy that does not fit
> all. At the same time pid_max must have some value assigned, so use the
> end of the allowed range -- pid_max_max.
> 
> This change thus increases initial pid_max from 32k to 4M (x86_64
> defconfig).

Out of curiosity I dug out the commit
acdc721fe26d ("[PATCH] pid-max-2.5.33-A0") v2.5.34~5
that introduced the 32k default. The commit message doesn't say why such
a sudden change though.
Previously, the limit was 1G of pids (i.e. effectively no default limit
like the intention of this series).

Honestly, I expected more enthusiasm or reasons against removing the
default value of pid_max. Is this really not of interest to anyone?

(Thanks, Andrew, for your responses. I don't plan to pursue this further
should there be no more interest in having less default limit values in
kernel.)

Regards,
Michal


signature.asc
Description: PGP signature


Re: Re: [PATCH 2/3] kernel/pid: Remove default pid_max value

2024-04-12 Thread Michal Koutný
On Thu, Apr 11, 2024 at 03:03:31PM -0700, Andrew Morton 
 wrote:
> A large increase in the maximum number of processes.

The change from (some) default to effective infinity is the crux of the
change. Because that is only a number.
(Thus I don't find the number's 12700% increase alone a big change.)

Actual maximum amount of processes is "workload dependent" and hence
should be determined based on the particular workload.

> Or did I misinterpret?

I thought you saw an issue with projection of that number into sizings
based on the default. Which of them comprises the large change in your
eyes?

Thanks,
Michal



Re: [PATCH 2/3] kernel/pid: Remove default pid_max value

2024-04-11 Thread Andrew Morton
On Thu, 11 Apr 2024 17:40:02 +0200 Michal Koutný  wrote:

> Hello.
> 
> On Mon, Apr 08, 2024 at 01:29:55PM -0700, Andrew Morton 
>  wrote:
> > That seems like a large change.
> 
> In what sense is it large?

A large increase in the maximum number of processes.  Or did I misinterpret?





Re: Re: [PATCH 2/3] kernel/pid: Remove default pid_max value

2024-04-11 Thread Michal Koutný
Hello.

On Mon, Apr 08, 2024 at 01:29:55PM -0700, Andrew Morton 
 wrote:
> That seems like a large change.

In what sense is it large?

I tried to lookup the code parts that depend on this default and either
add the other patches or mention the impact (that part could be more
thorough) in the commit message.

> It isn't clear why we'd want to merge this patchset.  Does it improve
> anyone's life and if so, how?

- kernel devs who don't care about policy
  - policy should be decided by distros/users, not in kernel

- users who need many threads
  - current default is too low
  - this is one more place to look at when configuring

- users who want to prevent fork-bombs
  - current default is ineffective (too high), false feeling of safety
  - i.e. they should configure appropriate mechanism appropriately


I thought that the first point alone would be convincing and that only
scaling impact might need clarification.

Regards,
Michal



Re: [PATCH 2/3] kernel/pid: Remove default pid_max value

2024-04-08 Thread kernel test robot
Hi Michal,

kernel test robot noticed the following build errors:

[auto build test ERROR on fec50db7033ea478773b159e0e2efb135270e3b7]

url:
https://github.com/intel-lab-lkp/linux/commits/Michal-Koutn/tracing-Remove-dependency-of-saved_cmdlines_buffer-on-PID_MAX_DEFAULT/20240408-230031
base:   fec50db7033ea478773b159e0e2efb135270e3b7
patch link:
https://lore.kernel.org/r/20240408145819.8787-3-mkoutny%40suse.com
patch subject: [PATCH 2/3] kernel/pid: Remove default pid_max value
config: arm-allnoconfig 
(https://download.01.org/0day-ci/archive/20240409/202404090903.3jz667sn-...@intel.com/config)
compiler: clang version 19.0.0git (https://github.com/llvm/llvm-project 
8b3b4a92adee40483c27f26c478a384cd69c6f05)
reproduce (this is a W=1 build): 
(https://download.01.org/0day-ci/archive/20240409/202404090903.3jz667sn-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202404090903.3jz667sn-...@intel.com/

All errors (new ones prefixed by >>):

   In file included from kernel/sysctl.c:23:
   In file included from include/linux/mm.h:2208:
   include/linux/vmstat.h:522:36: warning: arithmetic between different 
enumeration types ('enum node_stat_item' and 'enum lru_list') 
[-Wenum-enum-conversion]
 522 | return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
 |   ~~~ ^ ~~~
>> kernel/sysctl.c:1819:14: error: initializing 'void *' with an expression of 
>> type 'const int *' discards qualifiers 
>> [-Werror,-Wincompatible-pointer-types-discards-qualifiers]
1819 | .extra2 = _max_max,
 |   ^~~~
   1 warning and 1 error generated.


vim +1819 kernel/sysctl.c

f461d2dcd511c0 Christoph Hellwig   2020-04-24  1617  
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1618  static struct ctl_table 
kern_table[] = {
^1da177e4c3f41 Linus Torvalds  2005-04-16  1619 {
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1620 .procname   
= "panic",
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1621 .data   
= _timeout,
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1622 .maxlen 
= sizeof(int),
49f0ce5f92321c Jerome Marchand 2014-01-21  1623 .mode   
= 0644,
6d4561110a3e9f Eric W. Biederman   2009-11-16  1624 .proc_handler   
= proc_dointvec,
^1da177e4c3f41 Linus Torvalds  2005-04-16  1625 },
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1626  #ifdef CONFIG_PROC_SYSCTL
^1da177e4c3f41 Linus Torvalds  2005-04-16  1627 {
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1628 .procname   
= "tainted",
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1629 .maxlen 
= sizeof(long),
^1da177e4c3f41 Linus Torvalds  2005-04-16  1630 .mode   
= 0644,
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1631 .proc_handler   
= proc_taint,
^1da177e4c3f41 Linus Torvalds  2005-04-16  1632 },
2da02997e08d3e David Rientjes  2009-01-06  1633 {
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1634 .procname   
= "sysctl_writes_strict",
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1635 .data   
= _writes_strict,
9e3961a0979817 Prarit Bhargava 2014-12-10  1636 .maxlen 
= sizeof(int),
2da02997e08d3e David Rientjes  2009-01-06  1637 .mode   
= 0644,
9e3961a0979817 Prarit Bhargava 2014-12-10  1638 .proc_handler   
= proc_dointvec_minmax,
78e36f3b0dae58 Xiaoming Ni 2022-01-21  1639 .extra1 
= SYSCTL_NEG_ONE,
eec4844fae7c03 Matteo Croce2019-07-18  1640 .extra2 
= SYSCTL_ONE,
2da02997e08d3e David Rientjes  2009-01-06  1641 },
964c9dff009189 Alexander Popov 2018-08-17  1642  #endif
1efff914afac8a Theodore Ts'o   2015-03-17  1643 {
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1644 .procname   
= "print-fatal-signals",
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1645 .data   
= _fatal_signals,
964c9dff009189 Alexander Popov 2018-08-17  1646 .maxlen 
= sizeof(int),
1efff914afac8a Theodore Ts'o   2015-03-17  1647 .mode   
= 0644,
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1648 .proc_handler   
= proc_dointvec,
1efff914afac8a Theodore Ts'o   2015-03-17  1649 },
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1650  #ifdef CONFIG_SPARC
^1da177e4c3f41 Linus Torvalds  2005-04-16  1651 {
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1652 .procname   
= "reb

Re: [PATCH 2/3] kernel/pid: Remove default pid_max value

2024-04-08 Thread kernel test robot
Hi Michal,

kernel test robot noticed the following build warnings:

[auto build test WARNING on fec50db7033ea478773b159e0e2efb135270e3b7]

url:
https://github.com/intel-lab-lkp/linux/commits/Michal-Koutn/tracing-Remove-dependency-of-saved_cmdlines_buffer-on-PID_MAX_DEFAULT/20240408-230031
base:   fec50db7033ea478773b159e0e2efb135270e3b7
patch link:
https://lore.kernel.org/r/20240408145819.8787-3-mkoutny%40suse.com
patch subject: [PATCH 2/3] kernel/pid: Remove default pid_max value
config: alpha-allnoconfig 
(https://download.01.org/0day-ci/archive/20240409/202404090849.mgj3z0xi-...@intel.com/config)
compiler: alpha-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): 
(https://download.01.org/0day-ci/archive/20240409/202404090849.mgj3z0xi-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202404090849.mgj3z0xi-...@intel.com/

All warnings (new ones prefixed by >>):

>> kernel/sysctl.c:1819:35: warning: initialization discards 'const' qualifier 
>> from pointer target type [-Wdiscarded-qualifiers]
1819 | .extra2 = _max_max,
 |   ^


vim +/const +1819 kernel/sysctl.c

f461d2dcd511c0 Christoph Hellwig   2020-04-24  1617  
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1618  static struct ctl_table 
kern_table[] = {
^1da177e4c3f41 Linus Torvalds  2005-04-16  1619 {
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1620 .procname   
= "panic",
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1621 .data   
= _timeout,
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1622 .maxlen 
= sizeof(int),
49f0ce5f92321c Jerome Marchand 2014-01-21  1623 .mode   
= 0644,
6d4561110a3e9f Eric W. Biederman   2009-11-16  1624 .proc_handler   
= proc_dointvec,
^1da177e4c3f41 Linus Torvalds  2005-04-16  1625 },
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1626  #ifdef CONFIG_PROC_SYSCTL
^1da177e4c3f41 Linus Torvalds  2005-04-16  1627 {
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1628 .procname   
= "tainted",
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1629 .maxlen 
= sizeof(long),
^1da177e4c3f41 Linus Torvalds  2005-04-16  1630 .mode   
= 0644,
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1631 .proc_handler   
= proc_taint,
^1da177e4c3f41 Linus Torvalds  2005-04-16  1632 },
2da02997e08d3e David Rientjes  2009-01-06  1633 {
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1634 .procname   
= "sysctl_writes_strict",
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1635 .data   
= _writes_strict,
9e3961a0979817 Prarit Bhargava 2014-12-10  1636 .maxlen 
= sizeof(int),
2da02997e08d3e David Rientjes  2009-01-06  1637 .mode   
= 0644,
9e3961a0979817 Prarit Bhargava 2014-12-10  1638 .proc_handler   
= proc_dointvec_minmax,
78e36f3b0dae58 Xiaoming Ni 2022-01-21  1639 .extra1 
= SYSCTL_NEG_ONE,
eec4844fae7c03 Matteo Croce2019-07-18  1640 .extra2 
= SYSCTL_ONE,
2da02997e08d3e David Rientjes  2009-01-06  1641 },
964c9dff009189 Alexander Popov 2018-08-17  1642  #endif
1efff914afac8a Theodore Ts'o   2015-03-17  1643 {
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1644 .procname   
= "print-fatal-signals",
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1645 .data   
= _fatal_signals,
964c9dff009189 Alexander Popov 2018-08-17  1646 .maxlen 
= sizeof(int),
1efff914afac8a Theodore Ts'o   2015-03-17  1647 .mode   
= 0644,
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1648 .proc_handler   
= proc_dointvec,
1efff914afac8a Theodore Ts'o   2015-03-17  1649 },
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1650  #ifdef CONFIG_SPARC
^1da177e4c3f41 Linus Torvalds  2005-04-16  1651 {
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1652 .procname   
= "reboot-cmd",
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1653 .data   
= reboot_command,
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1654 .maxlen 
= 256,
^1da177e4c3f41 Linus Torvalds  2005-04-16  1655 .mode   
= 0644,
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1656 .proc_handler   
= proc_dostring,
^1da177e4c3f41 Linus Torvalds  2005-04-16  1657 },
^1da177e4c3f41 Linus Torvalds  2005-04-16  1658 {
f461d2dcd511c0 Christoph Hellwig   2020-04-24  1659 .procname   

Re: [PATCH 2/3] kernel/pid: Remove default pid_max value

2024-04-08 Thread Andrew Morton
On Mon,  8 Apr 2024 16:58:18 +0200 Michal Koutný  wrote:

> The kernel provides mechanisms, while it should not imply policies --
> default pid_max seems to be an example of the policy that does not fit
> all. At the same time pid_max must have some value assigned, so use the
> end of the allowed range -- pid_max_max.
> 
> This change thus increases initial pid_max from 32k to 4M (x86_64
> defconfig).

That seems like a large change.

It isn't clear why we'd want to merge this patchset.  Does it improve
anyone's life and if so, how?




[PATCH 2/3] kernel/pid: Remove default pid_max value

2024-04-08 Thread Michal Koutný
pid_max is a per-pidns (thus global too) limit on a number of tasks the
kernel admits. The knob can be configured by admin in the range between
pid_max_min and pid_max_max (sic). The default value sits between
those and it typically equals max(32k, 1k*nr_cpus).

The nr_cpu scaling was introduced in commit 72680a191b93 ("pids:
increase pid_max based on num_possible_cpus") to accommodate kernel's own
helper tasks (before workqueues). Generally, 1024 tasks/cpu cap is too
much if they were all running and it is also too little when they are
idle (memory being bottleneck).

The kernel also provides other mechanisms to restrict number of tasks --
threads-max sysctl and RLIMIT_NPROC with memory-scaled defaults and
generic pids cgroup controller (the last one being the solution of
fork-bombs, with qualified limits set up by admin).

The kernel provides mechanisms, while it should not imply policies --
default pid_max seems to be an example of the policy that does not fit
all. At the same time pid_max must have some value assigned, so use the
end of the allowed range -- pid_max_max.

This change thus increases initial pid_max from 32k to 4M (x86_64
defconfig).

This has effect on size of structure that alloc_pid/idr_alloc_cyclic
eventually uses and structure that kernel tracing uses with
'record-tgid' (~16 MiB).

Signed-off-by: Michal Koutný 
---
 include/linux/pid.h |  4 ++--
 include/linux/threads.h | 15 ---
 kernel/pid.c|  8 +++-
 3 files changed, 9 insertions(+), 18 deletions(-)

diff --git a/include/linux/pid.h b/include/linux/pid.h
index a3aad9b4074c..0d191ac02958 100644
--- a/include/linux/pid.h
+++ b/include/linux/pid.h
@@ -106,8 +106,8 @@ extern void exchange_tids(struct task_struct *task, struct 
task_struct *old);
 extern void transfer_pid(struct task_struct *old, struct task_struct *new,
 enum pid_type);
 
-extern int pid_max;
-extern int pid_max_min, pid_max_max;
+extern int pid_max_min, pid_max;
+extern const int pid_max_max;
 
 /*
  * look up a PID in the hash table. Must be called with the tasklist_lock
diff --git a/include/linux/threads.h b/include/linux/threads.h
index c34173e6c5f1..43f8f38a0c13 100644
--- a/include/linux/threads.h
+++ b/include/linux/threads.h
@@ -22,25 +22,18 @@
 
 #define MIN_THREADS_LEFT_FOR_ROOT 4
 
-/*
- * This controls the default maximum pid allocated to a process
- */
-#define PID_MAX_DEFAULT (CONFIG_BASE_SMALL ? 0x1000 : 0x8000)
-
 /*
  * A maximum of 4 million PIDs should be enough for a while.
  * [NOTE: PID/TIDs are limited to 2^30 ~= 1 billion, see FUTEX_TID_MASK.]
  */
 #define PID_MAX_LIMIT (CONFIG_BASE_SMALL ? PAGE_SIZE * 8 : \
-   (sizeof(long) > 4 ? 4 * 1024 * 1024 : PID_MAX_DEFAULT))
+   (sizeof(long) > 4 ? 4 * 1024 * 1024 : 0x8000))
 
 /*
- * Define a minimum number of pids per cpu.  Heuristically based
- * on original pid max of 32k for 32 cpus.  Also, increase the
- * minimum settable value for pid_max on the running system based
- * on similar defaults.  See kernel/pid.c:pid_idr_init() for details.
+ * Define a minimum number of pids per cpu. Mainly to accommodate
+ * smpboot_register_percpu_thread() kernel threads.
+ * See kernel/pid.c:pid_idr_init() for details.
  */
-#define PIDS_PER_CPU_DEFAULT   1024
 #define PIDS_PER_CPU_MIN   8
 
 #endif
diff --git a/kernel/pid.c b/kernel/pid.c
index da76ed1873f7..24ae505ac3b0 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -60,10 +60,10 @@ struct pid init_struct_pid = {
}, }
 };
 
-int pid_max = PID_MAX_DEFAULT;
+int pid_max = PID_MAX_LIMIT;
 
 int pid_max_min = RESERVED_PIDS + 1;
-int pid_max_max = PID_MAX_LIMIT;
+const int pid_max_max = PID_MAX_LIMIT;
 /*
  * Pseudo filesystems start inode numbering after one. We use Reserved
  * PIDs as a natural offset.
@@ -652,9 +652,7 @@ void __init pid_idr_init(void)
/* Verify no one has done anything silly: */
BUILD_BUG_ON(PID_MAX_LIMIT >= PIDNS_ADDING);
 
-   /* bump default and minimum pid_max based on number of cpus */
-   pid_max = min(pid_max_max, max_t(int, pid_max,
-   PIDS_PER_CPU_DEFAULT * num_possible_cpus()));
+   /* bump minimum pid_max based on number of cpus */
pid_max_min = max_t(int, pid_max_min,
PIDS_PER_CPU_MIN * num_possible_cpus());
pr_info("pid_max: default: %u minimum: %u\n", pid_max, pid_max_min);
-- 
2.44.0