Re: [RFC PATCH 0/5] Introduce /proc/all/ to gather stats from all processes

2020-08-25 Thread Eugene Lubarsky
On Thu, 20 Aug 2020 10:41:39 -0700
Andrei Vagin  wrote:
> Unfotunatly, I don't have enough time to lead a process of pushing
> task_diag into the upstream. So if it is interesting for you, you can
> restart this process and I am ready to help as much as time will
> permit.
>
> I think the main blocking issue was a lack of interest from the wide
> audience to this. The slow proc is the problem just for a few users,
> but task_diag is a big subsystem that repeats functionality of another
> subsystem with all derived problems like code duplication.

Unfortunately I don't have much time either and yes it sounds like
upstreaming a new interface like this will require input & enthusiasm
from more of those who are monitoring large numbers of processes,
which is not really me..

A related issue is that task_diag doesn't currently support the cgroup
filesystem which has the same issues as /proc and is accessed very
heavily by e.g. the Kubernetes kubelet cadvisor. Perhaps more interest
in tackling this could come from the Kubernetes community.

> 
> Another blocking issue is a new interface. There was no consensus on
> this. Initially, I suggested to use netlink sockets, but developers
> from non-network subsystem objected on this, so the transaction file
> interface was introduced. The main idea similar to netlink sockets is
> that we write a request and read a response.
> 
> There were some security concerns but I think I fixed them.

There's currently a lot of momentum behind io_uring which could not only
enable efficient enumeration and retrieval of small files but maybe it
would also be a more natural place for an API like task_diag..



Best Wishes,
Eugene


Re: [RFC PATCH 0/5] Introduce /proc/all/ to gather stats from all processes

2020-08-25 Thread Eugene Lubarsky
On Mon, 10 Aug 2020 17:41:32 +0200
Greg KH  wrote:

> On Tue, Aug 11, 2020 at 01:27:00AM +1000, Eugene Lubarsky wrote:
> > On Mon, 10 Aug 2020 17:04:53 +0200
> > Greg KH  wrote:  
> And have you benchmarked any of this?  Try working with the common
> tools that want this information and see if it actually is noticeable
> (hint, I have been doing that with the readfile work and it's
> surprising what the results are in places...)

Apologies for the delay. Here are some benchmarks with atop.

Patch to atop at: https://github.com/eug48/atop/commits/proc-all
Patch to add /proc/all/schedstat & cpuset below.
atop not collecting threads & cmdline as /proc/all/ doesn't support it.
10,000 processes, kernel 5.8, nested KVM, 2 cores of i7-6700HQ @ 2.60GHz

# USE_PROC_ALL=0 ./atop -w test 1 &
# pidstat -p $(pidof atop) 1

01:33:05   %usr %system  %guest   %wait%CPU   CPU  Command
01:33:06  33.66   33.660.000.99   67.33 1  atop
01:33:07  33.00   32.000.002.00   65.00 0  atop
01:33:08  34.00   31.000.001.00   65.00 0  atop
...
Average:  33.15   32.790.001.09   65.94 -  atop


# USE_PROC_ALL=1 ./atop -w test 1 &
# pidstat -p $(pidof atop) 1

01:33:33   %usr %system  %guest   %wait%CPU   CPU  Command
01:33:34  28.00   14.000.001.00   42.00 1  atop
01:33:35  28.00   14.000.000.00   42.00 1  atop
01:33:36  26.00   13.000.000.00   39.00 1  atop
...
Average:  27.08   12.860.000.35   39.94 -  atop

So CPU usage goes down from ~65% to ~40%.

Data collection times in milliseconds are:

# xsv cat columns proc.csv procall.csv \
> | xsv stats \
> | xsv select field,min,max,mean,stddev \
> | xsv table
field   min  max  mean stddev
/proc time  558  625  586.59   18.29
/proc/all time  231  262  243.56   8.02

Much performance optimisation can still be done, e.g. the modified atop
uses fgets which is reading 1KB at a time, and seq_file seems to only
return 4KB pages. task_diag should be much faster still.

I'd imagine this sort of thing would be useful for daemons monitoring
large numbers of processes. I don't run such systems myself; my initial
motivation was frustration with the Kubernetes kubelet having ~2-4% CPU
usage even with a couple of containers. Basic profiling suggests syscalls
have a lot to do with it - it's actually reading loads of tiny cgroup files
and enumerating many directories every 10 seconds, but /proc has similar
issues and seemed easier to start with.

Anyway, I've read that io_uring could also help here in the near future,
which would be really cool especially if there was a way to enumerate
directories and read many files regex-style in a single operation,
e.g. /proc/[0-9].*/(stat|statm|io)

> > Currently I'm trying to re-use the existing code in fs/proc that
> > controls which PIDs are visible, but may well be missing
> > something..  
> 
> Try it out and see if it works correctly.  And pid namespaces are not
> the only thing these days from what I call :)
> 
I've tried `unshare --fork --pid --mount-proc cat /proc/all/stat`
which seems to behave correctly. ptrace flags are handled by the
existing code.


Best Wishes,
Eugene


From 2ffc2e388f7ce4e3f182c2442823e5f13bae03dd Mon Sep 17 00:00:00 2001
From: Eugene Lubarsky 
Date: Tue, 25 Aug 2020 12:36:41 +1000
Subject: [RFC PATCH] fs/proc: /proc/all: add schedstat and cpuset

Signed-off-by: Eugene Lubarsky 
---
 fs/proc/base.c | 42 ++
 1 file changed, 42 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 0bba4b3a985e..44d73f1ade4a 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -3944,6 +3944,36 @@ static int proc_all_io(struct seq_file *m, void *v)
 }
 #endif
 
+#ifdef CONFIG_PROC_PID_CPUSET
+static int proc_all_cpuset(struct seq_file *m, void *v)
+{
+   struct all_iter *iter = (struct all_iter *) v;
+   struct pid_namespace *ns = iter->ns;
+   struct task_struct *task = iter->tgid_iter.task;
+   struct pid *pid = task->thread_pid;
+
+   seq_put_decimal_ull(m, "", pid_nr_ns(pid, ns));
+   seq_puts(m, " ");
+
+   return proc_cpuset_show(m, ns, pid, task);
+}
+#endif
+
+#ifdef CONFIG_SCHED_INFO
+static int proc_all_schedstat(struct seq_file *m, void *v)
+{
+   struct all_iter *iter = (struct all_iter *) v;
+   struct pid_namespace *ns = iter->ns;
+   struct task_struct *task = iter->tgid_iter.task;
+   struct pid *pid = task->thread_pid;
+
+   seq_put_decimal_ull(m, "", pid_nr_ns(pid, ns));
+   seq_puts(m, " ");
+
+   return proc_pid_schedstat(m, ns, pid, task);
+}
+#endif
+
 static int proc_all_statx(struct seq_file *m, void *v)
 {
struct all_iter *iter = (struct all_iter *) v;
@@ -3990,6 +4020,12 @@ PROC_ALL_OPS(status);
 #ifdef CONFIG_TASK_IO_ACCOUNTING
PROC_ALL_OPS(io);
 #e

Re: [RFC PATCH 0/5] Introduce /proc/all/ to gather stats from all processes

2020-08-13 Thread Eugene Lubarsky
On Wed, 12 Aug 2020 00:51:35 -0700
Andrei Vagin  wrote:

> Maybe we need resurrect the task_diag series instead of inventing
> another less-effective interface...

I would certainly welcome the resurrection of task_diag - it is clearly
more efficient than this /proc/all/ idea. It would be good to find out
if there's anything in particular that's currently blocking it.

This RFC is mainly meant to check whether such an addition would
be acceptable from an API point of view. It currently has an obvious
performance issue in that seq_file seems to only return one page at a
time so lots of read syscalls are still required. However I may not
have the time to figure out a proposed fix for this by myself.
Regardless, text-based formats can't match the efficiency of task_diag,
but binary ones are also possible.



Best Wishes,
Eugene


Re: [RFC PATCH 0/5] Introduce /proc/all/ to gather stats from all processes

2020-08-10 Thread Eugene Lubarsky
On Mon, 10 Aug 2020 17:04:53 +0200
Greg KH  wrote:
> How many syscalls does this save on?
> 
> Perhaps you want my proposed readfile(2) syscall:
>   
> https://lore.kernel.org/r/20200704140250.423345-1-gre...@linuxfoundation.org
> to help out with things like this?  :)

The proposed readfile sounds great and would help, but if there are
1000 processes wouldn't that require 1000 readfile calls to read their
proc files?

With something like this the stats for 1000 processes could be
retrieved with an open, a few reads and a close.

> 
> > The proposed files in this proof-of-concept patch set are:
> > 
> > * /proc/all/stat
> 
> I think the problem will be defining "all" in the case of the specific
> namespace you are dealing with, right?  How will this handle all of
> those issues properly for all of these different statisics?
> 

Currently I'm trying to re-use the existing code in fs/proc that
controls which PIDs are visible, but may well be missing something..


Best Wishes,
Eugene


[RFC PATCH 4/5] fs/proc: Introduce /proc/all/io

2020-08-10 Thread Eugene Lubarsky
Returns io info for all visible processes.

The data is the same as /proc/[pid]/io but formatted as a series
of numbers on a single line. A PID column is also prepended.

Signed-off-by: Eugene Lubarsky 
---
 fs/proc/base.c | 66 --
 1 file changed, 59 insertions(+), 7 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 5982fd43dd21..03d48225b6d1 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2910,9 +2910,8 @@ static const struct file_operations 
proc_coredump_filter_operations = {
 #endif
 
 #ifdef CONFIG_TASK_IO_ACCOUNTING
-static int do_io_accounting(struct task_struct *task, struct seq_file *m, int 
whole)
+static int calc_io_accounting(struct task_struct *task, struct 
task_io_accounting *acct, int whole)
 {
-   struct task_io_accounting acct = task->ioac;
unsigned long flags;
int result;
 
@@ -2928,12 +2927,27 @@ static int do_io_accounting(struct task_struct *task, 
struct seq_file *m, int wh
if (whole && lock_task_sighand(task, )) {
struct task_struct *t = task;
 
-   task_io_accounting_add(, >signal->ioac);
+   task_io_accounting_add(acct, >signal->ioac);
while_each_thread(task, t)
-   task_io_accounting_add(, >ioac);
+   task_io_accounting_add(acct, >ioac);
 
unlock_task_sighand(task, );
}
+   result = 0;
+
+out_unlock:
+   mutex_unlock(>signal->exec_update_mutex);
+   return result;
+}
+static int do_io_accounting(struct task_struct *task, struct seq_file *m, int 
whole)
+{
+   struct task_io_accounting acct = task->ioac;
+   int result;
+
+   result = calc_io_accounting(task, , whole);
+   if (result)
+   return result;
+
seq_printf(m,
   "rchar: %llu\n"
   "wchar: %llu\n"
@@ -2949,10 +2963,7 @@ static int do_io_accounting(struct task_struct *task, 
struct seq_file *m, int wh
   (unsigned long long)acct.read_bytes,
   (unsigned long long)acct.write_bytes,
   (unsigned long long)acct.cancelled_write_bytes);
-   result = 0;
 
-out_unlock:
-   mutex_unlock(>signal->exec_update_mutex);
return result;
 }
 
@@ -3896,7 +3907,42 @@ static int proc_all_statm(struct seq_file *m, void *v)
return proc_pid_statm(m, ns, pid, task);
 }
 
+#ifdef CONFIG_TASK_IO_ACCOUNTING
+static int proc_all_io_print_one(struct seq_file *m, struct task_struct *task)
+{
+   struct task_io_accounting acct = task->ioac;
+   int result;
 
+   result = calc_io_accounting(task, , 1);
+   if (result)
+   return result;
+
+   seq_printf(m,
+  "%llu %llu %llu %llu %llu %llu %llu\n",
+  (unsigned long long)acct.rchar,
+  (unsigned long long)acct.wchar,
+  (unsigned long long)acct.syscr,
+  (unsigned long long)acct.syscw,
+  (unsigned long long)acct.read_bytes,
+  (unsigned long long)acct.write_bytes,
+  (unsigned long long)acct.cancelled_write_bytes);
+
+   return result;
+}
+
+static int proc_all_io(struct seq_file *m, void *v)
+{
+   struct all_iter *iter = (struct all_iter *) v;
+   struct pid_namespace *ns = iter->ns;
+   struct task_struct *task = iter->tgid_iter.task;
+   struct pid *pid = task->thread_pid;
+
+   seq_put_decimal_ull(m, "", pid_nr_ns(pid, ns));
+   seq_puts(m, " ");
+
+   return proc_all_io_print_one(m, task);
+}
+#endif
 
 static int proc_all_status(struct seq_file *m, void *v)
 {
@@ -3915,6 +3961,9 @@ static int proc_all_status(struct seq_file *m, void *v)
 PROC_ALL_OPS(stat);
 PROC_ALL_OPS(statm);
 PROC_ALL_OPS(status);
+#ifdef CONFIG_TASK_IO_ACCOUNTING
+   PROC_ALL_OPS(io);
+#endif
 
 #define PROC_ALL_CREATE(NAME) \
do { \
@@ -3932,4 +3981,7 @@ void __init proc_all_init(void)
PROC_ALL_CREATE(stat);
PROC_ALL_CREATE(statm);
PROC_ALL_CREATE(status);
+#ifdef CONFIG_TASK_IO_ACCOUNTING
+   PROC_ALL_CREATE(io);
+#endif
 }
-- 
2.25.1



[RFC PATCH 0/5] Introduce /proc/all/ to gather stats from all processes

2020-08-10 Thread Eugene Lubarsky
This is an idea for substantially reducing the number of syscalls needed
by monitoring tools whilst mostly re-using the existing API.

The proposed files in this proof-of-concept patch set are:

* /proc/all/stat
  A stat line for each process in the existing format.

* /proc/all/statm
  statm lines but starting with a PID column.

* /proc/all/status
  status info for all processes in the existing format.

* /proc/all/io
  The existing /proc/pid/io data but formatted as a single line for
  each process, similarly to stat/statm, with a PID column added.

* /proc/all/statx
  Gathers info from stat, statm and io; the purpose is actually
  not so much to reduce syscalls but to help userspace be more
  efficient by not having to store data in e.g. hashtables in order
  to gather it from separate /proc/all/ files.

  The format proposed here starts with the unchanged stat line
  and begins the other info with a few characters, repeating for
  each process:

  ...
  25 (cat) R 1 1 0 0 -1 4194304 185 0 16 0 2 0 0 0 20 ...
  m 662 188 167 5 0 112 0
  io 4292 0 12 0 0 0 0
  ...


There has been a proposal with some overlapping goals: /proc/task-diag
(https://github.com/avagin/linux-task-diag), but I'm not sure about
its current status.



Best Wishes,

Eugene


Eugene Lubarsky (5):
  fs/proc: Introduce /proc/all/stat
  fs/proc: Introduce /proc/all/statm
  fs/proc: Introduce /proc/all/status
  fs/proc: Introduce /proc/all/io
  fs/proc: Introduce /proc/all/statx

 fs/proc/base.c | 215 +++--
 fs/proc/internal.h |   1 +
 fs/proc/root.c |   1 +
 3 files changed, 210 insertions(+), 7 deletions(-)

-- 
2.25.1



[RFC PATCH 2/5] fs/proc: Introduce /proc/all/statm

2020-08-10 Thread Eugene Lubarsky
Returns statm lines for all visible processes with prepended PIDs.

Signed-off-by: Eugene Lubarsky 
---
 fs/proc/base.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index e0f60a1528b7..8396a38ba7d2 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -3884,6 +3884,18 @@ static int proc_all_stat(struct seq_file *m, void *v)
return proc_tgid_stat(m, iter->ns, iter->tgid_iter.task->thread_pid, 
iter->tgid_iter.task);
 }
 
+static int proc_all_statm(struct seq_file *m, void *v)
+{
+   struct all_iter *iter = (struct all_iter *) v;
+   struct pid_namespace *ns = iter->ns;
+   struct task_struct *task = iter->tgid_iter.task;
+   struct pid *pid = task->thread_pid;
+
+   seq_put_decimal_ull(m, "", pid_nr_ns(pid, ns));
+   seq_puts(m, " ");
+   return proc_pid_statm(m, ns, pid, task);
+}
+
 
 #define PROC_ALL_OPS(NAME) static const struct seq_operations 
proc_all_##NAME##_ops = { \
.start  = proc_all_start, \
@@ -3893,6 +3905,7 @@ static int proc_all_stat(struct seq_file *m, void *v)
 }
 
 PROC_ALL_OPS(stat);
+PROC_ALL_OPS(statm);
 
 #define PROC_ALL_CREATE(NAME) \
do { \
@@ -3908,4 +3921,5 @@ void __init proc_all_init(void)
return;
 
PROC_ALL_CREATE(stat);
+   PROC_ALL_CREATE(statm);
 }
-- 
2.25.1



[RFC PATCH 3/5] fs/proc: Introduce /proc/all/status

2020-08-10 Thread Eugene Lubarsky
Returns status lines for all visible processes in the existing format.

Signed-off-by: Eugene Lubarsky 
---
 fs/proc/base.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 8396a38ba7d2..5982fd43dd21 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -3897,6 +3897,14 @@ static int proc_all_statm(struct seq_file *m, void *v)
 }
 
 
+
+static int proc_all_status(struct seq_file *m, void *v)
+{
+   struct all_iter *iter = (struct all_iter *) v;
+
+   return proc_pid_status(m, iter->ns, iter->tgid_iter.task->thread_pid, 
iter->tgid_iter.task);
+}
+
 #define PROC_ALL_OPS(NAME) static const struct seq_operations 
proc_all_##NAME##_ops = { \
.start  = proc_all_start, \
.next   = proc_all_next, \
@@ -3906,6 +3914,7 @@ static int proc_all_statm(struct seq_file *m, void *v)
 
 PROC_ALL_OPS(stat);
 PROC_ALL_OPS(statm);
+PROC_ALL_OPS(status);
 
 #define PROC_ALL_CREATE(NAME) \
do { \
@@ -3922,4 +3931,5 @@ void __init proc_all_init(void)
 
PROC_ALL_CREATE(stat);
PROC_ALL_CREATE(statm);
+   PROC_ALL_CREATE(status);
 }
-- 
2.25.1



[RFC PATCH 5/5] fs/proc: Introduce /proc/all/statx

2020-08-10 Thread Eugene Lubarsky
Gathers info from stat, statm and io files.

The purpose is not so much to reduce syscall numbers but to help
userspace not have to store data in e.g. hashtables in order to
gather it from separate /proc/all files.

The format starts with an unchanged stat line and begins the
statm & io lines with "m" or "io", repeating these for each process.

e.g.
...
25 (cat) R 1 1 0 0 -1 4194304 185 0 16 0 2 0 0 0 20 ...
m 662 188 167 5 0 112 0
io 4292 0 12 0 0 0 0
...

Signed-off-by: Eugene Lubarsky 
---
 fs/proc/base.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 03d48225b6d1..5c6010c2ea1c 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -3944,6 +3944,31 @@ static int proc_all_io(struct seq_file *m, void *v)
 }
 #endif
 
+static int proc_all_statx(struct seq_file *m, void *v)
+{
+   struct all_iter *iter = (struct all_iter *) v;
+   struct pid_namespace *ns = iter->ns;
+   struct pid *pid = iter->tgid_iter.task->thread_pid;
+   struct task_struct *task = iter->tgid_iter.task;
+   int err;
+
+   err = proc_tgid_stat(m, ns, pid, task);
+   if (err)
+   return err;
+
+   seq_puts(m, "m ");
+   err = proc_pid_statm(m, ns, pid, task);
+   if (err)
+   return err;
+
+#ifdef CONFIG_TASK_IO_ACCOUNTING
+   seq_puts(m, "io ");
+   err = proc_all_io_print_one(m, task);
+#endif
+
+   return err;
+}
+
 static int proc_all_status(struct seq_file *m, void *v)
 {
struct all_iter *iter = (struct all_iter *) v;
@@ -3960,6 +3985,7 @@ static int proc_all_status(struct seq_file *m, void *v)
 
 PROC_ALL_OPS(stat);
 PROC_ALL_OPS(statm);
+PROC_ALL_OPS(statx);
 PROC_ALL_OPS(status);
 #ifdef CONFIG_TASK_IO_ACCOUNTING
PROC_ALL_OPS(io);
@@ -3980,6 +4006,7 @@ void __init proc_all_init(void)
 
PROC_ALL_CREATE(stat);
PROC_ALL_CREATE(statm);
+   PROC_ALL_CREATE(statx);
PROC_ALL_CREATE(status);
 #ifdef CONFIG_TASK_IO_ACCOUNTING
PROC_ALL_CREATE(io);
-- 
2.25.1



[RFC PATCH 1/5] fs/proc: Introduce /proc/all/stat

2020-08-10 Thread Eugene Lubarsky
Returns stat lines for all visible processes in the existing format,
aiming to substantially reduce the number of syscalls that are needed 
for this common task.

Signed-off-by: Eugene Lubarsky 
---
 fs/proc/base.c | 98 ++
 fs/proc/internal.h |  1 +
 fs/proc/root.c |  1 +
 3 files changed, 100 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index a333caeca291..e0f60a1528b7 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -3811,3 +3811,101 @@ void __init set_proc_pid_nlink(void)
nlink_tid = pid_entry_nlink(tid_base_stuff, ARRAY_SIZE(tid_base_stuff));
nlink_tgid = pid_entry_nlink(tgid_base_stuff, 
ARRAY_SIZE(tgid_base_stuff));
 }
+
+
+/*
+ * /proc/all/
+ */
+
+struct all_iter {
+   struct tgid_iter tgid_iter;
+   struct proc_fs_info *fs_info;
+   struct pid_namespace *ns;
+};
+
+static void *proc_all_start(struct seq_file *m, loff_t *pos)
+{
+   struct all_iter *iter = kmalloc(sizeof(struct all_iter), GFP_KERNEL);
+
+   iter->fs_info = proc_sb_info(file_inode(m->file)->i_sb);
+   iter->ns = proc_pid_ns(file_inode(m->file)->i_sb);
+
+   iter->tgid_iter.tgid = *pos;
+   iter->tgid_iter.task = NULL;
+   iter->tgid_iter = next_tgid(iter->ns, iter->tgid_iter);
+
+   if (!iter->tgid_iter.task) {
+   kfree(iter);
+   return NULL;
+   }
+
+   return iter;
+}
+
+static void *proc_all_next(struct seq_file *m, void *v, loff_t *pos)
+{
+   struct all_iter *iter = (struct all_iter *) v;
+   struct proc_fs_info *fs_info = iter->fs_info;
+   struct tgid_iter *tgid_iter = >tgid_iter;
+
+   do {
+   tgid_iter->tgid += 1;
+   *tgid_iter = next_tgid(iter->ns, *tgid_iter);
+   } while (tgid_iter->task &&
+   !has_pid_permissions(fs_info, tgid_iter->task, 
HIDEPID_INVISIBLE));
+
+   *pos = tgid_iter->tgid;
+
+   if (!tgid_iter->task) {
+   kfree(v);
+   return NULL;
+   }
+
+   return iter;
+}
+
+static void proc_all_stop(struct seq_file *m, void *v)
+{
+   if (v) {
+   struct all_iter *iter = (struct all_iter *) v;
+   struct task_struct *task = iter->tgid_iter.task;
+
+   if (task)
+   put_task_struct(task);
+
+ kfree(v);
+   }
+}
+
+static int proc_all_stat(struct seq_file *m, void *v)
+{
+   struct all_iter *iter = (struct all_iter *) v;
+
+   return proc_tgid_stat(m, iter->ns, iter->tgid_iter.task->thread_pid, 
iter->tgid_iter.task);
+}
+
+
+#define PROC_ALL_OPS(NAME) static const struct seq_operations 
proc_all_##NAME##_ops = { \
+   .start  = proc_all_start, \
+   .next   = proc_all_next, \
+   .stop   = proc_all_stop, \
+   .show   = proc_all_##NAME \
+}
+
+PROC_ALL_OPS(stat);
+
+#define PROC_ALL_CREATE(NAME) \
+   do { \
+   if (!proc_create_seq(#NAME, 0, all_dir, 
_all_##NAME##_ops)) \
+   return; \
+   } while (0)
+
+void __init proc_all_init(void)
+{
+   struct proc_dir_entry *all_dir = proc_mkdir("all", NULL);
+
+   if (!all_dir)
+   return;
+
+   PROC_ALL_CREATE(stat);
+}
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 917cc85e3466..b22d9cb619bf 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -171,6 +171,7 @@ extern int pid_delete_dentry(const struct dentry *);
 extern int proc_pid_readdir(struct file *, struct dir_context *);
 struct dentry *proc_pid_lookup(struct dentry *, unsigned int);
 extern loff_t mem_lseek(struct file *, loff_t, int);
+extern void proc_all_init(void);
 
 /* Lookups */
 typedef struct dentry *instantiate_t(struct dentry *,
diff --git a/fs/proc/root.c b/fs/proc/root.c
index 5e444d4f9717..4b5cfd2cdc0a 100644
--- a/fs/proc/root.c
+++ b/fs/proc/root.c
@@ -291,6 +291,7 @@ void __init proc_root_init(void)
set_proc_pid_nlink();
proc_self_init();
proc_thread_self_init();
+   proc_all_init();
proc_symlink("mounts", NULL, "self/mounts");
 
proc_net_init();
-- 
2.25.1