Re: [PATCH] task_pid_nr_ns() breaks proc_pid_readdir()

2007-11-19 Thread Eric W. Biederman
Oleg Nesterov <[EMAIL PROTECTED]> writes:

> On 11/19, Eric W. Biederman wrote:
>>
>> I think we can solve the immediate issues cleanly
>> without it, and we are pretty much in bug fixing territory now.
>
> Yes sure. Besides, the "patch" I showed for illustration is not complete,
> and it is not easy to solve the problems with exec().
>
> Could you re-send your patch with changelog to Andrew? I agree, it is better
> to pass a pointer than to add the horrible hack to proc_pid_readdir().

Definitely. I'm just testing now to make certain my code actually works.
I have respun my patch to return a structure as I figured out how to
do that cleanly.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] task_pid_nr_ns() breaks proc_pid_readdir()

2007-11-19 Thread Oleg Nesterov
On 11/19, Eric W. Biederman wrote:
>
> I think we can solve the immediate issues cleanly
> without it, and we are pretty much in bug fixing territory now.

Yes sure. Besides, the "patch" I showed for illustration is not complete,
and it is not easy to solve the problems with exec().

Could you re-send your patch with changelog to Andrew? I agree, it is better
to pass a pointer than to add the horrible hack to proc_pid_readdir().

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] task_pid_nr_ns() breaks proc_pid_readdir()

2007-11-19 Thread Eric W. Biederman
Oleg Nesterov <[EMAIL PROTECTED]> writes:

> On 11/17, Eric W. Biederman wrote:
>>
>> Oleg Nesterov <[EMAIL PROTECTED]> writes:
>> 
>> > Make sure that task_pid_nr_ns() returns !0 before updating tgid. Note that
>> > next_tgid(tgid + 1) can find the same "struct pid" again, but we shouldn't
>> > go into the endless loop because pid_task(PIDTYPE_PID) must return NULL in
>> > this case, so next_tgid() can't return the same task.
>> >
>> Oleg I think I would rather update next_tgid to return the tgid (which
>> removes the need to call task_pid_nr_ns).  This keeps all of the task
>> iteration logic together in next_tgid.
>
> Yes sure, I think your patch is also correct, please use it.
>
> 
>
> Personally, I hate the functions which use pointers to return another value.
> (yes, yes, I know, my taste is perverted). Why don't we return structure in
> this case? We can even make a common helper struct, say,
>
> Of course, another option is to rewrite the kernle in perl, in that case
> proc_pid_readdir() can just do
>
>   (task, tgid) = next_tgid();

I wish that last syntax worked in C.  That would make returning
structures so much easier.  From a compiler optimization standpoint
returning structures is ever so much nicer.  Seeing through pointers
or references to optimize things is tricky.

> 
>
>> Although looking at this in more detail, I'm half wondering if
>> proc_pid_make_inode() should take a struct pid instead of a task.
>
> Yes, I also thought about this. Needs more changes, and still not perfect.
>
> I am starting to think we need a more generic change. How about the patch
> below? With this change the stable task_struct implies we have the stable
> pids, this allows us to do a lot of cleanups.

I don't see the huge pile of opportunities to clean up.  But otherwise
I am in favor of it.  In the normal case it should only delay freeing
of struct pid (and the corresponding namespace) by an rcu interval so it
isn't a big deal.

I suspect it will be a help with killing things like the global pids
in task_struct.

Regardless I'm going to pass on the keeping struct pid on the task
struct until we free it and reference counting it there for the
immediate present as I think we can solve the immediate issues cleanly
without it, and we are pretty much in bug fixing territory now.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] task_pid_nr_ns() breaks proc_pid_readdir()

2007-11-19 Thread Eric W. Biederman
Oleg Nesterov [EMAIL PROTECTED] writes:

 On 11/17, Eric W. Biederman wrote:

 Oleg Nesterov [EMAIL PROTECTED] writes:
 
  Make sure that task_pid_nr_ns() returns !0 before updating tgid. Note that
  next_tgid(tgid + 1) can find the same struct pid again, but we shouldn't
  go into the endless loop because pid_task(PIDTYPE_PID) must return NULL in
  this case, so next_tgid() can't return the same task.
 
 Oleg I think I would rather update next_tgid to return the tgid (which
 removes the need to call task_pid_nr_ns).  This keeps all of the task
 iteration logic together in next_tgid.

 Yes sure, I think your patch is also correct, please use it.

 offtopic

 Personally, I hate the functions which use pointers to return another value.
 (yes, yes, I know, my taste is perverted). Why don't we return structure in
 this case? We can even make a common helper struct, say,

 Of course, another option is to rewrite the kernle in perl, in that case
 proc_pid_readdir() can just do

   (task, tgid) = next_tgid();

I wish that last syntax worked in C.  That would make returning
structures so much easier.  From a compiler optimization standpoint
returning structures is ever so much nicer.  Seeing through pointers
or references to optimize things is tricky.

 /offtopic

 Although looking at this in more detail, I'm half wondering if
 proc_pid_make_inode() should take a struct pid instead of a task.

 Yes, I also thought about this. Needs more changes, and still not perfect.

 I am starting to think we need a more generic change. How about the patch
 below? With this change the stable task_struct implies we have the stable
 pids, this allows us to do a lot of cleanups.

I don't see the huge pile of opportunities to clean up.  But otherwise
I am in favor of it.  In the normal case it should only delay freeing
of struct pid (and the corresponding namespace) by an rcu interval so it
isn't a big deal.

I suspect it will be a help with killing things like the global pids
in task_struct.

Regardless I'm going to pass on the keeping struct pid on the task
struct until we free it and reference counting it there for the
immediate present as I think we can solve the immediate issues cleanly
without it, and we are pretty much in bug fixing territory now.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] task_pid_nr_ns() breaks proc_pid_readdir()

2007-11-19 Thread Oleg Nesterov
On 11/19, Eric W. Biederman wrote:

 I think we can solve the immediate issues cleanly
 without it, and we are pretty much in bug fixing territory now.

Yes sure. Besides, the patch I showed for illustration is not complete,
and it is not easy to solve the problems with exec().

Could you re-send your patch with changelog to Andrew? I agree, it is better
to pass a pointer than to add the horrible hack to proc_pid_readdir().

Oleg.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] task_pid_nr_ns() breaks proc_pid_readdir()

2007-11-19 Thread Eric W. Biederman
Oleg Nesterov [EMAIL PROTECTED] writes:

 On 11/19, Eric W. Biederman wrote:

 I think we can solve the immediate issues cleanly
 without it, and we are pretty much in bug fixing territory now.

 Yes sure. Besides, the patch I showed for illustration is not complete,
 and it is not easy to solve the problems with exec().

 Could you re-send your patch with changelog to Andrew? I agree, it is better
 to pass a pointer than to add the horrible hack to proc_pid_readdir().

Definitely. I'm just testing now to make certain my code actually works.
I have respun my patch to return a structure as I figured out how to
do that cleanly.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] task_pid_nr_ns() breaks proc_pid_readdir()

2007-11-18 Thread Oleg Nesterov
On 11/17, Eric W. Biederman wrote:
>
> Oleg Nesterov <[EMAIL PROTECTED]> writes:
> 
> > Make sure that task_pid_nr_ns() returns !0 before updating tgid. Note that
> > next_tgid(tgid + 1) can find the same "struct pid" again, but we shouldn't
> > go into the endless loop because pid_task(PIDTYPE_PID) must return NULL in
> > this case, so next_tgid() can't return the same task.
> >
> Oleg I think I would rather update next_tgid to return the tgid (which
> removes the need to call task_pid_nr_ns).  This keeps all of the task
> iteration logic together in next_tgid.

Yes sure, I think your patch is also correct, please use it.



Personally, I hate the functions which use pointers to return another value.
(yes, yes, I know, my taste is perverted). Why don't we return structure in
this case? We can even make a common helper struct, say,

struct pair {
union {
long val1;
void *ptr1;
};
union {
long val2;
void *ptr2;
};
};

#define PAIR(x1, x2)(struct pair){{ . x1 }, { . x2 }}

Now, next_tgid() can do

return PAIR(ptr1 = task, val2 = tgid);

With -freg-struct-return the generated code is nice.

Of course, another option is to rewrite the kernle in perl, in that case
proc_pid_readdir() can just do

(task, tgid) = next_tgid();



> Although looking at this in more detail, I'm half wondering if
> proc_pid_make_inode() should take a struct pid instead of a task.

Yes, I also thought about this. Needs more changes, and still not perfect.

I am starting to think we need a more generic change. How about the patch
below? With this change the stable task_struct implies we have the stable
pids, this allows us to do a lot of cleanups.

Oleg.

--- kernel/pid.c2007-10-25 16:22:12.0 +0400
+++ -   2007-11-18 16:56:30.682555454 +0300
@@ -323,7 +323,7 @@ int fastcall attach_pid(struct task_stru
struct pid_link *link;
 
link = >pids[type];
-   link->pid = pid;
+   link->pid = get_pid(pid);
hlist_add_head_rcu(>node, >tasks[type]);
 
return 0;
@@ -339,7 +339,6 @@ void fastcall detach_pid(struct task_str
pid = link->pid;
 
hlist_del_rcu(>node);
-   link->pid = NULL;
 
for (tmp = PIDTYPE_MAX; --tmp >= 0; )
if (!hlist_empty(>tasks[tmp]))
@@ -348,6 +347,14 @@ void fastcall detach_pid(struct task_str
free_pid(pid);
 }
 
+void task_put_pids(struct pid_link *pids)
+{
+   int type = PIDTYPE_MAX;
+
+   while (type--)
+   put_pid(pids[type].pid);
+}
+
 /* transfer_pid is an optimization of attach_pid(new), detach_pid(old) */
 void fastcall transfer_pid(struct task_struct *old, struct task_struct *new,
   enum pid_type type)
--- kernel/fork.c   2007-11-09 12:57:31.0 +0300
+++ -   2007-11-18 16:57:34.037105563 +0300
@@ -121,6 +121,7 @@ void __put_task_struct(struct task_struc
WARN_ON(atomic_read(>usage));
WARN_ON(tsk == current);
 
+   task_put_pids(tsk->pids);
security_task_free(tsk);
free_uid(tsk->user);
put_group_info(tsk->group_info);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] task_pid_nr_ns() breaks proc_pid_readdir()

2007-11-18 Thread Oleg Nesterov
On 11/17, Eric W. Biederman wrote:

 Oleg Nesterov [EMAIL PROTECTED] writes:
 
  Make sure that task_pid_nr_ns() returns !0 before updating tgid. Note that
  next_tgid(tgid + 1) can find the same struct pid again, but we shouldn't
  go into the endless loop because pid_task(PIDTYPE_PID) must return NULL in
  this case, so next_tgid() can't return the same task.
 
 Oleg I think I would rather update next_tgid to return the tgid (which
 removes the need to call task_pid_nr_ns).  This keeps all of the task
 iteration logic together in next_tgid.

Yes sure, I think your patch is also correct, please use it.

offtopic

Personally, I hate the functions which use pointers to return another value.
(yes, yes, I know, my taste is perverted). Why don't we return structure in
this case? We can even make a common helper struct, say,

struct pair {
union {
long val1;
void *ptr1;
};
union {
long val2;
void *ptr2;
};
};

#define PAIR(x1, x2)(struct pair){{ . x1 }, { . x2 }}

Now, next_tgid() can do

return PAIR(ptr1 = task, val2 = tgid);

With -freg-struct-return the generated code is nice.

Of course, another option is to rewrite the kernle in perl, in that case
proc_pid_readdir() can just do

(task, tgid) = next_tgid();

/offtopic

 Although looking at this in more detail, I'm half wondering if
 proc_pid_make_inode() should take a struct pid instead of a task.

Yes, I also thought about this. Needs more changes, and still not perfect.

I am starting to think we need a more generic change. How about the patch
below? With this change the stable task_struct implies we have the stable
pids, this allows us to do a lot of cleanups.

Oleg.

--- kernel/pid.c2007-10-25 16:22:12.0 +0400
+++ -   2007-11-18 16:56:30.682555454 +0300
@@ -323,7 +323,7 @@ int fastcall attach_pid(struct task_stru
struct pid_link *link;
 
link = task-pids[type];
-   link-pid = pid;
+   link-pid = get_pid(pid);
hlist_add_head_rcu(link-node, pid-tasks[type]);
 
return 0;
@@ -339,7 +339,6 @@ void fastcall detach_pid(struct task_str
pid = link-pid;
 
hlist_del_rcu(link-node);
-   link-pid = NULL;
 
for (tmp = PIDTYPE_MAX; --tmp = 0; )
if (!hlist_empty(pid-tasks[tmp]))
@@ -348,6 +347,14 @@ void fastcall detach_pid(struct task_str
free_pid(pid);
 }
 
+void task_put_pids(struct pid_link *pids)
+{
+   int type = PIDTYPE_MAX;
+
+   while (type--)
+   put_pid(pids[type].pid);
+}
+
 /* transfer_pid is an optimization of attach_pid(new), detach_pid(old) */
 void fastcall transfer_pid(struct task_struct *old, struct task_struct *new,
   enum pid_type type)
--- kernel/fork.c   2007-11-09 12:57:31.0 +0300
+++ -   2007-11-18 16:57:34.037105563 +0300
@@ -121,6 +121,7 @@ void __put_task_struct(struct task_struc
WARN_ON(atomic_read(tsk-usage));
WARN_ON(tsk == current);
 
+   task_put_pids(tsk-pids);
security_task_free(tsk);
free_uid(tsk-user);
put_group_info(tsk-group_info);

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] task_pid_nr_ns() breaks proc_pid_readdir()

2007-11-17 Thread Eric W. Biederman
Oleg Nesterov <[EMAIL PROTECTED]> writes:

> proc_pid_readdir:
>
>   for (...; ...; task = next_tgid(tgid + 1, ns)) {
>   tgid = task_pid_nr_ns(task, ns);
>   ... use tgid ...
>
> The first problem is that task_pid_nr_ns() can race with RCU and read the
> freed memory.
>
> However, rcu_read_lock() can't help. next_tgid() returns a pinned task_struct,
> but the task can be released (and it's pid detached) before task_pid_nr_ns()
> reads the pid_t value. In that case task_pid_nr_ns() returns 0 thus breaking
> the whole logic.
>
> Make sure that task_pid_nr_ns() returns !0 before updating tgid. Note that
> next_tgid(tgid + 1) can find the same "struct pid" again, but we shouldn't
> go into the endless loop because pid_task(PIDTYPE_PID) must return NULL in
> this case, so next_tgid() can't return the same task.
>
> Signed-off-by: Oleg Nesterov <[EMAIL PROTECTED]>

Oleg I think I would rather update next_tgid to return the tgid (which
removes the need to call task_pid_nr_ns).  This keeps all of the task
iteration logic together in next_tgid.

Although looking at this in more detail, I'm half wondering if
proc_pid_make_inode() should take a struct pid instead of a task.
At first glance that looks like it might be a little simple and more
race free.  Although it doesn't do any favors to:
>   inode->i_gid = 0;
>   if (task_dumpable(task)) {
>   inode->i_uid = task->euid;
>   inode->i_gid = task->egid;
>   }
>   security_task_to_inode(task, inode);

Anyway short of rewriting the world this is what updating next_tgid
looks like.  Opinions?

Eric


diff --git a/fs/proc/base.c b/fs/proc/base.c
index a17c268..5d9328d 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2411,7 +2411,7 @@ out:
  * Find the first task with tgid >= tgid
  *
  */
-static struct task_struct *next_tgid(unsigned int tgid,
+static struct task_struct *next_tgid(unsigned int *tgid,
struct pid_namespace *ns)
 {
struct task_struct *task;
@@ -2420,9 +2420,9 @@ static struct task_struct *next_tgid(unsigned int tgid,
rcu_read_lock();
 retry:
task = NULL;
-   pid = find_ge_pid(tgid, ns);
+   pid = find_ge_pid(*tgid, ns);
if (pid) {
-   tgid = pid_nr_ns(pid, ns) + 1;
+   *tgid = pid_nr_ns(pid, ns);
task = pid_task(pid, PIDTYPE_PID);
/* What we to know is if the pid we have find is the
 * pid of a thread_group_leader.  Testing for task
@@ -2436,8 +2436,10 @@ retry:
 * found doesn't happen to be a thread group leader.
 * As we don't care in the case of readdir.
 */
-   if (!task || !has_group_leader_pid(task))
+   if (!task || !has_group_leader_pid(task)) {
+   *tgid += 1;
goto retry;
+   }
get_task_struct(task);
}
rcu_read_unlock();
@@ -2475,10 +2477,9 @@ int proc_pid_readdir(struct file * filp, void * dirent, 
filldir_t filldir)
 
ns = filp->f_dentry->d_sb->s_fs_info;
tgid = filp->f_pos - TGID_OFFSET;
-   for (task = next_tgid(tgid, ns);
+   for (task = next_tgid(, ns);
 task;
-put_task_struct(task), task = next_tgid(tgid + 1, ns)) {
-   tgid = task_pid_nr_ns(task, ns);
+put_task_struct(task), tgid += 1, task = next_tgid(, ns)) {
filp->f_pos = tgid + TGID_OFFSET;
if (proc_pid_fill_cache(filp, dirent, filldir, task, tgid) < 0) 
{
put_task_struct(task);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] task_pid_nr_ns() breaks proc_pid_readdir()

2007-11-17 Thread Oleg Nesterov
proc_pid_readdir:

for (...; ...; task = next_tgid(tgid + 1, ns)) {
tgid = task_pid_nr_ns(task, ns);
... use tgid ...

The first problem is that task_pid_nr_ns() can race with RCU and read the
freed memory.

However, rcu_read_lock() can't help. next_tgid() returns a pinned task_struct,
but the task can be released (and it's pid detached) before task_pid_nr_ns()
reads the pid_t value. In that case task_pid_nr_ns() returns 0 thus breaking
the whole logic.

Make sure that task_pid_nr_ns() returns !0 before updating tgid. Note that
next_tgid(tgid + 1) can find the same "struct pid" again, but we shouldn't
go into the endless loop because pid_task(PIDTYPE_PID) must return NULL in
this case, so next_tgid() can't return the same task.

Signed-off-by: Oleg Nesterov <[EMAIL PROTECTED]>

--- 24/fs/proc/base.c~pprd  2007-10-25 16:22:11.0 +0400
+++ 24/fs/proc/base.c   2007-11-17 20:58:14.0 +0300
@@ -2481,7 +2481,15 @@ int proc_pid_readdir(struct file * filp,
for (task = next_tgid(tgid, ns);
 task;
 put_task_struct(task), task = next_tgid(tgid + 1, ns)) {
-   tgid = task_pid_nr_ns(task, ns);
+   int nr;
+
+   rcu_read_lock();
+   nr = task_pid_nr_ns(task, ns);
+   rcu_read_unlock();
+   if (!nr)
+   continue;
+
+   tgid = nr;
filp->f_pos = tgid + TGID_OFFSET;
if (proc_pid_fill_cache(filp, dirent, filldir, task, tgid) < 0) 
{
put_task_struct(task);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] task_pid_nr_ns() breaks proc_pid_readdir()

2007-11-17 Thread Oleg Nesterov
proc_pid_readdir:

for (...; ...; task = next_tgid(tgid + 1, ns)) {
tgid = task_pid_nr_ns(task, ns);
... use tgid ...

The first problem is that task_pid_nr_ns() can race with RCU and read the
freed memory.

However, rcu_read_lock() can't help. next_tgid() returns a pinned task_struct,
but the task can be released (and it's pid detached) before task_pid_nr_ns()
reads the pid_t value. In that case task_pid_nr_ns() returns 0 thus breaking
the whole logic.

Make sure that task_pid_nr_ns() returns !0 before updating tgid. Note that
next_tgid(tgid + 1) can find the same struct pid again, but we shouldn't
go into the endless loop because pid_task(PIDTYPE_PID) must return NULL in
this case, so next_tgid() can't return the same task.

Signed-off-by: Oleg Nesterov [EMAIL PROTECTED]

--- 24/fs/proc/base.c~pprd  2007-10-25 16:22:11.0 +0400
+++ 24/fs/proc/base.c   2007-11-17 20:58:14.0 +0300
@@ -2481,7 +2481,15 @@ int proc_pid_readdir(struct file * filp,
for (task = next_tgid(tgid, ns);
 task;
 put_task_struct(task), task = next_tgid(tgid + 1, ns)) {
-   tgid = task_pid_nr_ns(task, ns);
+   int nr;
+
+   rcu_read_lock();
+   nr = task_pid_nr_ns(task, ns);
+   rcu_read_unlock();
+   if (!nr)
+   continue;
+
+   tgid = nr;
filp-f_pos = tgid + TGID_OFFSET;
if (proc_pid_fill_cache(filp, dirent, filldir, task, tgid)  0) 
{
put_task_struct(task);

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] task_pid_nr_ns() breaks proc_pid_readdir()

2007-11-17 Thread Eric W. Biederman
Oleg Nesterov [EMAIL PROTECTED] writes:

 proc_pid_readdir:

   for (...; ...; task = next_tgid(tgid + 1, ns)) {
   tgid = task_pid_nr_ns(task, ns);
   ... use tgid ...

 The first problem is that task_pid_nr_ns() can race with RCU and read the
 freed memory.

 However, rcu_read_lock() can't help. next_tgid() returns a pinned task_struct,
 but the task can be released (and it's pid detached) before task_pid_nr_ns()
 reads the pid_t value. In that case task_pid_nr_ns() returns 0 thus breaking
 the whole logic.

 Make sure that task_pid_nr_ns() returns !0 before updating tgid. Note that
 next_tgid(tgid + 1) can find the same struct pid again, but we shouldn't
 go into the endless loop because pid_task(PIDTYPE_PID) must return NULL in
 this case, so next_tgid() can't return the same task.

 Signed-off-by: Oleg Nesterov [EMAIL PROTECTED]

Oleg I think I would rather update next_tgid to return the tgid (which
removes the need to call task_pid_nr_ns).  This keeps all of the task
iteration logic together in next_tgid.

Although looking at this in more detail, I'm half wondering if
proc_pid_make_inode() should take a struct pid instead of a task.
At first glance that looks like it might be a little simple and more
race free.  Although it doesn't do any favors to:
   inode-i_gid = 0;
   if (task_dumpable(task)) {
   inode-i_uid = task-euid;
   inode-i_gid = task-egid;
   }
   security_task_to_inode(task, inode);

Anyway short of rewriting the world this is what updating next_tgid
looks like.  Opinions?

Eric


diff --git a/fs/proc/base.c b/fs/proc/base.c
index a17c268..5d9328d 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2411,7 +2411,7 @@ out:
  * Find the first task with tgid = tgid
  *
  */
-static struct task_struct *next_tgid(unsigned int tgid,
+static struct task_struct *next_tgid(unsigned int *tgid,
struct pid_namespace *ns)
 {
struct task_struct *task;
@@ -2420,9 +2420,9 @@ static struct task_struct *next_tgid(unsigned int tgid,
rcu_read_lock();
 retry:
task = NULL;
-   pid = find_ge_pid(tgid, ns);
+   pid = find_ge_pid(*tgid, ns);
if (pid) {
-   tgid = pid_nr_ns(pid, ns) + 1;
+   *tgid = pid_nr_ns(pid, ns);
task = pid_task(pid, PIDTYPE_PID);
/* What we to know is if the pid we have find is the
 * pid of a thread_group_leader.  Testing for task
@@ -2436,8 +2436,10 @@ retry:
 * found doesn't happen to be a thread group leader.
 * As we don't care in the case of readdir.
 */
-   if (!task || !has_group_leader_pid(task))
+   if (!task || !has_group_leader_pid(task)) {
+   *tgid += 1;
goto retry;
+   }
get_task_struct(task);
}
rcu_read_unlock();
@@ -2475,10 +2477,9 @@ int proc_pid_readdir(struct file * filp, void * dirent, 
filldir_t filldir)
 
ns = filp-f_dentry-d_sb-s_fs_info;
tgid = filp-f_pos - TGID_OFFSET;
-   for (task = next_tgid(tgid, ns);
+   for (task = next_tgid(tgid, ns);
 task;
-put_task_struct(task), task = next_tgid(tgid + 1, ns)) {
-   tgid = task_pid_nr_ns(task, ns);
+put_task_struct(task), tgid += 1, task = next_tgid(tgid, ns)) {
filp-f_pos = tgid + TGID_OFFSET;
if (proc_pid_fill_cache(filp, dirent, filldir, task, tgid)  0) 
{
put_task_struct(task);
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/