Re: [RFC] Per-thread getrusage

2008-01-29 Thread Pavel Emelyanov
Eric W. Biederman wrote:
> Pavel Emelyanov <[EMAIL PROTECTED]> writes:
>>> ...
>>> +asmlinkage long sys_thread_getrusage(int tid, struct rusage __user *ru)
>>> +{
>>> +   struct task_struct *tsk;
>>> +   tsk = find_task_by_pid(tid);
>>> +   return getrusage(tsk, RUSAGE_THREAD, ru);
>>> +}
>> Well, the find_task_by_pid() is really wrong here.
> 
> And find_task_by_pid should probably just be removed.
> 
> No need to provide function with the gun firmly pointed at our feet

We are working to uncock it. If you feel you know how to do it
faster, it would be just terrific to review your patches.

> Eric
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Per-thread getrusage

2008-01-29 Thread Pavel Emelyanov
Andrew Morton wrote:
> On Mon, 28 Jan 2008 13:43:02 -0700
> [EMAIL PROTECTED] (Eric W. Biederman) wrote:
> 
>> Pavel Emelyanov <[EMAIL PROTECTED]> writes:
 ...
 +asmlinkage long sys_thread_getrusage(int tid, struct rusage __user *ru)
 +{
 +  struct task_struct *tsk;
 +  tsk = find_task_by_pid(tid);
 +  return getrusage(tsk, RUSAGE_THREAD, ru);
 +}
>>> Well, the find_task_by_pid() is really wrong here.
>> And find_task_by_pid should probably just be removed.
> 
> That's what I was thinking.

find_task_by_pid and find_pid are to be removed, but this task
heavily depends on others.

E.g. to drop the find_pid() we need to kill the kill_proc() 
function, which in turn depends on turning the usbatm, nfs and 
lockd code into kthread API. We're currently working on this.

>> No need to provide function with the gun firmly pointed at our feet
> 
> It still has a disturbingly large number of callers.

Yes, but unfortunately simple conversion from find_xxx_pid into
find_xxx_vpid is not possible - each case is special.

Thanks,
Pavel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Per-thread getrusage

2008-01-29 Thread Pavel Emelyanov
Andrew Morton wrote:
 On Mon, 28 Jan 2008 13:43:02 -0700
 [EMAIL PROTECTED] (Eric W. Biederman) wrote:
 
 Pavel Emelyanov [EMAIL PROTECTED] writes:
 ...
 +asmlinkage long sys_thread_getrusage(int tid, struct rusage __user *ru)
 +{
 +  struct task_struct *tsk;
 +  tsk = find_task_by_pid(tid);
 +  return getrusage(tsk, RUSAGE_THREAD, ru);
 +}
 Well, the find_task_by_pid() is really wrong here.
 And find_task_by_pid should probably just be removed.
 
 That's what I was thinking.

find_task_by_pid and find_pid are to be removed, but this task
heavily depends on others.

E.g. to drop the find_pid() we need to kill the kill_proc() 
function, which in turn depends on turning the usbatm, nfs and 
lockd code into kthread API. We're currently working on this.

 No need to provide function with the gun firmly pointed at our feet
 
 It still has a disturbingly large number of callers.

Yes, but unfortunately simple conversion from find_xxx_pid into
find_xxx_vpid is not possible - each case is special.

Thanks,
Pavel

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Per-thread getrusage

2008-01-29 Thread Pavel Emelyanov
Eric W. Biederman wrote:
 Pavel Emelyanov [EMAIL PROTECTED] writes:
 ...
 +asmlinkage long sys_thread_getrusage(int tid, struct rusage __user *ru)
 +{
 +   struct task_struct *tsk;
 +   tsk = find_task_by_pid(tid);
 +   return getrusage(tsk, RUSAGE_THREAD, ru);
 +}
 Well, the find_task_by_pid() is really wrong here.
 
 And find_task_by_pid should probably just be removed.
 
 No need to provide function with the gun firmly pointed at our feet

We are working to uncock it. If you feel you know how to do it
faster, it would be just terrific to review your patches.

 Eric
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Per-thread getrusage

2008-01-28 Thread Andrew Morton
On Mon, 28 Jan 2008 13:43:02 -0700
[EMAIL PROTECTED] (Eric W. Biederman) wrote:

> Pavel Emelyanov <[EMAIL PROTECTED]> writes:
> >> ...
> >> +asmlinkage long sys_thread_getrusage(int tid, struct rusage __user *ru)
> >> +{
> >> +  struct task_struct *tsk;
> >> +  tsk = find_task_by_pid(tid);
> >> +  return getrusage(tsk, RUSAGE_THREAD, ru);
> >> +}
> >
> > Well, the find_task_by_pid() is really wrong here.
> 
> And find_task_by_pid should probably just be removed.

That's what I was thinking.

> No need to provide function with the gun firmly pointed at our feet

It still has a disturbingly large number of callers.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Per-thread getrusage

2008-01-28 Thread Eric W. Biederman
Pavel Emelyanov <[EMAIL PROTECTED]> writes:
>> ...
>> +asmlinkage long sys_thread_getrusage(int tid, struct rusage __user *ru)
>> +{
>> +struct task_struct *tsk;
>> +tsk = find_task_by_pid(tid);
>> +return getrusage(tsk, RUSAGE_THREAD, ru);
>> +}
>
> Well, the find_task_by_pid() is really wrong here.

And find_task_by_pid should probably just be removed.

No need to provide function with the gun firmly pointed at our feet

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Per-thread getrusage

2008-01-28 Thread Pavel Emelyanov
Andrew Morton wrote:
> On Mon, 28 Jan 2008 12:38:17 +0300 Pavel Emelyanov <[EMAIL PROTECTED]> wrote:
> 
>>> If the code was using find_task_by_vpid() then OK (I guess).  But it is
>> Yup, find_task_by_vpid() will find the proper (i.e. in your namespace) task.
>>
>>> looking the tids up in the init_pid_ns.  Which I assume means that if it's
>>> in a new namespace and is looking up a sibling thread it will simply fail?
>> If it looks in the init_pid_ns, then it can either fail or obtain a task 
>> from different namespace. The find_task_by_pid_ns() was intended to be used
>> in proc mainly, to get tasks from the namespace pointed by the super-block
>> being explored.
>>
>> Please excuse my lamentable ignorance, but which code does such things with
>> init_pid_ns? I followed the 'per-thread rusage' thread and didn't find any.
> 
> From: Vinay Sridhar <[EMAIL PROTECTED]>
> To: linux-kernel@vger.kernel.org, [EMAIL PROTECTED]
> Cc: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED]
> Subject: [RFC] Per-thread getrusage

Ouch. Thanks, I've missed that and looked just at the Roland's patch :(

> ...
> +asmlinkage long sys_thread_getrusage(int tid, struct rusage __user *ru)
> +{
> + struct task_struct *tsk;
> + tsk = find_task_by_pid(tid);
> + return getrusage(tsk, RUSAGE_THREAD, ru);
> +}

Well, the find_task_by_pid() is really wrong here.

Besides (just in case this system call is going to be developed further), 
the tsk == NULL  case is not checked inside the getrusage and may OOPS 
even if the proper namespace is used.

Thanks,
Pavel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Per-thread getrusage

2008-01-28 Thread Andrew Morton
On Mon, 28 Jan 2008 12:38:17 +0300 Pavel Emelyanov <[EMAIL PROTECTED]> wrote:

> > If the code was using find_task_by_vpid() then OK (I guess).  But it is
> 
> Yup, find_task_by_vpid() will find the proper (i.e. in your namespace) task.
> 
> > looking the tids up in the init_pid_ns.  Which I assume means that if it's
> > in a new namespace and is looking up a sibling thread it will simply fail?
> 
> If it looks in the init_pid_ns, then it can either fail or obtain a task 
> from different namespace. The find_task_by_pid_ns() was intended to be used
> in proc mainly, to get tasks from the namespace pointed by the super-block
> being explored.
> 
> Please excuse my lamentable ignorance, but which code does such things with
> init_pid_ns? I followed the 'per-thread rusage' thread and didn't find any.

From: Vinay Sridhar <[EMAIL PROTECTED]>
To: linux-kernel@vger.kernel.org, [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: [RFC] Per-thread getrusage
...
+asmlinkage long sys_thread_getrusage(int tid, struct rusage __user *ru)
+{
+   struct task_struct *tsk;
+   tsk = find_task_by_pid(tid);
+   return getrusage(tsk, RUSAGE_THREAD, ru);
+}


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Per-thread getrusage

2008-01-28 Thread Pavel Emelyanov
Andrew Morton wrote:
> On Mon, 28 Jan 2008 10:48:23 +0300 Pavel Emelyanov <[EMAIL PROTECTED]> wrote:
> 
>> Andrew Morton wrote:
>>> On Thu, 17 Jan 2008 13:57:05 +0530 Vinay Sridhar <[EMAIL PROTECTED]> 
>>> wrote:
>>>
 Hi All,

 Last year, there was discussion about per-thread getrusage by adding
 RUSAGE_THREAD flag to getrusage(). Please refer to the thread
 http://lkml.org/lkml/2007/4/4/308. Ulrich had suggested that we should
 design a better user-space API. Specifically, we need a
 pthread_getrusage interface in the thread library, which accepts
 pthread_t, converts pthread_t into the corresponding tid and passes it
 down to the syscall.

 There are two ways to implement this in the kernel:
 1) Introduce an additional parameter 'tid' to sys_getrusage() and put
 code in glibc to handle getrusage() and pthread_getrusage() calls
 correctly.
 2) Introduce a new system call to handle pthread_getrusage() and leave
 sys_getrusage() untouched.

 We implemented the second idea above, simply because it avoids touching
 any existing code. We have implemented a new syscall, thread_getrusage()
 and we have exposed pthread_getrusage() API to applications.

 Could you please share your thoughts on this? Does the approach look
 alright? The code is hardly complete. It is just a prototype that works
 on IA32 at the moment.

 ...

 +asmlinkage long sys_thread_getrusage(int tid, struct rusage __user *ru);
>>> What happens if `tid' refers to a thread in a different pid namespace?
>>>
>> That's impossible. I explicitly deny namespace creation in case the
>> CLONE_THREAD is specified. So all threads of a single process always
>> live in one pid namespace.
>>
> 
> If the code was using find_task_by_vpid() then OK (I guess).  But it is

Yup, find_task_by_vpid() will find the proper (i.e. in your namespace) task.

> looking the tids up in the init_pid_ns.  Which I assume means that if it's
> in a new namespace and is looking up a sibling thread it will simply fail?

If it looks in the init_pid_ns, then it can either fail or obtain a task 
from different namespace. The find_task_by_pid_ns() was intended to be used
in proc mainly, to get tasks from the namespace pointed by the super-block
being explored.

Please excuse my lamentable ignorance, but which code does such things with
init_pid_ns? I followed the 'per-thread rusage' thread and didn't find any.

> Or am I missing something?
> 

Thanks,
Pavel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Per-thread getrusage

2008-01-28 Thread Andrew Morton
On Mon, 28 Jan 2008 13:54:12 +0530 Sripathi Kodi <[EMAIL PROTECTED]> wrote:

> Does Roland's patch (http://lkml.org/lkml/2008/1/18/589) look good to go 
> in, provided Ulrich's comment (http://lkml.org/lkml/2008/1/19/15) is 
> addressed?

Sure, it looks sane - it avoids the problematic get_task_by_pid() too.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Per-thread getrusage

2008-01-28 Thread Andrew Morton
On Mon, 28 Jan 2008 10:48:23 +0300 Pavel Emelyanov <[EMAIL PROTECTED]> wrote:

> Andrew Morton wrote:
> > On Thu, 17 Jan 2008 13:57:05 +0530 Vinay Sridhar <[EMAIL PROTECTED]> 
> > wrote:
> > 
> >> Hi All,
> >>
> >> Last year, there was discussion about per-thread getrusage by adding
> >> RUSAGE_THREAD flag to getrusage(). Please refer to the thread
> >> http://lkml.org/lkml/2007/4/4/308. Ulrich had suggested that we should
> >> design a better user-space API. Specifically, we need a
> >> pthread_getrusage interface in the thread library, which accepts
> >> pthread_t, converts pthread_t into the corresponding tid and passes it
> >> down to the syscall.
> >>
> >> There are two ways to implement this in the kernel:
> >> 1) Introduce an additional parameter 'tid' to sys_getrusage() and put
> >> code in glibc to handle getrusage() and pthread_getrusage() calls
> >> correctly.
> >> 2) Introduce a new system call to handle pthread_getrusage() and leave
> >> sys_getrusage() untouched.
> >>
> >> We implemented the second idea above, simply because it avoids touching
> >> any existing code. We have implemented a new syscall, thread_getrusage()
> >> and we have exposed pthread_getrusage() API to applications.
> >>
> >> Could you please share your thoughts on this? Does the approach look
> >> alright? The code is hardly complete. It is just a prototype that works
> >> on IA32 at the moment.
> >>
> >> ...
> >>
> >> +asmlinkage long sys_thread_getrusage(int tid, struct rusage __user *ru);
> > 
> > What happens if `tid' refers to a thread in a different pid namespace?
> > 
> 
> That's impossible. I explicitly deny namespace creation in case the
> CLONE_THREAD is specified. So all threads of a single process always
> live in one pid namespace.
> 

If the code was using find_task_by_vpid() then OK (I guess).  But it is
looking the tids up in the init_pid_ns.  Which I assume means that if it's
in a new namespace and is looking up a sibling thread it will simply fail?

Or am I missing something?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Per-thread getrusage

2008-01-28 Thread Sripathi Kodi
Hi Andrew,

On Monday 28 January 2008 11:22, Andrew Morton wrote:
>   On Thu, 17 Jan 2008 13:57:05 +0530 Vinay Sridhar 
<[EMAIL PROTECTED]> wrote:
> > Hi All,
> >
> > Last year, there was discussion about per-thread getrusage by
> > adding RUSAGE_THREAD flag to getrusage(). Please refer to the
> > thread http://lkml.org/lkml/2007/4/4/308. Ulrich had suggested that
> > we should design a better user-space API. Specifically, we need a
> > pthread_getrusage interface in the thread library, which accepts
> > pthread_t, converts pthread_t into the corresponding tid and passes
> > it down to the syscall.
> >
> > There are two ways to implement this in the kernel:
> > 1) Introduce an additional parameter 'tid' to sys_getrusage() and
> > put code in glibc to handle getrusage() and pthread_getrusage()
> > calls correctly.
> > 2) Introduce a new system call to handle pthread_getrusage() and
> > leave sys_getrusage() untouched.
> >
> > We implemented the second idea above, simply because it avoids
> > touching any existing code. We have implemented a new syscall,
> > thread_getrusage() and we have exposed pthread_getrusage() API to
> > applications.
> >
> > Could you please share your thoughts on this? Does the approach
> > look alright? The code is hardly complete. It is just a prototype
> > that works on IA32 at the moment.
> >
> > ...
> >
> > +asmlinkage long sys_thread_getrusage(int tid, struct rusage __user
> > *ru);
>
> What happens if `tid' refers to a thread in a different pid
> namespace?

The code was only meant to be a base for discussions. It surely needs 
work. Our idea for the final version was to be able to read a thread's 
rusage from another thread strictly within the same process. The idea 
came from applications that need a cost enforcement mechanism. Having a 
mechanism for a thread to read it's own usage is essential. If there is 
a way to read other threads' rusage, it is even better.

Does Roland's patch (http://lkml.org/lkml/2008/1/18/589) look good to go 
in, provided Ulrich's comment (http://lkml.org/lkml/2008/1/19/15) is 
addressed?

Thanks,
Sripathi.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Per-thread getrusage

2008-01-28 Thread Eric W. Biederman
Pavel Emelyanov [EMAIL PROTECTED] writes:
 ...
 +asmlinkage long sys_thread_getrusage(int tid, struct rusage __user *ru)
 +{
 +struct task_struct *tsk;
 +tsk = find_task_by_pid(tid);
 +return getrusage(tsk, RUSAGE_THREAD, ru);
 +}

 Well, the find_task_by_pid() is really wrong here.

And find_task_by_pid should probably just be removed.

No need to provide function with the gun firmly pointed at our feet

Eric
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Per-thread getrusage

2008-01-28 Thread Andrew Morton
On Mon, 28 Jan 2008 13:43:02 -0700
[EMAIL PROTECTED] (Eric W. Biederman) wrote:

 Pavel Emelyanov [EMAIL PROTECTED] writes:
  ...
  +asmlinkage long sys_thread_getrusage(int tid, struct rusage __user *ru)
  +{
  +  struct task_struct *tsk;
  +  tsk = find_task_by_pid(tid);
  +  return getrusage(tsk, RUSAGE_THREAD, ru);
  +}
 
  Well, the find_task_by_pid() is really wrong here.
 
 And find_task_by_pid should probably just be removed.

That's what I was thinking.

 No need to provide function with the gun firmly pointed at our feet

It still has a disturbingly large number of callers.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Per-thread getrusage

2008-01-28 Thread Sripathi Kodi
Hi Andrew,

On Monday 28 January 2008 11:22, Andrew Morton wrote:
   On Thu, 17 Jan 2008 13:57:05 +0530 Vinay Sridhar 
[EMAIL PROTECTED] wrote:
  Hi All,
 
  Last year, there was discussion about per-thread getrusage by
  adding RUSAGE_THREAD flag to getrusage(). Please refer to the
  thread http://lkml.org/lkml/2007/4/4/308. Ulrich had suggested that
  we should design a better user-space API. Specifically, we need a
  pthread_getrusage interface in the thread library, which accepts
  pthread_t, converts pthread_t into the corresponding tid and passes
  it down to the syscall.
 
  There are two ways to implement this in the kernel:
  1) Introduce an additional parameter 'tid' to sys_getrusage() and
  put code in glibc to handle getrusage() and pthread_getrusage()
  calls correctly.
  2) Introduce a new system call to handle pthread_getrusage() and
  leave sys_getrusage() untouched.
 
  We implemented the second idea above, simply because it avoids
  touching any existing code. We have implemented a new syscall,
  thread_getrusage() and we have exposed pthread_getrusage() API to
  applications.
 
  Could you please share your thoughts on this? Does the approach
  look alright? The code is hardly complete. It is just a prototype
  that works on IA32 at the moment.
 
  ...
 
  +asmlinkage long sys_thread_getrusage(int tid, struct rusage __user
  *ru);

 What happens if `tid' refers to a thread in a different pid
 namespace?

The code was only meant to be a base for discussions. It surely needs 
work. Our idea for the final version was to be able to read a thread's 
rusage from another thread strictly within the same process. The idea 
came from applications that need a cost enforcement mechanism. Having a 
mechanism for a thread to read it's own usage is essential. If there is 
a way to read other threads' rusage, it is even better.

Does Roland's patch (http://lkml.org/lkml/2008/1/18/589) look good to go 
in, provided Ulrich's comment (http://lkml.org/lkml/2008/1/19/15) is 
addressed?

Thanks,
Sripathi.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Per-thread getrusage

2008-01-28 Thread Andrew Morton
On Mon, 28 Jan 2008 13:54:12 +0530 Sripathi Kodi [EMAIL PROTECTED] wrote:

 Does Roland's patch (http://lkml.org/lkml/2008/1/18/589) look good to go 
 in, provided Ulrich's comment (http://lkml.org/lkml/2008/1/19/15) is 
 addressed?

Sure, it looks sane - it avoids the problematic get_task_by_pid() too.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Per-thread getrusage

2008-01-28 Thread Andrew Morton
On Mon, 28 Jan 2008 10:48:23 +0300 Pavel Emelyanov [EMAIL PROTECTED] wrote:

 Andrew Morton wrote:
  On Thu, 17 Jan 2008 13:57:05 +0530 Vinay Sridhar [EMAIL PROTECTED] 
  wrote:
  
  Hi All,
 
  Last year, there was discussion about per-thread getrusage by adding
  RUSAGE_THREAD flag to getrusage(). Please refer to the thread
  http://lkml.org/lkml/2007/4/4/308. Ulrich had suggested that we should
  design a better user-space API. Specifically, we need a
  pthread_getrusage interface in the thread library, which accepts
  pthread_t, converts pthread_t into the corresponding tid and passes it
  down to the syscall.
 
  There are two ways to implement this in the kernel:
  1) Introduce an additional parameter 'tid' to sys_getrusage() and put
  code in glibc to handle getrusage() and pthread_getrusage() calls
  correctly.
  2) Introduce a new system call to handle pthread_getrusage() and leave
  sys_getrusage() untouched.
 
  We implemented the second idea above, simply because it avoids touching
  any existing code. We have implemented a new syscall, thread_getrusage()
  and we have exposed pthread_getrusage() API to applications.
 
  Could you please share your thoughts on this? Does the approach look
  alright? The code is hardly complete. It is just a prototype that works
  on IA32 at the moment.
 
  ...
 
  +asmlinkage long sys_thread_getrusage(int tid, struct rusage __user *ru);
  
  What happens if `tid' refers to a thread in a different pid namespace?
  
 
 That's impossible. I explicitly deny namespace creation in case the
 CLONE_THREAD is specified. So all threads of a single process always
 live in one pid namespace.
 

If the code was using find_task_by_vpid() then OK (I guess).  But it is
looking the tids up in the init_pid_ns.  Which I assume means that if it's
in a new namespace and is looking up a sibling thread it will simply fail?

Or am I missing something?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Per-thread getrusage

2008-01-28 Thread Andrew Morton
On Mon, 28 Jan 2008 12:38:17 +0300 Pavel Emelyanov [EMAIL PROTECTED] wrote:

  If the code was using find_task_by_vpid() then OK (I guess).  But it is
 
 Yup, find_task_by_vpid() will find the proper (i.e. in your namespace) task.
 
  looking the tids up in the init_pid_ns.  Which I assume means that if it's
  in a new namespace and is looking up a sibling thread it will simply fail?
 
 If it looks in the init_pid_ns, then it can either fail or obtain a task 
 from different namespace. The find_task_by_pid_ns() was intended to be used
 in proc mainly, to get tasks from the namespace pointed by the super-block
 being explored.
 
 Please excuse my lamentable ignorance, but which code does such things with
 init_pid_ns? I followed the 'per-thread rusage' thread and didn't find any.

From: Vinay Sridhar [EMAIL PROTECTED]
To: linux-kernel@vger.kernel.org, [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: [RFC] Per-thread getrusage
...
+asmlinkage long sys_thread_getrusage(int tid, struct rusage __user *ru)
+{
+   struct task_struct *tsk;
+   tsk = find_task_by_pid(tid);
+   return getrusage(tsk, RUSAGE_THREAD, ru);
+}


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Per-thread getrusage

2008-01-28 Thread Pavel Emelyanov
Andrew Morton wrote:
 On Mon, 28 Jan 2008 10:48:23 +0300 Pavel Emelyanov [EMAIL PROTECTED] wrote:
 
 Andrew Morton wrote:
 On Thu, 17 Jan 2008 13:57:05 +0530 Vinay Sridhar [EMAIL PROTECTED] 
 wrote:

 Hi All,

 Last year, there was discussion about per-thread getrusage by adding
 RUSAGE_THREAD flag to getrusage(). Please refer to the thread
 http://lkml.org/lkml/2007/4/4/308. Ulrich had suggested that we should
 design a better user-space API. Specifically, we need a
 pthread_getrusage interface in the thread library, which accepts
 pthread_t, converts pthread_t into the corresponding tid and passes it
 down to the syscall.

 There are two ways to implement this in the kernel:
 1) Introduce an additional parameter 'tid' to sys_getrusage() and put
 code in glibc to handle getrusage() and pthread_getrusage() calls
 correctly.
 2) Introduce a new system call to handle pthread_getrusage() and leave
 sys_getrusage() untouched.

 We implemented the second idea above, simply because it avoids touching
 any existing code. We have implemented a new syscall, thread_getrusage()
 and we have exposed pthread_getrusage() API to applications.

 Could you please share your thoughts on this? Does the approach look
 alright? The code is hardly complete. It is just a prototype that works
 on IA32 at the moment.

 ...

 +asmlinkage long sys_thread_getrusage(int tid, struct rusage __user *ru);
 What happens if `tid' refers to a thread in a different pid namespace?

 That's impossible. I explicitly deny namespace creation in case the
 CLONE_THREAD is specified. So all threads of a single process always
 live in one pid namespace.

 
 If the code was using find_task_by_vpid() then OK (I guess).  But it is

Yup, find_task_by_vpid() will find the proper (i.e. in your namespace) task.

 looking the tids up in the init_pid_ns.  Which I assume means that if it's
 in a new namespace and is looking up a sibling thread it will simply fail?

If it looks in the init_pid_ns, then it can either fail or obtain a task 
from different namespace. The find_task_by_pid_ns() was intended to be used
in proc mainly, to get tasks from the namespace pointed by the super-block
being explored.

Please excuse my lamentable ignorance, but which code does such things with
init_pid_ns? I followed the 'per-thread rusage' thread and didn't find any.

 Or am I missing something?
 

Thanks,
Pavel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Per-thread getrusage

2008-01-28 Thread Pavel Emelyanov
Andrew Morton wrote:
 On Mon, 28 Jan 2008 12:38:17 +0300 Pavel Emelyanov [EMAIL PROTECTED] wrote:
 
 If the code was using find_task_by_vpid() then OK (I guess).  But it is
 Yup, find_task_by_vpid() will find the proper (i.e. in your namespace) task.

 looking the tids up in the init_pid_ns.  Which I assume means that if it's
 in a new namespace and is looking up a sibling thread it will simply fail?
 If it looks in the init_pid_ns, then it can either fail or obtain a task 
 from different namespace. The find_task_by_pid_ns() was intended to be used
 in proc mainly, to get tasks from the namespace pointed by the super-block
 being explored.

 Please excuse my lamentable ignorance, but which code does such things with
 init_pid_ns? I followed the 'per-thread rusage' thread and didn't find any.
 
 From: Vinay Sridhar [EMAIL PROTECTED]
 To: linux-kernel@vger.kernel.org, [EMAIL PROTECTED]
 Cc: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED]
 Subject: [RFC] Per-thread getrusage

Ouch. Thanks, I've missed that and looked just at the Roland's patch :(

 ...
 +asmlinkage long sys_thread_getrusage(int tid, struct rusage __user *ru)
 +{
 + struct task_struct *tsk;
 + tsk = find_task_by_pid(tid);
 + return getrusage(tsk, RUSAGE_THREAD, ru);
 +}

Well, the find_task_by_pid() is really wrong here.

Besides (just in case this system call is going to be developed further), 
the tsk == NULL  case is not checked inside the getrusage and may OOPS 
even if the proper namespace is used.

Thanks,
Pavel

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Per-thread getrusage

2008-01-27 Thread Pavel Emelyanov
Andrew Morton wrote:
>   On Thu, 17 Jan 2008 13:57:05 +0530 Vinay Sridhar <[EMAIL PROTECTED]> 
> wrote:
> 
>> Hi All,
>>
>> Last year, there was discussion about per-thread getrusage by adding
>> RUSAGE_THREAD flag to getrusage(). Please refer to the thread
>> http://lkml.org/lkml/2007/4/4/308. Ulrich had suggested that we should
>> design a better user-space API. Specifically, we need a
>> pthread_getrusage interface in the thread library, which accepts
>> pthread_t, converts pthread_t into the corresponding tid and passes it
>> down to the syscall.
>>
>> There are two ways to implement this in the kernel:
>> 1) Introduce an additional parameter 'tid' to sys_getrusage() and put
>> code in glibc to handle getrusage() and pthread_getrusage() calls
>> correctly.
>> 2) Introduce a new system call to handle pthread_getrusage() and leave
>> sys_getrusage() untouched.
>>
>> We implemented the second idea above, simply because it avoids touching
>> any existing code. We have implemented a new syscall, thread_getrusage()
>> and we have exposed pthread_getrusage() API to applications.
>>
>> Could you please share your thoughts on this? Does the approach look
>> alright? The code is hardly complete. It is just a prototype that works
>> on IA32 at the moment.
>>
>> ...
>>
>> +asmlinkage long sys_thread_getrusage(int tid, struct rusage __user *ru);
> 
> What happens if `tid' refers to a thread in a different pid namespace?
> 

That's impossible. I explicitly deny namespace creation in case the
CLONE_THREAD is specified. So all threads of a single process always
live in one pid namespace.

Thanks,
Pavel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Per-thread getrusage

2008-01-27 Thread Andrew Morton
On Thu, 17 Jan 2008 13:57:05 +0530 Vinay Sridhar <[EMAIL PROTECTED]> 
wrote:

> Hi All,
> 
> Last year, there was discussion about per-thread getrusage by adding
> RUSAGE_THREAD flag to getrusage(). Please refer to the thread
> http://lkml.org/lkml/2007/4/4/308. Ulrich had suggested that we should
> design a better user-space API. Specifically, we need a
> pthread_getrusage interface in the thread library, which accepts
> pthread_t, converts pthread_t into the corresponding tid and passes it
> down to the syscall.
> 
> There are two ways to implement this in the kernel:
> 1) Introduce an additional parameter 'tid' to sys_getrusage() and put
> code in glibc to handle getrusage() and pthread_getrusage() calls
> correctly.
> 2) Introduce a new system call to handle pthread_getrusage() and leave
> sys_getrusage() untouched.
> 
> We implemented the second idea above, simply because it avoids touching
> any existing code. We have implemented a new syscall, thread_getrusage()
> and we have exposed pthread_getrusage() API to applications.
> 
> Could you please share your thoughts on this? Does the approach look
> alright? The code is hardly complete. It is just a prototype that works
> on IA32 at the moment.
> 
> ...
>
> +asmlinkage long sys_thread_getrusage(int tid, struct rusage __user *ru);

What happens if `tid' refers to a thread in a different pid namespace?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Per-thread getrusage

2008-01-27 Thread Andrew Morton
On Thu, 17 Jan 2008 13:57:05 +0530 Vinay Sridhar [EMAIL PROTECTED] 
wrote:

 Hi All,
 
 Last year, there was discussion about per-thread getrusage by adding
 RUSAGE_THREAD flag to getrusage(). Please refer to the thread
 http://lkml.org/lkml/2007/4/4/308. Ulrich had suggested that we should
 design a better user-space API. Specifically, we need a
 pthread_getrusage interface in the thread library, which accepts
 pthread_t, converts pthread_t into the corresponding tid and passes it
 down to the syscall.
 
 There are two ways to implement this in the kernel:
 1) Introduce an additional parameter 'tid' to sys_getrusage() and put
 code in glibc to handle getrusage() and pthread_getrusage() calls
 correctly.
 2) Introduce a new system call to handle pthread_getrusage() and leave
 sys_getrusage() untouched.
 
 We implemented the second idea above, simply because it avoids touching
 any existing code. We have implemented a new syscall, thread_getrusage()
 and we have exposed pthread_getrusage() API to applications.
 
 Could you please share your thoughts on this? Does the approach look
 alright? The code is hardly complete. It is just a prototype that works
 on IA32 at the moment.
 
 ...

 +asmlinkage long sys_thread_getrusage(int tid, struct rusage __user *ru);

What happens if `tid' refers to a thread in a different pid namespace?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Per-thread getrusage

2008-01-18 Thread Roland McGrath
I agree that RUSAGE_THREAD is fine.  (In fact, if you'd pressed me to
remember without looking, I would have assumed we put it in already.)
However, in the implementation, I would keep it cleaner by moving the
identical code from inside the loop under case RUSAGE_SELF into a shared
subfunction, rather than duplicating it.  In fact, here you go (next posting).

As to getting arbitrary other threads' data, there are several problems
there.  Adding a syscall is often more trouble than it's worth.  Ulrich
cited the issues with that as the API.  You also didn't handle compat for
it correctly.  To warrant the code necessary to make this available by
whatever API, I think you need to say some more about what it's needed for.

Off hand, it seems most in keeping with other things to expose this via a
/proc file, i.e. /proc/tgid/task/tid/rusage and (/proc/tgid/rusage for the
RUSAGE_SELF behavior on a foreign process).  There we already have the
infrastructure for dealing with the security issues uniformly with how we
control other similar information.  Personally I tend to prefer a binary
interface, i.e. a virtual file whose contents are struct rusage; for that
you still need to do the extra compat work, since a 32-bit process should
have the 32-bit struct rusage layout in its /proc files.  If you put the
numbers into ascii text as some /proc interfaces do, you don't need any
special considerations for CONFIG_COMPAT.


Thanks,
Roland
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Per-thread getrusage

2008-01-18 Thread Roland McGrath
I agree that RUSAGE_THREAD is fine.  (In fact, if you'd pressed me to
remember without looking, I would have assumed we put it in already.)
However, in the implementation, I would keep it cleaner by moving the
identical code from inside the loop under case RUSAGE_SELF into a shared
subfunction, rather than duplicating it.  In fact, here you go (next posting).

As to getting arbitrary other threads' data, there are several problems
there.  Adding a syscall is often more trouble than it's worth.  Ulrich
cited the issues with that as the API.  You also didn't handle compat for
it correctly.  To warrant the code necessary to make this available by
whatever API, I think you need to say some more about what it's needed for.

Off hand, it seems most in keeping with other things to expose this via a
/proc file, i.e. /proc/tgid/task/tid/rusage and (/proc/tgid/rusage for the
RUSAGE_SELF behavior on a foreign process).  There we already have the
infrastructure for dealing with the security issues uniformly with how we
control other similar information.  Personally I tend to prefer a binary
interface, i.e. a virtual file whose contents are struct rusage; for that
you still need to do the extra compat work, since a 32-bit process should
have the 32-bit struct rusage layout in its /proc files.  If you put the
numbers into ascii text as some /proc interfaces do, you don't need any
special considerations for CONFIG_COMPAT.


Thanks,
Roland
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Per-thread getrusage

2008-01-17 Thread Ulrich Drepper
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Vinay Sridhar wrote:
> There are two ways to implement this in the kernel:
> 1) Introduce an additional parameter 'tid' to sys_getrusage() and put
> code in glibc to handle getrusage() and pthread_getrusage() calls
> correctly.
> 2) Introduce a new system call to handle pthread_getrusage() and leave
> sys_getrusage() untouched.

You're doing two things at once:

a) provide a way to get a thread's usage

b) provide a way to get another process's/thread's usage


The former is a trivial extension and I completely agree.  RUSAGE_THREAD
is trivial to implement and should go in ASAP.

The second part isn't that easy.  The first question is: do we really
need this?  It is a new type of interface.  We have the /proc filesystem
etc for programs which want to look at other process' data.  Second,
more importantly right now, your patch seems not to include any security
support.  Correct me if I'm wrong, but find_task_by_pid will always
succeed, regardless of whether the calling thread belongs to another UID
or not.  I.e., your patch enables any process to read any other process'
usage.  That's a no-no.


I suggest that you split the patch in two.  The first should implement
RUSAGE_THREAD.  You'll immediately get an ACK from me for that.  The
second part then should introduce a way to get another process' usage.
This patch should only be used initially as a starting point for
discussions.  You'll have to argue why it is necessary in the first place.

The argument might have to do with why you want a pthread_getrusage()
interface (which, btw, is a bad name since the interface is nothing like
getrusage, getrusage doesn't allow requesting any other process' data).
 Yes, for intra-process lookups relying on /proc is no good idea.  But
then, I have not seen any reason so far why such an API is needed and
why a thread cannot just be responsible for reading its own usage data.
 Anyway, if pthread_getrusage (or whatever it'll be called) is the only
usage then the syscall should require that the TID parameter is from a
thread in the same process which would solve the security problem.

- --
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQFHj3do2ijCOnn/RHQRAiKdAKCSooiEWcxr780hJGenElyDiWPWKgCdE+6Y
j6ibmGsPT4aYxhSfpimSdiw=
=jOC9
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC] Per-thread getrusage

2008-01-17 Thread Vinay Sridhar
Hi All,

Last year, there was discussion about per-thread getrusage by adding
RUSAGE_THREAD flag to getrusage(). Please refer to the thread
http://lkml.org/lkml/2007/4/4/308. Ulrich had suggested that we should
design a better user-space API. Specifically, we need a
pthread_getrusage interface in the thread library, which accepts
pthread_t, converts pthread_t into the corresponding tid and passes it
down to the syscall.

There are two ways to implement this in the kernel:
1) Introduce an additional parameter 'tid' to sys_getrusage() and put
code in glibc to handle getrusage() and pthread_getrusage() calls
correctly.
2) Introduce a new system call to handle pthread_getrusage() and leave
sys_getrusage() untouched.

We implemented the second idea above, simply because it avoids touching
any existing code. We have implemented a new syscall, thread_getrusage()
and we have exposed pthread_getrusage() API to applications.

Could you please share your thoughts on this? Does the approach look
alright? The code is hardly complete. It is just a prototype that works
on IA32 at the moment.

kernel patch : 

signed-off by : Vinay Sridhar <[EMAIL PROTECTED]>
signed-off by : Sripathi Kodi <[EMAIL PROTECTED]>

diff -Nuarp linux-2.6.24-rc6_org/arch/x86/ia32/ia32entry.S 
linux-2.6.24-rc6/arch/x86/ia32/ia32entry.S
--- linux-2.6.24-rc6_org/arch/x86/ia32/ia32entry.S  2008-01-10 
17:16:05.0 +0530
+++ linux-2.6.24-rc6/arch/x86/ia32/ia32entry.S  2008-01-14 15:54:54.0 
+0530
@@ -726,4 +726,5 @@ ia32_sys_call_table:
.quad compat_sys_timerfd
.quad sys_eventfd
.quad sys32_fallocate
+   .quad sys_thread_getrusage  /* 325 */
 ia32_syscall_end:
diff -Nuarp linux-2.6.24-rc6_org/arch/x86/kernel/syscall_table_32.S 
linux-2.6.24-rc6/arch/x86/kernel/syscall_table_32.S
--- linux-2.6.24-rc6_org/arch/x86/kernel/syscall_table_32.S 2008-01-10 
17:16:05.0 +0530
+++ linux-2.6.24-rc6/arch/x86/kernel/syscall_table_32.S 2008-01-14 
15:54:17.0 +0530
@@ -324,3 +324,5 @@ ENTRY(sys_call_table)
.long sys_timerfd
.long sys_eventfd
.long sys_fallocate
+   .long sys_thread_getrusage  /* 325 */
+
diff -Nuarp linux-2.6.24-rc6_org/include/asm-x86/unistd_32.h 
linux-2.6.24-rc6/include/asm-x86/unistd_32.h
--- linux-2.6.24-rc6_org/include/asm-x86/unistd_32.h2008-01-10 
17:16:13.0 +0530
+++ linux-2.6.24-rc6/include/asm-x86/unistd_32.h2008-01-14 
15:58:35.0 +0530
@@ -330,10 +330,11 @@
 #define __NR_timerfd   322
 #define __NR_eventfd   323
 #define __NR_fallocate 324
+#define __NR_thread_getrusage  325
 
 #ifdef __KERNEL__
 
-#define NR_syscalls 325
+#define NR_syscalls 326
 
 #define __ARCH_WANT_IPC_PARSE_VERSION
 #define __ARCH_WANT_OLD_READDIR
diff -Nuarp linux-2.6.24-rc6_org/include/linux/syscalls.h 
linux-2.6.24-rc6/include/linux/syscalls.h
--- linux-2.6.24-rc6_org/include/linux/syscalls.h   2008-01-10 
17:16:15.0 +0530
+++ linux-2.6.24-rc6/include/linux/syscalls.h   2008-01-14 15:59:12.0 
+0530
@@ -611,7 +611,7 @@ asmlinkage long sys_timerfd(int ufd, int
const struct itimerspec __user *utmr);
 asmlinkage long sys_eventfd(unsigned int count);
 asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len);
-
+asmlinkage long sys_thread_getrusage(int tid, struct rusage __user *ru);
 int kernel_execve(const char *filename, char *const argv[], char *const 
envp[]);
 
 #endif
diff -Nuarp linux-2.6.24-rc6_org/kernel/sys.c linux-2.6.24-rc6/kernel/sys.c
--- linux-2.6.24-rc6_org/kernel/sys.c   2008-01-10 17:16:10.0 +0530
+++ linux-2.6.24-rc6/kernel/sys.c   2008-01-17 11:00:18.0 +0530
@@ -33,6 +33,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -1570,6 +1571,16 @@ static void k_getrusage(struct task_stru
}
 
switch (who) {
+   case RUSAGE_THREAD:
+   utime = p->utime;
+   stime = p->stime;
+   r->ru_nvcsw = p->nvcsw;
+   r->ru_nivcsw = p->nivcsw;
+   r->ru_minflt = p->min_flt;
+   r->ru_majflt = p->maj_flt;
+   r->ru_inblock = task_io_get_inblock(p);
+   r->ru_oublock = task_io_get_oublock(p);
+   break;
case RUSAGE_BOTH:
case RUSAGE_CHILDREN:
utime = p->signal->cutime;
@@ -1627,11 +1638,19 @@ int getrusage(struct task_struct *p, int
 
 asmlinkage long sys_getrusage(int who, struct rusage __user *ru)
 {
-   if (who != RUSAGE_SELF && who != RUSAGE_CHILDREN)
+   if (who != RUSAGE_SELF && who != RUSAGE_CHILDREN &&
+   who != RUSAGE_THREAD)
return -EINVAL;
return getrusage(current, who, ru);
 }
 
+asmlinkage long sys_thread_getrusage(int tid, struct rusage __user *ru)
+{
+   struct 

[RFC] Per-thread getrusage

2008-01-17 Thread Vinay Sridhar
Hi All,

Last year, there was discussion about per-thread getrusage by adding
RUSAGE_THREAD flag to getrusage(). Please refer to the thread
http://lkml.org/lkml/2007/4/4/308. Ulrich had suggested that we should
design a better user-space API. Specifically, we need a
pthread_getrusage interface in the thread library, which accepts
pthread_t, converts pthread_t into the corresponding tid and passes it
down to the syscall.

There are two ways to implement this in the kernel:
1) Introduce an additional parameter 'tid' to sys_getrusage() and put
code in glibc to handle getrusage() and pthread_getrusage() calls
correctly.
2) Introduce a new system call to handle pthread_getrusage() and leave
sys_getrusage() untouched.

We implemented the second idea above, simply because it avoids touching
any existing code. We have implemented a new syscall, thread_getrusage()
and we have exposed pthread_getrusage() API to applications.

Could you please share your thoughts on this? Does the approach look
alright? The code is hardly complete. It is just a prototype that works
on IA32 at the moment.

kernel patch : 

signed-off by : Vinay Sridhar [EMAIL PROTECTED]
signed-off by : Sripathi Kodi [EMAIL PROTECTED]

diff -Nuarp linux-2.6.24-rc6_org/arch/x86/ia32/ia32entry.S 
linux-2.6.24-rc6/arch/x86/ia32/ia32entry.S
--- linux-2.6.24-rc6_org/arch/x86/ia32/ia32entry.S  2008-01-10 
17:16:05.0 +0530
+++ linux-2.6.24-rc6/arch/x86/ia32/ia32entry.S  2008-01-14 15:54:54.0 
+0530
@@ -726,4 +726,5 @@ ia32_sys_call_table:
.quad compat_sys_timerfd
.quad sys_eventfd
.quad sys32_fallocate
+   .quad sys_thread_getrusage  /* 325 */
 ia32_syscall_end:
diff -Nuarp linux-2.6.24-rc6_org/arch/x86/kernel/syscall_table_32.S 
linux-2.6.24-rc6/arch/x86/kernel/syscall_table_32.S
--- linux-2.6.24-rc6_org/arch/x86/kernel/syscall_table_32.S 2008-01-10 
17:16:05.0 +0530
+++ linux-2.6.24-rc6/arch/x86/kernel/syscall_table_32.S 2008-01-14 
15:54:17.0 +0530
@@ -324,3 +324,5 @@ ENTRY(sys_call_table)
.long sys_timerfd
.long sys_eventfd
.long sys_fallocate
+   .long sys_thread_getrusage  /* 325 */
+
diff -Nuarp linux-2.6.24-rc6_org/include/asm-x86/unistd_32.h 
linux-2.6.24-rc6/include/asm-x86/unistd_32.h
--- linux-2.6.24-rc6_org/include/asm-x86/unistd_32.h2008-01-10 
17:16:13.0 +0530
+++ linux-2.6.24-rc6/include/asm-x86/unistd_32.h2008-01-14 
15:58:35.0 +0530
@@ -330,10 +330,11 @@
 #define __NR_timerfd   322
 #define __NR_eventfd   323
 #define __NR_fallocate 324
+#define __NR_thread_getrusage  325
 
 #ifdef __KERNEL__
 
-#define NR_syscalls 325
+#define NR_syscalls 326
 
 #define __ARCH_WANT_IPC_PARSE_VERSION
 #define __ARCH_WANT_OLD_READDIR
diff -Nuarp linux-2.6.24-rc6_org/include/linux/syscalls.h 
linux-2.6.24-rc6/include/linux/syscalls.h
--- linux-2.6.24-rc6_org/include/linux/syscalls.h   2008-01-10 
17:16:15.0 +0530
+++ linux-2.6.24-rc6/include/linux/syscalls.h   2008-01-14 15:59:12.0 
+0530
@@ -611,7 +611,7 @@ asmlinkage long sys_timerfd(int ufd, int
const struct itimerspec __user *utmr);
 asmlinkage long sys_eventfd(unsigned int count);
 asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len);
-
+asmlinkage long sys_thread_getrusage(int tid, struct rusage __user *ru);
 int kernel_execve(const char *filename, char *const argv[], char *const 
envp[]);
 
 #endif
diff -Nuarp linux-2.6.24-rc6_org/kernel/sys.c linux-2.6.24-rc6/kernel/sys.c
--- linux-2.6.24-rc6_org/kernel/sys.c   2008-01-10 17:16:10.0 +0530
+++ linux-2.6.24-rc6/kernel/sys.c   2008-01-17 11:00:18.0 +0530
@@ -33,6 +33,7 @@
 #include linux/task_io_accounting_ops.h
 #include linux/seccomp.h
 #include linux/cpu.h
+#include linux/sched.h
 
 #include linux/compat.h
 #include linux/syscalls.h
@@ -1570,6 +1571,16 @@ static void k_getrusage(struct task_stru
}
 
switch (who) {
+   case RUSAGE_THREAD:
+   utime = p-utime;
+   stime = p-stime;
+   r-ru_nvcsw = p-nvcsw;
+   r-ru_nivcsw = p-nivcsw;
+   r-ru_minflt = p-min_flt;
+   r-ru_majflt = p-maj_flt;
+   r-ru_inblock = task_io_get_inblock(p);
+   r-ru_oublock = task_io_get_oublock(p);
+   break;
case RUSAGE_BOTH:
case RUSAGE_CHILDREN:
utime = p-signal-cutime;
@@ -1627,11 +1638,19 @@ int getrusage(struct task_struct *p, int
 
 asmlinkage long sys_getrusage(int who, struct rusage __user *ru)
 {
-   if (who != RUSAGE_SELF  who != RUSAGE_CHILDREN)
+   if (who != RUSAGE_SELF  who != RUSAGE_CHILDREN 
+   who != RUSAGE_THREAD)
return -EINVAL;
return getrusage(current, who, ru);
 }
 
+asmlinkage long