Re: [PATCH -mm] scheduler: fix the return of the first time_slice

2007-04-16 Thread Satoru Takeuchi
At Mon, 16 Apr 2007 14:45:59 +0400,
Oleg Nesterov wrote:
> 
> On 04/16, Satoru Takeuchi wrote:
> >
> > --- linux-2.6.21-rc6-mm1.tsfix.orig/kernel/exit.c   2007-04-15 
> > 16:56:12.0 +0900
> > +++ linux-2.6.21-rc6-mm1.tsfix/kernel/exit.c2007-04-15 
> > 16:56:17.0 +0900
> > @@ -680,10 +680,14 @@
> > reaper = child_reaper(father);
> > break;
> > }
> > +   if (reaper->time_slice_reaper == father)
> > +   reaper->time_slice_reaper = NULL;
> > } while (reaper->exit_state);
> >  
> > list_for_each_safe(_p, _n, >children) {
> > p = list_entry(_p, struct task_struct, sibling);
> > +   if (p->time_slice_reaper == father)
> > +   p->time_slice_reaper = NULL;
> > choose_new_parent(p, reaper);
> > reparent_thread(p, father);
> > }
> > Index: linux-2.6.21-rc6-mm1.tsfix/kernel/sched.c
> > ===
> > --- linux-2.6.21-rc6-mm1.tsfix.orig/kernel/sched.c  2007-04-15 
> > 16:56:12.0 +0900
> > +++ linux-2.6.21-rc6-mm1.tsfix/kernel/sched.c   2007-04-15 
> > 16:56:17.0 +0900
> > @@ -1693,7 +1693,7 @@
> >  * The remainder of the first timeslice might be recovered by
> >  * the parent if the child exits early enough.
> >  */
> > -   p->first_time_slice = 1;
> > +   p->time_slice_reaper = current;
> > p->timestamp = sched_clock();
> > local_irq_enable();
> 
> I am afraid this doesn't work for CLONE_THREAD. Suppose that some sub-thread 
> (not
> main thread) T1 creates another sub-thread, T2.
> 
>   T2->time_slice_reaper = T1.
> 
> then T1 exits, but forget_original_parent(T1) doesn't change 
> T2->time_slice_reaper,
> because T2 is not on T1->children list. Now T2->time_slice_reaper points to an
> already freed task_struct.

I intended that CLONE_THREAD case is managed on the do-while loop seeking 
`reaper'.
However, anyway, my patch has a bug. On T1 exiting, if there are any living 
threads
ahead of T2, T2->time_slice_reaper isn't checked. What a easy mistake ... sorry.

I'll post fixed version soon.

Satoru
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm] scheduler: fix the return of the first time_slice

2007-04-16 Thread Ingo Molnar

* Oleg Nesterov <[EMAIL PROTECTED]> wrote:

> > >* The remainder of the first timeslice might be recovered by
> > >* the parent if the child exits early enough.
> > >*/
> > > - p->first_time_slice = 1;
> > > + p->time_slice_reaper = current;
> > >   p->timestamp = sched_clock();
> > >   local_irq_enable();
> > 
> > I am afraid this doesn't work for CLONE_THREAD. Suppose that some 
> > sub-thread (not main thread) T1 creates another sub-thread, T2.

> In case I was not clear...

> To make this correct, we should iterate over all thread-group, but 
> this can slow down exit() when we have a lot of threads.
> 
> I guess we need Ingo's opinion on that.

right now my first cautious estimation seems to be that we might be able 
to get rid of this whole child/parent timeslice sharing complexity and 
do all the scheduling setup without affecting the parent - hence 
avoiding all the reaper problems as well. People reported interactivity 
improvements with this removed from CFS. (It all still needs a ton of 
validation to make sure, but the trend seems to be this.)

(the only valid component of that complexity is 'child runs first' - but 
it's not really related to the timesplice splitting thing just 
intermixed with it.)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm] scheduler: fix the return of the first time_slice

2007-04-16 Thread Oleg Nesterov
On 04/16, Oleg Nesterov wrote:
>
> On 04/16, Satoru Takeuchi wrote:
> >
> > --- linux-2.6.21-rc6-mm1.tsfix.orig/kernel/exit.c   2007-04-15 
> > 16:56:12.0 +0900
> > +++ linux-2.6.21-rc6-mm1.tsfix/kernel/exit.c2007-04-15 
> > 16:56:17.0 +0900
> > @@ -680,10 +680,14 @@
> > reaper = child_reaper(father);
> > break;
> > }
> > +   if (reaper->time_slice_reaper == father)
> > +   reaper->time_slice_reaper = NULL;
> > } while (reaper->exit_state);
> >  
> > list_for_each_safe(_p, _n, >children) {
> > p = list_entry(_p, struct task_struct, sibling);
> > +   if (p->time_slice_reaper == father)
> > +   p->time_slice_reaper = NULL;
> > choose_new_parent(p, reaper);
> > reparent_thread(p, father);
> > }
> > Index: linux-2.6.21-rc6-mm1.tsfix/kernel/sched.c
> > ===
> > --- linux-2.6.21-rc6-mm1.tsfix.orig/kernel/sched.c  2007-04-15 
> > 16:56:12.0 +0900
> > +++ linux-2.6.21-rc6-mm1.tsfix/kernel/sched.c   2007-04-15 
> > 16:56:17.0 +0900
> > @@ -1693,7 +1693,7 @@
> >  * The remainder of the first timeslice might be recovered by
> >  * the parent if the child exits early enough.
> >  */
> > -   p->first_time_slice = 1;
> > +   p->time_slice_reaper = current;
> > p->timestamp = sched_clock();
> > local_irq_enable();
> 
> I am afraid this doesn't work for CLONE_THREAD. Suppose that some sub-thread 
> (not
> main thread) T1 creates another sub-thread, T2.
> 
>   T2->time_slice_reaper = T1.
> 
> then T1 exits, but forget_original_parent(T1) doesn't change 
> T2->time_slice_reaper,
> because T2 is not on T1->children list. Now T2->time_slice_reaper points to an
> already freed task_struct.

In case I was not clear...

Note that "do {} while ()" loop in forget_original_parent() stops when it finds
a sub-thread with ->exit_state == 0. This means we can miss another sub-thread
which has ->time_slice_reaper == father.

To make this correct, we should iterate over all thread-group, but this can slow
down exit() when we have a lot of threads.

I guess we need Ingo's opinion on that.

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm] scheduler: fix the return of the first time_slice

2007-04-16 Thread Oleg Nesterov
On 04/16, Satoru Takeuchi wrote:
>
> --- linux-2.6.21-rc6-mm1.tsfix.orig/kernel/exit.c 2007-04-15 
> 16:56:12.0 +0900
> +++ linux-2.6.21-rc6-mm1.tsfix/kernel/exit.c  2007-04-15 16:56:17.0 
> +0900
> @@ -680,10 +680,14 @@
>   reaper = child_reaper(father);
>   break;
>   }
> + if (reaper->time_slice_reaper == father)
> + reaper->time_slice_reaper = NULL;
>   } while (reaper->exit_state);
>  
>   list_for_each_safe(_p, _n, >children) {
>   p = list_entry(_p, struct task_struct, sibling);
> + if (p->time_slice_reaper == father)
> + p->time_slice_reaper = NULL;
>   choose_new_parent(p, reaper);
>   reparent_thread(p, father);
>   }
> Index: linux-2.6.21-rc6-mm1.tsfix/kernel/sched.c
> ===
> --- linux-2.6.21-rc6-mm1.tsfix.orig/kernel/sched.c2007-04-15 
> 16:56:12.0 +0900
> +++ linux-2.6.21-rc6-mm1.tsfix/kernel/sched.c 2007-04-15 16:56:17.0 
> +0900
> @@ -1693,7 +1693,7 @@
>* The remainder of the first timeslice might be recovered by
>* the parent if the child exits early enough.
>*/
> - p->first_time_slice = 1;
> + p->time_slice_reaper = current;
>   p->timestamp = sched_clock();
>   local_irq_enable();

I am afraid this doesn't work for CLONE_THREAD. Suppose that some sub-thread 
(not
main thread) T1 creates another sub-thread, T2.

T2->time_slice_reaper = T1.

then T1 exits, but forget_original_parent(T1) doesn't change 
T2->time_slice_reaper,
because T2 is not on T1->children list. Now T2->time_slice_reaper points to an
already freed task_struct.

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm] scheduler: fix the return of the first time_slice

2007-04-16 Thread Oleg Nesterov
On 04/16, Satoru Takeuchi wrote:

 --- linux-2.6.21-rc6-mm1.tsfix.orig/kernel/exit.c 2007-04-15 
 16:56:12.0 +0900
 +++ linux-2.6.21-rc6-mm1.tsfix/kernel/exit.c  2007-04-15 16:56:17.0 
 +0900
 @@ -680,10 +680,14 @@
   reaper = child_reaper(father);
   break;
   }
 + if (reaper-time_slice_reaper == father)
 + reaper-time_slice_reaper = NULL;
   } while (reaper-exit_state);
  
   list_for_each_safe(_p, _n, father-children) {
   p = list_entry(_p, struct task_struct, sibling);
 + if (p-time_slice_reaper == father)
 + p-time_slice_reaper = NULL;
   choose_new_parent(p, reaper);
   reparent_thread(p, father);
   }
 Index: linux-2.6.21-rc6-mm1.tsfix/kernel/sched.c
 ===
 --- linux-2.6.21-rc6-mm1.tsfix.orig/kernel/sched.c2007-04-15 
 16:56:12.0 +0900
 +++ linux-2.6.21-rc6-mm1.tsfix/kernel/sched.c 2007-04-15 16:56:17.0 
 +0900
 @@ -1693,7 +1693,7 @@
* The remainder of the first timeslice might be recovered by
* the parent if the child exits early enough.
*/
 - p-first_time_slice = 1;
 + p-time_slice_reaper = current;
   p-timestamp = sched_clock();
   local_irq_enable();

I am afraid this doesn't work for CLONE_THREAD. Suppose that some sub-thread 
(not
main thread) T1 creates another sub-thread, T2.

T2-time_slice_reaper = T1.

then T1 exits, but forget_original_parent(T1) doesn't change 
T2-time_slice_reaper,
because T2 is not on T1-children list. Now T2-time_slice_reaper points to an
already freed task_struct.

Oleg.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm] scheduler: fix the return of the first time_slice

2007-04-16 Thread Oleg Nesterov
On 04/16, Oleg Nesterov wrote:

 On 04/16, Satoru Takeuchi wrote:
 
  --- linux-2.6.21-rc6-mm1.tsfix.orig/kernel/exit.c   2007-04-15 
  16:56:12.0 +0900
  +++ linux-2.6.21-rc6-mm1.tsfix/kernel/exit.c2007-04-15 
  16:56:17.0 +0900
  @@ -680,10 +680,14 @@
  reaper = child_reaper(father);
  break;
  }
  +   if (reaper-time_slice_reaper == father)
  +   reaper-time_slice_reaper = NULL;
  } while (reaper-exit_state);
   
  list_for_each_safe(_p, _n, father-children) {
  p = list_entry(_p, struct task_struct, sibling);
  +   if (p-time_slice_reaper == father)
  +   p-time_slice_reaper = NULL;
  choose_new_parent(p, reaper);
  reparent_thread(p, father);
  }
  Index: linux-2.6.21-rc6-mm1.tsfix/kernel/sched.c
  ===
  --- linux-2.6.21-rc6-mm1.tsfix.orig/kernel/sched.c  2007-04-15 
  16:56:12.0 +0900
  +++ linux-2.6.21-rc6-mm1.tsfix/kernel/sched.c   2007-04-15 
  16:56:17.0 +0900
  @@ -1693,7 +1693,7 @@
   * The remainder of the first timeslice might be recovered by
   * the parent if the child exits early enough.
   */
  -   p-first_time_slice = 1;
  +   p-time_slice_reaper = current;
  p-timestamp = sched_clock();
  local_irq_enable();
 
 I am afraid this doesn't work for CLONE_THREAD. Suppose that some sub-thread 
 (not
 main thread) T1 creates another sub-thread, T2.
 
   T2-time_slice_reaper = T1.
 
 then T1 exits, but forget_original_parent(T1) doesn't change 
 T2-time_slice_reaper,
 because T2 is not on T1-children list. Now T2-time_slice_reaper points to an
 already freed task_struct.

In case I was not clear...

Note that do {} while () loop in forget_original_parent() stops when it finds
a sub-thread with -exit_state == 0. This means we can miss another sub-thread
which has -time_slice_reaper == father.

To make this correct, we should iterate over all thread-group, but this can slow
down exit() when we have a lot of threads.

I guess we need Ingo's opinion on that.

Oleg.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm] scheduler: fix the return of the first time_slice

2007-04-16 Thread Ingo Molnar

* Oleg Nesterov [EMAIL PROTECTED] wrote:

  * The remainder of the first timeslice might be recovered by
  * the parent if the child exits early enough.
  */
   - p-first_time_slice = 1;
   + p-time_slice_reaper = current;
 p-timestamp = sched_clock();
 local_irq_enable();
  
  I am afraid this doesn't work for CLONE_THREAD. Suppose that some 
  sub-thread (not main thread) T1 creates another sub-thread, T2.

 In case I was not clear...

 To make this correct, we should iterate over all thread-group, but 
 this can slow down exit() when we have a lot of threads.
 
 I guess we need Ingo's opinion on that.

right now my first cautious estimation seems to be that we might be able 
to get rid of this whole child/parent timeslice sharing complexity and 
do all the scheduling setup without affecting the parent - hence 
avoiding all the reaper problems as well. People reported interactivity 
improvements with this removed from CFS. (It all still needs a ton of 
validation to make sure, but the trend seems to be this.)

(the only valid component of that complexity is 'child runs first' - but 
it's not really related to the timesplice splitting thing just 
intermixed with it.)

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm] scheduler: fix the return of the first time_slice

2007-04-16 Thread Satoru Takeuchi
At Mon, 16 Apr 2007 14:45:59 +0400,
Oleg Nesterov wrote:
 
 On 04/16, Satoru Takeuchi wrote:
 
  --- linux-2.6.21-rc6-mm1.tsfix.orig/kernel/exit.c   2007-04-15 
  16:56:12.0 +0900
  +++ linux-2.6.21-rc6-mm1.tsfix/kernel/exit.c2007-04-15 
  16:56:17.0 +0900
  @@ -680,10 +680,14 @@
  reaper = child_reaper(father);
  break;
  }
  +   if (reaper-time_slice_reaper == father)
  +   reaper-time_slice_reaper = NULL;
  } while (reaper-exit_state);
   
  list_for_each_safe(_p, _n, father-children) {
  p = list_entry(_p, struct task_struct, sibling);
  +   if (p-time_slice_reaper == father)
  +   p-time_slice_reaper = NULL;
  choose_new_parent(p, reaper);
  reparent_thread(p, father);
  }
  Index: linux-2.6.21-rc6-mm1.tsfix/kernel/sched.c
  ===
  --- linux-2.6.21-rc6-mm1.tsfix.orig/kernel/sched.c  2007-04-15 
  16:56:12.0 +0900
  +++ linux-2.6.21-rc6-mm1.tsfix/kernel/sched.c   2007-04-15 
  16:56:17.0 +0900
  @@ -1693,7 +1693,7 @@
   * The remainder of the first timeslice might be recovered by
   * the parent if the child exits early enough.
   */
  -   p-first_time_slice = 1;
  +   p-time_slice_reaper = current;
  p-timestamp = sched_clock();
  local_irq_enable();
 
 I am afraid this doesn't work for CLONE_THREAD. Suppose that some sub-thread 
 (not
 main thread) T1 creates another sub-thread, T2.
 
   T2-time_slice_reaper = T1.
 
 then T1 exits, but forget_original_parent(T1) doesn't change 
 T2-time_slice_reaper,
 because T2 is not on T1-children list. Now T2-time_slice_reaper points to an
 already freed task_struct.

I intended that CLONE_THREAD case is managed on the do-while loop seeking 
`reaper'.
However, anyway, my patch has a bug. On T1 exiting, if there are any living 
threads
ahead of T2, T2-time_slice_reaper isn't checked. What a easy mistake ... sorry.

I'll post fixed version soon.

Satoru
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/