Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU
On Tue, Apr 05, 2005 at 09:55:06AM +0800, Li Shaohua wrote: > On Mon, 2005-04-04 at 23:33, Nathan Lynch wrote: > > No. It should make zero difference to the scheduler whether the "play > > dead" cpu hotplug or "physical" hotplug is being used. > Keeping some fields like 'cpu_load' are meanless for a hotadded CPU to > me. Just ignore them? Reinitializing such things during the CPU_UP_PREPARE case in migration_call should be sufficient, if it's not done already. Nathan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU
On Mon, 2005-04-04 at 23:33, Nathan Lynch wrote: > > > > > > I don't understand why this is needed at all. It looks like a fair > > > amount of code from do_exit is being duplicated here. > > Yes, exactly. Someone who understand do_exit please help clean up the > > code. I'd like to remove the idle thread, since the smpboot code will > > create a new idle thread. > > I'd say fix the smpboot code so that it doesn't create new idle tasks > except during boot. I tried what you said. But I must use a ugly method to adjust idle->thread.esp (stack pointer in IA32). otherwise, the stack will soon overflow after several rounds of hotplug. I'll take close look at if other fields in thread_info cause problems. Did you reinitialize the idle's thread_info in ppc? I have no problem to do it in IA32, but is this a good approach? Creating a new idle thread for upcoming CPU looks more graceful to me. Thanks, Shaohua - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU
On Tue, Apr 05, 2005 at 09:55:06AM +0800, Li Shaohua wrote: On Mon, 2005-04-04 at 23:33, Nathan Lynch wrote: No. It should make zero difference to the scheduler whether the play dead cpu hotplug or physical hotplug is being used. Keeping some fields like 'cpu_load' are meanless for a hotadded CPU to me. Just ignore them? Reinitializing such things during the CPU_UP_PREPARE case in migration_call should be sufficient, if it's not done already. Nathan - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU
On Mon, 2005-04-04 at 23:33, Nathan Lynch wrote: I don't understand why this is needed at all. It looks like a fair amount of code from do_exit is being duplicated here. Yes, exactly. Someone who understand do_exit please help clean up the code. I'd like to remove the idle thread, since the smpboot code will create a new idle thread. I'd say fix the smpboot code so that it doesn't create new idle tasks except during boot. I tried what you said. But I must use a ugly method to adjust idle-thread.esp (stack pointer in IA32). otherwise, the stack will soon overflow after several rounds of hotplug. I'll take close look at if other fields in thread_info cause problems. Did you reinitialize the idle's thread_info in ppc? I have no problem to do it in IA32, but is this a good approach? Creating a new idle thread for upcoming CPU looks more graceful to me. Thanks, Shaohua - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU
Hi, On Mon, 2005-04-04 at 23:33, Nathan Lynch wrote: > > I'd say fix the smpboot code so that it doesn't create new idle tasks > except during boot. I'd like the the CPU hotremove case just likes the case that CPU isn't boot. A non-boot CPU hasn't a idle thread. But you may think it's not worthy doing. Anyway, I will keep the idle thread in a updated patch like what you said. > > > We've been > > > doing cpu removal on ppc64 logical partitions for a while and never > > > needed to do anything like this. > > Did it remove idle thread? or dead cpu is in a busy loop of idle? > > Neither. The cpu is definitely offline, but there is no reason to > free the idle thread. > > > > > > Maybe idle_task_exit would suffice? > > idle_task_exit seems just drop mm. We need destroy the idle task for > > physical CPU hotplug, right? > > No. > > > > > > > I don't understand the need for this, either. The existing cpu > > > hotplug notifier in the scheduler takes care of initializing the sched > > > domains and groups appropriately for online/offline events; why do you > > > need to touch the runqueue structures? > > If a CPU is physically hotremoved from the system, shouldn't we clean > > its runqueue? > > No. It should make zero difference to the scheduler whether the "play > dead" cpu hotplug or "physical" hotplug is being used. Keeping some fields like 'cpu_load' are meanless for a hotadded CPU to me. Just ignore them? Thanks, Shaohua - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU
Hi. On Tue, 2005-04-05 at 08:46, Nathan Lynch wrote: > Hi Nigel! > > On Tue, Apr 05, 2005 at 08:14:25AM +1000, Nigel Cunningham wrote: > > > > On Tue, 2005-04-05 at 01:33, Nathan Lynch wrote: > > > > Yes, exactly. Someone who understand do_exit please help clean up the > > > > code. I'd like to remove the idle thread, since the smpboot code will > > > > create a new idle thread. > > > > > > I'd say fix the smpboot code so that it doesn't create new idle tasks > > > except during boot. > > > > Would that mean that CPUs that were physically hotplugged wouldn't get > > idle threads? > > No, that wouldn't work. I am saying that there's little to gain by > adding all this complexity for destroying the idle tasks when it's > fairly simple to create num_possible_cpus() - 1 idle tasks* to > accommodate any additional cpus which may come along. This is what > ppc64 does now, and it should be feasible on any architecture which > supports cpu hotplug. Ah. Ta. I was a little confused :> Nigel > * num_possible_cpus() - 1 because the idle task for the boot cpu is > created in sched_init. -- Nigel Cunningham Software Engineer, Canberra, Australia http://www.cyclades.com Bus: +61 (2) 6291 9554; Hme: +61 (2) 6292 8028; Mob: +61 (417) 100 574 Maintainer of Suspend2 Kernel Patches http://suspend2.net - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU
Hi Nigel! On Tue, Apr 05, 2005 at 08:14:25AM +1000, Nigel Cunningham wrote: > > On Tue, 2005-04-05 at 01:33, Nathan Lynch wrote: > > > Yes, exactly. Someone who understand do_exit please help clean up the > > > code. I'd like to remove the idle thread, since the smpboot code will > > > create a new idle thread. > > > > I'd say fix the smpboot code so that it doesn't create new idle tasks > > except during boot. > > Would that mean that CPUs that were physically hotplugged wouldn't get > idle threads? No, that wouldn't work. I am saying that there's little to gain by adding all this complexity for destroying the idle tasks when it's fairly simple to create num_possible_cpus() - 1 idle tasks* to accommodate any additional cpus which may come along. This is what ppc64 does now, and it should be feasible on any architecture which supports cpu hotplug. Nathan * num_possible_cpus() - 1 because the idle task for the boot cpu is created in sched_init. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU
On Mon, Apr 04, 2005 at 03:46:20PM -0700, Nathan Lynch wrote: > >Hi Nigel! > >On Tue, Apr 05, 2005 at 08:14:25AM +1000, Nigel Cunningham wrote: >> >> On Tue, 2005-04-05 at 01:33, Nathan Lynch wrote: >> > > Yes, exactly. Someone who understand do_exit please help clean > >No, that wouldn't work. I am saying that there's little to gain by >adding all this complexity for destroying the idle tasks when it's >fairly simple to create num_possible_cpus() - 1 idle tasks* to >accommodate any additional cpus which may come along. This is what >ppc64 does now, and it should be feasible on any architecture which >supports cpu hotplug. > >Nathan > >* num_possible_cpus() - 1 because the idle task for the boot cpu is > created in sched_init. > In ia64 we create idle threads on demand if one is not available for the same logical cpu number, and re-used when the same logical cpu number is re-used. just a minor improvement, i also thought about idle exit, but wasnt worth anything in return. Cheers, ashok - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU
Hi. On Tue, 2005-04-05 at 01:33, Nathan Lynch wrote: > > Yes, exactly. Someone who understand do_exit please help clean up the > > code. I'd like to remove the idle thread, since the smpboot code will > > create a new idle thread. > > I'd say fix the smpboot code so that it doesn't create new idle tasks > except during boot. Would that mean that CPUs that were physically hotplugged wouldn't get idle threads? Regards, Nigel -- Nigel Cunningham Software Engineer, Canberra, Australia http://www.cyclades.com Bus: +61 (2) 6291 9554; Hme: +61 (2) 6292 8028; Mob: +61 (417) 100 574 Maintainer of Suspend2 Kernel Patches http://suspend2.net - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU
On Mon, Apr 04, 2005 at 01:42:18PM +0800, Li Shaohua wrote: > Hi, > On Mon, 2005-04-04 at 13:28, Nathan Lynch wrote: > > On Mon, Apr 04, 2005 at 10:07:02AM +0800, Li Shaohua wrote: > > > Clean up all CPU states including its runqueue and idle thread, > > > so we can use boot time code without any changes. > > > Note this makes /sys/devices/system/cpu/cpux/online unworkable. > > > > In what sense does it make the online attribute unworkable? > I removed the idle thread and other CPU states, and makes the dead CPU > into a 'halt' busy loop. > > > > > > diff -puN kernel/exit.c~cpu_state_clean kernel/exit.c > > > --- linux-2.6.11/kernel/exit.c~cpu_state_clean2005-03-31 > > > 10:50:27.0 +0800 > > > +++ linux-2.6.11-root/kernel/exit.c 2005-03-31 10:50:27.0 > > > +0800 > > > @@ -845,6 +845,65 @@ fastcall NORET_TYPE void do_exit(long co > > > for (;;) ; > > > } > > > > > > +#ifdef CONFIG_STR_SMP > > > +void do_exit_idle(void) > > > +{ > > > + struct task_struct *tsk = current; > > > + int group_dead; > > > + > > > + BUG_ON(tsk->pid); > > > + BUG_ON(tsk->mm); > > > + > > > + if (tsk->io_context) > > > + exit_io_context(); > > > + tsk->flags |= PF_EXITING; > > > + tsk->it_virt_expires = cputime_zero; > > > + tsk->it_prof_expires = cputime_zero; > > > + tsk->it_sched_expires = 0; > > > + > > > + acct_update_integrals(tsk); > > > + update_mem_hiwater(tsk); > > > + group_dead = atomic_dec_and_test(>signal->live); > > > + if (group_dead) { > > > + del_timer_sync(>signal->real_timer); > > > + acct_process(-1); > > > + } > > > + exit_mm(tsk); > > > + > > > + exit_sem(tsk); > > > + __exit_files(tsk); > > > + __exit_fs(tsk); > > > + exit_namespace(tsk); > > > + exit_thread(); > > > + exit_keys(tsk); > > > + > > > + if (group_dead && tsk->signal->leader) > > > + disassociate_ctty(1); > > > + > > > + module_put(tsk->thread_info->exec_domain->module); > > > + if (tsk->binfmt) > > > + module_put(tsk->binfmt->module); > > > + > > > + tsk->exit_code = -1; > > > + tsk->exit_state = EXIT_DEAD; > > > + > > > + /* in release_task */ > > > + atomic_dec(>user->processes); > > > + write_lock_irq(_lock); > > > + __exit_signal(tsk); > > > + __exit_sighand(tsk); > > > + write_unlock_irq(_lock); > > > + release_thread(tsk); > > > + put_task_struct(tsk); > > > + > > > + tsk->flags |= PF_DEAD; > > > +#ifdef CONFIG_NUMA > > > + mpol_free(tsk->mempolicy); > > > + tsk->mempolicy = NULL; > > > +#endif > > > +} > > > +#endif > > > > I don't understand why this is needed at all. It looks like a fair > > amount of code from do_exit is being duplicated here. > Yes, exactly. Someone who understand do_exit please help clean up the > code. I'd like to remove the idle thread, since the smpboot code will > create a new idle thread. I'd say fix the smpboot code so that it doesn't create new idle tasks except during boot. > > > We've been > > doing cpu removal on ppc64 logical partitions for a while and never > > needed to do anything like this. > Did it remove idle thread? or dead cpu is in a busy loop of idle? Neither. The cpu is definitely offline, but there is no reason to free the idle thread. > > > Maybe idle_task_exit would suffice? > idle_task_exit seems just drop mm. We need destroy the idle task for > physical CPU hotplug, right? No. > > > > I don't understand the need for this, either. The existing cpu > > hotplug notifier in the scheduler takes care of initializing the sched > > domains and groups appropriately for online/offline events; why do you > > need to touch the runqueue structures? > If a CPU is physically hotremoved from the system, shouldn't we clean > its runqueue? No. It should make zero difference to the scheduler whether the "play dead" cpu hotplug or "physical" hotplug is being used. Nathan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU
On Mon, Apr 04, 2005 at 01:42:18PM +0800, Li Shaohua wrote: Hi, On Mon, 2005-04-04 at 13:28, Nathan Lynch wrote: On Mon, Apr 04, 2005 at 10:07:02AM +0800, Li Shaohua wrote: Clean up all CPU states including its runqueue and idle thread, so we can use boot time code without any changes. Note this makes /sys/devices/system/cpu/cpux/online unworkable. In what sense does it make the online attribute unworkable? I removed the idle thread and other CPU states, and makes the dead CPU into a 'halt' busy loop. diff -puN kernel/exit.c~cpu_state_clean kernel/exit.c --- linux-2.6.11/kernel/exit.c~cpu_state_clean2005-03-31 10:50:27.0 +0800 +++ linux-2.6.11-root/kernel/exit.c 2005-03-31 10:50:27.0 +0800 @@ -845,6 +845,65 @@ fastcall NORET_TYPE void do_exit(long co for (;;) ; } +#ifdef CONFIG_STR_SMP +void do_exit_idle(void) +{ + struct task_struct *tsk = current; + int group_dead; + + BUG_ON(tsk-pid); + BUG_ON(tsk-mm); + + if (tsk-io_context) + exit_io_context(); + tsk-flags |= PF_EXITING; + tsk-it_virt_expires = cputime_zero; + tsk-it_prof_expires = cputime_zero; + tsk-it_sched_expires = 0; + + acct_update_integrals(tsk); + update_mem_hiwater(tsk); + group_dead = atomic_dec_and_test(tsk-signal-live); + if (group_dead) { + del_timer_sync(tsk-signal-real_timer); + acct_process(-1); + } + exit_mm(tsk); + + exit_sem(tsk); + __exit_files(tsk); + __exit_fs(tsk); + exit_namespace(tsk); + exit_thread(); + exit_keys(tsk); + + if (group_dead tsk-signal-leader) + disassociate_ctty(1); + + module_put(tsk-thread_info-exec_domain-module); + if (tsk-binfmt) + module_put(tsk-binfmt-module); + + tsk-exit_code = -1; + tsk-exit_state = EXIT_DEAD; + + /* in release_task */ + atomic_dec(tsk-user-processes); + write_lock_irq(tasklist_lock); + __exit_signal(tsk); + __exit_sighand(tsk); + write_unlock_irq(tasklist_lock); + release_thread(tsk); + put_task_struct(tsk); + + tsk-flags |= PF_DEAD; +#ifdef CONFIG_NUMA + mpol_free(tsk-mempolicy); + tsk-mempolicy = NULL; +#endif +} +#endif I don't understand why this is needed at all. It looks like a fair amount of code from do_exit is being duplicated here. Yes, exactly. Someone who understand do_exit please help clean up the code. I'd like to remove the idle thread, since the smpboot code will create a new idle thread. I'd say fix the smpboot code so that it doesn't create new idle tasks except during boot. We've been doing cpu removal on ppc64 logical partitions for a while and never needed to do anything like this. Did it remove idle thread? or dead cpu is in a busy loop of idle? Neither. The cpu is definitely offline, but there is no reason to free the idle thread. Maybe idle_task_exit would suffice? idle_task_exit seems just drop mm. We need destroy the idle task for physical CPU hotplug, right? No. I don't understand the need for this, either. The existing cpu hotplug notifier in the scheduler takes care of initializing the sched domains and groups appropriately for online/offline events; why do you need to touch the runqueue structures? If a CPU is physically hotremoved from the system, shouldn't we clean its runqueue? No. It should make zero difference to the scheduler whether the play dead cpu hotplug or physical hotplug is being used. Nathan - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU
Hi. On Tue, 2005-04-05 at 01:33, Nathan Lynch wrote: Yes, exactly. Someone who understand do_exit please help clean up the code. I'd like to remove the idle thread, since the smpboot code will create a new idle thread. I'd say fix the smpboot code so that it doesn't create new idle tasks except during boot. Would that mean that CPUs that were physically hotplugged wouldn't get idle threads? Regards, Nigel -- Nigel Cunningham Software Engineer, Canberra, Australia http://www.cyclades.com Bus: +61 (2) 6291 9554; Hme: +61 (2) 6292 8028; Mob: +61 (417) 100 574 Maintainer of Suspend2 Kernel Patches http://suspend2.net - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU
On Mon, Apr 04, 2005 at 03:46:20PM -0700, Nathan Lynch wrote: Hi Nigel! On Tue, Apr 05, 2005 at 08:14:25AM +1000, Nigel Cunningham wrote: On Tue, 2005-04-05 at 01:33, Nathan Lynch wrote: Yes, exactly. Someone who understand do_exit please help clean No, that wouldn't work. I am saying that there's little to gain by adding all this complexity for destroying the idle tasks when it's fairly simple to create num_possible_cpus() - 1 idle tasks* to accommodate any additional cpus which may come along. This is what ppc64 does now, and it should be feasible on any architecture which supports cpu hotplug. Nathan * num_possible_cpus() - 1 because the idle task for the boot cpu is created in sched_init. In ia64 we create idle threads on demand if one is not available for the same logical cpu number, and re-used when the same logical cpu number is re-used. just a minor improvement, i also thought about idle exit, but wasnt worth anything in return. Cheers, ashok - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU
Hi Nigel! On Tue, Apr 05, 2005 at 08:14:25AM +1000, Nigel Cunningham wrote: On Tue, 2005-04-05 at 01:33, Nathan Lynch wrote: Yes, exactly. Someone who understand do_exit please help clean up the code. I'd like to remove the idle thread, since the smpboot code will create a new idle thread. I'd say fix the smpboot code so that it doesn't create new idle tasks except during boot. Would that mean that CPUs that were physically hotplugged wouldn't get idle threads? No, that wouldn't work. I am saying that there's little to gain by adding all this complexity for destroying the idle tasks when it's fairly simple to create num_possible_cpus() - 1 idle tasks* to accommodate any additional cpus which may come along. This is what ppc64 does now, and it should be feasible on any architecture which supports cpu hotplug. Nathan * num_possible_cpus() - 1 because the idle task for the boot cpu is created in sched_init. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU
Hi. On Tue, 2005-04-05 at 08:46, Nathan Lynch wrote: Hi Nigel! On Tue, Apr 05, 2005 at 08:14:25AM +1000, Nigel Cunningham wrote: On Tue, 2005-04-05 at 01:33, Nathan Lynch wrote: Yes, exactly. Someone who understand do_exit please help clean up the code. I'd like to remove the idle thread, since the smpboot code will create a new idle thread. I'd say fix the smpboot code so that it doesn't create new idle tasks except during boot. Would that mean that CPUs that were physically hotplugged wouldn't get idle threads? No, that wouldn't work. I am saying that there's little to gain by adding all this complexity for destroying the idle tasks when it's fairly simple to create num_possible_cpus() - 1 idle tasks* to accommodate any additional cpus which may come along. This is what ppc64 does now, and it should be feasible on any architecture which supports cpu hotplug. Ah. Ta. I was a little confused : Nigel * num_possible_cpus() - 1 because the idle task for the boot cpu is created in sched_init. -- Nigel Cunningham Software Engineer, Canberra, Australia http://www.cyclades.com Bus: +61 (2) 6291 9554; Hme: +61 (2) 6292 8028; Mob: +61 (417) 100 574 Maintainer of Suspend2 Kernel Patches http://suspend2.net - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU
Hi, On Mon, 2005-04-04 at 23:33, Nathan Lynch wrote: I'd say fix the smpboot code so that it doesn't create new idle tasks except during boot. I'd like the the CPU hotremove case just likes the case that CPU isn't boot. A non-boot CPU hasn't a idle thread. But you may think it's not worthy doing. Anyway, I will keep the idle thread in a updated patch like what you said. We've been doing cpu removal on ppc64 logical partitions for a while and never needed to do anything like this. Did it remove idle thread? or dead cpu is in a busy loop of idle? Neither. The cpu is definitely offline, but there is no reason to free the idle thread. Maybe idle_task_exit would suffice? idle_task_exit seems just drop mm. We need destroy the idle task for physical CPU hotplug, right? No. I don't understand the need for this, either. The existing cpu hotplug notifier in the scheduler takes care of initializing the sched domains and groups appropriately for online/offline events; why do you need to touch the runqueue structures? If a CPU is physically hotremoved from the system, shouldn't we clean its runqueue? No. It should make zero difference to the scheduler whether the play dead cpu hotplug or physical hotplug is being used. Keeping some fields like 'cpu_load' are meanless for a hotadded CPU to me. Just ignore them? Thanks, Shaohua - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU
Hi, On Mon, 2005-04-04 at 13:28, Nathan Lynch wrote: > On Mon, Apr 04, 2005 at 10:07:02AM +0800, Li Shaohua wrote: > > Clean up all CPU states including its runqueue and idle thread, > > so we can use boot time code without any changes. > > Note this makes /sys/devices/system/cpu/cpux/online unworkable. > > In what sense does it make the online attribute unworkable? I removed the idle thread and other CPU states, and makes the dead CPU into a 'halt' busy loop. > > > diff -puN kernel/exit.c~cpu_state_clean kernel/exit.c > > --- linux-2.6.11/kernel/exit.c~cpu_state_clean 2005-03-31 > > 10:50:27.0 +0800 > > +++ linux-2.6.11-root/kernel/exit.c 2005-03-31 10:50:27.0 +0800 > > @@ -845,6 +845,65 @@ fastcall NORET_TYPE void do_exit(long co > > for (;;) ; > > } > > > > +#ifdef CONFIG_STR_SMP > > +void do_exit_idle(void) > > +{ > > + struct task_struct *tsk = current; > > + int group_dead; > > + > > + BUG_ON(tsk->pid); > > + BUG_ON(tsk->mm); > > + > > + if (tsk->io_context) > > + exit_io_context(); > > + tsk->flags |= PF_EXITING; > > + tsk->it_virt_expires = cputime_zero; > > + tsk->it_prof_expires = cputime_zero; > > + tsk->it_sched_expires = 0; > > + > > + acct_update_integrals(tsk); > > + update_mem_hiwater(tsk); > > + group_dead = atomic_dec_and_test(>signal->live); > > + if (group_dead) { > > + del_timer_sync(>signal->real_timer); > > + acct_process(-1); > > + } > > + exit_mm(tsk); > > + > > + exit_sem(tsk); > > + __exit_files(tsk); > > + __exit_fs(tsk); > > + exit_namespace(tsk); > > + exit_thread(); > > + exit_keys(tsk); > > + > > + if (group_dead && tsk->signal->leader) > > + disassociate_ctty(1); > > + > > + module_put(tsk->thread_info->exec_domain->module); > > + if (tsk->binfmt) > > + module_put(tsk->binfmt->module); > > + > > + tsk->exit_code = -1; > > + tsk->exit_state = EXIT_DEAD; > > + > > + /* in release_task */ > > + atomic_dec(>user->processes); > > + write_lock_irq(_lock); > > + __exit_signal(tsk); > > + __exit_sighand(tsk); > > + write_unlock_irq(_lock); > > + release_thread(tsk); > > + put_task_struct(tsk); > > + > > + tsk->flags |= PF_DEAD; > > +#ifdef CONFIG_NUMA > > + mpol_free(tsk->mempolicy); > > + tsk->mempolicy = NULL; > > +#endif > > +} > > +#endif > > I don't understand why this is needed at all. It looks like a fair > amount of code from do_exit is being duplicated here. Yes, exactly. Someone who understand do_exit please help clean up the code. I'd like to remove the idle thread, since the smpboot code will create a new idle thread. > We've been > doing cpu removal on ppc64 logical partitions for a while and never > needed to do anything like this. Did it remove idle thread? or dead cpu is in a busy loop of idle? > Maybe idle_task_exit would suffice? idle_task_exit seems just drop mm. We need destroy the idle task for physical CPU hotplug, right? > > > > diff -puN kernel/sched.c~cpu_state_clean kernel/sched.c > > --- linux-2.6.11/kernel/sched.c~cpu_state_clean 2005-03-31 > > 10:50:27.0 +0800 > > +++ linux-2.6.11-root/kernel/sched.c2005-04-04 09:06:40.362357104 > > +0800 > > @@ -4028,6 +4028,58 @@ void __devinit init_idle(task_t *idle, i > > } > > > > /* > > + * Initial dummy domain for early boot and for hotplug cpu. Being static, > > + * it is initialized to zero, so all balancing flags are cleared which is > > + * what we want. > > + */ > > +static struct sched_domain sched_domain_dummy; > > + > > +#ifdef CONFIG_STR_SMP > > +static void __devinit exit_idle(int cpu) > > +{ > > + runqueue_t *rq = cpu_rq(cpu); > > + struct task_struct *p = rq->idle; > > + int j, k; > > + prio_array_t *array; > > + > > + /* init runqueue */ > > + spin_lock_init(>lock); > > + rq->active = rq->arrays; > > + rq->expired = rq->arrays + 1; > > + rq->best_expired_prio = MAX_PRIO; > > + > > + rq->prev_mm = NULL; > > + rq->curr = rq->idle = NULL; > > + rq->expired_timestamp = 0; > > + > > + rq->sd = _domain_dummy; > > + rq->cpu_load = 0; > > + rq->active_balance = 0; > > + rq->push_cpu = 0; > > + rq->migration_thread = NULL; > > + INIT_LIST_HEAD(>migration_queue); > > + atomic_set(>nr_iowait, 0); > > + > > + for (j = 0; j < 2; j++) { > > + array = rq->arrays + j; > > + for (k = 0; k < MAX_PRIO; k++) { > > + INIT_LIST_HEAD(array->queue + k); > > + __clear_bit(k, array->bitmap); > > + } > > + // delimiter for bitsearch > > + __set_bit(MAX_PRIO, array->bitmap); > > + } > > + /* Destroy IDLE thread. > > +* it's safe now, the CPU is in busy loop > > +*/ > > + if (p->active_mm) > > + mmdrop(p->active_mm); > > + p->active_mm = NULL; > > + put_task_struct(p); > > +} > > +#endif > > + > > +/* > > * In a system that switches off the HZ timer nohz_cpu_mask > > * indicates
Re: [ACPI] Re: [RFC 5/6]clean cpu state after hotremove CPU
Hi, On Mon, 2005-04-04 at 13:28, Nathan Lynch wrote: On Mon, Apr 04, 2005 at 10:07:02AM +0800, Li Shaohua wrote: Clean up all CPU states including its runqueue and idle thread, so we can use boot time code without any changes. Note this makes /sys/devices/system/cpu/cpux/online unworkable. In what sense does it make the online attribute unworkable? I removed the idle thread and other CPU states, and makes the dead CPU into a 'halt' busy loop. diff -puN kernel/exit.c~cpu_state_clean kernel/exit.c --- linux-2.6.11/kernel/exit.c~cpu_state_clean 2005-03-31 10:50:27.0 +0800 +++ linux-2.6.11-root/kernel/exit.c 2005-03-31 10:50:27.0 +0800 @@ -845,6 +845,65 @@ fastcall NORET_TYPE void do_exit(long co for (;;) ; } +#ifdef CONFIG_STR_SMP +void do_exit_idle(void) +{ + struct task_struct *tsk = current; + int group_dead; + + BUG_ON(tsk-pid); + BUG_ON(tsk-mm); + + if (tsk-io_context) + exit_io_context(); + tsk-flags |= PF_EXITING; + tsk-it_virt_expires = cputime_zero; + tsk-it_prof_expires = cputime_zero; + tsk-it_sched_expires = 0; + + acct_update_integrals(tsk); + update_mem_hiwater(tsk); + group_dead = atomic_dec_and_test(tsk-signal-live); + if (group_dead) { + del_timer_sync(tsk-signal-real_timer); + acct_process(-1); + } + exit_mm(tsk); + + exit_sem(tsk); + __exit_files(tsk); + __exit_fs(tsk); + exit_namespace(tsk); + exit_thread(); + exit_keys(tsk); + + if (group_dead tsk-signal-leader) + disassociate_ctty(1); + + module_put(tsk-thread_info-exec_domain-module); + if (tsk-binfmt) + module_put(tsk-binfmt-module); + + tsk-exit_code = -1; + tsk-exit_state = EXIT_DEAD; + + /* in release_task */ + atomic_dec(tsk-user-processes); + write_lock_irq(tasklist_lock); + __exit_signal(tsk); + __exit_sighand(tsk); + write_unlock_irq(tasklist_lock); + release_thread(tsk); + put_task_struct(tsk); + + tsk-flags |= PF_DEAD; +#ifdef CONFIG_NUMA + mpol_free(tsk-mempolicy); + tsk-mempolicy = NULL; +#endif +} +#endif I don't understand why this is needed at all. It looks like a fair amount of code from do_exit is being duplicated here. Yes, exactly. Someone who understand do_exit please help clean up the code. I'd like to remove the idle thread, since the smpboot code will create a new idle thread. We've been doing cpu removal on ppc64 logical partitions for a while and never needed to do anything like this. Did it remove idle thread? or dead cpu is in a busy loop of idle? Maybe idle_task_exit would suffice? idle_task_exit seems just drop mm. We need destroy the idle task for physical CPU hotplug, right? diff -puN kernel/sched.c~cpu_state_clean kernel/sched.c --- linux-2.6.11/kernel/sched.c~cpu_state_clean 2005-03-31 10:50:27.0 +0800 +++ linux-2.6.11-root/kernel/sched.c2005-04-04 09:06:40.362357104 +0800 @@ -4028,6 +4028,58 @@ void __devinit init_idle(task_t *idle, i } /* + * Initial dummy domain for early boot and for hotplug cpu. Being static, + * it is initialized to zero, so all balancing flags are cleared which is + * what we want. + */ +static struct sched_domain sched_domain_dummy; + +#ifdef CONFIG_STR_SMP +static void __devinit exit_idle(int cpu) +{ + runqueue_t *rq = cpu_rq(cpu); + struct task_struct *p = rq-idle; + int j, k; + prio_array_t *array; + + /* init runqueue */ + spin_lock_init(rq-lock); + rq-active = rq-arrays; + rq-expired = rq-arrays + 1; + rq-best_expired_prio = MAX_PRIO; + + rq-prev_mm = NULL; + rq-curr = rq-idle = NULL; + rq-expired_timestamp = 0; + + rq-sd = sched_domain_dummy; + rq-cpu_load = 0; + rq-active_balance = 0; + rq-push_cpu = 0; + rq-migration_thread = NULL; + INIT_LIST_HEAD(rq-migration_queue); + atomic_set(rq-nr_iowait, 0); + + for (j = 0; j 2; j++) { + array = rq-arrays + j; + for (k = 0; k MAX_PRIO; k++) { + INIT_LIST_HEAD(array-queue + k); + __clear_bit(k, array-bitmap); + } + // delimiter for bitsearch + __set_bit(MAX_PRIO, array-bitmap); + } + /* Destroy IDLE thread. +* it's safe now, the CPU is in busy loop +*/ + if (p-active_mm) + mmdrop(p-active_mm); + p-active_mm = NULL; + put_task_struct(p); +} +#endif + +/* * In a system that switches off the HZ timer nohz_cpu_mask * indicates which cpus entered this state. This is used * in the rcu update to wait only for active cpus. For system @@ -4432,6 +4484,9 @@ static int migration_call(struct notifie complete(req-done); } spin_unlock_irq(rq-lock); +#ifdef