[RFC 1/4] memblock: Introduce a for_each_reserved_mem_region iterator.
As part of initializing struct page's in 2MiB chunks, we noticed that at the end of free_all_bootmem(), there was nothing which had forced the reserved/allocated 4KiB pages to be initialized. This helper function will be used for that expansion. Signed-off-by: Robin Holt h...@sgi.com Signed-off-by: Nate Zimmer nzim...@sgi.com To: H. Peter Anvin h...@zytor.com To: Ingo Molnar mi...@kernel.org Cc: Linux Kernel linux-kernel@vger.kernel.org Cc: Linux MM linux...@kvack.org Cc: Rob Landley r...@landley.net Cc: Mike Travis tra...@sgi.com Cc: Daniel J Blueman dan...@numascale-asia.com Cc: Andrew Morton a...@linux-foundation.org Cc: Greg KH gre...@linuxfoundation.org Cc: Yinghai Lu ying...@kernel.org Cc: Mel Gorman mgor...@suse.de --- include/linux/memblock.h | 18 ++ mm/memblock.c| 32 2 files changed, 50 insertions(+) diff --git a/include/linux/memblock.h b/include/linux/memblock.h index f388203..e99bbd1 100644 --- a/include/linux/memblock.h +++ b/include/linux/memblock.h @@ -118,6 +118,24 @@ void __next_free_mem_range_rev(u64 *idx, int nid, phys_addr_t *out_start, i != (u64)ULLONG_MAX; \ __next_free_mem_range_rev(i, nid, p_start, p_end, p_nid)) +void __next_reserved_mem_region(u64 *idx, phys_addr_t *out_start, + phys_addr_t *out_end); + +/** + * for_earch_reserved_mem_region - iterate over all reserved memblock areas + * @i: u64 used as loop variable + * @p_start: ptr to phys_addr_t for start address of the range, can be %NULL + * @p_end: ptr to phys_addr_t for end address of the range, can be %NULL + * + * Walks over reserved areas of memblock in. Available as soon as memblock + * is initialized. + */ +#define for_each_reserved_mem_region(i, p_start, p_end) \ + for (i = 0UL, \ +__next_reserved_mem_region(i, p_start, p_end);\ +i != (u64)ULLONG_MAX; \ +__next_reserved_mem_region(i, p_start, p_end)) + #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP int memblock_set_node(phys_addr_t base, phys_addr_t size, int nid); diff --git a/mm/memblock.c b/mm/memblock.c index c5fad93..0d7d6e7 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -564,6 +564,38 @@ int __init_memblock memblock_reserve(phys_addr_t base, phys_addr_t size) } /** + * __next_reserved_mem_region - next function for for_each_reserved_region() + * @idx: pointer to u64 loop variable + * @out_start: ptr to phys_addr_t for start address of the region, can be %NULL + * @out_end: ptr to phys_addr_t for end address of the region, can be %NULL + * + * Iterate over all reserved memory regions. + */ +void __init_memblock __next_reserved_mem_region(u64 *idx, + phys_addr_t *out_start, + phys_addr_t *out_end) +{ + struct memblock_type *rsv = memblock.reserved; + + if (*idx = 0 *idx rsv-cnt) { + struct memblock_region *r = rsv-regions[*idx]; + phys_addr_t base = r-base; + phys_addr_t size = r-size; + + if (out_start) + *out_start = base; + if (out_end) + *out_end = base + size - 1; + + *idx += 1; + return; + } + + /* signal end of iteration */ + *idx = ULLONG_MAX; +} + +/** * __next_free_mem_range - next function for for_each_free_mem_range() * @idx: pointer to u64 loop variable * @nid: nid: node selector, %MAX_NUMNODES for all nodes -- 1.8.2.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC 2/4] Have __free_pages_memory() free in larger chunks.
Currently, when free_all_bootmem() calls __free_pages_memory(), the number of contiguous pages that __free_pages_memory() passes to the buddy allocator is limited to BITS_PER_LONG. In order to be able to free only the first page of a 2MiB chunk, we need that to be increased to PTRS_PER_PMD. Signed-off-by: Robin Holt h...@sgi.com Signed-off-by: Nate Zimmer nzim...@sgi.com To: H. Peter Anvin h...@zytor.com To: Ingo Molnar mi...@kernel.org Cc: Linux Kernel linux-kernel@vger.kernel.org Cc: Linux MM linux...@kvack.org Cc: Rob Landley r...@landley.net Cc: Mike Travis tra...@sgi.com Cc: Daniel J Blueman dan...@numascale-asia.com Cc: Andrew Morton a...@linux-foundation.org Cc: Greg KH gre...@linuxfoundation.org Cc: Yinghai Lu ying...@kernel.org Cc: Mel Gorman mgor...@suse.de --- mm/nobootmem.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/mm/nobootmem.c b/mm/nobootmem.c index bdd3fa2..3b512ca 100644 --- a/mm/nobootmem.c +++ b/mm/nobootmem.c @@ -83,10 +83,10 @@ void __init free_bootmem_late(unsigned long addr, unsigned long size) static void __init __free_pages_memory(unsigned long start, unsigned long end) { unsigned long i, start_aligned, end_aligned; - int order = ilog2(BITS_PER_LONG); + int order = ilog2(max(BITS_PER_LONG, PTRS_PER_PMD)); - start_aligned = (start + (BITS_PER_LONG - 1)) ~(BITS_PER_LONG - 1); - end_aligned = end ~(BITS_PER_LONG - 1); + start_aligned = (start + ((1UL order) - 1)) ~((1UL order) - 1); + end_aligned = end ~((1UL order) - 1); if (end_aligned = start_aligned) { for (i = start; i end; i++) @@ -98,7 +98,7 @@ static void __init __free_pages_memory(unsigned long start, unsigned long end) for (i = start; i start_aligned; i++) __free_pages_bootmem(pfn_to_page(i), 0); - for (i = start_aligned; i end_aligned; i += BITS_PER_LONG) + for (i = start_aligned; i end_aligned; i += 1 order) __free_pages_bootmem(pfn_to_page(i), order); for (i = end_aligned; i end; i++) -- 1.8.2.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC 0/4] Transparent on-demand struct page initialization embedded in the buddy allocator
We have been working on this since we returned from shutdown and have something to discuss now. We restricted ourselves to 2MiB initialization to keep the patch set a little smaller and more clear. First, I think I want to propose getting rid of the page flag. If I knew of a concrete way to determine that the page has not been initialized, this patch series would look different. If there is no definitive way to determine that the struct page has been initialized aside from checking the entire page struct is zero, then I think I would suggest we change the page flag to indicate the page has been initialized. The heart of the problem as I see it comes from expand(). We nearly always see a first reference to a struct page which is in the middle of the 2MiB region. Due to that access, the unlikely() check that was originally proposed really ends up referencing a different page entirely. We actually did not introduce an unlikely and refactor the patches to make that unlikely inside a static inline function. Also, given the strong warning at the head of expand(), we did not feel experienced enough to refactor it to make things always reference the 2MiB page first. With this patch, we did boot a 16TiB machine. Without the patches, the v3.10 kernel with the same configuration took 407 seconds for free_all_bootmem. With the patches and operating on 2MiB pages instead of 1GiB, it took 26 seconds so performance was improved. I have no feel for how the 1GiB chunk size will perform. I am on vacation for the next three days so I am sorry in advance for my infrequent or non-existant responses. Signed-off-by: Robin Holt h...@sgi.com Signed-off-by: Nate Zimmer nzim...@sgi.com To: H. Peter Anvin h...@zytor.com To: Ingo Molnar mi...@kernel.org Cc: Linux Kernel linux-kernel@vger.kernel.org Cc: Linux MM linux...@kvack.org Cc: Rob Landley r...@landley.net Cc: Mike Travis tra...@sgi.com Cc: Daniel J Blueman dan...@numascale-asia.com Cc: Andrew Morton a...@linux-foundation.org Cc: Greg KH gre...@linuxfoundation.org Cc: Yinghai Lu ying...@kernel.org Cc: Mel Gorman mgor...@suse.de -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Make transparent hugepages cpuset aware
On Wed, Jun 19, 2013 at 02:24:07PM -0700, David Rientjes wrote: > On Wed, 19 Jun 2013, Robin Holt wrote: > > > The convenience being that many batch schedulers have added cpuset > > support. They create the cpuset's and configure them as appropriate > > for the job as determined by a mixture of input from the submitting > > user but still under the control of the administrator. That seems like > > a fairly significant convenience given that it took years to get the > > batch schedulers to adopt cpusets in the first place. At this point, > > expanding their use of cpusets is under the control of the system > > administrator and would not require any additional development on > > the batch scheduler developers part. > > > > You can't say the same for memcg? I am not aware of batch scheduler support for memory controllers. The request came from our benchmarking group. > > Here are the entries in the cpuset: > > cgroup.event_control mem_exclusivememory_pressure_enabled > > notify_on_release tasks > > cgroup.procs mem_hardwall memory_spread_page > > release_agent > > cpu_exclusive memory_migrate memory_spread_slab > > sched_load_balance > > cpus memory_pressure mems > > sched_relax_domain_level > > > > There are scheduler, slab allocator, page_cache layout, etc controls. > > I think this is mostly for historical reasons since cpusets were > introduced before cgroups. > > > Why _NOT_ add a thp control to that nicely contained central location? > > It is a concise set of controls for the job. > > > > All of the above seem to be for cpusets primary purpose, i.e. NUMA > optimizations. It has nothing to do with transparent hugepages. (I'm not > saying thp has anything to do with memcg either, but a "memory controller" > seems more appropriate for controlling thp behavior.) cpusets was not for NUMA. It has no preference for "nodes" or anything like that. It was for splitting a machine into layered smaller groups. Usually, we see one cpuset with contains the batch scheduler. The batch scheduler then creates cpusets for jobs it starts. Has nothing to do with nodes. That is more an administrator issue. They set the minimum grouping of resources for scheduled jobs. > > Maybe I am misunderstanding. Are you saying you want to put memcg > > information into the cpuset or something like that? > > > > I'm saying there's absolutely no reason to have thp controlled by a > cpuset, or ANY cgroup for that matter, since you chose not to respond to > the question I asked: why do you want to control thp behavior for certain > static binaries and not others? Where is the performance regression or > the downside? Is it because of max_ptes_none for certain jobs blowing up > the rss? We need information, and even if were justifiable then it > wouldn't have anything to do with ANY cgroup but rather a per-process > control. It has nothing to do with cpusets whatsoever. It was a request from our benchmarking group that has found some jobs benefit from thp, while other are harmed. Let me ask them for more details. > (And I'm very curious why you didn't even cc the cpusets maintainer on > this patch in the first place who would probably say the same thing.) I didn't know there was a cpuset maintainer. Paul Jackson (SGI retired) had originally worked to get cpusets introduced and then converted to use cgroups. I had never known there was a maintainer after him. Sorry for that. Robin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Make transparent hugepages cpuset aware
On Tue, Jun 18, 2013 at 05:01:23PM -0700, David Rientjes wrote: > On Tue, 18 Jun 2013, Alex Thorlton wrote: > > > Thanks for your input, however, I believe the method of using a malloc > > hook falls apart when it comes to static binaries, since we wont' have > > any shared libraries to hook into. Although using a malloc hook is a > > perfectly suitable solution for most cases, we're looking to implement a > > solution that can be used in all situations. > > > > I guess the question would be why you don't want your malloc memory backed > by thp pages for certain static binaries and not others? Is it because of > an increased rss due to khugepaged collapsing memory because of its > default max_ptes_none value? > > > Aside from that particular shortcoming of the malloc hook solution, > > there are some other situations having a cpuset-based option is a > > much simpler and more efficient solution than the alternatives. > > Sure, but why should this be a cpuset based solution? What is special > about cpusets that make certain statically allocated binaries not want > memory backed by thp while others do? This still seems based solely on > convenience instead of any hard requirement. The convenience being that many batch schedulers have added cpuset support. They create the cpuset's and configure them as appropriate for the job as determined by a mixture of input from the submitting user but still under the control of the administrator. That seems like a fairly significant convenience given that it took years to get the batch schedulers to adopt cpusets in the first place. At this point, expanding their use of cpusets is under the control of the system administrator and would not require any additional development on the batch scheduler developers part. > > One > > such situation that comes to mind would be an environment where a batch > > scheduler is in use to ration system resources. If an administrator > > determines that a users jobs run more efficiently with thp always on, > > the administrator can simply set the users jobs to always run with that > > setting, instead of having to coordinate with that user to get them to > > run their jobs in a different way. I feel that, for cases such as this, > > the this additional flag is in line with the other capabilities that > > cgroups and cpusets provide. > > > > That sounds like a memcg, i.e. container, type of an issue, not a cpuset > issue which is more geared toward NUMA optimizations. User jobs should > always run more efficiently with thp always on, the worst-case scenario > should be if they run with the same performance as thp set to never. In > other words, there shouldn't be any regression that requires certain > cpusets to disable thp because of a performance regression. If there are > any, we'd like to investigate that separately from this patch. Here are the entries in the cpuset: cgroup.event_control mem_exclusivememory_pressure_enabled notify_on_release tasks cgroup.procs mem_hardwall memory_spread_page release_agent cpu_exclusive memory_migrate memory_spread_slab sched_load_balance cpus memory_pressure mems sched_relax_domain_level There are scheduler, slab allocator, page_cache layout, etc controls. Why _NOT_ add a thp control to that nicely contained central location? It is a concise set of controls for the job. Maybe I am misunderstanding. Are you saying you want to put memcg information into the cpuset or something like that? Robin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Make transparent hugepages cpuset aware
On Tue, Jun 18, 2013 at 05:01:23PM -0700, David Rientjes wrote: On Tue, 18 Jun 2013, Alex Thorlton wrote: Thanks for your input, however, I believe the method of using a malloc hook falls apart when it comes to static binaries, since we wont' have any shared libraries to hook into. Although using a malloc hook is a perfectly suitable solution for most cases, we're looking to implement a solution that can be used in all situations. I guess the question would be why you don't want your malloc memory backed by thp pages for certain static binaries and not others? Is it because of an increased rss due to khugepaged collapsing memory because of its default max_ptes_none value? Aside from that particular shortcoming of the malloc hook solution, there are some other situations having a cpuset-based option is a much simpler and more efficient solution than the alternatives. Sure, but why should this be a cpuset based solution? What is special about cpusets that make certain statically allocated binaries not want memory backed by thp while others do? This still seems based solely on convenience instead of any hard requirement. The convenience being that many batch schedulers have added cpuset support. They create the cpuset's and configure them as appropriate for the job as determined by a mixture of input from the submitting user but still under the control of the administrator. That seems like a fairly significant convenience given that it took years to get the batch schedulers to adopt cpusets in the first place. At this point, expanding their use of cpusets is under the control of the system administrator and would not require any additional development on the batch scheduler developers part. One such situation that comes to mind would be an environment where a batch scheduler is in use to ration system resources. If an administrator determines that a users jobs run more efficiently with thp always on, the administrator can simply set the users jobs to always run with that setting, instead of having to coordinate with that user to get them to run their jobs in a different way. I feel that, for cases such as this, the this additional flag is in line with the other capabilities that cgroups and cpusets provide. That sounds like a memcg, i.e. container, type of an issue, not a cpuset issue which is more geared toward NUMA optimizations. User jobs should always run more efficiently with thp always on, the worst-case scenario should be if they run with the same performance as thp set to never. In other words, there shouldn't be any regression that requires certain cpusets to disable thp because of a performance regression. If there are any, we'd like to investigate that separately from this patch. Here are the entries in the cpuset: cgroup.event_control mem_exclusivememory_pressure_enabled notify_on_release tasks cgroup.procs mem_hardwall memory_spread_page release_agent cpu_exclusive memory_migrate memory_spread_slab sched_load_balance cpus memory_pressure mems sched_relax_domain_level There are scheduler, slab allocator, page_cache layout, etc controls. Why _NOT_ add a thp control to that nicely contained central location? It is a concise set of controls for the job. Maybe I am misunderstanding. Are you saying you want to put memcg information into the cpuset or something like that? Robin -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] Make transparent hugepages cpuset aware
On Wed, Jun 19, 2013 at 02:24:07PM -0700, David Rientjes wrote: On Wed, 19 Jun 2013, Robin Holt wrote: The convenience being that many batch schedulers have added cpuset support. They create the cpuset's and configure them as appropriate for the job as determined by a mixture of input from the submitting user but still under the control of the administrator. That seems like a fairly significant convenience given that it took years to get the batch schedulers to adopt cpusets in the first place. At this point, expanding their use of cpusets is under the control of the system administrator and would not require any additional development on the batch scheduler developers part. You can't say the same for memcg? I am not aware of batch scheduler support for memory controllers. The request came from our benchmarking group. Here are the entries in the cpuset: cgroup.event_control mem_exclusivememory_pressure_enabled notify_on_release tasks cgroup.procs mem_hardwall memory_spread_page release_agent cpu_exclusive memory_migrate memory_spread_slab sched_load_balance cpus memory_pressure mems sched_relax_domain_level There are scheduler, slab allocator, page_cache layout, etc controls. I think this is mostly for historical reasons since cpusets were introduced before cgroups. Why _NOT_ add a thp control to that nicely contained central location? It is a concise set of controls for the job. All of the above seem to be for cpusets primary purpose, i.e. NUMA optimizations. It has nothing to do with transparent hugepages. (I'm not saying thp has anything to do with memcg either, but a memory controller seems more appropriate for controlling thp behavior.) cpusets was not for NUMA. It has no preference for nodes or anything like that. It was for splitting a machine into layered smaller groups. Usually, we see one cpuset with contains the batch scheduler. The batch scheduler then creates cpusets for jobs it starts. Has nothing to do with nodes. That is more an administrator issue. They set the minimum grouping of resources for scheduled jobs. Maybe I am misunderstanding. Are you saying you want to put memcg information into the cpuset or something like that? I'm saying there's absolutely no reason to have thp controlled by a cpuset, or ANY cgroup for that matter, since you chose not to respond to the question I asked: why do you want to control thp behavior for certain static binaries and not others? Where is the performance regression or the downside? Is it because of max_ptes_none for certain jobs blowing up the rss? We need information, and even if were justifiable then it wouldn't have anything to do with ANY cgroup but rather a per-process control. It has nothing to do with cpusets whatsoever. It was a request from our benchmarking group that has found some jobs benefit from thp, while other are harmed. Let me ask them for more details. (And I'm very curious why you didn't even cc the cpusets maintainer on this patch in the first place who would probably say the same thing.) I didn't know there was a cpuset maintainer. Paul Jackson (SGI retired) had originally worked to get cpusets introduced and then converted to use cgroups. I had never known there was a maintainer after him. Sorry for that. Robin -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Trivial PATCH 14/33] sgi: xpc: Convert use of typedef ctl_table to struct ctl_table
On Thu, Jun 13, 2013 at 07:37:39PM -0700, Joe Perches wrote: > This typedef is unnecessary and should just be removed. > > Signed-off-by: Joe Perches Acked-by: Robin Holt > --- > drivers/misc/sgi-xp/xpc_main.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/misc/sgi-xp/xpc_main.c b/drivers/misc/sgi-xp/xpc_main.c > index d971817..82dc574 100644 > --- a/drivers/misc/sgi-xp/xpc_main.c > +++ b/drivers/misc/sgi-xp/xpc_main.c > @@ -92,7 +92,7 @@ int xpc_disengage_timelimit = > XPC_DISENGAGE_DEFAULT_TIMELIMIT; > static int xpc_disengage_min_timelimit; /* = 0 */ > static int xpc_disengage_max_timelimit = 120; > > -static ctl_table xpc_sys_xpc_hb_dir[] = { > +static struct ctl_table xpc_sys_xpc_hb_dir[] = { > { >.procname = "hb_interval", >.data = _hb_interval, > @@ -111,7 +111,7 @@ static ctl_table xpc_sys_xpc_hb_dir[] = { >.extra2 = _hb_check_max_interval}, > {} > }; > -static ctl_table xpc_sys_xpc_dir[] = { > +static struct ctl_table xpc_sys_xpc_dir[] = { > { >.procname = "hb", >.mode = 0555, > @@ -126,7 +126,7 @@ static ctl_table xpc_sys_xpc_dir[] = { >.extra2 = _disengage_max_timelimit}, > {} > }; > -static ctl_table xpc_sys_dir[] = { > +static struct ctl_table xpc_sys_dir[] = { > { >.procname = "xpc", >.mode = 0555, > -- > 1.8.1.2.459.gbcd45b4.dirty -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Trivial PATCH 14/33] sgi: xpc: Convert use of typedef ctl_table to struct ctl_table
On Thu, Jun 13, 2013 at 07:37:39PM -0700, Joe Perches wrote: This typedef is unnecessary and should just be removed. Signed-off-by: Joe Perches j...@perches.com Acked-by: Robin Holt h...@sgi.com --- drivers/misc/sgi-xp/xpc_main.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/misc/sgi-xp/xpc_main.c b/drivers/misc/sgi-xp/xpc_main.c index d971817..82dc574 100644 --- a/drivers/misc/sgi-xp/xpc_main.c +++ b/drivers/misc/sgi-xp/xpc_main.c @@ -92,7 +92,7 @@ int xpc_disengage_timelimit = XPC_DISENGAGE_DEFAULT_TIMELIMIT; static int xpc_disengage_min_timelimit; /* = 0 */ static int xpc_disengage_max_timelimit = 120; -static ctl_table xpc_sys_xpc_hb_dir[] = { +static struct ctl_table xpc_sys_xpc_hb_dir[] = { { .procname = hb_interval, .data = xpc_hb_interval, @@ -111,7 +111,7 @@ static ctl_table xpc_sys_xpc_hb_dir[] = { .extra2 = xpc_hb_check_max_interval}, {} }; -static ctl_table xpc_sys_xpc_dir[] = { +static struct ctl_table xpc_sys_xpc_dir[] = { { .procname = hb, .mode = 0555, @@ -126,7 +126,7 @@ static ctl_table xpc_sys_xpc_dir[] = { .extra2 = xpc_disengage_max_timelimit}, {} }; -static ctl_table xpc_sys_dir[] = { +static struct ctl_table xpc_sys_dir[] = { { .procname = xpc, .mode = 0555, -- 1.8.1.2.459.gbcd45b4.dirty -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 03/11] Remove -stable friendly PF_THREAD_BOUND define
Remove the prior patch's #define for easier backporting to the stable releases. Signed-off-by: Robin Holt To: Andrew Morton Cc: H. Peter Anvin Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List --- kernel/sys.c | 5 - 1 file changed, 5 deletions(-) diff --git a/kernel/sys.c b/kernel/sys.c index 2bbd9a7..17bb8d3 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -362,11 +362,6 @@ int unregister_reboot_notifier(struct notifier_block *nb) } EXPORT_SYMBOL(unregister_reboot_notifier); -/* Add backwards compatibility for stable trees. */ -#ifndef PF_NO_SETAFFINITY -#define PF_NO_SETAFFINITY PF_THREAD_BOUND -#endif - static void migrate_to_reboot_cpu(void) { /* The boot cpu is always logical cpu 0 */ -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 02/11] Migrate shutdown/reboot to boot cpu.
We recently noticed that reboot of a 1024 cpu machine takes approx 16 minutes of just stopping the cpus. The slowdown was tracked to commit f96972f. The current implementation does all the work of hot removing the cpus before halting the system. We are switching to just migrating to the boot cpu and then continuing with shutdown/reboot. This also has the effect of not breaking x86's command line parameter for specifying the reboot cpu. Note, this code was shamelessly copied from arch/x86/kernel/reboot.c with bits removed pertaining to the reboot_cpu command line parameter. Signed-off-by: Robin Holt Tested-by: Shawn Guo To: Andrew Morton Cc: H. Peter Anvin Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List Cc: --- Changes since -v8 - Change stack parameter to make future patches cleaner. Changes since -v6: - Add #define for PF_THREAD_BOUND as compatibility to make stable easier. - Fixup s/reboot_cpu_id/reboot_cpu/ --- kernel/sys.c | 29 ++--- 1 file changed, 26 insertions(+), 3 deletions(-) diff --git a/kernel/sys.c b/kernel/sys.c index b95d3c7..2bbd9a7 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -362,6 +362,29 @@ int unregister_reboot_notifier(struct notifier_block *nb) } EXPORT_SYMBOL(unregister_reboot_notifier); +/* Add backwards compatibility for stable trees. */ +#ifndef PF_NO_SETAFFINITY +#define PF_NO_SETAFFINITY PF_THREAD_BOUND +#endif + +static void migrate_to_reboot_cpu(void) +{ + /* The boot cpu is always logical cpu 0 */ + int cpu = 0; + + cpu_hotplug_disable(); + + /* Make certain the cpu I'm about to reboot on is online */ + if (!cpu_online(cpu)) + cpu = cpumask_first(cpu_online_mask); + + /* Prevent races with other tasks migrating this task */ + current->flags |= PF_NO_SETAFFINITY; + + /* Make certain I only run on the appropriate processor */ + set_cpus_allowed_ptr(current, cpumask_of(cpu)); +} + /** * kernel_restart - reboot the system * @cmd: pointer to buffer containing command to execute for restart @@ -373,7 +396,7 @@ EXPORT_SYMBOL(unregister_reboot_notifier); void kernel_restart(char *cmd) { kernel_restart_prepare(cmd); - disable_nonboot_cpus(); + migrate_to_reboot_cpu(); syscore_shutdown(); if (!cmd) printk(KERN_EMERG "Restarting system.\n"); @@ -400,7 +423,7 @@ static void kernel_shutdown_prepare(enum system_states state) void kernel_halt(void) { kernel_shutdown_prepare(SYSTEM_HALT); - disable_nonboot_cpus(); + migrate_to_reboot_cpu(); syscore_shutdown(); printk(KERN_EMERG "System halted.\n"); kmsg_dump(KMSG_DUMP_HALT); @@ -419,7 +442,7 @@ void kernel_power_off(void) kernel_shutdown_prepare(SYSTEM_POWER_OFF); if (pm_power_off_prepare) pm_power_off_prepare(); - disable_nonboot_cpus(); + migrate_to_reboot_cpu(); syscore_shutdown(); printk(KERN_EMERG "Power down.\n"); kmsg_dump(KMSG_DUMP_POWEROFF); -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 08/11] arm, Remove unused restart_mode fields from some arm subarchs
These restart_mode fields are not used at all. Remove them to make moving the reboot= cmdline options to the general kernel easier. Signed-off-by: Robin Holt To: Andrew Morton Cc: Russell King Cc: Russ Anderson Cc: Robin Holt Cc: H. Peter Anvin Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List Acked-by: Russell King --- arch/arm/mach-ebsa110/core.c | 1 - arch/arm/mach-pxa/mioa701.c | 1 - arch/arm/mach-pxa/spitz.c| 3 --- arch/arm/mach-pxa/tosa.c | 1 - 4 files changed, 6 deletions(-) diff --git a/arch/arm/mach-ebsa110/core.c b/arch/arm/mach-ebsa110/core.c index b13cc74..69a9d5d 100644 --- a/arch/arm/mach-ebsa110/core.c +++ b/arch/arm/mach-ebsa110/core.c @@ -321,7 +321,6 @@ MACHINE_START(EBSA110, "EBSA110") .atag_offset= 0x400, .reserve_lp0= 1, .reserve_lp2= 1, - .restart_mode = 's', .map_io = ebsa110_map_io, .init_early = ebsa110_init_early, .init_irq = ebsa110_init_irq, diff --git a/arch/arm/mach-pxa/mioa701.c b/arch/arm/mach-pxa/mioa701.c index f8979b9..dbea67a 100644 --- a/arch/arm/mach-pxa/mioa701.c +++ b/arch/arm/mach-pxa/mioa701.c @@ -756,7 +756,6 @@ static void mioa701_machine_exit(void) MACHINE_START(MIOA701, "MIO A701") .atag_offset= 0x100, - .restart_mode = 's', .map_io = _map_io, .nr_irqs= PXA_NR_IRQS, .init_irq = _init_irq, diff --git a/arch/arm/mach-pxa/spitz.c b/arch/arm/mach-pxa/spitz.c index 362726c..c3c0042 100644 --- a/arch/arm/mach-pxa/spitz.c +++ b/arch/arm/mach-pxa/spitz.c @@ -979,7 +979,6 @@ static void __init spitz_fixup(struct tag *tags, char **cmdline, #ifdef CONFIG_MACH_SPITZ MACHINE_START(SPITZ, "SHARP Spitz") - .restart_mode = 'g', .fixup = spitz_fixup, .map_io = pxa27x_map_io, .nr_irqs= PXA_NR_IRQS, @@ -993,7 +992,6 @@ MACHINE_END #ifdef CONFIG_MACH_BORZOI MACHINE_START(BORZOI, "SHARP Borzoi") - .restart_mode = 'g', .fixup = spitz_fixup, .map_io = pxa27x_map_io, .nr_irqs= PXA_NR_IRQS, @@ -1007,7 +1005,6 @@ MACHINE_END #ifdef CONFIG_MACH_AKITA MACHINE_START(AKITA, "SHARP Akita") - .restart_mode = 'g', .fixup = spitz_fixup, .map_io = pxa27x_map_io, .nr_irqs= PXA_NR_IRQS, diff --git a/arch/arm/mach-pxa/tosa.c b/arch/arm/mach-pxa/tosa.c index 3d91d2e..a41992f 100644 --- a/arch/arm/mach-pxa/tosa.c +++ b/arch/arm/mach-pxa/tosa.c @@ -969,7 +969,6 @@ static void __init fixup_tosa(struct tag *tags, char **cmdline, } MACHINE_START(TOSA, "SHARP Tosa") - .restart_mode = 'g', .fixup = fixup_tosa, .map_io = pxa25x_map_io, .nr_irqs= TOSA_NR_IRQS, -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 11/11] Move arch/x86 reboot= handling to generic kernel.
Merge together the unicore32, arm, and x86 reboot= command line parameter handling. Signed-off-by: Robin Holt To: Andrew Morton Cc: H. Peter Anvin Cc: Russell King Cc: Guan Xuetao Cc: Russ Anderson Cc: Robin Holt Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List Acked-by: Ingo Molnar Acked-by: Guan Xuetao Acked-by: Russell King --- Changes since -v8 - Add missing break statements. - Change parsing so #ifdef's are no longer needed. - Switch to using simple_strtoul to make parsing cleaner. - Add handling of REBOOT_HARD/SOFT --- Documentation/kernel-parameters.txt | 14 +++- arch/arm/kernel/process.c| 10 --- arch/unicore32/kernel/process.c | 10 --- arch/x86/include/asm/emergency-restart.h | 12 arch/x86/kernel/apic/x2apic_uv_x.c | 2 +- arch/x86/kernel/reboot.c | 111 +-- include/linux/reboot.h | 17 + kernel/reboot.c | 76 - 8 files changed, 107 insertions(+), 145 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index c3bfacb..b2945ce 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2677,9 +2677,17 @@ bytes respectively. Such letter suffixes can also be entirely omitted. Run specified binary instead of /init from the ramdisk, used for early userspace startup. See initrd. - reboot= [BUGS=X86-32,BUGS=ARM,BUGS=IA-64] Rebooting mode - Format: [,[,...]] - See arch/*/kernel/reboot.c or arch/*/kernel/process.c + reboot= [KNL] + Format (x86 or x86_64): + [w[arm] | c[old] | h[ard] | s[oft] | g[pio]] \ + [[,]s[mp] \ + [[,]b[ios] | a[cpi] | k[bd] | t[riple] | e[fi] | p[ci]] \ + [[,]f[orce] + Where reboot_mode is one of warm (soft) or cold (hard) or gpio, + reboot_type is one of bios, acpi, kbd, triple, efi, or pci, + reboot_force is either force or not specified, + reboot_cpu is s[mp] with being the processor + to be used for rebooting. relax_domain_level= [KNL, SMP] Set scheduler's default relax_domain_level. diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c index 42856fc..304b102 100644 --- a/arch/arm/kernel/process.c +++ b/arch/arm/kernel/process.c @@ -175,16 +175,6 @@ void arch_cpu_idle(void) default_idle(); } -enum reboot_mode reboot_mode = REBOOT_HARD; - -static int __init reboot_setup(char *str) -{ - if ('s' == str[0]) - reboot_mode = REBOOT_SOFT; - return 1; -} -__setup("reboot=", reboot_setup); - void machine_shutdown(void) { #ifdef CONFIG_SMP diff --git a/arch/unicore32/kernel/process.c b/arch/unicore32/kernel/process.c index 93dd035..778ebba 100644 --- a/arch/unicore32/kernel/process.c +++ b/arch/unicore32/kernel/process.c @@ -51,16 +51,6 @@ void arch_cpu_idle(void) local_irq_enable(); } -static enum reboot_mode reboot_mode = REBOOT_HARD; - -int __init reboot_setup(char *str) -{ - if ('s' == str[0]) - reboot_mode = REBOOT_SOFT; - return 1; -} -__setup("reboot=", reboot_setup); - void machine_halt(void) { gpio_set_value(GPO_SOFT_OFF, 0); diff --git a/arch/x86/include/asm/emergency-restart.h b/arch/x86/include/asm/emergency-restart.h index 75ce3f4..77a99ac 100644 --- a/arch/x86/include/asm/emergency-restart.h +++ b/arch/x86/include/asm/emergency-restart.h @@ -1,18 +1,6 @@ #ifndef _ASM_X86_EMERGENCY_RESTART_H #define _ASM_X86_EMERGENCY_RESTART_H -enum reboot_type { - BOOT_TRIPLE = 't', - BOOT_KBD = 'k', - BOOT_BIOS = 'b', - BOOT_ACPI = 'a', - BOOT_EFI = 'e', - BOOT_CF9 = 'p', - BOOT_CF9_COND = 'q', -}; - -extern enum reboot_type reboot_type; - extern void machine_emergency_restart(void); #endif /* _ASM_X86_EMERGENCY_RESTART_H */ diff --git a/arch/x86/kernel/apic/x2apic_uv_x.c b/arch/x86/kernel/apic/x2apic_uv_x.c index 794f6eb..958e3e4 100644 --- a/arch/x86/kernel/apic/x2apic_uv_x.c +++ b/arch/x86/kernel/apic/x2apic_uv_x.c @@ -25,6 +25,7 @@ #include #include #include +#include #include #include @@ -36,7 +37,6 @@ #include #include #include -#include #include /* BMC sets a bit this MMR non-zero before sending an NMI */ diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c index f770340..563ed91 100644 --- a/arch/x86/kernel/reboot.c +++ b/arch/x86/kernel/reboot.c @@ -36,22 +36,6 @@ void (*pm_power_off)(void); EXPORT_SYMBOL(pm_power_o
[PATCH -v11 resend 05/11] checkpatch.pl the new kernel/reboot.c file.
Get the new file to pass scripts/checkpatch.pl Signed-off-by: Robin Holt To: Andrew Morton Cc: H. Peter Anvin Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List --- Changes since v6: - Removed last remaining line length warning. --- include/linux/reboot.h | 2 +- kernel/reboot.c| 28 +--- 2 files changed, 14 insertions(+), 16 deletions(-) diff --git a/include/linux/reboot.h b/include/linux/reboot.h index 23b3630..c6eba21 100644 --- a/include/linux/reboot.h +++ b/include/linux/reboot.h @@ -26,7 +26,7 @@ extern void machine_shutdown(void); struct pt_regs; extern void machine_crash_shutdown(struct pt_regs *); -/* +/* * Architecture independent implemenations of sys_reboot commands. */ diff --git a/kernel/reboot.c b/kernel/reboot.c index 0616483..abb6a04 100644 --- a/kernel/reboot.c +++ b/kernel/reboot.c @@ -4,6 +4,8 @@ * Copyright (C) 2013 Linus Torvalds */ +#define pr_fmt(fmt)"reboot: " fmt + #include #include #include @@ -114,9 +116,9 @@ void kernel_restart(char *cmd) migrate_to_reboot_cpu(); syscore_shutdown(); if (!cmd) - printk(KERN_EMERG "Restarting system.\n"); + pr_emerg("Restarting system\n"); else - printk(KERN_EMERG "Restarting system with command '%s'.\n", cmd); + pr_emerg("Restarting system with command '%s'\n", cmd); kmsg_dump(KMSG_DUMP_RESTART); machine_restart(cmd); } @@ -125,7 +127,7 @@ EXPORT_SYMBOL_GPL(kernel_restart); static void kernel_shutdown_prepare(enum system_states state) { blocking_notifier_call_chain(_notifier_list, - (state == SYSTEM_HALT)?SYS_HALT:SYS_POWER_OFF, NULL); + (state == SYSTEM_HALT) ? SYS_HALT : SYS_POWER_OFF, NULL); system_state = state; usermodehelper_disable(); device_shutdown(); @@ -140,11 +142,10 @@ void kernel_halt(void) kernel_shutdown_prepare(SYSTEM_HALT); migrate_to_reboot_cpu(); syscore_shutdown(); - printk(KERN_EMERG "System halted.\n"); + pr_emerg("System halted\n"); kmsg_dump(KMSG_DUMP_HALT); machine_halt(); } - EXPORT_SYMBOL_GPL(kernel_halt); /** @@ -159,7 +160,7 @@ void kernel_power_off(void) pm_power_off_prepare(); migrate_to_reboot_cpu(); syscore_shutdown(); - printk(KERN_EMERG "Power down.\n"); + pr_emerg("Power down\n"); kmsg_dump(KMSG_DUMP_POWEROFF); machine_power_off(); } @@ -188,10 +189,10 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd, /* For safety, we require "magic" arguments. */ if (magic1 != LINUX_REBOOT_MAGIC1 || - (magic2 != LINUX_REBOOT_MAGIC2 && - magic2 != LINUX_REBOOT_MAGIC2A && + (magic2 != LINUX_REBOOT_MAGIC2 && + magic2 != LINUX_REBOOT_MAGIC2A && magic2 != LINUX_REBOOT_MAGIC2B && - magic2 != LINUX_REBOOT_MAGIC2C)) + magic2 != LINUX_REBOOT_MAGIC2C)) return -EINVAL; /* @@ -234,7 +235,8 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd, break; case LINUX_REBOOT_CMD_RESTART2: - if (strncpy_from_user([0], arg, sizeof(buffer) - 1) < 0) { + ret = strncpy_from_user([0], arg, sizeof(buffer) - 1); + if (ret < 0) { ret = -EFAULT; break; } @@ -282,7 +284,6 @@ void ctrl_alt_del(void) else kill_cad_pid(SIGINT, 1); } - char poweroff_cmd[POWEROFF_CMD_PATH_LEN] = "/sbin/poweroff"; @@ -301,14 +302,11 @@ static int __orderly_poweroff(bool force) ret = call_usermodehelper(argv[0], argv, envp, UMH_WAIT_EXEC); argv_free(argv); } else { - printk(KERN_WARNING "%s failed to allocate memory for \"%s\"\n", -__func__, poweroff_cmd); ret = -ENOMEM; } if (ret && force) { - printk(KERN_WARNING "Failed to start orderly shutdown: " - "forcing the issue\n"); + pr_warn("Failed to start orderly shutdown: forcing the issue\n"); /* * I guess this should try to kick off some daemon to sync and * poweroff asap. Or not even bother syncing if we're doing an -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 06/11] x86, prepare reboot_mode for moving to generic kernel code.
This patch prepares for the moving the parsing of reboot= to the generic kernel code by making reboot_mode into a more generic form. Signed-off-by: Robin Holt To: Andrew Morton Cc: H. Peter Anvin Cc: Miguel Boton Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List Acked-by: Ingo Molnar --- arch/x86/kernel/reboot.c | 12 +++- include/linux/reboot.h | 5 + 2 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c index 76fa1e9..f770340 100644 --- a/arch/x86/kernel/reboot.c +++ b/arch/x86/kernel/reboot.c @@ -36,7 +36,7 @@ void (*pm_power_off)(void); EXPORT_SYMBOL(pm_power_off); static const struct desc_ptr no_idt = {}; -static int reboot_mode; +static enum reboot_mode reboot_mode; enum reboot_type reboot_type = BOOT_ACPI; int reboot_force; @@ -88,11 +88,11 @@ static int __init reboot_setup(char *str) switch (*str) { case 'w': - reboot_mode = 0x1234; + reboot_mode = REBOOT_WARM; break; case 'c': - reboot_mode = 0; + reboot_mode = REBOOT_COLD; break; #ifdef CONFIG_SMP @@ -536,6 +536,7 @@ static void native_machine_emergency_restart(void) int i; int attempt = 0; int orig_reboot_type = reboot_type; + unsigned short mode; if (reboot_emergency) emergency_vmx_disable_all(); @@ -543,7 +544,8 @@ static void native_machine_emergency_restart(void) tboot_shutdown(TB_SHUTDOWN_REBOOT); /* Tell the BIOS if we want cold or warm reboot */ - *((unsigned short *)__va(0x472)) = reboot_mode; + mode = reboot_mode == REBOOT_WARM ? 0x1234 : 0; + *((unsigned short *)__va(0x472)) = mode; for (;;) { /* Could also try the reset bit in the Hammer NB */ @@ -585,7 +587,7 @@ static void native_machine_emergency_restart(void) case BOOT_EFI: if (efi_enabled(EFI_RUNTIME_SERVICES)) - efi.reset_system(reboot_mode ? + efi.reset_system(reboot_mode == REBOOT_WARM ? EFI_RESET_WARM : EFI_RESET_COLD, EFI_SUCCESS, 0, NULL); diff --git a/include/linux/reboot.h b/include/linux/reboot.h index c6eba21..37d56c3 100644 --- a/include/linux/reboot.h +++ b/include/linux/reboot.h @@ -10,6 +10,11 @@ #define SYS_HALT 0x0002 /* Notify of system halt */ #define SYS_POWER_OFF 0x0003 /* Notify of system power off */ +enum reboot_mode { + REBOOT_COLD = 0, + REBOOT_WARM, +}; + extern int register_reboot_notifier(struct notifier_block *); extern int unregister_reboot_notifier(struct notifier_block *); -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 07/11] unicore32, prepare reboot_mode for moving to generic kernel code.
This patch prepares for the moving the parsing of reboot= to the generic kernel code by making reboot_mode into a more generic form. Signed-off-by: Robin Holt To: Andrew Morton Cc: Guan Xuetao Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: H. Peter Anvin Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List Acked-by: Guan Xuetao --- Changes since -v8 - Switched from using REBOOT_WARM/COLD to HARD/SOFT. --- arch/unicore32/kernel/process.c | 10 +- arch/unicore32/kernel/setup.h | 2 +- arch/unicore32/mm/mmu.c | 2 +- include/linux/reboot.h | 2 ++ 4 files changed, 9 insertions(+), 7 deletions(-) diff --git a/arch/unicore32/kernel/process.c b/arch/unicore32/kernel/process.c index c944769..93dd035 100644 --- a/arch/unicore32/kernel/process.c +++ b/arch/unicore32/kernel/process.c @@ -51,14 +51,14 @@ void arch_cpu_idle(void) local_irq_enable(); } -static char reboot_mode = 'h'; +static enum reboot_mode reboot_mode = REBOOT_HARD; int __init reboot_setup(char *str) { - reboot_mode = str[0]; + if ('s' == str[0]) + reboot_mode = REBOOT_SOFT; return 1; } - __setup("reboot=", reboot_setup); void machine_halt(void) @@ -88,7 +88,7 @@ void machine_restart(char *cmd) * we may need it to insert some 1:1 mappings so that * soft boot works. */ - setup_mm_for_reboot(reboot_mode); + setup_mm_for_reboot(); /* Clean and invalidate caches */ flush_cache_all(); @@ -102,7 +102,7 @@ void machine_restart(char *cmd) /* * Now handle reboot code. */ - if (reboot_mode == 's') { + if (reboot_mode == REBOOT_SOFT) { /* Jump into ROM at address 0x */ cpu_reset(VECTORS_BASE); } else { diff --git a/arch/unicore32/kernel/setup.h b/arch/unicore32/kernel/setup.h index 30f749d..f5c51b8 100644 --- a/arch/unicore32/kernel/setup.h +++ b/arch/unicore32/kernel/setup.h @@ -22,7 +22,7 @@ extern void puv3_ps2_init(void); extern void pci_puv3_preinit(void); extern void __init puv3_init_gpio(void); -extern void setup_mm_for_reboot(char mode); +extern void setup_mm_for_reboot(void); extern char __stubs_start[], __stubs_end[]; extern char __vectors_start[], __vectors_end[]; diff --git a/arch/unicore32/mm/mmu.c b/arch/unicore32/mm/mmu.c index 43c20b4..4f5a532 100644 --- a/arch/unicore32/mm/mmu.c +++ b/arch/unicore32/mm/mmu.c @@ -445,7 +445,7 @@ void __init paging_init(void) * the user-mode pages. This will then ensure that we have predictable * results when turning the mmu off */ -void setup_mm_for_reboot(char mode) +void setup_mm_for_reboot(void) { unsigned long base_pmdval; pgd_t *pgd; diff --git a/include/linux/reboot.h b/include/linux/reboot.h index 37d56c3..ca29a6f 100644 --- a/include/linux/reboot.h +++ b/include/linux/reboot.h @@ -13,6 +13,8 @@ enum reboot_mode { REBOOT_COLD = 0, REBOOT_WARM, + REBOOT_HARD, + REBOOT_SOFT, }; extern int register_reboot_notifier(struct notifier_block *); -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 09/11] arm, prepare reboot_mode for moving to generic kernel code.
This patch prepares for the moving the parsing of reboot= to the generic kernel code by making reboot_mode into a more generic form. Signed-off-by: Robin Holt To: Andrew Morton Cc: Russell King Cc: Russ Anderson Cc: Robin Holt Cc: H. Peter Anvin Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List Acked-by: Russell King --- Changes since -v10 - Uncommented an accidentally commented out line. Changes since -v8 - Switched from using REBOOT_WARM/COLD to HARD/SOFT. --- arch/arm/include/asm/mach/arch.h | 3 ++- arch/arm/kernel/process.c | 8 arch/arm/kernel/setup.c| 6 +++--- arch/arm/mach-footbridge/cats-hw.c | 2 +- 4 files changed, 10 insertions(+), 9 deletions(-) diff --git a/arch/arm/include/asm/mach/arch.h b/arch/arm/include/asm/mach/arch.h index 308ad7d..e2b551e 100644 --- a/arch/arm/include/asm/mach/arch.h +++ b/arch/arm/include/asm/mach/arch.h @@ -9,6 +9,7 @@ */ #ifndef __ASSEMBLY__ +#include struct tag; struct meminfo; @@ -39,7 +40,7 @@ struct machine_desc { unsigned char reserve_lp0 :1; /* never has lp0*/ unsigned char reserve_lp1 :1; /* never has lp1*/ unsigned char reserve_lp2 :1; /* never has lp2*/ - charrestart_mode; /* default restart mode */ + enum reboot_modereboot_mode;/* default restart mode */ struct smp_operations *smp; /* SMP operations */ void(*fixup)(struct tag *, char **, struct meminfo *); diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c index f219703..92b47df 100644 --- a/arch/arm/kernel/process.c +++ b/arch/arm/kernel/process.c @@ -174,14 +174,14 @@ void arch_cpu_idle(void) default_idle(); } -static char reboot_mode = 'h'; +enum reboot_mode reboot_mode = REBOOT_HARD; -int __init reboot_setup(char *str) +static int __init reboot_setup(char *str) { - reboot_mode = str[0]; + if ('s' == str[0]) + reboot_mode = REBOOT_SOFT; return 1; } - __setup("reboot=", reboot_setup); void machine_shutdown(void) diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c index 1522c7a..e05df42 100644 --- a/arch/arm/kernel/setup.c +++ b/arch/arm/kernel/setup.c @@ -73,7 +73,7 @@ __setup("fpe=", fpe_setup); extern void paging_init(struct machine_desc *desc); extern void sanity_check_meminfo(void); -extern void reboot_setup(char *str); +extern enum reboot_mode reboot_mode; extern void setup_dma_zone(struct machine_desc *desc); unsigned int processor_id; @@ -769,8 +769,8 @@ void __init setup_arch(char **cmdline_p) setup_dma_zone(mdesc); - if (mdesc->restart_mode) - reboot_setup(>restart_mode); + if (mdesc->reboot_mode != REBOOT_HARD) + reboot_mode = mdesc->reboot_mode; init_mm.start_code = (unsigned long) _text; init_mm.end_code = (unsigned long) _etext; diff --git a/arch/arm/mach-footbridge/cats-hw.c b/arch/arm/mach-footbridge/cats-hw.c index 6987a09..9669cc0 100644 --- a/arch/arm/mach-footbridge/cats-hw.c +++ b/arch/arm/mach-footbridge/cats-hw.c @@ -86,7 +86,7 @@ fixup_cats(struct tag *tags, char **cmdline, struct meminfo *mi) MACHINE_START(CATS, "Chalice-CATS") /* Maintainer: Philip Blundell */ .atag_offset= 0x100, - .restart_mode = 's', + .reboot_mode= REBOOT_SOFT, .fixup = fixup_cats, .map_io = footbridge_map_io, .init_irq = footbridge_init_irq, -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 01/11] CPU hotplug: Provide a generic helper to disable/enable CPU hotplug
From: "Srivatsa S. Bhat" There are instances in the kernel where we would like to disable CPU hotplug (from sysfs) during some important operation. Today the freezer code depends on this and the code to do it was kinda tailor-made for that. Restructure the code and make it generic enough to be useful for other usecases too. Signed-off-by: Srivatsa S. Bhat Signed-off-by: Robin Holt To: Andrew Morton Cc: H. Peter Anvin Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List Cc: --- include/linux/cpu.h | 4 kernel/cpu.c| 55 ++--- 2 files changed, 27 insertions(+), 32 deletions(-) diff --git a/include/linux/cpu.h b/include/linux/cpu.h index c6f6e08..9f3c7e8 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -175,6 +175,8 @@ extern struct bus_type cpu_subsys; extern void get_online_cpus(void); extern void put_online_cpus(void); +extern void cpu_hotplug_disable(void); +extern void cpu_hotplug_enable(void); #define hotcpu_notifier(fn, pri) cpu_notifier(fn, pri) #define register_hotcpu_notifier(nb) register_cpu_notifier(nb) #define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb) @@ -198,6 +200,8 @@ static inline void cpu_hotplug_driver_unlock(void) #define get_online_cpus() do { } while (0) #define put_online_cpus() do { } while (0) +#define cpu_hotplug_disable() do { } while (0) +#define cpu_hotplug_enable() do { } while (0) #define hotcpu_notifier(fn, pri) do { (void)(fn); } while (0) /* These aren't inline functions due to a GCC bug. */ #define register_hotcpu_notifier(nb) ({ (void)(nb); 0; }) diff --git a/kernel/cpu.c b/kernel/cpu.c index b5e4ab2..198a388 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -133,6 +133,27 @@ static void cpu_hotplug_done(void) mutex_unlock(_hotplug.lock); } +/* + * Wait for currently running CPU hotplug operations to complete (if any) and + * disable future CPU hotplug (from sysfs). The 'cpu_add_remove_lock' protects + * the 'cpu_hotplug_disabled' flag. The same lock is also acquired by the + * hotplug path before performing hotplug operations. So acquiring that lock + * guarantees mutual exclusion from any currently running hotplug operations. + */ +void cpu_hotplug_disable(void) +{ + cpu_maps_update_begin(); + cpu_hotplug_disabled = 1; + cpu_maps_update_done(); +} + +void cpu_hotplug_enable(void) +{ + cpu_maps_update_begin(); + cpu_hotplug_disabled = 0; + cpu_maps_update_done(); +} + #else /* #if CONFIG_HOTPLUG_CPU */ static void cpu_hotplug_begin(void) {} static void cpu_hotplug_done(void) {} @@ -541,36 +562,6 @@ static int __init alloc_frozen_cpus(void) core_initcall(alloc_frozen_cpus); /* - * Prevent regular CPU hotplug from racing with the freezer, by disabling CPU - * hotplug when tasks are about to be frozen. Also, don't allow the freezer - * to continue until any currently running CPU hotplug operation gets - * completed. - * To modify the 'cpu_hotplug_disabled' flag, we need to acquire the - * 'cpu_add_remove_lock'. And this same lock is also taken by the regular - * CPU hotplug path and released only after it is complete. Thus, we - * (and hence the freezer) will block here until any currently running CPU - * hotplug operation gets completed. - */ -void cpu_hotplug_disable_before_freeze(void) -{ - cpu_maps_update_begin(); - cpu_hotplug_disabled = 1; - cpu_maps_update_done(); -} - - -/* - * When tasks have been thawed, re-enable regular CPU hotplug (which had been - * disabled while beginning to freeze tasks). - */ -void cpu_hotplug_enable_after_thaw(void) -{ - cpu_maps_update_begin(); - cpu_hotplug_disabled = 0; - cpu_maps_update_done(); -} - -/* * When callbacks for CPU hotplug notifications are being executed, we must * ensure that the state of the system with respect to the tasks being frozen * or not, as reported by the notification, remains unchanged *throughout the @@ -589,12 +580,12 @@ cpu_hotplug_pm_callback(struct notifier_block *nb, case PM_SUSPEND_PREPARE: case PM_HIBERNATION_PREPARE: - cpu_hotplug_disable_before_freeze(); + cpu_hotplug_disable(); break; case PM_POST_SUSPEND: case PM_POST_HIBERNATION: - cpu_hotplug_enable_after_thaw(); + cpu_hotplug_enable(); break; default: -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 04/11] Move shutdown/reboot related functions to kernel/reboot.c
This patch is preparatory. It moves reboot related syscall, etc functions from kernel/sys.c to kernel/reboot.c. Signed-off-by: Robin Holt To: Andrew Morton Cc: H. Peter Anvin Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List --- Changes since -v6: - Add include of linux/uaccess.h to allow building on arm. --- kernel/Makefile | 2 +- kernel/reboot.c | 347 kernel/sys.c| 331 - 3 files changed, 348 insertions(+), 332 deletions(-) create mode 100644 kernel/reboot.c diff --git a/kernel/Makefile b/kernel/Makefile index 271fd31..470839d 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -9,7 +9,7 @@ obj-y = fork.o exec_domain.o panic.o printk.o \ rcupdate.o extable.o params.o posix-timers.o \ kthread.o wait.o sys_ni.o posix-cpu-timers.o mutex.o \ hrtimer.o rwsem.o nsproxy.o srcu.o semaphore.o \ - notifier.o ksysfs.o cred.o \ + notifier.o ksysfs.o cred.o reboot.o \ async.o range.o groups.o lglock.o smpboot.o ifdef CONFIG_FUNCTION_TRACER diff --git a/kernel/reboot.c b/kernel/reboot.c new file mode 100644 index 000..0616483 --- /dev/null +++ b/kernel/reboot.c @@ -0,0 +1,347 @@ +/* + * linux/kernel/reboot.c + * + * Copyright (C) 2013 Linus Torvalds + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * this indicates whether you can reboot with ctrl-alt-del: the default is yes + */ + +int C_A_D = 1; +struct pid *cad_pid; +EXPORT_SYMBOL(cad_pid); + +/* + * If set, this is used for preparing the system to power off. + */ + +void (*pm_power_off_prepare)(void); + +/** + * emergency_restart - reboot the system + * + * Without shutting down any hardware or taking any locks + * reboot the system. This is called when we know we are in + * trouble so this is our best effort to reboot. This is + * safe to call in interrupt context. + */ +void emergency_restart(void) +{ + kmsg_dump(KMSG_DUMP_EMERG); + machine_emergency_restart(); +} +EXPORT_SYMBOL_GPL(emergency_restart); + +void kernel_restart_prepare(char *cmd) +{ + blocking_notifier_call_chain(_notifier_list, SYS_RESTART, cmd); + system_state = SYSTEM_RESTART; + usermodehelper_disable(); + device_shutdown(); +} + +/** + * register_reboot_notifier - Register function to be called at reboot time + * @nb: Info about notifier function to be called + * + * Registers a function with the list of functions + * to be called at reboot time. + * + * Currently always returns zero, as blocking_notifier_chain_register() + * always returns zero. + */ +int register_reboot_notifier(struct notifier_block *nb) +{ + return blocking_notifier_chain_register(_notifier_list, nb); +} +EXPORT_SYMBOL(register_reboot_notifier); + +/** + * unregister_reboot_notifier - Unregister previously registered reboot notifier + * @nb: Hook to be unregistered + * + * Unregisters a previously registered reboot + * notifier function. + * + * Returns zero on success, or %-ENOENT on failure. + */ +int unregister_reboot_notifier(struct notifier_block *nb) +{ + return blocking_notifier_chain_unregister(_notifier_list, nb); +} +EXPORT_SYMBOL(unregister_reboot_notifier); + +static void migrate_to_reboot_cpu(void) +{ + /* The boot cpu is always logical cpu 0 */ + int cpu = 0; + + cpu_hotplug_disable(); + + /* Make certain the cpu I'm about to reboot on is online */ + if (!cpu_online(cpu)) + cpu = cpumask_first(cpu_online_mask); + + /* Prevent races with other tasks migrating this task */ + current->flags |= PF_NO_SETAFFINITY; + + /* Make certain I only run on the appropriate processor */ + set_cpus_allowed_ptr(current, cpumask_of(cpu)); +} + +/** + * kernel_restart - reboot the system + * @cmd: pointer to buffer containing command to execute for restart + * or %NULL + * + * Shutdown everything and perform a clean reboot. + * This is not safe to call in interrupt context. + */ +void kernel_restart(char *cmd) +{ + kernel_restart_prepare(cmd); + migrate_to_reboot_cpu(); + syscore_shutdown(); + if (!cmd) + printk(KERN_EMERG "Restarting system.\n"); + else + printk(KERN_EMERG "Restarting system with command '%s'.\n", cmd); + kmsg_dump(KMSG_DUMP_RESTART); + machine_restart(cmd); +} +EXPORT_SYMBOL_GPL(kernel_restart); + +static void kernel_shutdown_prepare(enum system_states state) +{ + blocking_notifier_call_chain(_notifier_list, + (state == SYSTEM_HALT)?SYS_HALT:SYS_POWER_OFF, NULL); + system_state = state; +
[PATCH -v11 resend 00/11] Shutdown from reboot_cpuid without stopping other cpus.
We recently noticed that reboot of a 1024 cpu machine takes approx 16 minutes of just stopping the cpus. The slowdown was tracked to commit f96972f. The current implementation does all the work of hot removing the cpus before halting the system. We are switching to just migrating to the reboot_cpu and then continuing with shutdown/reboot. The patch set is broken into eleven parts. The first two are planned for the stable release. The others move the halt/shutdown/reboot related functions to their own kernel/reboot.c file and then move the handling of the kernel reboot= kernel parameter to generic kernel code. Changes since -v10 - Added Russell's Acked-by for arm. - Fixed an accidentally commented out line in an arm header file. Changes since -v9 - Added Ingo's Acked-by for x86. - Added Guan's Acked-by for unicore32. - Replaced first patch with updated patch from Srivatsa S. Bhat. This compiles for alpha allmodconfig, all arm defconfigs, and a few test x86_64 defconfigs. I have not tried more. Changes since -v8 - Changes reboot_cpu on stack to cpu to fix bug noticed by Russell King. - Switched unicore32 and arm from using REBOOT_WARM/COLD to HARD/SOFT. - Fixed case statement bug. - Went to using simple_strtoul for parsing reboot_cpu=smp###. - Made parsing of reboot= not use any #ifdef'd code. Changes since -v7. - Fixed authorship for first patch. - Rebased to Linus' current tree (51a26ae7a). Changes since -v6. - Cross compiled all arm architectures (using v3.9 kernel. Fails with current). - Added a #define for non-hotplug case. - Add #define for PF_THREAD_BOUND as compatibility to make stable easier. - Fixup s/reboot_cpu_id/reboot_cpu/ - Add include of linux/uaccess.h to allow building on arm. - Removed last remaining checkpatch.pl line length warning on kernel/reboot.c. - Fixed the duplicate handling or the reboot= kernel parameter. Changes since -v5. - Moved the arch/x86 reboot= up to the generic kernel code. Changes since -v4. - Integrated Srivatsa S. Bhat creating cpu_hotplug_disable() function - Integrated comments by Srivatsa S. Bhat. - Made one more comment consistent with others in function. Changes since -v3. - Added a tested-by for the original reporter. - Fix compile failure found by Joe Perches. - Integrated comments by Joe Perches. To: Andrew Morton Cc: H. Peter Anvin Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: mmotm 2013-06-05-17-24 uploaded
On Wed, Jun 05, 2013 at 05:26:36PM -0700, a...@linux-foundation.org wrote: > The mm-of-the-moment snapshot 2013-06-05-17-24 has been uploaded to > >http://www.ozlabs.org/~akpm/mmotm/ > > mmotm-readme.txt says > > README for mm-of-the-moment: > > http://www.ozlabs.org/~akpm/mmotm/ > > This is a snapshot of my -mm patch queue. Uploaded at random hopefully > more than once a week. > > You will need quilt to apply these patches to the latest Linus release (3.x > or 3.x-rcY). The series file is in broken-out.tar.gz and is duplicated in > http://ozlabs.org/~akpm/mmotm/series It looks like the shutdown-reboot patches I sent are still not queued for Linus. Did these just get lost in the shuffle or do they need to be resubmitted? The first two were marked for -stable and I would really like to get them in sometime as it does affect severely affect shutdown of large systems. I will resend them shortly. Robin > > The file broken-out.tar.gz contains two datestamp files: .DATE and > .DATE--mm-dd-hh-mm-ss. Both contain the string -mm-dd-hh-mm-ss, > followed by the base kernel version against which this patch series is to > be applied. > > This tree is partially included in linux-next. To see which patches are > included in linux-next, consult the `series' file. Only the patches > within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in > linux-next. > > A git tree which contains the memory management portion of this tree is > maintained at git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git > by Michal Hocko. It contains the patches which are between the > "#NEXT_PATCHES_START mm" and "#NEXT_PATCHES_END" markers, from the series > file, http://www.ozlabs.org/~akpm/mmotm/series. > > > A full copy of the full kernel tree with the linux-next and mmotm patches > already applied is available through git within an hour of the mmotm > release. Individual mmotm releases are tagged. The master branch always > points to the latest release, so it's constantly rebasing. > > http://git.cmpxchg.org/?p=linux-mmotm.git;a=summary > > To develop on top of mmotm git: > > $ git remote add mmotm > git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git > $ git remote update mmotm > $ git checkout -b topic mmotm/master > > $ git send-email mmotm/master.. [...] > > To rebase a branch with older patches to a new mmotm release: > > $ git remote update mmotm > $ git rebase --onto mmotm/master topic > > > > > The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second) > contains daily snapshots of the -mm tree. It is updated more frequently > than mmotm, and is untested. > > A git copy of this tree is available at > > http://git.cmpxchg.org/?p=linux-mmots.git;a=summary > > and use of this tree is similar to > http://git.cmpxchg.org/?p=linux-mmotm.git, described above. > > > This mmotm tree contains the following patches against 3.10-rc4: > (patches marked "*" will be included in linux-next) > > origin.patch > linux-next.patch > linux-next-git-rejects.patch > arch-alpha-kernel-systblss-remove-debug-check.patch > i-need-old-gcc.patch > * cpu-hotplug-provide-a-generic-helper-to-disable-enable-cpu-hotplug.patch > * cpu-hotplug-provide-a-generic-helper-to-disable-enable-cpu-hotplug-v11.patch > * migrate-shutdown-reboot-to-boot-cpu.patch > * migrate-shutdown-reboot-to-boot-cpu-v11.patch > * kmsg-honor-dmesg_restrict-sysctl-on-dev-kmsg.patch > * kmsg-honor-dmesg_restrict-sysctl-on-dev-kmsg-fix.patch > * > lib-mpi-mpicoderc-looping-issue-need-stop-when-equal-to-zero-found-by-extra_flags=-w.patch > * ocfs2-ocfs2_prep_new_orphaned_file-should-return-ret.patch > * memcg-dont-initialize-kmem-cache-destroying-work-for-root-caches.patch > * rtc-tps6586x-device-wakeup-flags-correction.patch > * drivers-rtc-rtc-cmosc-fix-accidentally-enabling-rtc-channel.patch > * drivers-rtc-rtc-cmosc-fix-accidentally-enabling-rtc-channel-fix.patch > * audit-wait_for_auditd-should-use-task_uninterruptible.patch > * cciss-fix-broken-mutex-usage-in-ioctl.patch > * > drivers-rtc-rtc-twlc-fix-missing-device_init_wakeup-when-booted-with-device-tree.patch > * > swap-avoid-read_swap_cache_async-race-to-deadlock-while-waiting-on-discard-i-o-completion.patch > * > fs-ocfs2-nameic-remove-unecessary-error-when-removing-non-empty-directory.patch > * rtc-at91rm9200-add-match-table-compile-guard.patch > * rtc-at91rm9200-add-configuration-support.patch > * rtc-at91rm9200-refactor-interrupt-register-handling.patch > * rtc-at91rm9200-add-shadow-interrupt-mask.patch > * rtc-at91rm9200-use-shadow-imr-on-at91sam9x5.patch > * aio-use-call_rcu-instead-of-synchronize_rcu-in-kill_ioctx.patch > * drivers-misc-sgi-gru-grufilec-fix-info-leak-in-gru_get_config_info.patch > * mm-page_alloc-fix-watermark-check-in-__zone_watermark_ok.patch > * mm-migration-add-migrate_entry_wait_huge.patch > * drivers-base-cpuc-fix-maxcpus-boot-option.patch > *
Re: mmotm 2013-06-05-17-24 uploaded
On Wed, Jun 05, 2013 at 05:26:36PM -0700, a...@linux-foundation.org wrote: The mm-of-the-moment snapshot 2013-06-05-17-24 has been uploaded to http://www.ozlabs.org/~akpm/mmotm/ mmotm-readme.txt says README for mm-of-the-moment: http://www.ozlabs.org/~akpm/mmotm/ This is a snapshot of my -mm patch queue. Uploaded at random hopefully more than once a week. You will need quilt to apply these patches to the latest Linus release (3.x or 3.x-rcY). The series file is in broken-out.tar.gz and is duplicated in http://ozlabs.org/~akpm/mmotm/series It looks like the shutdown-reboot patches I sent are still not queued for Linus. Did these just get lost in the shuffle or do they need to be resubmitted? The first two were marked for -stable and I would really like to get them in sometime as it does affect severely affect shutdown of large systems. I will resend them shortly. Robin The file broken-out.tar.gz contains two datestamp files: .DATE and .DATE--mm-dd-hh-mm-ss. Both contain the string -mm-dd-hh-mm-ss, followed by the base kernel version against which this patch series is to be applied. This tree is partially included in linux-next. To see which patches are included in linux-next, consult the `series' file. Only the patches within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in linux-next. A git tree which contains the memory management portion of this tree is maintained at git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git by Michal Hocko. It contains the patches which are between the #NEXT_PATCHES_START mm and #NEXT_PATCHES_END markers, from the series file, http://www.ozlabs.org/~akpm/mmotm/series. A full copy of the full kernel tree with the linux-next and mmotm patches already applied is available through git within an hour of the mmotm release. Individual mmotm releases are tagged. The master branch always points to the latest release, so it's constantly rebasing. http://git.cmpxchg.org/?p=linux-mmotm.git;a=summary To develop on top of mmotm git: $ git remote add mmotm git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git $ git remote update mmotm $ git checkout -b topic mmotm/master make changes, commit $ git send-email mmotm/master.. [...] To rebase a branch with older patches to a new mmotm release: $ git remote update mmotm $ git rebase --onto mmotm/master topic base topic The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second) contains daily snapshots of the -mm tree. It is updated more frequently than mmotm, and is untested. A git copy of this tree is available at http://git.cmpxchg.org/?p=linux-mmots.git;a=summary and use of this tree is similar to http://git.cmpxchg.org/?p=linux-mmotm.git, described above. This mmotm tree contains the following patches against 3.10-rc4: (patches marked * will be included in linux-next) origin.patch linux-next.patch linux-next-git-rejects.patch arch-alpha-kernel-systblss-remove-debug-check.patch i-need-old-gcc.patch * cpu-hotplug-provide-a-generic-helper-to-disable-enable-cpu-hotplug.patch * cpu-hotplug-provide-a-generic-helper-to-disable-enable-cpu-hotplug-v11.patch * migrate-shutdown-reboot-to-boot-cpu.patch * migrate-shutdown-reboot-to-boot-cpu-v11.patch * kmsg-honor-dmesg_restrict-sysctl-on-dev-kmsg.patch * kmsg-honor-dmesg_restrict-sysctl-on-dev-kmsg-fix.patch * lib-mpi-mpicoderc-looping-issue-need-stop-when-equal-to-zero-found-by-extra_flags=-w.patch * ocfs2-ocfs2_prep_new_orphaned_file-should-return-ret.patch * memcg-dont-initialize-kmem-cache-destroying-work-for-root-caches.patch * rtc-tps6586x-device-wakeup-flags-correction.patch * drivers-rtc-rtc-cmosc-fix-accidentally-enabling-rtc-channel.patch * drivers-rtc-rtc-cmosc-fix-accidentally-enabling-rtc-channel-fix.patch * audit-wait_for_auditd-should-use-task_uninterruptible.patch * cciss-fix-broken-mutex-usage-in-ioctl.patch * drivers-rtc-rtc-twlc-fix-missing-device_init_wakeup-when-booted-with-device-tree.patch * swap-avoid-read_swap_cache_async-race-to-deadlock-while-waiting-on-discard-i-o-completion.patch * fs-ocfs2-nameic-remove-unecessary-error-when-removing-non-empty-directory.patch * rtc-at91rm9200-add-match-table-compile-guard.patch * rtc-at91rm9200-add-configuration-support.patch * rtc-at91rm9200-refactor-interrupt-register-handling.patch * rtc-at91rm9200-add-shadow-interrupt-mask.patch * rtc-at91rm9200-use-shadow-imr-on-at91sam9x5.patch * aio-use-call_rcu-instead-of-synchronize_rcu-in-kill_ioctx.patch * drivers-misc-sgi-gru-grufilec-fix-info-leak-in-gru_get_config_info.patch * mm-page_alloc-fix-watermark-check-in-__zone_watermark_ok.patch * mm-migration-add-migrate_entry_wait_huge.patch * drivers-base-cpuc-fix-maxcpus-boot-option.patch * drivers-base-cpuc-fix-maxcpus-boot-option-fix.patch *
[PATCH -v11 resend 00/11] Shutdown from reboot_cpuid without stopping other cpus.
We recently noticed that reboot of a 1024 cpu machine takes approx 16 minutes of just stopping the cpus. The slowdown was tracked to commit f96972f. The current implementation does all the work of hot removing the cpus before halting the system. We are switching to just migrating to the reboot_cpu and then continuing with shutdown/reboot. The patch set is broken into eleven parts. The first two are planned for the stable release. The others move the halt/shutdown/reboot related functions to their own kernel/reboot.c file and then move the handling of the kernel reboot= kernel parameter to generic kernel code. Changes since -v10 - Added Russell's Acked-by for arm. - Fixed an accidentally commented out line in an arm header file. Changes since -v9 - Added Ingo's Acked-by for x86. - Added Guan's Acked-by for unicore32. - Replaced first patch with updated patch from Srivatsa S. Bhat. This compiles for alpha allmodconfig, all arm defconfigs, and a few test x86_64 defconfigs. I have not tried more. Changes since -v8 - Changes reboot_cpu on stack to cpu to fix bug noticed by Russell King. - Switched unicore32 and arm from using REBOOT_WARM/COLD to HARD/SOFT. - Fixed case statement bug. - Went to using simple_strtoul for parsing reboot_cpu=smp###. - Made parsing of reboot= not use any #ifdef'd code. Changes since -v7. - Fixed authorship for first patch. - Rebased to Linus' current tree (51a26ae7a). Changes since -v6. - Cross compiled all arm architectures (using v3.9 kernel. Fails with current). - Added a #define for non-hotplug case. - Add #define for PF_THREAD_BOUND as compatibility to make stable easier. - Fixup s/reboot_cpu_id/reboot_cpu/ - Add include of linux/uaccess.h to allow building on arm. - Removed last remaining checkpatch.pl line length warning on kernel/reboot.c. - Fixed the duplicate handling or the reboot= kernel parameter. Changes since -v5. - Moved the arch/x86 reboot= up to the generic kernel code. Changes since -v4. - Integrated Srivatsa S. Bhat creating cpu_hotplug_disable() function - Integrated comments by Srivatsa S. Bhat. - Made one more comment consistent with others in function. Changes since -v3. - Added a tested-by for the original reporter. - Fix compile failure found by Joe Perches. - Integrated comments by Joe Perches. To: Andrew Morton a...@linux-foundation.org Cc: H. Peter Anvin h...@zytor.com Cc: Russ Anderson r...@sgi.com Cc: Robin Holt h...@sgi.com Cc: Russell King li...@arm.linux.org.uk Cc: Guan Xuetao g...@mprc.pku.edu.cn Cc: Linux Kernel Mailing List linux-kernel@vger.kernel.org Cc: the arch/x86 maintainers x...@kernel.org Cc: Arm Mailing List linux-arm-ker...@lists.infradead.org -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 07/11] unicore32, prepare reboot_mode for moving to generic kernel code.
This patch prepares for the moving the parsing of reboot= to the generic kernel code by making reboot_mode into a more generic form. Signed-off-by: Robin Holt h...@sgi.com To: Andrew Morton a...@linux-foundation.org Cc: Guan Xuetao g...@mprc.pku.edu.cn Cc: Russ Anderson r...@sgi.com Cc: Robin Holt h...@sgi.com Cc: Russell King rmk+ker...@arm.linux.org.uk Cc: H. Peter Anvin h...@zytor.com Cc: Linux Kernel Mailing List linux-kernel@vger.kernel.org Cc: the arch/x86 maintainers x...@kernel.org Cc: Arm Mailing List linux-arm-ker...@lists.infradead.org Acked-by: Guan Xuetao g...@mprc.pku.edu.cn --- Changes since -v8 - Switched from using REBOOT_WARM/COLD to HARD/SOFT. --- arch/unicore32/kernel/process.c | 10 +- arch/unicore32/kernel/setup.h | 2 +- arch/unicore32/mm/mmu.c | 2 +- include/linux/reboot.h | 2 ++ 4 files changed, 9 insertions(+), 7 deletions(-) diff --git a/arch/unicore32/kernel/process.c b/arch/unicore32/kernel/process.c index c944769..93dd035 100644 --- a/arch/unicore32/kernel/process.c +++ b/arch/unicore32/kernel/process.c @@ -51,14 +51,14 @@ void arch_cpu_idle(void) local_irq_enable(); } -static char reboot_mode = 'h'; +static enum reboot_mode reboot_mode = REBOOT_HARD; int __init reboot_setup(char *str) { - reboot_mode = str[0]; + if ('s' == str[0]) + reboot_mode = REBOOT_SOFT; return 1; } - __setup(reboot=, reboot_setup); void machine_halt(void) @@ -88,7 +88,7 @@ void machine_restart(char *cmd) * we may need it to insert some 1:1 mappings so that * soft boot works. */ - setup_mm_for_reboot(reboot_mode); + setup_mm_for_reboot(); /* Clean and invalidate caches */ flush_cache_all(); @@ -102,7 +102,7 @@ void machine_restart(char *cmd) /* * Now handle reboot code. */ - if (reboot_mode == 's') { + if (reboot_mode == REBOOT_SOFT) { /* Jump into ROM at address 0x */ cpu_reset(VECTORS_BASE); } else { diff --git a/arch/unicore32/kernel/setup.h b/arch/unicore32/kernel/setup.h index 30f749d..f5c51b8 100644 --- a/arch/unicore32/kernel/setup.h +++ b/arch/unicore32/kernel/setup.h @@ -22,7 +22,7 @@ extern void puv3_ps2_init(void); extern void pci_puv3_preinit(void); extern void __init puv3_init_gpio(void); -extern void setup_mm_for_reboot(char mode); +extern void setup_mm_for_reboot(void); extern char __stubs_start[], __stubs_end[]; extern char __vectors_start[], __vectors_end[]; diff --git a/arch/unicore32/mm/mmu.c b/arch/unicore32/mm/mmu.c index 43c20b4..4f5a532 100644 --- a/arch/unicore32/mm/mmu.c +++ b/arch/unicore32/mm/mmu.c @@ -445,7 +445,7 @@ void __init paging_init(void) * the user-mode pages. This will then ensure that we have predictable * results when turning the mmu off */ -void setup_mm_for_reboot(char mode) +void setup_mm_for_reboot(void) { unsigned long base_pmdval; pgd_t *pgd; diff --git a/include/linux/reboot.h b/include/linux/reboot.h index 37d56c3..ca29a6f 100644 --- a/include/linux/reboot.h +++ b/include/linux/reboot.h @@ -13,6 +13,8 @@ enum reboot_mode { REBOOT_COLD = 0, REBOOT_WARM, + REBOOT_HARD, + REBOOT_SOFT, }; extern int register_reboot_notifier(struct notifier_block *); -- 1.8.2.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 09/11] arm, prepare reboot_mode for moving to generic kernel code.
This patch prepares for the moving the parsing of reboot= to the generic kernel code by making reboot_mode into a more generic form. Signed-off-by: Robin Holt h...@sgi.com To: Andrew Morton a...@linux-foundation.org Cc: Russell King rmk+ker...@arm.linux.org.uk Cc: Russ Anderson r...@sgi.com Cc: Robin Holt h...@sgi.com Cc: H. Peter Anvin h...@zytor.com Cc: Guan Xuetao g...@mprc.pku.edu.cn Cc: Linux Kernel Mailing List linux-kernel@vger.kernel.org Cc: the arch/x86 maintainers x...@kernel.org Cc: Arm Mailing List linux-arm-ker...@lists.infradead.org Acked-by: Russell King rmk+ker...@arm.linux.org.uk --- Changes since -v10 - Uncommented an accidentally commented out line. Changes since -v8 - Switched from using REBOOT_WARM/COLD to HARD/SOFT. --- arch/arm/include/asm/mach/arch.h | 3 ++- arch/arm/kernel/process.c | 8 arch/arm/kernel/setup.c| 6 +++--- arch/arm/mach-footbridge/cats-hw.c | 2 +- 4 files changed, 10 insertions(+), 9 deletions(-) diff --git a/arch/arm/include/asm/mach/arch.h b/arch/arm/include/asm/mach/arch.h index 308ad7d..e2b551e 100644 --- a/arch/arm/include/asm/mach/arch.h +++ b/arch/arm/include/asm/mach/arch.h @@ -9,6 +9,7 @@ */ #ifndef __ASSEMBLY__ +#include linux/reboot.h struct tag; struct meminfo; @@ -39,7 +40,7 @@ struct machine_desc { unsigned char reserve_lp0 :1; /* never has lp0*/ unsigned char reserve_lp1 :1; /* never has lp1*/ unsigned char reserve_lp2 :1; /* never has lp2*/ - charrestart_mode; /* default restart mode */ + enum reboot_modereboot_mode;/* default restart mode */ struct smp_operations *smp; /* SMP operations */ void(*fixup)(struct tag *, char **, struct meminfo *); diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c index f219703..92b47df 100644 --- a/arch/arm/kernel/process.c +++ b/arch/arm/kernel/process.c @@ -174,14 +174,14 @@ void arch_cpu_idle(void) default_idle(); } -static char reboot_mode = 'h'; +enum reboot_mode reboot_mode = REBOOT_HARD; -int __init reboot_setup(char *str) +static int __init reboot_setup(char *str) { - reboot_mode = str[0]; + if ('s' == str[0]) + reboot_mode = REBOOT_SOFT; return 1; } - __setup(reboot=, reboot_setup); void machine_shutdown(void) diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c index 1522c7a..e05df42 100644 --- a/arch/arm/kernel/setup.c +++ b/arch/arm/kernel/setup.c @@ -73,7 +73,7 @@ __setup(fpe=, fpe_setup); extern void paging_init(struct machine_desc *desc); extern void sanity_check_meminfo(void); -extern void reboot_setup(char *str); +extern enum reboot_mode reboot_mode; extern void setup_dma_zone(struct machine_desc *desc); unsigned int processor_id; @@ -769,8 +769,8 @@ void __init setup_arch(char **cmdline_p) setup_dma_zone(mdesc); - if (mdesc-restart_mode) - reboot_setup(mdesc-restart_mode); + if (mdesc-reboot_mode != REBOOT_HARD) + reboot_mode = mdesc-reboot_mode; init_mm.start_code = (unsigned long) _text; init_mm.end_code = (unsigned long) _etext; diff --git a/arch/arm/mach-footbridge/cats-hw.c b/arch/arm/mach-footbridge/cats-hw.c index 6987a09..9669cc0 100644 --- a/arch/arm/mach-footbridge/cats-hw.c +++ b/arch/arm/mach-footbridge/cats-hw.c @@ -86,7 +86,7 @@ fixup_cats(struct tag *tags, char **cmdline, struct meminfo *mi) MACHINE_START(CATS, Chalice-CATS) /* Maintainer: Philip Blundell */ .atag_offset= 0x100, - .restart_mode = 's', + .reboot_mode= REBOOT_SOFT, .fixup = fixup_cats, .map_io = footbridge_map_io, .init_irq = footbridge_init_irq, -- 1.8.2.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 01/11] CPU hotplug: Provide a generic helper to disable/enable CPU hotplug
From: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com There are instances in the kernel where we would like to disable CPU hotplug (from sysfs) during some important operation. Today the freezer code depends on this and the code to do it was kinda tailor-made for that. Restructure the code and make it generic enough to be useful for other usecases too. Signed-off-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com Signed-off-by: Robin Holt h...@sgi.com To: Andrew Morton a...@linux-foundation.org Cc: H. Peter Anvin h...@zytor.com Cc: Russ Anderson r...@sgi.com Cc: Robin Holt h...@sgi.com Cc: Russell King rmk+ker...@arm.linux.org.uk Cc: Guan Xuetao g...@mprc.pku.edu.cn Cc: Linux Kernel Mailing List linux-kernel@vger.kernel.org Cc: the arch/x86 maintainers x...@kernel.org Cc: Arm Mailing List linux-arm-ker...@lists.infradead.org Cc: sta...@vger.kernel.org --- include/linux/cpu.h | 4 kernel/cpu.c| 55 ++--- 2 files changed, 27 insertions(+), 32 deletions(-) diff --git a/include/linux/cpu.h b/include/linux/cpu.h index c6f6e08..9f3c7e8 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -175,6 +175,8 @@ extern struct bus_type cpu_subsys; extern void get_online_cpus(void); extern void put_online_cpus(void); +extern void cpu_hotplug_disable(void); +extern void cpu_hotplug_enable(void); #define hotcpu_notifier(fn, pri) cpu_notifier(fn, pri) #define register_hotcpu_notifier(nb) register_cpu_notifier(nb) #define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb) @@ -198,6 +200,8 @@ static inline void cpu_hotplug_driver_unlock(void) #define get_online_cpus() do { } while (0) #define put_online_cpus() do { } while (0) +#define cpu_hotplug_disable() do { } while (0) +#define cpu_hotplug_enable() do { } while (0) #define hotcpu_notifier(fn, pri) do { (void)(fn); } while (0) /* These aren't inline functions due to a GCC bug. */ #define register_hotcpu_notifier(nb) ({ (void)(nb); 0; }) diff --git a/kernel/cpu.c b/kernel/cpu.c index b5e4ab2..198a388 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -133,6 +133,27 @@ static void cpu_hotplug_done(void) mutex_unlock(cpu_hotplug.lock); } +/* + * Wait for currently running CPU hotplug operations to complete (if any) and + * disable future CPU hotplug (from sysfs). The 'cpu_add_remove_lock' protects + * the 'cpu_hotplug_disabled' flag. The same lock is also acquired by the + * hotplug path before performing hotplug operations. So acquiring that lock + * guarantees mutual exclusion from any currently running hotplug operations. + */ +void cpu_hotplug_disable(void) +{ + cpu_maps_update_begin(); + cpu_hotplug_disabled = 1; + cpu_maps_update_done(); +} + +void cpu_hotplug_enable(void) +{ + cpu_maps_update_begin(); + cpu_hotplug_disabled = 0; + cpu_maps_update_done(); +} + #else /* #if CONFIG_HOTPLUG_CPU */ static void cpu_hotplug_begin(void) {} static void cpu_hotplug_done(void) {} @@ -541,36 +562,6 @@ static int __init alloc_frozen_cpus(void) core_initcall(alloc_frozen_cpus); /* - * Prevent regular CPU hotplug from racing with the freezer, by disabling CPU - * hotplug when tasks are about to be frozen. Also, don't allow the freezer - * to continue until any currently running CPU hotplug operation gets - * completed. - * To modify the 'cpu_hotplug_disabled' flag, we need to acquire the - * 'cpu_add_remove_lock'. And this same lock is also taken by the regular - * CPU hotplug path and released only after it is complete. Thus, we - * (and hence the freezer) will block here until any currently running CPU - * hotplug operation gets completed. - */ -void cpu_hotplug_disable_before_freeze(void) -{ - cpu_maps_update_begin(); - cpu_hotplug_disabled = 1; - cpu_maps_update_done(); -} - - -/* - * When tasks have been thawed, re-enable regular CPU hotplug (which had been - * disabled while beginning to freeze tasks). - */ -void cpu_hotplug_enable_after_thaw(void) -{ - cpu_maps_update_begin(); - cpu_hotplug_disabled = 0; - cpu_maps_update_done(); -} - -/* * When callbacks for CPU hotplug notifications are being executed, we must * ensure that the state of the system with respect to the tasks being frozen * or not, as reported by the notification, remains unchanged *throughout the @@ -589,12 +580,12 @@ cpu_hotplug_pm_callback(struct notifier_block *nb, case PM_SUSPEND_PREPARE: case PM_HIBERNATION_PREPARE: - cpu_hotplug_disable_before_freeze(); + cpu_hotplug_disable(); break; case PM_POST_SUSPEND: case PM_POST_HIBERNATION: - cpu_hotplug_enable_after_thaw(); + cpu_hotplug_enable(); break; default: -- 1.8.2.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More
[PATCH -v11 resend 04/11] Move shutdown/reboot related functions to kernel/reboot.c
This patch is preparatory. It moves reboot related syscall, etc functions from kernel/sys.c to kernel/reboot.c. Signed-off-by: Robin Holt h...@sgi.com To: Andrew Morton a...@linux-foundation.org Cc: H. Peter Anvin h...@zytor.com Cc: Russ Anderson r...@sgi.com Cc: Robin Holt h...@sgi.com Cc: Russell King rmk+ker...@arm.linux.org.uk Cc: Guan Xuetao g...@mprc.pku.edu.cn Cc: Linux Kernel Mailing List linux-kernel@vger.kernel.org Cc: the arch/x86 maintainers x...@kernel.org Cc: Arm Mailing List linux-arm-ker...@lists.infradead.org --- Changes since -v6: - Add include of linux/uaccess.h to allow building on arm. --- kernel/Makefile | 2 +- kernel/reboot.c | 347 kernel/sys.c| 331 - 3 files changed, 348 insertions(+), 332 deletions(-) create mode 100644 kernel/reboot.c diff --git a/kernel/Makefile b/kernel/Makefile index 271fd31..470839d 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -9,7 +9,7 @@ obj-y = fork.o exec_domain.o panic.o printk.o \ rcupdate.o extable.o params.o posix-timers.o \ kthread.o wait.o sys_ni.o posix-cpu-timers.o mutex.o \ hrtimer.o rwsem.o nsproxy.o srcu.o semaphore.o \ - notifier.o ksysfs.o cred.o \ + notifier.o ksysfs.o cred.o reboot.o \ async.o range.o groups.o lglock.o smpboot.o ifdef CONFIG_FUNCTION_TRACER diff --git a/kernel/reboot.c b/kernel/reboot.c new file mode 100644 index 000..0616483 --- /dev/null +++ b/kernel/reboot.c @@ -0,0 +1,347 @@ +/* + * linux/kernel/reboot.c + * + * Copyright (C) 2013 Linus Torvalds + */ + +#include linux/export.h +#include linux/kexec.h +#include linux/kmod.h +#include linux/kmsg_dump.h +#include linux/reboot.h +#include linux/suspend.h +#include linux/syscalls.h +#include linux/syscore_ops.h +#include linux/uaccess.h + +/* + * this indicates whether you can reboot with ctrl-alt-del: the default is yes + */ + +int C_A_D = 1; +struct pid *cad_pid; +EXPORT_SYMBOL(cad_pid); + +/* + * If set, this is used for preparing the system to power off. + */ + +void (*pm_power_off_prepare)(void); + +/** + * emergency_restart - reboot the system + * + * Without shutting down any hardware or taking any locks + * reboot the system. This is called when we know we are in + * trouble so this is our best effort to reboot. This is + * safe to call in interrupt context. + */ +void emergency_restart(void) +{ + kmsg_dump(KMSG_DUMP_EMERG); + machine_emergency_restart(); +} +EXPORT_SYMBOL_GPL(emergency_restart); + +void kernel_restart_prepare(char *cmd) +{ + blocking_notifier_call_chain(reboot_notifier_list, SYS_RESTART, cmd); + system_state = SYSTEM_RESTART; + usermodehelper_disable(); + device_shutdown(); +} + +/** + * register_reboot_notifier - Register function to be called at reboot time + * @nb: Info about notifier function to be called + * + * Registers a function with the list of functions + * to be called at reboot time. + * + * Currently always returns zero, as blocking_notifier_chain_register() + * always returns zero. + */ +int register_reboot_notifier(struct notifier_block *nb) +{ + return blocking_notifier_chain_register(reboot_notifier_list, nb); +} +EXPORT_SYMBOL(register_reboot_notifier); + +/** + * unregister_reboot_notifier - Unregister previously registered reboot notifier + * @nb: Hook to be unregistered + * + * Unregisters a previously registered reboot + * notifier function. + * + * Returns zero on success, or %-ENOENT on failure. + */ +int unregister_reboot_notifier(struct notifier_block *nb) +{ + return blocking_notifier_chain_unregister(reboot_notifier_list, nb); +} +EXPORT_SYMBOL(unregister_reboot_notifier); + +static void migrate_to_reboot_cpu(void) +{ + /* The boot cpu is always logical cpu 0 */ + int cpu = 0; + + cpu_hotplug_disable(); + + /* Make certain the cpu I'm about to reboot on is online */ + if (!cpu_online(cpu)) + cpu = cpumask_first(cpu_online_mask); + + /* Prevent races with other tasks migrating this task */ + current-flags |= PF_NO_SETAFFINITY; + + /* Make certain I only run on the appropriate processor */ + set_cpus_allowed_ptr(current, cpumask_of(cpu)); +} + +/** + * kernel_restart - reboot the system + * @cmd: pointer to buffer containing command to execute for restart + * or %NULL + * + * Shutdown everything and perform a clean reboot. + * This is not safe to call in interrupt context. + */ +void kernel_restart(char *cmd) +{ + kernel_restart_prepare(cmd); + migrate_to_reboot_cpu(); + syscore_shutdown(); + if (!cmd) + printk(KERN_EMERG Restarting system.\n); + else + printk(KERN_EMERG Restarting system with command '%s'.\n, cmd); + kmsg_dump
[PATCH -v11 resend 05/11] checkpatch.pl the new kernel/reboot.c file.
Get the new file to pass scripts/checkpatch.pl Signed-off-by: Robin Holt h...@sgi.com To: Andrew Morton a...@linux-foundation.org Cc: H. Peter Anvin h...@zytor.com Cc: Russ Anderson r...@sgi.com Cc: Robin Holt h...@sgi.com Cc: Russell King rmk+ker...@arm.linux.org.uk Cc: Guan Xuetao g...@mprc.pku.edu.cn Cc: Linux Kernel Mailing List linux-kernel@vger.kernel.org Cc: the arch/x86 maintainers x...@kernel.org Cc: Arm Mailing List linux-arm-ker...@lists.infradead.org --- Changes since v6: - Removed last remaining line length warning. --- include/linux/reboot.h | 2 +- kernel/reboot.c| 28 +--- 2 files changed, 14 insertions(+), 16 deletions(-) diff --git a/include/linux/reboot.h b/include/linux/reboot.h index 23b3630..c6eba21 100644 --- a/include/linux/reboot.h +++ b/include/linux/reboot.h @@ -26,7 +26,7 @@ extern void machine_shutdown(void); struct pt_regs; extern void machine_crash_shutdown(struct pt_regs *); -/* +/* * Architecture independent implemenations of sys_reboot commands. */ diff --git a/kernel/reboot.c b/kernel/reboot.c index 0616483..abb6a04 100644 --- a/kernel/reboot.c +++ b/kernel/reboot.c @@ -4,6 +4,8 @@ * Copyright (C) 2013 Linus Torvalds */ +#define pr_fmt(fmt)reboot: fmt + #include linux/export.h #include linux/kexec.h #include linux/kmod.h @@ -114,9 +116,9 @@ void kernel_restart(char *cmd) migrate_to_reboot_cpu(); syscore_shutdown(); if (!cmd) - printk(KERN_EMERG Restarting system.\n); + pr_emerg(Restarting system\n); else - printk(KERN_EMERG Restarting system with command '%s'.\n, cmd); + pr_emerg(Restarting system with command '%s'\n, cmd); kmsg_dump(KMSG_DUMP_RESTART); machine_restart(cmd); } @@ -125,7 +127,7 @@ EXPORT_SYMBOL_GPL(kernel_restart); static void kernel_shutdown_prepare(enum system_states state) { blocking_notifier_call_chain(reboot_notifier_list, - (state == SYSTEM_HALT)?SYS_HALT:SYS_POWER_OFF, NULL); + (state == SYSTEM_HALT) ? SYS_HALT : SYS_POWER_OFF, NULL); system_state = state; usermodehelper_disable(); device_shutdown(); @@ -140,11 +142,10 @@ void kernel_halt(void) kernel_shutdown_prepare(SYSTEM_HALT); migrate_to_reboot_cpu(); syscore_shutdown(); - printk(KERN_EMERG System halted.\n); + pr_emerg(System halted\n); kmsg_dump(KMSG_DUMP_HALT); machine_halt(); } - EXPORT_SYMBOL_GPL(kernel_halt); /** @@ -159,7 +160,7 @@ void kernel_power_off(void) pm_power_off_prepare(); migrate_to_reboot_cpu(); syscore_shutdown(); - printk(KERN_EMERG Power down.\n); + pr_emerg(Power down\n); kmsg_dump(KMSG_DUMP_POWEROFF); machine_power_off(); } @@ -188,10 +189,10 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd, /* For safety, we require magic arguments. */ if (magic1 != LINUX_REBOOT_MAGIC1 || - (magic2 != LINUX_REBOOT_MAGIC2 - magic2 != LINUX_REBOOT_MAGIC2A + (magic2 != LINUX_REBOOT_MAGIC2 + magic2 != LINUX_REBOOT_MAGIC2A magic2 != LINUX_REBOOT_MAGIC2B - magic2 != LINUX_REBOOT_MAGIC2C)) + magic2 != LINUX_REBOOT_MAGIC2C)) return -EINVAL; /* @@ -234,7 +235,8 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd, break; case LINUX_REBOOT_CMD_RESTART2: - if (strncpy_from_user(buffer[0], arg, sizeof(buffer) - 1) 0) { + ret = strncpy_from_user(buffer[0], arg, sizeof(buffer) - 1); + if (ret 0) { ret = -EFAULT; break; } @@ -282,7 +284,6 @@ void ctrl_alt_del(void) else kill_cad_pid(SIGINT, 1); } - char poweroff_cmd[POWEROFF_CMD_PATH_LEN] = /sbin/poweroff; @@ -301,14 +302,11 @@ static int __orderly_poweroff(bool force) ret = call_usermodehelper(argv[0], argv, envp, UMH_WAIT_EXEC); argv_free(argv); } else { - printk(KERN_WARNING %s failed to allocate memory for \%s\\n, -__func__, poweroff_cmd); ret = -ENOMEM; } if (ret force) { - printk(KERN_WARNING Failed to start orderly shutdown: - forcing the issue\n); + pr_warn(Failed to start orderly shutdown: forcing the issue\n); /* * I guess this should try to kick off some daemon to sync and * poweroff asap. Or not even bother syncing if we're doing an -- 1.8.2.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord
[PATCH -v11 resend 06/11] x86, prepare reboot_mode for moving to generic kernel code.
This patch prepares for the moving the parsing of reboot= to the generic kernel code by making reboot_mode into a more generic form. Signed-off-by: Robin Holt h...@sgi.com To: Andrew Morton a...@linux-foundation.org Cc: H. Peter Anvin h...@zytor.com Cc: Miguel Boton mboton.l...@gmail.com Cc: Russ Anderson r...@sgi.com Cc: Robin Holt h...@sgi.com Cc: Russell King rmk+ker...@arm.linux.org.uk Cc: Guan Xuetao g...@mprc.pku.edu.cn Cc: Linux Kernel Mailing List linux-kernel@vger.kernel.org Cc: the arch/x86 maintainers x...@kernel.org Cc: Arm Mailing List linux-arm-ker...@lists.infradead.org Acked-by: Ingo Molnar mi...@kernel.org --- arch/x86/kernel/reboot.c | 12 +++- include/linux/reboot.h | 5 + 2 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c index 76fa1e9..f770340 100644 --- a/arch/x86/kernel/reboot.c +++ b/arch/x86/kernel/reboot.c @@ -36,7 +36,7 @@ void (*pm_power_off)(void); EXPORT_SYMBOL(pm_power_off); static const struct desc_ptr no_idt = {}; -static int reboot_mode; +static enum reboot_mode reboot_mode; enum reboot_type reboot_type = BOOT_ACPI; int reboot_force; @@ -88,11 +88,11 @@ static int __init reboot_setup(char *str) switch (*str) { case 'w': - reboot_mode = 0x1234; + reboot_mode = REBOOT_WARM; break; case 'c': - reboot_mode = 0; + reboot_mode = REBOOT_COLD; break; #ifdef CONFIG_SMP @@ -536,6 +536,7 @@ static void native_machine_emergency_restart(void) int i; int attempt = 0; int orig_reboot_type = reboot_type; + unsigned short mode; if (reboot_emergency) emergency_vmx_disable_all(); @@ -543,7 +544,8 @@ static void native_machine_emergency_restart(void) tboot_shutdown(TB_SHUTDOWN_REBOOT); /* Tell the BIOS if we want cold or warm reboot */ - *((unsigned short *)__va(0x472)) = reboot_mode; + mode = reboot_mode == REBOOT_WARM ? 0x1234 : 0; + *((unsigned short *)__va(0x472)) = mode; for (;;) { /* Could also try the reset bit in the Hammer NB */ @@ -585,7 +587,7 @@ static void native_machine_emergency_restart(void) case BOOT_EFI: if (efi_enabled(EFI_RUNTIME_SERVICES)) - efi.reset_system(reboot_mode ? + efi.reset_system(reboot_mode == REBOOT_WARM ? EFI_RESET_WARM : EFI_RESET_COLD, EFI_SUCCESS, 0, NULL); diff --git a/include/linux/reboot.h b/include/linux/reboot.h index c6eba21..37d56c3 100644 --- a/include/linux/reboot.h +++ b/include/linux/reboot.h @@ -10,6 +10,11 @@ #define SYS_HALT 0x0002 /* Notify of system halt */ #define SYS_POWER_OFF 0x0003 /* Notify of system power off */ +enum reboot_mode { + REBOOT_COLD = 0, + REBOOT_WARM, +}; + extern int register_reboot_notifier(struct notifier_block *); extern int unregister_reboot_notifier(struct notifier_block *); -- 1.8.2.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 08/11] arm, Remove unused restart_mode fields from some arm subarchs
These restart_mode fields are not used at all. Remove them to make moving the reboot= cmdline options to the general kernel easier. Signed-off-by: Robin Holt h...@sgi.com To: Andrew Morton a...@linux-foundation.org Cc: Russell King rmk+ker...@arm.linux.org.uk Cc: Russ Anderson r...@sgi.com Cc: Robin Holt h...@sgi.com Cc: H. Peter Anvin h...@zytor.com Cc: Guan Xuetao g...@mprc.pku.edu.cn Cc: Linux Kernel Mailing List linux-kernel@vger.kernel.org Cc: the arch/x86 maintainers x...@kernel.org Cc: Arm Mailing List linux-arm-ker...@lists.infradead.org Acked-by: Russell King rmk+ker...@arm.linux.org.uk --- arch/arm/mach-ebsa110/core.c | 1 - arch/arm/mach-pxa/mioa701.c | 1 - arch/arm/mach-pxa/spitz.c| 3 --- arch/arm/mach-pxa/tosa.c | 1 - 4 files changed, 6 deletions(-) diff --git a/arch/arm/mach-ebsa110/core.c b/arch/arm/mach-ebsa110/core.c index b13cc74..69a9d5d 100644 --- a/arch/arm/mach-ebsa110/core.c +++ b/arch/arm/mach-ebsa110/core.c @@ -321,7 +321,6 @@ MACHINE_START(EBSA110, EBSA110) .atag_offset= 0x400, .reserve_lp0= 1, .reserve_lp2= 1, - .restart_mode = 's', .map_io = ebsa110_map_io, .init_early = ebsa110_init_early, .init_irq = ebsa110_init_irq, diff --git a/arch/arm/mach-pxa/mioa701.c b/arch/arm/mach-pxa/mioa701.c index f8979b9..dbea67a 100644 --- a/arch/arm/mach-pxa/mioa701.c +++ b/arch/arm/mach-pxa/mioa701.c @@ -756,7 +756,6 @@ static void mioa701_machine_exit(void) MACHINE_START(MIOA701, MIO A701) .atag_offset= 0x100, - .restart_mode = 's', .map_io = pxa27x_map_io, .nr_irqs= PXA_NR_IRQS, .init_irq = pxa27x_init_irq, diff --git a/arch/arm/mach-pxa/spitz.c b/arch/arm/mach-pxa/spitz.c index 362726c..c3c0042 100644 --- a/arch/arm/mach-pxa/spitz.c +++ b/arch/arm/mach-pxa/spitz.c @@ -979,7 +979,6 @@ static void __init spitz_fixup(struct tag *tags, char **cmdline, #ifdef CONFIG_MACH_SPITZ MACHINE_START(SPITZ, SHARP Spitz) - .restart_mode = 'g', .fixup = spitz_fixup, .map_io = pxa27x_map_io, .nr_irqs= PXA_NR_IRQS, @@ -993,7 +992,6 @@ MACHINE_END #ifdef CONFIG_MACH_BORZOI MACHINE_START(BORZOI, SHARP Borzoi) - .restart_mode = 'g', .fixup = spitz_fixup, .map_io = pxa27x_map_io, .nr_irqs= PXA_NR_IRQS, @@ -1007,7 +1005,6 @@ MACHINE_END #ifdef CONFIG_MACH_AKITA MACHINE_START(AKITA, SHARP Akita) - .restart_mode = 'g', .fixup = spitz_fixup, .map_io = pxa27x_map_io, .nr_irqs= PXA_NR_IRQS, diff --git a/arch/arm/mach-pxa/tosa.c b/arch/arm/mach-pxa/tosa.c index 3d91d2e..a41992f 100644 --- a/arch/arm/mach-pxa/tosa.c +++ b/arch/arm/mach-pxa/tosa.c @@ -969,7 +969,6 @@ static void __init fixup_tosa(struct tag *tags, char **cmdline, } MACHINE_START(TOSA, SHARP Tosa) - .restart_mode = 'g', .fixup = fixup_tosa, .map_io = pxa25x_map_io, .nr_irqs= TOSA_NR_IRQS, -- 1.8.2.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 11/11] Move arch/x86 reboot= handling to generic kernel.
Merge together the unicore32, arm, and x86 reboot= command line parameter handling. Signed-off-by: Robin Holt h...@sgi.com To: Andrew Morton a...@linux-foundation.org Cc: H. Peter Anvin h...@zytor.com Cc: Russell King rmk+ker...@arm.linux.org.uk Cc: Guan Xuetao g...@mprc.pku.edu.cn Cc: Russ Anderson r...@sgi.com Cc: Robin Holt h...@sgi.com Cc: Linux Kernel Mailing List linux-kernel@vger.kernel.org Cc: the arch/x86 maintainers x...@kernel.org Cc: Arm Mailing List linux-arm-ker...@lists.infradead.org Acked-by: Ingo Molnar mi...@kernel.org Acked-by: Guan Xuetao g...@mprc.pku.edu.cn Acked-by: Russell King rmk+ker...@arm.linux.org.uk --- Changes since -v8 - Add missing break statements. - Change parsing so #ifdef's are no longer needed. - Switch to using simple_strtoul to make parsing cleaner. - Add handling of REBOOT_HARD/SOFT --- Documentation/kernel-parameters.txt | 14 +++- arch/arm/kernel/process.c| 10 --- arch/unicore32/kernel/process.c | 10 --- arch/x86/include/asm/emergency-restart.h | 12 arch/x86/kernel/apic/x2apic_uv_x.c | 2 +- arch/x86/kernel/reboot.c | 111 +-- include/linux/reboot.h | 17 + kernel/reboot.c | 76 - 8 files changed, 107 insertions(+), 145 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index c3bfacb..b2945ce 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2677,9 +2677,17 @@ bytes respectively. Such letter suffixes can also be entirely omitted. Run specified binary instead of /init from the ramdisk, used for early userspace startup. See initrd. - reboot= [BUGS=X86-32,BUGS=ARM,BUGS=IA-64] Rebooting mode - Format: reboot_mode[,reboot_mode2[,...]] - See arch/*/kernel/reboot.c or arch/*/kernel/process.c + reboot= [KNL] + Format (x86 or x86_64): + [w[arm] | c[old] | h[ard] | s[oft] | g[pio]] \ + [[,]s[mp] \ + [[,]b[ios] | a[cpi] | k[bd] | t[riple] | e[fi] | p[ci]] \ + [[,]f[orce] + Where reboot_mode is one of warm (soft) or cold (hard) or gpio, + reboot_type is one of bios, acpi, kbd, triple, efi, or pci, + reboot_force is either force or not specified, + reboot_cpu is s[mp] with being the processor + to be used for rebooting. relax_domain_level= [KNL, SMP] Set scheduler's default relax_domain_level. diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c index 42856fc..304b102 100644 --- a/arch/arm/kernel/process.c +++ b/arch/arm/kernel/process.c @@ -175,16 +175,6 @@ void arch_cpu_idle(void) default_idle(); } -enum reboot_mode reboot_mode = REBOOT_HARD; - -static int __init reboot_setup(char *str) -{ - if ('s' == str[0]) - reboot_mode = REBOOT_SOFT; - return 1; -} -__setup(reboot=, reboot_setup); - void machine_shutdown(void) { #ifdef CONFIG_SMP diff --git a/arch/unicore32/kernel/process.c b/arch/unicore32/kernel/process.c index 93dd035..778ebba 100644 --- a/arch/unicore32/kernel/process.c +++ b/arch/unicore32/kernel/process.c @@ -51,16 +51,6 @@ void arch_cpu_idle(void) local_irq_enable(); } -static enum reboot_mode reboot_mode = REBOOT_HARD; - -int __init reboot_setup(char *str) -{ - if ('s' == str[0]) - reboot_mode = REBOOT_SOFT; - return 1; -} -__setup(reboot=, reboot_setup); - void machine_halt(void) { gpio_set_value(GPO_SOFT_OFF, 0); diff --git a/arch/x86/include/asm/emergency-restart.h b/arch/x86/include/asm/emergency-restart.h index 75ce3f4..77a99ac 100644 --- a/arch/x86/include/asm/emergency-restart.h +++ b/arch/x86/include/asm/emergency-restart.h @@ -1,18 +1,6 @@ #ifndef _ASM_X86_EMERGENCY_RESTART_H #define _ASM_X86_EMERGENCY_RESTART_H -enum reboot_type { - BOOT_TRIPLE = 't', - BOOT_KBD = 'k', - BOOT_BIOS = 'b', - BOOT_ACPI = 'a', - BOOT_EFI = 'e', - BOOT_CF9 = 'p', - BOOT_CF9_COND = 'q', -}; - -extern enum reboot_type reboot_type; - extern void machine_emergency_restart(void); #endif /* _ASM_X86_EMERGENCY_RESTART_H */ diff --git a/arch/x86/kernel/apic/x2apic_uv_x.c b/arch/x86/kernel/apic/x2apic_uv_x.c index 794f6eb..958e3e4 100644 --- a/arch/x86/kernel/apic/x2apic_uv_x.c +++ b/arch/x86/kernel/apic/x2apic_uv_x.c @@ -25,6 +25,7 @@ #include linux/kdebug.h #include linux/delay.h #include linux/crash_dump.h +#include linux/reboot.h #include asm/uv/uv_mmrs.h #include asm/uv/uv_hub.h @@ -36,7 +37,6
[PATCH -v11 resend 03/11] Remove -stable friendly PF_THREAD_BOUND define
Remove the prior patch's #define for easier backporting to the stable releases. Signed-off-by: Robin Holt h...@sgi.com To: Andrew Morton a...@linux-foundation.org Cc: H. Peter Anvin h...@zytor.com Cc: Russ Anderson r...@sgi.com Cc: Robin Holt h...@sgi.com Cc: Russell King rmk+ker...@arm.linux.org.uk Cc: Guan Xuetao g...@mprc.pku.edu.cn Cc: Linux Kernel Mailing List linux-kernel@vger.kernel.org Cc: the arch/x86 maintainers x...@kernel.org Cc: Arm Mailing List linux-arm-ker...@lists.infradead.org --- kernel/sys.c | 5 - 1 file changed, 5 deletions(-) diff --git a/kernel/sys.c b/kernel/sys.c index 2bbd9a7..17bb8d3 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -362,11 +362,6 @@ int unregister_reboot_notifier(struct notifier_block *nb) } EXPORT_SYMBOL(unregister_reboot_notifier); -/* Add backwards compatibility for stable trees. */ -#ifndef PF_NO_SETAFFINITY -#define PF_NO_SETAFFINITY PF_THREAD_BOUND -#endif - static void migrate_to_reboot_cpu(void) { /* The boot cpu is always logical cpu 0 */ -- 1.8.2.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 02/11] Migrate shutdown/reboot to boot cpu.
We recently noticed that reboot of a 1024 cpu machine takes approx 16 minutes of just stopping the cpus. The slowdown was tracked to commit f96972f. The current implementation does all the work of hot removing the cpus before halting the system. We are switching to just migrating to the boot cpu and then continuing with shutdown/reboot. This also has the effect of not breaking x86's command line parameter for specifying the reboot cpu. Note, this code was shamelessly copied from arch/x86/kernel/reboot.c with bits removed pertaining to the reboot_cpu command line parameter. Signed-off-by: Robin Holt h...@sgi.com Tested-by: Shawn Guo shawn@linaro.org To: Andrew Morton a...@linux-foundation.org Cc: H. Peter Anvin h...@zytor.com Cc: Russ Anderson r...@sgi.com Cc: Robin Holt h...@sgi.com Cc: Russell King rmk+ker...@arm.linux.org.uk Cc: Guan Xuetao g...@mprc.pku.edu.cn Cc: Linux Kernel Mailing List linux-kernel@vger.kernel.org Cc: the arch/x86 maintainers x...@kernel.org Cc: Arm Mailing List linux-arm-ker...@lists.infradead.org Cc: sta...@vger.kernel.org --- Changes since -v8 - Change stack parameter to make future patches cleaner. Changes since -v6: - Add #define for PF_THREAD_BOUND as compatibility to make stable easier. - Fixup s/reboot_cpu_id/reboot_cpu/ --- kernel/sys.c | 29 ++--- 1 file changed, 26 insertions(+), 3 deletions(-) diff --git a/kernel/sys.c b/kernel/sys.c index b95d3c7..2bbd9a7 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -362,6 +362,29 @@ int unregister_reboot_notifier(struct notifier_block *nb) } EXPORT_SYMBOL(unregister_reboot_notifier); +/* Add backwards compatibility for stable trees. */ +#ifndef PF_NO_SETAFFINITY +#define PF_NO_SETAFFINITY PF_THREAD_BOUND +#endif + +static void migrate_to_reboot_cpu(void) +{ + /* The boot cpu is always logical cpu 0 */ + int cpu = 0; + + cpu_hotplug_disable(); + + /* Make certain the cpu I'm about to reboot on is online */ + if (!cpu_online(cpu)) + cpu = cpumask_first(cpu_online_mask); + + /* Prevent races with other tasks migrating this task */ + current-flags |= PF_NO_SETAFFINITY; + + /* Make certain I only run on the appropriate processor */ + set_cpus_allowed_ptr(current, cpumask_of(cpu)); +} + /** * kernel_restart - reboot the system * @cmd: pointer to buffer containing command to execute for restart @@ -373,7 +396,7 @@ EXPORT_SYMBOL(unregister_reboot_notifier); void kernel_restart(char *cmd) { kernel_restart_prepare(cmd); - disable_nonboot_cpus(); + migrate_to_reboot_cpu(); syscore_shutdown(); if (!cmd) printk(KERN_EMERG Restarting system.\n); @@ -400,7 +423,7 @@ static void kernel_shutdown_prepare(enum system_states state) void kernel_halt(void) { kernel_shutdown_prepare(SYSTEM_HALT); - disable_nonboot_cpus(); + migrate_to_reboot_cpu(); syscore_shutdown(); printk(KERN_EMERG System halted.\n); kmsg_dump(KMSG_DUMP_HALT); @@ -419,7 +442,7 @@ void kernel_power_off(void) kernel_shutdown_prepare(SYSTEM_POWER_OFF); if (pm_power_off_prepare) pm_power_off_prepare(); - disable_nonboot_cpus(); + migrate_to_reboot_cpu(); syscore_shutdown(); printk(KERN_EMERG Power down.\n); kmsg_dump(KMSG_DUMP_POWEROFF); -- 1.8.2.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Handling NUMA page migration
On Tue, Jun 04, 2013 at 02:14:45PM +0200, Frank Mehnert wrote: > On Tuesday 04 June 2013 13:58:07 Robin Holt wrote: > > This is probably more appropriate to be directed at the linux-mm > > mailing list. > > > > On Tue, Jun 04, 2013 at 09:22:10AM +0200, Frank Mehnert wrote: > > > Hi, > > > > > > our memory management on Linux hosts conflicts with NUMA page migration. > > > I assume this problem existed for a longer time but Linux 3.8 introduced > > > automatic NUMA page balancing which makes the problem visible on > > > multi-node hosts leading to kernel oopses. > > > > > > NUMA page migration means that the physical address of a page changes. > > > This is fatal if the application assumes that this never happens for > > > that page as it was supposed to be pinned. > > > > > > We have two kind of pinned memory: > > > > > > A) 1. allocate memory in userland with mmap() > > > > > >2. madvise(MADV_DONTFORK) > > >3. pin with get_user_pages(). > > >4. flush dcache_page() > > >5. vm_flags |= (VM_DONTCOPY | VM_LOCKED) > > > > > > (resulting flags are VM_MIXEDMAP | VM_DONTDUMP | VM_DONTEXPAND | > > > > > >VM_DONTCOPY | VM_LOCKED | 0xff) > > > > I don't think this type of allocation should be affected. The > > get_user_pages() call should elevate the pages reference count which > > should prevent migration from completing. I would, however, wait for > > a more definitive answer. > > Thanks Robin! Actually case B) is more important for us so I'm waiting > for more feedback :) If you have a good test case, you might want to try adding a get_page() in there to see if that mitigates the problem. It would at least be interesting to know if it has an effect. Robin > > Frank > > > > B) 1. allocate memory with alloc_pages() > > > > > >2. SetPageReserved() > > >3. vm_mmap() to allocate a userspace mapping > > >4. vm_insert_page() > > >5. vm_flags |= (VM_DONTEXPAND | VM_DONTDUMP) > > > > > > (resulting flags are VM_MIXEDMAP | VM_DONTDUMP | VM_DONTEXPAND | > > > 0xff) > > > > > > At least the memory allocated like B) is affected by automatic NUMA page > > > migration. I'm not sure about A). > > > > > > 1. How can I prevent automatic NUMA page migration on this memory? > > > 2. Can NUMA page migration also be handled on such kind of memory without > > > > > >preventing migration? > > > > > > Thanks, > > > > > > Frank > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > the body of a message to majord...@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ > > -- > Dr.-Ing. Frank Mehnert | Software Development Director, VirtualBox > ORACLE Deutschland B.V. & Co. KG | Werkstr. 24 | 71384 Weinstadt, Germany > > Hauptverwaltung: Riesstr. 25, D-80992 München > Registergericht: Amtsgericht München, HRA 95603 > Geschäftsführer: Jürgen Kunz > > Komplementärin: ORACLE Deutschland Verwaltung B.V. > Hertogswetering 163/167, 3543 AS Utrecht, Niederlande > Handelsregister der Handelskammer Midden-Niederlande, Nr. 30143697 > Geschäftsführer: Alexander van der Ven, Astrid Kepper, Val Maher signature.asc Description: Digital signature
Re: Handling NUMA page migration
This is probably more appropriate to be directed at the linux-mm mailing list. On Tue, Jun 04, 2013 at 09:22:10AM +0200, Frank Mehnert wrote: > Hi, > > our memory management on Linux hosts conflicts with NUMA page migration. > I assume this problem existed for a longer time but Linux 3.8 introduced > automatic NUMA page balancing which makes the problem visible on > multi-node hosts leading to kernel oopses. > > NUMA page migration means that the physical address of a page changes. > This is fatal if the application assumes that this never happens for > that page as it was supposed to be pinned. > > We have two kind of pinned memory: > > A) 1. allocate memory in userland with mmap() >2. madvise(MADV_DONTFORK) >3. pin with get_user_pages(). >4. flush dcache_page() >5. vm_flags |= (VM_DONTCOPY | VM_LOCKED) > (resulting flags are VM_MIXEDMAP | VM_DONTDUMP | VM_DONTEXPAND | >VM_DONTCOPY | VM_LOCKED | 0xff) I don't think this type of allocation should be affected. The get_user_pages() call should elevate the pages reference count which should prevent migration from completing. I would, however, wait for a more definitive answer. > B) 1. allocate memory with alloc_pages() >2. SetPageReserved() >3. vm_mmap() to allocate a userspace mapping >4. vm_insert_page() >5. vm_flags |= (VM_DONTEXPAND | VM_DONTDUMP) > (resulting flags are VM_MIXEDMAP | VM_DONTDUMP | VM_DONTEXPAND | 0xff) > > At least the memory allocated like B) is affected by automatic NUMA page > migration. I'm not sure about A). > > 1. How can I prevent automatic NUMA page migration on this memory? > 2. Can NUMA page migration also be handled on such kind of memory without >preventing migration? > > Thanks, > > Frank > -- > Dr.-Ing. Frank Mehnert | Software Development Director, VirtualBox > ORACLE Deutschland B.V. & Co. KG | Werkstr. 24 | 71384 Weinstadt, Germany > > Hauptverwaltung: Riesstr. 25, D-80992 München > Registergericht: Amtsgericht München, HRA 95603 > Geschäftsführer: Jürgen Kunz > > Komplementärin: ORACLE Deutschland Verwaltung B.V. > Hertogswetering 163/167, 3543 AS Utrecht, Niederlande > Handelsregister der Handelskammer Midden-Niederlande, Nr. 30143697 > Geschäftsführer: Alexander van der Ven, Astrid Kepper, Val Maher > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Handling NUMA page migration
This is probably more appropriate to be directed at the linux-mm mailing list. On Tue, Jun 04, 2013 at 09:22:10AM +0200, Frank Mehnert wrote: Hi, our memory management on Linux hosts conflicts with NUMA page migration. I assume this problem existed for a longer time but Linux 3.8 introduced automatic NUMA page balancing which makes the problem visible on multi-node hosts leading to kernel oopses. NUMA page migration means that the physical address of a page changes. This is fatal if the application assumes that this never happens for that page as it was supposed to be pinned. We have two kind of pinned memory: A) 1. allocate memory in userland with mmap() 2. madvise(MADV_DONTFORK) 3. pin with get_user_pages(). 4. flush dcache_page() 5. vm_flags |= (VM_DONTCOPY | VM_LOCKED) (resulting flags are VM_MIXEDMAP | VM_DONTDUMP | VM_DONTEXPAND | VM_DONTCOPY | VM_LOCKED | 0xff) I don't think this type of allocation should be affected. The get_user_pages() call should elevate the pages reference count which should prevent migration from completing. I would, however, wait for a more definitive answer. B) 1. allocate memory with alloc_pages() 2. SetPageReserved() 3. vm_mmap() to allocate a userspace mapping 4. vm_insert_page() 5. vm_flags |= (VM_DONTEXPAND | VM_DONTDUMP) (resulting flags are VM_MIXEDMAP | VM_DONTDUMP | VM_DONTEXPAND | 0xff) At least the memory allocated like B) is affected by automatic NUMA page migration. I'm not sure about A). 1. How can I prevent automatic NUMA page migration on this memory? 2. Can NUMA page migration also be handled on such kind of memory without preventing migration? Thanks, Frank -- Dr.-Ing. Frank Mehnert | Software Development Director, VirtualBox ORACLE Deutschland B.V. Co. KG | Werkstr. 24 | 71384 Weinstadt, Germany Hauptverwaltung: Riesstr. 25, D-80992 München Registergericht: Amtsgericht München, HRA 95603 Geschäftsführer: Jürgen Kunz Komplementärin: ORACLE Deutschland Verwaltung B.V. Hertogswetering 163/167, 3543 AS Utrecht, Niederlande Handelsregister der Handelskammer Midden-Niederlande, Nr. 30143697 Geschäftsführer: Alexander van der Ven, Astrid Kepper, Val Maher -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Handling NUMA page migration
On Tue, Jun 04, 2013 at 02:14:45PM +0200, Frank Mehnert wrote: On Tuesday 04 June 2013 13:58:07 Robin Holt wrote: This is probably more appropriate to be directed at the linux-mm mailing list. On Tue, Jun 04, 2013 at 09:22:10AM +0200, Frank Mehnert wrote: Hi, our memory management on Linux hosts conflicts with NUMA page migration. I assume this problem existed for a longer time but Linux 3.8 introduced automatic NUMA page balancing which makes the problem visible on multi-node hosts leading to kernel oopses. NUMA page migration means that the physical address of a page changes. This is fatal if the application assumes that this never happens for that page as it was supposed to be pinned. We have two kind of pinned memory: A) 1. allocate memory in userland with mmap() 2. madvise(MADV_DONTFORK) 3. pin with get_user_pages(). 4. flush dcache_page() 5. vm_flags |= (VM_DONTCOPY | VM_LOCKED) (resulting flags are VM_MIXEDMAP | VM_DONTDUMP | VM_DONTEXPAND | VM_DONTCOPY | VM_LOCKED | 0xff) I don't think this type of allocation should be affected. The get_user_pages() call should elevate the pages reference count which should prevent migration from completing. I would, however, wait for a more definitive answer. Thanks Robin! Actually case B) is more important for us so I'm waiting for more feedback :) If you have a good test case, you might want to try adding a get_page() in there to see if that mitigates the problem. It would at least be interesting to know if it has an effect. Robin Frank B) 1. allocate memory with alloc_pages() 2. SetPageReserved() 3. vm_mmap() to allocate a userspace mapping 4. vm_insert_page() 5. vm_flags |= (VM_DONTEXPAND | VM_DONTDUMP) (resulting flags are VM_MIXEDMAP | VM_DONTDUMP | VM_DONTEXPAND | 0xff) At least the memory allocated like B) is affected by automatic NUMA page migration. I'm not sure about A). 1. How can I prevent automatic NUMA page migration on this memory? 2. Can NUMA page migration also be handled on such kind of memory without preventing migration? Thanks, Frank -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- Dr.-Ing. Frank Mehnert | Software Development Director, VirtualBox ORACLE Deutschland B.V. Co. KG | Werkstr. 24 | 71384 Weinstadt, Germany Hauptverwaltung: Riesstr. 25, D-80992 München Registergericht: Amtsgericht München, HRA 95603 Geschäftsführer: Jürgen Kunz Komplementärin: ORACLE Deutschland Verwaltung B.V. Hertogswetering 163/167, 3543 AS Utrecht, Niederlande Handelsregister der Handelskammer Midden-Niederlande, Nr. 30143697 Geschäftsführer: Alexander van der Ven, Astrid Kepper, Val Maher signature.asc Description: Digital signature
Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code
Russ, Can we open a bug for the BIOS folks and see if we can get this addressed? Robin On Fri, May 24, 2013 at 08:43:31AM +0100, Matt Fleming wrote: > On Thu, 23 May, at 03:32:34PM, Russ Anderson wrote: > >efi: mem127: type=4, attr=0xf, > > range=[0x6bb22000-0x7ca9c000) (271MB) > > EFI_BOOT_SERVICES_CODE > > >efi: mem133: type=5, attr=0x800f, > > range=[0x7daff000-0x7dbff000) (1MB) > > EFI_RUNTIME_SERVICES_CODE > > >EFI Variables Facility v0.08 2004-May-17 > >BUG: unable to handle kernel paging request at 7ca95b10 > >IP: [] 0x88007dbf213f > > This... > > >Call Trace: > > [] ? __alloc_pages_nodemask+0x154/0x2f0 > > [] ? alloc_page_interleave+0x9d/0xa0 > > [] ? put_dec+0x72/0x90 > > [] ? ida_get_new_above+0xb3/0x220 > > [] ? sub_alloc+0x74/0x1d0 > > [] ? sub_alloc+0x74/0x1d0 > > [] ? ida_get_new_above+0xb3/0x220 > > [] ? create_efivars_bin_attributes+0x150/0x150 > > is junk on the stack. > > > [] ? efi_call3+0x43/0x80 > > [] ? virt_efi_get_next_variable+0x47/0x1c0 > > [] ? create_efivars_bin_attributes+0x150/0x150 > > [] ? efivar_init+0xd5/0x390 > > [] ? efivar_update_sysfs_entries+0x90/0x90 > > [] ? kobject_uevent+0xb/0x10 > > [] ? kset_register+0x5b/0x70 > > [] ? create_efivars_bin_attributes+0x150/0x150 > > [] ? efivars_sysfs_init+0x87/0xf0 > > [] ? do_one_initcall+0x15a/0x1b0 > > [] ? do_basic_setup+0xad/0xce > > [] ? kernel_init_freeable+0x291/0x291 > > [] ? sched_init_smp+0x15b/0x162 > > [] ? kernel_init_freeable+0x20d/0x291 > > [] ? rest_init+0x80/0x80 > > [] ? kernel_init+0xe/0x180 > > [] ? ret_from_fork+0x7c/0xb0 > > [] ? rest_init+0x80/0x80 > > Here's the real call stack leading up to the crash. > > What appears to be happening is that your the EFI runtime services code > is calling into the EFI boot services code, which is definitely a bug in > your firmware because we're at runtime, but we've seen other machines > that do similar things so we usually handle it just fine. However, what > makes your case different, and the reason you see the above splat, is > that it's using the physical address of the EFI boot services region, > not the virtual one we setup with SetVirtualAddressMap(). Which is a > second firmware bug. Again, we have seen other machines that access > physical addresses after SetVirtualAddressMap(), but until now we > haven't had any non-optional code that triggered them. > > The only reason I can see that the offending commit would introduce this > problem is because it calls QueryVariableInfo() at boot time. I notice > that your machine is an SGI UV one, is there any chance you could get a > firmware fix for this? If possible, it would be also good to confirm > that it's this chunk of code in setup_efi_vars(), > > status = efi_call_phys4(sys_table->runtime->query_variable_info, > EFI_VARIABLE_NON_VOLATILE | > EFI_VARIABLE_BOOTSERVICE_ACCESS | > EFI_VARIABLE_RUNTIME_ACCESS, _size, > _size, _size); > > that later makes GetNextVariable() jump to the physical address of the > EFI Boot Services region. Because if not, we need to do some more > digging. > > Borislav, how are your 1:1 mapping patches coming along? In theory, once > those are merged we can gracefully workaround these kinds of issues. > > -- > Matt Fleming, Intel Open Source Technology Center > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code
Russ, Can we open a bug for the BIOS folks and see if we can get this addressed? Robin On Fri, May 24, 2013 at 08:43:31AM +0100, Matt Fleming wrote: On Thu, 23 May, at 03:32:34PM, Russ Anderson wrote: efi: mem127: type=4, attr=0xf, range=[0x6bb22000-0x7ca9c000) (271MB) EFI_BOOT_SERVICES_CODE efi: mem133: type=5, attr=0x800f, range=[0x7daff000-0x7dbff000) (1MB) EFI_RUNTIME_SERVICES_CODE EFI Variables Facility v0.08 2004-May-17 BUG: unable to handle kernel paging request at 7ca95b10 IP: [88007dbf2140] 0x88007dbf213f This... Call Trace: [81139a34] ? __alloc_pages_nodemask+0x154/0x2f0 [81174f7d] ? alloc_page_interleave+0x9d/0xa0 [812fe192] ? put_dec+0x72/0x90 [812f6d53] ? ida_get_new_above+0xb3/0x220 [812f6174] ? sub_alloc+0x74/0x1d0 [812f6174] ? sub_alloc+0x74/0x1d0 [812f6d53] ? ida_get_new_above+0xb3/0x220 [814c8cc0] ? create_efivars_bin_attributes+0x150/0x150 is junk on the stack. [810499b3] ? efi_call3+0x43/0x80 [810492a7] ? virt_efi_get_next_variable+0x47/0x1c0 [814c8cc0] ? create_efivars_bin_attributes+0x150/0x150 [814c7b55] ? efivar_init+0xd5/0x390 [814c8ae0] ? efivar_update_sysfs_entries+0x90/0x90 [812f906b] ? kobject_uevent+0xb/0x10 [812f812b] ? kset_register+0x5b/0x70 [814c8cc0] ? create_efivars_bin_attributes+0x150/0x150 [814c8d47] ? efivars_sysfs_init+0x87/0xf0 [8100032a] ? do_one_initcall+0x15a/0x1b0 [81a17831] ? do_basic_setup+0xad/0xce [81a17ae3] ? kernel_init_freeable+0x291/0x291 [81a3708a] ? sched_init_smp+0x15b/0x162 [81a17a5f] ? kernel_init_freeable+0x20d/0x291 [81601eb0] ? rest_init+0x80/0x80 [81601ebe] ? kernel_init+0xe/0x180 [8162179c] ? ret_from_fork+0x7c/0xb0 [81601eb0] ? rest_init+0x80/0x80 Here's the real call stack leading up to the crash. What appears to be happening is that your the EFI runtime services code is calling into the EFI boot services code, which is definitely a bug in your firmware because we're at runtime, but we've seen other machines that do similar things so we usually handle it just fine. However, what makes your case different, and the reason you see the above splat, is that it's using the physical address of the EFI boot services region, not the virtual one we setup with SetVirtualAddressMap(). Which is a second firmware bug. Again, we have seen other machines that access physical addresses after SetVirtualAddressMap(), but until now we haven't had any non-optional code that triggered them. The only reason I can see that the offending commit would introduce this problem is because it calls QueryVariableInfo() at boot time. I notice that your machine is an SGI UV one, is there any chance you could get a firmware fix for this? If possible, it would be also good to confirm that it's this chunk of code in setup_efi_vars(), status = efi_call_phys4(sys_table-runtime-query_variable_info, EFI_VARIABLE_NON_VOLATILE | EFI_VARIABLE_BOOTSERVICE_ACCESS | EFI_VARIABLE_RUNTIME_ACCESS, store_size, remaining_size, var_size); that later makes GetNextVariable() jump to the physical address of the EFI Boot Services region. Because if not, we need to do some more digging. Borislav, how are your 1:1 mapping patches coming along? In theory, once those are merged we can gracefully workaround these kinds of issues. -- Matt Fleming, Intel Open Source Technology Center -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 02/11] Migrate shutdown/reboot to boot cpu.
We recently noticed that reboot of a 1024 cpu machine takes approx 16 minutes of just stopping the cpus. The slowdown was tracked to commit f96972f. The current implementation does all the work of hot removing the cpus before halting the system. We are switching to just migrating to the boot cpu and then continuing with shutdown/reboot. This also has the effect of not breaking x86's command line parameter for specifying the reboot cpu. Note, this code was shamelessly copied from arch/x86/kernel/reboot.c with bits removed pertaining to the reboot_cpu command line parameter. Signed-off-by: Robin Holt Tested-by: Shawn Guo To: Andrew Morton Cc: H. Peter Anvin Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List Cc: --- Changes since -v8 - Change stack parameter to make future patches cleaner. Changes since -v6: - Add #define for PF_THREAD_BOUND as compatibility to make stable easier. - Fixup s/reboot_cpu_id/reboot_cpu/ --- kernel/sys.c | 29 ++--- 1 file changed, 26 insertions(+), 3 deletions(-) diff --git a/kernel/sys.c b/kernel/sys.c index b95d3c7..2bbd9a7 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -362,6 +362,29 @@ int unregister_reboot_notifier(struct notifier_block *nb) } EXPORT_SYMBOL(unregister_reboot_notifier); +/* Add backwards compatibility for stable trees. */ +#ifndef PF_NO_SETAFFINITY +#define PF_NO_SETAFFINITY PF_THREAD_BOUND +#endif + +static void migrate_to_reboot_cpu(void) +{ + /* The boot cpu is always logical cpu 0 */ + int cpu = 0; + + cpu_hotplug_disable(); + + /* Make certain the cpu I'm about to reboot on is online */ + if (!cpu_online(cpu)) + cpu = cpumask_first(cpu_online_mask); + + /* Prevent races with other tasks migrating this task */ + current->flags |= PF_NO_SETAFFINITY; + + /* Make certain I only run on the appropriate processor */ + set_cpus_allowed_ptr(current, cpumask_of(cpu)); +} + /** * kernel_restart - reboot the system * @cmd: pointer to buffer containing command to execute for restart @@ -373,7 +396,7 @@ EXPORT_SYMBOL(unregister_reboot_notifier); void kernel_restart(char *cmd) { kernel_restart_prepare(cmd); - disable_nonboot_cpus(); + migrate_to_reboot_cpu(); syscore_shutdown(); if (!cmd) printk(KERN_EMERG "Restarting system.\n"); @@ -400,7 +423,7 @@ static void kernel_shutdown_prepare(enum system_states state) void kernel_halt(void) { kernel_shutdown_prepare(SYSTEM_HALT); - disable_nonboot_cpus(); + migrate_to_reboot_cpu(); syscore_shutdown(); printk(KERN_EMERG "System halted.\n"); kmsg_dump(KMSG_DUMP_HALT); @@ -419,7 +442,7 @@ void kernel_power_off(void) kernel_shutdown_prepare(SYSTEM_POWER_OFF); if (pm_power_off_prepare) pm_power_off_prepare(); - disable_nonboot_cpus(); + migrate_to_reboot_cpu(); syscore_shutdown(); printk(KERN_EMERG "Power down.\n"); kmsg_dump(KMSG_DUMP_POWEROFF); -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 07/11] unicore32, prepare reboot_mode for moving to generic kernel code.
This patch prepares for the moving the parsing of reboot= to the generic kernel code by making reboot_mode into a more generic form. Signed-off-by: Robin Holt To: Andrew Morton Cc: Guan Xuetao Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: H. Peter Anvin Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List Acked-by: Guan Xuetao --- Changes since -v8 - Switched from using REBOOT_WARM/COLD to HARD/SOFT. --- arch/unicore32/kernel/process.c | 10 +- arch/unicore32/kernel/setup.h | 2 +- arch/unicore32/mm/mmu.c | 2 +- include/linux/reboot.h | 2 ++ 4 files changed, 9 insertions(+), 7 deletions(-) diff --git a/arch/unicore32/kernel/process.c b/arch/unicore32/kernel/process.c index c944769..93dd035 100644 --- a/arch/unicore32/kernel/process.c +++ b/arch/unicore32/kernel/process.c @@ -51,14 +51,14 @@ void arch_cpu_idle(void) local_irq_enable(); } -static char reboot_mode = 'h'; +static enum reboot_mode reboot_mode = REBOOT_HARD; int __init reboot_setup(char *str) { - reboot_mode = str[0]; + if ('s' == str[0]) + reboot_mode = REBOOT_SOFT; return 1; } - __setup("reboot=", reboot_setup); void machine_halt(void) @@ -88,7 +88,7 @@ void machine_restart(char *cmd) * we may need it to insert some 1:1 mappings so that * soft boot works. */ - setup_mm_for_reboot(reboot_mode); + setup_mm_for_reboot(); /* Clean and invalidate caches */ flush_cache_all(); @@ -102,7 +102,7 @@ void machine_restart(char *cmd) /* * Now handle reboot code. */ - if (reboot_mode == 's') { + if (reboot_mode == REBOOT_SOFT) { /* Jump into ROM at address 0x */ cpu_reset(VECTORS_BASE); } else { diff --git a/arch/unicore32/kernel/setup.h b/arch/unicore32/kernel/setup.h index 30f749d..f5c51b8 100644 --- a/arch/unicore32/kernel/setup.h +++ b/arch/unicore32/kernel/setup.h @@ -22,7 +22,7 @@ extern void puv3_ps2_init(void); extern void pci_puv3_preinit(void); extern void __init puv3_init_gpio(void); -extern void setup_mm_for_reboot(char mode); +extern void setup_mm_for_reboot(void); extern char __stubs_start[], __stubs_end[]; extern char __vectors_start[], __vectors_end[]; diff --git a/arch/unicore32/mm/mmu.c b/arch/unicore32/mm/mmu.c index 43c20b4..4f5a532 100644 --- a/arch/unicore32/mm/mmu.c +++ b/arch/unicore32/mm/mmu.c @@ -445,7 +445,7 @@ void __init paging_init(void) * the user-mode pages. This will then ensure that we have predictable * results when turning the mmu off */ -void setup_mm_for_reboot(char mode) +void setup_mm_for_reboot(void) { unsigned long base_pmdval; pgd_t *pgd; diff --git a/include/linux/reboot.h b/include/linux/reboot.h index 37d56c3..ca29a6f 100644 --- a/include/linux/reboot.h +++ b/include/linux/reboot.h @@ -13,6 +13,8 @@ enum reboot_mode { REBOOT_COLD = 0, REBOOT_WARM, + REBOOT_HARD, + REBOOT_SOFT, }; extern int register_reboot_notifier(struct notifier_block *); -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 03/11] Remove -stable friendly PF_THREAD_BOUND define
Remove the prior patch's #define for easier backporting to the stable releases. Signed-off-by: Robin Holt To: Andrew Morton Cc: H. Peter Anvin Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List --- kernel/sys.c | 5 - 1 file changed, 5 deletions(-) diff --git a/kernel/sys.c b/kernel/sys.c index 2bbd9a7..17bb8d3 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -362,11 +362,6 @@ int unregister_reboot_notifier(struct notifier_block *nb) } EXPORT_SYMBOL(unregister_reboot_notifier); -/* Add backwards compatibility for stable trees. */ -#ifndef PF_NO_SETAFFINITY -#define PF_NO_SETAFFINITY PF_THREAD_BOUND -#endif - static void migrate_to_reboot_cpu(void) { /* The boot cpu is always logical cpu 0 */ -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 05/11] checkpatch.pl the new kernel/reboot.c file.
Get the new file to pass scripts/checkpatch.pl Signed-off-by: Robin Holt To: Andrew Morton Cc: H. Peter Anvin Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List --- Changes since v6: - Removed last remaining line length warning. --- include/linux/reboot.h | 2 +- kernel/reboot.c| 28 +--- 2 files changed, 14 insertions(+), 16 deletions(-) diff --git a/include/linux/reboot.h b/include/linux/reboot.h index 23b3630..c6eba21 100644 --- a/include/linux/reboot.h +++ b/include/linux/reboot.h @@ -26,7 +26,7 @@ extern void machine_shutdown(void); struct pt_regs; extern void machine_crash_shutdown(struct pt_regs *); -/* +/* * Architecture independent implemenations of sys_reboot commands. */ diff --git a/kernel/reboot.c b/kernel/reboot.c index 0616483..abb6a04 100644 --- a/kernel/reboot.c +++ b/kernel/reboot.c @@ -4,6 +4,8 @@ * Copyright (C) 2013 Linus Torvalds */ +#define pr_fmt(fmt)"reboot: " fmt + #include #include #include @@ -114,9 +116,9 @@ void kernel_restart(char *cmd) migrate_to_reboot_cpu(); syscore_shutdown(); if (!cmd) - printk(KERN_EMERG "Restarting system.\n"); + pr_emerg("Restarting system\n"); else - printk(KERN_EMERG "Restarting system with command '%s'.\n", cmd); + pr_emerg("Restarting system with command '%s'\n", cmd); kmsg_dump(KMSG_DUMP_RESTART); machine_restart(cmd); } @@ -125,7 +127,7 @@ EXPORT_SYMBOL_GPL(kernel_restart); static void kernel_shutdown_prepare(enum system_states state) { blocking_notifier_call_chain(_notifier_list, - (state == SYSTEM_HALT)?SYS_HALT:SYS_POWER_OFF, NULL); + (state == SYSTEM_HALT) ? SYS_HALT : SYS_POWER_OFF, NULL); system_state = state; usermodehelper_disable(); device_shutdown(); @@ -140,11 +142,10 @@ void kernel_halt(void) kernel_shutdown_prepare(SYSTEM_HALT); migrate_to_reboot_cpu(); syscore_shutdown(); - printk(KERN_EMERG "System halted.\n"); + pr_emerg("System halted\n"); kmsg_dump(KMSG_DUMP_HALT); machine_halt(); } - EXPORT_SYMBOL_GPL(kernel_halt); /** @@ -159,7 +160,7 @@ void kernel_power_off(void) pm_power_off_prepare(); migrate_to_reboot_cpu(); syscore_shutdown(); - printk(KERN_EMERG "Power down.\n"); + pr_emerg("Power down\n"); kmsg_dump(KMSG_DUMP_POWEROFF); machine_power_off(); } @@ -188,10 +189,10 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd, /* For safety, we require "magic" arguments. */ if (magic1 != LINUX_REBOOT_MAGIC1 || - (magic2 != LINUX_REBOOT_MAGIC2 && - magic2 != LINUX_REBOOT_MAGIC2A && + (magic2 != LINUX_REBOOT_MAGIC2 && + magic2 != LINUX_REBOOT_MAGIC2A && magic2 != LINUX_REBOOT_MAGIC2B && - magic2 != LINUX_REBOOT_MAGIC2C)) + magic2 != LINUX_REBOOT_MAGIC2C)) return -EINVAL; /* @@ -234,7 +235,8 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd, break; case LINUX_REBOOT_CMD_RESTART2: - if (strncpy_from_user([0], arg, sizeof(buffer) - 1) < 0) { + ret = strncpy_from_user([0], arg, sizeof(buffer) - 1); + if (ret < 0) { ret = -EFAULT; break; } @@ -282,7 +284,6 @@ void ctrl_alt_del(void) else kill_cad_pid(SIGINT, 1); } - char poweroff_cmd[POWEROFF_CMD_PATH_LEN] = "/sbin/poweroff"; @@ -301,14 +302,11 @@ static int __orderly_poweroff(bool force) ret = call_usermodehelper(argv[0], argv, envp, UMH_WAIT_EXEC); argv_free(argv); } else { - printk(KERN_WARNING "%s failed to allocate memory for \"%s\"\n", -__func__, poweroff_cmd); ret = -ENOMEM; } if (ret && force) { - printk(KERN_WARNING "Failed to start orderly shutdown: " - "forcing the issue\n"); + pr_warn("Failed to start orderly shutdown: forcing the issue\n"); /* * I guess this should try to kick off some daemon to sync and * poweroff asap. Or not even bother syncing if we're doing an -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 04/11] Move shutdown/reboot related functions to kernel/reboot.c
This patch is preparatory. It moves reboot related syscall, etc functions from kernel/sys.c to kernel/reboot.c. Signed-off-by: Robin Holt To: Andrew Morton Cc: H. Peter Anvin Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List --- Changes since -v6: - Add include of linux/uaccess.h to allow building on arm. --- kernel/Makefile | 2 +- kernel/reboot.c | 347 kernel/sys.c| 331 - 3 files changed, 348 insertions(+), 332 deletions(-) create mode 100644 kernel/reboot.c diff --git a/kernel/Makefile b/kernel/Makefile index 271fd31..470839d 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -9,7 +9,7 @@ obj-y = fork.o exec_domain.o panic.o printk.o \ rcupdate.o extable.o params.o posix-timers.o \ kthread.o wait.o sys_ni.o posix-cpu-timers.o mutex.o \ hrtimer.o rwsem.o nsproxy.o srcu.o semaphore.o \ - notifier.o ksysfs.o cred.o \ + notifier.o ksysfs.o cred.o reboot.o \ async.o range.o groups.o lglock.o smpboot.o ifdef CONFIG_FUNCTION_TRACER diff --git a/kernel/reboot.c b/kernel/reboot.c new file mode 100644 index 000..0616483 --- /dev/null +++ b/kernel/reboot.c @@ -0,0 +1,347 @@ +/* + * linux/kernel/reboot.c + * + * Copyright (C) 2013 Linus Torvalds + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * this indicates whether you can reboot with ctrl-alt-del: the default is yes + */ + +int C_A_D = 1; +struct pid *cad_pid; +EXPORT_SYMBOL(cad_pid); + +/* + * If set, this is used for preparing the system to power off. + */ + +void (*pm_power_off_prepare)(void); + +/** + * emergency_restart - reboot the system + * + * Without shutting down any hardware or taking any locks + * reboot the system. This is called when we know we are in + * trouble so this is our best effort to reboot. This is + * safe to call in interrupt context. + */ +void emergency_restart(void) +{ + kmsg_dump(KMSG_DUMP_EMERG); + machine_emergency_restart(); +} +EXPORT_SYMBOL_GPL(emergency_restart); + +void kernel_restart_prepare(char *cmd) +{ + blocking_notifier_call_chain(_notifier_list, SYS_RESTART, cmd); + system_state = SYSTEM_RESTART; + usermodehelper_disable(); + device_shutdown(); +} + +/** + * register_reboot_notifier - Register function to be called at reboot time + * @nb: Info about notifier function to be called + * + * Registers a function with the list of functions + * to be called at reboot time. + * + * Currently always returns zero, as blocking_notifier_chain_register() + * always returns zero. + */ +int register_reboot_notifier(struct notifier_block *nb) +{ + return blocking_notifier_chain_register(_notifier_list, nb); +} +EXPORT_SYMBOL(register_reboot_notifier); + +/** + * unregister_reboot_notifier - Unregister previously registered reboot notifier + * @nb: Hook to be unregistered + * + * Unregisters a previously registered reboot + * notifier function. + * + * Returns zero on success, or %-ENOENT on failure. + */ +int unregister_reboot_notifier(struct notifier_block *nb) +{ + return blocking_notifier_chain_unregister(_notifier_list, nb); +} +EXPORT_SYMBOL(unregister_reboot_notifier); + +static void migrate_to_reboot_cpu(void) +{ + /* The boot cpu is always logical cpu 0 */ + int cpu = 0; + + cpu_hotplug_disable(); + + /* Make certain the cpu I'm about to reboot on is online */ + if (!cpu_online(cpu)) + cpu = cpumask_first(cpu_online_mask); + + /* Prevent races with other tasks migrating this task */ + current->flags |= PF_NO_SETAFFINITY; + + /* Make certain I only run on the appropriate processor */ + set_cpus_allowed_ptr(current, cpumask_of(cpu)); +} + +/** + * kernel_restart - reboot the system + * @cmd: pointer to buffer containing command to execute for restart + * or %NULL + * + * Shutdown everything and perform a clean reboot. + * This is not safe to call in interrupt context. + */ +void kernel_restart(char *cmd) +{ + kernel_restart_prepare(cmd); + migrate_to_reboot_cpu(); + syscore_shutdown(); + if (!cmd) + printk(KERN_EMERG "Restarting system.\n"); + else + printk(KERN_EMERG "Restarting system with command '%s'.\n", cmd); + kmsg_dump(KMSG_DUMP_RESTART); + machine_restart(cmd); +} +EXPORT_SYMBOL_GPL(kernel_restart); + +static void kernel_shutdown_prepare(enum system_states state) +{ + blocking_notifier_call_chain(_notifier_list, + (state == SYSTEM_HALT)?SYS_HALT:SYS_POWER_OFF, NULL); + system_state = state; +
[PATCH -v11 resend 01/11] CPU hotplug: Provide a generic helper to disable/enable CPU hotplug
From: "Srivatsa S. Bhat" There are instances in the kernel where we would like to disable CPU hotplug (from sysfs) during some important operation. Today the freezer code depends on this and the code to do it was kinda tailor-made for that. Restructure the code and make it generic enough to be useful for other usecases too. Signed-off-by: Srivatsa S. Bhat Signed-off-by: Robin Holt To: Andrew Morton Cc: H. Peter Anvin Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List Cc: --- include/linux/cpu.h | 4 kernel/cpu.c| 55 ++--- 2 files changed, 27 insertions(+), 32 deletions(-) diff --git a/include/linux/cpu.h b/include/linux/cpu.h index c6f6e08..9f3c7e8 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -175,6 +175,8 @@ extern struct bus_type cpu_subsys; extern void get_online_cpus(void); extern void put_online_cpus(void); +extern void cpu_hotplug_disable(void); +extern void cpu_hotplug_enable(void); #define hotcpu_notifier(fn, pri) cpu_notifier(fn, pri) #define register_hotcpu_notifier(nb) register_cpu_notifier(nb) #define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb) @@ -198,6 +200,8 @@ static inline void cpu_hotplug_driver_unlock(void) #define get_online_cpus() do { } while (0) #define put_online_cpus() do { } while (0) +#define cpu_hotplug_disable() do { } while (0) +#define cpu_hotplug_enable() do { } while (0) #define hotcpu_notifier(fn, pri) do { (void)(fn); } while (0) /* These aren't inline functions due to a GCC bug. */ #define register_hotcpu_notifier(nb) ({ (void)(nb); 0; }) diff --git a/kernel/cpu.c b/kernel/cpu.c index b5e4ab2..198a388 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -133,6 +133,27 @@ static void cpu_hotplug_done(void) mutex_unlock(_hotplug.lock); } +/* + * Wait for currently running CPU hotplug operations to complete (if any) and + * disable future CPU hotplug (from sysfs). The 'cpu_add_remove_lock' protects + * the 'cpu_hotplug_disabled' flag. The same lock is also acquired by the + * hotplug path before performing hotplug operations. So acquiring that lock + * guarantees mutual exclusion from any currently running hotplug operations. + */ +void cpu_hotplug_disable(void) +{ + cpu_maps_update_begin(); + cpu_hotplug_disabled = 1; + cpu_maps_update_done(); +} + +void cpu_hotplug_enable(void) +{ + cpu_maps_update_begin(); + cpu_hotplug_disabled = 0; + cpu_maps_update_done(); +} + #else /* #if CONFIG_HOTPLUG_CPU */ static void cpu_hotplug_begin(void) {} static void cpu_hotplug_done(void) {} @@ -541,36 +562,6 @@ static int __init alloc_frozen_cpus(void) core_initcall(alloc_frozen_cpus); /* - * Prevent regular CPU hotplug from racing with the freezer, by disabling CPU - * hotplug when tasks are about to be frozen. Also, don't allow the freezer - * to continue until any currently running CPU hotplug operation gets - * completed. - * To modify the 'cpu_hotplug_disabled' flag, we need to acquire the - * 'cpu_add_remove_lock'. And this same lock is also taken by the regular - * CPU hotplug path and released only after it is complete. Thus, we - * (and hence the freezer) will block here until any currently running CPU - * hotplug operation gets completed. - */ -void cpu_hotplug_disable_before_freeze(void) -{ - cpu_maps_update_begin(); - cpu_hotplug_disabled = 1; - cpu_maps_update_done(); -} - - -/* - * When tasks have been thawed, re-enable regular CPU hotplug (which had been - * disabled while beginning to freeze tasks). - */ -void cpu_hotplug_enable_after_thaw(void) -{ - cpu_maps_update_begin(); - cpu_hotplug_disabled = 0; - cpu_maps_update_done(); -} - -/* * When callbacks for CPU hotplug notifications are being executed, we must * ensure that the state of the system with respect to the tasks being frozen * or not, as reported by the notification, remains unchanged *throughout the @@ -589,12 +580,12 @@ cpu_hotplug_pm_callback(struct notifier_block *nb, case PM_SUSPEND_PREPARE: case PM_HIBERNATION_PREPARE: - cpu_hotplug_disable_before_freeze(); + cpu_hotplug_disable(); break; case PM_POST_SUSPEND: case PM_POST_HIBERNATION: - cpu_hotplug_enable_after_thaw(); + cpu_hotplug_enable(); break; default: -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 08/11] arm, Remove unused restart_mode fields from some arm subarchs
These restart_mode fields are not used at all. Remove them to make moving the reboot= cmdline options to the general kernel easier. Signed-off-by: Robin Holt To: Andrew Morton Cc: Russell King Cc: Russ Anderson Cc: Robin Holt Cc: H. Peter Anvin Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List Acked-by: Russell King --- arch/arm/mach-ebsa110/core.c | 1 - arch/arm/mach-pxa/mioa701.c | 1 - arch/arm/mach-pxa/spitz.c| 3 --- arch/arm/mach-pxa/tosa.c | 1 - 4 files changed, 6 deletions(-) diff --git a/arch/arm/mach-ebsa110/core.c b/arch/arm/mach-ebsa110/core.c index b13cc74..69a9d5d 100644 --- a/arch/arm/mach-ebsa110/core.c +++ b/arch/arm/mach-ebsa110/core.c @@ -321,7 +321,6 @@ MACHINE_START(EBSA110, "EBSA110") .atag_offset= 0x400, .reserve_lp0= 1, .reserve_lp2= 1, - .restart_mode = 's', .map_io = ebsa110_map_io, .init_early = ebsa110_init_early, .init_irq = ebsa110_init_irq, diff --git a/arch/arm/mach-pxa/mioa701.c b/arch/arm/mach-pxa/mioa701.c index f8979b9..dbea67a 100644 --- a/arch/arm/mach-pxa/mioa701.c +++ b/arch/arm/mach-pxa/mioa701.c @@ -756,7 +756,6 @@ static void mioa701_machine_exit(void) MACHINE_START(MIOA701, "MIO A701") .atag_offset= 0x100, - .restart_mode = 's', .map_io = _map_io, .nr_irqs= PXA_NR_IRQS, .init_irq = _init_irq, diff --git a/arch/arm/mach-pxa/spitz.c b/arch/arm/mach-pxa/spitz.c index 362726c..c3c0042 100644 --- a/arch/arm/mach-pxa/spitz.c +++ b/arch/arm/mach-pxa/spitz.c @@ -979,7 +979,6 @@ static void __init spitz_fixup(struct tag *tags, char **cmdline, #ifdef CONFIG_MACH_SPITZ MACHINE_START(SPITZ, "SHARP Spitz") - .restart_mode = 'g', .fixup = spitz_fixup, .map_io = pxa27x_map_io, .nr_irqs= PXA_NR_IRQS, @@ -993,7 +992,6 @@ MACHINE_END #ifdef CONFIG_MACH_BORZOI MACHINE_START(BORZOI, "SHARP Borzoi") - .restart_mode = 'g', .fixup = spitz_fixup, .map_io = pxa27x_map_io, .nr_irqs= PXA_NR_IRQS, @@ -1007,7 +1005,6 @@ MACHINE_END #ifdef CONFIG_MACH_AKITA MACHINE_START(AKITA, "SHARP Akita") - .restart_mode = 'g', .fixup = spitz_fixup, .map_io = pxa27x_map_io, .nr_irqs= PXA_NR_IRQS, diff --git a/arch/arm/mach-pxa/tosa.c b/arch/arm/mach-pxa/tosa.c index 3d91d2e..a41992f 100644 --- a/arch/arm/mach-pxa/tosa.c +++ b/arch/arm/mach-pxa/tosa.c @@ -969,7 +969,6 @@ static void __init fixup_tosa(struct tag *tags, char **cmdline, } MACHINE_START(TOSA, "SHARP Tosa") - .restart_mode = 'g', .fixup = fixup_tosa, .map_io = pxa25x_map_io, .nr_irqs= TOSA_NR_IRQS, -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 06/11] x86, prepare reboot_mode for moving to generic kernel code.
This patch prepares for the moving the parsing of reboot= to the generic kernel code by making reboot_mode into a more generic form. Signed-off-by: Robin Holt To: Andrew Morton Cc: H. Peter Anvin Cc: Miguel Boton Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List Acked-by: Ingo Molnar --- arch/x86/kernel/reboot.c | 12 +++- include/linux/reboot.h | 5 + 2 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c index 76fa1e9..f770340 100644 --- a/arch/x86/kernel/reboot.c +++ b/arch/x86/kernel/reboot.c @@ -36,7 +36,7 @@ void (*pm_power_off)(void); EXPORT_SYMBOL(pm_power_off); static const struct desc_ptr no_idt = {}; -static int reboot_mode; +static enum reboot_mode reboot_mode; enum reboot_type reboot_type = BOOT_ACPI; int reboot_force; @@ -88,11 +88,11 @@ static int __init reboot_setup(char *str) switch (*str) { case 'w': - reboot_mode = 0x1234; + reboot_mode = REBOOT_WARM; break; case 'c': - reboot_mode = 0; + reboot_mode = REBOOT_COLD; break; #ifdef CONFIG_SMP @@ -536,6 +536,7 @@ static void native_machine_emergency_restart(void) int i; int attempt = 0; int orig_reboot_type = reboot_type; + unsigned short mode; if (reboot_emergency) emergency_vmx_disable_all(); @@ -543,7 +544,8 @@ static void native_machine_emergency_restart(void) tboot_shutdown(TB_SHUTDOWN_REBOOT); /* Tell the BIOS if we want cold or warm reboot */ - *((unsigned short *)__va(0x472)) = reboot_mode; + mode = reboot_mode == REBOOT_WARM ? 0x1234 : 0; + *((unsigned short *)__va(0x472)) = mode; for (;;) { /* Could also try the reset bit in the Hammer NB */ @@ -585,7 +587,7 @@ static void native_machine_emergency_restart(void) case BOOT_EFI: if (efi_enabled(EFI_RUNTIME_SERVICES)) - efi.reset_system(reboot_mode ? + efi.reset_system(reboot_mode == REBOOT_WARM ? EFI_RESET_WARM : EFI_RESET_COLD, EFI_SUCCESS, 0, NULL); diff --git a/include/linux/reboot.h b/include/linux/reboot.h index c6eba21..37d56c3 100644 --- a/include/linux/reboot.h +++ b/include/linux/reboot.h @@ -10,6 +10,11 @@ #define SYS_HALT 0x0002 /* Notify of system halt */ #define SYS_POWER_OFF 0x0003 /* Notify of system power off */ +enum reboot_mode { + REBOOT_COLD = 0, + REBOOT_WARM, +}; + extern int register_reboot_notifier(struct notifier_block *); extern int unregister_reboot_notifier(struct notifier_block *); -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 11/11] Move arch/x86 reboot= handling to generic kernel.
Merge together the unicore32, arm, and x86 reboot= command line parameter handling. Signed-off-by: Robin Holt To: Andrew Morton Cc: H. Peter Anvin Cc: Russell King Cc: Guan Xuetao Cc: Russ Anderson Cc: Robin Holt Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List Acked-by: Ingo Molnar Acked-by: Guan Xuetao Acked-by: Russell King --- Changes since -v8 - Add missing break statements. - Change parsing so #ifdef's are no longer needed. - Switch to using simple_strtoul to make parsing cleaner. - Add handling of REBOOT_HARD/SOFT --- Documentation/kernel-parameters.txt | 14 +++- arch/arm/kernel/process.c| 10 --- arch/unicore32/kernel/process.c | 10 --- arch/x86/include/asm/emergency-restart.h | 12 arch/x86/kernel/apic/x2apic_uv_x.c | 2 +- arch/x86/kernel/reboot.c | 111 +-- include/linux/reboot.h | 17 + kernel/reboot.c | 76 - 8 files changed, 107 insertions(+), 145 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index c3bfacb..b2945ce 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2677,9 +2677,17 @@ bytes respectively. Such letter suffixes can also be entirely omitted. Run specified binary instead of /init from the ramdisk, used for early userspace startup. See initrd. - reboot= [BUGS=X86-32,BUGS=ARM,BUGS=IA-64] Rebooting mode - Format: [,[,...]] - See arch/*/kernel/reboot.c or arch/*/kernel/process.c + reboot= [KNL] + Format (x86 or x86_64): + [w[arm] | c[old] | h[ard] | s[oft] | g[pio]] \ + [[,]s[mp] \ + [[,]b[ios] | a[cpi] | k[bd] | t[riple] | e[fi] | p[ci]] \ + [[,]f[orce] + Where reboot_mode is one of warm (soft) or cold (hard) or gpio, + reboot_type is one of bios, acpi, kbd, triple, efi, or pci, + reboot_force is either force or not specified, + reboot_cpu is s[mp] with being the processor + to be used for rebooting. relax_domain_level= [KNL, SMP] Set scheduler's default relax_domain_level. diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c index 42856fc..304b102 100644 --- a/arch/arm/kernel/process.c +++ b/arch/arm/kernel/process.c @@ -175,16 +175,6 @@ void arch_cpu_idle(void) default_idle(); } -enum reboot_mode reboot_mode = REBOOT_HARD; - -static int __init reboot_setup(char *str) -{ - if ('s' == str[0]) - reboot_mode = REBOOT_SOFT; - return 1; -} -__setup("reboot=", reboot_setup); - void machine_shutdown(void) { #ifdef CONFIG_SMP diff --git a/arch/unicore32/kernel/process.c b/arch/unicore32/kernel/process.c index 93dd035..778ebba 100644 --- a/arch/unicore32/kernel/process.c +++ b/arch/unicore32/kernel/process.c @@ -51,16 +51,6 @@ void arch_cpu_idle(void) local_irq_enable(); } -static enum reboot_mode reboot_mode = REBOOT_HARD; - -int __init reboot_setup(char *str) -{ - if ('s' == str[0]) - reboot_mode = REBOOT_SOFT; - return 1; -} -__setup("reboot=", reboot_setup); - void machine_halt(void) { gpio_set_value(GPO_SOFT_OFF, 0); diff --git a/arch/x86/include/asm/emergency-restart.h b/arch/x86/include/asm/emergency-restart.h index 75ce3f4..77a99ac 100644 --- a/arch/x86/include/asm/emergency-restart.h +++ b/arch/x86/include/asm/emergency-restart.h @@ -1,18 +1,6 @@ #ifndef _ASM_X86_EMERGENCY_RESTART_H #define _ASM_X86_EMERGENCY_RESTART_H -enum reboot_type { - BOOT_TRIPLE = 't', - BOOT_KBD = 'k', - BOOT_BIOS = 'b', - BOOT_ACPI = 'a', - BOOT_EFI = 'e', - BOOT_CF9 = 'p', - BOOT_CF9_COND = 'q', -}; - -extern enum reboot_type reboot_type; - extern void machine_emergency_restart(void); #endif /* _ASM_X86_EMERGENCY_RESTART_H */ diff --git a/arch/x86/kernel/apic/x2apic_uv_x.c b/arch/x86/kernel/apic/x2apic_uv_x.c index 794f6eb..958e3e4 100644 --- a/arch/x86/kernel/apic/x2apic_uv_x.c +++ b/arch/x86/kernel/apic/x2apic_uv_x.c @@ -25,6 +25,7 @@ #include #include #include +#include #include #include @@ -36,7 +37,6 @@ #include #include #include -#include #include /* BMC sets a bit this MMR non-zero before sending an NMI */ diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c index f770340..563ed91 100644 --- a/arch/x86/kernel/reboot.c +++ b/arch/x86/kernel/reboot.c @@ -36,22 +36,6 @@ void (*pm_power_off)(void); EXPORT_SYMBOL(pm_power_o
[PATCH -v11 resend 09/11] arm, prepare reboot_mode for moving to generic kernel code.
This patch prepares for the moving the parsing of reboot= to the generic kernel code by making reboot_mode into a more generic form. Signed-off-by: Robin Holt To: Andrew Morton Cc: Russell King Cc: Russ Anderson Cc: Robin Holt Cc: H. Peter Anvin Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List Acked-by: Russell King --- Changes since -v10 - Uncommented an accidentally commented out line. Changes since -v8 - Switched from using REBOOT_WARM/COLD to HARD/SOFT. --- arch/arm/include/asm/mach/arch.h | 3 ++- arch/arm/kernel/process.c | 8 arch/arm/kernel/setup.c| 6 +++--- arch/arm/mach-footbridge/cats-hw.c | 2 +- 4 files changed, 10 insertions(+), 9 deletions(-) diff --git a/arch/arm/include/asm/mach/arch.h b/arch/arm/include/asm/mach/arch.h index 308ad7d..e2b551e 100644 --- a/arch/arm/include/asm/mach/arch.h +++ b/arch/arm/include/asm/mach/arch.h @@ -9,6 +9,7 @@ */ #ifndef __ASSEMBLY__ +#include struct tag; struct meminfo; @@ -39,7 +40,7 @@ struct machine_desc { unsigned char reserve_lp0 :1; /* never has lp0*/ unsigned char reserve_lp1 :1; /* never has lp1*/ unsigned char reserve_lp2 :1; /* never has lp2*/ - charrestart_mode; /* default restart mode */ + enum reboot_modereboot_mode;/* default restart mode */ struct smp_operations *smp; /* SMP operations */ void(*fixup)(struct tag *, char **, struct meminfo *); diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c index f219703..92b47df 100644 --- a/arch/arm/kernel/process.c +++ b/arch/arm/kernel/process.c @@ -174,14 +174,14 @@ void arch_cpu_idle(void) default_idle(); } -static char reboot_mode = 'h'; +enum reboot_mode reboot_mode = REBOOT_HARD; -int __init reboot_setup(char *str) +static int __init reboot_setup(char *str) { - reboot_mode = str[0]; + if ('s' == str[0]) + reboot_mode = REBOOT_SOFT; return 1; } - __setup("reboot=", reboot_setup); void machine_shutdown(void) diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c index 1522c7a..e05df42 100644 --- a/arch/arm/kernel/setup.c +++ b/arch/arm/kernel/setup.c @@ -73,7 +73,7 @@ __setup("fpe=", fpe_setup); extern void paging_init(struct machine_desc *desc); extern void sanity_check_meminfo(void); -extern void reboot_setup(char *str); +extern enum reboot_mode reboot_mode; extern void setup_dma_zone(struct machine_desc *desc); unsigned int processor_id; @@ -769,8 +769,8 @@ void __init setup_arch(char **cmdline_p) setup_dma_zone(mdesc); - if (mdesc->restart_mode) - reboot_setup(>restart_mode); + if (mdesc->reboot_mode != REBOOT_HARD) + reboot_mode = mdesc->reboot_mode; init_mm.start_code = (unsigned long) _text; init_mm.end_code = (unsigned long) _etext; diff --git a/arch/arm/mach-footbridge/cats-hw.c b/arch/arm/mach-footbridge/cats-hw.c index 6987a09..9669cc0 100644 --- a/arch/arm/mach-footbridge/cats-hw.c +++ b/arch/arm/mach-footbridge/cats-hw.c @@ -86,7 +86,7 @@ fixup_cats(struct tag *tags, char **cmdline, struct meminfo *mi) MACHINE_START(CATS, "Chalice-CATS") /* Maintainer: Philip Blundell */ .atag_offset= 0x100, - .restart_mode = 's', + .reboot_mode= REBOOT_SOFT, .fixup = fixup_cats, .map_io = footbridge_map_io, .init_irq = footbridge_init_irq, -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 00/11] Shutdown from reboot_cpuid without stopping other cpus.
We recently noticed that reboot of a 1024 cpu machine takes approx 16 minutes of just stopping the cpus. The slowdown was tracked to commit f96972f. The current implementation does all the work of hot removing the cpus before halting the system. We are switching to just migrating to the reboot_cpu and then continuing with shutdown/reboot. The patch set is broken into eleven parts. The first two are planned for the stable release. The others move the halt/shutdown/reboot related functions to their own kernel/reboot.c file and then move the handling of the kernel reboot= kernel parameter to generic kernel code. Changes since -v10 - Added Russell's Acked-by for arm. - Fixed an accidentally commented out line in an arm header file. Changes since -v9 - Added Ingo's Acked-by for x86. - Added Guan's Acked-by for unicore32. - Replaced first patch with updated patch from Srivatsa S. Bhat. This compiles for alpha allmodconfig, all arm defconfigs, and a few test x86_64 defconfigs. I have not tried more. Changes since -v8 - Changes reboot_cpu on stack to cpu to fix bug noticed by Russell King. - Switched unicore32 and arm from using REBOOT_WARM/COLD to HARD/SOFT. - Fixed case statement bug. - Went to using simple_strtoul for parsing reboot_cpu=smp###. - Made parsing of reboot= not use any #ifdef'd code. Changes since -v7. - Fixed authorship for first patch. - Rebased to Linus' current tree (51a26ae7a). Changes since -v6. - Cross compiled all arm architectures (using v3.9 kernel. Fails with current). - Added a #define for non-hotplug case. - Add #define for PF_THREAD_BOUND as compatibility to make stable easier. - Fixup s/reboot_cpu_id/reboot_cpu/ - Add include of linux/uaccess.h to allow building on arm. - Removed last remaining checkpatch.pl line length warning on kernel/reboot.c. - Fixed the duplicate handling or the reboot= kernel parameter. Changes since -v5. - Moved the arch/x86 reboot= up to the generic kernel code. Changes since -v4. - Integrated Srivatsa S. Bhat creating cpu_hotplug_disable() function - Integrated comments by Srivatsa S. Bhat. - Made one more comment consistent with others in function. Changes since -v3. - Added a tested-by for the original reporter. - Fix compile failure found by Joe Perches. - Integrated comments by Joe Perches. To: Andrew Morton Cc: H. Peter Anvin Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 00/11] Shutdown from reboot_cpuid without stopping other cpus.
We recently noticed that reboot of a 1024 cpu machine takes approx 16 minutes of just stopping the cpus. The slowdown was tracked to commit f96972f. The current implementation does all the work of hot removing the cpus before halting the system. We are switching to just migrating to the reboot_cpu and then continuing with shutdown/reboot. The patch set is broken into eleven parts. The first two are planned for the stable release. The others move the halt/shutdown/reboot related functions to their own kernel/reboot.c file and then move the handling of the kernel reboot= kernel parameter to generic kernel code. Changes since -v10 - Added Russell's Acked-by for arm. - Fixed an accidentally commented out line in an arm header file. Changes since -v9 - Added Ingo's Acked-by for x86. - Added Guan's Acked-by for unicore32. - Replaced first patch with updated patch from Srivatsa S. Bhat. This compiles for alpha allmodconfig, all arm defconfigs, and a few test x86_64 defconfigs. I have not tried more. Changes since -v8 - Changes reboot_cpu on stack to cpu to fix bug noticed by Russell King. - Switched unicore32 and arm from using REBOOT_WARM/COLD to HARD/SOFT. - Fixed case statement bug. - Went to using simple_strtoul for parsing reboot_cpu=smp###. - Made parsing of reboot= not use any #ifdef'd code. Changes since -v7. - Fixed authorship for first patch. - Rebased to Linus' current tree (51a26ae7a). Changes since -v6. - Cross compiled all arm architectures (using v3.9 kernel. Fails with current). - Added a #define for non-hotplug case. - Add #define for PF_THREAD_BOUND as compatibility to make stable easier. - Fixup s/reboot_cpu_id/reboot_cpu/ - Add include of linux/uaccess.h to allow building on arm. - Removed last remaining checkpatch.pl line length warning on kernel/reboot.c. - Fixed the duplicate handling or the reboot= kernel parameter. Changes since -v5. - Moved the arch/x86 reboot= up to the generic kernel code. Changes since -v4. - Integrated Srivatsa S. Bhat creating cpu_hotplug_disable() function - Integrated comments by Srivatsa S. Bhat. - Made one more comment consistent with others in function. Changes since -v3. - Added a tested-by for the original reporter. - Fix compile failure found by Joe Perches. - Integrated comments by Joe Perches. To: Andrew Morton a...@linux-foundation.org Cc: H. Peter Anvin h...@zytor.com Cc: Russ Anderson r...@sgi.com Cc: Robin Holt h...@sgi.com Cc: Russell King li...@arm.linux.org.uk Cc: Guan Xuetao g...@mprc.pku.edu.cn Cc: Linux Kernel Mailing List linux-kernel@vger.kernel.org Cc: the arch/x86 maintainers x...@kernel.org Cc: Arm Mailing List linux-arm-ker...@lists.infradead.org -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 09/11] arm, prepare reboot_mode for moving to generic kernel code.
This patch prepares for the moving the parsing of reboot= to the generic kernel code by making reboot_mode into a more generic form. Signed-off-by: Robin Holt h...@sgi.com To: Andrew Morton a...@linux-foundation.org Cc: Russell King rmk+ker...@arm.linux.org.uk Cc: Russ Anderson r...@sgi.com Cc: Robin Holt h...@sgi.com Cc: H. Peter Anvin h...@zytor.com Cc: Guan Xuetao g...@mprc.pku.edu.cn Cc: Linux Kernel Mailing List linux-kernel@vger.kernel.org Cc: the arch/x86 maintainers x...@kernel.org Cc: Arm Mailing List linux-arm-ker...@lists.infradead.org Acked-by: Russell King rmk+ker...@arm.linux.org.uk --- Changes since -v10 - Uncommented an accidentally commented out line. Changes since -v8 - Switched from using REBOOT_WARM/COLD to HARD/SOFT. --- arch/arm/include/asm/mach/arch.h | 3 ++- arch/arm/kernel/process.c | 8 arch/arm/kernel/setup.c| 6 +++--- arch/arm/mach-footbridge/cats-hw.c | 2 +- 4 files changed, 10 insertions(+), 9 deletions(-) diff --git a/arch/arm/include/asm/mach/arch.h b/arch/arm/include/asm/mach/arch.h index 308ad7d..e2b551e 100644 --- a/arch/arm/include/asm/mach/arch.h +++ b/arch/arm/include/asm/mach/arch.h @@ -9,6 +9,7 @@ */ #ifndef __ASSEMBLY__ +#include linux/reboot.h struct tag; struct meminfo; @@ -39,7 +40,7 @@ struct machine_desc { unsigned char reserve_lp0 :1; /* never has lp0*/ unsigned char reserve_lp1 :1; /* never has lp1*/ unsigned char reserve_lp2 :1; /* never has lp2*/ - charrestart_mode; /* default restart mode */ + enum reboot_modereboot_mode;/* default restart mode */ struct smp_operations *smp; /* SMP operations */ void(*fixup)(struct tag *, char **, struct meminfo *); diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c index f219703..92b47df 100644 --- a/arch/arm/kernel/process.c +++ b/arch/arm/kernel/process.c @@ -174,14 +174,14 @@ void arch_cpu_idle(void) default_idle(); } -static char reboot_mode = 'h'; +enum reboot_mode reboot_mode = REBOOT_HARD; -int __init reboot_setup(char *str) +static int __init reboot_setup(char *str) { - reboot_mode = str[0]; + if ('s' == str[0]) + reboot_mode = REBOOT_SOFT; return 1; } - __setup(reboot=, reboot_setup); void machine_shutdown(void) diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c index 1522c7a..e05df42 100644 --- a/arch/arm/kernel/setup.c +++ b/arch/arm/kernel/setup.c @@ -73,7 +73,7 @@ __setup(fpe=, fpe_setup); extern void paging_init(struct machine_desc *desc); extern void sanity_check_meminfo(void); -extern void reboot_setup(char *str); +extern enum reboot_mode reboot_mode; extern void setup_dma_zone(struct machine_desc *desc); unsigned int processor_id; @@ -769,8 +769,8 @@ void __init setup_arch(char **cmdline_p) setup_dma_zone(mdesc); - if (mdesc-restart_mode) - reboot_setup(mdesc-restart_mode); + if (mdesc-reboot_mode != REBOOT_HARD) + reboot_mode = mdesc-reboot_mode; init_mm.start_code = (unsigned long) _text; init_mm.end_code = (unsigned long) _etext; diff --git a/arch/arm/mach-footbridge/cats-hw.c b/arch/arm/mach-footbridge/cats-hw.c index 6987a09..9669cc0 100644 --- a/arch/arm/mach-footbridge/cats-hw.c +++ b/arch/arm/mach-footbridge/cats-hw.c @@ -86,7 +86,7 @@ fixup_cats(struct tag *tags, char **cmdline, struct meminfo *mi) MACHINE_START(CATS, Chalice-CATS) /* Maintainer: Philip Blundell */ .atag_offset= 0x100, - .restart_mode = 's', + .reboot_mode= REBOOT_SOFT, .fixup = fixup_cats, .map_io = footbridge_map_io, .init_irq = footbridge_init_irq, -- 1.8.2.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 11/11] Move arch/x86 reboot= handling to generic kernel.
Merge together the unicore32, arm, and x86 reboot= command line parameter handling. Signed-off-by: Robin Holt h...@sgi.com To: Andrew Morton a...@linux-foundation.org Cc: H. Peter Anvin h...@zytor.com Cc: Russell King rmk+ker...@arm.linux.org.uk Cc: Guan Xuetao g...@mprc.pku.edu.cn Cc: Russ Anderson r...@sgi.com Cc: Robin Holt h...@sgi.com Cc: Linux Kernel Mailing List linux-kernel@vger.kernel.org Cc: the arch/x86 maintainers x...@kernel.org Cc: Arm Mailing List linux-arm-ker...@lists.infradead.org Acked-by: Ingo Molnar mi...@kernel.org Acked-by: Guan Xuetao g...@mprc.pku.edu.cn Acked-by: Russell King rmk+ker...@arm.linux.org.uk --- Changes since -v8 - Add missing break statements. - Change parsing so #ifdef's are no longer needed. - Switch to using simple_strtoul to make parsing cleaner. - Add handling of REBOOT_HARD/SOFT --- Documentation/kernel-parameters.txt | 14 +++- arch/arm/kernel/process.c| 10 --- arch/unicore32/kernel/process.c | 10 --- arch/x86/include/asm/emergency-restart.h | 12 arch/x86/kernel/apic/x2apic_uv_x.c | 2 +- arch/x86/kernel/reboot.c | 111 +-- include/linux/reboot.h | 17 + kernel/reboot.c | 76 - 8 files changed, 107 insertions(+), 145 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index c3bfacb..b2945ce 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2677,9 +2677,17 @@ bytes respectively. Such letter suffixes can also be entirely omitted. Run specified binary instead of /init from the ramdisk, used for early userspace startup. See initrd. - reboot= [BUGS=X86-32,BUGS=ARM,BUGS=IA-64] Rebooting mode - Format: reboot_mode[,reboot_mode2[,...]] - See arch/*/kernel/reboot.c or arch/*/kernel/process.c + reboot= [KNL] + Format (x86 or x86_64): + [w[arm] | c[old] | h[ard] | s[oft] | g[pio]] \ + [[,]s[mp] \ + [[,]b[ios] | a[cpi] | k[bd] | t[riple] | e[fi] | p[ci]] \ + [[,]f[orce] + Where reboot_mode is one of warm (soft) or cold (hard) or gpio, + reboot_type is one of bios, acpi, kbd, triple, efi, or pci, + reboot_force is either force or not specified, + reboot_cpu is s[mp] with being the processor + to be used for rebooting. relax_domain_level= [KNL, SMP] Set scheduler's default relax_domain_level. diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c index 42856fc..304b102 100644 --- a/arch/arm/kernel/process.c +++ b/arch/arm/kernel/process.c @@ -175,16 +175,6 @@ void arch_cpu_idle(void) default_idle(); } -enum reboot_mode reboot_mode = REBOOT_HARD; - -static int __init reboot_setup(char *str) -{ - if ('s' == str[0]) - reboot_mode = REBOOT_SOFT; - return 1; -} -__setup(reboot=, reboot_setup); - void machine_shutdown(void) { #ifdef CONFIG_SMP diff --git a/arch/unicore32/kernel/process.c b/arch/unicore32/kernel/process.c index 93dd035..778ebba 100644 --- a/arch/unicore32/kernel/process.c +++ b/arch/unicore32/kernel/process.c @@ -51,16 +51,6 @@ void arch_cpu_idle(void) local_irq_enable(); } -static enum reboot_mode reboot_mode = REBOOT_HARD; - -int __init reboot_setup(char *str) -{ - if ('s' == str[0]) - reboot_mode = REBOOT_SOFT; - return 1; -} -__setup(reboot=, reboot_setup); - void machine_halt(void) { gpio_set_value(GPO_SOFT_OFF, 0); diff --git a/arch/x86/include/asm/emergency-restart.h b/arch/x86/include/asm/emergency-restart.h index 75ce3f4..77a99ac 100644 --- a/arch/x86/include/asm/emergency-restart.h +++ b/arch/x86/include/asm/emergency-restart.h @@ -1,18 +1,6 @@ #ifndef _ASM_X86_EMERGENCY_RESTART_H #define _ASM_X86_EMERGENCY_RESTART_H -enum reboot_type { - BOOT_TRIPLE = 't', - BOOT_KBD = 'k', - BOOT_BIOS = 'b', - BOOT_ACPI = 'a', - BOOT_EFI = 'e', - BOOT_CF9 = 'p', - BOOT_CF9_COND = 'q', -}; - -extern enum reboot_type reboot_type; - extern void machine_emergency_restart(void); #endif /* _ASM_X86_EMERGENCY_RESTART_H */ diff --git a/arch/x86/kernel/apic/x2apic_uv_x.c b/arch/x86/kernel/apic/x2apic_uv_x.c index 794f6eb..958e3e4 100644 --- a/arch/x86/kernel/apic/x2apic_uv_x.c +++ b/arch/x86/kernel/apic/x2apic_uv_x.c @@ -25,6 +25,7 @@ #include linux/kdebug.h #include linux/delay.h #include linux/crash_dump.h +#include linux/reboot.h #include asm/uv/uv_mmrs.h #include asm/uv/uv_hub.h @@ -36,7 +37,6
[PATCH -v11 resend 06/11] x86, prepare reboot_mode for moving to generic kernel code.
This patch prepares for the moving the parsing of reboot= to the generic kernel code by making reboot_mode into a more generic form. Signed-off-by: Robin Holt h...@sgi.com To: Andrew Morton a...@linux-foundation.org Cc: H. Peter Anvin h...@zytor.com Cc: Miguel Boton mboton.l...@gmail.com Cc: Russ Anderson r...@sgi.com Cc: Robin Holt h...@sgi.com Cc: Russell King rmk+ker...@arm.linux.org.uk Cc: Guan Xuetao g...@mprc.pku.edu.cn Cc: Linux Kernel Mailing List linux-kernel@vger.kernel.org Cc: the arch/x86 maintainers x...@kernel.org Cc: Arm Mailing List linux-arm-ker...@lists.infradead.org Acked-by: Ingo Molnar mi...@kernel.org --- arch/x86/kernel/reboot.c | 12 +++- include/linux/reboot.h | 5 + 2 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c index 76fa1e9..f770340 100644 --- a/arch/x86/kernel/reboot.c +++ b/arch/x86/kernel/reboot.c @@ -36,7 +36,7 @@ void (*pm_power_off)(void); EXPORT_SYMBOL(pm_power_off); static const struct desc_ptr no_idt = {}; -static int reboot_mode; +static enum reboot_mode reboot_mode; enum reboot_type reboot_type = BOOT_ACPI; int reboot_force; @@ -88,11 +88,11 @@ static int __init reboot_setup(char *str) switch (*str) { case 'w': - reboot_mode = 0x1234; + reboot_mode = REBOOT_WARM; break; case 'c': - reboot_mode = 0; + reboot_mode = REBOOT_COLD; break; #ifdef CONFIG_SMP @@ -536,6 +536,7 @@ static void native_machine_emergency_restart(void) int i; int attempt = 0; int orig_reboot_type = reboot_type; + unsigned short mode; if (reboot_emergency) emergency_vmx_disable_all(); @@ -543,7 +544,8 @@ static void native_machine_emergency_restart(void) tboot_shutdown(TB_SHUTDOWN_REBOOT); /* Tell the BIOS if we want cold or warm reboot */ - *((unsigned short *)__va(0x472)) = reboot_mode; + mode = reboot_mode == REBOOT_WARM ? 0x1234 : 0; + *((unsigned short *)__va(0x472)) = mode; for (;;) { /* Could also try the reset bit in the Hammer NB */ @@ -585,7 +587,7 @@ static void native_machine_emergency_restart(void) case BOOT_EFI: if (efi_enabled(EFI_RUNTIME_SERVICES)) - efi.reset_system(reboot_mode ? + efi.reset_system(reboot_mode == REBOOT_WARM ? EFI_RESET_WARM : EFI_RESET_COLD, EFI_SUCCESS, 0, NULL); diff --git a/include/linux/reboot.h b/include/linux/reboot.h index c6eba21..37d56c3 100644 --- a/include/linux/reboot.h +++ b/include/linux/reboot.h @@ -10,6 +10,11 @@ #define SYS_HALT 0x0002 /* Notify of system halt */ #define SYS_POWER_OFF 0x0003 /* Notify of system power off */ +enum reboot_mode { + REBOOT_COLD = 0, + REBOOT_WARM, +}; + extern int register_reboot_notifier(struct notifier_block *); extern int unregister_reboot_notifier(struct notifier_block *); -- 1.8.2.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 08/11] arm, Remove unused restart_mode fields from some arm subarchs
These restart_mode fields are not used at all. Remove them to make moving the reboot= cmdline options to the general kernel easier. Signed-off-by: Robin Holt h...@sgi.com To: Andrew Morton a...@linux-foundation.org Cc: Russell King rmk+ker...@arm.linux.org.uk Cc: Russ Anderson r...@sgi.com Cc: Robin Holt h...@sgi.com Cc: H. Peter Anvin h...@zytor.com Cc: Guan Xuetao g...@mprc.pku.edu.cn Cc: Linux Kernel Mailing List linux-kernel@vger.kernel.org Cc: the arch/x86 maintainers x...@kernel.org Cc: Arm Mailing List linux-arm-ker...@lists.infradead.org Acked-by: Russell King rmk+ker...@arm.linux.org.uk --- arch/arm/mach-ebsa110/core.c | 1 - arch/arm/mach-pxa/mioa701.c | 1 - arch/arm/mach-pxa/spitz.c| 3 --- arch/arm/mach-pxa/tosa.c | 1 - 4 files changed, 6 deletions(-) diff --git a/arch/arm/mach-ebsa110/core.c b/arch/arm/mach-ebsa110/core.c index b13cc74..69a9d5d 100644 --- a/arch/arm/mach-ebsa110/core.c +++ b/arch/arm/mach-ebsa110/core.c @@ -321,7 +321,6 @@ MACHINE_START(EBSA110, EBSA110) .atag_offset= 0x400, .reserve_lp0= 1, .reserve_lp2= 1, - .restart_mode = 's', .map_io = ebsa110_map_io, .init_early = ebsa110_init_early, .init_irq = ebsa110_init_irq, diff --git a/arch/arm/mach-pxa/mioa701.c b/arch/arm/mach-pxa/mioa701.c index f8979b9..dbea67a 100644 --- a/arch/arm/mach-pxa/mioa701.c +++ b/arch/arm/mach-pxa/mioa701.c @@ -756,7 +756,6 @@ static void mioa701_machine_exit(void) MACHINE_START(MIOA701, MIO A701) .atag_offset= 0x100, - .restart_mode = 's', .map_io = pxa27x_map_io, .nr_irqs= PXA_NR_IRQS, .init_irq = pxa27x_init_irq, diff --git a/arch/arm/mach-pxa/spitz.c b/arch/arm/mach-pxa/spitz.c index 362726c..c3c0042 100644 --- a/arch/arm/mach-pxa/spitz.c +++ b/arch/arm/mach-pxa/spitz.c @@ -979,7 +979,6 @@ static void __init spitz_fixup(struct tag *tags, char **cmdline, #ifdef CONFIG_MACH_SPITZ MACHINE_START(SPITZ, SHARP Spitz) - .restart_mode = 'g', .fixup = spitz_fixup, .map_io = pxa27x_map_io, .nr_irqs= PXA_NR_IRQS, @@ -993,7 +992,6 @@ MACHINE_END #ifdef CONFIG_MACH_BORZOI MACHINE_START(BORZOI, SHARP Borzoi) - .restart_mode = 'g', .fixup = spitz_fixup, .map_io = pxa27x_map_io, .nr_irqs= PXA_NR_IRQS, @@ -1007,7 +1005,6 @@ MACHINE_END #ifdef CONFIG_MACH_AKITA MACHINE_START(AKITA, SHARP Akita) - .restart_mode = 'g', .fixup = spitz_fixup, .map_io = pxa27x_map_io, .nr_irqs= PXA_NR_IRQS, diff --git a/arch/arm/mach-pxa/tosa.c b/arch/arm/mach-pxa/tosa.c index 3d91d2e..a41992f 100644 --- a/arch/arm/mach-pxa/tosa.c +++ b/arch/arm/mach-pxa/tosa.c @@ -969,7 +969,6 @@ static void __init fixup_tosa(struct tag *tags, char **cmdline, } MACHINE_START(TOSA, SHARP Tosa) - .restart_mode = 'g', .fixup = fixup_tosa, .map_io = pxa25x_map_io, .nr_irqs= TOSA_NR_IRQS, -- 1.8.2.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 01/11] CPU hotplug: Provide a generic helper to disable/enable CPU hotplug
From: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com There are instances in the kernel where we would like to disable CPU hotplug (from sysfs) during some important operation. Today the freezer code depends on this and the code to do it was kinda tailor-made for that. Restructure the code and make it generic enough to be useful for other usecases too. Signed-off-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com Signed-off-by: Robin Holt h...@sgi.com To: Andrew Morton a...@linux-foundation.org Cc: H. Peter Anvin h...@zytor.com Cc: Russ Anderson r...@sgi.com Cc: Robin Holt h...@sgi.com Cc: Russell King rmk+ker...@arm.linux.org.uk Cc: Guan Xuetao g...@mprc.pku.edu.cn Cc: Linux Kernel Mailing List linux-kernel@vger.kernel.org Cc: the arch/x86 maintainers x...@kernel.org Cc: Arm Mailing List linux-arm-ker...@lists.infradead.org Cc: sta...@vger.kernel.org --- include/linux/cpu.h | 4 kernel/cpu.c| 55 ++--- 2 files changed, 27 insertions(+), 32 deletions(-) diff --git a/include/linux/cpu.h b/include/linux/cpu.h index c6f6e08..9f3c7e8 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -175,6 +175,8 @@ extern struct bus_type cpu_subsys; extern void get_online_cpus(void); extern void put_online_cpus(void); +extern void cpu_hotplug_disable(void); +extern void cpu_hotplug_enable(void); #define hotcpu_notifier(fn, pri) cpu_notifier(fn, pri) #define register_hotcpu_notifier(nb) register_cpu_notifier(nb) #define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb) @@ -198,6 +200,8 @@ static inline void cpu_hotplug_driver_unlock(void) #define get_online_cpus() do { } while (0) #define put_online_cpus() do { } while (0) +#define cpu_hotplug_disable() do { } while (0) +#define cpu_hotplug_enable() do { } while (0) #define hotcpu_notifier(fn, pri) do { (void)(fn); } while (0) /* These aren't inline functions due to a GCC bug. */ #define register_hotcpu_notifier(nb) ({ (void)(nb); 0; }) diff --git a/kernel/cpu.c b/kernel/cpu.c index b5e4ab2..198a388 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -133,6 +133,27 @@ static void cpu_hotplug_done(void) mutex_unlock(cpu_hotplug.lock); } +/* + * Wait for currently running CPU hotplug operations to complete (if any) and + * disable future CPU hotplug (from sysfs). The 'cpu_add_remove_lock' protects + * the 'cpu_hotplug_disabled' flag. The same lock is also acquired by the + * hotplug path before performing hotplug operations. So acquiring that lock + * guarantees mutual exclusion from any currently running hotplug operations. + */ +void cpu_hotplug_disable(void) +{ + cpu_maps_update_begin(); + cpu_hotplug_disabled = 1; + cpu_maps_update_done(); +} + +void cpu_hotplug_enable(void) +{ + cpu_maps_update_begin(); + cpu_hotplug_disabled = 0; + cpu_maps_update_done(); +} + #else /* #if CONFIG_HOTPLUG_CPU */ static void cpu_hotplug_begin(void) {} static void cpu_hotplug_done(void) {} @@ -541,36 +562,6 @@ static int __init alloc_frozen_cpus(void) core_initcall(alloc_frozen_cpus); /* - * Prevent regular CPU hotplug from racing with the freezer, by disabling CPU - * hotplug when tasks are about to be frozen. Also, don't allow the freezer - * to continue until any currently running CPU hotplug operation gets - * completed. - * To modify the 'cpu_hotplug_disabled' flag, we need to acquire the - * 'cpu_add_remove_lock'. And this same lock is also taken by the regular - * CPU hotplug path and released only after it is complete. Thus, we - * (and hence the freezer) will block here until any currently running CPU - * hotplug operation gets completed. - */ -void cpu_hotplug_disable_before_freeze(void) -{ - cpu_maps_update_begin(); - cpu_hotplug_disabled = 1; - cpu_maps_update_done(); -} - - -/* - * When tasks have been thawed, re-enable regular CPU hotplug (which had been - * disabled while beginning to freeze tasks). - */ -void cpu_hotplug_enable_after_thaw(void) -{ - cpu_maps_update_begin(); - cpu_hotplug_disabled = 0; - cpu_maps_update_done(); -} - -/* * When callbacks for CPU hotplug notifications are being executed, we must * ensure that the state of the system with respect to the tasks being frozen * or not, as reported by the notification, remains unchanged *throughout the @@ -589,12 +580,12 @@ cpu_hotplug_pm_callback(struct notifier_block *nb, case PM_SUSPEND_PREPARE: case PM_HIBERNATION_PREPARE: - cpu_hotplug_disable_before_freeze(); + cpu_hotplug_disable(); break; case PM_POST_SUSPEND: case PM_POST_HIBERNATION: - cpu_hotplug_enable_after_thaw(); + cpu_hotplug_enable(); break; default: -- 1.8.2.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More
[PATCH -v11 resend 05/11] checkpatch.pl the new kernel/reboot.c file.
Get the new file to pass scripts/checkpatch.pl Signed-off-by: Robin Holt h...@sgi.com To: Andrew Morton a...@linux-foundation.org Cc: H. Peter Anvin h...@zytor.com Cc: Russ Anderson r...@sgi.com Cc: Robin Holt h...@sgi.com Cc: Russell King rmk+ker...@arm.linux.org.uk Cc: Guan Xuetao g...@mprc.pku.edu.cn Cc: Linux Kernel Mailing List linux-kernel@vger.kernel.org Cc: the arch/x86 maintainers x...@kernel.org Cc: Arm Mailing List linux-arm-ker...@lists.infradead.org --- Changes since v6: - Removed last remaining line length warning. --- include/linux/reboot.h | 2 +- kernel/reboot.c| 28 +--- 2 files changed, 14 insertions(+), 16 deletions(-) diff --git a/include/linux/reboot.h b/include/linux/reboot.h index 23b3630..c6eba21 100644 --- a/include/linux/reboot.h +++ b/include/linux/reboot.h @@ -26,7 +26,7 @@ extern void machine_shutdown(void); struct pt_regs; extern void machine_crash_shutdown(struct pt_regs *); -/* +/* * Architecture independent implemenations of sys_reboot commands. */ diff --git a/kernel/reboot.c b/kernel/reboot.c index 0616483..abb6a04 100644 --- a/kernel/reboot.c +++ b/kernel/reboot.c @@ -4,6 +4,8 @@ * Copyright (C) 2013 Linus Torvalds */ +#define pr_fmt(fmt)reboot: fmt + #include linux/export.h #include linux/kexec.h #include linux/kmod.h @@ -114,9 +116,9 @@ void kernel_restart(char *cmd) migrate_to_reboot_cpu(); syscore_shutdown(); if (!cmd) - printk(KERN_EMERG Restarting system.\n); + pr_emerg(Restarting system\n); else - printk(KERN_EMERG Restarting system with command '%s'.\n, cmd); + pr_emerg(Restarting system with command '%s'\n, cmd); kmsg_dump(KMSG_DUMP_RESTART); machine_restart(cmd); } @@ -125,7 +127,7 @@ EXPORT_SYMBOL_GPL(kernel_restart); static void kernel_shutdown_prepare(enum system_states state) { blocking_notifier_call_chain(reboot_notifier_list, - (state == SYSTEM_HALT)?SYS_HALT:SYS_POWER_OFF, NULL); + (state == SYSTEM_HALT) ? SYS_HALT : SYS_POWER_OFF, NULL); system_state = state; usermodehelper_disable(); device_shutdown(); @@ -140,11 +142,10 @@ void kernel_halt(void) kernel_shutdown_prepare(SYSTEM_HALT); migrate_to_reboot_cpu(); syscore_shutdown(); - printk(KERN_EMERG System halted.\n); + pr_emerg(System halted\n); kmsg_dump(KMSG_DUMP_HALT); machine_halt(); } - EXPORT_SYMBOL_GPL(kernel_halt); /** @@ -159,7 +160,7 @@ void kernel_power_off(void) pm_power_off_prepare(); migrate_to_reboot_cpu(); syscore_shutdown(); - printk(KERN_EMERG Power down.\n); + pr_emerg(Power down\n); kmsg_dump(KMSG_DUMP_POWEROFF); machine_power_off(); } @@ -188,10 +189,10 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd, /* For safety, we require magic arguments. */ if (magic1 != LINUX_REBOOT_MAGIC1 || - (magic2 != LINUX_REBOOT_MAGIC2 - magic2 != LINUX_REBOOT_MAGIC2A + (magic2 != LINUX_REBOOT_MAGIC2 + magic2 != LINUX_REBOOT_MAGIC2A magic2 != LINUX_REBOOT_MAGIC2B - magic2 != LINUX_REBOOT_MAGIC2C)) + magic2 != LINUX_REBOOT_MAGIC2C)) return -EINVAL; /* @@ -234,7 +235,8 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd, break; case LINUX_REBOOT_CMD_RESTART2: - if (strncpy_from_user(buffer[0], arg, sizeof(buffer) - 1) 0) { + ret = strncpy_from_user(buffer[0], arg, sizeof(buffer) - 1); + if (ret 0) { ret = -EFAULT; break; } @@ -282,7 +284,6 @@ void ctrl_alt_del(void) else kill_cad_pid(SIGINT, 1); } - char poweroff_cmd[POWEROFF_CMD_PATH_LEN] = /sbin/poweroff; @@ -301,14 +302,11 @@ static int __orderly_poweroff(bool force) ret = call_usermodehelper(argv[0], argv, envp, UMH_WAIT_EXEC); argv_free(argv); } else { - printk(KERN_WARNING %s failed to allocate memory for \%s\\n, -__func__, poweroff_cmd); ret = -ENOMEM; } if (ret force) { - printk(KERN_WARNING Failed to start orderly shutdown: - forcing the issue\n); + pr_warn(Failed to start orderly shutdown: forcing the issue\n); /* * I guess this should try to kick off some daemon to sync and * poweroff asap. Or not even bother syncing if we're doing an -- 1.8.2.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord
[PATCH -v11 resend 04/11] Move shutdown/reboot related functions to kernel/reboot.c
This patch is preparatory. It moves reboot related syscall, etc functions from kernel/sys.c to kernel/reboot.c. Signed-off-by: Robin Holt h...@sgi.com To: Andrew Morton a...@linux-foundation.org Cc: H. Peter Anvin h...@zytor.com Cc: Russ Anderson r...@sgi.com Cc: Robin Holt h...@sgi.com Cc: Russell King rmk+ker...@arm.linux.org.uk Cc: Guan Xuetao g...@mprc.pku.edu.cn Cc: Linux Kernel Mailing List linux-kernel@vger.kernel.org Cc: the arch/x86 maintainers x...@kernel.org Cc: Arm Mailing List linux-arm-ker...@lists.infradead.org --- Changes since -v6: - Add include of linux/uaccess.h to allow building on arm. --- kernel/Makefile | 2 +- kernel/reboot.c | 347 kernel/sys.c| 331 - 3 files changed, 348 insertions(+), 332 deletions(-) create mode 100644 kernel/reboot.c diff --git a/kernel/Makefile b/kernel/Makefile index 271fd31..470839d 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -9,7 +9,7 @@ obj-y = fork.o exec_domain.o panic.o printk.o \ rcupdate.o extable.o params.o posix-timers.o \ kthread.o wait.o sys_ni.o posix-cpu-timers.o mutex.o \ hrtimer.o rwsem.o nsproxy.o srcu.o semaphore.o \ - notifier.o ksysfs.o cred.o \ + notifier.o ksysfs.o cred.o reboot.o \ async.o range.o groups.o lglock.o smpboot.o ifdef CONFIG_FUNCTION_TRACER diff --git a/kernel/reboot.c b/kernel/reboot.c new file mode 100644 index 000..0616483 --- /dev/null +++ b/kernel/reboot.c @@ -0,0 +1,347 @@ +/* + * linux/kernel/reboot.c + * + * Copyright (C) 2013 Linus Torvalds + */ + +#include linux/export.h +#include linux/kexec.h +#include linux/kmod.h +#include linux/kmsg_dump.h +#include linux/reboot.h +#include linux/suspend.h +#include linux/syscalls.h +#include linux/syscore_ops.h +#include linux/uaccess.h + +/* + * this indicates whether you can reboot with ctrl-alt-del: the default is yes + */ + +int C_A_D = 1; +struct pid *cad_pid; +EXPORT_SYMBOL(cad_pid); + +/* + * If set, this is used for preparing the system to power off. + */ + +void (*pm_power_off_prepare)(void); + +/** + * emergency_restart - reboot the system + * + * Without shutting down any hardware or taking any locks + * reboot the system. This is called when we know we are in + * trouble so this is our best effort to reboot. This is + * safe to call in interrupt context. + */ +void emergency_restart(void) +{ + kmsg_dump(KMSG_DUMP_EMERG); + machine_emergency_restart(); +} +EXPORT_SYMBOL_GPL(emergency_restart); + +void kernel_restart_prepare(char *cmd) +{ + blocking_notifier_call_chain(reboot_notifier_list, SYS_RESTART, cmd); + system_state = SYSTEM_RESTART; + usermodehelper_disable(); + device_shutdown(); +} + +/** + * register_reboot_notifier - Register function to be called at reboot time + * @nb: Info about notifier function to be called + * + * Registers a function with the list of functions + * to be called at reboot time. + * + * Currently always returns zero, as blocking_notifier_chain_register() + * always returns zero. + */ +int register_reboot_notifier(struct notifier_block *nb) +{ + return blocking_notifier_chain_register(reboot_notifier_list, nb); +} +EXPORT_SYMBOL(register_reboot_notifier); + +/** + * unregister_reboot_notifier - Unregister previously registered reboot notifier + * @nb: Hook to be unregistered + * + * Unregisters a previously registered reboot + * notifier function. + * + * Returns zero on success, or %-ENOENT on failure. + */ +int unregister_reboot_notifier(struct notifier_block *nb) +{ + return blocking_notifier_chain_unregister(reboot_notifier_list, nb); +} +EXPORT_SYMBOL(unregister_reboot_notifier); + +static void migrate_to_reboot_cpu(void) +{ + /* The boot cpu is always logical cpu 0 */ + int cpu = 0; + + cpu_hotplug_disable(); + + /* Make certain the cpu I'm about to reboot on is online */ + if (!cpu_online(cpu)) + cpu = cpumask_first(cpu_online_mask); + + /* Prevent races with other tasks migrating this task */ + current-flags |= PF_NO_SETAFFINITY; + + /* Make certain I only run on the appropriate processor */ + set_cpus_allowed_ptr(current, cpumask_of(cpu)); +} + +/** + * kernel_restart - reboot the system + * @cmd: pointer to buffer containing command to execute for restart + * or %NULL + * + * Shutdown everything and perform a clean reboot. + * This is not safe to call in interrupt context. + */ +void kernel_restart(char *cmd) +{ + kernel_restart_prepare(cmd); + migrate_to_reboot_cpu(); + syscore_shutdown(); + if (!cmd) + printk(KERN_EMERG Restarting system.\n); + else + printk(KERN_EMERG Restarting system with command '%s'.\n, cmd); + kmsg_dump
[PATCH -v11 resend 07/11] unicore32, prepare reboot_mode for moving to generic kernel code.
This patch prepares for the moving the parsing of reboot= to the generic kernel code by making reboot_mode into a more generic form. Signed-off-by: Robin Holt h...@sgi.com To: Andrew Morton a...@linux-foundation.org Cc: Guan Xuetao g...@mprc.pku.edu.cn Cc: Russ Anderson r...@sgi.com Cc: Robin Holt h...@sgi.com Cc: Russell King rmk+ker...@arm.linux.org.uk Cc: H. Peter Anvin h...@zytor.com Cc: Linux Kernel Mailing List linux-kernel@vger.kernel.org Cc: the arch/x86 maintainers x...@kernel.org Cc: Arm Mailing List linux-arm-ker...@lists.infradead.org Acked-by: Guan Xuetao g...@mprc.pku.edu.cn --- Changes since -v8 - Switched from using REBOOT_WARM/COLD to HARD/SOFT. --- arch/unicore32/kernel/process.c | 10 +- arch/unicore32/kernel/setup.h | 2 +- arch/unicore32/mm/mmu.c | 2 +- include/linux/reboot.h | 2 ++ 4 files changed, 9 insertions(+), 7 deletions(-) diff --git a/arch/unicore32/kernel/process.c b/arch/unicore32/kernel/process.c index c944769..93dd035 100644 --- a/arch/unicore32/kernel/process.c +++ b/arch/unicore32/kernel/process.c @@ -51,14 +51,14 @@ void arch_cpu_idle(void) local_irq_enable(); } -static char reboot_mode = 'h'; +static enum reboot_mode reboot_mode = REBOOT_HARD; int __init reboot_setup(char *str) { - reboot_mode = str[0]; + if ('s' == str[0]) + reboot_mode = REBOOT_SOFT; return 1; } - __setup(reboot=, reboot_setup); void machine_halt(void) @@ -88,7 +88,7 @@ void machine_restart(char *cmd) * we may need it to insert some 1:1 mappings so that * soft boot works. */ - setup_mm_for_reboot(reboot_mode); + setup_mm_for_reboot(); /* Clean and invalidate caches */ flush_cache_all(); @@ -102,7 +102,7 @@ void machine_restart(char *cmd) /* * Now handle reboot code. */ - if (reboot_mode == 's') { + if (reboot_mode == REBOOT_SOFT) { /* Jump into ROM at address 0x */ cpu_reset(VECTORS_BASE); } else { diff --git a/arch/unicore32/kernel/setup.h b/arch/unicore32/kernel/setup.h index 30f749d..f5c51b8 100644 --- a/arch/unicore32/kernel/setup.h +++ b/arch/unicore32/kernel/setup.h @@ -22,7 +22,7 @@ extern void puv3_ps2_init(void); extern void pci_puv3_preinit(void); extern void __init puv3_init_gpio(void); -extern void setup_mm_for_reboot(char mode); +extern void setup_mm_for_reboot(void); extern char __stubs_start[], __stubs_end[]; extern char __vectors_start[], __vectors_end[]; diff --git a/arch/unicore32/mm/mmu.c b/arch/unicore32/mm/mmu.c index 43c20b4..4f5a532 100644 --- a/arch/unicore32/mm/mmu.c +++ b/arch/unicore32/mm/mmu.c @@ -445,7 +445,7 @@ void __init paging_init(void) * the user-mode pages. This will then ensure that we have predictable * results when turning the mmu off */ -void setup_mm_for_reboot(char mode) +void setup_mm_for_reboot(void) { unsigned long base_pmdval; pgd_t *pgd; diff --git a/include/linux/reboot.h b/include/linux/reboot.h index 37d56c3..ca29a6f 100644 --- a/include/linux/reboot.h +++ b/include/linux/reboot.h @@ -13,6 +13,8 @@ enum reboot_mode { REBOOT_COLD = 0, REBOOT_WARM, + REBOOT_HARD, + REBOOT_SOFT, }; extern int register_reboot_notifier(struct notifier_block *); -- 1.8.2.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 03/11] Remove -stable friendly PF_THREAD_BOUND define
Remove the prior patch's #define for easier backporting to the stable releases. Signed-off-by: Robin Holt h...@sgi.com To: Andrew Morton a...@linux-foundation.org Cc: H. Peter Anvin h...@zytor.com Cc: Russ Anderson r...@sgi.com Cc: Robin Holt h...@sgi.com Cc: Russell King rmk+ker...@arm.linux.org.uk Cc: Guan Xuetao g...@mprc.pku.edu.cn Cc: Linux Kernel Mailing List linux-kernel@vger.kernel.org Cc: the arch/x86 maintainers x...@kernel.org Cc: Arm Mailing List linux-arm-ker...@lists.infradead.org --- kernel/sys.c | 5 - 1 file changed, 5 deletions(-) diff --git a/kernel/sys.c b/kernel/sys.c index 2bbd9a7..17bb8d3 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -362,11 +362,6 @@ int unregister_reboot_notifier(struct notifier_block *nb) } EXPORT_SYMBOL(unregister_reboot_notifier); -/* Add backwards compatibility for stable trees. */ -#ifndef PF_NO_SETAFFINITY -#define PF_NO_SETAFFINITY PF_THREAD_BOUND -#endif - static void migrate_to_reboot_cpu(void) { /* The boot cpu is always logical cpu 0 */ -- 1.8.2.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 resend 02/11] Migrate shutdown/reboot to boot cpu.
We recently noticed that reboot of a 1024 cpu machine takes approx 16 minutes of just stopping the cpus. The slowdown was tracked to commit f96972f. The current implementation does all the work of hot removing the cpus before halting the system. We are switching to just migrating to the boot cpu and then continuing with shutdown/reboot. This also has the effect of not breaking x86's command line parameter for specifying the reboot cpu. Note, this code was shamelessly copied from arch/x86/kernel/reboot.c with bits removed pertaining to the reboot_cpu command line parameter. Signed-off-by: Robin Holt h...@sgi.com Tested-by: Shawn Guo shawn@linaro.org To: Andrew Morton a...@linux-foundation.org Cc: H. Peter Anvin h...@zytor.com Cc: Russ Anderson r...@sgi.com Cc: Robin Holt h...@sgi.com Cc: Russell King rmk+ker...@arm.linux.org.uk Cc: Guan Xuetao g...@mprc.pku.edu.cn Cc: Linux Kernel Mailing List linux-kernel@vger.kernel.org Cc: the arch/x86 maintainers x...@kernel.org Cc: Arm Mailing List linux-arm-ker...@lists.infradead.org Cc: sta...@vger.kernel.org --- Changes since -v8 - Change stack parameter to make future patches cleaner. Changes since -v6: - Add #define for PF_THREAD_BOUND as compatibility to make stable easier. - Fixup s/reboot_cpu_id/reboot_cpu/ --- kernel/sys.c | 29 ++--- 1 file changed, 26 insertions(+), 3 deletions(-) diff --git a/kernel/sys.c b/kernel/sys.c index b95d3c7..2bbd9a7 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -362,6 +362,29 @@ int unregister_reboot_notifier(struct notifier_block *nb) } EXPORT_SYMBOL(unregister_reboot_notifier); +/* Add backwards compatibility for stable trees. */ +#ifndef PF_NO_SETAFFINITY +#define PF_NO_SETAFFINITY PF_THREAD_BOUND +#endif + +static void migrate_to_reboot_cpu(void) +{ + /* The boot cpu is always logical cpu 0 */ + int cpu = 0; + + cpu_hotplug_disable(); + + /* Make certain the cpu I'm about to reboot on is online */ + if (!cpu_online(cpu)) + cpu = cpumask_first(cpu_online_mask); + + /* Prevent races with other tasks migrating this task */ + current-flags |= PF_NO_SETAFFINITY; + + /* Make certain I only run on the appropriate processor */ + set_cpus_allowed_ptr(current, cpumask_of(cpu)); +} + /** * kernel_restart - reboot the system * @cmd: pointer to buffer containing command to execute for restart @@ -373,7 +396,7 @@ EXPORT_SYMBOL(unregister_reboot_notifier); void kernel_restart(char *cmd) { kernel_restart_prepare(cmd); - disable_nonboot_cpus(); + migrate_to_reboot_cpu(); syscore_shutdown(); if (!cmd) printk(KERN_EMERG Restarting system.\n); @@ -400,7 +423,7 @@ static void kernel_shutdown_prepare(enum system_states state) void kernel_halt(void) { kernel_shutdown_prepare(SYSTEM_HALT); - disable_nonboot_cpus(); + migrate_to_reboot_cpu(); syscore_shutdown(); printk(KERN_EMERG System halted.\n); kmsg_dump(KMSG_DUMP_HALT); @@ -419,7 +442,7 @@ void kernel_power_off(void) kernel_shutdown_prepare(SYSTEM_POWER_OFF); if (pm_power_off_prepare) pm_power_off_prepare(); - disable_nonboot_cpus(); + migrate_to_reboot_cpu(); syscore_shutdown(); printk(KERN_EMERG Power down.\n); kmsg_dump(KMSG_DUMP_POWEROFF); -- 1.8.2.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kmalloc warning in mlx4_buddy_init.
On Wed, May 15, 2013 at 07:15:42AM -0700, Eric Dumazet wrote: > On Wed, 2013-05-15 at 03:23 -0500, Robin Holt wrote: > > Roland, > > > > We are seeing the following when booting on a large system. > > > > [ 171.399023] mlx4_core 0004:01:00.0: irq 2410 for MSI/MSI-X > > [ 171.406560] [ cut here ] > > [ 171.411734] WARNING: at mm/slab_common.c:376 kmalloc_slab+0x71/0x90() > > [ 171.418919] Modules linked in: mlx4_core(+) sg lpc_ich mfd_core shpchp > > pci_hotplug ehci_pci ehci_hcd ioatdma i2c_i801 igb dca i2c_algo_bit > > i2c_core ptp pps_core mperf processor thermal_sys hwmon usbcore usb_common > > ext4 jbd2 crc16 sd_mod crc_t10dif qla2xxx scsi_transport_fc scsi_tgt > > megaraid_sas ahci libahci isci libsas libata scsi_transport_sas scsi_mod > > button dm_mirror dm_region_hash dm_log dm_mod gru(O) xvma(O) > > [ 171.460377] CPU: 48 PID: 2561 Comm: kworker/48:1 Tainted: GW O > > 3.10.0-rc1-uv-hz100-rja+ #3 > > [ 171.470473] Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 > > series BIOS 01/15/2013 > > [ 171.479720] Workqueue: events work_for_cpu_fn > > [ 171.484597] 0178 8867bb0f5ba8 814a873c > > 8867bb0f5be8 > > [ 171.492897] 81045a7b 80d080d0 0020 > > 88679bc7cb80 > > [ 171.501205] 82d0 > > 8867bb0f5bf8 > > [ 171.509502] Call Trace: > > [ 171.512266] [] dump_stack+0x19/0x1d > > [ 171.518007] [] warn_slowpath_common+0x6b/0xa0 > > [ 171.524711] [] warn_slowpath_null+0x15/0x20 > > [ 171.531230] [] kmalloc_slab+0x71/0x90 > > [ 171.537176] [] __kmalloc+0x30/0x220 > > [ 171.542989] [] ? mlx4_buddy_init+0xdb/0x1d0 > > [mlx4_core] > > [ 171.550699] [] mlx4_buddy_init+0xdb/0x1d0 [mlx4_core] > > [ 171.558183] [] mlx4_init_mr_table+0xaf/0x130 > > [mlx4_core] > > [ 171.565964] [] mlx4_setup_hca+0x158/0x5a0 [mlx4_core] > > [ 171.573446] [] __mlx4_init_one+0x720/0x9c0 [mlx4_core] > > [ 171.581030] [] mlx4_init_one+0x2c/0x60 [mlx4_core] > > [ 171.588232] [] local_pci_probe+0x49/0x80 > > [ 171.594458] [] work_for_cpu_fn+0x13/0x20 > > [ 171.600692] [] process_one_work+0x194/0x3d0 > > [ 171.607200] [] worker_thread+0x2c4/0x410 > > [ 171.613421] [] ? manage_workers+0x190/0x190 > > [ 171.619940] [] kthread+0xc6/0xd0 > > [ 171.625392] [] ? > > kthread_freezable_should_stop+0x70/0x70 > > [ 171.633182] [] ret_from_fork+0x7c/0xb0 > > [ 171.639204] [] ? > > kthread_freezable_should_stop+0x70/0x70 > > [ 171.646976] ---[ end trace 822f6d487f108023 ]--- > > [ 171.715920] mlx4_core 0004:01:00.0: command 0xc failed: fw status = 0x40 > > [ 171.723888] mlx4_core: Initializing 0007:02:00.0 > > > > This looks to be a kmalloc larger than MAX_ORDER. Not sure which of the two > > kcallocs in mlx4_buddy_init. > > Same problem here, its a real old problem that I mentioned. Is there any pressure against getting this changed or an equivalent change made upstream? > I usually use following hack to reduce the allocation size by 50% > > diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c > b/drivers/net/ethernet/mellanox/mlx4/main.c > index 0d32a82..b22f116 100644 > --- a/drivers/net/ethernet/mellanox/mlx4/main.c > +++ b/drivers/net/ethernet/mellanox/mlx4/main.c > @@ -126,7 +126,7 @@ static int log_num_vlan; > module_param_named(log_num_vlan, log_num_vlan, int, 0444); > MODULE_PARM_DESC(log_num_vlan, "Log2 max number of VLANs per ETH port > (0-7)"); > /* Log2 max number of VLANs per ETH port (0-7) */ > -#define MLX4_LOG_NUM_VLANS 7 > +#define MLX4_LOG_NUM_VLANS 6 This seems to work around the problem, but I think I might have something else going on as well. Without this patch, it will succeed if I have the driver configured to be built into the kernel. It will fail when I have it configured as a loadable module. I am not certain I did not accidentally change something else as well. I am going to use this for my local stuff and hope a fix gets upstream. Thanks, Robin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Commit 911af505 introduced a bootmem warning.
On Wed, May 15, 2013 at 07:32:49AM -0700, Paul E. McKenney wrote: > On Wed, May 15, 2013 at 02:57:42AM -0500, Robin Holt wrote: > > Paul, > > > > When we boot Linus' current kernel we get the following warning early > > in boot: > > > > -- > > [0.00] Memory: 63081268k/99598336k available (4832k kernel code, > > 34651396k absent, 1865672k reserved, 6269k data, > > 1672k init) > > [0.00] Hierarchical RCU implementation. > > [0.00] RCU dyntick-idle grace-period acceleration is enabled. > > [0.00] RCU restricting CPUs from NR_CPUS=4096 to nr_cpu_ids=32. > > [0.00] [ cut here ] > > [0.00] WARNING: at mm/nobootmem.c:215 > > ___alloc_bootmem_nopanic+0x79/0x82() > > [0.00] Modules linked in: > > [0.00] CPU: 0 PID: 0 Comm: swapper/0 Not tainted > > 3.10.0-rc1-uv-hz100-rja+ #3 > > [0.00] Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 > > series BIOS 01/15/2013 > > [0.00] 00d7 81a01e38 814a873c > > 81a01e78 > > [0.00] 81045a7b 81ced5a2 > > > > [0.00] 0040 0200 67da2bfc > > 81a01e88 > > [0.00] Call Trace: > > [0.00] [] dump_stack+0x19/0x1d > > [0.00] [] warn_slowpath_common+0x6b/0xa0 > > [0.00] [] warn_slowpath_null+0x15/0x20 > > [0.00] [] ___alloc_bootmem_nopanic+0x79/0x82 > > [0.00] [] ___alloc_bootmem+0x11/0x3c > > [0.00] [] __alloc_bootmem+0x10/0x12 > > [0.00] [] alloc_bootmem_cpumask_var+0x1d/0x27 > > [0.00] [] rcu_bootup_announce_oddness+0xd0/0x153 > > [0.00] [] rcu_init+0x1e/0x1e6 > > [0.00] [] start_kernel+0x1e6/0x43c > > [0.00] [] ? repair_env_string+0x58/0x58 > > [0.00] [] x86_64_start_reservations+0x1b/0x32 > > [0.00] [] x86_64_start_kernel+0x12a/0x131 > > [0.00] ---[ end trace c8b13488e92fad65 ]--- > > [0.00] Experimental no-CBs for all CPUs > > [0.00] Experimental no-CBs CPUs: 0-31. > > [0.00] NO_HZ: Full dynticks CPUs: 1-31. > > Could you please try the following patch and let me know if it helps? > > Thanx, Paul > > > > rcu: Don't allocate bootmem from rcu_init() > > When rcu_init() is called we already have slab working, allocating > bootmem at that point results in warnings and an allocation from > slab. This commit therefore changes alloc_bootmem_cpumask_var() to > alloc_cpumask_var() in rcu_bootup_announce_oddness(), which is called > from rcu_init(). > > Signed-off-by: Sasha Levin > Signed-off-by: Paul E. McKenney > Reviewed-by: Josh Triplett Tested-by: Robin Holt Works great. Robin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
kmalloc warning in mlx4_buddy_init.
Roland, We are seeing the following when booting on a large system. [ 171.399023] mlx4_core 0004:01:00.0: irq 2410 for MSI/MSI-X [ 171.406560] [ cut here ] [ 171.411734] WARNING: at mm/slab_common.c:376 kmalloc_slab+0x71/0x90() [ 171.418919] Modules linked in: mlx4_core(+) sg lpc_ich mfd_core shpchp pci_hotplug ehci_pci ehci_hcd ioatdma i2c_i801 igb dca i2c_algo_bit i2c_core ptp pps_core mperf processor thermal_sys hwmon usbcore usb_common ext4 jbd2 crc16 sd_mod crc_t10dif qla2xxx scsi_transport_fc scsi_tgt megaraid_sas ahci libahci isci libsas libata scsi_transport_sas scsi_mod button dm_mirror dm_region_hash dm_log dm_mod gru(O) xvma(O) [ 171.460377] CPU: 48 PID: 2561 Comm: kworker/48:1 Tainted: GW O 3.10.0-rc1-uv-hz100-rja+ #3 [ 171.470473] Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series BIOS 01/15/2013 [ 171.479720] Workqueue: events work_for_cpu_fn [ 171.484597] 0178 8867bb0f5ba8 814a873c 8867bb0f5be8 [ 171.492897] 81045a7b 80d080d0 0020 88679bc7cb80 [ 171.501205] 82d0 8867bb0f5bf8 [ 171.509502] Call Trace: [ 171.512266] [] dump_stack+0x19/0x1d [ 171.518007] [] warn_slowpath_common+0x6b/0xa0 [ 171.524711] [] warn_slowpath_null+0x15/0x20 [ 171.531230] [] kmalloc_slab+0x71/0x90 [ 171.537176] [] __kmalloc+0x30/0x220 [ 171.542989] [] ? mlx4_buddy_init+0xdb/0x1d0 [mlx4_core] [ 171.550699] [] mlx4_buddy_init+0xdb/0x1d0 [mlx4_core] [ 171.558183] [] mlx4_init_mr_table+0xaf/0x130 [mlx4_core] [ 171.565964] [] mlx4_setup_hca+0x158/0x5a0 [mlx4_core] [ 171.573446] [] __mlx4_init_one+0x720/0x9c0 [mlx4_core] [ 171.581030] [] mlx4_init_one+0x2c/0x60 [mlx4_core] [ 171.588232] [] local_pci_probe+0x49/0x80 [ 171.594458] [] work_for_cpu_fn+0x13/0x20 [ 171.600692] [] process_one_work+0x194/0x3d0 [ 171.607200] [] worker_thread+0x2c4/0x410 [ 171.613421] [] ? manage_workers+0x190/0x190 [ 171.619940] [] kthread+0xc6/0xd0 [ 171.625392] [] ? kthread_freezable_should_stop+0x70/0x70 [ 171.633182] [] ret_from_fork+0x7c/0xb0 [ 171.639204] [] ? kthread_freezable_should_stop+0x70/0x70 [ 171.646976] ---[ end trace 822f6d487f108023 ]--- [ 171.715920] mlx4_core 0004:01:00.0: command 0xc failed: fw status = 0x40 [ 171.723888] mlx4_core: Initializing 0007:02:00.0 This looks to be a kmalloc larger than MAX_ORDER. Not sure which of the two kcallocs in mlx4_buddy_init. Thanks, Robin Holt -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Commit 911af505 introduced a bootmem warning.
Paul, When we boot Linus' current kernel we get the following warning early in boot: -- [0.00] Memory: 63081268k/99598336k available (4832k kernel code, 34651396k absent, 1865672k reserved, 6269k data, 1672k init) [0.00] Hierarchical RCU implementation. [0.00] RCU dyntick-idle grace-period acceleration is enabled. [0.00] RCU restricting CPUs from NR_CPUS=4096 to nr_cpu_ids=32. [0.00] [ cut here ] [0.00] WARNING: at mm/nobootmem.c:215 ___alloc_bootmem_nopanic+0x79/0x82() [0.00] Modules linked in: [0.00] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.0-rc1-uv-hz100-rja+ #3 [0.00] Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series BIOS 01/15/2013 [0.00] 00d7 81a01e38 814a873c 81a01e78 [0.00] 81045a7b 81ced5a2 [0.00] 0040 0200 67da2bfc 81a01e88 [0.00] Call Trace: [0.00] [] dump_stack+0x19/0x1d [0.00] [] warn_slowpath_common+0x6b/0xa0 [0.00] [] warn_slowpath_null+0x15/0x20 [0.00] [] ___alloc_bootmem_nopanic+0x79/0x82 [0.00] [] ___alloc_bootmem+0x11/0x3c [0.00] [] __alloc_bootmem+0x10/0x12 [0.00] [] alloc_bootmem_cpumask_var+0x1d/0x27 [0.00] [] rcu_bootup_announce_oddness+0xd0/0x153 [0.00] [] rcu_init+0x1e/0x1e6 [0.00] [] start_kernel+0x1e6/0x43c [0.00] [] ? repair_env_string+0x58/0x58 [0.00] [] x86_64_start_reservations+0x1b/0x32 [0.00] [] x86_64_start_kernel+0x12a/0x131 [0.00] ---[ end trace c8b13488e92fad65 ]--- [0.00] Experimental no-CBs for all CPUs [0.00] Experimental no-CBs CPUs: 0-31. [0.00] NO_HZ: Full dynticks CPUs: 1-31. $ grep RCU .config # RCU Subsystem CONFIG_TREE_RCU=y # CONFIG_PREEMPT_RCU is not set CONFIG_RCU_STALL_COMMON=y CONFIG_RCU_USER_QS=y CONFIG_RCU_FANOUT=64 CONFIG_RCU_FANOUT_LEAF=16 # CONFIG_RCU_FANOUT_EXACT is not set CONFIG_RCU_FAST_NO_HZ=y # CONFIG_TREE_RCU_TRACE is not set CONFIG_RCU_NOCB_CPU=y CONFIG_RCU_NOCB_CPU_ALL=y # RCU Debugging # CONFIG_SPARSE_RCU_POINTER is not set # CONFIG_RCU_TORTURE_TEST is not set CONFIG_RCU_CPU_STALL_TIMEOUT=60 # CONFIG_RCU_CPU_STALL_INFO is not set # CONFIG_RCU_TRACE is not set $ git rev-parse HEAD 1f638766ffcd9f08209afcabb3e2df961552fe18 cca6f3931 (Paul E. McKenney2012-05-08 21:00:28 -0700 86) if (nr_cpu_ids != NR_CPUS) cca6f3931 (Paul E. McKenney2012-05-08 21:00:28 -0700 87) printk(KERN_INFO "\tRCU restricting CPUs from NR_C PUS=%d to nr_cpu_ids=%d.\n", NR_CPUS, nr_cpu_ids); 3fbfbf7a3 (Paul E. McKenney2012-08-19 21:35:53 -0700 88) #ifdef CONFIG_RCU_NOCB_CPU 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 89) #ifndef CONFIG_RCU_NOCB_CPU_NONE 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 90) if (!have_rcu_nocb_mask) { 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 91) alloc_bootmem_cpumask_var(_nocb_mask); 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 92) have_rcu_nocb_mask = true; 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 93) } 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 94) #ifdef CONFIG_RCU_NOCB_CPU_ZERO 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 95) pr_info("\tExperimental no-CBs CPU 0\n"); 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 96) cpumask_set_cpu(0, rcu_nocb_mask); 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 97) #endif /* #ifdef CONFIG_RCU_NOCB_CPU_ZERO */ 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 98) #ifdef CONFIG_RCU_NOCB_CPU_ALL 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 99) pr_info("\tExperimental no-CBs for all CPUs\n"); 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 100) cpumask_setall(rcu_nocb_mask); 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 101) #endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */ 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 102) #endif /* #ifndef CONFIG_RCU_NOCB_CPU_NONE */ Thanks, Robin Holt -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Commit 911af505 introduced a bootmem warning.
Paul, When we boot Linus' current kernel we get the following warning early in boot: -- [0.00] Memory: 63081268k/99598336k available (4832k kernel code, 34651396k absent, 1865672k reserved, 6269k data, 1672k init) [0.00] Hierarchical RCU implementation. [0.00] RCU dyntick-idle grace-period acceleration is enabled. [0.00] RCU restricting CPUs from NR_CPUS=4096 to nr_cpu_ids=32. [0.00] [ cut here ] [0.00] WARNING: at mm/nobootmem.c:215 ___alloc_bootmem_nopanic+0x79/0x82() [0.00] Modules linked in: [0.00] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.0-rc1-uv-hz100-rja+ #3 [0.00] Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series BIOS 01/15/2013 [0.00] 00d7 81a01e38 814a873c 81a01e78 [0.00] 81045a7b 81ced5a2 [0.00] 0040 0200 67da2bfc 81a01e88 [0.00] Call Trace: [0.00] [814a873c] dump_stack+0x19/0x1d [0.00] [81045a7b] warn_slowpath_common+0x6b/0xa0 [0.00] [81045ac5] warn_slowpath_null+0x15/0x20 [0.00] [81b1793a] ___alloc_bootmem_nopanic+0x79/0x82 [0.00] [81b17a59] ___alloc_bootmem+0x11/0x3c [0.00] [81b17aa4] __alloc_bootmem+0x10/0x12 [0.00] [81b233a7] alloc_bootmem_cpumask_var+0x1d/0x27 [0.00] [81b1119b] rcu_bootup_announce_oddness+0xd0/0x153 [0.00] [81b11735] rcu_init+0x1e/0x1e6 [0.00] [81aedf2f] start_kernel+0x1e6/0x43c [0.00] [81aedb3b] ? repair_env_string+0x58/0x58 [0.00] [81aed4d1] x86_64_start_reservations+0x1b/0x32 [0.00] [81aed612] x86_64_start_kernel+0x12a/0x131 [0.00] ---[ end trace c8b13488e92fad65 ]--- [0.00] Experimental no-CBs for all CPUs [0.00] Experimental no-CBs CPUs: 0-31. [0.00] NO_HZ: Full dynticks CPUs: 1-31. $ grep RCU .config # RCU Subsystem CONFIG_TREE_RCU=y # CONFIG_PREEMPT_RCU is not set CONFIG_RCU_STALL_COMMON=y CONFIG_RCU_USER_QS=y CONFIG_RCU_FANOUT=64 CONFIG_RCU_FANOUT_LEAF=16 # CONFIG_RCU_FANOUT_EXACT is not set CONFIG_RCU_FAST_NO_HZ=y # CONFIG_TREE_RCU_TRACE is not set CONFIG_RCU_NOCB_CPU=y CONFIG_RCU_NOCB_CPU_ALL=y # RCU Debugging # CONFIG_SPARSE_RCU_POINTER is not set # CONFIG_RCU_TORTURE_TEST is not set CONFIG_RCU_CPU_STALL_TIMEOUT=60 # CONFIG_RCU_CPU_STALL_INFO is not set # CONFIG_RCU_TRACE is not set $ git rev-parse HEAD 1f638766ffcd9f08209afcabb3e2df961552fe18 cca6f3931 (Paul E. McKenney2012-05-08 21:00:28 -0700 86) if (nr_cpu_ids != NR_CPUS) cca6f3931 (Paul E. McKenney2012-05-08 21:00:28 -0700 87) printk(KERN_INFO \tRCU restricting CPUs from NR_C PUS=%d to nr_cpu_ids=%d.\n, NR_CPUS, nr_cpu_ids); 3fbfbf7a3 (Paul E. McKenney2012-08-19 21:35:53 -0700 88) #ifdef CONFIG_RCU_NOCB_CPU 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 89) #ifndef CONFIG_RCU_NOCB_CPU_NONE 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 90) if (!have_rcu_nocb_mask) { 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 91) alloc_bootmem_cpumask_var(rcu_nocb_mask); 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 92) have_rcu_nocb_mask = true; 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 93) } 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 94) #ifdef CONFIG_RCU_NOCB_CPU_ZERO 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 95) pr_info(\tExperimental no-CBs CPU 0\n); 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 96) cpumask_set_cpu(0, rcu_nocb_mask); 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 97) #endif /* #ifdef CONFIG_RCU_NOCB_CPU_ZERO */ 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 98) #ifdef CONFIG_RCU_NOCB_CPU_ALL 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 99) pr_info(\tExperimental no-CBs for all CPUs\n); 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 100) cpumask_setall(rcu_nocb_mask); 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 101) #endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */ 911af505e (Paul E. McKenney2013-02-11 10:23:27 -0800 102) #endif /* #ifndef CONFIG_RCU_NOCB_CPU_NONE */ Thanks, Robin Holt -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
kmalloc warning in mlx4_buddy_init.
Roland, We are seeing the following when booting on a large system. [ 171.399023] mlx4_core 0004:01:00.0: irq 2410 for MSI/MSI-X [ 171.406560] [ cut here ] [ 171.411734] WARNING: at mm/slab_common.c:376 kmalloc_slab+0x71/0x90() [ 171.418919] Modules linked in: mlx4_core(+) sg lpc_ich mfd_core shpchp pci_hotplug ehci_pci ehci_hcd ioatdma i2c_i801 igb dca i2c_algo_bit i2c_core ptp pps_core mperf processor thermal_sys hwmon usbcore usb_common ext4 jbd2 crc16 sd_mod crc_t10dif qla2xxx scsi_transport_fc scsi_tgt megaraid_sas ahci libahci isci libsas libata scsi_transport_sas scsi_mod button dm_mirror dm_region_hash dm_log dm_mod gru(O) xvma(O) [ 171.460377] CPU: 48 PID: 2561 Comm: kworker/48:1 Tainted: GW O 3.10.0-rc1-uv-hz100-rja+ #3 [ 171.470473] Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series BIOS 01/15/2013 [ 171.479720] Workqueue: events work_for_cpu_fn [ 171.484597] 0178 8867bb0f5ba8 814a873c 8867bb0f5be8 [ 171.492897] 81045a7b 80d080d0 0020 88679bc7cb80 [ 171.501205] 82d0 8867bb0f5bf8 [ 171.509502] Call Trace: [ 171.512266] [814a873c] dump_stack+0x19/0x1d [ 171.518007] [81045a7b] warn_slowpath_common+0x6b/0xa0 [ 171.524711] [81045ac5] warn_slowpath_null+0x15/0x20 [ 171.531230] [811258c1] kmalloc_slab+0x71/0x90 [ 171.537176] [81152e10] __kmalloc+0x30/0x220 [ 171.542989] [a03a9f4b] ? mlx4_buddy_init+0xdb/0x1d0 [mlx4_core] [ 171.550699] [a03a9f4b] mlx4_buddy_init+0xdb/0x1d0 [mlx4_core] [ 171.558183] [a03aa0ef] mlx4_init_mr_table+0xaf/0x130 [mlx4_core] [ 171.565964] [a03a3c48] mlx4_setup_hca+0x158/0x5a0 [mlx4_core] [ 171.573446] [a03a5b90] __mlx4_init_one+0x720/0x9c0 [mlx4_core] [ 171.581030] [a03a5e7c] mlx4_init_one+0x2c/0x60 [mlx4_core] [ 171.588232] [8128c599] local_pci_probe+0x49/0x80 [ 171.594458] [810606f3] work_for_cpu_fn+0x13/0x20 [ 171.600692] [81064114] process_one_work+0x194/0x3d0 [ 171.607200] [81065464] worker_thread+0x2c4/0x410 [ 171.613421] [810651a0] ? manage_workers+0x190/0x190 [ 171.619940] [8106aee6] kthread+0xc6/0xd0 [ 171.625392] [8106ae20] ? kthread_freezable_should_stop+0x70/0x70 [ 171.633182] [814b42ec] ret_from_fork+0x7c/0xb0 [ 171.639204] [8106ae20] ? kthread_freezable_should_stop+0x70/0x70 [ 171.646976] ---[ end trace 822f6d487f108023 ]--- [ 171.715920] mlx4_core 0004:01:00.0: command 0xc failed: fw status = 0x40 [ 171.723888] mlx4_core: Initializing 0007:02:00.0 This looks to be a kmalloc larger than MAX_ORDER. Not sure which of the two kcallocs in mlx4_buddy_init. Thanks, Robin Holt -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Commit 911af505 introduced a bootmem warning.
On Wed, May 15, 2013 at 07:32:49AM -0700, Paul E. McKenney wrote: On Wed, May 15, 2013 at 02:57:42AM -0500, Robin Holt wrote: Paul, When we boot Linus' current kernel we get the following warning early in boot: -- [0.00] Memory: 63081268k/99598336k available (4832k kernel code, 34651396k absent, 1865672k reserved, 6269k data, 1672k init) [0.00] Hierarchical RCU implementation. [0.00] RCU dyntick-idle grace-period acceleration is enabled. [0.00] RCU restricting CPUs from NR_CPUS=4096 to nr_cpu_ids=32. [0.00] [ cut here ] [0.00] WARNING: at mm/nobootmem.c:215 ___alloc_bootmem_nopanic+0x79/0x82() [0.00] Modules linked in: [0.00] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.0-rc1-uv-hz100-rja+ #3 [0.00] Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series BIOS 01/15/2013 [0.00] 00d7 81a01e38 814a873c 81a01e78 [0.00] 81045a7b 81ced5a2 [0.00] 0040 0200 67da2bfc 81a01e88 [0.00] Call Trace: [0.00] [814a873c] dump_stack+0x19/0x1d [0.00] [81045a7b] warn_slowpath_common+0x6b/0xa0 [0.00] [81045ac5] warn_slowpath_null+0x15/0x20 [0.00] [81b1793a] ___alloc_bootmem_nopanic+0x79/0x82 [0.00] [81b17a59] ___alloc_bootmem+0x11/0x3c [0.00] [81b17aa4] __alloc_bootmem+0x10/0x12 [0.00] [81b233a7] alloc_bootmem_cpumask_var+0x1d/0x27 [0.00] [81b1119b] rcu_bootup_announce_oddness+0xd0/0x153 [0.00] [81b11735] rcu_init+0x1e/0x1e6 [0.00] [81aedf2f] start_kernel+0x1e6/0x43c [0.00] [81aedb3b] ? repair_env_string+0x58/0x58 [0.00] [81aed4d1] x86_64_start_reservations+0x1b/0x32 [0.00] [81aed612] x86_64_start_kernel+0x12a/0x131 [0.00] ---[ end trace c8b13488e92fad65 ]--- [0.00] Experimental no-CBs for all CPUs [0.00] Experimental no-CBs CPUs: 0-31. [0.00] NO_HZ: Full dynticks CPUs: 1-31. Could you please try the following patch and let me know if it helps? Thanx, Paul rcu: Don't allocate bootmem from rcu_init() When rcu_init() is called we already have slab working, allocating bootmem at that point results in warnings and an allocation from slab. This commit therefore changes alloc_bootmem_cpumask_var() to alloc_cpumask_var() in rcu_bootup_announce_oddness(), which is called from rcu_init(). Signed-off-by: Sasha Levin sasha.le...@oracle.com Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com Reviewed-by: Josh Triplett j...@joshtriplett.org Tested-by: Robin Holt h...@sgi.com Works great. Robin -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kmalloc warning in mlx4_buddy_init.
On Wed, May 15, 2013 at 07:15:42AM -0700, Eric Dumazet wrote: On Wed, 2013-05-15 at 03:23 -0500, Robin Holt wrote: Roland, We are seeing the following when booting on a large system. [ 171.399023] mlx4_core 0004:01:00.0: irq 2410 for MSI/MSI-X [ 171.406560] [ cut here ] [ 171.411734] WARNING: at mm/slab_common.c:376 kmalloc_slab+0x71/0x90() [ 171.418919] Modules linked in: mlx4_core(+) sg lpc_ich mfd_core shpchp pci_hotplug ehci_pci ehci_hcd ioatdma i2c_i801 igb dca i2c_algo_bit i2c_core ptp pps_core mperf processor thermal_sys hwmon usbcore usb_common ext4 jbd2 crc16 sd_mod crc_t10dif qla2xxx scsi_transport_fc scsi_tgt megaraid_sas ahci libahci isci libsas libata scsi_transport_sas scsi_mod button dm_mirror dm_region_hash dm_log dm_mod gru(O) xvma(O) [ 171.460377] CPU: 48 PID: 2561 Comm: kworker/48:1 Tainted: GW O 3.10.0-rc1-uv-hz100-rja+ #3 [ 171.470473] Hardware name: SGI UV2000/ROMLEY, BIOS SGI UV 2000/3000 series BIOS 01/15/2013 [ 171.479720] Workqueue: events work_for_cpu_fn [ 171.484597] 0178 8867bb0f5ba8 814a873c 8867bb0f5be8 [ 171.492897] 81045a7b 80d080d0 0020 88679bc7cb80 [ 171.501205] 82d0 8867bb0f5bf8 [ 171.509502] Call Trace: [ 171.512266] [814a873c] dump_stack+0x19/0x1d [ 171.518007] [81045a7b] warn_slowpath_common+0x6b/0xa0 [ 171.524711] [81045ac5] warn_slowpath_null+0x15/0x20 [ 171.531230] [811258c1] kmalloc_slab+0x71/0x90 [ 171.537176] [81152e10] __kmalloc+0x30/0x220 [ 171.542989] [a03a9f4b] ? mlx4_buddy_init+0xdb/0x1d0 [mlx4_core] [ 171.550699] [a03a9f4b] mlx4_buddy_init+0xdb/0x1d0 [mlx4_core] [ 171.558183] [a03aa0ef] mlx4_init_mr_table+0xaf/0x130 [mlx4_core] [ 171.565964] [a03a3c48] mlx4_setup_hca+0x158/0x5a0 [mlx4_core] [ 171.573446] [a03a5b90] __mlx4_init_one+0x720/0x9c0 [mlx4_core] [ 171.581030] [a03a5e7c] mlx4_init_one+0x2c/0x60 [mlx4_core] [ 171.588232] [8128c599] local_pci_probe+0x49/0x80 [ 171.594458] [810606f3] work_for_cpu_fn+0x13/0x20 [ 171.600692] [81064114] process_one_work+0x194/0x3d0 [ 171.607200] [81065464] worker_thread+0x2c4/0x410 [ 171.613421] [810651a0] ? manage_workers+0x190/0x190 [ 171.619940] [8106aee6] kthread+0xc6/0xd0 [ 171.625392] [8106ae20] ? kthread_freezable_should_stop+0x70/0x70 [ 171.633182] [814b42ec] ret_from_fork+0x7c/0xb0 [ 171.639204] [8106ae20] ? kthread_freezable_should_stop+0x70/0x70 [ 171.646976] ---[ end trace 822f6d487f108023 ]--- [ 171.715920] mlx4_core 0004:01:00.0: command 0xc failed: fw status = 0x40 [ 171.723888] mlx4_core: Initializing 0007:02:00.0 This looks to be a kmalloc larger than MAX_ORDER. Not sure which of the two kcallocs in mlx4_buddy_init. Same problem here, its a real old problem that I mentioned. Is there any pressure against getting this changed or an equivalent change made upstream? I usually use following hack to reduce the allocation size by 50% diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c index 0d32a82..b22f116 100644 --- a/drivers/net/ethernet/mellanox/mlx4/main.c +++ b/drivers/net/ethernet/mellanox/mlx4/main.c @@ -126,7 +126,7 @@ static int log_num_vlan; module_param_named(log_num_vlan, log_num_vlan, int, 0444); MODULE_PARM_DESC(log_num_vlan, Log2 max number of VLANs per ETH port (0-7)); /* Log2 max number of VLANs per ETH port (0-7) */ -#define MLX4_LOG_NUM_VLANS 7 +#define MLX4_LOG_NUM_VLANS 6 This seems to work around the problem, but I think I might have something else going on as well. Without this patch, it will succeed if I have the driver configured to be built into the kernel. It will fail when I have it configured as a loadable module. I am not certain I did not accidentally change something else as well. I am going to use this for my local stuff and hope a fix gets upstream. Thanks, Robin -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Full dynticks needs evtdesc set before marking cpu online.
On Mon, May 13, 2013 at 04:04:45PM +0200, Thomas Gleixner wrote: > On Mon, 13 May 2013, Robin Holt wrote: > > On Mon, May 13, 2013 at 03:03:55PM +0200, Thomas Gleixner wrote: > > > On Mon, 13 May 2013, Robin Holt wrote: > > > > > > > On Mon, May 13, 2013 at 11:21:00AM +0200, Thomas Gleixner wrote: > > > > > On Wed, 8 May 2013, Robin Holt wrote: > > > > > > > > > > > Thomas, > > > > > > > > > > > > We are seeing failures booting medium sized machines which I think > > > > > > is > > > > > > a change in expectations that dyntick put on x86's start_secondary. > > > > > > > > > > > > During boot of cpus, we see an occassional panic in > > > > > > tick_do_broadcast at > > > > > > > > > > http://lkml.indiana.edu/hypermail/linux/kernel/1305.0/01818.html > > > > > > > > > > Will hit Linus tree soon. > > > > > > > > I think this is really due to a sequence in start_secondary. The cpu > > > > has been marked as online, but its evtdesc has not been initialized. > > > > I sent a followup to this with a hack/patch. > > > > > > No, the real issue is that I messed up the cpumask conversion in the > > > broadcast code, i.e. using alloc instead of zalloc, which allocated > > > nonzeroed memory for the cpumasks, so any random bit set will crash > > > the machine. Your patch is just papering over the issue. > > > > I believe I understand now. What would be the downside of moving > > the initialization to before marking the cpu online? It seems like a > > reasonable this to expect as well in spite of it not being the right > > fix to the other bug. > > Yes, we can move it, but its not a required thing that the tick device > is setup befor onlining. I tested with your patch and it does fix my problem as well. Thank your, Robin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Full dynticks needs evtdesc set before marking cpu online.
On Mon, May 13, 2013 at 03:03:55PM +0200, Thomas Gleixner wrote: > On Mon, 13 May 2013, Robin Holt wrote: > > > On Mon, May 13, 2013 at 11:21:00AM +0200, Thomas Gleixner wrote: > > > On Wed, 8 May 2013, Robin Holt wrote: > > > > > > > Thomas, > > > > > > > > We are seeing failures booting medium sized machines which I think is > > > > a change in expectations that dyntick put on x86's start_secondary. > > > > > > > > During boot of cpus, we see an occassional panic in tick_do_broadcast at > > > > > > http://lkml.indiana.edu/hypermail/linux/kernel/1305.0/01818.html > > > > > > Will hit Linus tree soon. > > > > I think this is really due to a sequence in start_secondary. The cpu > > has been marked as online, but its evtdesc has not been initialized. > > I sent a followup to this with a hack/patch. > > No, the real issue is that I messed up the cpumask conversion in the > broadcast code, i.e. using alloc instead of zalloc, which allocated > nonzeroed memory for the cpumasks, so any random bit set will crash > the machine. Your patch is just papering over the issue. I believe I understand now. What would be the downside of moving the initialization to before marking the cpu online? It seems like a reasonable this to expect as well in spite of it not being the right fix to the other bug. Thanks, Robin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Full dynticks needs evtdesc set before marking cpu online.
On Mon, May 13, 2013 at 11:21:00AM +0200, Thomas Gleixner wrote: > On Wed, 8 May 2013, Robin Holt wrote: > > > Thomas, > > > > We are seeing failures booting medium sized machines which I think is > > a change in expectations that dyntick put on x86's start_secondary. > > > > During boot of cpus, we see an occassional panic in tick_do_broadcast at > > http://lkml.indiana.edu/hypermail/linux/kernel/1305.0/01818.html > > Will hit Linus tree soon. I think this is really due to a sequence in start_secondary. The cpu has been marked as online, but its evtdesc has not been initialized. I sent a followup to this with a hack/patch. It was essentially: --- linux.orig/arch/x86/kernel/smpboot.c +++ linux/arch/x86/kernel/smpboot.c @@ -264,6 +264,8 @@ notrace static void __cpuinit start_seco */ check_tsc_sync_target(); + x86_cpuinit.setup_percpu_clockev(); + /* * We need to hold vector_lock so there the set of online cpus * does not change while we are assigning vectors to cpus. Holding @@ -281,8 +283,6 @@ notrace static void __cpuinit start_seco /* to prevent fake stack check failure in clock setup */ boot_init_stack_canary(); - x86_cpuinit.setup_percpu_clockev(); - wmb(); cpu_startup_entry(CPUHP_ONLINE); } Robin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Full dynticks needs evtdesc set before marking cpu online.
On Mon, May 13, 2013 at 11:21:00AM +0200, Thomas Gleixner wrote: On Wed, 8 May 2013, Robin Holt wrote: Thomas, We are seeing failures booting medium sized machines which I think is a change in expectations that dyntick put on x86's start_secondary. During boot of cpus, we see an occassional panic in tick_do_broadcast at http://lkml.indiana.edu/hypermail/linux/kernel/1305.0/01818.html Will hit Linus tree soon. I think this is really due to a sequence in start_secondary. The cpu has been marked as online, but its evtdesc has not been initialized. I sent a followup to this with a hack/patch. It was essentially: --- linux.orig/arch/x86/kernel/smpboot.c +++ linux/arch/x86/kernel/smpboot.c @@ -264,6 +264,8 @@ notrace static void __cpuinit start_seco */ check_tsc_sync_target(); + x86_cpuinit.setup_percpu_clockev(); + /* * We need to hold vector_lock so there the set of online cpus * does not change while we are assigning vectors to cpus. Holding @@ -281,8 +283,6 @@ notrace static void __cpuinit start_seco /* to prevent fake stack check failure in clock setup */ boot_init_stack_canary(); - x86_cpuinit.setup_percpu_clockev(); - wmb(); cpu_startup_entry(CPUHP_ONLINE); } Robin -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Full dynticks needs evtdesc set before marking cpu online.
On Mon, May 13, 2013 at 03:03:55PM +0200, Thomas Gleixner wrote: On Mon, 13 May 2013, Robin Holt wrote: On Mon, May 13, 2013 at 11:21:00AM +0200, Thomas Gleixner wrote: On Wed, 8 May 2013, Robin Holt wrote: Thomas, We are seeing failures booting medium sized machines which I think is a change in expectations that dyntick put on x86's start_secondary. During boot of cpus, we see an occassional panic in tick_do_broadcast at http://lkml.indiana.edu/hypermail/linux/kernel/1305.0/01818.html Will hit Linus tree soon. I think this is really due to a sequence in start_secondary. The cpu has been marked as online, but its evtdesc has not been initialized. I sent a followup to this with a hack/patch. No, the real issue is that I messed up the cpumask conversion in the broadcast code, i.e. using alloc instead of zalloc, which allocated nonzeroed memory for the cpumasks, so any random bit set will crash the machine. Your patch is just papering over the issue. I believe I understand now. What would be the downside of moving the initialization to before marking the cpu online? It seems like a reasonable this to expect as well in spite of it not being the right fix to the other bug. Thanks, Robin -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Full dynticks needs evtdesc set before marking cpu online.
On Mon, May 13, 2013 at 04:04:45PM +0200, Thomas Gleixner wrote: On Mon, 13 May 2013, Robin Holt wrote: On Mon, May 13, 2013 at 03:03:55PM +0200, Thomas Gleixner wrote: On Mon, 13 May 2013, Robin Holt wrote: On Mon, May 13, 2013 at 11:21:00AM +0200, Thomas Gleixner wrote: On Wed, 8 May 2013, Robin Holt wrote: Thomas, We are seeing failures booting medium sized machines which I think is a change in expectations that dyntick put on x86's start_secondary. During boot of cpus, we see an occassional panic in tick_do_broadcast at http://lkml.indiana.edu/hypermail/linux/kernel/1305.0/01818.html Will hit Linus tree soon. I think this is really due to a sequence in start_secondary. The cpu has been marked as online, but its evtdesc has not been initialized. I sent a followup to this with a hack/patch. No, the real issue is that I messed up the cpumask conversion in the broadcast code, i.e. using alloc instead of zalloc, which allocated nonzeroed memory for the cpumasks, so any random bit set will crash the machine. Your patch is just papering over the issue. I believe I understand now. What would be the downside of moving the initialization to before marking the cpu online? It seems like a reasonable this to expect as well in spite of it not being the right fix to the other bug. Yes, we can move it, but its not a required thing that the tick device is setup befor onlining. I tested with your patch and it does fix my problem as well. Thank your, Robin -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 01/11] CPU hotplug: Provide a generic helper to disable/enable CPU hotplug
From: "Srivatsa S. Bhat" There are instances in the kernel where we would like to disable CPU hotplug (from sysfs) during some important operation. Today the freezer code depends on this and the code to do it was kinda tailor-made for that. Restructure the code and make it generic enough to be useful for other usecases too. Signed-off-by: Srivatsa S. Bhat Signed-off-by: Robin Holt To: Andrew Morton Cc: H. Peter Anvin Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List Cc: --- include/linux/cpu.h | 4 kernel/cpu.c| 55 ++--- 2 files changed, 27 insertions(+), 32 deletions(-) diff --git a/include/linux/cpu.h b/include/linux/cpu.h index c6f6e08..9f3c7e8 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -175,6 +175,8 @@ extern struct bus_type cpu_subsys; extern void get_online_cpus(void); extern void put_online_cpus(void); +extern void cpu_hotplug_disable(void); +extern void cpu_hotplug_enable(void); #define hotcpu_notifier(fn, pri) cpu_notifier(fn, pri) #define register_hotcpu_notifier(nb) register_cpu_notifier(nb) #define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb) @@ -198,6 +200,8 @@ static inline void cpu_hotplug_driver_unlock(void) #define get_online_cpus() do { } while (0) #define put_online_cpus() do { } while (0) +#define cpu_hotplug_disable() do { } while (0) +#define cpu_hotplug_enable() do { } while (0) #define hotcpu_notifier(fn, pri) do { (void)(fn); } while (0) /* These aren't inline functions due to a GCC bug. */ #define register_hotcpu_notifier(nb) ({ (void)(nb); 0; }) diff --git a/kernel/cpu.c b/kernel/cpu.c index b5e4ab2..198a388 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -133,6 +133,27 @@ static void cpu_hotplug_done(void) mutex_unlock(_hotplug.lock); } +/* + * Wait for currently running CPU hotplug operations to complete (if any) and + * disable future CPU hotplug (from sysfs). The 'cpu_add_remove_lock' protects + * the 'cpu_hotplug_disabled' flag. The same lock is also acquired by the + * hotplug path before performing hotplug operations. So acquiring that lock + * guarantees mutual exclusion from any currently running hotplug operations. + */ +void cpu_hotplug_disable(void) +{ + cpu_maps_update_begin(); + cpu_hotplug_disabled = 1; + cpu_maps_update_done(); +} + +void cpu_hotplug_enable(void) +{ + cpu_maps_update_begin(); + cpu_hotplug_disabled = 0; + cpu_maps_update_done(); +} + #else /* #if CONFIG_HOTPLUG_CPU */ static void cpu_hotplug_begin(void) {} static void cpu_hotplug_done(void) {} @@ -541,36 +562,6 @@ static int __init alloc_frozen_cpus(void) core_initcall(alloc_frozen_cpus); /* - * Prevent regular CPU hotplug from racing with the freezer, by disabling CPU - * hotplug when tasks are about to be frozen. Also, don't allow the freezer - * to continue until any currently running CPU hotplug operation gets - * completed. - * To modify the 'cpu_hotplug_disabled' flag, we need to acquire the - * 'cpu_add_remove_lock'. And this same lock is also taken by the regular - * CPU hotplug path and released only after it is complete. Thus, we - * (and hence the freezer) will block here until any currently running CPU - * hotplug operation gets completed. - */ -void cpu_hotplug_disable_before_freeze(void) -{ - cpu_maps_update_begin(); - cpu_hotplug_disabled = 1; - cpu_maps_update_done(); -} - - -/* - * When tasks have been thawed, re-enable regular CPU hotplug (which had been - * disabled while beginning to freeze tasks). - */ -void cpu_hotplug_enable_after_thaw(void) -{ - cpu_maps_update_begin(); - cpu_hotplug_disabled = 0; - cpu_maps_update_done(); -} - -/* * When callbacks for CPU hotplug notifications are being executed, we must * ensure that the state of the system with respect to the tasks being frozen * or not, as reported by the notification, remains unchanged *throughout the @@ -589,12 +580,12 @@ cpu_hotplug_pm_callback(struct notifier_block *nb, case PM_SUSPEND_PREPARE: case PM_HIBERNATION_PREPARE: - cpu_hotplug_disable_before_freeze(); + cpu_hotplug_disable(); break; case PM_POST_SUSPEND: case PM_POST_HIBERNATION: - cpu_hotplug_enable_after_thaw(); + cpu_hotplug_enable(); break; default: -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 03/11] Remove -stable friendly PF_THREAD_BOUND define
Remove the prior patch's #define for easier backporting to the stable releases. Signed-off-by: Robin Holt To: Andrew Morton Cc: H. Peter Anvin Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List --- kernel/sys.c | 5 - 1 file changed, 5 deletions(-) diff --git a/kernel/sys.c b/kernel/sys.c index 2bbd9a7..17bb8d3 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -362,11 +362,6 @@ int unregister_reboot_notifier(struct notifier_block *nb) } EXPORT_SYMBOL(unregister_reboot_notifier); -/* Add backwards compatibility for stable trees. */ -#ifndef PF_NO_SETAFFINITY -#define PF_NO_SETAFFINITY PF_THREAD_BOUND -#endif - static void migrate_to_reboot_cpu(void) { /* The boot cpu is always logical cpu 0 */ -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 05/11] checkpatch.pl the new kernel/reboot.c file.
Get the new file to pass scripts/checkpatch.pl Signed-off-by: Robin Holt To: Andrew Morton Cc: H. Peter Anvin Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List --- Changes since v6: - Removed last remaining line length warning. --- include/linux/reboot.h | 2 +- kernel/reboot.c| 28 +--- 2 files changed, 14 insertions(+), 16 deletions(-) diff --git a/include/linux/reboot.h b/include/linux/reboot.h index 23b3630..c6eba21 100644 --- a/include/linux/reboot.h +++ b/include/linux/reboot.h @@ -26,7 +26,7 @@ extern void machine_shutdown(void); struct pt_regs; extern void machine_crash_shutdown(struct pt_regs *); -/* +/* * Architecture independent implemenations of sys_reboot commands. */ diff --git a/kernel/reboot.c b/kernel/reboot.c index 0616483..abb6a04 100644 --- a/kernel/reboot.c +++ b/kernel/reboot.c @@ -4,6 +4,8 @@ * Copyright (C) 2013 Linus Torvalds */ +#define pr_fmt(fmt)"reboot: " fmt + #include #include #include @@ -114,9 +116,9 @@ void kernel_restart(char *cmd) migrate_to_reboot_cpu(); syscore_shutdown(); if (!cmd) - printk(KERN_EMERG "Restarting system.\n"); + pr_emerg("Restarting system\n"); else - printk(KERN_EMERG "Restarting system with command '%s'.\n", cmd); + pr_emerg("Restarting system with command '%s'\n", cmd); kmsg_dump(KMSG_DUMP_RESTART); machine_restart(cmd); } @@ -125,7 +127,7 @@ EXPORT_SYMBOL_GPL(kernel_restart); static void kernel_shutdown_prepare(enum system_states state) { blocking_notifier_call_chain(_notifier_list, - (state == SYSTEM_HALT)?SYS_HALT:SYS_POWER_OFF, NULL); + (state == SYSTEM_HALT) ? SYS_HALT : SYS_POWER_OFF, NULL); system_state = state; usermodehelper_disable(); device_shutdown(); @@ -140,11 +142,10 @@ void kernel_halt(void) kernel_shutdown_prepare(SYSTEM_HALT); migrate_to_reboot_cpu(); syscore_shutdown(); - printk(KERN_EMERG "System halted.\n"); + pr_emerg("System halted\n"); kmsg_dump(KMSG_DUMP_HALT); machine_halt(); } - EXPORT_SYMBOL_GPL(kernel_halt); /** @@ -159,7 +160,7 @@ void kernel_power_off(void) pm_power_off_prepare(); migrate_to_reboot_cpu(); syscore_shutdown(); - printk(KERN_EMERG "Power down.\n"); + pr_emerg("Power down\n"); kmsg_dump(KMSG_DUMP_POWEROFF); machine_power_off(); } @@ -188,10 +189,10 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd, /* For safety, we require "magic" arguments. */ if (magic1 != LINUX_REBOOT_MAGIC1 || - (magic2 != LINUX_REBOOT_MAGIC2 && - magic2 != LINUX_REBOOT_MAGIC2A && + (magic2 != LINUX_REBOOT_MAGIC2 && + magic2 != LINUX_REBOOT_MAGIC2A && magic2 != LINUX_REBOOT_MAGIC2B && - magic2 != LINUX_REBOOT_MAGIC2C)) + magic2 != LINUX_REBOOT_MAGIC2C)) return -EINVAL; /* @@ -234,7 +235,8 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd, break; case LINUX_REBOOT_CMD_RESTART2: - if (strncpy_from_user([0], arg, sizeof(buffer) - 1) < 0) { + ret = strncpy_from_user([0], arg, sizeof(buffer) - 1); + if (ret < 0) { ret = -EFAULT; break; } @@ -282,7 +284,6 @@ void ctrl_alt_del(void) else kill_cad_pid(SIGINT, 1); } - char poweroff_cmd[POWEROFF_CMD_PATH_LEN] = "/sbin/poweroff"; @@ -301,14 +302,11 @@ static int __orderly_poweroff(bool force) ret = call_usermodehelper(argv[0], argv, envp, UMH_WAIT_EXEC); argv_free(argv); } else { - printk(KERN_WARNING "%s failed to allocate memory for \"%s\"\n", -__func__, poweroff_cmd); ret = -ENOMEM; } if (ret && force) { - printk(KERN_WARNING "Failed to start orderly shutdown: " - "forcing the issue\n"); + pr_warn("Failed to start orderly shutdown: forcing the issue\n"); /* * I guess this should try to kick off some daemon to sync and * poweroff asap. Or not even bother syncing if we're doing an -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 00/11] Shutdown from reboot_cpuid without stopping other cpus.
We recently noticed that reboot of a 1024 cpu machine takes approx 16 minutes of just stopping the cpus. The slowdown was tracked to commit f96972f. The current implementation does all the work of hot removing the cpus before halting the system. We are switching to just migrating to the reboot_cpu and then continuing with shutdown/reboot. The patch set is broken into eleven parts. The first two are planned for the stable release. The others move the halt/shutdown/reboot related functions to their own kernel/reboot.c file and then move the handling of the kernel reboot= kernel parameter to generic kernel code. Changes since -v10 - Added Russell's Acked-by for arm. - Fixed an accidentally commented out line in an arm header file. Changes since -v9 - Added Ingo's Acked-by for x86. - Added Guan's Acked-by for unicore32. - Replaced first patch with updated patch from Srivatsa S. Bhat. This compiles for alpha allmodconfig, all arm defconfigs, and a few test x86_64 defconfigs. I have not tried more. Changes since -v8 - Changes reboot_cpu on stack to cpu to fix bug noticed by Russell King. - Switched unicore32 and arm from using REBOOT_WARM/COLD to HARD/SOFT. - Fixed case statement bug. - Went to using simple_strtoul for parsing reboot_cpu=smp###. - Made parsing of reboot= not use any #ifdef'd code. Changes since -v7. - Fixed authorship for first patch. - Rebased to Linus' current tree (51a26ae7a). Changes since -v6. - Cross compiled all arm architectures (using v3.9 kernel. Fails with current). - Added a #define for non-hotplug case. - Add #define for PF_THREAD_BOUND as compatibility to make stable easier. - Fixup s/reboot_cpu_id/reboot_cpu/ - Add include of linux/uaccess.h to allow building on arm. - Removed last remaining checkpatch.pl line length warning on kernel/reboot.c. - Fixed the duplicate handling or the reboot= kernel parameter. Changes since -v5. - Moved the arch/x86 reboot= up to the generic kernel code. Changes since -v4. - Integrated Srivatsa S. Bhat creating cpu_hotplug_disable() function - Integrated comments by Srivatsa S. Bhat. - Made one more comment consistent with others in function. Changes since -v3. - Added a tested-by for the original reporter. - Fix compile failure found by Joe Perches. - Integrated comments by Joe Perches. To: Andrew Morton Cc: H. Peter Anvin Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 07/11] unicore32, prepare reboot_mode for moving to generic kernel code.
This patch prepares for the moving the parsing of reboot= to the generic kernel code by making reboot_mode into a more generic form. Signed-off-by: Robin Holt To: Andrew Morton Cc: Guan Xuetao Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: H. Peter Anvin Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List Acked-by: Guan Xuetao --- Changes since -v8 - Switched from using REBOOT_WARM/COLD to HARD/SOFT. --- arch/unicore32/kernel/process.c | 10 +- arch/unicore32/kernel/setup.h | 2 +- arch/unicore32/mm/mmu.c | 2 +- include/linux/reboot.h | 2 ++ 4 files changed, 9 insertions(+), 7 deletions(-) diff --git a/arch/unicore32/kernel/process.c b/arch/unicore32/kernel/process.c index c944769..93dd035 100644 --- a/arch/unicore32/kernel/process.c +++ b/arch/unicore32/kernel/process.c @@ -51,14 +51,14 @@ void arch_cpu_idle(void) local_irq_enable(); } -static char reboot_mode = 'h'; +static enum reboot_mode reboot_mode = REBOOT_HARD; int __init reboot_setup(char *str) { - reboot_mode = str[0]; + if ('s' == str[0]) + reboot_mode = REBOOT_SOFT; return 1; } - __setup("reboot=", reboot_setup); void machine_halt(void) @@ -88,7 +88,7 @@ void machine_restart(char *cmd) * we may need it to insert some 1:1 mappings so that * soft boot works. */ - setup_mm_for_reboot(reboot_mode); + setup_mm_for_reboot(); /* Clean and invalidate caches */ flush_cache_all(); @@ -102,7 +102,7 @@ void machine_restart(char *cmd) /* * Now handle reboot code. */ - if (reboot_mode == 's') { + if (reboot_mode == REBOOT_SOFT) { /* Jump into ROM at address 0x */ cpu_reset(VECTORS_BASE); } else { diff --git a/arch/unicore32/kernel/setup.h b/arch/unicore32/kernel/setup.h index 30f749d..f5c51b8 100644 --- a/arch/unicore32/kernel/setup.h +++ b/arch/unicore32/kernel/setup.h @@ -22,7 +22,7 @@ extern void puv3_ps2_init(void); extern void pci_puv3_preinit(void); extern void __init puv3_init_gpio(void); -extern void setup_mm_for_reboot(char mode); +extern void setup_mm_for_reboot(void); extern char __stubs_start[], __stubs_end[]; extern char __vectors_start[], __vectors_end[]; diff --git a/arch/unicore32/mm/mmu.c b/arch/unicore32/mm/mmu.c index 43c20b4..4f5a532 100644 --- a/arch/unicore32/mm/mmu.c +++ b/arch/unicore32/mm/mmu.c @@ -445,7 +445,7 @@ void __init paging_init(void) * the user-mode pages. This will then ensure that we have predictable * results when turning the mmu off */ -void setup_mm_for_reboot(char mode) +void setup_mm_for_reboot(void) { unsigned long base_pmdval; pgd_t *pgd; diff --git a/include/linux/reboot.h b/include/linux/reboot.h index 37d56c3..ca29a6f 100644 --- a/include/linux/reboot.h +++ b/include/linux/reboot.h @@ -13,6 +13,8 @@ enum reboot_mode { REBOOT_COLD = 0, REBOOT_WARM, + REBOOT_HARD, + REBOOT_SOFT, }; extern int register_reboot_notifier(struct notifier_block *); -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 08/11] arm, Remove unused restart_mode fields from some arm subarchs
These restart_mode fields are not used at all. Remove them to make moving the reboot= cmdline options to the general kernel easier. Signed-off-by: Robin Holt To: Andrew Morton Cc: Russell King Cc: Russ Anderson Cc: Robin Holt Cc: H. Peter Anvin Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List Acked-by: Russell King --- arch/arm/mach-ebsa110/core.c | 1 - arch/arm/mach-pxa/mioa701.c | 1 - arch/arm/mach-pxa/spitz.c| 3 --- arch/arm/mach-pxa/tosa.c | 1 - 4 files changed, 6 deletions(-) diff --git a/arch/arm/mach-ebsa110/core.c b/arch/arm/mach-ebsa110/core.c index b13cc74..69a9d5d 100644 --- a/arch/arm/mach-ebsa110/core.c +++ b/arch/arm/mach-ebsa110/core.c @@ -321,7 +321,6 @@ MACHINE_START(EBSA110, "EBSA110") .atag_offset= 0x400, .reserve_lp0= 1, .reserve_lp2= 1, - .restart_mode = 's', .map_io = ebsa110_map_io, .init_early = ebsa110_init_early, .init_irq = ebsa110_init_irq, diff --git a/arch/arm/mach-pxa/mioa701.c b/arch/arm/mach-pxa/mioa701.c index f8979b9..dbea67a 100644 --- a/arch/arm/mach-pxa/mioa701.c +++ b/arch/arm/mach-pxa/mioa701.c @@ -756,7 +756,6 @@ static void mioa701_machine_exit(void) MACHINE_START(MIOA701, "MIO A701") .atag_offset= 0x100, - .restart_mode = 's', .map_io = _map_io, .nr_irqs= PXA_NR_IRQS, .init_irq = _init_irq, diff --git a/arch/arm/mach-pxa/spitz.c b/arch/arm/mach-pxa/spitz.c index 362726c..c3c0042 100644 --- a/arch/arm/mach-pxa/spitz.c +++ b/arch/arm/mach-pxa/spitz.c @@ -979,7 +979,6 @@ static void __init spitz_fixup(struct tag *tags, char **cmdline, #ifdef CONFIG_MACH_SPITZ MACHINE_START(SPITZ, "SHARP Spitz") - .restart_mode = 'g', .fixup = spitz_fixup, .map_io = pxa27x_map_io, .nr_irqs= PXA_NR_IRQS, @@ -993,7 +992,6 @@ MACHINE_END #ifdef CONFIG_MACH_BORZOI MACHINE_START(BORZOI, "SHARP Borzoi") - .restart_mode = 'g', .fixup = spitz_fixup, .map_io = pxa27x_map_io, .nr_irqs= PXA_NR_IRQS, @@ -1007,7 +1005,6 @@ MACHINE_END #ifdef CONFIG_MACH_AKITA MACHINE_START(AKITA, "SHARP Akita") - .restart_mode = 'g', .fixup = spitz_fixup, .map_io = pxa27x_map_io, .nr_irqs= PXA_NR_IRQS, diff --git a/arch/arm/mach-pxa/tosa.c b/arch/arm/mach-pxa/tosa.c index 3d91d2e..a41992f 100644 --- a/arch/arm/mach-pxa/tosa.c +++ b/arch/arm/mach-pxa/tosa.c @@ -969,7 +969,6 @@ static void __init fixup_tosa(struct tag *tags, char **cmdline, } MACHINE_START(TOSA, "SHARP Tosa") - .restart_mode = 'g', .fixup = fixup_tosa, .map_io = pxa25x_map_io, .nr_irqs= TOSA_NR_IRQS, -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 04/11] Move shutdown/reboot related functions to kernel/reboot.c
This patch is preparatory. It moves reboot related syscall, etc functions from kernel/sys.c to kernel/reboot.c. Signed-off-by: Robin Holt To: Andrew Morton Cc: H. Peter Anvin Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List --- Changes since -v6: - Add include of linux/uaccess.h to allow building on arm. --- kernel/Makefile | 2 +- kernel/reboot.c | 347 kernel/sys.c| 331 - 3 files changed, 348 insertions(+), 332 deletions(-) create mode 100644 kernel/reboot.c diff --git a/kernel/Makefile b/kernel/Makefile index 271fd31..470839d 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -9,7 +9,7 @@ obj-y = fork.o exec_domain.o panic.o printk.o \ rcupdate.o extable.o params.o posix-timers.o \ kthread.o wait.o sys_ni.o posix-cpu-timers.o mutex.o \ hrtimer.o rwsem.o nsproxy.o srcu.o semaphore.o \ - notifier.o ksysfs.o cred.o \ + notifier.o ksysfs.o cred.o reboot.o \ async.o range.o groups.o lglock.o smpboot.o ifdef CONFIG_FUNCTION_TRACER diff --git a/kernel/reboot.c b/kernel/reboot.c new file mode 100644 index 000..0616483 --- /dev/null +++ b/kernel/reboot.c @@ -0,0 +1,347 @@ +/* + * linux/kernel/reboot.c + * + * Copyright (C) 2013 Linus Torvalds + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * this indicates whether you can reboot with ctrl-alt-del: the default is yes + */ + +int C_A_D = 1; +struct pid *cad_pid; +EXPORT_SYMBOL(cad_pid); + +/* + * If set, this is used for preparing the system to power off. + */ + +void (*pm_power_off_prepare)(void); + +/** + * emergency_restart - reboot the system + * + * Without shutting down any hardware or taking any locks + * reboot the system. This is called when we know we are in + * trouble so this is our best effort to reboot. This is + * safe to call in interrupt context. + */ +void emergency_restart(void) +{ + kmsg_dump(KMSG_DUMP_EMERG); + machine_emergency_restart(); +} +EXPORT_SYMBOL_GPL(emergency_restart); + +void kernel_restart_prepare(char *cmd) +{ + blocking_notifier_call_chain(_notifier_list, SYS_RESTART, cmd); + system_state = SYSTEM_RESTART; + usermodehelper_disable(); + device_shutdown(); +} + +/** + * register_reboot_notifier - Register function to be called at reboot time + * @nb: Info about notifier function to be called + * + * Registers a function with the list of functions + * to be called at reboot time. + * + * Currently always returns zero, as blocking_notifier_chain_register() + * always returns zero. + */ +int register_reboot_notifier(struct notifier_block *nb) +{ + return blocking_notifier_chain_register(_notifier_list, nb); +} +EXPORT_SYMBOL(register_reboot_notifier); + +/** + * unregister_reboot_notifier - Unregister previously registered reboot notifier + * @nb: Hook to be unregistered + * + * Unregisters a previously registered reboot + * notifier function. + * + * Returns zero on success, or %-ENOENT on failure. + */ +int unregister_reboot_notifier(struct notifier_block *nb) +{ + return blocking_notifier_chain_unregister(_notifier_list, nb); +} +EXPORT_SYMBOL(unregister_reboot_notifier); + +static void migrate_to_reboot_cpu(void) +{ + /* The boot cpu is always logical cpu 0 */ + int cpu = 0; + + cpu_hotplug_disable(); + + /* Make certain the cpu I'm about to reboot on is online */ + if (!cpu_online(cpu)) + cpu = cpumask_first(cpu_online_mask); + + /* Prevent races with other tasks migrating this task */ + current->flags |= PF_NO_SETAFFINITY; + + /* Make certain I only run on the appropriate processor */ + set_cpus_allowed_ptr(current, cpumask_of(cpu)); +} + +/** + * kernel_restart - reboot the system + * @cmd: pointer to buffer containing command to execute for restart + * or %NULL + * + * Shutdown everything and perform a clean reboot. + * This is not safe to call in interrupt context. + */ +void kernel_restart(char *cmd) +{ + kernel_restart_prepare(cmd); + migrate_to_reboot_cpu(); + syscore_shutdown(); + if (!cmd) + printk(KERN_EMERG "Restarting system.\n"); + else + printk(KERN_EMERG "Restarting system with command '%s'.\n", cmd); + kmsg_dump(KMSG_DUMP_RESTART); + machine_restart(cmd); +} +EXPORT_SYMBOL_GPL(kernel_restart); + +static void kernel_shutdown_prepare(enum system_states state) +{ + blocking_notifier_call_chain(_notifier_list, + (state == SYSTEM_HALT)?SYS_HALT:SYS_POWER_OFF, NULL); + system_state = state; +
[PATCH -v11 09/11] arm, prepare reboot_mode for moving to generic kernel code.
This patch prepares for the moving the parsing of reboot= to the generic kernel code by making reboot_mode into a more generic form. Signed-off-by: Robin Holt To: Andrew Morton Cc: Russell King Cc: Russ Anderson Cc: Robin Holt Cc: H. Peter Anvin Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List Acked-by: Russell King --- Changes since -v10 - Uncommented an accidentally commented out line. Changes since -v8 - Switched from using REBOOT_WARM/COLD to HARD/SOFT. --- arch/arm/include/asm/mach/arch.h | 3 ++- arch/arm/kernel/process.c | 8 arch/arm/kernel/setup.c| 6 +++--- arch/arm/mach-footbridge/cats-hw.c | 2 +- 4 files changed, 10 insertions(+), 9 deletions(-) diff --git a/arch/arm/include/asm/mach/arch.h b/arch/arm/include/asm/mach/arch.h index 308ad7d..e2b551e 100644 --- a/arch/arm/include/asm/mach/arch.h +++ b/arch/arm/include/asm/mach/arch.h @@ -9,6 +9,7 @@ */ #ifndef __ASSEMBLY__ +#include struct tag; struct meminfo; @@ -39,7 +40,7 @@ struct machine_desc { unsigned char reserve_lp0 :1; /* never has lp0*/ unsigned char reserve_lp1 :1; /* never has lp1*/ unsigned char reserve_lp2 :1; /* never has lp2*/ - charrestart_mode; /* default restart mode */ + enum reboot_modereboot_mode;/* default restart mode */ struct smp_operations *smp; /* SMP operations */ void(*fixup)(struct tag *, char **, struct meminfo *); diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c index f219703..92b47df 100644 --- a/arch/arm/kernel/process.c +++ b/arch/arm/kernel/process.c @@ -174,14 +174,14 @@ void arch_cpu_idle(void) default_idle(); } -static char reboot_mode = 'h'; +enum reboot_mode reboot_mode = REBOOT_HARD; -int __init reboot_setup(char *str) +static int __init reboot_setup(char *str) { - reboot_mode = str[0]; + if ('s' == str[0]) + reboot_mode = REBOOT_SOFT; return 1; } - __setup("reboot=", reboot_setup); void machine_shutdown(void) diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c index 1522c7a..e05df42 100644 --- a/arch/arm/kernel/setup.c +++ b/arch/arm/kernel/setup.c @@ -73,7 +73,7 @@ __setup("fpe=", fpe_setup); extern void paging_init(struct machine_desc *desc); extern void sanity_check_meminfo(void); -extern void reboot_setup(char *str); +extern enum reboot_mode reboot_mode; extern void setup_dma_zone(struct machine_desc *desc); unsigned int processor_id; @@ -769,8 +769,8 @@ void __init setup_arch(char **cmdline_p) setup_dma_zone(mdesc); - if (mdesc->restart_mode) - reboot_setup(>restart_mode); + if (mdesc->reboot_mode != REBOOT_HARD) + reboot_mode = mdesc->reboot_mode; init_mm.start_code = (unsigned long) _text; init_mm.end_code = (unsigned long) _etext; diff --git a/arch/arm/mach-footbridge/cats-hw.c b/arch/arm/mach-footbridge/cats-hw.c index 6987a09..9669cc0 100644 --- a/arch/arm/mach-footbridge/cats-hw.c +++ b/arch/arm/mach-footbridge/cats-hw.c @@ -86,7 +86,7 @@ fixup_cats(struct tag *tags, char **cmdline, struct meminfo *mi) MACHINE_START(CATS, "Chalice-CATS") /* Maintainer: Philip Blundell */ .atag_offset= 0x100, - .restart_mode = 's', + .reboot_mode= REBOOT_SOFT, .fixup = fixup_cats, .map_io = footbridge_map_io, .init_irq = footbridge_init_irq, -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 11/11] Move arch/x86 reboot= handling to generic kernel.
Merge together the unicore32, arm, and x86 reboot= command line parameter handling. Signed-off-by: Robin Holt To: Andrew Morton Cc: H. Peter Anvin Cc: Russell King Cc: Guan Xuetao Cc: Russ Anderson Cc: Robin Holt Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List Acked-by: Ingo Molnar Acked-by: Guan Xuetao Acked-by: Russell King --- Changes since -v8 - Add missing break statements. - Change parsing so #ifdef's are no longer needed. - Switch to using simple_strtoul to make parsing cleaner. - Add handling of REBOOT_HARD/SOFT --- Documentation/kernel-parameters.txt | 14 +++- arch/arm/kernel/process.c| 10 --- arch/unicore32/kernel/process.c | 10 --- arch/x86/include/asm/emergency-restart.h | 12 arch/x86/kernel/apic/x2apic_uv_x.c | 2 +- arch/x86/kernel/reboot.c | 111 +-- include/linux/reboot.h | 17 + kernel/reboot.c | 76 - 8 files changed, 107 insertions(+), 145 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index c3bfacb..b2945ce 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2677,9 +2677,17 @@ bytes respectively. Such letter suffixes can also be entirely omitted. Run specified binary instead of /init from the ramdisk, used for early userspace startup. See initrd. - reboot= [BUGS=X86-32,BUGS=ARM,BUGS=IA-64] Rebooting mode - Format: [,[,...]] - See arch/*/kernel/reboot.c or arch/*/kernel/process.c + reboot= [KNL] + Format (x86 or x86_64): + [w[arm] | c[old] | h[ard] | s[oft] | g[pio]] \ + [[,]s[mp] \ + [[,]b[ios] | a[cpi] | k[bd] | t[riple] | e[fi] | p[ci]] \ + [[,]f[orce] + Where reboot_mode is one of warm (soft) or cold (hard) or gpio, + reboot_type is one of bios, acpi, kbd, triple, efi, or pci, + reboot_force is either force or not specified, + reboot_cpu is s[mp] with being the processor + to be used for rebooting. relax_domain_level= [KNL, SMP] Set scheduler's default relax_domain_level. diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c index 42856fc..304b102 100644 --- a/arch/arm/kernel/process.c +++ b/arch/arm/kernel/process.c @@ -175,16 +175,6 @@ void arch_cpu_idle(void) default_idle(); } -enum reboot_mode reboot_mode = REBOOT_HARD; - -static int __init reboot_setup(char *str) -{ - if ('s' == str[0]) - reboot_mode = REBOOT_SOFT; - return 1; -} -__setup("reboot=", reboot_setup); - void machine_shutdown(void) { #ifdef CONFIG_SMP diff --git a/arch/unicore32/kernel/process.c b/arch/unicore32/kernel/process.c index 93dd035..778ebba 100644 --- a/arch/unicore32/kernel/process.c +++ b/arch/unicore32/kernel/process.c @@ -51,16 +51,6 @@ void arch_cpu_idle(void) local_irq_enable(); } -static enum reboot_mode reboot_mode = REBOOT_HARD; - -int __init reboot_setup(char *str) -{ - if ('s' == str[0]) - reboot_mode = REBOOT_SOFT; - return 1; -} -__setup("reboot=", reboot_setup); - void machine_halt(void) { gpio_set_value(GPO_SOFT_OFF, 0); diff --git a/arch/x86/include/asm/emergency-restart.h b/arch/x86/include/asm/emergency-restart.h index 75ce3f4..77a99ac 100644 --- a/arch/x86/include/asm/emergency-restart.h +++ b/arch/x86/include/asm/emergency-restart.h @@ -1,18 +1,6 @@ #ifndef _ASM_X86_EMERGENCY_RESTART_H #define _ASM_X86_EMERGENCY_RESTART_H -enum reboot_type { - BOOT_TRIPLE = 't', - BOOT_KBD = 'k', - BOOT_BIOS = 'b', - BOOT_ACPI = 'a', - BOOT_EFI = 'e', - BOOT_CF9 = 'p', - BOOT_CF9_COND = 'q', -}; - -extern enum reboot_type reboot_type; - extern void machine_emergency_restart(void); #endif /* _ASM_X86_EMERGENCY_RESTART_H */ diff --git a/arch/x86/kernel/apic/x2apic_uv_x.c b/arch/x86/kernel/apic/x2apic_uv_x.c index 794f6eb..958e3e4 100644 --- a/arch/x86/kernel/apic/x2apic_uv_x.c +++ b/arch/x86/kernel/apic/x2apic_uv_x.c @@ -25,6 +25,7 @@ #include #include #include +#include #include #include @@ -36,7 +37,6 @@ #include #include #include -#include #include /* BMC sets a bit this MMR non-zero before sending an NMI */ diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c index f770340..563ed91 100644 --- a/arch/x86/kernel/reboot.c +++ b/arch/x86/kernel/reboot.c @@ -36,22 +36,6 @@ void (*pm_power_off)(void); EXPORT_SYMBOL(pm_power_o
[PATCH -v11 06/11] x86, prepare reboot_mode for moving to generic kernel code.
This patch prepares for the moving the parsing of reboot= to the generic kernel code by making reboot_mode into a more generic form. Signed-off-by: Robin Holt To: Andrew Morton Cc: H. Peter Anvin Cc: Miguel Boton Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List Acked-by: Ingo Molnar --- arch/x86/kernel/reboot.c | 12 +++- include/linux/reboot.h | 5 + 2 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c index 76fa1e9..f770340 100644 --- a/arch/x86/kernel/reboot.c +++ b/arch/x86/kernel/reboot.c @@ -36,7 +36,7 @@ void (*pm_power_off)(void); EXPORT_SYMBOL(pm_power_off); static const struct desc_ptr no_idt = {}; -static int reboot_mode; +static enum reboot_mode reboot_mode; enum reboot_type reboot_type = BOOT_ACPI; int reboot_force; @@ -88,11 +88,11 @@ static int __init reboot_setup(char *str) switch (*str) { case 'w': - reboot_mode = 0x1234; + reboot_mode = REBOOT_WARM; break; case 'c': - reboot_mode = 0; + reboot_mode = REBOOT_COLD; break; #ifdef CONFIG_SMP @@ -536,6 +536,7 @@ static void native_machine_emergency_restart(void) int i; int attempt = 0; int orig_reboot_type = reboot_type; + unsigned short mode; if (reboot_emergency) emergency_vmx_disable_all(); @@ -543,7 +544,8 @@ static void native_machine_emergency_restart(void) tboot_shutdown(TB_SHUTDOWN_REBOOT); /* Tell the BIOS if we want cold or warm reboot */ - *((unsigned short *)__va(0x472)) = reboot_mode; + mode = reboot_mode == REBOOT_WARM ? 0x1234 : 0; + *((unsigned short *)__va(0x472)) = mode; for (;;) { /* Could also try the reset bit in the Hammer NB */ @@ -585,7 +587,7 @@ static void native_machine_emergency_restart(void) case BOOT_EFI: if (efi_enabled(EFI_RUNTIME_SERVICES)) - efi.reset_system(reboot_mode ? + efi.reset_system(reboot_mode == REBOOT_WARM ? EFI_RESET_WARM : EFI_RESET_COLD, EFI_SUCCESS, 0, NULL); diff --git a/include/linux/reboot.h b/include/linux/reboot.h index c6eba21..37d56c3 100644 --- a/include/linux/reboot.h +++ b/include/linux/reboot.h @@ -10,6 +10,11 @@ #define SYS_HALT 0x0002 /* Notify of system halt */ #define SYS_POWER_OFF 0x0003 /* Notify of system power off */ +enum reboot_mode { + REBOOT_COLD = 0, + REBOOT_WARM, +}; + extern int register_reboot_notifier(struct notifier_block *); extern int unregister_reboot_notifier(struct notifier_block *); -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v11 02/11] Migrate shutdown/reboot to boot cpu.
We recently noticed that reboot of a 1024 cpu machine takes approx 16 minutes of just stopping the cpus. The slowdown was tracked to commit f96972f. The current implementation does all the work of hot removing the cpus before halting the system. We are switching to just migrating to the boot cpu and then continuing with shutdown/reboot. This also has the effect of not breaking x86's command line parameter for specifying the reboot cpu. Note, this code was shamelessly copied from arch/x86/kernel/reboot.c with bits removed pertaining to the reboot_cpu command line parameter. Signed-off-by: Robin Holt Tested-by: Shawn Guo To: Andrew Morton Cc: H. Peter Anvin Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List Cc: --- Changes since -v8 - Change stack parameter to make future patches cleaner. Changes since -v6: - Add #define for PF_THREAD_BOUND as compatibility to make stable easier. - Fixup s/reboot_cpu_id/reboot_cpu/ --- kernel/sys.c | 29 ++--- 1 file changed, 26 insertions(+), 3 deletions(-) diff --git a/kernel/sys.c b/kernel/sys.c index b95d3c7..2bbd9a7 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -362,6 +362,29 @@ int unregister_reboot_notifier(struct notifier_block *nb) } EXPORT_SYMBOL(unregister_reboot_notifier); +/* Add backwards compatibility for stable trees. */ +#ifndef PF_NO_SETAFFINITY +#define PF_NO_SETAFFINITY PF_THREAD_BOUND +#endif + +static void migrate_to_reboot_cpu(void) +{ + /* The boot cpu is always logical cpu 0 */ + int cpu = 0; + + cpu_hotplug_disable(); + + /* Make certain the cpu I'm about to reboot on is online */ + if (!cpu_online(cpu)) + cpu = cpumask_first(cpu_online_mask); + + /* Prevent races with other tasks migrating this task */ + current->flags |= PF_NO_SETAFFINITY; + + /* Make certain I only run on the appropriate processor */ + set_cpus_allowed_ptr(current, cpumask_of(cpu)); +} + /** * kernel_restart - reboot the system * @cmd: pointer to buffer containing command to execute for restart @@ -373,7 +396,7 @@ EXPORT_SYMBOL(unregister_reboot_notifier); void kernel_restart(char *cmd) { kernel_restart_prepare(cmd); - disable_nonboot_cpus(); + migrate_to_reboot_cpu(); syscore_shutdown(); if (!cmd) printk(KERN_EMERG "Restarting system.\n"); @@ -400,7 +423,7 @@ static void kernel_shutdown_prepare(enum system_states state) void kernel_halt(void) { kernel_shutdown_prepare(SYSTEM_HALT); - disable_nonboot_cpus(); + migrate_to_reboot_cpu(); syscore_shutdown(); printk(KERN_EMERG "System halted.\n"); kmsg_dump(KMSG_DUMP_HALT); @@ -419,7 +442,7 @@ void kernel_power_off(void) kernel_shutdown_prepare(SYSTEM_POWER_OFF); if (pm_power_off_prepare) pm_power_off_prepare(); - disable_nonboot_cpus(); + migrate_to_reboot_cpu(); syscore_shutdown(); printk(KERN_EMERG "Power down.\n"); kmsg_dump(KMSG_DUMP_POWEROFF); -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -v10 00/11] Shutdown from reboot_cpuid without stopping other cpus.
I will resubmit a -v11 with Russell's comment about the wrongly added ^// in a bit. Robin On Sat, May 11, 2013 at 06:57:16AM -0500, Robin Holt wrote: > We recently noticed that reboot of a 1024 cpu machine takes approx 16 > minutes of just stopping the cpus. The slowdown was tracked to commit > f96972f. > > The current implementation does all the work of hot removing the cpus > before halting the system. We are switching to just migrating to the > reboot_cpu and then continuing with shutdown/reboot. > > The patch set is broken into eleven parts. The first two are planned for > the stable release. The others move the halt/shutdown/reboot related > functions to their own kernel/reboot.c file and then move the handling > of the kernel reboot= kernel parameter to generic kernel code. > > Changes since -v9 > - Added Ingo's Acked-by for x86. > > - Added Guan's Acked-by for unicore32. > > - Replaced first patch with updated patch from Srivatsa S. Bhat. >This compiles for alpha allmodconfig, all arm defconfigs, and a few >test x86_64 defconfigs. I have not tried more. > > Changes since -v8 > - Changes reboot_cpu on stack to cpu to fix bug noticed by Russell King. > > - Switched unicore32 and arm from using REBOOT_WARM/COLD to HARD/SOFT. > > - Fixed case statement bug. > > - Went to using simple_strtoul for parsing reboot_cpu=smp###. > > - Made parsing of reboot= not use any #ifdef'd code. > > Changes since -v7. > - Fixed authorship for first patch. > > - Rebased to Linus' current tree (51a26ae7a). > > Changes since -v6. > - Cross compiled all arm architectures (using v3.9 kernel. Fails with >current). > > - Added a #define for non-hotplug case. > > - Add #define for PF_THREAD_BOUND as compatibility to make stable easier. > > - Fixup s/reboot_cpu_id/reboot_cpu/ > > - Add include of linux/uaccess.h to allow building on arm. > > - Removed last remaining checkpatch.pl line length warning on >kernel/reboot.c. > > - Fixed the duplicate handling or the reboot= kernel parameter. > > Changes since -v5. > - Moved the arch/x86 reboot= up to the generic kernel code. > > Changes since -v4. > - Integrated Srivatsa S. Bhat creating cpu_hotplug_disable() >function > > - Integrated comments by Srivatsa S. Bhat. > > - Made one more comment consistent with others in function. > > Changes since -v3. > - Added a tested-by for the original reporter. > > - Fix compile failure found by Joe Perches. > > - Integrated comments by Joe Perches. > > > To: Andrew Morton > Cc: H. Peter Anvin > Cc: Russ Anderson > Cc: Robin Holt > Cc: Russell King > Cc: Guan Xuetao > Cc: Linux Kernel Mailing List > Cc: the arch/x86 maintainers > Cc: Arm Mailing List > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v10 06/11] x86, prepare reboot_mode for moving to generic kernel code.
This patch prepares for the moving the parsing of reboot= to the generic kernel code by making reboot_mode into a more generic form. Signed-off-by: Robin Holt To: Andrew Morton Cc: H. Peter Anvin Cc: Miguel Boton Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List Acked-by: Ingo Molnar --- arch/x86/kernel/reboot.c | 12 +++- include/linux/reboot.h | 5 + 2 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c index 76fa1e9..f770340 100644 --- a/arch/x86/kernel/reboot.c +++ b/arch/x86/kernel/reboot.c @@ -36,7 +36,7 @@ void (*pm_power_off)(void); EXPORT_SYMBOL(pm_power_off); static const struct desc_ptr no_idt = {}; -static int reboot_mode; +static enum reboot_mode reboot_mode; enum reboot_type reboot_type = BOOT_ACPI; int reboot_force; @@ -88,11 +88,11 @@ static int __init reboot_setup(char *str) switch (*str) { case 'w': - reboot_mode = 0x1234; + reboot_mode = REBOOT_WARM; break; case 'c': - reboot_mode = 0; + reboot_mode = REBOOT_COLD; break; #ifdef CONFIG_SMP @@ -536,6 +536,7 @@ static void native_machine_emergency_restart(void) int i; int attempt = 0; int orig_reboot_type = reboot_type; + unsigned short mode; if (reboot_emergency) emergency_vmx_disable_all(); @@ -543,7 +544,8 @@ static void native_machine_emergency_restart(void) tboot_shutdown(TB_SHUTDOWN_REBOOT); /* Tell the BIOS if we want cold or warm reboot */ - *((unsigned short *)__va(0x472)) = reboot_mode; + mode = reboot_mode == REBOOT_WARM ? 0x1234 : 0; + *((unsigned short *)__va(0x472)) = mode; for (;;) { /* Could also try the reset bit in the Hammer NB */ @@ -585,7 +587,7 @@ static void native_machine_emergency_restart(void) case BOOT_EFI: if (efi_enabled(EFI_RUNTIME_SERVICES)) - efi.reset_system(reboot_mode ? + efi.reset_system(reboot_mode == REBOOT_WARM ? EFI_RESET_WARM : EFI_RESET_COLD, EFI_SUCCESS, 0, NULL); diff --git a/include/linux/reboot.h b/include/linux/reboot.h index c6eba21..37d56c3 100644 --- a/include/linux/reboot.h +++ b/include/linux/reboot.h @@ -10,6 +10,11 @@ #define SYS_HALT 0x0002 /* Notify of system halt */ #define SYS_POWER_OFF 0x0003 /* Notify of system power off */ +enum reboot_mode { + REBOOT_COLD = 0, + REBOOT_WARM, +}; + extern int register_reboot_notifier(struct notifier_block *); extern int unregister_reboot_notifier(struct notifier_block *); -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v10 08/11] arm, Remove unused restart_mode fields from some arm subarchs
These restart_mode fields are not used at all. Remove them to make moving the reboot= cmdline options to the general kernel easier. Signed-off-by: Robin Holt To: Andrew Morton Cc: Russell King Cc: Russ Anderson Cc: Robin Holt Cc: H. Peter Anvin Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List --- arch/arm/mach-ebsa110/core.c | 1 - arch/arm/mach-pxa/mioa701.c | 1 - arch/arm/mach-pxa/spitz.c| 3 --- arch/arm/mach-pxa/tosa.c | 1 - 4 files changed, 6 deletions(-) diff --git a/arch/arm/mach-ebsa110/core.c b/arch/arm/mach-ebsa110/core.c index b13cc74..69a9d5d 100644 --- a/arch/arm/mach-ebsa110/core.c +++ b/arch/arm/mach-ebsa110/core.c @@ -321,7 +321,6 @@ MACHINE_START(EBSA110, "EBSA110") .atag_offset= 0x400, .reserve_lp0= 1, .reserve_lp2= 1, - .restart_mode = 's', .map_io = ebsa110_map_io, .init_early = ebsa110_init_early, .init_irq = ebsa110_init_irq, diff --git a/arch/arm/mach-pxa/mioa701.c b/arch/arm/mach-pxa/mioa701.c index f8979b9..dbea67a 100644 --- a/arch/arm/mach-pxa/mioa701.c +++ b/arch/arm/mach-pxa/mioa701.c @@ -756,7 +756,6 @@ static void mioa701_machine_exit(void) MACHINE_START(MIOA701, "MIO A701") .atag_offset= 0x100, - .restart_mode = 's', .map_io = _map_io, .nr_irqs= PXA_NR_IRQS, .init_irq = _init_irq, diff --git a/arch/arm/mach-pxa/spitz.c b/arch/arm/mach-pxa/spitz.c index 362726c..c3c0042 100644 --- a/arch/arm/mach-pxa/spitz.c +++ b/arch/arm/mach-pxa/spitz.c @@ -979,7 +979,6 @@ static void __init spitz_fixup(struct tag *tags, char **cmdline, #ifdef CONFIG_MACH_SPITZ MACHINE_START(SPITZ, "SHARP Spitz") - .restart_mode = 'g', .fixup = spitz_fixup, .map_io = pxa27x_map_io, .nr_irqs= PXA_NR_IRQS, @@ -993,7 +992,6 @@ MACHINE_END #ifdef CONFIG_MACH_BORZOI MACHINE_START(BORZOI, "SHARP Borzoi") - .restart_mode = 'g', .fixup = spitz_fixup, .map_io = pxa27x_map_io, .nr_irqs= PXA_NR_IRQS, @@ -1007,7 +1005,6 @@ MACHINE_END #ifdef CONFIG_MACH_AKITA MACHINE_START(AKITA, "SHARP Akita") - .restart_mode = 'g', .fixup = spitz_fixup, .map_io = pxa27x_map_io, .nr_irqs= PXA_NR_IRQS, diff --git a/arch/arm/mach-pxa/tosa.c b/arch/arm/mach-pxa/tosa.c index 3d91d2e..a41992f 100644 --- a/arch/arm/mach-pxa/tosa.c +++ b/arch/arm/mach-pxa/tosa.c @@ -969,7 +969,6 @@ static void __init fixup_tosa(struct tag *tags, char **cmdline, } MACHINE_START(TOSA, "SHARP Tosa") - .restart_mode = 'g', .fixup = fixup_tosa, .map_io = pxa25x_map_io, .nr_irqs= TOSA_NR_IRQS, -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v10 07/11] unicore32, prepare reboot_mode for moving to generic kernel code.
This patch prepares for the moving the parsing of reboot= to the generic kernel code by making reboot_mode into a more generic form. Signed-off-by: Robin Holt To: Andrew Morton Cc: Guan Xuetao Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: H. Peter Anvin Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List Acked-by: Guan Xuetao --- Changes since -v8 - Switched from using REBOOT_WARM/COLD to HARD/SOFT. --- arch/unicore32/kernel/process.c | 10 +- arch/unicore32/kernel/setup.h | 2 +- arch/unicore32/mm/mmu.c | 2 +- include/linux/reboot.h | 2 ++ 4 files changed, 9 insertions(+), 7 deletions(-) diff --git a/arch/unicore32/kernel/process.c b/arch/unicore32/kernel/process.c index c944769..93dd035 100644 --- a/arch/unicore32/kernel/process.c +++ b/arch/unicore32/kernel/process.c @@ -51,14 +51,14 @@ void arch_cpu_idle(void) local_irq_enable(); } -static char reboot_mode = 'h'; +static enum reboot_mode reboot_mode = REBOOT_HARD; int __init reboot_setup(char *str) { - reboot_mode = str[0]; + if ('s' == str[0]) + reboot_mode = REBOOT_SOFT; return 1; } - __setup("reboot=", reboot_setup); void machine_halt(void) @@ -88,7 +88,7 @@ void machine_restart(char *cmd) * we may need it to insert some 1:1 mappings so that * soft boot works. */ - setup_mm_for_reboot(reboot_mode); + setup_mm_for_reboot(); /* Clean and invalidate caches */ flush_cache_all(); @@ -102,7 +102,7 @@ void machine_restart(char *cmd) /* * Now handle reboot code. */ - if (reboot_mode == 's') { + if (reboot_mode == REBOOT_SOFT) { /* Jump into ROM at address 0x */ cpu_reset(VECTORS_BASE); } else { diff --git a/arch/unicore32/kernel/setup.h b/arch/unicore32/kernel/setup.h index 30f749d..f5c51b8 100644 --- a/arch/unicore32/kernel/setup.h +++ b/arch/unicore32/kernel/setup.h @@ -22,7 +22,7 @@ extern void puv3_ps2_init(void); extern void pci_puv3_preinit(void); extern void __init puv3_init_gpio(void); -extern void setup_mm_for_reboot(char mode); +extern void setup_mm_for_reboot(void); extern char __stubs_start[], __stubs_end[]; extern char __vectors_start[], __vectors_end[]; diff --git a/arch/unicore32/mm/mmu.c b/arch/unicore32/mm/mmu.c index 43c20b4..4f5a532 100644 --- a/arch/unicore32/mm/mmu.c +++ b/arch/unicore32/mm/mmu.c @@ -445,7 +445,7 @@ void __init paging_init(void) * the user-mode pages. This will then ensure that we have predictable * results when turning the mmu off */ -void setup_mm_for_reboot(char mode) +void setup_mm_for_reboot(void) { unsigned long base_pmdval; pgd_t *pgd; diff --git a/include/linux/reboot.h b/include/linux/reboot.h index 37d56c3..ca29a6f 100644 --- a/include/linux/reboot.h +++ b/include/linux/reboot.h @@ -13,6 +13,8 @@ enum reboot_mode { REBOOT_COLD = 0, REBOOT_WARM, + REBOOT_HARD, + REBOOT_SOFT, }; extern int register_reboot_notifier(struct notifier_block *); -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v10 03/11] Remove -stable friendly PF_THREAD_BOUND define
Remove the prior patch's #define for easier backporting to the stable releases. Signed-off-by: Robin Holt To: Andrew Morton Cc: H. Peter Anvin Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List --- kernel/sys.c | 5 - 1 file changed, 5 deletions(-) diff --git a/kernel/sys.c b/kernel/sys.c index 2bbd9a7..17bb8d3 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -362,11 +362,6 @@ int unregister_reboot_notifier(struct notifier_block *nb) } EXPORT_SYMBOL(unregister_reboot_notifier); -/* Add backwards compatibility for stable trees. */ -#ifndef PF_NO_SETAFFINITY -#define PF_NO_SETAFFINITY PF_THREAD_BOUND -#endif - static void migrate_to_reboot_cpu(void) { /* The boot cpu is always logical cpu 0 */ -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v10 02/11] Migrate shutdown/reboot to boot cpu.
We recently noticed that reboot of a 1024 cpu machine takes approx 16 minutes of just stopping the cpus. The slowdown was tracked to commit f96972f. The current implementation does all the work of hot removing the cpus before halting the system. We are switching to just migrating to the boot cpu and then continuing with shutdown/reboot. This also has the effect of not breaking x86's command line parameter for specifying the reboot cpu. Note, this code was shamelessly copied from arch/x86/kernel/reboot.c with bits removed pertaining to the reboot_cpu command line parameter. Signed-off-by: Robin Holt Tested-by: Shawn Guo To: Andrew Morton Cc: H. Peter Anvin Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List Cc: --- Changes since -v8 - Change stack parameter to make future patches cleaner. Changes since -v6: - Add #define for PF_THREAD_BOUND as compatibility to make stable easier. - Fixup s/reboot_cpu_id/reboot_cpu/ --- kernel/sys.c | 29 ++--- 1 file changed, 26 insertions(+), 3 deletions(-) diff --git a/kernel/sys.c b/kernel/sys.c index b95d3c7..2bbd9a7 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -362,6 +362,29 @@ int unregister_reboot_notifier(struct notifier_block *nb) } EXPORT_SYMBOL(unregister_reboot_notifier); +/* Add backwards compatibility for stable trees. */ +#ifndef PF_NO_SETAFFINITY +#define PF_NO_SETAFFINITY PF_THREAD_BOUND +#endif + +static void migrate_to_reboot_cpu(void) +{ + /* The boot cpu is always logical cpu 0 */ + int cpu = 0; + + cpu_hotplug_disable(); + + /* Make certain the cpu I'm about to reboot on is online */ + if (!cpu_online(cpu)) + cpu = cpumask_first(cpu_online_mask); + + /* Prevent races with other tasks migrating this task */ + current->flags |= PF_NO_SETAFFINITY; + + /* Make certain I only run on the appropriate processor */ + set_cpus_allowed_ptr(current, cpumask_of(cpu)); +} + /** * kernel_restart - reboot the system * @cmd: pointer to buffer containing command to execute for restart @@ -373,7 +396,7 @@ EXPORT_SYMBOL(unregister_reboot_notifier); void kernel_restart(char *cmd) { kernel_restart_prepare(cmd); - disable_nonboot_cpus(); + migrate_to_reboot_cpu(); syscore_shutdown(); if (!cmd) printk(KERN_EMERG "Restarting system.\n"); @@ -400,7 +423,7 @@ static void kernel_shutdown_prepare(enum system_states state) void kernel_halt(void) { kernel_shutdown_prepare(SYSTEM_HALT); - disable_nonboot_cpus(); + migrate_to_reboot_cpu(); syscore_shutdown(); printk(KERN_EMERG "System halted.\n"); kmsg_dump(KMSG_DUMP_HALT); @@ -419,7 +442,7 @@ void kernel_power_off(void) kernel_shutdown_prepare(SYSTEM_POWER_OFF); if (pm_power_off_prepare) pm_power_off_prepare(); - disable_nonboot_cpus(); + migrate_to_reboot_cpu(); syscore_shutdown(); printk(KERN_EMERG "Power down.\n"); kmsg_dump(KMSG_DUMP_POWEROFF); -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v10 01/11] CPU hotplug: Provide a generic helper to disable/enable CPU hotplug
From: "Srivatsa S. Bhat" There are instances in the kernel where we would like to disable CPU hotplug (from sysfs) during some important operation. Today the freezer code depends on this and the code to do it was kinda tailor-made for that. Restructure the code and make it generic enough to be useful for other usecases too. Signed-off-by: Srivatsa S. Bhat Signed-off-by: Robin Holt To: Andrew Morton Cc: H. Peter Anvin Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List Cc: --- include/linux/cpu.h | 4 kernel/cpu.c| 55 ++--- 2 files changed, 27 insertions(+), 32 deletions(-) diff --git a/include/linux/cpu.h b/include/linux/cpu.h index c6f6e08..9f3c7e8 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -175,6 +175,8 @@ extern struct bus_type cpu_subsys; extern void get_online_cpus(void); extern void put_online_cpus(void); +extern void cpu_hotplug_disable(void); +extern void cpu_hotplug_enable(void); #define hotcpu_notifier(fn, pri) cpu_notifier(fn, pri) #define register_hotcpu_notifier(nb) register_cpu_notifier(nb) #define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb) @@ -198,6 +200,8 @@ static inline void cpu_hotplug_driver_unlock(void) #define get_online_cpus() do { } while (0) #define put_online_cpus() do { } while (0) +#define cpu_hotplug_disable() do { } while (0) +#define cpu_hotplug_enable() do { } while (0) #define hotcpu_notifier(fn, pri) do { (void)(fn); } while (0) /* These aren't inline functions due to a GCC bug. */ #define register_hotcpu_notifier(nb) ({ (void)(nb); 0; }) diff --git a/kernel/cpu.c b/kernel/cpu.c index b5e4ab2..198a388 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -133,6 +133,27 @@ static void cpu_hotplug_done(void) mutex_unlock(_hotplug.lock); } +/* + * Wait for currently running CPU hotplug operations to complete (if any) and + * disable future CPU hotplug (from sysfs). The 'cpu_add_remove_lock' protects + * the 'cpu_hotplug_disabled' flag. The same lock is also acquired by the + * hotplug path before performing hotplug operations. So acquiring that lock + * guarantees mutual exclusion from any currently running hotplug operations. + */ +void cpu_hotplug_disable(void) +{ + cpu_maps_update_begin(); + cpu_hotplug_disabled = 1; + cpu_maps_update_done(); +} + +void cpu_hotplug_enable(void) +{ + cpu_maps_update_begin(); + cpu_hotplug_disabled = 0; + cpu_maps_update_done(); +} + #else /* #if CONFIG_HOTPLUG_CPU */ static void cpu_hotplug_begin(void) {} static void cpu_hotplug_done(void) {} @@ -541,36 +562,6 @@ static int __init alloc_frozen_cpus(void) core_initcall(alloc_frozen_cpus); /* - * Prevent regular CPU hotplug from racing with the freezer, by disabling CPU - * hotplug when tasks are about to be frozen. Also, don't allow the freezer - * to continue until any currently running CPU hotplug operation gets - * completed. - * To modify the 'cpu_hotplug_disabled' flag, we need to acquire the - * 'cpu_add_remove_lock'. And this same lock is also taken by the regular - * CPU hotplug path and released only after it is complete. Thus, we - * (and hence the freezer) will block here until any currently running CPU - * hotplug operation gets completed. - */ -void cpu_hotplug_disable_before_freeze(void) -{ - cpu_maps_update_begin(); - cpu_hotplug_disabled = 1; - cpu_maps_update_done(); -} - - -/* - * When tasks have been thawed, re-enable regular CPU hotplug (which had been - * disabled while beginning to freeze tasks). - */ -void cpu_hotplug_enable_after_thaw(void) -{ - cpu_maps_update_begin(); - cpu_hotplug_disabled = 0; - cpu_maps_update_done(); -} - -/* * When callbacks for CPU hotplug notifications are being executed, we must * ensure that the state of the system with respect to the tasks being frozen * or not, as reported by the notification, remains unchanged *throughout the @@ -589,12 +580,12 @@ cpu_hotplug_pm_callback(struct notifier_block *nb, case PM_SUSPEND_PREPARE: case PM_HIBERNATION_PREPARE: - cpu_hotplug_disable_before_freeze(); + cpu_hotplug_disable(); break; case PM_POST_SUSPEND: case PM_POST_HIBERNATION: - cpu_hotplug_enable_after_thaw(); + cpu_hotplug_enable(); break; default: -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v10 09/11] arm, prepare reboot_mode for moving to generic kernel code.
This patch prepares for the moving the parsing of reboot= to the generic kernel code by making reboot_mode into a more generic form. Signed-off-by: Robin Holt To: Andrew Morton To: Russell King Cc: Russ Anderson Cc: Robin Holt Cc: H. Peter Anvin Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List --- Changes since -v8 - Switched from using REBOOT_WARM/COLD to HARD/SOFT. --- arch/arm/include/asm/mach/arch.h | 3 ++- arch/arm/kernel/process.c | 8 arch/arm/kernel/setup.c| 6 +++--- arch/arm/mach-footbridge/cats-hw.c | 2 +- 4 files changed, 10 insertions(+), 9 deletions(-) diff --git a/arch/arm/include/asm/mach/arch.h b/arch/arm/include/asm/mach/arch.h index 308ad7d..e2b551e 100644 --- a/arch/arm/include/asm/mach/arch.h +++ b/arch/arm/include/asm/mach/arch.h @@ -9,6 +9,7 @@ */ #ifndef __ASSEMBLY__ +#include struct tag; struct meminfo; @@ -39,7 +40,7 @@ struct machine_desc { unsigned char reserve_lp0 :1; /* never has lp0*/ unsigned char reserve_lp1 :1; /* never has lp1*/ unsigned char reserve_lp2 :1; /* never has lp2*/ - charrestart_mode; /* default restart mode */ + enum reboot_modereboot_mode;/* default restart mode */ struct smp_operations *smp; /* SMP operations */ void(*fixup)(struct tag *, char **, struct meminfo *); diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c index f219703..92b47df 100644 --- a/arch/arm/kernel/process.c +++ b/arch/arm/kernel/process.c @@ -174,14 +174,14 @@ void arch_cpu_idle(void) default_idle(); } -static char reboot_mode = 'h'; +enum reboot_mode reboot_mode = REBOOT_HARD; -int __init reboot_setup(char *str) +static int __init reboot_setup(char *str) { - reboot_mode = str[0]; + if ('s' == str[0]) + reboot_mode = REBOOT_SOFT; return 1; } - __setup("reboot=", reboot_setup); void machine_shutdown(void) diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c index 1522c7a..e05df42 100644 --- a/arch/arm/kernel/setup.c +++ b/arch/arm/kernel/setup.c @@ -73,7 +73,7 @@ __setup("fpe=", fpe_setup); extern void paging_init(struct machine_desc *desc); extern void sanity_check_meminfo(void); -extern void reboot_setup(char *str); +extern enum reboot_mode reboot_mode; extern void setup_dma_zone(struct machine_desc *desc); unsigned int processor_id; @@ -769,8 +769,8 @@ void __init setup_arch(char **cmdline_p) setup_dma_zone(mdesc); - if (mdesc->restart_mode) - reboot_setup(>restart_mode); + if (mdesc->reboot_mode != REBOOT_HARD) + reboot_mode = mdesc->reboot_mode; init_mm.start_code = (unsigned long) _text; init_mm.end_code = (unsigned long) _etext; diff --git a/arch/arm/mach-footbridge/cats-hw.c b/arch/arm/mach-footbridge/cats-hw.c index 6987a09..9669cc0 100644 --- a/arch/arm/mach-footbridge/cats-hw.c +++ b/arch/arm/mach-footbridge/cats-hw.c @@ -86,7 +86,7 @@ fixup_cats(struct tag *tags, char **cmdline, struct meminfo *mi) MACHINE_START(CATS, "Chalice-CATS") /* Maintainer: Philip Blundell */ .atag_offset= 0x100, - .restart_mode = 's', + .reboot_mode= REBOOT_SOFT, .fixup = fixup_cats, .map_io = footbridge_map_io, .init_irq = footbridge_init_irq, -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v10 11/11] Move arch/x86 reboot= handling to generic kernel.
Merge together the unicore32, arm, and x86 reboot= command line parameter handling. Signed-off-by: Robin Holt To: Andrew Morton Cc: H. Peter Anvin Cc: Russell King Cc: Guan Xuetao Cc: Russ Anderson Cc: Robin Holt Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List Acked-by: Ingo Molnar Acked-by: Guan Xuetao --- Changes since -v8 - Add missing break statements. - Change parsing so #ifdef's are no longer needed. - Switch to using simple_strtoul to make parsing cleaner. - Add handling of REBOOT_HARD/SOFT --- Documentation/kernel-parameters.txt | 14 +++- arch/arm/kernel/process.c| 10 --- arch/unicore32/kernel/process.c | 10 --- arch/x86/include/asm/emergency-restart.h | 12 arch/x86/kernel/apic/x2apic_uv_x.c | 2 +- arch/x86/kernel/reboot.c | 111 +-- include/linux/reboot.h | 17 + kernel/reboot.c | 76 - 8 files changed, 107 insertions(+), 145 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index c3bfacb..b2945ce 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2677,9 +2677,17 @@ bytes respectively. Such letter suffixes can also be entirely omitted. Run specified binary instead of /init from the ramdisk, used for early userspace startup. See initrd. - reboot= [BUGS=X86-32,BUGS=ARM,BUGS=IA-64] Rebooting mode - Format: [,[,...]] - See arch/*/kernel/reboot.c or arch/*/kernel/process.c + reboot= [KNL] + Format (x86 or x86_64): + [w[arm] | c[old] | h[ard] | s[oft] | g[pio]] \ + [[,]s[mp] \ + [[,]b[ios] | a[cpi] | k[bd] | t[riple] | e[fi] | p[ci]] \ + [[,]f[orce] + Where reboot_mode is one of warm (soft) or cold (hard) or gpio, + reboot_type is one of bios, acpi, kbd, triple, efi, or pci, + reboot_force is either force or not specified, + reboot_cpu is s[mp] with being the processor + to be used for rebooting. relax_domain_level= [KNL, SMP] Set scheduler's default relax_domain_level. diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c index 42856fc..304b102 100644 --- a/arch/arm/kernel/process.c +++ b/arch/arm/kernel/process.c @@ -175,16 +175,6 @@ void arch_cpu_idle(void) default_idle(); } -enum reboot_mode reboot_mode = REBOOT_HARD; - -static int __init reboot_setup(char *str) -{ - if ('s' == str[0]) - reboot_mode = REBOOT_SOFT; - return 1; -} -__setup("reboot=", reboot_setup); - void machine_shutdown(void) { #ifdef CONFIG_SMP diff --git a/arch/unicore32/kernel/process.c b/arch/unicore32/kernel/process.c index 93dd035..778ebba 100644 --- a/arch/unicore32/kernel/process.c +++ b/arch/unicore32/kernel/process.c @@ -51,16 +51,6 @@ void arch_cpu_idle(void) local_irq_enable(); } -static enum reboot_mode reboot_mode = REBOOT_HARD; - -int __init reboot_setup(char *str) -{ - if ('s' == str[0]) - reboot_mode = REBOOT_SOFT; - return 1; -} -__setup("reboot=", reboot_setup); - void machine_halt(void) { gpio_set_value(GPO_SOFT_OFF, 0); diff --git a/arch/x86/include/asm/emergency-restart.h b/arch/x86/include/asm/emergency-restart.h index 75ce3f4..77a99ac 100644 --- a/arch/x86/include/asm/emergency-restart.h +++ b/arch/x86/include/asm/emergency-restart.h @@ -1,18 +1,6 @@ #ifndef _ASM_X86_EMERGENCY_RESTART_H #define _ASM_X86_EMERGENCY_RESTART_H -enum reboot_type { - BOOT_TRIPLE = 't', - BOOT_KBD = 'k', - BOOT_BIOS = 'b', - BOOT_ACPI = 'a', - BOOT_EFI = 'e', - BOOT_CF9 = 'p', - BOOT_CF9_COND = 'q', -}; - -extern enum reboot_type reboot_type; - extern void machine_emergency_restart(void); #endif /* _ASM_X86_EMERGENCY_RESTART_H */ diff --git a/arch/x86/kernel/apic/x2apic_uv_x.c b/arch/x86/kernel/apic/x2apic_uv_x.c index 794f6eb..958e3e4 100644 --- a/arch/x86/kernel/apic/x2apic_uv_x.c +++ b/arch/x86/kernel/apic/x2apic_uv_x.c @@ -25,6 +25,7 @@ #include #include #include +#include #include #include @@ -36,7 +37,6 @@ #include #include #include -#include #include /* BMC sets a bit this MMR non-zero before sending an NMI */ diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c index f770340..563ed91 100644 --- a/arch/x86/kernel/reboot.c +++ b/arch/x86/kernel/reboot.c @@ -36,22 +36,6 @@ void (*pm_power_off)(void); EXPORT_SYMBOL(pm_power_off); static const st
[PATCH -v10 04/11] Move shutdown/reboot related functions to kernel/reboot.c
This patch is preparatory. It moves reboot related syscall, etc functions from kernel/sys.c to kernel/reboot.c. Signed-off-by: Robin Holt To: Andrew Morton Cc: H. Peter Anvin Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List --- Changes since -v6: - Add include of linux/uaccess.h to allow building on arm. --- kernel/Makefile | 2 +- kernel/reboot.c | 347 kernel/sys.c| 331 - 3 files changed, 348 insertions(+), 332 deletions(-) create mode 100644 kernel/reboot.c diff --git a/kernel/Makefile b/kernel/Makefile index 271fd31..470839d 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -9,7 +9,7 @@ obj-y = fork.o exec_domain.o panic.o printk.o \ rcupdate.o extable.o params.o posix-timers.o \ kthread.o wait.o sys_ni.o posix-cpu-timers.o mutex.o \ hrtimer.o rwsem.o nsproxy.o srcu.o semaphore.o \ - notifier.o ksysfs.o cred.o \ + notifier.o ksysfs.o cred.o reboot.o \ async.o range.o groups.o lglock.o smpboot.o ifdef CONFIG_FUNCTION_TRACER diff --git a/kernel/reboot.c b/kernel/reboot.c new file mode 100644 index 000..0616483 --- /dev/null +++ b/kernel/reboot.c @@ -0,0 +1,347 @@ +/* + * linux/kernel/reboot.c + * + * Copyright (C) 2013 Linus Torvalds + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * this indicates whether you can reboot with ctrl-alt-del: the default is yes + */ + +int C_A_D = 1; +struct pid *cad_pid; +EXPORT_SYMBOL(cad_pid); + +/* + * If set, this is used for preparing the system to power off. + */ + +void (*pm_power_off_prepare)(void); + +/** + * emergency_restart - reboot the system + * + * Without shutting down any hardware or taking any locks + * reboot the system. This is called when we know we are in + * trouble so this is our best effort to reboot. This is + * safe to call in interrupt context. + */ +void emergency_restart(void) +{ + kmsg_dump(KMSG_DUMP_EMERG); + machine_emergency_restart(); +} +EXPORT_SYMBOL_GPL(emergency_restart); + +void kernel_restart_prepare(char *cmd) +{ + blocking_notifier_call_chain(_notifier_list, SYS_RESTART, cmd); + system_state = SYSTEM_RESTART; + usermodehelper_disable(); + device_shutdown(); +} + +/** + * register_reboot_notifier - Register function to be called at reboot time + * @nb: Info about notifier function to be called + * + * Registers a function with the list of functions + * to be called at reboot time. + * + * Currently always returns zero, as blocking_notifier_chain_register() + * always returns zero. + */ +int register_reboot_notifier(struct notifier_block *nb) +{ + return blocking_notifier_chain_register(_notifier_list, nb); +} +EXPORT_SYMBOL(register_reboot_notifier); + +/** + * unregister_reboot_notifier - Unregister previously registered reboot notifier + * @nb: Hook to be unregistered + * + * Unregisters a previously registered reboot + * notifier function. + * + * Returns zero on success, or %-ENOENT on failure. + */ +int unregister_reboot_notifier(struct notifier_block *nb) +{ + return blocking_notifier_chain_unregister(_notifier_list, nb); +} +EXPORT_SYMBOL(unregister_reboot_notifier); + +static void migrate_to_reboot_cpu(void) +{ + /* The boot cpu is always logical cpu 0 */ + int cpu = 0; + + cpu_hotplug_disable(); + + /* Make certain the cpu I'm about to reboot on is online */ + if (!cpu_online(cpu)) + cpu = cpumask_first(cpu_online_mask); + + /* Prevent races with other tasks migrating this task */ + current->flags |= PF_NO_SETAFFINITY; + + /* Make certain I only run on the appropriate processor */ + set_cpus_allowed_ptr(current, cpumask_of(cpu)); +} + +/** + * kernel_restart - reboot the system + * @cmd: pointer to buffer containing command to execute for restart + * or %NULL + * + * Shutdown everything and perform a clean reboot. + * This is not safe to call in interrupt context. + */ +void kernel_restart(char *cmd) +{ + kernel_restart_prepare(cmd); + migrate_to_reboot_cpu(); + syscore_shutdown(); + if (!cmd) + printk(KERN_EMERG "Restarting system.\n"); + else + printk(KERN_EMERG "Restarting system with command '%s'.\n", cmd); + kmsg_dump(KMSG_DUMP_RESTART); + machine_restart(cmd); +} +EXPORT_SYMBOL_GPL(kernel_restart); + +static void kernel_shutdown_prepare(enum system_states state) +{ + blocking_notifier_call_chain(_notifier_list, + (state == SYSTEM_HALT)?SYS_HALT:SYS_POWER_OFF, NULL); + system_state = state; +
[PATCH -v10 05/11] checkpatch.pl the new kernel/reboot.c file.
Get the new file to pass scripts/checkpatch.pl Signed-off-by: Robin Holt To: Andrew Morton Cc: H. Peter Anvin Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List --- Changes since v6: - Removed last remaining line length warning. --- include/linux/reboot.h | 2 +- kernel/reboot.c| 28 +--- 2 files changed, 14 insertions(+), 16 deletions(-) diff --git a/include/linux/reboot.h b/include/linux/reboot.h index 23b3630..c6eba21 100644 --- a/include/linux/reboot.h +++ b/include/linux/reboot.h @@ -26,7 +26,7 @@ extern void machine_shutdown(void); struct pt_regs; extern void machine_crash_shutdown(struct pt_regs *); -/* +/* * Architecture independent implemenations of sys_reboot commands. */ diff --git a/kernel/reboot.c b/kernel/reboot.c index 0616483..abb6a04 100644 --- a/kernel/reboot.c +++ b/kernel/reboot.c @@ -4,6 +4,8 @@ * Copyright (C) 2013 Linus Torvalds */ +#define pr_fmt(fmt)"reboot: " fmt + #include #include #include @@ -114,9 +116,9 @@ void kernel_restart(char *cmd) migrate_to_reboot_cpu(); syscore_shutdown(); if (!cmd) - printk(KERN_EMERG "Restarting system.\n"); + pr_emerg("Restarting system\n"); else - printk(KERN_EMERG "Restarting system with command '%s'.\n", cmd); + pr_emerg("Restarting system with command '%s'\n", cmd); kmsg_dump(KMSG_DUMP_RESTART); machine_restart(cmd); } @@ -125,7 +127,7 @@ EXPORT_SYMBOL_GPL(kernel_restart); static void kernel_shutdown_prepare(enum system_states state) { blocking_notifier_call_chain(_notifier_list, - (state == SYSTEM_HALT)?SYS_HALT:SYS_POWER_OFF, NULL); + (state == SYSTEM_HALT) ? SYS_HALT : SYS_POWER_OFF, NULL); system_state = state; usermodehelper_disable(); device_shutdown(); @@ -140,11 +142,10 @@ void kernel_halt(void) kernel_shutdown_prepare(SYSTEM_HALT); migrate_to_reboot_cpu(); syscore_shutdown(); - printk(KERN_EMERG "System halted.\n"); + pr_emerg("System halted\n"); kmsg_dump(KMSG_DUMP_HALT); machine_halt(); } - EXPORT_SYMBOL_GPL(kernel_halt); /** @@ -159,7 +160,7 @@ void kernel_power_off(void) pm_power_off_prepare(); migrate_to_reboot_cpu(); syscore_shutdown(); - printk(KERN_EMERG "Power down.\n"); + pr_emerg("Power down\n"); kmsg_dump(KMSG_DUMP_POWEROFF); machine_power_off(); } @@ -188,10 +189,10 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd, /* For safety, we require "magic" arguments. */ if (magic1 != LINUX_REBOOT_MAGIC1 || - (magic2 != LINUX_REBOOT_MAGIC2 && - magic2 != LINUX_REBOOT_MAGIC2A && + (magic2 != LINUX_REBOOT_MAGIC2 && + magic2 != LINUX_REBOOT_MAGIC2A && magic2 != LINUX_REBOOT_MAGIC2B && - magic2 != LINUX_REBOOT_MAGIC2C)) + magic2 != LINUX_REBOOT_MAGIC2C)) return -EINVAL; /* @@ -234,7 +235,8 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd, break; case LINUX_REBOOT_CMD_RESTART2: - if (strncpy_from_user([0], arg, sizeof(buffer) - 1) < 0) { + ret = strncpy_from_user([0], arg, sizeof(buffer) - 1); + if (ret < 0) { ret = -EFAULT; break; } @@ -282,7 +284,6 @@ void ctrl_alt_del(void) else kill_cad_pid(SIGINT, 1); } - char poweroff_cmd[POWEROFF_CMD_PATH_LEN] = "/sbin/poweroff"; @@ -301,14 +302,11 @@ static int __orderly_poweroff(bool force) ret = call_usermodehelper(argv[0], argv, envp, UMH_WAIT_EXEC); argv_free(argv); } else { - printk(KERN_WARNING "%s failed to allocate memory for \"%s\"\n", -__func__, poweroff_cmd); ret = -ENOMEM; } if (ret && force) { - printk(KERN_WARNING "Failed to start orderly shutdown: " - "forcing the issue\n"); + pr_warn("Failed to start orderly shutdown: forcing the issue\n"); /* * I guess this should try to kick off some daemon to sync and * poweroff asap. Or not even bother syncing if we're doing an -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -v10 00/11] Shutdown from reboot_cpuid without stopping other cpus.
We recently noticed that reboot of a 1024 cpu machine takes approx 16 minutes of just stopping the cpus. The slowdown was tracked to commit f96972f. The current implementation does all the work of hot removing the cpus before halting the system. We are switching to just migrating to the reboot_cpu and then continuing with shutdown/reboot. The patch set is broken into eleven parts. The first two are planned for the stable release. The others move the halt/shutdown/reboot related functions to their own kernel/reboot.c file and then move the handling of the kernel reboot= kernel parameter to generic kernel code. Changes since -v9 - Added Ingo's Acked-by for x86. - Added Guan's Acked-by for unicore32. - Replaced first patch with updated patch from Srivatsa S. Bhat. This compiles for alpha allmodconfig, all arm defconfigs, and a few test x86_64 defconfigs. I have not tried more. Changes since -v8 - Changes reboot_cpu on stack to cpu to fix bug noticed by Russell King. - Switched unicore32 and arm from using REBOOT_WARM/COLD to HARD/SOFT. - Fixed case statement bug. - Went to using simple_strtoul for parsing reboot_cpu=smp###. - Made parsing of reboot= not use any #ifdef'd code. Changes since -v7. - Fixed authorship for first patch. - Rebased to Linus' current tree (51a26ae7a). Changes since -v6. - Cross compiled all arm architectures (using v3.9 kernel. Fails with current). - Added a #define for non-hotplug case. - Add #define for PF_THREAD_BOUND as compatibility to make stable easier. - Fixup s/reboot_cpu_id/reboot_cpu/ - Add include of linux/uaccess.h to allow building on arm. - Removed last remaining checkpatch.pl line length warning on kernel/reboot.c. - Fixed the duplicate handling or the reboot= kernel parameter. Changes since -v5. - Moved the arch/x86 reboot= up to the generic kernel code. Changes since -v4. - Integrated Srivatsa S. Bhat creating cpu_hotplug_disable() function - Integrated comments by Srivatsa S. Bhat. - Made one more comment consistent with others in function. Changes since -v3. - Added a tested-by for the original reporter. - Fix compile failure found by Joe Perches. - Integrated comments by Joe Perches. To: Andrew Morton Cc: H. Peter Anvin Cc: Russ Anderson Cc: Robin Holt Cc: Russell King Cc: Guan Xuetao Cc: Linux Kernel Mailing List Cc: the arch/x86 maintainers Cc: Arm Mailing List -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -v8 01/11] CPU hotplug: Provide a generic helper to disable/enable CPU hotplug
Thank you. That fixed the alpha allmodconfig case for me. I am currently rebuilding all the arm defconfig files and will resubmit -v10 when that is done. Robin On Sat, May 11, 2013 at 02:43:20PM +0530, Srivatsa S. Bhat wrote: > On 05/11/2013 09:46 AM, Robin Holt wrote: > > On Fri, May 10, 2013 at 10:03:24AM -0700, Andrew Morton wrote: > >> On Fri, 10 May 2013 13:11:51 +0200 "Rafael J. Wysocki" > >> wrote: > >> > >>> ... > >>> > >>>> cpu_hotplug_disable() doesn't get compiled unless we've defined > >>>> CONFIG_PM_SLEEP_SMP. I cannot begin to imagine what the logic is > >>>> behind that! > >>> > >>> I suppose it was only used by suspend/hibernate code paths when this was > >>> introduced. > >> > >> OK, well I suspect that what I have now is simply wrong for Robin's > >> application. Robin, can you please check this? We probably want to > >> make the does-something version of cpu_hotplug_disable/enable available > >> if CONFIG_HOTPLUG_CPU. > > > > This patch came from "Srivatsa S. Bhat" , > > I think I need to defer. > > Here is a revised patch, which should address all the problems, IMHO. > Let me know if you face any issues. (And thanks a lot for fixing up the > !SMP case in the previous version.) > > -> > > From: Srivatsa S. Bhat > Subject: [PATCH] CPU hotplug: Provide a generic helper to disable/enable CPU > hotplug > > There are instances in the kernel where we would like to disable > CPU hotplug (from sysfs) during some important operation. Today > the freezer code depends on this and the code to do it was kinda > tailor-made for that. > > Restructure the code and make it generic enough to be useful for > other usecases too. > > Signed-off-by: Srivatsa S. Bhat > --- > > include/linux/cpu.h |4 > kernel/cpu.c| 55 > +-- > 2 files changed, 27 insertions(+), 32 deletions(-) > > diff --git a/include/linux/cpu.h b/include/linux/cpu.h > index c6f6e08..9f3c7e8 100644 > --- a/include/linux/cpu.h > +++ b/include/linux/cpu.h > @@ -175,6 +175,8 @@ extern struct bus_type cpu_subsys; > > extern void get_online_cpus(void); > extern void put_online_cpus(void); > +extern void cpu_hotplug_disable(void); > +extern void cpu_hotplug_enable(void); > #define hotcpu_notifier(fn, pri) cpu_notifier(fn, pri) > #define register_hotcpu_notifier(nb) register_cpu_notifier(nb) > #define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb) > @@ -198,6 +200,8 @@ static inline void cpu_hotplug_driver_unlock(void) > > #define get_online_cpus()do { } while (0) > #define put_online_cpus()do { } while (0) > +#define cpu_hotplug_disable()do { } while (0) > +#define cpu_hotplug_enable() do { } while (0) > #define hotcpu_notifier(fn, pri) do { (void)(fn); } while (0) > /* These aren't inline functions due to a GCC bug. */ > #define register_hotcpu_notifier(nb) ({ (void)(nb); 0; }) > diff --git a/kernel/cpu.c b/kernel/cpu.c > index b5e4ab2..198a388 100644 > --- a/kernel/cpu.c > +++ b/kernel/cpu.c > @@ -133,6 +133,27 @@ static void cpu_hotplug_done(void) > mutex_unlock(_hotplug.lock); > } > > +/* > + * Wait for currently running CPU hotplug operations to complete (if any) and > + * disable future CPU hotplug (from sysfs). The 'cpu_add_remove_lock' > protects > + * the 'cpu_hotplug_disabled' flag. The same lock is also acquired by the > + * hotplug path before performing hotplug operations. So acquiring that lock > + * guarantees mutual exclusion from any currently running hotplug operations. > + */ > +void cpu_hotplug_disable(void) > +{ > + cpu_maps_update_begin(); > + cpu_hotplug_disabled = 1; > + cpu_maps_update_done(); > +} > + > +void cpu_hotplug_enable(void) > +{ > + cpu_maps_update_begin(); > + cpu_hotplug_disabled = 0; > + cpu_maps_update_done(); > +} > + > #else /* #if CONFIG_HOTPLUG_CPU */ > static void cpu_hotplug_begin(void) {} > static void cpu_hotplug_done(void) {} > @@ -541,36 +562,6 @@ static int __init alloc_frozen_cpus(void) > core_initcall(alloc_frozen_cpus); > > /* > - * Prevent regular CPU hotplug from racing with the freezer, by disabling CPU > - * hotplug when tasks are about to be frozen. Also, don't allow the freezer > - * to continue until any currently running CPU hotplug operation gets > - * completed. > - * To modify the 'cpu_hotplug_disabled' flag, we need to acquire the > - *
Re: [PATCH -v8 01/11] CPU hotplug: Provide a generic helper to disable/enable CPU hotplug
Thank you. That fixed the alpha allmodconfig case for me. I am currently rebuilding all the arm defconfig files and will resubmit -v10 when that is done. Robin On Sat, May 11, 2013 at 02:43:20PM +0530, Srivatsa S. Bhat wrote: On 05/11/2013 09:46 AM, Robin Holt wrote: On Fri, May 10, 2013 at 10:03:24AM -0700, Andrew Morton wrote: On Fri, 10 May 2013 13:11:51 +0200 Rafael J. Wysocki r...@sisk.pl wrote: ... cpu_hotplug_disable() doesn't get compiled unless we've defined CONFIG_PM_SLEEP_SMP. I cannot begin to imagine what the logic is behind that! I suppose it was only used by suspend/hibernate code paths when this was introduced. OK, well I suspect that what I have now is simply wrong for Robin's application. Robin, can you please check this? We probably want to make the does-something version of cpu_hotplug_disable/enable available if CONFIG_HOTPLUG_CPU. This patch came from Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com, I think I need to defer. Here is a revised patch, which should address all the problems, IMHO. Let me know if you face any issues. (And thanks a lot for fixing up the !SMP case in the previous version.) - From: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com Subject: [PATCH] CPU hotplug: Provide a generic helper to disable/enable CPU hotplug There are instances in the kernel where we would like to disable CPU hotplug (from sysfs) during some important operation. Today the freezer code depends on this and the code to do it was kinda tailor-made for that. Restructure the code and make it generic enough to be useful for other usecases too. Signed-off-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com --- include/linux/cpu.h |4 kernel/cpu.c| 55 +-- 2 files changed, 27 insertions(+), 32 deletions(-) diff --git a/include/linux/cpu.h b/include/linux/cpu.h index c6f6e08..9f3c7e8 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -175,6 +175,8 @@ extern struct bus_type cpu_subsys; extern void get_online_cpus(void); extern void put_online_cpus(void); +extern void cpu_hotplug_disable(void); +extern void cpu_hotplug_enable(void); #define hotcpu_notifier(fn, pri) cpu_notifier(fn, pri) #define register_hotcpu_notifier(nb) register_cpu_notifier(nb) #define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb) @@ -198,6 +200,8 @@ static inline void cpu_hotplug_driver_unlock(void) #define get_online_cpus()do { } while (0) #define put_online_cpus()do { } while (0) +#define cpu_hotplug_disable()do { } while (0) +#define cpu_hotplug_enable() do { } while (0) #define hotcpu_notifier(fn, pri) do { (void)(fn); } while (0) /* These aren't inline functions due to a GCC bug. */ #define register_hotcpu_notifier(nb) ({ (void)(nb); 0; }) diff --git a/kernel/cpu.c b/kernel/cpu.c index b5e4ab2..198a388 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -133,6 +133,27 @@ static void cpu_hotplug_done(void) mutex_unlock(cpu_hotplug.lock); } +/* + * Wait for currently running CPU hotplug operations to complete (if any) and + * disable future CPU hotplug (from sysfs). The 'cpu_add_remove_lock' protects + * the 'cpu_hotplug_disabled' flag. The same lock is also acquired by the + * hotplug path before performing hotplug operations. So acquiring that lock + * guarantees mutual exclusion from any currently running hotplug operations. + */ +void cpu_hotplug_disable(void) +{ + cpu_maps_update_begin(); + cpu_hotplug_disabled = 1; + cpu_maps_update_done(); +} + +void cpu_hotplug_enable(void) +{ + cpu_maps_update_begin(); + cpu_hotplug_disabled = 0; + cpu_maps_update_done(); +} + #else /* #if CONFIG_HOTPLUG_CPU */ static void cpu_hotplug_begin(void) {} static void cpu_hotplug_done(void) {} @@ -541,36 +562,6 @@ static int __init alloc_frozen_cpus(void) core_initcall(alloc_frozen_cpus); /* - * Prevent regular CPU hotplug from racing with the freezer, by disabling CPU - * hotplug when tasks are about to be frozen. Also, don't allow the freezer - * to continue until any currently running CPU hotplug operation gets - * completed. - * To modify the 'cpu_hotplug_disabled' flag, we need to acquire the - * 'cpu_add_remove_lock'. And this same lock is also taken by the regular - * CPU hotplug path and released only after it is complete. Thus, we - * (and hence the freezer) will block here until any currently running CPU - * hotplug operation gets completed. - */ -void cpu_hotplug_disable_before_freeze(void) -{ - cpu_maps_update_begin(); - cpu_hotplug_disabled = 1; - cpu_maps_update_done(); -} - - -/* - * When tasks have been thawed, re-enable regular CPU hotplug (which had been - * disabled while beginning to freeze tasks