Re: [V5 PATCH 00/26] mm, memory-hotplug: dynamic configure movable memory and introduce movable node
HI Lai, The patch-set is huge. Therefore, we hesitate to read the patch-set. I think the patch-set has multiple feature developments. - Development of online_movable [PATCH 1 - 3] - Cleanup node_state_attr [PATCH 4] - Introduce N_MEMORY [PATCH 5 - 18] - Development of kernelcore_max_addr [PATCH 19 - 25] - Bug fix [PATCH 26] Why don't you separate the patch-set into each feature development? By separating the patch-set, many people can easily participate in your development. Thanks, Yasuaki Ishimatsu 2012/10/30 0:07, Lai Jiangshan wrote: Movable memory is a very important concept of memory-management, we need to consolidate it and make use of it on systems. Movable memory is needed for o anti-fragmentation(hugepage, big-order allocation...) o logic hot-remove(virtualization, Memory capacity on Demand) o physic hot-remove(power-saving, hardware partitioning, hardware fault management) All these require dynamic configuring the memory and making better utilities of memories and safer. We also need physic hot-remove, so we need movable node too. (Although some systems support physic-memory-migration, we don't require all memory on physic-node is movable, but movable node is still needed here for logic-node if we want to make physic-migration is transparent) We add dynamic configuration commands online_movalbe and online_kernel. We also add non-dynamic boot option kernelcore_max_addr. We may add some more dynamic/non-dynamic configuration in future. The patchset is based on 3.7-rc3 with these three patches already applied: https://lkml.org/lkml/2012/10/24/151 https://lkml.org/lkml/2012/10/26/150 You can also simply pull all the patches from: git pull https://github.com/laijs/linux.git hotplug-next Issues): mempolicy(M_BIND) don't act well when the nodemask has movable nodes only, the kernel allocation will fail and the task can't create new task or other kernel objects. So we change the strategy/policy when the bound nodemask has movable node(s) only, we only apply mempolicy for userspace allocation, don't apply it for kernel allocation. CPUSET also has the same problem, but the code spread in page_alloc.c, and we doesn't fix it yet, we can/will change allocation strategy to one of these 3 strategies: 1) the same strategy as mempolicy 2) change cpuset, make nodemask always has at least a normal node 3) split nodemask: nodemask_user and nodemask_kernel Thoughts? Patches): patch1-3: add online_movable and online_kernel, bot don't result movable node Patch4cleanup for node_state_attr Patch5introduce N_MEMORY Patch6-17 use N_MEMORY instead N_HIGH_MEMORY. The patches are separated by subsystem, Patch18 also changes the node_states initialization Patch18-20Add MOVABLE-dedicated node Patch21-25Add kernelcore_max_addr patch26: mempolicy handle movable node Changes): change V5-V4: consolidate online_movable/online_kernel nodemask management change V4-v3 rebase. online_movable/online_kernel can create a zone from empty or empyt a zone change V3-v2: Proper nodemask management change V2-V1: The original V1 patchset of MOVABLE-dedicated node is here: http://comments.gmane.org/gmane.linux.kernel.mm/78122 The new V2 adds N_MEMORY and a notion of MOVABLE-dedicated node. And fix some related problems. The orignal V1 patchset of add online_movable is here: https://lkml.org/lkml/2012/7/4/145 The new V2 discards the MIGRATE_HOTREMOVE approach, and use a more straight implementation(only 1 patch). Lai Jiangshan (22): mm, memory-hotplug: dynamic configure movable memory and portion memory memory_hotplug: handle empty zone when online_movable/online_kernel memory_hotplug: ensure every online node has NORMAL memory node: cleanup node_state_attr node_states: introduce N_MEMORY cpuset: use N_MEMORY instead N_HIGH_MEMORY procfs: use N_MEMORY instead N_HIGH_MEMORY memcontrol: use N_MEMORY instead N_HIGH_MEMORY oom: use N_MEMORY instead N_HIGH_MEMORY mm,migrate: use N_MEMORY instead N_HIGH_MEMORY mempolicy: use N_MEMORY instead N_HIGH_MEMORY hugetlb: use N_MEMORY instead N_HIGH_MEMORY vmstat: use N_MEMORY instead N_HIGH_MEMORY kthread: use N_MEMORY instead N_HIGH_MEMORY init: use N_MEMORY instead N_HIGH_MEMORY vmscan: use N_MEMORY instead N_HIGH_MEMORY page_alloc: use N_MEMORY instead N_HIGH_MEMORY change the node_states initialization hotplug: update nodemasks management numa: add CONFIG_MOVABLE_NODE for movable-dedicated node memory_hotplug: allow online/offline memory to result movable node page_alloc: add kernelcore_max_addr mempolicy: fix is_valid_nodemask() Yasuaki Ishimatsu (4): x86: get pg_data_t's memory from
Re: [PATCH v2] acpi : acpi_bus_trim() stops removing devices when failing to remove the device
Hi Greg, 2012/10/27 0:25, Greg Kroah-Hartman wrote: On Fri, Oct 26, 2012 at 04:33:49PM +0900, Yasuaki Ishimatsu wrote: Hi Greg, Sorry for late reply. 2012/10/20 2:59, Greg Kroah-Hartman wrote: On Fri, Oct 19, 2012 at 06:29:52AM +0200, Rafael J. Wysocki wrote: On Thursday 11 of October 2012 19:12:28 Yasuaki Ishimatsu wrote: acpi_bus_trim() stops removing devices, when acpi_bus_remove() return error number. But acpi_bus_remove() cannot return error number correctly. acpi_bus_remove() only return -EINVAL, when dev argument is NULL. Thus even if device cannot be removed correctly, acpi_bus_trim() ignores and continues to remove devices. acpi_bus_hot_remove_device() uses acpi_bus_trim() for removing devices. Therefore acpi_bus_hot_remove_device() can send _EJ0 to firmware, even if the device is running on the system. In this case, the system cannot work well. Vasilis hit the bug at memory hotplug and reported it as follow: https://lkml.org/lkml/2012/9/26/318 So acpi_bus_trim() should check whether device was removed or not correctly. The patch adds error check into some functions to remove the device. Applying the patch, acpi_bus_trim() stops removing devices when failing to remove the device. But I think there is no impact with the exceptionof CPU and Memory hotplug path. Because other device also fails but the fail is an irregular case like device is NULL. v1-v2 - add a rollback for reinstalling a notify handler. Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Greg, do you think there may be any problems with the changes in dd.c? Yes, I don't like it. remove should always work, just like the exit call in a module. It means that the core wants to remove the driver, so it is going to happen, a driver can't refuse it. Which brings me to the larger question, why would this solve anything? Now we are developing physical memory hot plug. https://lkml.org/lkml/2012/10/23/213 So if we aplly the patch-set, we can hot remove a physical memory by the following way. echo 1 /sys/bus/acpi/devices/PNP/eject In this case, acpi_bus_hot_remove_device() tries to remove memory device by acpi_bus_trim(). But if the memory has irremovable memory, memory hot remove fails. And the memory remains in kernel. However acpi_bus_trim() cannot notice that memory hot remove fails and retruns 0. So acpi_bus_hot_remove_device() continues to remove memory devices and sends _EJ0 method to firmware. Thus the memory device cannot be used. But the memory remains in kernel yet. So if someone access the memory, kernel panic occurs. Why can't you check to find out if you can do the remove operation before you enter the driver core asking to actually remove the devices? That would allow you to know if you can do this before having to go through the whole operation. What happens if you can complete half of the removal, and do that, but not the whole thing? Don't you end up with half of the memory chunk gone from the system now? In other words, please solve this at a higher level than the driver core if at all possible. O.K. I'll check whether the problem is sloved at a higher level or not. Thanks, Yasuaki Ishimatsu greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 0/3] ACPI: container hot remove support.
Hi Tang, 2012/10/31 16:27, Tang Chen wrote: Hi, The container hotplug handler container_notify_cb() didn't implement the hot-remove functionality. So, these 3 patches implement it like the following way: patch 1. Do not use kacpid_wq/kacpid_notify_wq to handle container hotplug event, use kacpi_hotplug_wq instead to avoid deadlock. Doing this is to reuse acpi_bus_hot_remove_device() in container hot-remove handling. patch 2. Introduce a new function container_device_remove() to handle ACPI_NOTIFY_EJECT_REQUEST event for container. If container device contains memory device, the function is very danger. As you know, we are developing a memory hotplug. If memory has kernel memory, memory hot remove operations fails. But container_device_remove() cannot realize it. So even if the memory hot remove operation fails, container_device_remove() keeps hot remove operation. Finally, the function sends _EJ0 to firmware. In this case, if the memory is accessed, kernel panic occurs. The example is as follows: https://lkml.org/lkml/2012/9/26/318 Thanks, Yasuaki Ishimatsu change log v2 - v3: 1. Add 1 patch(patch1). As Toshi Kan mentioned, acpi_os_hotplug_execute() is already kernel. So use it instead of alloc_acpi_hp_work() to add hotplug job onto kacpi_hotplug_wq. 2. In patch3: Print caller's function name when container_device_remove() fails to help to debug. 3. In patch3: Add commit message to describ why we need to call acpi_bus_trim() twice when removing devices. change log v1 - v2: 1. In patch1: Based on the lastest for-pci-split-pci-root-hp-2 branch from Lu Yinghai, use alloc_acpi_hp_work() to add container hotplug work into kacpi_hotplug_wq. 2. In patch2: Allocate ej_event after container is stopped, so that we don't need to kfree the ej_event if stopping container failed. This is based on Lu Yinghai's job. git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git for-pci-split-pci-root-hp-2 Tang Chen (3): Use acpi_os_hotplug_execute() instead of alloc_acpi_hp_work(). Use kacpi_hotplug_wq to handle container hotplug event. Improve container_notify_cb() to support container hot-remove. drivers/acpi/container.c | 95 +++- drivers/acpi/osl.c | 28 +- drivers/acpi/pci_root_hp.c | 25 ++--- drivers/pci/hotplug/acpiphp_glue.c | 39 --- include/acpi/acpiosxf.h|7 +-- 5 files changed, 137 insertions(+), 57 deletions(-) -- To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4] create sun sysfs file
Hi Len, Ping... I want you to merge the patch into your tree for linux-3.7. Thanks, Yasuaki Ishimatsu 2012/08/30 10:34, Yasuaki Ishimatsu wrote: Hi Len, Three weeks passed after I post the patch. All comments have already been applied to it. And I think there is no comments about it. So I want you to merge it into your tree. Thanks, Yasuaki Ishimatsu 2012/08/07 9:36, Yasuaki Ishimatsu wrote: Even if a device has _SUN method, there is no way to know the slot unique-ID. Thus the patch creates sun file in sysfs so that we can recognize it. Reviewed-by: Toshi Kani toshi.k...@hp.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/acpi/scan.c | 24 include/acpi/acpi_bus.h |1 + 2 files changed, 25 insertions(+) Index: linux-3.5/include/acpi/acpi_bus.h === --- linux-3.5.orig/include/acpi/acpi_bus.h 2012-07-30 10:06:49.722171575 +0900 +++ linux-3.5/include/acpi/acpi_bus.h2012-08-07 08:57:45.678204360 +0900 @@ -209,6 +209,7 @@ struct acpi_device_pnp { struct list_head ids; /* _HID and _CIDs */ acpi_device_name device_name; /* Driver-determined */ acpi_device_class device_class; /* */ +unsigned long sun; /* _SUN */ }; #define acpi_device_bid(d)((d)-pnp.bus_id) Index: linux-3.5/drivers/acpi/scan.c === --- linux-3.5.orig/drivers/acpi/scan.c 2012-07-30 10:06:49.713171688 +0900 +++ linux-3.5/drivers/acpi/scan.c2012-08-07 09:01:38.196203659 +0900 @@ -192,10 +192,20 @@ end: } static DEVICE_ATTR(path, 0444, acpi_device_path_show, NULL); +static ssize_t +acpi_device_sun_show(struct device *dev, struct device_attribute *attr, + char *buf) { +struct acpi_device *acpi_dev = to_acpi_device(dev); + +return sprintf(buf, %lu\n, acpi_dev-pnp.sun); +} +static DEVICE_ATTR(sun, 0444, acpi_device_sun_show, NULL); + static int acpi_device_setup_files(struct acpi_device *dev) { acpi_status status; acpi_handle temp; +unsigned long long sun; int result = 0; /* @@ -217,6 +227,16 @@ static int acpi_device_setup_files(struc goto end; } +status = acpi_evaluate_integer(dev-handle, _SUN, NULL, sun); +if (ACPI_SUCCESS(status)) { +dev-pnp.sun = (unsigned long)sun; +result = device_create_file(dev-dev, dev_attr_sun); +if (result) +goto end; +} else { +dev-pnp.sun = (unsigned long)-1; +} + /* * If device has _EJ0, 'eject' file is created that is used to trigger * hot-removal function from userland. @@ -241,6 +261,10 @@ static void acpi_device_remove_files(str if (ACPI_SUCCESS(status)) device_remove_file(dev-dev, dev_attr_eject); +status = acpi_get_handle(dev-handle, _SUN, temp); +if (ACPI_SUCCESS(status)) +device_remove_file(dev-dev, dev_attr_sun); + device_remove_file(dev-dev, dev_attr_modalias); device_remove_file(dev-dev, dev_attr_hid); if (dev-handle) -- To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: hot-added cpu is not asiggned to the correct node
Hi Dan, At first, thank you for your comment. 2012/09/24 18:33, Dan Carpenter wrote: On Wed, Sep 12, 2012 at 02:33:11PM +0900, Yasuaki Ishimatsu wrote: When I hot-added CPUs and memories simultaneously using container driver, all the hot-added CPUs were mistakenly assigned to node0. Is this something which used to work correctly? If so which was the most recent working kernel? The cpu hot-adding is first time on my x86 box. So I don't know whether old kernel can work well or not. But it seems that x86 does not permit to create memory-less-node. So I guess the problem occurs on old kernel. Thanks, Yasuaki Ishimatsu regards, dan carpenter -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86: rename mp_register_lapic in a comment
Commit 31d2092eb0c23636b73d2c24c0c11b66470cef58 (x86: move mp_register_lapic_address to boot.c) renamed mp_register_lapic to acpi_register_lapic. But mp_register_lapic remains in a comment. So the patch rename it. CC: Len Brown l...@kernel.org CC: Thomas Gleixner t...@linutronix.de CC: Ingo Molnar mi...@kernel.org CC: H. Peter Anvin h...@zytor.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- arch/x86/kernel/acpi/boot.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux-3.6-rc5/arch/x86/kernel/acpi/boot.c === --- linux-3.6-rc5.orig/arch/x86/kernel/acpi/boot.c 2012-09-19 11:38:03.990715466 +0900 +++ linux-3.6-rc5/arch/x86/kernel/acpi/boot.c 2012-09-26 09:50:42.269534856 +0900 @@ -656,7 +656,7 @@ static int __cpuinit _acpi_map_lsapic(ac acpi_register_lapic(physid, ACPI_MADT_ENABLED); /* -* If mp_register_lapic successfully generates a new logical cpu +* If acpi_register_lapic successfully generates a new logical cpu * number, then the following will get us exactly what was mapped */ cpumask_andnot(new_map, cpu_present_mask, tmp_map); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86: use the correct macros
This patch fixes to use the correct macros. CC: Len Brown l...@kernel.org CC: Thomas Gleixner t...@linutronix.de CC: Ingo Molnar mi...@kernel.org CC: H. Peter Anvin h...@zytor.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- arch/x86/kernel/acpi/boot.c |2 +- drivers/acpi/numa.c |4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) Index: linux-3.6-rc5/arch/x86/kernel/acpi/boot.c === --- linux-3.6-rc5.orig/arch/x86/kernel/acpi/boot.c 2012-09-13 15:44:30.0 +0900 +++ linux-3.6-rc5/arch/x86/kernel/acpi/boot.c 2012-09-13 15:46:31.743850426 +0900 @@ -601,7 +601,7 @@ static void __cpuinit acpi_map_cpu2node( int nid; nid = acpi_get_node(handle); - if (nid == -1 || !node_online(nid)) + if (nid == NUMA_NO_NODE || !node_online(nid)) return; set_apicid_to_node(physid, nid); numa_set_node(cpu, nid); Index: linux-3.6-rc5/drivers/acpi/numa.c === --- linux-3.6-rc5.orig/drivers/acpi/numa.c 2012-09-13 15:44:59.0 +0900 +++ linux-3.6-rc5/drivers/acpi/numa.c 2012-09-13 15:46:03.079850552 +0900 @@ -327,12 +327,12 @@ int acpi_get_pxm(acpi_handle h) return pxm; status = acpi_get_parent(handle, phandle); } while (ACPI_SUCCESS(status)); - return -1; + return PXM_INVAL; } int acpi_get_node(acpi_handle *handle) { - int pxm, node = -1; + int pxm, node = NUMA_NO_NODE; pxm = acpi_get_pxm(handle); if (pxm = 0 pxm MAX_PXM_DOMAINS) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/4] memory-hotplug: add node_device_release
Hi Kosaki-san, 2012/09/28 5:13, KOSAKI Motohiro wrote: On Thu, Sep 27, 2012 at 1:45 AM, we...@cn.fujitsu.com wrote: From: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com When calling unregister_node(), the function shows following message at device_release(). This description doesn't have the following message. Device 'node2' does not have a release() function, it is broken and must be fixed. This is the messages. The message is shown by kobject_cleanup(), when calling unregister_node(). Thanks, Yasuaki Ishimatsu So the patch implements node_device_release() -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/4] memory-hotplug: add memory_block_release
Hi Chen, 2012/09/27 19:20, Ni zhan Chen wrote: Hi Congyang, 2012/9/27 we...@cn.fujitsu.com From: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com When calling remove_memory_block(), the function shows following message at device_release(). Device 'memory528' does not have a release() function, it is broken and must be fixed. What's the difference between the patch and original implemetation? The implementation is for removing a memory_block. So the purpose is same as original one. But original code is bad manner. kobject_cleanup() is called by remove_memory_block() at last. But release function for releasing memory_block is not registered. As a result, the kernel message is shown. IMHO, memory_block should be release by the releae function. Thanks, Yasuaki Ishimatsu remove_memory_block() calls kfree(mem). I think it shouled be called from device_release(). So the patch implements memory_block_release() CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Benjamin Herrenschmidt b...@kernel.crashing.org CC: Paul Mackerras pau...@samba.org Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/base/memory.c |9 - 1 files changed, 8 insertions(+), 1 deletions(-) diff --git a/drivers/base/memory.c b/drivers/base/memory.c index 7dda4f7..da457e5 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -70,6 +70,13 @@ void unregister_memory_isolate_notifier(struct notifier_block *nb) } EXPORT_SYMBOL(unregister_memory_isolate_notifier); +static void release_memory_block(struct device *dev) +{ + struct memory_block *mem = container_of(dev, struct memory_block, dev); + + kfree(mem); +} + /* * register_memory - Setup a sysfs device for a memory block */ @@ -80,6 +87,7 @@ int register_memory(struct memory_block *memory) memory-dev.bus = memory_subsys; memory-dev.id = memory-start_section_nr / sections_per_block; + memory-dev.release = release_memory_block; error = device_register(memory-dev); return error; @@ -630,7 +638,6 @@ int remove_memory_block(unsigned long node_id, struct mem_section *section, mem_remove_simple_file(mem, phys_device); mem_remove_simple_file(mem, removable); unregister_memory(mem); - kfree(mem); } else kobject_put(mem-dev.kobj); -- 1.7.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majord...@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/4] memory-hotplug: add node_device_release
Hi Kosaki-san, 2012/09/28 10:13, KOSAKI Motohiro wrote: On Thu, Sep 27, 2012 at 8:07 PM, Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com wrote: Hi Kosaki-san, 2012/09/28 5:13, KOSAKI Motohiro wrote: On Thu, Sep 27, 2012 at 1:45 AM, we...@cn.fujitsu.com wrote: From: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com When calling unregister_node(), the function shows following message at device_release(). This description doesn't have the following message. Device 'node2' does not have a release() function, it is broken and must be fixed. This is the messages. The message is shown by kobject_cleanup(), when calling unregister_node(). If so, you should quote the message. and don't mix it with your subject. Moreover your patch title is too silly. add node_device_release() function is a way. you should describe the effect of the patch. e.g. suppress Device 'nodeXX' does not have a release() function warning. What you say is correct. We should update subject and changelog. Moreover, your explanation is still insufficient. Even if node_device_release() is empty function, we can get rid of the warning. I don't understand it. How can we get rid of the warning? Why do we need this node_device_release() implementation? I think that this is a manner of releasing object related kobject. Thanks, Yasuaki Ishimatsu -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/4] memory-hotplug: add memory_block_release
Hi Kosaki-san, 2012/09/28 10:35, KOSAKI Motohiro wrote: On Thu, Sep 27, 2012 at 8:24 PM, Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com wrote: Hi Chen, 2012/09/27 19:20, Ni zhan Chen wrote: Hi Congyang, 2012/9/27 we...@cn.fujitsu.com From: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com When calling remove_memory_block(), the function shows following message at device_release(). Device 'memory528' does not have a release() function, it is broken and must be fixed. What's the difference between the patch and original implemetation? The implementation is for removing a memory_block. So the purpose is same as original one. But original code is bad manner. kobject_cleanup() is called by remove_memory_block() at last. But release function for releasing memory_block is not registered. As a result, the kernel message is shown. IMHO, memory_block should be release by the releae function. but your patch introduced use after free bug, if i understand correctly. See unregister_memory() function. After your patch, kobject_put() call release_memory_block() and kfree(). and then device_unregister() will touch freed memory. It is not correct. The kobject_put() is prepared against find_memory_block() in remove_memory_block() since kobject-kref is incremented in it. So release_memory_block() is called by device_unregister() correctly as follows: [ 1014.589008] Pid: 126, comm: kworker/0:2 Not tainted 3.6.0-rc3-enable-memory-hotremove-and-root-bridge #3 [ 1014.702437] Call Trace: [ 1014.731684] [8144d096] release_memory_block+0x16/0x30 [ 1014.803581] [81438587] device_release+0x27/0xa0 [ 1014.869312] [8133e962] kobject_cleanup+0x82/0x1b0 [ 1014.937062] [8133ea9d] kobject_release+0xd/0x10 [ 1015.002718] [8133e7ec] kobject_put+0x2c/0x60 [ 1015.065271] [81438107] put_device+0x17/0x20 [ 1015.126794] [8143918a] device_unregister+0x2a/0x60 [ 1015.195578] [8144d55b] remove_memory_block+0xbb/0xf0 [ 1015.266434] [8144d5af] unregister_memory_section+0x1f/0x30 [ 1015.343532] [811c0a58] __remove_section+0x68/0x110 [ 1015.412318] [811c0be7] __remove_pages+0xe7/0x120 [ 1015.479021] [81653d8c] arch_remove_memory+0x2c/0x80 [ 1015.548845] [8165497b] remove_memory+0x6b/0xd0 [ 1015.613474] [813d946c] acpi_memory_device_remove_memory+0x48/0x73 [ 1015.697834] [813d94c2] acpi_memory_device_remove+0x2b/0x44 [ 1015.774922] [813a61e4] acpi_device_remove+0x90/0xb2 [ 1015.844796] [8143c2fc] __device_release_driver+0x7c/0xf0 [ 1015.919814] [8143c47f] device_release_driver+0x2f/0x50 [ 1015.992753] [813a70dc] acpi_bus_remove+0x32/0x6d [ 1016.059462] [813a71a8] acpi_bus_trim+0x91/0x102 [ 1016.125128] [813a72a1] acpi_bus_hot_remove_device+0x88/0x16b [ 1016.204295] [813a2e57] acpi_os_execute_deferred+0x27/0x34 [ 1016.280350] [81090599] process_one_work+0x219/0x680 [ 1016.350173] [81090538] ? process_one_work+0x1b8/0x680 [ 1016.422072] [813a2e30] ? acpi_os_wait_events_complete+0x23/0x23 [ 1016.504357] [810923ce] worker_thread+0x12e/0x320 [ 1016.571064] [810922a0] ? manage_workers+0x110/0x110 [ 1016.640886] [810983a6] kthread+0xc6/0xd0 [ 1016.699290] [8167b144] kernel_thread_helper+0x4/0x10 [ 1016.770149] [81670bb0] ? retint_restore_args+0x13/0x13 [ 1016.843165] [810982e0] ? __init_kthread_worker+0x70/0x70 [ 1016.918200] [8167b140] ? gs_change+0x13/0x13 Thanks, Yasuaki Ishimatsu static void unregister_memory(struct memory_block *memory) { BUG_ON(memory-dev.bus != memory_subsys); /* drop the ref. we got in remove_memory_block() */ kobject_put(memory-dev.kobj); device_unregister(memory-dev); } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC v9 PATCH 01/21] memory-hotplug: rename remove_memory() to offline_memory()/offline_pages()
Hi Chen, 2012/09/28 11:22, Ni zhan Chen wrote: On 09/05/2012 05:25 PM, we...@cn.fujitsu.com wrote: From: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com remove_memory() only try to offline pages. It is called in two cases: 1. hot remove a memory device 2. echo offline /sys/devices/system/memory/memoryXX/state In the 1st case, we should also change memory block's state, and notify the userspace that the memory block's state is changed after offlining pages. So rename remove_memory() to offline_memory()/offline_pages(). And in the 1st case, offline_memory() will be used. The function offline_memory() is not implemented. In the 2nd case, offline_pages() will be used. But this time there is not a function associated with add_memory. To associate with add_memory() later, we renamed it. Thanks, Yasuaki Ishimatsu CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Benjamin Herrenschmidt b...@kernel.crashing.org CC: Paul Mackerras pau...@samba.org CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Signed-off-by: Wen Congyang we...@cn.fujitsu.com --- drivers/acpi/acpi_memhotplug.c |2 +- drivers/base/memory.c |9 +++-- include/linux/memory_hotplug.h |3 ++- mm/memory_hotplug.c| 22 ++ 4 files changed, 20 insertions(+), 16 deletions(-) diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c index 24c807f..2a7beac 100644 --- a/drivers/acpi/acpi_memhotplug.c +++ b/drivers/acpi/acpi_memhotplug.c @@ -318,7 +318,7 @@ static int acpi_memory_disable_device(struct acpi_memory_device *mem_device) */ list_for_each_entry_safe(info, n, mem_device-res_list, list) { if (info-enabled) { -result = remove_memory(info-start_addr, info-length); +result = offline_memory(info-start_addr, info-length); if (result) return result; } diff --git a/drivers/base/memory.c b/drivers/base/memory.c index 7dda4f7..44e7de6 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -248,26 +248,23 @@ static bool pages_correctly_reserved(unsigned long start_pfn, static int memory_block_action(unsigned long phys_index, unsigned long action) { -unsigned long start_pfn, start_paddr; +unsigned long start_pfn; unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block; struct page *first_page; int ret; first_page = pfn_to_page(phys_index PFN_SECTION_SHIFT); +start_pfn = page_to_pfn(first_page); switch (action) { case MEM_ONLINE: -start_pfn = page_to_pfn(first_page); - if (!pages_correctly_reserved(start_pfn, nr_pages)) return -EBUSY; ret = online_pages(start_pfn, nr_pages); break; case MEM_OFFLINE: -start_paddr = page_to_pfn(first_page) PAGE_SHIFT; -ret = remove_memory(start_paddr, -nr_pages PAGE_SHIFT); +ret = offline_pages(start_pfn, nr_pages); break; default: WARN(1, KERN_WARNING %s(%ld, %ld) unknown action: diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 910550f..c183f39 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -233,7 +233,8 @@ static inline int is_mem_section_removable(unsigned long pfn, extern int mem_online_node(int nid); extern int add_memory(int nid, u64 start, u64 size); extern int arch_add_memory(int nid, u64 start, u64 size); -extern int remove_memory(u64 start, u64 size); +extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages); +extern int offline_memory(u64 start, u64 size); extern int sparse_add_one_section(struct zone *zone, unsigned long start_pfn, int nr_pages); extern void sparse_remove_one_section(struct zone *zone, struct mem_section *ms); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 3ad25f9..bb42316 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -866,7 +866,7 @@ check_pages_isolated(unsigned long start_pfn, unsigned long end_pfn) return offlined; } -static int __ref offline_pages(unsigned long start_pfn, +static int __ref __offline_pages(unsigned long start_pfn, unsigned long end_pfn, unsigned long timeout) { unsigned long pfn, nr_pages, expire; @@ -994,18 +994,24 @@ out: return ret; } -int remove_memory(u64 start, u64 size) +int offline_pages(unsigned long start_pfn, unsigned long nr_pages) { -unsigned long start_pfn, end_pfn; +return __offline_pages(start_pfn, start_pfn + nr_pages, 120 * HZ); +} -start_pfn = PFN_DOWN(start
Re: [PATCH 1/4] memory-hotplug: add memory_block_release
Hi Chen, 2012/09/28 15:04, Ni zhan Chen wrote: On 09/28/2012 11:45 AM, Yasuaki Ishimatsu wrote: Hi Kosaki-san, 2012/09/28 10:35, KOSAKI Motohiro wrote: On Thu, Sep 27, 2012 at 8:24 PM, Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com wrote: Hi Chen, 2012/09/27 19:20, Ni zhan Chen wrote: Hi Congyang, 2012/9/27 we...@cn.fujitsu.com From: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com When calling remove_memory_block(), the function shows following message at device_release(). Device 'memory528' does not have a release() function, it is broken and must be fixed. What's the difference between the patch and original implemetation? The implementation is for removing a memory_block. So the purpose is same as original one. But original code is bad manner. kobject_cleanup() is called by remove_memory_block() at last. But release function for releasing memory_block is not registered. As a result, the kernel message is shown. IMHO, memory_block should be release by the releae function. but your patch introduced use after free bug, if i understand correctly. See unregister_memory() function. After your patch, kobject_put() call release_memory_block() and kfree(). and then device_unregister() will touch freed memory. this patch is similiar to [RFC v9 PATCH 10/21] memory-hotplug: add memory_block_release, they handle the same issue, can these two patches be fold to one? You're right. The patch is same as [RFC v9 PATCH 10/21]. The patch is a bug fix. So we separated it from memory-hotplug patch-set. Thanks, Yasuaki Ishimatsu It is not correct. The kobject_put() is prepared against find_memory_block() in remove_memory_block() since kobject-kref is incremented in it. So release_memory_block() is called by device_unregister() correctly as follows: [ 1014.589008] Pid: 126, comm: kworker/0:2 Not tainted 3.6.0-rc3-enable-memory-hotremove-and-root-bridge #3 [ 1014.702437] Call Trace: [ 1014.731684] [8144d096] release_memory_block+0x16/0x30 [ 1014.803581] [81438587] device_release+0x27/0xa0 [ 1014.869312] [8133e962] kobject_cleanup+0x82/0x1b0 [ 1014.937062] [8133ea9d] kobject_release+0xd/0x10 [ 1015.002718] [8133e7ec] kobject_put+0x2c/0x60 [ 1015.065271] [81438107] put_device+0x17/0x20 [ 1015.126794] [8143918a] device_unregister+0x2a/0x60 [ 1015.195578] [8144d55b] remove_memory_block+0xbb/0xf0 [ 1015.266434] [8144d5af] unregister_memory_section+0x1f/0x30 [ 1015.343532] [811c0a58] __remove_section+0x68/0x110 [ 1015.412318] [811c0be7] __remove_pages+0xe7/0x120 [ 1015.479021] [81653d8c] arch_remove_memory+0x2c/0x80 [ 1015.548845] [8165497b] remove_memory+0x6b/0xd0 [ 1015.613474] [813d946c] acpi_memory_device_remove_memory+0x48/0x73 [ 1015.697834] [813d94c2] acpi_memory_device_remove+0x2b/0x44 [ 1015.774922] [813a61e4] acpi_device_remove+0x90/0xb2 [ 1015.844796] [8143c2fc] __device_release_driver+0x7c/0xf0 [ 1015.919814] [8143c47f] device_release_driver+0x2f/0x50 [ 1015.992753] [813a70dc] acpi_bus_remove+0x32/0x6d [ 1016.059462] [813a71a8] acpi_bus_trim+0x91/0x102 [ 1016.125128] [813a72a1] acpi_bus_hot_remove_device+0x88/0x16b [ 1016.204295] [813a2e57] acpi_os_execute_deferred+0x27/0x34 [ 1016.280350] [81090599] process_one_work+0x219/0x680 [ 1016.350173] [81090538] ? process_one_work+0x1b8/0x680 [ 1016.422072] [813a2e30] ? acpi_os_wait_events_complete+0x23/0x23 [ 1016.504357] [810923ce] worker_thread+0x12e/0x320 [ 1016.571064] [810922a0] ? manage_workers+0x110/0x110 [ 1016.640886] [810983a6] kthread+0xc6/0xd0 [ 1016.699290] [8167b144] kernel_thread_helper+0x4/0x10 [ 1016.770149] [81670bb0] ? retint_restore_args+0x13/0x13 [ 1016.843165] [810982e0] ? __init_kthread_worker+0x70/0x70 [ 1016.918200] [8167b140] ? gs_change+0x13/0x13 Thanks, Yasuaki Ishimatsu static void unregister_memory(struct memory_block *memory) { BUG_ON(memory-dev.bus != memory_subsys); /* drop the ref. we got in remove_memory_block() */ kobject_put(memory-dev.kobj); device_unregister(memory-dev); } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/4] acpi,memory-hotplug : implement framework for hot removing memory
We are trying to implement a physical memory hot removing function as following thread. https://lkml.org/lkml/2012/9/5/201 But there is not enough review to merge into linux kernel. I think there are following blockades. 1. no physical memory hot removable system 2. huge patch-set If you have a KVM system, we can get rid of 1st blockade. Because applying following patch, we can create memory hot removable system on KVM guest. http://lists.gnu.org/archive/html/qemu-devel/2012-07/msg01389.html 2nd blockade is own problem. So we try to divide huge patch into a small patch in each function as follows: - bug fix - acpi framework - kernel core We had already sent bug fix patches. https://lkml.org/lkml/2012/9/27/39 https://lkml.org/lkml/2012/10/2/83 The patch-set implements a framework for hot removing memory. The memory device can be removed by 2 ways: 1. send eject request by SCI 2. echo 1 /sys/bus/pci/devices/PNP0C80:XX/eject In the 1st case, acpi_memory_disable_device() will be called. In the 2nd case, acpi_memory_device_remove() will be called. acpi_memory_device_remove() will also be called when we unbind the memory device from the driver acpi_memhotplug. acpi_memory_disable_device() has already implemented a code which offlines memory and releases acpi_memory_info struct . But acpi_memory_device_remove() has not implemented it yet. So the patch prepares the framework for hot removing memory and adds the framework intoacpi_memory_device_remove(). And it prepares remove_memory(). But the function does nothing because we cannot support memory hot remove. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4] create sun sysfs file
Hi Len, Ping... Pleae merge the patch into your tree. Thanks, Yasuaki Ishimatsu 2012/09/24 11:31, Yasuaki Ishimatsu wrote: Hi Len, Ping... I want you to merge the patch into your tree for linux-3.7. Thanks, Yasuaki Ishimatsu 2012/08/30 10:34, Yasuaki Ishimatsu wrote: Hi Len, Three weeks passed after I post the patch. All comments have already been applied to it. And I think there is no comments about it. So I want you to merge it into your tree. Thanks, Yasuaki Ishimatsu 2012/08/07 9:36, Yasuaki Ishimatsu wrote: Even if a device has _SUN method, there is no way to know the slot unique-ID. Thus the patch creates sun file in sysfs so that we can recognize it. Reviewed-by: Toshi Kani toshi.k...@hp.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/acpi/scan.c | 24 include/acpi/acpi_bus.h |1 + 2 files changed, 25 insertions(+) Index: linux-3.5/include/acpi/acpi_bus.h === --- linux-3.5.orig/include/acpi/acpi_bus.h 2012-07-30 10:06:49.722171575 +0900 +++ linux-3.5/include/acpi/acpi_bus.h 2012-08-07 08:57:45.678204360 +0900 @@ -209,6 +209,7 @@ struct acpi_device_pnp { struct list_head ids; /* _HID and _CIDs */ acpi_device_name device_name; /* Driver-determined */ acpi_device_class device_class; /* */ + unsigned long sun; /* _SUN */ }; #define acpi_device_bid(d) ((d)-pnp.bus_id) Index: linux-3.5/drivers/acpi/scan.c === --- linux-3.5.orig/drivers/acpi/scan.c 2012-07-30 10:06:49.713171688 +0900 +++ linux-3.5/drivers/acpi/scan.c 2012-08-07 09:01:38.196203659 +0900 @@ -192,10 +192,20 @@ end: } static DEVICE_ATTR(path, 0444, acpi_device_path_show, NULL); +static ssize_t +acpi_device_sun_show(struct device *dev, struct device_attribute *attr, +char *buf) { + struct acpi_device *acpi_dev = to_acpi_device(dev); + + return sprintf(buf, %lu\n, acpi_dev-pnp.sun); +} +static DEVICE_ATTR(sun, 0444, acpi_device_sun_show, NULL); + static int acpi_device_setup_files(struct acpi_device *dev) { acpi_status status; acpi_handle temp; + unsigned long long sun; int result = 0; /* @@ -217,6 +227,16 @@ static int acpi_device_setup_files(struc goto end; } + status = acpi_evaluate_integer(dev-handle, _SUN, NULL, sun); + if (ACPI_SUCCESS(status)) { + dev-pnp.sun = (unsigned long)sun; + result = device_create_file(dev-dev, dev_attr_sun); + if (result) + goto end; + } else { + dev-pnp.sun = (unsigned long)-1; + } + /* * If device has _EJ0, 'eject' file is created that is used to trigger * hot-removal function from userland. @@ -241,6 +261,10 @@ static void acpi_device_remove_files(str if (ACPI_SUCCESS(status)) device_remove_file(dev-dev, dev_attr_eject); + status = acpi_get_handle(dev-handle, _SUN, temp); + if (ACPI_SUCCESS(status)) + device_remove_file(dev-dev, dev_attr_sun); + device_remove_file(dev-dev, dev_attr_modalias); device_remove_file(dev-dev, dev_attr_hid); if (dev-handle) -- To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/4] acpi,memory-hotplug : add memory offline code to acpi_memory_device_remove()
From: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com The memory device can be removed by 2 ways: 1. send eject request by SCI 2. echo 1 /sys/bus/pci/devices/PNP0C80:XX/eject In the 1st case, acpi_memory_disable_device() will be called. In the 2nd case, acpi_memory_device_remove() will be called. acpi_memory_device_remove() will also be called when we unbind the memory device from the driver acpi_memhotplug. acpi_memory_disable_device() has already implemented a code which offlines memory and releases acpi_memory_info struct . But acpi_memory_device_remove() has not implemented it yet. So the patch implements acpi_memory_remove_memory() for offlining memory and releasing acpi_memory_info struct. And it is used by both acpi_memory_device_remove() and acpi_memory_disable_device(). Additionally, if the type is ACPI_BUS_REMOVAL_EJECT in acpi_memory_device_remove() , it means that the user wants to eject the memory device. In this case, acpi_memory_device_remove() calls acpi_memory_remove_memory(). CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Signed-off-by: Wen Congyang we...@cn.fujitsu.com --- drivers/acpi/acpi_memhotplug.c | 44 +++-- 1 file changed, 34 insertions(+), 10 deletions(-) Index: linux-3.6/drivers/acpi/acpi_memhotplug.c === --- linux-3.6.orig/drivers/acpi/acpi_memhotplug.c 2012-10-03 18:55:33.386378909 +0900 +++ linux-3.6/drivers/acpi/acpi_memhotplug.c2012-10-03 18:55:58.624380688 +0900 @@ -306,24 +306,37 @@ static int acpi_memory_powerdown_device( return 0; } -static int acpi_memory_disable_device(struct acpi_memory_device *mem_device) +static int acpi_memory_remove_memory(struct acpi_memory_device *mem_device) { int result; struct acpi_memory_info *info, *n; + list_for_each_entry_safe(info, n, mem_device-res_list, list) { + if (!info-enabled) + return -EBUSY; + + result = remove_memory(info-start_addr, info-length); + if (result) + return result; + + list_del(info-list); + kfree(info); + } + + return 0; +} + +static int acpi_memory_disable_device(struct acpi_memory_device *mem_device) +{ + int result; /* * Ask the VM to offline this memory range. * Note: Assume that this function returns zero on success */ - list_for_each_entry_safe(info, n, mem_device-res_list, list) { - if (info-enabled) { - result = remove_memory(info-start_addr, info-length); - if (result) - return result; - } - kfree(info); - } + result = acpi_memory_remove_memory(mem_device); + if (result) + return result; /* Power-off and eject the device */ result = acpi_memory_powerdown_device(mem_device); @@ -473,12 +486,23 @@ static int acpi_memory_device_add(struct static int acpi_memory_device_remove(struct acpi_device *device, int type) { struct acpi_memory_device *mem_device = NULL; - + int result; if (!device || !acpi_driver_data(device)) return -EINVAL; mem_device = acpi_driver_data(device); + + if (type == ACPI_BUS_REMOVAL_EJECT) { + /* +* offline and remove memory only when the memory device is +* ejected. +*/ + result = acpi_memory_remove_memory(mem_device); + if (result) + return result; + } + kfree(mem_device); return 0; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/4] acpi,memory-hotplug : rename remove_memory() to offline_memory()
From: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com add_memory() hot adds a physical memory. But remove_memory does not hot remove a phsical memory. It only offlines memory. The name confuse us. So the patch renames remove_memory() to offline_memory(). We will use rename_memory() for hot removing memory. CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Signed-off-by: Wen Congyang we...@cn.fujitsu.com --- drivers/acpi/acpi_memhotplug.c |2 +- include/linux/memory_hotplug.h |2 +- mm/memory_hotplug.c|6 +++--- 3 files changed, 5 insertions(+), 5 deletions(-) Index: linux-3.6/drivers/acpi/acpi_memhotplug.c === --- linux-3.6.orig/drivers/acpi/acpi_memhotplug.c 2012-10-03 18:17:29.291244669 +0900 +++ linux-3.6/drivers/acpi/acpi_memhotplug.c2012-10-03 18:17:41.494247869 +0900 @@ -316,7 +316,7 @@ acpi_memory_remove_memory(struct acpi_me if (!info-enabled) return -EBUSY; - result = remove_memory(info-start_addr, info-length); + result = offline_memory(info-start_addr, info-length); if (result) return result; Index: linux-3.6/include/linux/memory_hotplug.h === --- linux-3.6.orig/include/linux/memory_hotplug.h 2012-10-03 18:17:01.863247694 +0900 +++ linux-3.6/include/linux/memory_hotplug.h2012-10-03 18:17:41.496247872 +0900 @@ -236,7 +236,7 @@ extern int add_memory(int nid, u64 start extern int arch_add_memory(int nid, u64 start, u64 size); extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages); extern int offline_memory_block(struct memory_block *mem); -extern int remove_memory(u64 start, u64 size); +extern int offline_memory(u64 start, u64 size); extern int sparse_add_one_section(struct zone *zone, unsigned long start_pfn, int nr_pages); extern void sparse_remove_one_section(struct zone *zone, struct mem_section *ms); Index: linux-3.6/mm/memory_hotplug.c === --- linux-3.6.orig/mm/memory_hotplug.c 2012-10-03 18:17:01.861247692 +0900 +++ linux-3.6/mm/memory_hotplug.c 2012-10-03 18:17:41.503247876 +0900 @@ -1003,7 +1003,7 @@ int offline_pages(unsigned long start_pf return __offline_pages(start_pfn, start_pfn + nr_pages, 120 * HZ); } -int remove_memory(u64 start, u64 size) +int offline_memory(u64 start, u64 size) { struct memory_block *mem = NULL; struct mem_section *section; @@ -1047,9 +1047,9 @@ int offline_pages(unsigned long start_pf { return -EINVAL; } -int remove_memory(u64 start, u64 size) +int offline_memory(u64 start, u64 size) { return -EINVAL; } #endif /* CONFIG_MEMORY_HOTREMOVE */ -EXPORT_SYMBOL_GPL(remove_memory); +EXPORT_SYMBOL_GPL(offline_memory); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/6] acpi,memory-hotplug : add physical memory hotplug code to acpi_memhotplug.c
From: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com For hot removing physical memory, the patch adds remove_memory() into acpi_memory_remove_memory(). But we cannot support physical memory hot remove. So remove_memory() do nothinig. CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Signed-off-by: Wen Congyang we...@cn.fujitsu.com --- drivers/acpi/acpi_memhotplug.c | 10 ++ include/linux/memory_hotplug.h |5 + mm/memory_hotplug.c|7 +++ 3 files changed, 22 insertions(+) Index: linux-3.6/drivers/acpi/acpi_memhotplug.c === --- linux-3.6.orig/drivers/acpi/acpi_memhotplug.c 2012-10-03 19:03:10.960400793 +0900 +++ linux-3.6/drivers/acpi/acpi_memhotplug.c2012-10-03 19:03:26.818401966 +0900 @@ -310,6 +310,9 @@ static int acpi_memory_remove_memory(str { int result; struct acpi_memory_info *info, *n; + int node; + + node = acpi_get_node(mem_device-device-handle); list_for_each_entry_safe(info, n, mem_device-res_list, list) { if (!info-enabled) @@ -319,6 +322,13 @@ static int acpi_memory_remove_memory(str if (result) return result; + if (node 0) + node = memory_add_physaddr_to_nid(info-start_addr); + + result = remove_memory(node, info-start_addr, info-length); + if (result) + return result; + list_del(info-list); kfree(info); } Index: linux-3.6/include/linux/memory_hotplug.h === --- linux-3.6.orig/include/linux/memory_hotplug.h 2012-10-03 19:03:10.963400796 +0900 +++ linux-3.6/include/linux/memory_hotplug.h2012-10-03 19:03:26.820401968 +0900 @@ -222,6 +222,7 @@ static inline void unlock_memory_hotplug #ifdef CONFIG_MEMORY_HOTREMOVE extern int is_mem_section_removable(unsigned long pfn, unsigned long nr_pages); +extern int remove_memory(int nid, u64 start, u64 size); #else static inline int is_mem_section_removable(unsigned long pfn, @@ -229,6 +230,10 @@ static inline int is_mem_section_removab { return 0; } +static inline int remove_memory(int nid, u64 start, u64 size) +{ + return -EBUSY; +} #endif /* CONFIG_MEMORY_HOTREMOVE */ extern int mem_online_node(int nid); Index: linux-3.6/mm/memory_hotplug.c === --- linux-3.6.orig/mm/memory_hotplug.c 2012-10-03 19:03:10.962400795 +0900 +++ linux-3.6/mm/memory_hotplug.c 2012-10-03 19:04:15.493404911 +0900 @@ -1042,6 +1042,13 @@ int offline_memory(u64 start, u64 size) return 0; } + +int remove_memory(int nid, u64 start, u64 size) +{ + /* It is not implemented yet*/ + return 0; +} +EXPORT_SYMBOL_GPL(remove_memory); #else int offline_pages(unsigned long start_pfn, unsigned long nr_pages) { -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/4] acpi,memory-hotplug : store the node id in acpi_memory_device
From: Wen Congyang we...@cn.fujitsu.com The memory device has only one node id. Store the node id when enable the memory device, and we can reuse it when removing the memory device. CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Benjamin Herrenschmidt b...@kernel.crashing.org CC: Paul Mackerras pau...@samba.org CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Signed-off-by: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/acpi/acpi_memhotplug.c | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) Index: linux-3.6/drivers/acpi/acpi_memhotplug.c === --- linux-3.6.orig/drivers/acpi/acpi_memhotplug.c 2012-10-03 19:03:26.818401966 +0900 +++ linux-3.6/drivers/acpi/acpi_memhotplug.c2012-10-03 19:08:38.804604700 +0900 @@ -83,6 +83,7 @@ struct acpi_memory_info { struct acpi_memory_device { struct acpi_device * device; unsigned int state; /* State of the memory device */ + int nid; struct list_head res_list; }; @@ -256,6 +257,9 @@ static int acpi_memory_enable_device(str info-enabled = 1; num_enabled++; } + + mem_device-nid = node; + if (!num_enabled) { printk(KERN_ERR PREFIX add_memory failed\n); mem_device-state = MEMORY_INVALID_STATE; @@ -310,9 +314,7 @@ static int acpi_memory_remove_memory(str { int result; struct acpi_memory_info *info, *n; - int node; - - node = acpi_get_node(mem_device-device-handle); + int node = mem_device-nid; list_for_each_entry_safe(info, n, mem_device-res_list, list) { if (!info-enabled) @@ -322,9 +324,6 @@ static int acpi_memory_remove_memory(str if (result) return result; - if (node 0) - node = memory_add_physaddr_to_nid(info-start_addr); - result = remove_memory(node, info-start_addr, info-length); if (result) return result; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/2] acpi,memory-hotplug : remove memory device by acpi_bus_remove()
The patch-set was divided from following thread's patch-set. https://lkml.org/lkml/2012/9/5/201 If you want to know the reason, please read following thread. https://lkml.org/lkml/2012/10/2/83 The patch exports acpi_bus_remove() for removing a acpi device from a acpi bus at memory hot plug. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] acpi,memory-hotplug : export the function acpi_bus_remove()
From: Wen Congyang we...@cn.fujitsu.com The function acpi_bus_remove() can remove a acpi device from acpi bus. When a acpi device is removed, we need to call this function to remove the acpi device from acpi bus. So export this function. CC: Len Brown len.br...@intel.com Reviewed-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Signed-off-by: Wen Congyang we...@cn.fujitsu.com --- drivers/acpi/scan.c |3 ++- include/acpi/acpi_bus.h |1 + 2 files changed, 3 insertions(+), 1 deletion(-) Index: linux-3.6/drivers/acpi/scan.c === --- linux-3.6.orig/drivers/acpi/scan.c 2012-10-03 18:16:57.206246798 +0900 +++ linux-3.6/drivers/acpi/scan.c 2012-10-03 18:17:49.974249714 +0900 @@ -1224,7 +1224,7 @@ static int acpi_device_set_context(struc return -ENODEV; } -static int acpi_bus_remove(struct acpi_device *dev, int rmdevice) +int acpi_bus_remove(struct acpi_device *dev, int rmdevice) { if (!dev) return -EINVAL; @@ -1246,6 +1246,7 @@ static int acpi_bus_remove(struct acpi_d return 0; } +EXPORT_SYMBOL(acpi_bus_remove); static int acpi_add_single_object(struct acpi_device **child, acpi_handle handle, int type, Index: linux-3.6/include/acpi/acpi_bus.h === --- linux-3.6.orig/include/acpi/acpi_bus.h 2012-10-03 18:16:57.208246800 +0900 +++ linux-3.6/include/acpi/acpi_bus.h 2012-10-03 18:17:49.976249717 +0900 @@ -360,6 +360,7 @@ bool acpi_bus_power_manageable(acpi_hand bool acpi_bus_can_wakeup(acpi_handle handle); int acpi_power_resource_register_device(struct device *dev, acpi_handle handle); void acpi_power_resource_unregister_device(struct device *dev, acpi_handle handle); +int acpi_bus_remove(struct acpi_device *dev, int rmdevice); #ifdef CONFIG_ACPI_PROC_EVENT int acpi_bus_generate_proc_event(struct acpi_device *device, u8 type, int data); int acpi_bus_generate_proc_event4(const char *class, const char *bid, u8 type, int data); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] acpi,memory-hotplug : call acpi_bus_remo() to remove memory device
From: Wen Congyang we...@cn.fujitsu.com The memory device has been ejected and powoffed, so we can call acpi_bus_remove() to remove the memory device from acpi bus. CC: Len Brown len.br...@intel.com Reviewed-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Signed-off-by: Wen Congyang we...@cn.fujitsu.com --- drivers/acpi/acpi_memhotplug.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) Index: linux-3.6/drivers/acpi/acpi_memhotplug.c === --- linux-3.6.orig/drivers/acpi/acpi_memhotplug.c 2012-10-03 18:17:47.802249170 +0900 +++ linux-3.6/drivers/acpi/acpi_memhotplug.c2012-10-03 18:17:52.471250299 +0900 @@ -424,8 +424,9 @@ static void acpi_memory_device_notify(ac } /* -* TBD: Invoke acpi_bus_remove to cleanup data structures +* Invoke acpi_bus_remove() to remove memory device */ + acpi_bus_remove(device, 1); /* _EJ0 succeeded; _OST is not necessary */ return; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC v9 PATCH 16/21] memory-hotplug: free memmap of sparse-vmemmap
Hi Chen, Sorry for late reply. 2012/10/02 13:21, Ni zhan Chen wrote: On 09/05/2012 05:25 PM, we...@cn.fujitsu.com wrote: From: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com All pages of virtual mapping in removed memory cannot be freed, since some pages used as PGD/PUD includes not only removed memory but also other memory. So the patch checks whether page can be freed or not. How to check whether page can be freed or not? 1. When removing memory, the page structs of the revmoved memory are filled with 0FD. 2. All page structs are filled with 0xFD on PT/PMD, PT/PMD can be cleared. In this case, the page used as PT/PMD can be freed. Applying patch, __remove_section() of CONFIG_SPARSEMEM_VMEMMAP is integrated into one. So __remove_section() of CONFIG_SPARSEMEM_VMEMMAP is deleted. Note: vmemmap_kfree() and vmemmap_free_bootmem() are not implemented for ia64, ppc, s390, and sparc. CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Benjamin Herrenschmidt b...@kernel.crashing.org CC: Paul Mackerras pau...@samba.org CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- arch/ia64/mm/discontig.c |8 +++ arch/powerpc/mm/init_64.c |8 +++ arch/s390/mm/vmem.c |8 +++ arch/sparc/mm/init_64.c |8 +++ arch/x86/mm/init_64.c | 119 + include/linux/mm.h|2 + mm/memory_hotplug.c | 17 +-- mm/sparse.c |5 +- 8 files changed, 158 insertions(+), 17 deletions(-) diff --git a/arch/ia64/mm/discontig.c b/arch/ia64/mm/discontig.c index 33943db..0d23b69 100644 --- a/arch/ia64/mm/discontig.c +++ b/arch/ia64/mm/discontig.c @@ -823,6 +823,14 @@ int __meminit vmemmap_populate(struct page *start_page, return vmemmap_populate_basepages(start_page, size, node); } +void vmemmap_kfree(struct page *memmap, unsigned long nr_pages) +{ +} + +void vmemmap_free_bootmem(struct page *memmap, unsigned long nr_pages) +{ +} + void register_page_bootmem_memmap(unsigned long section_nr, struct page *start_page, unsigned long size) { diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c index 3690c44..835a2b3 100644 --- a/arch/powerpc/mm/init_64.c +++ b/arch/powerpc/mm/init_64.c @@ -299,6 +299,14 @@ int __meminit vmemmap_populate(struct page *start_page, return 0; } +void vmemmap_kfree(struct page *memmap, unsigned long nr_pages) +{ +} + +void vmemmap_free_bootmem(struct page *memmap, unsigned long nr_pages) +{ +} + void register_page_bootmem_memmap(unsigned long section_nr, struct page *start_page, unsigned long size) { diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c index eda55cd..4b42b0b 100644 --- a/arch/s390/mm/vmem.c +++ b/arch/s390/mm/vmem.c @@ -227,6 +227,14 @@ out: return ret; } +void vmemmap_kfree(struct page *memmap, unsigned long nr_pages) +{ +} + +void vmemmap_free_bootmem(struct page *memmap, unsigned long nr_pages) +{ +} + void register_page_bootmem_memmap(unsigned long section_nr, struct page *start_page, unsigned long size) { diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c index add1cc7..1384826 100644 --- a/arch/sparc/mm/init_64.c +++ b/arch/sparc/mm/init_64.c @@ -2078,6 +2078,14 @@ void __meminit vmemmap_populate_print_last(void) } } +void vmemmap_kfree(struct page *memmap, unsigned long nr_pages) +{ +} + +void vmemmap_free_bootmem(struct page *memmap, unsigned long nr_pages) +{ +} + void register_page_bootmem_memmap(unsigned long section_nr, struct page *start_page, unsigned long size) { diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 0075592..4e8f8a4 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1138,6 +1138,125 @@ vmemmap_populate(struct page *start_page, unsigned long size, int node) return 0; } +#define PAGE_INUSE 0xFD + +unsigned long find_and_clear_pte_page(unsigned long addr, unsigned long end, +struct page **pp, int *page_size) +{ +pgd_t *pgd; +pud_t *pud; +pmd_t *pmd; +pte_t *pte; +void *page_addr; +unsigned long next; + +*pp = NULL; + +pgd = pgd_offset_k(addr); +if (pgd_none(*pgd)) +return pgd_addr_end(addr, end); + +pud = pud_offset(pgd, addr); +if (pud_none(*pud)) +return pud_addr_end(addr, end); + +if (!cpu_has_pse) { +next = (addr + PAGE_SIZE) PAGE_MASK; +pmd = pmd_offset(pud, addr); +if (pmd_none(*pmd)) +return next; + +pte = pte_offset_kernel(pmd, addr); +if (pte_none(*pte)) +return next; + +*page_size = PAGE_SIZE; +*pp = pte_page
Re: [PATCH 2/4] memory-hotplug: add node_device_release
Hi Kosaki-san, 2012/10/02 3:12, KOSAKI Motohiro wrote: On Mon, Oct 1, 2012 at 2:54 AM, Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com wrote: Hi Kosaki-san, 2012/09/29 7:19, KOSAKI Motohiro wrote: I don't understand it. How can we get rid of the warning? See cpu_device_release() for example. If we implement a function like cpu_device_release(), the warning disappears. But the comment says in the function Never copy this way So I think it is illegal way. What does illegal mean? The illegal means the code should not be mimicked. You still haven't explain any benefit of your code. If there is zero benefit, just kill it. I believe everybody think so. Again, Which benefit do you have? The patch has a benefit to delets a warning message. Why do we need this node_device_release() implementation? I think that this is a manner of releasing object related kobject. No. Usually we never call memset() from release callback. What we want to release is a part of array, not a pointer. Therefore, there is only this way instead of kfree(). Why? Before your patch, we don't have memset() and did work it. If we does not apply the patch, a warning message is shown. So I think it did not work well. I can't understand what mean only way. For deleting a warning message, I created a node_device_release(). In the manner of releasing kobject, the function frees a object related to the kobject. So most functions calls kfree() for releasing it. In node_device_release(), we need to free a node struct. If the node struct is pointer, I can free it by kfree. But the node struct is a part of node_devices[] array. I cannot free it. So I filled the node struct with 0. But you think it is not good. Do you have a good solution? Do nothing. just add empty release function and kill a warning. Obviously do nothing can't make any performance drop nor any side effect. meaningless memset() is just silly from point of cache pollution view. I have the reason to have to fill the node struct with 0 by memset. The node is a part of node struct array (node_devices[]). If we add empty release function for suppressing warning, some data remains in the node struct after hot removing memory. So if we re-hot adds the memory, the node struct is reused by register_onde_node(). But the node struct has some data, because it was not initialized with 0. As a result, more waning is shown by the remained data at hot addinig memory as follows: [ 374.037710] kobject (82c15718): tried to init an initialized object, something is seriously wrong. [ 374.153169] Pid: 4, comm: kworker/0:0 Tainted: GW3.6.0 #5 [ 374.230279] Call Trace: [ 374.259647] [8133cf39] kobject_init+0x89/0xa0 [ 374.323286] [8143632c] device_initialize+0x2c/0xc0 [ 374.392086] [814376a6] device_register+0x16/0x30 [ 374.458856] [81449b15] register_node+0x25/0xe0 [ 374.523434] [8144a057] register_one_node+0x67/0x140 [ 374.593306] [81652e40] add_memory+0x100/0x1f0 [ 374.656961] [a00a31c6] acpi_memory_enable_device+0x92/0xdf [acpi_memhotplug] [ 374.752811] [a00a3753] acpi_memory_device_add+0x10d/0x116 [acpi_memhotplug] [ 374.847622] [813a4376] acpi_device_probe+0x50/0x18a [ 374.917504] [8124b053] ? sysfs_create_link+0x13/0x20 [ 374.988426] [81439d7c] really_probe+0x6c/0x320 [ 375.053061] [8143a077] driver_probe_device+0x47/0xa0 [ 375.123922] [8143a180] ? __driver_attach+0xb0/0xb0 [ 375.192709] [8143a180] ? __driver_attach+0xb0/0xb0 [ 375.261494] [8143a1d3] __device_attach+0x53/0x60 [ 375.328206] [81437dac] bus_for_each_drv+0x6c/0xa0 [ 375.395950] [81439cf8] device_attach+0xa8/0xc0 [ 375.460578] [814389d0] bus_probe_device+0xb0/0xe0 [ 375.528318] [81437421] device_add+0x301/0x570 [ 375.591883] [814376ae] device_register+0x1e/0x30 [ 375.658568] [813a56bf] acpi_device_register+0x1af/0x2bf [ 375.732590] [813a59ae] acpi_add_single_object+0x1df/0x2b9 [ 375.808640] [813ce320] ? acpi_ut_release_mutex+0xac/0xb5 [ 375.883646] [813a5b93] acpi_bus_check_add+0x10b/0x166 [ 375.955529] [810db4ad] ? trace_hardirqs_on+0xd/0x10 [ 376.025327] [8109fa4f] ? up+0x2f/0x50 [ 376.080639] [813a0cc1] ? acpi_os_signal_semaphore+0x6b/0x74 [ 376.158792] [813c3f1d] acpi_ns_walk_namespace+0xbe/0x17d [ 376.233854] [813a5a88] ? acpi_add_single_object+0x2b9/0x2b9 [ 376.312012] [813a5a88] ? acpi_add_single_object+0x2b9/0x2b9 [ 376.390162] [813c43b3] acpi_walk_namespace+0x8a/0xc4 [ 376.461051] [813a5c49] acpi_bus_scan+0x5b/0x7c [ 376.525707] [813a5cd6] acpi_bus_add+0x2a/0x2c [ 376.589344] [813d5dc6] container_notify_cb+0x103/0x18d [ 376.662309] [813b3946] acpi_ev_notify_dispatch+0x41/0x5f [ 376.737386
[PATCH 0/10] memory-hotplug: hot-remove physical memory
The patch-set was divided from following thread's patch-set. https://lkml.org/lkml/2012/9/5/201 If you want to know the reason, please read following thread. https://lkml.org/lkml/2012/10/2/83 The patch-set has only the function of kernel core side for physical memory hot remove. So if you use the patch, please apply following patches. - bug fix for memory hot remove https://lkml.org/lkml/2012/9/27/39 https://lkml.org/lkml/2012/10/2/83 http://www.spinics.net/lists/linux-mm/msg42982.html - acpi framework https://lkml.org/lkml/2012/10/3/126 https://lkml.org/lkml/2012/10/3/641 The patches can free/remove the following things: - /sys/firmware/memmap/X/{end, start, type} : [PATCH 2/10] - mem_section and related sysfs files : [PATCH 3-4/10] - memmap of sparse-vmemmap : [PATCH 5-7/10] - page table of removed memory : [RFC PATCH 8/10] - node and related sysfs files : [RFC PATCH 9-10/10] * [PATCH 1/10] checks whether the memory can be removed or not. If you find lack of function for physical memory hot-remove, please let me know. How to test this patchset? 1. apply this patchset and build the kernel. MEMORY_HOTPLUG, MEMORY_HOTREMOVE, ACPI_HOTPLUG_MEMORY must be selected. 2. load the module acpi_memhotplug 3. hotplug the memory device(it depends on your hardware) You will see the memory device under the directory /sys/bus/acpi/devices/. Its name is PNP0C80:XX. 4. online/offline pages provided by this memory device You can write online/offline to /sys/devices/system/memory/memoryX/state to online/offline pages provided by this memory device 5. hotremove the memory device You can hotremove the memory device by the hardware, or writing 1 to /sys/bus/acpi/devices/PNP0C80:XX/eject. Note: if the memory provided by the memory device is used by the kernel, it can't be offlined. It is not a bug. Known problems: 1. memory can't be offlined when CONFIG_MEMCG is selected. For example: there is a memory device on node 1. The address range is [1G, 1.5G). You will find 4 new directories memory8, memory9, memory10, and memory11 under the directory /sys/devices/system/memory/. If CONFIG_MEMCG is selected, we will allocate memory to store page cgroup when we online pages. When we online memory8, the memory stored page cgroup is not provided by this memory device. But when we online memory9, the memory stored page cgroup may be provided by memory8. So we can't offline memory8 now. We should offline the memory in the reversed order. When the memory device is hotremoved, we will auto offline memory provided by this memory device. But we don't know which memory is onlined first, so offlining memory may fail. In such case, you should offline the memory by hand before hotremoving the memory device. 2. hotremoving memory device may cause kernel panicked This bug will be fixed by Liu Jiang's patch: https://lkml.org/lkml/2012/7/3/1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/10] memory-hotplug : check whether memory is offline or not when removing memory
When calling remove_memory(), the memory should be offline. If the function is used to online memory, kernel panic may occur. So the patch checks whether memory is offline or not. CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com Signed-off-by: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/base/memory.c | 39 +++ include/linux/memory.h |5 + mm/memory_hotplug.c| 17 +++-- 3 files changed, 59 insertions(+), 2 deletions(-) Index: linux-3.6/drivers/base/memory.c === --- linux-3.6.orig/drivers/base/memory.c2012-10-04 14:22:57.0 +0900 +++ linux-3.6/drivers/base/memory.c 2012-10-04 14:45:46.653585860 +0900 @@ -70,6 +70,45 @@ void unregister_memory_isolate_notifier( } EXPORT_SYMBOL(unregister_memory_isolate_notifier); +bool is_memblk_offline(unsigned long start, unsigned long size) +{ + struct memory_block *mem = NULL; + struct mem_section *section; + unsigned long start_pfn, end_pfn; + unsigned long pfn, section_nr; + + start_pfn = PFN_DOWN(start); + end_pfn = PFN_UP(start + size); + + for (pfn = start_pfn; pfn end_pfn; pfn += PAGES_PER_SECTION) { + section_nr = pfn_to_section_nr(pfn); + if (!present_section_nr(section_nr)) + continue; + + section = __nr_to_section(section_nr); + /* same memblock? */ + if (mem) + if ((section_nr = mem-start_section_nr) + (section_nr = mem-end_section_nr)) + continue; + + mem = find_memory_block_hinted(section, mem); + if (!mem) + continue; + if (mem-state == MEM_OFFLINE) + continue; + + kobject_put(mem-dev.kobj); + return false; + } + + if (mem) + kobject_put(mem-dev.kobj); + + return true; +} +EXPORT_SYMBOL(is_memblk_offline); + /* * register_memory - Setup a sysfs device for a memory block */ Index: linux-3.6/include/linux/memory.h === --- linux-3.6.orig/include/linux/memory.h 2012-10-02 18:00:22.0 +0900 +++ linux-3.6/include/linux/memory.h2012-10-04 14:44:40.902581028 +0900 @@ -106,6 +106,10 @@ static inline int memory_isolate_notify( { return 0; } +static inline bool is_memblk_offline(unsigned long start, unsigned long size) +{ + return false; +} #else extern int register_memory_notifier(struct notifier_block *nb); extern void unregister_memory_notifier(struct notifier_block *nb); @@ -120,6 +124,7 @@ extern int memory_isolate_notify(unsigne extern struct memory_block *find_memory_block_hinted(struct mem_section *, struct memory_block *); extern struct memory_block *find_memory_block(struct mem_section *); +extern bool is_memblk_offline(unsigned long start, unsigned long size); #define CONFIG_MEM_BLOCK_SIZE (PAGES_PER_SECTIONPAGE_SHIFT) enum mem_add_context { BOOT, HOTPLUG }; #endif /* CONFIG_MEMORY_HOTPLUG_SPARSE */ Index: linux-3.6/mm/memory_hotplug.c === --- linux-3.6.orig/mm/memory_hotplug.c 2012-10-04 14:31:08.0 +0900 +++ linux-3.6/mm/memory_hotplug.c 2012-10-04 14:58:22.449687986 +0900 @@ -1045,8 +1045,21 @@ int offline_memory(u64 start, u64 size) int remove_memory(int nid, u64 start, u64 size) { - /* It is not implemented yet*/ - return 0; + int ret = 0; + lock_memory_hotplug(); + /* +* The memory might become online by other task, even if you offine it. +* So we check whether the memory has been onlined or not. +*/ + if (!is_memblk_offline(start, size)) { + pr_warn(memory removing [mem %#010llx-%#010llx] failed, + because the memmory range is online\n, + start, start + size); + ret = -EAGAIN; + } + + unlock_memory_hotplug(); + return ret; } EXPORT_SYMBOL_GPL(remove_memory); #else -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/10] memory-hotplug : remove /sys/firmware/memmap/X sysfs
When (hot)adding memory into system, /sys/firmware/memmap/X/{end, start, type} sysfs files are created. But there is no code to remove these files. The patch implements the function to remove them. Note : The code does not free firmware_map_entry since there is no way to free memory which is allocated by bootmem. CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com Signed-off-by: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/firmware/memmap.c| 98 ++- include/linux/firmware-map.h |6 ++ mm/memory_hotplug.c |7 ++- 3 files changed, 108 insertions(+), 3 deletions(-) Index: linux-3.6/drivers/firmware/memmap.c === --- linux-3.6.orig/drivers/firmware/memmap.c2012-10-04 18:27:05.195500420 +0900 +++ linux-3.6/drivers/firmware/memmap.c 2012-10-04 18:27:18.901514330 +0900 @@ -21,6 +21,7 @@ #include linux/types.h #include linux/bootmem.h #include linux/slab.h +#include linux/mm.h /* * Data types -- @@ -41,6 +42,7 @@ struct firmware_map_entry { const char *type; /* type of the memory range */ struct list_headlist; /* entry for the linked list */ struct kobject kobj; /* kobject for each entry */ + unsigned intbootmem:1; /* allocated from bootmem */ }; /* @@ -79,7 +81,26 @@ static const struct sysfs_ops memmap_att .show = memmap_attr_show, }; + +static inline struct firmware_map_entry * +to_memmap_entry(struct kobject *kobj) +{ + return container_of(kobj, struct firmware_map_entry, kobj); +} + +static void release_firmware_map_entry(struct kobject *kobj) +{ + struct firmware_map_entry *entry = to_memmap_entry(kobj); + + if (entry-bootmem) + /* There is no way to free memory allocated from bootmem */ + return; + + kfree(entry); +} + static struct kobj_type memmap_ktype = { + .release= release_firmware_map_entry, .sysfs_ops = memmap_attr_ops, .default_attrs = def_attrs, }; @@ -94,6 +115,7 @@ static struct kobj_type memmap_ktype = { * in firmware initialisation code in one single thread of execution. */ static LIST_HEAD(map_entries); +static DEFINE_SPINLOCK(map_entries_lock); /** * firmware_map_add_entry() - Does the real work to add a firmware memmap entry. @@ -118,11 +140,25 @@ static int firmware_map_add_entry(u64 st INIT_LIST_HEAD(entry-list); kobject_init(entry-kobj, memmap_ktype); + spin_lock(map_entries_lock); list_add_tail(entry-list, map_entries); + spin_unlock(map_entries_lock); return 0; } +/** + * firmware_map_remove_entry() - Does the real work to remove a firmware + * memmap entry. + * @entry: removed entry. + **/ +static inline void firmware_map_remove_entry(struct firmware_map_entry *entry) +{ + spin_lock(map_entries_lock); + list_del(entry-list); + spin_unlock(map_entries_lock); +} + /* * Add memmap entry on sysfs */ @@ -144,6 +180,35 @@ static int add_sysfs_fw_map_entry(struct return 0; } +/* + * Remove memmap entry on sysfs + */ +static inline void remove_sysfs_fw_map_entry(struct firmware_map_entry *entry) +{ + kobject_put(entry-kobj); +} + +/* + * Search memmap entry + */ + +static struct firmware_map_entry * __meminit +firmware_map_find_entry(u64 start, u64 end, const char *type) +{ + struct firmware_map_entry *entry; + + spin_lock(map_entries_lock); + list_for_each_entry(entry, map_entries, list) + if ((entry-start == start) (entry-end == end) + (!strcmp(entry-type, type))) { + spin_unlock(map_entries_lock); + return entry; + } + + spin_unlock(map_entries_lock); + return NULL; +} + /** * firmware_map_add_hotplug() - Adds a firmware mapping entry when we do * memory hotplug. @@ -193,9 +258,36 @@ int __init firmware_map_add_early(u64 st if (WARN_ON(!entry)) return -ENOMEM; + entry-bootmem = 1; return firmware_map_add_entry(start, end, type, entry); } +/** + * firmware_map_remove() - remove a firmware mapping entry + * @start: Start of the memory range. + * @end: End of the memory range. + * @type: Type of the memory range. + * + * removes a firmware mapping entry. + * + * Returns 0 on success, or -EINVAL if no entry. + **/ +int __meminit firmware_map_remove(u64 start, u64 end, const char *type) +{ + struct firmware_map_entry *entry; + + entry
[PATCH 3/10] memory-hotplug : introduce new function arch_remove_memory() for removing page table depends on architecture
From: Wen Congyang we...@cn.fujitsu.com For removing memory, we need to remove page table. But it depends on architecture. So the patch introduce arch_remove_memory() for removing page table. Now it only calls __remove_pages(). Note: __remove_pages() for some archtecuture is not implemented (I don't know how to implement it for s390). CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Benjamin Herrenschmidt b...@kernel.crashing.org CC: Paul Mackerras pau...@samba.org CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Signed-off-by: Wen Congyang we...@cn.fujitsu.com --- arch/ia64/mm/init.c| 18 ++ arch/powerpc/mm/mem.c | 12 arch/s390/mm/init.c| 12 arch/sh/mm/init.c | 17 + arch/tile/mm/init.c|8 arch/x86/mm/init_32.c | 12 arch/x86/mm/init_64.c | 15 +++ include/linux/memory_hotplug.h |1 + mm/memory_hotplug.c|1 + 9 files changed, 96 insertions(+) Index: linux-3.6/arch/ia64/mm/init.c === --- linux-3.6.orig/arch/ia64/mm/init.c 2012-10-04 18:27:03.082498276 +0900 +++ linux-3.6/arch/ia64/mm/init.c 2012-10-04 18:28:50.087606867 +0900 @@ -688,6 +688,24 @@ int arch_add_memory(int nid, u64 start, return ret; } + +#ifdef CONFIG_MEMORY_HOTREMOVE +int arch_remove_memory(u64 start, u64 size) +{ + unsigned long start_pfn = start PAGE_SHIFT; + unsigned long nr_pages = size PAGE_SHIFT; + struct zone *zone; + int ret; + + zone = page_zone(pfn_to_page(start_pfn)); + ret = __remove_pages(zone, start_pfn, nr_pages); + if (ret) + pr_warn(%s: Problem encountered in __remove_pages() as +ret=%d\n, __func__, ret); + + return ret; +} +#endif #endif /* Index: linux-3.6/arch/powerpc/mm/mem.c === --- linux-3.6.orig/arch/powerpc/mm/mem.c2012-10-04 18:27:03.084498278 +0900 +++ linux-3.6/arch/powerpc/mm/mem.c 2012-10-04 18:28:50.094606874 +0900 @@ -133,6 +133,18 @@ int arch_add_memory(int nid, u64 start, return __add_pages(nid, zone, start_pfn, nr_pages); } + +#ifdef CONFIG_MEMORY_HOTREMOVE +int arch_remove_memory(u64 start, u64 size) +{ + unsigned long start_pfn = start PAGE_SHIFT; + unsigned long nr_pages = size PAGE_SHIFT; + struct zone *zone; + + zone = page_zone(pfn_to_page(start_pfn)); + return __remove_pages(zone, start_pfn, nr_pages); +} +#endif #endif /* CONFIG_MEMORY_HOTPLUG */ /* Index: linux-3.6/arch/s390/mm/init.c === --- linux-3.6.orig/arch/s390/mm/init.c 2012-10-04 18:27:03.080498274 +0900 +++ linux-3.6/arch/s390/mm/init.c 2012-10-04 18:28:50.104606884 +0900 @@ -257,4 +257,16 @@ int arch_add_memory(int nid, u64 start, vmem_remove_mapping(start, size); return rc; } + +#ifdef CONFIG_MEMORY_HOTREMOVE +int arch_remove_memory(u64 start, u64 size) +{ + /* +* There is no hardware or firmware interface which could trigger a +* hot memory remove on s390. So there is nothing that needs to be +* implemented. +*/ + return -EBUSY; +} +#endif #endif /* CONFIG_MEMORY_HOTPLUG */ Index: linux-3.6/arch/sh/mm/init.c === --- linux-3.6.orig/arch/sh/mm/init.c2012-10-04 18:27:03.091498285 +0900 +++ linux-3.6/arch/sh/mm/init.c 2012-10-04 18:28:50.116606897 +0900 @@ -558,4 +558,21 @@ int memory_add_physaddr_to_nid(u64 addr) EXPORT_SYMBOL_GPL(memory_add_physaddr_to_nid); #endif +#ifdef CONFIG_MEMORY_HOTREMOVE +int arch_remove_memory(u64 start, u64 size) +{ + unsigned long start_pfn = start PAGE_SHIFT; + unsigned long nr_pages = size PAGE_SHIFT; + struct zone *zone; + int ret; + + zone = page_zone(pfn_to_page(start_pfn)); + ret = __remove_pages(zone, start_pfn, nr_pages); + if (unlikely(ret)) + pr_warn(%s: Failed, __remove_pages() == %d\n, __func__, + ret); + + return ret; +} +#endif #endif /* CONFIG_MEMORY_HOTPLUG */ Index: linux-3.6/arch/tile/mm/init.c === --- linux-3.6.orig/arch/tile/mm/init.c 2012-10-04 18:27:03.078498272 +0900 +++ linux-3.6/arch/tile/mm/init.c 2012-10-04 18:28:50.122606903 +0900 @@ -935,6 +935,14 @@ int remove_memory(u64 start, u64 size) { return -EINVAL; } + +#ifdef CONFIG_MEMORY_HOTREMOVE +int arch_remove_memory(u64
[PATCH 4/10] memory-hotplug : unregister memory section on SPARSEMEM_VMEMMAP
Currently __remove_section for SPARSEMEM_VMEMMAP does nothing. But even if we use SPARSEMEM_VMEMMAP, we can unregister the memory_section. So the patch add unregister_memory_section() into __remove_section(). CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- mm/memory_hotplug.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) Index: linux-3.6/mm/memory_hotplug.c === --- linux-3.6.orig/mm/memory_hotplug.c 2012-10-04 18:29:50.577668254 +0900 +++ linux-3.6/mm/memory_hotplug.c 2012-10-04 18:29:58.284676075 +0900 @@ -279,11 +279,14 @@ static int __meminit __add_section(int n #ifdef CONFIG_SPARSEMEM_VMEMMAP static int __remove_section(struct zone *zone, struct mem_section *ms) { - /* -* XXX: Freeing memmap with vmemmap is not implement yet. -* This should be removed later. -*/ - return -EBUSY; + int ret = -EINVAL; + + if (!valid_section(ms)) + return ret; + + ret = unregister_memory_section(ms); + + return ret; } #else static int __remove_section(struct zone *zone, struct mem_section *ms) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 5/10] memory-hotplug : memory-hotplug: check page type in get_page_bootmem
The function get_page_bootmem() may be called more than one time to the same page. There is no need to set page's type, private if the function is not the first time called to the page. Note: the patch is just optimization and does not fix any problem. CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- mm/memory_hotplug.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) Index: linux-3.6/mm/memory_hotplug.c === --- linux-3.6.orig/mm/memory_hotplug.c 2012-10-04 18:29:58.284676075 +0900 +++ linux-3.6/mm/memory_hotplug.c 2012-10-04 18:30:03.454680542 +0900 @@ -95,10 +95,17 @@ static void release_memory_resource(stru static void get_page_bootmem(unsigned long info, struct page *page, unsigned long type) { - page-lru.next = (struct list_head *) type; - SetPagePrivate(page); - set_page_private(page, info); - atomic_inc(page-_count); + unsigned long page_type; + + page_type = (unsigned long)page-lru.next; + if (page_type MEMORY_HOTPLUG_MIN_BOOTMEM_TYPE || + page_type MEMORY_HOTPLUG_MAX_BOOTMEM_TYPE){ + page-lru.next = (struct list_head *)type; + SetPagePrivate(page); + set_page_private(page, info); + atomic_inc(page-_count); + } else + atomic_inc(page-_count); } /* reference to __meminit __free_pages_bootmem is valid -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 6/10] memory-hotplug : implement register_page_bootmem_info_section of sparse-vmemmap
For removing memmap region of sparse-vmemmap which is allocated bootmem, memmap region of sparse-vmemmap needs to be registered by get_page_bootmem(). So the patch searches pages of virtual mapping and registers the pages by get_page_bootmem(). Note: register_page_bootmem_memmap() is not implemented for ia64, ppc, s390, and sparc. CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com Signed-off-by: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- arch/ia64/mm/discontig.c |6 arch/powerpc/mm/init_64.c |6 arch/s390/mm/vmem.c|6 arch/sparc/mm/init_64.c|6 arch/x86/mm/init_64.c | 52 + include/linux/memory_hotplug.h | 11 +--- include/linux/mm.h |3 +- mm/memory_hotplug.c| 37 ++--- 8 files changed, 113 insertions(+), 14 deletions(-) Index: linux-3.6/include/linux/memory_hotplug.h === --- linux-3.6.orig/include/linux/memory_hotplug.h 2012-10-04 17:15:03.029828127 +0900 +++ linux-3.6/include/linux/memory_hotplug.h2012-10-04 17:15:59.010833688 +0900 @@ -163,17 +163,10 @@ static inline void arch_refresh_nodedata #endif /* CONFIG_NUMA */ #endif /* CONFIG_HAVE_ARCH_NODEDATA_EXTENSION */ -#ifdef CONFIG_SPARSEMEM_VMEMMAP -static inline void register_page_bootmem_info_node(struct pglist_data *pgdat) -{ -} -static inline void put_page_bootmem(struct page *page) -{ -} -#else extern void register_page_bootmem_info_node(struct pglist_data *pgdat); extern void put_page_bootmem(struct page *page); -#endif +extern void get_page_bootmem(unsigned long ingo, struct page *page, +unsigned long type); /* * Lock for memory hotplug guarantees 1) all callbacks for memory hotplug Index: linux-3.6/mm/memory_hotplug.c === --- linux-3.6.orig/mm/memory_hotplug.c 2012-10-04 17:15:27.213831361 +0900 +++ linux-3.6/mm/memory_hotplug.c 2012-10-04 17:37:00.176401540 +0900 @@ -91,9 +91,8 @@ static void release_memory_resource(stru } #ifdef CONFIG_MEMORY_HOTPLUG_SPARSE -#ifndef CONFIG_SPARSEMEM_VMEMMAP -static void get_page_bootmem(unsigned long info, struct page *page, -unsigned long type) +void get_page_bootmem(unsigned long info, struct page *page, + unsigned long type) { unsigned long page_type; @@ -127,6 +126,7 @@ void __ref put_page_bootmem(struct page } +#ifndef CONFIG_SPARSEMEM_VMEMMAP static void register_page_bootmem_info_section(unsigned long start_pfn) { unsigned long *usemap, mapsize, section_nr, i; @@ -160,6 +160,36 @@ static void register_page_bootmem_info_s get_page_bootmem(section_nr, page, MIX_SECTION_INFO); } +#else +static void register_page_bootmem_info_section(unsigned long start_pfn) +{ + unsigned long *usemap, mapsize, section_nr, i; + struct mem_section *ms; + struct page *page, *memmap; + + if (!pfn_valid(start_pfn)) + return; + + section_nr = pfn_to_section_nr(start_pfn); + ms = __nr_to_section(section_nr); + + memmap = sparse_decode_mem_map(ms-section_mem_map, section_nr); + + page = virt_to_page(memmap); + mapsize = sizeof(struct page) * PAGES_PER_SECTION; + mapsize = PAGE_ALIGN(mapsize) PAGE_SHIFT; + + register_page_bootmem_memmap(section_nr, memmap, PAGES_PER_SECTION); + + usemap = __nr_to_section(section_nr)-pageblock_flags; + page = virt_to_page(usemap); + + mapsize = PAGE_ALIGN(usemap_size()) PAGE_SHIFT; + + for (i = 0; i mapsize; i++, page++) + get_page_bootmem(section_nr, page, MIX_SECTION_INFO); +} +#endif void register_page_bootmem_info_node(struct pglist_data *pgdat) { @@ -202,7 +232,6 @@ void register_page_bootmem_info_node(str register_page_bootmem_info_section(pfn); } } -#endif /* !CONFIG_SPARSEMEM_VMEMMAP */ static void grow_zone_span(struct zone *zone, unsigned long start_pfn, unsigned long end_pfn) Index: linux-3.6/arch/ia64/mm/discontig.c === --- linux-3.6.orig/arch/ia64/mm/discontig.c 2012-10-01 08:47:46.0 +0900 +++ linux-3.6/arch/ia64/mm/discontig.c 2012-10-04 17:15:59.209833459 +0900 @@ -822,4 +822,10 @@ int __meminit vmemmap_populate(struct pa { return vmemmap_populate_basepages(start_page, size, node); } + +void register_page_bootmem_memmap(unsigned long section_nr
[PATCH 7/10] memory-hotplug : remove memmap of sparse-vmemmap
All pages of virtual mapping in removed memory cannot be freed, since some pages used as PGD/PUD includes not only removed memory but also other memory. So the patch checks whether page can be freed or not. How to check whether page can be freed or not? 1. When removing memory, the page structs of the revmoved memory are filled with 0FD. 2. All page structs are filled with 0xFD on PT/PMD, PT/PMD can be cleared. In this case, the page used as PT/PMD can be freed. Applying patch, __remove_section() of CONFIG_SPARSEMEM_VMEMMAP is integrated into one. So __remove_section() of CONFIG_SPARSEMEM_VMEMMAP is deleted. Note: vmemmap_kfree() and vmemmap_free_bootmem() are not implemented for ia64, ppc, s390, and sparc. CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- arch/ia64/mm/discontig.c |8 +++ arch/powerpc/mm/init_64.c |8 +++ arch/s390/mm/vmem.c |8 +++ arch/sparc/mm/init_64.c |8 +++ arch/x86/mm/init_64.c | 119 ++ include/linux/mm.h|2 mm/memory_hotplug.c | 17 -- mm/sparse.c |5 + 8 files changed, 158 insertions(+), 17 deletions(-) Index: linux-3.6/arch/ia64/mm/discontig.c === --- linux-3.6.orig/arch/ia64/mm/discontig.c 2012-10-04 18:30:15.475692638 +0900 +++ linux-3.6/arch/ia64/mm/discontig.c 2012-10-04 18:30:21.145698389 +0900 @@ -823,6 +823,14 @@ int __meminit vmemmap_populate(struct pa return vmemmap_populate_basepages(start_page, size, node); } +void vmemmap_kfree(struct page *memmap, unsigned long nr_pages) +{ +} + +void vmemmap_free_bootmem(struct page *memmap, unsigned long nr_pages) +{ +} + void register_page_bootmem_memmap(unsigned long section_nr, struct page *start_page, unsigned long size) { Index: linux-3.6/arch/powerpc/mm/init_64.c === --- linux-3.6.orig/arch/powerpc/mm/init_64.c2012-10-04 18:30:15.494692657 +0900 +++ linux-3.6/arch/powerpc/mm/init_64.c 2012-10-04 18:30:21.150698394 +0900 @@ -299,6 +299,14 @@ int __meminit vmemmap_populate(struct pa return 0; } +void vmemmap_kfree(struct page *memmap, unsigned long nr_pages) +{ +} + +void vmemmap_free_bootmem(struct page *memmap, unsigned long nr_pages) +{ +} + void register_page_bootmem_memmap(unsigned long section_nr, struct page *start_page, unsigned long size) { Index: linux-3.6/arch/s390/mm/vmem.c === --- linux-3.6.orig/arch/s390/mm/vmem.c 2012-10-04 18:30:15.506692670 +0900 +++ linux-3.6/arch/s390/mm/vmem.c 2012-10-04 18:30:21.157698401 +0900 @@ -227,6 +227,14 @@ out: return ret; } +void vmemmap_kfree(struct page *memmap, unsigned long nr_pages) +{ +} + +void vmemmap_free_bootmem(struct page *memmap, unsigned long nr_pages) +{ +} + void register_page_bootmem_memmap(unsigned long section_nr, struct page *start_page, unsigned long size) { Index: linux-3.6/arch/sparc/mm/init_64.c === --- linux-3.6.orig/arch/sparc/mm/init_64.c 2012-10-04 18:30:15.512692676 +0900 +++ linux-3.6/arch/sparc/mm/init_64.c 2012-10-04 18:30:21.163698408 +0900 @@ -2078,6 +2078,14 @@ void __meminit vmemmap_populate_print_la } } +void vmemmap_kfree(struct page *memmap, unsigned long nr_pages) +{ +} + +void vmemmap_free_bootmem(struct page *memmap, unsigned long nr_pages) +{ +} + void register_page_bootmem_memmap(unsigned long section_nr, struct page *start_page, unsigned long size) { Index: linux-3.6/arch/x86/mm/init_64.c === --- linux-3.6.orig/arch/x86/mm/init_64.c2012-10-04 18:30:15.517692681 +0900 +++ linux-3.6/arch/x86/mm/init_64.c 2012-10-04 18:30:21.171698416 +0900 @@ -993,6 +993,125 @@ vmemmap_populate(struct page *start_page return 0; } +#define PAGE_INUSE 0xFD + +unsigned long find_and_clear_pte_page(unsigned long addr, unsigned long end, + struct page **pp, int *page_size) +{ + pgd_t *pgd; + pud_t *pud; + pmd_t *pmd; + pte_t *pte; + void *page_addr; + unsigned long next; + + *pp = NULL; + + pgd = pgd_offset_k(addr); + if (pgd_none(*pgd)) + return pgd_addr_end(addr, end); + + pud = pud_offset(pgd, addr); + if (pud_none(*pud)) + return
[PATCH 8/10] memory-hotplug : remove page table of x86_64 architecture
From: Wen Congyang we...@cn.fujitsu.com For hot removing memory, we sholud remove page table about the memory. So the patch searches a page table about the removed memory, and clear page table. CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Signed-off-by: Wen Congyang we...@cn.fujitsu.com --- arch/x86/include/asm/pgtable_types.h |1 arch/x86/mm/init_64.c| 147 +++ arch/x86/mm/pageattr.c | 47 +-- 3 files changed, 173 insertions(+), 22 deletions(-) Index: linux-3.6/arch/x86/mm/init_64.c === --- linux-3.6.orig/arch/x86/mm/init_64.c2012-10-04 18:30:21.171698416 +0900 +++ linux-3.6/arch/x86/mm/init_64.c 2012-10-04 18:30:27.317704652 +0900 @@ -675,6 +675,151 @@ int arch_add_memory(int nid, u64 start, } EXPORT_SYMBOL_GPL(arch_add_memory); +static void __meminit +phys_pte_remove(pte_t *pte_page, unsigned long addr, unsigned long end) +{ + unsigned pages = 0; + int i = pte_index(addr); + + pte_t *pte = pte_page + pte_index(addr); + + for (; i PTRS_PER_PTE; i++, addr += PAGE_SIZE, pte++) { + + if (addr = end) + break; + + if (!pte_present(*pte)) + continue; + + pages++; + set_pte(pte, __pte(0)); + } + + update_page_count(PG_LEVEL_4K, -pages); +} + +static void __meminit +phys_pmd_remove(pmd_t *pmd_page, unsigned long addr, unsigned long end) +{ + unsigned long pages = 0, next; + int i = pmd_index(addr); + + for (; i PTRS_PER_PMD; i++, addr = next) { + unsigned long pte_phys; + pmd_t *pmd = pmd_page + pmd_index(addr); + pte_t *pte; + + if (addr = end) + break; + + next = (addr PMD_MASK) + PMD_SIZE; + + if (!pmd_present(*pmd)) + continue; + + if (pmd_large(*pmd)) { + if ((addr ~PMD_MASK) == 0 next = end) { + set_pmd(pmd, __pmd(0)); + pages++; + continue; + } + + /* +* We use 2M page, but we need to remove part of them, +* so split 2M page to 4K page. +*/ + pte = alloc_low_page(pte_phys); + __split_large_page((pte_t *)pmd, addr, pte); + + spin_lock(init_mm.page_table_lock); + pmd_populate_kernel(init_mm, pmd, __va(pte_phys)); + spin_unlock(init_mm.page_table_lock); + } + + spin_lock(init_mm.page_table_lock); + pte = map_low_page((pte_t *)pmd_page_vaddr(*pmd)); + phys_pte_remove(pte, addr, end); + unmap_low_page(pte); + spin_unlock(init_mm.page_table_lock); + } + update_page_count(PG_LEVEL_2M, -pages); +} + +static void __meminit +phys_pud_remove(pud_t *pud_page, unsigned long addr, unsigned long end) +{ + unsigned long pages = 0, next; + int i = pud_index(addr); + + for (; i PTRS_PER_PUD; i++, addr = next) { + unsigned long pmd_phys; + pud_t *pud = pud_page + pud_index(addr); + pmd_t *pmd; + + if (addr = end) + break; + + next = (addr PUD_MASK) + PUD_SIZE; + + if (!pud_present(*pud)) + continue; + + if (pud_large(*pud)) { + if ((addr ~PUD_MASK) == 0 next = end) { + set_pud(pud, __pud(0)); + pages++; + continue; + } + + /* +* We use 1G page, but we need to remove part of them, +* so split 1G page to 2M page. +*/ + pmd = alloc_low_page(pmd_phys); + __split_large_page((pte_t *)pud, addr, (pte_t *)pmd); + + spin_lock(init_mm.page_table_lock); + pud_populate(init_mm, pud, __va(pmd_phys)); + spin_unlock(init_mm.page_table_lock); + } + + pmd = map_low_page(pmd_offset(pud, 0)); + phys_pmd_remove(pmd, addr, end); + unmap_low_page(pmd); + __flush_tlb_all(); + } + __flush_tlb_all
[PATCH 9/10] memory-hotplug : memory_hotplug: clear zone when removing the memory
When a memory is added, we update zone's and pgdat's start_pfn and spanned_pages in the function __add_zone(). So we should revert them when the memory is removed. The patch adds a new function __remove_zone() to do this. CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Signed-off-by: Wen Congyang we...@cn.fujitsu.com --- mm/memory_hotplug.c | 207 1 file changed, 207 insertions(+) Index: linux-3.6/mm/memory_hotplug.c === --- linux-3.6.orig/mm/memory_hotplug.c 2012-10-04 18:30:21.182698427 +0900 +++ linux-3.6/mm/memory_hotplug.c 2012-10-04 18:30:31.767709165 +0900 @@ -312,10 +312,213 @@ static int __meminit __add_section(int n return register_new_memory(nid, __pfn_to_section(phys_start_pfn)); } +/* find the smallest valid pfn in the range [start_pfn, end_pfn) */ +static int find_smallest_section_pfn(int nid, struct zone *zone, +unsigned long start_pfn, +unsigned long end_pfn) +{ + struct mem_section *ms; + + for (; start_pfn end_pfn; start_pfn += PAGES_PER_SECTION) { + ms = __pfn_to_section(start_pfn); + + if (unlikely(!valid_section(ms))) + continue; + + if (unlikely(pfn_to_nid(start_pfn)) != nid) + continue; + + if (zone zone != page_zone(pfn_to_page(start_pfn))) + continue; + + return start_pfn; + } + + return 0; +} + +/* find the biggest valid pfn in the range [start_pfn, end_pfn). */ +static int find_biggest_section_pfn(int nid, struct zone *zone, + unsigned long start_pfn, + unsigned long end_pfn) +{ + struct mem_section *ms; + unsigned long pfn; + + /* pfn is the end pfn of a memory section. */ + pfn = end_pfn - 1; + for (; pfn = start_pfn; pfn -= PAGES_PER_SECTION) { + ms = __pfn_to_section(pfn); + + if (unlikely(!valid_section(ms))) + continue; + + if (unlikely(pfn_to_nid(pfn)) != nid) + continue; + + if (zone zone != page_zone(pfn_to_page(pfn))) + continue; + + return pfn; + } + + return 0; +} + +static void shrink_zone_span(struct zone *zone, unsigned long start_pfn, +unsigned long end_pfn) +{ + unsigned long zone_start_pfn = zone-zone_start_pfn; + unsigned long zone_end_pfn = zone-zone_start_pfn + zone-spanned_pages; + unsigned long pfn; + struct mem_section *ms; + int nid = zone_to_nid(zone); + + zone_span_writelock(zone); + if (zone_start_pfn == start_pfn) { + /* +* If the section is smallest section in the zone, it need +* shrink zone-zone_start_pfn and zone-zone_spanned_pages. +* In this case, we find second smallest valid mem_section +* for shrinking zone. +*/ + pfn = find_smallest_section_pfn(nid, zone, end_pfn, + zone_end_pfn); + if (pfn) { + zone-zone_start_pfn = pfn; + zone-spanned_pages = zone_end_pfn - pfn; + } + } else if (zone_end_pfn == end_pfn) { + /* +* If the section is biggest section in the zone, it need +* shrink zone-spanned_pages. +* In this case, we find second biggest valid mem_section for +* shrinking zone. +*/ + pfn = find_biggest_section_pfn(nid, zone, zone_start_pfn, + start_pfn); + if (pfn) + zone-spanned_pages = pfn - zone_start_pfn + 1; + } + + /* +* The section is not biggest or smallest mem_section in the zone, it +* only creates a hole in the zone. So in this case, we need not +* change the zone. But perhaps, the zone has only hole data. Thus +* it check the zone has only hole or not. +*/ + pfn = zone_start_pfn; + for (; pfn zone_end_pfn; pfn += PAGES_PER_SECTION) { + ms = __pfn_to_section(pfn); + + if (unlikely(!valid_section(ms))) + continue; + + if (page_zone(pfn_to_page(pfn)) != zone) + continue; + +/* If the section
[PATCH 10/10] memory-hotplug : remove sysfs file of node
From: Wen Congyang we...@cn.fujitsu.com This patch introduces a new function try_offline_node() to remove sysfs file of node when all memory sections of this node are removed. If some memory sections of this node are not removed, this function does nothing. CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Signed-off-by: Wen Congyang we...@cn.fujitsu.com --- mm/memory_hotplug.c | 54 1 file changed, 54 insertions(+) Index: linux-3.6/mm/memory_hotplug.c === --- linux-3.6.orig/mm/memory_hotplug.c 2012-10-04 18:30:31.767709165 +0900 +++ linux-3.6/mm/memory_hotplug.c 2012-10-04 18:32:46.907842637 +0900 @@ -29,6 +29,7 @@ #include linux/suspend.h #include linux/mm_inline.h #include linux/firmware-map.h +#include linux/stop_machine.h #include asm/tlbflush.h @@ -1276,6 +1277,57 @@ int offline_memory(u64 start, u64 size) return 0; } +static int check_cpu_on_node(void *data) +{ + struct pglist_data *pgdat = data; + int cpu; + + for_each_online_cpu(cpu) { + if (cpu_to_node(cpu) == pgdat-node_id) + /* +* the cpu on this node is onlined, and we can't +* offline this node. +*/ + return -EBUSY; + } + + return 0; +} + +/* offline the node if all memory sections of this node are removed */ +static void try_offline_node(int nid) +{ + unsigned long start_pfn = NODE_DATA(nid)-node_start_pfn; + unsigned long end_pfn = start_pfn + NODE_DATA(nid)-node_spanned_pages; + unsigned long pfn; + + for (pfn = start_pfn; pfn end_pfn; pfn += PAGES_PER_SECTION) { + unsigned long section_nr = pfn_to_section_nr(pfn); + + if (!present_section_nr(section_nr)) + continue; + + if (pfn_to_nid(pfn) != nid) + continue; + + /* +* some memory sections of this node are not removed, and we +* can't offline node now. +*/ + return; + } + + if (stop_machine(check_cpu_on_node, NODE_DATA(nid), NULL)) + return; + + /* +* all memory sections of this node are removed, we can offline this +* node now. +*/ + node_set_offline(nid); + unregister_one_node(nid); +} + int __ref remove_memory(int nid, u64 start, u64 size) { int ret = 0; @@ -1296,6 +1348,8 @@ int __ref remove_memory(int nid, u64 sta firmware_map_remove(start, start + size, System RAM); arch_remove_memory(start, size); + + try_offline_node(nid); out: unlock_memory_hotplug(); return ret; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] CPU hotplug, debug: Detect imbalance between get_online_cpus() and put_online_cpus()
2012/10/04 15:16, Srivatsa S. Bhat wrote: On 10/04/2012 02:43 AM, Andrew Morton wrote: On Wed, 03 Oct 2012 18:23:09 +0530 Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com wrote: The synchronization between CPU hotplug readers and writers is achieved by means of refcounting, safe-guarded by the cpu_hotplug.lock. get_online_cpus() increments the refcount, whereas put_online_cpus() decrements it. If we ever hit an imbalance between the two, we end up compromising the guarantees of the hotplug synchronization i.e, for example, an extra call to put_online_cpus() can end up allowing a hotplug reader to execute concurrently with a hotplug writer. So, add a BUG_ON() in put_online_cpus() to detect such cases where the refcount can go negative. Signed-off-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com --- kernel/cpu.c |1 + 1 file changed, 1 insertion(+) diff --git a/kernel/cpu.c b/kernel/cpu.c index f560598..00d29bc 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -80,6 +80,7 @@ void put_online_cpus(void) if (cpu_hotplug.active_writer == current) return; mutex_lock(cpu_hotplug.lock); + BUG_ON(cpu_hotplug.refcount == 0); if (!--cpu_hotplug.refcount unlikely(cpu_hotplug.active_writer)) wake_up_process(cpu_hotplug.active_writer); mutex_unlock(cpu_hotplug.lock); I think calling BUG() here is a bit harsh. We should only do that if there's a risk to proceeding: a risk of data loss, a reduced ability to analyse the underlying bug, etc. But a cpu-hotplug locking imbalance is a really really really minor problem! So how about we emit a warning then try to fix things up? That would be better indeed, thanks! This should increase the chance that the machine will keep running and so will increase the chance that a user will be able to report the bug to us. Yep, sounds good. --- a/kernel/cpu.c~cpu-hotplug-debug-detect-imbalance-between-get_online_cpus-and-put_online_cpus-fix +++ a/kernel/cpu.c @@ -80,9 +80,12 @@ void put_online_cpus(void) if (cpu_hotplug.active_writer == current) return; mutex_lock(cpu_hotplug.lock); - BUG_ON(cpu_hotplug.refcount == 0); - if (!--cpu_hotplug.refcount unlikely(cpu_hotplug.active_writer)) - wake_up_process(cpu_hotplug.active_writer); + if (!--cpu_hotplug.refcount) { This won't catch it. We'll enter this 'if' condition only when cpu_hotplug.refcount was decremented to zero. We'll miss out the case when it went negative (which we intended to detect). + if (WARN_ON(cpu_hotplug.refcount == -1)) + cpu_hotplug.refcount++; /* try to fix things up */ + if (unlikely(cpu_hotplug.active_writer)) + wake_up_process(cpu_hotplug.active_writer); + } mutex_unlock(cpu_hotplug.lock); } So how about something like below: -- From: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com Subject: [PATCH] CPU hotplug, debug: Detect imbalance between get_online_cpus() and put_online_cpus() The synchronization between CPU hotplug readers and writers is achieved by means of refcounting, safe-guarded by the cpu_hotplug.lock. get_online_cpus() increments the refcount, whereas put_online_cpus() decrements it. If we ever hit an imbalance between the two, we end up compromising the guarantees of the hotplug synchronization i.e, for example, an extra call to put_online_cpus() can end up allowing a hotplug reader to execute concurrently with a hotplug writer. So, add a WARN_ON() in put_online_cpus() to detect such cases where the refcount can go negative, and also attempt to fix it up, so that we can continue to run. Signed-off-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com --- Looks good to me. Reviewed-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com kernel/cpu.c |4 1 file changed, 4 insertions(+) diff --git a/kernel/cpu.c b/kernel/cpu.c index f560598..42bd331 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -80,6 +80,10 @@ void put_online_cpus(void) if (cpu_hotplug.active_writer == current) return; mutex_lock(cpu_hotplug.lock); + + if (WARN_ON(!cpu_hotplug.refcount)) + cpu_hotplug.refcount++; /* try to fix things up */ + if (!--cpu_hotplug.refcount unlikely(cpu_hotplug.active_writer)) wake_up_process(cpu_hotplug.active_writer); mutex_unlock(cpu_hotplug.lock); -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majord...@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ
Re: [PATCH 1/2] acpi : cpu hot-remove returns error when cpu_down() fails
Hi Rafael, 2012/10/19 10:06, Rafael J. Wysocki wrote: On Friday 28 of September 2012 19:36:02 Yasuaki Ishimatsu wrote: Even if cpu_down() fails, acpi_processor_remove() continues to remove the cpu. But in this case, it should return error number since some process may run on the cpu. If the cpu has a running process and the cpu is turned the power off, the system may not work well. Reviewed-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com Reviewed-by: Toshi Kani toshi.k...@hp.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/acpi/processor_driver.c | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) Index: linux-3.6-rc7/drivers/acpi/processor_driver.c === --- linux-3.6-rc7.orig/drivers/acpi/processor_driver.c 2012-09-24 10:10:57.0 +0900 +++ linux-3.6-rc7/drivers/acpi/processor_driver.c 2012-09-28 19:16:33.207858261 +0900 @@ -605,7 +605,7 @@ err_free_pr: static int acpi_processor_remove(struct acpi_device *device, int type) { struct acpi_processor *pr = NULL; - + int ret; if (!device || !acpi_driver_data(device)) return -EINVAL; @@ -616,8 +616,9 @@ static int acpi_processor_remove(struct goto free; if (type == ACPI_BUS_REMOVAL_EJECT) { - if (acpi_processor_handle_eject(pr)) - return -EINVAL; + ret = acpi_processor_handle_eject(pr); + if (ret) + return ret; } acpi_processor_power_exit(pr, device); @@ -848,12 +849,17 @@ static acpi_status acpi_processor_hotadd static int acpi_processor_handle_eject(struct acpi_processor *pr) { - if (cpu_online(pr-id)) - cpu_down(pr-id); + int ret = 0; + + if (cpu_online(pr-id)) { + ret = cpu_down(pr-id); If you defined ret here ... + if (ret) + return ret; + } arch_unregister_cpu(pr-id); acpi_unmap_lsapic(pr-id); - return (0); + return ret; ... this line wouldn't need to be changed. Thank you for your review. O.K. I'll put the return code back. Thanks, Yasuaki Ishimatsu } #else static acpi_status acpi_processor_hotadd_init(struct acpi_processor *pr) Thanks, Rafael -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 0/9] bugfix for memory hotplug
Hi Wen, Some bug fix patches have been merged into linux-next. So the patches confuse me. Why did you send same patches again? Thanks, Yasuaki Ishimatsu 2012/10/19 15:46, we...@cn.fujitsu.com wrote: From: Wen Congyang we...@cn.fujitsu.com Changes from v2 to v3: Merge the bug fix from ishimatsu to this patchset(Patch 1-3) Patch 3: split it from patch as it fixes another bug. Patch 4: new patch, and fix bad-page state when hotadding a memory device after hotremoving it. I forgot to post this patch in v2. Patch 6: update it according to Dave Hansen's comment. Changes from v1 to v2: Patch 1: updated according to kosaki's suggestion Patch 2: new patch, and update mce_bad_pages when removing memory. Patch 4: new patch, and fix a NR_FREE_PAGES mismatch, and this bug cause oom in my test. Patch 5: new patch, and fix a new bug. When repeating to online/offline pages, the free pages will continue to decrease. Wen Congyang (6): clear the memory to store struct page memory-hotplug: skip HWPoisoned page when offlining pages memory-hotplug: update mce_bad_pages when removing the memory memory-hotplug: auto offline page_cgroup when onlining memory block failed memory-hotplug: fix NR_FREE_PAGES mismatch memory-hotplug: allocate zone's pcp before onlining pages Yasuaki Ishimatsu (3): suppress Device memoryX does not have a release() function warning suppress Device nodeX does not have a release() function warning memory-hotplug: flush the work for the node when the node is offlined drivers/base/memory.c |9 - drivers/base/node.c| 11 +++ include/linux/page-isolation.h | 10 ++ mm/memory-failure.c|2 +- mm/memory_hotplug.c| 14 -- mm/page_alloc.c| 37 - mm/page_cgroup.c |3 +++ mm/page_isolation.c| 27 --- mm/sparse.c| 22 +- 9 files changed, 106 insertions(+), 29 deletions(-) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 0/9] bugfix for memory hotplug
2012/10/19 17:06, Yasuaki Ishimatsu wrote: Hi Wen, Some bug fix patches have been merged into linux-next. So the patches confuse me. The following patches have been already merged into linux-next and mm-tree as long as I know. Wen Congyang (6): clear the memory to store struct page memory-hotplug: skip HWPoisoned page when offlining pages mm-tree memory-hotplug: update mce_bad_pages when removing the memory memory-hotplug: auto offline page_cgroup when onlining memory block failed mm-tree memory-hotplug: fix NR_FREE_PAGES mismatch mm-tree memory-hotplug: allocate zone's pcp before onlining pages mm-tree Yasuaki Ishimatsu (3): suppress Device memoryX does not have a release() function warning linux-next suppress Device nodeX does not have a release() function warning memory-hotplug: flush the work for the node when the node is offlined linux-next Thanks, Yasuaki Ishimatsu Why did you send same patches again? Thanks, Yasuaki Ishimatsu 2012/10/19 15:46, we...@cn.fujitsu.com wrote: From: Wen Congyang we...@cn.fujitsu.com Changes from v2 to v3: Merge the bug fix from ishimatsu to this patchset(Patch 1-3) Patch 3: split it from patch as it fixes another bug. Patch 4: new patch, and fix bad-page state when hotadding a memory device after hotremoving it. I forgot to post this patch in v2. Patch 6: update it according to Dave Hansen's comment. Changes from v1 to v2: Patch 1: updated according to kosaki's suggestion Patch 2: new patch, and update mce_bad_pages when removing memory. Patch 4: new patch, and fix a NR_FREE_PAGES mismatch, and this bug cause oom in my test. Patch 5: new patch, and fix a new bug. When repeating to online/offline pages, the free pages will continue to decrease. Wen Congyang (6): clear the memory to store struct page memory-hotplug: skip HWPoisoned page when offlining pages memory-hotplug: update mce_bad_pages when removing the memory memory-hotplug: auto offline page_cgroup when onlining memory block failed memory-hotplug: fix NR_FREE_PAGES mismatch memory-hotplug: allocate zone's pcp before onlining pages Yasuaki Ishimatsu (3): suppress Device memoryX does not have a release() function warning suppress Device nodeX does not have a release() function warning memory-hotplug: flush the work for the node when the node is offlined drivers/base/memory.c |9 - drivers/base/node.c| 11 +++ include/linux/page-isolation.h | 10 ++ mm/memory-failure.c|2 +- mm/memory_hotplug.c| 14 -- mm/page_alloc.c| 37 - mm/page_cgroup.c |3 +++ mm/page_isolation.c| 27 --- mm/sparse.c| 22 +- 9 files changed, 106 insertions(+), 29 deletions(-) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 0/9] bugfix for memory hotplug
2012/10/19 17:45, Wen Congyang wrote: At 10/19/2012 04:19 PM, Yasuaki Ishimatsu Wrote: 2012/10/19 17:06, Yasuaki Ishimatsu wrote: Hi Wen, Some bug fix patches have been merged into linux-next. So the patches confuse me. Sorry, I don't check linux-next tree. The following patches have been already merged into linux-next and mm-tree as long as I know. Wen Congyang (6): clear the memory to store struct page memory-hotplug: skip HWPoisoned page when offlining pages mm-tree Hmm, I don't find this patch in this URL: http://www.ozlabs.org/~akpm/mmotm/broken-out/ Do I miss something? But Andrew announced that the patch was merged in mm-tree. And you received the announcement. memory-hotplug: update mce_bad_pages when removing the memory memory-hotplug: auto offline page_cgroup when onlining memory block failed mm-tree memory-hotplug: fix NR_FREE_PAGES mismatch mm-tree memory-hotplug: allocate zone's pcp before onlining pages mm-tree Yasuaki Ishimatsu (3): suppress Device memoryX does not have a release() function warning linux-next suppress Device nodeX does not have a release() function warning memory-hotplug: flush the work for the node when the node is offlined linux-next I split this patch to two patches according to kosaki's comment. Yeah, I know. But is the patch really need now? Thanks, Yasuaki Ishimatsu Thanks Wen Congyang Thanks, Yasuaki Ishimatsu Why did you send same patches again? Thanks, Yasuaki Ishimatsu 2012/10/19 15:46, we...@cn.fujitsu.com wrote: From: Wen Congyang we...@cn.fujitsu.com Changes from v2 to v3: Merge the bug fix from ishimatsu to this patchset(Patch 1-3) Patch 3: split it from patch as it fixes another bug. Patch 4: new patch, and fix bad-page state when hotadding a memory device after hotremoving it. I forgot to post this patch in v2. Patch 6: update it according to Dave Hansen's comment. Changes from v1 to v2: Patch 1: updated according to kosaki's suggestion Patch 2: new patch, and update mce_bad_pages when removing memory. Patch 4: new patch, and fix a NR_FREE_PAGES mismatch, and this bug cause oom in my test. Patch 5: new patch, and fix a new bug. When repeating to online/offline pages, the free pages will continue to decrease. Wen Congyang (6): clear the memory to store struct page memory-hotplug: skip HWPoisoned page when offlining pages memory-hotplug: update mce_bad_pages when removing the memory memory-hotplug: auto offline page_cgroup when onlining memory block failed memory-hotplug: fix NR_FREE_PAGES mismatch memory-hotplug: allocate zone's pcp before onlining pages Yasuaki Ishimatsu (3): suppress Device memoryX does not have a release() function warning suppress Device nodeX does not have a release() function warning memory-hotplug: flush the work for the node when the node is offlined drivers/base/memory.c |9 - drivers/base/node.c| 11 +++ include/linux/page-isolation.h | 10 ++ mm/memory-failure.c|2 +- mm/memory_hotplug.c| 14 -- mm/page_alloc.c| 37 - mm/page_cgroup.c |3 +++ mm/page_isolation.c| 27 --- mm/sparse.c| 22 +- 9 files changed, 106 insertions(+), 29 deletions(-) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 0/3] acpi,memory-hotplug : implement framework for hot removing memory
CCing Rafael, because he become ACPI Maintainer. Hi Wen, If you update the patch-set, please CCing Rafael from the next time. Thanks, Yasuaki Ishimatsu 2012/10/19 19:03, we...@cn.fujitsu.com wrote: From: Wen Congyang we...@cn.fujitsu.com The patch-set implements a framework for hot removing memory. The memory device can be removed by 2 ways: 1. send eject request by SCI 2. echo 1 /sys/bus/pci/devices/PNP0C80:XX/eject In the 1st case, acpi_memory_disable_device() will be called. In the 2nd case, acpi_memory_device_remove() will be called. acpi_memory_device_remove() will also be called when we unbind the memory device from the driver acpi_memhotplug or a driver initialization fails. acpi_memory_disable_device() has already implemented a code which offlines memory and releases acpi_memory_info struct . But acpi_memory_device_remove() has not implemented it yet. So the patch prepares the framework for hot removing memory and adds the framework into acpi_memory_device_remove(). The last version of this patchset is here: https://lkml.org/lkml/2012/10/3/126 Changelos from v1 to v2: Patch1: use acpi_bus_trim() instead of acpi_bus_remove() Patch2: new patch, introduce a lock to protect the list Patch3: remove memory too when type is ACPI_BUS_REMOVAL_NORMAL Note: I don't send [Patch2-4 v1] in this series because they are no logical changes in these 3 patches. Wen Congyang (2): acpi,memory-hotplug: call acpi_bus_trim() to remove memory device acpi,memory-hotplug: introduce a mutex lock to protect the list in acpi_memory_device Yasuaki Ishimatsu (1): acpi,memory-hotplug : add memory offline code to acpi_memory_device_remove() drivers/acpi/acpi_memhotplug.c | 51 1 files changed, 41 insertions(+), 10 deletions(-) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: make zone_pcp_reset independ on MEMORY_HOTREMOVE
2012/10/23 18:37, Michal Hocko wrote: 340175b7 (mm/hotplug: free zone-pageset when a zone becomes empty) introduced zone_pcp_reset and hided it inside CONFIG_MEMORY_HOTREMOVE. The function is since 506e5fb7 (memory-hotplug: allocate zone's pcp before onlining pages) called also called from online_pages which is called outside CONFIG_MEMORY_HOTREMOVE which causes a linkage error. The function, although not used outside of MEMORY_{HOTPLUT,HOTREMOVE}, seems like universal enough so let's keep it at its current location and only remove the HOTREMOVE guard. Signed-off-by: Michal Hocko mho...@suse.cz Cc: David Rientjes rient...@google.com Cc: Jiang Liu liu...@gmail.com Cc: Len Brown len.br...@intel.com Cc: Benjamin Herrenschmidt b...@kernel.crashing.org Cc: Paul Mackerras pau...@samba.org Cc: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com Cc: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com Cc: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Looks goot to me. Reviewd-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Thanks, Yasuki Ishimatsu Cc: Dave Hansen d...@linux.vnet.ibm.com Cc: Mel Gorman m...@csn.ul.ie --- mm/page_alloc.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index e29912e..30e359c 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5981,7 +5981,6 @@ void __meminit zone_pcp_update(struct zone *zone) } #endif -#ifdef CONFIG_MEMORY_HOTREMOVE void zone_pcp_reset(struct zone *zone) { unsigned long flags; @@ -6001,6 +6000,7 @@ void zone_pcp_reset(struct zone *zone) local_irq_restore(flags); } +#ifdef CONFIG_MEMORY_HOTREMOVE /* * All pages in the range must be isolated before calling this. */ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/2] Use kacpi_hotplug_wq to handle container hotplug event.
Hi Tang, 2012/10/24 15:05, Tang Chen wrote: As the comments in __acpi_os_execute() said: We can't run hotplug code in keventd_wq/kacpid_wq/kacpid_notify_wq because the hotplug code may call driver .remove() functions, which invoke flush_scheduled_work/acpi_os_wait_events_complete to flush these workqueues. we should keep the hotplug code in kacpi_hotplug_wq. But we have the following call series in kernel now: acpi_ev_queue_notify_request() |-- acpi_os_execute() |-- __acpi_os_execute(type, function, context, 0) The last parameter 0 makes the container_notify_cb() executed in kacpi_notify_wq or kacpid_wq. So, we need to put the real hotplug code into kacpi_hotplug_wq. I cannot understand the purpose of the patch. Is the patch a bug fix patch? If yes, what problem happens? Thanks, Yasuaki Ishimatsu Signed-off-by: Tang Chen tangc...@cn.fujitsu.com --- drivers/acpi/container.c | 17 - 1 files changed, 16 insertions(+), 1 deletions(-) diff --git a/drivers/acpi/container.c b/drivers/acpi/container.c index 69e2d6b..d300e03 100644 --- a/drivers/acpi/container.c +++ b/drivers/acpi/container.c @@ -35,6 +35,7 @@ #include acpi/acpi_bus.h #include acpi/acpi_drivers.h #include acpi/container.h +#include acpi/acpiosxf.h #define PREFIX ACPI: @@ -165,14 +166,21 @@ static int container_device_add(struct acpi_device **device, acpi_handle handle) return result; } -static void container_notify_cb(acpi_handle handle, u32 type, void *context) +static void __container_notify_cb(struct work_struct *work) { struct acpi_device *device = NULL; int result; int present; acpi_status status; + struct acpi_hp_work *hp_work; + acpi_handle handle; + u32 type; u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; /* default */ + hp_work = container_of(work, struct acpi_hp_work, work); + handle = hp_work-handle; + type = hp_work-type; + switch (type) { case ACPI_NOTIFY_BUS_CHECK: /* Fall through */ @@ -224,6 +232,13 @@ static void container_notify_cb(acpi_handle handle, u32 type, void *context) return; } +static void container_notify_cb(acpi_handle handle, u32 type, + void *context) +{ + alloc_acpi_hp_work(handle, type, context, +__container_notify_cb); +} + static acpi_status container_walk_namespace_cb(acpi_handle handle, u32 lvl, void *context, void **rv) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/2] Use kacpi_hotplug_wq to handle container hotplug event.
Hi Tang, 2012/10/24 16:24, Tang Chen wrote: On 10/24/2012 02:54 PM, Yasuaki Ishimatsu wrote: Hi Tang, 2012/10/24 15:05, Tang Chen wrote: As the comments in __acpi_os_execute() said: We can't run hotplug code in keventd_wq/kacpid_wq/kacpid_notify_wq because the hotplug code may call driver .remove() functions, which invoke flush_scheduled_work/acpi_os_wait_events_complete to flush these workqueues. we should keep the hotplug code in kacpi_hotplug_wq. But we have the following call series in kernel now: acpi_ev_queue_notify_request() |-- acpi_os_execute() |-- __acpi_os_execute(type, function, context, 0) The last parameter 0 makes the container_notify_cb() executed in kacpi_notify_wq or kacpid_wq. So, we need to put the real hotplug code into kacpi_hotplug_wq. I cannot understand the purpose of the patch. Is the patch a bug fix patch? If yes, what problem happens? Hi Yasuaki-san, Actually, it is a problem. But container hot-remove was not implemented in container_notify_cb(), so this problem would never be triggered. So I cannot say it is a bug in kernel. The problem is here: acpi_pci_root_remove() will finally call acpi_os_wait_events_complete(): void acpi_os_wait_events_complete(void) { flush_workqueue(kacpid_wq); flush_workqueue(kacpi_notify_wq); } which means it will flush kacpid_wq and kacpi_notify_wq. So the current work should not be in these 2 workqueue, otherwise it will cause deadlock: the worker will wait for itself to complete. But unfortunately, in the beginning, we have: acpi_ev_queue_notify_request() |-- acpi_os_execute() |-- __acpi_os_execute(type, function, context, 0) Please refer to the code, you will see the last parameter 0 will make the hotplug call serial in kacpid_wq or kacpi_notify_wq. And it is hard coded in kernel. I don't know why and I don't how to fix it. So I made this patch, and want to see what you guys think about it. :) The deadlock call trace is like below: [ 302.383606] = [ 302.448094] [ INFO: possible recursive locking detected ] [ 302.512578] 3.6.0-rc5-luyh-hostbridge-hotplug+ #13 Not tainted [ 302.582252] - [ 302.646736] kworker/0:2/1412 is trying to acquire lock: [ 302.709143] (kacpi_notify){.+}, at: [81091300] flush_workqueue+0x0/0x5c0 [ 302.805222] [ 302.805222] but task is already holding lock: [ 302.874898] (kacpi_notify){.+}, at: [81090528] process_one_work+0x1b8/0x680 [ 302.974083] [ 302.974083] other info that might help us debug this: [ 303.052067] Possible unsafe locking scenario: [ 303.052067] [ 303.122785]CPU0 [ 303.151965] [ 303.181150] lock(kacpi_notify); [ 303.220935] lock(kacpi_notify); [ 303.260721] [ 303.260721] *** DEADLOCK *** [ 303.260721] [ 303.331434] May be due to missing lock nesting notation [ 303.331434] [ 303.412529] 4 locks held by kworker/0:2/1412: [ 303.464553] #0: (kacpi_notify){.+}, at: [81090528] process_one_work+0x1b8/0x680 [ 303.569042] #1: ((dpc-work)#2){+.+.+.}, at: [81090528] process_one_work+0x1b8/0x680 [ 303.675718] #2: (__lockdep_no_validate__){..}, at: [8143cca7] device_release_driver+0x27/0x50 [ 303.795782] #3: (pci_acpi_pm_notify_mtx){+.+.+.}, at: [81385443] remove_pm_notifier+0x33/0x90 [ 303.910662] [ 303.910662] stack backtrace: [ 303.962687] Pid: 1412, comm: kworker/0:2 Not tainted 3.6.0-rc5-luyh-hostbridge-hotplug+ #13 [ 304.062470] Call Trace: [ 304.091666] [810da704] print_deadlock_bug+0xf4/0x100 [ 304.162384] [810dc6a9] validate_chain+0x549/0x7e0 [ 304.229987] [810dcc36] __lock_acquire+0x2f6/0x4f0 [ 304.297587] [810dba65] ? debug_check_no_locks_freed+0xa5/0xf0 [ 304.377650] [810dcecd] lock_acquire+0x9d/0x190 [ 304.442141] [81091300] ? flush_workqueue_prep_cwqs+0x260/0x260 [ 304.523242] [810d8759] ? lockdep_init_map+0x59/0x150 [ 304.593963] [810914af] flush_workqueue+0x1af/0x5c0 [ 304.662605] [81091300] ? flush_workqueue_prep_cwqs+0x260/0x260 [ 304.743713] [810a6ab8] ? complete+0x28/0x60 [ 304.805084] [810a6ab8] ? complete+0x28/0x60 [ 304.866457] [810db925] ? trace_hardirqs_on_caller+0x105/0x190 [ 304.946515] [810a6ab8] ? complete+0x28/0x60 [ 305.007891] [81385443] ? remove_pm_notifier+0x33/0x90 [ 305.079649] [813854e0] ? pci_acpi_remove_bus_pm_notifier+0x20/0x20 [ 305.164905] [813a340e] acpi_os_wait_events_complete+0x21/0x23 [ 305.244970] [813b7b3c] acpi_remove_notify_handler+0x47/0x183 [ 305.323994] [813854e0] ? pci_acpi_remove_bus_pm_notifier+0x20/0x20 [ 305.409251] [81385481] remove_pm_notifier+0x71
Re: [PATCH v4] create sun sysfs file
Hi Len, What should I do to put this patch in your tree? Thanks, Yasuaki Ishimatsu 2012/10/03 18:54, Yasuaki Ishimatsu wrote: Hi Len, Ping... Pleae merge the patch into your tree. Thanks, Yasuaki Ishimatsu 2012/09/24 11:31, Yasuaki Ishimatsu wrote: Hi Len, Ping... I want you to merge the patch into your tree for linux-3.7. Thanks, Yasuaki Ishimatsu 2012/08/30 10:34, Yasuaki Ishimatsu wrote: Hi Len, Three weeks passed after I post the patch. All comments have already been applied to it. And I think there is no comments about it. So I want you to merge it into your tree. Thanks, Yasuaki Ishimatsu 2012/08/07 9:36, Yasuaki Ishimatsu wrote: Even if a device has _SUN method, there is no way to know the slot unique-ID. Thus the patch creates sun file in sysfs so that we can recognize it. Reviewed-by: Toshi Kani toshi.k...@hp.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/acpi/scan.c | 24 include/acpi/acpi_bus.h |1 + 2 files changed, 25 insertions(+) Index: linux-3.5/include/acpi/acpi_bus.h === --- linux-3.5.orig/include/acpi/acpi_bus.h 2012-07-30 10:06:49.722171575 +0900 +++ linux-3.5/include/acpi/acpi_bus.h 2012-08-07 08:57:45.678204360 +0900 @@ -209,6 +209,7 @@ struct acpi_device_pnp { struct list_head ids; /* _HID and _CIDs */ acpi_device_name device_name; /* Driver-determined */ acpi_device_class device_class; /* */ + unsigned long sun; /* _SUN */ }; #define acpi_device_bid(d)((d)-pnp.bus_id) Index: linux-3.5/drivers/acpi/scan.c === --- linux-3.5.orig/drivers/acpi/scan.c 2012-07-30 10:06:49.713171688 +0900 +++ linux-3.5/drivers/acpi/scan.c 2012-08-07 09:01:38.196203659 +0900 @@ -192,10 +192,20 @@ end: } static DEVICE_ATTR(path, 0444, acpi_device_path_show, NULL); +static ssize_t +acpi_device_sun_show(struct device *dev, struct device_attribute *attr, + char *buf) { + struct acpi_device *acpi_dev = to_acpi_device(dev); + + return sprintf(buf, %lu\n, acpi_dev-pnp.sun); +} +static DEVICE_ATTR(sun, 0444, acpi_device_sun_show, NULL); + static int acpi_device_setup_files(struct acpi_device *dev) { acpi_status status; acpi_handle temp; + unsigned long long sun; int result = 0; /* @@ -217,6 +227,16 @@ static int acpi_device_setup_files(struc goto end; } + status = acpi_evaluate_integer(dev-handle, _SUN, NULL, sun); + if (ACPI_SUCCESS(status)) { + dev-pnp.sun = (unsigned long)sun; + result = device_create_file(dev-dev, dev_attr_sun); + if (result) + goto end; + } else { + dev-pnp.sun = (unsigned long)-1; + } + /* * If device has _EJ0, 'eject' file is created that is used to trigger * hot-removal function from userland. @@ -241,6 +261,10 @@ static void acpi_device_remove_files(str if (ACPI_SUCCESS(status)) device_remove_file(dev-dev, dev_attr_eject); + status = acpi_get_handle(dev-handle, _SUN, temp); + if (ACPI_SUCCESS(status)) + device_remove_file(dev-dev, dev_attr_sun); + device_remove_file(dev-dev, dev_attr_modalias); device_remove_file(dev-dev, dev_attr_hid); if (dev-handle) -- To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4] create sun sysfs file
Hi Len, 2012/10/09 14:05, Len Brown wrote: On 10/08/2012 07:57 PM, Yasuaki Ishimatsu wrote: Hi Len, What should I do to put this patch in your tree? Please add a description of the attribute in Documentation/ABI/testing/ A human needs to understand exactly what is in that file because you are proposing it as an ABI. Thank you for your comment. I'll update soon. Regards, Yasuaki Ishimatsu thanks, Len Brown, Intel Open Source Technology Center -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v5] create sun sysfs file
_SUN method provides the slot unique-ID in the ACPI namespace. And The value is written in Advanced Configuration and Power Interface Specification as follows: The _SUN value is required to be unique among the slots ofthe same type. It is also recommended that this number match the slot number printed on the physical slot whenever possible. So if we can know the value, we can identify the physical position of the slot in the system. The patch creates sun file in sysfs for identifying physical position of the slot. Reviewed-by: Toshi Kani toshi.k...@hp.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- Documentation/ABI/testing/sysfs-devices-sun | 14 ++ drivers/acpi/scan.c | 24 include/acpi/acpi_bus.h |1 + 3 files changed, 39 insertions(+) Index: linux-3.6/include/acpi/acpi_bus.h === --- linux-3.6.orig/include/acpi/acpi_bus.h 2012-10-09 11:54:17.690072343 +0900 +++ linux-3.6/include/acpi/acpi_bus.h 2012-10-09 14:15:49.207794671 +0900 @@ -208,6 +208,7 @@ struct acpi_device_pnp { struct list_head ids; /* _HID and _CIDs */ acpi_device_name device_name; /* Driver-determined */ acpi_device_class device_class; /* */ + unsigned long sun; /* _SUN */ }; #define acpi_device_bid(d) ((d)-pnp.bus_id) Index: linux-3.6/drivers/acpi/scan.c === --- linux-3.6.orig/drivers/acpi/scan.c 2012-10-09 11:54:17.688072343 +0900 +++ linux-3.6/drivers/acpi/scan.c 2012-10-09 14:15:49.211794675 +0900 @@ -232,10 +232,20 @@ end: } static DEVICE_ATTR(path, 0444, acpi_device_path_show, NULL); +static ssize_t +acpi_device_sun_show(struct device *dev, struct device_attribute *attr, +char *buf) { + struct acpi_device *acpi_dev = to_acpi_device(dev); + + return sprintf(buf, %lu\n, acpi_dev-pnp.sun); +} +static DEVICE_ATTR(sun, 0444, acpi_device_sun_show, NULL); + static int acpi_device_setup_files(struct acpi_device *dev) { acpi_status status; acpi_handle temp; + unsigned long long sun; int result = 0; /* @@ -257,6 +267,16 @@ static int acpi_device_setup_files(struc goto end; } + status = acpi_evaluate_integer(dev-handle, _SUN, NULL, sun); + if (ACPI_SUCCESS(status)) { + dev-pnp.sun = (unsigned long)sun; + result = device_create_file(dev-dev, dev_attr_sun); + if (result) + goto end; + } else { + dev-pnp.sun = (unsigned long)-1; + } + /* * If device has _EJ0, 'eject' file is created that is used to trigger * hot-removal function from userland. @@ -281,6 +301,10 @@ static void acpi_device_remove_files(str if (ACPI_SUCCESS(status)) device_remove_file(dev-dev, dev_attr_eject); + status = acpi_get_handle(dev-handle, _SUN, temp); + if (ACPI_SUCCESS(status)) + device_remove_file(dev-dev, dev_attr_sun); + device_remove_file(dev-dev, dev_attr_modalias); device_remove_file(dev-dev, dev_attr_hid); if (dev-handle) Index: linux-3.6/Documentation/ABI/testing/sysfs-devices-sun === --- /dev/null 1970-01-01 00:00:00.0 + +++ linux-3.6/Documentation/ABI/testing/sysfs-devices-sun 2012-10-09 15:47:02.333245246 +0900 @@ -0,0 +1,14 @@ +Whatt: /sys/devices/.../sun +Date: October 2012 +Contact: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com +Description: + The file contains a Slot-unique ID which provided by the _SUN + method in the ACPI namespace. The value is written in Advanced + Configuration and Power Interface Specification as follows: + + The _SUN value is required to be unique among the slots of + the same type. It is also recommended that this number match + the slot number printed on the physical slot whenever possible. + + So reading the sysfs file, we can identify a physical position + of the slot in the system. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 3/3] acpi : acpi_bus_trim() stops removing devices when failing to remove the device
Hi Wen, 2012/10/09 17:02, Wen Congyang wrote: Hi, ishimatsu: At 07/12/2012 07:28 PM, Yasuaki Ishimatsu Wrote: acpi_bus_trim() stops removing devices, when acpi_bus_remove() return error number. But acpi_bus_remove() cannot return error number correctly. acpi_bus_remove() only return -EINVAL, when dev argument is NULL. Thus even if device cannot be removed correctly, acpi_bus_trim() ignores and continues to remove devices. acpi_bus_hot_remove_device() uses acpi_bus_trim() for removing devices. Therefore acpi_bus_hot_remove_device() can send _EJ0 to firmware, even if the device is running on the system. In this case, the system cannot work well. So acpi_bus_trim() should check whether device was removed or not correctly. The patch adds error check into some functions to remove the device. What is the status about this patch? I need to update the description against Toshi's comment as follows: I agree with this change as driver's remove interface can fail. However, there are other callers to this function, which do not check the return value. I suppose there is no impact to the other paths since you only changed the CPU hotplug path to fail properly, but please confirm this is the case. I recommend documenting this change to the change log. I have already checked that the patch does not impact the other path with the exception of CPU and Memory hotplug path. So I will adds the result of investigation and following Vasislis's problem into the patch and resend to lklml. Vasilis Liaskovitis found a similar bug about the memory hotplug, and this patch can fix this problem: https://lkml.org/lkml/2012/9/26/318 Thanks, Yasuaki Ishimatsu Thanks Wen Congyang Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/acpi/scan.c| 15 --- drivers/base/dd.c | 22 +- include/linux/device.h |2 +- 3 files changed, 30 insertions(+), 9 deletions(-) Index: linux-3.5-rc6/drivers/acpi/scan.c === --- linux-3.5-rc6.orig/drivers/acpi/scan.c 2012-07-12 20:11:37.316443808 +0900 +++ linux-3.5-rc6/drivers/acpi/scan.c 2012-07-12 20:17:17.927185231 +0900 @@ -425,12 +425,17 @@ static int acpi_device_remove(struct dev { struct acpi_device *acpi_dev = to_acpi_device(dev); struct acpi_driver *acpi_drv = acpi_dev-driver; + int ret; if (acpi_drv) { if (acpi_drv-ops.notify) acpi_device_remove_notify_handler(acpi_dev); - if (acpi_drv-ops.remove) - acpi_drv-ops.remove(acpi_dev, acpi_dev-removal_type); + if (acpi_drv-ops.remove) { + ret = acpi_drv-ops.remove(acpi_dev, + acpi_dev-removal_type); + if (ret) + return ret; + } } acpi_dev-driver = NULL; acpi_dev-driver_data = NULL; @@ -1208,11 +1213,15 @@ static int acpi_device_set_context(struc static int acpi_bus_remove(struct acpi_device *dev, int rmdevice) { + int ret; + if (!dev) return -EINVAL; dev-removal_type = ACPI_BUS_REMOVAL_EJECT; - device_release_driver(dev-dev); + ret = device_release_driver(dev-dev); + if (ret) + return ret; if (!rmdevice) return 0; Index: linux-3.5-rc6/drivers/base/dd.c === --- linux-3.5-rc6.orig/drivers/base/dd.c2012-07-12 20:11:37.316443808 +0900 +++ linux-3.5-rc6/drivers/base/dd.c 2012-07-12 20:17:17.928185218 +0900 @@ -464,9 +464,10 @@ EXPORT_SYMBOL_GPL(driver_attach); * __device_release_driver() must be called with @dev lock held. * When called for a USB interface, @dev-parent lock must be held as well. */ -static void __device_release_driver(struct device *dev) +static int __device_release_driver(struct device *dev) { struct device_driver *drv; + int ret; drv = dev-driver; if (drv) { @@ -482,9 +483,11 @@ static void __device_release_driver(stru pm_runtime_put_sync(dev); if (dev-bus dev-bus-remove) - dev-bus-remove(dev); + ret = dev-bus-remove(dev); else if (drv-remove) - drv-remove(dev); + ret = drv-remove(dev); + if (ret) + goto rollback; devres_release_all(dev); dev-driver = NULL; klist_remove(dev-p-knode_driver); @@ -494,6 +497,12 @@ static void __device_release_driver(stru dev); } + + return ret; + +rollback: + driver_sysfs_add(dev); + return ret; } /** @@ -503,16 +512,19 @@ static void
acpi : acpi_bus_trim() stops removing devices when failing to remove the device
acpi_bus_trim() stops removing devices, when acpi_bus_remove() return error number. But acpi_bus_remove() cannot return error number correctly. acpi_bus_remove() only return -EINVAL, when dev argument is NULL. Thus even if device cannot be removed correctly, acpi_bus_trim() ignores and continues to remove devices. acpi_bus_hot_remove_device() uses acpi_bus_trim() for removing devices. Therefore acpi_bus_hot_remove_device() can send _EJ0 to firmware, even if the device is running on the system. In this case, the system cannot work well. Vasilis hit the bug at memory hotplug and reported it as follow: https://lkml.org/lkml/2012/9/26/318 So acpi_bus_trim() should check whether device was removed or not correctly. The patch adds error check into some functions to remove the device. Applying the patch, acpi_bus_trim() stops removing devices when failing to remove the device. But I think there is no impact with the exceptionof CPU and Memory hotplug path. Because other device also fails but the fail is an irregular case like device is NULL. Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/acpi/scan.c| 15 --- drivers/base/dd.c | 22 +- include/linux/device.h |2 +- 3 files changed, 30 insertions(+), 9 deletions(-) Index: linux-3.6/drivers/acpi/scan.c === --- linux-3.6.orig/drivers/acpi/scan.c 2012-10-09 17:25:40.956496325 +0900 +++ linux-3.6/drivers/acpi/scan.c 2012-10-09 17:25:55.405497800 +0900 @@ -445,12 +445,17 @@ static int acpi_device_remove(struct dev { struct acpi_device *acpi_dev = to_acpi_device(dev); struct acpi_driver *acpi_drv = acpi_dev-driver; + int ret; if (acpi_drv) { if (acpi_drv-ops.notify) acpi_device_remove_notify_handler(acpi_dev); - if (acpi_drv-ops.remove) - acpi_drv-ops.remove(acpi_dev, acpi_dev-removal_type); + if (acpi_drv-ops.remove) { + ret = acpi_drv-ops.remove(acpi_dev, + acpi_dev-removal_type); + if (ret) + return ret; + } } acpi_dev-driver = NULL; acpi_dev-driver_data = NULL; @@ -1226,11 +1231,15 @@ static int acpi_device_set_context(struc static int acpi_bus_remove(struct acpi_device *dev, int rmdevice) { + int ret; + if (!dev) return -EINVAL; dev-removal_type = ACPI_BUS_REMOVAL_EJECT; - device_release_driver(dev-dev); + ret = device_release_driver(dev-dev); + if (ret) + return ret; if (!rmdevice) return 0; Index: linux-3.6/drivers/base/dd.c === --- linux-3.6.orig/drivers/base/dd.c2012-10-01 08:47:46.0 +0900 +++ linux-3.6/drivers/base/dd.c 2012-10-09 17:25:55.442497825 +0900 @@ -475,9 +475,10 @@ EXPORT_SYMBOL_GPL(driver_attach); * __device_release_driver() must be called with @dev lock held. * When called for a USB interface, @dev-parent lock must be held as well. */ -static void __device_release_driver(struct device *dev) +static int __device_release_driver(struct device *dev) { struct device_driver *drv; + int ret = 0; drv = dev-driver; if (drv) { @@ -493,9 +494,11 @@ static void __device_release_driver(stru pm_runtime_put_sync(dev); if (dev-bus dev-bus-remove) - dev-bus-remove(dev); + ret = dev-bus-remove(dev); else if (drv-remove) - drv-remove(dev); + ret = drv-remove(dev); + if (ret) + goto rollback; devres_release_all(dev); dev-driver = NULL; dev_set_drvdata(dev, NULL); @@ -506,6 +509,12 @@ static void __device_release_driver(stru dev); } + + return ret; + +rollback: + driver_sysfs_add(dev); + return ret; } /** @@ -515,16 +524,19 @@ static void __device_release_driver(stru * Manually detach device from driver. * When called for a USB interface, @dev-parent lock must be held. */ -void device_release_driver(struct device *dev) +int device_release_driver(struct device *dev) { + int ret; /* * If anyone calls device_release_driver() recursively from * within their -remove callback for the same device, they * will deadlock right here. */ device_lock(dev); - __device_release_driver(dev); + ret = __device_release_driver(dev); device_unlock(dev); + + return ret; } EXPORT_SYMBOL_GPL(device_release_driver); Index: linux-3.6/include/linux
Re: linux-next: build failure after merge of the origin tree
Hi Stephen, 2012/10/10 8:45, Andrew Morton wrote: On Wed, 10 Oct 2012 10:21:50 +1100 Stephen Rothwell s...@canb.auug.org.au wrote: Hi Linus, In Linus' tree, today's linux-next build (powerpc ppc64_defconfig) failed like this: arch/powerpc/platforms/pseries/hotplug-memory.c: In function 'pseries_remove_memblock': arch/powerpc/platforms/pseries/hotplug-memory.c:103:17: error: unused variable 'pfn' [-Werror=unused-variable] Caused by commit d760afd4d257 (memory-hotplug: suppress Trying to free nonexistent resource - warning). I can't see what the point of the pfn variable is This: --- a/arch/powerpc/platforms/pseries/hotplug-memory.c~a +++ a/arch/powerpc/platforms/pseries/hotplug-memory.c @@ -101,7 +101,7 @@ static int pseries_remove_memblock(unsig sections_to_remove = (memblock_size PAGE_SHIFT) / PAGES_PER_SECTION; for (i = 0; i sections_to_remove; i++) { unsigned long pfn = start_pfn + i * PAGES_PER_SECTION; - ret = __remove_pages(zone, start_pfn, PAGES_PER_SECTION); + ret = __remove_pages(zone, pfn, PAGES_PER_SECTION); if (ret) return ret; } I believe the error to be fixed with this patch. Could you try it? Thanks, Yasuaki Ishimatsu and this patch never appeared in linux-next before being merged. :-( It was first sighted October 3. I have reverted that commit for today. If this patch truly was authored yesterday (according the Author Date in git), why was it merged yesterday while still under discussion? And the latest update to it still has this build problem ... did anyone even try to build this for powerpc (since that architecture was obviously affected)? Apparently not - the ppc bit was a best-effort fixup for a patch which addresses an x86 problem. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: acpi : acpi_bus_trim() stops removing devices when failing to remove the device
Hi Toshi, 2012/10/10 1:36, Toshi Kani wrote: On Tue, 2012-10-09 at 17:48 +0900, Yasuaki Ishimatsu wrote: acpi_bus_trim() stops removing devices, when acpi_bus_remove() return error number. But acpi_bus_remove() cannot return error number correctly. acpi_bus_remove() only return -EINVAL, when dev argument is NULL. Thus even if device cannot be removed correctly, acpi_bus_trim() ignores and continues to remove devices. acpi_bus_hot_remove_device() uses acpi_bus_trim() for removing devices. Therefore acpi_bus_hot_remove_device() can send _EJ0 to firmware, even if the device is running on the system. In this case, the system cannot work well. Vasilis hit the bug at memory hotplug and reported it as follow: https://lkml.org/lkml/2012/9/26/318 So acpi_bus_trim() should check whether device was removed or not correctly. The patch adds error check into some functions to remove the device. Applying the patch, acpi_bus_trim() stops removing devices when failing to remove the device. But I think there is no impact with the exceptionof CPU and Memory hotplug path. Because other device also fails but the fail is an irregular case like device is NULL. Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/acpi/scan.c| 15 --- drivers/base/dd.c | 22 +- include/linux/device.h |2 +- 3 files changed, 30 insertions(+), 9 deletions(-) Index: linux-3.6/drivers/acpi/scan.c === --- linux-3.6.orig/drivers/acpi/scan.c 2012-10-09 17:25:40.956496325 +0900 +++ linux-3.6/drivers/acpi/scan.c 2012-10-09 17:25:55.405497800 +0900 @@ -445,12 +445,17 @@ static int acpi_device_remove(struct dev { struct acpi_device *acpi_dev = to_acpi_device(dev); struct acpi_driver *acpi_drv = acpi_dev-driver; + int ret; if (acpi_drv) { if (acpi_drv-ops.notify) acpi_device_remove_notify_handler(acpi_dev); - if (acpi_drv-ops.remove) - acpi_drv-ops.remove(acpi_dev, acpi_dev-removal_type); + if (acpi_drv-ops.remove) { + ret = acpi_drv-ops.remove(acpi_dev, + acpi_dev-removal_type); + if (ret) Hi Yasuaki, Shouldn't the notify handler be reinstalled here if it was removed by the acpi_device_remove_notify_handler() above? I do not reinstall the notify handler. The function has not been removed on linux-3.6. And the patch is created on linux-3.6. So the function remains in the patch. Thanks, Yasuaki Ishimatsu Thanks, -Toshi + return ret; + } } acpi_dev-driver = NULL; acpi_dev-driver_data = NULL; @@ -1226,11 +1231,15 @@ static int acpi_device_set_context(struc static int acpi_bus_remove(struct acpi_device *dev, int rmdevice) { + int ret; + if (!dev) return -EINVAL; dev-removal_type = ACPI_BUS_REMOVAL_EJECT; - device_release_driver(dev-dev); + ret = device_release_driver(dev-dev); + if (ret) + return ret; if (!rmdevice) return 0; Index: linux-3.6/drivers/base/dd.c === --- linux-3.6.orig/drivers/base/dd.c2012-10-01 08:47:46.0 +0900 +++ linux-3.6/drivers/base/dd.c 2012-10-09 17:25:55.442497825 +0900 @@ -475,9 +475,10 @@ EXPORT_SYMBOL_GPL(driver_attach); * __device_release_driver() must be called with @dev lock held. * When called for a USB interface, @dev-parent lock must be held as well. */ -static void __device_release_driver(struct device *dev) +static int __device_release_driver(struct device *dev) { struct device_driver *drv; + int ret = 0; drv = dev-driver; if (drv) { @@ -493,9 +494,11 @@ static void __device_release_driver(stru pm_runtime_put_sync(dev); if (dev-bus dev-bus-remove) - dev-bus-remove(dev); + ret = dev-bus-remove(dev); else if (drv-remove) - drv-remove(dev); + ret = drv-remove(dev); + if (ret) + goto rollback; devres_release_all(dev); dev-driver = NULL; dev_set_drvdata(dev, NULL); @@ -506,6 +509,12 @@ static void __device_release_driver(stru dev); } + + return ret; + +rollback: + driver_sysfs_add(dev); + return ret; } /** @@ -515,16 +524,19 @@ static void __device_release_driver(stru * Manually detach device from driver. * When called for a USB interface, @dev-parent lock must be held. */ -void device_release_driver(struct device *dev) +int device_release_driver(struct device
Re: [PATCH] ACPI: dock: Remove redundant ACPI NS walk
Hi Toshi, Sorry for late reply. 2012/09/13 5:30, Toshi Kani wrote: Combined two ACPI namespace walks, which look for dock stations and then bays separately, into a single walk. Signed-off-by: Toshi Kani toshi.k...@hp.com --- I have not tested the patch. But it looks good to me. Reviewed-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Thanks, Yasuaki Ishimatsu drivers/acpi/dock.c | 26 +++--- 1 files changed, 7 insertions(+), 19 deletions(-) diff --git a/drivers/acpi/dock.c b/drivers/acpi/dock.c index 88eb143..ae4ebf2 100644 --- a/drivers/acpi/dock.c +++ b/drivers/acpi/dock.c @@ -1016,44 +1016,32 @@ static int dock_remove(struct dock_station *ds) } /** - * find_dock - look for a dock station + * find_dock_and_bay - look for dock stations and bays * @handle: acpi handle of a device * @lvl: unused - * @context: counter of dock stations found + * @context: unused * @rv: unused * - * This is called by acpi_walk_namespace to look for dock stations. + * This is called by acpi_walk_namespace to look for dock stations and bays. */ static __init acpi_status -find_dock(acpi_handle handle, u32 lvl, void *context, void **rv) +find_dock_and_bay(acpi_handle handle, u32 lvl, void *context, void **rv) { - if (is_dock(handle)) + if (is_dock(handle) || is_ejectable_bay(handle)) dock_add(handle); return AE_OK; } -static __init acpi_status -find_bay(acpi_handle handle, u32 lvl, void *context, void **rv) -{ - /* If bay is a dock, it's already handled */ - if (is_ejectable_bay(handle) !is_dock(handle)) - dock_add(handle); - return AE_OK; -} - static int __init dock_init(void) { if (acpi_disabled) return 0; - /* look for a dock station */ + /* look for dock stations and bays */ acpi_walk_namespace(ACPI_TYPE_DEVICE, ACPI_ROOT_OBJECT, - ACPI_UINT32_MAX, find_dock, NULL, NULL, NULL); + ACPI_UINT32_MAX, find_dock_and_bay, NULL, NULL, NULL); - /* look for bay */ - acpi_walk_namespace(ACPI_TYPE_DEVICE, ACPI_ROOT_OBJECT, - ACPI_UINT32_MAX, find_bay, NULL, NULL, NULL); if (!dock_station_count) { printk(KERN_INFO PREFIX No dock devices found.\n); return 0; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5] create sun sysfs file
Hi Len, How about v5 patch? Thanks, Yasuaki Ishimatsu 2012/10/09 15:49, Yasuaki Ishimatsu wrote: _SUN method provides the slot unique-ID in the ACPI namespace. And The value is written in Advanced Configuration and Power Interface Specification as follows: The _SUN value is required to be unique among the slots ofthe same type. It is also recommended that this number match the slot number printed on the physical slot whenever possible. So if we can know the value, we can identify the physical position of the slot in the system. The patch creates sun file in sysfs for identifying physical position of the slot. Reviewed-by: Toshi Kani toshi.k...@hp.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- Documentation/ABI/testing/sysfs-devices-sun | 14 ++ drivers/acpi/scan.c | 24 include/acpi/acpi_bus.h |1 + 3 files changed, 39 insertions(+) Index: linux-3.6/include/acpi/acpi_bus.h === --- linux-3.6.orig/include/acpi/acpi_bus.h2012-10-09 11:54:17.690072343 +0900 +++ linux-3.6/include/acpi/acpi_bus.h 2012-10-09 14:15:49.207794671 +0900 @@ -208,6 +208,7 @@ struct acpi_device_pnp { struct list_head ids; /* _HID and _CIDs */ acpi_device_name device_name; /* Driver-determined */ acpi_device_class device_class; /* */ + unsigned long sun; /* _SUN */ }; #define acpi_device_bid(d) ((d)-pnp.bus_id) Index: linux-3.6/drivers/acpi/scan.c === --- linux-3.6.orig/drivers/acpi/scan.c2012-10-09 11:54:17.688072343 +0900 +++ linux-3.6/drivers/acpi/scan.c 2012-10-09 14:15:49.211794675 +0900 @@ -232,10 +232,20 @@ end: } static DEVICE_ATTR(path, 0444, acpi_device_path_show, NULL); +static ssize_t +acpi_device_sun_show(struct device *dev, struct device_attribute *attr, + char *buf) { + struct acpi_device *acpi_dev = to_acpi_device(dev); + + return sprintf(buf, %lu\n, acpi_dev-pnp.sun); +} +static DEVICE_ATTR(sun, 0444, acpi_device_sun_show, NULL); + static int acpi_device_setup_files(struct acpi_device *dev) { acpi_status status; acpi_handle temp; + unsigned long long sun; int result = 0; /* @@ -257,6 +267,16 @@ static int acpi_device_setup_files(struc goto end; } + status = acpi_evaluate_integer(dev-handle, _SUN, NULL, sun); + if (ACPI_SUCCESS(status)) { + dev-pnp.sun = (unsigned long)sun; + result = device_create_file(dev-dev, dev_attr_sun); + if (result) + goto end; + } else { + dev-pnp.sun = (unsigned long)-1; + } + /* * If device has _EJ0, 'eject' file is created that is used to trigger * hot-removal function from userland. @@ -281,6 +301,10 @@ static void acpi_device_remove_files(str if (ACPI_SUCCESS(status)) device_remove_file(dev-dev, dev_attr_eject); + status = acpi_get_handle(dev-handle, _SUN, temp); + if (ACPI_SUCCESS(status)) + device_remove_file(dev-dev, dev_attr_sun); + device_remove_file(dev-dev, dev_attr_modalias); device_remove_file(dev-dev, dev_attr_hid); if (dev-handle) Index: linux-3.6/Documentation/ABI/testing/sysfs-devices-sun === --- /dev/null 1970-01-01 00:00:00.0 + +++ linux-3.6/Documentation/ABI/testing/sysfs-devices-sun 2012-10-09 15:47:02.333245246 +0900 @@ -0,0 +1,14 @@ +Whatt: /sys/devices/.../sun +Date:October 2012 +Contact: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com +Description: + The file contains a Slot-unique ID which provided by the _SUN + method in the ACPI namespace. The value is written in Advanced + Configuration and Power Interface Specification as follows: + + The _SUN value is required to be unique among the slots of + the same type. It is also recommended that this number match + the slot number printed on the physical slot whenever possible. + + So reading the sysfs file, we can identify a physical position + of the slot in the system. -- To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http
[PATCH v2 0/2] Suppress Device device name does not have a release() function warning
This patch-set is patches to which [1] and [2] are updated [1] memory-hotplug: add memory_block_release [2] memory-hotplug: add node_device_release from following patch-set. https://lkml.org/lkml/2012/9/27/39 So the patch-set version is v2. v1 - v2 [PATCH 1/2] - change subject to Suppress Device memoryX does not have a release() function warning. - Add detail information into description - change function name from release_memory_block() to memory_block_release(), because other device release() function is named to device_name_release() [PATCH 2/2] - change subject to Suppress Device nodeX does not have a release() function warning. - Add detail information into description - Remove memset() to initialize a node struct from node_device_release() - Add memset() to initialize a node struct into register_node() -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2]suppress Device memoryX does not have a release() function warning
When calling remove_memory_block(), the function shows following message at device_release(). Device 'memory528' does not have a release() function, it is broken and must be fixed. The reason is memory_block's device struct does not have a release() function. So the patch registers memory_block_release() to the device's release() function for suppressing the warning message. Additionally, the patch moves kfree(mem) into the release function since the release function is prepared as a means to free a memory_block struct. CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/base/memory.c |9 - 1 file changed, 8 insertions(+), 1 deletion(-) Index: linux-3.6/drivers/base/memory.c === --- linux-3.6.orig/drivers/base/memory.c2012-10-11 11:37:33.404668048 +0900 +++ linux-3.6/drivers/base/memory.c 2012-10-11 11:38:27.865672989 +0900 @@ -70,6 +70,13 @@ void unregister_memory_isolate_notifier( } EXPORT_SYMBOL(unregister_memory_isolate_notifier); +static void memory_block_release(struct device *dev) +{ + struct memory_block *mem = container_of(dev, struct memory_block, dev); + + kfree(mem); +} + /* * register_memory - Setup a sysfs device for a memory block */ @@ -80,6 +87,7 @@ int register_memory(struct memory_block memory-dev.bus = memory_subsys; memory-dev.id = memory-start_section_nr / sections_per_block; + memory-dev.release = memory_block_release; error = device_register(memory-dev); return error; @@ -630,7 +638,6 @@ int remove_memory_block(unsigned long no mem_remove_simple_file(mem, phys_device); mem_remove_simple_file(mem, removable); unregister_memory(mem); - kfree(mem); } else kobject_put(mem-dev.kobj); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2]suppress Device nodeX does not have a release() function warning
When calling unregister_node(), the function shows following message at device_release(). Device 'node2' does not have a release() function, it is broken and must be fixed. The reason is node's device struct does not have a release() function. So the patch registers node_device_release() to the device's release() function for suppressing the warning message. Additionally, the patch adds memset() to initialize a node struct into register_node(). Because the node struct is part of node_devices[] array and it cannot be freed by node_device_release(). So if system reuses the node struct, it has a garbage. CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Signed-off-by: Wen Congyang we...@cn.fujitsu.com --- drivers/base/node.c | 11 +++ 1 file changed, 11 insertions(+) Index: linux-3.6/drivers/base/node.c === --- linux-3.6.orig/drivers/base/node.c 2012-10-11 10:04:02.149758748 +0900 +++ linux-3.6/drivers/base/node.c 2012-10-11 10:20:34.111806931 +0900 @@ -252,6 +252,14 @@ static inline void hugetlb_register_node static inline void hugetlb_unregister_node(struct node *node) {} #endif +static void node_device_release(struct device *dev) +{ +#if defined(CONFIG_MEMORY_HOTPLUG_SPARSE) defined(CONFIG_HUGETLBFS) + struct node *node_dev = to_node(dev); + + flush_work(node_dev-node_work); +#endif +} /* * register_node - Setup a sysfs device for a node. @@ -263,8 +271,11 @@ int register_node(struct node *node, int { int error; + memset(node, 0, sizeof(*node)); + node-dev.id = num; node-dev.bus = node_subsys; + node-dev.release = node_device_release; error = device_register(node-dev); if (!error){ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: acpi : acpi_bus_trim() stops removing devices when failing to remove the device
Hi Toshi, 2012/10/10 22:01, Toshi Kani wrote: On Wed, 2012-10-10 at 10:07 +0900, Yasuaki Ishimatsu wrote: : if (acpi_drv) { if (acpi_drv-ops.notify) acpi_device_remove_notify_handler(acpi_dev); THIS CALL - if (acpi_drv-ops.remove) - acpi_drv-ops.remove(acpi_dev, acpi_dev-removal_type); + if (acpi_drv-ops.remove) { + ret = acpi_drv-ops.remove(acpi_dev, + acpi_dev-removal_type); + if (ret) Hi Yasuaki, Shouldn't the notify handler be reinstalled here if it was removed by the acpi_device_remove_notify_handler() above? I do not reinstall the notify handler. The function has not been removed on linux-3.6. And the patch is created on linux-3.6. So the function remains in the patch. Umm... I am not sure what you meant. Let me clarify my comment. When acpi_drv-ops.remove() failed, I thought we would need to roll-back the procedure done by the acpi_device_remove_notify_handler() call, which I indicated as THIS CALL above. So, in this error path, don't we need something like below? if (acpi_drv-ops.notify) acpi_device_install_notify_handler(acpi_dev) I understood what you said. I'll update it. Thanks, Yasuaki Ishimatsu Thanks, -Toshi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/10] memory-hotplug : remove /sys/firmware/memmap/X sysfs
2012/10/06 4:36, KOSAKI Motohiro wrote: On Thu, Oct 4, 2012 at 10:26 PM, Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com wrote: When (hot)adding memory into system, /sys/firmware/memmap/X/{end, start, type} sysfs files are created. But there is no code to remove these files. The patch implements the function to remove them. Note : The code does not free firmware_map_entry since there is no way to free memory which is allocated by bootmem. You have to explain why this is ok. I guess the unfreed firmware_map_entry is reused at next online memory and don't make memory leak, right? Unfortunately, it is no. It makes memory leak about firmware_map_entry size. If we hot add memory, slab allocater prepares a other memory for firmware_map_entry. In my understanding, if the memory is allocated by bootmem allocator, the memory is not managed by slab allocator. So we can not use kfree() against the memory. On the other hand, the page of the memory may have various data allocalted by bootmem allocater with the exception of the firmware_map_entry. Thus we cannot free the page. So the patch makes memory leak. But I think the memory leak size is very samll. And it does not affect the system. CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com Signed-off-by: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/firmware/memmap.c| 98 ++- include/linux/firmware-map.h |6 ++ mm/memory_hotplug.c |7 ++- 3 files changed, 108 insertions(+), 3 deletions(-) Index: linux-3.6/drivers/firmware/memmap.c === --- linux-3.6.orig/drivers/firmware/memmap.c2012-10-04 18:27:05.195500420 +0900 +++ linux-3.6/drivers/firmware/memmap.c 2012-10-04 18:27:18.901514330 +0900 @@ -21,6 +21,7 @@ #include linux/types.h #include linux/bootmem.h #include linux/slab.h +#include linux/mm.h /* * Data types -- @@ -41,6 +42,7 @@ struct firmware_map_entry { const char *type; /* type of the memory range */ struct list_headlist; /* entry for the linked list */ struct kobject kobj; /* kobject for each entry */ + unsigned intbootmem:1; /* allocated from bootmem */ Use bool. We'll update it. }; /* @@ -79,7 +81,26 @@ static const struct sysfs_ops memmap_att .show = memmap_attr_show, }; + +static inline struct firmware_map_entry * +to_memmap_entry(struct kobject *kobj) +{ + return container_of(kobj, struct firmware_map_entry, kobj); +} + +static void release_firmware_map_entry(struct kobject *kobj) +{ + struct firmware_map_entry *entry = to_memmap_entry(kobj); + + if (entry-bootmem) + /* There is no way to free memory allocated from bootmem */ + return; + + kfree(entry); +} + static struct kobj_type memmap_ktype = { + .release= release_firmware_map_entry, .sysfs_ops = memmap_attr_ops, .default_attrs = def_attrs, }; @@ -94,6 +115,7 @@ static struct kobj_type memmap_ktype = { * in firmware initialisation code in one single thread of execution. */ static LIST_HEAD(map_entries); +static DEFINE_SPINLOCK(map_entries_lock); /** * firmware_map_add_entry() - Does the real work to add a firmware memmap entry. @@ -118,11 +140,25 @@ static int firmware_map_add_entry(u64 st INIT_LIST_HEAD(entry-list); kobject_init(entry-kobj, memmap_ktype); + spin_lock(map_entries_lock); list_add_tail(entry-list, map_entries); + spin_unlock(map_entries_lock); return 0; } +/** + * firmware_map_remove_entry() - Does the real work to remove a firmware + * memmap entry. + * @entry: removed entry. + **/ +static inline void firmware_map_remove_entry(struct firmware_map_entry *entry) Don't use inline in *.c file. gcc is wise than you. We'll update it. +{ + spin_lock(map_entries_lock); + list_del(entry-list); + spin_unlock(map_entries_lock); +} + /* * Add memmap entry on sysfs */ @@ -144,6 +180,35 @@ static int add_sysfs_fw_map_entry(struct return 0; } +/* + * Remove memmap entry on sysfs + */ +static inline void remove_sysfs_fw_map_entry(struct firmware_map_entry *entry) +{ + kobject_put(entry-kobj); +} + +/* + * Search memmap entry + */ + +static struct firmware_map_entry * __meminit +firmware_map_find_entry(u64 start, u64 end, const char *type) +{ + struct firmware_map_entry *entry; + + spin_lock(map_entries_lock); + list_for_each_entry(entry, map_entries, list
[PATCH v2] acpi : acpi_bus_trim() stops removing devices when failing to remove the device
acpi_bus_trim() stops removing devices, when acpi_bus_remove() return error number. But acpi_bus_remove() cannot return error number correctly. acpi_bus_remove() only return -EINVAL, when dev argument is NULL. Thus even if device cannot be removed correctly, acpi_bus_trim() ignores and continues to remove devices. acpi_bus_hot_remove_device() uses acpi_bus_trim() for removing devices. Therefore acpi_bus_hot_remove_device() can send _EJ0 to firmware, even if the device is running on the system. In this case, the system cannot work well. Vasilis hit the bug at memory hotplug and reported it as follow: https://lkml.org/lkml/2012/9/26/318 So acpi_bus_trim() should check whether device was removed or not correctly. The patch adds error check into some functions to remove the device. Applying the patch, acpi_bus_trim() stops removing devices when failing to remove the device. But I think there is no impact with the exceptionof CPU and Memory hotplug path. Because other device also fails but the fail is an irregular case like device is NULL. v1-v2 - add a rollback for reinstalling a notify handler. Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/acpi/scan.c| 21 ++--- drivers/base/dd.c | 22 +- include/linux/device.h |2 +- 3 files changed, 36 insertions(+), 9 deletions(-) Index: linux-3.6/drivers/acpi/scan.c === --- linux-3.6.orig/drivers/acpi/scan.c 2012-10-11 18:31:40.189019503 +0900 +++ linux-3.6/drivers/acpi/scan.c 2012-10-11 18:42:35.669041641 +0900 @@ -445,18 +445,29 @@ static int acpi_device_remove(struct dev { struct acpi_device *acpi_dev = to_acpi_device(dev); struct acpi_driver *acpi_drv = acpi_dev-driver; + int ret; if (acpi_drv) { if (acpi_drv-ops.notify) acpi_device_remove_notify_handler(acpi_dev); - if (acpi_drv-ops.remove) - acpi_drv-ops.remove(acpi_dev, acpi_dev-removal_type); + if (acpi_drv-ops.remove) { + ret = acpi_drv-ops.remove(acpi_dev, + acpi_dev-removal_type); + if (ret) + goto rollback; + } } acpi_dev-driver = NULL; acpi_dev-driver_data = NULL; put_device(dev); return 0; + +rollback: + if (acpi_drv-ops.notify) + acpi_device_install_notify_handler(acpi_dev); + + return ret; } struct bus_type acpi_bus_type = { @@ -1226,11 +1237,15 @@ static int acpi_device_set_context(struc static int acpi_bus_remove(struct acpi_device *dev, int rmdevice) { + int ret; + if (!dev) return -EINVAL; dev-removal_type = ACPI_BUS_REMOVAL_EJECT; - device_release_driver(dev-dev); + ret = device_release_driver(dev-dev); + if (ret) + return ret; if (!rmdevice) return 0; Index: linux-3.6/drivers/base/dd.c === --- linux-3.6.orig/drivers/base/dd.c2012-10-11 18:31:40.191019505 +0900 +++ linux-3.6/drivers/base/dd.c 2012-10-11 18:31:46.873020548 +0900 @@ -475,9 +475,10 @@ EXPORT_SYMBOL_GPL(driver_attach); * __device_release_driver() must be called with @dev lock held. * When called for a USB interface, @dev-parent lock must be held as well. */ -static void __device_release_driver(struct device *dev) +static int __device_release_driver(struct device *dev) { struct device_driver *drv; + int ret = 0; drv = dev-driver; if (drv) { @@ -493,9 +494,11 @@ static void __device_release_driver(stru pm_runtime_put_sync(dev); if (dev-bus dev-bus-remove) - dev-bus-remove(dev); + ret = dev-bus-remove(dev); else if (drv-remove) - drv-remove(dev); + ret = drv-remove(dev); + if (ret) + goto rollback; devres_release_all(dev); dev-driver = NULL; dev_set_drvdata(dev, NULL); @@ -506,6 +509,12 @@ static void __device_release_driver(stru dev); } + + return ret; + +rollback: + driver_sysfs_add(dev); + return ret; } /** @@ -515,16 +524,19 @@ static void __device_release_driver(stru * Manually detach device from driver. * When called for a USB interface, @dev-parent lock must be held. */ -void device_release_driver(struct device *dev) +int device_release_driver(struct device *dev) { + int ret; /* * If anyone calls device_release_driver() recursively from * within their -remove callback for the same device
Re: [PATCH 2/2]suppress Device nodeX does not have a release() function warning
2012/10/12 5:31, David Rientjes wrote: On Thu, 11 Oct 2012, Yasuaki Ishimatsu wrote: When calling unregister_node(), the function shows following message at device_release(). Device 'node2' does not have a release() function, it is broken and must be fixed. The reason is node's device struct does not have a release() function. So the patch registers node_device_release() to the device's release() function for suppressing the warning message. Additionally, the patch adds memset() to initialize a node struct into register_node(). Because the node struct is part of node_devices[] array and it cannot be freed by node_device_release(). So if system reuses the node struct, it has a garbage. Nice catch on reuse of the statically allocated node_devices[] for node hotplug. CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Signed-off-by: Wen Congyang we...@cn.fujitsu.com Can register_node() be made static in drivers/base/node.c and its declaration removed from linux/node.h? Yah. I'll fix it. Thanks, Yasuaki Ishimatsu Acked-by: David Rientjes rient...@google.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] acpi : acpi_bus_trim() stops removing devices when failing to remove the device
Hi Toshi, 2012/10/11 22:58, Toshi Kani wrote: On Thu, 2012-10-11 at 19:12 +0900, Yasuaki Ishimatsu wrote: acpi_bus_trim() stops removing devices, when acpi_bus_remove() return error number. But acpi_bus_remove() cannot return error number correctly. acpi_bus_remove() only return -EINVAL, when dev argument is NULL. Thus even if device cannot be removed correctly, acpi_bus_trim() ignores and continues to remove devices. acpi_bus_hot_remove_device() uses acpi_bus_trim() for removing devices. Therefore acpi_bus_hot_remove_device() can send _EJ0 to firmware, even if the device is running on the system. In this case, the system cannot work well. Vasilis hit the bug at memory hotplug and reported it as follow: https://lkml.org/lkml/2012/9/26/318 So acpi_bus_trim() should check whether device was removed or not correctly. The patch adds error check into some functions to remove the device. Applying the patch, acpi_bus_trim() stops removing devices when failing to remove the device. But I think there is no impact with the exceptionof CPU and Memory hotplug path. Because other device also fails but the fail is an irregular case like device is NULL. v1-v2 - add a rollback for reinstalling a notify handler. Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Thanks for the update. Looks good. Reviewed-by: Toshi Kani toshi.k...@hp.com Thank you for reviewing. Thanks, Yasauaki Ishimatsu -Toshi --- drivers/acpi/scan.c| 21 ++--- drivers/base/dd.c | 22 +- include/linux/device.h |2 +- 3 files changed, 36 insertions(+), 9 deletions(-) Index: linux-3.6/drivers/acpi/scan.c === --- linux-3.6.orig/drivers/acpi/scan.c 2012-10-11 18:31:40.189019503 +0900 +++ linux-3.6/drivers/acpi/scan.c 2012-10-11 18:42:35.669041641 +0900 @@ -445,18 +445,29 @@ static int acpi_device_remove(struct dev { struct acpi_device *acpi_dev = to_acpi_device(dev); struct acpi_driver *acpi_drv = acpi_dev-driver; + int ret; if (acpi_drv) { if (acpi_drv-ops.notify) acpi_device_remove_notify_handler(acpi_dev); - if (acpi_drv-ops.remove) - acpi_drv-ops.remove(acpi_dev, acpi_dev-removal_type); + if (acpi_drv-ops.remove) { + ret = acpi_drv-ops.remove(acpi_dev, + acpi_dev-removal_type); + if (ret) + goto rollback; + } } acpi_dev-driver = NULL; acpi_dev-driver_data = NULL; put_device(dev); return 0; + +rollback: + if (acpi_drv-ops.notify) + acpi_device_install_notify_handler(acpi_dev); + + return ret; } struct bus_type acpi_bus_type = { @@ -1226,11 +1237,15 @@ static int acpi_device_set_context(struc static int acpi_bus_remove(struct acpi_device *dev, int rmdevice) { + int ret; + if (!dev) return -EINVAL; dev-removal_type = ACPI_BUS_REMOVAL_EJECT; - device_release_driver(dev-dev); + ret = device_release_driver(dev-dev); + if (ret) + return ret; if (!rmdevice) return 0; Index: linux-3.6/drivers/base/dd.c === --- linux-3.6.orig/drivers/base/dd.c2012-10-11 18:31:40.191019505 +0900 +++ linux-3.6/drivers/base/dd.c 2012-10-11 18:31:46.873020548 +0900 @@ -475,9 +475,10 @@ EXPORT_SYMBOL_GPL(driver_attach); * __device_release_driver() must be called with @dev lock held. * When called for a USB interface, @dev-parent lock must be held as well. */ -static void __device_release_driver(struct device *dev) +static int __device_release_driver(struct device *dev) { struct device_driver *drv; + int ret = 0; drv = dev-driver; if (drv) { @@ -493,9 +494,11 @@ static void __device_release_driver(stru pm_runtime_put_sync(dev); if (dev-bus dev-bus-remove) - dev-bus-remove(dev); + ret = dev-bus-remove(dev); else if (drv-remove) - drv-remove(dev); + ret = drv-remove(dev); + if (ret) + goto rollback; devres_release_all(dev); dev-driver = NULL; dev_set_drvdata(dev, NULL); @@ -506,6 +509,12 @@ static void __device_release_driver(stru dev); } + + return ret; + +rollback: + driver_sysfs_add(dev); + return ret; } /** @@ -515,16 +524,19 @@ static void __device_release_driver(stru * Manually detach device from driver. * When called for a USB interface, @dev-parent lock must be held. */ -void
[PATCH] mm: cleanup register_node()
register_node() is defined as extern in include/linux/node.h. But the function is only called from register_one_node() in driver/base/node.c. So the patch defines register_node() as static. CC: David Rientjes rient...@google.com CC: Andrew Morton a...@linux-foundation.org Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/base/node.c |2 +- include/linux/node.h |1 - 2 files changed, 1 insertion(+), 2 deletions(-) Index: linux-3.6/drivers/base/node.c === --- linux-3.6.orig/drivers/base/node.c 2012-10-12 16:35:51.0 +0900 +++ linux-3.6/drivers/base/node.c 2012-10-12 16:52:25.294207322 +0900 @@ -259,7 +259,7 @@ static inline void hugetlb_unregister_no * * Initialize and register the node device. */ -int register_node(struct node *node, int num, struct node *parent) +static int register_node(struct node *node, int num, struct node *parent) { int error; Index: linux-3.6/include/linux/node.h === --- linux-3.6.orig/include/linux/node.h 2012-10-01 08:47:46.0 +0900 +++ linux-3.6/include/linux/node.h 2012-10-12 16:52:55.215210433 +0900 @@ -30,7 +30,6 @@ struct memory_block; extern struct node node_devices[]; typedef void (*node_registration_func_t)(struct node *); -extern int register_node(struct node *, int, struct node *); extern void unregister_node(struct node *node); #ifdef CONFIG_NUMA extern int register_one_node(int nid); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Fix a hard coding style when determining if a device is a container.
Hi Tang, 2012/10/12 15:55, Tang Chen wrote: ACPI0004,PNP0A05 and PNP0A06 are all defined in array container_device_ids[], so use it, but not the hard coding style. The idea is good. Signed-off-by: Tang Chen tangc...@cn.fujitsu.com --- drivers/acpi/container.c | 10 +++--- 1 files changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/acpi/container.c b/drivers/acpi/container.c index 1f9f7d7..448c0e2 100644 --- a/drivers/acpi/container.c +++ b/drivers/acpi/container.c @@ -217,6 +217,7 @@ container_walk_namespace_cb(acpi_handle handle, { char *hid = NULL; struct acpi_device_info *info; + struct acpi_device_id *container_id; acpi_status status; int *action = context; @@ -232,10 +233,13 @@ container_walk_namespace_cb(acpi_handle handle, goto end; } - if (strcmp(hid, ACPI0004) strcmp(hid, PNP0A05) - strcmp(hid, PNP0A06)) { - goto end; + for (container_id = container_device_ids; + container_id-id[0]; container_id++) { + if (!strcmp((char *)container_id-id, hid)) + break; } + if (!container_id-id[0]) + goto end; How about prepare is_container_device() function and check whether the device is the container device or not as below? if (is_container_device()) goto end; Thanks, Yasuaki Ishimatsu switch (*action) { case INSTALL_NOTIFY_HANDLER: -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] acpi : acpi_bus_trim() stops removing devices when failing to remove the device
Hi Greg, Sorry for late reply. 2012/10/20 2:59, Greg Kroah-Hartman wrote: On Fri, Oct 19, 2012 at 06:29:52AM +0200, Rafael J. Wysocki wrote: On Thursday 11 of October 2012 19:12:28 Yasuaki Ishimatsu wrote: acpi_bus_trim() stops removing devices, when acpi_bus_remove() return error number. But acpi_bus_remove() cannot return error number correctly. acpi_bus_remove() only return -EINVAL, when dev argument is NULL. Thus even if device cannot be removed correctly, acpi_bus_trim() ignores and continues to remove devices. acpi_bus_hot_remove_device() uses acpi_bus_trim() for removing devices. Therefore acpi_bus_hot_remove_device() can send _EJ0 to firmware, even if the device is running on the system. In this case, the system cannot work well. Vasilis hit the bug at memory hotplug and reported it as follow: https://lkml.org/lkml/2012/9/26/318 So acpi_bus_trim() should check whether device was removed or not correctly. The patch adds error check into some functions to remove the device. Applying the patch, acpi_bus_trim() stops removing devices when failing to remove the device. But I think there is no impact with the exceptionof CPU and Memory hotplug path. Because other device also fails but the fail is an irregular case like device is NULL. v1-v2 - add a rollback for reinstalling a notify handler. Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Greg, do you think there may be any problems with the changes in dd.c? Yes, I don't like it. remove should always work, just like the exit call in a module. It means that the core wants to remove the driver, so it is going to happen, a driver can't refuse it. Which brings me to the larger question, why would this solve anything? Now we are developing physical memory hot plug. https://lkml.org/lkml/2012/10/23/213 So if we aplly the patch-set, we can hot remove a physical memory by the following way. echo 1 /sys/bus/acpi/devices/PNP/eject In this case, acpi_bus_hot_remove_device() tries to remove memory device by acpi_bus_trim(). But if the memory has irremovable memory, memory hot remove fails. And the memory remains in kernel. However acpi_bus_trim() cannot notice that memory hot remove fails and retruns 0. So acpi_bus_hot_remove_device() continues to remove memory devices and sends _EJ0 method to firmware. Thus the memory device cannot be used. But the memory remains in kernel yet. So if someone access the memory, kernel panic occurs. Thanks, Yasuaki Ishimatsu If the kernel wants to unbind a device, why would we ever not want that to happen? So, NAK on this patch, sorry. Fix up the ACPI core to handle this properly, don't mess with the driver core here. greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 1/2] acpi : cpu hot-remove returns error when cpu_down() fails
v1 - v2 - update Rafael's comment. static int acpi_processor_handle_eject(struct acpi_processor *pr) { -if (cpu_online(pr-id)) -cpu_down(pr-id); +int ret = 0; + +if (cpu_online(pr-id)) { +ret = cpu_down(pr-id); If you defined ret here ... +if (ret) +return ret; +} arch_unregister_cpu(pr-id); acpi_unmap_lsapic(pr-id); -return (0); +return ret; ... this line wouldn't need to be changed. --- Even if cpu_down() fails, acpi_processor_remove() continues to remove the cpu. But in this case, it should return error number since some process may run on the cpu. If the cpu has a running process and the cpu is turned the power off, the system may not work well. Reviewed-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com Reviewed-by: Toshi Kani toshi.k...@hp.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/acpi/processor_driver.c | 16 +++- 1 file changed, 11 insertions(+), 5 deletions(-) Index: linux-3.7-rc2/drivers/acpi/processor_driver.c === --- linux-3.7-rc2.orig/drivers/acpi/processor_driver.c 2012-10-21 04:11:32.0 +0900 +++ linux-3.7-rc2/drivers/acpi/processor_driver.c 2012-10-26 18:15:43.721665836 +0900 @@ -605,7 +605,7 @@ err_free_pr: static int acpi_processor_remove(struct acpi_device *device, int type) { struct acpi_processor *pr = NULL; - + int ret; if (!device || !acpi_driver_data(device)) return -EINVAL; @@ -616,8 +616,9 @@ static int acpi_processor_remove(struct goto free; if (type == ACPI_BUS_REMOVAL_EJECT) { - if (acpi_processor_handle_eject(pr)) - return -EINVAL; + ret = acpi_processor_handle_eject(pr); + if (ret) + return ret; } acpi_processor_power_exit(pr); @@ -848,8 +849,13 @@ static acpi_status acpi_processor_hotadd static int acpi_processor_handle_eject(struct acpi_processor *pr) { - if (cpu_online(pr-id)) - cpu_down(pr-id); + int ret = 0; + + if (cpu_online(pr-id)) { + ret = cpu_down(pr-id); + if (ret) + return ret; + } arch_unregister_cpu(pr-id); acpi_unmap_lsapic(pr-id); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 02/46] x86, mm: Split out split_mem_range from init_memory_mapping
2012/11/13 6:17, Yinghai Lu wrote: So make init_memory_mapping smaller and readable. Suggested-by: Ingo Molnar mi...@elte.hu Signed-off-by: Yinghai Lu ying...@kernel.org Reviewed-by: Pekka Enberg penb...@kernel.org --- arch/x86/mm/init.c | 42 ++ 1 files changed, 26 insertions(+), 16 deletions(-) diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index aa5b0da..6d8e102 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -146,25 +146,13 @@ static int __meminit save_mr(struct map_range *mr, int nr_range, return nr_range; } -/* - * Setup the direct mapping of the physical memory at PAGE_OFFSET. - * This runs before bootmem is initialized and gets pages directly from - * the physical memory. To access them they are temporarily mapped. - */ -unsigned long __init_refok init_memory_mapping(unsigned long start, -unsigned long end) +static int __meminit split_mem_range(struct map_range *mr, int nr_range, + unsigned long start, + unsigned long end) { unsigned long start_pfn, end_pfn; - unsigned long ret = 0; unsigned long pos; - struct map_range mr[NR_RANGE_MR]; - int nr_range, i; - - printk(KERN_INFO init_memory_mapping: [mem %#010lx-%#010lx]\n, -start, end - 1); - - memset(mr, 0, sizeof(mr)); - nr_range = 0; + int i; /* head if not big page alignment ? */ start_pfn = start PAGE_SHIFT; @@ -258,6 +246,28 @@ unsigned long __init_refok init_memory_mapping(unsigned long start, (mr[i].page_size_mask (1PG_LEVEL_1G))?1G:( (mr[i].page_size_mask (1PG_LEVEL_2M))?2M:4k)); + return nr_range; +} + +/* + * Setup the direct mapping of the physical memory at PAGE_OFFSET. + * This runs before bootmem is initialized and gets pages directly from + * the physical memory. To access them they are temporarily mapped. + */ +unsigned long __init_refok init_memory_mapping(unsigned long start, +unsigned long end) +{ + struct map_range mr[NR_RANGE_MR]; + unsigned long ret = 0; + int nr_range, i; + + pr_info(init_memory_mapping: [mem %#010lx-%#010lx]\n, +start, end - 1); + + memset(mr, 0, sizeof(mr)); + nr_range = 0; This is unnecessary since it is set in the below. + nr_range = split_mem_range(mr, nr_range, start, end); Thanks, Yasuaki Ishimatsu + /* * Find space for the kernel direct mapping tables. * -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v6] create sun sysfs file
Hi Rafael, The patch was rebased on linux-next. Ant it has been acked by Len: https://lkml.org/lkml/2012/10/10/65 So please merge it into your tree. --- _SUN method provides the slot unique-ID in the ACPI namespace. And The value is written in Advanced Configuration and Power Interface Specification as follows: The _SUN value is required to be unique among the slots ofthe same type. It is also recommended that this number match the slot number printed on the physical slot whenever possible. So if we can know the value, we can identify the physical position of the slot in the system. The patch creates sun file in sysfs for identifying physical position of the slot. Reviewed-by: Toshi Kani toshi.k...@hp.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- Documentation/ABI/testing/sysfs-devices-sun | 14 ++ drivers/acpi/scan.c | 24 include/acpi/acpi_bus.h |1 + 3 files changed, 39 insertions(+) Index: linux-next/include/acpi/acpi_bus.h === --- linux-next.orig/include/acpi/acpi_bus.h 2012-11-12 17:58:08.0 +0900 +++ linux-next/include/acpi/acpi_bus.h 2012-11-12 19:07:33.577427071 +0900 @@ -179,6 +179,7 @@ struct acpi_device_pnp { acpi_device_name device_name; /* Driver-determined */ acpi_device_class device_class; /* */ union acpi_object *str_obj; /* unicode string for _STR method */ + unsigned long sun; /* _SUN */ }; #define acpi_device_bid(d) ((d)-pnp.bus_id) Index: linux-next/drivers/acpi/scan.c === --- linux-next.orig/drivers/acpi/scan.c 2012-11-12 17:57:44.0 +0900 +++ linux-next/drivers/acpi/scan.c 2012-11-12 19:08:53.387428254 +0900 @@ -292,11 +292,21 @@ static ssize_t description_show(struct d } static DEVICE_ATTR(description, 0444, description_show, NULL); +static ssize_t +acpi_device_sun_show(struct device *dev, struct device_attribute *attr, +char *buf) { + struct acpi_device *acpi_dev = to_acpi_device(dev); + + return sprintf(buf, %lu\n, acpi_dev-pnp.sun); +} +static DEVICE_ATTR(sun, 0444, acpi_device_sun_show, NULL); + static int acpi_device_setup_files(struct acpi_device *dev) { struct acpi_buffer buffer = {ACPI_ALLOCATE_BUFFER, NULL}; acpi_status status; acpi_handle temp; + unsigned long long sun; int result = 0; /* @@ -338,6 +348,16 @@ static int acpi_device_setup_files(struc if (dev-pnp.unique_id) result = device_create_file(dev-dev, dev_attr_uid); + status = acpi_evaluate_integer(dev-handle, _SUN, NULL, sun); + if (ACPI_SUCCESS(status)) { + dev-pnp.sun = (unsigned long)sun; + result = device_create_file(dev-dev, dev_attr_sun); + if (result) + goto end; + } else { + dev-pnp.sun = (unsigned long)-1; + } + /* * If device has _EJ0, 'eject' file is created that is used to trigger * hot-removal function from userland. @@ -369,6 +389,10 @@ static void acpi_device_remove_files(str if (ACPI_SUCCESS(status)) device_remove_file(dev-dev, dev_attr_eject); + status = acpi_get_handle(dev-handle, _SUN, temp); + if (ACPI_SUCCESS(status)) + device_remove_file(dev-dev, dev_attr_sun); + if (dev-pnp.unique_id) device_remove_file(dev-dev, dev_attr_uid); if (dev-flags.bus_address) Index: linux-next/Documentation/ABI/testing/sysfs-devices-sun === --- /dev/null 1970-01-01 00:00:00.0 + +++ linux-next/Documentation/ABI/testing/sysfs-devices-sun 2012-11-12 19:09:26.854428750 +0900 @@ -0,0 +1,14 @@ +Whatt: /sys/devices/.../sun +Date: October 2012 +Contact: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com +Description: + The file contains a Slot-unique ID which provided by the _SUN + method in the ACPI namespace. The value is written in Advanced + Configuration and Power Interface Specification as follows: + + The _SUN value is required to be unique among the slots of + the same type. It is also recommended that this number match + the slot number printed on the physical slot whenever possible. + + So reading the sysfs file, we can identify a physical position + of the slot in the system. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [resend] ACPI: Fix memory leak in acpi_bind_one() (fwd)
On Mon, 2012-10-15 at 20:51 +0200, Jesper Juhl wrote: Ok, so I had a little problem with my mail servers clock that caused the mail below to be timestamped a few years in the past, so I assume noone saw it - thus, resending. -- Jesper Juhl j...@chaosbits.net http://www.chaosbits.net/ Don't top-post http://www.catb.org/jargon/html/T/top-post.html Plain text mails only, please. -- Forwarded message -- Date: Sun, 9 Nov 2008 14:38:30 +0100 (CET) From: Jesper Juhl j...@chaosbits.net To: linux-a...@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Len Brown l...@kernel.org Subject: [PATCH] ACPI: Fix memory leak in acpi_bind_one() Memory is allocated with kzalloc() and assigned to 'physical_node'. Then 'physical_node-node_id' is initialized with a call to 'find_first_zero_bit()', if that results in a value greater than ACPI_MAX_PHYSICAL_NODE we'll end up jumping to the 'err:' label and there leave the function and let 'physical_node' go out of scope and leak the memory we allocated. This patch fixes the leak by simply freeing the unused/unneeded memory pointed to by 'physical_node' just before we jump to 'err:'. Signed-off-by: Jesper Juhl j...@chaosbits.net Looks good to me. Reviewed-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Thanks, Yasuaki Ishimatsu --- drivers/acpi/glue.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/acpi/glue.c b/drivers/acpi/glue.c index d1a2d74..0837308 100644 --- a/drivers/acpi/glue.c +++ b/drivers/acpi/glue.c @@ -159,6 +159,7 @@ static int acpi_bind_one(struct device *dev, acpi_handle handle) if (physical_node-node_id = ACPI_MAX_PHYSICAL_NODE) { retval = -ENOSPC; mutex_unlock(acpi_dev-physical_node_lock); + kfree(physical_node); goto err; } -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] ACPI: Fix stale pointer access to flags.lockable
2012/10/16 1:34, Toshi Kani wrote: During hot-remove, acpi_bus_hot_remove_device() calls ACPI _LCK method when device-flags.lockable is set. However, this device pointer is stale since the target acpi_device object has been already kfree'd by acpi_bus_trim(). The flags.lockable indicates whether or not this ACPI object implements _LCK method. Fix the stable pointer access by replacing it with acpi_get_handle() to check if _LCK is implemented. Signed-off-by: Toshi Kani toshi.k...@hp.com Looks good to me. Reviewed-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/acpi/scan.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c index 1fcb867..ed87f43 100644 --- a/drivers/acpi/scan.c +++ b/drivers/acpi/scan.c @@ -97,6 +97,7 @@ void acpi_bus_hot_remove_device(void *context) struct acpi_eject_event *ej_event = (struct acpi_eject_event *) context; struct acpi_device *device; acpi_handle handle = ej_event-handle; + acpi_handle temp; struct acpi_object_list arg_list; union acpi_object arg; acpi_status status = AE_OK; @@ -117,13 +118,16 @@ void acpi_bus_hot_remove_device(void *context) goto err_out; } + /* device has been freed */ + device = NULL; + /* power off device */ status = acpi_evaluate_object(handle, _PS3, NULL, NULL); if (ACPI_FAILURE(status) status != AE_NOT_FOUND) printk(KERN_WARNING PREFIX Power-off device failed\n); - if (device-flags.lockable) { + if (ACPI_SUCCESS(acpi_get_handle(handle, _LCK, temp))) { arg_list.count = 1; arg_list.pointer = arg; arg.type = ACPI_TYPE_INTEGER; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] ACPI: Remove unused lockable in acpi_device_flags
2012/10/16 1:34, Toshi Kani wrote: Removed lockable in struct acpi_device_flags since it is no longer used by any code. acpi_bus_hot_remove_device() cannot use this flag because acpi_bus_trim() frees up its acpi_device object. Furthermore, the dock driver calls _LCK method without using this lockable flag. Signed-off-by: Toshi Kani toshi.k...@hp.com Looks good to me. Reviewed-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/acpi/scan.c | 5 - include/acpi/acpi_bus.h | 3 +-- 2 files changed, 1 insertion(+), 7 deletions(-) diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c index ed87f43..19d3d4a 100644 --- a/drivers/acpi/scan.c +++ b/drivers/acpi/scan.c @@ -1017,11 +1017,6 @@ static int acpi_bus_get_flags(struct acpi_device *device) device-flags.ejectable = 1; } - /* Presence of _LCK indicates 'lockable' */ - status = acpi_get_handle(device-handle, _LCK, temp); - if (ACPI_SUCCESS(status)) - device-flags.lockable = 1; - /* Power resources cannot be power manageable. */ if (device-device_type == ACPI_BUS_TYPE_POWER) return 0; diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h index 0daa0fb..e8b2877 100644 --- a/include/acpi/acpi_bus.h +++ b/include/acpi/acpi_bus.h @@ -144,12 +144,11 @@ struct acpi_device_flags { u32 bus_address:1; u32 removable:1; u32 ejectable:1; - u32 lockable:1; u32 suprise_removal_ok:1; u32 power_manageable:1; u32 performance_manageable:1; u32 eject_pending:1; - u32 reserved:23; + u32 reserved:24; }; /* File System */ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/4] acpi,memory-hotplug : add memory offline code to acpi_memory_device_remove()
Hi Wen, 2012/10/17 18:52, Wen Congyang wrote: At 10/17/2012 05:18 PM, KOSAKI Motohiro Wrote: Hmm, it doesn't move the code. It just reuse the code in acpi_memory_powerdown_device(). Even if reuse or not reuse, you changed the behavior. If any changes has no good rational, you cannot get an ack. I don't understand this? IIRC, the behavior isn't changed. Heh, please explain why do you think so. We just introduce a function, and move codes from acpi_memory_disable_device() to the new function. We call the new function in acpi_memory_disable_device(), so the function acpi_memory_disable_device()'s behavior isn't changed. Maybe I don't understand what do you want to say. Ok, now you agreed you moved the code, yes? So then, you should explain why your code moving makes zero impact other acpi_memory_disable_device() caller. We just move the code, and don't change the acpi_memory_disable_device()'s behavior. I look it the change again, and found some diffs: 1. we treat !info-enabled as error, while it isn't a error without this patch 2. we remove memory info from the list, it is a bug fix because we free the memory that stores memory info.(I have sent a patch to fix this bug, and it is in akpm's tree now) I guess you mean 1 will change the behavior. In the last version, I don't do it. Ishimatsu changes this and I don't notify this. To Ishimatsu: Why do you change this? Oops. If so, it's my mistake. Could you update it in next version? Thanks, Yasuaki Ishimatsu Thanks Wen Congyang -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 5/10] memory-hotplug : memory-hotplug: check page type in get_page_bootmem
Hi Kosaki, Sorry for late reply. 2012/10/13 4:28, KOSAKI Motohiro wrote: On Thu, Oct 4, 2012 at 10:32 PM, Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com wrote: The function get_page_bootmem() may be called more than one time to the same page. There is no need to set page's type, private if the function is not the first time called to the page. Note: the patch is just optimization and does not fix any problem. CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- mm/memory_hotplug.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) Index: linux-3.6/mm/memory_hotplug.c === --- linux-3.6.orig/mm/memory_hotplug.c 2012-10-04 18:29:58.284676075 +0900 +++ linux-3.6/mm/memory_hotplug.c 2012-10-04 18:30:03.454680542 +0900 @@ -95,10 +95,17 @@ static void release_memory_resource(stru static void get_page_bootmem(unsigned long info, struct page *page, unsigned long type) { - page-lru.next = (struct list_head *) type; - SetPagePrivate(page); - set_page_private(page, info); - atomic_inc(page-_count); + unsigned long page_type; + + page_type = (unsigned long)page-lru.next; If I understand correctly, page-lru.next might be uninitialized yet. Ah yes. I was misunderstanding... Hi Wen, When you update the physical hot remove patch-set, please drop the patch. Thanks, Yasuaki Ishimatsu Moreover, I have no seen any good effect in this patch. I don't understand why we need to increase code complexity. + if (page_type MEMORY_HOTPLUG_MIN_BOOTMEM_TYPE || + page_type MEMORY_HOTPLUG_MAX_BOOTMEM_TYPE){ + page-lru.next = (struct list_head *)type; + SetPagePrivate(page); + set_page_private(page, info); + atomic_inc(page-_count); + } else + atomic_inc(page-_count); } /* reference to __meminit __free_pages_bootmem is valid -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majord...@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH v3 3/13] memory-hotplug : unify argument of firmware_map_add_early/hotplug
Hi Dave, 2012/07/12 22:40, Dave Hansen wrote: On 07/11/2012 09:52 PM, Yasuaki Ishimatsu wrote: Does the following patch include your comment? If O.K., I will separate the patch from the series and send it for bug fix. Looks sane to me. It does now mean that the calling conventions for some of the other firmware_map*() functions are different, but I think that's OK since they're only used internally to memmap.c. Thank you for reviewing my patch. I'll send the patch. Thanks, Yasuaki Ishimatsu -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH v3 3/13] memory-hotplug : unify argument of firmware_map_add_early/hotplug
Hi Dave, 2012/07/12 22:40, Dave Hansen wrote: On 07/11/2012 09:52 PM, Yasuaki Ishimatsu wrote: Does the following patch include your comment? If O.K., I will separate the patch from the series and send it for bug fix. Looks sane to me. It does now mean that the calling conventions for some of the other firmware_map*() functions are different, but I think that's OK since they're only used internally to memmap.c. Can I add Reviewed-by: Dave Hansen to the patch? Thanks, Yasuaki Ishimatsu -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 2/3 RESEND] acpi : prevent cpu from becoming online
2012/07/12 21:41, Srivatsa S. Bhat wrote: On 07/12/2012 05:10 PM, Yasuaki Ishimatsu wrote: Even if acpi_processor_handle_eject() offlines cpu, there is a chance to online the cpu after that. So the patch closes the window by using get/put_online_cpus(). Why does the patch change _cpu_up() logic? The patch cares the race of hot-remove cpu and _cpu_up(). If the patch does not change it, there is the following race. hot-remove cpu | _cpu_up() - call acpi_processor_handle_eject() | call cpu_down() | call get_online_cpus()| | call cpu_hotplug_begin() and stop here call arch_unregister_cpu()| call acpi_unmap_lsapic() | call put_online_cpus()| | start and continue _cpu_up() return acpi_processor_remove()| continue hot-remove the cpu| So _cpu_up() can continue to itself. And hot-remove cpu can also continue itself. If the patch changes _cpu_up() logic, the race disappears as below: hot-remove cpu | _cpu_up() --- call acpi_processor_handle_eject() | call cpu_down() | call get_online_cpus()| | call cpu_hotplug_begin() and stop here call arch_unregister_cpu()| call acpi_unmap_lsapic() | cpu's cpu_present is set | to false by set_cpu_present()| call put_online_cpus()| | start _cpu_up() | check cpu_present() and return -EINVAL return acpi_processor_remove()| continue hot-remove the cpu| Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Please consider fixing the grammar issue below (since it is a user-visible print statement). Other than that, everything looks fine. Reviewed-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com --- drivers/acpi/processor_driver.c | 14 ++ kernel/cpu.c|8 +--- 2 files changed, 19 insertions(+), 3 deletions(-) Index: linux-3.5-rc6/drivers/acpi/processor_driver.c === --- linux-3.5-rc6.orig/drivers/acpi/processor_driver.c 2012-07-12 20:34:29.438289841 +0900 +++ linux-3.5-rc6/drivers/acpi/processor_driver.c2012-07-12 20:39:29.190542257 +0900 @@ -850,8 +850,22 @@ static int acpi_processor_handle_eject(s return ret; } +get_online_cpus(); +/* + * The cpu might become online again at this point. So we check whether + * the cpu has been onlined or not. If the cpu became online, it means + * that someone wants to use the cpu. So acpi_processor_handle_eject() + * returns -EAGAIN. + */ +if (unlikely(cpu_online(pr-id))) { +put_online_cpus(); +printk(KERN_WARNING Failed to remove CPU %d, + since someone onlines the cpu\n , pr-id); How about: Failed to remove CPU %d, because some other task brought the CPU back online\n Looks good to me. I'll update it. Thanks, Yasuaki Ishimatsu Regards, Srivatsa S. Bhat +return -EAGAIN; +} arch_unregister_cpu(pr-id); acpi_unmap_lsapic(pr-id); +put_online_cpus(); return ret; } #else Index: linux-3.5-rc6/kernel/cpu.c === --- linux-3.5-rc6.orig/kernel/cpu.c 2012-07-12 20:34:29.438289841 +0900 +++ linux-3.5-rc6/kernel/cpu.c 2012-07-12 20:34:35.040219535 +0900 @@ -343,11 +343,13 @@ static int __cpuinit _cpu_up(unsigned in unsigned long mod = tasks_frozen ? CPU_TASKS_FROZEN : 0; struct task_struct *idle; -if (cpu_online(cpu) || !cpu_present(cpu)) -return -EINVAL; - cpu_hotplug_begin(); +if (cpu_online(cpu) || !cpu_present(cpu)) { +ret = -EINVAL; +goto out; +} + idle = idle_thread_get(cpu); if (IS_ERR(idle)) { ret = PTR_ERR(idle); -- To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 1/3] acpi : cpu hot-remove returns error when cpu_down() fails
Hi Toshi, 2012/07/13 1:48, Toshi Kani wrote: On Thu, 2012-07-12 at 20:22 +0900, Yasuaki Ishimatsu wrote: Even if cpu_down() fails, acpi_processor_remove() continues to remove the cpu. But in this case, it should return error number since some process may run on the cpu. If the cpu has a running process and the cpu is turned the power off, the system may not work well. Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/acpi/processor_driver.c | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) Index: linux-3.5-rc4/drivers/acpi/processor_driver.c === --- linux-3.5-rc4.orig/drivers/acpi/processor_driver.c 2012-06-25 04:53:04.0 +0900 +++ linux-3.5-rc4/drivers/acpi/processor_driver.c 2012-07-05 21:02:58.711285382 +0900 @@ -610,7 +610,7 @@ err_free_pr: static int acpi_processor_remove(struct acpi_device *device, int type) { struct acpi_processor *pr = NULL; - + int ret; if (!device || !acpi_driver_data(device)) return -EINVAL; @@ -621,8 +621,9 @@ static int acpi_processor_remove(struct goto free; if (type == ACPI_BUS_REMOVAL_EJECT) { - if (acpi_processor_handle_eject(pr)) - return -EINVAL; + ret = acpi_processor_handle_eject(pr); + if (ret) + return ret; } acpi_processor_power_exit(pr, device); @@ -841,12 +842,17 @@ static acpi_status acpi_processor_hotadd static int acpi_processor_handle_eject(struct acpi_processor *pr) { - if (cpu_online(pr-id)) - cpu_down(pr-id); + int ret; + + if (cpu_online(pr-id)) { + ret = cpu_down(pr-id); + if (ret) + return ret; + } arch_unregister_cpu(pr-id); acpi_unmap_lsapic(pr-id); - return (0); + return ret; ret is uninitialized when !cpu_online(). Oops! I'll update it. Thanks, Yasuaki Ishimatsu Thanks, -Toshi } #else static acpi_status acpi_processor_hotadd_init(struct acpi_processor *pr) -- To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 2/3 RESEND] acpi : prevent cpu from becoming online
Hi Toshi, 2012/07/13 1:49, Toshi Kani wrote: On Thu, 2012-07-12 at 20:40 +0900, Yasuaki Ishimatsu wrote: Even if acpi_processor_handle_eject() offlines cpu, there is a chance to online the cpu after that. So the patch closes the window by using get/put_online_cpus(). Why does the patch change _cpu_up() logic? The patch cares the race of hot-remove cpu and _cpu_up(). If the patch does not change it, there is the following race. hot-remove cpu | _cpu_up() - call acpi_processor_handle_eject() | call cpu_down() | call get_online_cpus()| | call cpu_hotplug_begin() and stop here call arch_unregister_cpu()| call acpi_unmap_lsapic() | call put_online_cpus()| | start and continue _cpu_up() return acpi_processor_remove()| continue hot-remove the cpu| So _cpu_up() can continue to itself. And hot-remove cpu can also continue itself. If the patch changes _cpu_up() logic, the race disappears as below: hot-remove cpu | _cpu_up() --- call acpi_processor_handle_eject() | call cpu_down() | call get_online_cpus()| | call cpu_hotplug_begin() and stop here call arch_unregister_cpu()| call acpi_unmap_lsapic() | cpu's cpu_present is set | to false by set_cpu_present()| call put_online_cpus()| | start _cpu_up() | check cpu_present() and return -EINVAL return acpi_processor_remove()| continue hot-remove the cpu| Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/acpi/processor_driver.c | 14 ++ kernel/cpu.c|8 +--- 2 files changed, 19 insertions(+), 3 deletions(-) Index: linux-3.5-rc6/drivers/acpi/processor_driver.c === --- linux-3.5-rc6.orig/drivers/acpi/processor_driver.c 2012-07-12 20:34:29.438289841 +0900 +++ linux-3.5-rc6/drivers/acpi/processor_driver.c 2012-07-12 20:39:29.190542257 +0900 @@ -850,8 +850,22 @@ static int acpi_processor_handle_eject(s return ret; } + get_online_cpus(); + /* +* The cpu might become online again at this point. So we check whether +* the cpu has been onlined or not. If the cpu became online, it means +* that someone wants to use the cpu. So acpi_processor_handle_eject() +* returns -EAGAIN. +*/ + if (unlikely(cpu_online(pr-id))) { + put_online_cpus(); + printk(KERN_WARNING Failed to remove CPU %d, + since someone onlines the cpu\n , pr-id); pr_warn() should be used per the recent checkpatch change. O.K. I'll update it. Thanks, Yasuaki Ishimatsu Thanks, -Toshi + return -EAGAIN; + } arch_unregister_cpu(pr-id); acpi_unmap_lsapic(pr-id); + put_online_cpus(); return ret; } #else Index: linux-3.5-rc6/kernel/cpu.c === --- linux-3.5-rc6.orig/kernel/cpu.c 2012-07-12 20:34:29.438289841 +0900 +++ linux-3.5-rc6/kernel/cpu.c 2012-07-12 20:34:35.040219535 +0900 @@ -343,11 +343,13 @@ static int __cpuinit _cpu_up(unsigned in unsigned long mod = tasks_frozen ? CPU_TASKS_FROZEN : 0; struct task_struct *idle; - if (cpu_online(cpu) || !cpu_present(cpu)) - return -EINVAL; - cpu_hotplug_begin(); + if (cpu_online(cpu) || !cpu_present(cpu)) { + ret = -EINVAL; + goto out; + } + idle = idle_thread_get(cpu); if (IS_ERR(idle)) { ret = PTR_ERR(idle); -- To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 1/3] acpi : cpu hot-remove returns error when cpu_down() fails
Hi Srivatsa, 2012/07/12 21:32, Srivatsa S. Bhat wrote: On 07/12/2012 04:52 PM, Yasuaki Ishimatsu wrote: Even if cpu_down() fails, acpi_processor_remove() continues to remove the cpu. But in this case, it should return error number since some process may run on the cpu. If the cpu has a running process and the cpu is turned the power off, the system may not work well. Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com Reviewed-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com Thank you for reviewing. Thanks, Yasuaki Ishimatsu Regards, Srivatsa S. Bhat --- drivers/acpi/processor_driver.c | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) Index: linux-3.5-rc4/drivers/acpi/processor_driver.c === --- linux-3.5-rc4.orig/drivers/acpi/processor_driver.c 2012-06-25 04:53:04.0 +0900 +++ linux-3.5-rc4/drivers/acpi/processor_driver.c2012-07-05 21:02:58.711285382 +0900 @@ -610,7 +610,7 @@ err_free_pr: static int acpi_processor_remove(struct acpi_device *device, int type) { struct acpi_processor *pr = NULL; - +int ret; if (!device || !acpi_driver_data(device)) return -EINVAL; @@ -621,8 +621,9 @@ static int acpi_processor_remove(struct goto free; if (type == ACPI_BUS_REMOVAL_EJECT) { -if (acpi_processor_handle_eject(pr)) -return -EINVAL; +ret = acpi_processor_handle_eject(pr); +if (ret) +return ret; } acpi_processor_power_exit(pr, device); @@ -841,12 +842,17 @@ static acpi_status acpi_processor_hotadd static int acpi_processor_handle_eject(struct acpi_processor *pr) { -if (cpu_online(pr-id)) -cpu_down(pr-id); +int ret; + +if (cpu_online(pr-id)) { +ret = cpu_down(pr-id); +if (ret) +return ret; +} arch_unregister_cpu(pr-id); acpi_unmap_lsapic(pr-id); -return (0); +return ret; } #else static acpi_status acpi_processor_hotadd_init(struct acpi_processor *pr) -- To unsubscribe from this list: send the line unsubscribe linux-acpi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 3/3] acpi : acpi_bus_trim() stops removing devices when failing to remove the device
2012/07/13 1:50, Toshi Kani wrote: On Thu, 2012-07-12 at 20:28 +0900, Yasuaki Ishimatsu wrote: acpi_bus_trim() stops removing devices, when acpi_bus_remove() return error number. But acpi_bus_remove() cannot return error number correctly. acpi_bus_remove() only return -EINVAL, when dev argument is NULL. Thus even if device cannot be removed correctly, acpi_bus_trim() ignores and continues to remove devices. acpi_bus_hot_remove_device() uses acpi_bus_trim() for removing devices. Therefore acpi_bus_hot_remove_device() can send _EJ0 to firmware, even if the device is running on the system. In this case, the system cannot work well. So acpi_bus_trim() should check whether device was removed or not correctly. The patch adds error check into some functions to remove the device. Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/acpi/scan.c| 15 --- drivers/base/dd.c | 22 +- include/linux/device.h |2 +- 3 files changed, 30 insertions(+), 9 deletions(-) Index: linux-3.5-rc6/drivers/acpi/scan.c === --- linux-3.5-rc6.orig/drivers/acpi/scan.c 2012-07-12 20:11:37.316443808 +0900 +++ linux-3.5-rc6/drivers/acpi/scan.c 2012-07-12 20:17:17.927185231 +0900 @@ -425,12 +425,17 @@ static int acpi_device_remove(struct dev { struct acpi_device *acpi_dev = to_acpi_device(dev); struct acpi_driver *acpi_drv = acpi_dev-driver; + int ret; if (acpi_drv) { if (acpi_drv-ops.notify) acpi_device_remove_notify_handler(acpi_dev); - if (acpi_drv-ops.remove) - acpi_drv-ops.remove(acpi_dev, acpi_dev-removal_type); + if (acpi_drv-ops.remove) { + ret = acpi_drv-ops.remove(acpi_dev, + acpi_dev-removal_type); + if (ret) + return ret; + } } acpi_dev-driver = NULL; acpi_dev-driver_data = NULL; @@ -1208,11 +1213,15 @@ static int acpi_device_set_context(struc static int acpi_bus_remove(struct acpi_device *dev, int rmdevice) { + int ret; + if (!dev) return -EINVAL; dev-removal_type = ACPI_BUS_REMOVAL_EJECT; - device_release_driver(dev-dev); + ret = device_release_driver(dev-dev); + if (ret) + return ret; if (!rmdevice) return 0; Index: linux-3.5-rc6/drivers/base/dd.c === --- linux-3.5-rc6.orig/drivers/base/dd.c2012-07-12 20:11:37.316443808 +0900 +++ linux-3.5-rc6/drivers/base/dd.c 2012-07-12 20:17:17.928185218 +0900 @@ -464,9 +464,10 @@ EXPORT_SYMBOL_GPL(driver_attach); * __device_release_driver() must be called with @dev lock held. * When called for a USB interface, @dev-parent lock must be held as well. */ -static void __device_release_driver(struct device *dev) +static int __device_release_driver(struct device *dev) { struct device_driver *drv; + int ret; drv = dev-driver; if (drv) { @@ -482,9 +483,11 @@ static void __device_release_driver(stru pm_runtime_put_sync(dev); if (dev-bus dev-bus-remove) - dev-bus-remove(dev); + ret = dev-bus-remove(dev); else if (drv-remove) - drv-remove(dev); + ret = drv-remove(dev); + if (ret) + goto rollback; devres_release_all(dev); dev-driver = NULL; klist_remove(dev-p-knode_driver); @@ -494,6 +497,12 @@ static void __device_release_driver(stru dev); } + + return ret; ret is uninitialized when !drv. Thanks! I'll update it. + +rollback: + driver_sysfs_add(dev); + return ret; } /** @@ -503,16 +512,19 @@ static void __device_release_driver(stru * Manually detach device from driver. * When called for a USB interface, @dev-parent lock must be held. */ -void device_release_driver(struct device *dev) +int device_release_driver(struct device *dev) I agree with this change as driver's remove interface can fail. However, there are other callers to this function, which do not check the return value. I suppose there is no impact to the other paths since you only changed the CPU hotplug path to fail properly, but please confirm this is the case. I recommend documenting this change to the change log. Thank you for your agreement. As you know, there are other callers. I believe the patch does not impact to them, since all of them does not check return value of device_release_driver(). I will write it to the patch. Thanks, Yasuaki Ishimatsu Thanks
[PATCH v4 1/3] acpi : cpu hot-remove returns error when cpu_down() fails
Even if cpu_down() fails, acpi_processor_remove() continues to remove the cpu. But in this case, it should return error number since some process may run on the cpu. If the cpu has a running process and the cpu is turned the power off, the system may not work well. Reviewed-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/acpi/processor_driver.c | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) Index: linux-3.5-rc6/drivers/acpi/processor_driver.c === --- linux-3.5-rc6.orig/drivers/acpi/processor_driver.c 2012-07-08 09:23:56.0 +0900 +++ linux-3.5-rc6/drivers/acpi/processor_driver.c 2012-07-13 15:11:06.135541317 +0900 @@ -610,7 +610,7 @@ err_free_pr: static int acpi_processor_remove(struct acpi_device *device, int type) { struct acpi_processor *pr = NULL; - + int ret; if (!device || !acpi_driver_data(device)) return -EINVAL; @@ -621,8 +621,9 @@ static int acpi_processor_remove(struct goto free; if (type == ACPI_BUS_REMOVAL_EJECT) { - if (acpi_processor_handle_eject(pr)) - return -EINVAL; + ret = acpi_processor_handle_eject(pr); + if (ret) + return ret; } acpi_processor_power_exit(pr, device); @@ -841,12 +842,17 @@ static acpi_status acpi_processor_hotadd static int acpi_processor_handle_eject(struct acpi_processor *pr) { - if (cpu_online(pr-id)) - cpu_down(pr-id); + int ret = 0; + + if (cpu_online(pr-id)) { + ret = cpu_down(pr-id); + if (ret) + return ret; + } arch_unregister_cpu(pr-id); acpi_unmap_lsapic(pr-id); - return (0); + return ret; } #else static acpi_status acpi_processor_hotadd_init(struct acpi_processor *pr) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 2/3] acpi : prevent cpu from becoming online
Even if acpi_processor_handle_eject() offlines cpu, there is a chance to online the cpu after that. So the patch closes the window by using get/put_online_cpus(). Why does the patch change _cpu_up() logic? The patch cares the race of hot-remove cpu and _cpu_up(). If the patch does not change it, there is the following race. hot-remove cpu | _cpu_up() - call acpi_processor_handle_eject() | call cpu_down() | call get_online_cpus()| | call cpu_hotplug_begin() and stop here call arch_unregister_cpu()| call acpi_unmap_lsapic() | call put_online_cpus()| | start and continue _cpu_up() return acpi_processor_remove()| continue hot-remove the cpu| So _cpu_up() can continue to itself. And hot-remove cpu can also continue itself. If the patch changes _cpu_up() logic, the race disappears as below: hot-remove cpu | _cpu_up() --- call acpi_processor_handle_eject() | call cpu_down() | call get_online_cpus()| | call cpu_hotplug_begin() and stop here call arch_unregister_cpu()| call acpi_unmap_lsapic() | cpu's cpu_present is set | to false by set_cpu_present()| call put_online_cpus()| | start _cpu_up() | check cpu_present() and return -EINVAL return acpi_processor_remove()| continue hot-remove the cpu| Reviewed-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/acpi/processor_driver.c | 14 ++ kernel/cpu.c|8 +--- 2 files changed, 19 insertions(+), 3 deletions(-) Index: linux-3.5-rc6/drivers/acpi/processor_driver.c === --- linux-3.5-rc6.orig/drivers/acpi/processor_driver.c 2012-07-13 17:31:37.799130100 +0900 +++ linux-3.5-rc6/drivers/acpi/processor_driver.c 2012-07-13 17:39:47.727006338 +0900 @@ -850,8 +850,22 @@ static int acpi_processor_handle_eject(s return ret; } + get_online_cpus(); + /* +* The cpu might become online again at this point. So we check whether +* the cpu has been onlined or not. If the cpu became online, it means +* that someone wants to use the cpu. So acpi_processor_handle_eject() +* returns -EAGAIN. +*/ + if (unlikely(cpu_online(pr-id))) { + put_online_cpus(); + pr_warn(Failed to remove CPU %d, , pr-id); + pr_warn(because other task brought the CPU back online\n); + return -EAGAIN; + } arch_unregister_cpu(pr-id); acpi_unmap_lsapic(pr-id); + put_online_cpus(); return ret; } #else Index: linux-3.5-rc6/kernel/cpu.c === --- linux-3.5-rc6.orig/kernel/cpu.c 2012-07-13 17:31:37.800130087 +0900 +++ linux-3.5-rc6/kernel/cpu.c 2012-07-13 17:31:39.661106874 +0900 @@ -343,11 +343,13 @@ static int __cpuinit _cpu_up(unsigned in unsigned long mod = tasks_frozen ? CPU_TASKS_FROZEN : 0; struct task_struct *idle; - if (cpu_online(cpu) || !cpu_present(cpu)) - return -EINVAL; - cpu_hotplug_begin(); + if (cpu_online(cpu) || !cpu_present(cpu)) { + ret = -EINVAL; + goto out; + } + idle = idle_thread_get(cpu); if (IS_ERR(idle)) { ret = PTR_ERR(idle); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v4 3/3] acpi : acpi_bus_trim() stops removing devices when failing to remove the device
acpi_bus_trim() stops removing devices, when acpi_bus_remove() return error number. But acpi_bus_remove() cannot return error number correctly. acpi_bus_remove() only return -EINVAL, when dev argument is NULL. Thus even if device cannot be removed correctly, acpi_bus_trim() ignores and continues to remove devices. acpi_bus_hot_remove_device() uses acpi_bus_trim() for removing devices. Therefore acpi_bus_hot_remove_device() can send _EJ0 to firmware, even if the device is running on the system. In this case, the system cannot work well. So acpi_bus_trim() should check whether device was removed or not correctly. The patch adds error check into some functions to remove the device. device_release_driver() can return error value by the patch. But the change does not impact other caller function excluding acpi_bus_trim(), since all of them does not check return value of device_releae_driver(). Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/acpi/scan.c| 15 --- drivers/base/dd.c | 22 +- include/linux/device.h |2 +- 3 files changed, 30 insertions(+), 9 deletions(-) Index: linux-3.5-rc6/drivers/acpi/scan.c === --- linux-3.5-rc6.orig/drivers/acpi/scan.c 2012-07-13 15:10:46.136790418 +0900 +++ linux-3.5-rc6/drivers/acpi/scan.c 2012-07-13 15:12:41.364349387 +0900 @@ -425,12 +425,17 @@ static int acpi_device_remove(struct dev { struct acpi_device *acpi_dev = to_acpi_device(dev); struct acpi_driver *acpi_drv = acpi_dev-driver; + int ret; if (acpi_drv) { if (acpi_drv-ops.notify) acpi_device_remove_notify_handler(acpi_dev); - if (acpi_drv-ops.remove) - acpi_drv-ops.remove(acpi_dev, acpi_dev-removal_type); + if (acpi_drv-ops.remove) { + ret = acpi_drv-ops.remove(acpi_dev, + acpi_dev-removal_type); + if (ret) + return ret; + } } acpi_dev-driver = NULL; acpi_dev-driver_data = NULL; @@ -1208,11 +1213,15 @@ static int acpi_device_set_context(struc static int acpi_bus_remove(struct acpi_device *dev, int rmdevice) { + int ret; + if (!dev) return -EINVAL; dev-removal_type = ACPI_BUS_REMOVAL_EJECT; - device_release_driver(dev-dev); + ret = device_release_driver(dev-dev); + if (ret) + return ret; if (!rmdevice) return 0; Index: linux-3.5-rc6/drivers/base/dd.c === --- linux-3.5-rc6.orig/drivers/base/dd.c2012-07-13 15:10:46.136790418 +0900 +++ linux-3.5-rc6/drivers/base/dd.c 2012-07-13 15:14:13.895193383 +0900 @@ -464,9 +464,10 @@ EXPORT_SYMBOL_GPL(driver_attach); * __device_release_driver() must be called with @dev lock held. * When called for a USB interface, @dev-parent lock must be held as well. */ -static void __device_release_driver(struct device *dev) +static int __device_release_driver(struct device *dev) { struct device_driver *drv; + int ret = 0; drv = dev-driver; if (drv) { @@ -482,9 +483,11 @@ static void __device_release_driver(stru pm_runtime_put_sync(dev); if (dev-bus dev-bus-remove) - dev-bus-remove(dev); + ret = dev-bus-remove(dev); else if (drv-remove) - drv-remove(dev); + ret = drv-remove(dev); + if (ret) + goto rollback; devres_release_all(dev); dev-driver = NULL; klist_remove(dev-p-knode_driver); @@ -494,6 +497,12 @@ static void __device_release_driver(stru dev); } + + return ret; + +rollback: + driver_sysfs_add(dev); + return ret; } /** @@ -503,16 +512,19 @@ static void __device_release_driver(stru * Manually detach device from driver. * When called for a USB interface, @dev-parent lock must be held. */ -void device_release_driver(struct device *dev) +int device_release_driver(struct device *dev) { + int ret; /* * If anyone calls device_release_driver() recursively from * within their -remove callback for the same device, they * will deadlock right here. */ device_lock(dev); - __device_release_driver(dev); + ret = __device_release_driver(dev); device_unlock(dev); + + return ret; } EXPORT_SYMBOL_GPL(device_release_driver); Index: linux-3.5-rc6/include/linux/device.h === --- linux-3.5-rc6.orig/include/linux/device.h
Re: [RFC PATCH v3 4/13] memory-hotplug : remove /sys/firmware/memmap/X sysfs
Hi Wen, 2012/07/13 18:10, Wen Congyang wrote: At 07/09/2012 06:26 PM, Yasuaki Ishimatsu Wrote: When (hot)adding memory into system, /sys/firmware/memmap/X/{end, start, type} sysfs files are created. But there is no code to remove these files. The patch implements the function to remove them. Note : The code does not free firmware_map_entry since there is no way to free memory which is allocated by bootmem. CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Benjamin Herrenschmidt b...@kernel.crashing.org CC: Paul Mackerras pau...@samba.org CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/firmware/memmap.c| 78 ++- include/linux/firmware-map.h |6 +++ mm/memory_hotplug.c |6 ++- 3 files changed, 88 insertions(+), 2 deletions(-) Index: linux-3.5-rc6/mm/memory_hotplug.c === --- linux-3.5-rc6.orig/mm/memory_hotplug.c 2012-07-09 18:23:13.323844923 +0900 +++ linux-3.5-rc6/mm/memory_hotplug.c2012-07-09 18:23:19.522767424 +0900 @@ -661,7 +661,11 @@ EXPORT_SYMBOL_GPL(add_memory); int remove_memory(int nid, u64 start, u64 size) { -return -EBUSY; +lock_memory_hotplug(); +/* remove memmap entry */ +firmware_map_remove(start, start + size - 1, System RAM); firmware_map_remove() is in meminit section, so remove_memory() should be in ref section. I'll add it. Thanks, Yasuaki Ishimatsu Thanks Wen Congyang +unlock_memory_hotplug(); +return 0; } EXPORT_SYMBOL_GPL(remove_memory); Index: linux-3.5-rc6/include/linux/firmware-map.h === --- linux-3.5-rc6.orig/include/linux/firmware-map.h 2012-07-09 18:23:09.532892314 +0900 +++ linux-3.5-rc6/include/linux/firmware-map.h 2012-07-09 18:23:19.523767412 +0900 @@ -25,6 +25,7 @@ int firmware_map_add_early(u64 start, u64 end, const char *type); int firmware_map_add_hotplug(u64 start, u64 end, const char *type); +int firmware_map_remove(u64 start, u64 end, const char *type); #else /* CONFIG_FIRMWARE_MEMMAP */ @@ -38,6 +39,11 @@ static inline int firmware_map_add_hotpl return 0; } +static inline int firmware_map_remove(u64 start, u64 end, const char *type) +{ +return 0; +} + #endif /* CONFIG_FIRMWARE_MEMMAP */ #endif /* _LINUX_FIRMWARE_MAP_H */ Index: linux-3.5-rc6/drivers/firmware/memmap.c === --- linux-3.5-rc6.orig/drivers/firmware/memmap.c 2012-07-09 18:23:09.532892314 +0900 +++ linux-3.5-rc6/drivers/firmware/memmap.c 2012-07-09 18:25:46.371931554 +0900 @@ -21,6 +21,7 @@ #include linux/types.h #include linux/bootmem.h #include linux/slab.h +#include linux/mm.h /* * Data types -- @@ -79,7 +80,22 @@ static const struct sysfs_ops memmap_att .show = memmap_attr_show, }; +#define to_memmap_entry(obj) container_of(obj, struct firmware_map_entry, kobj) + +static void release_firmware_map_entry(struct kobject *kobj) +{ +struct firmware_map_entry *entry = to_memmap_entry(kobj); +struct page *head_page; + +head_page = virt_to_head_page(entry); +if (PageSlab(head_page)) +kfree(entry); + +/* There is no way to free memory allocated from bootmem*/ +} + static struct kobj_type memmap_ktype = { +.release= release_firmware_map_entry, .sysfs_ops = memmap_attr_ops, .default_attrs = def_attrs, }; @@ -123,6 +139,16 @@ static int firmware_map_add_entry(u64 st return 0; } +/** + * firmware_map_remove_entry() - Does the real work to remove a firmware + * memmap entry. + * @entry: removed entry. + **/ +static inline void firmware_map_remove_entry(struct firmware_map_entry *entry) +{ +list_del(entry-list); +} + /* * Add memmap entry on sysfs */ @@ -144,6 +170,31 @@ static int add_sysfs_fw_map_entry(struct return 0; } +/* + * Remove memmap entry on sysfs + */ +static inline void remove_sysfs_fw_map_entry(struct firmware_map_entry *entry) +{ +kobject_put(entry-kobj); +} + +/* + * Search memmap entry + */ + +struct firmware_map_entry * __meminit +find_firmware_map_entry(u64 start, u64 end, const char *type) +{ +struct firmware_map_entry *entry; + +list_for_each_entry(entry, map_entries, list) +if ((entry-start == start) (entry-end == end) +(!strcmp(entry-type, type))) +return
Re: [RFC PATCH v3 4/13] memory-hotplug : remove /sys/firmware/memmap/X sysfs
Hi Wen, 2012/07/16 11:32, Wen Congyang wrote: At 07/09/2012 06:26 PM, Yasuaki Ishimatsu Wrote: When (hot)adding memory into system, /sys/firmware/memmap/X/{end, start, type} sysfs files are created. But there is no code to remove these files. The patch implements the function to remove them. Note : The code does not free firmware_map_entry since there is no way to free memory which is allocated by bootmem. CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Benjamin Herrenschmidt b...@kernel.crashing.org CC: Paul Mackerras pau...@samba.org CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/firmware/memmap.c| 78 ++- include/linux/firmware-map.h |6 +++ mm/memory_hotplug.c |6 ++- 3 files changed, 88 insertions(+), 2 deletions(-) Index: linux-3.5-rc6/mm/memory_hotplug.c === --- linux-3.5-rc6.orig/mm/memory_hotplug.c 2012-07-09 18:23:13.323844923 +0900 +++ linux-3.5-rc6/mm/memory_hotplug.c2012-07-09 18:23:19.522767424 +0900 @@ -661,7 +661,11 @@ EXPORT_SYMBOL_GPL(add_memory); int remove_memory(int nid, u64 start, u64 size) { -return -EBUSY; +lock_memory_hotplug(); +/* remove memmap entry */ +firmware_map_remove(start, start + size - 1, System RAM); +unlock_memory_hotplug(); +return 0; } EXPORT_SYMBOL_GPL(remove_memory); Index: linux-3.5-rc6/include/linux/firmware-map.h === --- linux-3.5-rc6.orig/include/linux/firmware-map.h 2012-07-09 18:23:09.532892314 +0900 +++ linux-3.5-rc6/include/linux/firmware-map.h 2012-07-09 18:23:19.523767412 +0900 @@ -25,6 +25,7 @@ int firmware_map_add_early(u64 start, u64 end, const char *type); int firmware_map_add_hotplug(u64 start, u64 end, const char *type); +int firmware_map_remove(u64 start, u64 end, const char *type); #else /* CONFIG_FIRMWARE_MEMMAP */ @@ -38,6 +39,11 @@ static inline int firmware_map_add_hotpl return 0; } +static inline int firmware_map_remove(u64 start, u64 end, const char *type) +{ +return 0; +} + #endif /* CONFIG_FIRMWARE_MEMMAP */ #endif /* _LINUX_FIRMWARE_MAP_H */ Index: linux-3.5-rc6/drivers/firmware/memmap.c === --- linux-3.5-rc6.orig/drivers/firmware/memmap.c 2012-07-09 18:23:09.532892314 +0900 +++ linux-3.5-rc6/drivers/firmware/memmap.c 2012-07-09 18:25:46.371931554 +0900 @@ -21,6 +21,7 @@ #include linux/types.h #include linux/bootmem.h #include linux/slab.h +#include linux/mm.h /* * Data types -- @@ -79,7 +80,22 @@ static const struct sysfs_ops memmap_att .show = memmap_attr_show, }; +#define to_memmap_entry(obj) container_of(obj, struct firmware_map_entry, kobj) + +static void release_firmware_map_entry(struct kobject *kobj) +{ +struct firmware_map_entry *entry = to_memmap_entry(kobj); +struct page *head_page; + +head_page = virt_to_head_page(entry); +if (PageSlab(head_page)) +kfree(entry); + +/* There is no way to free memory allocated from bootmem*/ +} + static struct kobj_type memmap_ktype = { +.release= release_firmware_map_entry, .sysfs_ops = memmap_attr_ops, .default_attrs = def_attrs, }; @@ -123,6 +139,16 @@ static int firmware_map_add_entry(u64 st return 0; } +/** + * firmware_map_remove_entry() - Does the real work to remove a firmware + * memmap entry. + * @entry: removed entry. + **/ +static inline void firmware_map_remove_entry(struct firmware_map_entry *entry) +{ +list_del(entry-list); +} + /* * Add memmap entry on sysfs */ @@ -144,6 +170,31 @@ static int add_sysfs_fw_map_entry(struct return 0; } +/* + * Remove memmap entry on sysfs + */ +static inline void remove_sysfs_fw_map_entry(struct firmware_map_entry *entry) +{ +kobject_put(entry-kobj); +} + +/* + * Search memmap entry + */ + +struct firmware_map_entry * __meminit +find_firmware_map_entry(u64 start, u64 end, const char *type) +{ +struct firmware_map_entry *entry; + +list_for_each_entry(entry, map_entries, list) +if ((entry-start == start) (entry-end == end) +(!strcmp(entry-type, type))) +return entry; + +return NULL; +} + /** * firmware_map_add_hotplug() - Adds a firmware mapping entry when we do * memory hotplug. @@ -196,6 +247,32
Re: [RFC PATCH v3 2/13] memory-hotplug : add physical memory hotplug code to acpi_memory_device_remove
Hi Wen, 2012/07/13 12:26, Wen Congyang wrote: At 07/09/2012 06:24 PM, Yasuaki Ishimatsu Wrote: acpi_memory_device_remove() has been prepared to remove physical memory. But, the function only frees acpi_memory_device currentlry. The patch adds following functions into acpi_memory_device_remove(): - offline memory - remove physical memory (only return -EBUSY) - free acpi_memory_device CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Benjamin Herrenschmidt b...@kernel.crashing.org CC: Paul Mackerras pau...@samba.org CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/acpi/acpi_memhotplug.c | 26 +- drivers/base/memory.c | 39 +++ include/linux/memory.h |5 + include/linux/memory_hotplug.h |1 + mm/memory_hotplug.c|8 5 files changed, 78 insertions(+), 1 deletion(-) Index: linux-3.5-rc6/drivers/acpi/acpi_memhotplug.c === --- linux-3.5-rc6.orig/drivers/acpi/acpi_memhotplug.c2012-07-09 18:08:29.946888653 +0900 +++ linux-3.5-rc6/drivers/acpi/acpi_memhotplug.c 2012-07-09 18:08:43.470719531 +0900 @@ -29,6 +29,7 @@ #include linux/module.h #include linux/init.h #include linux/types.h +#include linux/memory.h #include linux/memory_hotplug.h #include linux/slab.h #include acpi/acpi_drivers.h @@ -452,12 +453,35 @@ static int acpi_memory_device_add(struct static int acpi_memory_device_remove(struct acpi_device *device, int type) { struct acpi_memory_device *mem_device = NULL; - +struct acpi_memory_info *info, *tmp; +int result; +int node; if (!device || !acpi_driver_data(device)) return -EINVAL; mem_device = acpi_driver_data(device); + +node = acpi_get_node(mem_device-device-handle); + +list_for_each_entry_safe(info, tmp, mem_device-res_list, list) { +if (!info-enabled) +continue; + +if (!is_memblk_offline(info-start_addr, info-length)) { +result = offline_memory(info-start_addr, info-length); +if (result) +return result; +} + +result = remove_memory(node, info-start_addr, info-length); +if (result) +return result; + +list_del(info-list); +kfree(info); +} + kfree(mem_device); return 0; Index: linux-3.5-rc6/include/linux/memory_hotplug.h === --- linux-3.5-rc6.orig/include/linux/memory_hotplug.h2012-07-09 18:08:29.955888542 +0900 +++ linux-3.5-rc6/include/linux/memory_hotplug.h 2012-07-09 18:08:43.471719518 +0900 @@ -233,6 +233,7 @@ static inline int is_mem_section_removab extern int mem_online_node(int nid); extern int add_memory(int nid, u64 start, u64 size); extern int arch_add_memory(int nid, u64 start, u64 size); +extern int remove_memory(int nid, u64 start, u64 size); Here should be: #ifdef CONFIG_MEMORY_HOTREMOVE extern int remove_memory(int nid, u64 start, u64 size); #else static int inline remove_memory(int nid, u64 start, u64 size) { return -EBUSY; } #endif O.K. I'll update it. Thanks, Yasuaki Ishimatsu extern int offline_memory(u64 start, u64 size); extern int sparse_add_one_section(struct zone *zone, unsigned long start_pfn, int nr_pages); Index: linux-3.5-rc6/mm/memory_hotplug.c === --- linux-3.5-rc6.orig/mm/memory_hotplug.c 2012-07-09 18:08:29.953888567 +0900 +++ linux-3.5-rc6/mm/memory_hotplug.c2012-07-09 18:08:43.476719455 +0900 @@ -659,6 +659,14 @@ out: } EXPORT_SYMBOL_GPL(add_memory); +int remove_memory(int nid, u64 start, u64 size) +{ +return -EBUSY; + +} +EXPORT_SYMBOL_GPL(remove_memory); We only need to implement this function when CONFIG_MEMORY_HOTREMOVE is defined here. Thanks Wen Congyang + + #ifdef CONFIG_MEMORY_HOTREMOVE /* * A free page on the buddy free lists (not the per-cpu lists) has PageBuddy Index: linux-3.5-rc6/drivers/base/memory.c === --- linux-3.5-rc6.orig/drivers/base/memory.c 2012-07-09 18:08:29.947888640 +0900 +++ linux-3.5-rc6/drivers/base/memory.c 2012-07-09 18:10:54.880076739 +0900 @@ -70,6 +70,45 @@ void unregister_memory_isolate_notifier
Re: [RFC PATCH v3 2/13] memory-hotplug : add physical memory hotplug code to acpi_memory_device_remove
Hi Wen, 2012/07/13 19:40, Wen Congyang wrote: At 07/09/2012 06:24 PM, Yasuaki Ishimatsu Wrote: acpi_memory_device_remove() has been prepared to remove physical memory. But, the function only frees acpi_memory_device currentlry. The patch adds following functions into acpi_memory_device_remove(): - offline memory - remove physical memory (only return -EBUSY) - free acpi_memory_device CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Benjamin Herrenschmidt b...@kernel.crashing.org CC: Paul Mackerras pau...@samba.org CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/acpi/acpi_memhotplug.c | 26 +- drivers/base/memory.c | 39 +++ include/linux/memory.h |5 + include/linux/memory_hotplug.h |1 + mm/memory_hotplug.c|8 5 files changed, 78 insertions(+), 1 deletion(-) Index: linux-3.5-rc6/drivers/acpi/acpi_memhotplug.c === --- linux-3.5-rc6.orig/drivers/acpi/acpi_memhotplug.c2012-07-09 18:08:29.946888653 +0900 +++ linux-3.5-rc6/drivers/acpi/acpi_memhotplug.c 2012-07-09 18:08:43.470719531 +0900 @@ -29,6 +29,7 @@ #include linux/module.h #include linux/init.h #include linux/types.h +#include linux/memory.h #include linux/memory_hotplug.h #include linux/slab.h #include acpi/acpi_drivers.h @@ -452,12 +453,35 @@ static int acpi_memory_device_add(struct static int acpi_memory_device_remove(struct acpi_device *device, int type) { struct acpi_memory_device *mem_device = NULL; - +struct acpi_memory_info *info, *tmp; +int result; +int node; if (!device || !acpi_driver_data(device)) return -EINVAL; mem_device = acpi_driver_data(device); + +node = acpi_get_node(mem_device-device-handle); acpi_get_node() may return -1, and you should call memory_add_physaddr_to_nid() to get the node id. O.K. I'll update it. Thanks, Yasuaki Ishimatsu Thanks Wen Congyang + +list_for_each_entry_safe(info, tmp, mem_device-res_list, list) { +if (!info-enabled) +continue; + +if (!is_memblk_offline(info-start_addr, info-length)) { +result = offline_memory(info-start_addr, info-length); +if (result) +return result; +} + +result = remove_memory(node, info-start_addr, info-length); +if (result) +return result; + +list_del(info-list); +kfree(info); +} + kfree(mem_device); return 0; Index: linux-3.5-rc6/include/linux/memory_hotplug.h === --- linux-3.5-rc6.orig/include/linux/memory_hotplug.h2012-07-09 18:08:29.955888542 +0900 +++ linux-3.5-rc6/include/linux/memory_hotplug.h 2012-07-09 18:08:43.471719518 +0900 @@ -233,6 +233,7 @@ static inline int is_mem_section_removab extern int mem_online_node(int nid); extern int add_memory(int nid, u64 start, u64 size); extern int arch_add_memory(int nid, u64 start, u64 size); +extern int remove_memory(int nid, u64 start, u64 size); extern int offline_memory(u64 start, u64 size); extern int sparse_add_one_section(struct zone *zone, unsigned long start_pfn, int nr_pages); Index: linux-3.5-rc6/mm/memory_hotplug.c === --- linux-3.5-rc6.orig/mm/memory_hotplug.c 2012-07-09 18:08:29.953888567 +0900 +++ linux-3.5-rc6/mm/memory_hotplug.c2012-07-09 18:08:43.476719455 +0900 @@ -659,6 +659,14 @@ out: } EXPORT_SYMBOL_GPL(add_memory); +int remove_memory(int nid, u64 start, u64 size) +{ +return -EBUSY; + +} +EXPORT_SYMBOL_GPL(remove_memory); + + #ifdef CONFIG_MEMORY_HOTREMOVE /* * A free page on the buddy free lists (not the per-cpu lists) has PageBuddy Index: linux-3.5-rc6/drivers/base/memory.c === --- linux-3.5-rc6.orig/drivers/base/memory.c 2012-07-09 18:08:29.947888640 +0900 +++ linux-3.5-rc6/drivers/base/memory.c 2012-07-09 18:10:54.880076739 +0900 @@ -70,6 +70,45 @@ void unregister_memory_isolate_notifier( } EXPORT_SYMBOL(unregister_memory_isolate_notifier); +bool is_memblk_offline(unsigned long start, unsigned long size) +{ +struct memory_block *mem = NULL; +struct mem_section *section; +unsigned
Re: [RFC PATCH v3 2/13] memory-hotplug : add physical memory hotplug code to acpi_memory_device_remove
Hi Wen, 2012/07/13 12:35, Wen Congyang wrote: At 07/09/2012 06:24 PM, Yasuaki Ishimatsu Wrote: acpi_memory_device_remove() has been prepared to remove physical memory. But, the function only frees acpi_memory_device currentlry. The patch adds following functions into acpi_memory_device_remove(): - offline memory - remove physical memory (only return -EBUSY) - free acpi_memory_device CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Benjamin Herrenschmidt b...@kernel.crashing.org CC: Paul Mackerras pau...@samba.org CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/acpi/acpi_memhotplug.c | 26 +- drivers/base/memory.c | 39 +++ include/linux/memory.h |5 + include/linux/memory_hotplug.h |1 + mm/memory_hotplug.c|8 5 files changed, 78 insertions(+), 1 deletion(-) Index: linux-3.5-rc6/drivers/acpi/acpi_memhotplug.c === --- linux-3.5-rc6.orig/drivers/acpi/acpi_memhotplug.c2012-07-09 18:08:29.946888653 +0900 +++ linux-3.5-rc6/drivers/acpi/acpi_memhotplug.c 2012-07-09 18:08:43.470719531 +0900 @@ -29,6 +29,7 @@ #include linux/module.h #include linux/init.h #include linux/types.h +#include linux/memory.h #include linux/memory_hotplug.h #include linux/slab.h #include acpi/acpi_drivers.h @@ -452,12 +453,35 @@ static int acpi_memory_device_add(struct static int acpi_memory_device_remove(struct acpi_device *device, int type) { struct acpi_memory_device *mem_device = NULL; - +struct acpi_memory_info *info, *tmp; +int result; +int node; if (!device || !acpi_driver_data(device)) return -EINVAL; mem_device = acpi_driver_data(device); + +node = acpi_get_node(mem_device-device-handle); + +list_for_each_entry_safe(info, tmp, mem_device-res_list, list) { +if (!info-enabled) +continue; + +if (!is_memblk_offline(info-start_addr, info-length)) { +result = offline_memory(info-start_addr, info-length); +if (result) +return result; +} + +result = remove_memory(node, info-start_addr, info-length); The user may online the memory between offline_memory() and remove_memory(). So I think we should lock memory hotplug before check the memory's status and release it after remove_memory(). How about get mem_block-state_mutex of removed memory? When offlining memory, we need to change memory_block-state into MEM_OFFLINE. In this case, we get mem_block-state_mutex. So I think the mutex lock is beneficial. Thanks, Yasuaki Ishimatsu Thanks Wen Congyang +if (result) +return result; + +list_del(info-list); +kfree(info); +} + kfree(mem_device); return 0; Index: linux-3.5-rc6/include/linux/memory_hotplug.h === --- linux-3.5-rc6.orig/include/linux/memory_hotplug.h2012-07-09 18:08:29.955888542 +0900 +++ linux-3.5-rc6/include/linux/memory_hotplug.h 2012-07-09 18:08:43.471719518 +0900 @@ -233,6 +233,7 @@ static inline int is_mem_section_removab extern int mem_online_node(int nid); extern int add_memory(int nid, u64 start, u64 size); extern int arch_add_memory(int nid, u64 start, u64 size); +extern int remove_memory(int nid, u64 start, u64 size); extern int offline_memory(u64 start, u64 size); extern int sparse_add_one_section(struct zone *zone, unsigned long start_pfn, int nr_pages); Index: linux-3.5-rc6/mm/memory_hotplug.c === --- linux-3.5-rc6.orig/mm/memory_hotplug.c 2012-07-09 18:08:29.953888567 +0900 +++ linux-3.5-rc6/mm/memory_hotplug.c2012-07-09 18:08:43.476719455 +0900 @@ -659,6 +659,14 @@ out: } EXPORT_SYMBOL_GPL(add_memory); +int remove_memory(int nid, u64 start, u64 size) +{ +return -EBUSY; + +} +EXPORT_SYMBOL_GPL(remove_memory); + + #ifdef CONFIG_MEMORY_HOTREMOVE /* * A free page on the buddy free lists (not the per-cpu lists) has PageBuddy Index: linux-3.5-rc6/drivers/base/memory.c === --- linux-3.5-rc6.orig/drivers/base/memory.c 2012-07-09 18:08:29.947888640 +0900 +++ linux-3.5-rc6/drivers/base/memory.c 2012-07-09 18:10:54.880076739
Re: [RFC PATCH v3 2/13] memory-hotplug : add physical memory hotplug code to acpi_memory_device_remove
Hi Wen, 2012/07/17 10:44, Yasuaki Ishimatsu wrote: Hi Wen, 2012/07/13 12:35, Wen Congyang wrote: At 07/09/2012 06:24 PM, Yasuaki Ishimatsu Wrote: acpi_memory_device_remove() has been prepared to remove physical memory. But, the function only frees acpi_memory_device currentlry. The patch adds following functions into acpi_memory_device_remove(): - offline memory - remove physical memory (only return -EBUSY) - free acpi_memory_device CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Benjamin Herrenschmidt b...@kernel.crashing.org CC: Paul Mackerras pau...@samba.org CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/acpi/acpi_memhotplug.c | 26 +- drivers/base/memory.c | 39 +++ include/linux/memory.h |5 + include/linux/memory_hotplug.h |1 + mm/memory_hotplug.c|8 5 files changed, 78 insertions(+), 1 deletion(-) Index: linux-3.5-rc6/drivers/acpi/acpi_memhotplug.c === --- linux-3.5-rc6.orig/drivers/acpi/acpi_memhotplug.c 2012-07-09 18:08:29.946888653 +0900 +++ linux-3.5-rc6/drivers/acpi/acpi_memhotplug.c2012-07-09 18:08:43.470719531 +0900 @@ -29,6 +29,7 @@ #include linux/module.h #include linux/init.h #include linux/types.h +#include linux/memory.h #include linux/memory_hotplug.h #include linux/slab.h #include acpi/acpi_drivers.h @@ -452,12 +453,35 @@ static int acpi_memory_device_add(struct static int acpi_memory_device_remove(struct acpi_device *device, int type) { struct acpi_memory_device *mem_device = NULL; - + struct acpi_memory_info *info, *tmp; + int result; + int node; if (!device || !acpi_driver_data(device)) return -EINVAL; mem_device = acpi_driver_data(device); + + node = acpi_get_node(mem_device-device-handle); + + list_for_each_entry_safe(info, tmp, mem_device-res_list, list) { + if (!info-enabled) + continue; + + if (!is_memblk_offline(info-start_addr, info-length)) { + result = offline_memory(info-start_addr, info-length); + if (result) + return result; + } + + result = remove_memory(node, info-start_addr, info-length); The user may online the memory between offline_memory() and remove_memory(). So I think we should lock memory hotplug before check the memory's status and release it after remove_memory(). How about get mem_block-state_mutex of removed memory? When offlining memory, we need to change memory_block-state into MEM_OFFLINE. In this case, we get mem_block-state_mutex. So I think the mutex lock is beneficial. It is not good idea since remove_memory frees mem_block structure... Do you have any ideas? Thanks, Yasuaki Ishimatsu Thanks, Yasuaki Ishimatsu Thanks Wen Congyang + if (result) + return result; + + list_del(info-list); + kfree(info); + } + kfree(mem_device); return 0; Index: linux-3.5-rc6/include/linux/memory_hotplug.h === --- linux-3.5-rc6.orig/include/linux/memory_hotplug.h 2012-07-09 18:08:29.955888542 +0900 +++ linux-3.5-rc6/include/linux/memory_hotplug.h2012-07-09 18:08:43.471719518 +0900 @@ -233,6 +233,7 @@ static inline int is_mem_section_removab extern int mem_online_node(int nid); extern int add_memory(int nid, u64 start, u64 size); extern int arch_add_memory(int nid, u64 start, u64 size); +extern int remove_memory(int nid, u64 start, u64 size); extern int offline_memory(u64 start, u64 size); extern int sparse_add_one_section(struct zone *zone, unsigned long start_pfn, int nr_pages); Index: linux-3.5-rc6/mm/memory_hotplug.c === --- linux-3.5-rc6.orig/mm/memory_hotplug.c 2012-07-09 18:08:29.953888567 +0900 +++ linux-3.5-rc6/mm/memory_hotplug.c 2012-07-09 18:08:43.476719455 +0900 @@ -659,6 +659,14 @@ out: } EXPORT_SYMBOL_GPL(add_memory); +int remove_memory(int nid, u64 start, u64 size) +{ + return -EBUSY; + +} +EXPORT_SYMBOL_GPL(remove_memory); + + #ifdef CONFIG_MEMORY_HOTREMOVE /* * A free page on the buddy free lists (not the per-cpu lists) has PageBuddy Index: linux-3.5-rc6/drivers/base/memory.c
[PATCH] firmware_map : unify argument of firmware_map_add_early/hotplug
There are two ways to create /sys/firmware/memmap/X sysfs: - firmware_map_add_early When the system starts, it is calledd from e820_reserve_resources() - firmware_map_add_hotplug When the memory is hot plugged, it is called from add_memory() But these functions are called without unifying value of end argument as below: - end argument of firmware_map_add_early() : start + size - 1 - end argument of firmware_map_add_hogplug() : start + size The patch unifies them to start + size. Even if applying the patch, /sys/firmware/memmap/X/end file content does not change. CC: Thomas Gleixner t...@linutronix.de CC: Ingo Molnar mi...@kernel.org CC: H. Peter Anvin h...@zytor.com CC: Tejun Heo t...@kernel.org CC: Andrew Morton a...@linux-foundation.org Reviewed-by: Dave Hansen d...@linux.vnet.ibm.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- arch/x86/kernel/e820.c|2 +- drivers/firmware/memmap.c |8 2 files changed, 5 insertions(+), 5 deletions(-) Index: linux-next/arch/x86/kernel/e820.c === --- linux-next.orig/arch/x86/kernel/e820.c 2012-07-02 09:50:23.0 +0900 +++ linux-next/arch/x86/kernel/e820.c 2012-07-12 13:30:45.942318179 +0900 @@ -944,7 +944,7 @@ for (i = 0; i e820_saved.nr_map; i++) { struct e820entry *entry = e820_saved.map[i]; firmware_map_add_early(entry-addr, - entry-addr + entry-size - 1, + entry-addr + entry-size, e820_type_to_string(entry-type)); } } Index: linux-next/drivers/firmware/memmap.c === --- linux-next.orig/drivers/firmware/memmap.c 2012-07-02 09:50:26.0 +0900 +++ linux-next/drivers/firmware/memmap.c2012-07-12 13:40:53.823318481 +0900 @@ -98,7 +98,7 @@ /** * firmware_map_add_entry() - Does the real work to add a firmware memmap entry. * @start: Start of the memory range. - * @end: End of the memory range (inclusive). + * @end: End of the memory range. * @type: Type of the memory range. * @entry: Pre-allocated (either kmalloc() or bootmem allocator), uninitialised * entry. @@ -113,7 +113,7 @@ BUG_ON(start end); entry-start = start; - entry-end = end; + entry-end = end - 1; entry-type = type; INIT_LIST_HEAD(entry-list); kobject_init(entry-kobj, memmap_ktype); @@ -148,7 +148,7 @@ * firmware_map_add_hotplug() - Adds a firmware mapping entry when we do * memory hotplug. * @start: Start of the memory range. - * @end: End of the memory range (inclusive). + * @end: End of the memory range. * @type: Type of the memory range. * * Adds a firmware mapping entry. This function is for memory hotplug, it is @@ -175,7 +175,7 @@ /** * firmware_map_add_early() - Adds a firmware mapping entry. * @start: Start of the memory range. - * @end: End of the memory range (inclusive). + * @end: End of the memory range. * @type: Type of the memory range. * * Adds a firmware mapping entry. This function uses the bootmem allocator -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH v3 2/13] memory-hotplug : add physical memory hotplug code to acpi_memory_device_remove
Hi Wen, 2012/07/17 11:32, Wen Congyang wrote: At 07/17/2012 09:54 AM, Yasuaki Ishimatsu Wrote: Hi Wen, 2012/07/17 10:44, Yasuaki Ishimatsu wrote: Hi Wen, 2012/07/13 12:35, Wen Congyang wrote: At 07/09/2012 06:24 PM, Yasuaki Ishimatsu Wrote: acpi_memory_device_remove() has been prepared to remove physical memory. But, the function only frees acpi_memory_device currentlry. The patch adds following functions into acpi_memory_device_remove(): - offline memory - remove physical memory (only return -EBUSY) - free acpi_memory_device CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Benjamin Herrenschmidt b...@kernel.crashing.org CC: Paul Mackerras pau...@samba.org CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/acpi/acpi_memhotplug.c | 26 +- drivers/base/memory.c | 39 +++ include/linux/memory.h |5 + include/linux/memory_hotplug.h |1 + mm/memory_hotplug.c|8 5 files changed, 78 insertions(+), 1 deletion(-) Index: linux-3.5-rc6/drivers/acpi/acpi_memhotplug.c === --- linux-3.5-rc6.orig/drivers/acpi/acpi_memhotplug.c 2012-07-09 18:08:29.946888653 +0900 +++ linux-3.5-rc6/drivers/acpi/acpi_memhotplug.c 2012-07-09 18:08:43.470719531 +0900 @@ -29,6 +29,7 @@ #include linux/module.h #include linux/init.h #include linux/types.h +#include linux/memory.h #include linux/memory_hotplug.h #include linux/slab.h #include acpi/acpi_drivers.h @@ -452,12 +453,35 @@ static int acpi_memory_device_add(struct static int acpi_memory_device_remove(struct acpi_device *device, int type) { struct acpi_memory_device *mem_device = NULL; - + struct acpi_memory_info *info, *tmp; + int result; + int node; if (!device || !acpi_driver_data(device)) return -EINVAL; mem_device = acpi_driver_data(device); + + node = acpi_get_node(mem_device-device-handle); + + list_for_each_entry_safe(info, tmp, mem_device-res_list, list) { + if (!info-enabled) + continue; + + if (!is_memblk_offline(info-start_addr, info-length)) { + result = offline_memory(info-start_addr, info-length); + if (result) + return result; + } + + result = remove_memory(node, info-start_addr, info-length); The user may online the memory between offline_memory() and remove_memory(). So I think we should lock memory hotplug before check the memory's status and release it after remove_memory(). How about get mem_block-state_mutex of removed memory? When offlining memory, we need to change memory_block-state into MEM_OFFLINE. In this case, we get mem_block-state_mutex. So I think the mutex lock is beneficial. It is not good idea since remove_memory frees mem_block structure... Do you have any ideas? Hmm, split offline_memory() to 2 functions: offline_pages() and __offline_pages() offline_pages() lock_memory_hotplug(); __offline_pages(); unlock_memory_hotplug(); and implement remove_memory() like this: remove_memory() lock_memory_hotplug() if (!is_memblk_offline()) { __offline_pages(); } // cleanup unlock_memory_hotplug(); What about this? I also thought about it once. But a problem remains. Current offilne_pages() cannot realize the memory has been removed by remove_memory(). So even if protecting the race by lock_memory_hotplug(), offline_pages() can offline the removed memory. offline_pages() should have the means to know the memory was removed. But I don't have good idea. Thanks, Yasuaki Ishimatsu Thanks Wen Congyang Thanks, Yasuaki Ishimatsu Thanks, Yasuaki Ishimatsu Thanks Wen Congyang + if (result) + return result; + + list_del(info-list); + kfree(info); + } + kfree(mem_device); return 0; Index: linux-3.5-rc6/include/linux/memory_hotplug.h === --- linux-3.5-rc6.orig/include/linux/memory_hotplug.h 2012-07-09 18:08:29.955888542 +0900 +++ linux-3.5-rc6/include/linux/memory_hotplug.h 2012-07-09 18:08:43.471719518 +0900 @@ -233,6 +233,7 @@ static inline int is_mem_section_removab extern int mem_online_node(int nid); extern int add_memory(int nid, u64 start, u64 size); extern int arch_add_memory(int nid, u64 start, u64 size); +extern int
Re: [RFC PATCH v3 2/13] memory-hotplug : add physical memory hotplug code to acpi_memory_device_remove
Hi Wen, 2012/07/17 12:32, Wen Congyang wrote: At 07/17/2012 11:08 AM, Yasuaki Ishimatsu Wrote: Hi Wen, 2012/07/17 11:32, Wen Congyang wrote: At 07/17/2012 09:54 AM, Yasuaki Ishimatsu Wrote: Hi Wen, 2012/07/17 10:44, Yasuaki Ishimatsu wrote: Hi Wen, 2012/07/13 12:35, Wen Congyang wrote: At 07/09/2012 06:24 PM, Yasuaki Ishimatsu Wrote: acpi_memory_device_remove() has been prepared to remove physical memory. But, the function only frees acpi_memory_device currentlry. The patch adds following functions into acpi_memory_device_remove(): - offline memory - remove physical memory (only return -EBUSY) - free acpi_memory_device CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Benjamin Herrenschmidt b...@kernel.crashing.org CC: Paul Mackerras pau...@samba.org CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/acpi/acpi_memhotplug.c | 26 +- drivers/base/memory.c | 39 +++ include/linux/memory.h |5 + include/linux/memory_hotplug.h |1 + mm/memory_hotplug.c|8 5 files changed, 78 insertions(+), 1 deletion(-) Index: linux-3.5-rc6/drivers/acpi/acpi_memhotplug.c === --- linux-3.5-rc6.orig/drivers/acpi/acpi_memhotplug.c 2012-07-09 18:08:29.946888653 +0900 +++ linux-3.5-rc6/drivers/acpi/acpi_memhotplug.c2012-07-09 18:08:43.470719531 +0900 @@ -29,6 +29,7 @@ #include linux/module.h #include linux/init.h #include linux/types.h +#include linux/memory.h #include linux/memory_hotplug.h #include linux/slab.h #include acpi/acpi_drivers.h @@ -452,12 +453,35 @@ static int acpi_memory_device_add(struct static int acpi_memory_device_remove(struct acpi_device *device, int type) { struct acpi_memory_device *mem_device = NULL; - + struct acpi_memory_info *info, *tmp; + int result; + int node; if (!device || !acpi_driver_data(device)) return -EINVAL; mem_device = acpi_driver_data(device); + + node = acpi_get_node(mem_device-device-handle); + + list_for_each_entry_safe(info, tmp, mem_device-res_list, list) { + if (!info-enabled) + continue; + + if (!is_memblk_offline(info-start_addr, info-length)) { + result = offline_memory(info-start_addr, info-length); + if (result) + return result; + } + + result = remove_memory(node, info-start_addr, info-length); The user may online the memory between offline_memory() and remove_memory(). So I think we should lock memory hotplug before check the memory's status and release it after remove_memory(). How about get mem_block-state_mutex of removed memory? When offlining memory, we need to change memory_block-state into MEM_OFFLINE. In this case, we get mem_block-state_mutex. So I think the mutex lock is beneficial. It is not good idea since remove_memory frees mem_block structure... Do you have any ideas? Hmm, split offline_memory() to 2 functions: offline_pages() and __offline_pages() offline_pages() lock_memory_hotplug(); __offline_pages(); unlock_memory_hotplug(); and implement remove_memory() like this: remove_memory() lock_memory_hotplug() if (!is_memblk_offline()) { __offline_pages(); } // cleanup unlock_memory_hotplug(); What about this? I also thought about it once. But a problem remains. Current offilne_pages() cannot realize the memory has been removed by remove_memory(). So even if protecting the race by lock_memory_hotplug(), offline_pages() can offline the removed memory. offline_pages() should have the means to know the memory was removed. But I don't have good idea. We can not online/offline part of memory block, so what about this? It seems you do not understand my concern. When memory_remove() and offline_pages() run to same memory simultaneously, offline_pages runs to removed memory. memory_remove() | offline_pages() --- lock_memory_hotplug()| | wait at lock_memory_hotplug() remove memory| unlock_memory_hotplug() | | wake up and start offline_pages() | offline page | = but the memory has already removed
Re: [RFC PATCH v3 2/13] memory-hotplug : add physical memory hotplug code to acpi_memory_device_remove
Hi Wen, 2012/07/17 14:17, Wen Congyang wrote: At 07/17/2012 12:51 PM, Yasuaki Ishimatsu Wrote: Hi Wen, 2012/07/17 12:32, Wen Congyang wrote: At 07/17/2012 11:08 AM, Yasuaki Ishimatsu Wrote: Hi Wen, 2012/07/17 11:32, Wen Congyang wrote: At 07/17/2012 09:54 AM, Yasuaki Ishimatsu Wrote: Hi Wen, 2012/07/17 10:44, Yasuaki Ishimatsu wrote: Hi Wen, 2012/07/13 12:35, Wen Congyang wrote: At 07/09/2012 06:24 PM, Yasuaki Ishimatsu Wrote: acpi_memory_device_remove() has been prepared to remove physical memory. But, the function only frees acpi_memory_device currentlry. The patch adds following functions into acpi_memory_device_remove(): - offline memory - remove physical memory (only return -EBUSY) - free acpi_memory_device CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Benjamin Herrenschmidt b...@kernel.crashing.org CC: Paul Mackerras pau...@samba.org CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/acpi/acpi_memhotplug.c | 26 +- drivers/base/memory.c | 39 +++ include/linux/memory.h |5 + include/linux/memory_hotplug.h |1 + mm/memory_hotplug.c|8 5 files changed, 78 insertions(+), 1 deletion(-) Index: linux-3.5-rc6/drivers/acpi/acpi_memhotplug.c === --- linux-3.5-rc6.orig/drivers/acpi/acpi_memhotplug.c 2012-07-09 18:08:29.946888653 +0900 +++ linux-3.5-rc6/drivers/acpi/acpi_memhotplug.c 2012-07-09 18:08:43.470719531 +0900 @@ -29,6 +29,7 @@ #include linux/module.h #include linux/init.h #include linux/types.h +#include linux/memory.h #include linux/memory_hotplug.h #include linux/slab.h #include acpi/acpi_drivers.h @@ -452,12 +453,35 @@ static int acpi_memory_device_add(struct static int acpi_memory_device_remove(struct acpi_device *device, int type) { struct acpi_memory_device *mem_device = NULL; - + struct acpi_memory_info *info, *tmp; + int result; + int node; if (!device || !acpi_driver_data(device)) return -EINVAL; mem_device = acpi_driver_data(device); + + node = acpi_get_node(mem_device-device-handle); + + list_for_each_entry_safe(info, tmp, mem_device-res_list, list) { + if (!info-enabled) + continue; + + if (!is_memblk_offline(info-start_addr, info-length)) { + result = offline_memory(info-start_addr, info-length); + if (result) + return result; + } + + result = remove_memory(node, info-start_addr, info-length); The user may online the memory between offline_memory() and remove_memory(). So I think we should lock memory hotplug before check the memory's status and release it after remove_memory(). How about get mem_block-state_mutex of removed memory? When offlining memory, we need to change memory_block-state into MEM_OFFLINE. In this case, we get mem_block-state_mutex. So I think the mutex lock is beneficial. It is not good idea since remove_memory frees mem_block structure... Do you have any ideas? Hmm, split offline_memory() to 2 functions: offline_pages() and __offline_pages() offline_pages() lock_memory_hotplug(); __offline_pages(); unlock_memory_hotplug(); and implement remove_memory() like this: remove_memory() lock_memory_hotplug() if (!is_memblk_offline()) { __offline_pages(); } // cleanup unlock_memory_hotplug(); What about this? I also thought about it once. But a problem remains. Current offilne_pages() cannot realize the memory has been removed by remove_memory(). So even if protecting the race by lock_memory_hotplug(), offline_pages() can offline the removed memory. offline_pages() should have the means to know the memory was removed. But I don't have good idea. We can not online/offline part of memory block, so what about this? It seems you do not understand my concern. When memory_remove() and offline_pages() run to same memory simultaneously, offline_pages runs to removed memory. memory_remove() | offline_pages() --- lock_memory_hotplug()| | wait at lock_memory_hotplug() remove memory| unlock_memory_hotplug() | | wake up and start offline_pages
[RFC PATCH 0/13] firmware_map : unify argument of firmware_map_add_early/hotplug
There are two ways to create /sys/firmware/memmap/X sysfs: - firmware_map_add_early When the system starts, it is calledd from e820_reserve_resources() - firmware_map_add_hotplug When the memory is hot plugged, it is called from add_memory() But these functions are called without unifying value of end argument as below: - end argument of firmware_map_add_early() : start + size - 1 - end argument of firmware_map_add_hogplug() : start + size The patch unifies them to start + size. Even if applying the patch, /sys/firmware/memmap/X/end file content does not change. CC: Thomas Gleixner t...@linutronix.de CC: Ingo Molnar mi...@kernel.org CC: H. Peter Anvin h...@zytor.com CC: Tejun Heo t...@kernel.org CC: Andrew Morton a...@linux-foundation.org Reviewed-by: Dave Hansen d...@linux.vnet.ibm.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- arch/x86/kernel/e820.c|2 +- drivers/firmware/memmap.c |8 2 files changed, 5 insertions(+), 5 deletions(-) Index: linux-3.5-rc6/arch/x86/kernel/e820.c === --- linux-3.5-rc6.orig/arch/x86/kernel/e820.c 2012-07-18 17:19:38.391365260 +0900 +++ linux-3.5-rc6/arch/x86/kernel/e820.c2012-07-18 17:19:43.616300222 +0900 @@ -944,7 +944,7 @@ void __init e820_reserve_resources(void) for (i = 0; i e820_saved.nr_map; i++) { struct e820entry *entry = e820_saved.map[i]; firmware_map_add_early(entry-addr, - entry-addr + entry-size - 1, + entry-addr + entry-size, e820_type_to_string(entry-type)); } } Index: linux-3.5-rc6/drivers/firmware/memmap.c === --- linux-3.5-rc6.orig/drivers/firmware/memmap.c2012-07-18 17:19:38.388365299 +0900 +++ linux-3.5-rc6/drivers/firmware/memmap.c 2012-07-18 18:30:47.608390251 +0900 @@ -98,7 +98,7 @@ static LIST_HEAD(map_entries); /** * firmware_map_add_entry() - Does the real work to add a firmware memmap entry. * @start: Start of the memory range. - * @end: End of the memory range (inclusive). + * @end: End of the memory range. * @type: Type of the memory range. * @entry: Pre-allocated (either kmalloc() or bootmem allocator), uninitialised * entry. @@ -113,7 +113,7 @@ static int firmware_map_add_entry(u64 st BUG_ON(start end); entry-start = start; - entry-end = end; + entry-end = end - 1; entry-type = type; INIT_LIST_HEAD(entry-list); kobject_init(entry-kobj, memmap_ktype); @@ -148,7 +148,7 @@ static int add_sysfs_fw_map_entry(struct * firmware_map_add_hotplug() - Adds a firmware mapping entry when we do * memory hotplug. * @start: Start of the memory range. - * @end: End of the memory range (inclusive). + * @end: End of the memory range. * @type: Type of the memory range. * * Adds a firmware mapping entry. This function is for memory hotplug, it is @@ -175,7 +175,7 @@ int __meminit firmware_map_add_hotplug(u /** * firmware_map_add_early() - Adds a firmware mapping entry. * @start: Start of the memory range. - * @end: End of the memory range (inclusive). + * @end: End of the memory range. * @type: Type of the memory range. * * Adds a firmware mapping entry. This function uses the bootmem allocator -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH v4 1/13] memory-hotplug : rename remove_memory to offline_memory
remove_memory() does not remove memory but just offlines memory. The patch changes name of it to offline_memory(). CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Benjamin Herrenschmidt b...@kernel.crashing.org CC: Paul Mackerras pau...@samba.org CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/acpi/acpi_memhotplug.c |2 +- drivers/base/memory.c |4 ++-- include/linux/memory_hotplug.h |2 +- mm/memory_hotplug.c|6 +++--- 4 files changed, 7 insertions(+), 7 deletions(-) Index: linux-3.5-rc4/drivers/acpi/acpi_memhotplug.c === --- linux-3.5-rc4.orig/drivers/acpi/acpi_memhotplug.c 2012-07-03 14:21:46.102416917 +0900 +++ linux-3.5-rc4/drivers/acpi/acpi_memhotplug.c2012-07-03 14:21:49.458374960 +0900 @@ -318,7 +318,7 @@ static int acpi_memory_disable_device(st */ list_for_each_entry_safe(info, n, mem_device-res_list, list) { if (info-enabled) { - result = remove_memory(info-start_addr, info-length); + result = offline_memory(info-start_addr, info-length); if (result) return result; } Index: linux-3.5-rc4/drivers/base/memory.c === --- linux-3.5-rc4.orig/drivers/base/memory.c2012-07-03 14:21:46.095417003 +0900 +++ linux-3.5-rc4/drivers/base/memory.c 2012-07-03 14:21:49.459374948 +0900 @@ -266,8 +266,8 @@ memory_block_action(unsigned long phys_i break; case MEM_OFFLINE: start_paddr = page_to_pfn(first_page) PAGE_SHIFT; - ret = remove_memory(start_paddr, - nr_pages PAGE_SHIFT); + ret = offline_memory(start_paddr, +nr_pages PAGE_SHIFT); break; default: WARN(1, KERN_WARNING %s(%ld, %ld) unknown action: Index: linux-3.5-rc4/mm/memory_hotplug.c === --- linux-3.5-rc4.orig/mm/memory_hotplug.c 2012-07-03 14:21:46.102416917 +0900 +++ linux-3.5-rc4/mm/memory_hotplug.c 2012-07-03 14:21:49.466374860 +0900 @@ -990,7 +990,7 @@ out: return ret; } -int remove_memory(u64 start, u64 size) +int offline_memory(u64 start, u64 size) { unsigned long start_pfn, end_pfn; @@ -999,9 +999,9 @@ int remove_memory(u64 start, u64 size) return offline_pages(start_pfn, end_pfn, 120 * HZ); } #else -int remove_memory(u64 start, u64 size) +int offline_memory(u64 start, u64 size) { return -EINVAL; } #endif /* CONFIG_MEMORY_HOTREMOVE */ -EXPORT_SYMBOL_GPL(remove_memory); +EXPORT_SYMBOL_GPL(offline_memory); Index: linux-3.5-rc4/include/linux/memory_hotplug.h === --- linux-3.5-rc4.orig/include/linux/memory_hotplug.h 2012-07-03 14:21:46.102416917 +0900 +++ linux-3.5-rc4/include/linux/memory_hotplug.h2012-07-03 14:21:49.471374796 +0900 @@ -233,7 +233,7 @@ static inline int is_mem_section_removab extern int mem_online_node(int nid); extern int add_memory(int nid, u64 start, u64 size); extern int arch_add_memory(int nid, u64 start, u64 size); -extern int remove_memory(u64 start, u64 size); +extern int offline_memory(u64 start, u64 size); extern int sparse_add_one_section(struct zone *zone, unsigned long start_pfn, int nr_pages); extern void sparse_remove_one_section(struct zone *zone, struct mem_section *ms); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH v4 2/13] memory-hotplug : add physical memory hotplug code to acpi_memory_device_remove
acpi_memory_device_remove() has been prepared to remove physical memory. But, the function only frees acpi_memory_device currentlry. The patch adds following functions into acpi_memory_device_remove(): - offline memory - remove physical memory. It only check whether memory is online or not. - free acpi_memory_device CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Benjamin Herrenschmidt b...@kernel.crashing.org CC: Paul Mackerras pau...@samba.org CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/acpi/acpi_memhotplug.c | 27 ++- drivers/base/memory.c | 39 +++ include/linux/memory.h |5 + include/linux/memory_hotplug.h |5 + mm/memory_hotplug.c| 22 ++ 5 files changed, 97 insertions(+), 1 deletion(-) Index: linux-3.5-rc6/drivers/acpi/acpi_memhotplug.c === --- linux-3.5-rc6.orig/drivers/acpi/acpi_memhotplug.c 2012-07-17 11:20:15.117796971 +0900 +++ linux-3.5-rc6/drivers/acpi/acpi_memhotplug.c2012-07-17 13:36:30.325594022 +0900 @@ -29,6 +29,7 @@ #include linux/module.h #include linux/init.h #include linux/types.h +#include linux/memory.h #include linux/memory_hotplug.h #include linux/slab.h #include acpi/acpi_drivers.h @@ -452,12 +453,36 @@ static int acpi_memory_device_add(struct static int acpi_memory_device_remove(struct acpi_device *device, int type) { struct acpi_memory_device *mem_device = NULL; - + struct acpi_memory_info *info, *tmp; + int result; + int node; if (!device || !acpi_driver_data(device)) return -EINVAL; mem_device = acpi_driver_data(device); + + node = acpi_get_node(mem_device-device-handle); + list_for_each_entry_safe(info, tmp, mem_device-res_list, list) { + if (!info-enabled) + continue; + + if (!is_memblk_offline(info-start_addr, info-length)) { + result = offline_memory(info-start_addr, info-length); + if (result) + return result; + } + if (node 0) + node = memory_add_physaddr_to_nid(info-start_addr); + + result = remove_memory(node, info-start_addr, info-length); + if (result) + return result; + + list_del(info-list); + kfree(info); + } + kfree(mem_device); return 0; Index: linux-3.5-rc6/include/linux/memory_hotplug.h === --- linux-3.5-rc6.orig/include/linux/memory_hotplug.h 2012-07-17 11:20:15.133796772 +0900 +++ linux-3.5-rc6/include/linux/memory_hotplug.h2012-07-17 11:29:41.490716352 +0900 @@ -221,6 +221,7 @@ static inline void unlock_memory_hotplug #ifdef CONFIG_MEMORY_HOTREMOVE extern int is_mem_section_removable(unsigned long pfn, unsigned long nr_pages); +extern int remove_memory(int nid, u64 start, u64 size); #else static inline int is_mem_section_removable(unsigned long pfn, @@ -228,6 +229,10 @@ static inline int is_mem_section_removab { return 0; } +static inline int remove_memory(int nid, u64 start, u64 size) +{ + return -EBUSY; +} #endif /* CONFIG_MEMORY_HOTREMOVE */ extern int mem_online_node(int nid); Index: linux-3.5-rc6/mm/memory_hotplug.c === --- linux-3.5-rc6.orig/mm/memory_hotplug.c 2012-07-17 11:20:15.129796821 +0900 +++ linux-3.5-rc6/mm/memory_hotplug.c 2012-07-17 13:25:18.952986069 +0900 @@ -998,6 +998,28 @@ int offline_memory(u64 start, u64 size) end_pfn = start_pfn + PFN_DOWN(size); return offline_pages(start_pfn, end_pfn, 120 * HZ); } + +int remove_memory(int nid, u64 start, u64 size) +{ + int ret = -EBUSY; + lock_memory_hotplug(); + /* +* The memory might become online by other task, even if you offine it. +* So we check whether the cpu has been onlined or not. +*/ + if (!is_memblk_offline(start, size)) { + pr_warn(memory removing [mem %#010llx-%#010llx] failed, + because the memmory range is online\n, + start, start + size); + ret = -EAGAIN; + } + + unlock_memory_hotplug(); + return ret; + +} +EXPORT_SYMBOL_GPL(remove_memory); + #else int offline_memory(u64 start, u64 size) { Index: linux-3.5-rc6/drivers/base/memory.c
[PATCH v4 3/13] memory-hotplug : check whether memory is present or not
If system supports memory hot-remove, online_pages() may online removed pages. So online_pages() need to check whether onlining pages are present or not. CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Benjamin Herrenschmidt b...@kernel.crashing.org CC: Paul Mackerras pau...@samba.org CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- include/linux/mmzone.h | 21 + mm/memory_hotplug.c| 13 + 2 files changed, 34 insertions(+) Index: linux-3.5-rc6/include/linux/mmzone.h === --- linux-3.5-rc6.orig/include/linux/mmzone.h 2012-07-08 09:23:56.0 +0900 +++ linux-3.5-rc6/include/linux/mmzone.h2012-07-17 16:10:21.588186145 +0900 @@ -1168,6 +1168,27 @@ void sparse_init(void); #define sparse_index_init(_sec, _nid) do {} while (0) #endif /* CONFIG_SPARSEMEM */ +#ifdef CONFIG_SPARSEMEM +static inline int pfns_present(unsigned long pfn, unsigned long nr_pages) +{ + int i; + for (i = 0; i nr_pages; i++) { + if (pfn_present(pfn + 1)) + continue; + else { + unlock_memory_hotplug(); + return -EINVAL; + } + } + return 0; +} +#else +static inline int pfns_present(unsigned long pfn, unsigned long nr_pages) +{ + return 0; +} +#endif /* CONFIG_SPARSEMEM*/ + #ifdef CONFIG_NODES_SPAN_OTHER_NODES bool early_pfn_in_nid(unsigned long pfn, int nid); #else Index: linux-3.5-rc6/mm/memory_hotplug.c === --- linux-3.5-rc6.orig/mm/memory_hotplug.c 2012-07-17 14:26:40.0 +0900 +++ linux-3.5-rc6/mm/memory_hotplug.c 2012-07-17 16:09:50.070580170 +0900 @@ -467,6 +467,19 @@ int __ref online_pages(unsigned long pfn struct memory_notify arg; lock_memory_hotplug(); + /* +* If system supports memory hot-remove, the memory may have been +* removed. So we check whether the memory has been removed or not. +* +* Note: When CONFIG_SPARSEMEM is defined, pfns_present() become +* effective. If CONFIG_SPARSEMEM is not defined, pfns_present() +* always returns 0. +*/ + ret = pfns_present(pfn, nr_pages); + if (ret) { + unlock_memory_hotplug(); + return ret; + } arg.start_pfn = pfn; arg.nr_pages = nr_pages; arg.status_change_nid = -1; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH v4 4/13] memory-hotplug : remove /sys/firmware/memmap/X sysfs
When (hot)adding memory into system, /sys/firmware/memmap/X/{end, start, type} sysfs files are created. But there is no code to remove these files. The patch implements the function to remove them. Note : The code does not free firmware_map_entry since there is no way to free memory which is allocated by bootmem. CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Benjamin Herrenschmidt b...@kernel.crashing.org CC: Paul Mackerras pau...@samba.org CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/firmware/memmap.c| 78 ++- include/linux/firmware-map.h |6 +++ mm/memory_hotplug.c |9 +++- 3 files changed, 90 insertions(+), 3 deletions(-) Index: linux-3.5-rc6/mm/memory_hotplug.c === --- linux-3.5-rc6.orig/mm/memory_hotplug.c 2012-07-18 17:20:05.670024283 +0900 +++ linux-3.5-rc6/mm/memory_hotplug.c 2012-07-18 17:51:03.933189930 +0900 @@ -1012,9 +1012,9 @@ int offline_memory(u64 start, u64 size) return offline_pages(start_pfn, end_pfn, 120 * HZ); } -int remove_memory(int nid, u64 start, u64 size) +int __ref remove_memory(int nid, u64 start, u64 size) { - int ret = -EBUSY; + int ret = 0; lock_memory_hotplug(); /* * The memory might become online by other task, even if you offine it. @@ -1025,8 +1025,13 @@ int remove_memory(int nid, u64 start, u6 because the memmory range is online\n, start, start + size); ret = -EAGAIN; + goto out; } + /* remove memmap entry */ + firmware_map_remove(start, start + size, System RAM); + +out: unlock_memory_hotplug(); return ret; Index: linux-3.5-rc6/include/linux/firmware-map.h === --- linux-3.5-rc6.orig/include/linux/firmware-map.h 2012-07-18 17:19:37.007382563 +0900 +++ linux-3.5-rc6/include/linux/firmware-map.h 2012-07-18 17:42:20.804730245 +0900 @@ -25,6 +25,7 @@ int firmware_map_add_early(u64 start, u64 end, const char *type); int firmware_map_add_hotplug(u64 start, u64 end, const char *type); +int firmware_map_remove(u64 start, u64 end, const char *type); #else /* CONFIG_FIRMWARE_MEMMAP */ @@ -38,6 +39,11 @@ static inline int firmware_map_add_hotpl return 0; } +static inline int firmware_map_remove(u64 start, u64 end, const char *type) +{ + return 0; +} + #endif /* CONFIG_FIRMWARE_MEMMAP */ #endif /* _LINUX_FIRMWARE_MAP_H */ Index: linux-3.5-rc6/drivers/firmware/memmap.c === --- linux-3.5-rc6.orig/drivers/firmware/memmap.c2012-07-18 17:19:43.618300182 +0900 +++ linux-3.5-rc6/drivers/firmware/memmap.c 2012-07-18 17:42:20.846729721 +0900 @@ -21,6 +21,7 @@ #include linux/types.h #include linux/bootmem.h #include linux/slab.h +#include linux/mm.h /* * Data types -- @@ -79,7 +80,22 @@ static const struct sysfs_ops memmap_att .show = memmap_attr_show, }; +#define to_memmap_entry(obj) container_of(obj, struct firmware_map_entry, kobj) + +static void release_firmware_map_entry(struct kobject *kobj) +{ + struct firmware_map_entry *entry = to_memmap_entry(kobj); + struct page *page; + + page = virt_to_page(entry); + if (PageSlab(page) || PageCompound(page)) + kfree(entry); + + /* There is no way to free memory allocated from bootmem*/ +} + static struct kobj_type memmap_ktype = { + .release= release_firmware_map_entry, .sysfs_ops = memmap_attr_ops, .default_attrs = def_attrs, }; @@ -123,6 +139,16 @@ static int firmware_map_add_entry(u64 st return 0; } +/** + * firmware_map_remove_entry() - Does the real work to remove a firmware + * memmap entry. + * @entry: removed entry. + **/ +static inline void firmware_map_remove_entry(struct firmware_map_entry *entry) +{ + list_del(entry-list); +} + /* * Add memmap entry on sysfs */ @@ -144,6 +170,31 @@ static int add_sysfs_fw_map_entry(struct return 0; } +/* + * Remove memmap entry on sysfs + */ +static inline void remove_sysfs_fw_map_entry(struct firmware_map_entry *entry) +{ + kobject_put(entry-kobj); +} + +/* + * Search memmap entry + */ + +struct firmware_map_entry * __meminit +find_firmware_map_entry(u64 start, u64 end, const char *type) +{ + struct firmware_map_entry *entry; + + list_for_each_entry(entry, map_entries, list) + if ((entry
[RFC PATCH v4 7/13] memory-hotplug : remove_memory calls __remove_pages
The patch adds __remove_pages() to remove_memory(). Then the range of phys_start_pfn argument and nr_pages argument in __remove_pagse() may have different zone. So zone argument is removed from __remove_pages() and __remove_pages() caluculates zone in each section. When CONFIG_SPARSEMEM_VMEMMAP is defined, there is no way to remove a memmap. So __remove_section only calls unregister_memory_section(). CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Benjamin Herrenschmidt b...@kernel.crashing.org CC: Paul Mackerras pau...@samba.org CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- arch/powerpc/platforms/pseries/hotplug-memory.c |5 + include/linux/memory_hotplug.h |3 +-- mm/memory_hotplug.c | 19 --- 3 files changed, 14 insertions(+), 13 deletions(-) Index: linux-3.5-rc6/mm/memory_hotplug.c === --- linux-3.5-rc6.orig/mm/memory_hotplug.c 2012-07-18 18:00:27.440145432 +0900 +++ linux-3.5-rc6/mm/memory_hotplug.c 2012-07-18 18:01:02.070712487 +0900 @@ -275,11 +275,14 @@ static int __meminit __add_section(int n #ifdef CONFIG_SPARSEMEM_VMEMMAP static int __remove_section(struct zone *zone, struct mem_section *ms) { - /* -* XXX: Freeing memmap with vmemmap is not implement yet. -* This should be removed later. -*/ - return -EBUSY; + int ret = -EINVAL; + + if (!valid_section(ms)) + return ret; + + ret = unregister_memory_section(ms); + + return ret; } #else static int __remove_section(struct zone *zone, struct mem_section *ms) @@ -346,11 +349,11 @@ EXPORT_SYMBOL_GPL(__add_pages); * sure that pages are marked reserved and zones are adjust properly by * calling offline_pages(). */ -int __remove_pages(struct zone *zone, unsigned long phys_start_pfn, -unsigned long nr_pages) +int __remove_pages(unsigned long phys_start_pfn, unsigned long nr_pages) { unsigned long i, ret = 0; int sections_to_remove; + struct zone *zone; /* * We can only remove entire sections @@ -363,6 +366,7 @@ int __remove_pages(struct zone *zone, un sections_to_remove = nr_pages / PAGES_PER_SECTION; for (i = 0; i sections_to_remove; i++) { unsigned long pfn = phys_start_pfn + i*PAGES_PER_SECTION; + zone = page_zone(pfn_to_page(pfn)); ret = __remove_section(zone, __pfn_to_section(pfn)); if (ret) break; @@ -1031,6 +1035,7 @@ int __ref remove_memory(int nid, u64 sta /* remove memmap entry */ firmware_map_remove(start, start + size, System RAM); + __remove_pages(start PAGE_SHIFT, size PAGE_SHIFT); out: unlock_memory_hotplug(); return ret; Index: linux-3.5-rc6/include/linux/memory_hotplug.h === --- linux-3.5-rc6.orig/include/linux/memory_hotplug.h 2012-07-18 18:00:27.445145371 +0900 +++ linux-3.5-rc6/include/linux/memory_hotplug.h2012-07-18 18:00:40.461982690 +0900 @@ -89,8 +89,7 @@ extern bool is_pageblock_removable_noloc /* reasonably generic interface to expand the physical pages in a zone */ extern int __add_pages(int nid, struct zone *zone, unsigned long start_pfn, unsigned long nr_pages); -extern int __remove_pages(struct zone *zone, unsigned long start_pfn, - unsigned long nr_pages); +extern int __remove_pages(unsigned long start_pfn, unsigned long nr_pages); #ifdef CONFIG_NUMA extern int memory_add_physaddr_to_nid(u64 start); Index: linux-3.5-rc6/arch/powerpc/platforms/pseries/hotplug-memory.c === --- linux-3.5-rc6.orig/arch/powerpc/platforms/pseries/hotplug-memory.c 2012-07-18 18:00:27.442145407 +0900 +++ linux-3.5-rc6/arch/powerpc/platforms/pseries/hotplug-memory.c 2012-07-18 18:00:40.470982578 +0900 @@ -76,7 +76,6 @@ unsigned long memory_block_size_bytes(vo static int pseries_remove_memblock(unsigned long base, unsigned int memblock_size) { unsigned long start, start_pfn; - struct zone *zone; int i, ret; int sections_to_remove; @@ -87,8 +86,6 @@ static int pseries_remove_memblock(unsig return 0; } - zone = page_zone(pfn_to_page(start_pfn)); - /* * Remove section mappings and sysfs entries for the * section of the memory we are removing. @@ -101,7 +98,7 @@ static int pseries_remove_memblock(unsig sections_to_remove = (memblock_size PAGE_SHIFT
[RFC PATCH v4 5/13] memory-hotplug : does not release memory region in PAGES_PER_SECTION chunks
Since applying a patch(de7f0cba96786c), release_mem_region() has been changed as called in PAGES_PER_SECTION chunks because register_memory_resource() is called in PAGES_PER_SECTION chunks by add_memory(). But it seems firmware dependency. If CRS are written in the PAGES_PER_SECTION chunks in ACPI DSDT Table, register_memory_resource() is called in PAGES_PER_SECTION chunks. But if CRS are written in the DIMM unit in ACPI DSDT Table, register_memory_resource() is called in DIMM unit. So release_mem_region() should not be called in PAGES_PER_SECTION chunks. The patch fixes it. CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Benjamin Herrenschmidt b...@kernel.crashing.org CC: Paul Mackerras pau...@samba.org CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- arch/powerpc/platforms/pseries/hotplug-memory.c | 13 + mm/memory_hotplug.c |4 ++-- 2 files changed, 11 insertions(+), 6 deletions(-) Index: linux-3.5-rc6/mm/memory_hotplug.c === --- linux-3.5-rc6.orig/mm/memory_hotplug.c 2012-07-18 17:51:03.933189930 +0900 +++ linux-3.5-rc6/mm/memory_hotplug.c 2012-07-18 17:51:17.550020005 +0900 @@ -358,11 +358,11 @@ int __remove_pages(struct zone *zone, un BUG_ON(phys_start_pfn ~PAGE_SECTION_MASK); BUG_ON(nr_pages % PAGES_PER_SECTION); + release_mem_region(phys_start_pfn PAGE_SHIFT, nr_pages * PAGE_SIZE); + sections_to_remove = nr_pages / PAGES_PER_SECTION; for (i = 0; i sections_to_remove; i++) { unsigned long pfn = phys_start_pfn + i*PAGES_PER_SECTION; - release_mem_region(pfn PAGE_SHIFT, - PAGES_PER_SECTION PAGE_SHIFT); ret = __remove_section(zone, __pfn_to_section(pfn)); if (ret) break; Index: linux-3.5-rc6/arch/powerpc/platforms/pseries/hotplug-memory.c === --- linux-3.5-rc6.orig/arch/powerpc/platforms/pseries/hotplug-memory.c 2012-07-18 17:50:49.893365814 +0900 +++ linux-3.5-rc6/arch/powerpc/platforms/pseries/hotplug-memory.c 2012-07-18 17:51:17.553019968 +0900 @@ -77,7 +77,8 @@ static int pseries_remove_memblock(unsig { unsigned long start, start_pfn; struct zone *zone; - int ret; + int i, ret; + int sections_to_remove; start_pfn = base PAGE_SHIFT; @@ -97,9 +98,13 @@ static int pseries_remove_memblock(unsig * to sysfs state file and we can't remove sysfs entries * while writing to it. So we have to defer it to here. */ - ret = __remove_pages(zone, start_pfn, memblock_size PAGE_SHIFT); - if (ret) - return ret; + sections_to_remove = (memblock_size PAGE_SHIFT) / PAGES_PER_SECTION; + for (i = 0; i sections_to_remove; i++) { + unsigned long pfn = start_pfn + i * PAGES_PER_SECTION; + ret = __remove_pages(zone, start_pfn, PAGES_PER_SECTION); + if (ret) + return ret; + } /* * Update memory regions for memory remove -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH v4 6/13] memory-hotplug : add memory_block_release
When calling remove_memory_block(), the function shows following message at device_release(). Device 'memory528' does not have a release() function, it is broken and must be fixed. remove_memory_block() calls kfree(mem). I think it shouled be called from device_release(). So the patch implements memory_block_release() CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Benjamin Herrenschmidt b...@kernel.crashing.org CC: Paul Mackerras pau...@samba.org CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- drivers/base/memory.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) Index: linux-3.5-rc6/drivers/base/memory.c === --- linux-3.5-rc6.orig/drivers/base/memory.c2012-07-18 17:50:49.659368740 +0900 +++ linux-3.5-rc6/drivers/base/memory.c 2012-07-18 17:51:28.655881214 +0900 @@ -109,6 +109,15 @@ bool is_memblk_offline(unsigned long sta } EXPORT_SYMBOL(is_memblk_offline); +#define to_memory_block(device) container_of(device, struct memory_block, dev) + +static void release_memory_block(struct device *dev) +{ + struct memory_block *mem = to_memory_block(dev); + + kfree(mem); +} + /* * register_memory - Setup a sysfs device for a memory block */ @@ -119,6 +128,7 @@ int register_memory(struct memory_block memory-dev.bus = memory_subsys; memory-dev.id = memory-start_section_nr / sections_per_block; + memory-dev.release = release_memory_block; error = device_register(memory-dev); return error; @@ -669,7 +679,6 @@ int remove_memory_block(unsigned long no mem_remove_simple_file(mem, phys_device); mem_remove_simple_file(mem, removable); unregister_memory(mem); - kfree(mem); } else kobject_put(mem-dev.kobj); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/