[PPC] relocation truncated to fit: R_PPC64_REL24 against symbol `.eeh_check_failure' defined in .text section in arch/powerpc/platforms/built-in.o
Hi, I got such build errors in powerpc allyesconfig and other configs. How can they be eliminated? I'm running the cross compile tools from kernel.org. drivers/built-in.o: In function `.yenta_interrupt': yenta_socket.c:(.text+0x1ffba78): relocation truncated to fit: R_PPC64_REL24 against symbol `.eeh_check_failure' defined in .text section in arch/powerpc/platforms/built-in.o yenta_socket.c:(.text+0x1ffbb40): relocation truncated to fit: R_PPC64_REL24 against symbol `.eeh_check_failure' defined in .text section in arch/powerpc/platforms/built-in.o yenta_socket.c:(.text+0x1ffbcd0): relocation truncated to fit: R_PPC64_REL24 against symbol `.eeh_check_failure' defined in .text section in arch/powerpc/platforms/built-in.o drivers/built-in.o: In function `.yenta_interrupt_wrapper': yenta_socket.c:(.text+0x1ffbe3c): relocation truncated to fit: R_PPC64_REL24 against symbol `_savegpr0_29' defined in .text.save.restore section in arch/powerpc/lib/built-in.o yenta_socket.c:(.text+0x1ffbea8): relocation truncated to fit: R_PPC64_REL24 against symbol `_restgpr0_29' defined in .text.save.restore section in arch/powerpc/lib/built-in.o drivers/built-in.o: In function `.yenta_probe_irq.isra.1': yenta_socket.c:(.text+0x1ffc044): relocation truncated to fit: R_PPC64_REL24 against symbol `.eeh_check_failure' defined in .text section in arch/powerpc/platforms/built-in.o yenta_socket.c:(.text+0x1ffc1d0): relocation truncated to fit: R_PPC64_REL24 against symbol `.eeh_check_failure' defined in .text section in arch/powerpc/platforms/built-in.o yenta_socket.c:(.text+0x1ffc298): relocation truncated to fit: R_PPC64_REL24 against symbol `.eeh_check_failure' defined in .text section in arch/powerpc/platforms/built-in.o yenta_socket.c:(.text+0x1ffc478): relocation truncated to fit: R_PPC64_REL24 against symbol `.eeh_check_failure' defined in .text section in arch/powerpc/platforms/built-in.o yenta_socket.c:(.text+0x1ffc608): relocation truncated to fit: R_PPC64_REL24 against symbol `.eeh_check_failure' defined in .text section in arch/powerpc/platforms/built-in.o yenta_socket.c:(.text+0x1ffc7a0): additional relocation overflows omitted from the output Thanks, Fengguang ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 3/3] edac/85xx: Enable the EDAC PCI err driver by device_initcall
On Sep 27, 2012, at 4:51 PM, Scott Wood wrote: On 09/27/2012 04:45:08 PM, Gala Kumar-B11780 wrote: On Sep 27, 2012, at 11:09 AM, Scott Wood wrote: On 09/27/2012 02:02:03 PM, Chunhe Lan wrote: Original process of call: The mpc85xx_pci_err_probe function completes to been registered and enabled of EDAC PCI err driver at the latter time stage of kernel boot in the mpc85xx_edac.c. Current process of call: The mpc85xx_pci_err_probe function completes to been registered and enabled of EDAC PCI err driver at the first time stage of kernel boot in the fsl_pci.c. So in this case the following error messages appear in the boot log: PCI: Probing PCI hardware pci :00:00.0: ignoring class b20 (doesn't match header type 01) PCIE error(s) detected PCIE ERR_DR register: 0x0002 PCIE ERR_CAP_STAT register: 0x8001 PCIE ERR_CAP_R0 register: 0x0800 PCIE ERR_CAP_R1 register: 0x PCIE ERR_CAP_R2 register: 0x PCIE ERR_CAP_R3 register: 0x Because the EDAC PCI err driver is registered and enabled earlier than original point of call. But at this point of time, PCI hardware is not probed and initialized, and it is in unknowable state. So, move enable function into mpc85xx_pci_err_en which is called at the middle time stage of kernel boot and after PCI hardware is probed and initialized by device_initcall in the fsl_pci.c. Signed-off-by: Chunhe Lan chunhe@freescale.com --- arch/powerpc/sysdev/fsl_pci.c | 12 ++ arch/powerpc/sysdev/fsl_pci.h |5 drivers/edac/mpc85xx_edac.c | 47 3 files changed, 50 insertions(+), 14 deletions(-) diff --git a/arch/powerpc/sysdev/fsl_pci.c b/arch/powerpc/sysdev/fsl_pci.c index 3d6f4d8..a591965 100644 --- a/arch/powerpc/sysdev/fsl_pci.c +++ b/arch/powerpc/sysdev/fsl_pci.c @@ -904,4 +904,16 @@ static int __init fsl_pci_init(void) return platform_driver_register(fsl_pci_driver); } arch_initcall(fsl_pci_init); + +static int __init fsl_pci_err_en(void) +{ + struct device_node *np; + + for_each_node_by_type(np, pci) + if (of_match_node(pci_ids, np)) + mpc85xx_pci_err_en(np); + + return 0; +} +device_initcall(fsl_pci_err_en); Why can't you call this from the normal PCIe controller init, instead of searching for the node independently? Don't we have this now with mpc85xx_pci_err_probe() ?? What do you mean by this? I'm saying don't we replace fsl_pci_err_en() with mpc85xx_pci_err_probe()... I need to look at this more, but not clear why mpc85xx_pci_err_en() can just be part of mpc85xx_pci_err_probe() - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
R: Re: PCI device not working
Hi Kumar, It was, can you figure out in u-boot what exact config read on the bus would return the correct thing. The fact that when we probe the device at 0001:03 we should get back something like cfg_data=0xabba1b65 here follow some details about what is going on inside u-boot; verbosity increases from [1] to [3] [1] PCI printouts when the board come up [2] output of pci [0-3] long u-boot command [3] same as [1] but with debug print inside indirect_read_config_##size() [drivers/pci/pci_indirect.c] if you were curious about our u-boot board settings, please refer to: http://www.mail-archive. com/linuxppc-dev@lists.ozlabs.org/msg62007.html thanx alot, Davide * *[1]* * PCIE1 used as Root Complex (base addr ffe09000) Scanning PCI bus 01 01 00 1b65 abba 0280 00 cfg_addr:ffe09000 cfg_data:ffe09004 indirect_type:0 PCIE1 on bus 00 - 01 PCIE2 used as Root Complex (base addr ffe0a000) Scanning PCI bus 03 03 00 1b65 abba 0280 00 cfg_addr:ffe0a000 cfg_data:ffe0a004 indirect_type:0 PCIE2 on bus 02 - 03 * *[2]* * = pci 0 long Scanning PCI devices on bus 0 Found PCI device 00.00.00: vendor ID = 0x1957 device ID = 0x0100 command register =0x0006 status register = 0x0010 revision ID = 0x11 class code = 0x0b (Processor) sub class code = 0x20 programming interface = 0x00 cache line = 0x08 latency time =0x00 header type = 0x01 BIST =0x00 base address 0 = 0xfff0 base address 1 = 0x primary bus number = 0x00 secondary bus number =0x01 subordinate bus number = 0x01 secondary latency timer = 0x00 IO base = 0x00 IO limit =0x00 secondary status =0x memory base = 0xa000 memory limit =0xa000 prefetch memory base =0x1001 prefetch memory limit = 0x0001 prefetch memory base upper = 0x prefetch memory limit upper = 0x IO base upper 16 bits = 0x IO limit upper 16 bits = 0x expansion ROM base address = 0x interrupt line = 0x00 interrupt pin = 0x00 bridge control = 0x = pci 1 long Scanning PCI devices on bus 1 Found PCI device 01.00.00:kk vendor ID = 0x1b65 device ID = 0xabba command register =0x0006 status register = 0x0010 revision ID = 0x01 class code = 0x02 (Network controller) sub class code = 0x80 programming interface = 0x00 cache line = 0x08 latency time =0x00 header type = 0x00 BIST =0x00 base address 0 = 0xa000 base address 1 = 0xa001 base address 2 = 0x base address 3 = 0x base address 4 = 0x base address 5 = 0x cardBus CIS pointer = 0x sub system vendor ID =0x sub system ID = 0x expansion ROM base address = 0x interrupt line = 0x00 interrupt pin = 0x01 min Grant = 0x00 max Latency = 0x00 = pci 2 long Scanning PCI devices on bus 2 Found PCI device 02.00.00: vendor ID = 0x1957 device ID = 0x0100 command register =0x0006 status register = 0x0010 revision ID = 0x11 class code = 0x0b (Processor) sub class code = 0x20 programming interface = 0x00 cache line = 0x08 latency time =0x00 header type = 0x01 BIST =0x00 base address 0 = 0xfff0 base address 1 = 0x primary bus number = 0x00 secondary bus number =0x01 subordinate bus number = 0x01 secondary latency timer = 0x00 IO base = 0x00 IO limit =0x00 secondary status =0x memory base = 0xb000 memory limit =0xb000 prefetch memory base =0x1001 prefetch memory limit = 0x0001 prefetch memory base upper = 0x prefetch memory limit upper = 0x IO base upper 16 bits = 0x IO limit upper 16 bits = 0x expansion ROM base address = 0x interrupt line =
Re: [REGRESSION] nfsd crashing with 3.6.0-rc7 on PowerPC
On Fri, Sep 28, 2012 at 04:19:55AM +0200, Alexander Graf wrote: On 28.09.2012, at 04:04, Linus Torvalds wrote: On Thu, Sep 27, 2012 at 6:55 PM, Alexander Graf ag...@suse.de wrote: Below are OOPS excerpts from different rc's I tried. All of them crashed - all the way up to current Linus' master branch. I haven't cross-checked, but I don't remember any such behavior from pre-3.6 releases. Since you seem to be able to reproduce it easily (and apparently reliably), any chance you could just bisect it? Since I assume v3.5 is fine, and apparently -rc1 is already busted, a simple git bisect start git bisect good v3.5 git bisect bad v3.6-rc1 will get you started on your adventure.. Heh, will give it a try :). The thing really does look quite bisectable. It might take a few hours though - the machine isn't exactly fast by today's standards and it's getting late here. But I'll keep you updated. I doubt it's anything special about that workload, but just for kicks I tried a git clone -ls (cloning my linux tree to another directory on the same nfs filesystem), with server on 3.6.0-rc7, and didn't see anything interesting (just an xfs lockdep warning that looks like this one jlayton already reported: http://oss.sgi.com/archives/xfs/2012-09/msg00088.html ) Any (even partial) bisection results would certainly be useful, thanks. --b. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [REGRESSION] nfsd crashing with 3.6.0-rc7 on PowerPC
On 28.09.2012, at 17:10, J. Bruce Fields wrote: On Fri, Sep 28, 2012 at 04:19:55AM +0200, Alexander Graf wrote: On 28.09.2012, at 04:04, Linus Torvalds wrote: On Thu, Sep 27, 2012 at 6:55 PM, Alexander Graf ag...@suse.de wrote: Below are OOPS excerpts from different rc's I tried. All of them crashed - all the way up to current Linus' master branch. I haven't cross-checked, but I don't remember any such behavior from pre-3.6 releases. Since you seem to be able to reproduce it easily (and apparently reliably), any chance you could just bisect it? Since I assume v3.5 is fine, and apparently -rc1 is already busted, a simple git bisect start git bisect good v3.5 git bisect bad v3.6-rc1 will get you started on your adventure.. Heh, will give it a try :). The thing really does look quite bisectable. It might take a few hours though - the machine isn't exactly fast by today's standards and it's getting late here. But I'll keep you updated. I doubt it's anything special about that workload, but just for kicks I tried a git clone -ls (cloning my linux tree to another directory on the same nfs filesystem), with server on 3.6.0-rc7, and didn't see anything interesting (just an xfs lockdep warning that looks like this one jlayton already reported: http://oss.sgi.com/archives/xfs/2012-09/msg00088.html ) Any (even partial) bisection results would certainly be useful, thanks. Yeah, still trying. Running the same workload in a PPC VM didn't show any badness. Then I tried again to bisect on the machine it broken on, and that commit failed even more badly on me than the previous ones, destroying my local git tree. Trying to narrow down now in a slightly more contained environment :). Alex ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 3/3] edac/85xx: Enable the EDAC PCI err driver by device_initcall
On 09/27/2012 05:33:26 PM, Kumar Gala wrote: On Sep 27, 2012, at 4:51 PM, Scott Wood wrote: On 09/27/2012 04:45:08 PM, Gala Kumar-B11780 wrote: On Sep 27, 2012, at 11:09 AM, Scott Wood wrote: On 09/27/2012 02:02:03 PM, Chunhe Lan wrote: Original process of call: The mpc85xx_pci_err_probe function completes to been registered and enabled of EDAC PCI err driver at the latter time stage of kernel boot in the mpc85xx_edac.c. Current process of call: The mpc85xx_pci_err_probe function completes to been registered and enabled of EDAC PCI err driver at the first time stage of kernel boot in the fsl_pci.c. So in this case the following error messages appear in the boot log: PCI: Probing PCI hardware pci :00:00.0: ignoring class b20 (doesn't match header type 01) PCIE error(s) detected PCIE ERR_DR register: 0x0002 PCIE ERR_CAP_STAT register: 0x8001 PCIE ERR_CAP_R0 register: 0x0800 PCIE ERR_CAP_R1 register: 0x PCIE ERR_CAP_R2 register: 0x PCIE ERR_CAP_R3 register: 0x Because the EDAC PCI err driver is registered and enabled earlier than original point of call. But at this point of time, PCI hardware is not probed and initialized, and it is in unknowable state. So, move enable function into mpc85xx_pci_err_en which is called at the middle time stage of kernel boot and after PCI hardware is probed and initialized by device_initcall in the fsl_pci.c. Signed-off-by: Chunhe Lan chunhe@freescale.com --- arch/powerpc/sysdev/fsl_pci.c | 12 ++ arch/powerpc/sysdev/fsl_pci.h |5 drivers/edac/mpc85xx_edac.c | 47 3 files changed, 50 insertions(+), 14 deletions(-) diff --git a/arch/powerpc/sysdev/fsl_pci.c b/arch/powerpc/sysdev/fsl_pci.c index 3d6f4d8..a591965 100644 --- a/arch/powerpc/sysdev/fsl_pci.c +++ b/arch/powerpc/sysdev/fsl_pci.c @@ -904,4 +904,16 @@ static int __init fsl_pci_init(void) return platform_driver_register(fsl_pci_driver); } arch_initcall(fsl_pci_init); + +static int __init fsl_pci_err_en(void) +{ + struct device_node *np; + + for_each_node_by_type(np, pci) + if (of_match_node(pci_ids, np)) + mpc85xx_pci_err_en(np); + + return 0; +} +device_initcall(fsl_pci_err_en); Why can't you call this from the normal PCIe controller init, instead of searching for the node independently? Don't we have this now with mpc85xx_pci_err_probe() ?? What do you mean by this? I'm saying don't we replace fsl_pci_err_en() with mpc85xx_pci_err_probe()... I need to look at this more, but not clear why mpc85xx_pci_err_en() can just be part of mpc85xx_pci_err_probe() OK, I was confused -- I thought the point was to make it happen earlier, not later. The changelog is not clear at all. Don't we want to be able to capture errors that happen during PCI driver initialization, though? -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/6] powerpc: Add enable_ppr kernel parameter to enable PPR save/restore
On Tue, 2012-09-11 at 15:55 +1000, Benjamin Herrenschmidt wrote: On Mon, 2012-09-10 at 22:42 -0700, Haren Myneni wrote: Thanks Michael. Yes, we noticed 6% overhead with null syscall test. Hence added cmdline option as suggested. I will add this comment in the changelog. Regarding the option name, I thought about various ones such as retain_process_ppr, retain_smt_priority, save_ppr and etc. Finally added 'enable_ppr' since it enables CPU_FTR (CPU_FTR_HAS_PPR) which allows to save/restore PPR value. Sure, I will change this option. No, that isn't a problem with the name. It's a problem with the polarity of the option. If you need a command line argument to enable the option, then nobody will enable it, it's pointless. In GLIBC (ppc.h) we'll be providing a user space API to change the thread priority in user state. We're also interested in using this in some of the locking constructs if performance tests indicate it's beneficial. I have concerns with being able to enable/disable this option at boot time. Usually, in GLIBC we'll just do a kernel version check and enable certain facilities if we're building against a particular kernel that supports them. In this case, with a configurable option, GLIBC is going to need the kernel to export a hwcap bit that tells us whether we need to do the save/restore ourselves. Having to check the hwcap, and do the save/restore in user space will, of course, increase the overhead on our side. If no hwcap bit is provided and this is disabled at kernel boot time, no check is done and the user process assumes it's running under a certain priority when it is, in-fact, not. I don't care for this option. We'll be hitting code paths that are ineffective and unnecessary. Ryan S. Arnold Linux Technology Center ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC v9 PATCH 01/21] memory-hotplug: rename remove_memory() to offline_memory()/offline_pages()
On Thu, Sep 27, 2012 at 11:50 PM, Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com wrote: Hi Chen, 2012/09/28 11:22, Ni zhan Chen wrote: On 09/05/2012 05:25 PM, we...@cn.fujitsu.com wrote: From: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com remove_memory() only try to offline pages. It is called in two cases: 1. hot remove a memory device 2. echo offline /sys/devices/system/memory/memoryXX/state In the 1st case, we should also change memory block's state, and notify the userspace that the memory block's state is changed after offlining pages. So rename remove_memory() to offline_memory()/offline_pages(). And in the 1st case, offline_memory() will be used. The function offline_memory() is not implemented. In the 2nd case, offline_pages() will be used. But this time there is not a function associated with add_memory. To associate with add_memory() later, we renamed it. Then, you introduced bisect breakage. It is definitely unacceptable. NAK. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC v9 PATCH 13/21] memory-hotplug: check page type in get_page_bootmem
On 09/05/2012 05:25 PM, we...@cn.fujitsu.com wrote: From: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com The function get_page_bootmem() may be called more than one time to the same page. There is no need to set page's type, private if the function is not the first time called to the page. Note: the patch is just optimization and does not fix any problem. Hi Yasuaki, this patch is reasonable to me. I have another question associated to get_page_bootmem(), the question is from another fujitsu guy's patch changelog [commit : 04753278769f3], the changelog said that: 1) When the memmap of removing section is allocated on other section by bootmem, it should/can be free. 2) When the memmap of removing section is allocated on the same section, it shouldn't be freed. Because the section has to be logical memory offlined already and all pages must be isolated against page allocater. If it is freed, page allocator may use it which will be removed physically soon. but I don't see his patch guarantee 2), it means that his patch doesn't guarantee the memmap of removing section which is allocated on other section by bootmem doesn't be freed. Hopefully get your explaination in details, thanks in advance. :-) CC: David Rientjes rient...@google.com CC: Jiang Liu liu...@gmail.com CC: Len Brown len.br...@intel.com CC: Benjamin Herrenschmidt b...@kernel.crashing.org CC: Paul Mackerras pau...@samba.org CC: Christoph Lameter c...@linux.com Cc: Minchan Kim minchan@gmail.com CC: Andrew Morton a...@linux-foundation.org CC: KOSAKI Motohiro kosaki.motoh...@jp.fujitsu.com CC: Wen Congyang we...@cn.fujitsu.com Signed-off-by: Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com --- mm/memory_hotplug.c | 15 +++ 1 files changed, 11 insertions(+), 4 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index d736df3..26a5012 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -95,10 +95,17 @@ static void release_memory_resource(struct resource *res) static void get_page_bootmem(unsigned long info, struct page *page, unsigned long type) { - page-lru.next = (struct list_head *) type; - SetPagePrivate(page); - set_page_private(page, info); - atomic_inc(page-_count); + unsigned long page_type; + + page_type = (unsigned long)page-lru.next; + if (page_type MEMORY_HOTPLUG_MIN_BOOTMEM_TYPE || + page_type MEMORY_HOTPLUG_MAX_BOOTMEM_TYPE){ + page-lru.next = (struct list_head *)type; + SetPagePrivate(page); + set_page_private(page, info); + atomic_inc(page-_count); + } else + atomic_inc(page-_count); } /* reference to __meminit __free_pages_bootmem is valid ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 3/3] edac/85xx: Enable the EDAC PCI err driver by device_initcall
On 09/28/2012 01:35 PM, Scott Wood wrote: On 09/27/2012 05:33:26 PM, Kumar Gala wrote: On Sep 27, 2012, at 4:51 PM, Scott Wood wrote: On 09/27/2012 04:45:08 PM, Gala Kumar-B11780 wrote: On Sep 27, 2012, at 11:09 AM, Scott Wood wrote: On 09/27/2012 02:02:03 PM, Chunhe Lan wrote: Original process of call: The mpc85xx_pci_err_probe function completes to been registered and enabled of EDAC PCI err driver at the latter time stage of kernel boot in the mpc85xx_edac.c. Current process of call: The mpc85xx_pci_err_probe function completes to been registered and enabled of EDAC PCI err driver at the firsttime stage of kernel boot in the fsl_pci.c. So in this case the following error messages appear in the boot log: PCI: Probing PCI hardware pci :00:00.0: ignoring class b20 (doesn't match header type 01) PCIE error(s) detected PCIE ERR_DR register: 0x0002 PCIE ERR_CAP_STAT register: 0x8001 PCIE ERR_CAP_R0 register: 0x0800 PCIE ERR_CAP_R1 register: 0x PCIE ERR_CAP_R2 register: 0x PCIE ERR_CAP_R3 register: 0x Because the EDAC PCI err driver is registered and enabled earlier than original point of call. But at this point of time, PCI hardware is not probed and initialized, and it is in unknowable state. So, move enable function into mpc85xx_pci_err_en which is called at the middle time stage of kernel boot and after PCI hardware is probed and initialized by device_initcall in the fsl_pci.c. Signed-off-by: Chunhe Lan chunhe@freescale.com --- arch/powerpc/sysdev/fsl_pci.c | 12 ++ arch/powerpc/sysdev/fsl_pci.h |5 drivers/edac/mpc85xx_edac.c | 47 3 files changed, 50 insertions(+), 14 deletions(-) diff --git a/arch/powerpc/sysdev/fsl_pci.c b/arch/powerpc/sysdev/fsl_pci.c index 3d6f4d8..a591965 100644 --- a/arch/powerpc/sysdev/fsl_pci.c +++ b/arch/powerpc/sysdev/fsl_pci.c @@ -904,4 +904,16 @@ static int __init fsl_pci_init(void) return platform_driver_register(fsl_pci_driver); } arch_initcall(fsl_pci_init); + +static int __init fsl_pci_err_en(void) +{ +struct device_node *np; + +for_each_node_by_type(np, pci) +if (of_match_node(pci_ids, np)) +mpc85xx_pci_err_en(np); + +return 0; +} +device_initcall(fsl_pci_err_en); Why can't you call this from the normal PCIe controller init, instead of searching for the node independently? Don't we have this now with mpc85xx_pci_err_probe() ?? What do you mean by this? I'm saying don't we replace fsl_pci_err_en() with mpc85xx_pci_err_probe()... I need to look at this more, but not clear why mpc85xx_pci_err_en() can just be part of mpc85xx_pci_err_probe() OK, I was confused -- I thought the point was to make it happen earlier, not later. The changelog is not clear at all. Don't we want to be able to capture errors that happen during PCI driver initialization, though? Yes. When PCI controller is probing slot which if the any device does not have on, happens the invalid address errors. Then the edac driver prints the many error massages. This makes sense as normal, but this is ugly. So, move the enable edac driver to later, and only detect the errors of the follow-up pci operations. Thanks, Chunhe -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC v9 PATCH 00/21] memory-hotplug: hot-remove physical memory
On 09/05/2012 05:25 PM, we...@cn.fujitsu.com wrote: From: Wen Congyang we...@cn.fujitsu.com This patch series aims to support physical memory hot-remove. The patches can free/remove the following things: - acpi_memory_info : [RFC PATCH 4/19] - /sys/firmware/memmap/X/{end, start, type} : [RFC PATCH 8/19] - iomem_resource: [RFC PATCH 9/19] - mem_section and related sysfs files : [RFC PATCH 10-11, 13-16/19] - page table of removed memory : [RFC PATCH 12/19] - node and related sysfs files : [RFC PATCH 18-19/19] If you find lack of function for physical memory hot-remove, please let me know. Since patchset is too big, could you add more patchset changelog to describe how this patchset works? in order that it is easier to review. How to test this patchset? 1. apply this patchset and build the kernel. MEMORY_HOTPLUG, MEMORY_HOTREMOVE, ACPI_HOTPLUG_MEMORY must be selected. 2. load the module acpi_memhotplug 3. hotplug the memory device(it depends on your hardware) You will see the memory device under the directory /sys/bus/acpi/devices/. Its name is PNP0C80:XX. 4. online/offline pages provided by this memory device You can write online/offline to /sys/devices/system/memory/memoryX/state to online/offline pages provided by this memory device 5. hotremove the memory device You can hotremove the memory device by the hardware, or writing 1 to /sys/bus/acpi/devices/PNP0C80:XX/eject. Note: if the memory provided by the memory device is used by the kernel, it can't be offlined. It is not a bug. Known problems: 1. memory can't be offlined when CONFIG_MEMCG is selected. For example: there is a memory device on node 1. The address range is [1G, 1.5G). You will find 4 new directories memory8, memory9, memory10, and memory11 under the directory /sys/devices/system/memory/. If CONFIG_MEMCG is selected, we will allocate memory to store page cgroup when we online pages. When we online memory8, the memory stored page cgroup is not provided by this memory device. But when we online memory9, the memory stored page cgroup may be provided by memory8. So we can't offline memory8 now. We should offline the memory in the reversed order. When the memory device is hotremoved, we will auto offline memory provided by this memory device. But we don't know which memory is onlined first, so offlining memory may fail. In such case, you should offline the memory by hand before hotremoving the memory device. 2. hotremoving memory device may cause kernel panicked This bug will be fixed by Liu Jiang's patch: https://lkml.org/lkml/2012/7/3/1 change log of v9: [RFC PATCH v9 8/21] * add a lock to protect the list map_entries * add an indicator to firmware_map_entry to remember whether the memory is allocated from bootmem [RFC PATCH v9 10/21] * change the macro to inline function [RFC PATCH v9 19/21] * don't offline the node if the cpu on the node is onlined [RFC PATCH v9 21/21] * create new patch: auto offline page_cgroup when onlining memory block failed change log of v8: [RFC PATCH v8 17/20] * Fix problems when one node's range include the other nodes [RFC PATCH v8 18/20] * fix building error when CONFIG_MEMORY_HOTPLUG_SPARSE or CONFIG_HUGETLBFS is not defined. [RFC PATCH v8 19/20] * don't offline node when some memory sections are not removed [RFC PATCH v8 20/20] * create new patch: clear hwpoisoned flag when onlining pages change log of v7: [RFC PATCH v7 4/19] * do not continue if acpi_memory_device_remove_memory() fails. [RFC PATCH v7 15/19] * handle usemap in register_page_bootmem_info_section() too. change log of v6: [RFC PATCH v6 12/19] * fix building error on other archtitectures than x86 [RFC PATCH v6 15-16/19] * fix building error on other archtitectures than x86 change log of v5: * merge the patchset to clear page table and the patchset to hot remove memory(from ishimatsu) to one big patchset. [RFC PATCH v5 1/19] * rename remove_memory() to offline_memory()/offline_pages() [RFC PATCH v5 2/19] * new patch: implement offline_memory(). This function offlines pages, update memory block's state, and notify the userspace that the memory block's state is changed. [RFC PATCH v5 4/19] * offline and remove memory in acpi_memory_disable_device() too. [RFC PATCH v5 17/19] * new patch: add a new function __remove_zone() to revert the things done in the function __add_zone(). [RFC PATCH v5 18/19] * flush work befor reseting node device. change log of v4: * remove memory-hotplug : unify argument of firmware_map_add_early/hotplug from the patch series, since the patch is a bugfix. It is being disccussed on other thread. But for testing the patch series, the patch is needed. So I added