Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-03-01 Thread Yinghai Lu
On Fri, Mar 1, 2013 at 7:03 PM, chen tang wrote: > > Thank you for your suggestion and fix work. :) > I would prefer your Plan b. But one last thing I want to confirm: > > Will "allocating pgdat and zone on local node" prevent node hot-removing ? > Or is it safe to free all node data when

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-03-01 Thread Yinghai Lu
On Thu, Feb 28, 2013 at 11:43 PM, Yinghai Lu wrote: > [trim down CC list a bit] > > On Thu, Feb 28, 2013 at 9:00 PM, Yinghai Lu wrote: >> >> >> On Thursday, February 28, 2013, H. Peter Anvin wrote: >>> >>> On 02/28/2013 08:32 PM, Linus Torvalds wrote: >>> > Yingai, Andrew, >>> > is this ok with

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-03-01 Thread H. Peter Anvin
On 02/28/2013 11:55 PM, Yinghai Lu wrote: > > Let me try again: > > movablemem_map is broken idea or poor design. > Very much so. I have said this before: this is potentially useful during development/testing, but anyone who expects to actually tell their customers to use it is abusive.

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-03-01 Thread H. Peter Anvin
If NUMAQ is breaking real stuff we can kill it by marking it BROKEN. Rip-out is 3.10 at this stage. Ingo Molnar wrote: > >* Borislav Petkov wrote: > >> On Thu, Feb 28, 2013 at 10:37:10PM -0800, H. Peter Anvin wrote: >> > I'd be very happy to get the NUMAQ code ripped out. I am wondering >if

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-03-01 Thread Tang Chen
On 03/01/2013 03:43 PM, Yinghai Lu wrote: Please check attached patches. Plan A. revert all 8 patches: revert_movablemem_map.patch Plan B. fix movablemem_map: kill_max_low_pfn_mapped.patch and fix_movablemem_map.patch fix_movablemem_map.patch is too risky, and need more test. Hi

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-03-01 Thread Ingo Molnar
* Borislav Petkov wrote: > On Thu, Feb 28, 2013 at 10:37:10PM -0800, H. Peter Anvin wrote: > > I'd be very happy to get the NUMAQ code ripped out. I am wondering if > > there are any reasons to keep any 32-bit x86 NUMA code at all. > > How much would it hurt us if we said 3.8 is the last

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-03-01 Thread Borislav Petkov
On Thu, Feb 28, 2013 at 10:37:10PM -0800, H. Peter Anvin wrote: > I'd be very happy to get the NUMAQ code ripped out. I am wondering if > there are any reasons to keep any 32-bit x86 NUMA code at all. How much would it hurt us if we said 3.8 is the last kernel that supported NUMAQ? If anyone

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-03-01 Thread Ingo Molnar
* H. Peter Anvin wrote: > On 02/25/2013 08:51 PM, Martin Bligh wrote: > >> Do you mean we can remove numaq x86 32bit code now? > > > > Wouldn't bother me at all. The machine is from 1995, end of life c. 2000? > > Was > > useful in the early days of getting NUMA up and running on Linux, but

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-03-01 Thread Yasuaki Ishimatsu
2013/03/01 17:02, Yinghai Lu wrote: On Thu, Feb 28, 2013 at 10:18 PM, Tang Chen wrote: On 03/01/2013 01:00 PM, Yinghai Lu wrote: On Thursday, February 28, 2013, H. Peter Anvin wrote: On 02/28/2013 08:32 PM, Linus Torvalds wrote: Yingai, Andrew, is this ok with you two?

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-03-01 Thread Yinghai Lu
On Thu, Feb 28, 2013 at 10:37 PM, H. Peter Anvin wrote: > On 02/25/2013 08:51 PM, Martin Bligh wrote: >>> Do you mean we can remove numaq x86 32bit code now? >> >> Wouldn't bother me at all. The machine is from 1995, end of life c. 2000? >> Was useful in the early days of getting NUMA up and

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-03-01 Thread Yinghai Lu
On Thu, Feb 28, 2013 at 10:18 PM, Tang Chen wrote: > On 03/01/2013 01:00 PM, Yinghai Lu wrote: >> >> On Thursday, February 28, 2013, H. Peter Anvin wrote: >> >>> On 02/28/2013 08:32 PM, Linus Torvalds wrote: Yingai, Andrew, is this ok with you two? Linus >>>

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-03-01 Thread Yinghai Lu
On Thu, Feb 28, 2013 at 10:18 PM, Tang Chen tangc...@cn.fujitsu.com wrote: On 03/01/2013 01:00 PM, Yinghai Lu wrote: On Thursday, February 28, 2013, H. Peter Anvin wrote: On 02/28/2013 08:32 PM, Linus Torvalds wrote: Yingai, Andrew, is this ok with you two? Linus FWIW, it

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-03-01 Thread Yinghai Lu
On Thu, Feb 28, 2013 at 10:37 PM, H. Peter Anvin h...@zytor.com wrote: On 02/25/2013 08:51 PM, Martin Bligh wrote: Do you mean we can remove numaq x86 32bit code now? Wouldn't bother me at all. The machine is from 1995, end of life c. 2000? Was useful in the early days of getting NUMA up and

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-03-01 Thread Yasuaki Ishimatsu
2013/03/01 17:02, Yinghai Lu wrote: On Thu, Feb 28, 2013 at 10:18 PM, Tang Chen tangc...@cn.fujitsu.com wrote: On 03/01/2013 01:00 PM, Yinghai Lu wrote: On Thursday, February 28, 2013, H. Peter Anvin wrote: On 02/28/2013 08:32 PM, Linus Torvalds wrote: Yingai, Andrew, is this ok with

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-03-01 Thread Ingo Molnar
* H. Peter Anvin h...@zytor.com wrote: On 02/25/2013 08:51 PM, Martin Bligh wrote: Do you mean we can remove numaq x86 32bit code now? Wouldn't bother me at all. The machine is from 1995, end of life c. 2000? Was useful in the early days of getting NUMA up and running on Linux, but

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-03-01 Thread Borislav Petkov
On Thu, Feb 28, 2013 at 10:37:10PM -0800, H. Peter Anvin wrote: I'd be very happy to get the NUMAQ code ripped out. I am wondering if there are any reasons to keep any 32-bit x86 NUMA code at all. How much would it hurt us if we said 3.8 is the last kernel that supported NUMAQ? If anyone wants

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-03-01 Thread Ingo Molnar
* Borislav Petkov b...@alien8.de wrote: On Thu, Feb 28, 2013 at 10:37:10PM -0800, H. Peter Anvin wrote: I'd be very happy to get the NUMAQ code ripped out. I am wondering if there are any reasons to keep any 32-bit x86 NUMA code at all. How much would it hurt us if we said 3.8 is the

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-03-01 Thread Tang Chen
On 03/01/2013 03:43 PM, Yinghai Lu wrote: Please check attached patches. Plan A. revert all 8 patches: revert_movablemem_map.patch Plan B. fix movablemem_map: kill_max_low_pfn_mapped.patch and fix_movablemem_map.patch fix_movablemem_map.patch is too risky, and need more test. Hi

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-03-01 Thread H. Peter Anvin
If NUMAQ is breaking real stuff we can kill it by marking it BROKEN. Rip-out is 3.10 at this stage. Ingo Molnar mi...@kernel.org wrote: * Borislav Petkov b...@alien8.de wrote: On Thu, Feb 28, 2013 at 10:37:10PM -0800, H. Peter Anvin wrote: I'd be very happy to get the NUMAQ code ripped

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-03-01 Thread H. Peter Anvin
On 02/28/2013 11:55 PM, Yinghai Lu wrote: Let me try again: movablemem_map is broken idea or poor design. Very much so. I have said this before: this is potentially useful during development/testing, but anyone who expects to actually tell their customers to use it is abusive.

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-03-01 Thread Yinghai Lu
On Thu, Feb 28, 2013 at 11:43 PM, Yinghai Lu ying...@kernel.org wrote: [trim down CC list a bit] On Thu, Feb 28, 2013 at 9:00 PM, Yinghai Lu ying...@kernel.org wrote: On Thursday, February 28, 2013, H. Peter Anvin wrote: On 02/28/2013 08:32 PM, Linus Torvalds wrote: Yingai, Andrew, is

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-03-01 Thread Yinghai Lu
On Fri, Mar 1, 2013 at 7:03 PM, chen tang imtangc...@gmail.com wrote: Thank you for your suggestion and fix work. :) I would prefer your Plan b. But one last thing I want to confirm: Will allocating pgdat and zone on local node prevent node hot-removing ? Or is it safe to free all node data

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-28 Thread Yinghai Lu
On Thu, Feb 28, 2013 at 10:02 PM, Yasuaki Ishimatsu wrote: > 2013/03/01 14:00, Yinghai Lu wrote: > > Original issue occurs by two patches. And it is fixed by Tang's reverting > patch. So other patches are obviously unrelated to original problem. Thus > there is no reason to revert all patches

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-28 Thread H. Peter Anvin
On 02/25/2013 08:51 PM, Martin Bligh wrote: >> Do you mean we can remove numaq x86 32bit code now? > > Wouldn't bother me at all. The machine is from 1995, end of life c. 2000? > Was useful in the early days of getting NUMA up and running on Linux, > but is now too old to be a museum piece,

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-28 Thread Tang Chen
On 03/01/2013 01:00 PM, Yinghai Lu wrote: On Thursday, February 28, 2013, H. Peter Anvin wrote: On 02/28/2013 08:32 PM, Linus Torvalds wrote: Yingai, Andrew, is this ok with you two? Linus FWIW, it makes sense to me iff it resolves the problems I prefer to reverting all 8

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-28 Thread Yasuaki Ishimatsu
2013/03/01 14:00, Yinghai Lu wrote: On Thursday, February 28, 2013, H. Peter Anvin wrote: On 02/28/2013 08:32 PM, Linus Torvalds wrote: Yingai, Andrew, is this ok with you two? Linus FWIW, it makes sense to me iff it resolves the problems I prefer to reverting all 8 patches.

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-28 Thread H. Peter Anvin
On 02/28/2013 08:32 PM, Linus Torvalds wrote: > Yingai, Andrew, > is this ok with you two? > > Linus FWIW, it makes sense to me iff it resolves the problems. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-28 Thread Andrew Morton
On Thu, 28 Feb 2013 20:32:15 -0800 Linus Torvalds wrote: > Yingai, Andrew, > is this ok with you two? If it works. I haven't tested it yet! Ordinarily I'd give it a few days for -next testing and to let Fengguang's testbot chew on it. -- To unsubscribe from this list: send the line

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-28 Thread Linus Torvalds
Yingai, Andrew, is this ok with you two? Linus On Thu, Feb 28, 2013 at 7:46 PM, Tang Chen wrote: > Hi Linus, > > Please refer to the attached patch. > > This patch everts only the following two patches. > > > commit 01a178a94e8eaec351b29ee49fbb3d1c124cb7fb >

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-28 Thread Tang Chen
Hi Linus, Please refer to the attached patch. This patch everts only the following two patches. commit 01a178a94e8eaec351b29ee49fbb3d1c124cb7fb acpi, memory-hotplug: support getting hotplug info from SRAT commit e8d1955258091e4c92d5a975ebd7fd8a98f5d30f

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-28 Thread Linus Torvalds
On Wed, Feb 27, 2013 at 1:26 PM, Andrew Morton wrote: > > So I'm thinking that the best approach here is to revert everything and > then try again for 3.10-rc1. This gives people time to test the code > while it's only in linux-next. (Hint!) I'd prefer to revert too by now - the bug seems to

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-28 Thread Tang Chen
On 03/01/2013 12:07 AM, Yinghai Lu wrote: On Tue, Feb 26, 2013 at 11:44 PM, Tang Chen wrote: Sorry, if you want to revert, you just need to revert: commit e8d1955258091e4c92d5a975ebd7fd8a98f5d30f acpi, memory-hotplug: parse SRAT before memblock is ready commit

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-28 Thread Yinghai Lu
On Tue, Feb 26, 2013 at 11:44 PM, Tang Chen wrote: > > Sorry, if you want to revert, you just need to revert: > > commit e8d1955258091e4c92d5a975ebd7fd8a98f5d30f > acpi, memory-hotplug: parse SRAT before memblock is ready > commit 01a178a94e8eaec351b29ee49fbb3d1c124cb7fb > acpi,

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-28 Thread Tang Chen
Hi Andrew, On 02/28/2013 05:26 AM, Andrew Morton wrote: Thank you all for addressing the bug. we are on the way to fix it. How long do you think this will take? I think we need one week to solve these problems. I do hope we can catch up the merge window for 3.9. Thanks. :) -- To

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-28 Thread Tang Chen
Hi Andrew, On 02/28/2013 05:26 AM, Andrew Morton wrote: Thank you all for addressing the bug. we are on the way to fix it. How long do you think this will take? I think we need one week to solve these problems. I do hope we can catch up the merge window for 3.9. Thanks. :) -- To

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-28 Thread Yinghai Lu
On Tue, Feb 26, 2013 at 11:44 PM, Tang Chen tangc...@cn.fujitsu.com wrote: Sorry, if you want to revert, you just need to revert: commit e8d1955258091e4c92d5a975ebd7fd8a98f5d30f acpi, memory-hotplug: parse SRAT before memblock is ready commit 01a178a94e8eaec351b29ee49fbb3d1c124cb7fb

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-28 Thread Tang Chen
On 03/01/2013 12:07 AM, Yinghai Lu wrote: On Tue, Feb 26, 2013 at 11:44 PM, Tang Chentangc...@cn.fujitsu.com wrote: Sorry, if you want to revert, you just need to revert: commit e8d1955258091e4c92d5a975ebd7fd8a98f5d30f acpi, memory-hotplug: parse SRAT before memblock is ready

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-28 Thread Linus Torvalds
On Wed, Feb 27, 2013 at 1:26 PM, Andrew Morton a...@linux-foundation.org wrote: So I'm thinking that the best approach here is to revert everything and then try again for 3.10-rc1. This gives people time to test the code while it's only in linux-next. (Hint!) I'd prefer to revert too by now

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-28 Thread Tang Chen
Hi Linus, Please refer to the attached patch. This patch everts only the following two patches. commit 01a178a94e8eaec351b29ee49fbb3d1c124cb7fb acpi, memory-hotplug: support getting hotplug info from SRAT commit e8d1955258091e4c92d5a975ebd7fd8a98f5d30f

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-28 Thread Linus Torvalds
Yingai, Andrew, is this ok with you two? Linus On Thu, Feb 28, 2013 at 7:46 PM, Tang Chen tangc...@cn.fujitsu.com wrote: Hi Linus, Please refer to the attached patch. This patch everts only the following two patches. commit 01a178a94e8eaec351b29ee49fbb3d1c124cb7fb

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-28 Thread Andrew Morton
On Thu, 28 Feb 2013 20:32:15 -0800 Linus Torvalds torva...@linux-foundation.org wrote: Yingai, Andrew, is this ok with you two? If it works. I haven't tested it yet! Ordinarily I'd give it a few days for -next testing and to let Fengguang's testbot chew on it. -- To unsubscribe from this

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-28 Thread H. Peter Anvin
On 02/28/2013 08:32 PM, Linus Torvalds wrote: Yingai, Andrew, is this ok with you two? Linus FWIW, it makes sense to me iff it resolves the problems. -hpa -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-28 Thread Yasuaki Ishimatsu
2013/03/01 14:00, Yinghai Lu wrote: On Thursday, February 28, 2013, H. Peter Anvin wrote: On 02/28/2013 08:32 PM, Linus Torvalds wrote: Yingai, Andrew, is this ok with you two? Linus FWIW, it makes sense to me iff it resolves the problems I prefer to reverting all 8 patches.

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-28 Thread Tang Chen
On 03/01/2013 01:00 PM, Yinghai Lu wrote: On Thursday, February 28, 2013, H. Peter Anvin wrote: On 02/28/2013 08:32 PM, Linus Torvalds wrote: Yingai, Andrew, is this ok with you two? Linus FWIW, it makes sense to me iff it resolves the problems I prefer to reverting all 8

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-28 Thread H. Peter Anvin
On 02/25/2013 08:51 PM, Martin Bligh wrote: Do you mean we can remove numaq x86 32bit code now? Wouldn't bother me at all. The machine is from 1995, end of life c. 2000? Was useful in the early days of getting NUMA up and running on Linux, but is now too old to be a museum piece, really.

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-28 Thread Yinghai Lu
On Thu, Feb 28, 2013 at 10:02 PM, Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com wrote: 2013/03/01 14:00, Yinghai Lu wrote: Original issue occurs by two patches. And it is fixed by Tang's reverting patch. So other patches are obviously unrelated to original problem. Thus there is no reason

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-27 Thread Andrew Morton
On Wed, 27 Feb 2013 16:00:36 +0800 Lai Jiangshan wrote: > In the mails and the changlog of the revert-patch, I think Yinghai > mainly worries about 3 problems. > > 1) the current implement has bug and bad code. > > Yes. Any bug should be fixed. we should fix it directly, or > we

RE: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-27 Thread Luck, Tony
> b. it will be freed to slub before run time. > like init code and initrd disk. If this is a problem - I'd be inclined to disable the code that frees it. It's only a few hundred KB of code, and possibly a few MB of initrd. Too small to worry about on a hot pluggable server. > In that

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-27 Thread Yinghai Lu
On Wed, Feb 27, 2013 at 8:28 AM, Luck, Tony wrote: >> assume first cpu only have 1G ram, and other 31 socket will have bunch of ram > > That doesn't seem to be a very realistic assumption. Can you even still buy 1G > DIMMs for servers? I'd think that a minimum would be to have each of four >

RE: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-27 Thread Luck, Tony
> assume first cpu only have 1G ram, and other 31 socket will have bunch of ram That doesn't seem to be a very realistic assumption. Can you even still buy 1G DIMMs for servers? I'd think that a minimum would be to have each of four channels populated with a 4G DIMM - so 16GB on first cpu. But

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-27 Thread Don Morris
On 02/27/2013 12:11 AM, Yinghai Lu wrote: > On Tue, Feb 26, 2013 at 8:43 PM, Yasuaki Ishimatsu > wrote: >> 2013/02/27 13:04, Yinghai Lu wrote: >>> >>> On Tue, Feb 26, 2013 at 7:38 PM, Yasuaki Ishimatsu >>> wrote: 2013/02/27 11:30, Yinghai Lu wrote: > > Do you mean you can not

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-27 Thread Don Morris
On 02/27/2013 12:11 AM, Yinghai Lu wrote: On Tue, Feb 26, 2013 at 8:43 PM, Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com wrote: 2013/02/27 13:04, Yinghai Lu wrote: On Tue, Feb 26, 2013 at 7:38 PM, Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com wrote: 2013/02/27 11:30, Yinghai Lu

RE: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-27 Thread Luck, Tony
assume first cpu only have 1G ram, and other 31 socket will have bunch of ram That doesn't seem to be a very realistic assumption. Can you even still buy 1G DIMMs for servers? I'd think that a minimum would be to have each of four channels populated with a 4G DIMM - so 16GB on first cpu. But

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-27 Thread Yinghai Lu
On Wed, Feb 27, 2013 at 8:28 AM, Luck, Tony tony.l...@intel.com wrote: assume first cpu only have 1G ram, and other 31 socket will have bunch of ram That doesn't seem to be a very realistic assumption. Can you even still buy 1G DIMMs for servers? I'd think that a minimum would be to have

RE: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-27 Thread Luck, Tony
b. it will be freed to slub before run time. like init code and initrd disk. If this is a problem - I'd be inclined to disable the code that frees it. It's only a few hundred KB of code, and possibly a few MB of initrd. Too small to worry about on a hot pluggable server. In that

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-27 Thread Andrew Morton
On Wed, 27 Feb 2013 16:00:36 +0800 Lai Jiangshan la...@cn.fujitsu.com wrote: In the mails and the changlog of the revert-patch, I think Yinghai mainly worries about 3 problems. 1) the current implement has bug and bad code. Yes. Any bug should be fixed. we should fix it directly, or

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Lai Jiangshan
On 02/27/2013 01:11 PM, Yinghai Lu wrote: > On Tue, Feb 26, 2013 at 8:43 PM, Yasuaki Ishimatsu > wrote: >> 2013/02/27 13:04, Yinghai Lu wrote: >>> >>> On Tue, Feb 26, 2013 at 7:38 PM, Yasuaki Ishimatsu >>> wrote: 2013/02/27 11:30, Yinghai Lu wrote: > > Do you mean you can not

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Tang Chen
On 02/27/2013 03:25 PM, Yinghai Lu wrote: On Tue, Feb 26, 2013 at 11:11 PM, Tang Chen wrote: On 02/27/2013 02:54 PM, Yinghai Lu wrote: Those patches are tangled together. No, they are not. The following commits supports "movablemem_map=nn[KMG]@ss[KMG]". commit

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Yinghai Lu
On Tue, Feb 26, 2013 at 11:11 PM, Tang Chen wrote: > On 02/27/2013 02:54 PM, Yinghai Lu wrote: >> >> Those patches are tangled together. > > > No, they are not. > > The following commits supports "movablemem_map=nn[KMG]@ss[KMG]". > > commit fb06bc8e5f42f38c011de0e59481f464a82380f6 >

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Tang Chen
On 02/27/2013 02:54 PM, Yinghai Lu wrote: On Tue, Feb 26, 2013 at 9:49 PM, Yasuaki Ishimatsu wrote: 2013/02/27 14:11, Yinghai Lu wrote: On Tue, Feb 26, 2013 at 8:43 PM, Yasuaki Ishimatsu wrote: 2013/02/27 13:04, Yinghai Lu wrote: On Tue, Feb 26, 2013 at 7:38 PM, Yasuaki Ishimatsu

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Yinghai Lu
On Tue, Feb 26, 2013 at 9:49 PM, Yasuaki Ishimatsu wrote: > 2013/02/27 14:11, Yinghai Lu wrote: >> >> On Tue, Feb 26, 2013 at 8:43 PM, Yasuaki Ishimatsu >> wrote: >>> >>> 2013/02/27 13:04, Yinghai Lu wrote: On Tue, Feb 26, 2013 at 7:38 PM, Yasuaki Ishimatsu wrote: >

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Yasuaki Ishimatsu
2013/02/27 14:11, Yinghai Lu wrote: On Tue, Feb 26, 2013 at 8:43 PM, Yasuaki Ishimatsu wrote: 2013/02/27 13:04, Yinghai Lu wrote: On Tue, Feb 26, 2013 at 7:38 PM, Yasuaki Ishimatsu wrote: 2013/02/27 11:30, Yinghai Lu wrote: Do you mean you can not boot one socket system with 1G ram ?

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Yinghai Lu
On Tue, Feb 26, 2013 at 8:43 PM, Yasuaki Ishimatsu wrote: > 2013/02/27 13:04, Yinghai Lu wrote: >> >> On Tue, Feb 26, 2013 at 7:38 PM, Yasuaki Ishimatsu >> wrote: >>> >>> 2013/02/27 11:30, Yinghai Lu wrote: Do you mean you can not boot one socket system with 1G ram ? Assume socket

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Yasuaki Ishimatsu
2013/02/27 13:04, Yinghai Lu wrote: On Tue, Feb 26, 2013 at 7:38 PM, Yasuaki Ishimatsu wrote: 2013/02/27 11:30, Yinghai Lu wrote: Do you mean you can not boot one socket system with 1G ram ? Assume socket 0 does not support hotplug, other 31 sockets support hot plug. So we could boot system

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Tang Chen
On 02/27/2013 10:24 AM, Yinghai Lu wrote: After looked at the code more, thought that theory that does not let kernel use ram on hotplug area is not right. after that commit, following range can not use movable ram: 1. real_mode code well..funny, legacy cpu0 [0,1M) could be hot-removed? 2.

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Yinghai Lu
On Tue, Feb 26, 2013 at 7:38 PM, Yasuaki Ishimatsu wrote: > 2013/02/27 11:30, Yinghai Lu wrote: >> Do you mean you can not boot one socket system with 1G ram ? >> Assume socket 0 does not support hotplug, other 31 sockets support hot >> plug. >> >> So we could boot system only with socket0, and

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Yasuaki Ishimatsu
2013/02/27 11:30, Yinghai Lu wrote: On Tue, Feb 26, 2013 at 4:52 PM, Yasuaki Ishimatsu wrote: 2013/02/27 7:44, Yinghai Lu wrote: that commit is totally broken, and it should be reverted. 1. numa_init is called several times, NOT just for srat. so those nodes_clear(numa_nodes_parsed)

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Yinghai Lu
On Tue, Feb 26, 2013 at 4:52 PM, Yasuaki Ishimatsu wrote: > 2013/02/27 7:44, Yinghai Lu wrote: that commit is totally broken, and it should be reverted. 1. numa_init is called several times, NOT just for srat. so those nodes_clear(numa_nodes_parsed)

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Yinghai Lu
On Tue, Feb 26, 2013 at 6:14 PM, Tang Chen wrote: > Hi Yinghai, > > Please see below. :) > > > On 02/27/2013 06:44 AM, Yinghai Lu wrote: that commit is totally broken, and it should be reverted. 1. numa_init is called several times, NOT just for srat. so those

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Tang Chen
Hi Yinghai, Please see below. :) On 02/27/2013 06:44 AM, Yinghai Lu wrote: that commit is totally broken, and it should be reverted. 1. numa_init is called several times, NOT just for srat. so those nodes_clear(numa_nodes_parsed) memset(_meminfo, 0, sizeof(numa_meminfo)) can not be

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Yasuaki Ishimatsu
ame: S2600CP [ 0.170454] sched: CPU #1's llc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency. [0.156000] smpboot: Booting Node 1, Processors #1 [0.170455] Modules linked in: [0.170460] Pid: 0, comm: swapper/1 Not tainted 3.8.0+ #1 [0.170461] Call Trace: [

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Yinghai Lu
t they aren't.] cc'ing Tang Chen >>> in case this is obvious to him or he's already fixed it somewhere not >>> on Linus's tree yet. >>> >>> Don Morris >>> >>>> >>>> [0.170435] [ cut here ] >>>> [

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Yinghai Lu
Don Morris >> >>> >>> [ 0.170435] [ cut here ] >>> [0.170450] WARNING: at arch/x86/kernel/smpboot.c:324 >>> topology_sane.isra.2+0x71/0x84() >>> [0.170452] Hardware name: S2600CP >>> [0

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Yinghai Lu
already fixed it somewhere not on Linus's tree yet. Don Morris [0.170435] [ cut here ] [0.170450] WARNING: at arch/x86/kernel/smpboot.c:324 topology_sane.isra.2+0x71/0x84() [0.170452] Hardware name: S2600CP [0.170454] sched: CPU #1's llc-sibling CPU #0

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Yinghai Lu
] Hardware name: S2600CP [0.170454] sched: CPU #1's llc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency. [0.156000] smpboot: Booting Node 1, Processors #1 [0.170455] Modules linked in: [0.170460] Pid: 0, comm: swapper/1 Not tainted 3.8.0+ #1

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Yasuaki Ishimatsu
() [0.170452] Hardware name: S2600CP [0.170454] sched: CPU #1's llc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency. [0.156000] smpboot: Booting Node 1, Processors #1 [0.170455] Modules linked in: [0.170460] Pid: 0, comm: swapper/1 Not tainted 3.8.0

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Tang Chen
Hi Yinghai, Please see below. :) On 02/27/2013 06:44 AM, Yinghai Lu wrote: that commit is totally broken, and it should be reverted. 1. numa_init is called several times, NOT just for srat. so those nodes_clear(numa_nodes_parsed) memset(numa_meminfo, 0, sizeof(numa_meminfo)) can not

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Yinghai Lu
On Tue, Feb 26, 2013 at 6:14 PM, Tang Chen tangc...@cn.fujitsu.com wrote: Hi Yinghai, Please see below. :) On 02/27/2013 06:44 AM, Yinghai Lu wrote: that commit is totally broken, and it should be reverted. 1. numa_init is called several times, NOT just for srat. so those

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Yinghai Lu
On Tue, Feb 26, 2013 at 4:52 PM, Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com wrote: 2013/02/27 7:44, Yinghai Lu wrote: that commit is totally broken, and it should be reverted. 1. numa_init is called several times, NOT just for srat. so those nodes_clear(numa_nodes_parsed)

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Yasuaki Ishimatsu
2013/02/27 11:30, Yinghai Lu wrote: On Tue, Feb 26, 2013 at 4:52 PM, Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com wrote: 2013/02/27 7:44, Yinghai Lu wrote: that commit is totally broken, and it should be reverted. 1. numa_init is called several times, NOT just for srat. so those

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Yinghai Lu
On Tue, Feb 26, 2013 at 7:38 PM, Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com wrote: 2013/02/27 11:30, Yinghai Lu wrote: Do you mean you can not boot one socket system with 1G ram ? Assume socket 0 does not support hotplug, other 31 sockets support hot plug. So we could boot system only

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Tang Chen
On 02/27/2013 10:24 AM, Yinghai Lu wrote: After looked at the code more, thought that theory that does not let kernel use ram on hotplug area is not right. after that commit, following range can not use movable ram: 1. real_mode code well..funny, legacy cpu0 [0,1M) could be hot-removed? 2.

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Yasuaki Ishimatsu
2013/02/27 13:04, Yinghai Lu wrote: On Tue, Feb 26, 2013 at 7:38 PM, Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com wrote: 2013/02/27 11:30, Yinghai Lu wrote: Do you mean you can not boot one socket system with 1G ram ? Assume socket 0 does not support hotplug, other 31 sockets support hot

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Yinghai Lu
On Tue, Feb 26, 2013 at 8:43 PM, Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com wrote: 2013/02/27 13:04, Yinghai Lu wrote: On Tue, Feb 26, 2013 at 7:38 PM, Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com wrote: 2013/02/27 11:30, Yinghai Lu wrote: Do you mean you can not boot one socket

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Yasuaki Ishimatsu
2013/02/27 14:11, Yinghai Lu wrote: On Tue, Feb 26, 2013 at 8:43 PM, Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com wrote: 2013/02/27 13:04, Yinghai Lu wrote: On Tue, Feb 26, 2013 at 7:38 PM, Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com wrote: 2013/02/27 11:30, Yinghai Lu wrote: Do

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Yinghai Lu
On Tue, Feb 26, 2013 at 9:49 PM, Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com wrote: 2013/02/27 14:11, Yinghai Lu wrote: On Tue, Feb 26, 2013 at 8:43 PM, Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com wrote: 2013/02/27 13:04, Yinghai Lu wrote: On Tue, Feb 26, 2013 at 7:38 PM,

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Tang Chen
On 02/27/2013 02:54 PM, Yinghai Lu wrote: On Tue, Feb 26, 2013 at 9:49 PM, Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com wrote: 2013/02/27 14:11, Yinghai Lu wrote: On Tue, Feb 26, 2013 at 8:43 PM, Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com wrote: 2013/02/27 13:04, Yinghai Lu

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Yinghai Lu
On Tue, Feb 26, 2013 at 11:11 PM, Tang Chen tangc...@cn.fujitsu.com wrote: On 02/27/2013 02:54 PM, Yinghai Lu wrote: Those patches are tangled together. No, they are not. The following commits supports movablemem_map=nn[KMG]@ss[KMG]. commit fb06bc8e5f42f38c011de0e59481f464a82380f6

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Tang Chen
On 02/27/2013 03:25 PM, Yinghai Lu wrote: On Tue, Feb 26, 2013 at 11:11 PM, Tang Chentangc...@cn.fujitsu.com wrote: On 02/27/2013 02:54 PM, Yinghai Lu wrote: Those patches are tangled together. No, they are not. The following commits supports movablemem_map=nn[KMG]@ss[KMG]. commit

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-26 Thread Lai Jiangshan
On 02/27/2013 01:11 PM, Yinghai Lu wrote: On Tue, Feb 26, 2013 at 8:43 PM, Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com wrote: 2013/02/27 13:04, Yinghai Lu wrote: On Tue, Feb 26, 2013 at 7:38 PM, Yasuaki Ishimatsu isimatu.yasu...@jp.fujitsu.com wrote: 2013/02/27 11:30, Yinghai Lu

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-25 Thread Yasuaki Ishimatsu
Hi Yinghai, 2013/02/26 15:57, Yinghai Lu wrote: On Mon, Feb 25, 2013 at 10:09 PM, Tang Chen wrote: On 02/26/2013 12:51 PM, Martin Bligh wrote: Do you mean we can remove numaq x86 32bit code now? Wouldn't bother me at all. The machine is from 1995, end of life c. 2000? Was useful in the

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-25 Thread Tang Chen
On 02/26/2013 02:57 PM, Yinghai Lu wrote: That is temporary workaround and your patch and this workaround make x86 acpi numa init too messy. I don't see the point to hack SRAT to make memory hotplug working. Do you guys check and use PMTT in ACPI spec instead? Hi Yinghai, Thanks for the

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-25 Thread Yinghai Lu
On Mon, Feb 25, 2013 at 10:09 PM, Tang Chen wrote: > On 02/26/2013 12:51 PM, Martin Bligh wrote: >>> >>> Do you mean we can remove numaq x86 32bit code now? >> >> >> Wouldn't bother me at all. The machine is from 1995, end of life c. 2000? >> Was useful in the early days of getting NUMA up and

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-25 Thread Tang Chen
On 02/26/2013 12:51 PM, Martin Bligh wrote: Do you mean we can remove numaq x86 32bit code now? Wouldn't bother me at all. The machine is from 1995, end of life c. 2000? Was useful in the early days of getting NUMA up and running on Linux, but is now too old to be a museum piece, really. M.

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-25 Thread Martin Bligh
> Do you mean we can remove numaq x86 32bit code now? Wouldn't bother me at all. The machine is from 1995, end of life c. 2000? Was useful in the early days of getting NUMA up and running on Linux, but is now too old to be a museum piece, really. M. -- To unsubscribe from this list: send the

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-25 Thread Yinghai Lu
On Mon, Feb 25, 2013 at 7:21 PM, Martin Bligh wrote: 4, it does not CC to TJ and other numa guys... >>> >>> attached workaround the problem for now. >>> but it will assume NUMAQ would not have SRAT table. >> >> Martin, can you confirm that numaq does not have srat? > > No, it's pre-SRAT. I

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-25 Thread Martin Bligh
>>> 4, it does not CC to TJ and other numa guys... >> >> attached workaround the problem for now. >> but it will assume NUMAQ would not have SRAT table. > > Martin, can you confirm that numaq does not have srat? No, it's pre-SRAT. I forget the exact name of the table, but no SRAT until x440.

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-25 Thread Yinghai Lu
igned to the correct node, but they aren't.] cc'ing Tang Chen >>> in case this is obvious to him or he's already fixed it somewhere not >>> on Linus's tree yet. >>> >>> Don Morris >>> >>>> >>>> [0.170435] [ cut here ]--

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-25 Thread Tang Chen
[0.170435] [ cut here ] [0.170450] WARNING: at arch/x86/kernel/smpboot.c:324 topology_sane.isra.2+0x71/0x84() [0.170452] Hardware name: S2600CP [0.170454] sched: CPU #1's llc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency

Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-25 Thread Yinghai Lu
Don Morris >> >>> >>> [ 0.170435] [ cut here ] >>> [0.170450] WARNING: at arch/x86/kernel/smpboot.c:324 >>> topology_sane.isra.2+0x71/0x84() >>> [0.170452] Hardware name: S2600CP >>> [0

  1   2   >