Re: Linux 2.6.39-rc3

2011-05-06 Thread Linus Torvalds
On Wednesday, April 13, 2011, Linus Torvalds wrote: > On Wednesday, April 13, 2011, H. Peter Anvin wrote: >> >> Yes.  However, even if we *do* revert (and the time is running short on >> not reverting) I would like to understand this particular one, simply >> because I think it may very well be a

Linux 2.6.39-rc3

2011-05-06 Thread Linus Torvalds
On Wednesday, April 13, 2011, Linus Torvalds wrote: > On Wednesday, April 13, 2011, H. Peter Anvin wrote: >> >> Yes. ?However, even if we *do* revert (and the time is running short on >> not reverting) I would like to understand this particular one, simply >> because I think it may very well be a

Linux 2.6.39-rc3

2011-04-18 Thread Alex Deucher
On Mon, Apr 18, 2011 at 11:59 AM, Jerome Glisse wrote: > On Mon, Apr 18, 2011 at 11:33 AM, Alex Deucher > wrote: >> On Mon, Apr 18, 2011 at 11:29 AM, Jerome Glisse >> wrote: >>> On Mon, Apr 18, 2011 at 11:23 AM, Alex Deucher >>> wrote: On Sun, Apr 17, 2011 at 10:09 AM, Joerg Roedel wro

Linux 2.6.39-rc3

2011-04-18 Thread Jerome Glisse
On Mon, Apr 18, 2011 at 11:33 AM, Alex Deucher wrote: > On Mon, Apr 18, 2011 at 11:29 AM, Jerome Glisse wrote: >> On Mon, Apr 18, 2011 at 11:23 AM, Alex Deucher >> wrote: >>> On Sun, Apr 17, 2011 at 10:09 AM, Joerg Roedel wrote: On Sat, Apr 16, 2011 at 02:54:04PM -0400, Jerome Glisse wrot

Linux 2.6.39-rc3

2011-04-18 Thread Alex Deucher
On Mon, Apr 18, 2011 at 11:29 AM, Jerome Glisse wrote: > On Mon, Apr 18, 2011 at 11:23 AM, Alex Deucher > wrote: >> On Sun, Apr 17, 2011 at 10:09 AM, Joerg Roedel wrote: >>> On Sat, Apr 16, 2011 at 02:54:04PM -0400, Jerome Glisse wrote: >>> If you want to go the printk way you can add prin

Linux 2.6.39-rc3

2011-04-18 Thread Jerome Glisse
On Mon, Apr 18, 2011 at 11:23 AM, Alex Deucher wrote: > On Sun, Apr 17, 2011 at 10:09 AM, Joerg Roedel wrote: >> On Sat, Apr 16, 2011 at 02:54:04PM -0400, Jerome Glisse wrote: >> >>> If you want to go the printk way you can add printk before each test >>> ring_test, ib_test in r600.c this 2 funct

Linux 2.6.39-rc3

2011-04-18 Thread Alex Deucher
On Sun, Apr 17, 2011 at 10:09 AM, Joerg Roedel wrote: > On Sat, Apr 16, 2011 at 02:54:04PM -0400, Jerome Glisse wrote: > >> If you want to go the printk way you can add printk before each test >> ring_test, ib_test in r600.c this 2 functions are the own that might >> trigger the first GPU gart act

Re: Linux 2.6.39-rc3

2011-04-18 Thread Alex Deucher
On Mon, Apr 18, 2011 at 11:59 AM, Jerome Glisse wrote: > On Mon, Apr 18, 2011 at 11:33 AM, Alex Deucher wrote: >> On Mon, Apr 18, 2011 at 11:29 AM, Jerome Glisse wrote: >>> On Mon, Apr 18, 2011 at 11:23 AM, Alex Deucher >>> wrote: On Sun, Apr 17, 2011 at 10:09 AM, Joerg Roedel wrote: >>>

Re: Linux 2.6.39-rc3

2011-04-18 Thread Jerome Glisse
On Mon, Apr 18, 2011 at 11:33 AM, Alex Deucher wrote: > On Mon, Apr 18, 2011 at 11:29 AM, Jerome Glisse wrote: >> On Mon, Apr 18, 2011 at 11:23 AM, Alex Deucher wrote: >>> On Sun, Apr 17, 2011 at 10:09 AM, Joerg Roedel wrote: On Sat, Apr 16, 2011 at 02:54:04PM -0400, Jerome Glisse wrote: >

Re: Linux 2.6.39-rc3

2011-04-18 Thread Alex Deucher
On Mon, Apr 18, 2011 at 11:29 AM, Jerome Glisse wrote: > On Mon, Apr 18, 2011 at 11:23 AM, Alex Deucher wrote: >> On Sun, Apr 17, 2011 at 10:09 AM, Joerg Roedel wrote: >>> On Sat, Apr 16, 2011 at 02:54:04PM -0400, Jerome Glisse wrote: >>> If you want to go the printk way you can add printk

Re: Linux 2.6.39-rc3

2011-04-18 Thread Jerome Glisse
On Mon, Apr 18, 2011 at 11:23 AM, Alex Deucher wrote: > On Sun, Apr 17, 2011 at 10:09 AM, Joerg Roedel wrote: >> On Sat, Apr 16, 2011 at 02:54:04PM -0400, Jerome Glisse wrote: >> >>> If you want to go the printk way you can add printk before each test >>> ring_test, ib_test in r600.c this 2 funct

Re: Linux 2.6.39-rc3

2011-04-18 Thread Alex Deucher
On Sun, Apr 17, 2011 at 10:09 AM, Joerg Roedel wrote: > On Sat, Apr 16, 2011 at 02:54:04PM -0400, Jerome Glisse wrote: > >> If you want to go the printk way you can add printk before each test >> ring_test, ib_test in r600.c this 2 functions are the own that might >> trigger the first GPU gart act

Linux 2.6.39-rc3

2011-04-17 Thread Jerome Glisse
On Sun, Apr 17, 2011 at 10:09 AM, Joerg Roedel wrote: > On Sat, Apr 16, 2011 at 02:54:04PM -0400, Jerome Glisse wrote: > >> If you want to go the printk way you can add printk before each test >> ring_test, ib_test in r600.c this 2 functions are the own that might >> trigger the first GPU gart act

Re: Linux 2.6.39-rc3

2011-04-17 Thread Jerome Glisse
On Sun, Apr 17, 2011 at 10:09 AM, Joerg Roedel wrote: > On Sat, Apr 16, 2011 at 02:54:04PM -0400, Jerome Glisse wrote: > >> If you want to go the printk way you can add printk before each test >> ring_test, ib_test in r600.c this 2 functions are the own that might >> trigger the first GPU gart act

Linux 2.6.39-rc3

2011-04-17 Thread Joerg Roedel
On Sat, Apr 16, 2011 at 02:54:04PM -0400, Jerome Glisse wrote: > If you want to go the printk way you can add printk before each test > ring_test, ib_test in r600.c this 2 functions are the own that might > trigger the first GPU gart activities. Okay, I found the place in source that triggers thi

Re: Linux 2.6.39-rc3

2011-04-17 Thread Joerg Roedel
On Sat, Apr 16, 2011 at 02:54:04PM -0400, Jerome Glisse wrote: > If you want to go the printk way you can add printk before each test > ring_test, ib_test in r600.c this 2 functions are the own that might > trigger the first GPU gart activities. Okay, I found the place in source that triggers thi

Linux 2.6.39-rc3

2011-04-16 Thread Joerg Roedel
On Fri, Apr 15, 2011 at 12:11:28PM -0400, Jerome Glisse wrote: > Do you also got the write if you load radeon with radeon.no_wb=1 ? > I think at this address it's the wb page, or maybe the cp as wb likely > take only one page radeon.no_wb=1 makes no difference. The box still reboots. Joer

Linux 2.6.39-rc3

2011-04-16 Thread Jerome Glisse
On Sat, Apr 16, 2011 at 12:35 PM, Joerg Roedel wrote: > On Fri, Apr 15, 2011 at 12:11:28PM -0400, Jerome Glisse wrote: >> Do you also got the write if you load radeon with radeon.no_wb=1 ? >> I think at this address it's the wb page, or maybe the cp as wb likely >> take only one page > > radeon.no

Linux 2.6.39-rc3

2011-04-16 Thread Ingo Molnar
* Joerg Roedel wrote: > On Fri, Apr 15, 2011 at 09:06:41PM +0200, Ingo Molnar wrote: > > > > * Alexandre Demers wrote: > > > > > On 11-04-15 10:27 AM, Joerg Roedel wrote: > > > > On Fri, Apr 15, 2011 at 10:16:59AM -0400, Alexandre Demers wrote: > > > >> Ok, I'll test it today. Should I apply

Linux 2.6.39-rc3

2011-04-16 Thread Joerg Roedel
On Fri, Apr 15, 2011 at 12:18:02PM -0700, Yinghai Lu wrote: > On 04/15/2011 12:06 PM, Ingo Molnar wrote: > > > > > Joerg, mind submitting it with a changelog that includes everything we > > learned > > about this bug and all the Tested-by's in place? > > > > Is anyone of the opinion that we sh

Linux 2.6.39-rc3

2011-04-16 Thread Joerg Roedel
On Fri, Apr 15, 2011 at 09:06:41PM +0200, Ingo Molnar wrote: > > * Alexandre Demers wrote: > > > On 11-04-15 10:27 AM, Joerg Roedel wrote: > > > On Fri, Apr 15, 2011 at 10:16:59AM -0400, Alexandre Demers wrote: > > >> Ok, I'll test it today. Should I apply it on a clean rc3 without any of > > >>

Re: Linux 2.6.39-rc3

2011-04-16 Thread Jerome Glisse
On Sat, Apr 16, 2011 at 12:35 PM, Joerg Roedel wrote: > On Fri, Apr 15, 2011 at 12:11:28PM -0400, Jerome Glisse wrote: >> Do you also got the write if you load radeon with radeon.no_wb=1 ? >> I think at this address it's the wb page, or maybe the cp as wb likely >> take only one page > > radeon.no

Re: Linux 2.6.39-rc3

2011-04-16 Thread Joerg Roedel
On Fri, Apr 15, 2011 at 12:11:28PM -0400, Jerome Glisse wrote: > Do you also got the write if you load radeon with radeon.no_wb=1 ? > I think at this address it's the wb page, or maybe the cp as wb likely > take only one page radeon.no_wb=1 makes no difference. The box still reboots. Joer

Re: Linux 2.6.39-rc3

2011-04-16 Thread Ingo Molnar
* Joerg Roedel wrote: > On Fri, Apr 15, 2011 at 09:06:41PM +0200, Ingo Molnar wrote: > > > > * Alexandre Demers wrote: > > > > > On 11-04-15 10:27 AM, Joerg Roedel wrote: > > > > On Fri, Apr 15, 2011 at 10:16:59AM -0400, Alexandre Demers wrote: > > > >> Ok, I'll test it today. Should I apply

Re: Linux 2.6.39-rc3

2011-04-16 Thread Joerg Roedel
On Fri, Apr 15, 2011 at 12:18:02PM -0700, Yinghai Lu wrote: > On 04/15/2011 12:06 PM, Ingo Molnar wrote: > > > > > Joerg, mind submitting it with a changelog that includes everything we > > learned > > about this bug and all the Tested-by's in place? > > > > Is anyone of the opinion that we sh

Re: Linux 2.6.39-rc3

2011-04-16 Thread Joerg Roedel
On Fri, Apr 15, 2011 at 09:06:41PM +0200, Ingo Molnar wrote: > > * Alexandre Demers wrote: > > > On 11-04-15 10:27 AM, Joerg Roedel wrote: > > > On Fri, Apr 15, 2011 at 10:16:59AM -0400, Alexandre Demers wrote: > > >> Ok, I'll test it today. Should I apply it on a clean rc3 without any of > > >>

Linux 2.6.39-rc3

2011-04-15 Thread Ingo Molnar
* Alexandre Demers wrote: > On 11-04-15 10:27 AM, Joerg Roedel wrote: > > On Fri, Apr 15, 2011 at 10:16:59AM -0400, Alexandre Demers wrote: > >> Ok, I'll test it today. Should I apply it on a clean rc3 without any of > >> the other patches? > > Yes, apply it just on -rc3 without any other patch.

Linux 2.6.39-rc3

2011-04-15 Thread Joerg Roedel
On Fri, Apr 15, 2011 at 03:16:50PM +0200, Ingo Molnar wrote: > Ok, but how did the allocation changes start triggering this error in > v2.6.39-rc1? There must still be some layout specific thing here, right? > Do we understand the details of that as well? Well, thinking again about this, the GPU

Linux 2.6.39-rc3

2011-04-15 Thread Andreas Herrmann
On Thu, Apr 14, 2011 at 05:34:46PM -0400, Alex Deucher wrote: > On Thu, Apr 14, 2011 at 5:09 PM, Joerg Roedel wrote: > > On Thu, Apr 14, 2011 at 10:28:43AM -0400, Alex Deucher wrote: > >> On Thu, Apr 14, 2011 at 4:56 AM, Joerg Roedel wrote: > >> > And this makes a difference, with this change on-

Linux 2.6.39-rc3

2011-04-15 Thread Joerg Roedel
On Fri, Apr 15, 2011 at 03:16:50PM +0200, Ingo Molnar wrote: > Ok, but how did the allocation changes start triggering this error in > v2.6.39-rc1? There must still be some layout specific thing here, right? > Do we understand the details of that as well? No, I must admit that I lack enough knowl

Linux 2.6.39-rc3

2011-04-15 Thread Joerg Roedel
On Fri, Apr 15, 2011 at 04:04:45PM +0200, Andreas Herrmann wrote: > What about tagging this patch for stable/longterm releases? > > Potentially there are other cases where certain combinations of > hardware(GPUs)/drivers/whatsoever might trigger a GartTlbWlkErr. If > the BIOS doesn't follow the BK

Linux 2.6.39-rc3

2011-04-15 Thread Joerg Roedel
On Fri, Apr 15, 2011 at 10:16:59AM -0400, Alexandre Demers wrote: > Ok, I'll test it today. Should I apply it on a clean rc3 without any of > the other patches? Yes, apply it just on -rc3 without any other patch. > > BTW, may I suggest adding the info under bug 33012 in kernel bugzilla? > This c

Linux 2.6.39-rc3

2011-04-15 Thread Andreas Herrmann
On Fri, Apr 15, 2011 at 03:11:52PM +0200, Joerg Roedel wrote: > On Wed, Apr 13, 2011 at 07:33:40PM -0700, Linus Torvalds wrote: > > we definitely want to also understand the reason for things not > > working, even if we do revert.. > > Okay, here it is. > > After experimenting with different con

Linux 2.6.39-rc3

2011-04-15 Thread Ingo Molnar
* Joerg Roedel wrote: > On Wed, Apr 13, 2011 at 07:33:40PM -0700, Linus Torvalds wrote: > > we definitely want to also understand the reason for things not > > working, even if we do revert.. > > Okay, here it is. > > After experimenting with different configurations for the north-bridge > it

Linux 2.6.39-rc3

2011-04-15 Thread Joerg Roedel
On Wed, Apr 13, 2011 at 07:33:40PM -0700, Linus Torvalds wrote: > we definitely want to also understand the reason for things not > working, even if we do revert.. Okay, here it is. After experimenting with different configurations for the north-bridge it turned out that a GART related MCE fires

Linux 2.6.39-rc3

2011-04-15 Thread Alexandre Demers
On 11-04-15 10:27 AM, Joerg Roedel wrote: > On Fri, Apr 15, 2011 at 10:16:59AM -0400, Alexandre Demers wrote: >> Ok, I'll test it today. Should I apply it on a clean rc3 without any of >> the other patches? > Yes, apply it just on -rc3 without any other patch. > >> BTW, may I suggest adding the inf

Re: Linux 2.6.39-rc3

2011-04-15 Thread H. Peter Anvin
On 04/15/2011 12:18 PM, Yinghai Lu wrote: > On 04/15/2011 12:06 PM, Ingo Molnar wrote: > >> >> Joerg, mind submitting it with a changelog that includes everything we >> learned >> about this bug and all the Tested-by's in place? >> >> Is anyone of the opinion that we should try to revert the all

Linux 2.6.39-rc3

2011-04-15 Thread H. Peter Anvin
On 04/15/2011 12:18 PM, Yinghai Lu wrote: > On 04/15/2011 12:06 PM, Ingo Molnar wrote: > >> >> Joerg, mind submitting it with a changelog that includes everything we >> learned >> about this bug and all the Tested-by's in place? >> >> Is anyone of the opinion that we should try to revert the all

Re: Linux 2.6.39-rc3

2011-04-15 Thread Yinghai Lu
On 04/15/2011 12:06 PM, Ingo Molnar wrote: > > Joerg, mind submitting it with a changelog that includes everything we > learned > about this bug and all the Tested-by's in place? > > Is anyone of the opinion that we should try to revert the allocation > order/alignment changes in addition to

Linux 2.6.39-rc3

2011-04-15 Thread Yinghai Lu
On 04/15/2011 12:06 PM, Ingo Molnar wrote: > > Joerg, mind submitting it with a changelog that includes everything we > learned > about this bug and all the Tested-by's in place? > > Is anyone of the opinion that we should try to revert the allocation > order/alignment changes in addition to

Linux 2.6.39-rc3

2011-04-15 Thread Jerome Glisse
On Fri, Apr 15, 2011 at 11:46 AM, Joerg Roedel wrote: > On Fri, Apr 15, 2011 at 03:16:50PM +0200, Ingo Molnar wrote: >> Ok, but how did the allocation changes start triggering this error in >> v2.6.39-rc1? There must still be some layout specific thing here, right? >> Do we understand the details

Linux 2.6.39-rc3

2011-04-15 Thread Alex Deucher
On Fri, Apr 15, 2011 at 10:33 AM, Joerg Roedel wrote: > On Fri, Apr 15, 2011 at 03:16:50PM +0200, Ingo Molnar wrote: >> Ok, but how did the allocation changes start triggering this error in >> v2.6.39-rc1? There must still be some layout specific thing here, right? >> Do we understand the details

Re: Linux 2.6.39-rc3

2011-04-15 Thread Ingo Molnar
* Alexandre Demers wrote: > On 11-04-15 10:27 AM, Joerg Roedel wrote: > > On Fri, Apr 15, 2011 at 10:16:59AM -0400, Alexandre Demers wrote: > >> Ok, I'll test it today. Should I apply it on a clean rc3 without any of > >> the other patches? > > Yes, apply it just on -rc3 without any other patch.

Re: Linux 2.6.39-rc3

2011-04-15 Thread Alexandre Demers
On 11-04-15 10:27 AM, Joerg Roedel wrote: > On Fri, Apr 15, 2011 at 10:16:59AM -0400, Alexandre Demers wrote: >> Ok, I'll test it today. Should I apply it on a clean rc3 without any of >> the other patches? > Yes, apply it just on -rc3 without any other patch. > >> BTW, may I suggest adding the inf

Linux 2.6.39-rc3

2011-04-15 Thread Joerg Roedel
On Fri, Apr 15, 2011 at 10:26:34AM +0200, Michel D?nzer wrote: > On Don, 2011-04-14 at 23:09 +0200, Joerg Roedel wrote: > > On Thu, Apr 14, 2011 at 10:28:43AM -0400, Alex Deucher wrote: > > > On Thu, Apr 14, 2011 at 4:56 AM, Joerg Roedel wrote: > > > > And this makes a difference, with this chang

Linux 2.6.39-rc3

2011-04-15 Thread Michel Dänzer
On Don, 2011-04-14 at 23:09 +0200, Joerg Roedel wrote: > On Thu, Apr 14, 2011 at 10:28:43AM -0400, Alex Deucher wrote: > > On Thu, Apr 14, 2011 at 4:56 AM, Joerg Roedel wrote: > > > And this makes a difference, with this change on-top of -rc3 the box boots > > > fine. So there seems to be some de

Linux 2.6.39-rc3

2011-04-15 Thread Alexandre Demers
On 11-04-15 09:11 AM, Joerg Roedel wrote: > On Wed, Apr 13, 2011 at 07:33:40PM -0700, Linus Torvalds wrote: >> we definitely want to also understand the reason for things not >> working, even if we do revert.. > Okay, here it is. > > After experimenting with different configurations for the north-

Re: Linux 2.6.39-rc3

2011-04-15 Thread Andreas Herrmann
On Thu, Apr 14, 2011 at 05:34:46PM -0400, Alex Deucher wrote: > On Thu, Apr 14, 2011 at 5:09 PM, Joerg Roedel wrote: > > On Thu, Apr 14, 2011 at 10:28:43AM -0400, Alex Deucher wrote: > >> On Thu, Apr 14, 2011 at 4:56 AM, Joerg Roedel wrote: > >> > And this makes a difference, with this change on-

Re: Linux 2.6.39-rc3

2011-04-15 Thread Andreas Herrmann
On Fri, Apr 15, 2011 at 03:11:52PM +0200, Joerg Roedel wrote: > On Wed, Apr 13, 2011 at 07:33:40PM -0700, Linus Torvalds wrote: > > we definitely want to also understand the reason for things not > > working, even if we do revert.. > > Okay, here it is. > > After experimenting with different con

Re: Linux 2.6.39-rc3

2011-04-15 Thread Jerome Glisse
On Fri, Apr 15, 2011 at 11:46 AM, Joerg Roedel wrote: > On Fri, Apr 15, 2011 at 03:16:50PM +0200, Ingo Molnar wrote: >> Ok, but how did the allocation changes start triggering this error in >> v2.6.39-rc1? There must still be some layout specific thing here, right? >> Do we understand the details

Re: Linux 2.6.39-rc3

2011-04-15 Thread Alex Deucher
On Fri, Apr 15, 2011 at 10:33 AM, Joerg Roedel wrote: > On Fri, Apr 15, 2011 at 03:16:50PM +0200, Ingo Molnar wrote: >> Ok, but how did the allocation changes start triggering this error in >> v2.6.39-rc1? There must still be some layout specific thing here, right? >> Do we understand the details

Linux 2.6.39-rc3

2011-04-15 Thread Joerg Roedel
On Thu, Apr 14, 2011 at 05:34:46PM -0400, Alex Deucher wrote: > On Thu, Apr 14, 2011 at 5:09 PM, Joerg Roedel wrote: > > Actually, the nb gart is part of the cpu. It is part of the cpu north > > bridge and can translate io and cpu accesses. In fact, it is a remapper > > of physical memory address

Re: Linux 2.6.39-rc3

2011-04-15 Thread Joerg Roedel
On Fri, Apr 15, 2011 at 03:16:50PM +0200, Ingo Molnar wrote: > Ok, but how did the allocation changes start triggering this error in > v2.6.39-rc1? There must still be some layout specific thing here, right? > Do we understand the details of that as well? Well, thinking again about this, the GPU

Re: Linux 2.6.39-rc3

2011-04-15 Thread Joerg Roedel
On Fri, Apr 15, 2011 at 03:16:50PM +0200, Ingo Molnar wrote: > Ok, but how did the allocation changes start triggering this error in > v2.6.39-rc1? There must still be some layout specific thing here, right? > Do we understand the details of that as well? No, I must admit that I lack enough knowl

Re: Linux 2.6.39-rc3

2011-04-15 Thread Joerg Roedel
On Fri, Apr 15, 2011 at 04:04:45PM +0200, Andreas Herrmann wrote: > What about tagging this patch for stable/longterm releases? > > Potentially there are other cases where certain combinations of > hardware(GPUs)/drivers/whatsoever might trigger a GartTlbWlkErr. If > the BIOS doesn't follow the BK

Re: Linux 2.6.39-rc3

2011-04-15 Thread Joerg Roedel
On Fri, Apr 15, 2011 at 10:16:59AM -0400, Alexandre Demers wrote: > Ok, I'll test it today. Should I apply it on a clean rc3 without any of > the other patches? Yes, apply it just on -rc3 without any other patch. > > BTW, may I suggest adding the info under bug 33012 in kernel bugzilla? > This c

Re: Linux 2.6.39-rc3

2011-04-15 Thread Alexandre Demers
On 11-04-15 09:11 AM, Joerg Roedel wrote: > On Wed, Apr 13, 2011 at 07:33:40PM -0700, Linus Torvalds wrote: >> we definitely want to also understand the reason for things not >> working, even if we do revert.. > Okay, here it is. > > After experimenting with different configurations for the north-

Re: Linux 2.6.39-rc3

2011-04-15 Thread Ingo Molnar
* Joerg Roedel wrote: > On Wed, Apr 13, 2011 at 07:33:40PM -0700, Linus Torvalds wrote: > > we definitely want to also understand the reason for things not > > working, even if we do revert.. > > Okay, here it is. > > After experimenting with different configurations for the north-bridge > it

Re: Linux 2.6.39-rc3

2011-04-15 Thread Joerg Roedel
On Wed, Apr 13, 2011 at 07:33:40PM -0700, Linus Torvalds wrote: > we definitely want to also understand the reason for things not > working, even if we do revert.. Okay, here it is. After experimenting with different configurations for the north-bridge it turned out that a GART related MCE fires

Re: Linux 2.6.39-rc3

2011-04-15 Thread Joerg Roedel
On Fri, Apr 15, 2011 at 10:26:34AM +0200, Michel Dänzer wrote: > On Don, 2011-04-14 at 23:09 +0200, Joerg Roedel wrote: > > On Thu, Apr 14, 2011 at 10:28:43AM -0400, Alex Deucher wrote: > > > On Thu, Apr 14, 2011 at 4:56 AM, Joerg Roedel wrote: > > > > And this makes a difference, with this chang

Re: Linux 2.6.39-rc3

2011-04-15 Thread Michel Dänzer
On Don, 2011-04-14 at 23:09 +0200, Joerg Roedel wrote: > On Thu, Apr 14, 2011 at 10:28:43AM -0400, Alex Deucher wrote: > > On Thu, Apr 14, 2011 at 4:56 AM, Joerg Roedel wrote: > > > And this makes a difference, with this change on-top of -rc3 the box boots > > > fine. So there seems to be some de

Re: Linux 2.6.39-rc3

2011-04-14 Thread Joerg Roedel
On Thu, Apr 14, 2011 at 05:34:46PM -0400, Alex Deucher wrote: > On Thu, Apr 14, 2011 at 5:09 PM, Joerg Roedel wrote: > > Actually, the nb gart is part of the cpu. It is part of the cpu north > > bridge and can translate io and cpu accesses. In fact, it is a remapper > > of physical memory address

Linux 2.6.39-rc3

2011-04-14 Thread Joerg Roedel
On Thu, Apr 14, 2011 at 10:28:43AM -0400, Alex Deucher wrote: > On Thu, Apr 14, 2011 at 4:56 AM, Joerg Roedel wrote: > > And this makes a difference, with this change on-top of -rc3 the box boots > > fine. So there seems to be some dependency between the GART base and the GTT > > base even when th

Linux 2.6.39-rc3

2011-04-14 Thread Dave Airlie
On Thu, Apr 14, 2011 at 6:56 PM, Joerg Roedel wrote: > On Wed, Apr 13, 2011 at 06:58:46PM -0700, H. Peter Anvin wrote: >> On 04/13/2011 12:14 PM, Yinghai Lu wrote: >> > >> > so looks bios program wrong address to the radon card? >> > >> >> Okay, staring at this, it definitely seems toxic to overla

Linux 2.6.39-rc3

2011-04-14 Thread Alex Deucher
On Thu, Apr 14, 2011 at 5:09 PM, Joerg Roedel wrote: > On Thu, Apr 14, 2011 at 10:28:43AM -0400, Alex Deucher wrote: >> On Thu, Apr 14, 2011 at 4:56 AM, Joerg Roedel wrote: >> > And this makes a difference, with this change on-top of -rc3 the box boots >> > fine. So there seems to be some depende

Re: Linux 2.6.39-rc3

2011-04-14 Thread Alex Deucher
On Thu, Apr 14, 2011 at 5:09 PM, Joerg Roedel wrote: > On Thu, Apr 14, 2011 at 10:28:43AM -0400, Alex Deucher wrote: >> On Thu, Apr 14, 2011 at 4:56 AM, Joerg Roedel wrote: >> > And this makes a difference, with this change on-top of -rc3 the box boots >> > fine. So there seems to be some depende

Re: Linux 2.6.39-rc3

2011-04-14 Thread Joerg Roedel
On Thu, Apr 14, 2011 at 10:28:43AM -0400, Alex Deucher wrote: > On Thu, Apr 14, 2011 at 4:56 AM, Joerg Roedel wrote: > > And this makes a difference, with this change on-top of -rc3 the box boots > > fine. So there seems to be some dependency between the GART base and the GTT > > base even when th

Linux 2.6.39-rc3

2011-04-14 Thread Tejun Heo
Hello, On Wed, Apr 13, 2011 at 07:33:40PM -0700, Linus Torvalds wrote: > On Wednesday, April 13, 2011, Linus Torvalds > wrote: > > On Wednesday, April 13, 2011, H. Peter Anvin wrote: > >> > >> Yes. ?However, even if we *do* revert (and the time is running short on > >> not reverting) I would lik

Linux 2.6.39-rc3

2011-04-14 Thread Dave Airlie
On Wed, 2011-04-13 at 18:58 -0700, H. Peter Anvin wrote: > On 04/13/2011 12:14 PM, Yinghai Lu wrote: > > > > so those two patches uncover some problems. > > > > [0.00] Checking aperture... > > [0.00] No AGP bridge found > > [0.00] Node 0: aperture @ a000 size 32 MB > >

Linux 2.6.39-rc3

2011-04-14 Thread Joerg Roedel
On Thu, Apr 14, 2011 at 01:03:37PM +0900, Tejun Heo wrote: > Hello, > > On Wed, Apr 13, 2011 at 07:33:40PM -0700, Linus Torvalds wrote: > > On Wednesday, April 13, 2011, Linus Torvalds > > wrote: > > > On Wednesday, April 13, 2011, H. Peter Anvin wrote: > > >> > > >> Yes. ?However, even if we *d

Linux 2.6.39-rc3

2011-04-14 Thread Ingo Molnar
* Joerg Roedel wrote: > On Wed, Apr 13, 2011 at 06:58:46PM -0700, H. Peter Anvin wrote: > > On 04/13/2011 12:14 PM, Yinghai Lu wrote: > > > > > > so looks bios program wrong address to the radon card? > > > > > > > Okay, staring at this, it definitely seems toxic to overlay the GART > > over

Linux 2.6.39-rc3

2011-04-14 Thread Joerg Roedel
On Wed, Apr 13, 2011 at 03:31:09PM -0700, H. Peter Anvin wrote: > On 04/13/2011 03:22 PM, Joerg Roedel wrote: > > On Wed, Apr 13, 2011 at 03:01:10PM -0700, H. Peter Anvin wrote: > >> On 04/13/2011 02:50 PM, Joerg Roedel wrote: > >>> On Wed, Apr 13, 2011 at 01:48:48PM -0700, Yinghai Lu wrote: >

Linux 2.6.39-rc3

2011-04-14 Thread Joerg Roedel
On Wed, Apr 13, 2011 at 06:58:46PM -0700, H. Peter Anvin wrote: > On 04/13/2011 12:14 PM, Yinghai Lu wrote: > > > > so looks bios program wrong address to the radon card? > > > > Okay, staring at this, it definitely seems toxic to overlay the GART > over memory areas reserved by the BIOS. If I

Linux 2.6.39-rc3

2011-04-14 Thread Alex Deucher
On Thu, Apr 14, 2011 at 4:56 AM, Joerg Roedel wrote: > On Wed, Apr 13, 2011 at 06:58:46PM -0700, H. Peter Anvin wrote: >> On 04/13/2011 12:14 PM, Yinghai Lu wrote: >> > >> > so looks bios program wrong address to the radon card? >> > >> >> Okay, staring at this, it definitely seems toxic to overla

Linux 2.6.39-rc3

2011-04-14 Thread Alan Cox
On Wed, 13 Apr 2011 19:33:40 -0700 Linus Torvalds wrote: > On Wednesday, April 13, 2011, Linus Torvalds > wrote: > > On Wednesday, April 13, 2011, H. Peter Anvin wrote: > >> > >> Yes. ?However, even if we *do* revert (and the time is running short on > >> not reverting) I would like to understa

Re: Linux 2.6.39-rc3

2011-04-14 Thread H. Peter Anvin
On 04/14/2011 02:11 AM, Ingo Molnar wrote: > > I'd strongly suggest we revert back to the old and proven allocation order, > as > long as it results in valid layouts. Even if we figure out this particular > GART/GTT assumption there might be a dozen others in other types of hardware. > Yes, b

Linux 2.6.39-rc3

2011-04-14 Thread H. Peter Anvin
On 04/14/2011 02:11 AM, Ingo Molnar wrote: > > I'd strongly suggest we revert back to the old and proven allocation order, > as > long as it results in valid layouts. Even if we figure out this particular > GART/GTT assumption there might be a dozen others in other types of hardware. > Yes, b

Re: Linux 2.6.39-rc3

2011-04-14 Thread Alex Deucher
On Thu, Apr 14, 2011 at 4:56 AM, Joerg Roedel wrote: > On Wed, Apr 13, 2011 at 06:58:46PM -0700, H. Peter Anvin wrote: >> On 04/13/2011 12:14 PM, Yinghai Lu wrote: >> > >> > so looks bios program wrong address to the radon card? >> > >> >> Okay, staring at this, it definitely seems toxic to overla

Re: Linux 2.6.39-rc3

2011-04-14 Thread Joerg Roedel
On Thu, Apr 14, 2011 at 01:03:37PM +0900, Tejun Heo wrote: > Hello, > > On Wed, Apr 13, 2011 at 07:33:40PM -0700, Linus Torvalds wrote: > > On Wednesday, April 13, 2011, Linus Torvalds > > wrote: > > > On Wednesday, April 13, 2011, H. Peter Anvin wrote: > > >> > > >> Yes.  However, even if we *d

Re: Linux 2.6.39-rc3

2011-04-14 Thread Ingo Molnar
* Joerg Roedel wrote: > On Wed, Apr 13, 2011 at 06:58:46PM -0700, H. Peter Anvin wrote: > > On 04/13/2011 12:14 PM, Yinghai Lu wrote: > > > > > > so looks bios program wrong address to the radon card? > > > > > > > Okay, staring at this, it definitely seems toxic to overlay the GART > > over

Re: Linux 2.6.39-rc3

2011-04-14 Thread Dave Airlie
On Thu, Apr 14, 2011 at 6:56 PM, Joerg Roedel wrote: > On Wed, Apr 13, 2011 at 06:58:46PM -0700, H. Peter Anvin wrote: >> On 04/13/2011 12:14 PM, Yinghai Lu wrote: >> > >> > so looks bios program wrong address to the radon card? >> > >> >> Okay, staring at this, it definitely seems toxic to overla

Re: Linux 2.6.39-rc3

2011-04-14 Thread Joerg Roedel
On Wed, Apr 13, 2011 at 03:31:09PM -0700, H. Peter Anvin wrote: > On 04/13/2011 03:22 PM, Joerg Roedel wrote: > > On Wed, Apr 13, 2011 at 03:01:10PM -0700, H. Peter Anvin wrote: > >> On 04/13/2011 02:50 PM, Joerg Roedel wrote: > >>> On Wed, Apr 13, 2011 at 01:48:48PM -0700, Yinghai Lu wrote: >

Re: Linux 2.6.39-rc3

2011-04-14 Thread Joerg Roedel
On Wed, Apr 13, 2011 at 06:58:46PM -0700, H. Peter Anvin wrote: > On 04/13/2011 12:14 PM, Yinghai Lu wrote: > > > > so looks bios program wrong address to the radon card? > > > > Okay, staring at this, it definitely seems toxic to overlay the GART > over memory areas reserved by the BIOS. If I

Re: Linux 2.6.39-rc3

2011-04-14 Thread Alan Cox
On Wed, 13 Apr 2011 19:33:40 -0700 Linus Torvalds wrote: > On Wednesday, April 13, 2011, Linus Torvalds > wrote: > > On Wednesday, April 13, 2011, H. Peter Anvin wrote: > >> > >> Yes.  However, even if we *do* revert (and the time is running short on > >> not reverting) I would like to understa

Linux 2.6.39-rc3

2011-04-14 Thread Joerg Roedel
On Wed, Apr 13, 2011 at 03:01:10PM -0700, H. Peter Anvin wrote: > On 04/13/2011 02:50 PM, Joerg Roedel wrote: > > On Wed, Apr 13, 2011 at 01:48:48PM -0700, Yinghai Lu wrote: > >> - addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<20); > >> + addr = memblock_find_in_range(0, 1ULL<<32,

Linux 2.6.39-rc3

2011-04-13 Thread Joerg Roedel
On Wed, Apr 13, 2011 at 01:48:48PM -0700, Yinghai Lu wrote: > - addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<20); > + addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<21); Btw, while looking at this code I wondered why the 512M goal is enforced by the alignmen

Re: Linux 2.6.39-rc3

2011-04-13 Thread H. Peter Anvin
On 04/13/2011 07:07 PM, Dave Airlie wrote: >> >> Okay, staring at this, it definitely seems toxic to overlay the GART >> over memory areas reserved by the BIOS. If I were to guess, I would say >> that the problem here seems to be that the kernel thinks it is >> overlaying 64 MiB of memory, but the

Linux 2.6.39-rc3

2011-04-13 Thread H. Peter Anvin
On 04/13/2011 07:07 PM, Dave Airlie wrote: >> >> Okay, staring at this, it definitely seems toxic to overlay the GART >> over memory areas reserved by the BIOS. If I were to guess, I would say >> that the problem here seems to be that the kernel thinks it is >> overlaying 64 MiB of memory, but the

Linux 2.6.39-rc3

2011-04-13 Thread Joerg Roedel
On Wed, Apr 13, 2011 at 12:14:55PM -0700, Yinghai Lu wrote: > thanks for the bisecting... > > so those two patches uncover some problems. > > [0.00] Checking aperture... > [0.00] No AGP bridge found > [0.00] Node 0: aperture @ a000 size 32 MB > [0.00] Aperture

Linux 2.6.39-rc3

2011-04-13 Thread Joerg Roedel
On Wed, Apr 13, 2011 at 11:39:29AM -0700, H. Peter Anvin wrote: > On 04/13/2011 10:21 AM, Joerg Roedel wrote: > > On Wed, Apr 13, 2011 at 08:46:09AM +0200, Ingo Molnar wrote: > >> Could you please send the before/after bootlog (in particular all memory > >> init > >> messages included) and your .

Linux 2.6.39-rc3

2011-04-13 Thread Joerg Roedel
On Wed, Apr 13, 2011 at 11:51:39AM -0700, H. Peter Anvin wrote: > On 04/13/2011 10:21 AM, Joerg Roedel wrote: > > > > First of all, I bisected between v2.6.37-rc2..f005fe12b90c which where > > only a couple of patches and merged v2.6.38-rc4 in at every step. There > > was no failure found. > > The

Re: Linux 2.6.39-rc3

2011-04-13 Thread Tejun Heo
Hello, On Wed, Apr 13, 2011 at 07:33:40PM -0700, Linus Torvalds wrote: > On Wednesday, April 13, 2011, Linus Torvalds > wrote: > > On Wednesday, April 13, 2011, H. Peter Anvin wrote: > >> > >> Yes.  However, even if we *do* revert (and the time is running short on > >> not reverting) I would lik

Re: Linux 2.6.39-rc3

2011-04-13 Thread Linus Torvalds
On Wednesday, April 13, 2011, Linus Torvalds wrote: > On Wednesday, April 13, 2011, H. Peter Anvin wrote: >> >> Yes.  However, even if we *do* revert (and the time is running short on >> not reverting) I would like to understand this particular one, simply >> because I think it may very well be a

Linux 2.6.39-rc3

2011-04-13 Thread Linus Torvalds
On Wednesday, April 13, 2011, Linus Torvalds wrote: > On Wednesday, April 13, 2011, H. Peter Anvin wrote: >> >> Yes. ?However, even if we *do* revert (and the time is running short on >> not reverting) I would like to understand this particular one, simply >> because I think it may very well be a

Re: Linux 2.6.39-rc3

2011-04-13 Thread Linus Torvalds
On Wednesday, April 13, 2011, H. Peter Anvin wrote: > > Yes.  However, even if we *do* revert (and the time is running short on > not reverting) I would like to understand this particular one, simply > because I think it may very well be a problem that is manifesting itself > in other ways on othe

Linux 2.6.39-rc3

2011-04-13 Thread Linus Torvalds
On Wednesday, April 13, 2011, H. Peter Anvin wrote: > > Yes. ?However, even if we *do* revert (and the time is running short on > not reverting) I would like to understand this particular one, simply > because I think it may very well be a problem that is manifesting itself > in other ways on othe

Linux 2.6.39-rc3

2011-04-13 Thread Joerg Roedel
On Wed, Apr 13, 2011 at 08:46:09AM +0200, Ingo Molnar wrote: > Could you please send the before/after bootlog (in particular all memory init > messages included) and your .config? > > before: f005fe12b90c: x86-64: Move out cleanup higmap [_brk_end, _end) out > of init_memory_mapping() > afte

Re: Linux 2.6.39-rc3

2011-04-13 Thread Dave Airlie
On Wed, 2011-04-13 at 18:58 -0700, H. Peter Anvin wrote: > On 04/13/2011 12:14 PM, Yinghai Lu wrote: > > > > so those two patches uncover some problems. > > > > [0.00] Checking aperture... > > [0.00] No AGP bridge found > > [0.00] Node 0: aperture @ a000 size 32 MB > >

Re: Linux 2.6.39-rc3

2011-04-13 Thread H. Peter Anvin
On 04/13/2011 04:39 PM, Linus Torvalds wrote: > > - Choice #2: understand exactly _what_ goes wrong, and fix it > analytically (ie by _understanding_ the problem, and being able to > solve it exactly, and in a way you can argue about without having to > resort to "magic happens"). > > Now, the w

Linux 2.6.39-rc3

2011-04-13 Thread H. Peter Anvin
On 04/13/2011 04:39 PM, Linus Torvalds wrote: > > - Choice #2: understand exactly _what_ goes wrong, and fix it > analytically (ie by _understanding_ the problem, and being able to > solve it exactly, and in a way you can argue about without having to > resort to "magic happens"). > > Now, the w

  1   2   >