2.6.39-rc1 nouveau(?) regression (bisected)
On Tue, Apr 19, 2011 at 11:47:47PM +0200, Marcin Slusarz wrote: > Thanks. It helped a bit. > I'll send two patches in response to this message, one of which fixes this > bug. Those patches fixed my system. Thanks!
Re: 2.6.39-rc1 nouveau(?) regression (bisected)
On Tue, Apr 19, 2011 at 11:47:47PM +0200, Marcin Slusarz wrote: Thanks. It helped a bit. I'll send two patches in response to this message, one of which fixes this bug. Those patches fixed my system. Thanks! ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: 2.6.39-rc1 nouveau regression (bisected)
On Sat, Apr 16, 2011 at 07:50:28PM -0400, Kyle Spaans wrote: On Sun, Apr 17, 2011 at 08:12:35AM +1000, Nigel Cunningham wrote: On 15/04/11 16:11, Dominik Brodowski wrote: On Thu, Apr 14, 2011 at 09:02:01PM +0200, Marcin Slusarz wrote: On Thu, Apr 14, 2011 at 07:05:59PM +0200, Dominik Brodowski wrote: Thought about CCing Linus to show him that 2.6.39-rcX isn't as calm to everyone, but then chose to CC Maciej instead: Would you be so kind and add this to your regression list? Thanks! Since commit 38f1cff From: Dave Airlie airl...@redhat.com Date: Wed, 16 Mar 2011 11:34:41 +1000 Subject: [PATCH] Merge commit '5359533801e3dd3abca5b7d3d985b0b33fd9fe8b' into dr This commit changed an internal radeon structure, that meant a new driver in -next had to be fixed up, merge in the commit and fix up the driver. Also fixes a trivial nouveau merge. Conflicts: drivers/gpu/drm/nouveau/nouveau_mem.c booting my atom/NM10/ION2 system crashes hard during boot, right after blanking the screen, and before the initramfs gets loaded. I just re-checked: both parent commits ( 5359533 and 4819d2e ) do indeed work just fine, but the merge commit ( 38f1cff ) fails, same as tip ( 85f2e68 ). Can you activate netconsole and check whether kernel spits anything interesting? You might try to load nouveau module after boot - maybe something will be saved to /var/log or you could even ssh into the box and check dmesg... Compiling it as a module seems to work fine. When I do so, no regression is obvious from what gets reported in dmesg. However, somehow I now do get some output: The last message I see is [drm] nouveau :01:00.0: allocated 1680x1050, fb 0x40 b0 some pointer value Then, nothing more. However, it really is quite strange why this error only appears in the CONFIG_NOUVEAU=y case, not in the =m case... Try disabling CONFIG_BOOT_LOGO. I reported on freedesktop.org that it is causing me an oops at boot, but my bug has been ignored there so far - perhaps I should have posted it here instead. I'm getting the exact same symptoms on my Atom + ION hardware. Crashes before it can write any logs if it's compiled in and the logo is selected, but boots fine if compiled as a module or the logo is removed. In my case I bisected and found 8969960 by Nick Piggin (change to mm/vmalloc.c) to be the first bad one in 2.6.38+. This makes me think that it's not a bug in nouveau, but maybe a bug in the order that things are initialized? FWIW, reverting commit 89699605fe7cfd8611900346f61cb6cbf179b10a on 2.6.39-rc3+ makes my system boot just fine with the nouveau drivers compiled into the kernel. I've seen some similar looking bugs on LKML that this regression may or may not be related to? It works fine on 2.6.38. https://bugzilla.kernel.org/show_bug.cgi?id=33272 http://lkml.org/lkml/2011/4/15/194 I'm still trying to figure out exactly where the kernel is crashing after printing [drm] nouveau :03:00.0: allocated 1280x1024 fb: 0x4000, b0 f4cf7600 Any thoughts on what else I should look for? ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: 2.6.39-rc1 nouveau regression (bisected)
On Sun, Apr 17, 2011 at 05:45:57PM +0200, Marcin Slusarz wrote: On Sun, Apr 17, 2011 at 11:12:04AM -0400, Kyle Spaans wrote: On Sat, Apr 16, 2011 at 07:50:28PM -0400, Kyle Spaans wrote: On Sun, Apr 17, 2011 at 08:12:35AM +1000, Nigel Cunningham wrote: On 15/04/11 16:11, Dominik Brodowski wrote: On Thu, Apr 14, 2011 at 09:02:01PM +0200, Marcin Slusarz wrote: On Thu, Apr 14, 2011 at 07:05:59PM +0200, Dominik Brodowski wrote: Thought about CCing Linus to show him that 2.6.39-rcX isn't as calm to everyone, but then chose to CC Maciej instead: Would you be so kind and add this to your regression list? Thanks! Since commit 38f1cff From: Dave Airlie airl...@redhat.com Date: Wed, 16 Mar 2011 11:34:41 +1000 Subject: [PATCH] Merge commit '5359533801e3dd3abca5b7d3d985b0b33fd9fe8b' into dr This commit changed an internal radeon structure, that meant a new driver in -next had to be fixed up, merge in the commit and fix up the driver. Also fixes a trivial nouveau merge. Conflicts: drivers/gpu/drm/nouveau/nouveau_mem.c booting my atom/NM10/ION2 system crashes hard during boot, right after blanking the screen, and before the initramfs gets loaded. I just re-checked: both parent commits ( 5359533 and 4819d2e ) do indeed work just fine, but the merge commit ( 38f1cff ) fails, same as tip ( 85f2e68 ). Can you activate netconsole and check whether kernel spits anything interesting? You might try to load nouveau module after boot - maybe something will be saved to /var/log or you could even ssh into the box and check dmesg... Compiling it as a module seems to work fine. When I do so, no regression is obvious from what gets reported in dmesg. However, somehow I now do get some output: The last message I see is [drm] nouveau :01:00.0: allocated 1680x1050, fb 0x40 b0 some pointer value Then, nothing more. However, it really is quite strange why this error only appears in the CONFIG_NOUVEAU=y case, not in the =m case... Try disabling CONFIG_BOOT_LOGO. I reported on freedesktop.org that it is causing me an oops at boot, but my bug has been ignored there so far - perhaps I should have posted it here instead. I'm getting the exact same symptoms on my Atom + ION hardware. Crashes before it can write any logs if it's compiled in and the logo is selected, but boots fine if compiled as a module or the logo is removed. In my case I bisected and found 8969960 by Nick Piggin (change to mm/vmalloc.c) to be the first bad one in 2.6.38+. This makes me think that it's not a bug in nouveau, but maybe a bug in the order that things are initialized? FWIW, reverting commit 89699605fe7cfd8611900346f61cb6cbf179b10a on 2.6.39-rc3+ makes my system boot just fine with the nouveau drivers compiled into the kernel. I've seen some similar looking bugs on LKML that this regression may or may not be related to? It works fine on 2.6.38. https://bugzilla.kernel.org/show_bug.cgi?id=33272 http://lkml.org/lkml/2011/4/15/194 I'm still trying to figure out exactly where the kernel is crashing after printing [drm] nouveau :03:00.0: allocated 1280x1024 fb: 0x4000, b0 f4cf7600 Any thoughts on what else I should look for? I reproduced this bug today, and reverting 89699605fe7cfd8611900346f61cb6cbf179b10a does not fix it for me. Here's the backtrace: Entering kdb (current=0x8801becb, pid 1) on processor 6 Oops: (null) due to oops @ 0x81255081 CPU 6 dModules linked in: c dPid: 1, comm: swapper Not tainted 2.6.39-rc2-nv+ #640c System manufacturer System Product Namec/P6T SEc dRIP: 0010:[81255081] [81255081] iowrite32+0x12/0x34 dRSP: :8801becab4b0 EFLAGS: 00010296 dRAX: RBX: 8801bd334800 RCX: 16fc dRDX: RSI: c900100bbf4c RDI: c900100bbf4c dRBP: 8801becab4b0 R08: 0002 R09: 0001 dR10: 00bb R11: 8801becab540 R12: 8801bd336000 dR13: 8801bd334818 R14: 8801bd60 R15: 0020 dFS: () GS:8801bfd8() knlGS: dCS: 0010 DS: ES: CR0: 8005003b dCR2: c900100bbf4c CR3: 01a2b000 CR4: 06e0 dDR0: DR1: DR2: dDR3: DR6: 0ff0 DR7: 0400 Process swapper (pid: 1, threadinfo 8801becaa000, task 8801becb) 0Stack: c 8801becab4c0c 812f5bd5c 8801becab4f0c 8130f1f8c c 8801bd336000c c90012a0c 8801becab620c c c 8801becab590c 8127b4c8c
2.6.39-rc1 nouveau regression (bisected)
On Sun, Apr 17, 2011 at 05:45:57PM +0200, Marcin Slusarz wrote: > On Sun, Apr 17, 2011 at 11:12:04AM -0400, Kyle Spaans wrote: > > On Sat, Apr 16, 2011 at 07:50:28PM -0400, Kyle Spaans wrote: > > > On Sun, Apr 17, 2011 at 08:12:35AM +1000, Nigel Cunningham wrote: > > > > On 15/04/11 16:11, Dominik Brodowski wrote: > > > > > On Thu, Apr 14, 2011 at 09:02:01PM +0200, Marcin Slusarz wrote: > > > > >> On Thu, Apr 14, 2011 at 07:05:59PM +0200, Dominik Brodowski wrote: > > > > >>> Thought about CCing Linus to show him that 2.6.39-rcX isn't as > > > > >>> "calm" > > > > >>> to everyone, but then chose to CC Maciej instead: Would you be so > > > > >>> kind and > > > > >>> add this to your regression list? Thanks! > > > > >>> > > > > >>> Since commit 38f1cff > > > > >>> > > > > >>> From: Dave Airlie > > > > >>> Date: Wed, 16 Mar 2011 11:34:41 +1000 > > > > >>> Subject: [PATCH] Merge commit > > > > >>> '5359533801e3dd3abca5b7d3d985b0b33fd9fe8b' into dr > > > > >>> > > > > >>> This commit changed an internal radeon structure, that meant a > > > > >>> new driver > > > > >>> in -next had to be fixed up, merge in the commit and fix up the > > > > >>> driver. > > > > >>> > > > > >>> Also fixes a trivial nouveau merge. > > > > >>> > > > > >>> Conflicts: > > > > >>> drivers/gpu/drm/nouveau/nouveau_mem.c > > > > >>> > > > > >>> booting my atom/NM10/ION2 system crashes hard during boot, right > > > > >>> after > > > > >>> blanking the screen, and before the initramfs gets loaded. I just > > > > >>> re-checked: both parent commits ( 5359533 and 4819d2e ) do indeed > > > > >>> work > > > > >>> just fine, but the merge commit ( 38f1cff ) fails, same as tip ( > > > > >>> 85f2e68 ). > > > > >> Can you activate netconsole and check whether kernel spits anything > > > > >> interesting? > > > > >> You might try to load nouveau module after boot - maybe something > > > > >> will be saved > > > > >> to /var/log or you could even ssh into the box and check dmesg... > > > > > Compiling it as a module seems to work fine. When I do so, no > > > > > regression is > > > > > obvious from what gets reported in "dmesg". However, somehow I now do > > > > > get > > > > > some output: The last message I see is > > > > > > > > > > [drm] nouveau :01:00.0: allocated 1680x1050, fb 0x40 b0 > > > > pointer value> > > > > > > > > > > Then, nothing more. However, it really is quite strange why this > > > > > error only > > > > > appears in the CONFIG_NOUVEAU=y case, not in the =m case... > > > > Try disabling CONFIG_BOOT_LOGO. I reported on freedesktop.org that it is > > > > causing me an oops at boot, but my bug has been ignored there so far - > > > > perhaps I should have posted it here instead. > > > > > > I'm getting the exact same symptoms on my Atom + ION hardware. Crashes > > > before it > > > can write any logs if it's compiled in and the logo is selected, but > > > boots fine > > > if compiled as a module or the logo is removed. > > > > > > In my case I bisected and found 8969960 by Nick Piggin (change to > > > mm/vmalloc.c) > > > to be the first bad one in 2.6.38+. This makes me think that it's not a > > > bug in > > > nouveau, but maybe a bug in the order that things are initialized? > > > > FWIW, reverting commit 89699605fe7cfd8611900346f61cb6cbf179b10a on > > 2.6.39-rc3+ > > makes my system boot just fine with the nouveau drivers compiled into the > > kernel. I've seen some similar looking bugs on LKML that this regression > > may or > > may not be related to? It works fine on 2.6.38. > > > > https://bugzilla.kernel.org/show_bug.cgi?id=33272 > > http://lkml.org/lkml/2011/4/15/194 > > > > I'm still trying to figure out exactly where the kernel is crashing after > > printing > > [drm] n
2.6.39-rc1 nouveau regression (bisected)
On Sat, Apr 16, 2011 at 07:50:28PM -0400, Kyle Spaans wrote: > On Sun, Apr 17, 2011 at 08:12:35AM +1000, Nigel Cunningham wrote: > > On 15/04/11 16:11, Dominik Brodowski wrote: > > > On Thu, Apr 14, 2011 at 09:02:01PM +0200, Marcin Slusarz wrote: > > >> On Thu, Apr 14, 2011 at 07:05:59PM +0200, Dominik Brodowski wrote: > > >>> Thought about CCing Linus to show him that 2.6.39-rcX isn't as "calm" > > >>> to everyone, but then chose to CC Maciej instead: Would you be so kind > > >>> and > > >>> add this to your regression list? Thanks! > > >>> > > >>> Since commit 38f1cff > > >>> > > >>> From: Dave Airlie > > >>> Date: Wed, 16 Mar 2011 11:34:41 +1000 > > >>> Subject: [PATCH] Merge commit > > >>> '5359533801e3dd3abca5b7d3d985b0b33fd9fe8b' into dr > > >>> > > >>> This commit changed an internal radeon structure, that meant a new > > >>> driver > > >>> in -next had to be fixed up, merge in the commit and fix up the > > >>> driver. > > >>> > > >>> Also fixes a trivial nouveau merge. > > >>> > > >>> Conflicts: > > >>> drivers/gpu/drm/nouveau/nouveau_mem.c > > >>> > > >>> booting my atom/NM10/ION2 system crashes hard during boot, right after > > >>> blanking the screen, and before the initramfs gets loaded. I just > > >>> re-checked: both parent commits ( 5359533 and 4819d2e ) do indeed work > > >>> just fine, but the merge commit ( 38f1cff ) fails, same as tip ( > > >>> 85f2e68 ). > > >> Can you activate netconsole and check whether kernel spits anything > > >> interesting? > > >> You might try to load nouveau module after boot - maybe something will > > >> be saved > > >> to /var/log or you could even ssh into the box and check dmesg... > > > Compiling it as a module seems to work fine. When I do so, no regression > > > is > > > obvious from what gets reported in "dmesg". However, somehow I now do get > > > some output: The last message I see is > > > > > > [drm] nouveau :01:00.0: allocated 1680x1050, fb 0x40 b0 > > pointer value> > > > > > > Then, nothing more. However, it really is quite strange why this error > > > only > > > appears in the CONFIG_NOUVEAU=y case, not in the =m case... > > Try disabling CONFIG_BOOT_LOGO. I reported on freedesktop.org that it is > > causing me an oops at boot, but my bug has been ignored there so far - > > perhaps I should have posted it here instead. > > I'm getting the exact same symptoms on my Atom + ION hardware. Crashes before > it > can write any logs if it's compiled in and the logo is selected, but boots > fine > if compiled as a module or the logo is removed. > > In my case I bisected and found 8969960 by Nick Piggin (change to > mm/vmalloc.c) > to be the first bad one in 2.6.38+. This makes me think that it's not a bug in > nouveau, but maybe a bug in the order that things are initialized? FWIW, reverting commit 89699605fe7cfd8611900346f61cb6cbf179b10a on 2.6.39-rc3+ makes my system boot just fine with the nouveau drivers compiled into the kernel. I've seen some similar looking bugs on LKML that this regression may or may not be related to? It works fine on 2.6.38. https://bugzilla.kernel.org/show_bug.cgi?id=33272 http://lkml.org/lkml/2011/4/15/194 I'm still trying to figure out exactly where the kernel is crashing after printing [drm] nouveau :03:00.0: allocated 1280x1024 fb: 0x4000, b0 f4cf7600 Any thoughts on what else I should look for?
2.6.39-rc1 nouveau regression (bisected)
On Sun, Apr 17, 2011 at 08:12:35AM +1000, Nigel Cunningham wrote: > On 15/04/11 16:11, Dominik Brodowski wrote: > > On Thu, Apr 14, 2011 at 09:02:01PM +0200, Marcin Slusarz wrote: > >> On Thu, Apr 14, 2011 at 07:05:59PM +0200, Dominik Brodowski wrote: > >>> Thought about CCing Linus to show him that 2.6.39-rcX isn't as "calm" > >>> to everyone, but then chose to CC Maciej instead: Would you be so kind and > >>> add this to your regression list? Thanks! > >>> > >>> Since commit 38f1cff > >>> > >>> From: Dave Airlie > >>> Date: Wed, 16 Mar 2011 11:34:41 +1000 > >>> Subject: [PATCH] Merge commit > >>> '5359533801e3dd3abca5b7d3d985b0b33fd9fe8b' into dr > >>> > >>> This commit changed an internal radeon structure, that meant a new > >>> driver > >>> in -next had to be fixed up, merge in the commit and fix up the > >>> driver. > >>> > >>> Also fixes a trivial nouveau merge. > >>> > >>> Conflicts: > >>> drivers/gpu/drm/nouveau/nouveau_mem.c > >>> > >>> booting my atom/NM10/ION2 system crashes hard during boot, right after > >>> blanking the screen, and before the initramfs gets loaded. I just > >>> re-checked: both parent commits ( 5359533 and 4819d2e ) do indeed work > >>> just fine, but the merge commit ( 38f1cff ) fails, same as tip ( 85f2e68 > >>> ). > >> Can you activate netconsole and check whether kernel spits anything > >> interesting? > >> You might try to load nouveau module after boot - maybe something will be > >> saved > >> to /var/log or you could even ssh into the box and check dmesg... > > Compiling it as a module seems to work fine. When I do so, no regression is > > obvious from what gets reported in "dmesg". However, somehow I now do get > > some output: The last message I see is > > > > [drm] nouveau :01:00.0: allocated 1680x1050, fb 0x40 b0 > pointer value> > > > > Then, nothing more. However, it really is quite strange why this error only > > appears in the CONFIG_NOUVEAU=y case, not in the =m case... > Try disabling CONFIG_BOOT_LOGO. I reported on freedesktop.org that it is > causing me an oops at boot, but my bug has been ignored there so far - > perhaps I should have posted it here instead. I'm getting the exact same symptoms on my Atom + ION hardware. Crashes before it can write any logs if it's compiled in and the logo is selected, but boots fine if compiled as a module or the logo is removed. In my case I bisected and found 8969960 by Nick Piggin (change to mm/vmalloc.c) to be the first bad one in 2.6.38+. This makes me think that it's not a bug in nouveau, but maybe a bug in the order that things are initialized? Config below. (It boots as is, but if you enable CONFIG_LOGO it won't.) # # Automatically generated make config: don't edit # Linux/x86 2.6.39-rc3 Kernel Configuration # Sat Apr 16 19:39:23 2011 # # CONFIG_64BIT is not set CONFIG_X86_32=y # CONFIG_X86_64 is not set CONFIG_X86=y CONFIG_INSTRUCTION_DECODER=y CONFIG_OUTPUT_FORMAT="elf32-i386" CONFIG_ARCH_DEFCONFIG="arch/x86/configs/i386_defconfig" CONFIG_GENERIC_CMOS_UPDATE=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_HAVE_LATENCYTOP_SUPPORT=y CONFIG_MMU=y CONFIG_ZONE_DMA=y # CONFIG_NEED_DMA_MAP_STATE is not set CONFIG_NEED_SG_DMA_LENGTH=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_HWEIGHT=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y # CONFIG_RWSEM_GENERIC_SPINLOCK is not set CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_ARCH_HAS_CPU_IDLE_WAIT=y CONFIG_GENERIC_CALIBRATE_DELAY=y # CONFIG_GENERIC_TIME_VSYSCALL is not set CONFIG_ARCH_HAS_CPU_RELAX=y CONFIG_ARCH_HAS_DEFAULT_IDLE=y CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y CONFIG_HAVE_SETUP_PER_CPU_AREA=y CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y # CONFIG_HAVE_CPUMASK_OF_CPU_MAP is not set CONFIG_ARCH_HIBERNATION_POSSIBLE=y CONFIG_ARCH_SUSPEND_POSSIBLE=y # CONFIG_ZONE_DMA32 is not set CONFIG_ARCH_POPULATES_NODE_MAP=y # CONFIG_AUDIT_ARCH is not set CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y CONFIG_X86_32_SMP=y CONFIG_X86_HT=y CONFIG_X86_32_LAZY_GS=y CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-ecx -fcall-saved-edx" CONFIG_KTIME_SCALAR=y CONFIG_ARCH_CPU_PROBE_RELEASE=y CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" CONFIG_CONSTRUCTORS=y CONFIG_HAVE_IRQ_WORK=y CONFIG_IRQ_WORK=y # # General setup # # CONFIG_EXPERIMENTAL is not set CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_CROSS_COMPILE="" CONFIG_LOCALVERSION="" # CONFIG_LOCALVERSION_AUTO is not set CONFIG_HAVE_KERNEL_GZIP=y CONFIG_HAVE_KERNEL_BZIP2=y CONFIG_HAVE_KERNEL_LZMA=y CONFIG_HAVE_KERNEL_XZ=y CONFIG_HAVE_KERNEL_LZO=y CONFIG_KERNEL_GZIP=y # CONFIG_KERNEL_BZIP2 is not set # CONFIG_KERNEL_LZMA is not set # CONFIG_KERNEL_XZ is not set # CONFIG_KERNEL_LZO is not set CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_SYSVIPC_SYSCTL=y CONFIG_BSD_PROCESS_ACCT=y # CONFIG_BSD_PROCESS_ACCT_V3