[Nouveau] 3.8-rc2: EFI framebuffer lock inversion...
On 3.8-rc2 with lockdep enabled and dual-GPU setup (Macbook Pro Retina), I see two releated lock inversion issues with the EFI framebuffer, leading to possible deadlock: when X takes over from the EFI framebuffer [1] and when nouveau releases the framebuffer when being vgaswitcherood [2]. Let me know if you'd like any testing or analysis when I can get the time. Many thanks, Daniel --- [1] init: lightdm main process (950) terminated with status 1 == [ INFO: possible circular locking dependency detected ] 3.8.0-rc2-expert #1 Not tainted --- Xorg/1193 is trying to acquire lock: ((fb_notifier_list).rwsem){.+}, at: [810697c1] __blocking_notifier_call_chain+0x51/0xc0 but task is already holding lock: (console_lock){+.+.+.}, at: [81263f95] do_fb_ioctl+0x2e5/0x5f0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: - #1 (console_lock){+.+.+.}: [81090a61] __lock_acquire+0x3a1/0xb60 [810916ea] lock_acquire+0x5a/0x70 [810407a7] console_lock+0x77/0x80 [812c6d84] register_con_driver+0x34/0x140 [812c84e9] take_over_console+0x29/0x60 [8126e76b] fbcon_takeover+0x5b/0xb0 [81272bb5] fbcon_event_notify+0x715/0x820 [810693a5] notifier_call_chain+0x55/0x110 [810697d7] __blocking_notifier_call_chain+0x67/0xc0 [81069841] blocking_notifier_call_chain+0x11/0x20 [81262a16] fb_notifier_call_chain+0x16/0x20 [81264c1d] register_framebuffer+0x1bd/0x2f0 [81ac2bd4] efifb_probe+0x40f/0x496 [81308dfe] platform_drv_probe+0x3e/0x70 [81306dc6] driver_probe_device+0x76/0x240 [81307033] __driver_attach+0xa3/0xb0 [8130503d] bus_for_each_dev+0x4d/0x90 [81306929] driver_attach+0x19/0x20 [813064e0] bus_add_driver+0x1a0/0x270 [813076c2] driver_register+0x72/0x170 [81308671] platform_driver_register+0x41/0x50 [81308696] platform_driver_probe+0x16/0xa0 [81ac2ece] efifb_init+0x273/0x292 [810002da] do_one_initcall+0x11a/0x170 [8154187c] kernel_init+0x11c/0x290 [8155acac] ret_from_fork+0x7c/0xb0 - #0 ((fb_notifier_list).rwsem){.+}: [8108ff10] validate_chain.isra.33+0x1000/0x10d0 [81090a61] __lock_acquire+0x3a1/0xb60 [810916ea] lock_acquire+0x5a/0x70 [81557ad7] down_read+0x47/0x5c [810697c1] __blocking_notifier_call_chain+0x51/0xc0 [81069841] blocking_notifier_call_chain+0x11/0x20 [81262a16] fb_notifier_call_chain+0x16/0x20 [81263196] fb_blank+0x36/0xc0 [81263fa7] do_fb_ioctl+0x2f7/0x5f0 [812646e1] fb_ioctl+0x41/0x50 [811209d7] do_vfs_ioctl+0x97/0x580 [81120f0b] sys_ioctl+0x4b/0x90 [8155ad56] system_call_fastpath+0x1a/0x1f other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 lock(console_lock); lock((fb_notifier_list).rwsem); lock(console_lock); lock((fb_notifier_list).rwsem); *** DEADLOCK *** 2 locks held by Xorg/1193: #0: (fb_info-lock){+.+.+.}, at: [81262ef1] lock_fb_info+0x21/0x60 #1: (console_lock){+.+.+.}, at: [81263f95] do_fb_ioctl+0x2e5/0x5f0 stack backtrace: Pid: 1193, comm: Xorg Not tainted 3.8.0-rc2-expert #1 Call Trace: [8154f6c6] print_circular_bug+0x28e/0x29f [8108ff10] validate_chain.isra.33+0x1000/0x10d0 [81090a61] __lock_acquire+0x3a1/0xb60 [8108d3a4] ? __lock_is_held+0x54/0x80 [810916ea] lock_acquire+0x5a/0x70 [810697c1] ? __blocking_notifier_call_chain+0x51/0xc0 [81557ad7] down_read+0x47/0x5c [810697c1] ? __blocking_notifier_call_chain+0x51/0xc0 [810697c1] __blocking_notifier_call_chain+0x51/0xc0 [81069841] blocking_notifier_call_chain+0x11/0x20 [81262a16] fb_notifier_call_chain+0x16/0x20 [81263196] fb_blank+0x36/0xc0 [81263fa7] do_fb_ioctl+0x2f7/0x5f0 [810e8d1a] ? mmap_region+0x1aa/0x620 [812646e1] fb_ioctl+0x41/0x50 [811209d7] do_vfs_ioctl+0x97/0x580 [8112c49a] ? fget_light+0x3da/0x4d0 [8155ad7b] ? sysret_check+0x1b/0x56 [81120f0b] sys_ioctl+0x4b/0x90 [8122c03e] ? trace_hardirqs_on_thunk+0x3a/0x3f [8155ad56] system_call_fastpath+0x1a/0x1f [drm] Enabling RC6 states: RC6 on, RC6p on, RC6pp off --- [2] hda-intel :01:00.1: Disabling via VGA-switcheroo hda-intel :01:00.1: Cannot lock devices! VGA switcheroo: switched nouveau off nouveau [ DRM] suspending fbcon... == [ INFO: possible circular locking dependency detected
Re: [Nouveau] Resume regression with nouveau 3.8rc1 (bisected)
On Wed, Jan 02, 2013 at 04:19:35PM +0100, Pontus Fuchs wrote: Hi, Starting with 3.8rc1 I get a black screen when resuming after suspend. The kernel is alive because I can switch to VT1 and reboot with ctrl-alt-delete. I bisected the problem down to this commit: 186ecad21: drm/nv50/disp: move remaining interrupt handling into core Hardware is 8400M GS (10de:0427) in a Dell XPS M1330. There's already open bug report for that: https://bugs.freedesktop.org/show_bug.cgi?id=58729 And my nv92 does not resume too, with similar symptoms, since nouveau display rework hit the tree. Marcin ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] 3.8-rc2: EFI framebuffer lock inversion...
On 3 January 2013 21:11, Alan Cox a...@lxorguk.ukuu.org.uk wrote: On Thu, 3 Jan 2013 20:56:30 +0800 Daniel J Blueman dan...@quora.org wrote: On 3.8-rc2 with lockdep enabled and dual-GPU setup (Macbook Pro Retina), I see two releated lock inversion issues with the EFI framebuffer, leading to possible deadlock: when X takes over from the EFI framebuffer [1] and when nouveau releases the framebuffer when being vgaswitcherood [2]. Let me know if you'd like any testing or analysis when I can get the time. The fb layer locking was broken. I posted patches early December which should have fixed the ones we know about. ('fb: Rework locking to fix lock ordering on takeover'). Superb work, Alan! The only patch I could find [1] (mid Nov) looks like it needs another sites updating, since we now see an i915 vs efifb lock ordering issue [2]. I can get some time next week to take a look if it helps. Thanks, Daniel --- [1] https://patchwork.kernel.org/patch/1757061/ --- [2] [drm] Memory usable by graphics device = 2048M checking generic (b000 144) vs hw (b000 1000) fb: conflicting fb hw usage inteldrmfb vs EFI VGA - removing generic driver == [ INFO: possible circular locking dependency detected ] 3.8.0-rc2-expert+ #2 Not tainted --- modprobe/603 is trying to acquire lock: (console_lock){+.+.+.}, at: [812c869f] unbind_con_driver+0x3f/0x200 but task is already holding lock: ((fb_notifier_list).rwsem){.+}, at: [810697c1] __blocking_notifier_call_chain+0x51/0xc0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: - #1 ((fb_notifier_list).rwsem){.+}: [81090a61] __lock_acquire+0x3a1/0xb60 [810916ea] lock_acquire+0x5a/0x70 [81557c97] down_read+0x47/0x5c [810697c1] __blocking_notifier_call_chain+0x51/0xc0 [81069841] blocking_notifier_call_chain+0x11/0x20 [81262a16] fb_notifier_call_chain+0x16/0x20 [81264c20] register_framebuffer+0x1c0/0x300 [81ac2bd4] efifb_probe+0x40f/0x496 [81308fbe] platform_drv_probe+0x3e/0x70 [81306f86] driver_probe_device+0x76/0x240 [813071f3] __driver_attach+0xa3/0xb0 [813051fd] bus_for_each_dev+0x4d/0x90 [81306ae9] driver_attach+0x19/0x20 [813066a0] bus_add_driver+0x1a0/0x270 [81307882] driver_register+0x72/0x170 [81308831] platform_driver_register+0x41/0x50 [81308856] platform_driver_probe+0x16/0xa0 [81ac2ece] efifb_init+0x273/0x292 [810002da] do_one_initcall+0x11a/0x170 [81541a3c] kernel_init+0x11c/0x290 [8155ae6c] ret_from_fork+0x7c/0xb0 - #0 (console_lock){+.+.+.}: [8108ff10] validate_chain.isra.33+0x1000/0x10d0 [81090a61] __lock_acquire+0x3a1/0xb60 [810916ea] lock_acquire+0x5a/0x70 [810407a7] console_lock+0x77/0x80 [812c869f] unbind_con_driver+0x3f/0x200 [81272bc7] fbcon_event_notify+0x447/0x8b0 [810693a5] notifier_call_chain+0x55/0x110 [810697d7] __blocking_notifier_call_chain+0x67/0xc0 [81069841] blocking_notifier_call_chain+0x11/0x20 [81262a16] fb_notifier_call_chain+0x16/0x20 [812647db] do_unregister_framebuffer+0x5b/0x110 [81264a28] do_remove_conflicting_framebuffers+0x158/0x190 [81264d9a] remove_conflicting_framebuffers+0x3a/0x60 [a007dbe4] i915_driver_load+0x7d4/0xe70 [i915] [812ee1ee] drm_get_pci_dev+0x17e/0x2b0 [a0079616] i915_pci_probe+0x36/0x90 [i915] [8124a146] local_pci_probe+0x46/0x80 [8124a9d1] pci_device_probe+0x101/0x110 [81306f86] driver_probe_device+0x76/0x240 [813071f3] __driver_attach+0xa3/0xb0 [813051fd] bus_for_each_dev+0x4d/0x90 [81306ae9] driver_attach+0x19/0x20 [813066a0] bus_add_driver+0x1a0/0x270 [81307882] driver_register+0x72/0x170 [8124aacf] __pci_register_driver+0x5f/0x70 [812ee435] drm_pci_init+0x115/0x130 [a00ff066] i915_init+0x66/0x68 [i915] [810002da] do_one_initcall+0x11a/0x170 [8109cf84] load_module+0xfd4/0x13c0 [8109d427] sys_init_module+0xb7/0xe0 [8155af16] system_call_fastpath+0x1a/0x1f other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 lock((fb_notifier_list).rwsem); lock(console_lock); lock((fb_notifier_list).rwsem); lock(console_lock); *** DEADLOCK *** 6 locks held by modprobe/603: #0: (__lockdep_no_validate__){..}, at: [813071a3] __driver_attach+0x53/0xb0 #1: (__lockdep_no_validate__){..}, at:
[Nouveau] [Bug 58776] DRM NOUVEAU: cannot boot with kernel =3.7
https://bugs.freedesktop.org/show_bug.cgi?id=58776 --- Comment #7 from gabriele balducci baldu...@units.it --- Thanks a lot. 1) I cannot find (or I am not able to recognize) any option related to AGP/IOMMU in my BIOS menu. Following advice in http://www.kernel.org/doc/Documentation/x86/x86_64/boot-options.txt, I have booted with iommu=soft, which apparently makes the IOMMU complaint go away from demsg, but I don't know if this is really a solution... In any case, booting with (only) iommu=soft, does not make any difference: i.e. the machine freezes as above 2) The nouveau.vram_pushbuf=1 option allows me to boot fine (with or without the iommu=soft option above)! However, now X11 has problems. After starting X11, text in menus (e.g. emacs, firefox) and in window borders is made of unreadable black rectangles; text in windows (e.g. xterm text lines, text in emacs buffers etc) are fine, though. This problem goes away if I boot into X11 with Option Accel Off i.e.: if I switch acceleration off, everything works nicely (but, of course, I have no acceleration) I have attached dmesg for kernel 3.6.10 (which works perfectly), the new dmesg for kernel 3.7.1 with iommu=soft and nouveau.vram_pushbuf=1, my xorg.conf and xorg.0.log for kernel 3.6.10 and 3.7.1 (with/without accel) I'll be happy to send any other information which might help to clarify this problem thank you very much again ciao gabriele -- You are receiving this mail because: You are the assignee for the bug. ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [Bug 58776] DRM NOUVEAU: cannot boot with kernel =3.7
https://bugs.freedesktop.org/show_bug.cgi?id=58776 --- Comment #8 from gabriele balducci baldu...@units.it --- Created attachment 72452 -- https://bugs.freedesktop.org/attachment.cgi?id=72452action=edit demsg for kernel-3.6.10 -- You are receiving this mail because: You are the assignee for the bug. ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [Bug 58776] DRM NOUVEAU: cannot boot with kernel =3.7
https://bugs.freedesktop.org/show_bug.cgi?id=58776 --- Comment #9 from gabriele balducci baldu...@units.it --- Created attachment 72453 -- https://bugs.freedesktop.org/attachment.cgi?id=72453action=edit dmesg for kernel-3.7.1 with iommu=soft nouveau.vram_pushbuf=1 -- You are receiving this mail because: You are the assignee for the bug. ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [Bug 58776] DRM NOUVEAU: cannot boot with kernel =3.7
https://bugs.freedesktop.org/show_bug.cgi?id=58776 --- Comment #10 from gabriele balducci baldu...@units.it --- Created attachment 72454 -- https://bugs.freedesktop.org/attachment.cgi?id=72454action=edit xorg.conf -- You are receiving this mail because: You are the assignee for the bug. ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [Bug 58984] New: DRM NOUVEAU: probe of 0001:01:00.0 failed with error -12
https://bugs.freedesktop.org/show_bug.cgi?id=58984 Priority: medium Bug ID: 58984 Assignee: nouveau@lists.freedesktop.org Summary: DRM NOUVEAU: probe of 0001:01:00.0 failed with error -12 QA Contact: xorg-t...@lists.x.org Severity: normal Classification: Unclassified OS: Linux (All) Reporter: baggett.patr...@gmail.com Hardware: SPARC Status: NEW Version: git Component: Driver/nouveau Product: xorg When booting a Sun Blade 2500 with a GeForce 8400 GS PCI, everything appears pretty normal but for no apparent reason, the probe fails. It looks like error -12 is out of memory, but bug 56721 which looks similar does not apply. My machine also has 6GB of memory, so I'm sure it isn't actually failing to allocate system memory. Running Linux 3.8-rc1 git and nouveau git from last night (Jan 2 2013). -- You are receiving this mail because: You are the assignee for the bug. ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [Bug 58776] DRM NOUVEAU: cannot boot with kernel =3.7
https://bugs.freedesktop.org/show_bug.cgi?id=58776 --- Comment #11 from gabriele balducci baldu...@units.it --- Created attachment 72455 -- https://bugs.freedesktop.org/attachment.cgi?id=72455action=edit Xorg.0.log for kernel-3.6.10 -- You are receiving this mail because: You are the assignee for the bug. ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [Bug 58776] DRM NOUVEAU: cannot boot with kernel =3.7
https://bugs.freedesktop.org/show_bug.cgi?id=58776 --- Comment #12 from gabriele balducci baldu...@units.it --- Created attachment 72456 -- https://bugs.freedesktop.org/attachment.cgi?id=72456action=edit Xorg.0.log for kernel-3.7.1 without acceleration -- You are receiving this mail because: You are the assignee for the bug. ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [Bug 58776] DRM NOUVEAU: cannot boot with kernel =3.7
https://bugs.freedesktop.org/show_bug.cgi?id=58776 --- Comment #13 from gabriele balducci baldu...@units.it --- Created attachment 72457 -- https://bugs.freedesktop.org/attachment.cgi?id=72457action=edit Xorg.0.log with acceleration -- You are receiving this mail because: You are the assignee for the bug. ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [Bug 58776] DRM NOUVEAU: cannot boot with kernel =3.7
https://bugs.freedesktop.org/show_bug.cgi?id=58776 --- Comment #14 from Marcin Slusarz marcin.slus...@gmail.com --- (For some reason GPU reads garbage from GART. With vram_pushbuf=1 we moved main push buffer from GART to VRAM, so it at least starts. But we really need GART.) Can you bisect it? To speed it up you can use drivers/gpu/drm/nouveau/ as bisect path. -- You are receiving this mail because: You are the assignee for the bug. ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [Bug 58984] DRM NOUVEAU: probe of 0001:01:00.0 failed with error -12
https://bugs.freedesktop.org/show_bug.cgi?id=58984 --- Comment #1 from Patrick Baggett baggett.patr...@gmail.com --- Doh, dmesg.log didn't attach. I'll get that later today. -- You are receiving this mail because: You are the assignee for the bug. ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] 3.8-rc2: EFI framebuffer lock inversion...
On Thu, 3 Jan 2013 20:56:30 +0800 Daniel J Blueman dan...@quora.org wrote: On 3.8-rc2 with lockdep enabled and dual-GPU setup (Macbook Pro Retina), I see two releated lock inversion issues with the EFI framebuffer, leading to possible deadlock: when X takes over from the EFI framebuffer [1] and when nouveau releases the framebuffer when being vgaswitcherood [2]. Let me know if you'd like any testing or analysis when I can get the time. The fb layer locking was broken. I posted patches early December which should have fixed the ones we know about. ('fb: Rework locking to fix lock ordering on takeover'). Alan ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] 3.8-rc2: EFI framebuffer lock inversion...
The only patch I could find [1] (mid Nov) looks like it needs another sites updating, since we now see an i915 vs efifb lock ordering issue [2]. I can get some time next week to take a look if it helps. That would be great. I've not got any EFI afflicted hardware and I'm doing my best to avoid it. Alan ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [Bug 58729] [bisected] Display fails to turn on after suspend/resume (NV86)
https://bugs.freedesktop.org/show_bug.cgi?id=58729 --- Comment #2 from Marcin Slusarz marcin.slus...@gmail.com --- Created attachment 72473 -- https://bugs.freedesktop.org/attachment.cgi?id=72473action=edit fix -- You are receiving this mail because: You are the assignee for the bug. ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] Resume regression with nouveau 3.8rc1 (bisected)
On Thu, Jan 03, 2013 at 01:58:10PM +0100, Marcin Slusarz wrote: On Wed, Jan 02, 2013 at 04:19:35PM +0100, Pontus Fuchs wrote: Hi, Starting with 3.8rc1 I get a black screen when resuming after suspend. The kernel is alive because I can switch to VT1 and reboot with ctrl-alt-delete. I bisected the problem down to this commit: 186ecad21: drm/nv50/disp: move remaining interrupt handling into core Hardware is 8400M GS (10de:0427) in a Dell XPS M1330. There's already open bug report for that: https://bugs.freedesktop.org/show_bug.cgi?id=58729 Yay, a bug fix was just posted there ;). Marcin ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] 3.8-rc2: EFI framebuffer lock inversion...
On Thu, Jan 3, 2013 at 5:11 AM, Alan Cox a...@lxorguk.ukuu.org.uk wrote: The fb layer locking was broken. I posted patches early December which should have fixed the ones we know about. ('fb: Rework locking to fix lock ordering on takeover'). That patch causes compile errors with allmodconfig: ERROR: do_take_over_console [drivers/video/console/fbcon.ko] undefined! make[1]: *** [__modpost] Error 1 make: *** [modules] Error 2 make: *** Waiting for unfinished jobs Hmm? Linus ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
Re: [Nouveau] 3.8-rc2: EFI framebuffer lock inversion...
On Thu, 3 Jan 2013 11:40:47 -0800 Linus Torvalds torva...@linux-foundation.org wrote: On Thu, Jan 3, 2013 at 5:11 AM, Alan Cox a...@lxorguk.ukuu.org.uk wrote: The fb layer locking was broken. I posted patches early December which should have fixed the ones we know about. ('fb: Rework locking to fix lock ordering on takeover'). That patch causes compile errors with allmodconfig: ERROR: do_take_over_console [drivers/video/console/fbcon.ko] undefined! make[1]: *** [__modpost] Error 1 make: *** [modules] Error 2 make: *** Waiting for unfinished jobs Hmm? I have a couple of fixes against fb-rework-locking-to-fix-lock-ordering-on-takeover.patch: http://ozlabs.org/~akpm/mmots/broken-out/fb-rework-locking-to-fix-lock-ordering-on-takeover-fix.patch http://ozlabs.org/~akpm/mmots/broken-out/fb-rework-locking-to-fix-lock-ordering-on-takeover-fix-2.patch Florian has been busy for a month or two - I've been waiting for him to reappear to consider this patch. ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [Bug 58984] DRM NOUVEAU: probe of 0001:01:00.0 failed with error -12
https://bugs.freedesktop.org/show_bug.cgi?id=58984 --- Comment #2 from Patrick Baggett baggett.patr...@gmail.com --- Created attachment 72491 -- https://bugs.freedesktop.org/attachment.cgi?id=72491action=edit dmesg output -- You are receiving this mail because: You are the assignee for the bug. ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau
[Nouveau] [Bug 41114] nouveau module crashes on boot
https://bugs.freedesktop.org/show_bug.cgi?id=41114 --- Comment #16 from Robert Riches rm.ric...@jacob21819.net --- With multiple duplicate reports and apparently multiple reporters, shouldn't this have a status of 'confirmed' rather than 'new'? -- You are receiving this mail because: You are the assignee for the bug. ___ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau