On Mon, Jun 6, 2016 at 6:25 PM, Robin Murphy <robin.mur...@arm.com> wrote: > On 06/06/16 08:11, Alexandre Courbot wrote: >> >> From: Robin Murphy <robin.mur...@arm.com> >> >> This reverts commit 1733a2ad36741b1812cf8b3f3037c28d0af53f50. >> >> There is apparently something amiss with the way the TTM code handles >> DMA buffers, which the above commit was attempting to work around for >> arm64 systems with non-coherent PCI. Unfortunately, this completely >> breaks systems *with* coherent PCI (which appear to be the majority). >> >> Booting a plain arm64 defconfig + CONFIG_DRM + CONFIG_DRM_NOUVEAU on >> a machine with a PCI GPU having coherent dma_map_ops (in this case a >> 7600GT card plugged into an ARM Juno board) results in a fatal crash: >> >> [ 2.803438] nouveau 0000:06:00.0: DRM: allocated 1024x768 fb: 0x9000, >> bo ffffffc976141c00 >> [ 2.897662] Unable to handle kernel NULL pointer dereference at virtual >> address 000001ac >> [ 2.897666] pgd = ffffff8008e00000 >> [ 2.897675] [000001ac] *pgd=00000009ffffe003, *pud=00000009ffffe003, >> *pmd=0000000000000000 >> [ 2.897680] Internal error: Oops: 96000045 [#1] PREEMPT SMP >> [ 2.897685] Modules linked in: >> [ 2.897692] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.6.0-rc5+ #543 >> [ 2.897694] Hardware name: ARM Juno development board (r1) (DT) >> [ 2.897699] task: ffffffc9768a0000 ti: ffffffc9768a8000 task.ti: >> ffffffc9768a8000 >> [ 2.897711] PC is at __memcpy+0x7c/0x180 >> [ 2.897719] LR is at OUT_RINGp+0x34/0x70 >> [ 2.897724] pc : [<ffffff80083465fc>] lr : [<ffffff800854248c>] pstate: >> 80000045 >> [ 2.897726] sp : ffffffc9768ab360 >> [ 2.897732] x29: ffffffc9768ab360 x28: 0000000000000001 >> [ 2.897738] x27: ffffffc97624c000 x26: 0000000000000000 >> [ 2.897744] x25: 0000000000000080 x24: 0000000000006c00 >> [ 2.897749] x23: 0000000000000005 x22: ffffffc97624c010 >> [ 2.897755] x21: 0000000000000004 x20: 0000000000000004 >> [ 2.897761] x19: ffffffc9763da000 x18: ffffffc976b2491c >> [ 2.897766] x17: 0000000000000007 x16: 0000000000000006 >> [ 2.897771] x15: 0000000000000001 x14: 0000000000000001 >> [ 2.897777] x13: 0000000000e31b70 x12: ffffffc9768a0080 >> [ 2.897783] x11: 0000000000000000 x10: fffffffffffffb00 >> [ 2.897788] x9 : 0000000000000000 x8 : 0000000000000000 >> [ 2.897793] x7 : 0000000000000000 x6 : 00000000000001ac >> [ 2.897799] x5 : 00000000ffffffff x4 : 0000000000000000 >> [ 2.897804] x3 : 0000000000000010 x2 : 0000000000000010 >> [ 2.897810] x1 : ffffffc97624c010 x0 : 00000000000001ac >> ... >> [ 2.898494] Call trace: >> [ 2.898499] Exception stack(0xffffffc9768ab1a0 to 0xffffffc9768ab2c0) >> [ 2.898506] b1a0: ffffffc9763da000 0000000000000004 ffffffc9768ab360 >> ffffff80083465fc >> [ 2.898513] b1c0: ffffffc976801e00 ffffffc9762b8000 ffffffc9768ab1f0 >> ffffff80080ec158 >> [ 2.898520] b1e0: ffffffc9768ab230 ffffff8008496d04 ffffffc975ce6d80 >> ffffffc9768ab36e >> [ 2.898527] b200: ffffffc9768ab36f ffffffc9768ab29d ffffffc9768ab29e >> ffffffc9768a0000 >> [ 2.898533] b220: ffffffc9768ab250 ffffff80080e70c0 ffffffc9768ab270 >> ffffff8008496e44 >> [ 2.898540] b240: 00000000000001ac ffffffc97624c010 0000000000000010 >> 0000000000000010 >> [ 2.898546] b260: 0000000000000000 00000000ffffffff 00000000000001ac >> 0000000000000000 >> [ 2.898552] b280: 0000000000000000 0000000000000000 fffffffffffffb00 >> 0000000000000000 >> [ 2.898558] b2a0: ffffffc9768a0080 0000000000e31b70 0000000000000001 >> 0000000000000001 >> [ 2.898566] [<ffffff80083465fc>] __memcpy+0x7c/0x180 >> [ 2.898574] [<ffffff800853e164>] nv04_fbcon_imageblit+0x1d4/0x2e8 >> [ 2.898582] [<ffffff800853d6d0>] nouveau_fbcon_imageblit+0xd8/0xe0 >> [ 2.898591] [<ffffff80083c4db4>] soft_cursor+0x154/0x1d8 >> [ 2.898598] [<ffffff80083c47b4>] bit_cursor+0x4fc/0x538 >> [ 2.898605] [<ffffff80083c0cfc>] fbcon_cursor+0x134/0x1a8 >> [ 2.898613] [<ffffff800841c280>] hide_cursor+0x38/0xa0 >> [ 2.898620] [<ffffff800841d420>] redraw_screen+0x120/0x228 >> [ 2.898628] [<ffffff80083bf268>] fbcon_prepare_logo+0x370/0x3f8 >> [ 2.898635] [<ffffff80083bf640>] fbcon_init+0x350/0x560 >> [ 2.898641] [<ffffff800841c634>] visual_init+0xac/0x108 >> [ 2.898648] [<ffffff800841df14>] do_bind_con_driver+0x1c4/0x3a8 >> [ 2.898655] [<ffffff800841e4f4>] do_take_over_console+0x174/0x1e8 >> [ 2.898662] [<ffffff80083bf8c4>] do_fbcon_takeover+0x74/0x100 >> [ 2.898669] [<ffffff80083c3e44>] fbcon_event_notify+0x8cc/0x920 >> [ 2.898680] [<ffffff80080d7e38>] notifier_call_chain+0x50/0x90 >> [ 2.898685] [<ffffff80080d8214>] >> __blocking_notifier_call_chain+0x4c/0x90 >> [ 2.898691] [<ffffff80080d826c>] blocking_notifier_call_chain+0x14/0x20 >> [ 2.898696] [<ffffff80083c5e1c>] fb_notifier_call_chain+0x1c/0x28 >> [ 2.898703] [<ffffff80083c81ac>] register_framebuffer+0x1cc/0x2e0 >> [ 2.898712] [<ffffff800845da80>] >> drm_fb_helper_initial_config+0x288/0x3e8 >> [ 2.898719] [<ffffff800853da20>] nouveau_fbcon_init+0xe0/0x118 >> [ 2.898727] [<ffffff800852d2f8>] nouveau_drm_load+0x268/0x890 >> [ 2.898734] [<ffffff8008466e24>] drm_dev_register+0xbc/0xc8 >> [ 2.898740] [<ffffff8008468a88>] drm_get_pci_dev+0xa0/0x180 >> [ 2.898747] [<ffffff800852cb28>] nouveau_drm_probe+0x1a0/0x1e0 >> [ 2.898755] [<ffffff80083a32e0>] pci_device_probe+0x98/0x110 >> [ 2.898763] [<ffffff800858e434>] driver_probe_device+0x204/0x2b0 >> [ 2.898770] [<ffffff800858e58c>] __driver_attach+0xac/0xb0 >> [ 2.898777] [<ffffff800858c3e0>] bus_for_each_dev+0x60/0xa0 >> [ 2.898783] [<ffffff800858dbc0>] driver_attach+0x20/0x28 >> [ 2.898789] [<ffffff800858d7b0>] bus_add_driver+0x1d0/0x238 >> [ 2.898796] [<ffffff800858ed50>] driver_register+0x60/0xf8 >> [ 2.898802] [<ffffff80083a20dc>] __pci_register_driver+0x3c/0x48 >> [ 2.898809] [<ffffff8008468eb4>] drm_pci_init+0xf4/0x120 >> [ 2.898818] [<ffffff8008c56fc0>] nouveau_drm_init+0x21c/0x230 >> [ 2.898825] [<ffffff80080829d4>] do_one_initcall+0x8c/0x190 >> [ 2.898832] [<ffffff8008c31af4>] kernel_init_freeable+0x14c/0x1f0 >> [ 2.898839] [<ffffff80088a0c20>] kernel_init+0x10/0x100 >> [ 2.898845] [<ffffff8008085e10>] ret_from_fork+0x10/0x40 >> [ 2.898853] Code: a88120c7 a8c12027 a88120c7 a8c12027 (a88120c7) >> [ 2.898871] ---[ end trace d5713dcad023ee04 ]--- >> [ 2.898888] Kernel panic - not syncing: Attempted to kill init! >> exitcode=0x0000000b >> >> In a toss-up between the GPU seeing stale data artefacts on some systems >> vs. catastrophic kernel crashes on other systems, the latter would seem >> to take precedence, so revert this change until the real underlying >> problem can be fixed. >> >> Signed-off-by: Robin Murphy <robin.mur...@arm.com> >> Acked-by: Alexandre Courbot <acour...@nvidia.com> >> [acour...@nvidia.com: port to Nouveau tree, remove bits in lib/] >> Signed-off-by: Alexandre Courbot <acour...@nvidia.com> >> --- >> Hi Ben, >> >> I have ported this patch to your tree - could you take it for 4.7? We >> definitely want >> to avoid these crashes. I am working on a final solution for this that >> will allow us >> to remove that cpu_coherent flag altogether. > > > Cheers Alex! Should this also go to stable for 4.6?
That would be good, yes. _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau