On Mon, Aug 8, 2016 at 1:24 PM, Eric W. Biederman <ebied...@xmission.com> wrote:
>
> Booting up v4.8-rc1 in qemu today I ran I ran into this beautiful oops.
>
> I am just about to head out the door on vacation until the end of the
> month so hopefully I am tossing out enough information to the right
> people.
>
> I have confirmed that disabling CONFIG_DRM_CIRRUS_QEMU avoids this
> crash.
>
> [    0.720167] BUG: unable to handle kernel NULL pointer dereference at 
> 0000000000000018
> [    0.721348] IP: [<ffffffff8d5ef9ac>] drm_pick_crtcs+0xfc/0x270
> [    0.722249] PGD 0
> [    0.722659] Oops: 0000 [#1] SMP
> [    0.723141] Modules linked in:
> [    0.723643] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-rc1+ #116
> [    0.724602] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
> [    0.725450] task: ffff8800bbf28000 task.stack: ffff8800bbf30000
> [    0.726327] RIP: 0010:[<ffffffff8d5ef9ac>]  [<ffffffff8d5ef9ac>] 
> drm_pick_crtcs+0xfc/0x270
> [    0.727648] RSP: 0000:ffff8800bbf33978  EFLAGS: 00010217
> [    0.728444] RAX: ffffffff8d623a20 RBX: 0000000000000000 RCX: 
> ffff8800baf3f860
> [    0.729498] RDX: 0000000000000000 RSI: 0000000000001000 RDI: 
> ffff8800baf3f800
> [    0.730626] RBP: ffff8800bbf339f8 R08: 0000000000000000 R09: 
> 0000000000000001
> [    0.731704] R10: 0000000000000001 R11: 0000000000000000 R12: 
> ffff8800bbaf3af0
> [    0.732827] R13: ffff8800bbaf3af8 R14: 0000000000000000 R15: 
> ffff8800baf3fc00
> [    0.733863] FS:  0000000000000000(0000) GS:ffff8800bf800000(0000) 
> knlGS:0000000000000000
> [    0.735106] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    0.735964] CR2: 0000000000000018 CR3: 000000003b219000 CR4: 
> 00000000000006f0
> [    0.737021] Stack:
> [    0.737342]  ffff8800bbf33998 0000000000000246 ffff8800bbaf3b18 
> ffff8800bbaf3b00
> [    0.738602]  0000100000000001 0000000000000000 ffff8800bbaf3b00 
> ffff8800baf3f800
> [    0.739807]  0000000300001000 ffff8800bbaf3b18 ffff8800bbf339e8 
> ffff8800baf3fc00
> [    0.740993] Call Trace:
> [    0.741393]  [<ffffffff8d5efedd>] drm_setup_crtcs+0x3bd/0xb50
> [    0.742252]  [<ffffffff8d0edaa1>] ? mark_held_locks+0x71/0x90
> [    0.743176]  [<ffffffff8db0db6b>] ? __mutex_unlock_slowpath+0xeb/0x1c0
> [    0.744138]  [<ffffffff8d5f08c1>] drm_fb_helper_initial_config+0x81/0x3c0
> [    0.745166]  [<ffffffff8d6110b4>] ? drm_modeset_unlock_all+0x54/0x80
> [    0.746136]  [<ffffffff8d624fe0>] cirrus_fbdev_init+0xa0/0xb0
> [    0.747033]  [<ffffffff8d6245fb>] cirrus_modeset_init+0x18b/0x1e0
> .. snip snip ..
> [    0.771023] Code: e8 ba e1 ff ff 48 8b 55 b8 48 83 f8 01 83 5d c4 ff 48 8b 
> 7d b8 48 8b 82 98 02 00 00 49 8b 57 08 48 8b 40 10 48 8b 92 00 0a 00 00 <48> 
> 83 7a 18 00 74 09 48 85 c0 0f 84 53 01 00 00 ff d0 49 89 c3

This decodes to

   0: e8 ba e1 ff ff       callq  ...
   5: 48 8b 55 b8           mov    -0x48(%rbp),%rdx
   9: 48 83 f8 01           cmp    $0x1,%rax
   d: 83 5d c4 ff           sbbl   $0xffffffff,-0x3c(%rbp)
  11: 48 8b 7d b8           mov    -0x48(%rbp),%rdi
  15: 48 8b 82 98 02 00 00 mov    0x298(%rdx),%rax
  1c: 49 8b 57 08           mov    0x8(%r15),%rdx
  20: 48 8b 40 10           mov    0x10(%rax),%rax
  24: 48 8b 92 00 0a 00 00 mov    0xa00(%rdx),%rdx
  2b:* 48 83 7a 18 00       cmpq   $0x0,0x18(%rdx) <-- trapping instruction
  30: 74 09                 je     0x3b
  32: 48 85 c0             test   %rax,%rax
  35: 0f 84 53 01 00 00     je     0x18e
  3b: ff d0                 callq  *%rax
  3d: 49 89 c3             mov    %rax,%r11


and trying to find matching code I think the oops is from this:

        if (fb_helper->dev->mode_config.funcs->atomic_commit &&
            !connector_funcs->best_encoder)
                encoder = drm_atomic_helper_best_encoder(connector);
        else
                encoder = connector_funcs->best_encoder(connector);

in particular, the trapping instruction is the load of "atomic_commit".

(The "callq *%rax" seems to be the "connector_funcs->best_encoder()" call)

So I *think* we have "fb_helper->dev->mode_config.funcs" being NULL.

But I might have gotten the code matching wrong. I don't have the same
compiler as Eric, so even when using his config my code generation
doesn't really match. But there is only one indirect call in that
function, which is that ->best_encoder() call, so it does seem to
match up.

Adding Boris Brezillon and Daniel Vetter to the cc, since assuming I'm
right, the main suspect is commit c61b93fe51b1 ("drm/atomic: Fix
remaining places where !funcs->best_encoder is valid").

Boris? Daniel?

              Linus

Reply via email to