Bug#890866: mesa: regression vs mesa 17.3.3-1: crash on i915 triggered by running emacs

2018-02-26 Thread Lucas Bonnet

Hello,

  I can confirm it appears to fix things here too, with 17.3 I could
trigger the crash just by making emacs scroll a larger-than-screen
buffer or make scid resize its internal windows, now it doesn't crash
anymore.


Regards,
-- 
Lucas



Bug#890866: mesa: regression vs mesa 17.3.3-1: crash on i915 triggered by running emacs

2018-02-24 Thread Theodore Ts'o
On Tue, Feb 20, 2018 at 11:37:12AM +0100, Andreas Boll wrote:
> 
> Thanks for reporting to upstream!
> Could you test Mesa 18.0.0~rc4-1 from experimental? According to [1]
> some GPU hangs are supposed to be fixed.

Per [1], I've tried 18.0.0~rc4-1, and it appears to fix the issue.
Also, I've since noticed there are similar GPU hangs happening with
17.3.3-1 --- it's just that they are much more easy to reproduce with
17.3.4-1.

Cheers,

- Ted

[1] https://bugs.freedesktop.org/show_bug.cgi?id=105169#c2



Bug#890866: mesa: regression vs mesa 17.3.3-1: crash on i915 triggered by running emacs

2018-02-20 Thread Andreas Boll
Control: severity -1 serious
Control: tags -1 upstream
Control: forwarded -1 https://bugs.freedesktop.org/show_bug.cgi?id=105169

On Mon, Feb 19, 2018 at 08:06:49PM -0500, Theodore Y. Ts'o wrote:
> Source: mesa
> Version: 17.3.4-1
> Severity: important
> 
> Dear Maintainer,
> 
> After upgrading to Mesa 17.3.4-1, starting emacs (in X11) would reliably
> cause the X server to crash.   Reverting to mesa 17.3.3-1 with the
> following packages:
> 
> libegl1-mesa_17.3.3-1_amd64.deb
> libegl-mesa0_17.3.3-1_amd64.deb
> libgbm1_17.3.3-1_amd64.deb
> libgl1-mesa-dri_17.3.3-1_amd64.deb
> libgl1-mesa-glx_17.3.3-1_amd64.deb
> libglapi-mesa_17.3.3-1_amd64.deb
> libglx-mesa0_17.3.3-1_amd64.deb
> libwayland-egl1-mesa_17.3.3-1_amd64.deb
> mesa-va-drivers_17.3.3-1_amd64.deb
> mesa-vdpau-drivers_17.3.3-1_amd64.deb
> 
> Causes the problem to go away.  This was running on a 2018 Dell XPS 13
> (model number 9370) with a 4k display.  Found in dmesg was the
> following:
> 
> Feb 19 14:07:00 cwcc kernel: [ 1740.829003] [drm] GPU HANG: ecode 
> 9:0:0x85df3cff, in Xorg [1098], reason: Hang on rcs0, action: reset
> Feb 19 14:07:00 cwcc kernel: [ 1740.829111] [drm] GPU hangs can indicate a 
> bug anywhere in the entire gfx stack, including userspace.
> Feb 19 14:07:00 cwcc kernel: [ 1740.829112] [drm] Please file a _new_ bug 
> report on bugs.freedesktop.org against DRI -> DRM/Intel
> Feb 19 14:07:00 cwcc kernel: [ 1740.829113] [drm] drm/i915 developers can 
> then reassign to the right component if it's not a kernel issue.
> Feb 19 14:07:00 cwcc kernel: [ 1740.829114] [drm] The gpu crash dump is 
> required to analyze gpu hangs, so please always attach it.
> Feb 19 14:07:00 cwcc kernel: [ 1740.829115] [drm] GPU crash dump saved to 
> /sys/class/drm/card0/error
> Feb 19 14:07:00 cwcc kernel: [ 1740.829123] i915 :00:02.0: Resetting rcs0 
> after gpu hang
> Feb 19 14:07:08 cwcc kernel: [ 1748.819899] i915 :00:02.0: Resetting rcs0 
> after gpu hang
> 

Hi,

Thanks for reporting to upstream!
Could you test Mesa 18.0.0~rc4-1 from experimental? According to [1]
some GPU hangs are supposed to be fixed.

Thanks,
Andreas

[1] https://bugs.freedesktop.org/show_bug.cgi?id=104578#c17


signature.asc
Description: PGP signature


Bug#890866: mesa: regression vs mesa 17.3.3-1: crash on i915 triggered by running emacs

2018-02-19 Thread Theodore Y. Ts'o
Source: mesa
Version: 17.3.4-1
Severity: important

Dear Maintainer,

After upgrading to Mesa 17.3.4-1, starting emacs (in X11) would reliably
cause the X server to crash.   Reverting to mesa 17.3.3-1 with the
following packages:

libegl1-mesa_17.3.3-1_amd64.deb
libegl-mesa0_17.3.3-1_amd64.deb
libgbm1_17.3.3-1_amd64.deb
libgl1-mesa-dri_17.3.3-1_amd64.deb
libgl1-mesa-glx_17.3.3-1_amd64.deb
libglapi-mesa_17.3.3-1_amd64.deb
libglx-mesa0_17.3.3-1_amd64.deb
libwayland-egl1-mesa_17.3.3-1_amd64.deb
mesa-va-drivers_17.3.3-1_amd64.deb
mesa-vdpau-drivers_17.3.3-1_amd64.deb

Causes the problem to go away.  This was running on a 2018 Dell XPS 13
(model number 9370) with a 4k display.  Found in dmesg was the
following:

Feb 19 14:07:00 cwcc kernel: [ 1740.829003] [drm] GPU HANG: ecode 
9:0:0x85df3cff, in Xorg [1098], reason: Hang on rcs0, action: reset
Feb 19 14:07:00 cwcc kernel: [ 1740.829111] [drm] GPU hangs can indicate a bug 
anywhere in the entire gfx stack, including userspace.
Feb 19 14:07:00 cwcc kernel: [ 1740.829112] [drm] Please file a _new_ bug 
report on bugs.freedesktop.org against DRI -> DRM/Intel
Feb 19 14:07:00 cwcc kernel: [ 1740.829113] [drm] drm/i915 developers can then 
reassign to the right component if it's not a kernel issue.
Feb 19 14:07:00 cwcc kernel: [ 1740.829114] [drm] The gpu crash dump is 
required to analyze gpu hangs, so please always attach it.
Feb 19 14:07:00 cwcc kernel: [ 1740.829115] [drm] GPU crash dump saved to 
/sys/class/drm/card0/error
Feb 19 14:07:00 cwcc kernel: [ 1740.829123] i915 :00:02.0: Resetting rcs0 
after gpu hang
Feb 19 14:07:08 cwcc kernel: [ 1748.819899] i915 :00:02.0: Resetting rcs0 
after gpu hang

And attached please find the contents of /sys/class/drm/card0/error

-- System Information:
Debian Release: buster/sid
  APT prefers unstable-debug
  APT policy: (500, 'unstable-debug'), (500, 'testing-debug'), (500, 
'unstable'), (500, 'testing'), (1, 'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 4.14.0-3-amd64 (SMP w/8 CPU cores)
Locale: LANG=en_US.utf8, LC_CTYPE=en_US.utf8 (charmap=UTF-8), 
LANGUAGE=en_US.utf8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled
GPU HANG: ecode 9:0:0x85df, in Xorg [1555], reason: Hang on rcs0, action: 
reset
Kernel: 4.14.0-3-amd64
Time: 1519086048 s 628402 us
Boottime: 128 s 796155 us
Uptime: 119 s 301934 us
Active process (on ring render): Xorg [1555], score 0
Reset count: 0
Suspend count: 0
Platform: KABYLAKE
PCI ID: 0x5917
PCI Revision: 0x07
PCI Subsystem: 1028:07e6
IOMMU enabled?: 0
DMC loaded: yes
DMC fw version: 1.1
GT awake: yes
RPM wakelock: yes
PM suspended: no
EIR: 0x
IER: 0x0800
GTIER[0]: 0x01010101
GTIER[1]: 0x01010101
GTIER[2]: 0x0070
GTIER[3]: 0x0101
PGTBL_ER: 0x
FORCEWAKE: 0x00010001
DERRMR: 0x2077efef
CCID: 0x
Missed interrupts: 0x
  fence[0] = 300030003
  fence[1] = 300403b0261
  fence[2] = 
  fence[3] = 
  fence[4] = 
  fence[5] = 20b900402088003
  fence[6] = 
  fence[7] = 
  fence[8] = 330032003
  fence[9] = 318a00903005003
  fence[10] = 1e001e003
  fence[11] = 
  fence[12] = 20872060003
  fence[13] = 
  fence[14] = 
  fence[15] = 340034003
  fence[16] = 360035003
  fence[17] = 
  fence[18] = 
  fence[19] = 202100202010003
  fence[20] = 
  fence[21] = 370037003
  fence[22] = 2d002d003
  fence[23] = 
  fence[24] = 1f001f003
  fence[25] = 
  fence[26] = 222b004021fa003
  fence[27] = 25dc00802499003
  fence[28] = 200e200e003
  fence[29] = 200f200f003
  fence[30] = 21f9004021f5003
  fence[31] = 
ERROR: 0x
FAULT_TLB_DATA: 0x0011 0xe4bf2b9a
DONE_REG: 0x
render command stream:
  START: 0x00011000
  HEAD:  0x02a03100 [0x30a8]
  TAIL:  0x3420 [0x3100, 0x3128]
  CTL:   0x3001
  MODE:  0x
  HWS:   0xfffe8000
  ACTHD: 0x 02a03100
  IPEIR: 0x
  IPEHR: 0x7a04
  INSTDONE: 0xffdb
  SC_INSTDONE: 0xd790
  SAMPLER_INSTDONE[0][0]: 0x
  SAMPLER_INSTDONE[0][1]: 0x
  SAMPLER_INSTDONE[0][2]: 0x
  ROW_INSTDONE[0][0]: 0xfa107ff5
  ROW_INSTDONE[0][1]: 0xfa107ff4
  ROW_INSTDONE[0][2]: 0xfa107ff4
  batch: [0x_09249000, 0x_0924e000]
  BBADDR: 0x_0924dbfc
  BB_STATE: 0x0020
  INSTPS: 0x8980
  INSTPM: 0x
  FADDR: 0x 00014280
  RC PSMI: 0x0010
  FAULT_REG: 0x
  SYNC_0: 0x
  SYNC_1: 0x
  SYNC_2: 0x
  GFX_MODE: 0x8000
  PDP0: 0x000499b5
  PDP1: 0x
  PDP2: 0x
  PDP3: 0x
  seqno: 0x0ada
  last_seqno: 0x0ae1
  waiting: yes
  ring->head: 0x3080
  ring->tail: 0x3420
  hangcheck stall: yes
  hangcheck action: dead
  hangcheck action timestamp: 4294923000, 663724 ms ago
  engine reset count: 0
  ELSP[0]:  pid 15