[Intel-gfx] Skylake 6700k Intel HD Graphics 530 Display Port Panic

2015-09-07 Thread Matthew Minter

Hello,

I am currently trying to set up a newly built system with a Skylake 
6700k CPU but am having an extremely
reproducible kernel panic every time I connect a monitor to the display 
port connector of the Intel

integrated graphics chip.

This issue occurs either immediately upon connecting a display port 
monitor to the machine while it is up
or late in the boot process if the display port is connected at boot 
time.


The monitor which I am using is a Dell U3415W ultra wide and the 
motherboard is a MSI Z170A Gaming M7.


I am not entirely surprised by the link train errors as there appear to 
be various posts about users having
problems with this monitor and display port training, what surprises me 
most is the fact it is causing a kernel panic.


Upon the panic happening the kernel prints the following dump (to the 
second non DP monitor), (note this is hand copied as I
have no way to dump the messages anywhere but the display so pardon any 
small typos).


[   22.318630]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too many 
full retries, give up
[   22.365449]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too many 
full retries, give up
[   22.420272]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too many 
full retries, give up
[   22.475105]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too many 
full retries, give up
[   22.529931]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too many 
full retries, give up
[   22.584759]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too many 
full retries, give up
[   22.639588]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too many 
full retries, give up
[   22.649935]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too many 
full retries, give up
[   22.650532]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too many 
full retries, give up
[   24.329955]  Kernel panic - not syncing: Timeout: Not all CPUs 
entered broadcast exception handler

[   25.345911]  Shutting down cpus with NMI
[   25.356092]  Kernel offset: disabled
[   25.356101]  Rebooting in 30 seconds.

If running kernel 4.2 occasionally these errors are followed by what 
seems to be a an mce machine check exception mentioning
 a corrupt processor context which is very hard to note down as it is 
only on the screen very briefly. However if running the
latest kernel from https://github.com/torvalds/linux only the above 
error occurs, not the mce exception. I am pretty confident
the mce exception is spurious due to this and the fact the system 
otherwise tests out fine.


I apologise if this report is a little sparse on details, it is very 
hard to post mortem debug the system due to the panic and

the fact I have no available serial terminal or hardware debugger.

Otherwise the system flawlessly passes memtest86+ and is completely 
stable even under heavy load.
This issue seems to occur on every kernel I have tested so far including 
a stock ubuntu 15.4, a vanilla 4.0.5 kernel,
a vanilla 4.2.0 kernel and the head of https://github.com/torvalds/linux 
as of a few hours ago.


The kernel config used for the kernel taken from git is available here: 
http://paste2.org/MH9vV4Le
The 4.2 and 4.0.5 configs were extremely similar and only differ in the 
new entries made by oldconfig.


If there is anything I can do to produce more info I am more than happy 
to do so.
Or if this is not the right mailing list for this issue please let me 
know where would be better.


Many thanks,
Matthew
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Skylake 6700k Intel HD Graphics 530 Display Port Panic

2015-09-07 Thread Zhang, Xiong Y
I see the similar error message on one SKL machine, but DP system works well 
and system doesn't panic.
I resolved it by one of the following three method, maybe you could try it.
1) boot with i915.disable_power_well=0
2)delete /lib/firmware/i915/skl_dmc_ver1.bin
3) apply the patch set in 
http://lists.freedesktop.org/archives/intel-gfx/2015-August/072870.html

thanks
> -Original Message-
> From: Intel-gfx [mailto:intel-gfx-boun...@lists.freedesktop.org] On Behalf Of
> Matthew Minter
> Sent: Sunday, September 6, 2015 12:34 PM
> To: intel-gfx@lists.freedesktop.org
> Subject: [Intel-gfx] Skylake 6700k Intel HD Graphics 530 Display Port Panic
> 
> Hello,
> 
> I am currently trying to set up a newly built system with a Skylake
> 6700k CPU but am having an extremely
> reproducible kernel panic every time I connect a monitor to the display
> port connector of the Intel
> integrated graphics chip.
> 
> This issue occurs either immediately upon connecting a display port
> monitor to the machine while it is up
> or late in the boot process if the display port is connected at boot
> time.
> 
> The monitor which I am using is a Dell U3415W ultra wide and the
> motherboard is a MSI Z170A Gaming M7.
> 
> I am not entirely surprised by the link train errors as there appear to
> be various posts about users having
> problems with this monitor and display port training, what surprises me
> most is the fact it is causing a kernel panic.
> 
> Upon the panic happening the kernel prints the following dump (to the
> second non DP monitor), (note this is hand copied as I
> have no way to dump the messages anywhere but the display so pardon any
> small typos).
> 
> [   22.318630]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too many
> full retries, give up
> [   22.365449]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too many
> full retries, give up
> [   22.420272]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too many
> full retries, give up
> [   22.475105]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too many
> full retries, give up
> [   22.529931]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too many
> full retries, give up
> [   22.584759]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too many
> full retries, give up
> [   22.639588]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too many
> full retries, give up
> [   22.649935]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too many
> full retries, give up
> [   22.650532]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too many
> full retries, give up
> [   24.329955]  Kernel panic - not syncing: Timeout: Not all CPUs
> entered broadcast exception handler
> [   25.345911]  Shutting down cpus with NMI
> [   25.356092]  Kernel offset: disabled
> [   25.356101]  Rebooting in 30 seconds.
> 
> If running kernel 4.2 occasionally these errors are followed by what
> seems to be a an mce machine check exception mentioning
>   a corrupt processor context which is very hard to note down as it is
> only on the screen very briefly. However if running the
> latest kernel from https://github.com/torvalds/linux only the above
> error occurs, not the mce exception. I am pretty confident
> the mce exception is spurious due to this and the fact the system
> otherwise tests out fine.
> 
> I apologise if this report is a little sparse on details, it is very
> hard to post mortem debug the system due to the panic and
> the fact I have no available serial terminal or hardware debugger.
> 
> Otherwise the system flawlessly passes memtest86+ and is completely
> stable even under heavy load.
> This issue seems to occur on every kernel I have tested so far including
> a stock ubuntu 15.4, a vanilla 4.0.5 kernel,
> a vanilla 4.2.0 kernel and the head of https://github.com/torvalds/linux
> as of a few hours ago.
> 
> The kernel config used for the kernel taken from git is available here:
> http://paste2.org/MH9vV4Le
> The 4.2 and 4.0.5 configs were extremely similar and only differ in the
> new entries made by oldconfig.
> 
> If there is anything I can do to produce more info I am more than happy
> to do so.
> Or if this is not the right mailing list for this issue please let me
> know where would be better.
> 
> Many thanks,
> Matthew
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] Skylake 6700k Intel HD Graphics 530 Display Port Panic

2015-09-05 Thread Matthew Minter

Hello,

I am currently trying to set up a newly built system with a Skylake 
6700k CPU but am having an extremely
reproducible kernel panic every time I connect a monitor to the display 
port connector of the Intel

integrated graphics chip.

This issue occurs either immediately upon connecting a display port 
monitor to the machine while it is up
or late in the boot process if the display port is connected at boot 
time.


The monitor which I am using is a Dell U3415W ultra wide and the 
motherboard is a MSI Z170A Gaming M7.


I am not entirely surprised by the link train errors as there appear to 
be various posts about users having
problems with this monitor and display port training, what surprises me 
most is the fact it is causing a kernel panic.


Upon the panic happening the kernel prints the following dump (to the 
second non DP monitor), (note this is hand copied as I
have no way to dump the messages anywhere but the display so pardon any 
small typos).


[   22.318630]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too many 
full retries, give up
[   22.365449]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too many 
full retries, give up
[   22.420272]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too many 
full retries, give up
[   22.475105]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too many 
full retries, give up
[   22.529931]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too many 
full retries, give up
[   22.584759]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too many 
full retries, give up
[   22.639588]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too many 
full retries, give up
[   22.649935]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too many 
full retries, give up
[   22.650532]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too many 
full retries, give up
[   24.329955]  Kernel panic - not syncing: Timeout: Not all CPUs 
entered broadcast exception handler

[   25.345911]  Shutting down cpus with NMI
[   25.356092]  Kernel offset: disabled
[   25.356101]  Rebooting in 30 seconds.

If running kernel 4.2 occasionally these errors are followed by what 
seems to be a an mce machine check exception mentioning
 a corrupt processor context which is very hard to note down as it is 
only on the screen very briefly. However if running the
latest kernel from https://github.com/torvalds/linux only the above 
error occurs, not the mce exception. I am pretty confident
the mce exception is spurious due to this and the fact the system 
otherwise tests out fine.


I apologise if this report is a little sparse on details, it is very 
hard to post mortem debug the system due to the panic and

the fact I have no available serial terminal or hardware debugger.

Otherwise the system flawlessly passes memtest86+ and is completely 
stable even under heavy load.
This issue seems to occur on every kernel I have tested so far including 
a stock ubuntu 15.4, a vanilla 4.0.5 kernel,
a vanilla 4.2.0 kernel and the head of https://github.com/torvalds/linux 
as of a few hours ago.


The kernel config used for the kernel taken from git is available here: 
http://paste2.org/MH9vV4Le
The 4.2 and 4.0.5 configs were extremely similar and only differ in the 
new entries made by oldconfig.


If there is anything I can do to produce more info I am more than happy 
to do so.
Or if this is not the right mailing list for this issue please let me 
know where would be better.


Many thanks,
Matthew

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Skylake 6700k Intel HD Graphics 530 Display Port Panic

2015-09-05 Thread Matthew Minter
Just to update, this also happens with the latest current 
git://anongit.freedesktop.org/drm-intel kernel on the drm-intel-nightly 
branch.


On 2015-09-06 06:05, Matthew Minter wrote:

Hello,

I am currently trying to set up a newly built system with a Skylake
6700k CPU but am having an extremely
reproducible kernel panic every time I connect a monitor to the
display port connector of the Intel
integrated graphics chip.

This issue occurs either immediately upon connecting a display port
monitor to the machine while it is up
or late in the boot process if the display port is connected at boot 
time.


The monitor which I am using is a Dell U3415W ultra wide and the
motherboard is a MSI Z170A Gaming M7.

I am not entirely surprised by the link train errors as there appear
to be various posts about users having
problems with this monitor and display port training, what surprises
me most is the fact it is causing a kernel panic.

Upon the panic happening the kernel prints the following dump (to the
second non DP monitor), (note this is hand copied as I
have no way to dump the messages anywhere but the display so pardon
any small typos).

[   22.318630]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too
many full retries, give up
[   22.365449]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too
many full retries, give up
[   22.420272]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too
many full retries, give up
[   22.475105]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too
many full retries, give up
[   22.529931]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too
many full retries, give up
[   22.584759]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too
many full retries, give up
[   22.639588]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too
many full retries, give up
[   22.649935]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too
many full retries, give up
[   22.650532]  [drm:intel_dp_start_link_traln [i915]] *ERROR* too
many full retries, give up
[   24.329955]  Kernel panic - not syncing: Timeout: Not all CPUs
entered broadcast exception handler
[   25.345911]  Shutting down cpus with NMI
[   25.356092]  Kernel offset: disabled
[   25.356101]  Rebooting in 30 seconds.

If running kernel 4.2 occasionally these errors are followed by what
seems to be a an mce machine check exception mentioning
 a corrupt processor context which is very hard to note down as it is
only on the screen very briefly. However if running the
latest kernel from https://github.com/torvalds/linux only the above
error occurs, not the mce exception. I am pretty confident
the mce exception is spurious due to this and the fact the system
otherwise tests out fine.

I apologise if this report is a little sparse on details, it is very
hard to post mortem debug the system due to the panic and
the fact I have no available serial terminal or hardware debugger.

Otherwise the system flawlessly passes memtest86+ and is completely
stable even under heavy load.
This issue seems to occur on every kernel I have tested so far
including a stock ubuntu 15.4, a vanilla 4.0.5 kernel,
a vanilla 4.2.0 kernel and the head of
https://github.com/torvalds/linux as of a few hours ago.

The kernel config used for the kernel taken from git is available
here: http://paste2.org/MH9vV4Le
The 4.2 and 4.0.5 configs were extremely similar and only differ in
the new entries made by oldconfig.

If there is anything I can do to produce more info I am more than
happy to do so.
Or if this is not the right mailing list for this issue please let me
know where would be better.

Many thanks,
Matthew

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx