Hi,
On 06-04-18 18:06, Ville Syrjälä wrote:
On Thu, Apr 05, 2018 at 01:37:31PM +0200, Hans de Goede wrote:
Hi,
On 04-04-18 22:49, Ville Syrjälä wrote:
On Wed, Apr 04, 2018 at 10:06:29PM +0200, Hans de Goede wrote:
Hi,
On 04-04-18 17:50, Ville Syrjälä wrote:
On Fri, Mar 30, 2018 at 04:26:53PM +0200, Hans de Goede wrote:
Hi,
On 30-03-18 15:25, Hans de Goede wrote:
Hi,
On 30-03-18 14:44, Chris Wilson wrote:
Quoting Hans de Goede (2018-03-30 13:37:40)
Hi,
On 30-03-18 14:30, Chris Wilson wrote:
Quoting Hans de Goede (2018-03-30 13:27:15)
Before this commit the WaSkipStolenMemoryFirstPage workaround code was
skipping the first 4k by passing 4096 as start of the address range passed
to drm_mm_init(). This means that calling drm_mm_reserve_node() to try and
reserve the firmware framebuffer so that we can inherit it would always
fail, as the firmware framebuffer starts at address 0.
Commit d43537610470 ("drm/i915: skip the first 4k of stolen memory on
everything >= gen8") says in its commit message: "This is confirmed to fix
Skylake screen flickering issues (probably caused by the fact that we
initialized a ring in the first page of stolen, but I didn't 100% confirm
this theory)."
Which suggests that it is safe to use the first page for a linear
framebuffer as the firmware is doing.
This commit always passes 0 as start to drm_mm_init() and works around
WaSkipStolenMemoryFirstPage in i915_gem_stolen_insert_node_in_range()
by insuring the start address passed by to drm_mm_insert_node_in_range()
is always 4k or more. All entry points to i915_gem_stolen.c go through
i915_gem_stolen_insert_node_in_range(), so that any newly allocated
objects such as ring-buffers will not be allocated in the first 4k.
The one exception is i915_gem_object_create_stolen_for_preallocated()
which directly calls drm_mm_reserve_node() which now will be able to
use the first 4k.
This fixes the i915 driver no longer being able to inherit the firmware
framebuffer on gen8+, which fixes the video output changing from the
vendor logo to a black screen as soon as the i915 driver is loaded
(on systems without fbcon).
We've been told by the HW guys not to use the first page. (That's my
understanding from every time this gets questioned.)
Yet the GOP is happily using the first page. I think we may need to make
a difference here between the GPU not using the first page and the
display engine/pipeline not using the first page. Note that my patch
only influences the inheriting of the initial framebuffer as allocated
by the GOP. It does not influence any other allocations from the
reserved range, those will still all avoid the first page.
Without this patch fastboot / flickerfree support is essentially broken
on any gen8+ hardware given that one of the goals of atomic is to be
able to do flickerfree transitions I think that this warrants a closer
look then just simply saying never use the first page.
The concern is what else (i.e. nothing that we allocated ourselves) that
may be in the first page...
Given that the GOP has put its framebuffer there at least at boot there
is nothing there, otherwise it would show up on the display.
We have a whole bunch of code to inherit the BIOS fb in intel_display.c
and AFAIK that code is there because this inheriting the BIOS fb is
deemed an important feature. So I'm not happy at all with the handwavy
best to not touch it answer I'm getting to this patch.
Unless there are some clear answer from the hardware folks which specifically
say we cannot put a framebuffer there (and then why is the GOP doing it?)
then I believe that the best way forward here is to get various people to
test with this patch and the best way to do that is probably to put it
in next. Note I deliberately did not add a Cc stable.
To elaborate on this, the excluding of the first 4k of the stolen memory
region causes intel_alloc_initial_plane_obj() from intel_display.c to fail,
which in turn causes intel_find_initial_plane_obj() to call
intel_plane_disable_noatomic(intel_crtc, intel_plane); which temporarily
completely turns off the display which leads to a very ugly flickering
of the display at boot (as well as replacing the vendor logo with a
black screen).
I think we can all agree that this behavior is undesirable and even a
regression in behavior caused by the fix to exclude the first 4k.
Chris, if my patch is not an acceptable way to fix this, then how do you
suggest that we fix this?
Digging a bit deeper I found this:
https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-kbl-vol16-workarounds.pdf
Which says:
"WaSkipStolenMemoryFirstPage:
WA to skip the first page of stolen
memory due to sporadic HW write on *CS Idle"
And also about FBC:
"First line of FBC getting corrupted when FBC
compressed frame buffer offset is programmed to
zero. Command streamers are doing flush writes to
base of stolen.
WA: New restriction to program FBC compressed
frame buffer offset to at least 4KB."
So using the