From: Ben Widawsky <b...@bwidawsk.net>
This patch series ultimately adds support within the i965 driver for
Renderbuffer Decompression with GBM. In short, this feature reduces memory
bandwidth by allowing the GPU to work with losslessly compressed data and having
that compression scheme understood by the display engine for decompression. The
display engine will decompress on the fly and scanout the image.
Quoting from the final patch, the bandwidth savings on a SKL GT4 with a 19x10
display running kmscube:
Read bandwidth: 603.91 MiB/s
Write bandwidth: 615.28 MiB/s
Read bandwidth: 259.34 MiB/s
Write bandwidth: 337.83 MiB/s
The hardware achieves this savings by maintaining an auxiliary buffer
containing "opaque" compression information. It's opaque in the sense that the
low level compression scheme is not needed, but, knowledge of the overall
layout of the compressed data is required. The auxiliary buffer is created by
the driver on behalf of the client when requested. That buffer needs to be
passed along wherever the main image's buffer goes.
The overall strategy is that the buffer/surface is created with a list of
modifiers. The list of modifiers the hardware is capable of using will come from
a new kernel API that is aware of the hardware and general constraints. A client
will request the list of modifiers and pass it directly back in during buffer
creation (potentially the client can prune the list, but as of now there is no
reason to.) This new API is being developed by Kristian. I did not get far
enough to play with that.
For EGL, a similar mechanism would exist whereby when importing a buffer into
EGL, one would provide a modifier and probably a pointer to the auxiliary data
upon import. (Import therefore might require multiple dma-buf fds), but for i965
and Intel, this wouldn't be necessary.
Here is a brief description of the series:
1-6 Adds support in GBM for per plane functions where necessary. This is
required because the kernel expects the auxiliary buffer to be passed along as a
plane. It has its own offset, and stride, and the client shouldn't need to
7-9 Adds support in GBM to understand modifiers. When creating a buffer or
surface, the client is expected to pass in a list of modifiers that the driver
will optimally choose from. As a result of this, the GBM APIs need to support
10-12 Support Y-tiled modifier. Y-tiling was already a modifier exposed by the
kernel. With the previous patches in place, it's easy to support this too.
13-26 Plumbing to support sending CCS buffers to display. Leveraging much of the
existing code for MCS buffers, these patches creating an MCS for the scanout
buffer. The trickery here is that a single BO contains both the main surface and
the auxiliary data. Previously, auxiliary data always lived in its own BO.
27 Support CCS-modifier. Finally, the code can parse the CCS fb modifier(s) and
realize the bandwidth savings that come with it.
This was tested using kmscube
(https://github.com/bwidawsk/kmscube/tree/modifiers). The kmscube implementation
is missing support for GET_PLANE2 - which is currently being worked on by
1. All of the patches up through 26 should be mergeable today after review.
2. After 1-12 land, client support of Y-tiling should be achievable. Modesetting
driver can probably be updated as can things like Weston. Clients assuming a new
enough kernel should be able to blindly set the y tiled modifier.
3. Once kernel and libdrm support for CCS modifiers, patch 27 can land, however
CCS isn't yet usable, it is only available as a prototype.
4. Kristian's GET_PLANE2 interface needs to be solidified and land.
5. Clients will utilize #3 and #4 to use CCS.
6. Protocol work, EGL, Wayland, DRIX - etc
When Kristian's interface is ready, kmscube can be modified to make use of it.
Rob: are you interested in a PR for kmscube?
Definition of terms:
Renderbuffer Decompression - In the ARM world, this is AFBC. Having the graphics
driver utilize lossless surface compression for the scanout buffer and sending
those surfaces, compressed, to the kernel (via KMS) for the display engine to
Renderbuffer Compression - Utilizing compressed surfaces for many buffer types
(scanout, textures, whatever), and decompressing (ie. resolving) those surfaces
before passing them along.
Ben Widawsky (27):
gbm: Move getters to match order in header file (trivial)
gbm: Fix width height getters return type (trivial)
gbm: Export a plane getter function
gbm: Create a gbm_device getter for stride
gbm: Export a per plane getter for stride
gbm: Export a per plane getter for offset
i965/dri: Store the screen associated with the image
dri: Add an image creation with modifiers
gbm: Introduce modifiers into surface/bo creation
i965: Handle Y-tile modifier
gbm: Get modifiers from DRI
i965: Bring back always Y-tiled on SKL+
i965: Separate image allocation with modifiers
i965: Allow aux buffers to have an offset
i965/miptree: Add a helper functions for image creation
i965/miptree: Allocate mcs_buf for an image's CCS_E
i965: Create correctly sized mcs for an image
i965/miptree: Add a return for updating of winsys
i965/miptree: Allocate mt earlier in update winsys
i965: Pretend that CCS modified images are two planes
i965: Make CCS stride match kernel's expectations
i965: Change resolve flags to enum
i965: Plumb resolve hints from miptrees to blorp
i965: Add new resolve hints full and partial
i965: Use partial resolves for CCS buffers being scanned out
i965: Remove scanout restriction from lossless compression
i965: Handle compression modifier
include/GL/internal/dri_interface.h | 28 ++-
src/egl/drivers/dri2/platform_drm.c | 7 +-
src/gallium/state_trackers/dri/dri2.c | 1 +
src/gbm/backends/dri/gbm_dri.c | 132 ++++++++++++++-
src/gbm/gbm-symbols-check | 6 +
src/gbm/main/gbm.c | 112 ++++++++++--
src/gbm/main/gbm.h | 28 ++-
src/gbm/main/gbmint.h | 16 +-
src/mesa/drivers/dri/i965/brw_blorp.c | 12 +-
src/mesa/drivers/dri/i965/brw_blorp.h | 3 +-
src/mesa/drivers/dri/i965/brw_context.c | 53 ++++--
src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 3 +-
src/mesa/drivers/dri/i965/intel_fbo.c | 17 +-
src/mesa/drivers/dri/i965/intel_image.h | 5 +
src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 139 +++++++++++----
src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 29 +++-
src/mesa/drivers/dri/i965/intel_screen.c | 207 +++++++++++++++++++++--
src/mesa/drivers/dri/i965/intel_tex_image.c | 17 +-
18 files changed, 688 insertions(+), 127 deletions(-)
Cc: Kristian H. Kristensen <hoegsb...@gmail.com>
Cc: Daniel Stone <dani...@collabora.com>
Cc: Rob Clark <robdcl...@gmail.com>
mesa-dev mailing list