From: Matthew Auld <matthew.a...@intel.com>

On discrete platforms like DG2, we need to support a minimum page size
of 64K when dealing with device local-memory. This is quite tricky for
various reasons, so try to document the new implicit uapi for this.

v2: Fixed suggestions on formatting [Daniel]

Signed-off-by: Matthew Auld <matthew.a...@intel.com>
Signed-off-by: Ramalingam C <ramalinga...@intel.com>
cc: Simon Ser <cont...@emersion.fr>
cc: Pekka Paalanen <ppaala...@gmail.com>
Cc: Jordan Justen <jordan.l.jus...@intel.com>
Cc: Kenneth Graunke <kenn...@whitecape.org>
Cc: mesa-dev@lists.freedesktop.org
Cc: Tony Ye <tony...@intel.com>
Cc: Slawomir Milczarek <slawomir.milcza...@intel.com>
---
 include/uapi/drm/i915_drm.h | 67 ++++++++++++++++++++++++++++++++++---
 1 file changed, 62 insertions(+), 5 deletions(-)

diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 914ebd9290e5..89bcf5a77958 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1118,10 +1118,16 @@ struct drm_i915_gem_exec_object2 {
        /**
         * When the EXEC_OBJECT_PINNED flag is specified this is populated by
         * the user with the GTT offset at which this object will be pinned.
+        *
         * When the I915_EXEC_NO_RELOC flag is specified this must contain the
         * presumed_offset of the object.
+        *
         * During execbuffer2 the kernel populates it with the value of the
         * current GTT offset of the object, for future presumed_offset writes.
+        *
+        * See struct drm_i915_gem_create_ext for the rules when dealing with
+        * alignment restrictions with I915_MEMORY_CLASS_DEVICE, on devices with
+        * minimum page sizes, like DG2.
         */
        __u64 offset;
 
@@ -3144,11 +3150,62 @@ struct drm_i915_gem_create_ext {
         *
         * The (page-aligned) allocated size for the object will be returned.
         *
-        * Note that for some devices we have might have further minimum
-        * page-size restrictions(larger than 4K), like for device local-memory.
-        * However in general the final size here should always reflect any
-        * rounding up, if for example using the 
I915_GEM_CREATE_EXT_MEMORY_REGIONS
-        * extension to place the object in device local-memory.
+        *
+        * **DG2 64K min page size implications:**
+        *
+        * On discrete platforms, starting from DG2, we have to contend with GTT
+        * page size restrictions when dealing with I915_MEMORY_CLASS_DEVICE
+        * objects.  Specifically the hardware only supports 64K or larger GTT
+        * page sizes for such memory. The kernel will already ensure that all
+        * I915_MEMORY_CLASS_DEVICE memory is allocated using 64K or larger page
+        * sizes underneath.
+        *
+        * Note that the returned size here will always reflect any required
+        * rounding up done by the kernel, i.e 4K will now become 64K on devices
+        * such as DG2.
+        *
+        * **Special DG2 GTT address alignment requirement:**
+        *
+        * The GTT alignment will also need be at least 64K for  such objects.
+        *
+        * Note that due to how the hardware implements 64K GTT page support, we
+        * have some further complications:
+        *
+        *   1) The entire PDE(which covers a 2M virtual address range), must
+        *   contain only 64K PTEs, i.e mixing 4K and 64K PTEs in the same
+        *   PDE is forbidden by the hardware.
+        *
+        *   2) We still need to support 4K PTEs for I915_MEMORY_CLASS_SYSTEM
+        *   objects.
+        *
+        * To handle the above the kernel implements a memory coloring scheme to
+        * prevent userspace from mixing I915_MEMORY_CLASS_DEVICE and
+        * I915_MEMORY_CLASS_SYSTEM objects in the same PDE. If the kernel is
+        * ever unable to evict the required pages for the given PDE(different
+        * color) when inserting the object into the GTT then it will simply
+        * fail the request.
+        *
+        * Since userspace needs to manage the GTT address space themselves,
+        * special care is needed to ensure this doesn't happen. The simplest
+        * scheme is to simply align and round up all I915_MEMORY_CLASS_DEVICE
+        * objects to 2M, which avoids any issues here. At the very least this
+        * is likely needed for objects that can be placed in both
+        * I915_MEMORY_CLASS_DEVICE and I915_MEMORY_CLASS_SYSTEM, to avoid
+        * potential issues when the kernel needs to migrate the object behind
+        * the scenes, since that might also involve evicting other objects.
+        *
+        * **To summarise the GTT rules, on platforms like DG2:**
+        *
+        *   1) All objects that can be placed in I915_MEMORY_CLASS_DEVICE must
+        *   have 64K alignment. The kernel will reject this otherwise.
+        *
+        *   2) All I915_MEMORY_CLASS_DEVICE objects must never be placed in
+        *   the same PDE with other I915_MEMORY_CLASS_SYSTEM objects. The
+        *   kernel will reject this otherwise.
+        *
+        *   3) Objects that can be placed in both I915_MEMORY_CLASS_DEVICE and
+        *   I915_MEMORY_CLASS_SYSTEM should probably be aligned and padded out
+        *   to 2M.
         */
        __u64 size;
        /**
-- 
2.20.1

Reply via email to