Re: Moving code around, post classic

2021-12-09 Thread Jason Ekstrand
On Thu, Dec 9, 2021 at 11:24 AM Timur Kristóf 
wrote:

> On Tue, 2021-12-07 at 08:19 -0500, Alyssa Rosenzweig wrote
> >
> > If it were just linked lists, I'd say someone should write the
> > Coccinelle to transform the tree to use the one in util and call it a
> > day. It's a bit more complicated though, NIR depends on GLSL types.
> > Though that could probably continue to live in its current location
> > even
> > if glsl moves? Might breed confusion.
>
>
> These GLSL types seem to be used by a lot more than just GLSL. To avoid
> some of the confusion, why not rename them to something like NIR types,
> or something along these lines?
>

First off, they're already in src/compiler, not src/compiler/glsl.

Could/should we rename them?  I'm fine with that and, honestly, the only
reason I haven't yet is because it's a pile of work and because I've not
been able to come up with a better name than `glsl_type` that sounds
generic to all compiler things.  We could go `nir_type`, I guess but that's
already kind-of taken by NIR ALU types.  Then again, those could be renamed
too.  I'm kind-of meh on it.

--Jason


Re: Moving code around, post classic

2021-12-09 Thread Timur Kristóf
On Tue, 2021-12-07 at 08:19 -0500, Alyssa Rosenzweig wrote
> 
> If it were just linked lists, I'd say someone should write the
> Coccinelle to transform the tree to use the one in util and call it a
> day. It's a bit more complicated though, NIR depends on GLSL types.
> Though that could probably continue to live in its current location
> even
> if glsl moves? Might breed confusion.


These GLSL types seem to be used by a lot more than just GLSL. To avoid
some of the confusion, why not rename them to something like NIR types,
or something along these lines?


[PATCH v4 14/16] drm/i915/uapi: document behaviour for DG2 64K support

2021-12-09 Thread Ramalingam C
From: Matthew Auld 

On discrete platforms like DG2, we need to support a minimum page size
of 64K when dealing with device local-memory. This is quite tricky for
various reasons, so try to document the new implicit uapi for this.

v2: Fixed suggestions on formatting [Daniel]

Signed-off-by: Matthew Auld 
Signed-off-by: Ramalingam C 
cc: Simon Ser 
cc: Pekka Paalanen 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: mesa-dev@lists.freedesktop.org
Cc: Tony Ye 
Cc: Slawomir Milczarek 
---
 include/uapi/drm/i915_drm.h | 67 ++---
 1 file changed, 62 insertions(+), 5 deletions(-)

diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 5e678917da70..b7441593434c 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1118,10 +1118,16 @@ struct drm_i915_gem_exec_object2 {
/**
 * When the EXEC_OBJECT_PINNED flag is specified this is populated by
 * the user with the GTT offset at which this object will be pinned.
+*
 * When the I915_EXEC_NO_RELOC flag is specified this must contain the
 * presumed_offset of the object.
+*
 * During execbuffer2 the kernel populates it with the value of the
 * current GTT offset of the object, for future presumed_offset writes.
+*
+* See struct drm_i915_gem_create_ext for the rules when dealing with
+* alignment restrictions with I915_MEMORY_CLASS_DEVICE, on devices with
+* minimum page sizes, like DG2.
 */
__u64 offset;
 
@@ -3145,11 +3151,62 @@ struct drm_i915_gem_create_ext {
 *
 * The (page-aligned) allocated size for the object will be returned.
 *
-* Note that for some devices we have might have further minimum
-* page-size restrictions(larger than 4K), like for device local-memory.
-* However in general the final size here should always reflect any
-* rounding up, if for example using the 
I915_GEM_CREATE_EXT_MEMORY_REGIONS
-* extension to place the object in device local-memory.
+*
+* **DG2 64K min page size implications:**
+*
+* On discrete platforms, starting from DG2, we have to contend with GTT
+* page size restrictions when dealing with I915_MEMORY_CLASS_DEVICE
+* objects.  Specifically the hardware only supports 64K or larger GTT
+* page sizes for such memory. The kernel will already ensure that all
+* I915_MEMORY_CLASS_DEVICE memory is allocated using 64K or larger page
+* sizes underneath.
+*
+* Note that the returned size here will always reflect any required
+* rounding up done by the kernel, i.e 4K will now become 64K on devices
+* such as DG2.
+*
+* **Special DG2 GTT address alignment requirement:**
+*
+* The GTT alignment will also need be at least 64K for  such objects.
+*
+* Note that due to how the hardware implements 64K GTT page support, we
+* have some further complications:
+*
+*   1) The entire PDE(which covers a 2M virtual address range), must
+*   contain only 64K PTEs, i.e mixing 4K and 64K PTEs in the same
+*   PDE is forbidden by the hardware.
+*
+*   2) We still need to support 4K PTEs for I915_MEMORY_CLASS_SYSTEM
+*   objects.
+*
+* To handle the above the kernel implements a memory coloring scheme to
+* prevent userspace from mixing I915_MEMORY_CLASS_DEVICE and
+* I915_MEMORY_CLASS_SYSTEM objects in the same PDE. If the kernel is
+* ever unable to evict the required pages for the given PDE(different
+* color) when inserting the object into the GTT then it will simply
+* fail the request.
+*
+* Since userspace needs to manage the GTT address space themselves,
+* special care is needed to ensure this doesn't happen. The simplest
+* scheme is to simply align and round up all I915_MEMORY_CLASS_DEVICE
+* objects to 2M, which avoids any issues here. At the very least this
+* is likely needed for objects that can be placed in both
+* I915_MEMORY_CLASS_DEVICE and I915_MEMORY_CLASS_SYSTEM, to avoid
+* potential issues when the kernel needs to migrate the object behind
+* the scenes, since that might also involve evicting other objects.
+*
+* **To summarise the GTT rules, on platforms like DG2:**
+*
+*   1) All objects that can be placed in I915_MEMORY_CLASS_DEVICE must
+*   have 64K alignment. The kernel will reject this otherwise.
+*
+*   2) All I915_MEMORY_CLASS_DEVICE objects must never be placed in
+*   the same PDE with other I915_MEMORY_CLASS_SYSTEM objects. The
+*   kernel will reject this otherwise.
+*
+*   3) Objects that can be placed in both I915_MEMORY_CLASS_DEVICE 

[PATCH v4 15/16] drm/i915/Flat-CCS: Document on Flat-CCS memory compression

2021-12-09 Thread Ramalingam C
Documents the Flat-CCS feature and kernel handling required along with
modifiers used.

Signed-off-by: Ramalingam C 
cc: Simon Ser 
cc: Pekka Paalanen 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: mesa-dev@lists.freedesktop.org
Cc: Tony Ye 
Cc: Slawomir Milczarek 
---
 drivers/gpu/drm/i915/gt/intel_migrate.c | 47 +
 1 file changed, 47 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c 
b/drivers/gpu/drm/i915/gt/intel_migrate.c
index 0fb83d0bec91..2d7ea9b6e8fb 100644
--- a/drivers/gpu/drm/i915/gt/intel_migrate.c
+++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
@@ -595,6 +595,53 @@ intel_context_migrate_copy(struct intel_context *ce,
return err;
 }
 
+/**
+ * DOC: Flat-CCS - Memory compression for Local memory
+ *
+ * On Xe-HP and later devices, we use dedicated compression control state (CCS)
+ * stored in local memory for each surface, to support the 3D and media
+ * compression formats.
+ *
+ * The memory required for the CCS of the entire local memory is 1/256 of the
+ * local memory size. So before the kernel boot, the required memory is 
reserved
+ * for the CCS data and a secure register will be programmed with the CCS base
+ * address.
+ *
+ * Flat CCS data needs to be cleared when a lmem object is allocated.
+ * And CCS data can be copied in and out of CCS region through
+ * XY_CTRL_SURF_COPY_BLT. CPU can't access the CCS data directly.
+ *
+ * When we exaust the lmem, if the object's placements support smem, then we 
can
+ * directly decompress the compressed lmem object into smem and start using it
+ * from smem itself.
+ *
+ * But when we need to swapout the compressed lmem object into a smem region
+ * though objects' placement doesn't support smem, then we copy the lmem 
content
+ * as it is into smem region along with ccs data (using XY_CTRL_SURF_COPY_BLT).
+ * When the object is referred, lmem content will be swaped in along with
+ * restoration of the CCS data (using XY_CTRL_SURF_COPY_BLT) at corresponding
+ * location.
+ *
+ *
+ * Flat-CCS Modifiers for different compression formats
+ * 
+ *
+ * I915_FORMAT_MOD_F_TILED_DG2_RC_CCS - used to indicate the buffers of Flat 
CCS
+ * render compression formats. Though the general layout is same as
+ * I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS, new hashing/compression algorithm is
+ * used. Render compression uses 128 byte compression blocks
+ *
+ * I915_FORMAT_MOD_F_TILED_DG2_MC_CCS -used to indicate the buffers of Flat CCS
+ * media compression formats. Though the general layout is same as
+ * I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS, new hashing/compression algorithm is
+ * used. Media compression uses 256 byte compression blocks.
+ *
+ * I915_FORMAT_MOD_F_TILED_DG2_RC_CCS_CC - used to indicate the buffers of Flat
+ * CCS clear color render compression formats. Unified compression format for
+ * clear color render compression. The genral layout is a tiled layout using
+ * 4Kb tiles i.e Tile4 layout.
+ */
+
 static inline u32 *i915_flush_dw(u32 *cmd, u64 dst, u32 flags)
 {
/* Mask the 3 LSB to use the PPGTT address space */
-- 
2.20.1



[PATCH v4 16/16] Doc/gpu/rfc/i915: i915 DG2 uAPI

2021-12-09 Thread Ramalingam C
Details of the new features getting added as part of DG2 enabling and their
implicit impact on the uAPI.

v2: improvised the Flat-CCS documentation [Danvet & CQ]

Signed-off-by: Ramalingam C 
cc: Daniel Vetter 
cc: Matthew Auld 
cc: Simon Ser 
cc: Pekka Paalanen 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: mesa-dev@lists.freedesktop.org
Cc: Tony Ye 
Cc: Slawomir Milczarek 
---
 Documentation/gpu/rfc/i915_dg2.rst | 32 ++
 Documentation/gpu/rfc/index.rst|  3 +++
 2 files changed, 35 insertions(+)
 create mode 100644 Documentation/gpu/rfc/i915_dg2.rst

diff --git a/Documentation/gpu/rfc/i915_dg2.rst 
b/Documentation/gpu/rfc/i915_dg2.rst
new file mode 100644
index ..9d28b1812bc7
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_dg2.rst
@@ -0,0 +1,32 @@
+
+I915 DG2 RFC Section
+
+
+Upstream plan
+=
+Plan to upstream the DG2 enabling is:
+
+* Merge basic HW enabling for DG2 (Still without pciid)
+* Merge the 64k support for lmem
+* Merge the flat CCS enabling patches
+* Add the pciid for DG2 and enable the DG2 in CI
+
+
+64K page support for lmem
+=
+On DG2 hw, local-memory supports minimum GTT page size of 64k only. 4k is not
+supported anymore.
+
+DG2 hw doesn't support the 64k (lmem) and 4k (smem) pages in the same ppgtt
+Page table. Refer the struct drm_i915_gem_create_ext for the implication of
+handling the 64k page size.
+
+.. kernel-doc:: include/uapi/drm/i915_drm.h
+:functions: drm_i915_gem_create_ext
+
+
+Flat CCS support for lmem
+=
+
+.. kernel-doc:: drivers/gpu/drm/i915/gt/intel_migrate.c
+:doc: Flat-CCS - Memory compression for Local memory
diff --git a/Documentation/gpu/rfc/index.rst b/Documentation/gpu/rfc/index.rst
index 91e93a705230..afb320ed4028 100644
--- a/Documentation/gpu/rfc/index.rst
+++ b/Documentation/gpu/rfc/index.rst
@@ -20,6 +20,9 @@ host such documentation:
 
 i915_gem_lmem.rst
 
+.. toctree::
+i915_dg2.rst
+
 .. toctree::
 
 i915_scheduler.rst
-- 
2.20.1