Re: Moving code around, post classic
On Thu, Dec 9, 2021 at 11:24 AM Timur Kristóf wrote: > On Tue, 2021-12-07 at 08:19 -0500, Alyssa Rosenzweig wrote > > > > If it were just linked lists, I'd say someone should write the > > Coccinelle to transform the tree to use the one in util and call it a > > day. It's a bit more complicated though, NIR depends on GLSL types. > > Though that could probably continue to live in its current location > > even > > if glsl moves? Might breed confusion. > > > These GLSL types seem to be used by a lot more than just GLSL. To avoid > some of the confusion, why not rename them to something like NIR types, > or something along these lines? > First off, they're already in src/compiler, not src/compiler/glsl. Could/should we rename them? I'm fine with that and, honestly, the only reason I haven't yet is because it's a pile of work and because I've not been able to come up with a better name than `glsl_type` that sounds generic to all compiler things. We could go `nir_type`, I guess but that's already kind-of taken by NIR ALU types. Then again, those could be renamed too. I'm kind-of meh on it. --Jason
Re: Moving code around, post classic
On Tue, 2021-12-07 at 08:19 -0500, Alyssa Rosenzweig wrote > > If it were just linked lists, I'd say someone should write the > Coccinelle to transform the tree to use the one in util and call it a > day. It's a bit more complicated though, NIR depends on GLSL types. > Though that could probably continue to live in its current location > even > if glsl moves? Might breed confusion. These GLSL types seem to be used by a lot more than just GLSL. To avoid some of the confusion, why not rename them to something like NIR types, or something along these lines?
[PATCH v4 14/16] drm/i915/uapi: document behaviour for DG2 64K support
From: Matthew Auld On discrete platforms like DG2, we need to support a minimum page size of 64K when dealing with device local-memory. This is quite tricky for various reasons, so try to document the new implicit uapi for this. v2: Fixed suggestions on formatting [Daniel] Signed-off-by: Matthew Auld Signed-off-by: Ramalingam C cc: Simon Ser cc: Pekka Paalanen Cc: Jordan Justen Cc: Kenneth Graunke Cc: mesa-dev@lists.freedesktop.org Cc: Tony Ye Cc: Slawomir Milczarek --- include/uapi/drm/i915_drm.h | 67 ++--- 1 file changed, 62 insertions(+), 5 deletions(-) diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 5e678917da70..b7441593434c 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -1118,10 +1118,16 @@ struct drm_i915_gem_exec_object2 { /** * When the EXEC_OBJECT_PINNED flag is specified this is populated by * the user with the GTT offset at which this object will be pinned. +* * When the I915_EXEC_NO_RELOC flag is specified this must contain the * presumed_offset of the object. +* * During execbuffer2 the kernel populates it with the value of the * current GTT offset of the object, for future presumed_offset writes. +* +* See struct drm_i915_gem_create_ext for the rules when dealing with +* alignment restrictions with I915_MEMORY_CLASS_DEVICE, on devices with +* minimum page sizes, like DG2. */ __u64 offset; @@ -3145,11 +3151,62 @@ struct drm_i915_gem_create_ext { * * The (page-aligned) allocated size for the object will be returned. * -* Note that for some devices we have might have further minimum -* page-size restrictions(larger than 4K), like for device local-memory. -* However in general the final size here should always reflect any -* rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS -* extension to place the object in device local-memory. +* +* **DG2 64K min page size implications:** +* +* On discrete platforms, starting from DG2, we have to contend with GTT +* page size restrictions when dealing with I915_MEMORY_CLASS_DEVICE +* objects. Specifically the hardware only supports 64K or larger GTT +* page sizes for such memory. The kernel will already ensure that all +* I915_MEMORY_CLASS_DEVICE memory is allocated using 64K or larger page +* sizes underneath. +* +* Note that the returned size here will always reflect any required +* rounding up done by the kernel, i.e 4K will now become 64K on devices +* such as DG2. +* +* **Special DG2 GTT address alignment requirement:** +* +* The GTT alignment will also need be at least 64K for such objects. +* +* Note that due to how the hardware implements 64K GTT page support, we +* have some further complications: +* +* 1) The entire PDE(which covers a 2M virtual address range), must +* contain only 64K PTEs, i.e mixing 4K and 64K PTEs in the same +* PDE is forbidden by the hardware. +* +* 2) We still need to support 4K PTEs for I915_MEMORY_CLASS_SYSTEM +* objects. +* +* To handle the above the kernel implements a memory coloring scheme to +* prevent userspace from mixing I915_MEMORY_CLASS_DEVICE and +* I915_MEMORY_CLASS_SYSTEM objects in the same PDE. If the kernel is +* ever unable to evict the required pages for the given PDE(different +* color) when inserting the object into the GTT then it will simply +* fail the request. +* +* Since userspace needs to manage the GTT address space themselves, +* special care is needed to ensure this doesn't happen. The simplest +* scheme is to simply align and round up all I915_MEMORY_CLASS_DEVICE +* objects to 2M, which avoids any issues here. At the very least this +* is likely needed for objects that can be placed in both +* I915_MEMORY_CLASS_DEVICE and I915_MEMORY_CLASS_SYSTEM, to avoid +* potential issues when the kernel needs to migrate the object behind +* the scenes, since that might also involve evicting other objects. +* +* **To summarise the GTT rules, on platforms like DG2:** +* +* 1) All objects that can be placed in I915_MEMORY_CLASS_DEVICE must +* have 64K alignment. The kernel will reject this otherwise. +* +* 2) All I915_MEMORY_CLASS_DEVICE objects must never be placed in +* the same PDE with other I915_MEMORY_CLASS_SYSTEM objects. The +* kernel will reject this otherwise. +* +* 3) Objects that can be placed in both I915_MEMORY_CLASS_DEVICE
[PATCH v4 15/16] drm/i915/Flat-CCS: Document on Flat-CCS memory compression
Documents the Flat-CCS feature and kernel handling required along with modifiers used. Signed-off-by: Ramalingam C cc: Simon Ser cc: Pekka Paalanen Cc: Jordan Justen Cc: Kenneth Graunke Cc: mesa-dev@lists.freedesktop.org Cc: Tony Ye Cc: Slawomir Milczarek --- drivers/gpu/drm/i915/gt/intel_migrate.c | 47 + 1 file changed, 47 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c b/drivers/gpu/drm/i915/gt/intel_migrate.c index 0fb83d0bec91..2d7ea9b6e8fb 100644 --- a/drivers/gpu/drm/i915/gt/intel_migrate.c +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c @@ -595,6 +595,53 @@ intel_context_migrate_copy(struct intel_context *ce, return err; } +/** + * DOC: Flat-CCS - Memory compression for Local memory + * + * On Xe-HP and later devices, we use dedicated compression control state (CCS) + * stored in local memory for each surface, to support the 3D and media + * compression formats. + * + * The memory required for the CCS of the entire local memory is 1/256 of the + * local memory size. So before the kernel boot, the required memory is reserved + * for the CCS data and a secure register will be programmed with the CCS base + * address. + * + * Flat CCS data needs to be cleared when a lmem object is allocated. + * And CCS data can be copied in and out of CCS region through + * XY_CTRL_SURF_COPY_BLT. CPU can't access the CCS data directly. + * + * When we exaust the lmem, if the object's placements support smem, then we can + * directly decompress the compressed lmem object into smem and start using it + * from smem itself. + * + * But when we need to swapout the compressed lmem object into a smem region + * though objects' placement doesn't support smem, then we copy the lmem content + * as it is into smem region along with ccs data (using XY_CTRL_SURF_COPY_BLT). + * When the object is referred, lmem content will be swaped in along with + * restoration of the CCS data (using XY_CTRL_SURF_COPY_BLT) at corresponding + * location. + * + * + * Flat-CCS Modifiers for different compression formats + * + * + * I915_FORMAT_MOD_F_TILED_DG2_RC_CCS - used to indicate the buffers of Flat CCS + * render compression formats. Though the general layout is same as + * I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS, new hashing/compression algorithm is + * used. Render compression uses 128 byte compression blocks + * + * I915_FORMAT_MOD_F_TILED_DG2_MC_CCS -used to indicate the buffers of Flat CCS + * media compression formats. Though the general layout is same as + * I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS, new hashing/compression algorithm is + * used. Media compression uses 256 byte compression blocks. + * + * I915_FORMAT_MOD_F_TILED_DG2_RC_CCS_CC - used to indicate the buffers of Flat + * CCS clear color render compression formats. Unified compression format for + * clear color render compression. The genral layout is a tiled layout using + * 4Kb tiles i.e Tile4 layout. + */ + static inline u32 *i915_flush_dw(u32 *cmd, u64 dst, u32 flags) { /* Mask the 3 LSB to use the PPGTT address space */ -- 2.20.1
[PATCH v4 16/16] Doc/gpu/rfc/i915: i915 DG2 uAPI
Details of the new features getting added as part of DG2 enabling and their implicit impact on the uAPI. v2: improvised the Flat-CCS documentation [Danvet & CQ] Signed-off-by: Ramalingam C cc: Daniel Vetter cc: Matthew Auld cc: Simon Ser cc: Pekka Paalanen Cc: Jordan Justen Cc: Kenneth Graunke Cc: mesa-dev@lists.freedesktop.org Cc: Tony Ye Cc: Slawomir Milczarek --- Documentation/gpu/rfc/i915_dg2.rst | 32 ++ Documentation/gpu/rfc/index.rst| 3 +++ 2 files changed, 35 insertions(+) create mode 100644 Documentation/gpu/rfc/i915_dg2.rst diff --git a/Documentation/gpu/rfc/i915_dg2.rst b/Documentation/gpu/rfc/i915_dg2.rst new file mode 100644 index ..9d28b1812bc7 --- /dev/null +++ b/Documentation/gpu/rfc/i915_dg2.rst @@ -0,0 +1,32 @@ + +I915 DG2 RFC Section + + +Upstream plan += +Plan to upstream the DG2 enabling is: + +* Merge basic HW enabling for DG2 (Still without pciid) +* Merge the 64k support for lmem +* Merge the flat CCS enabling patches +* Add the pciid for DG2 and enable the DG2 in CI + + +64K page support for lmem += +On DG2 hw, local-memory supports minimum GTT page size of 64k only. 4k is not +supported anymore. + +DG2 hw doesn't support the 64k (lmem) and 4k (smem) pages in the same ppgtt +Page table. Refer the struct drm_i915_gem_create_ext for the implication of +handling the 64k page size. + +.. kernel-doc:: include/uapi/drm/i915_drm.h +:functions: drm_i915_gem_create_ext + + +Flat CCS support for lmem += + +.. kernel-doc:: drivers/gpu/drm/i915/gt/intel_migrate.c +:doc: Flat-CCS - Memory compression for Local memory diff --git a/Documentation/gpu/rfc/index.rst b/Documentation/gpu/rfc/index.rst index 91e93a705230..afb320ed4028 100644 --- a/Documentation/gpu/rfc/index.rst +++ b/Documentation/gpu/rfc/index.rst @@ -20,6 +20,9 @@ host such documentation: i915_gem_lmem.rst +.. toctree:: +i915_dg2.rst + .. toctree:: i915_scheduler.rst -- 2.20.1