[ANNOUNCE] mesa 22.1.0-rc3
Hi list, I'd like to announce that Mesa 22.1.0-rc3 is now available for general consumption. There's a lot here, stuff for dzn, util, vulkan, r300, nir, intel, radv, anv, ac, crocus. The biggest change is lots of backports for kopper and zink with their supporting changes, such as wgl which Mike was gracious enough to manually backport for me, thanks Mike. please enjoy, and as always, report any issues. Cheers, Dylan shortlog Alexey Bozhenko (1): spirv: fix OpBranchConditional when both branches are the same Boris Brezillon (4): dzn: Add missing VKAPI_{ATTR,CALL} specifiers to BeginCommandBuffer() dzn: Pass the dzn_event pointer to _mesa_hash_table_insert() dzn: Fix the STATIC_ASSERT() in dzn_meta_blits_get_context() ci/windows: Add a variable to globally disable jobs using windows runners Daniel Stone (2): CI: Disable Windows jobs ci: Also disable Windows container builds when down Dave Airlie (2): u_blitter/stencil: take dstbox x/y into accounts for dst fb width util/stencil: fix stencil fallback blit shader texture types. Dylan Baker (5): .pick_status.json: Update to 9f44a264623461c98368185b023d99446676e039 .pick_status.json: Update to fbece25a451bb7915891851ee5c72724974ae5e2 .pick_status.json: Update to a6a4bf0f1eae36cb68d5c67653ac013fe0fbde8a .pick_status.json: Update to f329f67243d671965d73bd2243cffc4e1e68c4a3 VERSION: bump for 22.1.0-rc3 Filip Gawin (1): r300: Print warning when stubbing derivatives Jason Ekstrand (3): util/set: Respect found in search_or_add_pre_hashed nir: Lower all bit sizes of usub_borrow vulkan: Set signals[i].stageMask = ALL_COMMANDS for QueueSubmit2 wrapping Jordan Justen (1): intel/dev: Add device info for RPL-P Konstantin Seurer (1): radv: Fix lowering ignore_ray_intersection Lionel Landwerlin (4): nir/divergence: handle load_global_block_intel intel: fixup number of threads per EU on XeHP anv: fix acceleration structure descriptor template writes anv: skip acceleration structure in binding table emission Marek Olšák (3): nir: fix an uninitialized variable valgrind warning in nir_group_loads ac/surface: fix an addrlib race condition on gfx9 winsys/amdgpu: fix a mutex deadlock when we fail to create pipe_screen Martin Roukala (né Peres) (1): ci/b2c: fix the generation of the IMAGE_UNDER_TEST variable Michael Olbrich (1): crocus: export GEM handle with RDWR access rights Mike Blumenkrantz (21): zink: handle device-local unsynchronized maps util/draw: fix map size of indirect buffer in util_draw_indirect_read util/draw: handle draw_count=0 when reading indirect parameters util/draw: fix indirect draw count readback zink: move the kopper present fence to the displaytarget object wgl: support GL 4.6 zink: fix tcs control barriers for use without vk memory model zink: fix semantics mask for compute control barriers zink: add synchronization for buffer clears mesa/st: clamp GL_RENDERBUFFER to GL_TEXTURE_2D for sparse queries glsl/nir: set new_style_shadow for sparse tex ops as necessary zink: fix group memory barrier emission vulkan: bump layer api versions to current vk header version kopper: always fetch and store drawable info kopper: move drawable geometry updating up in function kopper: store whether screen has dmabuf support kopper: copy a bunch of code for texture_from_pixmap kopper: add DISPLAY_TARGET bind for depth buffer zink: fix/improve swapchain surface info updating zink: fix up swapchain depth buffer geometry during fb update zink: ci update Paulo Zanoni (1): iris: fix race condition during busy tracking Pavel Ondračka (1): r300: set PIPE_BIND_CONSTANT_BUFFER for const_uploader Pierre-Eric Pelloux-Prayer (1): ac/surface: adjust gfx9.pitch[*] based on surf->blk_w Rhys Perry (1): radv: fix clearing of TRUNC_COORD with tg4 and immutable samplers Samuel Pitoiset (4): radv: only apply enable_mrt_output_nan_fixup for 32-bit float MRTs aco: fix load_barycentric_at_{sample,offset} on GFX6-7 nir: fix marking XFB varyings as always active IO nir: mark XFB varyings as unmoveable to prevent them to be remapped Sidney Just (6): wgl: add a flag to determine if running on zink wgl: add zink to the list of auto-loaded drivers zink: support VK_KHR_win32_surface kopper: add win32 loader interface zink: support win32 wsi wgl: support kopper Sviatoslav Peleshko (1): anv: workaround apps that assume full subgroups without specifying it Vadym Shovkoplias (1): anv: Fix geometry flickering issue when compute and 3D passes are combined git tag: mesa-22.1.0-rc3 https://mesa.freedesktop.org/archive/mesa-22.1.0-rc3.tar.xz SHA256:
Re: [Intel-gfx] [PATCH v2] drm/doc: add rfc section for small BAR uapi
On 27/04/2022 09:36, Tvrtko Ursulin wrote: On 20/04/2022 18:13, Matthew Auld wrote: Add an entry for the new uapi needed for small BAR on DG2+. v2: - Some spelling fixes and other small tweaks. (Akeem & Thomas) - Rework error capture interactions, including no longer needing NEEDS_CPU_ACCESS for objects marked for capture. (Thomas) - Add probed_cpu_visible_size. (Lionel) Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Lionel Landwerlin Cc: Jon Bloomfield Cc: Daniel Vetter Cc: Jordan Justen Cc: Kenneth Graunke Cc: Akeem G Abodunrin Cc: mesa-dev@lists.freedesktop.org --- Documentation/gpu/rfc/i915_small_bar.h | 190 +++ Documentation/gpu/rfc/i915_small_bar.rst | 58 +++ Documentation/gpu/rfc/index.rst | 4 + 3 files changed, 252 insertions(+) create mode 100644 Documentation/gpu/rfc/i915_small_bar.h create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst diff --git a/Documentation/gpu/rfc/i915_small_bar.h b/Documentation/gpu/rfc/i915_small_bar.h new file mode 100644 index ..7bfd0cf44d35 --- /dev/null +++ b/Documentation/gpu/rfc/i915_small_bar.h @@ -0,0 +1,190 @@ +/** + * struct __drm_i915_memory_region_info - Describes one region as known to the + * driver. + * + * Note this is using both struct drm_i915_query_item and struct drm_i915_query. + * For this new query we are adding the new query id DRM_I915_QUERY_MEMORY_REGIONS + * at _i915_query_item.query_id. + */ +struct __drm_i915_memory_region_info { + /** @region: The class:instance pair encoding */ + struct drm_i915_gem_memory_class_instance region; + + /** @rsvd0: MBZ */ + __u32 rsvd0; + + /** @probed_size: Memory probed by the driver (-1 = unknown) */ + __u64 probed_size; + + /** @unallocated_size: Estimate of memory remaining (-1 = unknown) */ + __u64 unallocated_size; + + union { + /** @rsvd1: MBZ */ + __u64 rsvd1[8]; + struct { + /** + * @probed_cpu_visible_size: Memory probed by the driver + * that is CPU accessible. (-1 = unknown). + * + * This will be always be <= @probed_size, and the + * remainder(if there is any) will not be CPU + * accessible. + */ + __u64 probed_cpu_visible_size; Would unallocated_cpu_visible_size be useful, to follow the total unallocated_size? Make sense. But I don't think unallocated_size has actually been properly wired up yet. It still just gives the same value as probed_size. IIRC for unallocated_size we still need a real user/usecase/umd, before wiring that up for real with the existing avail tracking. Once we have that we can also add unallocated_cpu_visible_size. Btw, have we ever considered whether unallocated_size should require CAP_SYS_ADMIN/PERFMON or something? Note sure. But just in case we do add it for real at some point, why the added restriction? + }; + }; +}; + +/** + * struct __drm_i915_gem_create_ext - Existing gem_create behaviour, with added + * extension support using struct i915_user_extension. + * + * Note that new buffer flags should be added here, at least for the stuff that + * is immutable. Previously we would have two ioctls, one to create the object + * with gem_create, and another to apply various parameters, however this + * creates some ambiguity for the params which are considered immutable. Also in + * general we're phasing out the various SET/GET ioctls. + */ +struct __drm_i915_gem_create_ext { + /** + * @size: Requested size for the object. + * + * The (page-aligned) allocated size for the object will be returned. + * + * Note that for some devices we have might have further minimum + * page-size restrictions(larger than 4K), likefor device local-memory. + * However in general the final size here should always reflect any + * rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS + * extension to place the object in device local-memory. + */ + __u64 size; + /** + * @handle: Returned handle for the object. + * + * Object handles are nonzero. + */ + __u32 handle; + /** + * @flags: Optional flags. + * + * Supported values: + * + * I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS - Signal to the kernel that + * the object will need to be accessed via the CPU. + * + * Only valid when placing objects in I915_MEMORY_CLASS_DEVICE, and + * only strictly required on platforms where only some of the device + * memory is directly visible or mappable through the CPU, like on DG2+. + * + * One of the placements MUST also be I915_MEMORY_CLASS_SYSTEM, to + * ensure we can always spill the allocation tosystem memory, if we + * can't place the object in the mappable part of + * I915_MEMORY_CLASS_DEVICE. + * + * Note that since the kernel only supports
Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi
On 27/04/2022 18:18, Matthew Auld wrote: On 27/04/2022 07:48, Lionel Landwerlin wrote: One question though, how do we detect that this flag (I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS) is accepted on a given kernel? I assume older kernels are going to reject object creation if we use this flag? From some offline discussion with Lionel, the plan here is to just do a dummy gem_create_ext to check if the kernel throws an error with the new flag or not. I didn't plan to use __drm_i915_query_vma_info, but isn't it inconsistent to select the placement on the GEM object and then query whether it's mappable by address? You made a comment stating this is racy, wouldn't querying on the GEM object prevent this? Since mesa at this time doesn't currently have a use for this one, then I guess we should maybe just drop this part of the uapi, in this version at least, if no objections. Just repeating what we discussed (maybe I missed some other discussion and that's why I was confused) : The way I was planning to use this is to have 3 heaps in Vulkan : - heap0: local only, no cpu visible - heap1: system, cpu visible - heap2: local & cpu visible With heap2 having the reported probed_cpu_visible_size size. It is an error for the application to map from heap0 [1]. With that said, it means if we created a GEM BO without I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS, we'll never mmap it. So why the query? I guess it would be useful when we import a buffer from another application. But in that case, why not have the query on the BO? -Lionel [1] : https://www.khronos.org/registry/vulkan/specs/1.3-extensions/man/html/vkMapMemory.html (VUID-vkMapMemory-memory-00682) Thanks, -Lionel On 27/04/2022 09:35, Lionel Landwerlin wrote: Hi Matt, The proposal looks good to me. Looking forward to try it on drm-tip. -Lionel On 20/04/2022 20:13, Matthew Auld wrote: Add an entry for the new uapi needed for small BAR on DG2+. v2: - Some spelling fixes and other small tweaks. (Akeem & Thomas) - Rework error capture interactions, including no longer needing NEEDS_CPU_ACCESS for objects marked for capture. (Thomas) - Add probed_cpu_visible_size. (Lionel) Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Lionel Landwerlin Cc: Jon Bloomfield Cc: Daniel Vetter Cc: Jordan Justen Cc: Kenneth Graunke Cc: Akeem G Abodunrin Cc: mesa-dev@lists.freedesktop.org --- Documentation/gpu/rfc/i915_small_bar.h | 190 +++ Documentation/gpu/rfc/i915_small_bar.rst | 58 +++ Documentation/gpu/rfc/index.rst | 4 + 3 files changed, 252 insertions(+) create mode 100644 Documentation/gpu/rfc/i915_small_bar.h create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst diff --git a/Documentation/gpu/rfc/i915_small_bar.h b/Documentation/gpu/rfc/i915_small_bar.h new file mode 100644 index ..7bfd0cf44d35 --- /dev/null +++ b/Documentation/gpu/rfc/i915_small_bar.h @@ -0,0 +1,190 @@ +/** + * struct __drm_i915_memory_region_info - Describes one region as known to the + * driver. + * + * Note this is using both struct drm_i915_query_item and struct drm_i915_query. + * For this new query we are adding the new query id DRM_I915_QUERY_MEMORY_REGIONS + * at _i915_query_item.query_id. + */ +struct __drm_i915_memory_region_info { + /** @region: The class:instance pair encoding */ + struct drm_i915_gem_memory_class_instance region; + + /** @rsvd0: MBZ */ + __u32 rsvd0; + + /** @probed_size: Memory probed by the driver (-1 = unknown) */ + __u64 probed_size; + + /** @unallocated_size: Estimate of memory remaining (-1 = unknown) */ + __u64 unallocated_size; + + union { + /** @rsvd1: MBZ */ + __u64 rsvd1[8]; + struct { + /** + * @probed_cpu_visible_size: Memory probed by the driver + * that is CPU accessible. (-1 = unknown). + * + * This will be always be <= @probed_size, and the + * remainder(if there is any) will not be CPU + * accessible. + */ + __u64 probed_cpu_visible_size; + }; + }; +}; + +/** + * struct __drm_i915_gem_create_ext - Existing gem_create behaviour, with added + * extension support using struct i915_user_extension. + * + * Note that new buffer flags should be added here, at least for the stuff that + * is immutable. Previously we would have two ioctls, one to create the object + * with gem_create, and another to apply various parameters, however this + * creates some ambiguity for the params which are considered immutable. Also in + * general we're phasing out the various SET/GET ioctls. + */ +struct __drm_i915_gem_create_ext { + /** + * @size: Requested size for the object. + * + * The (page-aligned) allocated size for the object will be returned. + * + * Note that for some devices we have might have further minimum +
Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi
On 27/04/2022 07:48, Lionel Landwerlin wrote: One question though, how do we detect that this flag (I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS) is accepted on a given kernel? I assume older kernels are going to reject object creation if we use this flag? From some offline discussion with Lionel, the plan here is to just do a dummy gem_create_ext to check if the kernel throws an error with the new flag or not. I didn't plan to use __drm_i915_query_vma_info, but isn't it inconsistent to select the placement on the GEM object and then query whether it's mappable by address? You made a comment stating this is racy, wouldn't querying on the GEM object prevent this? Since mesa at this time doesn't currently have a use for this one, then I guess we should maybe just drop this part of the uapi, in this version at least, if no objections. Thanks, -Lionel On 27/04/2022 09:35, Lionel Landwerlin wrote: Hi Matt, The proposal looks good to me. Looking forward to try it on drm-tip. -Lionel On 20/04/2022 20:13, Matthew Auld wrote: Add an entry for the new uapi needed for small BAR on DG2+. v2: - Some spelling fixes and other small tweaks. (Akeem & Thomas) - Rework error capture interactions, including no longer needing NEEDS_CPU_ACCESS for objects marked for capture. (Thomas) - Add probed_cpu_visible_size. (Lionel) Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Lionel Landwerlin Cc: Jon Bloomfield Cc: Daniel Vetter Cc: Jordan Justen Cc: Kenneth Graunke Cc: Akeem G Abodunrin Cc: mesa-dev@lists.freedesktop.org --- Documentation/gpu/rfc/i915_small_bar.h | 190 +++ Documentation/gpu/rfc/i915_small_bar.rst | 58 +++ Documentation/gpu/rfc/index.rst | 4 + 3 files changed, 252 insertions(+) create mode 100644 Documentation/gpu/rfc/i915_small_bar.h create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst diff --git a/Documentation/gpu/rfc/i915_small_bar.h b/Documentation/gpu/rfc/i915_small_bar.h new file mode 100644 index ..7bfd0cf44d35 --- /dev/null +++ b/Documentation/gpu/rfc/i915_small_bar.h @@ -0,0 +1,190 @@ +/** + * struct __drm_i915_memory_region_info - Describes one region as known to the + * driver. + * + * Note this is using both struct drm_i915_query_item and struct drm_i915_query. + * For this new query we are adding the new query id DRM_I915_QUERY_MEMORY_REGIONS + * at _i915_query_item.query_id. + */ +struct __drm_i915_memory_region_info { + /** @region: The class:instance pair encoding */ + struct drm_i915_gem_memory_class_instance region; + + /** @rsvd0: MBZ */ + __u32 rsvd0; + + /** @probed_size: Memory probed by the driver (-1 = unknown) */ + __u64 probed_size; + + /** @unallocated_size: Estimate of memory remaining (-1 = unknown) */ + __u64 unallocated_size; + + union { + /** @rsvd1: MBZ */ + __u64 rsvd1[8]; + struct { + /** + * @probed_cpu_visible_size: Memory probed by the driver + * that is CPU accessible. (-1 = unknown). + * + * This will be always be <= @probed_size, and the + * remainder(if there is any) will not be CPU + * accessible. + */ + __u64 probed_cpu_visible_size; + }; + }; +}; + +/** + * struct __drm_i915_gem_create_ext - Existing gem_create behaviour, with added + * extension support using struct i915_user_extension. + * + * Note that new buffer flags should be added here, at least for the stuff that + * is immutable. Previously we would have two ioctls, one to create the object + * with gem_create, and another to apply various parameters, however this + * creates some ambiguity for the params which are considered immutable. Also in + * general we're phasing out the various SET/GET ioctls. + */ +struct __drm_i915_gem_create_ext { + /** + * @size: Requested size for the object. + * + * The (page-aligned) allocated size for the object will be returned. + * + * Note that for some devices we have might have further minimum + * page-size restrictions(larger than 4K), like for device local-memory. + * However in general the final size here should always reflect any + * rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS + * extension to place the object in device local-memory. + */ + __u64 size; + /** + * @handle: Returned handle for the object. + * + * Object handles are nonzero. + */ + __u32 handle; + /** + * @flags: Optional flags. + * + * Supported values: + * + * I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS - Signal to the kernel that + * the object will need to be accessed via the CPU. + * + * Only valid when placing objects in I915_MEMORY_CLASS_DEVICE, and + * only strictly required on platforms where only some of the device + * memory is directly visible or
Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi
On Wed, Apr 27, 2022 at 08:55:07AM +0200, Christian König wrote: > Well usually we increment the drm minor version when adding some new flags > on amdgpu. > > Additional to that just one comment from our experience with that: You don't > just need one flag, but two. The first one is a hint which says "CPU access > needed" and the second is a promise which says "CPU access never needed". > > The background is that on a whole bunch of buffers you can 100% certain say > that you will never ever need CPU access. > > Then at least we have a whole bunch of buffers where we might need CPU > access, but can't tell for sure. > > And last we have stuff like transfer buffers you can be 100% sure that you > need CPU access. > > Separating it like this helped a lot with performance on small BAR systems. So my assumption was that for transfer buffers you'd fill them with the cpu first anyway, so no need for the extra flag. I guess this if for transfer buffers for gpu -> cpu transfers, where it would result in costly bo move and stalls and it's better to make sure it's cpu accessible from the start? At least on current gpu we have where there's no coherent interconnect, those buffers have to be in system memory or your cpu access will be a disaster, so again they're naturally cpu accessible. What's the use-case for the "cpu access required" flag where "cpu access before gpu access" isn't a good enough hint already to get the same perf benefits? Also for scanout my idea at least is that we just fail mmap when you haven't set the flag and the scanout is pinned to unmappable, for two reasons: - 4k buffers are big, if we force them all into mappable things are non-pretty. - You need mesa anyway to access tiled buffers, and mesa knows how to use a transfer buffer. That should work even when you do desktop switching and fastboot and stuff like that with the getfb2 ioctl should all work (and without getfb2 it's doomed to garbage anyway). So only dumb kms buffers (which are linear) would ever get the NEEDS_CPU_ACCESS flag, and only those we'd ever pin into cpu accessible range for scanout. Is there a hole in that plan? Cheers, Daniel > > Regards, > Christian. > > Am 27.04.22 um 08:48 schrieb Lionel Landwerlin: > > One question though, how do we detect that this flag > > (I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS) is accepted on a given > > kernel? > > I assume older kernels are going to reject object creation if we use > > this flag? > > > > I didn't plan to use __drm_i915_query_vma_info, but isn't it > > inconsistent to select the placement on the GEM object and then query > > whether it's mappable by address? > > You made a comment stating this is racy, wouldn't querying on the GEM > > object prevent this? > > > > Thanks, > > > > -Lionel > > > > On 27/04/2022 09:35, Lionel Landwerlin wrote: > > > Hi Matt, > > > > > > > > > The proposal looks good to me. > > > > > > Looking forward to try it on drm-tip. > > > > > > > > > -Lionel > > > > > > On 20/04/2022 20:13, Matthew Auld wrote: > > > > Add an entry for the new uapi needed for small BAR on DG2+. > > > > > > > > v2: > > > > - Some spelling fixes and other small tweaks. (Akeem & Thomas) > > > > - Rework error capture interactions, including no longer needing > > > > NEEDS_CPU_ACCESS for objects marked for capture. (Thomas) > > > > - Add probed_cpu_visible_size. (Lionel) > > > > > > > > Signed-off-by: Matthew Auld > > > > Cc: Thomas Hellström > > > > Cc: Lionel Landwerlin > > > > Cc: Jon Bloomfield > > > > Cc: Daniel Vetter > > > > Cc: Jordan Justen > > > > Cc: Kenneth Graunke > > > > Cc: Akeem G Abodunrin > > > > Cc: mesa-dev@lists.freedesktop.org > > > > --- > > > > Documentation/gpu/rfc/i915_small_bar.h | 190 > > > > +++ > > > > Documentation/gpu/rfc/i915_small_bar.rst | 58 +++ > > > > Documentation/gpu/rfc/index.rst | 4 + > > > > 3 files changed, 252 insertions(+) > > > > create mode 100644 Documentation/gpu/rfc/i915_small_bar.h > > > > create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst > > > > > > > > diff --git a/Documentation/gpu/rfc/i915_small_bar.h > > > > b/Documentation/gpu/rfc/i915_small_bar.h > > > > new file mode 100644 > > > > index ..7bfd0cf44d35 > > > > --- /dev/null > > > > +++ b/Documentation/gpu/rfc/i915_small_bar.h > > > > @@ -0,0 +1,190 @@ > > > > +/** > > > > + * struct __drm_i915_memory_region_info - Describes one region > > > > as known to the > > > > + * driver. > > > > + * > > > > + * Note this is using both struct drm_i915_query_item and > > > > struct drm_i915_query. > > > > + * For this new query we are adding the new query id > > > > DRM_I915_QUERY_MEMORY_REGIONS > > > > + * at _i915_query_item.query_id. > > > > + */ > > > > +struct __drm_i915_memory_region_info { > > > > + /** @region: The class:instance pair encoding */ > > > > + struct drm_i915_gem_memory_class_instance region; > > > > + > > > > + /** @rsvd0:
Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi
Am 27.04.22 um 17:02 schrieb Matthew Auld: On 27/04/2022 07:55, Christian König wrote: Well usually we increment the drm minor version when adding some new flags on amdgpu. Additional to that just one comment from our experience with that: You don't just need one flag, but two. The first one is a hint which says "CPU access needed" and the second is a promise which says "CPU access never needed". The background is that on a whole bunch of buffers you can 100% certain say that you will never ever need CPU access. Then at least we have a whole bunch of buffers where we might need CPU access, but can't tell for sure. And last we have stuff like transfer buffers you can be 100% sure that you need CPU access. Separating it like this helped a lot with performance on small BAR systems. Thanks for the comments. For the "CPU access never needed" flag, what extra stuff does that do on the kernel side vs not specifying any flag/hint? I assume it still prioritizes using the non-CPU visible portion first? What else does it do? It's used as a hint when you need to pin BOs for scanout for example. In general we try to allocate BOs which are marked "CPU access needed" in the CPU visible window if possible, but fallback to any memory if that won't fit. Christian. Regards, Christian. Am 27.04.22 um 08:48 schrieb Lionel Landwerlin: One question though, how do we detect that this flag (I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS) is accepted on a given kernel? I assume older kernels are going to reject object creation if we use this flag? I didn't plan to use __drm_i915_query_vma_info, but isn't it inconsistent to select the placement on the GEM object and then query whether it's mappable by address? You made a comment stating this is racy, wouldn't querying on the GEM object prevent this? Thanks, -Lionel On 27/04/2022 09:35, Lionel Landwerlin wrote: Hi Matt, The proposal looks good to me. Looking forward to try it on drm-tip. -Lionel On 20/04/2022 20:13, Matthew Auld wrote: Add an entry for the new uapi needed for small BAR on DG2+. v2: - Some spelling fixes and other small tweaks. (Akeem & Thomas) - Rework error capture interactions, including no longer needing NEEDS_CPU_ACCESS for objects marked for capture. (Thomas) - Add probed_cpu_visible_size. (Lionel) Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Lionel Landwerlin Cc: Jon Bloomfield Cc: Daniel Vetter Cc: Jordan Justen Cc: Kenneth Graunke Cc: Akeem G Abodunrin Cc: mesa-dev@lists.freedesktop.org --- Documentation/gpu/rfc/i915_small_bar.h | 190 +++ Documentation/gpu/rfc/i915_small_bar.rst | 58 +++ Documentation/gpu/rfc/index.rst | 4 + 3 files changed, 252 insertions(+) create mode 100644 Documentation/gpu/rfc/i915_small_bar.h create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst diff --git a/Documentation/gpu/rfc/i915_small_bar.h b/Documentation/gpu/rfc/i915_small_bar.h new file mode 100644 index ..7bfd0cf44d35 --- /dev/null +++ b/Documentation/gpu/rfc/i915_small_bar.h @@ -0,0 +1,190 @@ +/** + * struct __drm_i915_memory_region_info - Describes one region as known to the + * driver. + * + * Note this is using both struct drm_i915_query_item and struct drm_i915_query. + * For this new query we are adding the new query id DRM_I915_QUERY_MEMORY_REGIONS + * at _i915_query_item.query_id. + */ +struct __drm_i915_memory_region_info { + /** @region: The class:instance pair encoding */ + struct drm_i915_gem_memory_class_instance region; + + /** @rsvd0: MBZ */ + __u32 rsvd0; + + /** @probed_size: Memory probed by the driver (-1 = unknown) */ + __u64 probed_size; + + /** @unallocated_size: Estimate of memory remaining (-1 = unknown) */ + __u64 unallocated_size; + + union { + /** @rsvd1: MBZ */ + __u64 rsvd1[8]; + struct { + /** + * @probed_cpu_visible_size: Memory probed by the driver + * that is CPU accessible. (-1 = unknown). + * + * This will be always be <= @probed_size, and the + * remainder(if there is any) will not be CPU + * accessible. + */ + __u64 probed_cpu_visible_size; + }; + }; +}; + +/** + * struct __drm_i915_gem_create_ext - Existing gem_create behaviour, with added + * extension support using struct i915_user_extension. + * + * Note that new buffer flags should be added here, at least for the stuff that + * is immutable. Previously we would have two ioctls, one to create the object + * with gem_create, and another to apply various parameters, however this + * creates some ambiguity for the params which are considered immutable. Also in + * general we're phasing out the various SET/GET ioctls. + */ +struct __drm_i915_gem_create_ext { + /** + * @size: Requested size for the object. + * + * The (page-aligned) allocated
Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi
On 27/04/2022 07:55, Christian König wrote: Well usually we increment the drm minor version when adding some new flags on amdgpu. Additional to that just one comment from our experience with that: You don't just need one flag, but two. The first one is a hint which says "CPU access needed" and the second is a promise which says "CPU access never needed". The background is that on a whole bunch of buffers you can 100% certain say that you will never ever need CPU access. Then at least we have a whole bunch of buffers where we might need CPU access, but can't tell for sure. And last we have stuff like transfer buffers you can be 100% sure that you need CPU access. Separating it like this helped a lot with performance on small BAR systems. Thanks for the comments. For the "CPU access never needed" flag, what extra stuff does that do on the kernel side vs not specifying any flag/hint? I assume it still prioritizes using the non-CPU visible portion first? What else does it do? Regards, Christian. Am 27.04.22 um 08:48 schrieb Lionel Landwerlin: One question though, how do we detect that this flag (I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS) is accepted on a given kernel? I assume older kernels are going to reject object creation if we use this flag? I didn't plan to use __drm_i915_query_vma_info, but isn't it inconsistent to select the placement on the GEM object and then query whether it's mappable by address? You made a comment stating this is racy, wouldn't querying on the GEM object prevent this? Thanks, -Lionel On 27/04/2022 09:35, Lionel Landwerlin wrote: Hi Matt, The proposal looks good to me. Looking forward to try it on drm-tip. -Lionel On 20/04/2022 20:13, Matthew Auld wrote: Add an entry for the new uapi needed for small BAR on DG2+. v2: - Some spelling fixes and other small tweaks. (Akeem & Thomas) - Rework error capture interactions, including no longer needing NEEDS_CPU_ACCESS for objects marked for capture. (Thomas) - Add probed_cpu_visible_size. (Lionel) Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Lionel Landwerlin Cc: Jon Bloomfield Cc: Daniel Vetter Cc: Jordan Justen Cc: Kenneth Graunke Cc: Akeem G Abodunrin Cc: mesa-dev@lists.freedesktop.org --- Documentation/gpu/rfc/i915_small_bar.h | 190 +++ Documentation/gpu/rfc/i915_small_bar.rst | 58 +++ Documentation/gpu/rfc/index.rst | 4 + 3 files changed, 252 insertions(+) create mode 100644 Documentation/gpu/rfc/i915_small_bar.h create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst diff --git a/Documentation/gpu/rfc/i915_small_bar.h b/Documentation/gpu/rfc/i915_small_bar.h new file mode 100644 index ..7bfd0cf44d35 --- /dev/null +++ b/Documentation/gpu/rfc/i915_small_bar.h @@ -0,0 +1,190 @@ +/** + * struct __drm_i915_memory_region_info - Describes one region as known to the + * driver. + * + * Note this is using both struct drm_i915_query_item and struct drm_i915_query. + * For this new query we are adding the new query id DRM_I915_QUERY_MEMORY_REGIONS + * at _i915_query_item.query_id. + */ +struct __drm_i915_memory_region_info { + /** @region: The class:instance pair encoding */ + struct drm_i915_gem_memory_class_instance region; + + /** @rsvd0: MBZ */ + __u32 rsvd0; + + /** @probed_size: Memory probed by the driver (-1 = unknown) */ + __u64 probed_size; + + /** @unallocated_size: Estimate of memory remaining (-1 = unknown) */ + __u64 unallocated_size; + + union { + /** @rsvd1: MBZ */ + __u64 rsvd1[8]; + struct { + /** + * @probed_cpu_visible_size: Memory probed by the driver + * that is CPU accessible. (-1 = unknown). + * + * This will be always be <= @probed_size, and the + * remainder(if there is any) will not be CPU + * accessible. + */ + __u64 probed_cpu_visible_size; + }; + }; +}; + +/** + * struct __drm_i915_gem_create_ext - Existing gem_create behaviour, with added + * extension support using struct i915_user_extension. + * + * Note that new buffer flags should be added here, at least for the stuff that + * is immutable. Previously we would have two ioctls, one to create the object + * with gem_create, and another to apply various parameters, however this + * creates some ambiguity for the params which are considered immutable. Also in + * general we're phasing out the various SET/GET ioctls. + */ +struct __drm_i915_gem_create_ext { + /** + * @size: Requested size for the object. + * + * The (page-aligned) allocated size for the object will be returned. + * + * Note that for some devices we have might have further minimum + * page-size restrictions(larger than 4K), like for device local-memory. + * However in general the final size here should always reflect any + * rounding up,
Re: [Intel-gfx] [PATCH v2] drm/doc: add rfc section for small BAR uapi
On 20/04/2022 18:13, Matthew Auld wrote: Add an entry for the new uapi needed for small BAR on DG2+. v2: - Some spelling fixes and other small tweaks. (Akeem & Thomas) - Rework error capture interactions, including no longer needing NEEDS_CPU_ACCESS for objects marked for capture. (Thomas) - Add probed_cpu_visible_size. (Lionel) Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Lionel Landwerlin Cc: Jon Bloomfield Cc: Daniel Vetter Cc: Jordan Justen Cc: Kenneth Graunke Cc: Akeem G Abodunrin Cc: mesa-dev@lists.freedesktop.org --- Documentation/gpu/rfc/i915_small_bar.h | 190 +++ Documentation/gpu/rfc/i915_small_bar.rst | 58 +++ Documentation/gpu/rfc/index.rst | 4 + 3 files changed, 252 insertions(+) create mode 100644 Documentation/gpu/rfc/i915_small_bar.h create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst diff --git a/Documentation/gpu/rfc/i915_small_bar.h b/Documentation/gpu/rfc/i915_small_bar.h new file mode 100644 index ..7bfd0cf44d35 --- /dev/null +++ b/Documentation/gpu/rfc/i915_small_bar.h @@ -0,0 +1,190 @@ +/** + * struct __drm_i915_memory_region_info - Describes one region as known to the + * driver. + * + * Note this is using both struct drm_i915_query_item and struct drm_i915_query. + * For this new query we are adding the new query id DRM_I915_QUERY_MEMORY_REGIONS + * at _i915_query_item.query_id. + */ +struct __drm_i915_memory_region_info { + /** @region: The class:instance pair encoding */ + struct drm_i915_gem_memory_class_instance region; + + /** @rsvd0: MBZ */ + __u32 rsvd0; + + /** @probed_size: Memory probed by the driver (-1 = unknown) */ + __u64 probed_size; + + /** @unallocated_size: Estimate of memory remaining (-1 = unknown) */ + __u64 unallocated_size; + + union { + /** @rsvd1: MBZ */ + __u64 rsvd1[8]; + struct { + /** +* @probed_cpu_visible_size: Memory probed by the driver +* that is CPU accessible. (-1 = unknown). +* +* This will be always be <= @probed_size, and the +* remainder(if there is any) will not be CPU +* accessible. +*/ + __u64 probed_cpu_visible_size; Would unallocated_cpu_visible_size be useful, to follow the total unallocated_size? Btw, have we ever considered whether unallocated_size should require CAP_SYS_ADMIN/PERFMON or something? + }; + }; +}; + +/** + * struct __drm_i915_gem_create_ext - Existing gem_create behaviour, with added + * extension support using struct i915_user_extension. + * + * Note that new buffer flags should be added here, at least for the stuff that + * is immutable. Previously we would have two ioctls, one to create the object + * with gem_create, and another to apply various parameters, however this + * creates some ambiguity for the params which are considered immutable. Also in + * general we're phasing out the various SET/GET ioctls. + */ +struct __drm_i915_gem_create_ext { + /** +* @size: Requested size for the object. +* +* The (page-aligned) allocated size for the object will be returned. +* +* Note that for some devices we have might have further minimum +* page-size restrictions(larger than 4K), like for device local-memory. +* However in general the final size here should always reflect any +* rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS +* extension to place the object in device local-memory. +*/ + __u64 size; + /** +* @handle: Returned handle for the object. +* +* Object handles are nonzero. +*/ + __u32 handle; + /** +* @flags: Optional flags. +* +* Supported values: +* +* I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS - Signal to the kernel that +* the object will need to be accessed via the CPU. +* +* Only valid when placing objects in I915_MEMORY_CLASS_DEVICE, and +* only strictly required on platforms where only some of the device +* memory is directly visible or mappable through the CPU, like on DG2+. +* +* One of the placements MUST also be I915_MEMORY_CLASS_SYSTEM, to +* ensure we can always spill the allocation to system memory, if we +* can't place the object in the mappable part of +* I915_MEMORY_CLASS_DEVICE. +* +* Note that since the kernel only supports flat-CCS on objects that can +* *only* be placed in I915_MEMORY_CLASS_DEVICE, we therefore don't +* support I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS together with +* flat-CCS. +* +* Without this
Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi
Well usually we increment the drm minor version when adding some new flags on amdgpu. Additional to that just one comment from our experience with that: You don't just need one flag, but two. The first one is a hint which says "CPU access needed" and the second is a promise which says "CPU access never needed". The background is that on a whole bunch of buffers you can 100% certain say that you will never ever need CPU access. Then at least we have a whole bunch of buffers where we might need CPU access, but can't tell for sure. And last we have stuff like transfer buffers you can be 100% sure that you need CPU access. Separating it like this helped a lot with performance on small BAR systems. Regards, Christian. Am 27.04.22 um 08:48 schrieb Lionel Landwerlin: One question though, how do we detect that this flag (I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS) is accepted on a given kernel? I assume older kernels are going to reject object creation if we use this flag? I didn't plan to use __drm_i915_query_vma_info, but isn't it inconsistent to select the placement on the GEM object and then query whether it's mappable by address? You made a comment stating this is racy, wouldn't querying on the GEM object prevent this? Thanks, -Lionel On 27/04/2022 09:35, Lionel Landwerlin wrote: Hi Matt, The proposal looks good to me. Looking forward to try it on drm-tip. -Lionel On 20/04/2022 20:13, Matthew Auld wrote: Add an entry for the new uapi needed for small BAR on DG2+. v2: - Some spelling fixes and other small tweaks. (Akeem & Thomas) - Rework error capture interactions, including no longer needing NEEDS_CPU_ACCESS for objects marked for capture. (Thomas) - Add probed_cpu_visible_size. (Lionel) Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Lionel Landwerlin Cc: Jon Bloomfield Cc: Daniel Vetter Cc: Jordan Justen Cc: Kenneth Graunke Cc: Akeem G Abodunrin Cc: mesa-dev@lists.freedesktop.org --- Documentation/gpu/rfc/i915_small_bar.h | 190 +++ Documentation/gpu/rfc/i915_small_bar.rst | 58 +++ Documentation/gpu/rfc/index.rst | 4 + 3 files changed, 252 insertions(+) create mode 100644 Documentation/gpu/rfc/i915_small_bar.h create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst diff --git a/Documentation/gpu/rfc/i915_small_bar.h b/Documentation/gpu/rfc/i915_small_bar.h new file mode 100644 index ..7bfd0cf44d35 --- /dev/null +++ b/Documentation/gpu/rfc/i915_small_bar.h @@ -0,0 +1,190 @@ +/** + * struct __drm_i915_memory_region_info - Describes one region as known to the + * driver. + * + * Note this is using both struct drm_i915_query_item and struct drm_i915_query. + * For this new query we are adding the new query id DRM_I915_QUERY_MEMORY_REGIONS + * at _i915_query_item.query_id. + */ +struct __drm_i915_memory_region_info { + /** @region: The class:instance pair encoding */ + struct drm_i915_gem_memory_class_instance region; + + /** @rsvd0: MBZ */ + __u32 rsvd0; + + /** @probed_size: Memory probed by the driver (-1 = unknown) */ + __u64 probed_size; + + /** @unallocated_size: Estimate of memory remaining (-1 = unknown) */ + __u64 unallocated_size; + + union { + /** @rsvd1: MBZ */ + __u64 rsvd1[8]; + struct { + /** + * @probed_cpu_visible_size: Memory probed by the driver + * that is CPU accessible. (-1 = unknown). + * + * This will be always be <= @probed_size, and the + * remainder(if there is any) will not be CPU + * accessible. + */ + __u64 probed_cpu_visible_size; + }; + }; +}; + +/** + * struct __drm_i915_gem_create_ext - Existing gem_create behaviour, with added + * extension support using struct i915_user_extension. + * + * Note that new buffer flags should be added here, at least for the stuff that + * is immutable. Previously we would have two ioctls, one to create the object + * with gem_create, and another to apply various parameters, however this + * creates some ambiguity for the params which are considered immutable. Also in + * general we're phasing out the various SET/GET ioctls. + */ +struct __drm_i915_gem_create_ext { + /** + * @size: Requested size for the object. + * + * The (page-aligned) allocated size for the object will be returned. + * + * Note that for some devices we have might have further minimum + * page-size restrictions(larger than 4K), like for device local-memory. + * However in general the final size here should always reflect any + * rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS + * extension to place the object in device local-memory. + */ + __u64 size; + /** + * @handle: Returned handle for the object. + * + * Object handles are nonzero. + */ + __u32 handle; + /** +
Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi
One question though, how do we detect that this flag (I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS) is accepted on a given kernel? I assume older kernels are going to reject object creation if we use this flag? I didn't plan to use __drm_i915_query_vma_info, but isn't it inconsistent to select the placement on the GEM object and then query whether it's mappable by address? You made a comment stating this is racy, wouldn't querying on the GEM object prevent this? Thanks, -Lionel On 27/04/2022 09:35, Lionel Landwerlin wrote: Hi Matt, The proposal looks good to me. Looking forward to try it on drm-tip. -Lionel On 20/04/2022 20:13, Matthew Auld wrote: Add an entry for the new uapi needed for small BAR on DG2+. v2: - Some spelling fixes and other small tweaks. (Akeem & Thomas) - Rework error capture interactions, including no longer needing NEEDS_CPU_ACCESS for objects marked for capture. (Thomas) - Add probed_cpu_visible_size. (Lionel) Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Lionel Landwerlin Cc: Jon Bloomfield Cc: Daniel Vetter Cc: Jordan Justen Cc: Kenneth Graunke Cc: Akeem G Abodunrin Cc: mesa-dev@lists.freedesktop.org --- Documentation/gpu/rfc/i915_small_bar.h | 190 +++ Documentation/gpu/rfc/i915_small_bar.rst | 58 +++ Documentation/gpu/rfc/index.rst | 4 + 3 files changed, 252 insertions(+) create mode 100644 Documentation/gpu/rfc/i915_small_bar.h create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst diff --git a/Documentation/gpu/rfc/i915_small_bar.h b/Documentation/gpu/rfc/i915_small_bar.h new file mode 100644 index ..7bfd0cf44d35 --- /dev/null +++ b/Documentation/gpu/rfc/i915_small_bar.h @@ -0,0 +1,190 @@ +/** + * struct __drm_i915_memory_region_info - Describes one region as known to the + * driver. + * + * Note this is using both struct drm_i915_query_item and struct drm_i915_query. + * For this new query we are adding the new query id DRM_I915_QUERY_MEMORY_REGIONS + * at _i915_query_item.query_id. + */ +struct __drm_i915_memory_region_info { + /** @region: The class:instance pair encoding */ + struct drm_i915_gem_memory_class_instance region; + + /** @rsvd0: MBZ */ + __u32 rsvd0; + + /** @probed_size: Memory probed by the driver (-1 = unknown) */ + __u64 probed_size; + + /** @unallocated_size: Estimate of memory remaining (-1 = unknown) */ + __u64 unallocated_size; + + union { + /** @rsvd1: MBZ */ + __u64 rsvd1[8]; + struct { + /** + * @probed_cpu_visible_size: Memory probed by the driver + * that is CPU accessible. (-1 = unknown). + * + * This will be always be <= @probed_size, and the + * remainder(if there is any) will not be CPU + * accessible. + */ + __u64 probed_cpu_visible_size; + }; + }; +}; + +/** + * struct __drm_i915_gem_create_ext - Existing gem_create behaviour, with added + * extension support using struct i915_user_extension. + * + * Note that new buffer flags should be added here, at least for the stuff that + * is immutable. Previously we would have two ioctls, one to create the object + * with gem_create, and another to apply various parameters, however this + * creates some ambiguity for the params which are considered immutable. Also in + * general we're phasing out the various SET/GET ioctls. + */ +struct __drm_i915_gem_create_ext { + /** + * @size: Requested size for the object. + * + * The (page-aligned) allocated size for the object will be returned. + * + * Note that for some devices we have might have further minimum + * page-size restrictions(larger than 4K), like for device local-memory. + * However in general the final size here should always reflect any + * rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS + * extension to place the object in device local-memory. + */ + __u64 size; + /** + * @handle: Returned handle for the object. + * + * Object handles are nonzero. + */ + __u32 handle; + /** + * @flags: Optional flags. + * + * Supported values: + * + * I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS - Signal to the kernel that + * the object will need to be accessed via the CPU. + * + * Only valid when placing objects in I915_MEMORY_CLASS_DEVICE, and + * only strictly required on platforms where only some of the device + * memory is directly visible or mappable through the CPU, like on DG2+. + * + * One of the placements MUST also be I915_MEMORY_CLASS_SYSTEM, to + * ensure we can always spill the allocation to system memory, if we + * can't place the object in the mappable part of + * I915_MEMORY_CLASS_DEVICE. + * + * Note that since the kernel only supports flat-CCS on objects that can + * *only* be
Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi
Hi Matt, The proposal looks good to me. Looking forward to try it on drm-tip. -Lionel On 20/04/2022 20:13, Matthew Auld wrote: Add an entry for the new uapi needed for small BAR on DG2+. v2: - Some spelling fixes and other small tweaks. (Akeem & Thomas) - Rework error capture interactions, including no longer needing NEEDS_CPU_ACCESS for objects marked for capture. (Thomas) - Add probed_cpu_visible_size. (Lionel) Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Lionel Landwerlin Cc: Jon Bloomfield Cc: Daniel Vetter Cc: Jordan Justen Cc: Kenneth Graunke Cc: Akeem G Abodunrin Cc: mesa-dev@lists.freedesktop.org --- Documentation/gpu/rfc/i915_small_bar.h | 190 +++ Documentation/gpu/rfc/i915_small_bar.rst | 58 +++ Documentation/gpu/rfc/index.rst | 4 + 3 files changed, 252 insertions(+) create mode 100644 Documentation/gpu/rfc/i915_small_bar.h create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst diff --git a/Documentation/gpu/rfc/i915_small_bar.h b/Documentation/gpu/rfc/i915_small_bar.h new file mode 100644 index ..7bfd0cf44d35 --- /dev/null +++ b/Documentation/gpu/rfc/i915_small_bar.h @@ -0,0 +1,190 @@ +/** + * struct __drm_i915_memory_region_info - Describes one region as known to the + * driver. + * + * Note this is using both struct drm_i915_query_item and struct drm_i915_query. + * For this new query we are adding the new query id DRM_I915_QUERY_MEMORY_REGIONS + * at _i915_query_item.query_id. + */ +struct __drm_i915_memory_region_info { + /** @region: The class:instance pair encoding */ + struct drm_i915_gem_memory_class_instance region; + + /** @rsvd0: MBZ */ + __u32 rsvd0; + + /** @probed_size: Memory probed by the driver (-1 = unknown) */ + __u64 probed_size; + + /** @unallocated_size: Estimate of memory remaining (-1 = unknown) */ + __u64 unallocated_size; + + union { + /** @rsvd1: MBZ */ + __u64 rsvd1[8]; + struct { + /** +* @probed_cpu_visible_size: Memory probed by the driver +* that is CPU accessible. (-1 = unknown). +* +* This will be always be <= @probed_size, and the +* remainder(if there is any) will not be CPU +* accessible. +*/ + __u64 probed_cpu_visible_size; + }; + }; +}; + +/** + * struct __drm_i915_gem_create_ext - Existing gem_create behaviour, with added + * extension support using struct i915_user_extension. + * + * Note that new buffer flags should be added here, at least for the stuff that + * is immutable. Previously we would have two ioctls, one to create the object + * with gem_create, and another to apply various parameters, however this + * creates some ambiguity for the params which are considered immutable. Also in + * general we're phasing out the various SET/GET ioctls. + */ +struct __drm_i915_gem_create_ext { + /** +* @size: Requested size for the object. +* +* The (page-aligned) allocated size for the object will be returned. +* +* Note that for some devices we have might have further minimum +* page-size restrictions(larger than 4K), like for device local-memory. +* However in general the final size here should always reflect any +* rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS +* extension to place the object in device local-memory. +*/ + __u64 size; + /** +* @handle: Returned handle for the object. +* +* Object handles are nonzero. +*/ + __u32 handle; + /** +* @flags: Optional flags. +* +* Supported values: +* +* I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS - Signal to the kernel that +* the object will need to be accessed via the CPU. +* +* Only valid when placing objects in I915_MEMORY_CLASS_DEVICE, and +* only strictly required on platforms where only some of the device +* memory is directly visible or mappable through the CPU, like on DG2+. +* +* One of the placements MUST also be I915_MEMORY_CLASS_SYSTEM, to +* ensure we can always spill the allocation to system memory, if we +* can't place the object in the mappable part of +* I915_MEMORY_CLASS_DEVICE. +* +* Note that since the kernel only supports flat-CCS on objects that can +* *only* be placed in I915_MEMORY_CLASS_DEVICE, we therefore don't +* support I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS together with +* flat-CCS. +* +* Without this hint, the kernel will assume that non-mappable +* I915_MEMORY_CLASS_DEVICE is preferred for this