Re: [Intel-gfx] [PATCH 2/2] drm/doc: add rfc section for small BAR uapi
On 18/03/2022 09:38, Lionel Landwerlin wrote: Hey Matthew, all, This sounds like a good thing to have. There are a number of DG2 machines where we have a small BAR and this is causing more apps to fail. Anv currently reports 3 memory heaps to the app : - local device only (not host visible) -> mapped to lmem - device/cpu -> mapped to smem - local device but also host visible -> mapped to lmem So we could use this straight away, by just not putting the I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS flag on the allocation of the first heap. One thing I don't see in this proposal is how can we get the size of the 2 lmem heap : cpu visible, cpu not visible We could use that to report the appropriate size to the app. We probably want to report a new drm_i915_memory_region_info and either : - put one of the reserve field to use to indicate : cpu visible - or define a new enum value in drm_i915_gem_memory_class Thanks for taking a look at this. Returning the probed CPU visible size as part of the region query seems reasonable. Something like: @@ -3074,8 +3074,18 @@ struct drm_i915_memory_region_info { /** @unallocated_size: Estimate of memory remaining (-1 = unknown) */ __u64 unallocated_size; - /** @rsvd1: MBZ */ - __u64 rsvd1[8]; + union { + /** @rsvd1: MBZ */ + __u64 rsvd1[8]; + + struct { + /** +* @probed_cpu_visible_size: Memory probed by the driver +* that is CPU accessible. (-1 = unknown) +*/ + __u64 probed_cpu_visible_size; + }; + }; I will add this in the next version, if no objections. Cheers, -Lionel On 18/02/2022 13:22, Matthew Auld wrote: Add an entry for the new uapi needed for small BAR on DG2+. Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Jon Bloomfield Cc: Daniel Vetter Cc: Jordan Justen Cc: Kenneth Graunke Cc: mesa-...@lists.freedesktop.org --- Documentation/gpu/rfc/i915_small_bar.h | 153 +++ Documentation/gpu/rfc/i915_small_bar.rst | 40 ++ Documentation/gpu/rfc/index.rst | 4 + 3 files changed, 197 insertions(+) create mode 100644 Documentation/gpu/rfc/i915_small_bar.h create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst diff --git a/Documentation/gpu/rfc/i915_small_bar.h b/Documentation/gpu/rfc/i915_small_bar.h new file mode 100644 index ..fa65835fd608 --- /dev/null +++ b/Documentation/gpu/rfc/i915_small_bar.h @@ -0,0 +1,153 @@ +/** + * struct __drm_i915_gem_create_ext - Existing gem_create behaviour, with added + * extension support using struct i915_user_extension. + * + * Note that in the future we want to have our buffer flags here, at least for + * the stuff that is immutable. Previously we would have two ioctls, one to + * create the object with gem_create, and another to apply various parameters, + * however this creates some ambiguity for the params which are considered + * immutable. Also in general we're phasing out the various SET/GET ioctls. + */ +struct __drm_i915_gem_create_ext { + /** + * @size: Requested size for the object. + * + * The (page-aligned) allocated size for the object will be returned. + * + * Note that for some devices we have might have further minimum + * page-size restrictions(larger than 4K), like for device local-memory. + * However in general the final size here should always reflect any + * rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS + * extension to place the object in device local-memory. + */ + __u64 size; + /** + * @handle: Returned handle for the object. + * + * Object handles are nonzero. + */ + __u32 handle; + /** + * @flags: Optional flags. + * + * Supported values: + * + * I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS - Signal to the kernel that + * the object will need to be accessed via the CPU. + * + * Only valid when placing objects in I915_MEMORY_CLASS_DEVICE, and + * only strictly required on platforms where only some of the device + * memory is directly visible or mappable through the CPU, like on DG2+. + * + * One of the placements MUST also be I915_MEMORY_CLASS_SYSTEM, to + * ensure we can always spill the allocation to system memory, if we + * can't place the object in the mappable part of + * I915_MEMORY_CLASS_DEVICE. + * + * Note that buffers that need to be captured with EXEC_OBJECT_CAPTURE, + * will need to enable this hint, if the object can also be placed in + * I915_MEMORY_CLASS_DEVICE, starting from DG2+. The execbuf call will + * throw an error otherwise. This also means that such objects will need + * I915_MEMORY_CLASS_SYSTEM set as a possible placement. + * + * Without this
Re: [Intel-gfx] [PATCH 2/2] drm/doc: add rfc section for small BAR uapi
Hey Matthew, all, This sounds like a good thing to have. There are a number of DG2 machines where we have a small BAR and this is causing more apps to fail. Anv currently reports 3 memory heaps to the app : - local device only (not host visible) -> mapped to lmem - device/cpu -> mapped to smem - local device but also host visible -> mapped to lmem So we could use this straight away, by just not putting the I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS flag on the allocation of the first heap. One thing I don't see in this proposal is how can we get the size of the 2 lmem heap : cpu visible, cpu not visible We could use that to report the appropriate size to the app. We probably want to report a new drm_i915_memory_region_info and either : - put one of the reserve field to use to indicate : cpu visible - or define a new enum value in drm_i915_gem_memory_class Cheers, -Lionel On 18/02/2022 13:22, Matthew Auld wrote: Add an entry for the new uapi needed for small BAR on DG2+. Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Jon Bloomfield Cc: Daniel Vetter Cc: Jordan Justen Cc: Kenneth Graunke Cc: mesa-...@lists.freedesktop.org --- Documentation/gpu/rfc/i915_small_bar.h | 153 +++ Documentation/gpu/rfc/i915_small_bar.rst | 40 ++ Documentation/gpu/rfc/index.rst | 4 + 3 files changed, 197 insertions(+) create mode 100644 Documentation/gpu/rfc/i915_small_bar.h create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst diff --git a/Documentation/gpu/rfc/i915_small_bar.h b/Documentation/gpu/rfc/i915_small_bar.h new file mode 100644 index ..fa65835fd608 --- /dev/null +++ b/Documentation/gpu/rfc/i915_small_bar.h @@ -0,0 +1,153 @@ +/** + * struct __drm_i915_gem_create_ext - Existing gem_create behaviour, with added + * extension support using struct i915_user_extension. + * + * Note that in the future we want to have our buffer flags here, at least for + * the stuff that is immutable. Previously we would have two ioctls, one to + * create the object with gem_create, and another to apply various parameters, + * however this creates some ambiguity for the params which are considered + * immutable. Also in general we're phasing out the various SET/GET ioctls. + */ +struct __drm_i915_gem_create_ext { + /** +* @size: Requested size for the object. +* +* The (page-aligned) allocated size for the object will be returned. +* +* Note that for some devices we have might have further minimum +* page-size restrictions(larger than 4K), like for device local-memory. +* However in general the final size here should always reflect any +* rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS +* extension to place the object in device local-memory. +*/ + __u64 size; + /** +* @handle: Returned handle for the object. +* +* Object handles are nonzero. +*/ + __u32 handle; + /** +* @flags: Optional flags. +* +* Supported values: +* +* I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS - Signal to the kernel that +* the object will need to be accessed via the CPU. +* +* Only valid when placing objects in I915_MEMORY_CLASS_DEVICE, and +* only strictly required on platforms where only some of the device +* memory is directly visible or mappable through the CPU, like on DG2+. +* +* One of the placements MUST also be I915_MEMORY_CLASS_SYSTEM, to +* ensure we can always spill the allocation to system memory, if we +* can't place the object in the mappable part of +* I915_MEMORY_CLASS_DEVICE. +* +* Note that buffers that need to be captured with EXEC_OBJECT_CAPTURE, +* will need to enable this hint, if the object can also be placed in +* I915_MEMORY_CLASS_DEVICE, starting from DG2+. The execbuf call will +* throw an error otherwise. This also means that such objects will need +* I915_MEMORY_CLASS_SYSTEM set as a possible placement. +* +* Without this hint, the kernel will assume that non-mappable +* I915_MEMORY_CLASS_DEVICE is preferred for this object. Note that the +* kernel can still migrate the object to the mappable part, as a last +* resort, if userspace ever CPU faults this object, but this might be +* expensive, and so ideally should be avoided. +*/ +#define I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS (1 << 0) + __u32 flags; + /** +* @extensions: The chain of extensions to apply to this object. +* +* This will be useful in the future when we need to support several +* different extensions, and we need to apply more than one when +* creating the object. See struct i915_user_extension. +* +* If we
Re: [Intel-gfx] [PATCH 2/2] drm/doc: add rfc section for small BAR uapi
> -Original Message- > From: dri-devel On Behalf Of > Thomas Hellström > Sent: Tuesday, February 22, 2022 2:36 AM > To: Auld, Matthew ; intel-gfx@lists.freedesktop.org > Cc: Daniel Vetter ; dri-de...@lists.freedesktop.org; > Kenneth Graunke ; Bloomfield, Jon > ; Justen, Jordan L ; > mesa-...@lists.freedesktop.org > Subject: Re: [PATCH 2/2] drm/doc: add rfc section for small BAR uapi > > > On 2/18/22 12:22, Matthew Auld wrote: > > Add an entry for the new uapi needed for small BAR on DG2+. > > > > Signed-off-by: Matthew Auld > > Cc: Thomas Hellström > > Cc: Jon Bloomfield > > Cc: Daniel Vetter > > Cc: Jordan Justen > > Cc: Kenneth Graunke > > Cc: mesa-...@lists.freedesktop.org > > --- > > Documentation/gpu/rfc/i915_small_bar.h | 153 > +++ > > Documentation/gpu/rfc/i915_small_bar.rst | 40 ++ > > Documentation/gpu/rfc/index.rst | 4 + > > 3 files changed, 197 insertions(+) > > create mode 100644 Documentation/gpu/rfc/i915_small_bar.h > > create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst > > > > diff --git a/Documentation/gpu/rfc/i915_small_bar.h > > b/Documentation/gpu/rfc/i915_small_bar.h > > new file mode 100644 > > index ..fa65835fd608 > > --- /dev/null > > +++ b/Documentation/gpu/rfc/i915_small_bar.h > > @@ -0,0 +1,153 @@ > > +/** > > + * struct __drm_i915_gem_create_ext - Existing gem_create behaviour, > > +with added > > + * extension support using struct i915_user_extension. > > + * > > + * Note that in the future we want to have our buffer flags here, > > Does this sentence need updating, with the flags member? > > > > at least for > > + * the stuff that is immutable. Previously we would have two ioctls, > > +one to > > + * create the object with gem_create, and another to apply various > > +parameters, > > + * however this creates some ambiguity for the params which are > > +considered > > + * immutable. Also in general we're phasing out the various SET/GET ioctls. > > + */ > > +struct __drm_i915_gem_create_ext { > > + /** > > +* @size: Requested size for the object. > > +* > > +* The (page-aligned) allocated size for the object will be returned. > > +* > > +* Note that for some devices we have might have further minimum > > +* page-size restrictions(larger than 4K), like for device local-memory. > > +* However in general the final size here should always reflect any > > +* rounding up, if for example using the > I915_GEM_CREATE_EXT_MEMORY_REGIONS > > +* extension to place the object in device local-memory. > > +*/ > > + __u64 size; > > + /** > > +* @handle: Returned handle for the object. > > +* > > +* Object handles are nonzero. > > +*/ > > + __u32 handle; > > + /** > > +* @flags: Optional flags. > > +* > > +* Supported values: > > +* > > +* I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS - Signal to the > kernel that > > +* the object will need to be accessed via the CPU. > > +* > > +* Only valid when placing objects in I915_MEMORY_CLASS_DEVICE, > and > > +* only strictly required on platforms where only some of the device > > +* memory is directly visible or mappable through the CPU, like on DG2+. > > +* > > +* One of the placements MUST also be I915_MEMORY_CLASS_SYSTEM, > to > > +* ensure we can always spill the allocation to system memory, if we > > +* can't place the object in the mappable part of > > +* I915_MEMORY_CLASS_DEVICE. > > +* > > +* Note that buffers that need to be captured with > EXEC_OBJECT_CAPTURE, > > +* will need to enable this hint, if the object can also be placed in > > +* I915_MEMORY_CLASS_DEVICE, starting from DG2+. The execbuf call > will > > +* throw an error otherwise. This also means that such objects will need > > +* I915_MEMORY_CLASS_SYSTEM set as a possible placement. > > +* > > +* Without this hint, the kernel will assume that non-mappable > > +* I915_MEMORY_CLASS_DEVICE is preferred for this object. Note that > the > > +* kernel can still migrate the object to the mappable part, as a last > > +* resort, if userspace ever CPU faults this object, but this might be > > +* expensive, and so ideally should be avoided. > > +*/ > > +#define I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS (1 << 0) > > + __u32 flags; > > + /** > > +* @extensions: The chain of extensions to apply to this object. > > +* > > +* This will be useful in the future when we need to support several > > +* different extensions, and we need to apply more than one when > > +* creating the object. See struct i915_user_extension. > > +* > > +* If we don't supply any extensions then we get the same old > gem_create > > +* behaviour. > > +* > > +* For I915_GEM_CREATE_EXT_MEMORY_REGIONS usage see > > +* struct drm_i915_gem_create_ext_memory_regions. > > +* > > +* For
Re: [Intel-gfx] [PATCH 2/2] drm/doc: add rfc section for small BAR uapi
On 2/18/22 12:22, Matthew Auld wrote: Add an entry for the new uapi needed for small BAR on DG2+. Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Jon Bloomfield Cc: Daniel Vetter Cc: Jordan Justen Cc: Kenneth Graunke Cc: mesa-...@lists.freedesktop.org --- Documentation/gpu/rfc/i915_small_bar.h | 153 +++ Documentation/gpu/rfc/i915_small_bar.rst | 40 ++ Documentation/gpu/rfc/index.rst | 4 + 3 files changed, 197 insertions(+) create mode 100644 Documentation/gpu/rfc/i915_small_bar.h create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst diff --git a/Documentation/gpu/rfc/i915_small_bar.h b/Documentation/gpu/rfc/i915_small_bar.h new file mode 100644 index ..fa65835fd608 --- /dev/null +++ b/Documentation/gpu/rfc/i915_small_bar.h @@ -0,0 +1,153 @@ +/** + * struct __drm_i915_gem_create_ext - Existing gem_create behaviour, with added + * extension support using struct i915_user_extension. + * + * Note that in the future we want to have our buffer flags here, Does this sentence need updating, with the flags member? at least for + * the stuff that is immutable. Previously we would have two ioctls, one to + * create the object with gem_create, and another to apply various parameters, + * however this creates some ambiguity for the params which are considered + * immutable. Also in general we're phasing out the various SET/GET ioctls. + */ +struct __drm_i915_gem_create_ext { + /** +* @size: Requested size for the object. +* +* The (page-aligned) allocated size for the object will be returned. +* +* Note that for some devices we have might have further minimum +* page-size restrictions(larger than 4K), like for device local-memory. +* However in general the final size here should always reflect any +* rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS +* extension to place the object in device local-memory. +*/ + __u64 size; + /** +* @handle: Returned handle for the object. +* +* Object handles are nonzero. +*/ + __u32 handle; + /** +* @flags: Optional flags. +* +* Supported values: +* +* I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS - Signal to the kernel that +* the object will need to be accessed via the CPU. +* +* Only valid when placing objects in I915_MEMORY_CLASS_DEVICE, and +* only strictly required on platforms where only some of the device +* memory is directly visible or mappable through the CPU, like on DG2+. +* +* One of the placements MUST also be I915_MEMORY_CLASS_SYSTEM, to +* ensure we can always spill the allocation to system memory, if we +* can't place the object in the mappable part of +* I915_MEMORY_CLASS_DEVICE. +* +* Note that buffers that need to be captured with EXEC_OBJECT_CAPTURE, +* will need to enable this hint, if the object can also be placed in +* I915_MEMORY_CLASS_DEVICE, starting from DG2+. The execbuf call will +* throw an error otherwise. This also means that such objects will need +* I915_MEMORY_CLASS_SYSTEM set as a possible placement. +* +* Without this hint, the kernel will assume that non-mappable +* I915_MEMORY_CLASS_DEVICE is preferred for this object. Note that the +* kernel can still migrate the object to the mappable part, as a last +* resort, if userspace ever CPU faults this object, but this might be +* expensive, and so ideally should be avoided. +*/ +#define I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS (1 << 0) + __u32 flags; + /** +* @extensions: The chain of extensions to apply to this object. +* +* This will be useful in the future when we need to support several +* different extensions, and we need to apply more than one when +* creating the object. See struct i915_user_extension. +* +* If we don't supply any extensions then we get the same old gem_create +* behaviour. +* +* For I915_GEM_CREATE_EXT_MEMORY_REGIONS usage see +* struct drm_i915_gem_create_ext_memory_regions. +* +* For I915_GEM_CREATE_EXT_PROTECTED_CONTENT usage see +* struct drm_i915_gem_create_ext_protected_content. +*/ +#define I915_GEM_CREATE_EXT_MEMORY_REGIONS 0 +#define I915_GEM_CREATE_EXT_PROTECTED_CONTENT 1 + __u64 extensions; +}; + +#define DRM_I915_QUERY_VMA_INFO5 + +/** + * struct __drm_i915_query_vma_info + * + * Given a vm and GTT address, lookup the corresponding vma, returning its set + * of attributes. + * + * .. code-block:: C + * + * struct drm_i915_query_vma_info info = {}; + * struct drm_i915_query_item item = { + * .data_ptr = (uintptr_t), + *
[Intel-gfx] [PATCH 2/2] drm/doc: add rfc section for small BAR uapi
Add an entry for the new uapi needed for small BAR on DG2+. Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Jon Bloomfield Cc: Daniel Vetter Cc: Jordan Justen Cc: Kenneth Graunke Cc: mesa-...@lists.freedesktop.org --- Documentation/gpu/rfc/i915_small_bar.h | 153 +++ Documentation/gpu/rfc/i915_small_bar.rst | 40 ++ Documentation/gpu/rfc/index.rst | 4 + 3 files changed, 197 insertions(+) create mode 100644 Documentation/gpu/rfc/i915_small_bar.h create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst diff --git a/Documentation/gpu/rfc/i915_small_bar.h b/Documentation/gpu/rfc/i915_small_bar.h new file mode 100644 index ..fa65835fd608 --- /dev/null +++ b/Documentation/gpu/rfc/i915_small_bar.h @@ -0,0 +1,153 @@ +/** + * struct __drm_i915_gem_create_ext - Existing gem_create behaviour, with added + * extension support using struct i915_user_extension. + * + * Note that in the future we want to have our buffer flags here, at least for + * the stuff that is immutable. Previously we would have two ioctls, one to + * create the object with gem_create, and another to apply various parameters, + * however this creates some ambiguity for the params which are considered + * immutable. Also in general we're phasing out the various SET/GET ioctls. + */ +struct __drm_i915_gem_create_ext { + /** +* @size: Requested size for the object. +* +* The (page-aligned) allocated size for the object will be returned. +* +* Note that for some devices we have might have further minimum +* page-size restrictions(larger than 4K), like for device local-memory. +* However in general the final size here should always reflect any +* rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS +* extension to place the object in device local-memory. +*/ + __u64 size; + /** +* @handle: Returned handle for the object. +* +* Object handles are nonzero. +*/ + __u32 handle; + /** +* @flags: Optional flags. +* +* Supported values: +* +* I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS - Signal to the kernel that +* the object will need to be accessed via the CPU. +* +* Only valid when placing objects in I915_MEMORY_CLASS_DEVICE, and +* only strictly required on platforms where only some of the device +* memory is directly visible or mappable through the CPU, like on DG2+. +* +* One of the placements MUST also be I915_MEMORY_CLASS_SYSTEM, to +* ensure we can always spill the allocation to system memory, if we +* can't place the object in the mappable part of +* I915_MEMORY_CLASS_DEVICE. +* +* Note that buffers that need to be captured with EXEC_OBJECT_CAPTURE, +* will need to enable this hint, if the object can also be placed in +* I915_MEMORY_CLASS_DEVICE, starting from DG2+. The execbuf call will +* throw an error otherwise. This also means that such objects will need +* I915_MEMORY_CLASS_SYSTEM set as a possible placement. +* +* Without this hint, the kernel will assume that non-mappable +* I915_MEMORY_CLASS_DEVICE is preferred for this object. Note that the +* kernel can still migrate the object to the mappable part, as a last +* resort, if userspace ever CPU faults this object, but this might be +* expensive, and so ideally should be avoided. +*/ +#define I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS (1 << 0) + __u32 flags; + /** +* @extensions: The chain of extensions to apply to this object. +* +* This will be useful in the future when we need to support several +* different extensions, and we need to apply more than one when +* creating the object. See struct i915_user_extension. +* +* If we don't supply any extensions then we get the same old gem_create +* behaviour. +* +* For I915_GEM_CREATE_EXT_MEMORY_REGIONS usage see +* struct drm_i915_gem_create_ext_memory_regions. +* +* For I915_GEM_CREATE_EXT_PROTECTED_CONTENT usage see +* struct drm_i915_gem_create_ext_protected_content. +*/ +#define I915_GEM_CREATE_EXT_MEMORY_REGIONS 0 +#define I915_GEM_CREATE_EXT_PROTECTED_CONTENT 1 + __u64 extensions; +}; + +#define DRM_I915_QUERY_VMA_INFO5 + +/** + * struct __drm_i915_query_vma_info + * + * Given a vm and GTT address, lookup the corresponding vma, returning its set + * of attributes. + * + * .. code-block:: C + * + * struct drm_i915_query_vma_info info = {}; + * struct drm_i915_query_item item = { + * .data_ptr = (uintptr_t), + * .query_id = DRM_I915_QUERY_VMA_INFO, + * }; + * struct drm_i915_query query = { + *