Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi

2022-05-03 Thread Lionel Landwerlin

On 03/05/2022 17:27, Matthew Auld wrote:

On 03/05/2022 11:39, Lionel Landwerlin wrote:

On 03/05/2022 13:22, Matthew Auld wrote:

On 02/05/2022 09:53, Lionel Landwerlin wrote:

On 02/05/2022 10:54, Lionel Landwerlin wrote:

On 20/04/2022 20:13, Matthew Auld wrote:

Add an entry for the new uapi needed for small BAR on DG2+.

v2:
   - Some spelling fixes and other small tweaks. (Akeem & Thomas)
   - Rework error capture interactions, including no longer needing
 NEEDS_CPU_ACCESS for objects marked for capture. (Thomas)
   - Add probed_cpu_visible_size. (Lionel)

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Lionel Landwerlin 
Cc: Jon Bloomfield 
Cc: Daniel Vetter 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Akeem G Abodunrin 
Cc: mesa-dev@lists.freedesktop.org
---
  Documentation/gpu/rfc/i915_small_bar.h   | 190 
+++

  Documentation/gpu/rfc/i915_small_bar.rst |  58 +++
  Documentation/gpu/rfc/index.rst  |   4 +
  3 files changed, 252 insertions(+)
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.h
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst

diff --git a/Documentation/gpu/rfc/i915_small_bar.h 
b/Documentation/gpu/rfc/i915_small_bar.h

new file mode 100644
index ..7bfd0cf44d35
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_small_bar.h
@@ -0,0 +1,190 @@
+/**
+ * struct __drm_i915_memory_region_info - Describes one region 
as known to the

+ * driver.
+ *
+ * Note this is using both struct drm_i915_query_item and struct 
drm_i915_query.
+ * For this new query we are adding the new query id 
DRM_I915_QUERY_MEMORY_REGIONS

+ * at _i915_query_item.query_id.
+ */
+struct __drm_i915_memory_region_info {
+    /** @region: The class:instance pair encoding */
+    struct drm_i915_gem_memory_class_instance region;
+
+    /** @rsvd0: MBZ */
+    __u32 rsvd0;
+
+    /** @probed_size: Memory probed by the driver (-1 = unknown) */
+    __u64 probed_size;
+
+    /** @unallocated_size: Estimate of memory remaining (-1 = 
unknown) */

+    __u64 unallocated_size;
+
+    union {
+    /** @rsvd1: MBZ */
+    __u64 rsvd1[8];
+    struct {
+    /**
+ * @probed_cpu_visible_size: Memory probed by the 
driver

+ * that is CPU accessible. (-1 = unknown).
+ *
+ * This will be always be <= @probed_size, and the
+ * remainder(if there is any) will not be CPU
+ * accessible.
+ */
+    __u64 probed_cpu_visible_size;
+    };



Trying to implement userspace support in Vulkan for this, I have 
an additional question about the value of probed_cpu_visible_size.


When is it set to -1?

I'm guessing before there is support for this value it'll be 0 (MBZ).

After after it should either be the entire lmem or something smaller.


-Lionel



Other pain point of this new uAPI, previously we could query the 
unallocated size for each heap.


unallocated_size should always give the same value as probed_size. 
We have the avail tracking, but we don't currently expose that 
through unallocated_size, due to lack of real userspace/user etc.




Now lmem is effectively divided into 2 heaps, but unallocated_size 
is tracking allocation from both parts of lmem.


Yeah, if we ever properly expose the unallocated_size, then we could 
also just add unallocated_cpu_visible_size.




Is adding new I915_MEMORY_CLASS_DEVICE_NON_MAPPABLE out of question?


I don't think it's out of the question...

I guess user-space should be able to get the current flag behaviour 
just by specifying: device, system. And it does give more flexibly 
to allow something like: device, device-nm, smem.


We can also drop the probed_cpu_visible_size, which would now just 
be the probed_size with device/device-nm. And if we lack device-nm, 
then the entire thing must be CPU mappable.


One of the downsides though, is that we can no longer easily mix 
object pages from both device + device-nm, which we could previously 
do when we didn't specify the flag. At least according to the 
current design/behaviour for @regions that would not be allowed. I 
guess some kind of new flag like ALLOC_MIXED or so? Although 
currently that is only possible with device + device-nm in ttm/i915.



Thanks, I wasn't aware of the restrictions.

Adding unallocated_cpu_visible_size would be great.


So do we want this in the next version? i.e we already have a current 
real use case in mind for unallocated_size where probed_size is not 
good enough?



Yeah in the  next iteration.

We're using unallocated_size to implement VK_EXT_memory_budget and since 
I'm going to expose lmem mappable/unmappable as 2 different heaps on 
Vulkan, I would use that there too.



-Lionel







-Lionel







-Lionel






+    };
+};
+
+/**
+ * struct __drm_i915_gem_create_ext - Existing gem_create 
behaviour, with added

+ * extension support using struct i915_user_extension.
+ *
+ * Note that new buffer flags should be added 

Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi

2022-05-03 Thread Matthew Auld

On 03/05/2022 11:39, Lionel Landwerlin wrote:

On 03/05/2022 13:22, Matthew Auld wrote:

On 02/05/2022 09:53, Lionel Landwerlin wrote:

On 02/05/2022 10:54, Lionel Landwerlin wrote:

On 20/04/2022 20:13, Matthew Auld wrote:

Add an entry for the new uapi needed for small BAR on DG2+.

v2:
   - Some spelling fixes and other small tweaks. (Akeem & Thomas)
   - Rework error capture interactions, including no longer needing
 NEEDS_CPU_ACCESS for objects marked for capture. (Thomas)
   - Add probed_cpu_visible_size. (Lionel)

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Lionel Landwerlin 
Cc: Jon Bloomfield 
Cc: Daniel Vetter 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Akeem G Abodunrin 
Cc: mesa-dev@lists.freedesktop.org
---
  Documentation/gpu/rfc/i915_small_bar.h   | 190 
+++

  Documentation/gpu/rfc/i915_small_bar.rst |  58 +++
  Documentation/gpu/rfc/index.rst  |   4 +
  3 files changed, 252 insertions(+)
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.h
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst

diff --git a/Documentation/gpu/rfc/i915_small_bar.h 
b/Documentation/gpu/rfc/i915_small_bar.h

new file mode 100644
index ..7bfd0cf44d35
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_small_bar.h
@@ -0,0 +1,190 @@
+/**
+ * struct __drm_i915_memory_region_info - Describes one region as 
known to the

+ * driver.
+ *
+ * Note this is using both struct drm_i915_query_item and struct 
drm_i915_query.
+ * For this new query we are adding the new query id 
DRM_I915_QUERY_MEMORY_REGIONS

+ * at _i915_query_item.query_id.
+ */
+struct __drm_i915_memory_region_info {
+    /** @region: The class:instance pair encoding */
+    struct drm_i915_gem_memory_class_instance region;
+
+    /** @rsvd0: MBZ */
+    __u32 rsvd0;
+
+    /** @probed_size: Memory probed by the driver (-1 = unknown) */
+    __u64 probed_size;
+
+    /** @unallocated_size: Estimate of memory remaining (-1 = 
unknown) */

+    __u64 unallocated_size;
+
+    union {
+    /** @rsvd1: MBZ */
+    __u64 rsvd1[8];
+    struct {
+    /**
+ * @probed_cpu_visible_size: Memory probed by the driver
+ * that is CPU accessible. (-1 = unknown).
+ *
+ * This will be always be <= @probed_size, and the
+ * remainder(if there is any) will not be CPU
+ * accessible.
+ */
+    __u64 probed_cpu_visible_size;
+    };



Trying to implement userspace support in Vulkan for this, I have an 
additional question about the value of probed_cpu_visible_size.


When is it set to -1?

I'm guessing before there is support for this value it'll be 0 (MBZ).

After after it should either be the entire lmem or something smaller.


-Lionel



Other pain point of this new uAPI, previously we could query the 
unallocated size for each heap.


unallocated_size should always give the same value as probed_size. We 
have the avail tracking, but we don't currently expose that through 
unallocated_size, due to lack of real userspace/user etc.




Now lmem is effectively divided into 2 heaps, but unallocated_size is 
tracking allocation from both parts of lmem.


Yeah, if we ever properly expose the unallocated_size, then we could 
also just add unallocated_cpu_visible_size.




Is adding new I915_MEMORY_CLASS_DEVICE_NON_MAPPABLE out of question?


I don't think it's out of the question...

I guess user-space should be able to get the current flag behaviour 
just by specifying: device, system. And it does give more flexibly to 
allow something like: device, device-nm, smem.


We can also drop the probed_cpu_visible_size, which would now just be 
the probed_size with device/device-nm. And if we lack device-nm, then 
the entire thing must be CPU mappable.


One of the downsides though, is that we can no longer easily mix 
object pages from both device + device-nm, which we could previously 
do when we didn't specify the flag. At least according to the current 
design/behaviour for @regions that would not be allowed. I guess some 
kind of new flag like ALLOC_MIXED or so? Although currently that is 
only possible with device + device-nm in ttm/i915.



Thanks, I wasn't aware of the restrictions.

Adding unallocated_cpu_visible_size would be great.


So do we want this in the next version? i.e we already have a current 
real use case in mind for unallocated_size where probed_size is not good 
enough?





-Lionel







-Lionel






+    };
+};
+
+/**
+ * struct __drm_i915_gem_create_ext - Existing gem_create 
behaviour, with added

+ * extension support using struct i915_user_extension.
+ *
+ * Note that new buffer flags should be added here, at least for 
the stuff that
+ * is immutable. Previously we would have two ioctls, one to 
create the object
+ * with gem_create, and another to apply various parameters, 
however this
+ * creates some ambiguity for the params which are considered 
immutable. Also 

Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi

2022-05-03 Thread Tvrtko Ursulin



On 03/05/2022 11:22, Matthew Auld wrote:

On 02/05/2022 09:53, Lionel Landwerlin wrote:

On 02/05/2022 10:54, Lionel Landwerlin wrote:

On 20/04/2022 20:13, Matthew Auld wrote:

Add an entry for the new uapi needed for small BAR on DG2+.

v2:
   - Some spelling fixes and other small tweaks. (Akeem & Thomas)
   - Rework error capture interactions, including no longer needing
 NEEDS_CPU_ACCESS for objects marked for capture. (Thomas)
   - Add probed_cpu_visible_size. (Lionel)

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Lionel Landwerlin 
Cc: Jon Bloomfield 
Cc: Daniel Vetter 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Akeem G Abodunrin 
Cc: mesa-dev@lists.freedesktop.org
---
  Documentation/gpu/rfc/i915_small_bar.h   | 190 
+++

  Documentation/gpu/rfc/i915_small_bar.rst |  58 +++
  Documentation/gpu/rfc/index.rst  |   4 +
  3 files changed, 252 insertions(+)
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.h
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst

diff --git a/Documentation/gpu/rfc/i915_small_bar.h 
b/Documentation/gpu/rfc/i915_small_bar.h

new file mode 100644
index ..7bfd0cf44d35
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_small_bar.h
@@ -0,0 +1,190 @@
+/**
+ * struct __drm_i915_memory_region_info - Describes one region as 
known to the

+ * driver.
+ *
+ * Note this is using both struct drm_i915_query_item and struct 
drm_i915_query.
+ * For this new query we are adding the new query id 
DRM_I915_QUERY_MEMORY_REGIONS

+ * at _i915_query_item.query_id.
+ */
+struct __drm_i915_memory_region_info {
+    /** @region: The class:instance pair encoding */
+    struct drm_i915_gem_memory_class_instance region;
+
+    /** @rsvd0: MBZ */
+    __u32 rsvd0;
+
+    /** @probed_size: Memory probed by the driver (-1 = unknown) */
+    __u64 probed_size;
+
+    /** @unallocated_size: Estimate of memory remaining (-1 = 
unknown) */

+    __u64 unallocated_size;
+
+    union {
+    /** @rsvd1: MBZ */
+    __u64 rsvd1[8];
+    struct {
+    /**
+ * @probed_cpu_visible_size: Memory probed by the driver
+ * that is CPU accessible. (-1 = unknown).
+ *
+ * This will be always be <= @probed_size, and the
+ * remainder(if there is any) will not be CPU
+ * accessible.
+ */
+    __u64 probed_cpu_visible_size;
+    };



Trying to implement userspace support in Vulkan for this, I have an 
additional question about the value of probed_cpu_visible_size.


When is it set to -1?

I'm guessing before there is support for this value it'll be 0 (MBZ).

After after it should either be the entire lmem or something smaller.


-Lionel



Other pain point of this new uAPI, previously we could query the 
unallocated size for each heap.


unallocated_size should always give the same value as probed_size. We 
have the avail tracking, but we don't currently expose that through 
unallocated_size, due to lack of real userspace/user etc.




Now lmem is effectively divided into 2 heaps, but unallocated_size is 
tracking allocation from both parts of lmem.


Yeah, if we ever properly expose the unallocated_size, then we could 
also just add unallocated_cpu_visible_size.




Is adding new I915_MEMORY_CLASS_DEVICE_NON_MAPPABLE out of question?


I don't think it's out of the question...

I guess user-space should be able to get the current flag behaviour just 
by specifying: device, system. And it does give more flexibly to allow 
something like: device, device-nm, smem.


I was also thinking about that option, albeit both regions under the 
existing class just separate instances with "capability" flags differing.


Downsides I thought were a) it does not really match the underlying 
resource, which is one and not two from the backing storage POV, and b) 
it allows userspace to do potentially do too much of restrictive 
regions=device-mappable,system  (even if only an innocent mistake); 
disallowing i915 to manage the space better in cases where multiple 
clients happen to fight over it.


The last part is going back to what I was commenting earlier, where I 
though migrating objects which had cpu-access set to any device should 
be allowed. (When mappable region is over subscribed.)


Regards,

Tvrtko

We can also drop the probed_cpu_visible_size, which would now just be 
the probed_size with device/device-nm. And if we lack device-nm, then 
the entire thing must be CPU mappable.


One of the downsides though, is that we can no longer easily mix object 
pages from both device + device-nm, which we could previously do when we 
didn't specify the flag. At least according to the current 
design/behaviour for @regions that would not be allowed. I guess some 
kind of new flag like ALLOC_MIXED or so? Although currently that is only 
possible with device + device-nm in ttm/i915.





-Lionel






+    };
+};
+
+/**
+ * struct 

Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi

2022-05-03 Thread Lionel Landwerlin

On 03/05/2022 13:22, Matthew Auld wrote:

On 02/05/2022 09:53, Lionel Landwerlin wrote:

On 02/05/2022 10:54, Lionel Landwerlin wrote:

On 20/04/2022 20:13, Matthew Auld wrote:

Add an entry for the new uapi needed for small BAR on DG2+.

v2:
   - Some spelling fixes and other small tweaks. (Akeem & Thomas)
   - Rework error capture interactions, including no longer needing
 NEEDS_CPU_ACCESS for objects marked for capture. (Thomas)
   - Add probed_cpu_visible_size. (Lionel)

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Lionel Landwerlin 
Cc: Jon Bloomfield 
Cc: Daniel Vetter 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Akeem G Abodunrin 
Cc: mesa-dev@lists.freedesktop.org
---
  Documentation/gpu/rfc/i915_small_bar.h   | 190 
+++

  Documentation/gpu/rfc/i915_small_bar.rst |  58 +++
  Documentation/gpu/rfc/index.rst  |   4 +
  3 files changed, 252 insertions(+)
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.h
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst

diff --git a/Documentation/gpu/rfc/i915_small_bar.h 
b/Documentation/gpu/rfc/i915_small_bar.h

new file mode 100644
index ..7bfd0cf44d35
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_small_bar.h
@@ -0,0 +1,190 @@
+/**
+ * struct __drm_i915_memory_region_info - Describes one region as 
known to the

+ * driver.
+ *
+ * Note this is using both struct drm_i915_query_item and struct 
drm_i915_query.
+ * For this new query we are adding the new query id 
DRM_I915_QUERY_MEMORY_REGIONS

+ * at _i915_query_item.query_id.
+ */
+struct __drm_i915_memory_region_info {
+    /** @region: The class:instance pair encoding */
+    struct drm_i915_gem_memory_class_instance region;
+
+    /** @rsvd0: MBZ */
+    __u32 rsvd0;
+
+    /** @probed_size: Memory probed by the driver (-1 = unknown) */
+    __u64 probed_size;
+
+    /** @unallocated_size: Estimate of memory remaining (-1 = 
unknown) */

+    __u64 unallocated_size;
+
+    union {
+    /** @rsvd1: MBZ */
+    __u64 rsvd1[8];
+    struct {
+    /**
+ * @probed_cpu_visible_size: Memory probed by the driver
+ * that is CPU accessible. (-1 = unknown).
+ *
+ * This will be always be <= @probed_size, and the
+ * remainder(if there is any) will not be CPU
+ * accessible.
+ */
+    __u64 probed_cpu_visible_size;
+    };



Trying to implement userspace support in Vulkan for this, I have an 
additional question about the value of probed_cpu_visible_size.


When is it set to -1?

I'm guessing before there is support for this value it'll be 0 (MBZ).

After after it should either be the entire lmem or something smaller.


-Lionel



Other pain point of this new uAPI, previously we could query the 
unallocated size for each heap.


unallocated_size should always give the same value as probed_size. We 
have the avail tracking, but we don't currently expose that through 
unallocated_size, due to lack of real userspace/user etc.




Now lmem is effectively divided into 2 heaps, but unallocated_size is 
tracking allocation from both parts of lmem.


Yeah, if we ever properly expose the unallocated_size, then we could 
also just add unallocated_cpu_visible_size.




Is adding new I915_MEMORY_CLASS_DEVICE_NON_MAPPABLE out of question?


I don't think it's out of the question...

I guess user-space should be able to get the current flag behaviour 
just by specifying: device, system. And it does give more flexibly to 
allow something like: device, device-nm, smem.


We can also drop the probed_cpu_visible_size, which would now just be 
the probed_size with device/device-nm. And if we lack device-nm, then 
the entire thing must be CPU mappable.


One of the downsides though, is that we can no longer easily mix 
object pages from both device + device-nm, which we could previously 
do when we didn't specify the flag. At least according to the current 
design/behaviour for @regions that would not be allowed. I guess some 
kind of new flag like ALLOC_MIXED or so? Although currently that is 
only possible with device + device-nm in ttm/i915.



Thanks, I wasn't aware of the restrictions.

Adding unallocated_cpu_visible_size would be great.


-Lionel







-Lionel






+    };
+};
+
+/**
+ * struct __drm_i915_gem_create_ext - Existing gem_create 
behaviour, with added

+ * extension support using struct i915_user_extension.
+ *
+ * Note that new buffer flags should be added here, at least for 
the stuff that
+ * is immutable. Previously we would have two ioctls, one to 
create the object
+ * with gem_create, and another to apply various parameters, 
however this
+ * creates some ambiguity for the params which are considered 
immutable. Also in

+ * general we're phasing out the various SET/GET ioctls.
+ */
+struct __drm_i915_gem_create_ext {
+    /**
+ * @size: Requested size for the object.
+ *
+ * The (page-aligned) allocated 

Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi

2022-05-03 Thread Matthew Auld

On 02/05/2022 09:53, Lionel Landwerlin wrote:

On 02/05/2022 10:54, Lionel Landwerlin wrote:

On 20/04/2022 20:13, Matthew Auld wrote:

Add an entry for the new uapi needed for small BAR on DG2+.

v2:
   - Some spelling fixes and other small tweaks. (Akeem & Thomas)
   - Rework error capture interactions, including no longer needing
 NEEDS_CPU_ACCESS for objects marked for capture. (Thomas)
   - Add probed_cpu_visible_size. (Lionel)

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Lionel Landwerlin 
Cc: Jon Bloomfield 
Cc: Daniel Vetter 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Akeem G Abodunrin 
Cc: mesa-dev@lists.freedesktop.org
---
  Documentation/gpu/rfc/i915_small_bar.h   | 190 +++
  Documentation/gpu/rfc/i915_small_bar.rst |  58 +++
  Documentation/gpu/rfc/index.rst  |   4 +
  3 files changed, 252 insertions(+)
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.h
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst

diff --git a/Documentation/gpu/rfc/i915_small_bar.h 
b/Documentation/gpu/rfc/i915_small_bar.h

new file mode 100644
index ..7bfd0cf44d35
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_small_bar.h
@@ -0,0 +1,190 @@
+/**
+ * struct __drm_i915_memory_region_info - Describes one region as 
known to the

+ * driver.
+ *
+ * Note this is using both struct drm_i915_query_item and struct 
drm_i915_query.
+ * For this new query we are adding the new query id 
DRM_I915_QUERY_MEMORY_REGIONS

+ * at _i915_query_item.query_id.
+ */
+struct __drm_i915_memory_region_info {
+    /** @region: The class:instance pair encoding */
+    struct drm_i915_gem_memory_class_instance region;
+
+    /** @rsvd0: MBZ */
+    __u32 rsvd0;
+
+    /** @probed_size: Memory probed by the driver (-1 = unknown) */
+    __u64 probed_size;
+
+    /** @unallocated_size: Estimate of memory remaining (-1 = 
unknown) */

+    __u64 unallocated_size;
+
+    union {
+    /** @rsvd1: MBZ */
+    __u64 rsvd1[8];
+    struct {
+    /**
+ * @probed_cpu_visible_size: Memory probed by the driver
+ * that is CPU accessible. (-1 = unknown).
+ *
+ * This will be always be <= @probed_size, and the
+ * remainder(if there is any) will not be CPU
+ * accessible.
+ */
+    __u64 probed_cpu_visible_size;
+    };



Trying to implement userspace support in Vulkan for this, I have an 
additional question about the value of probed_cpu_visible_size.


When is it set to -1?

I'm guessing before there is support for this value it'll be 0 (MBZ).

After after it should either be the entire lmem or something smaller.


-Lionel



Other pain point of this new uAPI, previously we could query the 
unallocated size for each heap.


unallocated_size should always give the same value as probed_size. We 
have the avail tracking, but we don't currently expose that through 
unallocated_size, due to lack of real userspace/user etc.




Now lmem is effectively divided into 2 heaps, but unallocated_size is 
tracking allocation from both parts of lmem.


Yeah, if we ever properly expose the unallocated_size, then we could 
also just add unallocated_cpu_visible_size.




Is adding new I915_MEMORY_CLASS_DEVICE_NON_MAPPABLE out of question?


I don't think it's out of the question...

I guess user-space should be able to get the current flag behaviour just 
by specifying: device, system. And it does give more flexibly to allow 
something like: device, device-nm, smem.


We can also drop the probed_cpu_visible_size, which would now just be 
the probed_size with device/device-nm. And if we lack device-nm, then 
the entire thing must be CPU mappable.


One of the downsides though, is that we can no longer easily mix object 
pages from both device + device-nm, which we could previously do when we 
didn't specify the flag. At least according to the current 
design/behaviour for @regions that would not be allowed. I guess some 
kind of new flag like ALLOC_MIXED or so? Although currently that is only 
possible with device + device-nm in ttm/i915.





-Lionel






+    };
+};
+
+/**
+ * struct __drm_i915_gem_create_ext - Existing gem_create behaviour, 
with added

+ * extension support using struct i915_user_extension.
+ *
+ * Note that new buffer flags should be added here, at least for the 
stuff that
+ * is immutable. Previously we would have two ioctls, one to create 
the object
+ * with gem_create, and another to apply various parameters, however 
this
+ * creates some ambiguity for the params which are considered 
immutable. Also in

+ * general we're phasing out the various SET/GET ioctls.
+ */
+struct __drm_i915_gem_create_ext {
+    /**
+ * @size: Requested size for the object.
+ *
+ * The (page-aligned) allocated size for the object will be 
returned.

+ *
+ * Note that for some devices we have might have further minimum
+ * page-size restrictions(larger 

Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi

2022-05-03 Thread Lionel Landwerlin

On 03/05/2022 12:07, Matthew Auld wrote:

On 02/05/2022 19:03, Lionel Landwerlin wrote:

On 02/05/2022 20:58, Abodunrin, Akeem G wrote:



-Original Message-
From: Landwerlin, Lionel G 
Sent: Monday, May 2, 2022 12:55 AM
To: Auld, Matthew ; 
intel-...@lists.freedesktop.org

Cc: dri-de...@lists.freedesktop.org; Thomas Hellström
; Bloomfield, Jon
; Daniel Vetter ; 
Justen,

Jordan L ; Kenneth Graunke
; Abodunrin, Akeem G
; mesa-dev@lists.freedesktop.org
Subject: Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi

On 20/04/2022 20:13, Matthew Auld wrote:

Add an entry for the new uapi needed for small BAR on DG2+.

v2:
    - Some spelling fixes and other small tweaks. (Akeem & Thomas)
    - Rework error capture interactions, including no longer needing
  NEEDS_CPU_ACCESS for objects marked for capture. (Thomas)
    - Add probed_cpu_visible_size. (Lionel)

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Lionel Landwerlin 
Cc: Jon Bloomfield 
Cc: Daniel Vetter 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Akeem G Abodunrin 
Cc: mesa-dev@lists.freedesktop.org
---
   Documentation/gpu/rfc/i915_small_bar.h   | 190

+++

Documentation/gpu/rfc/i915_small_bar.rst |  58 +++
   Documentation/gpu/rfc/index.rst  |   4 +
   3 files changed, 252 insertions(+)
   create mode 100644 Documentation/gpu/rfc/i915_small_bar.h
   create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst

diff --git a/Documentation/gpu/rfc/i915_small_bar.h
b/Documentation/gpu/rfc/i915_small_bar.h
new file mode 100644
index ..7bfd0cf44d35
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_small_bar.h
@@ -0,0 +1,190 @@
+/**
+ * struct __drm_i915_memory_region_info - Describes one region as
+known to the
+ * driver.
+ *
+ * Note this is using both struct drm_i915_query_item and struct

drm_i915_query.

+ * For this new query we are adding the new query id
+DRM_I915_QUERY_MEMORY_REGIONS
+ * at _i915_query_item.query_id.
+ */
+struct __drm_i915_memory_region_info {
+   /** @region: The class:instance pair encoding */
+   struct drm_i915_gem_memory_class_instance region;
+
+   /** @rsvd0: MBZ */
+   __u32 rsvd0;
+
+   /** @probed_size: Memory probed by the driver (-1 = unknown) */
+   __u64 probed_size;
+
+   /** @unallocated_size: Estimate of memory remaining (-1 = 
unknown)

*/

+   __u64 unallocated_size;
+
+   union {
+   /** @rsvd1: MBZ */
+   __u64 rsvd1[8];
+   struct {
+   /**
+    * @probed_cpu_visible_size: Memory probed by the

driver

+    * that is CPU accessible. (-1 = unknown).
+    *
+    * This will be always be <= @probed_size, and 
the

+    * remainder(if there is any) will not be CPU
+    * accessible.
+    */
+   __u64 probed_cpu_visible_size;
+   };


Trying to implement userspace support in Vulkan for this, I have an 
additional

question about the value of probed_cpu_visible_size.

When is it set to -1?
I believe it is set to -1 if it is unknown, and/or not cpu 
accessible...


Cheers!
~Akeem



So what should I expect on system memory?


I guess just probed_cpu_visible_size == probed_size. Or maybe we can 
just use -1 here?




What value is returned when all of probed_size is CPU visible on 
local memory?


probed_size == probed_cpu_visible_size.



Thanks, looks good to me.

Then maybe we should update the comment to say that.

Looks like there are no cases where we'll get -1.


-Lionel







Thanks,


-Lionel



I'm guessing before there is support for this value it'll be 0 (MBZ).

After after it should either be the entire lmem or something smaller.


-Lionel



+   };
+};
+
+/**
+ * struct __drm_i915_gem_create_ext - Existing gem_create behaviour,
+with added
+ * extension support using struct i915_user_extension.
+ *
+ * Note that new buffer flags should be added here, at least for the
+stuff that
+ * is immutable. Previously we would have two ioctls, one to create
+the object
+ * with gem_create, and another to apply various parameters, however
+this
+ * creates some ambiguity for the params which are considered
+immutable. Also in
+ * general we're phasing out the various SET/GET ioctls.
+ */
+struct __drm_i915_gem_create_ext {
+   /**
+    * @size: Requested size for the object.
+    *
+    * The (page-aligned) allocated size for the object will be 
returned.

+    *
+    * Note that for some devices we have might have further minimum
+    * page-size restrictions(larger than 4K), like for device 
local-memory.
+    * However in general the final size here should always 
reflect any

+    * rounding up, if for example using the

I915_GEM_CREATE_EXT_MEMORY_REGIONS

+    * extension to place the object in device local-memory.
+    */
+   __u64 size;
+   /**
+    * @handle: Returned handle for the object.
+    *
+    * Object handles are nonzero.
+    */
+   __u32

Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi

2022-05-03 Thread Matthew Auld

On 02/05/2022 19:03, Lionel Landwerlin wrote:

On 02/05/2022 20:58, Abodunrin, Akeem G wrote:



-Original Message-
From: Landwerlin, Lionel G 
Sent: Monday, May 2, 2022 12:55 AM
To: Auld, Matthew ; 
intel-...@lists.freedesktop.org

Cc: dri-de...@lists.freedesktop.org; Thomas Hellström
; Bloomfield, Jon
; Daniel Vetter ; 
Justen,

Jordan L ; Kenneth Graunke
; Abodunrin, Akeem G
; mesa-dev@lists.freedesktop.org
Subject: Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi

On 20/04/2022 20:13, Matthew Auld wrote:

Add an entry for the new uapi needed for small BAR on DG2+.

v2:
    - Some spelling fixes and other small tweaks. (Akeem & Thomas)
    - Rework error capture interactions, including no longer needing
  NEEDS_CPU_ACCESS for objects marked for capture. (Thomas)
    - Add probed_cpu_visible_size. (Lionel)

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Lionel Landwerlin 
Cc: Jon Bloomfield 
Cc: Daniel Vetter 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Akeem G Abodunrin 
Cc: mesa-dev@lists.freedesktop.org
---
   Documentation/gpu/rfc/i915_small_bar.h   | 190

+++

   Documentation/gpu/rfc/i915_small_bar.rst |  58 +++
   Documentation/gpu/rfc/index.rst  |   4 +
   3 files changed, 252 insertions(+)
   create mode 100644 Documentation/gpu/rfc/i915_small_bar.h
   create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst

diff --git a/Documentation/gpu/rfc/i915_small_bar.h
b/Documentation/gpu/rfc/i915_small_bar.h
new file mode 100644
index ..7bfd0cf44d35
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_small_bar.h
@@ -0,0 +1,190 @@
+/**
+ * struct __drm_i915_memory_region_info - Describes one region as
+known to the
+ * driver.
+ *
+ * Note this is using both struct drm_i915_query_item and struct

drm_i915_query.

+ * For this new query we are adding the new query id
+DRM_I915_QUERY_MEMORY_REGIONS
+ * at _i915_query_item.query_id.
+ */
+struct __drm_i915_memory_region_info {
+   /** @region: The class:instance pair encoding */
+   struct drm_i915_gem_memory_class_instance region;
+
+   /** @rsvd0: MBZ */
+   __u32 rsvd0;
+
+   /** @probed_size: Memory probed by the driver (-1 = unknown) */
+   __u64 probed_size;
+
+   /** @unallocated_size: Estimate of memory remaining (-1 = unknown)

*/

+   __u64 unallocated_size;
+
+   union {
+   /** @rsvd1: MBZ */
+   __u64 rsvd1[8];
+   struct {
+   /**
+    * @probed_cpu_visible_size: Memory probed by the

driver

+    * that is CPU accessible. (-1 = unknown).
+    *
+    * This will be always be <= @probed_size, and the
+    * remainder(if there is any) will not be CPU
+    * accessible.
+    */
+   __u64 probed_cpu_visible_size;
+   };


Trying to implement userspace support in Vulkan for this, I have an 
additional

question about the value of probed_cpu_visible_size.

When is it set to -1?

I believe it is set to -1 if it is unknown, and/or not cpu accessible...

Cheers!
~Akeem



So what should I expect on system memory?


I guess just probed_cpu_visible_size == probed_size. Or maybe we can 
just use -1 here?




What value is returned when all of probed_size is CPU visible on local 
memory?


probed_size == probed_cpu_visible_size.




Thanks,


-Lionel



I'm guessing before there is support for this value it'll be 0 (MBZ).

After after it should either be the entire lmem or something smaller.


-Lionel



+   };
+};
+
+/**
+ * struct __drm_i915_gem_create_ext - Existing gem_create behaviour,
+with added
+ * extension support using struct i915_user_extension.
+ *
+ * Note that new buffer flags should be added here, at least for the
+stuff that
+ * is immutable. Previously we would have two ioctls, one to create
+the object
+ * with gem_create, and another to apply various parameters, however
+this
+ * creates some ambiguity for the params which are considered
+immutable. Also in
+ * general we're phasing out the various SET/GET ioctls.
+ */
+struct __drm_i915_gem_create_ext {
+   /**
+    * @size: Requested size for the object.
+    *
+    * The (page-aligned) allocated size for the object will be 
returned.

+    *
+    * Note that for some devices we have might have further minimum
+    * page-size restrictions(larger than 4K), like for device 
local-memory.

+    * However in general the final size here should always reflect any
+    * rounding up, if for example using the

I915_GEM_CREATE_EXT_MEMORY_REGIONS

+    * extension to place the object in device local-memory.
+    */
+   __u64 size;
+   /**
+    * @handle: Returned handle for the object.
+    *
+    * Object handles are nonzero.
+    */
+   __u32 handle;
+   /**
+    * @flags: Optional flags.
+    *
+    * Supported values:
+    *
+    * I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS - Signal to the

kernel that

+    * the object will need to be a

Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi

2022-05-03 Thread Matthew Auld

On 02/05/2022 08:54, Lionel Landwerlin wrote:

On 20/04/2022 20:13, Matthew Auld wrote:

Add an entry for the new uapi needed for small BAR on DG2+.

v2:
   - Some spelling fixes and other small tweaks. (Akeem & Thomas)
   - Rework error capture interactions, including no longer needing
 NEEDS_CPU_ACCESS for objects marked for capture. (Thomas)
   - Add probed_cpu_visible_size. (Lionel)

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Lionel Landwerlin 
Cc: Jon Bloomfield 
Cc: Daniel Vetter 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Akeem G Abodunrin 
Cc: mesa-dev@lists.freedesktop.org
---
  Documentation/gpu/rfc/i915_small_bar.h   | 190 +++
  Documentation/gpu/rfc/i915_small_bar.rst |  58 +++
  Documentation/gpu/rfc/index.rst  |   4 +
  3 files changed, 252 insertions(+)
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.h
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst

diff --git a/Documentation/gpu/rfc/i915_small_bar.h 
b/Documentation/gpu/rfc/i915_small_bar.h

new file mode 100644
index ..7bfd0cf44d35
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_small_bar.h
@@ -0,0 +1,190 @@
+/**
+ * struct __drm_i915_memory_region_info - Describes one region as 
known to the

+ * driver.
+ *
+ * Note this is using both struct drm_i915_query_item and struct 
drm_i915_query.
+ * For this new query we are adding the new query id 
DRM_I915_QUERY_MEMORY_REGIONS

+ * at _i915_query_item.query_id.
+ */
+struct __drm_i915_memory_region_info {
+    /** @region: The class:instance pair encoding */
+    struct drm_i915_gem_memory_class_instance region;
+
+    /** @rsvd0: MBZ */
+    __u32 rsvd0;
+
+    /** @probed_size: Memory probed by the driver (-1 = unknown) */
+    __u64 probed_size;
+
+    /** @unallocated_size: Estimate of memory remaining (-1 = 
unknown) */

+    __u64 unallocated_size;
+
+    union {
+    /** @rsvd1: MBZ */
+    __u64 rsvd1[8];
+    struct {
+    /**
+ * @probed_cpu_visible_size: Memory probed by the driver
+ * that is CPU accessible. (-1 = unknown).
+ *
+ * This will be always be <= @probed_size, and the
+ * remainder(if there is any) will not be CPU
+ * accessible.
+ */
+    __u64 probed_cpu_visible_size;
+    };



Trying to implement userspace support in Vulkan for this, I have an 
additional question about the value of probed_cpu_visible_size.


When is it set to -1?


I don't anything is currently using -1, for any of these fields.



I'm guessing before there is support for this value it'll be 0 (MBZ).

After after it should either be the entire lmem or something smaller.


Yup.




-Lionel



+    };
+};
+
+/**
+ * struct __drm_i915_gem_create_ext - Existing gem_create behaviour, 
with added

+ * extension support using struct i915_user_extension.
+ *
+ * Note that new buffer flags should be added here, at least for the 
stuff that
+ * is immutable. Previously we would have two ioctls, one to create 
the object
+ * with gem_create, and another to apply various parameters, however 
this
+ * creates some ambiguity for the params which are considered 
immutable. Also in

+ * general we're phasing out the various SET/GET ioctls.
+ */
+struct __drm_i915_gem_create_ext {
+    /**
+ * @size: Requested size for the object.
+ *
+ * The (page-aligned) allocated size for the object will be 
returned.

+ *
+ * Note that for some devices we have might have further minimum
+ * page-size restrictions(larger than 4K), like for device 
local-memory.

+ * However in general the final size here should always reflect any
+ * rounding up, if for example using the 
I915_GEM_CREATE_EXT_MEMORY_REGIONS

+ * extension to place the object in device local-memory.
+ */
+    __u64 size;
+    /**
+ * @handle: Returned handle for the object.
+ *
+ * Object handles are nonzero.
+ */
+    __u32 handle;
+    /**
+ * @flags: Optional flags.
+ *
+ * Supported values:
+ *
+ * I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS - Signal to the 
kernel that

+ * the object will need to be accessed via the CPU.
+ *
+ * Only valid when placing objects in I915_MEMORY_CLASS_DEVICE, and
+ * only strictly required on platforms where only some of the device
+ * memory is directly visible or mappable through the CPU, like 
on DG2+.

+ *
+ * One of the placements MUST also be I915_MEMORY_CLASS_SYSTEM, to
+ * ensure we can always spill the allocation to system memory, if we
+ * can't place the object in the mappable part of
+ * I915_MEMORY_CLASS_DEVICE.
+ *
+ * Note that since the kernel only supports flat-CCS on objects 
that can

+ * *only* be placed in I915_MEMORY_CLASS_DEVICE, we therefore don't
+ * support I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS together with
+ * flat-CCS.
+ *
+ * Without this hint, the kernel 

Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi

2022-05-02 Thread Lionel Landwerlin

On 02/05/2022 20:58, Abodunrin, Akeem G wrote:



-Original Message-
From: Landwerlin, Lionel G 
Sent: Monday, May 2, 2022 12:55 AM
To: Auld, Matthew ; intel-...@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org; Thomas Hellström
; Bloomfield, Jon
; Daniel Vetter ; Justen,
Jordan L ; Kenneth Graunke
; Abodunrin, Akeem G
; mesa-dev@lists.freedesktop.org
Subject: Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi

On 20/04/2022 20:13, Matthew Auld wrote:

Add an entry for the new uapi needed for small BAR on DG2+.

v2:
- Some spelling fixes and other small tweaks. (Akeem & Thomas)
- Rework error capture interactions, including no longer needing
  NEEDS_CPU_ACCESS for objects marked for capture. (Thomas)
- Add probed_cpu_visible_size. (Lionel)

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Lionel Landwerlin 
Cc: Jon Bloomfield 
Cc: Daniel Vetter 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Akeem G Abodunrin 
Cc: mesa-dev@lists.freedesktop.org
---
   Documentation/gpu/rfc/i915_small_bar.h   | 190

+++

   Documentation/gpu/rfc/i915_small_bar.rst |  58 +++
   Documentation/gpu/rfc/index.rst  |   4 +
   3 files changed, 252 insertions(+)
   create mode 100644 Documentation/gpu/rfc/i915_small_bar.h
   create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst

diff --git a/Documentation/gpu/rfc/i915_small_bar.h
b/Documentation/gpu/rfc/i915_small_bar.h
new file mode 100644
index ..7bfd0cf44d35
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_small_bar.h
@@ -0,0 +1,190 @@
+/**
+ * struct __drm_i915_memory_region_info - Describes one region as
+known to the
+ * driver.
+ *
+ * Note this is using both struct drm_i915_query_item and struct

drm_i915_query.

+ * For this new query we are adding the new query id
+DRM_I915_QUERY_MEMORY_REGIONS
+ * at _i915_query_item.query_id.
+ */
+struct __drm_i915_memory_region_info {
+   /** @region: The class:instance pair encoding */
+   struct drm_i915_gem_memory_class_instance region;
+
+   /** @rsvd0: MBZ */
+   __u32 rsvd0;
+
+   /** @probed_size: Memory probed by the driver (-1 = unknown) */
+   __u64 probed_size;
+
+   /** @unallocated_size: Estimate of memory remaining (-1 = unknown)

*/

+   __u64 unallocated_size;
+
+   union {
+   /** @rsvd1: MBZ */
+   __u64 rsvd1[8];
+   struct {
+   /**
+* @probed_cpu_visible_size: Memory probed by the

driver

+* that is CPU accessible. (-1 = unknown).
+*
+* This will be always be <= @probed_size, and the
+* remainder(if there is any) will not be CPU
+* accessible.
+*/
+   __u64 probed_cpu_visible_size;
+   };


Trying to implement userspace support in Vulkan for this, I have an additional
question about the value of probed_cpu_visible_size.

When is it set to -1?

I believe it is set to -1 if it is unknown, and/or not cpu accessible...

Cheers!
~Akeem



So what should I expect on system memory?

What value is returned when all of probed_size is CPU visible on local 
memory?



Thanks,


-Lionel



I'm guessing before there is support for this value it'll be 0 (MBZ).

After after it should either be the entire lmem or something smaller.


-Lionel



+   };
+};
+
+/**
+ * struct __drm_i915_gem_create_ext - Existing gem_create behaviour,
+with added
+ * extension support using struct i915_user_extension.
+ *
+ * Note that new buffer flags should be added here, at least for the
+stuff that
+ * is immutable. Previously we would have two ioctls, one to create
+the object
+ * with gem_create, and another to apply various parameters, however
+this
+ * creates some ambiguity for the params which are considered
+immutable. Also in
+ * general we're phasing out the various SET/GET ioctls.
+ */
+struct __drm_i915_gem_create_ext {
+   /**
+* @size: Requested size for the object.
+*
+* The (page-aligned) allocated size for the object will be returned.
+*
+* Note that for some devices we have might have further minimum
+* page-size restrictions(larger than 4K), like for device local-memory.
+* However in general the final size here should always reflect any
+* rounding up, if for example using the

I915_GEM_CREATE_EXT_MEMORY_REGIONS

+* extension to place the object in device local-memory.
+*/
+   __u64 size;
+   /**
+* @handle: Returned handle for the object.
+*
+* Object handles are nonzero.
+*/
+   __u32 handle;
+   /**
+* @flags: Optional flags.
+*
+* Supported values:
+*
+* I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS - Signal to the

kernel that

+* the object will need to be accessed via the CPU.
+*
+* Only valid when placing objects in I915_MEMORY_CLASS_DEVICE,

and

+* only strictly required on platforms where only some of the device
+* memory is d

RE: [PATCH v2] drm/doc: add rfc section for small BAR uapi

2022-05-02 Thread Abodunrin, Akeem G


> -Original Message-
> From: Landwerlin, Lionel G 
> Sent: Monday, May 2, 2022 12:55 AM
> To: Auld, Matthew ; intel-...@lists.freedesktop.org
> Cc: dri-de...@lists.freedesktop.org; Thomas Hellström
> ; Bloomfield, Jon
> ; Daniel Vetter ; Justen,
> Jordan L ; Kenneth Graunke
> ; Abodunrin, Akeem G
> ; mesa-dev@lists.freedesktop.org
> Subject: Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi
> 
> On 20/04/2022 20:13, Matthew Auld wrote:
> > Add an entry for the new uapi needed for small BAR on DG2+.
> >
> > v2:
> >- Some spelling fixes and other small tweaks. (Akeem & Thomas)
> >- Rework error capture interactions, including no longer needing
> >  NEEDS_CPU_ACCESS for objects marked for capture. (Thomas)
> >- Add probed_cpu_visible_size. (Lionel)
> >
> > Signed-off-by: Matthew Auld 
> > Cc: Thomas Hellström 
> > Cc: Lionel Landwerlin 
> > Cc: Jon Bloomfield 
> > Cc: Daniel Vetter 
> > Cc: Jordan Justen 
> > Cc: Kenneth Graunke 
> > Cc: Akeem G Abodunrin 
> > Cc: mesa-dev@lists.freedesktop.org
> > ---
> >   Documentation/gpu/rfc/i915_small_bar.h   | 190
> +++
> >   Documentation/gpu/rfc/i915_small_bar.rst |  58 +++
> >   Documentation/gpu/rfc/index.rst  |   4 +
> >   3 files changed, 252 insertions(+)
> >   create mode 100644 Documentation/gpu/rfc/i915_small_bar.h
> >   create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst
> >
> > diff --git a/Documentation/gpu/rfc/i915_small_bar.h
> > b/Documentation/gpu/rfc/i915_small_bar.h
> > new file mode 100644
> > index ..7bfd0cf44d35
> > --- /dev/null
> > +++ b/Documentation/gpu/rfc/i915_small_bar.h
> > @@ -0,0 +1,190 @@
> > +/**
> > + * struct __drm_i915_memory_region_info - Describes one region as
> > +known to the
> > + * driver.
> > + *
> > + * Note this is using both struct drm_i915_query_item and struct
> drm_i915_query.
> > + * For this new query we are adding the new query id
> > +DRM_I915_QUERY_MEMORY_REGIONS
> > + * at _i915_query_item.query_id.
> > + */
> > +struct __drm_i915_memory_region_info {
> > +   /** @region: The class:instance pair encoding */
> > +   struct drm_i915_gem_memory_class_instance region;
> > +
> > +   /** @rsvd0: MBZ */
> > +   __u32 rsvd0;
> > +
> > +   /** @probed_size: Memory probed by the driver (-1 = unknown) */
> > +   __u64 probed_size;
> > +
> > +   /** @unallocated_size: Estimate of memory remaining (-1 = unknown)
> */
> > +   __u64 unallocated_size;
> > +
> > +   union {
> > +   /** @rsvd1: MBZ */
> > +   __u64 rsvd1[8];
> > +   struct {
> > +   /**
> > +* @probed_cpu_visible_size: Memory probed by the
> driver
> > +* that is CPU accessible. (-1 = unknown).
> > +*
> > +* This will be always be <= @probed_size, and the
> > +* remainder(if there is any) will not be CPU
> > +* accessible.
> > +*/
> > +   __u64 probed_cpu_visible_size;
> > +   };
> 
> 
> Trying to implement userspace support in Vulkan for this, I have an additional
> question about the value of probed_cpu_visible_size.
> 
> When is it set to -1?
I believe it is set to -1 if it is unknown, and/or not cpu accessible... 

Cheers!
~Akeem
> 
> I'm guessing before there is support for this value it'll be 0 (MBZ).
> 
> After after it should either be the entire lmem or something smaller.
> 
> 
> -Lionel
> 
> 
> > +   };
> > +};
> > +
> > +/**
> > + * struct __drm_i915_gem_create_ext - Existing gem_create behaviour,
> > +with added
> > + * extension support using struct i915_user_extension.
> > + *
> > + * Note that new buffer flags should be added here, at least for the
> > +stuff that
> > + * is immutable. Previously we would have two ioctls, one to create
> > +the object
> > + * with gem_create, and another to apply various parameters, however
> > +this
> > + * creates some ambiguity for the params which are considered
> > +immutable. Also in
> > + * general we're phasing out the various SET/GET ioctls.
> > + */
> > +struct __drm_i915_gem_create_ext {
> > +   /**
> > +* @size: Requested size for the object.
> > +*
> > +* The (page-aligned) allocated size for the object will be returned.
&g

Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi

2022-05-02 Thread Lionel Landwerlin

On 02/05/2022 10:54, Lionel Landwerlin wrote:

On 20/04/2022 20:13, Matthew Auld wrote:

Add an entry for the new uapi needed for small BAR on DG2+.

v2:
   - Some spelling fixes and other small tweaks. (Akeem & Thomas)
   - Rework error capture interactions, including no longer needing
 NEEDS_CPU_ACCESS for objects marked for capture. (Thomas)
   - Add probed_cpu_visible_size. (Lionel)

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Lionel Landwerlin 
Cc: Jon Bloomfield 
Cc: Daniel Vetter 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Akeem G Abodunrin 
Cc: mesa-dev@lists.freedesktop.org
---
  Documentation/gpu/rfc/i915_small_bar.h   | 190 +++
  Documentation/gpu/rfc/i915_small_bar.rst |  58 +++
  Documentation/gpu/rfc/index.rst  |   4 +
  3 files changed, 252 insertions(+)
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.h
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst

diff --git a/Documentation/gpu/rfc/i915_small_bar.h 
b/Documentation/gpu/rfc/i915_small_bar.h

new file mode 100644
index ..7bfd0cf44d35
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_small_bar.h
@@ -0,0 +1,190 @@
+/**
+ * struct __drm_i915_memory_region_info - Describes one region as 
known to the

+ * driver.
+ *
+ * Note this is using both struct drm_i915_query_item and struct 
drm_i915_query.
+ * For this new query we are adding the new query id 
DRM_I915_QUERY_MEMORY_REGIONS

+ * at _i915_query_item.query_id.
+ */
+struct __drm_i915_memory_region_info {
+    /** @region: The class:instance pair encoding */
+    struct drm_i915_gem_memory_class_instance region;
+
+    /** @rsvd0: MBZ */
+    __u32 rsvd0;
+
+    /** @probed_size: Memory probed by the driver (-1 = unknown) */
+    __u64 probed_size;
+
+    /** @unallocated_size: Estimate of memory remaining (-1 = 
unknown) */

+    __u64 unallocated_size;
+
+    union {
+    /** @rsvd1: MBZ */
+    __u64 rsvd1[8];
+    struct {
+    /**
+ * @probed_cpu_visible_size: Memory probed by the driver
+ * that is CPU accessible. (-1 = unknown).
+ *
+ * This will be always be <= @probed_size, and the
+ * remainder(if there is any) will not be CPU
+ * accessible.
+ */
+    __u64 probed_cpu_visible_size;
+    };



Trying to implement userspace support in Vulkan for this, I have an 
additional question about the value of probed_cpu_visible_size.


When is it set to -1?

I'm guessing before there is support for this value it'll be 0 (MBZ).

After after it should either be the entire lmem or something smaller.


-Lionel



Other pain point of this new uAPI, previously we could query the 
unallocated size for each heap.


Now lmem is effectively divided into 2 heaps, but unallocated_size is 
tracking allocation from both parts of lmem.


Is adding new I915_MEMORY_CLASS_DEVICE_NON_MAPPABLE out of question?


-Lionel






+    };
+};
+
+/**
+ * struct __drm_i915_gem_create_ext - Existing gem_create behaviour, 
with added

+ * extension support using struct i915_user_extension.
+ *
+ * Note that new buffer flags should be added here, at least for the 
stuff that
+ * is immutable. Previously we would have two ioctls, one to create 
the object
+ * with gem_create, and another to apply various parameters, however 
this
+ * creates some ambiguity for the params which are considered 
immutable. Also in

+ * general we're phasing out the various SET/GET ioctls.
+ */
+struct __drm_i915_gem_create_ext {
+    /**
+ * @size: Requested size for the object.
+ *
+ * The (page-aligned) allocated size for the object will be 
returned.

+ *
+ * Note that for some devices we have might have further minimum
+ * page-size restrictions(larger than 4K), like for device 
local-memory.

+ * However in general the final size here should always reflect any
+ * rounding up, if for example using the 
I915_GEM_CREATE_EXT_MEMORY_REGIONS

+ * extension to place the object in device local-memory.
+ */
+    __u64 size;
+    /**
+ * @handle: Returned handle for the object.
+ *
+ * Object handles are nonzero.
+ */
+    __u32 handle;
+    /**
+ * @flags: Optional flags.
+ *
+ * Supported values:
+ *
+ * I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS - Signal to the 
kernel that

+ * the object will need to be accessed via the CPU.
+ *
+ * Only valid when placing objects in I915_MEMORY_CLASS_DEVICE, and
+ * only strictly required on platforms where only some of the 
device
+ * memory is directly visible or mappable through the CPU, like 
on DG2+.

+ *
+ * One of the placements MUST also be I915_MEMORY_CLASS_SYSTEM, to
+ * ensure we can always spill the allocation to system memory, 
if we

+ * can't place the object in the mappable part of
+ * I915_MEMORY_CLASS_DEVICE.
+ *
+ * Note that since the kernel only supports flat-CCS on 

Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi

2022-05-02 Thread Lionel Landwerlin

On 20/04/2022 20:13, Matthew Auld wrote:

Add an entry for the new uapi needed for small BAR on DG2+.

v2:
   - Some spelling fixes and other small tweaks. (Akeem & Thomas)
   - Rework error capture interactions, including no longer needing
 NEEDS_CPU_ACCESS for objects marked for capture. (Thomas)
   - Add probed_cpu_visible_size. (Lionel)

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Lionel Landwerlin 
Cc: Jon Bloomfield 
Cc: Daniel Vetter 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Akeem G Abodunrin 
Cc: mesa-dev@lists.freedesktop.org
---
  Documentation/gpu/rfc/i915_small_bar.h   | 190 +++
  Documentation/gpu/rfc/i915_small_bar.rst |  58 +++
  Documentation/gpu/rfc/index.rst  |   4 +
  3 files changed, 252 insertions(+)
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.h
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst

diff --git a/Documentation/gpu/rfc/i915_small_bar.h 
b/Documentation/gpu/rfc/i915_small_bar.h
new file mode 100644
index ..7bfd0cf44d35
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_small_bar.h
@@ -0,0 +1,190 @@
+/**
+ * struct __drm_i915_memory_region_info - Describes one region as known to the
+ * driver.
+ *
+ * Note this is using both struct drm_i915_query_item and struct 
drm_i915_query.
+ * For this new query we are adding the new query id 
DRM_I915_QUERY_MEMORY_REGIONS
+ * at _i915_query_item.query_id.
+ */
+struct __drm_i915_memory_region_info {
+   /** @region: The class:instance pair encoding */
+   struct drm_i915_gem_memory_class_instance region;
+
+   /** @rsvd0: MBZ */
+   __u32 rsvd0;
+
+   /** @probed_size: Memory probed by the driver (-1 = unknown) */
+   __u64 probed_size;
+
+   /** @unallocated_size: Estimate of memory remaining (-1 = unknown) */
+   __u64 unallocated_size;
+
+   union {
+   /** @rsvd1: MBZ */
+   __u64 rsvd1[8];
+   struct {
+   /**
+* @probed_cpu_visible_size: Memory probed by the driver
+* that is CPU accessible. (-1 = unknown).
+*
+* This will be always be <= @probed_size, and the
+* remainder(if there is any) will not be CPU
+* accessible.
+*/
+   __u64 probed_cpu_visible_size;
+   };



Trying to implement userspace support in Vulkan for this, I have an 
additional question about the value of probed_cpu_visible_size.


When is it set to -1?

I'm guessing before there is support for this value it'll be 0 (MBZ).

After after it should either be the entire lmem or something smaller.


-Lionel



+   };
+};
+
+/**
+ * struct __drm_i915_gem_create_ext - Existing gem_create behaviour, with added
+ * extension support using struct i915_user_extension.
+ *
+ * Note that new buffer flags should be added here, at least for the stuff that
+ * is immutable. Previously we would have two ioctls, one to create the object
+ * with gem_create, and another to apply various parameters, however this
+ * creates some ambiguity for the params which are considered immutable. Also 
in
+ * general we're phasing out the various SET/GET ioctls.
+ */
+struct __drm_i915_gem_create_ext {
+   /**
+* @size: Requested size for the object.
+*
+* The (page-aligned) allocated size for the object will be returned.
+*
+* Note that for some devices we have might have further minimum
+* page-size restrictions(larger than 4K), like for device local-memory.
+* However in general the final size here should always reflect any
+* rounding up, if for example using the 
I915_GEM_CREATE_EXT_MEMORY_REGIONS
+* extension to place the object in device local-memory.
+*/
+   __u64 size;
+   /**
+* @handle: Returned handle for the object.
+*
+* Object handles are nonzero.
+*/
+   __u32 handle;
+   /**
+* @flags: Optional flags.
+*
+* Supported values:
+*
+* I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS - Signal to the kernel that
+* the object will need to be accessed via the CPU.
+*
+* Only valid when placing objects in I915_MEMORY_CLASS_DEVICE, and
+* only strictly required on platforms where only some of the device
+* memory is directly visible or mappable through the CPU, like on DG2+.
+*
+* One of the placements MUST also be I915_MEMORY_CLASS_SYSTEM, to
+* ensure we can always spill the allocation to system memory, if we
+* can't place the object in the mappable part of
+* I915_MEMORY_CLASS_DEVICE.
+*
+* Note that since the kernel only supports flat-CCS on objects that can
+* *only* be placed in I915_MEMORY_CLASS_DEVICE, we therefore don't
+* 

Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi

2022-04-27 Thread Lionel Landwerlin

On 27/04/2022 18:18, Matthew Auld wrote:

On 27/04/2022 07:48, Lionel Landwerlin wrote:
One question though, how do we detect that this flag 
(I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS) is accepted on a given 
kernel?
I assume older kernels are going to reject object creation if we use 
this flag?


From some offline discussion with Lionel, the plan here is to just do 
a dummy gem_create_ext to check if the kernel throws an error with the 
new flag or not.




I didn't plan to use __drm_i915_query_vma_info, but isn't it 
inconsistent to select the placement on the GEM object and then query 
whether it's mappable by address?
You made a comment stating this is racy, wouldn't querying on the GEM 
object prevent this?


Since mesa at this time doesn't currently have a use for this one, 
then I guess we should maybe just drop this part of the uapi, in this 
version at least, if no objections.



Just repeating what we discussed (maybe I missed some other discussion 
and that's why I was confused) :



The way I was planning to use this is to have 3 heaps in Vulkan :

    - heap0: local only, no cpu visible

    - heap1: system, cpu visible

    - heap2: local & cpu visible


With heap2 having the reported probed_cpu_visible_size size.

It is an error for the application to map from heap0 [1].


With that said, it means if we created a GEM BO without 
I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS, we'll never mmap it.


So why the query?

I guess it would be useful when we import a buffer from another 
application. But in that case, why not have the query on the BO?



-Lionel


[1] : 
https://www.khronos.org/registry/vulkan/specs/1.3-extensions/man/html/vkMapMemory.html 
(VUID-vkMapMemory-memory-00682)






Thanks,

-Lionel

On 27/04/2022 09:35, Lionel Landwerlin wrote:

Hi Matt,


The proposal looks good to me.

Looking forward to try it on drm-tip.


-Lionel

On 20/04/2022 20:13, Matthew Auld wrote:

Add an entry for the new uapi needed for small BAR on DG2+.

v2:
   - Some spelling fixes and other small tweaks. (Akeem & Thomas)
   - Rework error capture interactions, including no longer needing
 NEEDS_CPU_ACCESS for objects marked for capture. (Thomas)
   - Add probed_cpu_visible_size. (Lionel)

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Lionel Landwerlin 
Cc: Jon Bloomfield 
Cc: Daniel Vetter 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Akeem G Abodunrin 
Cc: mesa-dev@lists.freedesktop.org
---
  Documentation/gpu/rfc/i915_small_bar.h   | 190 
+++

  Documentation/gpu/rfc/i915_small_bar.rst |  58 +++
  Documentation/gpu/rfc/index.rst  |   4 +
  3 files changed, 252 insertions(+)
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.h
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst

diff --git a/Documentation/gpu/rfc/i915_small_bar.h 
b/Documentation/gpu/rfc/i915_small_bar.h

new file mode 100644
index ..7bfd0cf44d35
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_small_bar.h
@@ -0,0 +1,190 @@
+/**
+ * struct __drm_i915_memory_region_info - Describes one region as 
known to the

+ * driver.
+ *
+ * Note this is using both struct drm_i915_query_item and struct 
drm_i915_query.
+ * For this new query we are adding the new query id 
DRM_I915_QUERY_MEMORY_REGIONS

+ * at _i915_query_item.query_id.
+ */
+struct __drm_i915_memory_region_info {
+    /** @region: The class:instance pair encoding */
+    struct drm_i915_gem_memory_class_instance region;
+
+    /** @rsvd0: MBZ */
+    __u32 rsvd0;
+
+    /** @probed_size: Memory probed by the driver (-1 = unknown) */
+    __u64 probed_size;
+
+    /** @unallocated_size: Estimate of memory remaining (-1 = 
unknown) */

+    __u64 unallocated_size;
+
+    union {
+    /** @rsvd1: MBZ */
+    __u64 rsvd1[8];
+    struct {
+    /**
+ * @probed_cpu_visible_size: Memory probed by the driver
+ * that is CPU accessible. (-1 = unknown).
+ *
+ * This will be always be <= @probed_size, and the
+ * remainder(if there is any) will not be CPU
+ * accessible.
+ */
+    __u64 probed_cpu_visible_size;
+    };
+    };
+};
+
+/**
+ * struct __drm_i915_gem_create_ext - Existing gem_create 
behaviour, with added

+ * extension support using struct i915_user_extension.
+ *
+ * Note that new buffer flags should be added here, at least for 
the stuff that
+ * is immutable. Previously we would have two ioctls, one to 
create the object
+ * with gem_create, and another to apply various parameters, 
however this
+ * creates some ambiguity for the params which are considered 
immutable. Also in

+ * general we're phasing out the various SET/GET ioctls.
+ */
+struct __drm_i915_gem_create_ext {
+    /**
+ * @size: Requested size for the object.
+ *
+ * The (page-aligned) allocated size for the object will be 
returned.

+ *
+ * Note that for some devices we have might have further minimum
+ 

Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi

2022-04-27 Thread Matthew Auld

On 27/04/2022 07:48, Lionel Landwerlin wrote:
One question though, how do we detect that this flag 
(I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS) is accepted on a given kernel?
I assume older kernels are going to reject object creation if we use 
this flag?


From some offline discussion with Lionel, the plan here is to just do a 
dummy gem_create_ext to check if the kernel throws an error with the new 
flag or not.




I didn't plan to use __drm_i915_query_vma_info, but isn't it 
inconsistent to select the placement on the GEM object and then query 
whether it's mappable by address?
You made a comment stating this is racy, wouldn't querying on the GEM 
object prevent this?


Since mesa at this time doesn't currently have a use for this one, then 
I guess we should maybe just drop this part of the uapi, in this version 
at least, if no objections.




Thanks,

-Lionel

On 27/04/2022 09:35, Lionel Landwerlin wrote:

Hi Matt,


The proposal looks good to me.

Looking forward to try it on drm-tip.


-Lionel

On 20/04/2022 20:13, Matthew Auld wrote:

Add an entry for the new uapi needed for small BAR on DG2+.

v2:
   - Some spelling fixes and other small tweaks. (Akeem & Thomas)
   - Rework error capture interactions, including no longer needing
 NEEDS_CPU_ACCESS for objects marked for capture. (Thomas)
   - Add probed_cpu_visible_size. (Lionel)

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Lionel Landwerlin 
Cc: Jon Bloomfield 
Cc: Daniel Vetter 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Akeem G Abodunrin 
Cc: mesa-dev@lists.freedesktop.org
---
  Documentation/gpu/rfc/i915_small_bar.h   | 190 +++
  Documentation/gpu/rfc/i915_small_bar.rst |  58 +++
  Documentation/gpu/rfc/index.rst  |   4 +
  3 files changed, 252 insertions(+)
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.h
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst

diff --git a/Documentation/gpu/rfc/i915_small_bar.h 
b/Documentation/gpu/rfc/i915_small_bar.h

new file mode 100644
index ..7bfd0cf44d35
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_small_bar.h
@@ -0,0 +1,190 @@
+/**
+ * struct __drm_i915_memory_region_info - Describes one region as 
known to the

+ * driver.
+ *
+ * Note this is using both struct drm_i915_query_item and struct 
drm_i915_query.
+ * For this new query we are adding the new query id 
DRM_I915_QUERY_MEMORY_REGIONS

+ * at _i915_query_item.query_id.
+ */
+struct __drm_i915_memory_region_info {
+    /** @region: The class:instance pair encoding */
+    struct drm_i915_gem_memory_class_instance region;
+
+    /** @rsvd0: MBZ */
+    __u32 rsvd0;
+
+    /** @probed_size: Memory probed by the driver (-1 = unknown) */
+    __u64 probed_size;
+
+    /** @unallocated_size: Estimate of memory remaining (-1 = 
unknown) */

+    __u64 unallocated_size;
+
+    union {
+    /** @rsvd1: MBZ */
+    __u64 rsvd1[8];
+    struct {
+    /**
+ * @probed_cpu_visible_size: Memory probed by the driver
+ * that is CPU accessible. (-1 = unknown).
+ *
+ * This will be always be <= @probed_size, and the
+ * remainder(if there is any) will not be CPU
+ * accessible.
+ */
+    __u64 probed_cpu_visible_size;
+    };
+    };
+};
+
+/**
+ * struct __drm_i915_gem_create_ext - Existing gem_create behaviour, 
with added

+ * extension support using struct i915_user_extension.
+ *
+ * Note that new buffer flags should be added here, at least for the 
stuff that
+ * is immutable. Previously we would have two ioctls, one to create 
the object
+ * with gem_create, and another to apply various parameters, however 
this
+ * creates some ambiguity for the params which are considered 
immutable. Also in

+ * general we're phasing out the various SET/GET ioctls.
+ */
+struct __drm_i915_gem_create_ext {
+    /**
+ * @size: Requested size for the object.
+ *
+ * The (page-aligned) allocated size for the object will be 
returned.

+ *
+ * Note that for some devices we have might have further minimum
+ * page-size restrictions(larger than 4K), like for device 
local-memory.

+ * However in general the final size here should always reflect any
+ * rounding up, if for example using the 
I915_GEM_CREATE_EXT_MEMORY_REGIONS

+ * extension to place the object in device local-memory.
+ */
+    __u64 size;
+    /**
+ * @handle: Returned handle for the object.
+ *
+ * Object handles are nonzero.
+ */
+    __u32 handle;
+    /**
+ * @flags: Optional flags.
+ *
+ * Supported values:
+ *
+ * I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS - Signal to the 
kernel that

+ * the object will need to be accessed via the CPU.
+ *
+ * Only valid when placing objects in I915_MEMORY_CLASS_DEVICE, and
+ * only strictly required on platforms where only some of the 
device
+ * memory is directly visible or 

Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi

2022-04-27 Thread Daniel Vetter
On Wed, Apr 27, 2022 at 08:55:07AM +0200, Christian König wrote:
> Well usually we increment the drm minor version when adding some new flags
> on amdgpu.
> 
> Additional to that just one comment from our experience with that: You don't
> just need one flag, but two. The first one is a hint which says "CPU access
> needed" and the second is a promise which says "CPU access never needed".
> 
> The background is that on a whole bunch of buffers you can 100% certain say
> that you will never ever need CPU access.
> 
> Then at least we have a whole bunch of buffers where we might need CPU
> access, but can't tell for sure.
> 
> And last we have stuff like transfer buffers you can be 100% sure that you
> need CPU access.
> 
> Separating it like this helped a lot with performance on small BAR systems.

So my assumption was that for transfer buffers you'd fill them with the
cpu first anyway, so no need for the extra flag.

I guess this if for transfer buffers for gpu -> cpu transfers, where it
would result in costly bo move and stalls and it's better to make sure
it's cpu accessible from the start? At least on current gpu we have where
there's no coherent interconnect, those buffers have to be in system
memory or your cpu access will be a disaster, so again they're naturally
cpu accessible.

What's the use-case for the "cpu access required" flag where "cpu access
before gpu access" isn't a good enough hint already to get the same perf
benefits?

Also for scanout my idea at least is that we just fail mmap when you
haven't set the flag and the scanout is pinned to unmappable, for two
reasons:
- 4k buffers are big, if we force them all into mappable things are
  non-pretty.
- You need mesa anyway to access tiled buffers, and mesa knows how to use
  a transfer buffer. That should work even when you do desktop switching
  and fastboot and stuff like that with the getfb2 ioctl should all work
  (and without getfb2 it's doomed to garbage anyway).

So only dumb kms buffers (which are linear) would ever get the
NEEDS_CPU_ACCESS flag, and only those we'd ever pin into cpu accessible
range for scanout. Is there a hole in that plan?

Cheers, Daniel

> 
> Regards,
> Christian.
> 
> Am 27.04.22 um 08:48 schrieb Lionel Landwerlin:
> > One question though, how do we detect that this flag
> > (I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS) is accepted on a given
> > kernel?
> > I assume older kernels are going to reject object creation if we use
> > this flag?
> > 
> > I didn't plan to use __drm_i915_query_vma_info, but isn't it
> > inconsistent to select the placement on the GEM object and then query
> > whether it's mappable by address?
> > You made a comment stating this is racy, wouldn't querying on the GEM
> > object prevent this?
> > 
> > Thanks,
> > 
> > -Lionel
> > 
> > On 27/04/2022 09:35, Lionel Landwerlin wrote:
> > > Hi Matt,
> > > 
> > > 
> > > The proposal looks good to me.
> > > 
> > > Looking forward to try it on drm-tip.
> > > 
> > > 
> > > -Lionel
> > > 
> > > On 20/04/2022 20:13, Matthew Auld wrote:
> > > > Add an entry for the new uapi needed for small BAR on DG2+.
> > > > 
> > > > v2:
> > > >    - Some spelling fixes and other small tweaks. (Akeem & Thomas)
> > > >    - Rework error capture interactions, including no longer needing
> > > >  NEEDS_CPU_ACCESS for objects marked for capture. (Thomas)
> > > >    - Add probed_cpu_visible_size. (Lionel)
> > > > 
> > > > Signed-off-by: Matthew Auld 
> > > > Cc: Thomas Hellström 
> > > > Cc: Lionel Landwerlin 
> > > > Cc: Jon Bloomfield 
> > > > Cc: Daniel Vetter 
> > > > Cc: Jordan Justen 
> > > > Cc: Kenneth Graunke 
> > > > Cc: Akeem G Abodunrin 
> > > > Cc: mesa-dev@lists.freedesktop.org
> > > > ---
> > > >   Documentation/gpu/rfc/i915_small_bar.h   | 190
> > > > +++
> > > >   Documentation/gpu/rfc/i915_small_bar.rst |  58 +++
> > > >   Documentation/gpu/rfc/index.rst  |   4 +
> > > >   3 files changed, 252 insertions(+)
> > > >   create mode 100644 Documentation/gpu/rfc/i915_small_bar.h
> > > >   create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst
> > > > 
> > > > diff --git a/Documentation/gpu/rfc/i915_small_bar.h
> > > > b/Documentation/gpu/rfc/i915_small_bar.h
> > > > new file mode 100644
> > > > index ..7bfd0cf44d35
> > > > --- /dev/null
> > > > +++ b/Documentation/gpu/rfc/i915_small_bar.h
> > > > @@ -0,0 +1,190 @@
> > > > +/**
> > > > + * struct __drm_i915_memory_region_info - Describes one region
> > > > as known to the
> > > > + * driver.
> > > > + *
> > > > + * Note this is using both struct drm_i915_query_item and
> > > > struct drm_i915_query.
> > > > + * For this new query we are adding the new query id
> > > > DRM_I915_QUERY_MEMORY_REGIONS
> > > > + * at _i915_query_item.query_id.
> > > > + */
> > > > +struct __drm_i915_memory_region_info {
> > > > +    /** @region: The class:instance pair encoding */
> > > > +    struct drm_i915_gem_memory_class_instance region;
> > > > +
> > > > +    /** @rsvd0: 

Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi

2022-04-27 Thread Christian König

Am 27.04.22 um 17:02 schrieb Matthew Auld:

On 27/04/2022 07:55, Christian König wrote:
Well usually we increment the drm minor version when adding some new 
flags on amdgpu.


Additional to that just one comment from our experience with that: 
You don't just need one flag, but two. The first one is a hint which 
says "CPU access needed" and the second is a promise which says "CPU 
access never needed".


The background is that on a whole bunch of buffers you can 100% 
certain say that you will never ever need CPU access.


Then at least we have a whole bunch of buffers where we might need 
CPU access, but can't tell for sure.


And last we have stuff like transfer buffers you can be 100% sure 
that you need CPU access.


Separating it like this helped a lot with performance on small BAR 
systems.


Thanks for the comments. For the "CPU access never needed" flag, what 
extra stuff does that do on the kernel side vs not specifying any 
flag/hint? I assume it still prioritizes using the non-CPU visible 
portion first? What else does it do?


It's used as a hint when you need to pin BOs for scanout for example.

In general we try to allocate BOs which are marked "CPU access needed" 
in the CPU visible window if possible, but fallback to any memory if 
that won't fit.


Christian.





Regards,
Christian.

Am 27.04.22 um 08:48 schrieb Lionel Landwerlin:
One question though, how do we detect that this flag 
(I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS) is accepted on a given 
kernel?
I assume older kernels are going to reject object creation if we use 
this flag?


I didn't plan to use __drm_i915_query_vma_info, but isn't it 
inconsistent to select the placement on the GEM object and then 
query whether it's mappable by address?
You made a comment stating this is racy, wouldn't querying on the 
GEM object prevent this?


Thanks,

-Lionel

On 27/04/2022 09:35, Lionel Landwerlin wrote:

Hi Matt,


The proposal looks good to me.

Looking forward to try it on drm-tip.


-Lionel

On 20/04/2022 20:13, Matthew Auld wrote:

Add an entry for the new uapi needed for small BAR on DG2+.

v2:
   - Some spelling fixes and other small tweaks. (Akeem & Thomas)
   - Rework error capture interactions, including no longer needing
 NEEDS_CPU_ACCESS for objects marked for capture. (Thomas)
   - Add probed_cpu_visible_size. (Lionel)

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Lionel Landwerlin 
Cc: Jon Bloomfield 
Cc: Daniel Vetter 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Akeem G Abodunrin 
Cc: mesa-dev@lists.freedesktop.org
---
  Documentation/gpu/rfc/i915_small_bar.h   | 190 
+++

  Documentation/gpu/rfc/i915_small_bar.rst |  58 +++
  Documentation/gpu/rfc/index.rst  |   4 +
  3 files changed, 252 insertions(+)
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.h
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst

diff --git a/Documentation/gpu/rfc/i915_small_bar.h 
b/Documentation/gpu/rfc/i915_small_bar.h

new file mode 100644
index ..7bfd0cf44d35
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_small_bar.h
@@ -0,0 +1,190 @@
+/**
+ * struct __drm_i915_memory_region_info - Describes one region as 
known to the

+ * driver.
+ *
+ * Note this is using both struct drm_i915_query_item and struct 
drm_i915_query.
+ * For this new query we are adding the new query id 
DRM_I915_QUERY_MEMORY_REGIONS

+ * at _i915_query_item.query_id.
+ */
+struct __drm_i915_memory_region_info {
+    /** @region: The class:instance pair encoding */
+    struct drm_i915_gem_memory_class_instance region;
+
+    /** @rsvd0: MBZ */
+    __u32 rsvd0;
+
+    /** @probed_size: Memory probed by the driver (-1 = unknown) */
+    __u64 probed_size;
+
+    /** @unallocated_size: Estimate of memory remaining (-1 = 
unknown) */

+    __u64 unallocated_size;
+
+    union {
+    /** @rsvd1: MBZ */
+    __u64 rsvd1[8];
+    struct {
+    /**
+ * @probed_cpu_visible_size: Memory probed by the driver
+ * that is CPU accessible. (-1 = unknown).
+ *
+ * This will be always be <= @probed_size, and the
+ * remainder(if there is any) will not be CPU
+ * accessible.
+ */
+    __u64 probed_cpu_visible_size;
+    };
+    };
+};
+
+/**
+ * struct __drm_i915_gem_create_ext - Existing gem_create 
behaviour, with added

+ * extension support using struct i915_user_extension.
+ *
+ * Note that new buffer flags should be added here, at least for 
the stuff that
+ * is immutable. Previously we would have two ioctls, one to 
create the object
+ * with gem_create, and another to apply various parameters, 
however this
+ * creates some ambiguity for the params which are considered 
immutable. Also in

+ * general we're phasing out the various SET/GET ioctls.
+ */
+struct __drm_i915_gem_create_ext {
+    /**
+ * @size: Requested size for the object.
+ *
+ * The (page-aligned) allocated 

Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi

2022-04-27 Thread Matthew Auld

On 27/04/2022 07:55, Christian König wrote:
Well usually we increment the drm minor version when adding some new 
flags on amdgpu.


Additional to that just one comment from our experience with that: You 
don't just need one flag, but two. The first one is a hint which says 
"CPU access needed" and the second is a promise which says "CPU access 
never needed".


The background is that on a whole bunch of buffers you can 100% certain 
say that you will never ever need CPU access.


Then at least we have a whole bunch of buffers where we might need CPU 
access, but can't tell for sure.


And last we have stuff like transfer buffers you can be 100% sure that 
you need CPU access.


Separating it like this helped a lot with performance on small BAR systems.


Thanks for the comments. For the "CPU access never needed" flag, what 
extra stuff does that do on the kernel side vs not specifying any 
flag/hint? I assume it still prioritizes using the non-CPU visible 
portion first? What else does it do?




Regards,
Christian.

Am 27.04.22 um 08:48 schrieb Lionel Landwerlin:
One question though, how do we detect that this flag 
(I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS) is accepted on a given 
kernel?
I assume older kernels are going to reject object creation if we use 
this flag?


I didn't plan to use __drm_i915_query_vma_info, but isn't it 
inconsistent to select the placement on the GEM object and then query 
whether it's mappable by address?
You made a comment stating this is racy, wouldn't querying on the GEM 
object prevent this?


Thanks,

-Lionel

On 27/04/2022 09:35, Lionel Landwerlin wrote:

Hi Matt,


The proposal looks good to me.

Looking forward to try it on drm-tip.


-Lionel

On 20/04/2022 20:13, Matthew Auld wrote:

Add an entry for the new uapi needed for small BAR on DG2+.

v2:
   - Some spelling fixes and other small tweaks. (Akeem & Thomas)
   - Rework error capture interactions, including no longer needing
 NEEDS_CPU_ACCESS for objects marked for capture. (Thomas)
   - Add probed_cpu_visible_size. (Lionel)

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Lionel Landwerlin 
Cc: Jon Bloomfield 
Cc: Daniel Vetter 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Akeem G Abodunrin 
Cc: mesa-dev@lists.freedesktop.org
---
  Documentation/gpu/rfc/i915_small_bar.h   | 190 
+++

  Documentation/gpu/rfc/i915_small_bar.rst |  58 +++
  Documentation/gpu/rfc/index.rst  |   4 +
  3 files changed, 252 insertions(+)
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.h
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst

diff --git a/Documentation/gpu/rfc/i915_small_bar.h 
b/Documentation/gpu/rfc/i915_small_bar.h

new file mode 100644
index ..7bfd0cf44d35
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_small_bar.h
@@ -0,0 +1,190 @@
+/**
+ * struct __drm_i915_memory_region_info - Describes one region as 
known to the

+ * driver.
+ *
+ * Note this is using both struct drm_i915_query_item and struct 
drm_i915_query.
+ * For this new query we are adding the new query id 
DRM_I915_QUERY_MEMORY_REGIONS

+ * at _i915_query_item.query_id.
+ */
+struct __drm_i915_memory_region_info {
+    /** @region: The class:instance pair encoding */
+    struct drm_i915_gem_memory_class_instance region;
+
+    /** @rsvd0: MBZ */
+    __u32 rsvd0;
+
+    /** @probed_size: Memory probed by the driver (-1 = unknown) */
+    __u64 probed_size;
+
+    /** @unallocated_size: Estimate of memory remaining (-1 = 
unknown) */

+    __u64 unallocated_size;
+
+    union {
+    /** @rsvd1: MBZ */
+    __u64 rsvd1[8];
+    struct {
+    /**
+ * @probed_cpu_visible_size: Memory probed by the driver
+ * that is CPU accessible. (-1 = unknown).
+ *
+ * This will be always be <= @probed_size, and the
+ * remainder(if there is any) will not be CPU
+ * accessible.
+ */
+    __u64 probed_cpu_visible_size;
+    };
+    };
+};
+
+/**
+ * struct __drm_i915_gem_create_ext - Existing gem_create 
behaviour, with added

+ * extension support using struct i915_user_extension.
+ *
+ * Note that new buffer flags should be added here, at least for 
the stuff that
+ * is immutable. Previously we would have two ioctls, one to create 
the object
+ * with gem_create, and another to apply various parameters, 
however this
+ * creates some ambiguity for the params which are considered 
immutable. Also in

+ * general we're phasing out the various SET/GET ioctls.
+ */
+struct __drm_i915_gem_create_ext {
+    /**
+ * @size: Requested size for the object.
+ *
+ * The (page-aligned) allocated size for the object will be 
returned.

+ *
+ * Note that for some devices we have might have further minimum
+ * page-size restrictions(larger than 4K), like for device 
local-memory.
+ * However in general the final size here should always reflect 
any
+ * rounding up, 

Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi

2022-04-27 Thread Christian König
Well usually we increment the drm minor version when adding some new 
flags on amdgpu.


Additional to that just one comment from our experience with that: You 
don't just need one flag, but two. The first one is a hint which says 
"CPU access needed" and the second is a promise which says "CPU access 
never needed".


The background is that on a whole bunch of buffers you can 100% certain 
say that you will never ever need CPU access.


Then at least we have a whole bunch of buffers where we might need CPU 
access, but can't tell for sure.


And last we have stuff like transfer buffers you can be 100% sure that 
you need CPU access.


Separating it like this helped a lot with performance on small BAR systems.

Regards,
Christian.

Am 27.04.22 um 08:48 schrieb Lionel Landwerlin:
One question though, how do we detect that this flag 
(I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS) is accepted on a given 
kernel?
I assume older kernels are going to reject object creation if we use 
this flag?


I didn't plan to use __drm_i915_query_vma_info, but isn't it 
inconsistent to select the placement on the GEM object and then query 
whether it's mappable by address?
You made a comment stating this is racy, wouldn't querying on the GEM 
object prevent this?


Thanks,

-Lionel

On 27/04/2022 09:35, Lionel Landwerlin wrote:

Hi Matt,


The proposal looks good to me.

Looking forward to try it on drm-tip.


-Lionel

On 20/04/2022 20:13, Matthew Auld wrote:

Add an entry for the new uapi needed for small BAR on DG2+.

v2:
   - Some spelling fixes and other small tweaks. (Akeem & Thomas)
   - Rework error capture interactions, including no longer needing
 NEEDS_CPU_ACCESS for objects marked for capture. (Thomas)
   - Add probed_cpu_visible_size. (Lionel)

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Lionel Landwerlin 
Cc: Jon Bloomfield 
Cc: Daniel Vetter 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Akeem G Abodunrin 
Cc: mesa-dev@lists.freedesktop.org
---
  Documentation/gpu/rfc/i915_small_bar.h   | 190 
+++

  Documentation/gpu/rfc/i915_small_bar.rst |  58 +++
  Documentation/gpu/rfc/index.rst  |   4 +
  3 files changed, 252 insertions(+)
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.h
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst

diff --git a/Documentation/gpu/rfc/i915_small_bar.h 
b/Documentation/gpu/rfc/i915_small_bar.h

new file mode 100644
index ..7bfd0cf44d35
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_small_bar.h
@@ -0,0 +1,190 @@
+/**
+ * struct __drm_i915_memory_region_info - Describes one region as 
known to the

+ * driver.
+ *
+ * Note this is using both struct drm_i915_query_item and struct 
drm_i915_query.
+ * For this new query we are adding the new query id 
DRM_I915_QUERY_MEMORY_REGIONS

+ * at _i915_query_item.query_id.
+ */
+struct __drm_i915_memory_region_info {
+    /** @region: The class:instance pair encoding */
+    struct drm_i915_gem_memory_class_instance region;
+
+    /** @rsvd0: MBZ */
+    __u32 rsvd0;
+
+    /** @probed_size: Memory probed by the driver (-1 = unknown) */
+    __u64 probed_size;
+
+    /** @unallocated_size: Estimate of memory remaining (-1 = 
unknown) */

+    __u64 unallocated_size;
+
+    union {
+    /** @rsvd1: MBZ */
+    __u64 rsvd1[8];
+    struct {
+    /**
+ * @probed_cpu_visible_size: Memory probed by the driver
+ * that is CPU accessible. (-1 = unknown).
+ *
+ * This will be always be <= @probed_size, and the
+ * remainder(if there is any) will not be CPU
+ * accessible.
+ */
+    __u64 probed_cpu_visible_size;
+    };
+    };
+};
+
+/**
+ * struct __drm_i915_gem_create_ext - Existing gem_create 
behaviour, with added

+ * extension support using struct i915_user_extension.
+ *
+ * Note that new buffer flags should be added here, at least for 
the stuff that
+ * is immutable. Previously we would have two ioctls, one to create 
the object
+ * with gem_create, and another to apply various parameters, 
however this
+ * creates some ambiguity for the params which are considered 
immutable. Also in

+ * general we're phasing out the various SET/GET ioctls.
+ */
+struct __drm_i915_gem_create_ext {
+    /**
+ * @size: Requested size for the object.
+ *
+ * The (page-aligned) allocated size for the object will be 
returned.

+ *
+ * Note that for some devices we have might have further minimum
+ * page-size restrictions(larger than 4K), like for device 
local-memory.
+ * However in general the final size here should always reflect 
any
+ * rounding up, if for example using the 
I915_GEM_CREATE_EXT_MEMORY_REGIONS

+ * extension to place the object in device local-memory.
+ */
+    __u64 size;
+    /**
+ * @handle: Returned handle for the object.
+ *
+ * Object handles are nonzero.
+ */
+    __u32 handle;
+    /**
+

Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi

2022-04-27 Thread Lionel Landwerlin
One question though, how do we detect that this flag 
(I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS) is accepted on a given kernel?
I assume older kernels are going to reject object creation if we use 
this flag?


I didn't plan to use __drm_i915_query_vma_info, but isn't it 
inconsistent to select the placement on the GEM object and then query 
whether it's mappable by address?
You made a comment stating this is racy, wouldn't querying on the GEM 
object prevent this?


Thanks,

-Lionel

On 27/04/2022 09:35, Lionel Landwerlin wrote:

Hi Matt,


The proposal looks good to me.

Looking forward to try it on drm-tip.


-Lionel

On 20/04/2022 20:13, Matthew Auld wrote:

Add an entry for the new uapi needed for small BAR on DG2+.

v2:
   - Some spelling fixes and other small tweaks. (Akeem & Thomas)
   - Rework error capture interactions, including no longer needing
 NEEDS_CPU_ACCESS for objects marked for capture. (Thomas)
   - Add probed_cpu_visible_size. (Lionel)

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Lionel Landwerlin 
Cc: Jon Bloomfield 
Cc: Daniel Vetter 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Akeem G Abodunrin 
Cc: mesa-dev@lists.freedesktop.org
---
  Documentation/gpu/rfc/i915_small_bar.h   | 190 +++
  Documentation/gpu/rfc/i915_small_bar.rst |  58 +++
  Documentation/gpu/rfc/index.rst  |   4 +
  3 files changed, 252 insertions(+)
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.h
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst

diff --git a/Documentation/gpu/rfc/i915_small_bar.h 
b/Documentation/gpu/rfc/i915_small_bar.h

new file mode 100644
index ..7bfd0cf44d35
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_small_bar.h
@@ -0,0 +1,190 @@
+/**
+ * struct __drm_i915_memory_region_info - Describes one region as 
known to the

+ * driver.
+ *
+ * Note this is using both struct drm_i915_query_item and struct 
drm_i915_query.
+ * For this new query we are adding the new query id 
DRM_I915_QUERY_MEMORY_REGIONS

+ * at _i915_query_item.query_id.
+ */
+struct __drm_i915_memory_region_info {
+    /** @region: The class:instance pair encoding */
+    struct drm_i915_gem_memory_class_instance region;
+
+    /** @rsvd0: MBZ */
+    __u32 rsvd0;
+
+    /** @probed_size: Memory probed by the driver (-1 = unknown) */
+    __u64 probed_size;
+
+    /** @unallocated_size: Estimate of memory remaining (-1 = 
unknown) */

+    __u64 unallocated_size;
+
+    union {
+    /** @rsvd1: MBZ */
+    __u64 rsvd1[8];
+    struct {
+    /**
+ * @probed_cpu_visible_size: Memory probed by the driver
+ * that is CPU accessible. (-1 = unknown).
+ *
+ * This will be always be <= @probed_size, and the
+ * remainder(if there is any) will not be CPU
+ * accessible.
+ */
+    __u64 probed_cpu_visible_size;
+    };
+    };
+};
+
+/**
+ * struct __drm_i915_gem_create_ext - Existing gem_create behaviour, 
with added

+ * extension support using struct i915_user_extension.
+ *
+ * Note that new buffer flags should be added here, at least for the 
stuff that
+ * is immutable. Previously we would have two ioctls, one to create 
the object
+ * with gem_create, and another to apply various parameters, however 
this
+ * creates some ambiguity for the params which are considered 
immutable. Also in

+ * general we're phasing out the various SET/GET ioctls.
+ */
+struct __drm_i915_gem_create_ext {
+    /**
+ * @size: Requested size for the object.
+ *
+ * The (page-aligned) allocated size for the object will be 
returned.

+ *
+ * Note that for some devices we have might have further minimum
+ * page-size restrictions(larger than 4K), like for device 
local-memory.

+ * However in general the final size here should always reflect any
+ * rounding up, if for example using the 
I915_GEM_CREATE_EXT_MEMORY_REGIONS

+ * extension to place the object in device local-memory.
+ */
+    __u64 size;
+    /**
+ * @handle: Returned handle for the object.
+ *
+ * Object handles are nonzero.
+ */
+    __u32 handle;
+    /**
+ * @flags: Optional flags.
+ *
+ * Supported values:
+ *
+ * I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS - Signal to the 
kernel that

+ * the object will need to be accessed via the CPU.
+ *
+ * Only valid when placing objects in I915_MEMORY_CLASS_DEVICE, and
+ * only strictly required on platforms where only some of the 
device
+ * memory is directly visible or mappable through the CPU, like 
on DG2+.

+ *
+ * One of the placements MUST also be I915_MEMORY_CLASS_SYSTEM, to
+ * ensure we can always spill the allocation to system memory, 
if we

+ * can't place the object in the mappable part of
+ * I915_MEMORY_CLASS_DEVICE.
+ *
+ * Note that since the kernel only supports flat-CCS on objects 
that can

+ * *only* be 

Re: [PATCH v2] drm/doc: add rfc section for small BAR uapi

2022-04-27 Thread Lionel Landwerlin

Hi Matt,


The proposal looks good to me.

Looking forward to try it on drm-tip.


-Lionel

On 20/04/2022 20:13, Matthew Auld wrote:

Add an entry for the new uapi needed for small BAR on DG2+.

v2:
   - Some spelling fixes and other small tweaks. (Akeem & Thomas)
   - Rework error capture interactions, including no longer needing
 NEEDS_CPU_ACCESS for objects marked for capture. (Thomas)
   - Add probed_cpu_visible_size. (Lionel)

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Lionel Landwerlin 
Cc: Jon Bloomfield 
Cc: Daniel Vetter 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Akeem G Abodunrin 
Cc: mesa-dev@lists.freedesktop.org
---
  Documentation/gpu/rfc/i915_small_bar.h   | 190 +++
  Documentation/gpu/rfc/i915_small_bar.rst |  58 +++
  Documentation/gpu/rfc/index.rst  |   4 +
  3 files changed, 252 insertions(+)
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.h
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst

diff --git a/Documentation/gpu/rfc/i915_small_bar.h 
b/Documentation/gpu/rfc/i915_small_bar.h
new file mode 100644
index ..7bfd0cf44d35
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_small_bar.h
@@ -0,0 +1,190 @@
+/**
+ * struct __drm_i915_memory_region_info - Describes one region as known to the
+ * driver.
+ *
+ * Note this is using both struct drm_i915_query_item and struct 
drm_i915_query.
+ * For this new query we are adding the new query id 
DRM_I915_QUERY_MEMORY_REGIONS
+ * at _i915_query_item.query_id.
+ */
+struct __drm_i915_memory_region_info {
+   /** @region: The class:instance pair encoding */
+   struct drm_i915_gem_memory_class_instance region;
+
+   /** @rsvd0: MBZ */
+   __u32 rsvd0;
+
+   /** @probed_size: Memory probed by the driver (-1 = unknown) */
+   __u64 probed_size;
+
+   /** @unallocated_size: Estimate of memory remaining (-1 = unknown) */
+   __u64 unallocated_size;
+
+   union {
+   /** @rsvd1: MBZ */
+   __u64 rsvd1[8];
+   struct {
+   /**
+* @probed_cpu_visible_size: Memory probed by the driver
+* that is CPU accessible. (-1 = unknown).
+*
+* This will be always be <= @probed_size, and the
+* remainder(if there is any) will not be CPU
+* accessible.
+*/
+   __u64 probed_cpu_visible_size;
+   };
+   };
+};
+
+/**
+ * struct __drm_i915_gem_create_ext - Existing gem_create behaviour, with added
+ * extension support using struct i915_user_extension.
+ *
+ * Note that new buffer flags should be added here, at least for the stuff that
+ * is immutable. Previously we would have two ioctls, one to create the object
+ * with gem_create, and another to apply various parameters, however this
+ * creates some ambiguity for the params which are considered immutable. Also 
in
+ * general we're phasing out the various SET/GET ioctls.
+ */
+struct __drm_i915_gem_create_ext {
+   /**
+* @size: Requested size for the object.
+*
+* The (page-aligned) allocated size for the object will be returned.
+*
+* Note that for some devices we have might have further minimum
+* page-size restrictions(larger than 4K), like for device local-memory.
+* However in general the final size here should always reflect any
+* rounding up, if for example using the 
I915_GEM_CREATE_EXT_MEMORY_REGIONS
+* extension to place the object in device local-memory.
+*/
+   __u64 size;
+   /**
+* @handle: Returned handle for the object.
+*
+* Object handles are nonzero.
+*/
+   __u32 handle;
+   /**
+* @flags: Optional flags.
+*
+* Supported values:
+*
+* I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS - Signal to the kernel that
+* the object will need to be accessed via the CPU.
+*
+* Only valid when placing objects in I915_MEMORY_CLASS_DEVICE, and
+* only strictly required on platforms where only some of the device
+* memory is directly visible or mappable through the CPU, like on DG2+.
+*
+* One of the placements MUST also be I915_MEMORY_CLASS_SYSTEM, to
+* ensure we can always spill the allocation to system memory, if we
+* can't place the object in the mappable part of
+* I915_MEMORY_CLASS_DEVICE.
+*
+* Note that since the kernel only supports flat-CCS on objects that can
+* *only* be placed in I915_MEMORY_CLASS_DEVICE, we therefore don't
+* support I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS together with
+* flat-CCS.
+*
+* Without this hint, the kernel will assume that non-mappable
+* I915_MEMORY_CLASS_DEVICE is preferred for this