Re: [Intel-gfx] [PATCH v2] drm/doc: add rfc section for small BAR uapi

2022-04-28 Thread Tvrtko Ursulin



On 28/04/2022 11:25, Matthew Auld wrote:

On 28/04/2022 09:55, Tvrtko Ursulin wrote:


On 27/04/2022 18:36, Matthew Auld wrote:

On 27/04/2022 09:36, Tvrtko Ursulin wrote:


On 20/04/2022 18:13, Matthew Auld wrote:

Add an entry for the new uapi needed for small BAR on DG2+.

v2:
   - Some spelling fixes and other small tweaks. (Akeem & Thomas)
   - Rework error capture interactions, including no longer needing
 NEEDS_CPU_ACCESS for objects marked for capture. (Thomas)
   - Add probed_cpu_visible_size. (Lionel)

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Lionel Landwerlin 
Cc: Jon Bloomfield 
Cc: Daniel Vetter 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Akeem G Abodunrin 
Cc: mesa-dev@lists.freedesktop.org
---
  Documentation/gpu/rfc/i915_small_bar.h   | 190 
+++

  Documentation/gpu/rfc/i915_small_bar.rst |  58 +++
  Documentation/gpu/rfc/index.rst  |   4 +
  3 files changed, 252 insertions(+)
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.h
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst

diff --git a/Documentation/gpu/rfc/i915_small_bar.h 
b/Documentation/gpu/rfc/i915_small_bar.h

new file mode 100644
index ..7bfd0cf44d35
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_small_bar.h
@@ -0,0 +1,190 @@
+/**
+ * struct __drm_i915_memory_region_info - Describes one region as 
known to the

+ * driver.
+ *
+ * Note this is using both struct drm_i915_query_item and struct 
drm_i915_query.
+ * For this new query we are adding the new query id 
DRM_I915_QUERY_MEMORY_REGIONS

+ * at _i915_query_item.query_id.
+ */
+struct __drm_i915_memory_region_info {
+    /** @region: The class:instance pair encoding */
+    struct drm_i915_gem_memory_class_instance region;
+
+    /** @rsvd0: MBZ */
+    __u32 rsvd0;
+
+    /** @probed_size: Memory probed by the driver (-1 = unknown) */
+    __u64 probed_size;
+
+    /** @unallocated_size: Estimate of memory remaining (-1 = 
unknown) */

+    __u64 unallocated_size;
+
+    union {
+    /** @rsvd1: MBZ */
+    __u64 rsvd1[8];
+    struct {
+    /**
+ * @probed_cpu_visible_size: Memory probed by the driver
+ * that is CPU accessible. (-1 = unknown).
+ *
+ * This will be always be <= @probed_size, and the
+ * remainder(if there is any) will not be CPU
+ * accessible.
+ */
+    __u64 probed_cpu_visible_size;


Would unallocated_cpu_visible_size be useful, to follow the total 
unallocated_size?


Make sense. But I don't think unallocated_size has actually been 
properly wired up yet. It still just gives the same value as 
probed_size. IIRC for unallocated_size we still need a real 
user/usecase/umd, before wiring that up for real with the existing 
avail tracking. Once we have that we can also add 
unallocated_cpu_visible_size.


So this does nothing at the moment:

  info.unallocated_size = mr->avail;

Right, it is set to "mem->avail = mem->total;" at region init time and 
I indeed can't find it ever getting modified. Okay.


Btw, have we ever considered whether unallocated_size should require 
CAP_SYS_ADMIN/PERFMON or something?


Note sure. But just in case we do add it for real at some point, why 
the added restriction?


To avoid a side channel, albeit perhaps a very weak one. For engine 
utilization we require CAP_SYS_PERFMON, but that is implied by the 
perf core API. It's open for discussion. I guess it may make sense to 
limit it also because it is questionable the field(s) are even useful.







+    };
+    };
+};
+
+/**
+ * struct __drm_i915_gem_create_ext - Existing gem_create 
behaviour, with added

+ * extension support using struct i915_user_extension.
+ *
+ * Note that new buffer flags should be added here, at least for 
the stuff that
+ * is immutable. Previously we would have two ioctls, one to 
create the object
+ * with gem_create, and another to apply various parameters, 
however this
+ * creates some ambiguity for the params which are considered 
immutable. Also in

+ * general we're phasing out the various SET/GET ioctls.
+ */
+struct __drm_i915_gem_create_ext {
+    /**
+ * @size: Requested size for the object.
+ *
+ * The (page-aligned) allocated size for the object will be 
returned.

+ *
+ * Note that for some devices we have might have further minimum
+ * page-size restrictions(larger than 4K), likefor device 
local-memory.
+ * However in general the final size here should always 
reflect any
+ * rounding up, if for example using the 
I915_GEM_CREATE_EXT_MEMORY_REGIONS

+ * extension to place the object in device local-memory.
+ */
+    __u64 size;
+    /**
+ * @handle: Returned handle for the object.
+ *
+ * Object handles are nonzero.
+ */
+    __u32 handle;
+    /**
+ * @flags: Optional flags.
+ *
+ * Supported values:
+ *
+ * I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS - 

Re: [Intel-gfx] [PATCH v2] drm/doc: add rfc section for small BAR uapi

2022-04-28 Thread Matthew Auld

On 28/04/2022 09:55, Tvrtko Ursulin wrote:


On 27/04/2022 18:36, Matthew Auld wrote:

On 27/04/2022 09:36, Tvrtko Ursulin wrote:


On 20/04/2022 18:13, Matthew Auld wrote:

Add an entry for the new uapi needed for small BAR on DG2+.

v2:
   - Some spelling fixes and other small tweaks. (Akeem & Thomas)
   - Rework error capture interactions, including no longer needing
 NEEDS_CPU_ACCESS for objects marked for capture. (Thomas)
   - Add probed_cpu_visible_size. (Lionel)

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Lionel Landwerlin 
Cc: Jon Bloomfield 
Cc: Daniel Vetter 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Akeem G Abodunrin 
Cc: mesa-dev@lists.freedesktop.org
---
  Documentation/gpu/rfc/i915_small_bar.h   | 190 
+++

  Documentation/gpu/rfc/i915_small_bar.rst |  58 +++
  Documentation/gpu/rfc/index.rst  |   4 +
  3 files changed, 252 insertions(+)
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.h
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst

diff --git a/Documentation/gpu/rfc/i915_small_bar.h 
b/Documentation/gpu/rfc/i915_small_bar.h

new file mode 100644
index ..7bfd0cf44d35
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_small_bar.h
@@ -0,0 +1,190 @@
+/**
+ * struct __drm_i915_memory_region_info - Describes one region as 
known to the

+ * driver.
+ *
+ * Note this is using both struct drm_i915_query_item and struct 
drm_i915_query.
+ * For this new query we are adding the new query id 
DRM_I915_QUERY_MEMORY_REGIONS

+ * at _i915_query_item.query_id.
+ */
+struct __drm_i915_memory_region_info {
+    /** @region: The class:instance pair encoding */
+    struct drm_i915_gem_memory_class_instance region;
+
+    /** @rsvd0: MBZ */
+    __u32 rsvd0;
+
+    /** @probed_size: Memory probed by the driver (-1 = unknown) */
+    __u64 probed_size;
+
+    /** @unallocated_size: Estimate of memory remaining (-1 = 
unknown) */

+    __u64 unallocated_size;
+
+    union {
+    /** @rsvd1: MBZ */
+    __u64 rsvd1[8];
+    struct {
+    /**
+ * @probed_cpu_visible_size: Memory probed by the driver
+ * that is CPU accessible. (-1 = unknown).
+ *
+ * This will be always be <= @probed_size, and the
+ * remainder(if there is any) will not be CPU
+ * accessible.
+ */
+    __u64 probed_cpu_visible_size;


Would unallocated_cpu_visible_size be useful, to follow the total 
unallocated_size?


Make sense. But I don't think unallocated_size has actually been 
properly wired up yet. It still just gives the same value as 
probed_size. IIRC for unallocated_size we still need a real 
user/usecase/umd, before wiring that up for real with the existing 
avail tracking. Once we have that we can also add 
unallocated_cpu_visible_size.


So this does nothing at the moment:

  info.unallocated_size = mr->avail;

Right, it is set to "mem->avail = mem->total;" at region init time and I 
indeed can't find it ever getting modified. Okay.


Btw, have we ever considered whether unallocated_size should require 
CAP_SYS_ADMIN/PERFMON or something?


Note sure. But just in case we do add it for real at some point, why 
the added restriction?


To avoid a side channel, albeit perhaps a very weak one. For engine 
utilization we require CAP_SYS_PERFMON, but that is implied by the perf 
core API. It's open for discussion. I guess it may make sense to limit 
it also because it is questionable the field(s) are even useful.







+    };
+    };
+};
+
+/**
+ * struct __drm_i915_gem_create_ext - Existing gem_create 
behaviour, with added

+ * extension support using struct i915_user_extension.
+ *
+ * Note that new buffer flags should be added here, at least for 
the stuff that
+ * is immutable. Previously we would have two ioctls, one to create 
the object
+ * with gem_create, and another to apply various parameters, 
however this
+ * creates some ambiguity for the params which are considered 
immutable. Also in

+ * general we're phasing out the various SET/GET ioctls.
+ */
+struct __drm_i915_gem_create_ext {
+    /**
+ * @size: Requested size for the object.
+ *
+ * The (page-aligned) allocated size for the object will be 
returned.

+ *
+ * Note that for some devices we have might have further minimum
+ * page-size restrictions(larger than 4K), likefor device 
local-memory.
+ * However in general the final size here should always reflect 
any
+ * rounding up, if for example using the 
I915_GEM_CREATE_EXT_MEMORY_REGIONS

+ * extension to place the object in device local-memory.
+ */
+    __u64 size;
+    /**
+ * @handle: Returned handle for the object.
+ *
+ * Object handles are nonzero.
+ */
+    __u32 handle;
+    /**
+ * @flags: Optional flags.
+ *
+ * Supported values:
+ *
+ * I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS - Signal to the 
kernel that

+ * the object 

Re: [Intel-gfx] [PATCH v2] drm/doc: add rfc section for small BAR uapi

2022-04-28 Thread Tvrtko Ursulin



On 27/04/2022 18:36, Matthew Auld wrote:

On 27/04/2022 09:36, Tvrtko Ursulin wrote:


On 20/04/2022 18:13, Matthew Auld wrote:

Add an entry for the new uapi needed for small BAR on DG2+.

v2:
   - Some spelling fixes and other small tweaks. (Akeem & Thomas)
   - Rework error capture interactions, including no longer needing
 NEEDS_CPU_ACCESS for objects marked for capture. (Thomas)
   - Add probed_cpu_visible_size. (Lionel)

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Lionel Landwerlin 
Cc: Jon Bloomfield 
Cc: Daniel Vetter 
Cc: Jordan Justen 
Cc: Kenneth Graunke 
Cc: Akeem G Abodunrin 
Cc: mesa-dev@lists.freedesktop.org
---
  Documentation/gpu/rfc/i915_small_bar.h   | 190 +++
  Documentation/gpu/rfc/i915_small_bar.rst |  58 +++
  Documentation/gpu/rfc/index.rst  |   4 +
  3 files changed, 252 insertions(+)
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.h
  create mode 100644 Documentation/gpu/rfc/i915_small_bar.rst

diff --git a/Documentation/gpu/rfc/i915_small_bar.h 
b/Documentation/gpu/rfc/i915_small_bar.h

new file mode 100644
index ..7bfd0cf44d35
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_small_bar.h
@@ -0,0 +1,190 @@
+/**
+ * struct __drm_i915_memory_region_info - Describes one region as 
known to the

+ * driver.
+ *
+ * Note this is using both struct drm_i915_query_item and struct 
drm_i915_query.
+ * For this new query we are adding the new query id 
DRM_I915_QUERY_MEMORY_REGIONS

+ * at _i915_query_item.query_id.
+ */
+struct __drm_i915_memory_region_info {
+    /** @region: The class:instance pair encoding */
+    struct drm_i915_gem_memory_class_instance region;
+
+    /** @rsvd0: MBZ */
+    __u32 rsvd0;
+
+    /** @probed_size: Memory probed by the driver (-1 = unknown) */
+    __u64 probed_size;
+
+    /** @unallocated_size: Estimate of memory remaining (-1 = 
unknown) */

+    __u64 unallocated_size;
+
+    union {
+    /** @rsvd1: MBZ */
+    __u64 rsvd1[8];
+    struct {
+    /**
+ * @probed_cpu_visible_size: Memory probed by the driver
+ * that is CPU accessible. (-1 = unknown).
+ *
+ * This will be always be <= @probed_size, and the
+ * remainder(if there is any) will not be CPU
+ * accessible.
+ */
+    __u64 probed_cpu_visible_size;


Would unallocated_cpu_visible_size be useful, to follow the total 
unallocated_size?


Make sense. But I don't think unallocated_size has actually been 
properly wired up yet. It still just gives the same value as 
probed_size. IIRC for unallocated_size we still need a real 
user/usecase/umd, before wiring that up for real with the existing avail 
tracking. Once we have that we can also add unallocated_cpu_visible_size.


So this does nothing at the moment:

 info.unallocated_size = mr->avail;

Right, it is set to "mem->avail = mem->total;" at region init time and I 
indeed can't find it ever getting modified. Okay.


Btw, have we ever considered whether unallocated_size should require 
CAP_SYS_ADMIN/PERFMON or something?


Note sure. But just in case we do add it for real at some point, why the 
added restriction?


To avoid a side channel, albeit perhaps a very weak one. For engine 
utilization we require CAP_SYS_PERFMON, but that is implied by the perf 
core API. It's open for discussion. I guess it may make sense to limit 
it also because it is questionable the field(s) are even useful.







+    };
+    };
+};
+
+/**
+ * struct __drm_i915_gem_create_ext - Existing gem_create behaviour, 
with added

+ * extension support using struct i915_user_extension.
+ *
+ * Note that new buffer flags should be added here, at least for the 
stuff that
+ * is immutable. Previously we would have two ioctls, one to create 
the object
+ * with gem_create, and another to apply various parameters, however 
this
+ * creates some ambiguity for the params which are considered 
immutable. Also in

+ * general we're phasing out the various SET/GET ioctls.
+ */
+struct __drm_i915_gem_create_ext {
+    /**
+ * @size: Requested size for the object.
+ *
+ * The (page-aligned) allocated size for the object will be 
returned.

+ *
+ * Note that for some devices we have might have further minimum
+ * page-size restrictions(larger than 4K), likefor device 
local-memory.

+ * However in general the final size here should always reflect any
+ * rounding up, if for example using the 
I915_GEM_CREATE_EXT_MEMORY_REGIONS

+ * extension to place the object in device local-memory.
+ */
+    __u64 size;
+    /**
+ * @handle: Returned handle for the object.
+ *
+ * Object handles are nonzero.
+ */
+    __u32 handle;
+    /**
+ * @flags: Optional flags.
+ *
+ * Supported values:
+ *
+ * I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS - Signal to the 
kernel that

+ * the object will need to be accessed via the CPU.
+ *
+  

Re: Mesa 20.0 backlog

2022-04-28 Thread Lionel Landwerlin

On 22/04/2022 07:56, Dylan Baker wrote:

Hi all,

I've spent a good deal of time this week crushing the backlog of
patches on the mesa 20.0 series before making the release today. As such
there are not only a dozen outstanding patches, mostly for zink, which I
couldn't figure out how to correctly backport.

I've touched base with Lionel about the anv patches, but the remaning
patches I'd appreciate guidance on what you'd like to do with them.

2022-03-14 FIXES  d5870c45ae panfrost: Optimise recalculation of max sampler 
view
2022-03-24 FIXES  f348103fce anv: fix dynamic state emission
2022-03-24 FIXES  a4f502de32 anv: fix VK_DYNAMIC_STATE_COLOR_WRITE_ENABLE_EXT 
state
2022-03-24 FIXES  1d250b7b95 anv: fix color write enable interaction with color 
mask



Those 3 fixes are unfortunately bringing perf regressions. So I think we 
should drop them from the list.


Thanks,

-Lionel



2022-03-30 CC 4eca6e3e5d lavapipe: fix xfb availability query copying
2022-04-05 CC 3dcb80da9d zink: fix barrier generation for ssbo descriptors
2022-04-11 FIXES  dd078d13cb zink: fix tessellation shader key matching.
2022-04-13 FIXES  bbdf22ce13 radv: Fix barriers with cp dma
2022-04-18 CC 8806f444a5 zink: fix extended restart prim types without 
dynamic state2
2022-04-21 CC 373c8001d6 zink: set VK_QUERY_RESULT_WAIT_BIT when copying to 
qbo
2022-04-21 CC a056cbc691 zink: fix synchronization when drawing from 
streamout
2022-04-21 CC fc5edf9b68 zink: fix xfb counter buffer barriers
2022-04-21 CC e509598470 zink: remove xfb_barrier flag

Cheers,
Dylan