Re: [Intel-gfx] [PATCH v14 6/6] drm/i915: expose rcs topology through query uAPI

2018-03-05 Thread Joonas Lahtinen
Quoting Lionel Landwerlin (2018-02-22 19:53:59)
> With the introduction of asymmetric slices in CNL, we cannot rely on
> the previous SUBSLICE_MASK getparam to tell userspace what subslices
> are available. Here we introduce a more detailed way of querying the
> Gen's GPU topology that doesn't aggregate numbers.
> 
> This is essential for monitoring parts of the GPU with the OA unit,
> because counters need to be normalized to the number of
> EUs/subslices/slices. The current aggregated numbers like EU_TOTAL do
> not gives us sufficient information.
> 
> As a bonus we can draw representations of the GPU :
> 
> https://imgur.com/a/vuqpa
> 
> v2: Rename uapi struct s/_mask/_info/ (Tvrtko)
> Report max_slice/subslice/eus_per_subslice rather than strides (Tvrtko)
> Add uapi macros to read data from *_info structs (Tvrtko)
> 
> v3: Use !!(v & DRM_I915_BIT()) for uapi macros instead of custom shifts 
> (Tvrtko)
> 
> v4: factorize query item writting (Tvrtko)
> tweak uapi struct/define names (Tvrtko)
> 
> v5: Replace ALIGN() macro (Chris)
> 
> v6: Updated uapi comments (Tvrtko)
> Moved flags != 0 checks into vfuncs (Tvrtko)
> 
> v7: Use access_ok() before copying anything, to avoid overflows (Chris)
> Switch BUG_ON() to GEM_WARN_ON() (Tvrtko)
> 
> v8: Tweak uapi comments style to match the coding style (Lionel)
> 
> v9: Fix error in comment about computation of enabled subslice (Tvrtko)
> 
> v10: Fix/update comments in uAPI (Sagar)
> 
> v11: Drop drm_i915_query_(slice|subslice|eu)_info in favor of a single
>  drm_i915_query_topology_info (Joonas)
> 
> v12: Add subslice_stride/eu_stride in drm_i915_query_topology_info (Joonas)
> 
> Signed-off-by: Lionel Landwerlin 
> Acked-by: Chris Wilson 



> +++ b/drivers/gpu/drm/i915/i915_query.c
> @@ -7,8 +7,90 @@
>  #include "i915_drv.h"
>  #include 
>  
> +static int query_topology_info(struct drm_i915_private *dev_priv,
> +  struct drm_i915_query_item *query_item)
> +{
> +   const struct sseu_dev_info *sseu = _INFO(dev_priv)->sseu;
> +   struct drm_i915_query_topology_info topo_info;

Just like sseu is just sseu, topo_info could be topo...

Same goes for query, there's really no confusion within a function.

> +   u32 slice_length, subslice_length, eu_length, total_length;
> +
> +   if (query_item->flags != 0)
> +   return -EINVAL;
> +
> +   if (sseu->max_slices == 0)
> +   return -ENODEV;
> +
> +   /*
> +* If we ever change the internal slice mask data type, we'll need to
> +* update this function.
> +*/

That's pretty self evident from next line, so comment can be thrown away.

> +   BUILD_BUG_ON(sizeof(u8) != sizeof(sseu->slice_mask));



> +/*
> + * Data written by the kernel with query DRM_I915_QUERY_TOPOLOGY_INFO :
> + *
> + * data: contains the 3 pieces of information :
> + *
> + * - the slice mask with one bit per slice telling whether a slice is
> + *   available. The availability of slice X can be queried with the following
> + *   formula :
> + *
> + *   (data[X / 8] >> (X % 8)) & 1
> + *
> + * - the subslice mask for each slice with one bit per subslice telling
> + *   whether a subslice is available. The availability of subslice Y in slice
> + *   X can be queried with the following formula :
> + *
> + *   (data[subslice_offset +
> + * X * subslice_stride +
> + * Y / 8] >> (Y % 8)) & 1
> + *
> + * - the EU mask for each subslice in each slice with one bit per EU telling
> + *   whether an EU is available. The availability of EU Z in subslice Y in
> + *   slice X can be queried with the following formula :
> + *
> + *   (data[eu_offset +
> + * (X * max_subslices Y) * eu_stride +

s/Y/+ Y/

> + * Z / 8] >> (Z % 8)) & 1
> + */
> +struct drm_i915_query_topology_info {
> +   /*
> +* Unused for now.

+ "Must be cleared to zero."

Reviewed-by: Joonas Lahtinen 

Regards, Joonas
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v14 6/6] drm/i915: expose rcs topology through query uAPI

2018-02-22 Thread Lionel Landwerlin
With the introduction of asymmetric slices in CNL, we cannot rely on
the previous SUBSLICE_MASK getparam to tell userspace what subslices
are available. Here we introduce a more detailed way of querying the
Gen's GPU topology that doesn't aggregate numbers.

This is essential for monitoring parts of the GPU with the OA unit,
because counters need to be normalized to the number of
EUs/subslices/slices. The current aggregated numbers like EU_TOTAL do
not gives us sufficient information.

As a bonus we can draw representations of the GPU :

https://imgur.com/a/vuqpa

v2: Rename uapi struct s/_mask/_info/ (Tvrtko)
Report max_slice/subslice/eus_per_subslice rather than strides (Tvrtko)
Add uapi macros to read data from *_info structs (Tvrtko)

v3: Use !!(v & DRM_I915_BIT()) for uapi macros instead of custom shifts (Tvrtko)

v4: factorize query item writting (Tvrtko)
tweak uapi struct/define names (Tvrtko)

v5: Replace ALIGN() macro (Chris)

v6: Updated uapi comments (Tvrtko)
Moved flags != 0 checks into vfuncs (Tvrtko)

v7: Use access_ok() before copying anything, to avoid overflows (Chris)
Switch BUG_ON() to GEM_WARN_ON() (Tvrtko)

v8: Tweak uapi comments style to match the coding style (Lionel)

v9: Fix error in comment about computation of enabled subslice (Tvrtko)

v10: Fix/update comments in uAPI (Sagar)

v11: Drop drm_i915_query_(slice|subslice|eu)_info in favor of a single
 drm_i915_query_topology_info (Joonas)

v12: Add subslice_stride/eu_stride in drm_i915_query_topology_info (Joonas)

Signed-off-by: Lionel Landwerlin 
Acked-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_query.c | 82 +++
 include/uapi/drm/i915_drm.h   | 62 +
 2 files changed, 144 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_query.c 
b/drivers/gpu/drm/i915/i915_query.c
index c23bfc9be617..2f11fabb6aff 100644
--- a/drivers/gpu/drm/i915/i915_query.c
+++ b/drivers/gpu/drm/i915/i915_query.c
@@ -7,8 +7,90 @@
 #include "i915_drv.h"
 #include 
 
+static int query_topology_info(struct drm_i915_private *dev_priv,
+  struct drm_i915_query_item *query_item)
+{
+   const struct sseu_dev_info *sseu = _INFO(dev_priv)->sseu;
+   struct drm_i915_query_topology_info topo_info;
+   u32 slice_length, subslice_length, eu_length, total_length;
+
+   if (query_item->flags != 0)
+   return -EINVAL;
+
+   if (sseu->max_slices == 0)
+   return -ENODEV;
+
+   /*
+* If we ever change the internal slice mask data type, we'll need to
+* update this function.
+*/
+   BUILD_BUG_ON(sizeof(u8) != sizeof(sseu->slice_mask));
+
+   slice_length = sizeof(sseu->slice_mask);
+   subslice_length = sseu->max_slices *
+   DIV_ROUND_UP(sseu->max_subslices,
+sizeof(sseu->subslice_mask[0]) * BITS_PER_BYTE);
+   eu_length = sseu->max_slices * sseu->max_subslices *
+   DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE);
+
+   total_length = sizeof(topo_info) + slice_length + subslice_length + 
eu_length;
+
+   if (query_item->length == 0)
+   return total_length;
+
+   if (query_item->length < total_length)
+   return -EINVAL;
+
+   if (copy_from_user(_info, u64_to_user_ptr(query_item->data_ptr),
+  sizeof(topo_info)))
+   return -EFAULT;
+
+   if (topo_info.flags != 0)
+   return -EINVAL;
+
+   if (!access_ok(VERIFY_WRITE, u64_to_user_ptr(query_item->data_ptr),
+  total_length))
+   return -EFAULT;
+
+   memset(_info, 0, sizeof(topo_info));
+   topo_info.max_slices = sseu->max_slices;
+   topo_info.max_subslices = sseu->max_subslices;
+   topo_info.max_eus_per_subslice = sseu->max_eus_per_subslice;
+
+   topo_info.subslice_offset = slice_length;
+   topo_info.subslice_stride = DIV_ROUND_UP(sseu->max_subslices,
+BITS_PER_BYTE);
+   topo_info.eu_offset = slice_length + subslice_length;
+   topo_info.eu_stride = DIV_ROUND_UP(sseu->max_eus_per_subslice,
+  BITS_PER_BYTE);
+
+   if (__copy_to_user(u64_to_user_ptr(query_item->data_ptr),
+  _info, sizeof(topo_info)))
+   return -EFAULT;
+
+   if (__copy_to_user(u64_to_user_ptr(query_item->data_ptr +
+  sizeof(topo_info)),
+  >slice_mask, slice_length))
+   return -EFAULT;
+
+   if (__copy_to_user(u64_to_user_ptr(query_item->data_ptr +
+  sizeof(topo_info) +
+  slice_length),
+  sseu->subslice_mask, subslice_length))
+