Re: [Intel-gfx] [PATCH i-g-t 02/11] tests/perf: add per context filtering test for gen8+

2017-08-22 Thread Chris Wilson
Quoting Lionel Landwerlin (2017-08-22 14:48:12)
> On 22/08/17 14:28, Chris Wilson wrote:
> > Quoting Lionel Landwerlin (2017-08-22 14:11:07)
> >> On 22/08/17 12:59, Chris Wilson wrote:
> >>> Quoting Lionel Landwerlin (2017-08-22 12:45:47)
>  On 08/08/17 15:21, Matthew Auld wrote:
> > On 4 August 2017 at 12:20, Lionel Landwerlin
> >  wrote:
> >> static void
> >> @@ -1336,6 +1504,66 @@ print_reports(uint32_t *oa_report0, uint32_t 
> >> *oa_report1, int fmt)
> >> }
> >>
> >> static void
> >> +print_report(uint32_t *report, int fmt)
> >> +{
> > I get an unused warning for this...
>  Useful for really precise debugging. Putting under ifdef
> >>> Does it interfere that much with normal testing, or you could dump extra
> >>> details on unexpected events? If it is useful at some point, you will be
> >>> wishing you had the details from CI. The beauty of igt_debug() (at least
> >>> when --debug is not used by defaul!) is that it does give us the
> >>> portmortem output of the last N lines (where N is ~256?) without
> >>> flooding ourselves with irrelevant messages.
> >>> -Chris
> >> The problem is that we get loads of reports.
> >> Most of the time you want to look at the deltas between them (which is
> >> what most igt_debug() are about), only occasionally the actual values
> >> (which is what this function dumps).
> > If you can't identify the right one to dump when you find the error, how
> > much is the cost of recording all the traces for a subtest and dumping
> > them compressed+encoded to the output? If we are only talking a few megs
> > of raw data and hope it compresses down to a few 100k, that's not too
> > unwieldy to include on stderr/whatever. All depends on how difficult it
> > will be to reproduce the eventual bug. Just my crystal ball is saying
> > that if you find print_report() useful, you will find it useful again in
> > the future where you may not even have access to the system.
> > -Chris
> >
> 
> My debugging experience has been the following :
> 
> - I have no idea what's wrong, put traces in different places and hope 
> to notice something fishy
> - There are now too many traces and taking too long to figure anything 
> from the logs
> 
> print_report is just a remaining utility function. I'm happy to drop it 
> from the patch if that's too contentious.

Oh definitely keep it, just thinking aloud about how hard it is to get
the right information from the CI farm, and that if you have already
found one particular bit of info useful my experience says to wire that
up and have it available for when the igt_assert fires.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH i-g-t 02/11] tests/perf: add per context filtering test for gen8+

2017-08-22 Thread Lionel Landwerlin

On 22/08/17 14:28, Chris Wilson wrote:

Quoting Lionel Landwerlin (2017-08-22 14:11:07)

On 22/08/17 12:59, Chris Wilson wrote:

Quoting Lionel Landwerlin (2017-08-22 12:45:47)

On 08/08/17 15:21, Matthew Auld wrote:

On 4 August 2017 at 12:20, Lionel Landwerlin
 wrote:

static void
@@ -1336,6 +1504,66 @@ print_reports(uint32_t *oa_report0, uint32_t 
*oa_report1, int fmt)
}

static void
+print_report(uint32_t *report, int fmt)
+{

I get an unused warning for this...

Useful for really precise debugging. Putting under ifdef

Does it interfere that much with normal testing, or you could dump extra
details on unexpected events? If it is useful at some point, you will be
wishing you had the details from CI. The beauty of igt_debug() (at least
when --debug is not used by defaul!) is that it does give us the
portmortem output of the last N lines (where N is ~256?) without
flooding ourselves with irrelevant messages.
-Chris

The problem is that we get loads of reports.
Most of the time you want to look at the deltas between them (which is
what most igt_debug() are about), only occasionally the actual values
(which is what this function dumps).

If you can't identify the right one to dump when you find the error, how
much is the cost of recording all the traces for a subtest and dumping
them compressed+encoded to the output? If we are only talking a few megs
of raw data and hope it compresses down to a few 100k, that's not too
unwieldy to include on stderr/whatever. All depends on how difficult it
will be to reproduce the eventual bug. Just my crystal ball is saying
that if you find print_report() useful, you will find it useful again in
the future where you may not even have access to the system.
-Chris



My debugging experience has been the following :

- I have no idea what's wrong, put traces in different places and hope 
to notice something fishy
- There are now too many traces and taking too long to figure anything 
from the logs


print_report is just a remaining utility function. I'm happy to drop it 
from the patch if that's too contentious.


-
Lionel
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH i-g-t 02/11] tests/perf: add per context filtering test for gen8+

2017-08-22 Thread Chris Wilson
Quoting Lionel Landwerlin (2017-08-22 14:11:07)
> On 22/08/17 12:59, Chris Wilson wrote:
> > Quoting Lionel Landwerlin (2017-08-22 12:45:47)
> >> On 08/08/17 15:21, Matthew Auld wrote:
> >>> On 4 August 2017 at 12:20, Lionel Landwerlin
> >>>  wrote:
> static void
>  @@ -1336,6 +1504,66 @@ print_reports(uint32_t *oa_report0, uint32_t 
>  *oa_report1, int fmt)
> }
> 
> static void
>  +print_report(uint32_t *report, int fmt)
>  +{
> >>> I get an unused warning for this...
> >> Useful for really precise debugging. Putting under ifdef
> > Does it interfere that much with normal testing, or you could dump extra
> > details on unexpected events? If it is useful at some point, you will be
> > wishing you had the details from CI. The beauty of igt_debug() (at least
> > when --debug is not used by defaul!) is that it does give us the
> > portmortem output of the last N lines (where N is ~256?) without
> > flooding ourselves with irrelevant messages.
> > -Chris
> 
> The problem is that we get loads of reports.
> Most of the time you want to look at the deltas between them (which is 
> what most igt_debug() are about), only occasionally the actual values 
> (which is what this function dumps).

If you can't identify the right one to dump when you find the error, how
much is the cost of recording all the traces for a subtest and dumping
them compressed+encoded to the output? If we are only talking a few megs
of raw data and hope it compresses down to a few 100k, that's not too
unwieldy to include on stderr/whatever. All depends on how difficult it
will be to reproduce the eventual bug. Just my crystal ball is saying
that if you find print_report() useful, you will find it useful again in
the future where you may not even have access to the system.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH i-g-t 02/11] tests/perf: add per context filtering test for gen8+

2017-08-22 Thread Lionel Landwerlin

On 22/08/17 12:59, Chris Wilson wrote:

Quoting Lionel Landwerlin (2017-08-22 12:45:47)

On 08/08/17 15:21, Matthew Auld wrote:

On 4 August 2017 at 12:20, Lionel Landwerlin
 wrote:

   static void
@@ -1336,6 +1504,66 @@ print_reports(uint32_t *oa_report0, uint32_t 
*oa_report1, int fmt)
   }

   static void
+print_report(uint32_t *report, int fmt)
+{

I get an unused warning for this...

Useful for really precise debugging. Putting under ifdef

Does it interfere that much with normal testing, or you could dump extra
details on unexpected events? If it is useful at some point, you will be
wishing you had the details from CI. The beauty of igt_debug() (at least
when --debug is not used by defaul!) is that it does give us the
portmortem output of the last N lines (where N is ~256?) without
flooding ourselves with irrelevant messages.
-Chris


The problem is that we get loads of reports.
Most of the time you want to look at the deltas between them (which is 
what most igt_debug() are about), only occasionally the actual values 
(which is what this function dumps).


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH i-g-t 02/11] tests/perf: add per context filtering test for gen8+

2017-08-22 Thread Chris Wilson
Quoting Lionel Landwerlin (2017-08-22 12:45:47)
> On 08/08/17 15:21, Matthew Auld wrote:
> > On 4 August 2017 at 12:20, Lionel Landwerlin
> >  wrote:
> >>   static void
> >> @@ -1336,6 +1504,66 @@ print_reports(uint32_t *oa_report0, uint32_t 
> >> *oa_report1, int fmt)
> >>   }
> >>
> >>   static void
> >> +print_report(uint32_t *report, int fmt)
> >> +{
> > I get an unused warning for this...
> 
> Useful for really precise debugging. Putting under ifdef

Does it interfere that much with normal testing, or you could dump extra
details on unexpected events? If it is useful at some point, you will be
wishing you had the details from CI. The beauty of igt_debug() (at least
when --debug is not used by defaul!) is that it does give us the
portmortem output of the last N lines (where N is ~256?) without
flooding ourselves with irrelevant messages.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH i-g-t 02/11] tests/perf: add per context filtering test for gen8+

2017-08-22 Thread Lionel Landwerlin

On 08/08/17 15:21, Matthew Auld wrote:

On 4 August 2017 at 12:20, Lionel Landwerlin
 wrote:

From: Robert Bragg 

Signed-off-by: Robert Bragg 
Signed-off-by: Lionel Landwerlin 
---
  tests/perf.c | 806 ---
  1 file changed, 768 insertions(+), 38 deletions(-)

diff --git a/tests/perf.c b/tests/perf.c
index 26581ce4..279ff0c6 100644
--- a/tests/perf.c
+++ b/tests/perf.c
@@ -48,7 +48,9 @@ IGT_TEST_DESCRIPTION("Test the i915 perf metrics streaming 
interface");
  #define OAREPORT_REASON_MASK   0x3f
  #define OAREPORT_REASON_SHIFT  19
  #define OAREPORT_REASON_TIMER  (1<<0)
+#define OAREPORT_REASON_INTERNAL   (3<<1)
  #define OAREPORT_REASON_CTX_SWITCH (1<<3)
+#define OAREPORT_REASON_GO (1<<4)
  #define OAREPORT_REASON_CLK_RATIO  (1<<5)

  #define GFX_OP_PIPE_CONTROL ((3 << 29) | (3 << 27) | (2 << 24))
@@ -565,6 +567,22 @@ oa_exponent_to_ns(int exponent)
 return 10ULL * (2ULL << exponent) / timestamp_frequency;
  }

+static bool
+oa_report_ctx_is_valid(uint32_t *report)
+{
+   if (IS_HASWELL(devid)) {
+   return false; /* TODO */
+   } else if (IS_GEN8(devid)) {
+   return report[0] & (1ul << 25);
+   } else if (IS_GEN9(devid)) {
+   return report[0] & (1ul << 16);
+   }
+
+   /* Need to update this function for newer Gen. */
+   igt_assert(!"reached");
+}
+
+
  static void
  hsw_sanity_check_render_basic_reports(uint32_t *oa_report0, uint32_t 
*oa_report1,
   enum drm_i915_oa_format fmt)
@@ -669,6 +687,100 @@ gen8_40bit_a_delta(uint64_t value0, uint64_t value1)
 return value1 - value0;
  }

+static void
+accumulate_uint32(size_t offset,
+ uint32_t *report0,
+  uint32_t *report1,
+  uint64_t *delta)
+{
+   uint32_t value0 = *(uint32_t *)(((uint8_t *)report0) + offset);
+   uint32_t value1 = *(uint32_t *)(((uint8_t *)report1) + offset);
+
+   *delta += (uint32_t)(value1 - value0);
+}
+
+static void
+accumulate_uint40(int a_index,
+  uint32_t *report0,
+  uint32_t *report1,
+ enum drm_i915_oa_format format,
+  uint64_t *delta)
+{
+   uint64_t value0 = gen8_read_40bit_a_counter(report0, format, a_index),
+value1 = gen8_read_40bit_a_counter(report1, format, a_index);
+
+   *delta += gen8_40bit_a_delta(value0, value1);
+}
+
+static void
+accumulate_reports(struct accumulator *accumulator,
+  uint32_t *start,
+  uint32_t *end)
+{
+   enum drm_i915_oa_format format = accumulator->format;
+   uint64_t *deltas = accumulator->deltas;
+   int idx = 0;
+
+   if (intel_gen(devid) >= 8) {
+   /* timestamp */
+   accumulate_uint32(4, start, end, deltas + idx++);
+
+   /* clock cycles */
+   accumulate_uint32(12, start, end, deltas + idx++);
+   } else {
+   /* timestamp */
+   accumulate_uint32(4, start, end, deltas + idx++);
+   }
+
+   for (int i = 0; i < oa_formats[format].n_a40; i++)
+   accumulate_uint40(i, start, end, format, deltas + idx++);
+
+   for (int i = 0; i < oa_formats[format].n_a; i++) {
+   accumulate_uint32(oa_formats[format].a_off + 4 * i,
+ start, end, deltas + idx++);
+   }
+
+   for (int i = 0; i < oa_formats[format].n_b; i++) {
+   accumulate_uint32(oa_formats[format].b_off + 4 * i,
+ start, end, deltas + idx++);
+   }
+
+   for (int i = 0; i < oa_formats[format].n_c; i++) {
+   accumulate_uint32(oa_formats[format].c_off + 4 * i,
+ start, end, deltas + idx++);
+   }
+}
+
+static void
+accumulator_print(struct accumulator *accumulator, const char *title)
+{
+   enum drm_i915_oa_format format = accumulator->format;
+   uint64_t *deltas = accumulator->deltas;
+   int idx = 0;
+
+   igt_debug("%s:\n", title);
+   if (intel_gen(devid) >= 8) {
+   igt_debug("\ttime delta = %lu\n", deltas[idx++]);
+   igt_debug("\tclock cycle delta = %lu\n", deltas[idx++]);
+
+   for (int i = 0; i < oa_formats[format].n_a40; i++)
+   igt_debug("\tA%u = %lu\n", i, deltas[idx++]);
+   } else {
+   igt_debug("\ttime delta = %lu\n", deltas[idx++]);
+   }
+
+   for (int i = 0; i < oa_formats[format].n_a; i++) {
+   int a_id = oa_formats[format].first_a + i;
+   igt_debug("\tA%u = %lu\n", a_id, deltas[idx++]);
+   }
+
+   for (int i = 0; i < oa_formats[format].n_a; i++)
+   igt_debug("\tB%u = %lu\n", i, 

Re: [Intel-gfx] [PATCH i-g-t 02/11] tests/perf: add per context filtering test for gen8+

2017-08-08 Thread Matthew Auld
On 4 August 2017 at 12:20, Lionel Landwerlin
 wrote:
> From: Robert Bragg 
>
> Signed-off-by: Robert Bragg 
> Signed-off-by: Lionel Landwerlin 
> ---
>  tests/perf.c | 806 
> ---
>  1 file changed, 768 insertions(+), 38 deletions(-)
>
> diff --git a/tests/perf.c b/tests/perf.c
> index 26581ce4..279ff0c6 100644
> --- a/tests/perf.c
> +++ b/tests/perf.c
> @@ -48,7 +48,9 @@ IGT_TEST_DESCRIPTION("Test the i915 perf metrics streaming 
> interface");
>  #define OAREPORT_REASON_MASK   0x3f
>  #define OAREPORT_REASON_SHIFT  19
>  #define OAREPORT_REASON_TIMER  (1<<0)
> +#define OAREPORT_REASON_INTERNAL   (3<<1)
>  #define OAREPORT_REASON_CTX_SWITCH (1<<3)
> +#define OAREPORT_REASON_GO (1<<4)
>  #define OAREPORT_REASON_CLK_RATIO  (1<<5)
>
>  #define GFX_OP_PIPE_CONTROL ((3 << 29) | (3 << 27) | (2 << 24))
> @@ -565,6 +567,22 @@ oa_exponent_to_ns(int exponent)
> return 10ULL * (2ULL << exponent) / timestamp_frequency;
>  }
>
> +static bool
> +oa_report_ctx_is_valid(uint32_t *report)
> +{
> +   if (IS_HASWELL(devid)) {
> +   return false; /* TODO */
> +   } else if (IS_GEN8(devid)) {
> +   return report[0] & (1ul << 25);
> +   } else if (IS_GEN9(devid)) {
> +   return report[0] & (1ul << 16);
> +   }
> +
> +   /* Need to update this function for newer Gen. */
> +   igt_assert(!"reached");
> +}
> +
> +
>  static void
>  hsw_sanity_check_render_basic_reports(uint32_t *oa_report0, uint32_t 
> *oa_report1,
>   enum drm_i915_oa_format fmt)
> @@ -669,6 +687,100 @@ gen8_40bit_a_delta(uint64_t value0, uint64_t value1)
> return value1 - value0;
>  }
>
> +static void
> +accumulate_uint32(size_t offset,
> + uint32_t *report0,
> +  uint32_t *report1,
> +  uint64_t *delta)
> +{
> +   uint32_t value0 = *(uint32_t *)(((uint8_t *)report0) + offset);
> +   uint32_t value1 = *(uint32_t *)(((uint8_t *)report1) + offset);
> +
> +   *delta += (uint32_t)(value1 - value0);
> +}
> +
> +static void
> +accumulate_uint40(int a_index,
> +  uint32_t *report0,
> +  uint32_t *report1,
> + enum drm_i915_oa_format format,
> +  uint64_t *delta)
> +{
> +   uint64_t value0 = gen8_read_40bit_a_counter(report0, format, a_index),
> +value1 = gen8_read_40bit_a_counter(report1, format, a_index);
> +
> +   *delta += gen8_40bit_a_delta(value0, value1);
> +}
> +
> +static void
> +accumulate_reports(struct accumulator *accumulator,
> +  uint32_t *start,
> +  uint32_t *end)
> +{
> +   enum drm_i915_oa_format format = accumulator->format;
> +   uint64_t *deltas = accumulator->deltas;
> +   int idx = 0;
> +
> +   if (intel_gen(devid) >= 8) {
> +   /* timestamp */
> +   accumulate_uint32(4, start, end, deltas + idx++);
> +
> +   /* clock cycles */
> +   accumulate_uint32(12, start, end, deltas + idx++);
> +   } else {
> +   /* timestamp */
> +   accumulate_uint32(4, start, end, deltas + idx++);
> +   }
> +
> +   for (int i = 0; i < oa_formats[format].n_a40; i++)
> +   accumulate_uint40(i, start, end, format, deltas + idx++);
> +
> +   for (int i = 0; i < oa_formats[format].n_a; i++) {
> +   accumulate_uint32(oa_formats[format].a_off + 4 * i,
> + start, end, deltas + idx++);
> +   }
> +
> +   for (int i = 0; i < oa_formats[format].n_b; i++) {
> +   accumulate_uint32(oa_formats[format].b_off + 4 * i,
> + start, end, deltas + idx++);
> +   }
> +
> +   for (int i = 0; i < oa_formats[format].n_c; i++) {
> +   accumulate_uint32(oa_formats[format].c_off + 4 * i,
> + start, end, deltas + idx++);
> +   }
> +}
> +
> +static void
> +accumulator_print(struct accumulator *accumulator, const char *title)
> +{
> +   enum drm_i915_oa_format format = accumulator->format;
> +   uint64_t *deltas = accumulator->deltas;
> +   int idx = 0;
> +
> +   igt_debug("%s:\n", title);
> +   if (intel_gen(devid) >= 8) {
> +   igt_debug("\ttime delta = %lu\n", deltas[idx++]);
> +   igt_debug("\tclock cycle delta = %lu\n", deltas[idx++]);
> +
> +   for (int i = 0; i < oa_formats[format].n_a40; i++)
> +   igt_debug("\tA%u = %lu\n", i, deltas[idx++]);
> +   } else {
> +   igt_debug("\ttime delta = %lu\n", deltas[idx++]);
> +   }
> +
> +   for (int i = 0; i < oa_formats[format].n_a; i++) {
> +   int a_id =