Re: [Mesa-dev] [PATCH] i965: Enable disk shader cache by default

2018-02-14 Thread Kenneth Graunke
On Saturday, February 3, 2018 1:58:58 PM PST Jordan Justen wrote:
> Signed-off-by: Jordan Justen 
> Reviewed-by: Timothy Arceri 
> ---
>  docs/relnotes/18.1.0.html  | 1 +
>  src/mesa/drivers/dri/i965/brw_disk_cache.c | 3 ---
>  2 files changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/docs/relnotes/18.1.0.html b/docs/relnotes/18.1.0.html
> index b8a0cd0d02c..0a5878ea41f 100644
> --- a/docs/relnotes/18.1.0.html
> +++ b/docs/relnotes/18.1.0.html
> @@ -46,6 +46,7 @@ Note: some of the new features are only available with 
> certain drivers.
>  
>  GL_EXT_semaphore on radeonsi
>  GL_EXT_semaphore_fd on radeonsi
> +Disk shader cache support for i965 enabled by default
>  
>  
>  Bug fixes
> diff --git a/src/mesa/drivers/dri/i965/brw_disk_cache.c 
> b/src/mesa/drivers/dri/i965/brw_disk_cache.c
> index f989456bcde..41f742e858f 100644
> --- a/src/mesa/drivers/dri/i965/brw_disk_cache.c
> +++ b/src/mesa/drivers/dri/i965/brw_disk_cache.c
> @@ -407,9 +407,6 @@ void
>  brw_disk_cache_init(struct intel_screen *screen)
>  {
>  #ifdef ENABLE_SHADER_CACHE
> -   if (env_var_as_boolean("MESA_GLSL_CACHE_DISABLE", true))
> -  return;
> -
> char renderer[10];
> MAYBE_UNUSED int len = snprintf(renderer, sizeof(renderer), "i965_%04x",
> screen->deviceID);
> 

Sounds like it's good to go...so...

Acked-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Enable disk shader cache by default

2018-02-05 Thread Jason Ekstrand
On Sat, Feb 3, 2018 at 2:58 PM, Jordan Justen 
wrote:

> On 2018-02-03 14:24:06, Jason Ekstrand wrote:
> > On February 3, 2018 13:59:40 Jordan Justen 
> wrote:
> >
> > > Signed-off-by: Jordan Justen 
> > > Reviewed-by: Timothy Arceri 
> > > ---
> > >  docs/relnotes/18.1.0.html  | 1 +
> > >  src/mesa/drivers/dri/i965/brw_disk_cache.c | 3 ---
> > >  2 files changed, 1 insertion(+), 3 deletions(-)
> > >
> > > diff --git a/docs/relnotes/18.1.0.html b/docs/relnotes/18.1.0.html
> > > index b8a0cd0d02c..0a5878ea41f 100644
> > > --- a/docs/relnotes/18.1.0.html
> > > +++ b/docs/relnotes/18.1.0.html
> > > @@ -46,6 +46,7 @@ Note: some of the new features are only available
> with
> > > certain drivers.
> > >  
> > >  GL_EXT_semaphore on radeonsi
> > >  GL_EXT_semaphore_fd on radeonsi
> > > +Disk shader cache support for i965 enabled by default
> > >  
> > >
> > >  Bug fixes
> > > diff --git a/src/mesa/drivers/dri/i965/brw_disk_cache.c
> > > b/src/mesa/drivers/dri/i965/brw_disk_cache.c
> > > index f989456bcde..41f742e858f 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_disk_cache.c
> > > +++ b/src/mesa/drivers/dri/i965/brw_disk_cache.c
> > > @@ -407,9 +407,6 @@ void
> > >  brw_disk_cache_init(struct intel_screen *screen)
> > >  {
> > >  #ifdef ENABLE_SHADER_CACHE
> > > -   if (env_var_as_boolean("MESA_GLSL_CACHE_DISABLE", true))
> > > -  return;
> >
> > Should we just flip the default so we still have the environment variable
> > to shut it off if we have problems?
>
> The disk_cache_create function (called later) also looks at the same
> variable, and it defaults to enabling the shader cache.
>
> That's the reason I chose to use this variable name, even though it
> has meant that we had to use a double negative
> (MESA_GLSL_CACHE_DISABLE=0) to allow the i965 disk shader cache to be
> enabled.
>

Fair enough.  I figured it was something like that.  I just wanted to
double-check. :-)

--Jason



> -Jordan
>
> >
> > > -
> > > char renderer[10];
> > > MAYBE_UNUSED int len = snprintf(renderer, sizeof(renderer),
> "i965_%04x",
> > > screen->deviceID);
> > > --
> > > 2.15.1
> > >
> > > ___
> > > mesa-dev mailing list
> > > mesa-dev@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
> >
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Enable disk shader cache by default

2018-02-04 Thread Tapani Pälli

Reviewed-by: Tapani Pälli 

On 02/03/2018 11:58 PM, Jordan Justen wrote:

Signed-off-by: Jordan Justen 
Reviewed-by: Timothy Arceri 
---
  docs/relnotes/18.1.0.html  | 1 +
  src/mesa/drivers/dri/i965/brw_disk_cache.c | 3 ---
  2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/docs/relnotes/18.1.0.html b/docs/relnotes/18.1.0.html
index b8a0cd0d02c..0a5878ea41f 100644
--- a/docs/relnotes/18.1.0.html
+++ b/docs/relnotes/18.1.0.html
@@ -46,6 +46,7 @@ Note: some of the new features are only available with 
certain drivers.
  
  GL_EXT_semaphore on radeonsi
  GL_EXT_semaphore_fd on radeonsi
+Disk shader cache support for i965 enabled by default
  
  
  Bug fixes

diff --git a/src/mesa/drivers/dri/i965/brw_disk_cache.c 
b/src/mesa/drivers/dri/i965/brw_disk_cache.c
index f989456bcde..41f742e858f 100644
--- a/src/mesa/drivers/dri/i965/brw_disk_cache.c
+++ b/src/mesa/drivers/dri/i965/brw_disk_cache.c
@@ -407,9 +407,6 @@ void
  brw_disk_cache_init(struct intel_screen *screen)
  {
  #ifdef ENABLE_SHADER_CACHE
-   if (env_var_as_boolean("MESA_GLSL_CACHE_DISABLE", true))
-  return;
-
 char renderer[10];
 MAYBE_UNUSED int len = snprintf(renderer, sizeof(renderer), "i965_%04x",
 screen->deviceID);


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Enable disk shader cache by default

2018-02-03 Thread Jordan Justen
On 2018-02-03 14:24:06, Jason Ekstrand wrote:
> On February 3, 2018 13:59:40 Jordan Justen  wrote:
> 
> > Signed-off-by: Jordan Justen 
> > Reviewed-by: Timothy Arceri 
> > ---
> >  docs/relnotes/18.1.0.html  | 1 +
> >  src/mesa/drivers/dri/i965/brw_disk_cache.c | 3 ---
> >  2 files changed, 1 insertion(+), 3 deletions(-)
> >
> > diff --git a/docs/relnotes/18.1.0.html b/docs/relnotes/18.1.0.html
> > index b8a0cd0d02c..0a5878ea41f 100644
> > --- a/docs/relnotes/18.1.0.html
> > +++ b/docs/relnotes/18.1.0.html
> > @@ -46,6 +46,7 @@ Note: some of the new features are only available with 
> > certain drivers.
> >  
> >  GL_EXT_semaphore on radeonsi
> >  GL_EXT_semaphore_fd on radeonsi
> > +Disk shader cache support for i965 enabled by default
> >  
> >
> >  Bug fixes
> > diff --git a/src/mesa/drivers/dri/i965/brw_disk_cache.c 
> > b/src/mesa/drivers/dri/i965/brw_disk_cache.c
> > index f989456bcde..41f742e858f 100644
> > --- a/src/mesa/drivers/dri/i965/brw_disk_cache.c
> > +++ b/src/mesa/drivers/dri/i965/brw_disk_cache.c
> > @@ -407,9 +407,6 @@ void
> >  brw_disk_cache_init(struct intel_screen *screen)
> >  {
> >  #ifdef ENABLE_SHADER_CACHE
> > -   if (env_var_as_boolean("MESA_GLSL_CACHE_DISABLE", true))
> > -  return;
> 
> Should we just flip the default so we still have the environment variable 
> to shut it off if we have problems?

The disk_cache_create function (called later) also looks at the same
variable, and it defaults to enabling the shader cache.

That's the reason I chose to use this variable name, even though it
has meant that we had to use a double negative
(MESA_GLSL_CACHE_DISABLE=0) to allow the i965 disk shader cache to be
enabled.

-Jordan

> 
> > -
> > char renderer[10];
> > MAYBE_UNUSED int len = snprintf(renderer, sizeof(renderer), "i965_%04x",
> > screen->deviceID);
> > --
> > 2.15.1
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Enable disk shader cache by default

2018-02-03 Thread Jason Ekstrand

On February 3, 2018 13:59:40 Jordan Justen  wrote:


Signed-off-by: Jordan Justen 
Reviewed-by: Timothy Arceri 
---
 docs/relnotes/18.1.0.html  | 1 +
 src/mesa/drivers/dri/i965/brw_disk_cache.c | 3 ---
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/docs/relnotes/18.1.0.html b/docs/relnotes/18.1.0.html
index b8a0cd0d02c..0a5878ea41f 100644
--- a/docs/relnotes/18.1.0.html
+++ b/docs/relnotes/18.1.0.html
@@ -46,6 +46,7 @@ Note: some of the new features are only available with 
certain drivers.

 
 GL_EXT_semaphore on radeonsi
 GL_EXT_semaphore_fd on radeonsi
+Disk shader cache support for i965 enabled by default
 

 Bug fixes
diff --git a/src/mesa/drivers/dri/i965/brw_disk_cache.c 
b/src/mesa/drivers/dri/i965/brw_disk_cache.c

index f989456bcde..41f742e858f 100644
--- a/src/mesa/drivers/dri/i965/brw_disk_cache.c
+++ b/src/mesa/drivers/dri/i965/brw_disk_cache.c
@@ -407,9 +407,6 @@ void
 brw_disk_cache_init(struct intel_screen *screen)
 {
 #ifdef ENABLE_SHADER_CACHE
-   if (env_var_as_boolean("MESA_GLSL_CACHE_DISABLE", true))
-  return;


Should we just flip the default so we still have the environment variable 
to shut it off if we have problems?



-
char renderer[10];
MAYBE_UNUSED int len = snprintf(renderer, sizeof(renderer), "i965_%04x",
screen->deviceID);
--
2.15.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Enable disk shader cache by default

2017-12-07 Thread Kenneth Graunke
On Thursday, December 7, 2017 9:57:48 AM PST Matt Turner wrote:
[snip]
> But the entire API boils down to a comparatively small set of
> non-orthogonal state. The configuration of those nobs seems to me like
> the place things are most likely to break.

I do like Matt's idea of adding Piglit tests specifically for NOS items.
Even if we don't add tests for every item in a key, adding tests that
hit two variants of an item in the key would ensure that we're using the
right key, and not pulling the wrong one from the cache.

Yes, it's a problem with the existing in-memory cache, but it would also
be nice to have additional coverage of that.

Another idea I had was to extend the ARB_program_interface_query tasks
to have a mode (command line argument) that causes them to compile the
shader twice.  For example:

1. Compile the shader
2. Check program interface query properties
3. Delete the shader
4. Compile the shader again - which should be a cache hit
5. Check program interface query properties again

Most of these properties come from the IR and associated metadata.
That would be a way to verify that we get that right.  I think it would
have caught the VUE map bug relating to unify_interfaces() not getting
called when restoring from the cache.

It also sounds relatively easy on paper.

If we validated NOS, and validated PIQ, then I'd feel pretty good about
the cache working with only a single run.  The run-the-whole-suite-twice
idea is nice, but most people aren't going to do that on every single
patch (maybe daily?).  I'd like to have better coverage within a single
run.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Enable disk shader cache by default

2017-12-07 Thread Jordan Justen
On 2017-12-07 09:57:48, Matt Turner wrote:
> On Tue, Dec 5, 2017 at 3:40 PM, Jordan Justen  
> wrote:
> > On 2017-12-05 14:49:28, Mark Janes wrote:
> >> Jordan Justen  writes:
> >> > On 2017-12-05 09:13:11, Mark Janes wrote:
> >> >> Adding a dimension to the test matrix has high cost, especially when
> >> >> combined with other dimensions of the test matrix (does shader cache
> >> >> need to be tested for 32/64 bit builds? For hswgt1/hswgt2/hswgt3e?).
> >> >
> >> > Are you saying this is too high cost to run per check-in? Do you need
> >> > to disable it for the health of CI? I think I proposed that daily, or
> >> > perhaps even weekly would be adequate.
> >>
> >> Certainly, the test time per line of shader cache code is massively
> >> higher than any other feature, even if you run it just once a month.
> >> Other features have tests that run in milliseconds, not 30min * 20
> >> machines.
> >
> > The scope of this feature is nearly the entire API. It is justified to
> > throw the entire GL suite of tests at it on a regular basis. The cost
> > of running this once per week ought to be reasonable.
> 
> But the entire API boils down to a comparatively small set of
> non-orthogonal state. The configuration of those nobs seems to me like
> the place things are most likely to break.
> 
> I think there's value in testing that we're hitting the cache, but if
> we're not it's not a functional regression. I'm more concerned about
> ensuring we don't have bugs that affect functionality and cause things
> to break.
> 
> The program key for fragment shaders looks like:
> 
> /** The program key for Fragment/Pixel Shaders. */
> struct brw_wm_prog_key {
>/* Some collection of BRW_WM_IZ_* */
>uint8_t iz_lookup;
>bool stats_wm:1;
>bool flat_shade:1;
>unsigned nr_color_regions:5;
>bool replicate_alpha:1;
>bool clamp_fragment_color:1;
>bool persample_interp:1;
>bool multisample_fbo:1;
>bool frag_coord_adds_sample_pos:1;
>enum brw_wm_aa_enable line_aa:2;
>bool high_quality_derivatives:1;
>bool force_dual_color_blend:1;
>bool coherent_fb_fetch:1;
> 
>uint64_t input_slots_valid;
>unsigned program_string_id;
>GLenum alpha_test_func;  /* < For Gen4/5 MRT alpha test */
>float alpha_test_ref;
> 
>struct brw_sampler_prog_key_data tex;
> };
> 
> and it's the most complex of the shader stages. Wouldn't you feel a
> lot safer if we just had a piglit test for each of those nobs that
> compiled a shader, then changed some non-orthogonal state, and then
> rendered with it, thus confirming that it didn't get the wrong shader
> program out of the cache?

What would that be testing? That the disk_cache returns the same
program if we give it the same hash? Shouldn't disk_cache unit tests
cover this?

The scope is also more than just covering the various GL states that
might change the i965 program keys. We also need one or more programs
that are actually affected by that key change. We also need to test
that glsl program serialization works. We also need to test that nir
serialization works.

How about maintainability? Once we develop these 50~100 tests, how do
we make sure we update them if we change the i965 program keys?

> I know I've run into cases numerous times where piglit doesn't
> actually test something, or only one seemingly unrelated test in all
> 50 thousand tickles a real bug in my code. I feel uncomfortable
> assuming that piglit's existing coverage is good enough.

This is a reasonable point. Piglit has coverage holes. The CTS has
coverage holes. dEQP has coverage holes. I think the union of all 3
probably still has coverage holes. (But hopefully a fairly small set.)

> In general, I'm still bothered by the idea of just running piglit
> twice. That's a very untargeted approach that doesn't fit well with
> our existing testing model, as Mark has described.

Are you saying that running all of our CI tests twice is a waste of
time? I think that the 'dumb run everything twice' plan actually gets
us much more coverage than we'll likely get if we try to write 50~100
tests focusing on the i965 program keys. So, I don't agree that it is
a waste of time.

I think that the 'dumb run everything twice' plan gets us almost
everything we need, and is more maintainable. If it misses something,
then I think means we don't have good test coverage of the GL feature,
and we should add that test to one of our 3 possible GL test suites.

If you are arguing that the 'dumb run everything twice' plan is
valuable, but inadequate. Fine. I concede that maybe we could try to
add more specific focused tests. I'm not sure that these tests will
actually be added anytime soon, which means that we don't have a plan
for getting the i965 shader cache enabled.

-Jordan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Enable disk shader cache by default

2017-12-07 Thread Timothy Arceri

On 08/12/17 04:57, Matt Turner wrote:

On Tue, Dec 5, 2017 at 3:40 PM, Jordan Justen  wrote:

On 2017-12-05 14:49:28, Mark Janes wrote:

Jordan Justen  writes:


On 2017-12-05 09:13:11, Mark Janes wrote:

Jordan Justen  writes:


On 2017-11-08 17:26:47, Timothy Arceri wrote:

Reviewed-by: Timothy Arceri 

Mark may want to consider adding some of the once a day type CI runs for
this. For example running the test suite for two consecutive runs on the
same build so that the second run uses the shader cache and also a
second run the uses MESA_GLSL=cache_fb to force testing of the cache
fallback path.


Yeah. We discussed this previously, but I don't think it's been
implemented yet. My opinion is that it could perhaps be a weekly test.


This automation is implemented now. It found the issues reported in
https://bugs.freedesktop.org/show_bug.cgi?id=103988

My opinion is that this test strategy is inadequate.


Meaning you think we cannot enable i965 shader cache for the 18.0
release?


I haven't heard anyone express confidence that this feature is bug-free,
and I don't know on what basis that claim could be made.  I appreciate
that a lot of have manual testing has been done.  Do you feel confident
that the cache can be enabled for all mesa customers without disruption?


If we had enabled it two months ago, then yes, we would have worked
through the niche issues, or discovered the so-far-hidden major land
mine. At this point, it's getting risky. In a month, I will say no.


We are already ~1.5 months away from the next stable branch point. If
we want to enable this in 18.0, we should be using this time to see if
enabling the cache by default has some large unexpected side effect in
our graphics stack, or breaks real-world apps.


I agree.  We should encourage as many users as possible to enable the
shader cache in their environments.  At least one stable release should
be out with a working cache, where the feature can be enabled by those
who are eager to benefit from it.  I assume there will be a lot of them,
and they could flush out issues for everyone who has no idea what a
shader cache is.


Adding a dimension to the test matrix has high cost, especially when
combined with other dimensions of the test matrix (does shader cache
need to be tested for 32/64 bit builds? For hswgt1/hswgt2/hswgt3e?).


Are you saying this is too high cost to run per check-in? Do you need
to disable it for the health of CI? I think I proposed that daily, or
perhaps even weekly would be adequate.


Certainly, the test time per line of shader cache code is massively
higher than any other feature, even if you run it just once a month.
Other features have tests that run in milliseconds, not 30min * 20
machines.


The scope of this feature is nearly the entire API. It is justified to
throw the entire GL suite of tests at it on a regular basis. The cost
of running this once per week ought to be reasonable.


But the entire API boils down to a comparatively small set of
non-orthogonal state. The configuration of those nobs seems to me like
the place things are most likely to break.

I think there's value in testing that we're hitting the cache, but if
we're not it's not a functional regression. I'm more concerned about
ensuring we don't have bugs that affect functionality and cause things
to break.

The program key for fragment shaders looks like:

/** The program key for Fragment/Pixel Shaders. */
struct brw_wm_prog_key {
/* Some collection of BRW_WM_IZ_* */
uint8_t iz_lookup;
bool stats_wm:1;
bool flat_shade:1;
unsigned nr_color_regions:5;
bool replicate_alpha:1;
bool clamp_fragment_color:1;
bool persample_interp:1;
bool multisample_fbo:1;
bool frag_coord_adds_sample_pos:1;
enum brw_wm_aa_enable line_aa:2;
bool high_quality_derivatives:1;
bool force_dual_color_blend:1;
bool coherent_fb_fetch:1;

uint64_t input_slots_valid;
unsigned program_string_id;
GLenum alpha_test_func;  /* < For Gen4/5 MRT alpha test */
float alpha_test_ref;

struct brw_sampler_prog_key_data tex;
};

and it's the most complex of the shader stages. Wouldn't you feel a
lot safer if we just had a piglit test for each of those nobs that
compiled a shader, then changed some non-orthogonal state, and then
rendered with it, thus confirming that it didn't get the wrong shader
program out of the cache?


Those are just the keys for i965s gen program. You also should be 
testing the serialization and deserialization of the GL state and the IR 
(in i965s case NIR). For NIR you could turn on an ENV var to do a 
serialization/deserialization for every shader, I think something like 
that is done for nir clone.




I know I've run into cases numerous times where piglit doesn't
actually test something, or only one seemingly unrelated test in all
50 thousand tickles a real bug in my code. I 

Re: [Mesa-dev] [PATCH] i965: Enable disk shader cache by default

2017-12-07 Thread Matt Turner
On Tue, Dec 5, 2017 at 3:40 PM, Jordan Justen  wrote:
> On 2017-12-05 14:49:28, Mark Janes wrote:
>> Jordan Justen  writes:
>>
>> > On 2017-12-05 09:13:11, Mark Janes wrote:
>> >> Jordan Justen  writes:
>> >>
>> >> > On 2017-11-08 17:26:47, Timothy Arceri wrote:
>> >> >> Reviewed-by: Timothy Arceri 
>> >> >>
>> >> >> Mark may want to consider adding some of the once a day type CI runs 
>> >> >> for
>> >> >> this. For example running the test suite for two consecutive runs on 
>> >> >> the
>> >> >> same build so that the second run uses the shader cache and also a
>> >> >> second run the uses MESA_GLSL=cache_fb to force testing of the cache
>> >> >> fallback path.
>> >> >
>> >> > Yeah. We discussed this previously, but I don't think it's been
>> >> > implemented yet. My opinion is that it could perhaps be a weekly test.
>> >>
>> >> This automation is implemented now. It found the issues reported in
>> >> https://bugs.freedesktop.org/show_bug.cgi?id=103988
>> >>
>> >> My opinion is that this test strategy is inadequate.
>> >
>> > Meaning you think we cannot enable i965 shader cache for the 18.0
>> > release?
>>
>> I haven't heard anyone express confidence that this feature is bug-free,
>> and I don't know on what basis that claim could be made.  I appreciate
>> that a lot of have manual testing has been done.  Do you feel confident
>> that the cache can be enabled for all mesa customers without disruption?
>
> If we had enabled it two months ago, then yes, we would have worked
> through the niche issues, or discovered the so-far-hidden major land
> mine. At this point, it's getting risky. In a month, I will say no.
>
>> > We are already ~1.5 months away from the next stable branch point. If
>> > we want to enable this in 18.0, we should be using this time to see if
>> > enabling the cache by default has some large unexpected side effect in
>> > our graphics stack, or breaks real-world apps.
>>
>> I agree.  We should encourage as many users as possible to enable the
>> shader cache in their environments.  At least one stable release should
>> be out with a working cache, where the feature can be enabled by those
>> who are eager to benefit from it.  I assume there will be a lot of them,
>> and they could flush out issues for everyone who has no idea what a
>> shader cache is.
>>
>> >> Adding a dimension to the test matrix has high cost, especially when
>> >> combined with other dimensions of the test matrix (does shader cache
>> >> need to be tested for 32/64 bit builds? For hswgt1/hswgt2/hswgt3e?).
>> >
>> > Are you saying this is too high cost to run per check-in? Do you need
>> > to disable it for the health of CI? I think I proposed that daily, or
>> > perhaps even weekly would be adequate.
>>
>> Certainly, the test time per line of shader cache code is massively
>> higher than any other feature, even if you run it just once a month.
>> Other features have tests that run in milliseconds, not 30min * 20
>> machines.
>
> The scope of this feature is nearly the entire API. It is justified to
> throw the entire GL suite of tests at it on a regular basis. The cost
> of running this once per week ought to be reasonable.

But the entire API boils down to a comparatively small set of
non-orthogonal state. The configuration of those nobs seems to me like
the place things are most likely to break.

I think there's value in testing that we're hitting the cache, but if
we're not it's not a functional regression. I'm more concerned about
ensuring we don't have bugs that affect functionality and cause things
to break.

The program key for fragment shaders looks like:

/** The program key for Fragment/Pixel Shaders. */
struct brw_wm_prog_key {
   /* Some collection of BRW_WM_IZ_* */
   uint8_t iz_lookup;
   bool stats_wm:1;
   bool flat_shade:1;
   unsigned nr_color_regions:5;
   bool replicate_alpha:1;
   bool clamp_fragment_color:1;
   bool persample_interp:1;
   bool multisample_fbo:1;
   bool frag_coord_adds_sample_pos:1;
   enum brw_wm_aa_enable line_aa:2;
   bool high_quality_derivatives:1;
   bool force_dual_color_blend:1;
   bool coherent_fb_fetch:1;

   uint64_t input_slots_valid;
   unsigned program_string_id;
   GLenum alpha_test_func;  /* < For Gen4/5 MRT alpha test */
   float alpha_test_ref;

   struct brw_sampler_prog_key_data tex;
};

and it's the most complex of the shader stages. Wouldn't you feel a
lot safer if we just had a piglit test for each of those nobs that
compiled a shader, then changed some non-orthogonal state, and then
rendered with it, thus confirming that it didn't get the wrong shader
program out of the cache?

I know I've run into cases numerous times where piglit doesn't
actually test something, or only one seemingly unrelated test in all
50 thousand tickles a real bug in my code. I feel uncomfortable
assuming that piglit's existing coverage is good enough.

In general, I'm 

Re: [Mesa-dev] [PATCH] i965: Enable disk shader cache by default

2017-12-05 Thread Jordan Justen
On 2017-12-05 18:30:30, Mark Janes wrote:
> Timothy Arceri  writes:
> 
> > On 06/12/17 12:04, Mark Janes wrote:
> >> Jordan Justen  writes:
> >> 
> >>> On 2017-12-05 14:49:28, Mark Janes wrote:
>  Jordan Justen  writes:
> > It could be interesting to define a MESA extension to control or query
> > the shader cache. Today, the shader cache is a nearly 'invisible'
> > feature. There are a few env-vars, but I wouldn't call a public
> > interface.
> >
> > The downside of defining an extension is that it might constrain
> > changes to the shader cache implementation. Or, if the interface is
> > too complex, then it may be tough for some drivers to implement.
> 
>  Why is an extension necessary?  By comparison, GCC has no interface to
>  query the statistics for ccache.  A utility that can read the cache
>  files and report hits/misses would at least inform us that the cache is
>  functioning.  The utility would be useful in writing real unit tests for
>  the feature.
> >>>
> >>> If we don't have an extension, then how would this work? i965 unit
> >>> tests during the build? If we don't actually test the feature against
> >>> real hardware, I think we'll miss a whole class of potential issues.
> >> 
> >> Unit tests are good, and I know we already have some of those.  We'll
> >> have better data on the classes of issues are once we find more of
> >> them.  So far, it's just a few hardware-independent piglit tests.
> >> 
> >>> If we don't have an extension, then how can an external (piglit) test
> >>> hope to interact or probe the cache?
> >> 
> >> Piglit queries a utility or lib to make assertions about the cache?  For
> >> that matter, this whole cache verification could be implemented within
> >> piglit as a profile that executes tests twice with appropriate
> >> verification.
> >
> > That would mean manual work to add tests to a profile. The manual part 
> > makes it unlikely to ever happen, also trying to manually update a list 
> > to test every variation of shader is doomed to fail. Alternatively 
> > adding only tests that have found a problem in the past makes it not 
> > very useful.
> 
> Are there not some cpu tests that can be easily dropped from the
> profile?

Looking in tests/cpu.py, I guess the asmparsertests are probably not
affected. (At least today, and that's unlikely to change.)

For the GLSLParserTest tests, they do interact with the shader cache,
so they should still be run.

Uh, not sure if I mentioned it previously, but the shader cache has no
need to run any vulkan tests.

-Jordan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Enable disk shader cache by default

2017-12-05 Thread Mark Janes
Timothy Arceri  writes:

> On 06/12/17 12:04, Mark Janes wrote:
>> Jordan Justen  writes:
>> 
>>> On 2017-12-05 14:49:28, Mark Janes wrote:
 Jordan Justen  writes:

> On 2017-12-05 09:13:11, Mark Janes wrote:
>> Jordan Justen  writes:
>>
>>> On 2017-11-08 17:26:47, Timothy Arceri wrote:
 Reviewed-by: Timothy Arceri 

 Mark may want to consider adding some of the once a day type CI runs 
 for
 this. For example running the test suite for two consecutive runs on 
 the
 same build so that the second run uses the shader cache and also a
 second run the uses MESA_GLSL=cache_fb to force testing of the cache
 fallback path.
>>>
>>> Yeah. We discussed this previously, but I don't think it's been
>>> implemented yet. My opinion is that it could perhaps be a weekly test.
>>
>> This automation is implemented now. It found the issues reported in
>> https://bugs.freedesktop.org/show_bug.cgi?id=103988
>>
>> My opinion is that this test strategy is inadequate.
>
> Meaning you think we cannot enable i965 shader cache for the 18.0
> release?

 I haven't heard anyone express confidence that this feature is bug-free,
 and I don't know on what basis that claim could be made.  I appreciate
 that a lot of have manual testing has been done.  Do you feel confident
 that the cache can be enabled for all mesa customers without disruption?
>>>
>>> If we had enabled it two months ago, then yes, we would have worked
>>> through the niche issues, or discovered the so-far-hidden major land
>>> mine. At this point, it's getting risky. In a month, I will say no.
>>>
> We are already ~1.5 months away from the next stable branch point. If
> we want to enable this in 18.0, we should be using this time to see if
> enabling the cache by default has some large unexpected side effect in
> our graphics stack, or breaks real-world apps.

 I agree.  We should encourage as many users as possible to enable the
 shader cache in their environments.  At least one stable release should
 be out with a working cache, where the feature can be enabled by those
 who are eager to benefit from it.  I assume there will be a lot of them,
 and they could flush out issues for everyone who has no idea what a
 shader cache is.

>> Adding a dimension to the test matrix has high cost, especially when
>> combined with other dimensions of the test matrix (does shader cache
>> need to be tested for 32/64 bit builds? For hswgt1/hswgt2/hswgt3e?).
>
> Are you saying this is too high cost to run per check-in? Do you need
> to disable it for the health of CI? I think I proposed that daily, or
> perhaps even weekly would be adequate.

 Certainly, the test time per line of shader cache code is massively
 higher than any other feature, even if you run it just once a month.
 Other features have tests that run in milliseconds, not 30min * 20
 machines.
>>>
>>> The scope of this feature is nearly the entire API. It is justified to
>>> throw the entire GL suite of tests at it on a regular basis. The cost
>>> of running this once per week ought to be reasonable.
>>>
>>> Is the cost of running this per check-in too high for the system
>>> today? Or, is it that you think it is too high for the feature? I'm
>>> sympathetic to the former, and not so much to the later. :)
>>>
>>> By the way, enabling these in CI has been helpful in highlighting a
>>> few corner case issues. So, even if you don't like it, *Thank You* for
>>> enabling it. :)
>>>
 Beyond poor return on execution time, there is the simple fact that
 whoever is running the CI needs to manually look at shader cache results
 separately from the rest of the tests.  Unit testing is effective
 because coverage can be added at approximately zero marginal cost.

 3 years from now, will we still be looking at separate weekly shader
 cache test runs to make sure it's working?
>>>
>>> I actually think that long term this should become part of the main
>>> daily and weekly tests. (An idea you've already rejected.) In the long
>>> term it should be stable enough for that, and if it does ever regress,
>>> we'd what to see something sooner rather than later.
>>>
> These tests are already identifying some corner case issues. I'm not
> sure these would have impacted real-world apps yet, but I do think it
> is a good idea to run these tests regularly.
>
> You say this test strategy is inadequate. Perhaps. I do think it needs
> to be part of our test strategy. There is no way we are going to hit
> all the corners of the API better than running all of our tests with
> the cache enabled. Do you agree?

 

Re: [Mesa-dev] [PATCH] i965: Enable disk shader cache by default

2017-12-05 Thread Timothy Arceri

On 06/12/17 12:04, Mark Janes wrote:

Jordan Justen  writes:


On 2017-12-05 14:49:28, Mark Janes wrote:

Jordan Justen  writes:


On 2017-12-05 09:13:11, Mark Janes wrote:

Jordan Justen  writes:


On 2017-11-08 17:26:47, Timothy Arceri wrote:

Reviewed-by: Timothy Arceri 

Mark may want to consider adding some of the once a day type CI runs for
this. For example running the test suite for two consecutive runs on the
same build so that the second run uses the shader cache and also a
second run the uses MESA_GLSL=cache_fb to force testing of the cache
fallback path.


Yeah. We discussed this previously, but I don't think it's been
implemented yet. My opinion is that it could perhaps be a weekly test.


This automation is implemented now. It found the issues reported in
https://bugs.freedesktop.org/show_bug.cgi?id=103988

My opinion is that this test strategy is inadequate.


Meaning you think we cannot enable i965 shader cache for the 18.0
release?


I haven't heard anyone express confidence that this feature is bug-free,
and I don't know on what basis that claim could be made.  I appreciate
that a lot of have manual testing has been done.  Do you feel confident
that the cache can be enabled for all mesa customers without disruption?


If we had enabled it two months ago, then yes, we would have worked
through the niche issues, or discovered the so-far-hidden major land
mine. At this point, it's getting risky. In a month, I will say no.


We are already ~1.5 months away from the next stable branch point. If
we want to enable this in 18.0, we should be using this time to see if
enabling the cache by default has some large unexpected side effect in
our graphics stack, or breaks real-world apps.


I agree.  We should encourage as many users as possible to enable the
shader cache in their environments.  At least one stable release should
be out with a working cache, where the feature can be enabled by those
who are eager to benefit from it.  I assume there will be a lot of them,
and they could flush out issues for everyone who has no idea what a
shader cache is.


Adding a dimension to the test matrix has high cost, especially when
combined with other dimensions of the test matrix (does shader cache
need to be tested for 32/64 bit builds? For hswgt1/hswgt2/hswgt3e?).


Are you saying this is too high cost to run per check-in? Do you need
to disable it for the health of CI? I think I proposed that daily, or
perhaps even weekly would be adequate.


Certainly, the test time per line of shader cache code is massively
higher than any other feature, even if you run it just once a month.
Other features have tests that run in milliseconds, not 30min * 20
machines.


The scope of this feature is nearly the entire API. It is justified to
throw the entire GL suite of tests at it on a regular basis. The cost
of running this once per week ought to be reasonable.

Is the cost of running this per check-in too high for the system
today? Or, is it that you think it is too high for the feature? I'm
sympathetic to the former, and not so much to the later. :)

By the way, enabling these in CI has been helpful in highlighting a
few corner case issues. So, even if you don't like it, *Thank You* for
enabling it. :)


Beyond poor return on execution time, there is the simple fact that
whoever is running the CI needs to manually look at shader cache results
separately from the rest of the tests.  Unit testing is effective
because coverage can be added at approximately zero marginal cost.

3 years from now, will we still be looking at separate weekly shader
cache test runs to make sure it's working?


I actually think that long term this should become part of the main
daily and weekly tests. (An idea you've already rejected.) In the long
term it should be stable enough for that, and if it does ever regress,
we'd what to see something sooner rather than later.


These tests are already identifying some corner case issues. I'm not
sure these would have impacted real-world apps yet, but I do think it
is a good idea to run these tests regularly.

You say this test strategy is inadequate. Perhaps. I do think it needs
to be part of our test strategy. There is no way we are going to hit
all the corners of the API better than running all of our tests with
the cache enabled. Do you agree?


Tests should strive to cover the implementation, not the API.


I don't think any of our test suites operate this way. Very little of
piglit or deqp focus on i965.


Shader
cache is a unique feature, because it affects a large part of the API.
It doesn't necessarily follow that covering the API will cover the
feature, or that that is an effective test strategy.


As you say, the scope of the feature is nearly the entire API. You
want us to start building a test suite that assumes the current
implementation details of the i965 shader cache, and tests it?


Re: [Mesa-dev] [PATCH] i965: Enable disk shader cache by default

2017-12-05 Thread Mark Janes
Jordan Justen  writes:

> On 2017-12-05 14:49:28, Mark Janes wrote:
>> Jordan Justen  writes:
>> 
>> > On 2017-12-05 09:13:11, Mark Janes wrote:
>> >> Jordan Justen  writes:
>> >> 
>> >> > On 2017-11-08 17:26:47, Timothy Arceri wrote:
>> >> >> Reviewed-by: Timothy Arceri 
>> >> >> 
>> >> >> Mark may want to consider adding some of the once a day type CI runs 
>> >> >> for 
>> >> >> this. For example running the test suite for two consecutive runs on 
>> >> >> the 
>> >> >> same build so that the second run uses the shader cache and also a 
>> >> >> second run the uses MESA_GLSL=cache_fb to force testing of the cache 
>> >> >> fallback path.
>> >> >
>> >> > Yeah. We discussed this previously, but I don't think it's been
>> >> > implemented yet. My opinion is that it could perhaps be a weekly test.
>> >> 
>> >> This automation is implemented now. It found the issues reported in
>> >> https://bugs.freedesktop.org/show_bug.cgi?id=103988
>> >> 
>> >> My opinion is that this test strategy is inadequate.
>> >
>> > Meaning you think we cannot enable i965 shader cache for the 18.0
>> > release?
>> 
>> I haven't heard anyone express confidence that this feature is bug-free,
>> and I don't know on what basis that claim could be made.  I appreciate
>> that a lot of have manual testing has been done.  Do you feel confident
>> that the cache can be enabled for all mesa customers without disruption?
>
> If we had enabled it two months ago, then yes, we would have worked
> through the niche issues, or discovered the so-far-hidden major land
> mine. At this point, it's getting risky. In a month, I will say no.
>
>> > We are already ~1.5 months away from the next stable branch point. If
>> > we want to enable this in 18.0, we should be using this time to see if
>> > enabling the cache by default has some large unexpected side effect in
>> > our graphics stack, or breaks real-world apps.
>> 
>> I agree.  We should encourage as many users as possible to enable the
>> shader cache in their environments.  At least one stable release should
>> be out with a working cache, where the feature can be enabled by those
>> who are eager to benefit from it.  I assume there will be a lot of them,
>> and they could flush out issues for everyone who has no idea what a
>> shader cache is.
>> 
>> >> Adding a dimension to the test matrix has high cost, especially when
>> >> combined with other dimensions of the test matrix (does shader cache
>> >> need to be tested for 32/64 bit builds? For hswgt1/hswgt2/hswgt3e?).
>> >
>> > Are you saying this is too high cost to run per check-in? Do you need
>> > to disable it for the health of CI? I think I proposed that daily, or
>> > perhaps even weekly would be adequate.
>> 
>> Certainly, the test time per line of shader cache code is massively
>> higher than any other feature, even if you run it just once a month.
>> Other features have tests that run in milliseconds, not 30min * 20
>> machines.
>
> The scope of this feature is nearly the entire API. It is justified to
> throw the entire GL suite of tests at it on a regular basis. The cost
> of running this once per week ought to be reasonable.
>
> Is the cost of running this per check-in too high for the system
> today? Or, is it that you think it is too high for the feature? I'm
> sympathetic to the former, and not so much to the later. :)
>
> By the way, enabling these in CI has been helpful in highlighting a
> few corner case issues. So, even if you don't like it, *Thank You* for
> enabling it. :)
>
>> Beyond poor return on execution time, there is the simple fact that
>> whoever is running the CI needs to manually look at shader cache results
>> separately from the rest of the tests.  Unit testing is effective
>> because coverage can be added at approximately zero marginal cost.
>> 
>> 3 years from now, will we still be looking at separate weekly shader
>> cache test runs to make sure it's working?
>
> I actually think that long term this should become part of the main
> daily and weekly tests. (An idea you've already rejected.) In the long
> term it should be stable enough for that, and if it does ever regress,
> we'd what to see something sooner rather than later.
>
>> > These tests are already identifying some corner case issues. I'm not
>> > sure these would have impacted real-world apps yet, but I do think it
>> > is a good idea to run these tests regularly.
>> >
>> > You say this test strategy is inadequate. Perhaps. I do think it needs
>> > to be part of our test strategy. There is no way we are going to hit
>> > all the corners of the API better than running all of our tests with
>> > the cache enabled. Do you agree?
>> 
>> Tests should strive to cover the implementation, not the API.
>
> I don't think any of our test suites operate this way. Very little of
> piglit or deqp focus on i965.
>
>> Shader
>> cache is a unique feature, 

Re: [Mesa-dev] [PATCH] i965: Enable disk shader cache by default

2017-12-05 Thread Jordan Justen
On 2017-12-05 14:49:28, Mark Janes wrote:
> Jordan Justen  writes:
> 
> > On 2017-12-05 09:13:11, Mark Janes wrote:
> >> Jordan Justen  writes:
> >> 
> >> > On 2017-11-08 17:26:47, Timothy Arceri wrote:
> >> >> Reviewed-by: Timothy Arceri 
> >> >> 
> >> >> Mark may want to consider adding some of the once a day type CI runs 
> >> >> for 
> >> >> this. For example running the test suite for two consecutive runs on 
> >> >> the 
> >> >> same build so that the second run uses the shader cache and also a 
> >> >> second run the uses MESA_GLSL=cache_fb to force testing of the cache 
> >> >> fallback path.
> >> >
> >> > Yeah. We discussed this previously, but I don't think it's been
> >> > implemented yet. My opinion is that it could perhaps be a weekly test.
> >> 
> >> This automation is implemented now. It found the issues reported in
> >> https://bugs.freedesktop.org/show_bug.cgi?id=103988
> >> 
> >> My opinion is that this test strategy is inadequate.
> >
> > Meaning you think we cannot enable i965 shader cache for the 18.0
> > release?
> 
> I haven't heard anyone express confidence that this feature is bug-free,
> and I don't know on what basis that claim could be made.  I appreciate
> that a lot of have manual testing has been done.  Do you feel confident
> that the cache can be enabled for all mesa customers without disruption?

If we had enabled it two months ago, then yes, we would have worked
through the niche issues, or discovered the so-far-hidden major land
mine. At this point, it's getting risky. In a month, I will say no.

> > We are already ~1.5 months away from the next stable branch point. If
> > we want to enable this in 18.0, we should be using this time to see if
> > enabling the cache by default has some large unexpected side effect in
> > our graphics stack, or breaks real-world apps.
> 
> I agree.  We should encourage as many users as possible to enable the
> shader cache in their environments.  At least one stable release should
> be out with a working cache, where the feature can be enabled by those
> who are eager to benefit from it.  I assume there will be a lot of them,
> and they could flush out issues for everyone who has no idea what a
> shader cache is.
> 
> >> Adding a dimension to the test matrix has high cost, especially when
> >> combined with other dimensions of the test matrix (does shader cache
> >> need to be tested for 32/64 bit builds? For hswgt1/hswgt2/hswgt3e?).
> >
> > Are you saying this is too high cost to run per check-in? Do you need
> > to disable it for the health of CI? I think I proposed that daily, or
> > perhaps even weekly would be adequate.
> 
> Certainly, the test time per line of shader cache code is massively
> higher than any other feature, even if you run it just once a month.
> Other features have tests that run in milliseconds, not 30min * 20
> machines.

The scope of this feature is nearly the entire API. It is justified to
throw the entire GL suite of tests at it on a regular basis. The cost
of running this once per week ought to be reasonable.

Is the cost of running this per check-in too high for the system
today? Or, is it that you think it is too high for the feature? I'm
sympathetic to the former, and not so much to the later. :)

By the way, enabling these in CI has been helpful in highlighting a
few corner case issues. So, even if you don't like it, *Thank You* for
enabling it. :)

> Beyond poor return on execution time, there is the simple fact that
> whoever is running the CI needs to manually look at shader cache results
> separately from the rest of the tests.  Unit testing is effective
> because coverage can be added at approximately zero marginal cost.
> 
> 3 years from now, will we still be looking at separate weekly shader
> cache test runs to make sure it's working?

I actually think that long term this should become part of the main
daily and weekly tests. (An idea you've already rejected.) In the long
term it should be stable enough for that, and if it does ever regress,
we'd what to see something sooner rather than later.

> > These tests are already identifying some corner case issues. I'm not
> > sure these would have impacted real-world apps yet, but I do think it
> > is a good idea to run these tests regularly.
> >
> > You say this test strategy is inadequate. Perhaps. I do think it needs
> > to be part of our test strategy. There is no way we are going to hit
> > all the corners of the API better than running all of our tests with
> > the cache enabled. Do you agree?
> 
> Tests should strive to cover the implementation, not the API.

I don't think any of our test suites operate this way. Very little of
piglit or deqp focus on i965.

> Shader
> cache is a unique feature, because it affects a large part of the API.
> It doesn't necessarily follow that covering the API will cover the
> feature, or that that is an effective test strategy.

As you 

Re: [Mesa-dev] [PATCH] i965: Enable disk shader cache by default

2017-12-05 Thread Timothy Arceri

On 06/12/17 09:49, Mark Janes wrote:

Jordan Justen  writes:


On 2017-12-05 09:13:11, Mark Janes wrote:

Jordan Justen  writes:


On 2017-11-08 17:26:47, Timothy Arceri wrote:

Reviewed-by: Timothy Arceri 

Mark may want to consider adding some of the once a day type CI runs for
this. For example running the test suite for two consecutive runs on the
same build so that the second run uses the shader cache and also a
second run the uses MESA_GLSL=cache_fb to force testing of the cache
fallback path.


Yeah. We discussed this previously, but I don't think it's been
implemented yet. My opinion is that it could perhaps be a weekly test.


This automation is implemented now. It found the issues reported in
https://bugs.freedesktop.org/show_bug.cgi?id=103988

My opinion is that this test strategy is inadequate.


Meaning you think we cannot enable i965 shader cache for the 18.0
release?


I haven't heard anyone express confidence that this feature is bug-free,
and I don't know on what basis that claim could be made.  I appreciate
that a lot of have manual testing has been done.  Do you feel confident
that the cache can be enabled for all mesa customers without disruption?



I would suggest enabling it now in master it can be switched off again 
before release. For what its worth besides 1 or 2 dev branch related 
issues we still haven't had any shader cache related regressions 
reported for radeonsi (or any of the other gallium drivers with shader 
cache) and its been enabled for a couple of releases now.




We are already ~1.5 months away from the next stable branch point. If
we want to enable this in 18.0, we should be using this time to see if
enabling the cache by default has some large unexpected side effect in
our graphics stack, or breaks real-world apps.


I agree.  We should encourage as many users as possible to enable the
shader cache in their environments.  At least one stable release should
be out with a working cache, where the feature can be enabled by those
who are eager to benefit from it.  I assume there will be a lot of them,
and they could flush out issues for everyone who has no idea what a
shader cache is.


Adding a dimension to the test matrix has high cost, especially when
combined with other dimensions of the test matrix (does shader cache
need to be tested for 32/64 bit builds? For hswgt1/hswgt2/hswgt3e?).


Are you saying this is too high cost to run per check-in? Do you need
to disable it for the health of CI? I think I proposed that daily, or
perhaps even weekly would be adequate.


Certainly, the test time per line of shader cache code is massively
higher than any other feature, even if you run it just once a month.
Other features have tests that run in milliseconds, not 30min * 20
machines.

Beyond poor return on execution time, there is the simple fact that
whoever is running the CI needs to manually look at shader cache results
separately from the rest of the tests.  Unit testing is effective
because coverage can be added at approximately zero marginal cost.

3 years from now, will we still be looking at separate weekly shader
cache test runs to make sure it's working?


These tests are already identifying some corner case issues. I'm not
sure these would have impacted real-world apps yet, but I do think it
is a good idea to run these tests regularly.

You say this test strategy is inadequate. Perhaps. I do think it needs
to be part of our test strategy. There is no way we are going to hit
all the corners of the API better than running all of our tests with
the cache enabled. Do you agree?


Tests should strive to cover the implementation, not the API.  Shader
cache is a unique feature, because it affects a large part of the API.
It doesn't necessarily follow that covering the API will cover the
feature, or that that is an effective test strategy.

Knowing nothing of the implementation, I would look to similar projects
(eg: ccache) to see what test strategies they employ.  There are
probably issues they've encountered that we have not thought of.

Another approach would be to write real unit tests for bugs found during
the course of the shader cache implementation.  Those test cases would
help developers know that they broke a cache feature without having to
rely on a weekly CI run.


The problem is these tests would just be.

1. Compile shaders
2. Run program
3. Recompile all the same shaders from Step 1
4. Re-run the program in the same was as Step 2

Shader-cache is not like other features in that you need to exercise 
pretty much ever other feature (hence re-running piglit) in order to be 
confident you have full coverage.





Since 103988 is not hardware specific, and is not a regression, it's not
something that could only have been caught by CI.  I'm surprised to find
it at this point, considering piglit was used to validate the feature.

I'd be happy if there was at least a smoke test where 

Re: [Mesa-dev] [PATCH] i965: Enable disk shader cache by default

2017-12-05 Thread Mark Janes
Jordan Justen  writes:

> On 2017-12-05 09:13:11, Mark Janes wrote:
>> Jordan Justen  writes:
>> 
>> > On 2017-11-08 17:26:47, Timothy Arceri wrote:
>> >> Reviewed-by: Timothy Arceri 
>> >> 
>> >> Mark may want to consider adding some of the once a day type CI runs for 
>> >> this. For example running the test suite for two consecutive runs on the 
>> >> same build so that the second run uses the shader cache and also a 
>> >> second run the uses MESA_GLSL=cache_fb to force testing of the cache 
>> >> fallback path.
>> >
>> > Yeah. We discussed this previously, but I don't think it's been
>> > implemented yet. My opinion is that it could perhaps be a weekly test.
>> 
>> This automation is implemented now. It found the issues reported in
>> https://bugs.freedesktop.org/show_bug.cgi?id=103988
>> 
>> My opinion is that this test strategy is inadequate.
>
> Meaning you think we cannot enable i965 shader cache for the 18.0
> release?

I haven't heard anyone express confidence that this feature is bug-free,
and I don't know on what basis that claim could be made.  I appreciate
that a lot of have manual testing has been done.  Do you feel confident
that the cache can be enabled for all mesa customers without disruption?

> We are already ~1.5 months away from the next stable branch point. If
> we want to enable this in 18.0, we should be using this time to see if
> enabling the cache by default has some large unexpected side effect in
> our graphics stack, or breaks real-world apps.

I agree.  We should encourage as many users as possible to enable the
shader cache in their environments.  At least one stable release should
be out with a working cache, where the feature can be enabled by those
who are eager to benefit from it.  I assume there will be a lot of them,
and they could flush out issues for everyone who has no idea what a
shader cache is.

>> Adding a dimension to the test matrix has high cost, especially when
>> combined with other dimensions of the test matrix (does shader cache
>> need to be tested for 32/64 bit builds? For hswgt1/hswgt2/hswgt3e?).
>
> Are you saying this is too high cost to run per check-in? Do you need
> to disable it for the health of CI? I think I proposed that daily, or
> perhaps even weekly would be adequate.

Certainly, the test time per line of shader cache code is massively
higher than any other feature, even if you run it just once a month.
Other features have tests that run in milliseconds, not 30min * 20
machines.

Beyond poor return on execution time, there is the simple fact that
whoever is running the CI needs to manually look at shader cache results
separately from the rest of the tests.  Unit testing is effective
because coverage can be added at approximately zero marginal cost.

3 years from now, will we still be looking at separate weekly shader
cache test runs to make sure it's working?

> These tests are already identifying some corner case issues. I'm not
> sure these would have impacted real-world apps yet, but I do think it
> is a good idea to run these tests regularly.
>
> You say this test strategy is inadequate. Perhaps. I do think it needs
> to be part of our test strategy. There is no way we are going to hit
> all the corners of the API better than running all of our tests with
> the cache enabled. Do you agree?

Tests should strive to cover the implementation, not the API.  Shader
cache is a unique feature, because it affects a large part of the API.
It doesn't necessarily follow that covering the API will cover the
feature, or that that is an effective test strategy.

Knowing nothing of the implementation, I would look to similar projects
(eg: ccache) to see what test strategies they employ.  There are
probably issues they've encountered that we have not thought of.

Another approach would be to write real unit tests for bugs found during
the course of the shader cache implementation.  Those test cases would
help developers know that they broke a cache feature without having to
rely on a weekly CI run.

>> Since 103988 is not hardware specific, and is not a regression, it's not
>> something that could only have been caught by CI.  I'm surprised to find
>> it at this point, considering piglit was used to validate the feature.
>> 
>> I'd be happy if there was at least a smoke test where complex programs
>> are cached, with the intent of exercising the shader cache mechanism
>> directly.  It doesn't have to be exhaustive to be effective.  Seems like
>> a good idea to at least directly test the issue that was fixed in
>> 103988 tests.
>
> It could be interesting to define a MESA extension to control or query
> the shader cache. Today, the shader cache is a nearly 'invisible'
> feature. There are a few env-vars, but I wouldn't call a public
> interface.
>
> The downside of defining an extension is that it might constrain
> changes to the shader cache implementation. Or, if the 

Re: [Mesa-dev] [PATCH] i965: Enable disk shader cache by default

2017-12-05 Thread Jordan Justen
On 2017-12-05 09:13:11, Mark Janes wrote:
> Jordan Justen  writes:
> 
> > On 2017-11-08 17:26:47, Timothy Arceri wrote:
> >> Reviewed-by: Timothy Arceri 
> >> 
> >> Mark may want to consider adding some of the once a day type CI runs for 
> >> this. For example running the test suite for two consecutive runs on the 
> >> same build so that the second run uses the shader cache and also a 
> >> second run the uses MESA_GLSL=cache_fb to force testing of the cache 
> >> fallback path.
> >
> > Yeah. We discussed this previously, but I don't think it's been
> > implemented yet. My opinion is that it could perhaps be a weekly test.
> 
> This automation is implemented now. It found the issues reported in
> https://bugs.freedesktop.org/show_bug.cgi?id=103988
> 
> My opinion is that this test strategy is inadequate.

Meaning you think we cannot enable i965 shader cache for the 18.0
release?

We are already ~1.5 months away from the next stable branch point. If
we want to enable this in 18.0, we should be using this time to see if
enabling the cache by default has some large unexpected side effect in
our graphics stack, or breaks real-world apps.

> Adding a dimension
> to the test matrix has high cost, especially when combined with other
> dimensions of the test matrix (does shader cache need to be tested for
> 32/64 bit builds? For hswgt1/hswgt2/hswgt3e?).

Are you saying this is too high cost to run per check-in? Do you need
to disable it for the health of CI? I think I proposed that daily, or
perhaps even weekly would be adequate.

These tests are already identifying some corner case issues. I'm not
sure these would have impacted real-world apps yet, but I do think it
is a good idea to run these tests regularly.

You say this test strategy is inadequate. Perhaps. I do think it needs
to be part of our test strategy. There is no way we are going to hit
all the corners of the API better than running all of our tests with
the cache enabled. Do you agree?

> Since 103988 is not hardware specific, and is not a regression, it's not
> something that could only have been caught by CI.  I'm surprised to find
> it at this point, considering piglit was used to validate the feature.
> 
> I'd be happy if there was at least a smoke test where complex programs
> are cached, with the intent of exercising the shader cache mechanism
> directly.  It doesn't have to be exhaustive to be effective.  Seems like
> a good idea to at least directly test the issue that was fixed in
> 103988 tests.

It could be interesting to define a MESA extension to control or query
the shader cache. Today, the shader cache is a nearly 'invisible'
feature. There are a few env-vars, but I wouldn't call a public
interface.

The downside of defining an extension is that it might constrain
changes to the shader cache implementation. Or, if the interface is
too complex, then it may be tough for some drivers to implement.

But, what if we get around to defining and implementing this extenion,
(and a few piglit tests that makes use of it), by mid-January? It'll
still be pretty difficult to argue we should enable the shader cache
on i965 for 18.0, since we burned all the time to let it be tested on
master against real-world apps.

Also, while you say our current test strategy is inadequate, I do
think it is still far more comprehensive than anything we will
accomplish by defining a new extension and writing a few, or even a
few hundred tests against the new extension.

-Jordan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Enable disk shader cache by default

2017-12-05 Thread Mark Janes
Jordan Justen  writes:

> On 2017-11-08 17:26:47, Timothy Arceri wrote:
>> Reviewed-by: Timothy Arceri 
>> 
>> Mark may want to consider adding some of the once a day type CI runs for 
>> this. For example running the test suite for two consecutive runs on the 
>> same build so that the second run uses the shader cache and also a 
>> second run the uses MESA_GLSL=cache_fb to force testing of the cache 
>> fallback path.
>
> Yeah. We discussed this previously, but I don't think it's been
> implemented yet. My opinion is that it could perhaps be a weekly test.

This automation is implemented now. It found the issues reported in
https://bugs.freedesktop.org/show_bug.cgi?id=103988

My opinion is that this test strategy is inadequate.  Adding a dimension
to the test matrix has high cost, especially when combined with other
dimensions of the test matrix (does shader cache need to be tested for
32/64 bit builds? For hswgt1/hswgt2/hswgt3e?).

Since 103988 is not hardware specific, and is not a regression, it's not
something that could only have been caught by CI.  I'm surprised to find
it at this point, considering piglit was used to validate the feature.

I'd be happy if there was at least a smoke test where complex programs
are cached, with the intent of exercising the shader cache mechanism
directly.  It doesn't have to be exhaustive to be effective.  Seems like
a good idea to at least directly test the issue that was fixed in
103988 tests.

> We also discussed a nir serialization test, similar to our current nir
> clone daily test. I don't think this is implemented yet either.
>
> -Jordan
>
>> 
>> On 09/11/17 11:58, Jordan Justen wrote:
>> > f9d5a7add42af5a2e4410526d1480a08f41317ae along with
>> > a16dc04ad51c32e5c7d136e4dd6273d983385d3f appears to have fixed the one
>> > known regression with shader cache. (Deus Ex instability.)
>> > 
>> > We should enable the shader cache by default to stabilize it before
>> > the next major Mesa release.
>> > 
>> > Signed-off-by: Jordan Justen 
>> > ---
>> >   docs/relnotes/17.4.0.html  | 2 +-
>> >   src/mesa/drivers/dri/i965/brw_disk_cache.c | 3 ---
>> >   2 files changed, 1 insertion(+), 4 deletions(-)
>> > 
>> > diff --git a/docs/relnotes/17.4.0.html b/docs/relnotes/17.4.0.html
>> > index f81b5bd62d3..48dcd5cce38 100644
>> > --- a/docs/relnotes/17.4.0.html
>> > +++ b/docs/relnotes/17.4.0.html
>> > @@ -44,7 +44,7 @@ Note: some of the new features are only available with 
>> > certain drivers.
>> >   
>> >   
>> >   
>> > -Disk shader cache support for i965 when MESA_GLSL_CACHE_DISABLE 
>> > environment variable is set to "0" or "false"
>> > +Disk shader cache support for i965
>> >   
>> >   
>> >   Bug fixes
>> > diff --git a/src/mesa/drivers/dri/i965/brw_disk_cache.c 
>> > b/src/mesa/drivers/dri/i965/brw_disk_cache.c
>> > index 853ea98af03..cd0524c5cbf 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_disk_cache.c
>> > +++ b/src/mesa/drivers/dri/i965/brw_disk_cache.c
>> > @@ -420,9 +420,6 @@ void
>> >   brw_disk_cache_init(struct brw_context *brw)
>> >   {
>> >   #ifdef ENABLE_SHADER_CACHE
>> > -   if (env_var_as_boolean("MESA_GLSL_CACHE_DISABLE", true))
>> > -  return;
>> > -
>> >  char renderer[10];
>> >  MAYBE_UNUSED int len = snprintf(renderer, sizeof(renderer), 
>> > "i965_%04x",
>> >  brw->screen->deviceID);
>> > 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Enable disk shader cache by default

2017-11-08 Thread Jordan Justen
On 2017-11-08 17:26:47, Timothy Arceri wrote:
> Reviewed-by: Timothy Arceri 
> 
> Mark may want to consider adding some of the once a day type CI runs for 
> this. For example running the test suite for two consecutive runs on the 
> same build so that the second run uses the shader cache and also a 
> second run the uses MESA_GLSL=cache_fb to force testing of the cache 
> fallback path.

Yeah. We discussed this previously, but I don't think it's been
implemented yet. My opinion is that it could perhaps be a weekly test.

We also discussed a nir serialization test, similar to our current nir
clone daily test. I don't think this is implemented yet either.

-Jordan

> 
> On 09/11/17 11:58, Jordan Justen wrote:
> > f9d5a7add42af5a2e4410526d1480a08f41317ae along with
> > a16dc04ad51c32e5c7d136e4dd6273d983385d3f appears to have fixed the one
> > known regression with shader cache. (Deus Ex instability.)
> > 
> > We should enable the shader cache by default to stabilize it before
> > the next major Mesa release.
> > 
> > Signed-off-by: Jordan Justen 
> > ---
> >   docs/relnotes/17.4.0.html  | 2 +-
> >   src/mesa/drivers/dri/i965/brw_disk_cache.c | 3 ---
> >   2 files changed, 1 insertion(+), 4 deletions(-)
> > 
> > diff --git a/docs/relnotes/17.4.0.html b/docs/relnotes/17.4.0.html
> > index f81b5bd62d3..48dcd5cce38 100644
> > --- a/docs/relnotes/17.4.0.html
> > +++ b/docs/relnotes/17.4.0.html
> > @@ -44,7 +44,7 @@ Note: some of the new features are only available with 
> > certain drivers.
> >   
> >   
> >   
> > -Disk shader cache support for i965 when MESA_GLSL_CACHE_DISABLE 
> > environment variable is set to "0" or "false"
> > +Disk shader cache support for i965
> >   
> >   
> >   Bug fixes
> > diff --git a/src/mesa/drivers/dri/i965/brw_disk_cache.c 
> > b/src/mesa/drivers/dri/i965/brw_disk_cache.c
> > index 853ea98af03..cd0524c5cbf 100644
> > --- a/src/mesa/drivers/dri/i965/brw_disk_cache.c
> > +++ b/src/mesa/drivers/dri/i965/brw_disk_cache.c
> > @@ -420,9 +420,6 @@ void
> >   brw_disk_cache_init(struct brw_context *brw)
> >   {
> >   #ifdef ENABLE_SHADER_CACHE
> > -   if (env_var_as_boolean("MESA_GLSL_CACHE_DISABLE", true))
> > -  return;
> > -
> >  char renderer[10];
> >  MAYBE_UNUSED int len = snprintf(renderer, sizeof(renderer), 
> > "i965_%04x",
> >  brw->screen->deviceID);
> > 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Enable disk shader cache by default

2017-11-08 Thread Timothy Arceri

Reviewed-by: Timothy Arceri 

Mark may want to consider adding some of the once a day type CI runs for 
this. For example running the test suite for two consecutive runs on the 
same build so that the second run uses the shader cache and also a 
second run the uses MESA_GLSL=cache_fb to force testing of the cache 
fallback path.


On 09/11/17 11:58, Jordan Justen wrote:

f9d5a7add42af5a2e4410526d1480a08f41317ae along with
a16dc04ad51c32e5c7d136e4dd6273d983385d3f appears to have fixed the one
known regression with shader cache. (Deus Ex instability.)

We should enable the shader cache by default to stabilize it before
the next major Mesa release.

Signed-off-by: Jordan Justen 
---
  docs/relnotes/17.4.0.html  | 2 +-
  src/mesa/drivers/dri/i965/brw_disk_cache.c | 3 ---
  2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/docs/relnotes/17.4.0.html b/docs/relnotes/17.4.0.html
index f81b5bd62d3..48dcd5cce38 100644
--- a/docs/relnotes/17.4.0.html
+++ b/docs/relnotes/17.4.0.html
@@ -44,7 +44,7 @@ Note: some of the new features are only available with 
certain drivers.
  
  
  

-Disk shader cache support for i965 when MESA_GLSL_CACHE_DISABLE environment variable is set to 
"0" or "false"
+Disk shader cache support for i965
  
  
  Bug fixes

diff --git a/src/mesa/drivers/dri/i965/brw_disk_cache.c 
b/src/mesa/drivers/dri/i965/brw_disk_cache.c
index 853ea98af03..cd0524c5cbf 100644
--- a/src/mesa/drivers/dri/i965/brw_disk_cache.c
+++ b/src/mesa/drivers/dri/i965/brw_disk_cache.c
@@ -420,9 +420,6 @@ void
  brw_disk_cache_init(struct brw_context *brw)
  {
  #ifdef ENABLE_SHADER_CACHE
-   if (env_var_as_boolean("MESA_GLSL_CACHE_DISABLE", true))
-  return;
-
 char renderer[10];
 MAYBE_UNUSED int len = snprintf(renderer, sizeof(renderer), "i965_%04x",
 brw->screen->deviceID);


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev