On 06/12/17 12:04, Mark Janes wrote:
Jordan Justen <jordan.l.jus...@intel.com> writes:

On 2017-12-05 14:49:28, Mark Janes wrote:
Jordan Justen <jordan.l.jus...@intel.com> writes:

On 2017-12-05 09:13:11, Mark Janes wrote:
Jordan Justen <jordan.l.jus...@intel.com> writes:

On 2017-11-08 17:26:47, Timothy Arceri wrote:
Reviewed-by: Timothy Arceri <tarc...@itsqueeze.com>

Mark may want to consider adding some of the once a day type CI runs for
this. For example running the test suite for two consecutive runs on the
same build so that the second run uses the shader cache and also a
second run the uses MESA_GLSL=cache_fb to force testing of the cache
fallback path.

Yeah. We discussed this previously, but I don't think it's been
implemented yet. My opinion is that it could perhaps be a weekly test.

This automation is implemented now. It found the issues reported in
https://bugs.freedesktop.org/show_bug.cgi?id=103988

My opinion is that this test strategy is inadequate.

Meaning you think we cannot enable i965 shader cache for the 18.0
release?

I haven't heard anyone express confidence that this feature is bug-free,
and I don't know on what basis that claim could be made.  I appreciate
that a lot of have manual testing has been done.  Do you feel confident
that the cache can be enabled for all mesa customers without disruption?

If we had enabled it two months ago, then yes, we would have worked
through the niche issues, or discovered the so-far-hidden major land
mine. At this point, it's getting risky. In a month, I will say no.

We are already ~1.5 months away from the next stable branch point. If
we want to enable this in 18.0, we should be using this time to see if
enabling the cache by default has some large unexpected side effect in
our graphics stack, or breaks real-world apps.

I agree.  We should encourage as many users as possible to enable the
shader cache in their environments.  At least one stable release should
be out with a working cache, where the feature can be enabled by those
who are eager to benefit from it.  I assume there will be a lot of them,
and they could flush out issues for everyone who has no idea what a
shader cache is.

Adding a dimension to the test matrix has high cost, especially when
combined with other dimensions of the test matrix (does shader cache
need to be tested for 32/64 bit builds? For hswgt1/hswgt2/hswgt3e?).

Are you saying this is too high cost to run per check-in? Do you need
to disable it for the health of CI? I think I proposed that daily, or
perhaps even weekly would be adequate.

Certainly, the test time per line of shader cache code is massively
higher than any other feature, even if you run it just once a month.
Other features have tests that run in milliseconds, not 30min * 20
machines.

The scope of this feature is nearly the entire API. It is justified to
throw the entire GL suite of tests at it on a regular basis. The cost
of running this once per week ought to be reasonable.

Is the cost of running this per check-in too high for the system
today? Or, is it that you think it is too high for the feature? I'm
sympathetic to the former, and not so much to the later. :)

By the way, enabling these in CI has been helpful in highlighting a
few corner case issues. So, even if you don't like it, *Thank You* for
enabling it. :)

Beyond poor return on execution time, there is the simple fact that
whoever is running the CI needs to manually look at shader cache results
separately from the rest of the tests.  Unit testing is effective
because coverage can be added at approximately zero marginal cost.

3 years from now, will we still be looking at separate weekly shader
cache test runs to make sure it's working?

I actually think that long term this should become part of the main
daily and weekly tests. (An idea you've already rejected.) In the long
term it should be stable enough for that, and if it does ever regress,
we'd what to see something sooner rather than later.

These tests are already identifying some corner case issues. I'm not
sure these would have impacted real-world apps yet, but I do think it
is a good idea to run these tests regularly.

You say this test strategy is inadequate. Perhaps. I do think it needs
to be part of our test strategy. There is no way we are going to hit
all the corners of the API better than running all of our tests with
the cache enabled. Do you agree?

Tests should strive to cover the implementation, not the API.

I don't think any of our test suites operate this way. Very little of
piglit or deqp focus on i965.

Shader
cache is a unique feature, because it affects a large part of the API.
It doesn't necessarily follow that covering the API will cover the
feature, or that that is an effective test strategy.

As you say, the scope of the feature is nearly the entire API. You
want us to start building a test suite that assumes the current
implementation details of the i965 shader cache, and tests it?

Regardless of the choice to focus only on i965, the scope will still
be enormous. We will never feasibly reach the coverage level that we
get from enabling the shader cache paths, and running our current
hundreds of thousands of test cases on it.

It could be interesting to define a MESA extension to control or query
the shader cache. Today, the shader cache is a nearly 'invisible'
feature. There are a few env-vars, but I wouldn't call a public
interface.

The downside of defining an extension is that it might constrain
changes to the shader cache implementation. Or, if the interface is
too complex, then it may be tough for some drivers to implement.

Why is an extension necessary?  By comparison, GCC has no interface to
query the statistics for ccache.  A utility that can read the cache
files and report hits/misses would at least inform us that the cache is
functioning.  The utility would be useful in writing real unit tests for
the feature.

If we don't have an extension, then how would this work? i965 unit
tests during the build? If we don't actually test the feature against
real hardware, I think we'll miss a whole class of potential issues.

Unit tests are good, and I know we already have some of those.  We'll
have better data on the classes of issues are once we find more of
them.  So far, it's just a few hardware-independent piglit tests.

If we don't have an extension, then how can an external (piglit) test
hope to interact or probe the cache?

Piglit queries a utility or lib to make assertions about the cache?  For
that matter, this whole cache verification could be implemented within
piglit as a profile that executes tests twice with appropriate
verification.

That would mean manual work to add tests to a profile. The manual part makes it unlikely to ever happen, also trying to manually update a list to test every variation of shader is doomed to fail. Alternatively adding only tests that have found a problem in the past makes it not very useful.


Then, at least, developers could test their patches without a CI.
Piglit could even skip a bunch of tests which don't impact the cache.

They already can, they just need to run piglit twice. If thats not easy enough we can add a profile that causes everything to be run twice.



I'm not actually in favor of defining and implementing an extension,
because I'm not sure about the ROI.

But, what if we get around to defining and implementing this extenion,
(and a few piglit tests that makes use of it), by mid-January? It'll
still be pretty difficult to argue we should enable the shader cache
on i965 for 18.0, since we burned all the time to let it be tested on
master against real-world apps.

Personally, I think it is already difficult to argue that the cache
should be on by default in 18.0.

Ok. I guess I'll check back with you sometime in 2018...

Also, while you say our current test strategy is inadequate, I do
think it is still far more comprehensive than anything we will
accomplish by defining a new extension and writing a few, or even a
few hundred tests against the new extension.

In the short term, we should do whatever is expedient to try to test the
feature as much as we can.

In the longer term, a weekly test is going to be costly and ineffective
at preventing regressions.  Tests need to be runnable for developers to
use them as part of their process.  If a test can only be run at great
expense in a weekly CI, then any fragility in the feature will be
discovered a week after a bug has been pushed to master.

Given the scope of the feature, I just don't think it is going to be
possible to get reasonable coverage on a developer's machine.

Once an issue has been identified, it has generally been easy to
reproduce locally.

-Jordan
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to