Thanks for the reviews.

My setup is not complicated as yours. Mostly because my main focus has been llvmpipe, and we're not actively adding supporting to new OpenGL extensions for it at this moment, so most of the new piglit tests either skip or pass. New failures are relatively rare.

Before I used https://wiki.jenkins-ci.org/display/JENKINS/Email-ext+plugin -- it allows to control if/when emails are sent with great flexibility. In particular it allows to send emails for new regressions, but not for tests that were failing previously.

Still, nothing beats being able to look at a bunch of test jobs and immediately tell all blue == all good. Alhough I use jenkins for many years now, this is a lesson I only learned recently -- it's better to mask out expected failures somehow and get a boolean "all pass" or "fail" for the whole test stuie, than trying to track pass/fail for individual tests. The latter just doesn't scale...



We do have an internal branches where we run piglit. I confess I don't a good solution yet. Your trick of maintaining a database of which git commit tests were added is quite neat. Another thing worth considering would be to branch or tag/ piglit whenever Mesa is branched, and keep using a matching (and unchanging) piglit commit.



We also run testsuites through different APIs (namely D3D9/10). These testsuite rarely get updated, and llvmpipe conformance is actually quite good to start with, so it's easy to get "all pass" there.


Piglit, by being continuously updated/extended, is indeed more of a challenge than other testsuites.


We also use piglit for testing our OpenGL guest driver, but we use an internal testing infrastructure to driver, not Jenkins. So our experiences there don't apply.


I also have a few benchmarks on jenkins. Again, I only keep track of performance metrics via Jenkins Plots and Measurements plugins, but I don't produce pass/fail based on those metrics. I am however considering doing something of the sort -- e.g., getting the history of the metrics via jenkins JSON API, fit into a probablylity distribution, and fail when performance goes below a given percentile.


Jose


On 03/03/15 18:34, Mark Janes wrote:
Thanks Jose! this is an improvement.

In my experience, broken tests are introduced and fixed in mesa on a
daily basis.  This has a few consequences:

  - On a daily basis, I look at failures and update the expected
    pass/fails depending on whether it is a new test or a regression.
    Much of this process is automated.

  - Branches quickly diverge on the basis of passing/failing tests.
    Having separate pass/fail configs on release branches is
    unmanageable.  To account for this, my automation records the
    relevant commit sha as the value in the config file (the key is the
    test name).  I post-process the junit xml to filter out test failures
    with commits that occurred after the branch point.

  - for platforms that are too slow to build each checkin, I run an
    automated bisect which builds/tests in jenkins, then updates config
    files.

  - Our platform matrix generates over 350k unskipped tests for each
    build.  We filter out skipped tests due to the memory consumption on
    jenkins when displaying this many tests.

I am interested in learning more about your test system, and sharing
lessons learned / techniques.

-Mark

Reviewed-by: Mark Janes <[email protected]>

Jose Fonseca <[email protected]> writes:

I recently tried the junit backend's ability to ignore expected
failures/crashes and found it a godsend -- instead of having to look as
test graph results periodically, I can just tell jenkins to email me
when things go south.

The only drawback is that by reporting the expected issues as passing it
makes it too easy to forget about them and misinterpret the pass-rates.
So this change modifies the junit backend to report the expected issues
as skipped, making it more obvious when looking at the test graphs that
these tests are not really passing, and that whatever functionality they
target is not being fully covered.

This change also makes use of the junit `message` attribute to explain
the reason of the skip.  (In fact, we could consider using the `message`
attribute on other kind of failures to inform the piglit result, instead
of using the non-standard `type`.)
---
  framework/backends/junit.py | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/framework/backends/junit.py b/framework/backends/junit.py
index 82f9c29..53b6086 100644
--- a/framework/backends/junit.py
+++ b/framework/backends/junit.py
@@ -129,17 +129,19 @@ class JUnitBackend(FileBackend):
              # Add relevant result value, if the result is pass then it doesn't
              # need one of these statuses
              if data['result'] == 'skip':
-                etree.SubElement(element, 'skipped')
+                res = etree.SubElement(element, 'skipped')

              elif data['result'] in ['warn', 'fail', 'dmesg-warn', 
'dmesg-fail']:
                  if expected_result == "failure":
                      err.text += "\n\nWARN: passing test as an expected 
failure"
+                    res = etree.SubElement(element, 'skipped', 
message='expected failure')
                  else:
                      res = etree.SubElement(element, 'failure')

              elif data['result'] == 'crash':
                  if expected_result == "error":
                      err.text += "\n\nWARN: passing test as an expected crash"
+                    res = etree.SubElement(element, 'skipped', 
message='expected crash')
                  else:
                      res = etree.SubElement(element, 'error')

--
2.1.0

_______________________________________________
Piglit mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/piglit

Reply via email to