Re: [Mesa-dev] Mesa CI with trace regression testing

2019-09-26 Thread Eric Anholt
Alexandros Frantzis  writes:

> Hi all,
>
> The last couple of months we (at Collabora) have been working on a
> prototype for a Mesa testing system based on trace replays that supports
> correctness regression testing and, in the future, performance
> regression testing.
>
> We are aware that large-scale CI systems that perform extensive checks
> on Mesa already exist. However, our goal is not to reach that kind of
> scale or exhaustiveness, but to produce a system that will be simple and
> robust enough to be maintained by the community, while being useful
> enough so that the community will want to use and maintain it. We also
> want to be able to make it fast enough so that it will be run eventually
> on a regular basis, ideally in pre-commit fashion.
>
> The current prototype focuses on the correctness aspect, replaying
> traces and comparing images against a set of reference images on
> multiple devices. At the moment, we run on softpipe and
> intel/chromebook, but it's straightforward to add other devices through
> gitlab runners.
>
> For the prototype we have used a simple approach for image comparison,
> storing a separate set of reference images per device and using exact
> image comparison, but we are also investigating alternative ways to deal
> with this. First results indicate that the frequency of reference image
> mismatches due to non-bug changes in Mesa is acceptable, but we will get
> a more complete picture once we have a richer set of traces and a longer
> CI run history. 

Some missing context: I was told that over 2400 commits, in glmark2 + a
couple of other open source traces, on intel, there was one spurious
failure due to this diff method.  This is lower than I felt like it was
when I did this in piglit on vc4, but then I was very actively changing
optimization in the compiler while I was using that tool.

> The current design is based on an out-of-tree approach, where the tracie
> CI works independently from Mesa CI, fetching and building the latest
> Mesa on its own. We did this for maximum flexibility in the prototyping
> phase, but this has a complexity cost, and although we could continue to
> work this way, we would like to hear people's thoughts about eventually
> integrating with Mesa more closely, by becoming part of the upstream
> Mesa testing pipelines.
>
> It's worth noting that the last few months other people, most notably
> Eric Anholt, have made proposals to extend the scope of testing in CI.
> We believe there is much common ground here (multiple devices,
> deployment with gitlab runners) and room for cooperation and eventual
> integration into upstream Mesa. In the end, the main difference between
> all these efforts are the kind of tests (deqp, traces, performance) that
> are being run, which all have their place and offer different
> trade-offs.
>
> We have also implemented a prototype dashboard to display the results,
> which we have deployed at:
>
> https://tracie.freedesktop.org
>
> We are working to improve the dashboard and provide more value by
> extracting and displaying additional information, e.g., "softpipe broken
> since commit NNN".
>
> The dashboard is currently specific to the trace playback results, but
> it would be nice to eventually converge to a single MesaCI dashboard
> covering all kinds of Mesa CI test results. We would be happy to help
> develop in this direction if there is interest.
>
> You can find the CI scripts for tracie at:
>
> https://gitlab.freedesktop.org/gfx-ci/tracie/tracie
>
> Code for the dashboard is at:
>
> https://gitlab.freedesktop.org/gfx-ci/tracie/tracie_dashboard
>
> Here is an example of a failed CI job (for a purposefully broken Mesa
> commit) and the report of the failed trace (click on the red X to
> see the image diffs):
>
> https://tracie.freedesktop.org/dashboard/job/642369/
>
> Looking forward to your thoughts and comments.

A couple of thoughts on this:

A separate dashboard is useful if we have traces that are too slow to
run pre-merge or are not redistributable.  For traces that are
redistributable and cheap to run, we should run them in our CI and block
the merge instead of having someone have to watch an external dashboard
and report things to get patched up after regressions have already
landed.

I'm reluctant to add "maintain a web service codebase" as one of the
things that the Mesa project does, if there are alternatives that don't
involve that.  I've been thinking about a perf dashboard, and for that
I'd like to reuse existing open source projects like grafana.  If we
start our own dashboard project, are we going to end up reimplementing
that one?


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] Mesa CI with trace regression testing

2019-09-26 Thread Alexandros Frantzis
Hi all,

The last couple of months we (at Collabora) have been working on a
prototype for a Mesa testing system based on trace replays that supports
correctness regression testing and, in the future, performance
regression testing.

We are aware that large-scale CI systems that perform extensive checks
on Mesa already exist. However, our goal is not to reach that kind of
scale or exhaustiveness, but to produce a system that will be simple and
robust enough to be maintained by the community, while being useful
enough so that the community will want to use and maintain it. We also
want to be able to make it fast enough so that it will be run eventually
on a regular basis, ideally in pre-commit fashion.

The current prototype focuses on the correctness aspect, replaying
traces and comparing images against a set of reference images on
multiple devices. At the moment, we run on softpipe and
intel/chromebook, but it's straightforward to add other devices through
gitlab runners.

For the prototype we have used a simple approach for image comparison,
storing a separate set of reference images per device and using exact
image comparison, but we are also investigating alternative ways to deal
with this. First results indicate that the frequency of reference image
mismatches due to non-bug changes in Mesa is acceptable, but we will get
a more complete picture once we have a richer set of traces and a longer
CI run history. 

The current design is based on an out-of-tree approach, where the tracie
CI works independently from Mesa CI, fetching and building the latest
Mesa on its own. We did this for maximum flexibility in the prototyping
phase, but this has a complexity cost, and although we could continue to
work this way, we would like to hear people's thoughts about eventually
integrating with Mesa more closely, by becoming part of the upstream
Mesa testing pipelines.

It's worth noting that the last few months other people, most notably
Eric Anholt, have made proposals to extend the scope of testing in CI.
We believe there is much common ground here (multiple devices,
deployment with gitlab runners) and room for cooperation and eventual
integration into upstream Mesa. In the end, the main difference between
all these efforts are the kind of tests (deqp, traces, performance) that
are being run, which all have their place and offer different
trade-offs.

We have also implemented a prototype dashboard to display the results,
which we have deployed at:

https://tracie.freedesktop.org

We are working to improve the dashboard and provide more value by
extracting and displaying additional information, e.g., "softpipe broken
since commit NNN".

The dashboard is currently specific to the trace playback results, but
it would be nice to eventually converge to a single MesaCI dashboard
covering all kinds of Mesa CI test results. We would be happy to help
develop in this direction if there is interest.

You can find the CI scripts for tracie at:

https://gitlab.freedesktop.org/gfx-ci/tracie/tracie

Code for the dashboard is at:

https://gitlab.freedesktop.org/gfx-ci/tracie/tracie_dashboard

Here is an example of a failed CI job (for a purposefully broken Mesa
commit) and the report of the failed trace (click on the red X to
see the image diffs):

https://tracie.freedesktop.org/dashboard/job/642369/

Looking forward to your thoughts and comments.

Thanks,
Alexandros
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [clover/spirv] radeonsi/NIR (with Nine) - final linking failed on libOpenCL.so.1.0.0

2019-09-26 Thread Karol Herbst
I think you only need to recompile the translator with -fPIC enabled.
At least that's what the error is saying.

On Thu, Sep 26, 2019 at 6:53 AM Aaron Watry  wrote:
>
> Pretty sure I'm running into the same thing trying to build clover
> with llvm-spirv enabled.  If it's a known solution, I wouldn't mind
> having some time saved :)
>
> --Aaron
>
> On Wed, Sep 25, 2019 at 10:30 AM Dieter Nützel  wrote:
> >
> > Hello Karol and Pierre,
> >
> > tried it on radeonsi/NIR with Nine and OpenCL enabled
> > (-Dgallium-nine=true -Dopencl-spirv=true -Dgallium-opencl=standalone).
> >
> > I think I have all SPIRV-LLVM-Translator stuff in place
> > (/opt/llvm/projects/SPIRV-LLVM-Translator/). Resulting lib is installed
> > at /usr/local/lib/libLLVMSPIRVLib.a.
> >
> > Do I need a shared version (*.so ) of it? 'ld' output point at this
> > (relocation R_X86_64_32 against symbol `_ZTVN4SPIR13PrimitiveTypeE' can
> > not be used when making a shared object; recompile with -fPIC).
> >
> > Thanks,
> > Dieter
> >
> > [1384/1384] Linking target
> > src/gallium/targets/opencl/libOpenCL.so.1.0.0.
> > FAILED: src/gallium/targets/opencl/libOpenCL.so.1.0.0
> > ccache c++  -o src/gallium/targets/opencl/libOpenCL.so.1.0.0
> > -Wl,--no-undefined -Wl,--as-needed -Wl,-O1 -shared -fPIC
> > -Wl,--start-group -Wl,-soname,libOpenCL.so.1 -Wl,--whole-archive
> > src/gallium/state_trackers/clover/libclover.a -Wl,--no-whole-archive
> > src/gallium/auxiliary/pipe-loader/libpipe_loader_dynamic.a
> > src/loader/libloader.a src/util/libxmlconfig.a src/util/libmesa_util.a
> > src/gallium/auxiliary/libgallium.a src/compiler/nir/libnir.a
> > src/compiler/libcompiler.a src/gallium/state_trackers/clover/libclllvm.a
> > src/gallium/state_trackers/clover/libclspirv.a
> > src/gallium/state_trackers/clover/libclnir.a -Wl,--gc-sections
> > -Wl,--version-script /opt/mesa/src/gallium/targets/opencl/opencl.sym
> > /usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64/libz.so -pthread
> > -lm -ldl /usr/lib64/libunwind.so /usr/lib64/libelf.so
> > /usr/local/lib/libclangCodeGen.a /usr/local/lib/libclangFrontendTool.a
> > /usr/local/lib/libclangFrontend.a /usr/local/lib/libclangDriver.a
> > /usr/local/lib/libclangSerialization.a /usr/local/lib/libclangParse.a
> > /usr/local/lib/libclangSema.a /usr/local/lib/libclangAnalysis.a
> > /usr/local/lib/libclangAST.a /usr/local/lib/libclangASTMatchers.a
> > /usr/local/lib/libclangEdit.a /usr/local/lib/libclangLex.a
> > /usr/local/lib/libclangBasic.a /usr/lib64/libdrm.so
> > /usr/lib64/libexpat.so -L/usr/local/lib -lLLVM-10svn -lsensors
> > -L/usr/local/lib -lLLVM-10svn /usr/local/lib/libLLVMSPIRVLib.a
> > /usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64/libSPIRV-Tools.so
> > /usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64/libSPIRV-Tools-link.so
> > /usr/lib64/gcc/x86_64-suse-linux/9/../../../../lib64/libSPIRV-Tools-opt.so
> > -Wl,--end-group
> > '-Wl,-rpath,$ORIGIN/../../auxiliary/pipe-loader:$ORIGIN/../../../loader:$ORIGIN/../../../util:$ORIGIN/../../auxiliary:$ORIGIN/../../../compiler/nir:$ORIGIN/../../../compiler'
> > -Wl,-rpath-link,/opt/mesa/build/src/gallium/auxiliary/pipe-loader
> > -Wl,-rpath-link,/opt/mesa/build/src/loader
> > -Wl,-rpath-link,/opt/mesa/build/src/util
> > -Wl,-rpath-link,/opt/mesa/build/src/gallium/auxiliary
> > -Wl,-rpath-link,/opt/mesa/build/src/compiler/nir
> > -Wl,-rpath-link,/opt/mesa/build/src/compiler
> > /usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld:
> > /usr/local/lib/libLLVMSPIRVLib.a(SPIRVWriter.cpp.o): relocation
> > R_X86_64_32 against symbol `__pthread_key_create@@GLIBC_2.2.5' can not
> > be used when making a shared object; recompile with -fPIC
> > /usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld:
> > /usr/local/lib/libLLVMSPIRVLib.a(PreprocessMetadata.cpp.o): relocation
> > R_X86_64_32 against `.rodata' can not be used when making a shared
> > object; recompile with -fPIC
> > /usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld:
> > /usr/local/lib/libLLVMSPIRVLib.a(SPIRVDebug.cpp.o): relocation
> > R_X86_64_32 against `.bss' can not be used when making a shared object;
> > recompile with -fPIC
> > /usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld:
> > /usr/local/lib/libLLVMSPIRVLib.a(SPIRVDecorate.cpp.o): relocation
> > R_X86_64_32 against symbol `_ZTVN5SPIRV20SPIRVDecorateGenericE' can not
> > be used when making a shared object; recompile with -fPIC
> > /usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld:
> > /usr/local/lib/libLLVMSPIRVLib.a(SPIRVEntry.cpp.o): relocation
> > R_X86_64_32 against symbol `__pthread_key_create@@GLIBC_2.2.5' can not
> > be used when making a shared object; recompile with -fPIC
> > /usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld:
> > /usr/local/lib/libLLVMSPIRVLib.a(SPIRVFunction.cpp.o): relocation
> > R_X86_64_32 against symbol `_ZTVN5SPIRV22SPIRVFunctionParameterE' can
> > not be used when making a shared object;