On 1 December 2016 at 11:46, Daniel Stone <dan...@fooishbar.org> wrote:
> Hi,
> On 30 November 2016 at 10:00, Daniel Stone <dan...@fooishbar.org> wrote:
>> On 30 November 2016 at 01:02, Emil Velikov <emil.l.veli...@gmail.com> wrote:
>>> NB: Everything but the Install test, varies ±0.2s across 3 consecutive
>>> runs, thus it's been rounded to the closest 0.5s on average.
>>>          |          Laptop |       Laptop 2 |
>>> ---------+-----------------+----------------+
>>> Full     | 27.78s / 11.24s |   ~21s /  ~10s |
>>> Rebuild  | 13.80s /  9.63s | ~12.5s / ~8.5s |
>>> Check    | 13.22s /  8.80s |  ~7.5s / ~5.5s |
>>> Install  |  0.47s /  0.14s |  0.47s / 0.14s |
>>> ---------+-----------------+----------------+
>> Ha, this isn't Skylake laptop vs. Broadwell laptop, this is my 14-core
>> Xeon vs. your Broadwell laptop! That they're so different suggests
>> that our configuration is pretty wildly different - are you perhaps
>> not building a load of stuff which configure autodetects? That's the
>> only way I can imagine how your laptop would outpace a £4000
>> workstation. :\
> Having talked it through later, the conclusion was that the Xeon's
> single-core performance isn't nearly as good as the Skylake-U's.
> Various articles seemed to confirm that.
Ack. Just as a confirmation the above discrepancy is between XPS 13 +
F25 vs X1C3 + Arch.
After double-checking - there's nothing that gets disabled during
configure here. Thus the ~15 vs ~10s runtime between our setups is
very disturbing.

>>> So if we are to ignore configure time, numbers are comparable.
>>> And yes, that's a very big if to ask for.
>> Interesting indeed. I do wonder what is going on there. Either way,
>> it's still a 1.5-2x speed win for you ...
>> As another data point, when I'm rebasing, I make very heavy use of git
>> rebase -i --exec right down to the parent tree, to make sure it
>> compiles/works the whole way through. Across my compositor-drm tree
>> (now up to 60 patches; most only touching compositor-drm.c, but some
>> touching wider parts of the source), this was 260.0s for make -j8, and
>> 19.3sec for ninja. There were three points at which autotools re-run
>> and Meson didn't; if we assume we have to take the full penalty for a
>> rebuild, then we end up on ~50s vs. ~260s.
> So, on this note, you guilted me into doing some more serious
> benchmarking of the results, which can be found:
> https://people.collabora.com/~daniels/wayland-meson-20161201/benchmark.py
> (modified to exclude simple-dmabuf-intel on ARM systems)
> https://people.collabora.com/~daniels/wayland-meson-20161201/weston.rpi2
> https://people.collabora.com/~daniels/wayland-meson-20161201/weston.rock2
> https://people.collabora.com/~daniels/wayland-meson-20161201/weston.laptop
> https://people.collabora.com/~daniels/wayland-meson-20161201/weston.xeon
> All the files are the result of running benchmark.py in the Weston
> tree, with ministat installed from FreeBSD (cf.
> http://anholt.net/compare-perf/). These do 10 runs and take a
> statistically-sound average. I'd also earlier tried to split the runs
> into common tasks people might perform, but it's probably easier for
> clarity to show the individual steps.
> On Intel, the configure stage (autogen / meson -Dfoo) is 14-15x
> faster. A complete build is 3x faster, and rebuilding component parts
> of the tree is 1.5-4x faster. The 'rebuild weston from built tree'
> sees one huge spike on autotools, which makes me think there's some
> relinking going on, so the stats are potentially far worse for
> autotools than Meson. Running tests is marginally faster, and
> installation is 2-4x faster. On ARM, configure is ~20x faster,
> complete build and rebuilds 3-4x faster, tests not noticeably
> different, and install 2-4x faster (less I/O load?).
> I also updated the Wayland tree to support cross-compilation by
> building a native scanner as part of the build process, which was
> seriously trivial.
Thanks for the extensive tools - I merely had for i in `sed 1 3`; do
...; done :-)
I'll give them a try in the next few days.

As hinted before - in theory [at least] one should be able get
noticeable improvement, by polishing the old tools - convert from
mtime to the method used by ninja, strip down unneeded tests from
configure, etc.

This does not mitigate the fact that meson/ninja _does_ have some cool
features and is noticeably faster _by default_.
Yet there's also the fact that distributions and/or builders simply
cannot use python/similar tools when using distribution tarballs.

Latter feature seems to be missing from meson afaict, alongside
cscope, ctags and a few others that some people still use (I don't
Thus do check with distro maintainers/others who ship wayland/weston,
_if_ in the future one decides to drop the autohell setup ;-)

Please consider using a low traffic medium (hint not wayland-devel)
since the question _will_ get lost.

P.S. Speaking of which do you have some time to set this [1] up ?
wayland-devel mailing list

Reply via email to