On 30 November 2016 at 10:00, Daniel Stone <dan...@fooishbar.org> wrote:
> On 30 November 2016 at 01:02, Emil Velikov <emil.l.veli...@gmail.com> wrote:
>> NB: Everything but the Install test, varies ±0.2s across 3 consecutive
>> runs, thus it's been rounded to the closest 0.5s on average.
>>          |          Laptop |       Laptop 2 |
>> ---------+-----------------+----------------+
>> Full     | 27.78s / 11.24s |   ~21s /  ~10s |
>> Rebuild  | 13.80s /  9.63s | ~12.5s / ~8.5s |
>> Check    | 13.22s /  8.80s |  ~7.5s / ~5.5s |
>> Install  |  0.47s /  0.14s |  0.47s / 0.14s |
>> ---------+-----------------+----------------+
> Ha, this isn't Skylake laptop vs. Broadwell laptop, this is my 14-core
> Xeon vs. your Broadwell laptop! That they're so different suggests
> that our configuration is pretty wildly different - are you perhaps
> not building a load of stuff which configure autodetects? That's the
> only way I can imagine how your laptop would outpace a £4000
> workstation. :\

Having talked it through later, the conclusion was that the Xeon's
single-core performance isn't nearly as good as the Skylake-U's.
Various articles seemed to confirm that.

>> So if we are to ignore configure time, numbers are comparable.
>> And yes, that's a very big if to ask for.
> Interesting indeed. I do wonder what is going on there. Either way,
> it's still a 1.5-2x speed win for you ...
> As another data point, when I'm rebasing, I make very heavy use of git
> rebase -i --exec right down to the parent tree, to make sure it
> compiles/works the whole way through. Across my compositor-drm tree
> (now up to 60 patches; most only touching compositor-drm.c, but some
> touching wider parts of the source), this was 260.0s for make -j8, and
> 19.3sec for ninja. There were three points at which autotools re-run
> and Meson didn't; if we assume we have to take the full penalty for a
> rebuild, then we end up on ~50s vs. ~260s.

So, on this note, you guilted me into doing some more serious
benchmarking of the results, which can be found:
(modified to exclude simple-dmabuf-intel on ARM systems)

All the files are the result of running benchmark.py in the Weston
tree, with ministat installed from FreeBSD (cf.
http://anholt.net/compare-perf/). These do 10 runs and take a
statistically-sound average. I'd also earlier tried to split the runs
into common tasks people might perform, but it's probably easier for
clarity to show the individual steps.

On Intel, the configure stage (autogen / meson -Dfoo) is 14-15x
faster. A complete build is 3x faster, and rebuilding component parts
of the tree is 1.5-4x faster. The 'rebuild weston from built tree'
sees one huge spike on autotools, which makes me think there's some
relinking going on, so the stats are potentially far worse for
autotools than Meson. Running tests is marginally faster, and
installation is 2-4x faster. On ARM, configure is ~20x faster,
complete build and rebuilds 3-4x faster, tests not noticeably
different, and install 2-4x faster (less I/O load?).

I also updated the Wayland tree to support cross-compilation by
building a native scanner as part of the build process, which was
seriously trivial.

wayland-devel mailing list

Reply via email to