Re: Summary of e10s performance (Talos + Telemetry + crash-stats)

2015-07-15 Thread Benoit Girard
For the e10s talos regressions see
https://bugzilla.mozilla.org/show_bug.cgi?id=1174776 and
https://bugzilla.mozilla.org/show_bug.cgi?id=1184277. We've already
diagnose one source of the regression to be a difference with GC/CC
behavior when running e10s talos.

On Fri, Jul 10, 2015 at 5:44 PM, Vladan Djeric vdje...@mozilla.com wrote:

 Yup, the median shutdown duration for Release 39 users on Windows with
 Telemetry is 2.3 seconds for example: http://mzl.la/1HSHiD8
 Those are also the kinds of shutdown times I see on my Windows machines
 when I have 3-5 windows open with 5-10 tabs each.

 What is your experience?
 Btw, you can go to about:telemetry and look through your archived Telemetry
 pings to see a history of your own shutdownDurations. Open about:telemetry,
 select Archived ping data, open the Simple Measurements section, and
 use the next-previous arrows to look through your Telemetry submissions.
 Focus on the saved-session pings.

 On Fri, Jul 10, 2015 at 5:33 PM, Mike Hommey m...@glandium.org wrote:

  On Fri, Jul 10, 2015 at 03:59:43PM -0400, Vladan Djeric wrote:
   A few of us on the perf team (+ Joel Maher) looked at e10s performance
 
   stability using Talos, Telemetry, and crash-stats. I wrote up the
   conclusions below.
  
   Notable improvements in Talos tests [1]:
  
   * Hot startup time in Talos improved by about 50% across all platforms
   (ts_paint [2]). This test measures time from Firefox launch until a
  Firefox
   window is first painted (ts_paint); I/O read costs are not accounted
 for,
   as data is already cached in the OS disk buffer before the test.
   * The tsvgr_opacity test improved 50-80% across all platforms. This is
 a
   sign of a reduction in the overhead of loading a page, instead of an
   improvement in actual SVG performance.
   * Linux scrolling performance improved 5-15%
   * The long-standing e10s WebGL performance regression has been fixed
   * SVG rendering performance (tsvgx) is ~25% better on Windows 7  8,
 but
  it
   is 10% worse on Windows XP and 25% worse on Linux
  
   Notable regressions in Talos tests [1]:
  
   * There are several large regressions unique to Windows XP. Scrolling
   smoothness regressed significantly (5-6 times worse on tp5o_scroll and
   tscrollx [2]), resizing of Firefox windows is 150% worse (tresize), SVG
   rendering performance is 25% worse (tsvgx)
   * Page loading time regressed across all platforms (tp5o). Linux
  regressed
   ~30%, OS X 10.10 regressed 20%, WinXP/Win8/Win7 all regressed ~10%.
   Page-loading with accessibility enabled (a11yr) saw similar
 regressions.
   * Time to open a new Firefox window (tpaint) regressed 30% on Linux,
 and
   across different versions of Windows (10%)
   * Resizing of Firefox windows (tresize) is ~15% worse on Linux
   * Note: not all tests are compatible with e10s yet (e.g.
 session-restore
   performance test) so this list isn't complete
  
   Notable improvements from Telemetry data [3]:
  
   * Overall tab animation smoothness improved significantly: 50% vs 30%
 of
   tab animation frames are hitting the target 16ms inter-frame interval.
  See
   FX_TAB_ANIM_* graphs in [3] to see the distribution of frame intervals.
   Note that not all tab animations benefited equally.
   * e10s significantly decreased jank caused by GC  CC, both in parent 
   content processes (GC_MAX_PAUSE_MS, GC_SCC_SWEEP_MAX_PAUSE_MS,
   CYCLE_COLLECTOR_MAX_PAUSE, etc [3])
   * Unlike Talos, Telemetry suggests that the time to open a new Firefox
   window improved with e10s (FX_NEW_WINDOW_MS)
   * Median time to restore a saved session improved by 40ms or 20%
   (simpleMeasurements/sessionRestored)
   * Median shutdown duration improved by 120ms or 10%
   (simpleMeasurements/shutdownDuration)
 
  Wait. What? Median shutdown duration is 1.2s ?!?
 
 ___
 dev-platform mailing list
 dev-platform@lists.mozilla.org
 https://lists.mozilla.org/listinfo/dev-platform

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Summary of e10s performance (Talos + Telemetry + crash-stats)

2015-07-10 Thread Vladan Djeric
A few of us on the perf team (+ Joel Maher) looked at e10s performance 
stability using Talos, Telemetry, and crash-stats. I wrote up the
conclusions below.

Notable improvements in Talos tests [1]:

* Hot startup time in Talos improved by about 50% across all platforms
(ts_paint [2]). This test measures time from Firefox launch until a Firefox
window is first painted (ts_paint); I/O read costs are not accounted for,
as data is already cached in the OS disk buffer before the test.
* The tsvgr_opacity test improved 50-80% across all platforms. This is a
sign of a reduction in the overhead of loading a page, instead of an
improvement in actual SVG performance.
* Linux scrolling performance improved 5-15%
* The long-standing e10s WebGL performance regression has been fixed
* SVG rendering performance (tsvgx) is ~25% better on Windows 7  8, but it
is 10% worse on Windows XP and 25% worse on Linux

Notable regressions in Talos tests [1]:

* There are several large regressions unique to Windows XP. Scrolling
smoothness regressed significantly (5-6 times worse on tp5o_scroll and
tscrollx [2]), resizing of Firefox windows is 150% worse (tresize), SVG
rendering performance is 25% worse (tsvgx)
* Page loading time regressed across all platforms (tp5o). Linux regressed
~30%, OS X 10.10 regressed 20%, WinXP/Win8/Win7 all regressed ~10%.
Page-loading with accessibility enabled (a11yr) saw similar regressions.
* Time to open a new Firefox window (tpaint) regressed 30% on Linux, and
across different versions of Windows (10%)
* Resizing of Firefox windows (tresize) is ~15% worse on Linux
* Note: not all tests are compatible with e10s yet (e.g. session-restore
performance test) so this list isn't complete

Notable improvements from Telemetry data [3]:

* Overall tab animation smoothness improved significantly: 50% vs 30% of
tab animation frames are hitting the target 16ms inter-frame interval. See
FX_TAB_ANIM_* graphs in [3] to see the distribution of frame intervals.
Note that not all tab animations benefited equally.
* e10s significantly decreased jank caused by GC  CC, both in parent 
content processes (GC_MAX_PAUSE_MS, GC_SCC_SWEEP_MAX_PAUSE_MS,
CYCLE_COLLECTOR_MAX_PAUSE, etc [3])
* Unlike Talos, Telemetry suggests that the time to open a new Firefox
window improved with e10s (FX_NEW_WINDOW_MS)
* Median time to restore a saved session improved by 40ms or 20%
(simpleMeasurements/sessionRestored)
* Median shutdown duration improved by 120ms or 10%
(simpleMeasurements/shutdownDuration)

Notable regressions from Telemetry data [3]:

* Unlike Talos, Telemetry numbers imply that the median real-world startup
time, measured as time to first-paint, regressed by 550ms or 20% with e10s
(simpleMeasurements/firstPaint)
* The frequency of jank events lasting more than 100ms increased from ~19
events/min to ~21 events/minute with e10s. This was derived from the
main-thread's event processing times and session uptime
(gecko_hangs_per_minute)
* Similarly the frequency of the slow-script dialog appearing seems to have
roughly doubled with e10s (histograms/SLOW_SCRIPT_NOTICE_COUNT)
* A side-note: interpreting Telemetry data is trickier than Talos data,
because Telemetry measurements aren't gathered from a controlled
environment, there are confounding variables, opt-in bias, etc. An
additional challenge with e10s Telemetry is that many measurements haven't
yet been re-validated to confirm that they measure the same things in e10s
and non-e10s.

Notable stability improvements [4]:

* E10S Firefox can survive crashes in content-process code, so it's no
surprise that the E10S parent process crash rate is a quarter of the
single-process crash rate (based on crash-stats from Nightly 42 on Windows
[4])
* The total number of E10S crashes of any type (content crash or parent
crash) is roughly the same as without E10S
* Oddly enough, E10S seems to win on plugin crash rates as well! [4]
* There seem to be no regressions in crash rate compared to single-process

References:

1. Joel Maher used compare-talos to compare the Talos scores of an m-c
revision (Nightly 42) in e10s  non-e10s configurations:
https://bugzilla.mozilla.org/show_bug.cgi?id=1144120#c5
Data in friendlier chart form:
https://drive.google.com/open?id=1qfkcoE5_25GtZDa-pIlqFMw6pLueplOsP1YhQb8UcC8

Talos data was gathered from a Firefox 42 m-c build aad95360a002 from June
29th
2. Talos test descriptions https://wiki.mozilla.org/Buildbot/Talos/Tests
3. Roberto Vitillo compared Telemetry from 150,000 Nightly sessions
submitted on June 15th with buildIDs in the range [20150601, 20150616]:
http://nbviewer.ipython.org/urls/gist.githubusercontent.com/vitillo/cb6f1304316c1c1a2cbc/raw/e10s%20analysis.ipynb
~90% of the sessions were from e10s clients, so interpret the e10s 
non-e10s populations with a grain of salt. The numbers were not broken down
by OS. I did not comment on any findings where the delta had more than 0.10
probability of being caused by chance.
4. Crash-stats comparisons from 

Re: Summary of e10s performance (Talos + Telemetry + crash-stats)

2015-07-10 Thread Mike Hommey
On Fri, Jul 10, 2015 at 03:59:43PM -0400, Vladan Djeric wrote:
 A few of us on the perf team (+ Joel Maher) looked at e10s performance 
 stability using Talos, Telemetry, and crash-stats. I wrote up the
 conclusions below.
 
 Notable improvements in Talos tests [1]:
 
 * Hot startup time in Talos improved by about 50% across all platforms
 (ts_paint [2]). This test measures time from Firefox launch until a Firefox
 window is first painted (ts_paint); I/O read costs are not accounted for,
 as data is already cached in the OS disk buffer before the test.
 * The tsvgr_opacity test improved 50-80% across all platforms. This is a
 sign of a reduction in the overhead of loading a page, instead of an
 improvement in actual SVG performance.
 * Linux scrolling performance improved 5-15%
 * The long-standing e10s WebGL performance regression has been fixed
 * SVG rendering performance (tsvgx) is ~25% better on Windows 7  8, but it
 is 10% worse on Windows XP and 25% worse on Linux
 
 Notable regressions in Talos tests [1]:
 
 * There are several large regressions unique to Windows XP. Scrolling
 smoothness regressed significantly (5-6 times worse on tp5o_scroll and
 tscrollx [2]), resizing of Firefox windows is 150% worse (tresize), SVG
 rendering performance is 25% worse (tsvgx)
 * Page loading time regressed across all platforms (tp5o). Linux regressed
 ~30%, OS X 10.10 regressed 20%, WinXP/Win8/Win7 all regressed ~10%.
 Page-loading with accessibility enabled (a11yr) saw similar regressions.
 * Time to open a new Firefox window (tpaint) regressed 30% on Linux, and
 across different versions of Windows (10%)
 * Resizing of Firefox windows (tresize) is ~15% worse on Linux
 * Note: not all tests are compatible with e10s yet (e.g. session-restore
 performance test) so this list isn't complete
 
 Notable improvements from Telemetry data [3]:
 
 * Overall tab animation smoothness improved significantly: 50% vs 30% of
 tab animation frames are hitting the target 16ms inter-frame interval. See
 FX_TAB_ANIM_* graphs in [3] to see the distribution of frame intervals.
 Note that not all tab animations benefited equally.
 * e10s significantly decreased jank caused by GC  CC, both in parent 
 content processes (GC_MAX_PAUSE_MS, GC_SCC_SWEEP_MAX_PAUSE_MS,
 CYCLE_COLLECTOR_MAX_PAUSE, etc [3])
 * Unlike Talos, Telemetry suggests that the time to open a new Firefox
 window improved with e10s (FX_NEW_WINDOW_MS)
 * Median time to restore a saved session improved by 40ms or 20%
 (simpleMeasurements/sessionRestored)
 * Median shutdown duration improved by 120ms or 10%
 (simpleMeasurements/shutdownDuration)

Wait. What? Median shutdown duration is 1.2s ?!?
___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform


Re: Summary of e10s performance (Talos + Telemetry + crash-stats)

2015-07-10 Thread Vladan Djeric
Yup, the median shutdown duration for Release 39 users on Windows with
Telemetry is 2.3 seconds for example: http://mzl.la/1HSHiD8
Those are also the kinds of shutdown times I see on my Windows machines
when I have 3-5 windows open with 5-10 tabs each.

What is your experience?
Btw, you can go to about:telemetry and look through your archived Telemetry
pings to see a history of your own shutdownDurations. Open about:telemetry,
select Archived ping data, open the Simple Measurements section, and
use the next-previous arrows to look through your Telemetry submissions.
Focus on the saved-session pings.

On Fri, Jul 10, 2015 at 5:33 PM, Mike Hommey m...@glandium.org wrote:

 On Fri, Jul 10, 2015 at 03:59:43PM -0400, Vladan Djeric wrote:
  A few of us on the perf team (+ Joel Maher) looked at e10s performance 
  stability using Talos, Telemetry, and crash-stats. I wrote up the
  conclusions below.
 
  Notable improvements in Talos tests [1]:
 
  * Hot startup time in Talos improved by about 50% across all platforms
  (ts_paint [2]). This test measures time from Firefox launch until a
 Firefox
  window is first painted (ts_paint); I/O read costs are not accounted for,
  as data is already cached in the OS disk buffer before the test.
  * The tsvgr_opacity test improved 50-80% across all platforms. This is a
  sign of a reduction in the overhead of loading a page, instead of an
  improvement in actual SVG performance.
  * Linux scrolling performance improved 5-15%
  * The long-standing e10s WebGL performance regression has been fixed
  * SVG rendering performance (tsvgx) is ~25% better on Windows 7  8, but
 it
  is 10% worse on Windows XP and 25% worse on Linux
 
  Notable regressions in Talos tests [1]:
 
  * There are several large regressions unique to Windows XP. Scrolling
  smoothness regressed significantly (5-6 times worse on tp5o_scroll and
  tscrollx [2]), resizing of Firefox windows is 150% worse (tresize), SVG
  rendering performance is 25% worse (tsvgx)
  * Page loading time regressed across all platforms (tp5o). Linux
 regressed
  ~30%, OS X 10.10 regressed 20%, WinXP/Win8/Win7 all regressed ~10%.
  Page-loading with accessibility enabled (a11yr) saw similar regressions.
  * Time to open a new Firefox window (tpaint) regressed 30% on Linux, and
  across different versions of Windows (10%)
  * Resizing of Firefox windows (tresize) is ~15% worse on Linux
  * Note: not all tests are compatible with e10s yet (e.g. session-restore
  performance test) so this list isn't complete
 
  Notable improvements from Telemetry data [3]:
 
  * Overall tab animation smoothness improved significantly: 50% vs 30% of
  tab animation frames are hitting the target 16ms inter-frame interval.
 See
  FX_TAB_ANIM_* graphs in [3] to see the distribution of frame intervals.
  Note that not all tab animations benefited equally.
  * e10s significantly decreased jank caused by GC  CC, both in parent 
  content processes (GC_MAX_PAUSE_MS, GC_SCC_SWEEP_MAX_PAUSE_MS,
  CYCLE_COLLECTOR_MAX_PAUSE, etc [3])
  * Unlike Talos, Telemetry suggests that the time to open a new Firefox
  window improved with e10s (FX_NEW_WINDOW_MS)
  * Median time to restore a saved session improved by 40ms or 20%
  (simpleMeasurements/sessionRestored)
  * Median shutdown duration improved by 120ms or 10%
  (simpleMeasurements/shutdownDuration)

 Wait. What? Median shutdown duration is 1.2s ?!?

___
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform