Problem: Various measures have been tried to capture user frustration with having to wait to interact with a site they're loading (or to see the site data). This includes:
FID - First Input Delay -- https://developers.google.com/web/updates/2018/05/first-input-delay TTI - Time To Interactive -- https://developers.google.com/web/fundamentals/performance/user-centric-performance-metrics#time_to_interactive related to: FCP - First Contentful Paint and FMP - First Meaningful Paint -- https://developers.google.com/web/fundamentals/performance/user-centric-performance-metrics#first_paint_and_first_contentful_paint TTVC (Time To Visually Complete), etc. None of these do a great job capturing the reality around pageload and interactivity. FID is the latest suggestion, but it's very much based on watching user actions and reporting on them, and thus depends on how much they think the page is ready to interact with, and dozens of other things. It's only good for field measurements in bulk of a specific site, by the site author. In particular, FID cannot reasonably be used in automation (or before wide deployment). Proposal: We should define a new measure based on FID name MID, for Median Input Delay, which is measurable in automation and captures the expected delay a user experiences during a load. We can run this in automation against a set of captured pages, while also measuring related values like FCP and TTI, and dump this into a set of per-page graphs (perhaps on "areweinteractiveyet.com" :-) ). While FID depends on measuring the delay when the user *happens* to click, MID would measure the median (etc) delay that would be experienced at any point between (suggestion) FCP and TTI. I.e. it would be based on "if a user input event were generated this millisecond, how long would it be before it ran?" This would measure delay in the input event queue (probably 0 for this case) plus the time remaining until he current-running event for the mainthread finishes. This inherently assumes we measure TTI and FCP (or something approximating it). This is somewhat problematic, as TTI is very noisy. I have a first cut at TTI measurement (fed into profiler markers) in bug 1299118 (without the "no more than 2 connections in flight" part). Value calculation: Median seems to be the best measure, but once we have data we can look at the distributions on real sites and our test harness and decide what has the most correlation to user experience. We could also measure the 95% point, for example. In automation, there might be some advantage to recording/reporting more data, like median and 95%, or median, average, and 95%, and max. Another issue with the calculation is that it won't capture burstiness in the results well (a distribution would). Range measured over: We could modify the starting point to be when the first object that could be interacted with is rendered (input object, link, adding a key event handler, etc). This would be a more-accurate measure for web developers, and would matter only a little for our use. Note that getting content on the screen earlier might in some cases hurt you by starting the measurement "early" when the MainThread is presumably busy. Likewise, there might very well be alternatives to TTI for the end-point (and on some pages, you never get to TTI, or it's a Long Time). Using TTI does imply we must collect data until 5 seconds after the last "Long Task", and since some sites will never go 5 seconds without a long task, we'll need to upper-bound it (or progressively reduce the 5 seconds over time, which may help). Alternatively, we could use a shorter window, or put an arbitrary limit on it (5 seconds past 'loaded', or just to 'loaded'), etc. Issues: Defining the start and stop point, and the details around the exact way we calculate the result (I hand-wove about it above). Note that "longer" endpoints will result generally in better scores, since it would average over probably a longer tail where less is happening (presumably). OTOH if it ends at TTI on a "Long Task" (50+ms event), that rather implies that it was at least intermittently busy until then. If we want to start when something interact-able is rendered, there may be some work to figure that out. Note that this inherently is measuring the delay until the input event *starts* processing, not how long it takes to process (since there is no actual input event here). Once we have some experience with this, we could propose it for the Performance API WG. -- Randell Jesup, Mozilla Corp remove "news" for personal email _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform