[cc+ Heather for perceived performance] Hi Randell,
I think I like this and I'm glad you are thinking about it. Measuring time between TTI and FCP seems useful to me. You touched on a bunch of my questions in the Issues section and I don't have answers but others might? :) Cheers, David On Wed, Sep 19, 2018 at 2:45 PM Randell Jesup <rjesup.n...@jesup.org> wrote: > Problem: > Various measures have been tried to capture user frustration with having > to wait to interact with a site they're loading (or to see the site > data). This includes: > > FID - First Input Delay -- > https://developers.google.com/web/updates/2018/05/first-input-delay > TTI - Time To Interactive -- > > https://developers.google.com/web/fundamentals/performance/user-centric-performance-metrics#time_to_interactive > related to: FCP - First Contentful Paint and FMP - First Meaningful Paint > -- > > https://developers.google.com/web/fundamentals/performance/user-centric-performance-metrics#first_paint_and_first_contentful_paint > TTVC (Time To Visually Complete), etc. > > None of these do a great job capturing the reality around pageload and > interactivity. FID is the latest suggestion, but it's very much based > on watching user actions and reporting on them, and thus depends on how > much they think the page is ready to interact with, and dozens of other > things. It's only good for field measurements in bulk of a specific > site, by the site author. In particular, FID cannot reasonably be used > in automation (or before wide deployment). > > Proposal: > > We should define a new measure based on FID name MID, for Median Input > Delay, which is measurable in automation and captures the expected delay > a user experiences during a load. We can run this in automation against > a set of captured pages, while also measuring related values like FCP > and TTI, and dump this into a set of per-page graphs (perhaps on > "areweinteractiveyet.com" :-) ). > > While FID depends on measuring the delay when the user *happens* to > click, MID would measure the median (etc) delay that would be > experienced at any point between (suggestion) FCP and TTI. I.e. it > would be based on "if a user input event were generated this > millisecond, how long would it be before it ran?" This would measure > delay in the input event queue (probably 0 for this case) plus the time > remaining until he current-running event for the mainthread finishes. > > This inherently assumes we measure TTI and FCP (or something > approximating it). This is somewhat problematic, as TTI is very noisy. > I have a first cut at TTI measurement (fed into profiler markers) in > bug 1299118 (without the "no more than 2 connections in flight" part). > > Value calculation: > Median seems to be the best measure, but once we have data we can look > at the distributions on real sites and our test harness and decide what > has the most correlation to user experience. We could also measure the > 95% point, for example. In automation, there might be some advantage to > recording/reporting more data, like median and 95%, or median, average, > and 95%, and max. > > Another issue with the calculation is that it won't capture burstiness > in the results well (a distribution would). > > Range measured over: > We could modify the starting point to be when the first object that > could be interacted with is rendered (input object, link, adding a key > event handler, etc). This would be a more-accurate measure for web > developers, and would matter only a little for our use. Note that > getting content on the screen earlier might in some cases hurt you by > starting the measurement "early" when the MainThread is presumably busy. > > Likewise, there might very well be alternatives to TTI for the end-point > (and on some pages, you never get to TTI, or it's a Long Time). Using > TTI does imply we must collect data until 5 seconds after the last "Long > Task", and since some sites will never go 5 seconds without a long > task, we'll need to upper-bound it (or progressively reduce the 5 > seconds over time, which may help). Alternatively, we could use a > shorter window, or put an arbitrary limit on it (5 seconds past > 'loaded', or just to 'loaded'), etc. > > Issues: > > Defining the start and stop point, and the details around the exact way > we calculate the result (I hand-wove about it above). Note that > "longer" endpoints will result generally in better scores, since it > would average over probably a longer tail where less is happening > (presumably). OTOH if it ends at TTI on a "Long Task" (50+ms event), > that rather implies that it was at least intermittently busy until then. > > If we want to start when something interact-able is rendered, there may > be some work to figure that out. > > Note that this inherently is measuring the delay until the input event > *starts* processing, not how long it takes to process (since there is no > actual input event here). > > Once we have some experience with this, we could propose it for the > Performance API WG. > > -- > Randell Jesup, Mozilla Corp > remove "news" for personal email > _______________________________________________ > dev-platform mailing list > dev-platform@lists.mozilla.org > https://lists.mozilla.org/listinfo/dev-platform > _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform