Re: Input Delay Metric proposal

David Bolter Thu, 20 Sep 2018 07:14:17 -0700

[cc+ Heather for perceived performance]

Hi Randell,


I think I like this and I'm glad you are thinking about it. Measuring time
between TTI and FCP seems useful to me. You touched on a bunch of my
questions in the Issues section and I don't have answers but others might?
:)

Cheers,
David



On Wed, Sep 19, 2018 at 2:45 PM Randell Jesup <rjesup.n...@jesup.org> wrote:

> Problem:
> Various measures have been tried to capture user frustration with having
> to wait to interact with a site they're loading (or to see the site
> data).  This includes:
>
> FID - First Input Delay --
> https://developers.google.com/web/updates/2018/05/first-input-delay
> TTI - Time To Interactive --
>
> https://developers.google.com/web/fundamentals/performance/user-centric-performance-metrics#time_to_interactive
> related to: FCP - First Contentful Paint and FMP - First Meaningful Paint
> --
>
> https://developers.google.com/web/fundamentals/performance/user-centric-performance-metrics#first_paint_and_first_contentful_paint
> TTVC (Time To Visually Complete), etc.
>
> None of these do a great job capturing the reality around pageload and
> interactivity.  FID is the latest suggestion, but it's very much based
> on watching user actions and reporting on them, and thus depends on how
> much they think the page is ready to interact with, and dozens of other
> things. It's only good for field measurements in bulk of a specific
> site, by the site author.  In particular, FID cannot reasonably be used
> in automation (or before wide deployment).
>
> Proposal:
>
> We should define a new measure based on FID name MID, for Median Input
> Delay, which is measurable in automation and captures the expected delay
> a user experiences during a load.  We can run this in automation against
> a set of captured pages, while also measuring related values like FCP
> and TTI, and dump this into a set of per-page graphs (perhaps on
> "areweinteractiveyet.com" :-) ).
>
> While FID depends on measuring the delay when the user *happens* to
> click, MID would measure the median (etc) delay that would be
> experienced at any point between (suggestion) FCP and TTI.  I.e. it
> would be based on "if a user input event were generated this
> millisecond, how long would it be before it ran?"  This would measure
> delay in the input event queue (probably 0 for this case) plus the time
> remaining until he current-running event for the mainthread finishes.
>
> This inherently assumes we measure TTI and FCP (or something
> approximating it).  This is somewhat problematic, as TTI is very noisy.
> I have a first cut at TTI measurement (fed into profiler markers) in
> bug 1299118 (without the "no more than 2 connections in flight" part).
>
> Value calculation:
> Median seems to be the best measure, but once we have data we can look
> at the distributions on real sites and our test harness and decide what
> has the most correlation to user experience.  We could also measure the
> 95% point, for example.  In automation, there might be some advantage to
> recording/reporting more data, like median and 95%, or median, average,
> and 95%, and max.
>
> Another issue with the calculation is that it won't capture burstiness
> in the results well (a distribution would).
>
> Range measured over:
> We could modify the starting point to be when the first object that
> could be interacted with is rendered (input object, link, adding a key
> event handler, etc).  This would be a more-accurate measure for web
> developers, and would matter only a little for our use.  Note that
> getting content on the screen earlier might in some cases hurt you by
> starting the measurement "early" when the MainThread is presumably busy.
>
> Likewise, there might very well be alternatives to TTI for the end-point
> (and on some pages, you never get to TTI, or it's a Long Time).  Using
> TTI does imply we must collect data until 5 seconds after the last "Long
> Task", and since some sites will never go 5 seconds without a long
> task, we'll need to upper-bound it (or progressively reduce the 5
> seconds over time, which may help).   Alternatively, we could use a
> shorter window, or put an arbitrary limit on it (5 seconds past
> 'loaded', or just to 'loaded'), etc.
>
> Issues:
>
> Defining the start and stop point, and the details around the exact way
> we calculate the result (I hand-wove about it above).  Note that
> "longer" endpoints will result generally in better scores, since it
> would average over probably a longer tail where less is happening
> (presumably).  OTOH if it ends at TTI on a "Long Task" (50+ms event),
> that rather implies that it was at least intermittently busy until then.
>
> If we want to start when something interact-able is rendered, there may
> be some work to figure that out.
>
> Note that this inherently is measuring the delay until the input event
> *starts* processing, not how long it takes to process (since there is no
> actual input event here).
>
> Once we have some experience with this, we could propose it for the
> Performance API WG.
>
> --
> Randell Jesup, Mozilla Corp
> remove "news" for personal email
> _______________________________________________
> dev-platform mailing list
> dev-platform@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Re: Input Delay Metric proposal

Reply via email to