Re: add "networkDuration" to Resource Timing

Steve Souders Tue, 23 Dec 2014 12:27:23 -0800

> I think the current definition of "duration" is correct

I've never questioned that the definition of "duration" is incorrect.Instead, I'm suggesting that we add a new metric called something like"networkDuration".

> If you want to exclusively measure the "network transfer time" andexclude cache and blocking overhead, then you should do that as aseparate metric


Yes, that's it exactly.

-Steve

On 12/23/14 12:16 PM, Ilya Grigorik wrote:

On Mon, Dec 22, 2014 at 3:22 PM, Steve Souders <st...@souders.org<mailto:st...@souders.org>> wrote:


    > Sure, but that's mostly an educational and easily fixable
    problem on their end... Short of (2b) case.

    It's more than an educational problem. Developers typically look
    at code and object properties before documentation and tutorials.
    "Duration" is short and encompassing. It'll be the first choice.
    The people I've seen who have already made this mistake come from
    smart, webperf cutting edge organizations, as evidenced by the
    fact that they're using Resource Timing in production systems. If
    the cutting edge gurus make the mistake it's likely that we need
    more than education.

I disagree with this. I think the current definition of "duration" iscorrect and, in fact, is exactly what applications should bemeasuring: time from the moment you requested the resource to when itis available. This includes time to check the appropriate caches,which is non-zero and can be in tens and hundreds of milliseconds,connection setup time, blocking time due to head-of-line blocking(http/1 artifact), and the actual transfer times.

If you want to exclusively measure the "network transfer time" andexclude cache and blocking overhead, then you should do that as aseparate metric... I think what you're pointing out here is that mostpeople assume that cache lookups are effectively free, and http/1 HoLis not a problem... and that, to me, is an education problem, not ametric problem.


    > Could we instrument HTTP Archive to log blocking time for each
    resource?
    I accept pull requests. ;-) But given that the average website has
    50+ resources on a single hostname
    <http://httparchive.org/trends.php#numDomains&maxDomainReqs>
    that's 44 requests that have blocking time.

But not all of them are dispatched simultaneously either: some aredelayed because they're declared later in the document, some have towait for layout (e.g. CSS spec'ed resources), and others may bescheduled via JS, etc. It'd be good to understand how this looks inthe real world.

Good news is, looks like HAR already captures this in "timings: {blocked: ...}":http://www.softwareishard.com/blog/har-12-spec/#timings. I verifiedthat both WPT and Chrome HAR export report the metric. So, we alreadyhave the data in the raw WPT results... "just" need to pull it out ;)


    > But isn't this the same problem in a different disguise?
    Yes, but not as significant.

We have research showing that even flash I/O can be very expensive[1], and it mirrors some of the metrics we've gathered in the past inChrome.. Plus, in addition to slow I/O we also have thread hopping,etc, all of which adds non-trivial overhead. I'm not convinced we canjust sweep this under the rug.


ig

[1] http://dl.acm.org/citation.cfm?id=2385607

    On 12/22/14 9:25 AM, Ilya Grigorik wrote:

    On Wed, Dec 17, 2014 at 9:53 PM, Steve Souders <st...@souders.org
    <mailto:st...@souders.org>> wrote:

        The use cases were CDNs, RUM providers, and website owners
        using Resource Timing's duration to measure (what they
        thought was) download time of resources. In fact, one of the
        RUM providers (Buddy from SOASTA) did a preso at WebPerfdays
        showing code to track "duration" and captured it in a
        property called "downloadtime" - so everyone in that audience
        now things "duration" means "download time". Bummer!


    Sure, but that's mostly an educational and easily fixable problem
    on their end... Short of (2b) case.

        For the (2b) case (different origin & you don't control it so
        can't add TAO header), you're right that sometimes there's no
        action the website owner can take. For example, if the
        Twitter widget loads other scripts & images dynamically,
        there's not much the website owner can do. But there are
        *numerous* situations where the timing of (2b) content IS
        actionable. If the website owner was able to distinguish
        blocking time from download time they'd be able to make the
        right decision and take action. For example:
            - fonts - These are blocking the page from rendering. If
        it's because the fonts are slow to download, then I might
        want to switch font providers. If it's because of blocking,
        then I might want to preload or prefetch the fonts.
            - ads - I moved the ad in my page and clickthroughs
        dropped off significantly. Is that because the ad content is
        blocked or slow? Or something else?
            - JS libs - I might want to find out if
        https://code.jquery.com/jquery-2.1.2.min.js is loading slow
        on my site because it's blocked or just slow to download.
        Again, there are many actions the website owner can take -
        load it async, prefetch it, host it locally, get it from
        Google CDN.


    As an aside... I'm wondering if we can gather some data on how
    often this is actually a problem? Could we instrument HTTP
    Archive to log blocking time for each resource?

        Choosing a name is hard because I assume we do NOT want to
        reveal whether the object was read from cache for
        cross-origin resources. Thus, "networkDuration" could
        actually not involve any network requests at all. I thought
        about calling it "loadtime" since that covers loading it over
        the network or from cache. Again, I'm not insistent on
        "networkDuration" and would love better name brainstorming.


    But isn't this the same problem in a different disguise? I
    thought I was measuring the latency of my CDN, but I'm actually
    measuring latency of my cache lookup plus the CDN fetch, where
    the former can easily take tens if not hundreds of milliseconds..
    and crazily enough, be higher than the actual network fetch.

    ig

        On 12/4/14 9:13 AM, Ilya Grigorik wrote:

        On Mon, Nov 24, 2014 at 4:34 PM, Steve Souders
        <st...@souders.org <mailto:st...@souders.org>> wrote:

            LONG: A few weeks ago I discovered that "duration"
            includes blocking time, so "duration" is greater than
            the actual network time needed to download the resource.
            Since then I've been at Velocity and WebPerfDays where
            many people have shown their Resource Timing code.
            Everyone I spoke to (~5 different teams) assumed that
            "duration" was just the network time. When I explain
            that it also includes blocking they were surprised,
            admitted they hadn't known that, and agreed it is NOT
            the metric they were trying to capture.


        Steve, can you elaborate on the use case a bit more? Who's
        measuring what here, and for what purpose? Are we
        benchmarking CDN performance?

        In terms of getting access to the data, we have the
        following cases:
        1) same origin resources: full access to timing data.
        2) different origin:
          a) if you control it, add TAO header for full access to
        timing data.
          b) if you don't control it, you only have "duration"

        For (1) and (2a), I can see why you may want or need to get
        low-level "network duration" data: you want to track your
        provider's DNS performance, latency to your CDN, TTFB, total
        response time, and so on. You care about this because this
        is something *you can affect*. However, for (2b)... this
        same data falls into interesting but not actionable bucket?
        Further, it seems like if you are actually interested in
        benchmarking your CDN, then you really should be looking
        deeper than just total time: you want to decompose DNS, TCP,
        TLS, HTTP req>resp cycles. At which point.. you need the
        full timing object anyway.

            I propose we add a new property to Resource Timing that
            reflects the time to actually load the resource
            excluding blocking time. I'm flexible about the name but
            for purposes of this discussion let's call it
            "networkDuration". The important piece of this proposal
            is that "networkDuration" should be available for all
            resources, similar to "duration". In other words, it
            should be available for same origin as well as cross
            origin resources as part of the PerformanceEntry
            <http://www.w3.org/TR/performance-timeline/#performanceentry> 
interface.


        Note that "blocking time" is a thing of the past for SPDY
        and HTTP/2, as this demo demonstrates really well:
        http://www.httpvshttps.com/

        I'm skeptical of above definition: if you want "network
        duration", you should also exclude cache time; it's a
        computed metric that you can access today with TAO and a
        redundant one with http/2; if you really care about "network
        duration" you should probably decompose it further, but at
        that point it becomes a conversation about removing the TAO
        restriction.

        ig

        P.S. "networkDuration = dns + tcp + waiting + content" ...
        don't forget the https handshake!

        On Wed, Nov 26, 2014 at 9:01 AM, Patrick Meenan
        <pmee...@webpagetest.org <mailto:pmee...@webpagetest.org>>
        wrote:

            Would be great to see it either as a high-level duration
            or as an unblocking of the redirectStart time for
            cross-origin (though it may still not be clear to people
            that that is the time they really care about).

            I expect the current logic was the easiest and didn't
            require any privacy reviews because it's quite literally
            the exact same detail that you get if you do it manually
            in javascript by creating an element and listening to
            the onload.  Even if the more-granular detail doesn't
            really expose anything you couldn't figure out before it
            does provide additional detail that wouldn't otherwise
            be measurable and is probably going to require reviews
            by privacy and security teams.

            On Wed, Nov 26, 2014 at 9:36 AM, Peter Lepeska
            <bizzbys...@gmail.com <mailto:bizzbys...@gmail.com>> wrote:

                +1

                On Tue, Nov 25, 2014 at 12:31 PM, Nic Jansma
                <n...@nicj.net <mailto:n...@nicj.net>> wrote:

                    Good point!  Hadn't considered that, so yes I
                    would agree it's a very valuable addition to
                    consider.

                    As far as what interface to put it on, I'm not
                    sure networkDuration would make sense for
                    UserTiming, for example. While it could sit on
                    PerformanceEntry and just be "0" for interfaces
                    that aren't applicable, we could also create a
                    PerformanceNetworkEntry interface (with
                    networkDuration) that PerformanceResourceTiming
                    inherits from, while PerformanceUserTiming only
                    inherits from PerformanceEntry.

                    That's all minor details though. Really depends
                    on the browser privacy teams OK'ing the addition.

                    - Nic
                    http://nicj.net/
                    @NicJ

                    On 11/25/2014 12:16 PM, Steve Souders wrote:

                    Nic -

                    You can *not* calculate networkDuration from
                    other attributes for *cross-origin* resources.
                    That's why I'm suggesting adding this to
                    PerformanceEntry (rather than
                    PerformanceResourceTiming).

                    And as mentioned, about 50% of resources are
                    cross-origin so it's important to provide a
                    means for *accurate* download time measurements.

                    -Steve


                    On 11/25/14, 8:02 AM, Nic Jansma wrote:

                    Steve,

                    The only downside I see is that we're adding a
                    new attribute that can be entirely calculated
                    via other attributes.

                    One alternate (or additional thing) would be
                    to highlight this point in the description for
                    "duration" in the spec.
                    - Nic
                    http://nicj.net/
                    @NicJ
                    On 11/25/2014 3:04 AM, Yoav Weiss wrote:


                    On Tue, Nov 25, 2014 at 1:34 AM, Steve
                    Souders <st...@souders.org
                    <mailto:st...@souders.org>> wrote:

                        SHORT: I propose we add the
                        "networkDuration" property to
                        PerformanceEntry
                        
<http://www.w3.org/TR/performance-timeline/#performanceentry>
                        objects.

                        LONG: A few weeks ago I discovered that
                        "duration" includes blocking time, so
                        "duration" is greater than the actual
                        network time needed to download the
                        resource. Since then I've been at
                        Velocity and WebPerfDays where many
                        people have shown their Resource Timing
                        code. Everyone I spoke to (~5 different
                        teams) assumed that "duration" was just
                        the network time. When I explain that it
                        also includes blocking they were
                        surprised, admitted they hadn't known
                        that, and agreed it is NOT the metric
                        they were trying to capture.

                        I propose we add a new property to
                        Resource Timing that reflects the time to
                        actually load the resource excluding
                        blocking time. I'm flexible about the
                        name but for purposes of this discussion
                        let's call it "networkDuration". The
                        important piece of this proposal is that
                        "networkDuration" should be available for
                        all resources, similar to "duration". In
                        other words, it should be available for
                        same origin as well as cross origin
                        resources as part of the PerformanceEntry
                        
<http://www.w3.org/TR/performance-timeline/#performanceentry>
                        interface.

                        Same origin resources can calculate
                        "networkDuration" as follows (assume "r"
                        is a PerformanceResourceTiming
                        
<http://?ui=2&ik=b493d86064&view=att&th=149e4608a5dad0d6&attid=0.1.1&disp=emb&zw&atsh=0>
                        object):

                            dns = r.domainLookupEnd -
                        r.domainLookupStart;
                        tcp = r.connectEnd - r.connectStart; //
                        includes ssl negotiation
                        waiting = r.responseStart -
                        r.requestStart; // aka "TTFB"
                        content = r.responseEnd - r.responseStart;
                        networkDuration = dns + tcp + waiting +
                        content;

                        I've discussed this with a few people and
                        the only concern I've heard is with
                        regard to privacy along the lines of "if
                        we exclude blocking we've added the
                        ability to distinguish cache reads from
                        network fetches". This isn't an issue for
                        two reasons:

                         1. Even with the exclusion of blocking
                            time, it's still possible for
                            "networkDuration" to have a non-zero
                            value for resources read from cache
                            due to disk access time, etc.
                            Therefore, excluding blocking time
                            does not necessarily provide a clear
                            means of determining resources read
                            from cache.
                         2. This concern assumes that adding
                            "networkDuration" lessens privacy
                            because removing blocking time
                            provides additional information that
                            is not available today. However, it's
                            possible to exclude blocking time
                            today by loading a cross-origin
                            resource after window.onload, when
                            there is no blocking contention.

                        Therefore, individuals who have
                        JavaScript access to a page and can
                        measure duration also have enough access
                        to load resources after window.onload and
                        can thus determine the duration excluding
                        blocking time. Adding "networkDuration"
                        does not give these individuals
                        additional information beyond what is
                        measurable today.

                        What "networkDuration" provides is
                        additional information for the normal
                        case of resources that are loaded as part
                        of the main page when blocking contention
                        may occur. This will give current web
                        developers the metric they want for
                        cross-origin resources, and will provide
                        it more simply for same origin resources.


                    Assuming that the privacy concerns are in
                    fact non-existent, a big +1.

Re: add "networkDuration" to Resource Timing

Reply via email to