On Thu, 08 Oct 2009 12:10:01 +0200, Robert O'Callahan <[email protected]> wrote:

http://www.whatwg.org/specs/web-apps/current-work/#loading-the-media-resource

In the resource fetch algorithm, after we reach the NETWORK_LOADED state in step 3 which indicates that all the data we need to play the resource is now available locally, we end the resource fetch algorithm. However, in Gecko we
have a media cache which might discard blocks of media data after we've
reached the NETWORK_LOADED state (to make room for data for other loading
resources). This means we might have to start fetching the resource again
later. The spec does not seem to allow for this. Do we need to change our
behavior, or does the spec need to change to accommodate our behavior? I'd prefer not to change our behavior since I think to follow the spec we'd need
to pin the entire resource permanently in the cache after we reached
NETWORK_LOADED, which could be highly suboptimal in some situations.

The spec notes that "Some resources, e.g. streaming Web radio, can never reach the NETWORK_LOADED state." In my understanding, you mustn't go to NETWORK_LOADED if you can't guarantee that the resource will remain in cache. Browsers with clever caching or small caches simply won't send a load event most of the time.

Another issue is that it's not completely clear to me what is meant by
"While the user agent might still need network access to obtain parts of the media resource<http://www.whatwg.org/specs/web-apps/current-work/#media-resource>..."
What if there is data in the resource that we don't need in order to
play through normally, but which might be needed in some special situations
(e.g., enabling subtitles, or seeking using an index), and we optimize to
not load that data unless/until we need it? In that case would we never
reach NETWORK_LOADED?

As I understand it, NETWORK_LOADED means that all bytes of the resource have been loaded, regardless of whether they will be used or not. Are there any formats that would actually allow not downloading parts of the resource in a meaningful way? Subtitles and indexes are too small to bother, and multiplexed audio/video tracks can hardly be skipped without zillions of HTTP Range requests. It seems to me that kind of thing would have to be done either with a server side media fragment request (using the 'track' dimension) or with an external audio/video track somehow synced to the master track (much like external subtitles).

In general NETWORK_LOADED and the "load" event seem rather useless and
dangerous IMHO. If you're playing a resource that doesn't fit in your cache
then you'll certainly never reach NETWORK_LOADED, and since authors can't
know the cache size they can never rely on "load" firing. And if you allow the cache discarding behavior I described above, authors can't rely on data actually being present locally even after "load" has fired. I suspect many
authors will make invalid assumptions about "load" being sure to fire and
about what "load" means if it does fire. Does anyone have any use cases that
"load" actually solves?

I agree, sites that depend on the load event sites will likely break randomly for file sizes that usually barely fit into the cache of the browser they were tested with. If browsers are conservative with bandwidth and only send the load event when it's true, I think we will have less of a problem however. Note that the load event isn't strictly needed, waiting for a progress event with loaded==total would achieve the same thing. Aesthetically, however, I think it would be strange to not have the load event.

--
Philip Jägenstedt
Core Developer
Opera Software

Reply via email to