On Thu, 08 Oct 2009 12:10:01 +0200, Robert O'Callahan
<[email protected]> wrote:
http://www.whatwg.org/specs/web-apps/current-work/#loading-the-media-resource
In the resource fetch algorithm, after we reach the NETWORK_LOADED state
in
step 3 which indicates that all the data we need to play the resource is
now
available locally, we end the resource fetch algorithm. However, in
Gecko we
have a media cache which might discard blocks of media data after we've
reached the NETWORK_LOADED state (to make room for data for other loading
resources). This means we might have to start fetching the resource again
later. The spec does not seem to allow for this. Do we need to change our
behavior, or does the spec need to change to accommodate our behavior?
I'd
prefer not to change our behavior since I think to follow the spec we'd
need
to pin the entire resource permanently in the cache after we reached
NETWORK_LOADED, which could be highly suboptimal in some situations.
The spec notes that "Some resources, e.g. streaming Web radio, can never
reach the NETWORK_LOADED state." In my understanding, you mustn't go to
NETWORK_LOADED if you can't guarantee that the resource will remain in
cache. Browsers with clever caching or small caches simply won't send a
load event most of the time.
Another issue is that it's not completely clear to me what is meant by
"While the user agent might still need network access to obtain parts of
the
media
resource<http://www.whatwg.org/specs/web-apps/current-work/#media-resource>..."
What if there is data in the resource that we don't need in order to
play through normally, but which might be needed in some special
situations
(e.g., enabling subtitles, or seeking using an index), and we optimize to
not load that data unless/until we need it? In that case would we never
reach NETWORK_LOADED?
As I understand it, NETWORK_LOADED means that all bytes of the resource
have been loaded, regardless of whether they will be used or not. Are
there any formats that would actually allow not downloading parts of the
resource in a meaningful way? Subtitles and indexes are too small to
bother, and multiplexed audio/video tracks can hardly be skipped without
zillions of HTTP Range requests. It seems to me that kind of thing would
have to be done either with a server side media fragment request (using
the 'track' dimension) or with an external audio/video track somehow
synced to the master track (much like external subtitles).
In general NETWORK_LOADED and the "load" event seem rather useless and
dangerous IMHO. If you're playing a resource that doesn't fit in your
cache
then you'll certainly never reach NETWORK_LOADED, and since authors can't
know the cache size they can never rely on "load" firing. And if you
allow
the cache discarding behavior I described above, authors can't rely on
data
actually being present locally even after "load" has fired. I suspect
many
authors will make invalid assumptions about "load" being sure to fire and
about what "load" means if it does fire. Does anyone have any use cases
that
"load" actually solves?
I agree, sites that depend on the load event sites will likely break
randomly for file sizes that usually barely fit into the cache of the
browser they were tested with. If browsers are conservative with bandwidth
and only send the load event when it's true, I think we will have less of
a problem however. Note that the load event isn't strictly needed, waiting
for a progress event with loaded==total would achieve the same thing.
Aesthetically, however, I think it would be strange to not have the load
event.
--
Philip Jägenstedt
Core Developer
Opera Software