CC Hixie, question below.

On Tue, 10 Aug 2010 18:39:04 +0200, Boris Zbarsky <[email protected]> wrote:

On 8/10/10 4:40 AM, Philip Jägenstedt wrote:
Because the parser can't create a state which the algorithm doesn't
handle. It always first inserts the video element, then the source
elements in the order they should be evaluated. The algorithm is written
in such a way that the overall result is the same regardless of whether
it is invoked/continued on each inserted source element or after the
video element is closed.

Ah, the waiting state, etc?

Yes, in the case of the parser inserting source elements that fail one of the tests (no src, wrong type, wrong media) the algorithm will end up at step 6.21 waiting. It doesn't matter if all sources are available when the algorithm is first invoked or if they "trickle in", be that from the parser or from scripts.

Why does the algorithm not just reevaluate any sources after the newly-inserted source instead?

Because if a source failed after network access (404, wrong MIME, etc) then we'd have to perform that network access again and again for each modification. More on that below.

However, scripts can see the state at any point, which is why it needs to be the same in all browsers.

I'm not sure which "the state" you mean here.

For example networkState can be NETWORK_NO_SOURCE, NETWORK_EMPTY or NETWORK_LOADING depending on which steps you've run. Silvia Pfeiffer found inconsistencies between browsers because of this in, see <http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2010-July/027284.html>

It's quite serious because NETWORK_EMPTY is used as a condition in many places of the spec, so this absolutely must be consistent between browsers.

Because changes to the set of <source> elements do not restart the
resource selection algorithm, right? Why don't they, exactly? That
seems broken to me, from the POV of how the rest of the DOM generally
works (except as required by backward compatibility considerations)...

The resource selection is only started once, typically when the src
attribute is set (by parser or script) or when the first source element
is inserted. If it ends up in step 21 waiting, inserting another source
element may cause it to continue at step 22.

Right, ok.

Restarting the algorithm on any modification of source elements would
mean retrying sources that have previously failed due to network errors
or incorrect MIME type again and again, wasting network resources.
Instead, the algorithm just keeps it state and waits for more source
elements to try.

Well, the problem is that it introduces hysteresis into the DOM. Why is this a smaller consideration than the other, in the edge case when someone inserts sources in reverse order and "slowly" (off the event loop)?

The algorithm has been very stateful since I first implemented it and I always considered the sync/async split to be precisely for that reason, to be more tolerant of the order of DOM modification. I'll have to let Hixie answer why this specific trade-off was made.

That is, why do we only consider sources inserted after the |pointer| instead of all newly inserted sources?

Otherwise the pointer could potentially reach the same source element twice, with the aforementioned problems with failing after network access.

I'm not sure what you mean by hysteresis

http://en.wikipedia.org/wiki/Hysteresis

Specifically, that the state of the page depends not only on the current state of the DOM but also on the path in state space that the page took to get there.

Or in other words, that inserting two <source> elements does different things depending on whether you do "appendChild(a); appendChild(b)" or "appendChild(b); insertBefore(a, b)", even though the resulting DOM is exactly the same.

Or in your case, the fact that the ordering of the setAttribute and insertChild calls matters, say.

Such situations, which introduce order-dependency on DOM operations, are wonderful sources of frustration for web developers, especially if libraries that abstract away the DOM manipulation are involved (so the web developer can't even change the operation order).

OK, perhaps I should take this more seriously. Making the whole algorithm synchronous probably isn't a brilliant idea unless we can also do away with all of the state it keeps (i.e. hysteresis).

One way would be to introduce a magic flag on all source elements to indicate that they have already failed. This would be cleared whenever src, type or media is modified. Another is to cache 404 responses and the MIME types of rejected resources, but I think that's a bit overkill. Do you have any specific ideas?

I have a really hard time believing that you trigger resource
selection when the <video> is inserted into the document and don't
retrigger it afterward, given that... do you?

To the best of my knowledge we do exactly what the spec says, apart from the uncertainty regarding "await a stable state".

Resource selection is triggered by setting/modifying the src attribute or inserting a source element when networkState is NETWORK_EMPTY. Here's an annotated guide of exactly what happens in two cases:

<video src="video.webm">
<!-- resource selection triggered as src attribute was set by parser -->
</video>

<video>
<!-- resource selection not triggered yet -->
<source>
<!-- resource selection triggered, ends up waiting in step 6.21 due to missing src -->
<source src="video.mp4" type="video/mp4">
<!-- resource selection continues at step 6.22, but ends up waiting again in 6.21 as we don't support video/mp4 -->
<source src="video.webm" type="video/webm">
<!-- resource selection continues at step 6.22, calling resource fetch in step 6.9, potentially never returning -->
</video>

2. Instead of calling the resource fetch algorithm in step 5/9

There doesn't seem to be such a step...

3. In step 21, instead of waiting forever, just return and let inserting
a source element cause it to continue at step 22.

Again, the numbering seems to be off.

These are steps in the resource selection algorithm, not in the resource
fetch algorithm.

Yes.  Step 5 in the resource selection algorithm I see is:

   5. Queue a task to fire a simple event named loadstart at the media
      element.

It has no substeps.

Oops, steps 5/9/21 are substeps of step 6.

Mozilla is implementing this now. How are you interpreting "await a
stable state" when the resource selection algorithm is triggered by the
parser?

At the moment, given that we don't differentiate betwen "pause" and "spin the event loop" internally, it sounds like we plan to treat tis as "wait until the next event runs from the event loop". This means we will treat an alert being up as being in a stable state; same for sync XHR, showModalDialog, etc. From the parser we will basically treat it as "run asynchronously".

Will the result be 100% predictable or depend on "random" things
like how much data the parser already has available from the network?

I don't know about "result". When the algorithm runs, exactly, will depend on the amount of data the parser parses before returning to the event loop. Does that affect "result"?

Yes, it sounds like it very much does, and would result in disasters like this:

<!doctype html>
<video src="video.webm"></video>
<!-- network packet boundary or lag? -->
<script>alert(document.querySelector('video').networkState)</script>

The result will be 0 (NETWORK_EMPTY) or 2 (NETWORK_LOADING) depending on whether or not the parser happened to return to the event loop before the script. The only way this would not be the case is if the event loop is spun before executing scripts, but I haven't found anything to that effect in the spec. I hope I'm wrong, of course.

--
Philip Jägenstedt
Core Developer
Opera Software

Reply via email to