Re: [whatwg] about:blank synchronicity

2010-01-27 Thread Ian Hickson
On Wed, 13 Jan 2010, Henri Sivonen wrote:

 This has turned out to be a test suite compatibility problem with 
 about:blank. Mozilla's Mochitest test suite has tests that depend 
 about:blank in iframe having a document.body immediately upon iframe 
 insertion to document without a trip through the event loop.
 
 At first look, this seems like a clear case: the spec says that 
 about:blank is navigated to synchronously. However, this is not what 
 Gecko does (with the old parser).

The parser isn't even invoked in this case in the spec actually. Just 
creating an iframe and inserting in the document synchronously creates a 
Document object with the right elements, without parsing anything.


 Gecko (with the old parser) has these two characteristics:
  1) If a browsing context that has no document object is asked to return 
 its document object, an about:blank-like DOM is generated into the 
 browsing context synchronously.

This is compatible with what the spec requires.


  2) When a browsing context is navigated to about:blank, a task is 
 posted to the task queue. When that task is run, about:blank is parsed 
 to completion during the single task queue task.

I've changed the spec to make actual navigation to about:blank async.


On Wed, 13 Jan 2010, Maciej Stachowiak wrote:

 I am not sure what the exact constraints are, but I believe the 
 following are required:
 
 - Accessing the document of a frame with missing, empty or about:blank 
 src has to always give you an HTML document with a body, even if there 
 hasn't been a chance for the event loop to run.
 - A newly created iframe with missing, empty or about:blank src has to have an
 accessible document right away, without even cycling the event loop.

I believe this is guaranteed in the spec, at least for newly created 
browsing contexts.


 There are at least three particular scenarios that are relevant here:
 
 1) Some sites document.write or otherwise poke at the DOM of their 
 about:blank frames or iframes in inline script, without waiting for the 
 load event or anything.
 
 2) Some sites load multiple frames, yet one expects to poke at the 
 other's DOM during its load. Since load order is not guaranteed, this 
 would be a race condition, if the not-yet-loaded frame had no DOM, but 
 synchronous about:blank lets such sites muddle on through. Before we had 
 sufficiently synchronous loading of the initial empty frame document, we 
 actually encountered sites like this that broke in Safari but not IE or 
 Firefox.
 
 3) Some sites make a new iframe element using DOM calls in an event 
 handler, and expect it to have an empty document that's immediately 
 ready for DOM manipulation, without any intervening returns to the event 
 loop.

Those should all work, since they all can access the initially created 
document (the one that doesn't involve a parser).


On Wed, 13 Jan 2010, Boris Zbarsky wrote:
 On 1/13/10 11:52 AM, Maciej Stachowiak wrote:
  Question: if you generate a document on the fly via early access, does
  it get replaced when the about:blank task actually completes?
 
 Yes.  More precisely, the document is replaced, but the inner window is 
 not (the latter required for pages that set variables on the window 
 before the load is complete).

This is in fact required by the spec, too:

   
http://www.whatwg.org/specs/web-apps/current-work/multipage/history.html#create-a-document-object


 I believe Gecko treats a document.write the same way it treats a 
 location set in terms of network traffic: any loads that are happening 
 in that navigation context are canceled.  This is not specific to 
 pending about:blank loads.  For example, if you insert an iframe with 
 some http URI as @src into the DOM and then document.write into it 
 immediately, then the http load will be canceled.  Nothing special about 
 about:blank here.

This is currently step 8 of the document.open() algorithm.


 !DOCTYPE html
 html
   head
 script
   function doTheTest() {
 alert(window.frames[0].document.documentElement.textContent);
   }
 /script
   /head
   body onload=doTheTest()
 iframe src=/iframe
 script
   var doc = window.frames[0].document;
   doc.documentElement.appendChild(doc.createTextNode(foopy));
 /script
   /body
 /html
 
 This alerts empty string in Gecko (and doesn't show the string foopy 
 in the iframe).

The  URL resolves to the same as ./, which in IE (though no other 
browsers) means loading up an actual page. Currently the spec agrees with 
IE on this, though there is an open issue about whether to change this 
that I haven't looked at yet.

If we have src=about:blank, though, the spec says that the iframe gets 
a Document object with its body created synchronously, and the 
navigation is never done; the DOM manipulation is thus persistent (and the 
test alerts foopy)


On Thu, 14 Jan 2010, Henri Sivonen wrote:
 
 Which leads to the question: Are there known Web compat constraints on 
 

Re: [whatwg] about:blank synchronicity

2010-01-27 Thread Boris Zbarsky

On 1/27/10 3:53 AM, Ian Hickson wrote:

Would it have other problems? Are there cases other than navigation
where about:blank being synchronous is detectable? (I couldn't find
any.)


I'm not sure what you're asking here...


I mean, like, does it matter if about:blank is synchronous inimg
src=, or in CSS in a url(), or something like that?


Oh.  I don't think about:blank should be special in any of those 
contexts, no.


-Boris



Re: [whatwg] about:blank synchronicity

2010-01-20 Thread Henri Sivonen
On Jan 15, 2010, at 12:05, Henri Sivonen wrote:

 I've located a Mozilla test case that seems to depend on the event loop task 
 mapping of data: URL loads 
 (http://mxr.mozilla.org/mozilla-central/source/layout/base/tests/chrome/test_bug533845.xul).

This was most likely a misdiagnosis.

-- 
Henri Sivonen
hsivo...@iki.fi
http://hsivonen.iki.fi/




Re: [whatwg] about:blank synchronicity

2010-01-18 Thread Boris Zbarsky

On 1/15/10 5:05 AM, Henri Sivonen wrote:

I've located a Mozilla test case that seems to depend on the event loop task 
mapping of data: URL loads 
(http://mxr.mozilla.org/mozilla-central/source/layout/base/tests/chrome/test_bug533845.xul).


Er... it does?  Where?


Does anyone happen to have data on whether the Web already depends on data: 
URLs that don't block the parser loading as a single event loop task?


I don't think the web depends on data: URLs at all, really, so I would 
guess no.


-Boris


Re: [whatwg] about:blank synchronicity

2010-01-18 Thread Boris Zbarsky

On 1/13/10 4:56 PM, Ian Hickson wrote:

The spec currently distinguishes between the initial about:blank load
(creation of a new browsing context), which actually doesn't involve
navigation, and navigating to about:blank.

It seems like simply making the first one synchronous, but making the
latter asynchronous, would satisfy your use case. Would other vendors be
ok with this?


In case it wasn't clear from the relevant Gecko thread, I would 
personally be fine with this.  That said, would initial about:blank 
load only include iframe/ (no src at all), or also iframe src=/ 
or also iframe src=about:blank/?  I suspect it doesn't matter that 
much, actually, but would welcome confirmation.



Would it have other problems? Are there cases other than navigation where
about:blank being synchronous is detectable? (I couldn't find any.)


I'm not sure what you're asking here...

-Boris


Re: [whatwg] about:blank synchronicity

2010-01-18 Thread Ian Hickson
On Mon, 18 Jan 2010, Boris Zbarsky wrote:
 On 1/13/10 4:56 PM, Ian Hickson wrote:
  The spec currently distinguishes between the initial about:blank load
  (creation of a new browsing context), which actually doesn't involve
  navigation, and navigating to about:blank.
  
  It seems like simply making the first one synchronous, but making the
  latter asynchronous, would satisfy your use case. Would other vendors be
  ok with this?
 
 In case it wasn't clear from the relevant Gecko thread, I would personally be
 fine with this.  That said, would initial about:blank load only include
 iframe/ (no src at all), or also iframe src=/ or also iframe
 src=about:blank/?  I suspect it doesn't matter that much, actually, but
 would welcome confirmation.

It would include any browsing context creation, including, e.g. 
window.open(), object pointing to an HTML file before the HTML file is 
loaded, etc.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] about:blank synchronicity

2010-01-18 Thread Boris Zbarsky

On 1/18/10 6:02 PM, Ian Hickson wrote:

In case it wasn't clear from the relevant Gecko thread, I would personally be
fine with this.  That said, would initial about:blank load only include
iframe/  (no src at all), or alsoiframe src=/  or alsoiframe
src=about:blank/?  I suspect it doesn't matter that much, actually, but
would welcome confirmation.


It would include any browsing context creation, including, e.g.
window.open(),object  pointing to an HTML file before the HTML file is
loaded, etc.


That wasn't quite my question.

If I have an iframe src=about:blank/ in my source, would there be a 
sync about:blank document creation followed by an about:blank load?  Or 
would the @src value just get ignored if it's about:blank?


-Boris


Re: [whatwg] about:blank synchronicity

2010-01-15 Thread Henri Sivonen
I've located a Mozilla test case that seems to depend on the event loop task 
mapping of data: URL loads 
(http://mxr.mozilla.org/mozilla-central/source/layout/base/tests/chrome/test_bug533845.xul).

Does anyone happen to have data on whether the Web already depends on data: 
URLs that don't block the parser loading as a single event loop task?

-- 
Henri Sivonen
hsivo...@iki.fi
http://hsivonen.iki.fi/




Re: [whatwg] about:blank synchronicity

2010-01-14 Thread Henri Sivonen
On Jan 13, 2010, at 19:08, Boris Zbarsky wrote:

 On 1/13/10 11:52 AM, Maciej Stachowiak wrote:
 It seems like if Gecko truly wanted to make about:blank synchronous, it
 should be possible simply by special-casing its load and performing a
 series of DOM calls right then and there, without ever involving the
 parser.
 
 Turns out this actually breaks at least some things that expect 
 (asynchronous) onload events and the like for the about:blank load, at least 
 when Henri tried doing exactly that.  I _think_ this was for cases where an 
 explicit about:blank was listed as the src.

I did it after absent or empty src had been defaulted to about:blank: so empty, 
absent and explicit about:blank were all covered. Also, I did it for all 
browsing contexts--not just iframes.

The most obvious test case that broke was testing history navigation in a 
top-level browsing context (i.e. created in XUL--not as an iframe).

It is plausible that my attempt to fix was too naïve and additional tweaking of 
the events could work. (It is indeed very likely that my attempted fix was too 
naïve.) Also, making the change only for frames but not for top-level browsing 
contexts might be worth considering if changing this for top-level browsing 
contexts is too disruptive.

Which leads to the question: Are there known Web compat constraints on 
navigating a non-framed browsing context to about:blank via window.open() or 
window.location.href in a previously open()ed window?

-- 
Henri Sivonen
hsivo...@iki.fi
http://hsivonen.iki.fi/




[whatwg] about:blank synchronicity

2010-01-13 Thread Henri Sivonen
The HTML5 parser in Gecko loads all streams very asynchronously. That is, to 
loading a stream never finishes from the same event queue task that starts the 
load. This is fine for loading HTTP streams, since the general expectation is 
that the process of loading something from the network makes multiple trips 
through the event loop.

This has turned out to be a test suite compatibility problem with about:blank. 
Mozilla's Mochitest test suite has tests that depend about:blank in iframe 
having a document.body immediately upon iframe insertion to document without a 
trip through the event loop.

At first look, this seems like a clear case: the spec says that about:blank is 
navigated to synchronously. However, this is not what Gecko does (with the old 
parser).

Gecko (with the old parser) has these two characteristics:
 1) If a browsing context that has no document object is asked to return its 
document object, an about:blank-like DOM is generated into the browsing context 
synchronously.
 2) When a browsing context is navigated to about:blank, a task is posted to 
the task queue. When that task is run, about:blank is parsed to completion 
during the single task queue task.

As a result, in Gecko (with the old parser enabled), asking for document.body 
of an iframe never returns null even if navigation to about:blank isn't 
complete. If the navigation hasn't completed yet, a body element generated by 
#1 above is returned. If navigation has completed, a body element generated by 
#2 above is returned. Since #2 happens as a single task, it's never possible to 
see a browsing context that is being navigated to about:blank in an 
intermediate state of the parse. (The HTML5 parser breaks this by making the 
state where the document object has been created by nothing has been tokenized 
yet observable.)

Now, consider the following demo:
http://hsivonen.iki.fi/test/bz-about-blank-data.html

This makes it look like Opera and Safari were doing what the spec says and 
navigating the iframe synchronously to about:blank. (The use of the data: URL 
scheme makes the demo not work in IE.)

However, if the data: URL is changed to an http: URL, Safari no longer appears 
to navigate to about:blank synchronously:
http://hsivonen.iki.fi/test/bz-about-blank.html

Let's take a more careful look:
http://hsivonen.iki.fi/test/bz-about-blank-check-body.html

Opera indeed navigates to about:blank synchronously.

IE doesn't support window.stop, so let's try testing without it:
http://hsivonen.iki.fi/test/bz-about-blank-check-body-no-stop.html

IE7 neither does IE 8 doesn't appear to actually navigate synchronously.

So it appears that only Opera is doing what the spec requires. Since IE, 
Firefox or Safari aren't doing what the spec requires, what the spec requires 
can't be exactly necessary for Web compat.

What's the actual Web compat constraint when it comes to navigating to 
about:blank (including loading about:blank as the initial page into a 
newly-inserted iframe)?

-- 
Henri Sivonen
hsivo...@iki.fi
http://hsivonen.iki.fi/




Re: [whatwg] about:blank synchronicity

2010-01-13 Thread Maciej Stachowiak


On Jan 13, 2010, at 5:22 AM, Henri Sivonen wrote:

The HTML5 parser in Gecko loads all streams very asynchronously.  
That is, to loading a stream never finishes from the same event  
queue task that starts the load. This is fine for loading HTTP  
streams, since the general expectation is that the process of  
loading something from the network makes multiple trips through the  
event loop.


This has turned out to be a test suite compatibility problem with about:blank 
. Mozilla's Mochitest test suite has tests that depend about:blank  
in iframe having a document.body immediately upon iframe insertion  
to document without a trip through the event loop.


At first look, this seems like a clear case: the spec says that about:blank 
 is navigated to synchronously. However, this is not what Gecko does  
(with the old parser).


Gecko (with the old parser) has these two characteristics:
1) If a browsing context that has no document object is asked to  
return its document object, an about:blank-like DOM is generated  
into the browsing context synchronously.
2) When a browsing context is navigated to about:blank, a task is  
posted to the task queue. When that task is run, about:blank is  
parsed to completion during the single task queue task.


As a result, in Gecko (with the old parser enabled), asking for  
document.body of an iframe never returns null even if navigation to about:blank 
 isn't complete. If the navigation hasn't completed yet, a body  
element generated by #1 above is returned. If navigation has  
completed, a body element generated by #2 above is returned. Since  
#2 happens as a single task, it's never possible to see a browsing  
context that is being navigated to about:blank in an intermediate  
state of the parse. (The HTML5 parser breaks this by making the  
state where the document object has been created by nothing has been  
tokenized yet observable.)


Question: if you generate a document on the fly via early access, does  
it get replaced when the about:blank task actually completes?


It seems like if Gecko truly wanted to make about:blank synchronous,  
it should be possible simply by special-casing its load and performing  
a series of DOM calls right then and there, without ever involving the  
parser.




Now, consider the following demo:
http://hsivonen.iki.fi/test/bz-about-blank-data.html

This makes it look like Opera and Safari were doing what the spec  
says and navigating the iframe synchronously to about:blank. (The  
use of the data: URL scheme makes the demo not work in IE.)


However, if the data: URL is changed to an http: URL, Safari no  
longer appears to navigate to about:blank synchronously:

http://hsivonen.iki.fi/test/bz-about-blank.html


I think your test case demonstrates something that we would consider a  
bug. Though I am not sure what exactly is happening internally that  
causes it. We certainly make our best effort to load about:blank  
synchronously, though there may be unusual circumstances where that  
doesn't happen.




Let's take a more careful look:
http://hsivonen.iki.fi/test/bz-about-blank-check-body.html

Opera indeed navigates to about:blank synchronously.

IE doesn't support window.stop, so let's try testing without it:
http://hsivonen.iki.fi/test/bz-about-blank-check-body-no-stop.html

IE7 neither does IE 8 doesn't appear to actually navigate  
synchronously.


So it appears that only Opera is doing what the spec requires. Since  
IE, Firefox or Safari aren't doing what the spec requires, what the  
spec requires can't be exactly necessary for Web compat.


What's the actual Web compat constraint when it comes to navigating  
to about:blank (including loading about:blank as the initial page  
into a newly-inserted iframe)?


I am not sure what the exact constraints are, but I believe the  
following are required:


- Accessing the document of a frame with missing, empty or about:blank  
src has to always give you an HTML document with a body, even if there  
hasn't been a chance for the event loop to run.
- A newly created iframe with missing, empty or about:blank src has to  
have an accessible document right away, without even cycling the event  
loop.


There are at least three particular scenarios that are relevant here:

1) Some sites document.write or otherwise poke at the DOM of their about:blank 
 frames or iframes in inline script, without waiting for the load  
event or anything.


2) Some sites load multiple frames, yet one expects to poke at the  
other's DOM during its load. Since load order is not guaranteed, this  
would be a race condition, if the not-yet-loaded frame had no DOM, but  
synchronous about:blank lets such sites muddle on through. Before we  
had sufficiently synchronous loading of the initial empty frame  
document, we actually encountered sites like this that broke in Safari  
but not IE or Firefox.


3) Some sites make a new iframe element using DOM calls in an event  
handler, and expect it to have 

Re: [whatwg] about:blank synchronicity

2010-01-13 Thread Boris Zbarsky

On 1/13/10 11:52 AM, Maciej Stachowiak wrote:

Question: if you generate a document on the fly via early access, does
it get replaced when the about:blank task actually completes?


Yes.  More precisely, the document is replaced, but the inner window is 
not (the latter required for pages that set variables on the window 
before the load is complete).



It seems like if Gecko truly wanted to make about:blank synchronous, it
should be possible simply by special-casing its load and performing a
series of DOM calls right then and there, without ever involving the
parser.


Turns out this actually breaks at least some things that expect 
(asynchronous) onload events and the like for the about:blank load, at 
least when Henri tried doing exactly that.  I _think_ this was for cases 
where an explicit about:blank was listed as the src.  There is probably 
also various non-web code that expects various network events (start of 
network load, end of network load, etc) and the like in this 
circumstance...  We could try to hunt it all down and change it, but 
it's not exactly a trivial endeavor.  If it's necessary, it's necessary, 
of course.



- Accessing the document of a frame with missing, empty or about:blank
src has to always give you an HTML document with a body, even if there
hasn't been a chance for the event loop to run.


Agreed.  This is an absolute requirement.


- A newly created iframe with missing, empty or about:blank src has to
have an accessible document right away, without even cycling the event
loop.


Yes.


1) Some sites document.write or otherwise poke at the DOM of their
about:blank frames or iframes in inline script, without waiting for the
load event or anything.


Yep.  I believe Gecko treats a document.write the same way it treats a 
location set in terms of network traffic: any loads that are happening 
in that navigation context are canceled.  This is not specific to 
pending about:blank loads.  For example, if you insert an iframe with 
some http URI as @src into the DOM and then document.write into it 
immediately, then the http load will be canceled.  Nothing special about 
about:blank here.



2) Some sites load multiple frames, yet one expects to poke at the
other's DOM during its load. Since load order is not guaranteed, this
would be a race condition, if the not-yet-loaded frame had no DOM, but
synchronous about:blank lets such sites muddle on through. Before we had
sufficiently synchronous loading of the initial empty frame document, we
actually encountered sites like this that broke in Safari but not IE or
Firefox.

3) Some sites make a new iframe element using DOM calls in an event
handler, and expect it to have an empty document that's immediately
ready for DOM manipulation, without any intervening returns to the event
loop.


Fully agreed on these use cases.

One question is whether in case 3 the sites expect the same DOM to be 
available later on.  It seems to be in Safari and Opera.  It's not in 
Gecko at the moment, as expected based on code inspection.   Testcase:


!DOCTYPE html
html
  head
script
  function doTheTest() {
alert(window.frames[0].document.documentElement.textContent);
  }
/script
  /head
  body onload=doTheTest()
iframe src=/iframe
script
  var doc = window.frames[0].document;
  doc.documentElement.appendChild(doc.createTextNode(foopy));
/script
  /body
/html

This alerts empty string in Gecko (and doesn't show the string foopy 
in the iframe).


-Boris


Re: [whatwg] about:blank synchronicity

2010-01-13 Thread Ian Hickson
On Wed, 13 Jan 2010, Henri Sivonen wrote:
 
 Gecko (with the old parser) has these two characteristics:
  1) If a browsing context that has no document object is asked to return 
 its document object, an about:blank-like DOM is generated into the 
 browsing context synchronously.
  2) When a browsing context is navigated to about:blank, a task is 
 posted to the task queue. When that task is run, about:blank is parsed 
 to completion during the single task queue task.

The spec currently distinguishes between the initial about:blank load 
(creation of a new browsing context), which actually doesn't involve 
navigation, and navigating to about:blank.

It seems like simply making the first one synchronous, but making the 
latter asynchronous, would satisfy your use case. Would other vendors be 
ok with this?

Would it have other problems? Are there cases other than navigation where 
about:blank being synchronous is detectable? (I couldn't find any.)

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'