Re: [whatwg] Navigation and history traversal issues

2012-09-18 Thread Justin Lebar
, certainly before the parser stops, the user agent must update
 the session history with the new page.  That invokes [2] update the
 session history with the new page, which invokes [3] Traverse the
 history to the new entry, which fires popstate in step 14.

 However, After creating the Document object, but before any script
 execution seems like it could happen before or after the body element
 has been parsed, so the alert may or may not happen.

 Yeah, this is an oversight as specced. Fixed.


 On Sun, 16 Sep 2012, Justin Lebar wrote:

 Suppose an attack page evil.html controls a separate frame F (e.g.
 evil.html frames F, evil.html opened F as a popup window, or vice
 versa).

 We discovered that if evil.html causes F to

   1. load a.html
   2. start loading b.html
   3. load a.html#h

 then step (3) cannot cancel the load of b.html.  That is, the final
 session history from this sequence must be either

   a.html  -- oldest
   a.html#h
   b.html  -- current

 or

   a.html -- oldest
   b.html -- current.

 All browsers I tested gave one of the above two results.

 Doing anything else breaks the web (we shipped this in Firefox Nightly
 and people were unable to log into ingdirect.com, for example).  I
 didn't investigate too thoroughly, but I believe what happens is, some
 sites use a link with href # and then navigate themselves in the
 link's onclick handler, without cancelling the click event.  In that
 case, we do precisely steps 1-3 above.

 As I read the spec, browsers are supposed to cancel the load of b.html
 in step 3 above.  In the navigation algorithm [1], step 6 explicitly
 cancels the load of b.html, because the load of b.html has not matured.
 So if I understand correctly, the spec is dictating behavior that we
 know won't work and that no browser implements.

 The presence of steps 6 and 8 in the algorithm suggest that the spec is
 already trying to walk this line, so maybe I misunderstand what's going
 on, either in my tests or in the spec.

 The existing text in the spec step 4 is attempting to prevent a page from
 having you click on a link to a href=http://paypal.com/; and in the
 unload change that to a location.href=http://paypa1.com/; navigation, or
 something similar but with the user typing in the location bar and the
 page hijacking that navigation.

 If it turns out that you can't ever block a cross-origin navigation,
 though, that's a lot easier to fix. :-)

 It's not that simple though. Browsers agree on this page that we should go
 to the second of the two cross-origin navigations (replace false with
 1 in the script to run the test):

http://software.hixie.ch/utilities/js/live-dom-viewer/saved/1778

 This one too (frame nav):

http://software.hixie.ch/utilities/js/live-dom-viewer/saved/1780

 So this is presumably specific to fragment identifiers. And sure enough,
 when we change the latter one above to changing to a fragment identifier,
 it works as you describe:

http://software.hixie.ch/utilities/js/live-dom-viewer/saved/1782

 (Things aren't so simple in this example (same-page nav):

http://software.hixie.ch/utilities/js/live-dom-viewer/saved/1784

 ...where Firefox no longer exhibits the restraint we're looking for here,
 but Chrome and Opera still do.)

 Anyway, yeah, looks like step 6 is just bogus. I've removed it. This now
 means that fragment identifier navigations just happen without screwing
 around with ongoing loads.


 == Issue #2 ==

 Suppose again that evil.com controls a frame F, and evil.com causes F to

   1. load a.html
   2. load a.html#h
   3. start loading b.html
   4. go back

 When we go back, we traverse the history [2] from a.html#h to a.html.
 Per the spec, this doesn't cancel the load of b.html.

 This caused a problem for us in Firefox because we create a session
 history entry for b.html at the beginning of step 3 and insert it after
 the current one.  Then, when the load of b.html completes, we use
 whichever session history entry happens to be after the current one,
 assuming that it was the session history entry we created earlier. [...]

 The fix for this bug is not as simple as merely ensuring that the
 session history entry's URL matches the document's URL.  Due to hash
 navigations and pushstate, these URLs may not match even when we're
 behaving correctly.

 We fixed this bug by cancelling the load of b.html when you go back.
 This matches Chrome's behavior in my tests [3].

 Notice that this means we're cancelling an outstanding network load due
 to a synchronous same-document load, which I said in part 1 breaks the
 web.  But based on the (lack of) feedback we've received from our test
 audience, it seems that cancelling the load of b.html does /not/ break
 the web if the navigation from a.html to a.html#h is a history
 navigation.

 The right thing to do is probably to load b.html after a.html, so the
 final session history is

   a.html -- oldest
   b.html -- current.

 I /think/ this is what the spec says should happen

Re: [whatwg] Navigation and history traversal issues

2012-09-18 Thread Justin Lebar
  I've also made back()/forward()/go() not work during the document's
  unload handler, since that could be used for griefing. I'm tempted to
  disable it entirely for all docs a la alert(), but I've no idea if
  that's Web- compatible and I suspect not.

 I don't know what you mean by the last sentence here.  In my tests, IE
 and Opera do not support cross-origin back/forward/go, if that's what
 you mean.  I don't see any good reason for us to support that in
 Firefox, either, if we could get away with removing it.

 I meant blocking all scripted back/forward session history traversal while
 any page is running the unload algorithms.

Ah, I see.  I don't have any idea if that's a good idea or not, so, okay.  :)

 As far as cross-origin back/forward, there are 404 pages on the Web that
 have javascript:history.back() links; these would break for cross-origin
 links if we blocked cross-origin history traversal. I don't really see
 much point. What's the security risk?

The issue isn't a history.back() which crosses origins -- that seems
fine -- but rather calling history.back() on a cross-origin window.
(Sorry that wasn't clear.)

It's not clear that this poses a security risk (otherwise, I'm sure
we'd have removed it by now), aside from making it easier to tickle
Firefox into buggy states like this bug [1].  But it's also not clear
to me what benefit there is to being able to call back() on an
arbitrary window.

I guess I can navigate a window, so I might as well be able to make it
go back?  But those aren't quite the same thing.

-Justin

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=737307


Re: [whatwg] Document's base URI should use the document's *current* address

2012-02-22 Thread Justin Lebar
 From an author's point of view, there's no such thing as the
 document's original URI and, unless you're a nerd, you've never heard
 of the base URI.  There's just the document's URI, modified by
 pushState.

 From this point of view, I'd say it's less surprising that relative
 URIs would break when you change directories (hey, you *asked* for it)
 than that anchor refs would update the browser's address bar and
 document.location relative to the old URI.

In my tests, Chrome and Firefox both immediately change
document.baseURL when you call pushState.  Images (and I presume other
resources) are resolved relative to the new base.

I'm not sure why your earlier test with seemed to work in Chrome, Hixie.  :-/

I think this ship may have sailed.

-Justin

button onclick='push()'Click me/button
function push() {
  history.pushState('', '', '/' + Math.random() + '/file');
  alert(document.baseURI);
}


Re: [whatwg] Document's base URI should use the document's *current* address

2012-02-15 Thread Justin Lebar
 http://software.hixie.ch/utilities/js/live-dom-viewer/saved/1342

 It doesn't make sense that the second image is broken.

 (For some reason in Firefox I get an exception. Not sure if I'm misusing
 the API or if it's a bug in Firefox.)

Not sure what's going on with that Firefox exception.  But I'm not
terribly surprised that the second image shouldn't work...  :)

 Similarly, if for some bizarre reason the page pushState's to a new
 directory, shouldn't all the links point relative to that new directory?

 That would break all existing images, stylesheets, scripts, etc, if their
 URLs are reused somehow.

Hm...maybe you're right.  But then, how do we jive this with #foo
and ?foo links, both of which resolve relative to the current URI in
both Firefox and WebKit?

  - Start the Follow a hyperlink algorithm.
 -  [snip]
  - It sets the document's current address to .../page.html#foo.

Well, this is pretty bad.  document.location is the document's current
address [1].  So clicking #foo changed document.location from
page2.html to page.html#foo, which I certainly wouldn't expect (and
does not match implementations).

-Justin

[1] The href attribute [of document.location] must return the current
address of the associated Document object, as an absolute URL.

On Wed, Feb 15, 2012 at 3:50 PM, Ian Hickson i...@hixie.ch wrote:
 On Wed, 20 Jul 2011, Justin Lebar wrote:
 
  The spec as written decides whether a link is a same-resource
  reference or not based on comparing the URLs to what you're calling
  the original address, not comparing it to the current address. See the
  navigation algorithm, step 7 /Fragment identifiers/.

 Maybe I'm misunderstanding, but this might not be the case in the
 history traversal algorithm.

 In history traversal, the URLs compared are those of the entries involved.
 However, clicking a link is primarily navigation, not session history
 traversal (though it can involve the latter).


  Step 6: If the specified entry has a URL whose fragment identifier
  differs from that of the current entry's when compared in a
  case-sensitive manner, and the two share the same Document object,
  then let hash changed be true.

 It's not clear to me what the current/specified entry's URL is, or where
 this is properly defined, but earlier, we say:

 Hm, yes, the spec doesn't quite clearly define the URL in all cases.
 Fixed.


  The current entry is usually an entry for the location of the
  Document.

 That's a non-normative statement. I've made it more explicitly so.


 and the document's location changes when we call push/replaceState.

 The current entry is whatever the algorithms last set the current entry
 to. I've made that clearer in the spec.


  As currently specified, we'll resolve #foo relative to the document's
  original URL; that is, clicking the link will take the user to
  page.html#foo, not page2.html#foo.  But the intent of a link with
  href #foo is clearly to navigate within the current page, not to go
  somewhere else.

 Were you saying that this isn't the right interpretation of the spec?
 Because #foo is resolved relative to the document's base URI, which is
 the same as the document's original URI, so we decide that #foo is a
 same-document link?  That's comforting, if it's true.  :)

 When you click a link to #foo on a document whose current address is
 page2.html but whose document's address is page.html, then you go
 through these steps:

  - Start the Follow a hyperlink algorithm.
  - Resolve href relative to the a element.
  - This uses XML Base, with the fallback base url being the document's
   address, which is what you were calling the original URL.
  - This results in .../page.html#foo.
  - Navigate to that URL.
  - Step Fragment identifiers then compares this URL to the document's
   address (page.html, not page2.html), and finds a match.
  - Navigating to a fragment identifier is invoked and creates a new
   session history entry with the URL page.html#foo.
  - Traverse the history is then invoked.
  - It sets the document's current address to .../page.html#foo.
  - Scrolling happens.
  - The current entry's URL is ../page2.html and the specified entry's
   URL is .../page.html#foo so the fragids differ and hashchange fires.
  - The current entry becomes the new specified entry.


  Note that there are problems with what you describe: what if the new
  URL has a different path, and there are img elements whose URLs are
  relative, and after pushState() you clone one? Or what about relative
  links in the original markup? I don't think we can change the base URL
  on the fly, all kinds of problems could result.

 I agree there are problems with changing the base URI.  But it seems
 much less intuitive for common use-cases not to change it.  We can
 change my example above to use ?foo instead of #foo, and I think the
 same argument applies.  Should a link with href ?foo always resolve
 relative to the document's original URI (unless the base is explicitly

Re: [whatwg] Document's base URI should use the document's *current* address

2012-02-15 Thread Justin Lebar
On Wed, Feb 15, 2012 at 5:31 PM, Ian Hickson i...@hixie.ch wrote:
 On Wed, 15 Feb 2012, Justin Lebar wrote:
   - It sets the document's current address to .../page.html#foo.

 Well, this is pretty bad.  document.location is the document's current
 address [1].  So clicking #foo changed document.location from page2.html
 to page.html#foo, which I certainly wouldn't expect (and does not match
 implementations).

 Seems to me we should change the implementations then. There isn't any
 fundamental difference between linking to #foo and linking to
 page.html#foo if the base URL is page.html, as far as I can tell.

 If the implementations can't change, then I'll change the spec, but it
 really seems bad to me that relative URLs will break depending on when
 they are resolved relative to pushState() changes.

When I implemented pushState, I explicitly didn't want authors to have
to rewrite all their anchor links after they changed the document's
current URI.

From an author's point of view, there's no such thing as the
document's original URI and, unless you're a nerd, you've never heard
of the base URI.  There's just the document's URI, modified by
pushState.

From this point of view, I'd say it's less surprising that relative
URIs would break when you change directories (hey, you *asked* for it)
than that anchor refs would update the browser's address bar and
document.location relative to the old URI.

If we did make the change you're suggesting, we'd have to check that
it doesn't break at least the major sites which use pushstate
(Facebook, anyone?).  And I'd want to try to coordinate the change
with WebKit so we quickly move away from the old behavior.  But I'm
not convinced it's worthwhile, given that there's at least an argument
for the current behavior.

-Justin

 --
 Ian Hickson               U+1047E                )\._.,--,'``.    fL
 http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
 Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Document's base URI should use the document's *current* address

2011-07-20 Thread Justin Lebar
 The spec as written decides whether a link is a same-resource reference or
 not based on comparing the URLs to what you're calling the original
 address, not comparing it to the current address. See the navigation
 algorithm, step 7 /Fragment identifiers/.

Maybe I'm misunderstanding, but this might not be the case in the
history traversal algorithm.

 Step 6: If the specified entry has a URL whose fragment identifier differs 
 from that of the current entry's when compared in
 a case-sensitive manner, and the two share the same Document object, then let 
 hash changed be true.

It's not clear to me what the current/specified entry's URL is, or
where this is properly defined, but earlier, we say:

 The current entry is usually an entry for the location of the Document.

and the document's location changes when we call push/replaceState.

In any case, the navigation algorithm is clear as written.

 As currently specified, we'll resolve #foo relative to the document's
 original URL; that is, clicking the link will take the user to
 page.html#foo, not page2.html#foo.  But the intent of a link with href
 #foo is clearly to navigate within the current page, not to go somewhere
 else.

Were you saying that this isn't the right interpretation of the spec?
Because #foo is resolved relative to the document's base URI, which is
the same as the document's original URI, so we decide that #foo is a
same-document link?  That's comforting, if it's true.  :)

 Note that there are problems with what you describe: what if the new URL
 has a different path, and there are img elements whose URLs are
 relative, and after pushState() you clone one? Or what about relative
 links in the original markup? I don't think we can change the base URL on
 the fly, all kinds of problems could result.

I agree there are problems with changing the base URI.  But it seems
much less intuitive for common use-cases not to change it.  We can
change my example above to use ?foo instead of #foo, and I think the
same argument applies.  Should a link with href ?foo always resolve
relative to the document's original URI (unless the base is explicitly
changed)?  Similarly, if for some bizarre reason the page pushState's
to a new directory, shouldn't all the links point relative to that new
directory?

I kind of think this ship has sailed wrt implementations.  Chrome and
Firefox both have the same behavior in this respect.  See
http://people.mozilla.org/~jlebar/whatwg/test_pushstate_resolve.html
(source included below, since I have a bad habit of deleting these
test files right before someone else wants to look at them).

Ian, how hard do you think it would be to spec changing the base and
resolve the issues with that?

-Justin

html
body
a href='#foo'#foo/abr
a href='?foo'?foo/abr
a href='foo'foo/abr
 button onclick='history.pushState(, , Math.random())'pushState
to new file/buttonbr
button onclick='history.pushState(, , Math.random() +
/file)'pushState to new directory/button
/body
/html

On Tue, Jul 19, 2011 at 5:35 PM, Ian Hickson i...@hixie.ch wrote:
 On Wed, 27 Apr 2011, Justin Lebar wrote:

 The document base URL is used when fetching resources.

 Right now, if a page doesn't have a base element, the document base
 URL is set to the document's address.  (I'm going to call this the
 document's original address.)  The document's original address does
 not change when you call pushState; only the document's current address
 does.

 I think the base URI should use the document's current address, not the
 original address.

 To see why this makes sense, consider the following scenario:

 * User loads page.html
 * Page calls pushState and changes its url to page2.html
 * User clicks on a link with href #foo.

 As currently specified, we'll resolve #foo relative to the document's
 original URL; that is, clicking the link will take the user to
 page.html#foo, not page2.html#foo.  But the intent of a link with href
 #foo is clearly to navigate within the current page, not to go somewhere
 else.

 Firefox 4 already implements pushState as I'm suggesting here.

 The spec as written decides whether a link is a same-resource reference or
 not based on comparing the URLs to what you're calling the original
 address, not comparing it to the current address. See the navigation
 algorithm, step 7 /Fragment identifiers/.

 Note that there are problems with what you describe: what if the new URL
 has a different path, and there are img elements whose URLs are
 relative, and after pushState() you clone one? Or what about relative
 links in the original markup? I don't think we can change the base URL on
 the fly, all kinds of problems could result.

 --
 Ian Hickson               U+1047E                )\._.,--,'``.    fL
 http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
 Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


[whatwg] Document's base URI should use the document's *current* address

2011-04-27 Thread Justin Lebar
The document base URL [1] is used when fetching resources.

Right now, if a page doesn't have a base element, the document base
URL is set to the document's address.  (I'm going to call this the
document's original address.)  The document's original address does
not change when you call pushState; only the document's current
address [2] does.

I think the base URI should use the document's current address, not
the original address.

To see why this makes sense, consider the following scenario:

* User loads page.html
* Page calls pushState and changes its url to page2.html
* User clicks on a link with href #foo.

As currently specified, we'll resolve #foo relative to the document's
original URL; that is, clicking the link will take the user to
page.html#foo, not page2.html#foo.  But the intent of a link with href
#foo is clearly to navigate within the current page, not to go
somewhere else.

Firefox 4 already implements pushState as I'm suggesting here.

-Justin

[1] 
http://www.whatwg.org/specs/web-apps/current-work/multipage/urls.html#document-base-url
[2] 
http://www.whatwg.org/specs/web-apps/current-work/multipage/dom.html#the-document%27s-current-address


Re: [whatwg] Onpopstate is Flawed

2011-02-11 Thread Justin Lebar
 I'm not sure I follow you here. My idea for option A is that you never
 get a popstate when doing the initial parsing of a page.

Okay, I may still have misunderstood, despite my best efforts!  :)

 Option B:
 Fire popstates as we currently do, with the caveat that you never
 fire a stale popstate -- that is, if any navigations or
 push/replaceStates have occurred since you queued the task to fire the
 popstate, don't fire it.

Is my option B clear?  It's also what the patch I have [1] does.

We'd might want to make popstate sync again, since otherwise you have
to schedule a task which synchronously checks if no state changes have
occurred, and dispatches popstate only if appropriate.
I know Olli has some thoughts on making popstate sync, and fwiw, FF
currently dispatches it synchronously.

 The main problem with this proposal is that it's a big change from
 what the API is today. However it's only a change in the situation
 when the spec today calls for firing popstate during the initial page
 load. Something that it seems like pages don't deal with properly
 today anyway, at least in the case of facebook.

Given the adoption the feature has seen, I guess I'd favor a smaller
change.  In particular, the option B above makes it possible to write
correct pages without ever reading the DOM current state property --
it's there only as an optimization to allow pages to set their state
faster, so no rush to put it in Right Away.  In contrast, a correct
page with option A would have to check its state at some point as it
loads.

I guess I don't see why it's better to make a big change than a small
one, if they both work equally well.

-Justin

[1] Patch v4: https://bugzilla.mozilla.org/show_bug.cgi?id=615501

On Mon, Feb 7, 2011 at 5:07 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Sun, Feb 6, 2011 at 10:18 AM, Justin Lebar justin.le...@gmail.com wrote:
 1) Fire popstates as we currently do, with the caveat that you never
 fire a stale popstate -- that is, if any navigations or
 push/replaceStates have occurred since you queued the task to fire the
 popstate, don't fire it.

 Proposal B has the advantage of requiring fewer changes.

 The more I think about this, the more I like this option.  It's a
 smaller change than option A (though again, we certainly could expose
 the state object through a DOM property separately from this
 proposal), and I think it would be sufficient to fix some sites which
 are currently broken.  (For instance, I've gotten Facebook to receive
 stale popstates and show me the wrong page just by clicking around
 quickly.)

 Furthermore, this avoids the edge case in option B of you don't get a
 popstate on the initial initial load, but you do get a popstate if
 you're reloading from far enough back in the session history, or after
 a session restore.

 I'm not sure I follow you here. My idea for option A is that you never
 get a popstate when doing the initial parsing of a page. So if you're
 reloading from session restore or if you're going far back enough in
 history that you end up parsing a Document, you never get a popstate.

 You get a popstate when and only when you transition between two
 history entries while remaining on the same Document.

 So the basic code flow would be:

 Whenever creating a part of the UI (for example during page load or if
 called upon to render a new AJAX page), use document.currentState to
 decide what state to render.
 Whenever you receive a popstate, rerender UI as described by the popstate.

 So no edge cases that I can think of?

 The main problem with this proposal is that it's a big change from
 what the API is today. However it's only a change in the situation
 when the spec today calls for firing popstate during the initial page
 load. Something that it seems like pages don't deal with properly
 today anyway, at least in the case of facebook.

 I was concerned that pages might become confused when they don't get a
 popstate they were expecting -- for instance, if you pushState before
 the initial popstate, a page may never see a popstate event -- but I
 think this might not be such a big deal.  A call to push/replaceState
 would almost certainly be accompanied by code updating the DOM to the
 new state.  Popstate's main purpose is to tell me to update the DOM,
 so I don't think I'd be missing much by not getting it in that case.

 That was my thinking too FWIW.

 / Jonas



Re: [whatwg] Onpopstate is Flawed

2011-02-11 Thread Justin Lebar
 The problem with option B is that pages can't display correctly until
 the load event fires, which can be quite late in the game what with
 slow loading images and ads. It means that if you're on a page which
 uses state, and reload the page, you'll first see the page in a
 state-less mode while it's loading, and at some point later (generally
 when the last image finishes loading) it'll snap to be in the state
 it was when you pressed reload.

 You'll get the same behavior going back to a state-using page which
 has been kicked out of the fast-cache.

But isn't this problem orthogonal to option B?  That is, we could
still add the DOM property to address this concern, right?

But at least with option B, one can write a correct page without
reading that property -- that is, pages won't have to change in order
to be as fast and correct as they currently are.

-Justin


Re: [whatwg] Onpopstate is Flawed

2011-02-06 Thread Justin Lebar
 1) Fire popstates as we currently do, with the caveat that you never
 fire a stale popstate -- that is, if any navigations or
 push/replaceStates have occurred since you queued the task to fire the
 popstate, don't fire it.

 Proposal B has the advantage of requiring fewer changes.

The more I think about this, the more I like this option.  It's a
smaller change than option A (though again, we certainly could expose
the state object through a DOM property separately from this
proposal), and I think it would be sufficient to fix some sites which
are currently broken.  (For instance, I've gotten Facebook to receive
stale popstates and show me the wrong page just by clicking around
quickly.)

Furthermore, this avoids the edge case in option B of you don't get a
popstate on the initial initial load, but you do get a popstate if
you're reloading from far enough back in the session history, or after
a session restore.

I was concerned that pages might become confused when they don't get a
popstate they were expecting -- for instance, if you pushState before
the initial popstate, a page may never see a popstate event -- but I
think this might not be such a big deal.  A call to push/replaceState
would almost certainly be accompanied by code updating the DOM to the
new state.  Popstate's main purpose is to tell me to update the DOM,
so I don't think I'd be missing much by not getting it in that case.

I don't know if this is something we can get done in time for FF4, but
I can see.

-Justin

On Wed, Feb 2, 2011 at 3:37 PM, Justin Lebar justin.le...@gmail.com wrote:
 Oh, I think I now understand what Jonas meant.

 Proposal A, as I understand it:

 1) Don't fire an initial popstate, because this causes stale popstates
 when pushState is called before the popstate.

 2) Expose the state object to the DOM so pages can find out what the
 initial state is when they load.  (The initial state might not be null
 if we're restoring after a crash, or if we're going back in history
 after we unloaded the document.)

 3) Otherwise, fire popstate like normal, once for each navigation.
 (With the caveat that you never want to fire a stale popstate -- that
 is, if any navigations or push/replaceStates have occurred since you
 queued the task to fire the popstate, don't fire it.)

 I think we need the caveat in step 3 because firing popstate isn't
 synchronous (step 11 at [1]).

 But if we need that caveat, maybe it's better to do what Jonas
 originally proposed.  Proposal B:

 1) Fire popstates as we currently do, with the caveat that you never
 fire a stale popstate -- that is, if any navigations or
 push/replaceStates have occurred since you queued the task to fire the
 popstate, don't fire it.

 Proposal B has the advantage of requiring fewer changes.  (We could,
 of course, add the DOM property later -- it's orthogonal to proposal
 B, but required by proposal A.)

 [1] 
 http://www.whatwg.org/specs/web-apps/current-work/multipage/history.html#traverse-the-history

 On Wed, Feb 2, 2011 at 2:48 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Wed, Feb 2, 2011 at 2:34 PM, Justin Lebar justin.le...@gmail.com wrote:
 So during loading, any script that wants to know what the initial (or
 current) state is does not need to wait for the first popstate, but
 can simply grab the state and go.

 Yeah, I think it's too late to move to this approach.

 Even if we also include the new state in the popstate events? Such a
 change seems mostly additive to the current spec.

 My thinking was that if someone calls replaceState, then probably that
 means that they're currently changing the page to represent that new
 state. If they do that then I don't see that they initial popstate
 would help them in any way?

 I agree it's potentially misinformative to give the page a popstate in
 this case.  But it's possible that a page might be built so that it
 doesn't begin to function properly until it receives the initial
 popstate.  If a user clicks on a link and causes a replaceState call
 before the initial popstate, then such a page could break.

 But with my suggested change, pages have no reason to wait until the
 initial popstate fires. And in fact they can't since we don't fire it
 at all :) But yes, I agree that it could break already existing pages
 that have the above behavior.

 So the question is if webkit would be ok with such a change.

 So during loading, any script that wants to know what the initial (or
 current) state is does not need to wait for the first popstate, but
 can simply grab the state and go.

 Oh, is this why we needed the initial popstate?  For instance, we
 persist state objects across session restore, so when the user
 restarts, a page could get an onload followed by a popstate with a
 non-null state object.

 [Aside: What we currently have doesn't work well for this case, since
 the page really needs the state object at the moment it starts to run
 script so it can decide what content to load, but it doesn't get

Re: [whatwg] Onpopstate is Flawed

2011-02-02 Thread Justin Lebar
I'm a bit uncomfortable with this behavior, since it seems that having
replaceState cancel the initial popstate is at least somewhat
surprising.

How is this better than never firing an initial popstate?

-Justin

On Mon, Jan 31, 2011 at 6:32 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Thu, Dec 23, 2010 at 6:18 PM, Henry Chan
 henry.fai.hang.c...@gmail.com wrote:
 It fixes the bit where back/forward before onload doesn't fire onpopstate.
 But no, it still doesn't let us detect inital onpopstate.  And back/forward
 buttons don't work properly until onload.  A workaround would be to assign
 the handlers to the a tags at onload but again that's not feasible for my
 site.  I need it domready.
 Please make onpopstate fire as early as possible in the navigation sequence.
  And drop the pending state object.  I need exactly each firing.  Not just
 the last one.

 Would the following behavior solve your issue:

 If pushState or replaceState is called before the initial popstate,
 simply don't fire the after-onload-popstate.

 If the back button is pressed (or history.back() is called) after a
 pushState/replaceState, but before onload, fire a popstate for the
 newly transitioned to state. Still leave the after-onload-popstate
 canceled.

 I.e. if the webpage calls pushState or replaceState before onload
 fires, then it is deemed that the page has transitioned to the new
 state and no after-onload-popstate is needed.

 This behavior makes the most sense to me and allows the page to start
 handling state transitions before the page finishes loading.

 / Jonas



Re: [whatwg] Onpopstate is Flawed

2011-02-02 Thread Justin Lebar
 So during loading, any script that wants to know what the initial (or
 current) state is does not need to wait for the first popstate, but
 can simply grab the state and go.

Yeah, I think it's too late to move to this approach.

 My thinking was that if someone calls replaceState, then probably that
 means that they're currently changing the page to represent that new
 state. If they do that then I don't see that they initial popstate
 would help them in any way?

I agree it's potentially misinformative to give the page a popstate in
this case.  But it's possible that a page might be built so that it
doesn't begin to function properly until it receives the initial
popstate.  If a user clicks on a link and causes a replaceState call
before the initial popstate, then such a page could break.

It's an edge case, but that's exactly why it concerns me -- nobody's
going to test to make sure that their page works properly if the
initial popstate is canceled by a push/replaceState.

 So during loading, any script that wants to know what the initial (or
 current) state is does not need to wait for the first popstate, but
 can simply grab the state and go.

Oh, is this why we needed the initial popstate?  For instance, we
persist state objects across session restore, so when the user
restarts, a page could get an onload followed by a popstate with a
non-null state object.

[Aside: What we currently have doesn't work well for this case, since
the page really needs the state object at the moment it starts to run
script so it can decide what content to load, but it doesn't get the
state object until after onload.]

If we can't get rid of the initial popstate because of the above, then
I think what Jonas proposed is reasonable.  I just wish we had
something with fewer gotchas.

-Justin

On Wed, Feb 2, 2011 at 2:15 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Wed, Feb 2, 2011 at 2:05 PM, Justin Lebar justin.le...@gmail.com wrote:
 I'm a bit uncomfortable with this behavior, since it seems that having
 replaceState cancel the initial popstate is at least somewhat
 surprising.

 How is this better than never firing an initial popstate?

 My thinking was that if someone calls replaceState, then probably that
 means that they're currently changing the page to represent that new
 state. If they do that then I don't see that they initial popstate
 would help them in any way?

 Yet another solution would be to always expose the current state
 through a member on the window or the document. Then popstate would
 represent any transition in the current state and wouldn't be needed
 for the initial page load.

 So during loading, any script that wants to know what the initial (or
 current) state is does not need to wait for the first popstate, but
 can simply grab the state and go.

 The main problem I can think of with this design, apart from it being
 a bigger change from what we've got, is what happens if someone
 modifies the current-state member on the window/document. While we can
 make the member read-only, that doesn't help if the state is a deep
 object hierarchy. In IndexedDB we decided to not attempt to solve the
 problem and instead rely on authors not to trigger the footgun.

 / Jonas



Re: [whatwg] Onpopstate is Flawed

2011-02-02 Thread Justin Lebar
Oh, I think I now understand what Jonas meant.

Proposal A, as I understand it:

1) Don't fire an initial popstate, because this causes stale popstates
when pushState is called before the popstate.

2) Expose the state object to the DOM so pages can find out what the
initial state is when they load.  (The initial state might not be null
if we're restoring after a crash, or if we're going back in history
after we unloaded the document.)

3) Otherwise, fire popstate like normal, once for each navigation.
(With the caveat that you never want to fire a stale popstate -- that
is, if any navigations or push/replaceStates have occurred since you
queued the task to fire the popstate, don't fire it.)

I think we need the caveat in step 3 because firing popstate isn't
synchronous (step 11 at [1]).

But if we need that caveat, maybe it's better to do what Jonas
originally proposed.  Proposal B:

1) Fire popstates as we currently do, with the caveat that you never
fire a stale popstate -- that is, if any navigations or
push/replaceStates have occurred since you queued the task to fire the
popstate, don't fire it.

Proposal B has the advantage of requiring fewer changes.  (We could,
of course, add the DOM property later -- it's orthogonal to proposal
B, but required by proposal A.)

[1] 
http://www.whatwg.org/specs/web-apps/current-work/multipage/history.html#traverse-the-history

On Wed, Feb 2, 2011 at 2:48 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Wed, Feb 2, 2011 at 2:34 PM, Justin Lebar justin.le...@gmail.com wrote:
 So during loading, any script that wants to know what the initial (or
 current) state is does not need to wait for the first popstate, but
 can simply grab the state and go.

 Yeah, I think it's too late to move to this approach.

 Even if we also include the new state in the popstate events? Such a
 change seems mostly additive to the current spec.

 My thinking was that if someone calls replaceState, then probably that
 means that they're currently changing the page to represent that new
 state. If they do that then I don't see that they initial popstate
 would help them in any way?

 I agree it's potentially misinformative to give the page a popstate in
 this case.  But it's possible that a page might be built so that it
 doesn't begin to function properly until it receives the initial
 popstate.  If a user clicks on a link and causes a replaceState call
 before the initial popstate, then such a page could break.

 But with my suggested change, pages have no reason to wait until the
 initial popstate fires. And in fact they can't since we don't fire it
 at all :) But yes, I agree that it could break already existing pages
 that have the above behavior.

 So the question is if webkit would be ok with such a change.

 So during loading, any script that wants to know what the initial (or
 current) state is does not need to wait for the first popstate, but
 can simply grab the state and go.

 Oh, is this why we needed the initial popstate?  For instance, we
 persist state objects across session restore, so when the user
 restarts, a page could get an onload followed by a popstate with a
 non-null state object.

 [Aside: What we currently have doesn't work well for this case, since
 the page really needs the state object at the moment it starts to run
 script so it can decide what content to load, but it doesn't get the
 state object until after onload.]

 If we can't get rid of the initial popstate because of the above, then
 I think what Jonas proposed is reasonable.  I just wish we had
 something with fewer gotchas.

 I think my latest proposed change makes this a whole lot better since
 the state is immediately available to scripts. The problem with only
 sticking the state in an event is that there is really no good point
 to fire the event. The later you fire it the longer it takes before
 the page works properly. The sooner you fire it the bigger risk you
 run that some script runs too late to get be able to catch the event.

 / Jonas



Re: [whatwg] Firing popstate for all history entry changes

2010-08-25 Thread Justin Lebar
 It might also help if the event wasn't called popstate, since that
 implies a 1:1 relationship with pushState calls, but you can already
 get popstate events without corresponding pushState calls.
 historytraversal perhaps?

I think we've decided here that the time for major changes to this API
has past -- it's already in use in the wild.  If we *do* want to
change the API, I'd like to get in line.  :)

 However, it seems like the (web) developer's mental model for popstate
 would be much simpler if it fired whenever the current session history
 entry changed, regardless of whether it has a state object or was the
 first entry.

This is the model Firefox uses, and we're prepared to ship it in the
upcoming release of version 4.  It's divergent from WebKit, which has
already shipped, but WebKit is going to have to change anyway.
(http://webkit.org/b/41372)

-Justin

On Wed, Aug 25, 2010 at 2:55 PM, Mihai Parparita mih...@chromium.org wrote:
 There's been some discussion on http://webkit.org/b/41372 about
 Gecko's vs. WebKit's implementation of the popstate event. It turns
 out that a careful reading of
 http://www.whatwg.org/specs/web-apps/current-work/multipage/history.html#history-traversal,
 specifically of item 10, indicates that if you have this sequence of
 steps:

 1. Go to a page
 2. Change the location's fragment to #step1
 3. Change the location's fragment to #step2
 4. Go back
 5. Go back

 Then popstate should be fired after every step, except for step 4
 (test case at https://bugs.webkit.org/attachment.cgi?id=65467). That's
 because in step 4 we're going back from one history entry without a
 state object to another without a state object, and the target entry
 is not the first one for the document either.

 However, it seems like the (web) developer's mental model for popstate
 would be much simpler if it fired whenever the current session history
 entry changed, regardless of whether it has a state object or was the
 first entry. Then if someone wished to listen to all history events,
 they would just have to use onpopstate, instead of a combination of
 onpopstate and onhashchange.

 It might also help if the event wasn't called popstate, since that
 implies a 1:1 relationship with pushState calls, but you can already
 get popstate events without corresponding pushState calls.
 historytraversal perhaps?

 Mihai



Re: [whatwg] HTML resource packages

2010-08-09 Thread Justin Lebar
On Mon, Aug 9, 2010 at 9:47 AM, Aryeh Gregor simetrical+...@gmail.com wrote:
 If UAs can assume that files with the same path
 are the same regardless of whether they came from a resource package
 or which, and they have all but a couple of the files cached, they
 could request those directly instead of from the resource package,
 even if a resource package is specified.

These kinds of heuristics are far beyond the scope of resource
packages as we're planning to implement them.  Again, I think this
type of behavior is the domain of a large change to the networking
stack, such as SPDY, not a small hack like resource packages.

-Justin

On Mon, Aug 9, 2010 at 9:47 AM, Aryeh Gregor simetrical+...@gmail.com wrote:
 On Fri, Aug 6, 2010 at 7:40 PM, Justin Lebar justin.le...@gmail.com wrote:
 I think this is a fair point.  But I'd suggest we consider the following:

 * It might be confusing for resources from a resource package to show
 up on a page which doesn't opt-in to resource packages in general or
 to that specific resource package.

 Only if the resource package contains a different file from the real
 one.  I suggest we treat this as a pathological case and accept that
 it will be broken and confusing -- or at least we consider how many
 extra optimizations we could make if we did accept that, before
 deciding whether the extra performance is worth the confusion.

 * There's no easy way to opt out of this behavior.  That is, if I
 explicitly *don't* want to load content cached from a resource
 package, I have to name that content differently.

 Why would you want that, if the files are the same anyway?

 * The avatars-on-a-forum use case is less convincing the more I think
 about it.  Certainly you'd want each page which displays many avatars
 to package up all the avatars into a single package.  So you wouldn't
 benefit from the suggested caching changes on those pages.

 I don't see why not.  If UAs can assume that files with the same path
 are the same regardless of whether they came from a resource package
 or which, and they have all but a couple of the files cached, they
 could request those directly instead of from the resource package,
 even if a resource package is specified.  So if twenty different
 people post on the page, and you've been browsing for a while and have
 eighteen of their avatars (this will be common, a handful of people
 tend to account for most posts in a given forum):

 1) With no resource packages, you fetch two separate avatars (but on
 earlier page views you suffered).

 2) With resource packages as you suggest, you fetch a whole resource
 package, 90% of which you don't need.  In fact, you have to fetch a
 resource package even if you have 100% of the avatars on the page!  No
 two pages will be likely to have the same resource package, so you
 can't share cache at all.

 3) With resource packages as I suggest, you fetch only two separate
 avatars, *and* you got the benefits of resource packages on earlier
 pages.  The UA gets to guess whether using resource packages would be
 a win on a case-by-case basis, so in particular, it should be able to
 perform strictly better than either (1) or (2), given decent
 heuristics.  E.g., the heuristic fetch the resource package if I need
 at least two files, fetch the file if I only need one will perform
 better than either (1) or (2) in any reasonable circumstance.

 I think this sort of situation will be fairly common.  Has anyone
 looked at a bunch of different types of web pages and done a breakdown
 of how many assets they have, and how they're reused across pages?  If
 we're talking about assets that are used only on one page (image
 search) or all pages (logos, shared scripts), your approach works
 fine, but not if they're used on a random mix of pages.  I think a lot
 of files will wind up being used on only particular subsets of pages.

 In general, I think we need something like SPDY to really address the
 problem of duplicated downloads.  I don't think resource packages can
 fix it with any caching policy.

 Certainly there are limits to what resource packages can do, but we
 can wind up closer to the limits or farther from them depending on the
 implementation details.



Re: [whatwg] HTML resource packages

2010-08-09 Thread Justin Lebar
 Can you provide the content of the page which you used in your whitepaper?
 (https://bug529208.bugzilla.mozilla.org/attachment.cgi?id=455820)

I'll post this to the bug when I get home tonight.  But your comments
are astute -- the page I used is a pretty bad benchmark for a variety
of reasons.  It sounds like you probably could hack up a much better
one.

a) Looks like pages were loaded exactly once, as per your notes?  How
 hard is it to run the tests long enough to get to a 95% confidence interval?

Since I was running on a simulated network with no random parameters
(e.g. no packet loss), there was very little variance in load time
across runs.

d) What did you do about subdomains in the test?  I assume your test
 loaded from one subdomain?

That's correct.

 I'm betting time-to-paint goes through the roof with resource bundles:-)

It does right now because we don't support incremental extraction,
which is why I didn't bother measuring time-to-paint.  The hope is
that with incremental extraction, we won't take too much of a hit.

-Justin

On Mon, Aug 9, 2010 at 1:30 PM, Mike Belshe m...@belshe.com wrote:
 Justin -
 Can you provide the content of the page which you used in your whitepaper?
 (https://bug529208.bugzilla.mozilla.org/attachment.cgi?id=455820)
 I have a few concerns about the benchmark:
    a) Looks like pages were loaded exactly once, as per your notes?  How
 hard is it to run the tests long enough to get to a 95% confidence interval?
    b) As you note in the report, slow start will kill you.  I've verified
 this so many times it makes me sick.  If you try more combinations, I
 believe you'll see this.
    c) The 1.3MB of subresources in a single bundle seems unrealistic to me.
  On one hand you say that its similar to CNN, but note that CNN has
 JS/CSS/images, not just thumbnails like your test.  Further, note that CNN
 pulls these resources from multiple domains; combining them into one domain
 may work, but certainly makes the test content very different from CNN.  So
 the claim that it is somehow representative seems incorrect.   For more
 accurate data on what websites look like,
 see http://code.google.com/speed/articles/web-metrics.html
    d) What did you do about subdomains in the test?  I assume your test
 loaded from one subdomain?
    e) There is more to a browser than page-load-time.  Time-to-first-paint
 is critical as well.  For instance, in WebKit and Chrome, we have specific
 heuristics which optimize for time-to-render instead of total page load.
  CNN is always cited as a bad page, but it's really not - it just has a
 lot of content, both below and above the fold.  When the user can interact
 with the page successfully, the user is happy.  In other words, I know I can
 make webkit's PLT much faster by removing a couple of throttles.  But I also
 know that doing so worsens the user experience by delaying the time to first
 paint.  So - is it possible to measure both times?  I'm betting
 time-to-paint goes through the roof with resource bundles:-)
 If you provide the content, I'll try to run some tests.  It will take a few
 days.
 Mike

 On Mon, Aug 9, 2010 at 9:52 AM, Justin Lebar justin.le...@gmail.com wrote:

 On Mon, Aug 9, 2010 at 9:47 AM, Aryeh Gregor simetrical+...@gmail.com
 wrote:
  If UAs can assume that files with the same path
  are the same regardless of whether they came from a resource package
  or which, and they have all but a couple of the files cached, they
  could request those directly instead of from the resource package,
  even if a resource package is specified.

 These kinds of heuristics are far beyond the scope of resource
 packages as we're planning to implement them.  Again, I think this
 type of behavior is the domain of a large change to the networking
 stack, such as SPDY, not a small hack like resource packages.

 -Justin

 On Mon, Aug 9, 2010 at 9:47 AM, Aryeh Gregor simetrical+...@gmail.com
 wrote:
  On Fri, Aug 6, 2010 at 7:40 PM, Justin Lebar justin.le...@gmail.com
  wrote:
  I think this is a fair point.  But I'd suggest we consider the
  following:
 
  * It might be confusing for resources from a resource package to show
  up on a page which doesn't opt-in to resource packages in general or
  to that specific resource package.
 
  Only if the resource package contains a different file from the real
  one.  I suggest we treat this as a pathological case and accept that
  it will be broken and confusing -- or at least we consider how many
  extra optimizations we could make if we did accept that, before
  deciding whether the extra performance is worth the confusion.
 
  * There's no easy way to opt out of this behavior.  That is, if I
  explicitly *don't* want to load content cached from a resource
  package, I have to name that content differently.
 
  Why would you want that, if the files are the same anyway?
 
  * The avatars-on-a-forum use case is less convincing the more I think
  about it.  Certainly you'd want each page

Re: [whatwg] HTML resource packages

2010-08-09 Thread Justin Lebar
The files I used for the rough benchmarks are available in a tarball
at [1].  Live pages are at [2] and [3].

[1] http://people.mozilla.org/~jlebar/respkg/test/benchmark_files.tgz
[2] http://people.mozilla.org/~jlebar/respkg/test/test-pkg.html
[3] http://people.mozilla.org/~jlebar/respkg/test/test-nopkg.html

-Justin

On Mon, Aug 9, 2010 at 1:40 PM, Justin Lebar justin.le...@gmail.com wrote:
 Can you provide the content of the page which you used in your whitepaper?
 (https://bug529208.bugzilla.mozilla.org/attachment.cgi?id=455820)

 I'll post this to the bug when I get home tonight.  But your comments
 are astute -- the page I used is a pretty bad benchmark for a variety
 of reasons.  It sounds like you probably could hack up a much better
 one.

    a) Looks like pages were loaded exactly once, as per your notes?  How
 hard is it to run the tests long enough to get to a 95% confidence interval?

 Since I was running on a simulated network with no random parameters
 (e.g. no packet loss), there was very little variance in load time
 across runs.

    d) What did you do about subdomains in the test?  I assume your test
 loaded from one subdomain?

 That's correct.

 I'm betting time-to-paint goes through the roof with resource bundles:-)

 It does right now because we don't support incremental extraction,
 which is why I didn't bother measuring time-to-paint.  The hope is
 that with incremental extraction, we won't take too much of a hit.

 -Justin

 On Mon, Aug 9, 2010 at 1:30 PM, Mike Belshe m...@belshe.com wrote:
 Justin -
 Can you provide the content of the page which you used in your whitepaper?
 (https://bug529208.bugzilla.mozilla.org/attachment.cgi?id=455820)
 I have a few concerns about the benchmark:
    a) Looks like pages were loaded exactly once, as per your notes?  How
 hard is it to run the tests long enough to get to a 95% confidence interval?
    b) As you note in the report, slow start will kill you.  I've verified
 this so many times it makes me sick.  If you try more combinations, I
 believe you'll see this.
    c) The 1.3MB of subresources in a single bundle seems unrealistic to me.
  On one hand you say that its similar to CNN, but note that CNN has
 JS/CSS/images, not just thumbnails like your test.  Further, note that CNN
 pulls these resources from multiple domains; combining them into one domain
 may work, but certainly makes the test content very different from CNN.  So
 the claim that it is somehow representative seems incorrect.   For more
 accurate data on what websites look like,
 see http://code.google.com/speed/articles/web-metrics.html
    d) What did you do about subdomains in the test?  I assume your test
 loaded from one subdomain?
    e) There is more to a browser than page-load-time.  Time-to-first-paint
 is critical as well.  For instance, in WebKit and Chrome, we have specific
 heuristics which optimize for time-to-render instead of total page load.
  CNN is always cited as a bad page, but it's really not - it just has a
 lot of content, both below and above the fold.  When the user can interact
 with the page successfully, the user is happy.  In other words, I know I can
 make webkit's PLT much faster by removing a couple of throttles.  But I also
 know that doing so worsens the user experience by delaying the time to first
 paint.  So - is it possible to measure both times?  I'm betting
 time-to-paint goes through the roof with resource bundles:-)
 If you provide the content, I'll try to run some tests.  It will take a few
 days.
 Mike

 On Mon, Aug 9, 2010 at 9:52 AM, Justin Lebar justin.le...@gmail.com wrote:

 On Mon, Aug 9, 2010 at 9:47 AM, Aryeh Gregor simetrical+...@gmail.com
 wrote:
  If UAs can assume that files with the same path
  are the same regardless of whether they came from a resource package
  or which, and they have all but a couple of the files cached, they
  could request those directly instead of from the resource package,
  even if a resource package is specified.

 These kinds of heuristics are far beyond the scope of resource
 packages as we're planning to implement them.  Again, I think this
 type of behavior is the domain of a large change to the networking
 stack, such as SPDY, not a small hack like resource packages.

 -Justin

 On Mon, Aug 9, 2010 at 9:47 AM, Aryeh Gregor simetrical+...@gmail.com
 wrote:
  On Fri, Aug 6, 2010 at 7:40 PM, Justin Lebar justin.le...@gmail.com
  wrote:
  I think this is a fair point.  But I'd suggest we consider the
  following:
 
  * It might be confusing for resources from a resource package to show
  up on a page which doesn't opt-in to resource packages in general or
  to that specific resource package.
 
  Only if the resource package contains a different file from the real
  one.  I suggest we treat this as a pathological case and accept that
  it will be broken and confusing -- or at least we consider how many
  extra optimizations we could make if we did accept that, before
  deciding whether

Re: [whatwg] HTML resource packages

2010-08-06 Thread Justin Lebar
On Fri, Aug 6, 2010 at 12:46 AM, Christoph Päper
christoph.pae...@crissov.de wrote:
 Justin Lebar:
 Christoph Päper christoph.pae...@crissov.de wrote:

 Why do you want to put this on the HTML level (exclusively), not the HTTP 
 level?

 If you reference an image from a CSS file and include that CSS file in an 
 HTML file which uses resource packages, the image can be loaded from the 
 resource package.

 Yeah, it’s still wrong.

 Resource packages in HTML seem okay for the image gallery use case (and then 
 could be done with ‘link’), but they’re commonly inappropriate for anything 
 referenced from ‘link’, ‘script’ and ‘style’ elements. Your remark on loading 
 order just proves this point: you want resource packages referenced before 
 ‘head’. You should move one step further than the root element, i.e. to the 
 transport layer.

We want resource packages to work for people who don't have the
ability to set custom headers for their pages or who don't even know
what an HTTP header is.  I agree that it's a hack, but I don't
understand how putting the packages information in the html element
makes it inappropriate to load from a resource package resources
referenced in link, script, and style elements.

Is the issue just that the HTML file's |packages| attribute affects
what we load when we see @import url() in a separate CSS file?  This
seems like a feature, not a bug, to me.

SPDY will do this the Right Way, if we're patient.

-Justin


Re: [whatwg] HTML resource packages

2010-08-06 Thread Justin Lebar
 So if resource packages don't share caches, you need to either give up
 on caching, [or] put a given file only in one resource package on your
 whole site.  The latter is not practical if pages use small, fairly
 random subsets of your assets and it's not feasible to package them
 all on every page view.  Think avatars on a web forum

I think this is a fair point.  But I'd suggest we consider the following:

* It might be confusing for resources from a resource package to show
up on a page which doesn't opt-in to resource packages in general or
to that specific resource package.

* There's no easy way to opt out of this behavior.  That is, if I
explicitly *don't* want to load content cached from a resource
package, I have to name that content differently.

* The avatars-on-a-forum use case is less convincing the more I think
about it.  Certainly you'd want each page which displays many avatars
to package up all the avatars into a single package.  So you wouldn't
benefit from the suggested caching changes on those pages.

You might benefit on a user profile page which just displays one
avatar.  You might try and be clever and leave the avatar out of the
profile page's resource package on the assumption that the UA already
has that avatar in its cache.  But then your page would load slower
for users who visited the profile page without first getting the
avatar from another resource package.

Maybe you'd benefit from the suggested changes if you'd half-deployed
resource packages on your site, so some pages had packages and others
didn't.  But I don't think that's a use case we should design for.

In general, I think we need something like SPDY to really address the
problem of duplicated downloads.  I don't think resource packages can
fix it with any caching policy.

-Justin

On Fri, Aug 6, 2010 at 2:17 PM, Aryeh Gregor simetrical+...@gmail.com wrote:
 On Tue, Aug 3, 2010 at 8:31 PM, Justin Lebar justin.le...@gmail.com wrote:
 We at Mozilla are hoping to ship HTML resource packages in Firefox 4,
 and we wanted to get the WhatWG's feedback on the feature.

 For the impatient, the spec is here:

    http://people.mozilla.org/~jlebar/respkg/

 I have some concerns about caching behavior here, which I've mentioned
 before.  Consider a site that has a landing page that has lots of
 first-time viewers.  To accelerate that page view, you might want to
 add a resource package containing all the assets on the page, to speed
 up views in the cold cache case.  Some of those assets will be reused
 on other pages, and some will not.

 When the user navigates to another page, what's supposed to happen?
 If you hadn't used resource packages at all, they would have a hot
 cache, so they'd get all the shared assets on every subsequent page
 view for free.  But now they don't -- instead of the first view being
 slow, it's the second view, when they leave the landing page.  This
 isn't a big improvement.

 So if resource packages don't share caches, you need to either give up
 on caching, put a given file only in one resource package on your
 whole site.  The latter is not practical if pages use small, fairly
 random subsets of your assets and it's not feasible to package them
 all on every page view.  Think avatars on a web forum: you might have
 20 different avatars displayed per page, from a pool of tens of
 thousands or more.  Do you have to decide between not using resource
 packages and not getting any caching?

 You've said before that your goal in this requirement is
 predictability -- if there's an inconsistency between different
 resource packages or between a resource package and the real file, you
 don't want users to get different results depending on what order they
 visit the pages in.  This is fair enough, but I'm worried that the
 caching problems this approach causes will make it more of a hindrance
 than a benefit for a wide class of use-cases.  There's some possible
 inconsistency anyway whenever caching is permitted at all, because if
 the page provides incorrect caching headers, the UA might have an
 out-of-date copy.  Also, different browsers will be inconsistent too,
 until all UAs in common use have implemented resource packages -- some
 will use the packaged file and some the real file.  Is the extra
 inconsistency from letting the caches mix really too much to ask for
 the cacheability benefits?  I don't think so.



Re: [whatwg] HTML resource packages

2010-08-04 Thread Justin Lebar
 Brett Zamir bret...@yahoo.com wrote:
 1) I think it would be nice to see explicit confirmation in the spec that 
 this works with offline caching.

Yes.  I'll do that.

 2) Could data files such as .txt, .json, or .xml files be used as part of
 such a package as well?

 3) Can XMLHttpRequest be made to reference such files and get them from the
 cache, and if so, when referencing only a zip in the packages attribute, can
 XMLHttpRequest access files in the zip not spelled out by a tag like link/?
 I think this would be quite powerful/avoid duplication, even if it adds
 functionality (like other HTML5 features) which would not be available to
 older browsers.

This is tricky.  The problem is: If you have an img on a page which might be
able to be served from a resource package, we'll block the download of the
image until can either serve the request from a resource package or can be sure
that no package contains the image.

I can imagine this behavior being confusing with XMLHttpRequests.  On the other
hand, it could certainly be powerful when used correctly.

I think the natural thing is go ahead and treat things requested by an
XMLHttpRequest the same as anything else on a page and retrieve them from
packages as possible.  If you really don't want your XMLHttpRequest to block on
a resource package, you can always use a POST.  But I need to investigate more
to determine whether this makes sense.

 4) Could such a protocol also be made to accommodate profiles of packages,
 e.g., by a namespace being allowable somewhere for each package?

This sounds way outside the scope of what we're trying to do with resource
packages.  I'm all for designing for the future, but I don't think we want to
introduce the complexity even of these namespaces unless we intend to use them
immediately.

 Maciej Stachowiak m...@apple.com wrote:

 Have you done any performance testing of this feature, and if so can you 
 share any of that data?

There's a document (PDF) with some rough performance numbers in the bug:

https://bugzilla.mozilla.org/attachment.cgi?id=455820

Although the results are preliminary, I think doing much more than this on a
simulated network for a test page might be going a bit overboard.  Results from
real pages over real networks would be much more meaningful at this point.

 Separately, I am curious to hear how http headers are handled; it's a TODO in
 the spec, and what the TODO says seems poor for the Content-Type header in
 particular. It would make it hard to use package resources in any context
 that looks at the MIME type rather than always sniffing. Any thoughts on
 this?

The intent is for UAs to sniff the content-type of anything coming from a
resource package, so I think that TODO needs to be turned on its head: The UA
shouldn't apply any of the response headers from the resource package to its
elements.

 Christoph Päper christoph.pae...@crissov.de wrote:
 A page indicates in its html element that it uses one or more resource 
 packages (…).

 Why do you want to put this on the HTML level (exclusively), not the HTTP 
 level?
 ...
 Images might be referenced from within HTML or CSS files.

If you reference an image from a CSS file and include that CSS file in an HTML
file which uses resource packages, the image can be loaded from the resource
package.

 Why did you decide against link rel=resource-package
 href=pkg1.zip#files='img1.png,…'/ or something like that? (The hash part
 is just guesswork.)

We actually originally spec'ed resource packages with the link tag, but we
encountered some difficulties with this.  For example, it led to confusing
behavior when a resource package was defined after a link rel='javascript'.
Do we load the script from the network, or do we wait until we've received the
whole head before loading any scripts?

Resource packages as a link also interacted poorly with Mozilla's speculative
parsing algorithm, which tries to download resources before we run the page's
scripts.  We probably could have come up with semantics which didn't run into
problems with our own speculative parsing implementation, but we realized it
would be difficult to spec it in such a way that we didn't make things very
difficult for *someone*.

 * Argument: What about incremental rendering?

The spec (and our implementation in Firefox) cares deeply about incremental
rendering.  Although the zip format isn't strictly suitable for incremental
extraction, I defined alternate semantics in the spec which should work.

Zip is better than tar-gz for this kind of thing for two reasons:

 * Zip file headers are uncompressed, so you don't have to extract the whole
   file in order to tell what's inside.

 * Entries in a zip file are individually compressed.  Although this might
   cause you to compress less effectively, you can compress all your files
   ahead of time and construct a zip file on the fly pretty very cheaply.

 Philip Taylor excors+wha...@gmail.com wrote:
 It seems a bit surprising that 

Re: [whatwg] HTML resource packages

2010-08-04 Thread Justin Lebar
 If you do want it to work the same then you'll need to hook into the
 parser and ignore dynamic updates.

Indeed.  And since I explicitly *do* want dynamic updates, it'll need to change.

Thanks.

On Wed, Aug 4, 2010 at 1:32 PM, Philip Taylor excors+wha...@gmail.com wrote:
 On Wed, Aug 4, 2010 at 9:01 PM, Justin Lebar justin.le...@gmail.com wrote:
 What happens if the document contains multiple html elements (not
 all the root element)? (e.g. if it's XHTML, or the elements are added
 by scripts). The packages spec seems to assume there is only ever one.

 The packages attribute should work like the manifest attribute currently 
 works.
 I don't see language in the cache manifest section of HTML5 (6.6) specifying
 what happens when there are multiple html elements, so I hope I don't need 
 to
 specify this either.  :)

 http://whatwg.org/html#attr-html-manifest says:

  The manifest attribute only has an effect during the early stages
 of document load. Changing the attribute dynamically thus has no
 effect (and thus, no DOM API is provided for this attribute).

 Its effect is triggered from http://whatwg.org/html#parser-appcache
 (html token in the before html insertion mode) or from
 http://whatwg.org/html#read-xml , so it will only ever run for the
 root html element of the document.

 The packages attribute is defined as running Whenever the packages
 attribute is changed (including when the document is first loaded, if
 its html element has a packages attribute), so it's not the same.
 If you do want it to work the same then you'll need to hook into the
 parser and ignore dynamic updates.

 --
 Philip Taylor
 exc...@gmail.com



[whatwg] HTML resource packages

2010-08-03 Thread Justin Lebar
We at Mozilla are hoping to ship HTML resource packages in Firefox 4,
and we wanted to get the WhatWG's feedback on the feature.

For the impatient, the spec is here:

http://people.mozilla.org/~jlebar/respkg/

and the bug (complete with builds you can try and some preliminary
performance numbers) is here:

https://bugzilla.mozilla.org/show_bug.cgi?id=529208


You can think of resource packages as image spriting 2.0.  A page
indicates in its html element that it uses one or more resource
packages (which are just zip files).  Then when that page requests a
resource (be it an image, a css file, a script, or whatever), the
browser first checks whether one of the packages contains the
requested resource.  If so, the browser uses the resource out of the
package instead of making a separate HTTP request for the resource.

There's of course more detail than that, of course.  Hopefully it's
(mostly) clear in the spec.

I envision two classes of users of resource packages.  I'll call the
first resource-constrained developers.  These developers care about
how fast their page is (who doesn't?), but can't spend weeks speeding
up their page.  For these developers, resource packages are an easy
way to make their pages faster without going through the pain of
spriting their images and packaging their js/css.

The other class of users are the resource-unconstrained developers;
think Google or Facebook.  These developers have already put a huge
amount of effort into making their pages fast, and a naive application
of resource packages is unlikely to make them any faster.  But these
developers may be able to use resource packages cleverly to gain
speedups.  In particular, nobody (to my knowledge) currently sprites
content images, such as the results of an image search.  A determined
set of developers should be able to construct resource packages for
image search results on the fly and save some HTTP requests.


So we can avoid rehashing here the common objections to resource
packages, here's a brief overview of the arguments I've heard against
the feature and my responses.

* Argument: Packaging isn't the way forward.  When you change one
resource in a package you have to change the whole package and so the
user has to re-download all the bits when most of what was in their
cache would have been fine.

This is of course correct, but we don't think it eliminates the
utility of resource packages.  The resource-constrained developer is
probably happy with anything which speeds up page loads, even if it's
not optimal when one part of the page changes.  And the
resource-unconstrained developer probably won't find resource packages
too useful for non-dynamic content, so caching isn't an issue in that
case.

* Argument: We can already package things pretty well.  Mozilla should
instead be focusing on improving caching (or something else).

I'd contend that we don't package particularly well in general.  The
Facebook homepage loads 100 separate resources on a cold cache, and
they certainly care about speed.  But anyway, this is just one
project.  We're also looking at caching.  :)

* Argument: Isn't this subsumed by HTTP pipelining?

Mostly.  But we can't turn on HTTP pipelining because transparent
proxies break it.

Resource packages have the further benefit that they allow page
authors to explicitly set the order in which the UA will download the
resources -- with pipelining, an important resource might get stuck
behind a large, unimportant resource, while with resource packages,
the UA always downloads resources in the order they appear in the zip
file.

Last, my understanding is that the HTTP pipeline isn't particularly
deep, so perhaps resource packages fill the TCP pipe better on
high-latency connections.  I haven't looked into this, though.

* Argument: What about SPDY?

I think SPDY should subsume resource packages.  But its deployment
will require changes to both web clients and servers, so it will
probably take a while after it's released before it's available on all
web servers.  And we have no idea when to expect SPDY to be ready for
production.  Resource packages, in contrast, are something we can have
Right Now.

Additionally, since resource packages are backwards-compatible -- a
page which specifies resource packages should display just fine in a
browser which doesn't support them -- we should be able to turn off
resource packages in the future if we decide we don't want them
anymore.


We'd love to hear what you think of the specification and our implementation.

-Justin


Re: [whatwg] history.pushState() and replaceState()'s title parameter

2010-07-22 Thread Justin Lebar
Just to follow up on this: We just pushed a change to Firefox to
completely ignore the title parameter, as WebKit does.

We're getting close to locking down Firefox for the next release.  If
we want to do something more creative with the title parameter, now is
the time for action.

-Justin

On Wed, Jun 23, 2010 at 11:15 AM, Justin Lebar justin.le...@gmail.com wrote:
 Safari 5 and Chrome 5 recently shipped the history.pushState and
 replaceState methods.  Firefox 4 will also include those methods when
 it ships.

 pushState and replaceState take three arguments: An opaque data
 object, a title, and an optional URL.  Currently, Safari and Chrome
 both ignore the title parameter.

 Jonas Sicking jo...@sicking.cc and I have been talking with Brady
 Eidson beid...@apple.com and Darin Fisher da...@chromium.org,
 about what we can do to clean up this API, since having an unused
 parameter in our brand-new functions is unfortunate.

 Ideally, we might change the pushState and replaceState methods
 themselves, perhaps changing them so they only take a URL and an
 optional data object.  But since Chrome and Safari have already
 shipped the method, and since we hear that the functions are already
 being used on the web, it's probably too late to add or remove
 arguments from the functions.

 It seems that the intent of the spec as it stands is that the title
 parameter should show up in the session history list (shown e.g. when
 you click the down arrow next to the forward button), but not in the
 application's title bar.  We think this is confusing (as evidence,
 observe that two browsers skipped this step!) and adds a lot of
 complexity for a small amount of gain, so we're not in favor of this
 approach.  If modifying the document's title in the session history
 list is a desirable feature, then we could expose that property to the
 DOM just as we expose document.title.

 Seeing as we're stuck with the title argument in pushState and
 replaceState, we propose that it modify document.title in an intuitive way:

 * Before we unload a history entry, we save document.title into the
 history entry.
 * When we activate a history entry, we set document.title to the value
 stored in the history entry.
 * When we pushState, we set document.title to the title parameter
 after activating the new history entry.
 * When we replaceState, we set document.title to the title parameter.

 In the last two cases, if the title parameter is empty, we leave
 document.title unchanged.

 We think this is a good compromise between complexity and functionality.

 -Justin



[whatwg] push/replaceState interacting with POSTs

2010-07-16 Thread Justin Lebar
We have a minor issue using replaceState in Bugzilla that we may or
may not want to fix up in the spec.

When you make a change to a bug, Bugzilla POSTs you from a nice-looking URL, say

https://bugzilla.mozilla.org/show_bug.cgi?id=577720 ,

to

https://bugzilla.mozilla.org/process_bug.cgi

This is annoying because it breaks refresh and bookmarking, even
though process_bug.cgi is logically displaying the same page that
show_bug.cgi was previously displaying.

Apparently fixing this the Right Way is difficult in Bugzilla, so the
developers are considering using history.replaceState() to change the
URL of process_bug.cgi back to show_bug.cgi?id=xxx.

This works well, but it has the small problem that when you refresh
the page after processing a bug, Firefox shows you the warning it
shows when you refresh a page which was POST'ed to.

I wonder if calling push/replaceState should cause the browser to
consider the affected history entry as the result of a GET, even if it
was the result of a POST.  Bugzilla may be abusing the API here a bit,
but it's still not clear that we're doing the right thing when we
prompt the user on a refresh (or if we were to refuse to load the page
on a session restore since the load isn't idempotent).

I'm curious what the WhatWG thinks of this.

-Justin


Re: [whatwg] push/replaceState interacting with POSTs

2010-07-16 Thread Justin Lebar
On Fri, Jul 16, 2010 at 3:11 PM, Aryeh Gregor simetrical+...@gmail.com wrote:
 What do other browsers do?

Chrome 6.0.458.1 dev on Linux warns on refresh after a pushState or a
replaceState.

Firefox trunk (Mozilla/5.0 (X11; Linux x86_64; en-US; rv:2.0b2pre)
Gecko/20100716 Minefield/4.0b2pre) warns on a refresh only after a
replaceState.

http://people.mozilla.org/~jlebar/test/general/pushstate-post.html

-Justin

On Fri, Jul 16, 2010 at 3:11 PM, Aryeh Gregor simetrical+...@gmail.com wrote:
 On Fri, Jul 16, 2010 at 1:13 PM, Justin Lebar justin.le...@gmail.com wrote:
 We have a minor issue using replaceState in Bugzilla that we may or
 may not want to fix up in the spec.

 When you make a change to a bug, Bugzilla POSTs you from a nice-looking URL, 
 say

    https://bugzilla.mozilla.org/show_bug.cgi?id=577720 ,

 to

    https://bugzilla.mozilla.org/process_bug.cgi

 This is annoying because it breaks refresh and bookmarking, even
 though process_bug.cgi is logically displaying the same page that
 show_bug.cgi was previously displaying.

 Apparently fixing this the Right Way is difficult in Bugzilla, so the
 developers are considering using history.replaceState() to change the
 URL of process_bug.cgi back to show_bug.cgi?id=xxx.

 This is a standard nuisance: you want to display a success/failure
 message.  You don't want to just display it in the POST result,
 because then you get browser warnings, the URL can't be copy-pasted,
 etc.  You don't want to tack it on as a URL parameter because then the
 success/failure messages can be forged.  There's no good answer I'm
 aware of barring tedious server-side trickery (like queuing up a
 message for display on the next view of certain types of pages).

 replaceState() sounds like it should be a decent solution if
 implemented as you'd like, although it only works if JavaScript is
 enabled, so it's not ideal.

 This works well, but it has the small problem that when you refresh
 the page after processing a bug, Firefox shows you the warning it
 shows when you refresh a page which was POST'ed to.

 I wonder if calling push/replaceState should cause the browser to
 consider the affected history entry as the result of a GET, even if it
 was the result of a POST.  Bugzilla may be abusing the API here a bit,
 but it's still not clear that we're doing the right thing when we
 prompt the user on a refresh (or if we were to refuse to load the page
 on a session restore since the load isn't idempotent).

 I'm curious what the WhatWG thinks of this.

 I'd think that hitting refresh when the URL has been changed by
 JavaScript should load the current URL displayed in the location bar.
 If this is different from the actual URL that the page was originally
 served from, then submitting POST data that was submitted for the
 current page probably makes no sense, so treating the new request in
 all ways as a GET seems like the only sensible thing.  So I'd say this
 is a Firefox bug, if Firefox does this.  (What do other browsers do?
 WebKit implements replaceState, right?)



[whatwg] Ambiguity re firing the popstate event

2010-06-30 Thread Justin Lebar
Section 6.5.9.1 [1] says:

 The popstate event is fired when navigating to a session history entry that 
 represents a state object.

In contrast, section 6.5.9 [2] indicates in step 10 that a popstate
event is fired if the history entry represents a state object or the
first entry for a document.

Unfortunately this ambiguity has caused WebKit and Mozilla to
implement popstate in two different ways [3].

I think we can resolve this in the spec by changing the line from 6.5.9.1 to:

 The popstate event is fired when navigating to a session history entry.

-Justin

[1] 
http://www.whatwg.org/specs/web-apps/current-work/multipage/history.html#event-definitions
[2] 
http://www.whatwg.org/specs/web-apps/current-work/multipage/history.html#history-traversal
[3] https://bugs.webkit.org/show_bug.cgi?id=41372


[whatwg] history.pushState() and replaceState()'s title parameter

2010-06-23 Thread Justin Lebar
Safari 5 and Chrome 5 recently shipped the history.pushState and
replaceState methods.  Firefox 4 will also include those methods when
it ships.

pushState and replaceState take three arguments: An opaque data
object, a title, and an optional URL.  Currently, Safari and Chrome
both ignore the title parameter.

Jonas Sicking jo...@sicking.cc and I have been talking with Brady
Eidson beid...@apple.com and Darin Fisher da...@chromium.org,
about what we can do to clean up this API, since having an unused
parameter in our brand-new functions is unfortunate.

Ideally, we might change the pushState and replaceState methods
themselves, perhaps changing them so they only take a URL and an
optional data object.  But since Chrome and Safari have already
shipped the method, and since we hear that the functions are already
being used on the web, it's probably too late to add or remove
arguments from the functions.

It seems that the intent of the spec as it stands is that the title
parameter should show up in the session history list (shown e.g. when
you click the down arrow next to the forward button), but not in the
application's title bar.  We think this is confusing (as evidence,
observe that two browsers skipped this step!) and adds a lot of
complexity for a small amount of gain, so we're not in favor of this
approach.  If modifying the document's title in the session history
list is a desirable feature, then we could expose that property to the
DOM just as we expose document.title.

Seeing as we're stuck with the title argument in pushState and
replaceState, we propose that it modify document.title in an intuitive way:

* Before we unload a history entry, we save document.title into the
history entry.
* When we activate a history entry, we set document.title to the value
stored in the history entry.
* When we pushState, we set document.title to the title parameter
after activating the new history entry.
* When we replaceState, we set document.title to the title parameter.

In the last two cases, if the title parameter is empty, we leave
document.title unchanged.

We think this is a good compromise between complexity and functionality.

-Justin


Re: [whatwg] History API, pushState(), and related feedback

2010-02-10 Thread Justin Lebar
 On Thu, Jan 14, 2010, Hixie...oh dear.

 On Tue, 18 Aug 2009, Justin Lebar wrote:
 (An attempt at describing how pushstate is supposed to be used.)

 That's not quite how I would describe it. It's more that each entry in the
 session history has a URL and optionally some data. The data can be used
 for two main purposes: first, storing a preparsed description of the state
 in the URL so that in the simple case you don't have to do the parsing
 (though you still need the parsing for handling URLs passed around by
 users, so it's only a minor optimisation), and second, so that you can
 store state that you wouldn't store in the URL, because it only applies to
 the current Document instance and you would have to reconstruct it if a
 new Document were opened.

 An example of the latter would be something like keeping track of the
 precise coordinate from which a popup div was made to animate, so that
 if the user goes back, it can be made to animate to the same location. Or
 alternatively, it could be used to keep a pointer into a cache of data
 that would be fetched from the server based on the information in the URL,
 so that when going back and forward, the information doesn't have to be
 fetched again.

 Basically any information that is not information that you would not
 include in a URL describing the page, but which could be useful when going
 backwards and forwards in the history.

Can we publish this somewhere?  This is crucial and not obvious.

 If the Document is not recoverable, then recovering the state object makes
 little sense, IMHO. We should not be encouraging a world in which the
 meaningful state of a page is described by more than its URL. However,
 it's a UA decision whether to enable this or not.

Yes, but we want to make sure we're making the right UA decision. :)

I approached this from a different angle: Does it make sense to persist the
fact that two history entries with (potentially) different URLs correspond to
the same document across session history?  If pushState is supposed to replace
using the hash to store data, then we should persist this fact across session
restores, right?  But then we have to also persist the state data; otherwise,
if the page used pushState with no URL argument, it wouldn't be able to
distinguish between the two states.

I think you have a strong argument above.  On the other hand, the fact that
history entries X and Y are in fact the same Document is itself page state
which isn't stored in the URL.

 On Tue, 5 Jan 2010, Justin Lebar wrote:

 I think this is correct.  A popstate event is always dispatched whenever
 a new session history entry is activated (6.10.3).

 Actually if multiple popstates are fired before 'load' fires, all but the
 last are discarded, and the last waits until after 'load' fires to be
 fired. But otherwise yes.

Oh, interesting.  I didn't even notice that popstate is async again.  Good to
know.

-Justin


Re: [whatwg] question about the popstate event

2010-01-12 Thread Justin Lebar
If I'm understanding the bug correctly, Brady is suggesting not that a
popstate event isn't fired when we navigate back to a document which
has been unloaded from memory, but that the state object in that
popstate event is null.

As I understand it, the crux of his argument relates to the algorithm
to update the session history with the new page [1]:

2) If the navigation was initiated for entry update of an entry

   1) Replace the entry being updated with a new entry representing
  the new resource and its Document object and related state.

I think he's arguing that the set of related state that is copied to
the new entry does not contain the state object.  His evidence for
this is mostly textual: This state is referenced in other parts of the
spec, and in those places, it's made clear that the state consists of
scroll position and form fields:

(From comment #4 at https://bugs.webkit.org/show_bug.cgi?id=33224)
 I believe state in this context is not referring to state objects, but
 rather persisted user state as set forth in 5.11.9 step 3:
 For example, some user agents might want to persist the scroll position, or
 the values of form controls.

I think this is a good point from a textual perspective.

But I think it's clear that we actually want to persist state objects
across Document unloads.  If we didn't care about this use case, we
could do away with state objects altogether.  A document could just
call pushstate with no state variable and store its state objects in a
global variable indexed by an identifier in the URL.  When the page
receives a popstate, it checks its URL and grabs the relevant state
object.  Simple.  (This doesn't handle multiple entries with the same
URL, but hash navigation doesn't handle that either, so that's not a
big problem.)

My point is that state objects are pretty much useless unless you
persist them after the document has been unloaded.

I also think the fact that we take the structured clone of a state
object before saving it (and that structured clone forbids pointers to
DOM objects and whatnot) indicates that the spec intended for state
objects to stick around after document unload.  Otherwise, why bother
making a restrictive copy?

(It should go without saying that if you're saving state objects
across document unloads, you should also be saving the has same
document relationships between history entries.  That is, suppose
history entry A calls pushstate and creates history entry B.  Some
time later, the document for A and B is unloaded, then the user goes
back to B, which is re-fetched into a fresh Document.  Then the user
clicks back, activating A.  We should treat the activation of A from B
as an activation between two entries with the same document, and not
re-fetch A.)

Where the spec needs to be clarified to support this, I think it
should be.  But let's first agree that this is the right thing to do.

-Justin

[1] 
http://www.whatwg.org/specs/web-apps/current-work/multipage/history.html#update-the-session-history-with-the-new-page

On Tue, Jan 12, 2010 at 3:54 PM, Darin Fisher da...@chromium.org wrote:
 Hi,
 I've been discussing this issue with Brady Eidson over
 at https://bugs.webkit.org/show_bug.cgi?id=33224,
 and his interpretation appears to be different.  (I think he may have
 convinced me too.)
 I'd really like some help understanding how pushState is intended to work
 and to see how that lines up
 with the spec.
 Also, assuming Brady is correct, then I wonder why pushState was designed
 this way.  It seems strange
 to me that entries in session history would disappear when you navigate away
 from a document that used
 pushState.
 -Darin

 On Tue, Jan 5, 2010 at 6:55 PM, Justin Lebar justin.le...@gmail.com wrote:

  From my reading of the spec, I would expect the following steps:
  5. Page A is loaded.
  6. The load event for Page A is dispatched.
  7. The popstate event for Page A is dispatched.

 I think this is correct.  A popstate event is always dispatched
 whenever a new session history entry is activated (6.10.3).

 -Justin

 On Tue, Jan 5, 2010 at 4:53 PM, Darin Fisher da...@chromium.org wrote:
  I'd like to make sure that I'm understanding the spec for pushState and
  the
  popstate event properly.
  Suppose, I have the following sequence of events:
  1. Page A is loaded.
  2. Page A calls pushState(foo, null).
  3. The user navigates to Page B.
  4. The user navigates back to Page A (clicks the back button once).
  Assuming the document of Page A was disposed upon navigation to Page B
  (i.e., that it was not preserved in a page cache), should a popstate
  event
  be generated as a result of step 4?
  From my reading of the spec, I would expect the following steps:
  5. Page A is loaded.
  6. The load event for Page A is dispatched.
  7. The popstate event for Page A is dispatched.
  Do I understand correctly?
  Thanks,
  -Darin




Re: [whatwg] question about the popstate event

2010-01-05 Thread Justin Lebar
 From my reading of the spec, I would expect the following steps:
 5. Page A is loaded.
 6. The load event for Page A is dispatched.
 7. The popstate event for Page A is dispatched.

I think this is correct.  A popstate event is always dispatched
whenever a new session history entry is activated (6.10.3).

-Justin

On Tue, Jan 5, 2010 at 4:53 PM, Darin Fisher da...@chromium.org wrote:
 I'd like to make sure that I'm understanding the spec for pushState and the
 popstate event properly.
 Suppose, I have the following sequence of events:
 1. Page A is loaded.
 2. Page A calls pushState(foo, null).
 3. The user navigates to Page B.
 4. The user navigates back to Page A (clicks the back button once).
 Assuming the document of Page A was disposed upon navigation to Page B
 (i.e., that it was not preserved in a page cache), should a popstate event
 be generated as a result of step 4?
 From my reading of the spec, I would expect the following steps:
 5. Page A is loaded.
 6. The load event for Page A is dispatched.
 7. The popstate event for Page A is dispatched.
 Do I understand correctly?
 Thanks,
 -Darin


Re: [whatwg] Question about pushState

2009-12-16 Thread Justin Lebar
On Wed, Dec 16, 2009 at 3:06 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Wed, Dec 16, 2009 at 11:51 AM, Darin Fisher da...@chromium.org wrote:
 I would have expected it to behave like a reference
 fragment navigation, which prunes *all* forward session history entries.

 I agree. I *think* what you are suggesting is what the implementation
 that Justin Lebar has written for Firefox does.

Yes, with my patch, the forward button is never active after a
pushState.  It wasn't an intentional deviation from the spec, but I
agree with Darin's reasoning: If pushState is a replacement for the
hash-navigation hack, then it should behave like a hash navigation.

-Justin


[whatwg] push/replaceState title parameter (was AJAX History Concerns)

2009-11-23 Thread Justin Lebar
On Mon, Nov 23, 2009 at 5:01 PM, Ian Hickson i...@hixie.ch wrote:
 On Fri, 13 Nov 2009, Justin Lebar wrote:
 On Thu, Nov 12, 2009 at 5:43 PM, Ian Hickson i...@hixie.ch wrote:
  The idea is that the string you would put into the back button or
  history menu is not the same as the string you would put into the
  title bar or bookmarks (i.e. not the same as title).

 That doesn't seem too unreasonable, but I think it's strange to set that
 title through push/replaceState, since an alternate page title is
 orthogonal to the idea of an AJAX page with state objects.

 No more so than an alternative URL, surely?

I'm not sure I agree.  It seems to me that if you set the page's URL,
it's likely that you'll want to change the state object (if you're not
storing all your data in the URL).  On the other hand, one might want
to change the history entry title without ever changing the URL or the
state object.  In the simple case, consider a page which uses no AJAX
at all, but just wants to display a shorter title in the history than
in the titlebar of the browser.  Does it make sense for this page to
call history.replaceState(null, 'new title');?

 It might be confusing to expose this alternate title in the document
 object, but perhaps we could expose it as a property or setter function
 somewhere else.  Then we could persist it properly across forward /
 backs within the same document.

 It seems like that would just cause everyone to call pushState() and
 updateTitle() instead of just calling pushState(), except that then people
 would forget to update the title and your history would have a bunch of
 silly-looking titles like Inbox (3), Inbox (20), Inbox (4).

Well, people are already going to have to call pushState() and then
set document.title if they want to update the title at the top of the
browser, even if they specify a title in pushState().

I imagine that most pages aren't going to try to maintain two parallel
sets of titles.  For these cases, I think a pushState() function
without a title and propagating document.title changes into the
history entry makes sense, because this is what those pages already
were doing without pushstate.  For those pages which really want to
have two titles, it doesn't seem unreasonable to me that they should
have to write an extra line of code to explicitly set the history
entry's title.

Without this extra setHistoryEntryTitle() function, I think the API
for updating the history entry title becomes unnecessarily
complicated.  If you haven't used pushState() or replaceState(), then
the history entry's title gets updated when you modify document.title.
 But as soon as you call one of those functions, the two titles become
permanently unlinked, and further updates to the history entry's title
have to go through replaceState.  And if you want to change the
history entry's title, you now have to save or reconstruct a copy of
your state object just so you can pass it back to replaceState().

In addition to avoiding this complexity, the updateTitle() function
has the advantage that it allows us to call |updateTitle(undefined)|
(or something) to re-link the two titles.

I guess the essential question is whether we see the history entry
title as being a separate feature from pushState.  If most or all
pages will update the history entry title only in response to a
pushState or a replaceState that they'd have made anyway, then maybe
it makes sense to keep the history entry title there.  But I don't see
why the features should be coupled like that.  By analogy, none of us
would argue that we should couple setting document.title with clicking
links and setting document.location.

-Justin


Re: [whatwg] push/replaceState title parameter (was AJAX History Concerns)

2009-11-23 Thread Justin Lebar
On Mon, Nov 23, 2009 at 6:46 PM, Ian Hickson i...@hixie.ch wrote:
 On Mon, 23 Nov 2009, Justin Lebar wrote:
 I'm not sure I agree.  It seems to me that if you set the page's URL,
 it's likely that you'll want to change the state object (if you're not
 storing all your data in the URL).  On the other hand, one might want to
 change the history entry title without ever changing the URL or the
 state object.  In the simple case, consider a page which uses no AJAX at
 all, but just wants to display a shorter title in the history than in
 the titlebar of the browser.  Does it make sense for this page to call
 history.replaceState(null, 'new title');?

 I've never heard anyone asking for this; do you have a concrete example?

In the absence of push/replaceState, changes to document.title
propagate to the history entry title -- they're linked together.
Calling pushState unlinks them in the sense that after the call,
changes to document.title no longer affect the history entry's title.
To modify the history entry's title when residing at a history entry
which was pushState'd to, you have to call replaceState.

Thus you'd need to call history.replaceState(currentStateObject,
newTitle) when you changed document.title on a page which was
pushState'd to and wanted to reflect that change in the history entry.
 Suppose Gmail wanted to update the unread messages count in both the
history and in document.title.

Honestly, I don't think adding an extra set of titles will be
particularly useful, and I imagine that most websites will use just
one title for both the history entry and the browser title.  But
that's exactly the problem: As soon as you call pushState, you now
have to be aware that changes to document.title now no longer affect
the history title.

To be clear, my contention is that pushState shouldn't have a title
parameter, not that we should have a updateHistoryEntryTitle()
function.  I'm fine with the idea of the history entry title
reflecting the state of document.title immediately before the most
recent time we navigated away from that entry, as it does now.  But if
we want to allow the titles to be set independently, I don't think
pushState is the right mechanism.

 By analogy, none of us would argue that we should couple setting
 document.title with clicking links and setting document.location.

 Actually, I would; that's exactly what I'm arguing in fact. With normal
 navigation, the coupling is done by the UA (first setting the title to the
 URL, and then updating it when a title element is found during parsing).
 With pushState(), the navigation is implicit (scripted) and so the URL
 and title changes have to be done explicitly.

This doesn't suggest that we shouldn't have a
updateHistoryEntryTitle() function, just as the existence of title
doesn't suggest that code for modifying the document's title should be

document.navigateTo(document.location, newTitle)

Adding an updateHistoryEntryTitle() function while leaving the title
parameter in pushState might be better than things are now.  But since
we have to explicitly set document.title after a pushState anyway,
removing the title from pushState doesn't create any more work for the
vast majority of use cases.  I don't see why we need to add all this
complexity to support the edge use case where the history title and
document title are different.

-Justin


Re: [whatwg] Do we really need history.clearState()?

2009-11-14 Thread Justin Lebar
On Sat, Nov 14, 2009 at 5:23 PM, timeless timel...@gmail.com wrote:
 what if pushState returned a value which could be passed to clearState?

I'm not sure how this would work.  What would clearState do with that value?

 (i can't find clearState in
 http://www.whatwg.org/specs/web-apps/current-work/#dom-history-pushstate

Hixie removed it a few days ago.

-Justin


Re: [whatwg] AJAX History Concerns

2009-11-12 Thread Justin Lebar
 The title [argument to pushState] is purely advisory. User agents might use 
 the title in the user interface.

 But unlike the URL which actually changes in the Document object and is 
 therefore exposed to the DOM, this purely advisory title change is hidden 
 from the DOM.  I'm questioning the reasoning behind this distinction and am 
 curious if it was intentional or not.

What I did in my Firefox patch (which should be checked into trunk
within a few weeks, I hope) is use that title only to identify the
history entry in the pull-down back-forward menu (what's shown when
you click the down arrow next to the forward button in Firefox).

If you want the rest of the UI (e.g. browser title bar) to match up
with this title, you have to set document.title.  In fact, if you
pushState with title 'Foo', then navigate back and then forward, the
history entry's title will be reset to the document's title.  (I
intend to write something detailing tricks like these once we land the
pushState patch.)

On the one hand, the implementation as it is allows developers some
control over the history entry title independent of the document
title, and perhaps that's useful.  On the other hand, most use cases I
can imagine for setting the history entry title are only useful if it
persists between back/forwards.

It appears in my testing that if you do pushState(title1);
document.title=title2 Firefox shows title2 in both the local and
global history, so setting document.title appears to subsume most of
the functionality of pushState's title argument.

We could make the API change document.title and remember that change
between back/forwards, but I think that would be unnecessarily
complicated.  After a pushState, you'd get a new document which shares
all mutable state *except* its title with its sibling.

Unless there's a compelling use for it, perhaps we should simplify the
API by getting rid of the title parameter altogether.  One can pretty
easily update document.title on popStates manually.  But perhaps I'm
missing something; I recall at one time being convinced that the title
parameter was important.  :)

 [Given] A1 - A2 - B1 - *B2* - B3 - C1 - C2

 When this method is invoked, the user agent must remove from the session 
 history all the entries from the first state object entry for that Document 
 object up to the last entry that references that same Document object, if 
 any.

 In my original message I liberally interpreted this to mean the new current 
 entry should be a copy of B3 but without the state object because, clearly, 
 we just removeState()ed.

I don't think removing the entry from the history implies that we
clear its state object.

Certainly the spec could be clarified, however.  I don't think that
Marius's reading, here that B1, B2, and B3 would all be removed, is
completely unsupported by the text.  But I also don't think that's
what we want.

If I understand things correctly, we always remove the current entry
after a clearState.  So perhaps the language could be

When this method is invoked, the user agent must remove from the
session history all the entries from the first state object entry for
that Document up to the second-to-last entry that references the same
Document.  The current entry is then set to the one remaining entry
for the Document.

That said, we didn't implement clearState when we did
push/replaceState because it's hard to get right and we don't
currently have a compelling use case.  There are probably lots of
things we'd change if we were going to implement it -- for instance,
why go back to the last entry instead of staying at the current one?
But that's probably a conversation for another thread.

-Justin


[whatwg] Do we really need history.clearState()?

2009-11-12 Thread Justin Lebar
As I alluded to in the thread AJAX History Concerns, I'm not
convinced that we need the history.clearState() function.

I haven't been able to come up with a compelling case where a page
would use this.  I guess the idea is that I'm on Google Maps, which is
using pushState to make a history entry every time I scroll the map.
If I scroll around a lot, it might clobber my history and make it hard
to go back to the page I was at before I began looking at the map.
But it could be nice and at some point (possibly triggered by some
user action) call clearState.  Then I'd be able to click back and
actually go back to the Document I was previously viewing.

clearState as it exists doesn't match this use case particularly well.
 If we were concerned about clobbering history, we'd probably want to
keep the two or three newest history entries and throw out all the
rest of them.  If you were really clever, you might be able to
accomplish this by calling clearState and then using pushState to
reconstruct the part of the history you want to keep.  But getting the
URLs right would be pretty tricky, especially if clearState took you
to the last entry for the document, as currently specified.

clearState is also useless if you don't use this single-document
pushState model for your site.  If we think clearing the history is
useful for AJAX pages, I'm not sure why it wouldn't be useful for a
web application which loads multiple documents.

I think the use case I proposed is much better served by something
like history.truncate(numBefore, numAfter), which would remove all but
the numBefore entries before the current entry and the numAfter
entries after the current entry.  We'd subject this to the same-origin
policy, of course, and stop removing entries in a direction as soon as
we encountered an entry from another origin.

I'm not sure if history.truncate() is a good idea -- do we really want
to give pages that kind of control over the history? -- but at least I
can actually imagine a page using it.

Perhaps a better idea is leaving this whole issue to the UA, which
could collapse all the entries from a single origin in the UI.  Then
we wouldn't need either function.

-Justin


Re: [whatwg] Do we really need history.clearState()?

2009-11-12 Thread Justin Lebar
On Thu, Nov 12, 2009 at 1:08 PM, Olli Pettay olli.pet...@helsinki.fi wrote:
 On 11/12/09 10:00 PM, Justin Lebar wrote:

 Perhaps a better idea is leaving this whole issue to the UA, which
 could collapse all the entries from a single origin in the UI.  Then
 we wouldn't need either function.

 How would UA collapse entries from a single origin?

Right now, the back button means take me to the previous history
entry.  The UA could add a take me back to the previous
document/origin button.

Similarly, the browser could collect together all the entries from a
document or origin in the drop-down menu of history entries (the down
arrow next to the forward button in Firefox).  When you click the down
arrow, it could show a list of documents/origins, and when you hovered
over an entry in the list, it could expand out and show all the
entries associated with that document/origin.

 Brady Eidson beid...@apple.com wrote:
 Imagine the use case of the checkout procedure at an online merchant. [...]

I think this is a pretty good example of where clearState actually
helps.  I'm not sure how general it is, though.  A designer who wants
to use clearState in this way is forced to begin the checkout wizard
in a new Document.  Maybe that's OK, but it seems like an arbitrary
limitation to me.

-Justin


[whatwg] pushState / replaceState nits

2009-11-01 Thread Justin Lebar
In section 6.10.2:

The pushState(data, title, url) method adds a state object to the
history. perhaps should be ... adds a state object *entry* to the
history.

The replaceState(data, title, url) method updates the current entry
in the history to have a state object. perhaps should be
The replaceState(data, title, url) method adds a stateObject to the
current history entry or modifies (updates?) the entry's stateObject.

When either of these methods are invoked should be When either of
these methods is invoked.

-Justin


Re: [whatwg] Criticism of pushState (was Global Script proposal)

2009-09-08 Thread Justin Lebar
To be clear, I'm not suggesting that pushState obviates the need for
global script.  My point is that pushState is useful in its own right,
with our without global script.

Without pushstate, you can't make a non-hash navigation without
hitting the network.  Even if you're clever and store all of JQuery
and your whole DOM in global storage, if you want to change the
pre-hash part of the URI, you need to load a new page.

Imagine Google Maps trying to update the URI to match your current
location as you pan around the map.  Right now, they could update the
hash as you panned.  With pushstate, they could update the URI in an
arbitrary way.  With global state, they'd have to load a new page
every time you panned.  That's obviously worse, and probably not even
an option.

 Then why the heck would we want to come up with a fancier way to
 provide hash-navigation?

Perhaps the point is to do something which works like hash-navigation,
but to the user, looks like real navigation.  Imagine Bugzilla using
pushstate to navigate between bugs, but keeping the familiar
show_bug.cgi?id=1234 URI.  I don't pretend that the code necessary to
make this work would be easy to write, but it's certainly no more
difficult than changing the hash, and the resulting URLs are much
nicer.

 Once you introduce pushState, you deviate from the normalcy -- now you
 can have a URL in the address bar that the user agent hasn't requested
 from the server.

Again, this is just what happens when you're at your Gmail Inbox and
click a link to http://mail.google.com/mail/#Drafts.  You now have a
URL in the address bar that the UA hasn't requested from the server.
pushState improves this -- at least now the URL you didn't request
from the server looks like one which you plausibly might have
requested from the server.

 I really don't care about how the URLs
 look. I just want the Web development to be easier. And in my humble
 opinion, building a request controller in JS and essentially a whole
 alternative reality navigation system using hashes is not.

If you don't care how URLs look or if you don't mind making a network
request when you navigate a page, then don't use the feature!  A lot
of people do care about one or both of those things, though, and
they're willing to go through the pain of developing these
alternative-reality navigation systems.

PushState does not subsume global script.  For many applications,
storing the whole DOM in global script would get you sufficiently fast
navigations -- I agree.

But global script does not subsume pushState, either.  Even with
global script, you can't change the URI arbitrarily without navigating
the page.  Panning on Google Maps and changing the referer sent to a
page are two instances where extra page navigations might be
unacceptable.

I understand that pushState doesn't alleviate much of the pain of
developing no-navigation web apps.  But I don't think that's a reason
to get rid of it.

-Justin


[whatwg] Criticism of pushState (was Global Script proposal)

2009-09-07 Thread Justin Lebar
 Dimitri Glazkov wrote:
 But more to the point, I think globalScript is a good replacement for
 the pushState additions to the History spec.

I'm not sure I agree.  pushState lets you change the URI very quickly,
without doing any kind of navigation at all.  To emulate a pushSate
with globalScript, you'd have to save and restore the whole document,
and the browser would still have to do at least one network request,
unless you were only changing the hash of the URI.

 I am becoming
 somewhat convinced that pushState is confusing, hard to get right, and
 full of fail. You should simply look at the motivation behind building
 JS-based history state managers -- it all becomes fairly clear.

Could you elaborate on these points?  It seems to me that pushState
attacks a specific problem and delivers a simple solution which is
much better than the current workarounds (using the URL's hash to
identify a page and store state).  Yes, it's nontrivial to develop an
AJAX app which uses pushState and works correctly with bookmarking and
page refreshes.  On the other hand, pushState makes this a lot easier
than it would be otherwise.

 My big issue with pushHistory is that it messes with the nature of the
 Web: a URL is a resource you request from the server. Not something
 you arrive to via clever sleight of hand in a user agent.

Like it or not, this ship has already sailed.  When I load Gmail, I'm
taken to https://mail.google.com/mail/#inbox, but my browser never
sends #inbox to the server as part of the HTTP request.  Pandora and
Facebook do something like this too.  Perhaps the new intuition is
that a URL tells you how to get back to where you were.

 So, you've managed to pushState your way to
 a.com/some/path/10/clicks/from/the/home/page. Now the user bookmarks
 it. What are you going to do know?

When reading this message in Gmail, my browser shows that I'm at
https://mail.google.com/mail/#label/WhatWG/{guid} .  If I bookmark
this page and go back to it, Gmail takes me back to this exact
message.  There's no actual resource named #label/WhatWG/{guid} on
Google's servers, but the URL I bookmarked is sufficient to identify
where I was, and Gmail's servers were intelligent enough to take me
there.

Maybe you think that Gmail's URLs should name real resources; maybe
they should look like
https://mail.google.com/mail.cgi?label=WhatWGmessage={guid} or
something.  I'm not convinced this is better, but even if it suits
you, pushState still helps you navigate between mail.cgi?label=WhatWG
and mail.cgi?label=Drafts without a page refresh.

I think pushState API is really useful, but what do I know?  We're
going to land it in Firefox trunk Real Soon Now, so developers and
members of this list will be able to play with it and decide for
themselves whether it's the right API to solve the problem at hand.

-Justin


Re: [whatwg] first script and impersonating other pages - pushState(url)

2009-09-03 Thread Justin Lebar
Mike Wilson wrote:
 The result is that the address bar URL can't be trusted, as
 any page on the site can impersonate any other without
 consent from that page or part of the site?

Someone will correct me if I'm wrong, but I think this is already
pretty much the case with today's same-origin policy, albeit with a
bit more work.  My understanding is that if A and B have the same
origin, they can do whatever they want to each others' documents,
including modifying content.  So if you can control script at
http://google.com/~mwilson , and a user has both your site and
http://google.com/securesite , then your malicious page can do
whatever it wants to the secure page.

That's why it's important that you trust all the javascript which runs
on your origin.

-Justin


Re: [whatwg] Proposed changes to the History API

2009-08-21 Thread Justin Lebar
 Sorry, it seems we are not talking about the same application.
 Jonas referred to attachment pages in your bug database, which
 I assumed would f ex be a page like this one:
 https://bugzilla.mozilla.org/attachment.cgi?id=386244action=edit
 (The textarea in this app is not created onload, it is delivered
 in the server-generated HTML and thus is subject to form field
 value persistence.)

STR:
  * Open https://bugzilla.mozilla.org/attachment.cgi?id=386244action=edit
  * Click Edit as comment
  * Change the text in the textarea
  * Close and re-open your browser

Actual behavior: The textarea is back to its original state, read-only
and without your edits.  Even after you press edit as comment, the
state still doesn't reflect the changes you made before you closed the
browser.

Behavior with History API: When you click edit as comment and as you
type your comments, the page periodically saves the data to pageState.
 When the page receives a popstate, it restores the state of the
textarea.

I imagine that one could rework the Bugzilla page to function better
on browser restart using existing web technologies.  But as the page
is designed right now, some kind of pageStorage would be helpful.

-Justin


Re: [whatwg] Proposed changes to the History API

2009-08-21 Thread Justin Lebar
Mike Wilson wrote:
 What you're essentially saying here is that when restarting
 the browser, you will also restore history data, correct?

 For tabs that were open when the browser was closed, this
 will mean that these will reappear after restart with full
 history, being able to go Back and restore state on
 previous pages?

Right.  We already do this, sans popping a state object.

 But for pages that were explicitly closed, and then
 navigated to in a new tab, will you restore the full
 history in these as well?

No.  The state object is attached to the session history entry, not to
the page's URI.  If you close a tab, all its session history entries
go away.  If you navigate to a page which was open in the tab you just
closed, that new instance of the page won't be aware of the old page's
state object(s).

 And if there has been several sessions in parallel on that
 URL space, which one do you respawn for a navigation to a
 related page in a new tab?

A navigation on a new tab would get an entirely new environment.
Otherwise, like you suggested, this would be very confusing.

-Justin


Re: [whatwg] Proposed changes to the History API

2009-08-20 Thread Justin Lebar
On Wed, Aug 19, 2009 at 5:31 PM, Jeremy Orlowjor...@chromium.org wrote:
 but here it seems like everything can just stay in memory...right?

My thought was that if you had a tab open and restarted the browser,
that the state objects would be there after the restart, so we'd have
to serialize to disk.  I also thought that we'd persist this state
data even after we take a Document out of memory.

It might be possible to store some subset of DOM objects while still
meeting those requirements, but that seems like it might be a serious
can of worms.  Do you have a use case which would be facilitated by
being able to store some DOM objects in this way?

-Justin


Re: [whatwg] Proposed changes to the History API

2009-08-20 Thread Justin Lebar
 I guess this is just a vision about what the developer really
 wants to do, or are you thinking of any solutions that would
 actually allow changing path (or query string) without loading
 a new Document?

The pushState function as currently specified allows you to do
precisely this.  History.pushState(obj, title, url) creates a new
history entry with the given URL, but doesn't load a new document.

  It would further be nice if your comments weren't lost even if you
 navigate away from the page.

 This is the way it works in most browsers, as the browser persists
 form field values when you navigate back and forth in history.

Right.  But the difficulty with this page in particular is that it's
structured such that it's difficult/impossible for the browser to
properly restore its form state after a crash.  Onload, the page
creates a textarea and populates it with the text of the patch.  So
if we crash then restore, the page won't have created the textarea by
the time the browser looks to restore the text.

One can imagine reworking this page to make it play nicely with
session restore as it currently exists, but what we really want is a
way to programmatically do the restore.

  click link to navigate from page1.html#a to page1.html#b:

 [snip]

I think this is pretty much what we want to do, except that we'd like
to let authors use arbitrary URIs instead of constraining them to
using URIs which differ only in their hashes, so we still want
PopState to fire on all loads, not like hashchange.

The idea of having an unload event similar to PopState is intriguing, however.

-Justin


Re: [whatwg] Proposed changes to the History API

2009-08-20 Thread Justin Lebar
On Thu, Aug 20, 2009 at 11:20 AM, Jeremy Orlowjor...@chromium.org wrote:
 I see.  It makes more sense why you mentioned the session storage element
 then.  Note that there has been some discussion about whether session
 storage should survive crashes, but I know Safari and Chrome are currently
 planning to _not_ serialize it to disk.

I just did a quick test, and it appears that Firefox does save
sessionStorage across browser sessions, but IE8 does not.

Leaving aside the question of what the right thing to do is with
sessionStorage, I think there are some serious benefits to saving the
pushState'ed state across sessions.  Suppose I'm using a webmail
client which uses this new API.  I click around to a few of my folders
and messages and then close the browser.

If the page wants the back/forward buttons to work when the browser
re-opens, it needs to store all of the state for those history entries
in the URI.  At the point that pages have to do that, we might as well
not store a per-page state object.

  I still think we shouldn't force app developers to serialize everything to
 strings.  Maybe we can just raise an exception if they try to set the
 history state to something unserializable?  (I guess that's what you're
 already doing?)

Right now, I just serialize to JSON and throw an exception if that
fails.  I don't have a problem continuing to do that, at least until
we get the structured clone thing sorted out.

-Justin

 On Thu, Aug 20, 2009 at 11:05 AM, Justin Lebar justin.le...@gmail.com
 wrote:

 On Wed, Aug 19, 2009 at 5:31 PM, Jeremy Orlowjor...@chromium.org wrote:
  but here it seems like everything can just stay in memory...right?

 My thought was that if you had a tab open and restarted the browser,
 that the state objects would be there after the restart, so we'd have
 to serialize to disk.  I also thought that we'd persist this state
 data even after we take a Document out of memory.

 It might be possible to store some subset of DOM objects while still
 meeting those requirements, but that seems like it might be a serious
 can of worms.  Do you have a use case which would be facilitated by
 being able to store some DOM objects in this way?

 -Justin




Re: [whatwg] Proposed changes to the History API

2009-08-20 Thread Justin Lebar
 Overall, I think preserving history API information when restoring sessions
 is a good thing.  My only concern is whether web developers will program in
 such a way that this works.  Unless ALL state will need to be either saved
 in the history API or reconstructible from that information, bad things will
 happen.  (Note that this was difficult if not impossible with the original
 API, but your new proposal makes this quite practical.)

Maybe the right solution is to have a pageStorage object, which works
just like sessionStorage but is local to a session history entry and
perhaps carries some weak promise of persistence.  It might be a
little confusing that in the following code

  var len1 = pageStorage.length
  history.pushState(...)
  var len2 = pageStorage.length

len1 != len2, but that doesn't seem too complicated.

 Do most web apps that use iframe hacks (for tracking history) come back
 cleanly from a session restore?

I don't know, but I presume it would be possible so long as form data
is saved across session restore.

-Justin


[whatwg] Proposed changes to the History API

2009-08-18 Thread Justin Lebar
I'm in the process of implementing the HTML5 History API
(History.pushState(), History.clearState(), and the PopState event) in
Firefox.  I'd like to discuss whether the API might benefit from some
changes.  To my knowledge, no other browser implements this API, so
I'm assuming we have freedom to make large alterations to it.

My basic proposal is that History.pushState() be split into a function
for creating new history entries and functions or a property for
getting/setting an object associated with that entry.

In its current form, the History API allows us to identify session
history entries by way of an arbitrary object, which we pass as the
first argument to pushState() and which we receive as part of the
PopState event when that history entry is activated.  If the page gets
a null popstate, it's supposed to use the URL to decide what state to
display.

Notably unsupported by this API is support for pages altering their
saved state.  For instance, a page might want to save a text box's
edit history to implement a fancy undo.  It could store the edit
history in a cookie or in the session storage, but then if we loaded
the page twice in the same tab, those two instances would step on each
other when we went back and forth between them.

The page could just store its state in variables in the document, but
then it would loose that state when the browser crashed or was closed,
or when the browser decided to kick the document out of the history.

I think this page would be better served by a History.setStateObject()
function, which does exactly what the page wants in a simple fashion.

We'd still keep the history-entry-creating functionality of
History.pushState() in a new History function (I'll call it
createNewEntry(), but it probably needs a better name), which takes a
title and URL, as pushState() does now.

The API might be more intuitive if we had a History.stateObject
propery, but I'm concerned that then we'd be promising the page that
we'll keep around literally any objects it wants, including DOM
objects.  In fact, I'd be happy restricting the state object to being
a string.  If a page wants to store an object, it can convert it to
JSON, or it can store a GUID as its state string and index into the
session storage.

Pages could retrieve the state object just as they do now, in a
PopState event, although we'd probably want to change the name of the
event.  We'd probably want to fire PopState on all loads and history
navigations, since any document might have a state to pop, and even
those documents which didn't call setStateObject() might store state
in their URI which they need to restore when their history entry is
activated.

Last, I'm not sure that we need the History.clearState() function.
It's confusing (why do we end up at the last entry for the current
document instead of staying at the current entry?) and I haven't been
able to come up with a compelling use case.

I think the main benefit of these changes is added simplicity.
There's a right and wrong way to use pushState, and
setState/createNewEntry doesn't require such rules.  But additionally,
these changes allow pages flexibility to do things we haven't yet
thought of.  I don't know what those things might be, but I suspect
they may be pretty cool.  :)

-Justin


Re: [whatwg] Reading spec without boxes

2009-08-06 Thread Justin Lebar
Unbeknownst to me, I had a minimum font size of 12pt set.  FWIW, I
don't remember setting this, so it may have been a default.

-Justin

On Thu, Aug 6, 2009 at 2:09 PM, Ian Hicksoni...@hixie.ch wrote:
 On Thu, 6 Aug 2009, Elliotte Rusty Harold wrote:

 Same issue on Firefox 3.5.1 Mac at various font sizes. :-(

 On Thu, 6 Aug 2009, Justin Lebar wrote:

 Happens to me on Ubuntu 9.04 with FF 3.5.2.

 Screenshot at [1] http://stanford.edu/~jlebar/moz/screen1.png

 Do either of you have a minimum font size preference set?

 --
 Ian Hickson               U+1047E                )\._.,--,'``.    fL
 http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
 Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'



Re: [whatwg] A New Way Forward for HTML5

2009-07-23 Thread Justin Lebar
 That being said, inline spec comments sound interesting.

 I'm not quite sure what the
 UI would look like, but if anyone has any ideas, feel free to e-mail me
 directly and we can figure something out. (This would be exceedingly
 useful once we're in last call in a few months.)

Ian,

Other people have probably pointed this out, but the hg book has
inline comments.  http://hgbook.red-bean.com/read/preface.html

Regards,
-Justin


Re: [whatwg] Plus Signs in Signed Integers

2009-07-15 Thread Justin Lebar
 What does IE do in these two examples?

It appears that IE8 has the following behavior:

  ol start=+4

start = 4

  ol start=H2SO4

start = 1

Test at http://stanford.edu/~jlebar/moz/list.html

-Justin

On Tue, Jul 14, 2009 at 12:43 AM, Jonas Sickingjo...@sicking.cc wrote:
 On Thu, Jun 18, 2009 at 9:33 AM, Smylerssmyl...@stripey.com wrote:
 It also doesn't seem to match browser behaviour: the ol element's
 start attribute is an integer, so I tried this out in various browsers:

  ol start=+4
    liPlus four
  /ol

 All the ones I had to hand (Firefox, Opera, Konqueror, Dillo, Lynx,
 Links, and W3M) numbered the element with 4.

 [snip]

 To check that it is specifically the plus sign they are ignoring and not
 any non-digit character I also tried:

  ol start=H2SO4
    liAcid test
  /ol

 That should cause parsing an integer to abort and so the default of
 start=1 to be used.  Opera, Links, and W3M get that right.  Konqueror,
 Dillo, and Lynx all also seem to manage the aborting, but use a default
 of zero instead.  Firefox parses the 2 out of H2SO4, seemingly using
 the first integer it can find in the attribute, so possibly isn't
 special-casing +.

 What does IE do in these two examples? It appears webkit treats the
 first one as start=4 and the second as start=0.

 / Jonas