Re: [whatwg] html5 state handling: overview and extensions

2009-06-18 Thread Mike Wilson
Michael Nordman wrote:
 This breakdown is a useful way to think about application state
 in the browser.

Thanks, it has been very useful to myself as a working model for
making vague thoughts of something missing into something that
is possible to measure and compare. Ideally, some similar kind of
overview could be maintained in the HTML5 spec to keep both spec
authors and readers sane.

 Another axis that could be incorporated into the 
 model is lifetime.

You really hit the nail on the head there. Actually, my initial
attempt on a comparison table included a few additional columns:

General:
- state inherited by cloned browsing context
- persistent storage supported
- state protected/scoped by origin 
- state protected/scoped by origin+path
- lifetime control

Server-control:
- state travels to browser in response headers
- state travels to browser in response body

but I decided to start out with a cut-down version, as otherwise
definitely no-one would read my post (it was bad enough with the
length of it even being cut down ;-).

 There is some overlap between Server-
 Controlled and Script-Controlled realms in cookies, applications
 definitely have a dependency on that overlap being there.

Right, I missed including that in the post. Actually, in my 
initial table (mentioned above) I didn't split between client- 
and server-control, but had that as columns instead, so one
feature (cookies) could cover both script- and server-needs.
I should note that in this initial table of mine, normal
cookies serve both purposes but http-only cookies only do
server-controlled state. Here's an updated table with cookies
added:

SCRIPT-CONTROLLED STATE

Scope Visibility : State construct
- --   -
user agent,   invisible  : cookies, WS localStorage
browsing context, invisible  : window.name, WS sessionStorage
document, invisible  : -
history entry,invisible  : history state
history entry,url,   : url hash

 There
 are two constructs not represented in your writeup, Database and
 ApplicationCache.

Yes, they felt like somewhat different animals so I didn't
include them, maybe that was wrong. All the other ones are
some kind of simple data or name/value-paired data so I
focused on them.
My goal was also to make a matrix that would help us find
the missing pieces, by identifying the empty cells. When
adding too many different animals to the same matrix there
will be many empty cells, and it gets harder to see patterns.
Do you have ideas on how to incorporate Database and
ApplicationCache in the comparisons?

  Possible solutions would be to add a new documentStorage to
  WebStorage, or to offer a History.setDocumentState method.
 
 I see other possibilities with WebStorage too.
   documentStorage has persistent lifetime + document scope
 others...
   transient lifetime + document scope  (transient does not 
   survive a browser restart)
   transient lifetime + user agent scopetemporaryStorage
   transient lifetime + history entry scope   privateStorage

Interesting ideas! My initial thought was that lifetime could be 
controlled in a way similar to cookies, ie no lifetime indicates 
transient lifetime, and for persistent lifetime you set an 
expiration date/time. Maybe the legacy of cookie expiration 
doesn't have to be the model here, but it would seem good if 
some of the suggested extensions are indeed implemented as 
something cookie-like.

mvh Mike



[whatwg] editorial ambiguity in definition of nav?

2009-06-18 Thread Bruce Lawson

Spec says

The nav element represents a section of a page that links to other pages  
or to parts within the page: a section with navigation links. Not all  
groups of links on a page need to be in a nav element — only sections that  
consist of primary navigation blocks are appropriate for the nav element.  
http://www.whatwg.org/specs/web-apps/current-work/multipage/semantics.html#the-nav-element


Primary navigation blocks is ambiguous, imo. A page may have two nav  
blocks; the first is site-wide naviagtion (primary navigation) and  
within-page links, eg a table of contents which many would term secondary  
nav.


Because of the use of the phrase primary navigation block in the spec, a  
developer may think that her secondary nav should not use a nav element.


Suggest rewording along the lines of only sections that consist of blocks  
whose primary purpose is navigation around the page or within the site are  
appropriate for the nav element, so - for example - lists of links to  
sponsors/ advertisers would not be marked up as nav elements.


--
Hang loose and stay groovy,

Bruce Lawson
Web Evangelist
www.opera.com (work)
www.brucelawson.co.uk (personal)


[whatwg] Using em for Meta-Content

2009-06-18 Thread Smylers
HTML 5 currently defines em as being for stress emphasis of its
contents, noting that:

  The placement of emphasis changes the meaning of the sentence.  The
  element thus forms an integral part of the content.

-- http://www.whatwg.org/html5#the-em-element

I'm not sure this definition is wide enough to encompass the use that
HTML 5 itself puts em to, using it for the This section is
non-normative bits at the start of sections, such as:

  http://www.whatwg.org/html5#introduction

The italics there don't seem to be indicating stress (and the sentence
doesn't warrant an exclamation mark at the end), more that it's
meta-content -- information about the section.

Of current HTML 5 defintions that seems closest to one of the purposes
of i: an alternate voice or mood, or otherwise offset from the normal
prose:

  http://www.whatwg.org/html5#the-i-element

I suggest that either the definition of em is broadened to include
this sense, or these normativity designators are instead marked up with
something like i class=normativity or i class=other.

This meta-content use seems similar to an article by a guest author
being prefaced by an italicized paragraph from a regular author
introducing the guest.  Or editoral comments inserted into somebody
else's work, which are often in square brackets and italics as well as
having - Ed at the end.  Mainly it's just indicating some kind of
separation from the main text.

(strong isn't quite right for these uses either: while the sentence is
important, it's hardly the key information in that section.  If reading
the spec out loud to somebody This section is non-normative is the
kind of thing I'd say very quickly, as boilerplate to be got out of the
way of the interesting content to follow (almost like legalese on radio
adverts).  That suggests the small element, but that isn't quite right
either: whether a section is normative is materially relevant to the
content, not just a legal technicality.)

Smylers


[whatwg] b Lede Example

2009-06-18 Thread Smylers
One of the examples of b is marking up a lede paragraph:

  http://www.whatwg.org/html5#the-b-element

Is a lede semantically relevant to the document such that it needs to be
in the mark-up?

Emboldening the first paragraph of an article seems like a matter of
style to me, similar to using a drop cap for the first letter.  For
example, if an article were syndicated to multiple news sites it's
conceivable that some would style the lede differently from other
paragraphs and some wouldn't -- and doing so or not wouldn't affect the
meaning of the article or its interpretation by readers.

So using CSS (I think article  h2 + p would do it) would seem more
appropriate than any mark-up here.  Especially since b is labelled as
a last resort.

Highlighting specifically just the first sentence (even if the first
paragraph has multiple sentences) is more awkward, in that I don't think
it's currently possible with CSS.  But that it exists as a plausible
choice for presenting an article demonstrates how much a matter of
styling, rather than content, this area is.  And a limit of CSS should
be fixed in CSS, not HTML.  (span class=lede can always be used as a
work-around.)

Smylers


[whatwg] Tool Implementor Audience

2009-06-18 Thread Smylers
One of the audiences for HTML is stated as implementors of tools that
are intended to conform to this specification:

  http://www.whatwg.org/html5#audience

That seems circular, verging on tautologous: a tool author wondering
whether this spec is relevant to her (and therefore whether her tool
should aim to conform with it) isn't any better informed having read the
above.

And conversely, an author of a lousy tool (which attempts to parse
webpages but does so in a way not compatible with how browsers do) might
have this spec pointed out to him.  But he can claim it doesn't apply to
him, since he's never intended his tool to conform to it.

Could we make it something like implementors of tools that emit HTML or
parse Web content?

Smylers


[whatwg] Dom as Audience Prereq

2009-06-18 Thread Smylers
The audience section states familiarity with Dom Core and Dom Events as
prereqs for reading the HTML 5 spec:

  http://www.whatwg.org/html5#audience

As somebody without this Dom background there are certainly many parts
of the spec which I've found both understandable and useful (to a web
author).

There may be parts which do require Dom knowledge, but as written it
sounds like a prereq for understanding any part of the spec, and as such
may unnecessarily put people off.

Smylers


[whatwg] HTML 5 and HTML4

2009-06-18 Thread Smylers
The 'History' section starts:

  Work on HTML 5 originally started in late 2003, as a proof of concept
  to show that it was possible to extend HTML4's forms ...

-- http://www.whatwg.org/html5#history-0

Having HTML 5 (with a space) and HTML4 (no space) seems oddly
inconsistent.  Could we have them matching?

(I haven't searched to see if this also occurs elsewhere in the
document.)

Smylers


[whatwg] workers Highlighted

2009-06-18 Thread Smylers
In the 'Design Notes' section the word workers has a thick pale green
underline.  It isn't apparent what this is signifying.  Moving the mouse
over it reveals it isn't a link, but a tool-tip appears -- with the text
Worker.  That didn't really elucidate matters:

  http://www.whatwg.org/html5#serializability-of-script-execution

What's this about?

Smylers


Re: [whatwg] workers Highlighted

2009-06-18 Thread Anne van Kesteren
On Thu, 18 Jun 2009 14:50:41 +0200, Smylers smyl...@stripey.com wrote:
 In the 'Design Notes' section the word workers has a thick pale green
 underline.  It isn't apparent what this is signifying.  Moving the mouse
 over it reveals it isn't a link, but a tool-tip appears -- with the text
 Worker.  That didn't really elucidate matters:

   http://www.whatwg.org/html5#serializability-of-script-execution

 What's this about?

It's an unimplemented feature. In this case a cross-specification reference to 
Web Workers.


-- 
Anne van Kesteren
http://annevankesteren.nl/


[whatwg] Plus Signs in Signed Integers

2009-06-18 Thread Smylers
The algorithm for parsing signed integers does not allow an optional
plus sign before positive integers; that is, parsing +4 will return an
error at step 8 of this algorithm:

  http://www.whatwg.org/html5#rules-for-parsing-integers

That is inconsistent with the algorithm for non-negative integers, which
tolerates (and ignores) a leading plus sign (step 6):

  http://www.whatwg.org/html5#rules-for-parsing-non-negative-integers

It also doesn't seem to match browser behaviour: the ol element's
start attribute is an integer, so I tried this out in various browsers:

  ol start=+4
liPlus four
  /ol

All the ones I had to hand (Firefox, Opera, Konqueror, Dillo, Lynx,
Links, and W3M) numbered the element with 4.

I've no idea if any web content is relying on this, but there doesn't
seem to be any harm in being consistent with both current browser
behaviour and non-negative integers.

To check that it is specifically the plus sign they are ignoring and not
any non-digit character I also tried:

  ol start=H2SO4
liAcid test
  /ol

That should cause parsing an integer to abort and so the default of
start=1 to be used.  Opera, Links, and W3M get that right.  Konqueror,
Dillo, and Lynx all also seem to manage the aborting, but use a default
of zero instead.  Firefox parses the 2 out of H2SO4, seemingly using
the first integer it can find in the attribute, so possibly isn't
special-casing +.

Smylers


Re: [whatwg] Plus Signs in Signed Integers

2009-06-18 Thread Boris Zbarsky

Smylers wrote:

  ol start=H2SO4
liAcid test
  /ol

That should cause parsing an integer to abort and so the default of
start=1 to be used.  Opera, Links, and W3M get that right.  Konqueror,
Dillo, and Lynx all also seem to manage the aborting, but use a default
of zero instead.  Firefox parses the 2 out of H2SO4


In Firefox, if the string doesn't look like an integer we end up 
calling some code that crazy-permissive string-to-integer parsing (which 
in particular skips over leading garbage).  We plan to stop doing 
that, for what it's worth.



seemingly using
the first integer it can find in the attribute, so possibly isn't
special-casing +.


There is no special-casing of '+' in the non-crazy-permissive code, 
correct.  That can be fixed, though.


-Boris


[whatwg] Charset override table should match case of IANA registry

2009-06-18 Thread Geoffrey Sneddon
Although charsets are case insensitive, it'd probably be best to be  
consistent with the IANA registry. The only change this means makes is  
changing Windows-* to windows-*.


Re: [whatwg] Using em for Meta-Content

2009-06-18 Thread Nils Dagsson Moskopp
Am Donnerstag, den 18.06.2009, 12:52 +0100 schrieb Smylers:
 The italics there don't seem to be indicating stress (and the sentence
 doesn't warrant an exclamation mark at the end), more that it's
 meta-content -- information about the section.

I currently use small for that in my blog posts.

 I suggest that either the definition of em is broadened to include
 this sense, or these normativity designators are instead marked up with
 something like i class=normativity or i class=other.

I suggest broadening the small element, mainly because it is already
spec'd to contain some kind of meta-information (legal text).

 This meta-content use seems similar to an article by a guest author
 being prefaced by an italicized paragraph from a regular author
 introducing the guest.  Or editoral comments inserted into somebody
 else's work, which are often in square brackets and italics as well as
 having - Ed at the end.  Mainly it's just indicating some kind of
 separation from the main text.

Editorial comments can be marked up using the ins element, as I
understand it. Also, in your example, you could separate content through
having an actual article element being preceded by some other block
element.

 […] That suggests the small element, but that isn't quite right
 either: whether a section is normative is materially relevant to the
 content, not just a legal technicality.)

As I said, small appears to have the most appeal to me.


Cheers
-- 
Nils Dagsson Moskopp
http://dieweltistgarnichtso.net



Re: [whatwg] External document subset support

2009-06-18 Thread Brett Zamir

Ian Hickson wrote:

On Mon, 18 May 2009, Brett Zamir wrote:
   

Section 10.1, Writing XHTML documents observes: According to the XML
specification, XML processors are not guaranteed to process the external
DTD subset referenced in the DOCTYPE.

While this is true, since no doubt the majority of web browsers are
already able to process external stylesheets or scripts, might the very
useful feature of external entity files, be employed by XHTML 5 as a
stricter subset of XML (similar to how XML Namespaces re-annexed the
colon character) in order to allow this useful feature to work for XHTML
(to have access to HTML entities or other useful entities for one, as
well as enable a poor man's localization, etc.)?
 


While there are arguments on both sides of whether this is a good idea or
not, I think the more important concern in this case is whether we can
extend XML in this way. I think in practice we should leave this up to the
XML specs and their successors. I don't think it would be appropriate for
us to profile the XML spec in this way.

   


While it is not my purpose to extend the debate on external DTD's, I 
wanted to bring up the following points (brought to light after a recent 
re-review of the spec) because it raises a few serious issues which I 
believe current browsers are failing at, and if the browsers do not 
address these issues, they would make claims for real XHTML 5 support 
(as with XHTML 1.* and plain XML support) unworkable. While I agree that 
any changes to XML itself should be up to the XML specs, from what I can 
now tell, it looks like a closer adherence to the existing spec would 
solve most of the existing problems. I wanted to share the following 
points which I think could resolve most of the issues, if the browsers 
would make the required changes.


I was pleasantly surprised to find that the spec seems to recommend 
solutions which I believe avoid the more serious issue of single point 
of failure problems.


(The other complaints with DTD's, such as avoiding cross-domain DTDs for 
the sake of security or avoidance of DOS attacks might be an optional 
issue if that may, in combination with adhering to existing 
recommendations, satisfy concerns, though I personally do not think such 
a risk is similar to inclusion of cross-domain scripts.)


So what follows is what I have gleaned from these various statements as 
applied to current browsers. I can provide specific citations, but I did 
not wish to expand this post unnecessarily (though I list references at 
the end).


The major issues which I think ought to be resolved by certain browsers, 
as they do not seem to be in accord with the XML spec and as a result, 
create interoperability problems:


1) Firefox and Webkit, should not give a single point of failure for a 
missing entity as they do now, (unless they switch to a validating 
parser which finds no declaration in the external file and the user is 
in validation mode), since such failures in a document with an external 
DTD are NOT well-formedness errors unless the document deliberately 
declares standalone=yes.
2) Explorer, which no longer seems to require in IE8 that the document 
be completely described by the DTD as I believe it had earlier (though 
it will report errors if the document violates rules which are 
specified), should, per the spec, really only report validation errors 
upon user option (ideally, I would say, off by default, and activatable 
on a case-by-case as well as preference-based basis). This will possibly 
speed things up if the option could be disabled as well as let their 
browser work with documents which violate validation. But this issue is 
not as serious as #1, since #1 prevents even valid documents from being 
interoperably viewed on the web.


If these issues are addressed by those aiming for compliance, the only 
disadvantages which will remain (and which are inherent in XML by 
allowing the co-existence of validating and non-validating parsers) are 
those issues described in http://www.w3.org/TR/REC-xml/#safe-behavior 
and http://www.w3.org/TR/REC-xml/#proc-types , namely that:


1) some (entity-related) /well-formedness/ errors (e.g., if an entity is 
not defined but is used) will go hidden to a non-validating parser as 
these will not need to load an entity replacement (which is not a big 
problem, since a document author should presumably have checked (with an 
application which does external entity substitution) that their entities 
integrate properly with the text--it is not as important, however, that 
they check for /validation/ errors, since as mentioned above, these need 
only be reported optionally).
2) The application may possibly not be notified by its processor of, 
e.g., entity replacement values, if it is a non-validating processor 
(though non-validating processors can also make such replacements). But 
since these are, as mentioned above, not to produce well-formedness 
errors, there is no single point of