Re: [whatwg] Citing multiple
elements in HTML5

Calogero Alex Baldacchino Tue, 02 Dec 2008 21:48:49 -0800

Benjamin Hawkes-Lewis ha scritto:

Calogero Alex Baldacchino wrote:
[...]
I think you're confusing parsing rules that conforming user agentsmust follow to associate identifiers with elements (even when ids areduplicated) with the authoring rules that conforming documents mustfollow (ids must be unique).


Ok, so what's what?

When you read "The value must not contain any space characters.", is itan authoring rule for conforming documents, for you? Ok.

When you read "*If the value is not the empty string, user agents mustassociate the element with the given value (exactly, including any spacecharacters)* for the purposes of ID matching within the subtree theelement finds itself (e.g. for selectors in CSS or for the|getElementById()| method in the DOM).", is it a parsing rule forconforming user agents, for you? Ok. But, isn't it worth to spend a wordeverywhere in the spec to tell when it's a quirck for backwardcompatibility, which might go away in the future, and when it's not,because that's not needed? And when it's a drawback from the past,shouldn't it be considered in every aspect? After all, wasn't one of themain goals of html 5 to turn unwritten and browser-specific rules intowritten and standard behaviours?

I mean, if you allow spacing characters inside an id value, as a parsingrule, you can face something like '<div id="foo bar" >', that is an idconsisting of more than one token. Is it good to leave it in untouched?Yes? Ok, but what does it mean for CSS's, since there is a reference tothem as one reason to allow space characters? That is, can a browserhandle an id selector starting with the '#' character and being brokenby a blank space? Or better, is it legal in CSS? Honestly, again, Idon't remember well, I've never tried something like that (since makesno sense at me), and I think that's illegal. But let's say that'sillegal for conforming style sheets, but existing user agents may or maynot allow that, each one with its own behaviour. If we "close one eye"for '<div id="foo bar" >' in a piece of HTML 5 code, but leave its CSScounterpart to a free implementation, we'll solve half of the problem(where the problem is turning unwritten rules to written, and possiblyimproved, standards), won't we? But any kind of "CSS quirks" would beout of an HTML specification, and I believe '<div id="foo bar" >' is atrouble (if instead "foo bar" is not a valid id selector for CSS in anybrowser, that means we're allowing user agents to parse as valid an idwhich is inconsistent with CSS, and so CSS selectors cannot be a reasonto allow space characters inside an id string - at least, with respectto any direct reference to the identifier value). But it might be atrouble per se, even only for html conformance by user agents, since anURL fragment might contain escaped space characters, but an escapedspace isn't the same thing as the space character itself, so the rule ofexact matching, applied to space characters inside an id, may be atrouble without extensively considering the '<div id="foo bar" >' case.

Now, let's say, instead, that a user agent, conforming with HTML 5specifications, must cut off any token after the first one (I knowactually "foo bar" is taken as is), that is <div id="foo bar"> becomes<div id="foo "> and <div id=" foo "> is valid too. In such a case,skipping any spaces too, and stating the same behaviour for stringspassed to .getElementById() could be nice as a graceful degradation fordocuments non-conforming with the rule "the value [of an id attribute]must not contain any space characters", but such might fail with CSSselectors such as 'div[id="foo bar"]'.


Perhaps a compromise, if acceptable for backward compatibility, might be:

- when the id value must be compared to a fragment identifier, strip anytrailing space characters; if the match fails, escape any other spacecharacters both in the id value and in the fragid and try again;- when an attribute is defined to hold an url and its value has spacesin its path/query/fragment, escape them before resolving the url (notsure if needed);- for the purpose of ID matching through the DOM 'getElementById'method, leave the id value untouched;- for the purpose of ID matching through CSS selectors accessing it asan attribute, leave the id value untouched;- for the purpose of ID matching through CSS selectors directlyaccessing it (e.g. '#foo') either choose the first sequence ofnon-spacing characters or let the match fail (I can't decide what'sbetter, but perhaps the former would fail as well, since I guess anyonecoding <div id="foo bar"> not only as a fragment identifier, but alsofor styling, might have the nice idea to write "#foo bar { font-weight :bold; }" as well).

Anyway, if the id value is also a fragment identifier, which might havespace characters (since parsing rules prescribe to add such charactersto the unreserved production), does the (authoring) rule "the value mustnot contain any space characters" make sense?

Now let's come to the duplicated ids issue. Again, what's what? Whenit's said, "The id attribute represents its element's unique identifier.*The value must be unique in the subtree within which the element findsitself and must contain at least one character.*", I think that's whatyou call an authoring rule. So, I don't think it was so bad to ask for aclarification on the subtree nature. And if a subtree happened to match,eventually, an element subtree inside a document, was the suggestion fora getElementById method on the HTMLElement interface so awful?Otherwise, let's consider (again) the second paragraph:

"If the value is not the empty string, user agents must associate theelement with the given value (exactly, including any space characters)*for the purposes of ID matching within the subtree the element findsitself (e.g. for selectors in CSS or for the |getElementById()| methodin the DOM).*"

It's a parsing rule, isn't it? But it tells also the id must be uniquein the whole document for the purpose of ID matching through thegetElementById() method in the DOM, because the only object capable toget an element by its id is an instance of the Document interface. So,any choice should be taken on what to do with duplicated ids. Solvingthe question at the parser level (i.e. defaulting any duplicated id tothe empty string) would be consistent with both the fragment identifierbehaviour (only the first occurrence is valid) and the uniqueness rule,but might brake some semantics (i.e. an hyperlink used to create aninstance of a <dfn>, or a <blockquote> with a cite attribute referencinga <cite> element, both with a duplicated id not being the firstoccurrence). On the other hand, leaving the duplicated id in thedocument requires some changes in the Document's getElementById()method, since the W3C DOM Core does not define a unique behaviour insuch a case, and I've expressed a few dubts on solving this by adding anequivalent method on the HTMLDocument interface; anyway thegetElementById() behaviour must be defined for such situations, andhaving it to pick the first match may be a solution (but might causeside/unwanted effects if misused in actual documents, and leaves nochance to access directly to any element with a duplicated id, but ifI'm not careful when choosing an ID, I can complain just with myself...- anyway, the uniqueness fulfillment might become problematic whendinamically putting together pieces of code, perhaps from differentsources, e.g. using XMLHTTPRequests, or because of externally syndicatedcontet, but this is in the scope of careful programming).

From the point of view of CSS, both choices may be consistent withcoupled rules such as "#foo { font-size : 13; }" and #foo { font-size :14; }", since both would refer to the same element because of cascadingrules; on the other side, something like 'div[id="foo"] {/*somethinghere*/}' or a direct reference to an ID selector as a descendant ofdifferent elements might perhaps isolate different elements in thedocument (whether to allow such or not is outside html scope - but aresuch cases in the wild?), and for the purpose of compatibility withdocument styled that way, leaving duplicated ids in the document wouldbe a better choice. But, in such cases, shouldn't the DOM elementsselection be consistent with the CSS elements selection (i.e. to avoidside-effects when CSS rules manipulate the DOM itself)? That is, ifthrough CSS it were possible to reach elements with duplicated ids indifferent subtrees of a document tree (according to the definition ofall nodes descendant of a non-leaf node as being part of its subtree)and to manipulate their content, shouldn't it be possible through theDOM too?


Anyway, I'm not so much confused, no more than usual :-P

BR, Alex.


--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP 
autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
CheBanca! La prima banca che ti dà gli interessi in anticipo.
* Fino al 4,70% sul Conto Deposito, zero spese e interessi subito. Aprilo!
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=7917&d=3-12

Re: [whatwg] Citing multiple elements in HTML5

Reply via email to

Re: [whatwg] Citing multiple
elements in HTML5