Re: [whatwg] HTML spec incorrectly suggests that br can have its rendering changed with CSS

2014-01-23 Thread Stewart Brodie
Daniel Holbert dholb...@mozilla.com wrote:

 So: to reflect reality, it might be better to specify br in a way that
 doesn't suggest it's as customizable with CSS. (for the white-space
 property in particular, but probably others as well)
 
 For reference, here's a page with a few testcases:
   http://people.mozilla.org/~dholbert/tests/br-tests.html
 The browsers that I tested[1] all agree on the rendering (basically, not
 honoring any of the br styling), with one minor exception[2].
 
 Thanks,
 ~Daniel
 
 [1] I tested the following browsers:
  Firefox 26
  Opera 12.16
  Chrome 34.0.1788.0 dev
  IE 11

 [2] I only noticed one rendering difference -- IE11 honors border on
 br, unlike the other browsers that I tested. (It still doesn't honor
 e.g. display/width/height, though.)


I get different results on your test case for the bottom two tests.  In
Chrome 33 and Opera 12.16 (Linux), there is a line break; in Firefox 26
there isn't.

This matches a fault report that we had from a customer a few years about a
page that didn't lay out properly in our browser (but did in Opera) that I
tracked down to being that we permitted br elements to be styled, just like
Firefox (26.0) does.  I've put a suitably anonymised version of the test
case on my own website:

 http://www.metahusky.net/~stewart/css/br/br-rendering.html

And yes, the real page really did have the first line of its stylesheet as:

 * { position: absolute; margin: 0px; float: left }


-- 
Stewart Brodie
Team Leader - ANT Galio Browser
ANT Software Limited


Re: [whatwg] [Notifications] Constructor should not have side effects

2013-01-29 Thread Stewart Brodie
Glenn Maynard gl...@zewt.org wrote:

 On Tue, Jan 29, 2013 at 7:36 AM, Charles McCathie Nevile 
 cha...@yandex-team.ru wrote:
 
  Really? This doesn't seem like a good idea, so I'd be interested to know
  why. Is there an explanation laid out somewhere?
 
 Just to ask from another perspective: why doesn't it seem like a good
 idea?

All the information required to activate the object may not be available
at the point at which the object is constructed.

For example, how can you add event listeners to something that doesn't exist
yet?  Particularly if you want to use a closure with the new object.


 Having objects that begin their job when constructed simply avoids an
 extra step for the user (telling it to start), and reduces the number of
 possible states (eg. eliminating the UNSENT state), which generally
 simplifies things. Supporting reuse of objects is generally not a useful
 optimization (in my experience), so not supporting it also simplifies
 things a bit. Reducing the number of different-but-equivalent ways of
 doing the same thing is also generally good API design.

I agree - but I don't see what this has to do with separating construction
from activation.


-- 
Stewart Brodie
Team Leader - ANT Galio Browser
ANT Software Limited


Re: [whatwg] A plea to Hixie to adopt main

2012-12-14 Thread Stewart Brodie
Steve Faulkner faulkner.st...@gmail.com wrote:

 Hi Cory,
 
 
 
  I don't know if this is relevant at all, but according to the spec
  (section 4.4.1), The body element represents the main content of the
  document. What would you say is the relation between this use of the
  term main and your use of the term here? Might it perhaps be more
  accurate to state, The body element represents the *entire* content
  of the document (or something like that). I don't really know -- just
  asking.
 
 I filed a bug about this recently:
 https://www.w3.org/Bugs/Public/show_bug.cgi?id=19967


It doesn't necessarily.  I've come across pages that expect the head to be
displayed too.  e.g. tests at http://meyerweb.com/eric/css/tests/css3/ like
http://meyerweb.com/eric/css/tests/css3/show.php?p=caption-side


-- 
Stewart Brodie
Team Leader - ANT Galio Browser
ANT Software Limited


Re: [whatwg] A plea to Hixie to adopt main

2012-12-14 Thread Stewart Brodie
Steve Faulkner faulkner.st...@gmail.com wrote:

 Stewart wrote:
 
 It doesn't necessarily.  I've come across pages that expect the head to be
 displayed too.  e.g. tests at http://meyerweb.com/eric/css/tests/css3/like
 http://meyerweb.com/eric/css/tests/css3/show.php?p=caption-side

 Is this a common mark up pattern?

I've not gone looking for any other real-world examples - that's the only
one I've seen.  However, I can't think of any reason why it shouldn't work,
as it's just a block box like the body element (usually) is.


-- 
Stewart Brodie
Team Leader - ANT Galio Browser
ANT Software Limited


Re: [whatwg] Proposal for window.DocumentType.prototype.toString

2012-10-30 Thread Stewart Brodie
Johan Sundström oyas...@gmail.com wrote:

 Hi everybody!
 
 Serializing a complete HTML document DOM to a string is surprisingly
 hard in javascript.

Does XMLSerializer().serializeToString(document) not meet your requirement?


-- 
Stewart Brodie
Team Leader - ANT Galio Browser
ANT Software Limited


Re: [whatwg] [html5] Question on the structured cloning algorithm

2011-05-26 Thread Stewart Brodie
Ian Hickson i...@hixie.ch wrote:

 On Tue, 24 May 2011, Stewart Brodie wrote:
 
  Do getters need to be called to obtain a value which can be stored
  (after being cloned itself) in the result?
 
 I'm not sure I follow the question. Can you elaborate?

Are getters called during cloning?  i.e. what do I get if I clone this:

{ get a() { return 1 } }

Do I get { a: 1 } or do I get {}  ?


-- 
Stewart Brodie
Team Leader - ANT Galio Browser
ANT Software Limited


[whatwg] [html5] Question on the structured cloning algorithm

2011-05-24 Thread Stewart Brodie

The section on the structured cloning algorithm has a Note that says

 Property descriptors, setters, getters, and analogous features are not
 copied in this process.

Is this note part of the normative definition of the algorithm, or just a
non-normative helpful explanatory note?  The typographic convention
description set out in section 1.8.2 doesn't say either way.

Do getters need to be called to obtain a value which can be stored (after
being cloned itself) in the result?


-- 
Stewart Brodie
Team Leader - ANT Galio Browser
ANT Software Limited


Re: [whatwg] Media elements statistics

2011-01-28 Thread Stewart Brodie
Steve Lacey s...@chromium.org wrote:

[Media elements]

 Another open question: what are sensible values if the information is
 not available. Zero seems wrong.

This is a question that I have considered for some time for all the
properties in HTMLMediaElement interface, not especially for your new
proposed statistics.

I have not come to any particular conclusions as yet.  One option is to try
to invent plausible values for the property values.  e.g. you can seek
anywhere.

For us, with a third-party black box media player sitting at several levels
of abstraction away (and sometimes on a separate processor), we do not have
access to most of the information that is necessary to drive the algorithms.
For example, network usage (if indeed there is any at all), which frames you
have, seekable ranges, buffering statistics - all of these are unavailable.


-- 
Stewart Brodie
Team Leader - ANT Galio Browser
ANT Software Limited


Re: [whatwg] [html5] Attaching option elements to select elements in different documents

2010-03-04 Thread Stewart Brodie
Boris Zbarsky bzbar...@mit.edu wrote:

 On 3/3/10 12:11 PM, Stewart Brodie wrote:
  As far as I can tell, this affects: HTMLSelectElement.add(),
  HTMLOptionsCollection.add(), Node.appendChild(), Node.replaceChild(),
  Node.insertBefore().
 
 Is it option-specific, though?  Last I checked, various browsers
 implicitly adopted on append/insert/replace, period.

Since when?  I was sure that they didn't used to do this.  DOM Core is
extremely clear on this issue (both in level 2 and level 3).  You appear to
be correct: Firefox and Opera both just ignore the standard and get this
wrong.  Chrome just seems to get confused.


-- 
Stewart Brodie
Team Leader - ANT Galio Browser
ANT Software Limited


Re: [whatwg] [html5] Attaching option elements to select elements in different documents

2010-03-04 Thread Stewart Brodie
Anne van Kesteren ann...@opera.com wrote:

 On Thu, 04 Mar 2010 11:27:23 +0100, Stewart Brodie  
 stewart.bro...@antplc.com wrote:
  Boris Zbarsky bzbar...@mit.edu wrote:
  On 3/3/10 12:11 PM, Stewart Brodie wrote:
   As far as I can tell, this affects: HTMLSelectElement.add(),
   HTMLOptionsCollection.add(), Node.appendChild(), Node.replaceChild(),
   Node.insertBefore().
 
  Is it option-specific, though?  Last I checked, various browsers
  implicitly adopted on append/insert/replace, period.
 
  Since when?  I was sure that they didn't used to do this.  DOM Core is
  extremely clear on this issue (both in level 2 and level 3).  You appear
  to be correct: Firefox and Opera both just ignore the standard and get
  this wrong.  Chrome just seems to get confused.
 
 This changed a while ago due to compatibility problems. Consensus at the  
 time was to change DOM Core.

Is this documented anywhere?  By compatibility problems, presumably you
mean bugs in Firefox that were then exploited by content authors who didn't
know better?   From Maciej's description of WebKit's behaviour, it looks
like either they didn't know about this consensus or they didn't implement
it compatibly.

This definitely needs to be documented in HTML5.

Are there any more retrospective changes to fundamental behaviour specified
in DOM Core in the pipeline that I need to know about?  I already know about
the one in DOM Event about capturing listeners being called in the target
phase.

That leaves the issue of how adoptNode() affects the [[Prototype]] of the
node objects, which is currently inconsistent between desktop browsers.
Opera  Chrome agree with each other (that the [[Prototype]] is unchanged);
Firefox disagrees (it changes the [[Prototype]] to be that it would have
been if the node had been created anew in the destination document).


-- 
Stewart Brodie
Team Leader - ANT Galio Browser
ANT Software Limited


[whatwg] [html5] Attaching option elements to select elements in different documents

2010-03-03 Thread Stewart Brodie

The algorithm in the HTML5 specification for attaching an option element to
a select element is incomplete, because it doesn't describe how to handle
the case where the option element does not belong to the same document as
the select element.

It seems that HTMLOptionElement objects are immune to WRONG_DOCUMENT_ERR
exceptions on any tree modifications.  Thus the HTML5 specification also
needs to note that it is overriding the rules from DOM Core about what may
be attached to what.  I've written some proposed changes further below.

As far as I can tell, this affects: HTMLSelectElement.add(),
HTMLOptionsCollection.add(), Node.appendChild(), Node.replaceChild(),
Node.insertBefore().

My tests show that this isn't even confined to the cases where the new
parent node is an HTML select element - any cross-document attachment of
option elements operates as though the same-document check has been
bypassed.  In fact, the behaviour I'm seeing looks very much like an
implicit adoptNode() call has occurred.  I'm basing that suspicion on a
comparison of the (equally inconsistent) behaviour of adoptNode() in
different browsers[*]

I'm testing this from ECMAScript in my test page which is at:
http://www.metahusky.net/~stewart/css/html-options/

In all browsers, if the insertion of the option succeeds, then the inserted
option element compares strictly equal to the option in the receiving select
element.  i.e. the option tree has not been cloned.

In some browsers, the [[Prototype]] of the HTMLOptionElement is reset to be
HTMLOptionElement.prototype of the receiving document's script context; in
others, it does not get changed.  However, in all browsers, all the nodes in
the option's subtree are affected similarly (i.e. if the option's
[[Prototype]] changes, so does the text node's)

In some browsers, you can only insert the option element if the option
element is not currently attached to anything else.

In some browsers, the option isn't inserted at the right index into the
receiving select, but I think that must just be a different bug.


I propose the following changes to the specification:


Change 1: Renumber existing step 7 to step 8 and insert a new step 7 in
http://www.whatwg.org/specs/web-apps/current-work/multipage/urls.html#htmloptionscollection

7.  If _element_ does not belong to the same document as _parent_, then act
as if the DOM Core adoptNode() method was invoked on the _parent_ node's
ownerDocument with _element_ as the parameter.

[Aside: whilst in the vicinity, shouldn't step 3 be using node rather than
element i.e. If _before_ is a *node*, but that *node* ...?   Otherwise, I
could legitimately insert it before any text node anywhere in the document.
Should it require that _before_ has to be an option or optgroup?]



Change 2: Append some text to section 2.2.1 (Conformance Requirements -
Dependencies) to indicate the change to DOM Core, and include a link to the
text added by change 3:

Some requirements in this specification are a wilful violation of
constraints imposed by the DOM Core specification [DOMCORE]:

* attaching _option_ elements to different documents is permitted



Change 3: append explanatory text, linked from change 2's text to:
http://www.whatwg.org/specs/web-apps/current-work/multipage/the-button-element.html#the-option-element

If any attempt is made to attach an _option_ element to a node in a
Document other than the Document of the _option_ element, then the user
agent must not throw a _WRONG_DOCUMENT_ERR_ exception.  If the tree change
would otherwise succeed, then the user agent must behave as if a call to the
DOM Core adoptNode() method has been made so that the Document of the
_option_ element is updated.  This affects the DOM Core appendChild(),
insertBefore() and replaceChild() methods.


Actually, all of these changes might have to say _option_ or _optgroup_.


[*] Opera 10.10, Chrome 5.0.307.11 beta, Firefox 3.5.8, and our own ANT
Galio 3.1.0

-- 
Stewart Brodie
Team Leader - ANT Galio Browser
ANT Software Limited


Re: [whatwg] [html5] Attaching option elements to select elements in different documents

2010-03-03 Thread Stewart Brodie
Darin Adler da...@apple.com wrote:

 Was your testing done with option elements created with
 document.createElement(option) or new Option? I ask because I seem to
 recall the behavior being different for at least some types of elements.

That's a good idea - I forgot to test that.  I've updated my test so that it
tries both.

The behaviour seems to be the same, regardless of how the option is created.


-- 
Stewart Brodie
Team Leader - ANT Galio Browser
ANT Software Limited


[whatwg] [EventSource] Garbage collection rules

2009-07-10 Thread Stewart Brodie

I've been reviewing the new EventSource draft.  I'm very pleased to see it
converted into a separate object, rather than being tacked onto everything
that implements EventTarget.  This is a huge improvement.  However, there
are some issues that I think need to be addressed, specifically in the area
of lifetime management.

The GC rules in section 9 seem overly permissive - if there is a listener
for message events but the script forgets to call close() when the user
navigates away, then the resources it is consuming cannot be reclaimed.
There is a small chance that it may be reclaimed if the server terminates
the connection and a GC occurs before the UA is able to re-establish the
connection (i.e. during the reconnection delay or the reconnection), but I
don't think it's wise to rely on this as it would allow malicious scripts to
consume resource with no way for the user agent to recover.

The simplest way to prevent this would be to modify the condition in section
9 slightly to insist that the event listener is callable, drawing on the
text from HTML5's Calling scripts section 6.5.3.2#1.  i.e. modify the text
to say:

An EventSource object with an open connection must not be garbage collected
if there are any event listeners registered for message events and at least
one of those listeners' global object is a Window object whose Document
object is fully active.

In other words, the automatic marking of the EventSource now requires that
at least one of the event listeners must be callable.  The only difference
that this makes, I *think*, is that pages in the history lose unreferenced
EventSource objects.  Is this true and would it actually be a problem?


-- 
Stewart Brodie
Software Engineer
ANT Software Limited


Re: [whatwg] Exposing EventTarget to JavaScript

2009-04-30 Thread Stewart Brodie
Alex Russell slightly...@google.com wrote:


  But if you addEventListener, you can have multiple listeners for a given
  event.  The only caveat is that dispatch order is undefined.
w 
 Also a bug. It's not *actually* undefined, it's triangulated by
 libraries.

Actually, it is defined.  They are called in registration order, from oldest
to newest.  This is stated in both the latest D3E working draft, and the
older versions dating back to 2003 (at least - I didn't go back any further)


-- 
Stewart Brodie
Software Engineer
ANT Software Limited


Re: [whatwg] getElementsByClassName case sensitivity

2009-01-14 Thread Stewart Brodie
Anne van Kesteren ann...@opera.com wrote:

 On Tue, 13 Jan 2009 11:17:08 +0100, Anne van Kesteren ann...@opera.com
 wrote:
  Since my initial e-mail did not seem to have done it, could you please
  take a look at the source code of the respective test and tell me if you
  see a problem there?
 
 http://tc.labs.opera.com/apis/getElementsByClassName/014.htm
 
  To be perfectly clear, there is no discrepancy between CSS handling and
  the getElementsByClassName method and the test is testing that there is
  not.
 
 Wow, epic fail. I missed it should match two elements. The test is indeed
 out of date.
 
 * updates the test now.

Excellent - thanks!


-- 
Stewart Brodie
Software Engineer
ANT Software Limited


Re: [whatwg] getElementsByClassName case sensitivity

2009-01-13 Thread Stewart Brodie
Anne van Kesteren ann...@opera.com wrote:

 On Mon, 12 Jan 2009 15:25:33 +0100, Stewart Brodie
 stewart.bro...@antplc.com wrote:
  Ian Hickson i...@hixie.ch wrote (on 25 July 2008):
  I've made [getElementsByClassName] consistent with how classes work in
  CSS
  (case-insensitive for quirks and case-sensitive otherwise).
 
  I was looking for some tests for this API and found some from Opera
  (found
  at http://tc.labs.opera.com/apis/getElementsByClassName/) but given the
  dates on them predate the latest spec changes (which causes some to fail
  now), I was wondering if up to date versions are now kept somewhere else
  instead?
 
 The tests already take this change into account. It was agreed upon way
 earlier prolly over IRC or so, but the specification hadn't catched up
 with reality yet. I'm not sure what other tests you might believe to be
 out of date (and why) and would be interested in knowing being the author
 and all :-)

Specifically: test 14 - tests for case-sensitivity in a document that is in
quirks mode.

Are you saying that this change has now been reversed and the comparisons
are always case-sensitive, thus reintroducing the discrepancy between CSS's
handling of classes and this new method?


-- 
Stewart Brodie
Software Engineer
ANT Software Limited


Re: [whatwg] getElementsByClassName case sensitivity

2009-01-12 Thread Stewart Brodie
Ian Hickson i...@hixie.ch wrote (on 25 July 2008):

 I've made [getElementsByClassName] consistent with how classes work in CSS
 (case-insensitive for quirks and case-sensitive otherwise).

I was looking for some tests for this API and found some from Opera (found
at http://tc.labs.opera.com/apis/getElementsByClassName/) but given the
dates on them predate the latest spec changes (which causes some to fail
now), I was wondering if up to date versions are now kept somewhere else
instead?

-- 
Stewart Brodie
Software Engineer
ANT Software Limited


Re: [whatwg] Revised Plan for Server-sent DOM events

2008-01-07 Thread Stewart Brodie
Henry Mason [EMAIL PROTECTED] wrote:

 There's recently been some talk about completely removing HTML 5  
 section 6.2, Server-sent DOM events. I propose that rather than  
 remove, we revise.
 
 The major concerns I've heard about section 6.2 include:
 
 - Unnecessary dependency on DOM Events

Why is this a problem?  It's a facility to cause DOM events to be
dispatched.


 - Redundancy with already existing techniques, especially XMLHttpRequest

There are quite a lot of additional behavioural requirements for server-sent
DOM events that do noy apply to XMLHttpRequest, specifically the automatic
binding to event-source elements, plus the automatic reconnection
algorithms.


 - Complicated parsing of event fields

The major problem is determining the type information for the fields of
arbitrary events.  In the end, I gave up on this and simply stuffed the data
into the JS Event objects as strings and allowed the interpreter to worry
about the numeric conversions, provided that the field name was validated.  


 - Inability to support cross-domain events (without the as-of-yet  
 unimplemented and untested Access-Control HTTP header mechanism)

I don't see this as a particular problem - other facilities are waiting for
that to be done too.  I'd rather use the same mechanism everywhere.


 - Continued problems of the 2 connection limit on HTTP server  
 scalability

This might be alleviated somewhat, but not resolved by moving the
connections to other servers.  Does anybody implement the 2 connection limit
in desktop browsers anyway?  Last time I actually tested, most of them
appeared to be using at least 4.



 I propose that we remove support for non-message events; that is,  
 allow only events with MessageEvent interface. This will make  
 implementations easier, as UAs will only need to parse the Bubbles,  
 Cancelable, and data fields. The only existing implementation  

... that you know of ...

 (Opera) seems to only use the message event part of the interface anyway.
 In the few rare instances where general DOM Event bindings are needed,
 JavaScript parsing of the data field of the message events could be used.

I have an implementation - it does precisely that, as I mention above.


 The critically cool part, however, is that since MessageEvents store  
 their domain and URI origin, it will be safe to allow for cross- 
 domain messaging through this server-sent events. Section 6.1 already  
 uses this system for this very purpose. Opera has already implemented  
 it and it has been in WebKit's trunk for about a week. The removal of  
 the same-origin restriction actually makes this interface  
 dramatically more useful for developers. It provides a capability  
 (messaging with a foreign host) which is not already available to  
 XMLHttpRequest-using applications. It also makes it easier web  
 developers to more easily offload the long-running HTTP connections  
 needed for event streams to separate servers. This aides in  
 application scalability and circumvents potential problems with the 2  
 HTTP connection limit.

Not really - it's still possible for applications to cause problems by
trying to create 3 event streams.  My implementation permits 2 event streams
to any given host in addition to any used for normal accesses.
Additionally, we have a class of privileged applications for which all the
usual restrictions (cross-domain scripting, same origin checks, connection
limit, etc.) are relaxed, precisely because we need sometimes require things
like cross-domain XHRs in our embedded environments.


 This change would make server-sent events easier to implement for both UA
 implementers and web application developers and would make the developers
 more likely to use it.

Removing the requirement to support anything other than MessageEvent class
of events would certainly be a tremendous simplification.  I'm not sure
whether or not it is a good idea - it would leave us needing to perform all
sorts of string parsing in our JS if we wanted to issue other types of
event.  In fact, if this simplification were to be made, I'd probably have
to retain this ability for compatibility with our existing applications.


-- 
Stewart Brodie
Software Engineer
ANT Software Limited


Re: [whatwg] more discussion regarding codecs (Was: whatwg Digest, Vol 45, Issue 16)

2007-12-12 Thread Stewart Brodie
Ian Hickson [EMAIL PROTECTED] wrote:

 There is no way we can ever guarantee that there are no covering patents. 
 Whether a patent covers a technology or not really has more to do with 
 what the courts say than with what the patents say. If Apple say they 
 don't want to implement Ogg, then we have to find another solution.
 
 (Similarly -- Opera, Mozilla, et al, don't want to implement H.264. So we 
 have to find a solution other than H.264.)

Is there any codec that would satisfy everybody?  I doubt it, to be honest.


-- 
Stewart Brodie
Software Engineer
ANT Software Limited


Re: [whatwg] Parsing: Greater-than characters in doctype

2007-06-29 Thread Stewart Brodie
Simon Pieters [EMAIL PROTECTED] wrote:

 All browsers terminate the doctype at the first  character, even if  
 it's inside the public identifier or system identifier.

I see this sort of comment a lot - I think it would be really helpful if
people could state which browsers they have actually tested, because you
clearly cannot have tested all browsers.  IE, Firefox, Safari and Opera
aren't all browsers (especially if you only test one specific version)


-- 
Stewart Brodie
Software Engineer
ANT Software Limited


Re: [whatwg] Allow trailing slash in always-empty HTML5 elements?

2006-11-29 Thread Stewart Brodie
Robert Sayre [EMAIL PROTECTED] wrote:

 On 11/29/06, Lachlan Hunt [EMAIL PROTECTED] wrote:
 
  I do not think it's a good idea to make the trailing slash conforming.
  Although it is harmless, it provides no additional benefit at all and it
  creates the false impression that the syntax actually does something.
 
 It does do something, in systems that think they are using XML
 (whether they actually are is another matter). It's possible it will
 prevent  many information-free validation errors, and give the HTML5
 more credibility as a result. Warning people about img / in the
 validator is a waste of their time.
 
  It's not a
  good idea to confuse them any more by giving the impression that it
  works for some elements but not others.  It's better to just say it
  doesn't work at all and forbid it in all cases.
 
 
 Better? This is an opinion, and it's not backed up by data. So far, it
 looks like Sam has the data on his side. People do it, and it tends to
 work interoperably.

Except when it doesn't.

For example, here's a fragment of hotmail.com's signup page, served as
text/html.  It's the only example I've come across to date:


!DOCTYPE html PUBLIC -//W3C//DTD XHTML 1.0
  Strict//EN http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd;
html xmlns=http://www.w3.org/1999/xhtml; dir=ltr
...
select id=iRegion name=pff010004 /
  script.../script
/select
...


The script just document.write's loads of option tags (it's the country
menu).  It's hard to know what the author thought was going on.  Did they
think it was XHTML and just got stymied by the server configuration?

I'm still in favour of permitting the trailing slash, personally.


-- 
Stewart Brodie
Software Engineer
ANT Software Limited


[whatwg] [WA1] Missing step in formatting element algorithm

2006-07-21 Thread Stewart Brodie

I've hit another issue in trying to getting the Adoption Agency Algorithm
working.  The issue arises with the following fragments:

bodyb ab emdiv cd /b
bodyb ab u xy emdiv cd /b

The algorithm attempts to insert nodes that already have a parent node into
another node.  Does the instruction to insert a node carry an implication
that you must first detach the node from its parent node, if it has one? If
so, that should either be documented at the start of the whole section, or
additional steps put into the algorithm to make it so.  Steps 5 and 10 are
much more explicit about detaching nodes from their parents.

The problem is triggered by the EM having a single child - that being the
DIV (the furthest block) that gets detached by step 5.  Consequently, in
step 7.4, 'node' (EM) does not have any children.  Thus in step 7.5, the DIV
is re-inserted into the original EM and we carry on around the steps in step
7 until we hit 7.2 where the two examples' behaviours diverge.

In the first example, 7.2 terminates the loop and we hit step 8, where: last
node is the EM; node is the B; and furthest block is the DIV. The condition
holds, so we are instructed to insert the EM into the BODY.  However, the EM
is still a child node of the B at this point.

In the second example, 7.4 clones the U and then (in 7.5) tries to insert
the EM into that clone whilst the EM is still a child of the original U.


Either the EM has to be cloned, or it has to be detached first.  The detach
first works better, I think, otherwise you end up with a useless EM leaf
node, which I guess is what 7.4's condition is trying to avoid in the first
place.


Also, what is nearest block for?  It's not used anywhere.  It looks like
it could be a remnant of an earlier version of the algorithm that didn't
have a step 14 that looped the whole algorithm, perhaps?  I think you can
strike the first paragraph of step 3, and move the second paragraph of step
3 into step 4, and remove the sentence There will always be one ... from
step 4, then delete step 3 completely.


-- 
Stewart Brodie
Software Engineer
ANT Software Limited


Re: [whatwg] [WA1] Formatting elements

2006-07-19 Thread Stewart Brodie
Ian Hickson [EMAIL PROTECTED] wrote:

 On Mon, 17 Jul 2006, Stewart Brodie wrote:
  
  I tried dry-running the algorithm for handling mis-nested formatting 
  elements, but I ended up with a tree that looked very odd.  I can't 
  believe that the output I ended up with is what the desired result of 
  the algorithm is, so there is a mistake somewhere: either in my 
  execution of the algorithm or in the algorithm itself.  I took the 
  following fragment of HTML:
  
  DIV abc B def I ghi P jkl /B mno /I pqr /P stu
 
  the result I ended up with was equivalent to:
  
  DIV abc B def I ghi /I /B I /I P I B jkl /B mno
  /I pqr /P stu /DIV
 
 Looks right.  With that as input, my implementation outputs:
 
5: Parse error: missing document type declaration.
38: Parse error: mismatched b element end tag (misnested tags).
47: Parse error: mismatched i element end tag (misnested tags).
57: Parse error: mismatched body element end tag (premature end of 
file?).
htmlhead/headbodydiv abc b def i ghi 
/i/bi/ipib jkl /b mno /i pqr /p 
stu/div/body/html

Good - we do end up with exactly the same thing.


  I know it's hard to see when written out textually, but note that for 
  the text node 'jkl', the I and B elements are the wrong way around!
 
 Wrong way with respect to what? They're the right way if you look at the

 end tags: /b closes first, so it must be innermost! ;-)

I disagree because the 'jkl' is the bit I'm interested in here.  Are you
saying that the desirable tree order in defined in terms only of the closing
tags rather than the open tags?  In the original source, there haven't been
any close tags at all at the time the 'jkl' is parsed, ignoring the other
text nodes, the tree is:

DIV B I P jkl

(I don't really like the P being there, though, to be honest).  At this
point, jkl has a logical element hierarchy above it in the DOM tree that
matches what was in the original HTML source.  In CSS selector terms, DIV 
B  I.  The subsequent processing of the /B token causes such a selector
to no longer match (it has now changed to DIV  I  B):

DIV B I /I /B P I B jkl

Surely it is reasonable to expect the jkl to retain its ancestry - i.e. be a
child of the cloned I, which is a child of the cloned B, regardless of the
tag closure (of the B) that's about to occur, which would convert it to ...

DIV B I /I /B P B I jkl /I /B I (mno...)

I suppose the root of my concern is how to apply CSS selector matching in a
reasonable looking manner to the DOM tree if the parser has reversed the
parentage of the formatting elements.


 The point is this is error-correction logic, there is no right way 
 (well, until the spec is a standard, I guess).

Indeed I suspect that it may not be possible to define the one true way in
such a way that satisfies all content.


  It all seems to start going wrong for me in step 7 of the algorithm.  
  During the handling of the /B tag, the clone of I gets created and 
  that's the node that ends up being the childless I node that has the DIV

  as its parent (during step 5 of handling the /I tag when the I is 
  cloned for a second time to be the child of the P and adopt the original

  children of the P) Firefox generates what I think I would expect and 
  prefer:
  
  DIV abc B def I ghi /I /B P B I jkl /I /B I mno
  /I pqr /P stu /DIV
 
 It's the same number of tags, in this case.
 
 It gets more obviously bad to do what Mozilla does when you consider a 
 case like:
 
bp...p...p...p...p...p...
 
 ...which is very common. With that exact markup, Safari, IE7, and the spec

 all end up with the exact same DOM tree (from the body down, at least), 
 and with the same number of element nodes (from body down, 8).
 
 Mozilla ends up with 13 nodes (from the body down). That doesn't scale -- 
 there are pages with hundreds of nodes like this.

And it gets much worse if it was all wrapped in a u and em too. The key
is, as you mention in one of the blog entries linked below, that the
behaviour differs depending on whether or not the content is well-formed in
terms of matching order of start and end tags, or not.


  For comparison, Internet Explorer 6 on the other hand treats the P no
  differently to the B or I and ends up with:  DIV abc B def I ghi
  P jkl /P /I /B I P mno /P /I P pqr /P stu /DIV
 
 Actually IE has only one P element (and only one B and only one I). Look 
 closer and you'll find that the P element isn't closed -- it's just that 
 the mno and pqr text nodes' parentNodes point to the P, while the DIV 
 element's childNodes array actually also mentions those text nodes. Yes, 
 IE generates DOM trees that aren't trees. See also:
 
http://ln.hixie.ch/?start=1037910467count=1
http://ln.hixie.ch/?start=1138169545count=1
http://ln.hixie.ch/?start=1137740632count=1
http://ln.hixie.ch/?start=1026485588count=1
http://ln.hixie.ch/?start=1137799947count=1

Yes, I have already read many of your blog entries on this topic.  I got

[whatwg] [WA1] Formatting elements

2006-07-17 Thread Stewart Brodie

I tried dry-running the algorithm for handling mis-nested formatting
elements, but I ended up with a tree that looked very odd.  I can't believe
that the output I ended up with is what the desired result of the algorithm
is, so there is a mistake somewhere: either in my execution of the algorithm
or in the algorithm itself.  I took the following fragment of HTML:

DIV abc B def I ghi P jkl /B mno /I pqr /P stu

The DIV is chosen to provide a suitable context for testing everything else.
B and I were chosen as formatting elements with short names, P was chosen as
it has no special behaviour as an open tag when in in body state (possibly
a mistake?  I'm not certain).  One filled whiteboard later, the result I
ended up with was equivalent to:

DIV abc B def I ghi /I /B I /I P I B jkl /B mno /I
pqr /P stu /DIV

I know it's hard to see when written out textually, but note that for the
text node 'jkl', the I and B elements are the wrong way around!  It all
seems to start going wrong for me in step 7 of the algorithm.  During the
handling of the /B tag, the clone of I gets created and that's the node
that ends up being the childless I node that has the DIV as its parent
(during step 5 of handling the /I tag when the I is cloned for a second
time to be the child of the P and adopt the original children of the P)
Firefox generates what I think I would expect and prefer:

DIV abc B def I ghi /I /B P B I jkl /I /B I mno /I
pqr /P stu /DIV

This behaviour would be consistent with disallowing non-phrasing and
non-formatting elements on the stack of open elements when there are
phrasing/formatting elements on the bottom of the stack.  IOW, the P
implicitly closes the B and I elements, leaving them in the list of active
formatting elements, and then NOT executing reconstruct the active
formatting elements before appending the new P element, leaving that for
when the 'jkl' text node is encountered.

For comparison, Internet Explorer 6 on the other hand treats the P no
differently to the B or I and ends up with:  DIV abc B def I ghi P
jkl /P /I /B I P mno /P /I P pqr /P stu /DIV

The problem here may simply be that appending any node due to opening any
non-formatting/non-phrasing open tag when in in body should cause any
formatting/phrasing elements to be popped off the stack of open elements,
and then NOT execute reconstruct the active formatting elements (because
it'll be executed automatically when opening the next formatting/phrasing
element or text node anyway)


-- 
Stewart Brodie
Software Engineer
ANT Software Limited


Re: [whatwg] [wa1] Status of tree construction section

2006-07-10 Thread Stewart Brodie
Ian Hickson [EMAIL PROTECTED] wrote:

 On 7/7/06, Stewart Brodie [EMAIL PROTECTED] wrote:
 
  I thought I'd have a go at implementing the parsing algorithms,
  specifically the tree construction algorithms, to see what effect it had
  on the DOM trees that our parser creates.  Has anybody else here
  actually implemented this tree construction algorithm?  I'm finding one
  or two issues that I think may be (minor) mistakes, and I'd like to
  compare notes to see whether I've just misunderstood it or whether it is
  a mistake.
 
 I've been implementing it (to test the spec); I'd be quite happy to
 compare notes (either on this list or off-list, as you wish).
 Note that I'd definitely not consider that part of the spec done yet.

I'm happy to post to the list.  The first few issues are quite trivial, I
think:

In the main phase, section 'If the insertion mode is in row', the last
option for 'anything else' says process ... as if ... in table.  I think
that should say as if ... in table body instead.  That case will re-throw
the token out to in table in any case if it doesn't handle it.

The case immediately above that An end tag whose tag name is one of: body,
caption, col, colgroup, html, td, th, tr.  The /tr case is already handled
by the second case.  Remove 'tr' from the list here.

In 'If the insertion mode is in cell', the absence of a case for an end
tag for CAPTION looks odd.  All the other table-related tags are handled
here explicitly, so why is CAPTION so different (that it should be handled
in the 'treat it as in body' way)?

I've come to the conclusion that you need pictures to accompany the
adoption agency algorithm.  However, I'm not an artist.  Indeed, I'm so
bad at drawing pictures, that in the past, users often sent me replacement
bitmap graphics for my programs because they found my attempts so
distressing :-)

With reference to that algorithm, I think that the text in point 1 should be
re-organised somewhat after the second paragraph to make it a little
clearer.  I've re-organised it and I think it says exactly the same now, but
simpler and with less potential for misunderstanding:

  If there is a _formatting element_; proceed immediately to step 2

  Otherwise, there is no _formatting element_.  If there is an element in
  the _list of active formatting elements_ that:

  o  [same three steps, but with , and appended to the top one]

  then remove the last such element from the _list of active formatting
  elements_.

  In any case, abort these steps.


In the various places where a given operation has to be described multiple
times, you've macroed it (e.g. insert an HTML element, clear the list of
active formatting elements up to the last marker).  I suggest adding
another this one that can be used during the Adoption Agency algorithm (I'm
sure that I found I needed to perform this search in other places too -
hence defining it separately - although I can't quite recall exactly where
for the time being, ho hum):

  The _list of active formatting elements_ is said to *have an element in
   active formatting scope* when the following algorithm terminates in a
   match state:

  1. If the _list of active formatting elements_ is empty, terminate in a
 failure state.

  2. Initialise _entry_ to be the last (most recently added) entry in the
 _list of active formatting elements_.

  3. If _entry_ is a marker, terminate in a failure state.

  4. If _entry_ is an element with a tag name matching the target element
 name, terminate in a match state.

  5. If there are further elements in the _list of active formatting
 elements_, set _entry_ to the previous entry and return to step 3.

  6. Terminate in a failure state (there are no more entries)


Step 6 in the original 14-step algorithm: relative position of the
formatting element.  Relative to what?

The parsing quirks box lists several issues that I think are important.
The script one in particular is so very common.  Unfortunately, I had to
cave in eventually and support that because it broke some customers' own
sites.  I have come across never-opened /br and /p too.  I've never
heard of % ... % before.  Sometimes, it's really quite depressing the
rubbish that people (and programs!) write out.

I spent a long time trying to work out what I needed to store for each entry
on both the stack of open elements and the list of active formatting
elements.  I think it should be stated up front because this is often an
area of confusion, in my experience.  I frequently get upset with co-workers
over misuse of the terms element, tag and node, for example :-)

Finally (for now ;-), right at the beginning of the tree construction
section, it says that DOM Mutation events must not fire for changes caused
by the UA parsing the document.  I cannot decide whether or not I agree with
that statement.  My experimentation appears to show that this is indeed what
happens in Firefox, at least. I put a script in the head of my document

[whatwg] [wa1] Status of tree construction section and SVN interface

2006-07-07 Thread Stewart Brodie

I thought I'd have a go at implementing the parsing algorithms, specifically
the tree construction algorithms, to see what effect it had on the DOM trees
that our parser creates.  Has anybody else here actually implemented this
tree construction algorithm?  I'm finding one or two issues that I think may
be (minor) mistakes, and I'd like to compare notes to see whether I've just
misunderstood it or whether it is a mistake.

With the spec changing so frequently, I wanted to make sure I'm catching any
updates to the relevant parts of the document, so I followed the link in WA1
to the http://svn.whatwg.org but I just get web pages with the text
(literally) of the current revision of the specs, rather than access to the
history logs that WA1 implies I should find there. Neither web browsers nor
svn itself can talk to that URI.  Am I doing something wrong or is it
broken?


-- 
Stewart Brodie
Software Engineer
ANT Software Limited