Re: [whatwg] innerHTML in HTML documents with multiple namespaces
On Tue, 27 Mar 2007, Thomas Broyer wrote: I'm actually wondering what is supposed to be tag name for an element which is not in the HTML namespace (e.g. created with document.createElementNS). Is it the localName or the tagName (qualified name, i.e. with prefix)? The tag name is fully qualified (as in tagName). In other words, what should document.body.innerHTML end with after this script: var svg_svg = document.createElementNS(http://www.w3.org/2000/svg;, svg:svg); document.body.appendChild(svg_svg); Should it end with svg/svg or svg:svg/svg:svg? (Firefox would have svg/svg) The spec requires the latter at the moment. Also, should the tag name be lowercased before inclusion in the output or the algorithm is just assuming the tag name of HTML elements have already been lowercased elsewhere? (Firefox keeps uppercase letters; elements created with document.createElement are HTMLElements and have their names lowercased at creation time; as described in the spec) Same questions with attribute names ;-) I think the spec has been clarified regarding this, let me know if it is not clear still. Cheers, -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] several messages about discouraged things
Once upon a time Ian Hickson shaped the electrons to say... Frames are out (except iframe, which I don't really see as being a problem, though let me know if I'm wrong on this). Tables for layout are I think iframe is required simply by weight of use. It seems like most web-based advertising uses iframe now - huge examples are Google AdSense and Amazon. Back when HTML 4 was being put together, I suggested doing without iframe and having object handle text/html. It always seeemd to me that iframe was really a specialized 'object'. But now iframe is more widespread than frames ever were. -MZ -- megazone-at-megazone.org http://www.MegaZone.org/ Gweep, Geek, Human, me. http://www.TiVoLovers.com/ http://www.Eyrie-Productions.com/ -- Hail Eris A little nonsense now and then, is relished by the wisest men 508-852-2171
Re: [whatwg] in caption insertion mode
On Mon, 18 Jun 2007 22:25:46 +0200, Ian Hickson [EMAIL PROTECTED] wrote: On Sun, 10 Dec 2006, Anne van Kesteren wrote: The Anything else case should probably trigger a parse error before reprocessing the current token. Why? Could you show a sample of markup that would go through this path and should trigger an error that isn't flagged? I figured that out later. All good :-) -- Anne van Kesteren http://annevankesteren.nl/ http://www.opera.com/
Re: [whatwg] Parsing: comment tokenization
On Sat, 7 Apr 2007, Anne van Kesteren wrote: The tokenization section should also handle: !-- !--- as correct comments for compat with the web. This means that ! shows -- and that !- shows --. These comments are not handled (though not conformant). On Sat, 7 Apr 2007, Nicholas Shanks wrote: Why on earth is this a good idea? IE7 does it. The assumption is that content therefore depends on it. AFAIK browsers and other HTML clients don't currently treat these as comments This seems to disagree with my research. [...] compelling them to do so will cause several problems: 1) Web developers currently expect things like !--5?-- to result in the comment greater than five?. Changing such expectations on a whim is harmful. It is not clear to me that this is indeed true. 2) A double HYPHEN-MINUS delimits comments within tags, this provides compatibility with XML and SGML and changing this needlessly in HTML5 will just complicate conversion. This, unfortunately, is impractical. (I say this despite having personally pushed for this for years.) 3) You claim compat with the web but don't provide any evidence to support that. Are there huge numbers of sites expecting !-- to represent a comment without content? Can such sites not be fixed instead of polluting HTML with additional rules? I'd rather have a handful of broken sites that their authors will fix than saying to the other 99% of authors hey, you can now do this and ending up with millions of broken sites. (I say broken, because they will not be backwards compatible with current or previous UAs) It seems that they will in fact be compatible; but I agree, we shouldn't encourage it. The spec makes them non-conforming. On Sat, 7 Apr 2007, Nicholas Shanks wrote: Even you must (begrudgingly?) admit that comments formatted as in your original post are not backwards compatible, even if they do reflect the state of modern UAs as you say. How can both those statements be true? I don't believe I am 'pretending' anything. Just stating that diverging further from SGML for No Good Reason is pointless. (And yes, supporting a few odd websites that do this already counts as not a Good Reason, websites can always be fixed!) Sadly, Web sites can't always be fixed. Many sites have been long abandoned and are no longer updated. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] several messages about discouraged things
Ian Hickson wrote: On Sat, 24 Feb 2007, Keryx Web wrote: - A table within a table cell (Has this ever been used for anything but layout?) There are valid uses of that, though they are rare. Really? What are they? -- Lachlan Hunt http://lachy.id.au/
[whatwg] Clarity of the tag open state
If the next input character is not a U+002F SOLIDUS (/) character, emit a U+003C LESS-THAN SIGN character token and switch to the data state to process the next input character. This would be clearer if it used the usual wording consume the next input character and reconsume the current input character in the data state. (I just found out I wrote a bug due to my careless reading of the unusual wording.) -- Henri Sivonen [EMAIL PROTECTED] http://hsivonen.iki.fi/
[whatwg] Canvas line styles comments
Lines are great fun. See http://canvex.lazyilluminati.com/misc/lines.html for a random collection of demonstrations relating to the stuff below. For lineJoin, the term joins is used but not properly defined (except indirectly as where two lines meet). Given the implementations, this should be something like: For each subpath, a join exists at the point shared by each consecutive pair of lines. If the subpath is closed, then a join also exists at its first point (equivalent to its last point) connecting the first and last lines in the subpath. There are no conformance criteria for rendering lineCap. The definition of 'miter' is incorrect: it seems to say the miter gets truncated into a more-sided polygon if it would exceed miterLimit, but the behaviour of implementations is to revert to 'bevel' rendering instead. The definition of 'round' for lineJoin is slightly incorrect, since it talks about adding a filled arc when it needs to be a filled circle sector (or an arc plus a triangle). The definition for 'stroke' says The stroke() method must stroke each subpath of the current path in turn, using the strokeStyle, lineWidth, lineJoin, and (if appropriate) miterLimit attributes. That list should include lineCap. The lineWidth attribute gives the default width of lines, in coordinate space units. - why default? The expression the point where the inside edges of the lines touch doesn't make sense to me. (Actually, it did make sense for a while, but then I realised it was an incorrect sense). I think the problem is in being ambiguous about the distinction between geometric lines (which are infinitely thin and just a description of a path through space) and graphical lines (which are a thick filled shape, defined by their edges (which are geometric lines)) - the rendering details are describing how to convert the first sort of line into the second sort of line, but that seems to be made unclear. I believe it would be clearer to use the term line only in the first sense (so ctx.lineTo adds a line to the subpath, and ctx.fill fills the area enclosed by the path's lines, etc), and the term stroke [or a better name, since I don't really like this one, but I can't think of anything else] for the second sense (so ctx.stroke calculates and renders strokes, which are shapes that are based on the path's lines and widths and caps and joins). There also seems to be a danger of confusion between lines (like a single straight/arc/Bézier line segment) and subpaths, like in the definition of what lineCap applies to. So perhaps it could say something like: The lineWidth attribute gives the width used for rendering lines, in coordinate space units. The outline of a rendered stroke must pass through the points at a distance lineWidth/2 perpendicular to each point in the line being stroked, and must be closed at each end by a straight line. [[...because it's good to define what the width actually means, though I'm not sure if this definition is sufficiently clear/correct.]] ... The lineCap attribute defines the type of endings that UAs shall place on the end of lines. The three valid values are butt, round, and square. The butt value means that no cap shape will be added to the lines. [[...since you don't have to do anything extra at this stage - the earlier paragraph already said how to close the lines at the ends in a butt-like way.]] The round value means that a semi-circle with the diameter equal to the line width must be added on to the first and last points of each unclosed subpath. [[It needs to ignore closed subpaths - those get joined instead of capped.]] The square value means that a rectangle with the length of the line width and the width of half the line width must be placed flat against the edge perpendicular to the direction of the line, on the first and last points of each unclosed subpath. ... ... At each join, if the two lines connected to the join have the same direction at that point, no line join is rendered. If the two lines have exactly opposite directions, and lineJoin is round, then a filled semi-circle must be added with its diameter equal to the line width, its origin at the join, and its flat edge touching the edges of the strokes; otherwise, when lineJoin is not round, no line is rendered. [[It won't make sense to talk about the outside edges at a join if all the edges are parallel, so these cases need to be handled specially. It also avoids issues like the miter trying to find an intersection point between parallel lines.]] Otherwise, if the two lines do not have equal or opposite directions, the following rendering steps are performed for the join: * A filled triangle must be added between the position of the join and the two corners of the strokes on the outside of the join. [[That triangular region is shared for all the following variations, so it seems easier to describe it as separate step.]] [[Things like outside of the join are not defined but seem clear enough to me.]] *
Re: [whatwg] Parsing: /li should be ignored
On Sat, 14 Apr 2007, Simon Pieters wrote: For compatibility with IE the parsing algorithm should probably ignore /li tags. Test case for the above proposal: !doctype html style * { margin:0; padding:0; } ul { background:red; } li { background:lime; } /style ulli/liThis line should be green./ul I've thought this over and as much as I'd like to be compatible with IE on this, there are a number of issues with it. There's the way that every other browser doesn't do this, which makes it a very risky change. It also means there may not be an immediate need to do this, since browsers only tend to disagree with IE when doing so doesn't break much content. There's the problem that it makes it difficult to know how to handle things like: ullitest/li!--test--/ul It would also make future expansion difficult, too. This would have to be applied to dt and dd, and would make constructions like: x dt xx /dt xx dd xx li xx /li xx /dd xx /x ...have very different results than it appears. So, unless there's a strong reason to, I suggest we don't change this. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] web-apps/current-work/#datetime-parser
On Tue, 17 Apr 2007, Sam Ruby wrote: Step 25 If sign is negative, then shouldn't timezoneminutes also be negated? Fixed. Step 27 Shouldn't that be SUBTRACTING timezonehours hours and timezoneminutes minutes? My current time is 2007-04-17T05:28:33-04:00 The timezone is -4 hours from UTC. To convert to UTC I need to add 4 hours. Fixed. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] void elements vs. content model = empty
On Wed, 18 Apr 2007, ryan wrote: So, I was just trying to check my blog for HTML5 conformance [1] and ran into a conformance problem that I had trouble sorting out. The conformance checker said: 1. Fatal Error: End tag param seen even though the element is an empty element. Line 121, column 73 in resource http://theryanking.com/blog/ so, I went to http://www.whatwg.org/specs/web-apps/current-work/#param to see what the restrictions on param are. In that section it says: Content model: Empty. which brought up the question what's 'empty' mean?. In my mind, it could either be no content allowed or must be a void element (ie, no end tag). The content model only describes what conformance means at the DOM level, it doesn't affect the syntax. To tell if you're allowed to have a closing tag, you have to see the syntax section, where it says: # Void elements only have a start tag; end tags must not be specified for # void elements. ...and: # Void elements #base, link, meta, hr, br, img, embed, param, area, col, input -- http://www.whatwg.org/specs/web-apps/current-work/#elements0 It'd be nice if we could make this clearer in the spec– even though the language and html serialization are two different things, for the sake of authors it'd be nice to have pointers between the two. Yeah... I'm not really sure how to do that yet. I think on the long term I may add an informative block to the element definitions (the green boxes) that says something like: Syntax in text/html: Start tag may be omitted End tag must be omitted ...or whatever. Also, if there's a difference between content=empty and 'void elements' it deserves an explanation. One is just about the content model, the other is just about the syntax. They're not really related, though it happens to be the case that all elements that have an empty content model are void elements in HTML. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] void elements vs. content model = empty
A void element cannot have any content because there is no way to specify it in the source. Such a relation is called entailment. Chris -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ian Hickson Sent: Wednesday, June 20, 2007 12:29 AM To: ryan Cc: [EMAIL PROTECTED] Subject: Re: [whatwg] void elements vs. content model = empty On Wed, 18 Apr 2007, ryan wrote: Also, if there's a difference between content=empty and 'void elements' it deserves an explanation. One is just about the content model, the other is just about the syntax. They're not really related, though it happens to be the case that all elements that have an empty content model are void elements in HTML.
Re: [whatwg] Parsing: should foodd/foo close the DD?
On Fri, 20 Apr 2007, Simon Pieters wrote: I sent a bug report to Opera saying that given the markup foodd/fooX, X should be a sibling to FOO instead of a child of DD. According to Anne the bug report was invalid per the current spec: On Fri, 20 Apr 2007 09:03:29 +0200, [EMAIL PROTECTED] wrote: I think this bug report is invalid. When you hit /foo dd is the bottommost node of the stack. dd is in neither the formatting nor phrasing category (it's in special) and therefore the /foo end tag is ignored. However, in IE, Firefox and Safari, the DD does get closed at /foo, so perhaps this is a bug in the spec? I could only get /foo to close the dd in Firefox. In IE, the foo is treated as a void element. Opera and Safari seem to follow the spec. Without further evidence that this breaks things, I'd rather just leave the spec as is. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Incorrect character codes
On Fri, 20 Apr 2007, Philip Taylor wrote: Section 8.2.3.1: U+0061 LATIN SMALL LETTER A through to U+0078 LATIN SMALL LETTER F, and U+0041 LATIN CAPITAL LETTER A, through to U+0058 LATIN CAPITAL LETTER F Should say: U+0061 LATIN SMALL LETTER A through to U+0066 LATIN SMALL LETTER F, and U+0041 LATIN CAPITAL LETTER A through to U+0046 LATIN CAPITAL LETTER F It seems this is fixed now. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
[whatwg] html5 parsing/tokenizing
I have a friend who has implemented a fast tokenizer in C. I asked him to send me any feedback he might have, and so what follows are his words. This is from about a month ago, so I apologize if any of this is old ground. -Ben - When the tokenization state machine is defined, every state first consumes and then potentially emits. Some of the states transfer to another state with an order to re-consume the character in the next state. This means that what you do in the new state is dependant on what you did in the last state and that the comsume is necessarily an inconsistent operation. A much better wording would be look at the next character and on state transition consume and emit or just emit without consumption making it clear when the input cursor moves. It would be nice if all !... tags (except comments) were considered declarations instead of bogus comments. Then DOCTYPE wouldn't need special handling by the tokenizer, just special handling by the parser. (Too much of the parser seems to have gotten into the tokenizer; with CDATA and RCDATA, this is a necessary evil. With !DOCTYPE ... it isn't.) Other than that, the definition is pretty solid and I've come to terms with the xml-interoperability issues I formerly expressed. I've added a switch to my parser that tells it whether or not to honor RCDATA sections and I've purposed never to feed it CDATA. (I know it's not supposed to be an xml parser.) ~D
Re: [whatwg] Parsing: in unquoted attribute values
On Wed, 25 Apr 2007, Simon Pieters wrote: The parsing section says that in an unquoted attribute value terminates the tag. However, according to my testing[1], IE7, Gecko, Opera and Webkit don't do this -- they append the to the attribute value. So I think the parsing section is wrong here. This was fixed recently. Additionally, the syntax section says that authors are not allowed to use in unquoted attribute values, which should probably be changed if the parsing section is changed. Oops, forgot to fix that last time. Fixed now. On Wed, 25 Apr 2007, Anne van Kesteren wrote: IE also lets be an attribute. It can also be part of an attribute or element name. This means that: p/ptest will become a 'p' element with a 'p' attribute which has 'test' as textContent. This basically means less exceptions in the tokenizer for the '' character which would be fine with me. HTML5 requires this now. On Wed, 25 Apr 2007, Anne van Kesteren wrote: As I just mentioned on IRC, this essentially means removing the SHORTTAG TAGC OMISSION feature of SGML which appears not be supported by Internet Explorer, Opera and maybe Safari. Indeed. On Wed, 25 Apr 2007, Jonas Sicking wrote: p/ptest will become a 'p' element with a 'p' attribute which has 'test' as textContent. This basically means less exceptions in the tokenizer for the '' character which would be fine with me. We do no longer support this in mozilla (if we ever did). A reason we now explicitly forbid this is we don't want it to ever be possible to create elements with 'illegal' names. Same thing goes for attribute names. This is partially for security reasons since some elements and attributes carry very important security information. On Thu, 26 Apr 2007, Anne van Kesteren wrote: Could you elaborate on the security issues? Could you also give a definition of illegal names as it's not really clear to me what that means for HTML. On Fri, 27 Apr 2007, Jonas Sicking wrote: Basically, for input type=file value=/etc/passwd, if part of the code thinks that that is an input element, where as other parts thinks that is and input element, you might end up in a situation where the browser sends the /etc/passwd file to the server without user interaction. That seems a bit specious given that for type=file you'd have to ignore value= anyway. Furthermore, making the be _not_ part of the tag name is what causes the security issue, as it's only when you _don't_ put it in the tag name that you end up with an input element. Anyway, that's the advantage of having a single, well-defined tokeniser, you don't have to worry about differences in opinion. :-) It also seems like a bad idea to allow a document to be parsed such as there is no way to serialize it without creating an invalid html5 serialization. We are well past that point. Example: p bogus= ...can be parsed but can't be serialised legally. As far as element names go, i don't really see a reason to allow more, or less, characters than the XML spec lets you use. The main reason is that you have to define what happens to the characters you don't allow. We don't have the option of fatal failure. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Script, style and backwards compatibility
(Thanks for forwarding forum feedback to the list. Feel free to forward my reply back to the forums, and please do continue to forward feedback from the forums, or blogs, or anywhere else, to the list!) On Mon, 30 Apr 2007, Simon Pieters wrote: From http://forums.whatwg.org/viewtopic.php?t=38 Make noscript allowed in XHTML5 Unfortunately the way noscript works makes it impractical in XHTML. You can have similar effects, however, by just using script to remove the section: div class=noscript.../div script var n = document.getElementsByClassName('noscript'); for (var i = 0; n lt; n.length; n += 1) n[i].parentNode.removeChild(n[i]); /script ...or some such. (Untested.) and generally remove differences between HTML5 and XHTML5 where possible. Indeed, removing unnecessary differences is a goal (though it is not the most important goal, and so can be trumped; for example backwards compatibility would override it, as it does with noscript). This could thus also imply: * Don't disallow lang= in XHTML5 Having both xml:lang= and lang= would actually cause more round-tripping problems (if they were both allowed), since xml:lang can't be used in HTML. We can't drop xml:lang, though, since XML defines it. * Don't disallow base href in XHTML5. This is mostly disallowed because generic XML processing wouldn't know about it, and so URIs in unrelated languages like SVG would change meaning based on whether the UA knew XHTML or not. * Don't disallow meta charset in XHTML5 (it doesn't do any good, but doesn't harm either). We could allow it if we required that there be an XML declaration that had the same encoding specified, but then that wouldn't be the same as HTML5, so we wouldn't have won anything. On Mon, 30 Apr 2007, Simon Pieters wrote: Anne wrote: xml:lang should be treated the same as xml:id imo (except that for now I suppose they have different handling if both the xml: and normal attribute specified). Agreed. We can't treat xml:lang like xml:id. An element can have multiple IDs, it can't have multiple languages. In conclusion, while I agree with the principle of keeping XHTML and HTML as close to each other as possible, I don't think they're further apart than is actually necessary. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Parsing: don't move meta and link to head
On Mon, 21 May 2007, Anne van Kesteren wrote: Internet Explorer 7 and Opera 9 don't move meta and link to the head element during parsing (much like they don't do that for style). I think that's a good enough reason to change the parsing specification to match that behavior. Besides the fact that it is more sensible as the DOM and the original input stream are closer to each other. Done. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Parsing: ignore /head?
On Mon, 21 May 2007, Anne van Kesteren wrote: If we simply ignore /head there's no longer a need to append elements to the head element pointer. In fact, we can remove it. I'm not sure how much this would complicate conformance checking, but it would certainly be very nice not to have such strange appending rules for the limited set of elements that have that now (link, meta, style, base). This would screw up the placement of comments between head and body. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] HTMLFormElement reset method
Bumping this in hopes of a response. Thanks. On 3/29/07, Brad Fults [EMAIL PROTECTED] wrote: In section 7.1 of the WA 1.0 draft [1], there is the following text: The reset() method resets the form, then fires a a formchange event on all the form controls of the form. In the case of a form seeded with initial values [2], it is not clear to me whether the intention is for the form's reset() method to reset the form to some sort of blank state or to the state immediately following the seeding of initial values. Clarity on this point would be appreciated. Thanks. [1] - http://www.whatwg.org/specs/web-forms/current-work/#form.checkvalidity [2] - http://www.whatwg.org/specs/web-forms/current-work/#seeding -- Brad Fults -- Brad Fults
Re: [whatwg] setting .src of a SCRIPT element
On Mon, 21 May 2007, Hallvord R M Steen wrote: if you set the src property of a SCRIPT element in the DOM, IE will load the new script and run it. Firefox doesn't seem to do anything (perhaps a more seasoned bugzilla searcher can tell me if it is considered a known bug?). I think Opera 8 does what IE does, Opera 9 is buggy. I think IE's behaviour is pretty useful and I'd like the spec to make this standards-compliant. It is a common technique to create SCRIPT elements dynamically to load data (particularly because this gets around cross-domain limitations). Firefox's implementation means one has to create a new SCRIPT element each time, keep track of them, and remove them from the document again, whereas with IE's implementation you can have one data loader SCRIPT element and set its .src repeatedly. On Mon, 21 May 2007, Darin Adler wrote: Is this technique easy to use correctly? What if you set the src before a previous script has finished loading? I've heard from several implementors that this would be undesirable. The spec goes to some lengths to stop it from working, in fact. With the definition of XMLHttpRequest, the coming cross-domain nature of that element, the ability to use cross-frame communication, and the simple workaround of creating a new script for each communication, it seems there are enough ways to get around the problem that we don't have to allow it. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'