[whatwg] postMessage: max length / size
Is there any limit to the length of message you can send with postMessage (HTML5 Cross-document messaging)? I didn't see anything in the spec about this. I thought this might be one area where implementations might end up differing. Thanks, Brian
Re: [whatwg] postMessage: max length / size
On Thu, 22 Oct 2009, Brian Kuhn wrote: Is there any limit to the length of message you can send with postMessage (HTML5 Cross-document messaging)? I didn't see anything in the spec about this. I thought this might be one area where implementations might end up differing. There are probably implementation-specific limits, but HTML tries to not say what the limits should be, since it's hard to know what they should be. It might vary from platform to platform and device to device, and will almost certainly vary over time. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Superset encodings [Re: ISO-8859-* and the C1 control range]
Ian Hickson wrote: Authors should not use JIS-X-0208 (JIS_C6226-1983), JIS-X-0212 (JIS_X0212-1990), encodings based on ISO-2022, and encodings based on EBCDIC. It is not clear what this means (e.g., the character set JIS_C6226-1983 in any encoding, or only when encoded alone according to RFC1345 as described above); This is talking about character encodings, not character sets. JIS_C6226-1983 is a registered character encoding in the IANA registry. Yes, I can understand this, but... On Fri, 23 Oct 2009, NARUSE, Yui wrote: Authors should not use JIS-X-0208 (JIS_C6226-1983), JIS-X-0212 (JIS_X0212-1990), encodings based on ISO-2022, and encodings based on EBCDIC. First, JIS-X-0208 and JIS-X-0212 are not in IANA Charsets, moreover those correct names as spec are JIS X 0208 and JIS X 0212. On Thu, 22 Oct 2009, �istein E. Andersen wrote: I am not sure what you mean; they are both listed at http://www.iana.org/assignments/character-sets: Name: JIS_C6226-1983 [RFC1345,KXS2] MIBenum: 63 Source: ECMA registry Alias: iso-ir-87 Alias: x0208 Alias: JIS_X0208-1983 Alias: csISO87JISX0208 Name: JIS_X0212-1990 [RFC1345,KXS2] MIBenum: 98 Source: ECMA registry Alias: x0212 Alias: iso-ir-159 Alias: csISO159JISX02121990 On Fri, 23 Oct 2009, NARUSE, Yui wrote: Where is the word JIS-X-0208 ? Where is the word JIS-X-0212 ? The exact string isn't there, that's why I included the preferred MIME names in brackets in the spec. if it is talking about character encodings, why it uses the name of character sets mainly? Following seems better. Authors should not use JIS_C6226-1983, JIS_X0212-1990, encodings based on ISO-2022, and encodings based On Fri, 23 Oct 2009, NARUSE, Yui wrote: Second, JIS_C6226-1983, JIS_X0212-1990, and EBCDICs are not ASCII compatible. So they are out of discouraged; mustn't use. You can use non-ASCII-compatible encodings (e.g. UTF-16). I see. -- NARUSE, Yui nar...@airemix.jp
Re: [whatwg] Superset encodings [Re: ISO-8859-* and the C1 control range]
On Fri, 23 Oct 2009, NARUSE, Yui wrote: The exact string isn't there, that's why I included the preferred MIME names in brackets in the spec. if it is talking about character encodings, why it uses the name of character sets mainly? Following seems better. Authors should not use JIS_C6226-1983, JIS_X0212-1990, encodings based on ISO-2022, and encodings based Ok, done. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
[whatwg] HTMLElement.onload
In 6.5.6.2 of the spec I found, that the onload event handler is now available for every HTML element in HTML5, which I think is a great improvement. But there is something on the load event, that I think would be worth some words to clarify. According to 6.11.2 the load event is fired when the whole document is loaded; I did not find anything about element-specific load events. So I assume that element1.onload is triggered by the same event as element2.onload - the following two bodies would be equivalent: body p onload=dosomething(this)Text/p p onload=dosomethingelse(this)Text/p /body body onload=dosomething(document.getElementById('foo')); dosomethingelse(document.getElementById('bar')) p id=fooText/p p id=barText/p /body Is this assumption correct? Generally, the list of events that must be supported by all HTML elements looks somehow confusing to me, as there are some events that only apply to special types of elements, such as media players or forms resp. form elements. How are e.g. onpause or oninput supposed to work if applied to span or p elements?
[whatwg] Typo in Annotations for assistive technology products (ARIA) section
th elemen that is neither a column header nor a row header should read th element that is neither a column header nor a row header -Mark
[whatwg] Another typo in Annotations for assistive technology products (ARIA) section
Either a button element or an input element is required to when using the button role should read Either a button element or an input element is required when using the button role -Mark
Re: [whatwg] postMessage: max length / size
As a data point, the WebKit implementation (used by Safari and Chrome) doesn't currently enforce any limits (other than those imposed by running out of memory). -atw On Fri, Oct 23, 2009 at 12:02 AM, Ian Hickson i...@hixie.ch wrote: On Thu, 22 Oct 2009, Brian Kuhn wrote: Is there any limit to the length of message you can send with postMessage (HTML5 Cross-document messaging)? I didn't see anything in the spec about this. I thought this might be one area where implementations might end up differing. There are probably implementation-specific limits, but HTML tries to not say what the limits should be, since it's hard to know what they should be. It might vary from platform to platform and device to device, and will almost certainly vary over time. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Superset encodings [Re: ISO-8859-* and the C1 control range]
On 23 Oct 2009, at 04:20, Ian Hickson wrote: On Wed, 21 Oct 2009, Øistein E. Andersen wrote: ASCII-compatibility: The note in 2.1.5 Character encodings seems to say that [...] ISO-2022[-*] are ASCII-compatible, whereas HZ-GB-2312 is not, and I cannot find anything in Section 2.1.5 that would explain this difference. HZ-GB-2312 uses the byte ASCII uses for ~ as the escape character. ISO-2022-* uses the control codes. That's the difference. '~'/0x7E is not (and should not be, as far as I can tell) relevant for HTML5's concept of ASCII compatibility. Discouraged encodings: [...] Authors should not use JIS-X-0208 (JIS_C6226-1983), JIS-X-0212 (JIS_X0212-1990), [...] It is not clear what this means [...] This is talking about character encodings, not character sets. JIS_C6226-1983 is a registered character encoding in the IANA registry. (This is less confusing now since HTML5 only deals with character encodings and the strings match those in the the IANA registry as suggested by Yui Naruse.) the list of discouraged encodings seems conspicuously short if it is supposed to be complete; and the lack of rationale makes it difficult to understand why these encodings are considered particularly harmful (JIS_C6226-1983 v. JIS_C6226-1978 or ISO-2022 v. HZ, to mention but two at least initially puzzling cases). The reason for including these is to discourage encodings known to have security issues. I've added HZ-GB-2312, which can be used in a similarly dangerous fashion. (Basically the danger for user agents is in an attacker using an encoding that a user agent could autodetect, while a site interprets the bytes safely; that would allow those encodings to be used to smuggle script elements in a way that a naive whitelisting filter would think is safe.) It might be better to say *why* particular encodings are better avoided, whether or not the list of discouraged encodings be presented as definitive. I've added a note. [...] On Thu, 22 Oct 2009, Philip Taylor wrote: The string [숍訊昱穿] encoded as ISO-2022-KR is the bytes 0e 3c 73 63 72 69 70 74 3e. A UA that doesn't support ISO-2022-KR (e.g. Chrome, when I last checked) will decode it as Windows-1252 and get the string script, which is bad. So a site that uses ISO-2022-KR is very likely to expose some users to XSS attacks, which seems like a good reason to discourage that encoding. The same applies to other ISO-2022 encodings. [...] On Thu, 22 Oct 2009, Øistein E. Andersen wrote: If that is the reason, at least HZ encoding would seem to be affected as well. Explicitly discouraging a more or less random subset of the problematic encdodings without providing rationale makes it difficult to assess whether or not other, somewhat similar, encodings should be avoided as well, which was the main issue I wanted to raise. Hopefully this is somewhat addressed now. The added note certainly helps, but it is vague (does [m]ost of these encodings mean all the encodings mentioned above apart from UTF-32?) and inaccurate (Philip Taylor's example does not rely on bugs). Given that the set of encodings is open-ended, I still think it would be preferable to make the rationale (a definition of what makes an encoding problematic) primary and mention actual encodings as examples. This could give something like the following: Encodings in which a series of bytes in the range 0x20..0x7E may encode characters other than the corresponding characters in the range U+20..U+7E represent a potential security vulnerability since a browser that does not support the encoding (or does not support the label used to declare the encoding, or does not use the same mechanism to detect the encoding of unlabelled content) might end up interpreting technically benign plain text content as HTML tags and JavaScript. In particular, this applies to encodings in which the bytes corresponding to 'script' in ASCII may encode a different string. Authors should not use such encodings, which are known to include In addition, authors should not use UTF-32 Alternatively, fixing the current note would help and might be sufficient, albeit not ideal. I think one has to realise that a comprehensive list of problematic encodings is an elusive goal and act accordingly. -- Øistein E. Andersen PS: The following sentence makes little sense without (curly) quotes and apostrophes. In case they disappeared before you read it, please find it repeated below with (ASCII) quotes and apostrophes: It should probably be advise against authors' using legacy encodings or better advise authors against using legacy encodings. (The current text in the spec is fine.)
Re: [whatwg] Typo in Annotations for assistive technology products (ARIA) section
On Fri, 23 Oct 2009, Mark Pilgrim wrote: th elemen that is neither a column header nor a row header should read th element that is neither a column header nor a row header Fixed. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Another typo in Annotations for assistive technology products (ARIA) section
On Fri, 23 Oct 2009, Mark Pilgrim wrote: Either a button element or an input element is required to when using the button role should read Either a button element or an input element is required when using the button role Fixed. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Superset encodings [Re: ISO-8859-* and the C1 control range]
On Fri, 23 Oct 2009, �istein E. Andersen wrote: On 23 Oct 2009, at 04:20, Ian Hickson wrote: On Wed, 21 Oct 2009, Øistein E. Andersen wrote: ASCII-compatibility: The note in 2.1.5 Character encodings seems to say that [...] ISO-2022[-*] are ASCII-compatible, whereas HZ-GB-2312 is not, and I cannot find anything in Section 2.1.5 that would explain this difference. HZ-GB-2312 uses the byte ASCII uses for ~ as the escape character. ISO-2022-* uses the control codes. That's the difference. '~'/0x7E is not (and should not be, as far as I can tell) relevant for HTML5's concept of ASCII compatibility. Good point. Moved the encoding over to the other side. The added note certainly helps, but it is vague (does [m]ost of these encodings mean all the encodings mentioned above apart from UTF-32?) and inaccurate (Philip Taylor's example does not rely on bugs). Given that the set of encodings is open-ended, I still think it would be preferable to make the rationale (a definition of what makes an encoding problematic) primary and mention actual encodings as examples. This could give something like the following: Encodings in which a series of bytes in the range 0x20..0x7E may encode characters other than the corresponding characters in the range U+20..U+7E represent a potential security vulnerability since a browser that does not support the encoding (or does not support the label used to declare the encoding, or does not use the same mechanism to detect the encoding of unlabelled content) might end up interpreting technically benign plain text content as HTML tags and JavaScript. In particular, this applies to encodings in which the bytes corresponding to 'script' in ASCII may encode a different string. Authors should not use such encodings, which are known to include In addition, authors should not use UTF-32 Alternatively, fixing the current note would help and might be sufficient, albeit not ideal. I've reworded the spec based on your suggestion. Thanks! -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'