Re: [whatwg] postMessage() issues
Hi! I agree with everything Maciej said, but I'm rather impartial. The word post implies posting something to a queue of messages, like we've seen in other programming APIs. There are use cases for both sync and async API, so we should support both. We could have either a new parameter for postMessage, like with have for XHR, which defines the behavior (if async or not), or we could have two functions postMessage asynchronous and sendMessage synchronous. Then everyone would be happy. Thank you. 2008/4/16, Maciej Stachowiak [EMAIL PROTECTED]: On Apr 15, 2008, at 5:10 PM, Ian Hickson wrote: At the moment people have proposed that the API be asynchronous, and some people are ok with that, but other people are strongly opposed to it. I am not sure where to go with this. Input from other browser vendors -- yourself and WebKit in particular -- would be very useful. Right now the API is synchronous, and Mozilla reps have indicated they strongly prefer that, Opera reps have indicated they don't mind, and Gears reps have indicated they'd rather it be async. I think async is better, for the following reasons: - PostMessage seems to imply a message queue model. - Processing a reply synchronously is awkward in any case, since you need a callback. - This is different from event dispatch because replies are expected to be common; two way communication channels like postMessage make more sense as asynchronous, while event dispatch is typically one-way. - Saying that runaway two-way messaging should be handled by a slow script dialog seems weak to me, compared to making the mechanism intrinsically resistant to runaway behavior. - Making new communication APIs async makes it more practical to partition browsing contexts into separate threads, processes, operation queues, or other concurrency mechanisms (within the limitations of what kind of things must be serialized. - We can foresee that workers in the style of Gears will be a future use case for postMessage; in that case, it clearly must be async. However, I don't feel very strongly about this and I would consider synchronous postMessage acceptable. (Note also that Eric Seidel, who commented on this issue earlier, was also giving his feedback as a WebKit developer, though in both cases we speak mainly for ourselves and not as an official position of the whole project.) Regards, Maciej
Re: [whatwg] Supporting MathML and SVG in text/html, and related topics
On Apr 16, 2008, at 10:47, Paul Libbrecht wrote: I would like to put a grain of salt here and would love HTML5 passionates to answer: why is the whole HTML5 effort not a movement towards a really enhanced parser instead of trying to redefine fully HTML successors? text/html has immense network effects both from the deployed base of text/html content and the deployed base of software that deals with text/html. Failing to plug into this existing network would be extremely bad strategy. In fact, the reason why the proportion of Web pages that get parsed as XML is negligible is that the XML approach totally failed to plug into the existing text/html network effects (except for Appendix C which lacks a migration strategy to actual XML and amounts to the emperor's new clothes). Being an enhanced parser (that would use a lot of context info to be really hand-author supportive) it would define how to parse better an XHTML 3 page, but also MathML and SVG as it does currently... It has the ability to specify very readable encodings of these pages. It could serve as a model for many other situations where XML parsing is useful but its strictness bytes some. Anne has been working on XML5, but being able to parse any well-formed stream to the same infoset as an XML 1.0 parser and being able to parse existing text/html content in a backwards-compatible way are mutually conflicting requirements. Hence, XML5 parsing won't be suitable for text/html. Currently HTML5 defines at the same time parsing and the model and this is what can cause us to expect that XML is getting weaker. I believe that the whole model-definition work of XML is rich, has many libraries, has empowered a lot of great developments and it is a bad idea to drop it instead of enriching it. The dominant design of non-browser HTML5 parsing libraries is exposing the document tree using an XML parser API. The non-browser HTML5 libraries, therefore, plug into the network of XML libraries. For example, Validator.nu's internals operate on SAX events that look like SAX events for an XHTML5 document. This allows Validator.nu to use libraries written for XML, such as oNVDL and Saxon. -- Henri Sivonen [EMAIL PROTECTED] http://hsivonen.iki.fi/
Re: [whatwg] Supporting MathML and SVG in text/html, and related topics
On Apr 16, 2008, at 12:58, Paul Libbrecht wrote: In fact, the reason why the proportion of Web pages that get parsed as XML is negligible is that the XML approach totally failed to plug into the existing text/html network effects[...] My hypothesis here is that this problem is mostly a parsing problem and not a model problem. HTML5 mixes the two. For backwards compatibility in scripted browser environments, the HTML DOM can't behave exactly like the XHTML5 DOM. For non-scripted non- browser environments, using an XML data model (XML DOM, XOM, JDOM, dom4j, SAX, ElementTree, lxml, etc., etc.) works fine. There are tools that convert quite a lot of text/html pages (whose compliance is user-defined to be it works in my browser) to an XML stream today NeckoHTML is one of them. The goal would be to formalize this parsing, and just this parsing. Like NekoHTML and TagSoup, the Validator.nu HTML parser turns text/ html input into Java XML models. The difference is that the Validator.nu HTML parser implements the HTML5 algorithm instead of something the authors of NekoHTML and TagSoup figured out on their own. So if you are asking for a NekoHTML-like product for HTML5, it already exists and supports three popular Java XML APIs (SAX, DOM and XOM). Not XNI, though, at the moment. (It doesn't support the recent MathML addition, *yet*, though.) http://about.validator.nu/htmlparser/ Currently HTML5 defines at the same time parsing and the model and this is what can cause us to expect that XML is getting weaker. I believe that the whole model-definition work of XML is rich, has many libraries, has empowered a lot of great developments and it is a bad idea to drop it instead of enriching it. The dominant design of non-browser HTML5 parsing libraries is exposing the document tree using an XML parser API. The non-browser HTML5 libraries, therefore, plug into the network of XML libraries. For example, Validator.nu's internals operate on SAX events that look like SAX events for an XHTML5 document. This allows Validator.nu to use libraries written for XML, such as oNVDL and Saxon. So, except for needing yet another XHTML version to accomodate all wishes, I think it would be much saner that browsers' implementations and related specifications rely on an XML-based model of HTML (as the DOM is) instead of a coupled parsing-and- modelling specification which has different interpretations at different places. HTML5 already specifies parsing in terms of DOM output. However, when the DOM is in the HTML mode, it has to be slightly different. -- Henri Sivonen [EMAIL PROTECTED] http://hsivonen.iki.fi/
Re: [whatwg] Supporting MathML and SVG in text/html, and related topics
On Wed, 16 Apr 2008 18:36:49 +0200, William F Hammond [EMAIL PROTECTED] wrote: About 7 years ago there was argument in these circles about whether correct xhtml+mathml could be served as text/html. As we all know, a clear boundary was drawn, presumably because it was onerous for browsers to sniff incoming content and then decide how to parse. Actually, it was not the browsers: http://lists.w3.org/Archives/Public/www-html/2000Sep/0024.html As things have evolved, we now know that browsers do, in fact, perform a lot of triage. See, for example, Mozilla's DOCTYPE sniffing, http://developer.mozilla.org/en/docs/Mozilla's_DOCTYPE_sniffing That's a very limited set of differences which mostly affect page layout. Especially since we are speaking about dual serialization of the same DOM and since there is relatively little use of application/xhtml+xml (and some significant user agents do not support it), might it not be worthwhile to re-examine the question of serving standards-compliant xhtml or xhtml+(mathml|svg) serialized document instances as either text/html or application/xhtml+xml? In other words, why not be able to serve both serializations as text/html? What obstacles to this exist? The Web. -- Anne van Kesteren http://annevankesteren.nl/ http://www.opera.com/
Re: [whatwg] Supporting MathML and SVG in text/html, and related topics
On Apr 16, 2008, at 9:36 AM, William F Hammond wrote: About 7 years ago there was argument in these circles about whether correct xhtml+mathml could be served as text/html. As we all know, a clear boundary was drawn, presumably because it was onerous for browsers to sniff incoming content and then decide how to parse. As things have evolved, we now know that browsers do, in fact, perform a lot of triage. See, for example, Mozilla's DOCTYPE sniffing, http://developer.mozilla.org/en/docs/Mozilla's_DOCTYPE_sniffing Especially since we are speaking about dual serialization of the same DOM and since there is relatively little use of application/xhtml+xml (and some significant user agents do not support it), might it not be worthwhile to re-examine the question of serving standards-compliant xhtml or xhtml+(mathml|svg) serialized document instances as either text/html or application/xhtml+xml? In other words, why not be able to serve both serializations as text/html? What obstacles to this exist? It's not entirely clear what your proposal is, but I assume you are suggesting that content served as text/html with an XHTML doctype declaration should be parsed as XML. The obstacle to this is that much text/html content has an XHTML doctype declaration but depends on being parsed and otherwise processed as HTML, not XML, as current user agents do it. Such content is fairly widespread due to the legacy of Appendix C. It is preferable to let the MIME type continue to be the switch rather than making the doctype serve this role. An additional obstacle in the case of HTML5 is that the XML serialization does not have a distinct doctype (they may use the common HTML5 doctype or no doctype at all, which when parsed as text/ html would be treated as an HTML document in quirks mode). Regards, Maciej
Re: [whatwg] Supporting MathML and SVG in text/html, and related topics
Dnia 10-04-2008, Cz o godzinie 09:51 +, Ian Hickson pisze: On Sat, 4 Nov 2006, Paul Topping wrote: Elements whose namespaces aren't known should be handled like any other unknown HTML element. I believe the common way for user agents to handle an unknown element is basically to ignore the tag and its attributes and treat any text between start and end tags as if the tags weren't there. Namespaces do not present any new challenge in this area. Bogus namespaces are no more of a security risk than bogus HTML tags. It is only the ones that ARE processed by the user agent that represent potential security risks. The problem is legacy content like: html foo xmlns=bogus namespace ...rest of HTML document... We don't want to make the whole document get ignored. An example of such a tag is Microsoft HTML application indicator which is empty by design. But how does Paul’s recipe amount to ignoring the whole document? If anyone is actually reading this 3363 line e-mail, I'm impressed. Please do let me know that you read this. I do not do bungee jumping though.
Re: [whatwg] Question about the PICS label in HTML5
On 16/04/2008, Marco [EMAIL PROTECTED] wrote: I've been looking through the HTML5 working draft and I've been trying to find a reference for the use of the current PICS labels. I may have missed it, but does anyone, anywhere, actually use PICS? I don't think I've even heard the name uttered in a few years - I assumed it had died of neglect and lack of interest. - d.
Re: [whatwg] Supporting MathML and SVG in text/html, and related topics
On Wed, 16 Apr 2008 22:01:49 +0200, William F Hammond [EMAIL PROTECTED] wrote: Anne van Kesteren [EMAIL PROTECTED] writes: The Web. Really!?! Yes, see for instance: http://lists.w3.org/Archives/Public/public-html/2007Aug/1248.html It's time for user agents to stop supporting bogus document preambles. Please keep the discussion realistic. -- Anne van Kesteren http://annevankesteren.nl/ http://www.opera.com/
Re: [whatwg] Question about the PICS label in HTML5
On 16/04/2008, David Gerard [EMAIL PROTECTED] wrote: I may have missed it, but does anyone, anywhere, actually use PICS? I don't think I've even heard the name uttered in a few years - I assumed it had died of neglect and lack of interest. About 1% of the pages listed on dmoz.org attempt to use it - see http://philip.html5.org/data/pics-label.html (I have no idea how many of those uses are syntactically valid (maybe someone could test that if they're quite bored), or are appropriate for the page's content.) -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] postMessage() issues
So one thing I should note first of all is that the implementation that is currently in the Firefox 3 betas are synchronous. It is unlikely that we can get this changed by final shipping since we are more or less in code freeze already. Of course, we implemented this knowing that it's part of HTML5 which is nowhere near complete, so obviously we were aware that it might change. However it might mean that developers will have to put in workarounds in order to support the FF3 release :( Maciej Stachowiak wrote: On Apr 15, 2008, at 5:10 PM, Ian Hickson wrote: At the moment people have proposed that the API be asynchronous, and some people are ok with that, but other people are strongly opposed to it. I am not sure where to go with this. Input from other browser vendors -- yourself and WebKit in particular -- would be very useful. Right now the API is synchronous, and Mozilla reps have indicated they strongly prefer that, Opera reps have indicated they don't mind, and Gears reps have indicated they'd rather it be async. I think async is better, for the following reasons: - PostMessage seems to imply a message queue model. I think this is a pretty weak argument one way or another. It's IMHO much more important that we create an API that is the most usable that we can make it. - Processing a reply synchronously is awkward in any case, since you need a callback. I'm not sure I follow this argument, I actually come to the opposite conclusion. Say that a page is communicating with multiple iframes using postMessage, and expect replies from all of them. If postMessage is synchronous it is easy to associate a given reply with a given postMessage call, it's simply the reply you get between the time you make the postMessage call and when it returns. So you could install a generic listener for the message event and let the listener set a global variable. Then you just do a postMessage and pick up the reply from the global variable. If postMessage is asynchronous you need to agree on using some identifier in the messages, or you have to use the pipes mechanism for all communication. Granted, with javascript generators you can almost get the same behavior as for synchronous calling, but that is non-trivial. - This is different from event dispatch because replies are expected to be common; two way communication channels like postMessage make more sense as asynchronous, while event dispatch is typically one-way. Why does two-way communication make more sense asynchronous? See above for why responses are more complicated with async communication. - Saying that runaway two-way messaging should be handled by a slow script dialog seems weak to me, compared to making the mechanism intrinsically resistant to runaway behavior. I'm not sure why we think that runaway two-way messaging is going to be a likely problem. Do we have runaway two-way function calling as a big problem now? I would in fact say that async calling would be more likely to cause runaway cross messaging. If you have sync calling you'll quickly recurse to death and notice an exception being thrown. With async calling you simply hog the CPU. - Making new communication APIs async makes it more practical to partition browsing contexts into separate threads, processes, operation queues, or other concurrency mechanisms (within the limitations of what kind of things must be serialized. - We can foresee that workers in the style of Gears will be a future use case for postMessage; in that case, it clearly must be async. These both are good points. / Jonas
Re: [whatwg] postMessage() issues
On Wed, Apr 16, 2008 at 3:17 PM, Jonas Sicking [EMAIL PROTECTED] wrote: - Processing a reply synchronously is awkward in any case, since you need a callback. I'm not sure I follow this argument, I actually come to the opposite conclusion. Say that a page is communicating with multiple iframes using postMessage, and expect replies from all of them. I think the argument assumed you were communicating with a single frame in the common case, in which case the current API is more awkward than one in which the postMessage() call itself returns the response, requiring no listener at all. - This is different from event dispatch because replies are expected to be common; two way communication channels like postMessage make more sense as asynchronous, while event dispatch is typically one-way. Why does two-way communication make more sense asynchronous? See above for why responses are more complicated with async communication. From one of Aaron Boodman's mails: if you're doing a postMessage() response back to a frame when it calls you, then the original frame will get called with your response before its original postMessage() actually returns. This nesting feels bizarre compared to a more linear I send a message, then I get a response flow. PK
Re: [whatwg] postMessage() issues
Peter Kasting wrote: On Wed, Apr 16, 2008 at 3:17 PM, Jonas Sicking [EMAIL PROTECTED] wrote: - Processing a reply synchronously is awkward in any case, since you need a callback. I'm not sure I follow this argument, I actually come to the opposite conclusion. Say that a page is communicating with multiple iframes using postMessage, and expect replies from all of them. I think the argument assumed you were communicating with a single frame in the common case, in which case the current API is more awkward than one in which the postMessage() call itself returns the response, requiring no listener at all. No one is proposing an api where postMessage is returning an actual result though, right? And that would definitely require synchronous dispatch. - This is different from event dispatch because replies are expected to be common; two way communication channels like postMessage make more sense as asynchronous, while event dispatch is typically one-way. Why does two-way communication make more sense asynchronous? See above for why responses are more complicated with async communication. From one of Aaron Boodman's mails: if you're doing a postMessage() response back to a frame when it calls you, then the original frame will get called with your response before its original postMessage() actually returns. This nesting feels bizarre compared to a more linear I send a message, then I get a response flow. Yes, the nesting does feel a bit unusual. But it still seems easier to me to use since you'll get access to a result right after the call to postMessage, similar to a normal function call. No need to stow away any state you are currently carrying and then bring that back once you get a message back. / Jonas
Re: [whatwg] text/html for html and xhtml (Was: Supporting MathML and SVG in text/html, and related topics)
William F Hammond wrote: The experiment begun around 2001 of punishing bad documents in application/xhtml+xml seems to have led to that mime type not being much used. That has more to do with the fact that it wasn't supported in browsers used by 90+% of users for a number of years. So user agents need to learn how to recognize the good and the bad in both mimetypes. Recognize and do what with it? Otherwise you have Gresham's Law: the bad documents will drive out the good. Perhaps you should clearly state your definitions of bad and good in this case? I'd also like to know, given those definitions, why it's bad for the bad documents to drive out the good, and how you think your proposal will prevent that from happening. If it has a preamble beginning with ^?xml or a sensible xhtml DOCTYPE declaration or a first element html xmlns=..., then handle it as xhtml unless and until it proves to be non-compliant xhtml (e.g, not well-formed xml, unquoted attributes, munged handling of xml namespaces, ...). At the point it proves to be bad xhtml reload it and treat it as regular html. What's the benefit? This seems to give the worst of both worlds, as well as a poor user experience. So most bogus xhtml will then be 1 or 2 seconds slower than good xhtml. Astute content providers will notice that and then do something about it. It provides a feedback mechanism for making the web become better. In the meantime, it punishes the users for things outside their control by degrading their user experience. It also provides a competitive advantage to UAs who ignore your proposal. Sounds like an unstable equilibrium to me, even if attainable. -Boris