Re: [whatwg] URL: file: URLs
On Sun, Oct 28, 2012 at 6:51 PM, Boris Zbarsky bzbar...@mit.edu wrote: Same as the comment I quoted? As same as something else? Same as you quoted. Well, the Gecko parser preserves the host at this stage assuming the URI was correctly formatted with a host. Again: blah://foo/bar = blah://foo/bar The interesting things happen when you have 0, 1, or 3 slashes between ':' and foo. The handling of foo after this point is a separate issue. Those are handled the same as in Gecko (also matches Safari I think, Chrome strips are starting slashes (like if you have four), but I did not copy that). In Gecko, it's part of URL parsing. More precisely, it's part of the normalization performed as part of constructing a URL object from a string. Since this is also how we parse URLs, it's effectively all part of the package. But note that it would be a bit odd of file://c:/ claimed to have a host of c with a default port or some such... Maybe I should introduce a file host state that supports colons in the host name (or special case the host state further, but the former seems cleaner). Most browsers seem to fail currently on input such as file://c:/ but this is on a Mac so maybe that's the difference. I would prefer having the parsing be consistent though. 7 and 8 are not, though at some point we'll need to define equality comparisons anyway. Yeah, I guess at some point someone would need to write a processing file: URLs specification (for post-parsing operations). On the other hand, it's not entirely clear to me that needs to be interoperable. -- http://annevankesteren.nl/
Re: [whatwg] URL: file: URLs
On 10/29/12 5:00 AM, Anne van Kesteren wrote: But note that it would be a bit odd of file://c:/ claimed to have a host of c with a default port or some such... Maybe I should introduce a file host state that supports colons in the host name (or special case the host state further, but the former seems cleaner). I don't think that's particularly desirable. The c: is totally part of the path; treating it otherwise would just be confusing. Imo. Most browsers seem to fail currently on input such as file://c:/ but this is on a Mac Yes, doing that on a Mac would just be wrong I would prefer having the parsing be consistent though. You mean across Windows and non-Windows? I'm not sure that's viable. -Boris
Re: [whatwg] URL: file: URLs
On Mon, Oct 29, 2012 at 3:13 PM, Boris Zbarsky bzbar...@mit.edu wrote: On 10/29/12 5:00 AM, Anne van Kesteren wrote: Maybe I should introduce a file host state that supports colons in the host name (or special case the host state further, but the former seems cleaner). I don't think that's particularly desirable. The c: is totally part of the path; treating it otherwise would just be confusing. Imo. But at that point in a URL you cannot have a path. A path starts with a slash after the host. Especially if you want file://test/ to parse with test being the host. Most browsers seem to fail currently on input such as file://c:/ but this is on a Mac Yes, doing that on a Mac would just be wrong I suppose, I would hate it though for new URL(...) to depend on the platform. -- http://annevankesteren.nl/
Re: [whatwg] URL: file: URLs
On 10/29/12 10:53 AM, Anne van Kesteren wrote: But at that point in a URL you cannot have a path. A path starts with a slash after the host. The point is that on Windows, Gecko parses file://c:/something as file:///c:/something As in, it's an exception to the general if there are two slashes after the file: then the next thing is a host rule. I suppose, I would hate it though for new URL(...) to depend on the platform. I'm not sure there are great solutions here. :( -Boris
[whatwg] Proposal for window.DocumentType.prototype.toString
Hi everybody! Serializing a complete HTML document DOM to a string is surprisingly hard in javascript. As a fairly seasoned javascript hacker I figured this might do it: document.doctype + document.documentElement.outerHTML It doesn't. No browser has a useful window.DocumentType.prototype that returns either the original document's !DOCTYPE ... before parsing – or a semantically equivalent post-parsing one. Google Chrome shows one in its devtools, but seems not to export some way of getting at it to programmers. My proposal is we specify this more useful behaviour for javascript-running browsers, so it does become as simple as above. A rough sketch of how a polyfill might implement the latter window.DocumentType.prototype.toString: https://gist.github.com/3977584 Even as a polyfill, the above is rather limited, though: I believe only Firefox implements internalSubset today, and probably only in XML contexts. The most useful implementation would IMO be a native one that reproducing the doctype, as it was formatted in the source document. Thoughts? -- / Johan Sundström, http://ecmanaut.blogspot.com/
Re: [whatwg] Proposal for window.DocumentType.prototype.toString
On 10/29/12 8:58 PM, Johan Sundström wrote: Serializing a complete HTML document DOM to a string is surprisingly hard in javascript. I thought there were plans to put innerHTML on Document. Did that go nowhere? As a fairly seasoned javascript hacker I figured this might do it: document.doctype + document.documentElement.outerHTML This seems lossy in many cases (most obviously: when the HTML uses conditional comments, though there are also various XHTML-specific issues). The most useful implementation would IMO be a native one that reproducing the doctype, as it was formatted in the source document. That might be worth doing independent of the serialization issue. -Boris
Re: [whatwg] Proposal for window.DocumentType.prototype.toString
On Mon, Oct 29, 2012 at 6:17 PM, Boris Zbarsky bzbar...@mit.edu wrote: On 10/29/12 8:58 PM, Johan Sundström wrote: Serializing a complete HTML document DOM to a string is surprisingly hard in javascript. I thought there were plans to put innerHTML on Document. Did that go nowhere? There were plans to put in on DocumentFragment. But IIRC no other browser vendors voiced an interest and Hixie was opposed because he thought it would encourage people to do more string-based DOM building. The WebKit patch for this floundered as a result. I still think it's a good idea.
Re: [whatwg] Proposal for window.DocumentType.prototype.toString
On Mon, 29 Oct 2012, Johan Sundstr�m wrote: Serializing a complete HTML document DOM to a string is surprisingly hard in javascript. As a fairly seasoned javascript hacker I figured this might do it: document.doctype + document.documentElement.outerHTML It doesn't. No browser has a useful window.DocumentType.prototype that returns either the original document's !DOCTYPE ... before parsing � or a semantically equivalent post-parsing one. If you know the document is always going to be in the no-quirks mode, then you can just stick !DOCTYPE HTML at the start. If you need to be able to tell what the mode is but are ok with ignoring the limited quirks mode, then you can use document.compatMode to pick whether to use that string or none, as in: (document.compatMode == 'CSS1Compat' ? '!DOCTYPE HTML' : '') + document.documentElement.outerHTML That will drop any comment nodes around the root element, in case that matters. If you want to get the actual DOCTYPE strings, you can make a simple serialisation function for doctype nodes that uses the three attributes on that object to string together the full thing (much as you do in the polyfill you mentioned). I believe only Firefox implements internalSubset today Since the internal subset has no meaning in text/html, that's ok if your goal is just to be semantically equivalent. The most useful implementation would IMO be a native one that reproducing the doctype, as it was formatted in the source document. What's your use case, exactly? On Mon, 29 Oct 2012, Boris Zbarsky wrote: I thought there were plans to put innerHTML on Document. Did that go nowhere? Lack of implementor interest killed it a while ago. On Mon, 29 Oct 2012, Ojan Vafai wrote: On Mon, Oct 29, 2012 at 6:17 PM, Boris Zbarsky bzbar...@mit.edu wrote: I thought there were plans to put innerHTML on Document. Did that go nowhere? There were plans to put in on DocumentFragment. That was a different plan, but yes, there have also been proposals to do that. This was in the context of templates; a better solution to which has since been worked on in public-webapps. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'