Re: [whatwg] Proposal for window.DocumentType.prototype.toString
On Tue, Oct 30, 2012 at 3:20 AM, Stewart Brodie wrote: >> Hi everybody! >> >> Serializing a complete HTML document DOM to a string is surprisingly >> hard in javascript. > > Does XMLSerializer().serializeToString(document) not meet your requirement? Ah – good thinking. (new XMLSerializer).serializeToString(document) does indeed do a pretty excellent job of it, including the crazy hacks people do with conditional comments outside of the root node, which I hadn't figured I would be able to piece back together from an already parsed page. While I hate to admit it, maybe on some level there is benefit to much of the DOM APIs being javascript hostile to force you towards the occasional really well-paved paths like the above, when you can find them. My use case was taking as good a snapshot of an already live web page's structure from a non-privileged bookmarklet, for archival purposes (i e essentially what a curl of the page would do). For my purposes, it is a bonus that I actually get the current state of the page with whatever DOM mods have transpired since it loaded rather than what curl would produce, so I think XMLSerializer is a good friend. That said, I would still much enjoy a future where javascript:alert(document.doctype) would tell you something rich about the page that we today need deep knowledge of document.compatMode and/or combinations of XMLSerializer and parsers, or deep study of DocumentType refdocs to tease out. Is there a case against it in people using it where they ought to pick other solutions? -- / Johan Sundström, http://ecmanaut.blogspot.com/ > -- > Stewart Brodie > Team Leader - ANT Galio Browser > ANT Software Limited
Re: [whatwg] URL: file: URLs
On Tue, 30 Oct 2012 16:25:30 -, Anne van Kesteren wrote: On Mon, Oct 29, 2012 at 4:24 PM, Boris Zbarsky wrote: On 10/29/12 10:53 AM, Anne van Kesteren wrote: But at that point in a URL you cannot have a path. A path starts with a slash after the host. The point is that on Windows, Gecko parses file://c:/something as file:///c:/something As in, it's an exception to the general "if there are two slashes after the "file:" then the next thing is a host rule. Thanks, I missed that. It seems however we could have that parsing rule for all platforms without issue, no? After all, file://c:/ does not parse currently on non-Windows platforms. I suppose, I would hate it though for new URL(...) to depend on the platform. I'm not sure there are great solutions here. :( Yeah, I'm willing to suck it up, but I would like to explore our options before we go that route. In both Firefox and Chrome if you type file://aaa/some/path, or file://localhost/some/path, the aaa and localhost parts are ignored, and the rest of the path is interpreted as a local file path. In Opera, anything that is not localhost gives an error. I currently do not have Windows to test but I think I recall IE (or Opera?) opening file://server/share if there was a network share at \\server\share In a previous job I had, where the environment was a bit windows centric, there was a wiki with documentation with links to files on network shares. I recall the urls looked something like "file:\\server\some\path" in the HTML. IE opened the files (hence people continued to write them). The other browsers didn't. The point is that the file uri can and should have the authority part, or host, and that can be the local machine, or a network share.
Re: [whatwg] URL: file: URLs
On Tue, 30 Oct 2012 18:38:46 +0200, Boris Zbarsky wrote: On 10/30/12 12:25 PM, Anne van Kesteren wrote: Thanks, I missed that. It seems however we could have that parsing rule for all platforms without issue, no? Hmm. Possibly, yes. I'd love feedback from other UAs here! My knee-jerk reaction is the same as Anne's; why not do this for all platforms? -- Simon Pieters Opera Software
Re: [whatwg] URL: query encoding
On Tue, 30 Oct 2012, Simon Pieters wrote: > On Fri, 26 Oct 2012 17:23:53 +0300, Anne van Kesteren > wrote: > > > > Currently encoding the query component of a URL using the document's > > encoding affects all URLs with a "relative scheme" (http/ws/file/...). > > Should we restrict this to http/https/file so new schemes such as > > ws/wss and others will not be affected by this weird legacy quirk? > > So in Opera this quirk does not apply to ws: or wss:. Even in ? Or just in |new WebSocket()|? (There's no way to test non-ws:/wss: in the WebSocket constructor, so asking if it's scheme-specific in the constructor can't be answered.) -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] URL: query encoding
On Tue, Oct 30, 2012 at 5:51 PM, Boris Zbarsky wrote: > I would not be opposed to us explicitly specifying things this way. That > would incidentally require specs to say exactly when some non-UTF8 encoding > is supposed to be used for their URIs and what that encoding should be, > which seems like a good thing to me. The URL Standard defines it in exactly this way and once the other specifications are updated to use it, it should all fall into place. The question here is primarily whether we want to introduce the additional axis of the scheme to constrain this kind of ugly behavior even more. But maybe we should just tell people to use utf-8. -- http://annevankesteren.nl/
Re: [whatwg] URL: query encoding
On 10/30/12 11:43 AM, Simon Pieters wrote: The above applies to what gets sent over the wire when using the WebSocket(...) constructor. For , the results are different: http://simon.html5.org/test/url/url-encoding.html I don't have an opinion at this point about what to do here. In Gecko, at least , when a URL object is constructed from a string the caller can specify an encoding to use for the URL. The URL code then does things that depend on what that encoding was. Apart from the hierarchical vs not distinction, I believe the handling of the encoding does not depend on scheme in Gecko. If no encoding is specified, UTF-8 is assumed. passes in the document encoding as the encoding to use when constructing the URL object. The WebSocket constructor does not pass in an ecoding when constructing the URL object, so UTF-8 is used. I would not be opposed to us explicitly specifying things this way. That would incidentally require specs to say exactly when some non-UTF8 encoding is supposed to be used for their URIs and what that encoding should be, which seems like a good thing to me. -Boris
Re: [whatwg] URL: query encoding
On Tue, 30 Oct 2012 17:20:33 +0200, Simon Pieters wrote: On Fri, 26 Oct 2012 17:23:53 +0300, Anne van Kesteren wrote: Currently encoding the query component of a URL using the document's encoding affects all URLs with a "relative scheme" (http/ws/file/...). Should we restrict this to http/https/file so new schemes such as ws/wss and others will not be affected by this weird legacy quirk? So in Opera this quirk does not apply to ws: or wss:. We have a test case for this. I believe the spec required this at the time we implemented it. Firefox passes the test as well, but Chrome fails it (Chrome has the quirk). I don't know what IE does. I tentatively suggest we go with Opera/Firefox here and limit this quirk so it does not apply to ws: or wss:. The above applies to what gets sent over the wire when using the WebSocket(...) constructor. For , the results are different: http://simon.html5.org/test/url/url-encoding.html I don't have an opinion at this point about what to do here. Is that something implementors would consider following? The parsing section of http://url.spec.whatwg.org/ has this as an open issue for now. cheers -- Simon Pieters Opera Software
Re: [whatwg] URL: file: URLs
On 10/30/12 12:25 PM, Anne van Kesteren wrote: Thanks, I missed that. It seems however we could have that parsing rule for all platforms without issue, no? Hmm. Possibly, yes. I'd love feedback from other UAs here! -Boris
Re: [whatwg] URL: file: URLs
On Mon, Oct 29, 2012 at 4:24 PM, Boris Zbarsky wrote: > On 10/29/12 10:53 AM, Anne van Kesteren wrote: >> But at that point in a URL you cannot have a path. A path starts with >> a slash after the host. > > The point is that on Windows, Gecko parses file://c:/something as > file:///c:/something > > As in, it's an exception to the general "if there are two slashes after the > "file:" then the next thing is a host rule. Thanks, I missed that. It seems however we could have that parsing rule for all platforms without issue, no? After all, file://c:/ does not parse currently on non-Windows platforms. >> I suppose, I would hate it though for new URL(...) to depend on the >> platform. > > I'm not sure there are great solutions here. :( Yeah, I'm willing to suck it up, but I would like to explore our options before we go that route. -- http://annevankesteren.nl/
Re: [whatwg] URL: query encoding
On Fri, 26 Oct 2012 17:23:53 +0300, Anne van Kesteren wrote: Currently encoding the query component of a URL using the document's encoding affects all URLs with a "relative scheme" (http/ws/file/...). Should we restrict this to http/https/file so new schemes such as ws/wss and others will not be affected by this weird legacy quirk? So in Opera this quirk does not apply to ws: or wss:. We have a test case for this. I believe the spec required this at the time we implemented it. Firefox passes the test as well, but Chrome fails it (Chrome has the quirk). I don't know what IE does. I tentatively suggest we go with Opera/Firefox here and limit this quirk so it does not apply to ws: or wss:. Is that something implementors would consider following? The parsing section of http://url.spec.whatwg.org/ has this as an open issue for now. cheers -- Simon Pieters Opera Software
Re: [whatwg] Proposal for window.DocumentType.prototype.toString
On 30 Oct 2012 at 10:20, Stewart Brodie wrote: > Johan Sundström wrote: >> Serializing a complete HTML document DOM to a string is surprisingly >> hard in javascript. > > Does XMLSerializer().serializeToString(document) not meet your requirement? I was wondering that too. I use it to get the content of an iframe into a string, so I can send that off to a database. Seems to work without problems (Safari Mac 6.0.1). But I too had to ask how to do that; it wasn't particularly obvious that that was what I should have been using (to me at any rate). -- Cheers -- Tim
Re: [whatwg] whatwg Digest, Vol 103, Issue 51
SZs -- Sent using BlackBerry - Original Message - From: whatwg-requ...@lists.whatwg.org [mailto:whatwg-requ...@lists.whatwg.org] Sent: Tuesday, October 30, 2012 06:39 AM To: whatwg@lists.whatwg.org Subject: whatwg Digest, Vol 103, Issue 51 Send whatwg mailing list submissions to whatwg@lists.whatwg.org To subscribe or unsubscribe via the World Wide Web, visit http://lists.whatwg.org/listinfo.cgi/whatwg-whatwg.org or, via email, send a message with subject or body 'help' to whatwg-requ...@lists.whatwg.org You can reach the person managing the list at whatwg-ow...@lists.whatwg.org When replying, please edit your Subject line so it is more specific than "Re: Contents of whatwg digest..." When replying to digest messages, please please PLEASE update the subject line so it isn't the digest subject line. Today's Topics: 1. Proposal for window.DocumentType.prototype.toString (Johan Sundstr?m) 2. Re: Proposal for window.DocumentType.prototype.toString (Boris Zbarsky) 3. Re: Proposal for window.DocumentType.prototype.toString (Ojan Vafai) 4. Re: Proposal for window.DocumentType.prototype.toString (Ian Hickson) 5. Re: Real-time thread support for workers (Mikko Rantalainen) 6. Re: Proposal for window.DocumentType.prototype.toString (Stewart Brodie) 7. Re: Real-time thread support for workers (Jussi Kalliokoski) -- Message: 1 Date: Mon, 29 Oct 2012 17:58:45 -0700 From: Johan Sundstr?m To: WHAT-WG list Subject: [whatwg] Proposal for window.DocumentType.prototype.toString Message-ID: Content-Type: text/plain; charset=windows-1252 Hi everybody! Serializing a complete HTML document DOM to a string is surprisingly hard in javascript. As a fairly seasoned javascript hacker I figured this might do it: document.doctype + document.documentElement.outerHTML It doesn't. No browser has a useful window.DocumentType.prototype that returns either the original document's before parsing ? or a semantically equivalent post-parsing one. Google Chrome shows one in its devtools, but seems not to export some way of getting at it to programmers. My proposal is we specify this more useful behaviour for javascript-running browsers, so it does become as simple as above. A rough sketch of how a polyfill might implement the latter window.DocumentType.prototype.toString: https://gist.github.com/3977584 Even as a polyfill, the above is rather limited, though: I believe only Firefox implements "internalSubset" today, and probably only in XML contexts. The most useful implementation would IMO be a native one that reproducing the doctype, as it was formatted in the source document. Thoughts? -- / Johan Sundstr?m, http://ecmanaut.blogspot.com/ -- Message: 2 Date: Mon, 29 Oct 2012 21:17:45 -0400 From: Boris Zbarsky To: whatwg@lists.whatwg.org Subject: Re: [whatwg] Proposal for window.DocumentType.prototype.toString Message-ID: <508f2ab9.6000...@mit.edu> Content-Type: text/plain; charset=windows-1252; format=flowed On 10/29/12 8:58 PM, Johan Sundstr?m wrote: > Serializing a complete HTML document DOM to a string is surprisingly > hard in javascript. I thought there were plans to put innerHTML on Document. Did that go nowhere? > As a fairly seasoned javascript hacker I figured > this might do it: > >document.doctype + document.documentElement.outerHTML This seems lossy in many cases (most obviously: when the HTML uses conditional comments, though there are also various XHTML-specific issues). > The most useful implementation would IMO be a native one > that reproducing the doctype, as it was formatted in the source > document. That might be worth doing independent of the serialization issue. -Boris -- Message: 3 Date: Mon, 29 Oct 2012 18:34:24 -0700 From: Ojan Vafai To: Boris Zbarsky Cc: whatwg Subject: Re: [whatwg] Proposal for window.DocumentType.prototype.toString Message-ID: Content-Type: text/plain; charset=ISO-8859-1 On Mon, Oct 29, 2012 at 6:17 PM, Boris Zbarsky wrote: > On 10/29/12 8:58 PM, Johan Sundstr?m wrote: > >> Serializing a complete HTML document DOM to a string is surprisingly >> hard in javascript. >> > > I thought there were plans to put innerHTML on Document. Did that go > nowhere? There were plans to put in on DocumentFragment. But IIRC no other browser vendors voiced an interest and Hixie was opposed because he thought it would encourage people to do more string-based DOM building. The WebKit patch for this floundered as a result. I still think it's a good idea. -- Message: 4 Date: Tue, 30 Oct 2012 02:10:58 + (UTC) From: Ian Hickson To: WHAT-WG list Subject: Re: [whatwg] Proposal for window.DocumentType.prototype.toString Message-ID: Content-Type: text/plain;
Re: [whatwg] Real-time thread support for workers
On Sat, Oct 27, 2012 at 3:14 AM, Ian Hickson wrote: > On Thu, 9 Aug 2012, Jussi Kalliokoski wrote: > > > > On W3C AudioWG we're currently discussing the possibility of having web > > workers that run in a priority/RT thread. This would be highly useful > > for example to keep audio from glitching even under high CPU stress. > > > > Thoughts? Is there a big blocker for this that I'm not thinking about or > > has it just not been discussed yet? (I tried to search for it, but > > didn't find anything) > > I think it's impractical to give Web authors this kind of control. User > agents should be able to increase the priority of threads, or notice a > thread is being used for audio and start limiting its per-slice CPU but > increasing the frequency of its slices, but that should be up to the UA, > we can't possibly let Web authors control this, IMHO. > You're right, I agree. I think the user agent should stay on top of the situation and monitor the CPU usage and adjust the priority accordingly. However, I think the feasible options for getting the benefits of high priority when needed are either a) that we treat the priority the developer asks a request rather than a command, or b) the user agent detects the intent (in the case of audio I think it'd be fairly simple right now) and decides a suitable priority while adjusting it if necessary. To me, b) seems like the best approach to take, although both approaches have the advantage that they don't guarantee anything and are thus more amendable. > On Thu, 9 Aug 2012, Jussi Kalliokoski wrote: > > > > Yes, this is something I'm worried about as well. But prior work in > > native applications suggests that high priority threads are hardly ever > > abused like that. > > Native apps and Web apps aren't comparable. Native apps that the user has > decided to install also don't arbitrarily reformat the user's disk or > install key loggers, but I hope you agree that we couldn't let Web authors > do those things. > > The difference between native apps and Web apps is that users implicitly > trust native app authors, and therefore are (supposed to be) careful about > what software they install. However, on the Web, users do not have to be > (anywhere near as) careful, and so they follow arbitrary links. Trusted > sites get hijacked by hostile code, users get phished to hostile sites, > trolls point users on social networks at hostile sites. Yet, when all is > working as intended (i.e. modulo security bugs), the user is not at risk > of their machine being taken down. > > If we allow sites to use 100% CPU on a realtime thread, then this changes, > because untrusted hostile sites actually _can_ cause harm. > Very true, it can indeed be used to cause harm, and we should not allow that. I ignored this because I was thinking about attack vectors as a bidirectional thing (someone loses, someone gains), and I couldn't think of a way the attacker would benefit from freezing random users' computing devices. But this approach admittedly doesn't work that well on the web. > The way the Web platform normally gets around this is by having the Web > author describe to the UA what the author wants, declaratively, and then > having the UA take care of it without running author code. This allows the > UA to make sure it can't be abused, while still having good performance or > security or whatnot. In the case of Web audio, the way to get sub-80ms > latency would be say "when this happens (a click, a collision), do this > (a change in the music, a sound effect)". This is non-trivial to specify, > but wouldn't run the risk of hostile sites harming the user. Indeed it is non-trivial to specify, and while the Web Audio API attempts to do this, it can't possibly cover all use cases without custom processing in place, the spec is already huge and only addresses a limited set of use cases efficiently. When the use case requires the developer to do custom processing, it shouldn't cause the developer to lose all the advantage from having the rest of the audio graph in a real time thread. Currently it does, because the JS processing runs in a non-RT thread, and the results would be too unpredictable if the RT audio thread waited for the non-RT JS thread, so the current approach is to instead buffer up the input for JS, send it in for processing and apply it for the next round, which means that adding a custom processing node to the graph introduces the latency of at least the buffer size of that custom processing node. Cheers, Jussi
Re: [whatwg] Proposal for window.DocumentType.prototype.toString
Johan Sundström wrote: > Hi everybody! > > Serializing a complete HTML document DOM to a string is surprisingly > hard in javascript. Does XMLSerializer().serializeToString(document) not meet your requirement? -- Stewart Brodie Team Leader - ANT Galio Browser ANT Software Limited
Re: [whatwg] Real-time thread support for workers
Ian Hickson, 2012-10-27 03:14 (Europe/Helsinki): > On Thu, 9 Aug 2012, Jussi Kalliokoski wrote: >> >> On W3C AudioWG we're currently discussing the possibility of having web >> workers that run in a priority/RT thread. This would be highly useful >> for example to keep audio from glitching even under high CPU stress. >> >> Thoughts? Is there a big blocker for this that I'm not thinking about or >> has it just not been discussed yet? (I tried to search for it, but >> didn't find anything) > > I think it's impractical to give Web authors this kind of control. User > agents should be able to increase the priority of threads, or notice a > thread is being used for audio and start limiting its per-slice CPU but > increasing the frequency of its slices, but that should be up to the UA, > we can't possibly let Web authors control this, IMHO. Would it be possible to allow web site to request high priority / RT on the expense of getting explicitly limited time slice? For example, API could be something like setMaxLatency(latency) where latency is desired maximum latency in ns. The return value could be maximum time slice in ns. If the worker (repeatedly) went over it maximum time slice, the UA should then revoke the high priority / RT scheduling from said worker and post some kind of event to worker to let it know about the issue. This would prevent any RT worker from hogging the CPU 100% but any well written worker code could be run with very low latency. Notice that the worker can only request desired latency and UA will then tell how much CPU time the worker is allowed to use each slice. The UA should simply return zero if the requested latency is too low to implement. (In this case, the worker would logically always overrun its time sclice and would be re-scheduled back to normal latency.) -- Mikko