Re: [whatwg] *** GMX Spamverdacht *** Parsing of meta refresh needs tweaking
On 2014-12-11 09:09, Simon Pieters wrote: The spec's parsing rules of meta refresh causes infinite reloading on some pages. In particular, the spec requires the url= to be present, but there are pages that omit it. IE9 also requires url= apparently. Gecko/Blink/WebKit allow url= to be omitted. For example, there is http://www.only-for-winners.com/ which has meta http-equiv=refresh content=0;http://www.aldanitinetwork.com; / Clearly this is intended to redirect, not reload the current page after 0 seconds. SELECT page, COUNT(*) AS num FROM [httparchive:runs.2014_08_15_requests_body] WHERE page = url AND mimeType CONTAINS html AND REGEXP_MATCH(LOWER(body), rmeta\s+[^]*http-equiv\s*=\s*[\']?refresh) AND REGEXP_MATCH(LOWER(body), rmeta\s+[^]*content\s*=\s*[\']?\s*\d+\s*;\s*[^\']) AND NOT REGEXP_MATCH(LOWER(body), rmeta\s+[^]*content\s*=\s*[\']?\s*\d+\s*;\s*url=) GROUP BY page 23 rows. I also noticed that Gecko allows the number to be omitted. I only found one page doing that and it was using meta http-equiv=refresh content=;URL= so it seems we can fail parsing for that case. I hear (a) these pages have been broken in IE for a long time, and (b) only 23 (?) pages in your DB are found. So why not just leave them broken? Best regards, Julian
Re: [whatwg] Parsing of meta refresh needs tweaking
On 2015-01-07 08:52, Simon Pieters wrote: ... I hear (a) these pages have been broken in IE for a long time, and (b) only 23 (?) pages in your DB are found. Right. So why not just leave them broken? It's a worse user experience and it's a shorter path to interop to change IE. ... User experience for invalid content is one aspect; sane parsing rules are another one. Not requiring the parameter name will make it harder to introduce new parameters in the future. YMMV. Best regards, Julian
[whatwg] alternate ids for elements
Hi there, I have a use case where a certain location in a document can have two anchors (or even more). For instance, in a spec, the author may have specified an anchor, but a section-number based anchor is required as well. Right now I can address this by inserting an additional div element, but this is kind of ugly, and doesn't scale well. How about a new attribute alt-ids which would take a space-separated list of additional anchors? Best regards, Julian
Re: [whatwg] alternate ids for elements
On 2014-12-03 15:02, Jukka K. Korpela wrote: 2014-12-03, 15:49, Julian Reschke wrote: I have a use case where a certain location in a document can have two anchors (or even more). For instance, in a spec, the author may have specified an anchor, but a section-number based anchor is required as well. Can you elaborate on that? Why cannot you use the same id attribute value in all references to an element? 1.) An author-supplied anchor may change, but you want to preserve existing deep links from other documents. 2.) You may want to support anchors based on section numbers which will allow other parties to link to a specific section of the document while only knowing the section number and a template (think references to sections numbers in RFCs over on tools.ietf.org). How about a new attribute alt-ids which would take a space-separated list of additional anchors? What would be the use of such additional identifiers? See above. Essentially aliases for anchors. The only thing I can imagine right now is a situation where you have an existing id attribute and references to it all around but now need to refer from a context that imposes its own restrictions on the syntax. Say, you have id=παράδειγμα and you need to refer to the element using a URL like http://example.com/foo.html#παράδειγμα; but cannot because the URL needs to be used in an environment where Greek letters cannot be used. But this sounds like a rather rare occasion. It's yet another use case that could be addressed that way. Best regards, Julian
Re: [whatwg] Proposal to add website-* meta extensions
On 2014-07-16 11:31, Arpita Bahuguna wrote: Hi Julian, Thank-you for your views. Are you suggesting that we instead introduce a new link relation (perhaps contact) with tel:/mailto: types for specifying these parameters? This would involve spec modification. Not sure how developers or browser vendors take to it. Why would it involve a spec modification? Was looking for a simpler solution that's quickly implementable. Why is meta any simpler than link???
Re: [whatwg] Proposal to add website-* meta extensions
On 2014-07-16 12:01, Arpita Bahuguna wrote: Hi Julian, Please find my comments inline: -Original Message- From: whatwg [mailto:whatwg-boun...@lists.whatwg.org] On Behalf Of Julian Reschke Sent: Wednesday, July 16, 2014 3:10 PM To: Arpita Bahuguna Cc: wha...@whatwg.org; Arpita Bahuguna Subject: Re: [whatwg] Proposal to add website-* meta extensions On 2014-07-16 11:31, Arpita Bahuguna wrote: Hi Julian, Thank-you for your views. Are you suggesting that we instead introduce a new link relation (perhaps contact) with tel:/mailto: types for specifying these parameters? This would involve spec modification. Not sure how developers or browser vendors take to it. Why would it involve a spec modification? Currently the link types defined by the specification are: http://www.whatwg.org/specs/web-apps/current-work/#linkTypes Please correct me if I am wrong but I suspect introducing a new rel type would involve modifying the spec as well. No. There's a Wiki for it. Adding new meta extensions however would not have any such overhead as far as I know. Was looking for a simpler solution that's quickly implementable. Why is meta any simpler than link???
Re: [whatwg] Proposal to add website-* meta extensions
On 2014-07-15 12:11, Arpita Bahuguna wrote: Hi, I would like to propose addition of the following three meta extensions: website-mail, website-number and website-address. Please find below a detailed description for each. --- x - x --- x -- Overview: The website-mail meta extension defines a suggested e-mail ID, such as the customer support mail ID, specified by the vendor. The website-number meta extension defines a proposed phone number, such as the customer support number, specified by the vendor. The website-address meta extension defines a given address (or geolocation tag), such as the vendor's office address or billing address. UA's displaying a page containing any or all of these meta extensions could then make this information directly available for the user's perusal. Oft times visitors have to hunt through a vendor's site for obtaining the customer support mail ID, phone number etc. which is mostly hidden behind a not so prominently displayed Help or Contact Us link. Vendor's specifying their registered mail ID, phone number and/or their address via these meta extensions can thus expect supporting UA's to present this information to the user in an easily accessible format, either by way of a browser menu option (such as mail, call, map) or via the URL bar scheme handler or in another similar format. Selecting these menu options, if available, should launch the default mail application with the specified mail ID, the dialer application with the given contact number or, the default maps application loaded with the specified address/location tag respectively. No known existing meta extensions with a similar name/intention exist. Syntax: meta name=website-mail content=a...@xyz.com This should be a link relation, using by default a mailto:; URI. The content attribute for the website-mail meta extension can take any valid email ID. meta name=website-number content=+1-555-555- This should have a better name, and also be a link relation, using a tel: URI. The content attribute for the website-number meta extension can take any valid phone number. meta name=website-address content=Jane Doe, 5844 South Oak Street, Chicago, Illinois 60667 or meta name=website-address content=20.593684;78.96288 The content attribute for the website-address meta extension can be any string or latitude and longitude separated by a semi-colon. Note: . In case multiple instances are found of the same meta extension, the last specified one should take precedence. ... In general, a single link to a URI having contact information seems to be much simpler to me... Best regards, Julian
Re: [whatwg] HTTP status code from JavaScript
On 2014-05-23 06:53, Michael Heuberger wrote: Hi James Single page apps! These become more and more popular with frameworks like RactiveJS or AngularJS. There the first request is a HTTP request, for any subsequent requests an AJAX one is generated. The problem is the first HTTP AJAX requests are HTTP requests. I assume you mean the distinction between page navigation and using XMLHTTPRequest? request. The framework is unable to detect 404s with the first request because the status code cannot be obtained via JavaScript, hence a second request is made. If the initial page load yields a 404 will there be any scripts to execute at all? In my eyes, a waste of bandwidth. Cheers Michael Best regards, Julian
Re: [whatwg] Zip archives as first-class citizens
On 2013-09-13 12:32, Robin Berjon wrote: On 29/08/2013 15:58 , Simon Pieters wrote: On Thu, 29 Aug 2013 15:02:48 +0200, Anne van Kesteren ann...@annevk.nl wrote: On Thu, Aug 29, 2013 at 1:19 PM, Jake Archibald jaffathec...@gmail.com wrote: Causing a network error in existing browsers is a shame. It seems to fail to resolve in IE10. It works in Gecko/WebKit/Blink/Presto: the %! is requested literally. However, both Apache and IIS seems to return 400 Bad Request. That's not exactly promising. ... Because it's an invalid URI. % needs to be percent-escaped. Best regards, Julian
Re: [whatwg] Request: Implementing a Geo Location URI Scheme
On 2013-06-05 00:25, Rodrigo Polo wrote: I really don't want to fight over any issue, I, as a user, want to share with you the current state on this topic and (as I said on the letter) with a friendly open letter in the pursuit to make a polite request to make the life of millions easier give you some of the reasons why I think it should be implemented ASAP. I already checked this proposed specs: http://tools.ietf.org/html/rfc5870 http://www.iana.org/assignments/uri-schemes/uri-schemes.xml http://www.w3.org/wiki/UriSchemes But my experience waiting for many browsers implementations tell me the process is slow and it looks like it works by the interest of each brand, I really feel sorry for the Web SQL Database spec that was later removed, I really hope the geo URI scheme could be implemented. I know the registerProtocolHandler but it doesn't work exactly as proposed, geo protocol isn't accepted on Chrome, only protocols with the web- prefix and the URL parameter have to match the webpage that make the request, it is designed for websites, not for local apps, all this conclusions made by the tests I have done with the latest beta of Chrome and FireFox: https://developer.mozilla.org/en-US/docs/Web/API/navigator.registerProtocolHandler http://updates.html5rocks.com/2012/02/Getting-Gmail-to-handle-all-mailto-links-with-registerProtocolHandler Again, thanks for your attention and help. ... Not sure what kind of browser support you are looking for. If you want to geo URIs to invoke a local mapping application, all you need is to install an URI handler fort that scheme and that application in the *operating system*. This is how things like mailto: have been working for two decades now. Best regards, Julian
Re: [whatwg] Request: Implementing a Geo Location URI Scheme
On 2013-06-05 13:27, Rodrigo Polo wrote: Hi, well, the kind of support I think should be implemented is actually something that should be a standard, any anchor that have a mailto:; inside is supported out of the box in any web browser and the first time it is clicked the web browser asks for the default app to open that link. At least on Windows, mailto:; is supported by an URI handler in the operating system, and the browser just delegates to it. You can install new URI handlers (think skype: and callto:). The browser doesn't need to have any special support (except for asking the OS for advice). YMMV. The geo URI handler is not supported by default out of the box and it should, for the sake of the user experience, to make it work it is required for everyone in the web browser development community to join forces with the maps application developers, the It's the mapping application that needs to support it, not the browser. The browser will support whatever the system supports it is running on. ... Best regards, Julian
Re: [whatwg] Request: Implementing a Geo Location URI Scheme
On 2013-06-05 14:00, Rodrigo Polo wrote: You are completely right, but in the tests I made on Chrome the geo URI handler can't be used with the registerProtocolHandler call, it throws a security error and the use of geo location URI it is not included as a recommendation or good practice when we talk about the markup, so it is not a technical thing, it is more an idea that could be included in further discussions between web browsers developers, map app developers and the users so everyone adopt the idea of having the geo URI scheme adopted as an standard, I'm quite sure this idea can help a lot of users and web developers to give a better user experience and it is more important that many other things, it will make the life of users a lot easier. ... You don't *need* registerProtocolHandler to support geo:. Just install an OS-level application that handles geo: and you are done. That being said: I agree that geo: should be added to the white list so that browser-based handlers for geo: become possible. Best regards, Julian
Re: [whatwg] Priority between a download and content-disposition
On 2013-03-17 02:49, Jonas Sicking wrote: It's currently unclear what to do if a page contains markup like a href=page.txt download=A.txt if the resource at audio.wav responds with either 1) Content-Disposition: inline 2) Content-Disposition: inline; filename=B.txt 3) Content-Disposition: attachment; filename=B.txt People generally seem to have a harder time with getting header data right, than getting markup right, and so I think that in all cases we should display the save as dialog (or display equivalent download UI) and suggest the filename A.txt. I agree that people have problems getting headers right, but in all the cases above, it seems they have set the header on purpose, no? My recollection was that a/@download was mainly added for cases where the header field couldn't be set at all... ... Best regards, Julian
Re: [whatwg] URL standard: Query string parsing; host parsing
On 2013-03-13 21:24, Boris Zbarsky wrote: On 3/13/13 4:23 PM, Julian Reschke wrote: Under RFC 3986, it would resolve to jar:http://example.com/Bar.class If you assume that this is a hierarchical scheme and that the hierarchy is in some particular place, no? Why is that assumption being made? No such assumption was made. Just following the algorithm in the spec. Looks like a broken scheme to me. I'm not going to try to claim jar: is a wonderful thing. It is what it is. It needs to not break. Is it used outside Java applet scenarios? BTW: this shows why formal registration and review of URI schemes is a *feature*. Best regards, Julian
Re: [whatwg] URL standard: Query string parsing; host parsing
On 2013-03-13 18:38, pocci...@gmail.com wrote: (This was originally a bug report, but I was told to e-mail instead. Another issue is also added.) -- Non-relative URLs in the query string -- Earlier I posted an issue with serializing the query in non-relative URLs. But after I read more about URIs, I am not sure whether the scheme data and query string should be kept separate. There is a distinction between how the URL specification categorizes URLs and how the URI standards (RFC3986 and RFC3987) classify URIs. Both standards allow fragments to appear in all URLs/URIs, but they differ on whether a query string is parsed. In the URL standard, query strings can occur in all URLs, but in the URI standards, a query string is not parsed if the URI contains a scheme but the scheme data doesn't begin with a slash (that is, if the URI is an opaque URI). Take the following as an example: mailto:m...@example.com?subject=Hi In the URL standard, the URL is parsed as: scheme - mailto scheme data - m...@example.com query - subject=Hi but in the URI standards, the URI is parsed as: scheme - mailto scheme-specific part - m...@example.com?subject=Hi Here, in the mailto scheme, separating the scheme data and the query may be a useful distinction. As another example, the string jar:http://example.com/jar?x=1!/com/example/Foo.class is parsed in the URI standards as: scheme - jar scheme-specific part - http://example.com/jar?x=1!/com/example/Foo.class I have no idea what you're talking about, see http://greenbytes.de/tech/webdav/rfc3986.html#rfc.section.3. This will parse into: scheme: jar hier-part: http://example.com/jar query: x=1!/com/example/Foo.class but in the URL standard as: scheme - jar scheme data - http://example.com/jar query - x=1!/com/example/Foo.class ... Best regards, Julian
Re: [whatwg] URL standard: Query string parsing; host parsing
On 2013-03-13 21:14, Boris Zbarsky wrote: On 3/13/13 4:02 PM, Julian Reschke wrote: On 2013-03-13 18:38, pocci...@gmail.com wrote: jar:http://example.com/jar?x=1!/com/example/Foo.class is parsed in the URI standards as: scheme - jar scheme-specific part - http://example.com/jar?x=1!/com/example/Foo.class I have no idea what you're talking about, see http://greenbytes.de/tech/webdav/rfc3986.html#rfc.section.3. This will parse into: scheme: jar hier-part: http://example.com/jar query: x=1!/com/example/Foo.class I should note that jar: URIs are ... special. For example, given a base of jar:http://example.com/jar?x=1!/com/example/Foo.class the relative URI Bar.class should, as far as I know, resolve to: jar:http://example.com/jar?x=1!/com/example/Bar.class What that means for parsing them, I cannot say... Under RFC 3986, it would resolve to jar:http://example.com/Bar.class Looks like a broken scheme to me. Best regards, Julian
[whatwg] TZ database
On 2013-01-08 01:47, Ian Hickson wrote: The next best choice would be to have datetime-with-timezone but unfortunately (1) Official database for all timezones does not exist (2) Official timezone names (or labels) do not exist (3) Timezones are subject to future political decisions The problems (1) and (2) make transferring the timezone information from the end user to the server very problematic and the problem (3) makes any work to fix (1) and (2) a bit pointless. This is because even if UA could successfully inform the server about the correct timezone, the server could be using a week old timezone data that is not up to the latest political events. Or the server might be using latest timezone data but the UA could be using three year old data. In either case, the absolute time in UTC could be different for the server and UA. Indeed. Sorry? http://www.iana.org/time-zones addresses (1) and possibly (2), no? Best regards, Julian
Re: [whatwg] [mimesniff] Sniffing archives
On 2012-11-29 20:25, Adam Barth wrote: These are supported in Chrome. That's what causes the download. From Can you elaborate about what you mean by supported? Chrome sniffs for the type, and then offers to download as a result of that sniffing? How is that different from not sniffing in the first place? ...your comment, it's not clear to me if you are correctly reverse engineering existing user agents. The techniques we used to create this list originally are quite sophisticated and involved a massive amount of data [1]. It would be a shame if you destroyed that work because you didn't understand it. Adam [1] http://www.adambarth.com/papers/2009/barth-caballero-song.pdf ... Understood; but on the other hand if there's a chance to simplify things than it makes sense to discuss this, even if that would involve changing some of the implementations. Best regards, Julian
Re: [whatwg] [mimesniff] Sniffing archives
On 2012-12-04 08:40, Adam Barth wrote: On Mon, Dec 3, 2012 at 12:39 PM, Julian Reschke julian.resc...@gmx.de wrote: On 2012-11-29 20:25, Adam Barth wrote: These are supported in Chrome. That's what causes the download. From Can you elaborate about what you mean by supported? Chrome sniffs for the type, and then offers to download as a result of that sniffing? How is that different from not sniffing in the first place? They might otherwise be treated as a type that can be displayed (rather than downloaded). Also, some user agents treat downloads of Do you have an example for that case? ZIP archives differently than other sorts of download (e.g., they might offer to unzip them). Out of curiosity: which? Best regards, Julian
Re: [whatwg] [mimesniff] The X-Content-Type-Options header
On 2012-11-17 19:17, Adam Barth wrote: ... I would prefer if the spec described what implementations actually do rather than your opinion about what they should do. To answer your specific questions: ... That works well if something is widely supported already. It works less well if you have one initial and one incomplete implementation only. 1) Don't bother dropping the X-. Everyone who implements this feature uses the X- and dropping it is just going to cause unnecessary interoperability problems. There's no *need* to drop it, but if research on this topic leads to the conclusion that the functionality is needed, but the current X- prototype isn't sufficient anyway it might be worth considering. ... Best regards, Julian
Re: [whatwg] [mimesniff] The X-Content-Type-Options header
On 2012-11-19 19:27, Adam Barth wrote: On Mon, Nov 19, 2012 at 10:17 AM, Julian Reschke julian.resc...@gmx.de wrote: On 2012-11-17 19:17, Adam Barth wrote: ... I would prefer if the spec described what implementations actually do rather than your opinion about what they should do. To answer your specific questions: ... That works well if something is widely supported already. It works less well if you have one initial and one incomplete implementation only. Which implementation is initial and which is incomplete? AFAIK, both IE and Chromium consider their implementation of this feature done. initial - the one done first, and by the vendor that invented the functionality incomplete - the one that copies one part and not the other part of he behavior of the initial implementation ... 1) Don't bother dropping the X-. Everyone who implements this feature uses the X- and dropping it is just going to cause unnecessary interoperability problems. There's no *need* to drop it, but if research on this topic leads to the conclusion that the functionality is needed, but the current X- prototype isn't sufficient anyway it might be worth considering. Currently, I don't see a use case for dropping the X- prefix. Perhaps there's one I don't understand? A use case for *renaming* (which might be more than dropping the prefix) actually would be saving bytes on the wire. Another one would be to make it possible to make incompatible changes to the field value syntax, when needed. Best regards, Julian
Re: [whatwg] Meta bugreport proposal
On 2012-10-31 10:21, Nicolas Froidure wrote: Hi, I think we need a specification to allow users to report websites bugs from their browser. That's why i think it could be usefull to add a meta markup like this : meta name=bugreport content=(uri) / link, not meta. The uri could be : - mailto: to send a report by mail (ex: mailto:webmas...@example.org) - http: to send the bug report a a simple HTTP POST request (ex: http://example.org/bugreport). - bug: something more customizable to allow webmasters to fit bug reports with their systems (ex: bug:http?uri=/bug.datmethod=POSTcaptcha=/captcha.jpg ) What's the use case for this? Do you want to automate bug submission? What for? ... Best regards, Julian
Re: [whatwg] checksum attribute in a href tag
On 2012-10-19 14:01, Nils Dagsson Moskopp wrote: A. Rauschenbach rauschenb...@annuo.de schrieb am Fri, 19 Oct 2012 13:50:04 +0200: I'm sick of coping the checksum of important files by hand or QR-code to the download manager or console. To solve the problem I suggest a checksum attribute in the a href tag. It seems that problem is solved at the HTTP level with RFC 1864: http://tools.ietf.org/html/rfc1864 The latest spec defining Content-MD5 was RFC 2616. It will not be included in the revision of HTTP/1.1 because of broken interop for Range requests, and because of the weakness of MD5 (see http://trac.tools.ietf.org/wg/httpbis/trac/ticket/178 for context). That being said a new response header field that is well-defined wrt to partial responses and more flexible wrt to digest algorthms would be interesting. ... Best regards, Julian
Re: [whatwg] Proposal for improved handling of '#' inside of data URIs
On 2012-10-09 13:51, Anne van Kesteren wrote: On Tue, Oct 9, 2012 at 1:50 AM, Ian Hickson i...@hixie.ch wrote: On Sat, 10 Sep 2011, Daniel Holbert wrote: I'm writing with a proposal to improve the handling of # in data URIs. I'm particularly looking for feedback from other browser vendors, but of course feedback from others is welcome as well. [...] Anne has since tried to respec URL parsing in detail, with the work in progress being here: http://url.spec.whatwg.org/ I recommend checking that spec to see if it does what you want, and if not, working with Anne to see if it can be adjusted accordingly or if something else needs to happen. This is not written down explicitly just yet, but for data URLs I think we want the fragment to *not* be part of the actual resource, but rather as an input to the resource so things like data:text/html,style:target{background:lime}/stylep id=xtest#x work. (Fails in Chrome, but works fine in Opera and Firefox already.) Clarifying: that sounds like making it parse just like in any other URI (with which I would agree). The test case at http://greenbytes.de/tech/tc/datauri/#svg seems to imply that Opera doesn't do this right yet, though. (tested with 12.02) Best regards, Julian
Re: [whatwg] Proposal for improved handling of '#' inside of data URIs
On 2012-10-09 17:33, Anne van Kesteren wrote: On Tue, Oct 9, 2012 at 4:59 PM, Julian Reschke julian.resc...@gmx.de wrote: The test case at http://greenbytes.de/tech/tc/datauri/#svg seems to imply that Opera doesn't do this right yet, though. (tested with 12.02) Yeah, for some reason Opera has different behavior when entering in the address bar. Indeed. Will update test comment. Best regards, Julian
Re: [whatwg] New URL Standard
On 2012-09-21 17:16, Anne van Kesteren wrote: I took a crack at defining URLs: http://url.spec.whatwg.org/ At the moment it defines parsing (minus domain names / IP addresses) and the JavaScript API (minus the query manipulation methods proposed by Adam Barth). It defines things like setting .pathname to hello world (notice the space), it defines what happens if you resolve http:test against a data URL (you get http://test/;) or As per RFC 3986, Section 5.2 (Relative Resolution), the answer IMHO is http:test. Fetching from that URI indeed used http://test/ (just checked in Mozilla), so it appears we have a terminology problem. It would be good if we could avoid confusing relative reference resolution with what you try to define here. Note that the term resolve is widely used for what RFC 3986 Section 5.2 defines; see, for instance, http://docs.oracle.com/javase/1.4.2/docs/api/java/net/URI.html#resolve%28java.lang.String%29. ... http://teehee (you get http://teehee/test;). It is based on the various URL code paths found in WebKit and Gecko and supports the \ as / in various places because it seemed better for compatibility. I'm looking for some feedback/ideas on how to handle various aspects, e.g.: * data URLs; in Gecko these appear to be parsed as part of the URL layer, because they can turn a URL invalid. Other browsers do not do this. Opinions? Should data URLs support .search? ... I believe the behavior should be predictable and consistent no matter what the URI scheme is. Best regards, Julian PS: and no, I don't think URL Standard is a good name for this document.
Re: [whatwg] multipart/form-data filename encoding: unicode and special characters
On 2012-07-09 23:01, Ian Hickson wrote: On Thu, 3 May 2012, Evan Jones wrote: On May 3, 2012, at 17:09 , Anne van Kesteren wrote: Yes. I think we should define multipart/form-data directly in HTML and thereby obsolete http://tools.ietf.org/html/rfc2388 as it is outdated and not maintained. Right; that would be ideal. Despite the fact that HTML5 references that RFC, browsers don't really follow it. I would be interested in trying to help with this, but again I would certainly need some guidance from people who know more about the vagaries of how the various browsers encode their form parameters / uploaded file names, and why things got that way. It probably would not be helpful for me to try to draft an update to the spec without getting the right implementers on board. If this is still something for which you have some time available, then the starting point for anything like this would be test cases, lots and lots of test cases. In this case, it would have to be something like a server that echoes the precise bytes sent by the client, for a huge variety of different setups: - various submission encodings - various form field names and types - various file submission filenames ...etc. I'd be happy to advise if this is something that still interests you. I agree with the methodology. However I would suggest to simply revise RFC 2388. Best regards, Julian
Re: [whatwg] The pic element
On 2012-06-04 09:32, Kornel Lesiński wrote: On Mon, 04 Jun 2012 01:05:23 -0500, Anselm Hannemann Web Development i...@anselm-hannemann.com wrote: An alternative is to pick different delimiters. See, for instance, http://tools.ietf.org/html/rfc2295#section-8.3. I also would like to see another delimiting syntax which is clearer. What about JSON-syntax or just | ? I mean a backslash is not that common in a URL but commas are more and more and you all know that escaping is no fun. So we should really try to avoid this. Another character could work in theory, but I wonder whether it would work in practice. For example meta name=viewport was documented to support only comma, but thanks to silent error recovery authors ended up using and relying on semicolon: http://lists.w3.org/Archives/Public/www-style/2011Oct/0652.html I wonder whether reverse of it could happen with list of sources, e.g. unexpected comma parsed as invalid media query could end up delimiting sources in some implementations, and then we'll end up with worst of both worlds (both ambiguous comma and other unintuitive delimiter needed for web-compat). 1) Use a format where the delimiter is always the same, (2) where escaping never is needed, and (3) specify the parser to ignore all malformed attribute values. Best regards, Julian
Re: [whatwg] The pic element
On 2012-06-01 20:24, Kornel Lesiński wrote: ... If there are commas or backslashes in the URL they must be escaped with `\`. This is another problem why I would separate the diff. srces. Escaping an URL is not something that should be necessary in HTML I think. I agree, it's ugly, but otherwise you get ambiguous syntax for entries without descriptor or media query. I thought about specifying some magic, like ignoring trailing comma in URL, but all such magical solutions have surprising edge cases. Explicit escaping is at least easy to comprehend. ... An alternative is to pick different delimiters. See, for instance, http://tools.ietf.org/html/rfc2295#section-8.3. Best regards, Julian
Re: [whatwg] responsive images
On 2012-05-22 17:02, Glenn Maynard wrote: (I wish people would stop starting new threads about the same topic.) On Tue, May 22, 2012 at 5:53 AM, Paul Courtp...@pmcnetworks.co.uk wrote: As a HTML author and programmer, I just cannot see myself implementing the current srcset proposal on sites. As a programmer, it has very much got what we would call a bad code smell. img src=face-600-...@1.jpeg alt= srcset=face-600-...@1.jpeg 600w 200h 1x, face-600-...@2.jpeg 600w 200h 2x, face-icon.png 200w 200h Actually, it's pretty clean; you've just made it ugly by sticking it all on one line. img src=face-600-...@1.jpeg alt= srcset=face-600-...@1.jpeg 600w 200h 1x, face-600-...@2.jpeg 600w 200h 2x, face-icon.png 200w 200h It's no uglier than CSS syntaxes like background. It may not be uglier but it's much more fragile as the examples and the prose in the spec give the impression that you can use the , to tokenize, which would be incorrect. ... Best regards, Julian
Re: [whatwg] srcset javascript implementation (Respondu)
On 2012-05-21 04:21, David Clements wrote: Hi guys, Just to let you all know, I've written a javascript implementation of srcset using a framework for responsive images (which I also wrote) called Respondu (I'm open to new name suggestions), I'd love it if someone could check that I've implemented srcset right. Respondu manages to process the DOM without allowing any assets (contained in the body) to load, it also gracefully degrades for non-js browsers and is fairly unintrusive (it simply wraps the contents of the body tags). Check out the github page (feedback, pull requests, lunch money etc. welcome) https://github.com/davidmarkclements/Respondu ... https://github.com/davidmarkclements/Respondu/blob/master/R.js#L243 This looks like you are splitting the attribute value by ,? Best regards, Julian
Re: [whatwg] srcset javascript implementation (Respondu)
On 2012-05-21 09:36, huperekch...@googlemail.com wrote: Hey Julian I believe the attribute sets are delimited by comma, whereas each attribute itself is separated by space? No. The URIs can contain a comma, so you can't use that delimiter. See the parsing definition in the spec. ... (Please don't take this as a complaint about your code, but about the syntax of the attribute). Best regards, Julian
Re: [whatwg] Features for responsive Web design
On 2012-05-18 12:30, Maciej Stachowiak wrote: On May 18, 2012, at 3:16 AM, Markus Ernstderer...@gmx.ch wrote: Am 15.05.2012 09:28 schrieb Ian Hickson: img src=face-600-...@1.jpeg alt= srcset=face-600-...@1.jpeg 600w 200h 1x, face-600-...@2.jpeg 600w 200h 2x, face-icon.png 200w 200h Re-reading most parts of the last day's discussions, 2 questions come to my mind that I have the impression have not been pointed out very clearly so far: 1. Are there other cases in HTML where an attribute value contains more than one URI? 2. Have there been thoughts on the scriptability of @srcset? While sources can be added to resp. removed frompicture easily with standard DOM methods, it looks to me like this would require complex string operations for @srcset. If dynamically manipulating the items in srcset is useful, we can add a DOM API (similar to classList or style for manipulating the lists of items found in class and style attributes respectively). ...which of course means that it stops being simpler. I think it would be worthwhile to combine elements form both proposals; in particular to avoid the microsyntax and use proper markup instead. Best regards, Julian
Re: [whatwg] Defaulting new image solution to 192dpi
On 2012-05-17 13:30, Kornel Lesiński wrote: My suggestion is that the srcset (or picture) should assume that images are 2x scale by default. My reasoning behind is: - we have img for easy embedding of 1x images today, but we don't have 2x img for the future. Having to specify width/height in img all the time is annoying. - highdpi displays will become dominant at some point, it's only a matter of time (they pretty much are already in high-end smartphones, and are going to appear in laptops next). Bandwidth is also going to be less of a concern, so it'll be rational and desirable to serve images for the 2x resolution only (and just rely on 96dpi displays scaling them down). Necessity to specify 2x scaling all the time will become a bad default and a historical quirk (like the DOCTYPE), and a source of annoyance where accidentally omitted 2x syntax makes images large and pixelated. So to future-proof the solution I think: img src=1x.jpg srcset=2x.jpg should be equivalent to: img src=1x.jpg srcset=2x.jpg 2x ... As far as I can tell, making descriptors optional breaks the syntax (it allows comma both in the URI and as a separator between image candidates). (Please read this as argument for making the syntax less brittle) Best regards, Julian
Re: [whatwg] img srcset for responsive bitmapped content images
On 2012-05-10 09:58, Edward O'Connor wrote: Hi, When authors adapt their sites for high-resolution displays such as the iPhone's Retina display, they often need to be able to use different assets representing the same image. Doing this for content images in HTML is currently much more of a pain than it is in CSS (and it can be a pain in CSS). I think we can best address this problem for bitmap[1] content image by the addition of a srcset= attribute to the existing img element. The srcset= attribute takes as its argument a simplified variant of the image-set() microsyntax[2]. It would look something like this: img src=foo-lores.jpg srcset=foo-hires.jpg 2x, foo-superduperhires.jpg 6.5x alt=decent alt text for foo. ... Inventing a new microsyntax is tricky. - comma separated implies you'll need to escape a comma when it appears in a URI; this may be a problem when the URI scheme assigns a special meaning to the comma (so it doesn't affect HTTP but still...) - separating URIs from parameters with whitespace implies that the URIs are valid (in that they do not contain whitespace themselves); I personally have no problem with that, but it should be kept in mind Best regards, Julian
Re: [whatwg] img srcset for responsive bitmapped content images
On 2012-05-16 11:51, Odin Hørthe Omdal wrote: On Wed, 16 May 2012 11:22:07 +0200, Julian Reschke julian.resc...@gmx.de wrote: Inventing a new microsyntax is tricky. - comma separated implies you'll need to escape a comma when it appears in a URI; this may be a problem when the URI scheme assigns a special meaning to the comma (so it doesn't affect HTTP but still...) Indeed. Edward did not write it all as a spec, though, so cases like that might be a bit detailed for a first proposal. Hixies extension of srcset does however have some spec text, and that does in fact handle your first case: http://www.whatwg.org/specs/web-apps/current-work/multipage/embedded-content-1.html#processing-the-image-candidates ... It looks like that, but it's non-trivial to check (that's why I prefer declarative definitions). Best regards, Julian
Re: [whatwg] Throwing in my support for picture into the mix
On 2012-05-16 15:46, Glenn Maynard wrote: On Wed, May 16, 2012 at 4:28 AM, Paul Courtp...@pmcnetworks.co.uk wrote: First, I would like to suggest throwingimg srcset out the window and into a landfill somewhere (It's not even fit for recycling!). This reminds me if the recent semi-colon in JavaScript debate that erupted as a result of @fat's code in the Twitter Bootstrap project - To one or two people who are very specialised in their particular area, it seems like a non issue - and I think that is the case with theimg srcset syntax. From a browser developer point of view it might be easier to implement, but from a I'm just learning to code point of view, that syntax is bat-shit crazy! It's a simple, unambiguous, extensible format. If you don't like this ... It is? Quick check, do srcset=a,b and srcset=a, b mean the same thing? And what about srcset=a ,b ? Best regards, Julian
Re: [whatwg] Throwing in my support for picture into the mix
On 2012-05-16 16:07, Glenn Maynard wrote: On Wed, May 16, 2012 at 8:57 AM, Julian Reschkejulian.resc...@gmx.dewrote: It is? Quick check, do srcset=a,b and srcset=a, b mean the same thing? And what about srcset=a ,b Yes, they all mean the same thing: a url a with no descriptors, and a url b with no descriptors. What makes you think they wouldn't? , is a legal URI character. (Collect a sequence of characters that are not space characters, and let that be url.) Best regards, Julian
Re: [whatwg] Throwing in my support for picture into the mix
On 2012-05-16 16:36, Glenn Maynard wrote: On Wed, May 16, 2012 at 9:16 AM, Julian Reschke julian.resc...@gmx.de mailto:julian.resc...@gmx.de wrote: , is a legal URI character. (Collect a sequence of characters that are not space characters, and let that be url.) Actually, the key point is that this is non-conforming to start with: image candidate strings must have at least one descriptor (http://www.whatwg.org/specs/web-apps/current-work/#image-candidate-string). ... My point being that the syntax is fragile unless implementations follow the spec word by word. I know they are supposed to, but the way it's introduced *will* make people split the attribute value by ,. Best regards, Julian
Re: [whatwg] multipart/form-data filename encoding: unicode and special characters
On 2012-05-02 13:05, Evan Jones wrote: On May 1, 2012, at 22:38 , Ashley Sheridan wrote: The Webkit method looks the better of the two with regards to how server-side languages might interpret it, but it would need work to ensure everything that should be escaped is, and that everything that is unescaped on the server should be and is done so correctly. The problem is that currently I am unable to correctly round trip an uploaded file name. I would like users to upload a file, and be able to later download the file with the *exact same* file name. If you follow the specifications, this is not possible. Firefox is closer to the MIME RFCs (which specifies backslash quoting in quoted-strings), but apparently that will break IE6, 7, and 8: https://bugs.webkit.org/show_bug.cgi?id=62107 http://java.net/jira/browse/JERSEY-759 Webkit's %-escaping behaviour is *not* part of the referenced MIME RFCs (which specifies either backslash quoting in quoted-strings, base64 encoding, or %-escaping in special filename*= arguments). Thus, if this is the right answer, it should be specified somewhere. I'm assuming that this needs to be in the HTML5 spec, since HTTP calls this the body of the the POST and declares that it is outside the HTTP specification. Webkit's escaping is also flawed (see bug 62107 above). Files with that contain %-escapes (eg. my%22file.txt, admittedly very rare) will get mangled, because there is no difference between my%22file.txt and myfile.txt. Currently, I need to detect the browser in order to figure out what kind of unescaping to apply to the file name, and even then in some cases I can't figure out what the right file name is. Webkit claims this is a specification bug, so I'm hoping someone here might tell me if this is the case, and if so where can I file bugs, create test cases, etc? Evan -- http://evanjones.ca/ I did spend a considerable amount of time with Content-Disposition, the *response* header field (resulting in RFC 6266 and http://greenbytes.de/tech/tc2231/). However, this has little to do with the representation in form uploads. If browser implementers want to try something new that will not affect the old code paths, supporting the encoding defined in RFC 5987 might be the right thing to do (yes, it's ugly, but it's unambiguous). Best regards, Julian
Re: [whatwg] multipart/form-data filename encoding: unicode and special characters
On 2012-05-02 19:26, Evan Jones wrote: On May 2, 2012, at 7:43 , Julian Reschke wrote: If browser implementers want to try something new that will not affect the old code paths, supporting the encoding defined in RFC 5987 might be the right thing to do (yes, it's ugly, but it's unambiguous). It seems to me like that is a potential solution that could be evaluated. It would be nice to have both the HTTP response header and the POST form encoding be the same. However, a critical question is if the server software that parses the form headers would do the right thing if it sees both an ASCII fallback filename= and an escaped filename*= parameter in the Content-Disposition header. Without looking at any code, I suspect some will and some won't. I'm pretty sure everybody will ignore filename* for now. Which means servers need to upgrade, but at least it would be an upgrade that doesn't break any existing behavior. My conclusion: I would be willing to help with bugs, testing, test cases, looking at server code, etc related to this issue. However, I believe someone who is experienced with the technology and politics of web standards to really champion any change because I don't fully understand the processes or the issues. If I don't hear anything in a few days, I'll try filing some additional bugs with Webkit, Firefox, and the HTML5 spec and otherwise give up. ... Sounds like a plan. Best regards, Julian
Re: [whatwg] Encoding Sniffing
On 2012-04-23 10:19, Henri Sivonen wrote: ... * The Universal detector is used regardless of UI setting or locale when using the FileReader to read a local file as text. (I'm personally very unhappy about this sort of use of heuristics in a new feature.) ... +1 ... WebVTT is a new format with no legacy. Instead of letting it become infected with heuristic detection, we should go the other direction and hardwire it as UTF-8 like we did with app cache manifests and JSON-in-XHR. No one should be creating new content in encodings other than UTF-8. Those who can't be bothered to use The Encoding deserve REPLACEMENT CHARACTERs. Heuristic detection is for unlabeled legacy content. ... +1
Re: [whatwg] URL query component
On 2012-04-20 14:37, And Clover wrote: On 2012-04-20 09:15, Anne van Kesteren wrote: Currently browsers differ for what happens when the code point cannot be encoded. What Gecko does [?%C2%A3] makes the resulting data impossible to interpret. What WebKit does [?%26%23163%3B] is consistent with form submission. I like it. I do not! It makes the data impossible to recover just as Gecko does... in fact worse, because at least Gecko preserves ASCII. With the WebKit behaviour it becomes impossible to determine from an pure ASCII string '#163;' whether the user really typed '€' or '#163;' into the input field. It has the advantage of consistency with the POST behaviour, but that behaviour is an unpleasant legacy hack which encourages a misunderstanding of HTML-escaping that promotes XSS vulns. I would not like to see it spread any further than it already has. +1 Indeed. I think this is a case where you want to fail early (for some value of fail); so maybe substituting with ? makes most sense. Do any servers *expect* the Webkit behavior? If they do so, why don't they just fix the pages they serve to use UTF-8 to get consistent behavior throughout? Best regards, Julian
Re: [whatwg] Encoding Standard (mostly complete)
On 2012-04-17 11:30, Anne van Kesteren wrote: Hi, Apart from big5 (which requires some more research) all encoders and decoders are now defined: http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html ... As a nit, I believe that Character Encoding would make a better title than just Encoding. Best regards, Julian
Re: [whatwg] some thoughts on bring HTTP upon UDP: iWebPP - instant web p2p technology
On 2012-03-14 13:10, tom wrote: Hi, AFAIK, WebRTC intends to setup P2P communication between browsers, then carry video/audio/text media, etc. Why we need WebRTC? Firstly, Web is the most popular network app, secondly, video/voice brings the best user experience. But, the problem is that HTTP runs on TCP by now, while P2P runs on UDP normally. Suppose both web browser and server can run HTTP upon UDP(the protocol schema as HTTPP), what happens? Firstly, Web app developers can program HTTPP like HTTP, secondly, P2P traffic can be carried on HTTPP easily. Basically iWebPP consists of two parts: HTTPP-enabled web browser and web server. Any thoughts? thanks. ... Well, declaring that it should use UDP alone won't make it happen. It obviously will work nicely for small messages that are idempotent (so they can be retransmitted safely), but things get complicated beyond that. There's also previous work to study; for instance Microsoft has used HTTP over UDP for notifications in the past. A good place to bring this up might be the HTTPbis Working Group, which will be looking at what HTTP/2.0 might be very soon. Best regards, Julian
Re: [whatwg] Specify href target with HTTP headers
On 2012-03-08 20:25, Christian Schmidt wrote: ... Separating the network protocol from the user interface seems highly desirable. Window-Target sacrifices that. I get your point. But it seems that Content-Disposition already suffers from this. RFC 2183 describes the Content-Disposition like this: A mechanism is needed to allow the sender to transmit this sort of presentational information to the recipient; the Content-Disposition header provides this mechanism, allowing each component of a message to be tagged with an indication of its desired presentation semantics. I know that RFC 2183 deals with e-mail and is not pat of HTTP/1.1, but it is mentioned in the HTTP specification and is supported by several browsers. ... Content-Disposition for HTTP is defined in RFC 6266. ... Best regards, Julian
Re: [whatwg] Caching of identical files from different URLs using checksums
On 2012-02-18 14:45, Sven Neuhaus wrote: ... Stop here. That's not what the fragment identifier is for. Instead, you could specify the hash as a separate attribute on the containing element. The relevant section from RFC 3986 reads: The fragment identifier component of a URI allows indirect identification of a secondary resource by reference to a primary resource and additional identifying information. The identified secondary resource may be some portion or subset of the primary resource, some view on representations of the primary resource, or some other resource defined or described by those representations. ..but it goes on saying: The semantics of a fragment identifier are defined by the set of representations that might result from a retrieval action on the primary resource. The fragment's format and resolution is therefore dependent on the media type [RFC2046] of a potentially retrieved representation, even though such a retrieval is only performed if the URI is dereferenced. If no such representation exists, then the semantics of the fragment are considered unknown and are effectively unconstrained. Fragment identifier semantics are independent of the URI scheme and thus cannot be redefined by scheme specifications. This description is not contradicting the use of checksum as fragment identifiers. They are additional identifying information. It is contradicting the concept of being defined by the media type. However, if there is a consensus that checksums shouldn't be stored in the fragment part of the URL, a new attribute would be a good alternative. Regards, -Sven Neuhaus Best regards, Julian
Re: [whatwg] Caching of identical files from different URLs using checksums
On 2012-02-17 09:42, Sven Neuhaus wrote: Hello, as of 2012, some websites are including popular javascript libraries from CDNs, like Google's. The benefits are: * Traffic savings for the site operator because the javascript libraries are downloaded from the CDN and not from the site that uses them * If enough sites refer to the same external file, the browser will cache the file and even if it's a first visit, the (potentially large) javascript file will not have to be downloaded. There are however some drawbacks to this approach: * Security: The site operator is trusting an external site. If the CDN serves a malicious file it will directly lead to code execution in browsers under the domain settings of the site including it (a form of cross site scripting). * Availability: The site depends on the CDN to be available. If the CDN is down the site may not be available at all. * Privacy: The CDN will see requests for the file with HTTP referer headers for every visitor of the site. * Extra DNS lookup if file is not already cached * Extra HTTP connection (can't use persistent connection because it's a different site) if file is not cached I am proposing a solution that will solve all these problems, keep the benefits and offers some extra advantages: 1. The site stores a copy of the library file(s) on its own site. 2. The web page includes the library from the site itself instead of from the CDN 3. The script tag specifies a checksum calculated using a cryptographic hash function. With this solution, whenever a browser downloads a file and stores it in the local cache, it calculates its checksum. The browser can check its cache for an (identical) file with the same checksum (no matter what URL it was retrieved from) and use it instead of downloading the file again. This suggestion has previously been discussed here ( http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2006-November/thread.html#7825 ), however for a different purpose (file integrity instead of caching identical files from different sites) and I don't feel the points raised back then apply. If a library is popular, chances are that many sites are including the identical file and it will already be in the browser's cache. No network access is necessary to use it, improving the users' privacy. It doesn't matter if the sites store the library file at a different URL. It will always be identified by its checksum. The cached file can be used more often. The syntax used to specify the checksum is using the fragment identifier component of a URI (RFC 3986 section 3.5). ... Stop here. That's not what the fragment identifier is for. Instead, you could specify the hash as a separate attribute on the containing element. Best regards, Julian
Re: [whatwg] Augmenting HTML parser to recognize new elements
On 2012-01-18 22:55, Dimitri Glazkov wrote: On Wed, Jan 18, 2012 at 1:47 PM, Adam Barthw...@adambarth.com wrote: On Wed, Jan 18, 2012 at 1:29 PM, Dimitri Glazkovdglaz...@chromium.org wrote: On Wed, Jan 18, 2012 at 1:14 PM, Dimitri Glazkovdglaz...@chromium.org wrote: Ah, that's a good question. This also must be specified. It should depend on the parent of thecontent element. If the parent is shadow root ortable, then it should maketr the child ofcontent. Otherwise, it should use foster parenting as usual. Oops, not foster parenting, but ignore as you mentioned. Still getting through the details of the parsing spec. There's also some subtly w.r.t. the pending character tokens. More generally, I think we'd all be much more sane if the HTML parsing algorithm was specified in the HTML living standard rather than modified ad-hoc in a number of different documents. That makes sense, but how will we handle the fact that the elements in the algorithm aren't part of the HTML specification? ... The algorithm should be specified so that all future elements follow the same parsing rules, thus no further changes are required. Best regards, Julian
Re: [whatwg] Proposal: intent tag for Web Intents API
On 2011-12-08 18:54, Anne van Kesteren wrote: On Wed, 07 Dec 2011 18:59:43 +0100, Paul Kinlan paulkin...@google.com wrote: Cons: * ordering of data in the content element - if the ordering of data in the content value is mandatory and the developer mixes up the ordering, does the action then become image/png (which is still techincally valid) and the data type become the uri string specified? * we have other optional attributes, such as title, disposition and icon so a scheme needs to be defined inside the content, if we define a scheme it looks similar to the intent tag but harder to prepare (from a normal developers perspective) * some attributes can have spaces so we would need to define encoding mechanisms inside the content attribute to handle quotes, and double quotes. * we can't provide a visual fallback if intents aren't supported - see discussion about self closing tag in body. * harder to validate (due to all of the above) We can just add additional attributes to meta you know. We have done the same for link. E.g. for link rel=icon you can specify a sizes attribute. Hmmm. That makes it sound a lot easier than it is. After all, there's no extension point here. Adding attributes to meta (or link) requires a change to HTML5, or a delta spec adding these as conforming attributes. Best regards, Julian
Re: [whatwg] Proposal for improved handling of '#' inside of data URIs
On 2011-09-14 10:16, Robert O'Callahan wrote: On Sat, Sep 10, 2011 at 11:01 PM, Ryosuke Niwarn...@webkit.org wrote: Have implementors actively opposed to this idea? It seems like sticking to RFC is a cleaner option if possible. Yeah. Will you fix it in Webkit? :-) :-) Maybe we should start with opening a ticket, so this is properly tracked?
Re: [whatwg] Proposal for improved handling of '#' inside of data URIs
On 2011-09-12 21:47, Michal Zalewski wrote: What about javascript: URLs? Right now, every browser seems to treat javascript:alert('#') in an intuitive manner. This likely goes beyond data: and javascript:, so I think it would be useful to look at it more holistically. Maybe. Or it makes sense to do it one at a time :-). Observation: javascript: IMHO isn't a URI scheme (it just occupies a place in the same lexical space), so maybe the right thing to do is to document it as historic exception that only exists in browsers. Best regards, Julian
Re: [whatwg] Proposal for improved handling of '#' inside of data URIs
On 2011-09-11 04:51, Boris Zbarsky wrote: ... I think you misunderstand my position. I'm weakly against the proposal in question; the strongest argument in favor of the proposal is that there is either a current or future deployed base of data: URIs that won't work without it but do work in either past browsers or some subset of future ones. Of course the simplest way to prevent the future URIs thing being a problem is for UAs that don't follow the URI spec here right now to fix that, but I haven't sensed much willingness to do that in the past, or earlier in this discussion. :( ... +1 for trying to sanitize the parsing in Firefox. Given the fact that this change made it into the release without any major uproar there might be a chance that other UAs might simply adopt it. Given the choice between converging on this proposal and the status quo in which UAs just do wildly different totally wacky things, I'd pick the proposal, I think If we can't get the perfect fix (UAs consistently doing what the spec says), then of course converging on something that is less broken than before may be good. Best regards, Julian
Re: [whatwg] Proposal for improved handling of '#' inside of data URIs
On 2011-09-11 18:56, Daniel Holbert wrote: On 09/11/2011 02:09 AM, Julian Reschke wrote: Given the fact that this change made it into the release without any major uproar there might be a chance that other UAs might simply adopt it. (To be clear -- the proposal hasn't made it into any releases yet. Right now it's just an idea.) ... Understood. I was referring to the changed behavior as of Firefox 6. Best regards, Julian
Re: [whatwg] Proposal for improved handling of '#' inside of data URIs
On 2011-09-11 17:30, Daniel Holbert wrote: On 09/11/2011 07:21 AM, Michael A. Puls II wrote: Not only must # be %23 if you don't want it as a frag id, but and should be %3E and %3C. [...] Of course, if you can percent-encode everything needed as you type, you can hand-author the URI data. But, who wants to do that, As I noted in a response to Nils earlier in this thread, Firefox/Webkit/Opera don't actually require authors to percent-encode brackets and spaces in data URIs. (not sure whether that's correct per spec or not). ... It's not correct per RFC 2397 (data) and RFCs 3986 (URI) and 3987 (IRI), but the HTML spec certainly could *make* it correct by introducing an additional layer (if there was consensus to do so). Right now HTML5 conformance requires valid IRIs, so unescaped whitespace or angle brackets in @hrefs make the document non-conforming. ... Best regards, Julian
Re: [whatwg] 1.1.1 How do the WHATWG and W3C specifications differ?
On 2011-09-08 08:26, Jens O. Meiert wrote: Please clarify -- (a) the decisions do not make sense or (b) not applying them doesn't make sense? My main concern are the number of differences between the WHATWG and the W3C version, hence the question whether we’re on it at all to improve this. I'm all for improving it :-). The good question is how (other than the obvious way to just apply the W3C HTML WG decisions). Best regards, Julian
Re: [whatwg] add html-attribute for responsive images
On 2011-08-30 16:51, Anne van Kesteren wrote: On Tue, 30 Aug 2011 16:31:59 +0200, Karl Dubost ka...@opera.com wrote: * It is in fact an issue for being able to make the website responsive on Mobile devices in low banwidth. The mobile devices are the ones with the high-resolution displays. Speak for your own device :-)
Re: [whatwg] a rel=attachment
On 2011-07-22 09:00, Ian Hickson wrote: (These e-mails were sent after I started working on the previous one.) On Wed, 20 Jul 2011, Chris Bentzel wrote: Who should be trusted for filename if the one specified bya on the referring page differs from the one specified by Content-Disposition on the to-be-downloaded resource? I've specified that the header wins. On Wed, 20 Jul 2011, Julian Reschke wrote: That being said, if you want to go down the road, make it clear how the file name actually is extracted from the header field in an interoperable way. That isn't really in scope for the HTML spec, it's something either for the HTTP spec or the Content-Disposition spec (if HTTP doesn't define th header itself) to define. Is there a specific reason why the new text doesn't mention Content-Disposition anymore? Best regards, Julian
Re: [whatwg] a rel=attachment
On 2011-07-22 05:03, Hironori Bono (坊野 博典) wrote: Greetings all, This is just out of curiosity. Would it be possible to give me the encoding used for this download attribute? I think we have several options when we use non-ASCII characters (this example uses Cyrillic characters) as the value of this attribute as listed below. 1. Use the same encoding as the one used for the HTML content. a href=... download=файл.pngсохранить файл/a (If we allow using '#x...' format of HTML, it becomes: a href=... download=#x444;#x430;#x439;#x43B;.pngсохранить файл/a 2. Use the URL encoding (same as the href attribute). a href=... download=%D1%84%D0%B0%D0%B9%D0%BB.pngсохранить файл/a 3. Use RFC 2231 (same as the content-disposition header) a href=... download=UTF-8''%D1%84%D0%B0%D0%B9%D0%BB.pngсохранить файл/a Thank you for your help in advance. It's the same as with any other HTML attribute. The thing you mention in 3) is a special mechanism only needed in HTTP header fields (btw updated by RFC 5987), and doesn't apply here. Best regards, Julian
Re: [whatwg] a rel=attachment
On 2011-07-22 09:24, Ian Hickson wrote: On Fri, 22 Jul 2011, Julian Reschke wrote: That isn't really in scope for the HTML spec, it's something either for the HTTP spec or the Content-Disposition spec (if HTTP doesn't define th header itself) to define. Is there a specific reason why the new text doesn't mention Content-Disposition anymore? Not only does the new text mention Content-Disposition, it actually refers to its specification multiple times. Are you looking at the right diff? I was looking at the diff r6318. Maybe there were more changes? Best regards, Julian
Re: [whatwg] a rel=attachment
On 2011-07-20 13:33, Chris Bentzel wrote: Who should be trusted for filename if the one specified bya on the referring page differs from the one specified by Content-Disposition on the to-be-downloaded resource? ... I think the header field needs to be authoritative. That being said, if you want to go down the road, make it clear how the file name actually is extracted from the header field in an interoperable way. Best regards, Julian
Re: [whatwg] date meta-tag invalid
On 2011-07-18 14:54, aykut.sen...@bild.de wrote: According to the w3c Validator themetaname=datecontent=# / tag is invalid. In the WHATWG MetaExtensions List there is no registered extension, no specification and no proposal for the date meta-tag. The only alternative for date is a proposal called created, which however doesn't meet the requirements for registration . For our SEO team the date meta-tag contains some of the most important information about a webpage. What would be a w3c-valid way to implement a creation date meta-tag in html5? Out of curiosity: who is processing the tag? And what does this have to do with SEO? Do search engines do anything with it? From HTML5's point of view the suggest replace is probably time pubdate... Did you look at that already? Also: there seems to be overlap with Dublin Core's dc:created? Best regards, Julian
Re: [whatwg] date meta-tag invalid
On 2011-07-18 15:59, aykut.sen...@bild.de wrote: hi julian, i have asked one from the seo team and he says for example the freshness factor is important for google. is it possible to use the time-tag in the head instead (i mean invisible)? dc:created is also not in the Meta Extensions List, see: http://wiki.whatwg.org/wiki/MetaExtensions I *believe* the SEO time is misguided when it thinks that meta/@name=date affects Google. But only Google can tell us. I mentioned dc:created not because it's valid, but because it's at least *specified* and in more wider use. Best regards, Julian
Re: [whatwg] Iframe Sandbox Attribute - allow-plugins?
On 2011-07-14 17:01, Jonas Sicking wrote: ... True. I would be fine with removing the plugin requirement. Or changing it such that it states that plugins can only be loaded if it's done in a manner that ensures that all other requirements are still fulfilled. Or just dealing with this once there actually are plugins and plugin APIs which could be loaded while still fulfilling the other requirements. ... Well, the spec is in W3C LC. So if we think this requirement needs to be rephrased then it should be brought up as a problem. Best regards, Julian
Re: [whatwg] a rel=attachment
On 2011-07-15 19:05, Ian Fette (イアンフェッティ) wrote: .. It also doesn't naturally help understanding that it's just poor man's Content-Disposition:attachment. From this point of view, I like Ian's original proposal (rel=attachment) more. Yes and no - both are sort of a poor man's Content-Disposition :) The question is whether we need to handle filename, and the proposal of download=filename at least maps content-disposition fully and compactly. ... Well, one difference is that C-D is under the control of the owner of the resource being linked to (ideally), while attributes set somewhere else might not. So there is a security-related aspect to this. Best regards, Julian
Re: [whatwg] Iframe Sandbox Attribute - allow-plugins?
On 2011-07-14 08:22, Jonas Sicking wrote: On Wed, Jul 13, 2011 at 9:49 PM, Anne van Kesterenann...@opera.com wrote: On Wed, 13 Jul 2011 23:13:05 +0200, Julian Reschkejulian.resc...@gmx.de wrote: Yes, but we can *define* the flag in HTML and write down what it means with respect to plugin APIs. It seems much better to wait until it can actually be implemented. Especially since it's not at all clear to me that a specific opt-in mechanism is at all needed once we have the appropriate plugin APIs implemented. And those APIs are needed anyway if we want to allow plugins in any form in the sandbox. When the attribute is set, the content is treated as being from a unique origin, forms and scripts are disabled, links are prevented from targeting other browsing contexts, and plugins are disabled. A browser negotiating something with plugins using that API and enabling them despite @sandbox would violate the above requirement, no?
Re: [whatwg] Iframe Sandbox Attribute - allow-plugins?
On 2011-07-13 22:31, Adam Barth wrote: Adding allow-plugins today would defeat the prevention of parent redirection. The short answer is we need an API for informing plugins of the sandbox flags and a way of confirming that the plugins understand those bits before we can allow plugins inside sandboxed frames. ...but that API is outside the scope of what the W3C and the WhatWG currently do, so I think it would be great if defining this flag could be decoupled from progress on the plugin API layers. Best regards, Julian
Re: [whatwg] Iframe Sandbox Attribute - allow-plugins?
On 2011-07-13 22:58, Adam Barth wrote: On Wed, Jul 13, 2011 at 1:55 PM, Julian Reschkejulian.resc...@gmx.de wrote: On 2011-07-13 22:31, Adam Barth wrote: Adding allow-plugins today would defeat the prevention of parent redirection. The short answer is we need an API for informing plugins of the sandbox flags and a way of confirming that the plugins understand those bits before we can allow plugins inside sandboxed frames. ...but that API is outside the scope of what the W3C and the WhatWG currently do, so I think it would be great if defining this flag could be decoupled from progress on the plugin API layers. It is coupled in the sense that we can't implement the flag unless and until such a plug-in API exists. Yes, but we can *define* the flag in HTML and write down what it means with respect to plugin APIs. Best regards, Julian
Re: [whatwg] EventSource - Handling a charset in the content-type header
On 2011-07-04 16:13, Anne van Kesteren wrote: ... Are we sure we want this strict checking of media type parameters? I always thought the media type itself was what strict checking should be done upon, but that its parameters were extension points, not points of failure. ... The right thing for consistency with other uses of media types is to ignore unknown parameters. Best regards, Julian
Re: [whatwg] Content-Disposition property for a tags
On 2011-06-03 17:46, Bjartur Thorlacius wrote: ... I strongly disagree. I think browsers that use the Content-Disposition filename for attachment but not inline are just buggy and should be fixed. FWIW MSIE9 seems to honor the filename hint with inline (contrary to the test results mentioned earlier in the thread). ... Hint: the test page has a feedback link. That being said: I just tried http://greenbytes.de/tech/tc2231/inlwithasciifilename.asis and IE9 seems to ignore the filename information. Best regards, Julian
Re: [whatwg] Content-Disposition property for a tags
On 2011-06-03 14:23, Dennis Joachimsthaler wrote: Am 03.06.2011, 10:23 Uhr, schrieb Eduard Pascual herenva...@gmail.com: On Thu, Jun 2, 2011 at 10:09 PM, Dennis Joachimsthaler den...@efjot.de wrote: By the way, another point that we have to discuss: Which tag should a browser favor. The one in HTTP or the other one in HTML? Is that really worth discussing? HTTP HTML: whomever provides the file should have the last say about how the file needs to be served, regardless of what a site referencing to it may suggest. Furthermore, when links point to URIs with any scheme other than http:, whatever the scheme defines about how to deliver the file takes precedence. Thus, only in the lack of an actual Content-Disposition header, or its equivalent on some other scheme, would the attribute given by the link be used, just like an additional fallback step before whatever the UA's default behaviour would be. I agree that I shouldn't even have asked since this is actually a no- brainer. I can't think of any good reason to overwrite the http header with the html attribute. Alright, so, moving on... This grants the ability for any content provider to use an explicit Content-Disposition: inline HTTP header to effectively block download links from arbitrary sources. True. Is it still so that some browsers ignore the filename part of a content-disposition if an inline disposition is used? Yes, see http://greenbytes.de/tech/tc2231/#inlwithasciifilename. Apparently only Firefox gets this right. ... Best regards, Julian
Re: [whatwg] Content-Disposition property for a tags
On 2011-05-26 22:54, Dennis Joachimsthaler wrote: Am 26.05.2011, 22:53 Uhr, schrieb Boris Zbarsky bzbar...@mit.edu: Probably no one, to a first approximation, but we were specifically talking about non-Windows systems. On Windows, as I said, Gecko forces extensions to match content types, to avoid this sort of issue in general. Yep, yep... If browsers implement the filename (+ extension) name changing we should make it a MUST to implement security... ... Like http://greenbytes.de/tech/webdav/draft-ietf-httpbis-content-disp-latest.html#rfc.section.4.3? Best regards, Julian
Re: [whatwg] element img with HTTP POST method
On 10.12.2010 01:46, Tab Atkins Jr. wrote: ... Indeed. You shouldn't be able to trigger POSTs from involuntary actions. They should always require some sort of user input, because there is simply *far* too much naive code out there that is vulnerable to CSRF. ... Thanks, Tab. It's sad that the discussion even got that far. If the URI length is a problem because of browsers, fix the browsers to extend the limits, instead of adding a completely new feature. Best regards, Julian
Re: [whatwg] Content-Disposition property for a tags
On 02.08.2010 18:56, Tab Atkins Jr. wrote: 2010/8/2 Kornel Lesińskikor...@geekhood.net: Downloads can be forced already with Content-Disposition: attachment. It's just harder to do, and unfortunately that doesn't stop webmasters from trying. Popular PHP snippets for forcing download are among the most disgusting cargo-cult code I've ever seen — they're collection of self-contradictory and nonsensical HTTP headers, break caching and resuming, and often have security vulnerabilities. It would be great if we could obsolete those scripts. It would be great if those scripts could just get fixed. Indeed; I've used those code samples, and since the entire area is basically voodoo to me, I still have no idea which headers I sent did anything and which are useless or even harmful cruft. In general, even well-educated authors have no clue what they're doing here. I believe the spec for C-D is sufficiently clear. But you still need to read it :-). Best regards, Julian
Re: [whatwg] Content-Disposition property for a tags
On 06.08.2010 05:49, Bjartur Thorlacius wrote: ... IMO there should be a standard metadata wrapper that should be around virtually all files being passed around the Internet. Downloaders should register the metadata to xattrs or somesuch and uploaders should collect said metadata and rewrap it. Technically application/http could be used. ... There is a widely deployed metadata wrapper; it's the HTTP message headers. Best regards, Julian
Re: [whatwg] Content-Disposition property for a tags
On 07.12.2010 18:51, Dennis Joachimsthaler wrote: Am 07.12.2010, 10:13 Uhr, schrieb Julian Reschke julian.resc...@gmx.de: It would be great if those scripts could just get fixed. Do you actually think that would HAPPEN? I think not. Better have people get rid of them entirely. Though that wouldn't happen either. I'm still all for such a property in a hrefs. I personally hate writing scripts to do something so simple. I think we could name it declaration of content. Why should HTTP, the protocol underlying the HTML language, have to take care of declaration of content? Shouldn't the HTML file itself have the power over that? We have a lot of that already, like content types, etc. But we can not yet declarate content which is MEANT for downloading to your hard drive. This is a big hole in my opinion. ... I'm not against adding this in principle; but it shouldn't keep us from improving the situation for what's already there. Having multiple ways to do the same thing causes real cost; you need to explain when to use what, and define which information takes priority. Also; be sure to replicate what's needed from C-D, namely the filename information. Has it ever been considered to use target=_download (just made up) for this? Best regards, Julian
Re: [whatwg] Reserving XRI and URN in registerProtocolHandler
On 26.11.2010 05:20, Brett Zamir wrote: I'd like to propose reserving two protocols for use with navigator.registerProtocolHandler: urn and xri (or possibly xriNN where NN is a version number). See http://en.wikipedia.org/wiki/Extensible_Resource_Identifier for info on XRI (basically allows the equivalents of URN but with a user-defined namespace and without needing ICANN/IANA approval). Although it was You don't need ICANN/IANA approval. You can use informal URN namespaces, use a URN scheme that allows just grabbing a name (such as URN:UUID) *or* write a small spec; for the latter, the approval is *IETF* consensus (write an Internet Draft, then ask the IESG for publication as RFC). Best regards, Julian
Re: [whatwg] Reserving XRI and URN in registerProtocolHandler
On 26.11.2010 11:54, Brett Zamir wrote: ... My apologies for the lack of clarity on the approval process. I see all the protocols listed with them, so I wasn't clear. In any case, I still see the need for both types being reserved (and for their subnamespaces targeted by the protocol handler), in that namespacing is built into the XRI unlike for informal URNs which could potentially conflict. ... I'm still not sure what you mean by reserve and what that would mean for the spec and for implementations. I do agree that the current definition doesn't work well for the urn URI scheme, as, as you observed, semantics depend on the first component (the URN namespace). Do you have an example for an URN namespace you actually want a protocol handler for? Finally, I'd recommend not to open the XRI can-of-worms (see http://en.wikipedia.org/wiki/Talk:Extensible_Resource_Identifier). Best regards, Julian
Re: [whatwg] Reserving XRI and URN in registerProtocolHandler
On 26.11.2010 16:55, Brett Zamir wrote: On 11/26/2010 7:13 PM, Julian Reschke wrote: On 26.11.2010 11:54, Brett Zamir wrote: ... My apologies for the lack of clarity on the approval process. I see all the protocols listed with them, so I wasn't clear. In any case, I still see the need for both types being reserved (and for their subnamespaces targeted by the protocol handler), in that namespacing is built into the XRI unlike for informal URNs which could potentially conflict. ... I'm still not sure what you mean by reserve and what that would mean for the spec and for implementations. I just mean that authors should not use already registered protocols except as intended, thinking that they can use any which protocol name they like (e.g., the Urn Manufacturers Company using urn for its categorization scheme). I do agree that the current definition doesn't work well for the urn URI scheme, as, as you observed, semantics depend on the first component (the URN namespace). Do you have an example for an URN namespace you actually want a protocol handler for? ISBNs. Oh, that's a good point. In particular, if the URN WG at some day makes progress with respect to retrieval. So, would it be possible to write a generic protocolHandler for URN which itself delegates to more specific ones? ... BR, Julian
Re: [whatwg] Content-Disposition property for a tags
On 26.09.2010 12:39, Dennis Joachimsthaler wrote: Hello, I'd like to bring this back to attention. I don't want this to be forgotten before anybody who is official has said their definitive yes or no about it. Or how else do new additions find their way into the draft? Many were positive about this feature, so I don't want to let this sink into oblivion. If you want this to be tracked, you should open a ticket in the W3C bug tracker. Best regards, Julian
Re: [whatwg] Proposal: add attributes etags last-modified to link element.
On 19.09.2010 22:33, Robert O'Callahan wrote: ... So for example, page A links to resource B. The browser does a GET on A, and receives a document containing a link to B, and the link element has etags or last-modified attributes. The browser has a cached resource for B, whose etags/last-modified matches the link attribute, so the browser knows its cached B is valid and no further network transactions are required. The linked resource B having the right caching information in the first place (when the browser first fetched it) isn't enough to eliminate the need for an HTTP transaction to validate B later. ... Well, it would if the caching information specifies an expiry time sufficiently in the future. Best regards, Julian
Re: [whatwg] Proposal: add attributes etags last-modified to link element.
On 20.09.2010 02:37, Aryeh Gregor wrote: ... Sure it would. You can currently only save an HTTP request if a future Expires header (or equivalent) can be sent. A lot of the time, the resource might change at any moment, so you can't send such a header. The client has to check every time, and get a 204, even if the resource changes very rarely. If you could indicate in the HTML source that you know the resource hasn't changed, you could save a lot of round-trips on a page that links to many resources. ... Resources that should be cached (stylesheets, images) but change at unexpected times are indeed a problem. A well understood approach is to push some kind of version indicator into the URI (such as query parameter). Best regards, Julian
Re: [whatwg] Proposal: add attributes etags last-modified to link element.
On 20.09.2010 17:26, Mike Belshe wrote: ... LINK, in general, allows a server to indicate to a client that it will need a particular resource earlier than the client otherwise would have discovered it. Today, the LINK header doesn't assist with understanding ... Sorry? That may be a use case that *could* be implemented using LINK, but it's certainly *not* the general use case. For instance, it doesn't seem to be true for any of the currently used link relations in wide use, such as icon or stylesheet (there's no later discovery at all). Or are you referring to using the Link *header* in addition to an equivalent HTML LINK? existing cache control mechanics, so if the browser does have the resource in cache but it needs validation, you didn't accomplish what you had hoped with the LINK header - the client is still going to make a costly round-trip. For savvy content authers, they could, as you suggest, simply modify the content to work with this case. This effectively restricts the full benefit of LINK to the subset of resources which are static and have long-lived expiry. That would leave LINK less useful to large swaths of the internet where they do leverage if-modified-since and etags. Link relations cover many other use cases than those that you seem to be considering. For resources that change infrequently but at unexpected times, it's already possible to get what you want by varying the URI when the resource changes (such as putting a timestamp or a revision number into a query parameter). Rather than ask this question about the LINK header attributes, you could instead aim your question at HTTP - why does HTTP bother with if-modified-since?But the answer is moot - that decision was made long ago. Not sure what you're referring to. If-Modified-Since predates ETags (as far as I recall). Given that the web *does* use these basic cache control mechanisms, why *wouldn't* you want the LINK header to be capable of using them too? :-) This proposal is actually just making LINK more like the rest of HTTP. My main concern is that if we put etags into *HTML* links, we're leaking protocol-level information into markup. I think it would be good if we could avoid that, and so far I haven't seen any use case that doesn't work without. Best regards, Julian
Re: [whatwg] Proposal: add attributes etags last-modified to link element.
On 20.09.2010 18:17, Gavin Peters (蓋文彼德斯) wrote: I think Mike was referring to the Link header. This header is defined in RFC 2068 (but not RFC 2616) in section 19.6.2.4 http://tools.ietf.org/html/rfc2068#section-19.6.2.4 , the most important part of that text is probably that The Link field is semantically equivalent to the LINK element in the HTML. There's also a pending internet draft which expands more fully on this header: https://datatracker.ietf.org/doc/draft-nottingham-http-link-header/ , and that draft in the HTTP case maintains the HTML equivalence (see section 5 of the internet draft). I happen to be aware of the Link header, and the draft (which, by the way, was approved a few months ago and already is in the RFC Editor's publication queue). I think the HTML link element is unusual because it does exist both in markup, and at the protocol level. My experimentation with these attributes has been entirely at the protocol, and not the markup level. The standard for the element is in HTML, and so that's why I made my proposal here in whatwg. If we're talking about the link header primarily, I'd suggest you move over to the IETF HTTPbis Working Group's mailing list (http://lists.w3.org/Archives/Public/ietf-http-wg/). ... Those approaches work; but require modifying the HTML. So if a server is attempting to have good protocol-level support for the Link header, and to help a client avoid redundant fetches, we're now requiring information leak from the protocol level down to the markup level. I think this problematic, too. If the link element is going to work as both a header and an element, it should have sufficient flexibility to be useful and fully embedded in each application. ... I think Mike was speaking about conditional gets generally, which can of course be conditioned on ETag or Last-Modified. Most web browsers, when they have expired cache data, will make a conditional get based on their existing cache entry. If these attributes give a way to avoid this extra request, and if these attributes enhance the protocol-level context, why not support them? ... The main reason would be additional complexity (IMHO). But if we're talking about HTTP this mailing list most certainly is not the right place to discuss this. Later on Mike writes: Yeah, I'm thinking of servers that can learn and auto-generate these headers. I think you're thinking of content authors plunking this into their HTML. So, clarifying: you would send an *additional* Link header for the stylesheet relation, and augment it with the current etag? I'd be perfectly happy to split these out of the HTML-link to the HTTP-link. Maybe its time they be split up. I think both should be consistent (like relation type names mean the same thing); but that doesn't necessarily mean that their feature sets need to be identical. Best regards, Julian
Re: [whatwg] Proposal: add attributes etags last-modified to link element.
On 15.09.2010 19:45, Gavin Peters (蓋文彼德斯) wrote: Hi, I'm working on link tags inside of chrome. We're now experimenting with an optimization that uses link tags and headers to avoid round trips for cache validation in many cases. ... Clarifying: essentially that's a workaround for resources for which the actual cache information returned by HTTP GET isn't accurate, right? Which of course leads to the question: if the maintainers of a site can't get their cache information right, what makes you think they can get their HTML right instead? Best regards, Julian
Re: [whatwg] Proposal: add attributes etags last-modified to link element.
On 19.09.2010 20:47, Robert O'Callahan wrote: 2010/9/19 Julian Reschke julian.resc...@gmx.de mailto:julian.resc...@gmx.de On 15.09.2010 19:45, Gavin Peters (蓋文彼德斯) wrote: Hi, I'm working on link tags inside of chrome. We're now experimenting with an optimization that uses link tags and headers to avoid round trips for cache validation in many cases. ... Clarifying: essentially that's a workaround for resources for which the actual cache information returned by HTTP GET isn't accurate, right? Which of course leads to the question: if the maintainers of a site can't get their cache information right, what makes you think they can get their HTML right instead? No, it's a performance optimization. I presume that if the link attributes indicate that the browser's cached resource is valid, the browser does not issue a network request to validate the resource. :-) So it's a workaround that causes a performance optimization. It wouldn't be necessary if the linked resource would have the right caching information in the first place. So again: what makes you think they can get their HTML right instead? Best regards, Julian
Re: [whatwg] Video with MIME type application/octet-stream
On 13.09.2010 23:51, Aryeh Gregor wrote: ... And for heavens sake, do not specify any sniffing as official. Instead, explicitly specify all sniffing as UA specific and possibly suggest that UAs should inform the user that content is broken and the current rendering is best effort if any sniffing is required. This is totally incompatible with the compelling interoperability and security benefits of all browsers using the exact same sniffing algorithm. ... Again, there's more than browsers. And even for video in browsers, the actual component playing the video may not be part of the browser at all. So there's *much* more that would need to implement the exact same sniffing. Has anybody talked to the people responsible for VLC, Windows Media Player, and Quicktime? Best regards, Julian
Re: [whatwg] Video with MIME type application/octet-stream
On 07.09.2010 22:00, Boris Zbarsky wrote: ... * If a file in a top-level browsing context is sniffed as video but then some kind of error is returned before the video plays the first frame, fall back to allowing the user to download it, or whatever the usual action would be if no sniffing had occurred. This might be pretty difficult to implement, since the video decoder might consume arbitrary amounts of data before saying that there was an error. ... It's not that hard if it's acceptable to restart the network request (just do it again, with a flag not-to-sniff). Best regards, Julian
Re: [whatwg] Video with MIME type application/octet-stream
On 07.09.2010 11:51, And Clover wrote: On 09/07/2010 03:56 AM, Boris Zbarsky wrote: P.S. Sniffing is harder that you seem to think. It really is... Quite. It surprises and saddens me that anyone wants to argue for *more* sniffing, and even enshrining it in a web standard. +1 Sniffing is a perpetual disaster that, after several security-sensitive problems, web browsers have been moving to deprecate/mitigate. If browsers want to guess types when no Content-Type is specified(*) then fine, but there is no good reason to ignore an explicitly-set type. I don't want my `application/octet-stream` file download service to be repurposeable as a video player for some other party! Hmm, that's what Content-Disposition: attachment is for... ... Best regards, Julian
Re: [whatwg] Video with MIME type application/octet-stream
On 07.09.2010 12:52, Philip Jägenstedt wrote: ... IE9, Safari and Chrome ignore Content-Type in a video context and rely on sniffing. If you want Content-Type to be respected, convince the developers of those 3 browsers to change. If not, it's quite inevitable that Opera and Firefox will eventually have to follow. ... We have heard that Safari sniffs for compatibility with content previously consumed by Quicktime, and that IE9 may sniff because they (currently) can't pass the content-type to the decoding machinery (or something like that). So you really would have to standardize sniffing in the browsers, but also in the components they delegate video display to. Good luck with that. Best regards, Julian
Re: [whatwg] Video with MIME type application/octet-stream
On 01.09.2010 10:12, Philip Jägenstedt wrote: ... If we start ignoring the Content-Type I expect we would also add sniffing so that opening a video served with the wrong (or missing) Content-Type still works in a top-level browsing context, as it does for images (I think). ... Sniffing in the *absence* of a content type is fine. The interesting question is what to do when it's present, but wrong. Best regards, Julian
Re: [whatwg] Video with MIME type application/octet-stream
On 01.09.2010 16:23, Philip Jägenstedt wrote: ... Huh, I guessed incorrectly, neither serving a PNG as text/plain or text/html makes it be sniffed and rendered in a top-level browsing context in Opera. However, both work in IE8. ... Please don't say work when talking about something that's not supposed to happen...
Re: [whatwg] Video with MIME type application/octet-stream
On 01.09.2010 15:13, Brian Campbell wrote: On Aug 31, 2010, at 9:40 AM, Boris Zbarsky wrote: On 8/31/10 3:36 AM, Ian Hickson wrote: You might say Hey, but aren't you content sniffing then to find the codecs and you'd be right. But in this case we're respecting the MIME type sent by the server - it tells the browser to whatever level of detail it wants (including codecs if needed) what type it is sending. If the server sends 'text/plain' or 'video/x-matroska' I wouldn't expect a browsers to sniff it for Ogg content. The Microsoft guys responded to my suggestion that they might want to implement something like this with what's the benefit of doing that?. One obvious benefit is that videos with the wrong type will not work, and hence videos will be sent with the right type. What makes you say this? Even if they are sent with the right type initially, the correct types are at high risk of bitrotting. The big problem with MIME types is that they don't stick to files very well. So, while someone might get them working when they initially use video, if they move to a different web server, or upgrade their server, or someone mirrors their video, or any of a number of other things, they might lose the proper association of files and MIME types. ... That's true, and the reason why people still use file extensions. That's not super elegant, but it works. Best regards, Julian
Re: [whatwg] Video with MIME type application/octet-stream
On 31.08.2010 09:36, Ian Hickson wrote: Fromhttp://greenbytes.de/tech/webdav/rfc2046.html#rfc.section.1: Parameters are modifiers of the media subtype, and as such do not fundamentally affect the nature of the content. The set of meaningful parameters depends on the media type and subtype. Most parameters are associated with a single specific subtype. However, a given top-level media type may define parameters which are applicable to any subtype of that type. Parameters may be required by their defining media type or subtype or they may be optional. MIME implementations must also ignore any parameters whose names they do not recognize. So, as codecs is not defined on application/octet-stream, the parameter simply should be ignored, thus the advice [...]: The MIME type application/octet-stream with no parameters is never a type that the user agent knows it cannot render. User agents must treat that type as equivalent to the lack of any explicit Content-Type metadata when it is used to label a potential media resource. Note: In the absence of a specification to the contrary, the MIME type application/octet-stream when used with parameters, e.g. application/octet-stream;codecs=theora, is a type that the user agent knows it cannot render. is incorrect, because it requires handling application/octet-stream and application/octet-stream;codecs=theora differently. That's not incorrect. The type with no parameters is a special case that corresponds to a common configuration default. The case with parameters is not that case, and represents likely intentional configuration and thus clearly not a video format the UA supports. My point is that it's incorrect to make this distinction, and that it's furthermore misleading to mention the codecs parameter in the context of a type that doesn't define it. It's also not clear whether the note applies to all parameters or just codecs. The normative text you quote doesn't mention any specific parameters. In which case it would be a *bit* clearer if the note used a parameter that doesn't suggest that codecs has any meaning on a/o. Regarding codecs= in particular, it's an implementation reality that user agents that support it are likely to support it regardless of the type, so there's really no point trying to maintain an artificial boundary of which types it has semantics for and which it doesn't. David Singer pointed out in http://www.w3.org/Bugs/Public/show_bug.cgi?id=10202#c11 that this is the wrong thing to do. Do you have any evidence that UAs already use codecs on types on which they aren't defined, *and*, if this is the case, they can't be changed anymore? Best regards, Julian
Re: [whatwg] Video with MIME type application/octet-stream
On 31.08.2010 15:57, Anne van Kesteren wrote: ... Another is that when you save the video to disk the browser will fix up the extension correctly, if needed. If you sniff you can fix it up correctly too. ... Then let's hope that sniffing doesn't recognize Windows binaries. Best regards, Julian
Re: [whatwg] HTML6 Doctype
On 29.08.2010 05:15, David John Burrowes wrote: Hello all, I wanted to chime in on this discussion. Let me say up front that clearly the w3c and the browser vendors all are on the same page as you, Ian. I'm not in the position to be challenging your collective wisdom! ... With respect to the W3C, that's far from clear. Best regards, Julian
Re: [whatwg] base64 entities
On 27.08.2010 00:45, Adam Barth wrote: ... Escaping just those character is insufficient. The appeal of this approach is that authors don't need the right blacklist of dangerous characters. By the way, there are already folks doing something similar manually now. They send the untrusted bytes as base64 and decode them using JavaScript. That sounds like a good idea which doesn't have the deployment problem. ... On Thu, Aug 26, 2010 at 1:30 PM, Julian Reschkejulian.resc...@gmx.de wrote: I now get the point about the additional problems in script, but I fail to see how the proposal addresses this, unless expanding these entities is suppose to happen *after* parsing the script. Yes. That's precisely what happens. Ok. To be clear: the same applies to HTML entities in text/html, but not for XML entities in application/xhtml+xml (because of the different handling of script content). So, what's the implication for XHTML? Best regards, Julian
Re: [whatwg] Validator.nu Bug: Error: XHTML element noscript not allowed as child of XHTML element head in this context.
On 27.08.2010 12:32, Hugh Guiney wrote: Ah, thanks. I guess the error is just confusing then in that it calls it XHTML element noscript, which led me to think that it was indeed part of XHTML. I think some indication otherwise might prove beneficial to users. But, I thought XHTML5 was just an XML serialization of HTML5, so why is this the case? I just read the rationale behind it, but despite not being best practice shouldn't it be at the very least allowed? ... The HTML WG is currently discussing whether it should be deprecated (in HTML), see http://www.w3.org/Bugs/Public/show_bug.cgi?id=10068. If the outcome of this is that there are good use cases for noscript, I'd expect that it will also be allowed in XHTML. Best regards, Julian
Re: [whatwg] base64 entities
On 25.08.2010 22:50, Adam Barth wrote: == Summary == ... Not convinced. There's already one way to escape these things, and this is supported in all UAs. I don't see how adding another mechanism will help those who can't use the first one properly. For instance, people unable to escape , and are likely also unable to get the UTF-8 conversion right. Best regards, Julian
Re: [whatwg] base64 entities
On 26.08.2010 22:10, Aryeh Gregor wrote: On Thu, Aug 26, 2010 at 5:58 AM, Julian Reschkejulian.resc...@gmx.de wrote: Not convinced. There's already one way to escape these things, and this is supported in all UAs. Adam gave two examples of cases where htmlspecialchars() is insufficient, even if authors do use it. This proposal is completely general and will work anywhere, even inscript. Is automated general escaping even possible right now inscript for text/html? I have to admit that I'm not sure what's special about script here. Are you saying that it's insufficient to escape all characters that have a special meaning there? Server-wise, how is introducing a new escape mechanism any better than fixing the support code for the existing mechanism? Best regards, Julian