Re: [whatwg] *** GMX Spamverdacht *** Parsing of meta refresh needs tweaking

2015-01-06 Thread Julian Reschke

On 2014-12-11 09:09, Simon Pieters wrote:

The spec's parsing rules of meta refresh causes infinite reloading on
some pages. In particular, the spec requires the url= to be present,
but there are pages that omit it. IE9 also requires url= apparently.
Gecko/Blink/WebKit allow url= to be omitted.

For example, there is http://www.only-for-winners.com/ which has

meta http-equiv=refresh
content=0;http://www.aldanitinetwork.com; /

Clearly this is intended to redirect, not reload the current page after
0 seconds.


SELECT page, COUNT(*) AS num
FROM [httparchive:runs.2014_08_15_requests_body]
WHERE page = url
AND mimeType CONTAINS html
AND REGEXP_MATCH(LOWER(body),
rmeta\s+[^]*http-equiv\s*=\s*[\']?refresh)
AND REGEXP_MATCH(LOWER(body),
rmeta\s+[^]*content\s*=\s*[\']?\s*\d+\s*;\s*[^\'])
AND NOT REGEXP_MATCH(LOWER(body),
rmeta\s+[^]*content\s*=\s*[\']?\s*\d+\s*;\s*url=)
GROUP BY page

23 rows.

I also noticed that Gecko allows the number to be omitted. I only found
one page doing that and it was using meta http-equiv=refresh
content=;URL= so it seems we can fail parsing for that case.



I hear (a) these pages have been broken in IE for a long time, and (b) 
only 23 (?) pages in your DB are found.


So why not just leave them broken?

Best regards, Julian



Re: [whatwg] Parsing of meta refresh needs tweaking

2015-01-06 Thread Julian Reschke

On 2015-01-07 08:52, Simon Pieters wrote:

 ...

I hear (a) these pages have been broken in IE for a long time, and (b)
only 23 (?) pages in your DB are found.


Right.


So why not just leave them broken?


It's a worse user experience and it's a shorter path to interop to
change IE.
...


User experience for invalid content is one aspect; sane parsing rules 
are another one. Not requiring the parameter name will make it harder to 
introduce new parameters in the future.


YMMV.

Best regards, Julian


[whatwg] alternate ids for elements

2014-12-03 Thread Julian Reschke

Hi there,

I have a use case where a certain location in a document can have two 
anchors (or even more). For instance, in a spec, the author may have 
specified an anchor, but a section-number based anchor is required as well.


Right now I can address this by inserting an additional div element, 
but this is kind of ugly, and doesn't scale well.


How about a new attribute alt-ids which would take a space-separated 
list of additional anchors?


Best regards, Julian


Re: [whatwg] alternate ids for elements

2014-12-03 Thread Julian Reschke

On 2014-12-03 15:02, Jukka K. Korpela wrote:

2014-12-03, 15:49, Julian Reschke wrote:


I have a use case where a certain location in a document can have two
anchors (or even more). For instance, in a spec, the author may have
specified an anchor, but a section-number based anchor is required as
well.


Can you elaborate on that? Why cannot you use the same id attribute
value in all references to an element?


1.) An author-supplied anchor may change, but you want to preserve 
existing deep links from other documents.


2.) You may want to support anchors based on section numbers which will 
allow other parties to link to a specific section of the document while 
only knowing the section number and a template (think references to 
sections numbers in RFCs over on tools.ietf.org).



How about a new attribute alt-ids which would take a space-separated
list of additional anchors?


What would be the use of such additional identifiers?


See above. Essentially aliases for anchors.


The only thing I can imagine right now is a situation where you have an
existing id attribute and references to it all around but now need to
refer from a context that imposes its own restrictions on the syntax.
Say, you have id=παράδειγμα and you need to refer to the element using
a URL like http://example.com/foo.html#παράδειγμα; but cannot because
the URL needs to be used in an environment where Greek letters cannot be
used. But this sounds like a rather rare occasion.


It's yet another use case that could be addressed that way.

Best regards, Julian



Re: [whatwg] Proposal to add website-* meta extensions

2014-07-16 Thread Julian Reschke

On 2014-07-16 11:31, Arpita Bahuguna wrote:

Hi Julian,

Thank-you for your views.

Are you suggesting that we instead introduce a new link relation
(perhaps contact) with tel:/mailto: types for specifying these parameters?

This would involve spec modification. Not sure how developers or browser
vendors take to it.


Why would it involve a spec modification?


Was looking for a simpler solution that's quickly implementable.


Why is meta any simpler than link???



Re: [whatwg] Proposal to add website-* meta extensions

2014-07-16 Thread Julian Reschke

On 2014-07-16 12:01, Arpita Bahuguna wrote:

Hi Julian,

Please find my comments inline:

-Original Message-
From: whatwg [mailto:whatwg-boun...@lists.whatwg.org] On Behalf Of Julian 
Reschke
Sent: Wednesday, July 16, 2014 3:10 PM
To: Arpita Bahuguna
Cc: wha...@whatwg.org; Arpita Bahuguna
Subject: Re: [whatwg] Proposal to add website-* meta extensions

On 2014-07-16 11:31, Arpita Bahuguna wrote:

Hi Julian,

Thank-you for your views.

Are you suggesting that we instead introduce a new link relation
(perhaps contact) with tel:/mailto: types for specifying these parameters?

This would involve spec modification. Not sure how developers or
browser vendors take to it.


Why would it involve a spec modification?

Currently the link types defined by the specification are: 
http://www.whatwg.org/specs/web-apps/current-work/#linkTypes

Please correct me if I am wrong but I suspect introducing a new rel type would 
involve modifying the spec as well.


No. There's a Wiki for it.


Adding new meta extensions however would not have any such overhead as far as I 
know.



Was looking for a simpler solution that's quickly implementable.


Why is meta any simpler than link???






Re: [whatwg] Proposal to add website-* meta extensions

2014-07-15 Thread Julian Reschke

On 2014-07-15 12:11, Arpita Bahuguna wrote:

Hi,

I would like to propose addition of the following three meta extensions:
website-mail, website-number and website-address.

Please find below a detailed description for each.

--- x - x
--- x --

Overview:

The website-mail meta extension defines a suggested e-mail ID, such as the
customer support mail ID, specified by the vendor.

The website-number meta extension defines a proposed phone number, such as
the customer support number, specified by the vendor.

The website-address meta extension defines a given address (or geolocation
tag), such as the vendor's office address or billing address.



UA's displaying a page containing any or all of these meta extensions could
then make this information directly available for the user's perusal.



Oft times visitors have to hunt through a vendor's site for obtaining the
customer support mail ID, phone number etc. which is mostly hidden behind a
not so prominently displayed Help or Contact Us link.

Vendor's specifying their registered mail ID, phone number and/or their
address via these meta extensions can thus expect supporting UA's to present
this information to the user in an easily accessible format, either by way
of a browser menu option (such as mail, call, map) or via the URL bar
scheme handler or in another similar format.

Selecting these menu options, if available, should launch the default mail
application with the specified mail ID, the dialer application with the
given contact number or, the default maps application loaded with the
specified address/location tag respectively.

No known existing meta extensions with a similar name/intention exist.

Syntax:

meta name=website-mail content=a...@xyz.com


This should be a link relation, using by default a mailto:; URI.


The content attribute for the website-mail meta extension can take any valid
email ID.

meta name=website-number content=+1-555-555-


This should have a better name, and also be a link relation, using a 
tel: URI.



The content attribute for the website-number meta extension can take any
valid phone number.

meta name=website-address content=Jane Doe, 5844 South Oak Street,
Chicago, Illinois 60667 or

meta name=website-address content=20.593684;78.96288

The content attribute for the website-address meta extension can be any
string or latitude and longitude separated by a semi-colon.

Note:

. In case multiple instances are found of the same meta extension,
the last specified one should take precedence.
...


In general, a single link to a URI having contact information seems to 
be much simpler to me...


Best regards, Julian


Re: [whatwg] HTTP status code from JavaScript

2014-05-23 Thread Julian Reschke

On 2014-05-23 06:53, Michael Heuberger wrote:

Hi James

Single page apps!

These become more and more popular with frameworks like RactiveJS or
AngularJS. There the first request is a HTTP request, for any subsequent
requests an AJAX one is generated. The problem is the first HTTP


AJAX requests are HTTP requests.

I assume you mean the distinction between page navigation and using 
XMLHTTPRequest?



request. The framework is unable to detect 404s with the first request
because the status code cannot be obtained via JavaScript, hence a
second request is made.


If the initial page load yields a 404 will there be any scripts to 
execute at all?



In my eyes, a waste of bandwidth.

Cheers
Michael


Best regards, Julian



Re: [whatwg] Zip archives as first-class citizens

2013-09-16 Thread Julian Reschke

On 2013-09-13 12:32, Robin Berjon wrote:

On 29/08/2013 15:58 , Simon Pieters wrote:

On Thu, 29 Aug 2013 15:02:48 +0200, Anne van Kesteren ann...@annevk.nl
wrote:

On Thu, Aug 29, 2013 at 1:19 PM, Jake Archibald
jaffathec...@gmail.com wrote:

Causing a network error in existing browsers is a shame.


It seems to fail to resolve in IE10. It works in
Gecko/WebKit/Blink/Presto: the %! is requested literally. However, both
Apache and IIS seems to return 400 Bad Request.


That's not exactly promising.
...


Because it's an invalid URI. % needs to be percent-escaped.

Best regards, Julian



Re: [whatwg] Request: Implementing a Geo Location URI Scheme

2013-06-05 Thread Julian Reschke

On 2013-06-05 00:25, Rodrigo Polo wrote:

I really don't want to fight over any issue, I, as a user, want to share
with you the current state on this topic and (as I said on the letter) with
a friendly open letter in the pursuit to make a polite request to make the
life of millions easier give you some of the reasons why I think it should
be implemented ASAP.

I already checked this proposed specs:
http://tools.ietf.org/html/rfc5870
http://www.iana.org/assignments/uri-schemes/uri-schemes.xml
http://www.w3.org/wiki/UriSchemes

But my experience waiting for many browsers implementations tell me the
process is slow and it looks like it works by the interest of each brand, I
really feel sorry for the Web SQL Database spec that was later removed, I
really hope the geo URI scheme could be implemented.

I know the registerProtocolHandler but it doesn't work exactly as
proposed, geo protocol isn't accepted on Chrome, only protocols with
the web- prefix and the URL parameter have to match the webpage that make
the request, it is designed for websites, not for local apps, all this
conclusions made by the tests I have done with the latest beta of Chrome
and FireFox:

https://developer.mozilla.org/en-US/docs/Web/API/navigator.registerProtocolHandler
http://updates.html5rocks.com/2012/02/Getting-Gmail-to-handle-all-mailto-links-with-registerProtocolHandler

Again, thanks for your attention and help.
...


Not sure what kind of browser support you are looking for.

If you want to geo URIs to invoke a local mapping application, all you 
need is to install an URI handler fort that scheme and that application 
in the *operating system*. This is how things like mailto: have been 
working for two decades now.


Best regards, Julian


Re: [whatwg] Request: Implementing a Geo Location URI Scheme

2013-06-05 Thread Julian Reschke

On 2013-06-05 13:27, Rodrigo Polo wrote:

Hi, well, the kind of support I think should be implemented is
actually something that should be a standard, any anchor that have a
mailto:; inside is supported out of the box in any web browser and the
first time it is clicked the web browser asks for the default app to
open that link.


At least on Windows, mailto:; is supported by an URI handler in the 
operating system, and the browser just delegates to it. You can install 
new URI handlers (think skype: and callto:). The browser doesn't need to 
have any special support (except for asking the OS for advice).


YMMV.


The geo URI handler is not supported by default out of the box and it
should, for the sake of the user experience, to make it work it is
required for everyone in the web browser development community to join
forces with the maps application developers, the


It's the mapping application that needs to support it, not the browser. 
The browser will support whatever the system supports it is running on.



...


Best regards, Julian



Re: [whatwg] Request: Implementing a Geo Location URI Scheme

2013-06-05 Thread Julian Reschke

On 2013-06-05 14:00, Rodrigo Polo wrote:

You are completely right, but in the tests I made on Chrome the geo
URI handler can't be used with the registerProtocolHandler call,
it throws a security error and the use of geo location URI it is not
included as a recommendation or good practice when we talk about the
markup, so it is not a technical thing, it is more an idea that could be
included in further discussions between web browsers developers, map app
developers and the users so everyone adopt the idea of having the geo
URI scheme adopted as an standard, I'm quite sure this idea can help a
lot of users and web developers to give a better user experience and it
is more important that many other things, it will make the life of users
a lot easier.
...


You don't *need* registerProtocolHandler to support geo:. Just install 
an OS-level application that handles geo: and you are done.


That being said: I agree that geo: should be added to the white list 
so that browser-based handlers for geo: become possible.


Best regards, Julian


Re: [whatwg] Priority between a download and content-disposition

2013-03-17 Thread Julian Reschke

On 2013-03-17 02:49, Jonas Sicking wrote:

It's currently unclear what to do if a page contains markup like a
href=page.txt download=A.txt if the resource at audio.wav
responds with either

1) Content-Disposition: inline
2) Content-Disposition: inline; filename=B.txt
3) Content-Disposition: attachment; filename=B.txt

People generally seem to have a harder time with getting header data
right, than getting markup right, and so I think that in all cases we
should display the save as dialog (or display equivalent download
UI) and suggest the filename A.txt.


I agree that people have problems getting headers right, but in all the 
cases above, it seems they have set the header on purpose, no?


My recollection was that a/@download was mainly added for cases where 
the header field couldn't be set at all...


 ...

Best regards, Julian


Re: [whatwg] URL standard: Query string parsing; host parsing

2013-03-14 Thread Julian Reschke

On 2013-03-13 21:24, Boris Zbarsky wrote:

On 3/13/13 4:23 PM, Julian Reschke wrote:

Under RFC 3986, it would resolve to

   jar:http://example.com/Bar.class


If you assume that this is a hierarchical scheme and that the hierarchy
is in some particular place, no?  Why is that assumption being made?


No such assumption was made. Just following the algorithm in the spec.


Looks like a broken scheme to me.


I'm not going to try to claim jar: is a wonderful thing.  It is what it
is.  It needs to not break.


Is it used outside Java applet scenarios?

BTW: this shows why formal registration and review of URI schemes is a 
*feature*.


Best regards, Julian



Re: [whatwg] URL standard: Query string parsing; host parsing

2013-03-13 Thread Julian Reschke

On 2013-03-13 18:38, pocci...@gmail.com wrote:

(This was originally a bug report, but I was told to e-mail instead.  Another
issue is also added.)

-- Non-relative URLs in the query string --

Earlier I posted an issue with serializing the query in non-relative URLs. But 
after
I read more about URIs, I am not sure whether the scheme data and query string
should be kept separate.  There is a distinction between how the URL 
specification
categorizes URLs and how the URI standards (RFC3986 and RFC3987) classify URIs.

Both standards allow fragments to appear in all URLs/URIs, but they differ on 
whether
a query string is parsed.  In the URL standard, query strings can occur in all 
URLs, but
in the URI standards, a query string is not parsed if the URI contains a scheme 
but
the scheme data doesn't begin with a slash (that is, if the URI is an opaque 
URI).

Take the following as an example:

mailto:m...@example.com?subject=Hi

In the URL standard, the URL is parsed as:

scheme - mailto
scheme data - m...@example.com
query - subject=Hi

but in the URI standards, the URI is parsed as:

scheme - mailto
scheme-specific part - m...@example.com?subject=Hi

Here, in the mailto scheme, separating the scheme data and the query may be a 
useful distinction.

As another example, the string

jar:http://example.com/jar?x=1!/com/example/Foo.class

is parsed in the URI standards as:

scheme - jar
scheme-specific part - http://example.com/jar?x=1!/com/example/Foo.class


I have no idea what you're talking about, see 
http://greenbytes.de/tech/webdav/rfc3986.html#rfc.section.3.


This will parse into:

scheme: jar
hier-part: http://example.com/jar
query: x=1!/com/example/Foo.class


but in the URL standard as:

scheme - jar
scheme data - http://example.com/jar
query - x=1!/com/example/Foo.class
...


Best regards, Julian


Re: [whatwg] URL standard: Query string parsing; host parsing

2013-03-13 Thread Julian Reschke

On 2013-03-13 21:14, Boris Zbarsky wrote:

On 3/13/13 4:02 PM, Julian Reschke wrote:

On 2013-03-13 18:38, pocci...@gmail.com wrote:

jar:http://example.com/jar?x=1!/com/example/Foo.class

is parsed in the URI standards as:

scheme - jar
scheme-specific part - http://example.com/jar?x=1!/com/example/Foo.class


I have no idea what you're talking about, see
http://greenbytes.de/tech/webdav/rfc3986.html#rfc.section.3.

This will parse into:

scheme: jar
hier-part: http://example.com/jar
query: x=1!/com/example/Foo.class


I should note that jar: URIs are ... special.

For example, given a base of

   jar:http://example.com/jar?x=1!/com/example/Foo.class

the relative URI Bar.class should, as far as I know, resolve to:

   jar:http://example.com/jar?x=1!/com/example/Bar.class

What that means for parsing them, I cannot say...


Under RFC 3986, it would resolve to

  jar:http://example.com/Bar.class

Looks like a broken scheme to me.

Best regards, Julian



[whatwg] TZ database

2013-01-16 Thread Julian Reschke

On 2013-01-08 01:47, Ian Hickson wrote:

The next best choice would be to have datetime-with-timezone but
unfortunately

(1) Official database for all timezones does not exist
(2) Official timezone names (or labels) do not exist
(3) Timezones are subject to future political decisions

The problems (1) and (2) make transferring the timezone information from
the end user to the server very problematic and the problem (3) makes
any work to fix (1) and (2) a bit pointless. This is because even if UA
could successfully inform the server about the correct timezone, the
server could be using a week old timezone data that is not up to the
latest political events. Or the server might be using latest timezone
data but the UA could be using three year old data. In either case, the
absolute time in UTC could be different for the server and UA.


Indeed.


Sorry?

http://www.iana.org/time-zones addresses (1) and possibly (2), no?

Best regards, Julian





Re: [whatwg] [mimesniff] Sniffing archives

2012-12-03 Thread Julian Reschke

On 2012-11-29 20:25, Adam Barth wrote:

These are supported in Chrome.  That's what causes the download.  From


Can you elaborate about what you mean by supported? Chrome sniffs for 
the type, and then offers to download as a result of that sniffing? How 
is that different from not sniffing in the first place?



...your comment, it's not clear to me if you are correctly reverse
engineering existing user agents.  The techniques we used to create
this list originally are quite sophisticated and involved a massive
amount of data [1].  It would be a shame if you destroyed that work
because you didn't understand it.

Adam

[1] http://www.adambarth.com/papers/2009/barth-caballero-song.pdf
...


Understood; but on the other hand if there's a chance to simplify things 
than it makes sense to discuss this, even if that would involve changing 
some of the implementations.


Best regards, Julian


Re: [whatwg] [mimesniff] Sniffing archives

2012-12-03 Thread Julian Reschke

On 2012-12-04 08:40, Adam Barth wrote:

On Mon, Dec 3, 2012 at 12:39 PM, Julian Reschke julian.resc...@gmx.de wrote:

On 2012-11-29 20:25, Adam Barth wrote:

These are supported in Chrome.  That's what causes the download.  From


Can you elaborate about what you mean by supported? Chrome sniffs for the
type, and then offers to download as a result of that sniffing? How is that
different from not sniffing in the first place?


They might otherwise be treated as a type that can be displayed
(rather than downloaded).  Also, some user agents treat downloads of


Do you have an example for that case?


ZIP archives differently than other sorts of download (e.g., they
might offer to unzip them).


Out of curiosity: which?

Best regards, Julian



Re: [whatwg] [mimesniff] The X-Content-Type-Options header

2012-11-19 Thread Julian Reschke

On 2012-11-17 19:17, Adam Barth wrote:

...
I would prefer if the spec described what implementations actually do
rather than your opinion about what they should do.  To answer your
specific questions:
...


That works well if something is widely supported already. It works less 
well if you have one initial and one incomplete implementation only.



1) Don't bother dropping the X-.  Everyone who implements this
feature uses the X- and dropping it is just going to cause unnecessary
interoperability problems.


There's no *need* to drop it, but if research on this topic leads to the 
conclusion that the functionality is needed, but the current X- 
prototype isn't sufficient anyway it might be worth considering.



...


Best regards, Julian



Re: [whatwg] [mimesniff] The X-Content-Type-Options header

2012-11-19 Thread Julian Reschke

On 2012-11-19 19:27, Adam Barth wrote:

On Mon, Nov 19, 2012 at 10:17 AM, Julian Reschke julian.resc...@gmx.de wrote:

On 2012-11-17 19:17, Adam Barth wrote:

...

I would prefer if the spec described what implementations actually do
rather than your opinion about what they should do.  To answer your
specific questions:
...


That works well if something is widely supported already. It works less well
if you have one initial and one incomplete implementation only.


Which implementation is initial and which is incomplete?  AFAIK, both
IE and Chromium consider their implementation of this feature done.


initial - the one done first, and by the vendor that invented the 
functionality


incomplete - the one that copies one part and not the other part of 
he behavior of the initial implementation



...



1) Don't bother dropping the X-.  Everyone who implements this

feature uses the X- and dropping it is just going to cause unnecessary
interoperability problems.


There's no *need* to drop it, but if research on this topic leads to the
conclusion that the functionality is needed, but the current X- prototype
isn't sufficient anyway it might be worth considering.


Currently, I don't see a use case for dropping the X- prefix.  Perhaps
there's one I don't understand?


A use case for *renaming* (which might be more than dropping the prefix) 
actually would be saving bytes on the wire. Another one would be to make 
it possible to make incompatible changes to the field value syntax, when 
needed.


Best regards, Julian


Re: [whatwg] Meta bugreport proposal

2012-10-31 Thread Julian Reschke

On 2012-10-31 10:21, Nicolas Froidure wrote:

 Hi,

 I think we need a specification to allow users to report websites
bugs from their browser. That's why i think it could be usefull to add a
meta markup like this :
meta name=bugreport content=(uri) /


link, not meta.


The uri could be :
- mailto: to send a report by mail (ex: mailto:webmas...@example.org)
- http: to send the bug report a a simple HTTP POST request (ex:
http://example.org/bugreport).
- bug: something more customizable to allow webmasters to fit bug
reports with their systems (ex:
bug:http?uri=/bug.datmethod=POSTcaptcha=/captcha.jpg )


What's the use case for this? Do you want to automate bug submission? 
What for?



...


Best regards, Julian



Re: [whatwg] checksum attribute in a href tag

2012-10-25 Thread Julian Reschke

On 2012-10-19 14:01, Nils Dagsson Moskopp wrote:

A. Rauschenbach rauschenb...@annuo.de schrieb am Fri, 19 Oct 2012
13:50:04 +0200:


I'm sick of coping the checksum of important files by hand or QR-code
to the download manager or console.

To solve the problem I suggest a checksum attribute in the a href
tag.


It seems that problem is solved at the HTTP level with RFC 1864:
http://tools.ietf.org/html/rfc1864


The latest spec defining Content-MD5 was RFC 2616. It will not be 
included in the revision of HTTP/1.1 because of broken interop for Range 
requests, and because of the weakness of MD5 (see 
http://trac.tools.ietf.org/wg/httpbis/trac/ticket/178 for context).


That being said a new response header field that is well-defined wrt to 
partial responses and more flexible wrt to digest algorthms would be 
interesting.


 ...

Best regards, Julian


Re: [whatwg] Proposal for improved handling of '#' inside of data URIs

2012-10-09 Thread Julian Reschke

On 2012-10-09 13:51, Anne van Kesteren wrote:

On Tue, Oct 9, 2012 at 1:50 AM, Ian Hickson i...@hixie.ch wrote:

On Sat, 10 Sep 2011, Daniel Holbert wrote:

I'm writing with a proposal to improve the handling of # in data URIs.
I'm particularly looking for feedback from other browser vendors, but of
course feedback from others is welcome as well. [...]


Anne has since tried to respec URL parsing in detail, with the work in
progress being here:

http://url.spec.whatwg.org/

I recommend checking that spec to see if it does what you want, and if
not, working with Anne to see if it can be adjusted accordingly or if
something else needs to happen.


This is not written down explicitly just yet, but for data URLs I
think we want the fragment to *not* be part of the actual resource,
but rather as an input to the resource so things like

data:text/html,style:target{background:lime}/stylep id=xtest#x

work. (Fails in Chrome, but works fine in Opera and Firefox already.)


Clarifying: that sounds like making it parse just like in any other URI 
(with which I would agree).


The test case at

  http://greenbytes.de/tech/tc/datauri/#svg

seems to imply that Opera doesn't do this right yet, though. (tested 
with 12.02)


Best regards, Julian



Re: [whatwg] Proposal for improved handling of '#' inside of data URIs

2012-10-09 Thread Julian Reschke

On 2012-10-09 17:33, Anne van Kesteren wrote:

On Tue, Oct 9, 2012 at 4:59 PM, Julian Reschke julian.resc...@gmx.de wrote:

The test case at

   http://greenbytes.de/tech/tc/datauri/#svg

seems to imply that Opera doesn't do this right yet, though. (tested with
12.02)


Yeah, for some reason Opera has different behavior when entering in
the address bar.


Indeed. Will update test comment.

Best regards, Julian



Re: [whatwg] New URL Standard

2012-09-21 Thread Julian Reschke

On 2012-09-21 17:16, Anne van Kesteren wrote:

I took a crack at defining URLs: http://url.spec.whatwg.org/

At the moment it defines parsing (minus domain names / IP addresses)
and the JavaScript API (minus the query manipulation methods proposed
by Adam Barth). It defines things like setting .pathname to hello
world (notice the space), it defines what happens if you resolve
http:test against a data URL (you get http://test/;) or


As per RFC 3986, Section 5.2 (Relative Resolution), the answer IMHO is 
http:test.


Fetching from that URI indeed used http://test/ (just checked in 
Mozilla), so it appears we have a terminology problem. It would be good 
if we could avoid confusing relative reference resolution with what 
you try to define here.


Note that the term resolve is widely used for what RFC 3986 Section 
5.2 defines; see, for instance, 
http://docs.oracle.com/javase/1.4.2/docs/api/java/net/URI.html#resolve%28java.lang.String%29.


 ...


http://teehee (you get http://teehee/test;). It is based on the
various URL code paths found in WebKit and Gecko and supports the \ as
/ in various places because it seemed better for compatibility.

I'm looking for some feedback/ideas on how to handle various aspects, e.g.:

* data URLs; in Gecko these appear to be parsed as part of the URL
layer, because they can turn a URL invalid. Other browsers do not do
this. Opinions? Should data URLs support .search?

 ...

I believe the behavior should be predictable and consistent no matter 
what the URI scheme is.


Best regards, Julian

PS: and no, I don't think URL Standard is a good name for this document.



Re: [whatwg] multipart/form-data filename encoding: unicode and special characters

2012-07-09 Thread Julian Reschke

On 2012-07-09 23:01, Ian Hickson wrote:

On Thu, 3 May 2012, Evan Jones wrote:

On May 3, 2012, at 17:09 , Anne van Kesteren wrote:


Yes. I think we should define multipart/form-data directly in HTML and
thereby obsolete http://tools.ietf.org/html/rfc2388 as it is outdated
and not maintained.


Right; that would be ideal. Despite the fact that HTML5 references that
RFC, browsers don't really follow it.

I would be interested in trying to help with this, but again I would
certainly need some guidance from people who know more about the
vagaries of how the various browsers encode their form parameters /
uploaded file names, and why things got that way. It probably would not
be helpful for me to try to draft an update to the spec without getting
the right implementers on board.


If this is still something for which you have some time available, then
the starting point for anything like this would be test cases, lots and
lots of test cases. In this case, it would have to be something like a
server that echoes the precise bytes sent by the client, for a huge
variety of different setups:

  - various submission encodings
  - various form field names and types
  - various file submission filenames

...etc.

I'd be happy to advise if this is something that still interests you.


I agree with the methodology. However I would suggest to simply revise 
RFC 2388.


Best regards, Julian




Re: [whatwg] The pic element

2012-06-04 Thread Julian Reschke

On 2012-06-04 09:32, Kornel Lesiński wrote:

On Mon, 04 Jun 2012 01:05:23 -0500, Anselm Hannemann Web Development
i...@anselm-hannemann.com wrote:


An alternative is to pick different delimiters. See, for instance,
http://tools.ietf.org/html/rfc2295#section-8.3.


I also would like to see another delimiting syntax which is clearer.
What about JSON-syntax or just  | ?
I mean a backslash is not that common in a URL but commas are more and
more and you all know that escaping is no fun.
So we should really try to avoid this.


Another character could work in theory, but I wonder whether it would
work in practice.

For example meta name=viewport was documented to support only comma,
but thanks to silent error recovery authors ended up using and relying
on semicolon:
http://lists.w3.org/Archives/Public/www-style/2011Oct/0652.html

I wonder whether reverse of it could happen with list of sources, e.g.
unexpected comma parsed as invalid media query could end up delimiting
sources in some implementations, and then we'll end up with worst of
both worlds (both ambiguous comma and other unintuitive delimiter needed
for web-compat).


1) Use a format where the delimiter is always the same, (2) where 
escaping never is needed, and (3) specify the parser to ignore all 
malformed attribute values.


Best regards, Julian



Re: [whatwg] The pic element

2012-06-01 Thread Julian Reschke

On 2012-06-01 20:24, Kornel Lesiński wrote:

...

If there are commas or backslashes in the URL they must be escaped with `\`.

This is another problem why I would separate the diff. srces.
Escaping an URL is not something that should be necessary in HTML I think.


I agree, it's ugly, but otherwise you get ambiguous syntax for entries without 
descriptor or media query.

I thought about specifying some magic, like ignoring trailing comma in URL, but 
all such magical solutions have surprising edge cases. Explicit escaping is at 
least easy to comprehend.
...


An alternative is to pick different delimiters. See, for instance, 
http://tools.ietf.org/html/rfc2295#section-8.3.


Best regards, Julian


Re: [whatwg] responsive images

2012-05-22 Thread Julian Reschke

On 2012-05-22 17:02, Glenn Maynard wrote:

(I wish people would stop starting new threads about the same topic.)

On Tue, May 22, 2012 at 5:53 AM, Paul Courtp...@pmcnetworks.co.uk  wrote:


As a HTML author and programmer, I just cannot see myself implementing the
current srcset proposal on sites. As a programmer, it has very much got
what we would call a bad code smell.

img src=face-600-...@1.jpeg alt= srcset=face-600-...@1.jpeg 600w
200h 1x, face-600-...@2.jpeg 600w 200h 2x, face-icon.png 200w 200h



Actually, it's pretty clean; you've just made it ugly by sticking it all on
one line.

img src=face-600-...@1.jpeg alt=
  srcset=face-600-...@1.jpeg 600w 200h 1x,
  face-600-...@2.jpeg 600w 200h 2x,
  face-icon.png   200w 200h

It's no uglier than CSS syntaxes like background.


It may not be uglier but it's much more fragile as the examples and the 
prose in the spec give the impression that you can use the , to 
tokenize, which would be incorrect.



...


Best regards, Julian


Re: [whatwg] srcset javascript implementation (Respondu)

2012-05-21 Thread Julian Reschke

On 2012-05-21 04:21, David Clements wrote:

Hi guys,

Just to let you all know, I've written a javascript implementation of
srcset using a framework for responsive images (which I also wrote)
called Respondu (I'm open to new name suggestions), I'd love it if someone
could check that I've implemented srcset right.

Respondu manages to process the DOM without allowing any assets (contained
in the body) to load, it also gracefully degrades for non-js
browsers and is fairly unintrusive (it simply wraps the contents of the
body tags).

Check out the github page (feedback, pull requests, lunch money etc.
welcome)

https://github.com/davidmarkclements/Respondu
...


https://github.com/davidmarkclements/Respondu/blob/master/R.js#L243

This looks like you are splitting the attribute value by ,?

Best regards, Julian


Re: [whatwg] srcset javascript implementation (Respondu)

2012-05-21 Thread Julian Reschke

On 2012-05-21 09:36, huperekch...@googlemail.com wrote:

Hey Julian

I believe the attribute sets are delimited by comma, whereas each attribute 
itself is separated by space?


No. The URIs can contain a comma, so you can't use that delimiter. See 
the parsing definition in the spec.



...


(Please don't take this as a complaint about your code, but about the 
syntax of the attribute).



Best regards, Julian


Re: [whatwg] Features for responsive Web design

2012-05-18 Thread Julian Reschke

On 2012-05-18 12:30, Maciej Stachowiak wrote:


On May 18, 2012, at 3:16 AM, Markus Ernstderer...@gmx.ch  wrote:


Am 15.05.2012 09:28 schrieb Ian Hickson:

img src=face-600-...@1.jpeg alt=
 srcset=face-600-...@1.jpeg 600w 200h 1x,
 face-600-...@2.jpeg 600w 200h 2x,
 face-icon.png   200w 200h


Re-reading most parts of the last day's discussions, 2 questions come to my 
mind that I have the impression have not been pointed out very clearly so far:

1. Are there other cases in HTML where an attribute value contains more than 
one URI?

2. Have there been thoughts on the scriptability of @srcset? While sources can be 
added to resp. removed frompicture  easily with standard DOM methods, it 
looks to me like this would require complex string operations for @srcset.


If dynamically manipulating the items in srcset is useful, we can add a DOM API 
(similar to classList or style for manipulating the lists of items found in 
class and style attributes respectively).


...which of course means that it stops being simpler.

I think it would be worthwhile to combine elements form both proposals; 
in particular to avoid the microsyntax and use proper markup instead.


Best regards, Julian



Re: [whatwg] Defaulting new image solution to 192dpi

2012-05-17 Thread Julian Reschke

On 2012-05-17 13:30, Kornel Lesiński wrote:


My suggestion is that the srcset (or picture) should assume that
images are 2x scale by default.


My reasoning behind is:

- we have img for easy embedding of 1x images today, but we don't have
2x img for the future. Having to specify width/height in img all the
time is annoying.

- highdpi displays will become dominant at some point, it's only a
matter of time (they pretty much are already in high-end smartphones,
and are going to appear in laptops next). Bandwidth is also going to be
less of a concern, so it'll be rational and desirable to serve images
for the 2x resolution only (and just rely on 96dpi displays scaling them
down).

Necessity to specify 2x scaling all the time will become a bad default
and a historical quirk (like the DOCTYPE), and a source of annoyance
where accidentally omitted 2x syntax makes images large and pixelated.


So to future-proof the solution I think:

img src=1x.jpg srcset=2x.jpg

should be equivalent to:

img src=1x.jpg srcset=2x.jpg 2x
...


As far as I can tell, making descriptors optional breaks the syntax (it 
allows comma both in the URI and as a separator between image candidates).


(Please read this as argument for making the syntax less brittle)

Best regards, Julian


Re: [whatwg] img srcset for responsive bitmapped content images

2012-05-16 Thread Julian Reschke

On 2012-05-10 09:58, Edward O'Connor wrote:

Hi,

When authors adapt their sites for high-resolution displays such as the
iPhone's Retina display, they often need to be able to use different
assets representing the same image. Doing this for content images in
HTML is currently much more of a pain than it is in CSS (and it can be a
pain in CSS). I think we can best address this problem for bitmap[1]
content image by the addition of a srcset= attribute to the existing
img  element.

The srcset= attribute takes as its argument a simplified variant of
the image-set() microsyntax[2]. It would look something like this:

img src=foo-lores.jpg
  srcset=foo-hires.jpg 2x, foo-superduperhires.jpg 6.5x
  alt=decent alt text for foo.
...


Inventing a new microsyntax is tricky.

- comma separated implies you'll need to escape a comma when it 
appears in a URI; this may be a problem when the URI scheme assigns a 
special meaning to the comma (so it doesn't affect HTTP but still...)


- separating URIs from parameters with whitespace implies that the URIs 
are valid (in that they do not contain whitespace themselves); I 
personally have no problem with that, but it should be kept in mind


Best regards, Julian


Re: [whatwg] img srcset for responsive bitmapped content images

2012-05-16 Thread Julian Reschke

On 2012-05-16 11:51, Odin Hørthe Omdal wrote:

On Wed, 16 May 2012 11:22:07 +0200, Julian Reschke
julian.resc...@gmx.de wrote:


Inventing a new microsyntax is tricky.
- comma separated implies you'll need to escape a comma when it
appears in a URI; this may be a problem when the URI scheme assigns a
special meaning to the comma (so it doesn't affect HTTP but still...)


Indeed.

Edward did not write it all as a spec, though, so cases like that might
be a bit detailed for a first proposal. Hixies extension of srcset does
however have some spec text, and that does in fact handle your first case:

http://www.whatwg.org/specs/web-apps/current-work/multipage/embedded-content-1.html#processing-the-image-candidates
...


It looks like that, but it's non-trivial to check (that's why I prefer 
declarative definitions).


Best regards, Julian


Re: [whatwg] Throwing in my support for picture into the mix

2012-05-16 Thread Julian Reschke

On 2012-05-16 15:46, Glenn Maynard wrote:

On Wed, May 16, 2012 at 4:28 AM, Paul Courtp...@pmcnetworks.co.uk  wrote:


First, I would like to suggest throwingimg srcset  out the window and
into a landfill somewhere (It's not even fit for recycling!). This reminds
me if the recent semi-colon in JavaScript debate that erupted as a result
of @fat's code in the Twitter Bootstrap project - To one or two people who
are very specialised in their particular area, it seems like a non issue -
and I think that is the case with theimg srcset  syntax. From a browser
developer point of view it might be easier to implement, but from a I'm
just learning to code point of view, that syntax is bat-shit crazy!



It's a simple, unambiguous, extensible format.  If you don't like this
...


It is?

Quick check, do

  srcset=a,b

and

  srcset=a, b

mean the same thing?

And what about

  srcset=a ,b

?

Best regards, Julian


Re: [whatwg] Throwing in my support for picture into the mix

2012-05-16 Thread Julian Reschke

On 2012-05-16 16:07, Glenn Maynard wrote:

On Wed, May 16, 2012 at 8:57 AM, Julian Reschkejulian.resc...@gmx.dewrote:


It is?

Quick check, do

  srcset=a,b

and

  srcset=a, b

mean the same thing?

And what about

  srcset=a ,b



Yes, they all mean the same thing: a url a with no descriptors, and a url
b with no descriptors.  What makes you think they wouldn't?



, is a legal URI character. (Collect a sequence of characters that 
are not space characters, and let that be url.)


Best regards, Julian


Re: [whatwg] Throwing in my support for picture into the mix

2012-05-16 Thread Julian Reschke

On 2012-05-16 16:36, Glenn Maynard wrote:

On Wed, May 16, 2012 at 9:16 AM, Julian Reschke julian.resc...@gmx.de
mailto:julian.resc...@gmx.de wrote:

, is a legal URI character. (Collect a sequence of characters
that are not space characters, and let that be url.)


Actually, the key point is that this is non-conforming to start
with: image candidate strings must have at least one descriptor
(http://www.whatwg.org/specs/web-apps/current-work/#image-candidate-string).
...


My point being that the syntax is fragile unless implementations follow 
the spec word by word. I know they are supposed to, but the way it's 
introduced *will* make people split the attribute value by ,.


Best regards, Julian


Re: [whatwg] multipart/form-data filename encoding: unicode and special characters

2012-05-02 Thread Julian Reschke

On 2012-05-02 13:05, Evan Jones wrote:

On May 1, 2012, at 22:38 , Ashley Sheridan wrote:

The Webkit method looks the better of the two with regards to how
server-side languages might interpret it, but it would need work to
ensure everything that should be escaped is, and that everything that is
unescaped on the server should be and is done so correctly.


The problem is that currently I am unable to correctly round trip an uploaded 
file name. I would like users to upload a file, and be able to later download the file 
with the *exact same* file name. If you follow the specifications, this is not possible. 
Firefox is closer to the MIME RFCs (which specifies backslash quoting in quoted-strings), 
but apparently that will break IE6, 7, and 8:

https://bugs.webkit.org/show_bug.cgi?id=62107
http://java.net/jira/browse/JERSEY-759

Webkit's %-escaping behaviour is *not* part of the referenced MIME RFCs (which specifies either backslash 
quoting in quoted-strings, base64 encoding, or %-escaping in special filename*= arguments). Thus, 
if this is the right answer, it should be specified somewhere. I'm assuming that this needs to be 
in the HTML5 spec, since HTTP calls this the body of the the POST and declares that it is outside 
the HTTP specification.

Webkit's escaping is also flawed (see bug 62107 above). Files with that contain 
%-escapes (eg. my%22file.txt, admittedly very rare) will get mangled, because there 
is no difference between my%22file.txt and myfile.txt.

Currently, I need to detect the browser in order to figure out what kind of 
unescaping to apply to the file name, and even then in some cases I can't 
figure out what the right file name is. Webkit claims this is a specification 
bug, so I'm hoping someone here might tell me if this is the case, and if so 
where can I file bugs, create test cases, etc?

Evan

--
http://evanjones.ca/


I did spend a considerable amount of time with Content-Disposition, the 
*response* header field (resulting in RFC 6266 and 
http://greenbytes.de/tech/tc2231/).


However, this has little to do with the representation in form uploads. 
If browser implementers want to try something new that will not affect 
the old code paths, supporting the encoding defined in RFC 5987 might be 
the right thing to do (yes, it's ugly, but it's unambiguous).


Best regards, Julian


Re: [whatwg] multipart/form-data filename encoding: unicode and special characters

2012-05-02 Thread Julian Reschke

On 2012-05-02 19:26, Evan Jones wrote:

On May 2, 2012, at 7:43 , Julian Reschke wrote:

If browser implementers want to try something new that will not affect the old 
code paths, supporting the encoding defined in RFC 5987 might be the right 
thing to do (yes, it's ugly, but it's unambiguous).


It seems to me like that is a potential solution that could be evaluated. It would be 
nice to have both the HTTP response header and the POST form encoding be the same. 
However, a critical question is if the server software that parses the form headers would 
do the right thing if it sees both an ASCII fallback filename= and an escaped 
filename*= parameter in the Content-Disposition header. Without looking at any code, I 
suspect some will and some won't.


I'm pretty sure everybody will ignore filename* for now. Which means 
servers need to upgrade, but at least it would be an upgrade that 
doesn't break any existing behavior.



My conclusion: I would be willing to help with bugs, testing, test cases, 
looking at server code, etc related to this issue. However, I believe someone 
who is experienced with the technology and politics of web standards to really 
champion any change because I don't fully understand the processes or the 
issues. If I don't hear anything in a few days, I'll try filing some additional 
bugs with Webkit, Firefox, and the HTML5 spec and otherwise give up.
...


Sounds like a plan.

Best regards, Julian


Re: [whatwg] Encoding Sniffing

2012-04-23 Thread Julian Reschke

On 2012-04-23 10:19, Henri Sivonen wrote:

...
  * The Universal detector is used regardless of UI setting or locale
when using the FileReader to read a local file as text. (I'm
personally very unhappy about this sort of use of heuristics in a new
feature.)

 ...

+1


...
WebVTT is a new format with no legacy. Instead of letting it become
infected with heuristic detection, we should go the other direction
and hardwire it as UTF-8 like we did with app cache manifests and
JSON-in-XHR.  No one should be creating new content in encodings other
than UTF-8. Those who can't be bothered to use The Encoding deserve
REPLACEMENT CHARACTERs. Heuristic detection is for unlabeled legacy
content.
...


+1




Re: [whatwg] URL query component

2012-04-20 Thread Julian Reschke

On 2012-04-20 14:37, And Clover wrote:

On 2012-04-20 09:15, Anne van Kesteren wrote:

Currently browsers differ for what happens when the code point cannot
be encoded.
What Gecko does [?%C2%A3] makes the resulting data impossible to
interpret.
What WebKit does [?%26%23163%3B] is consistent with form submission. I
like it.


I do not! It makes the data impossible to recover just as Gecko does...
in fact worse, because at least Gecko preserves ASCII. With the WebKit
behaviour it becomes impossible to determine from an pure ASCII string
'#163;' whether the user really typed '€' or '#163;' into the input
field.

It has the advantage of consistency with the POST behaviour, but that
behaviour is an unpleasant legacy hack which encourages a
misunderstanding of HTML-escaping that promotes XSS vulns. I would not
like to see it spread any further than it already has.


+1

Indeed.

I think this is a case where you want to fail early (for some value of 
fail); so maybe substituting with ? makes most sense.


Do any servers *expect* the Webkit behavior? If they do so, why don't 
they just fix the pages they serve to use UTF-8 to get consistent 
behavior throughout?


Best regards, Julian


Re: [whatwg] Encoding Standard (mostly complete)

2012-04-17 Thread Julian Reschke

On 2012-04-17 11:30, Anne van Kesteren wrote:

Hi,

Apart from big5 (which requires some more research) all encoders and
decoders are now defined:

http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html
...


As a nit, I believe that Character Encoding would make a better title 
than just Encoding.


Best regards, Julian


Re: [whatwg] some thoughts on bring HTTP upon UDP: iWebPP - instant web p2p technology

2012-03-14 Thread Julian Reschke

On 2012-03-14 13:10, tom wrote:

Hi,

AFAIK, WebRTC intends to setup P2P communication between browsers, then
carry video/audio/text media, etc.

Why we need WebRTC? Firstly, Web is the most popular network app, secondly,
video/voice brings the best user experience.
But, the problem is that HTTP runs on TCP by now, while P2P runs on UDP
normally.

Suppose both web browser and server can run HTTP upon UDP(the protocol
schema as HTTPP), what happens?
Firstly, Web app developers can program HTTPP like HTTP, secondly, P2P
traffic can be carried on HTTPP easily.

Basically iWebPP consists of two parts: HTTPP-enabled web browser and web
server.

Any thoughts? thanks.
...


Well, declaring that it should use UDP alone won't make it happen. It 
obviously will work nicely for small messages that are idempotent (so 
they can be retransmitted safely), but things get complicated beyond that.


There's also previous work to study; for instance Microsoft has used 
HTTP over UDP for notifications in the past.


A good place to bring this up might be the HTTPbis Working Group, which 
will be looking at what HTTP/2.0 might be very soon.


Best regards, Julian


Re: [whatwg] Specify href target with HTTP headers

2012-03-09 Thread Julian Reschke

On 2012-03-08 20:25, Christian Schmidt wrote:

...

Separating the network protocol from the user interface seems highly
desirable. Window-Target sacrifices that.

I get your point. But it seems that Content-Disposition already suffers
from this.

RFC 2183 describes the Content-Disposition like this:

A mechanism is needed to allow the sender to transmit this sort of
presentational information to the recipient; the Content-Disposition
header provides this mechanism, allowing each component of a message
to be tagged with an indication of its desired presentation semantics.

I know that RFC 2183 deals with e-mail and is not pat of HTTP/1.1, but
it is mentioned in the HTTP specification and is supported by several
browsers.
...


Content-Disposition for HTTP is defined in RFC 6266.


...


Best regards, Julian


Re: [whatwg] Caching of identical files from different URLs using checksums

2012-02-19 Thread Julian Reschke

On 2012-02-18 14:45, Sven Neuhaus wrote:

...


Stop here. That's not what the fragment identifier is for.

Instead, you could specify the hash as a separate attribute on the
containing element.


The relevant section from RFC 3986 reads:

   The fragment identifier component of a URI allows indirect
identification of a secondary resource by reference to a primary
resource and additional identifying information.  The identified
secondary resource may be some portion or subset of the primary
resource, some view on representations of the primary resource, or
some other resource defined or described by those representations.


..but it goes on saying:

The semantics of a fragment identifier are defined by the set of 
representations that might result from a retrieval action on the primary 
resource. The fragment's format and resolution is therefore dependent on 
the media type [RFC2046] of a potentially retrieved representation, even 
though such a retrieval is only performed if the URI is dereferenced. If 
no such representation exists, then the semantics of the fragment are 
considered unknown and are effectively unconstrained. Fragment 
identifier semantics are independent of the URI scheme and thus cannot 
be redefined by scheme specifications.



This description is not contradicting the use of checksum as fragment
identifiers. They are additional identifying information.


It is contradicting the concept of being defined by the media type.


However, if there is a consensus that checksums shouldn't be stored in
the fragment part of the URL, a new attribute would be a good alternative.

Regards,
-Sven Neuhaus



Best regards, Julian


Re: [whatwg] Caching of identical files from different URLs using checksums

2012-02-17 Thread Julian Reschke

On 2012-02-17 09:42, Sven Neuhaus wrote:

Hello,

as of 2012, some websites are including popular javascript libraries from CDNs, 
like
Google's. The benefits are:

* Traffic savings for the site operator because the javascript libraries are 
downloaded from
   the CDN and not from the site that uses them
* If enough sites refer to the same external file, the browser will cache the 
file and even if
   it's a first visit, the (potentially large) javascript file will not have to 
be downloaded.

There are however some drawbacks to this approach:

* Security: The site operator is trusting an external site.  If the CDN serves 
a malicious file
   it will directly lead to code execution in browsers under the domain 
settings of the site
   including it (a form of cross site scripting).
* Availability: The site depends on the CDN to be available. If the CDN is down 
the site may not
   be available at all.
* Privacy: The CDN will see requests for the file with HTTP referer headers for 
every visitor
   of the site.
* Extra DNS lookup if file is not already cached
* Extra HTTP connection (can't use persistent connection because it's a 
different site) if file is not cached

I am proposing a solution that will solve all these problems, keep the benefits 
and offers
some extra advantages:

1. The site stores a copy of the library file(s) on its own site.
2. The web page includes the library from the site itself instead of from the 
CDN
3. The script tag specifies a checksum calculated using a cryptographic hash 
function.

With this solution, whenever a browser downloads a file and stores it in the 
local cache, it calculates
its checksum. The browser can check its cache for an (identical) file with the 
same checksum
(no matter what URL it was retrieved from) and use it instead of downloading 
the file again.

This suggestion has previously been discussed here ( 
http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2006-November/thread.html#7825
 ), however for a different purpose (file integrity instead of caching 
identical files from different sites) and I don't feel the points raised back 
then apply.

If a library is popular, chances are that many sites are including the 
identical file and it will
already be in the browser's cache. No network access is necessary to use it, 
improving the users'
privacy. It doesn't matter if the sites store the library file at a different 
URL. It will always
be identified by its checksum. The cached file can be used more often.

The syntax used to specify the checksum is using the fragment identifier 
component of a URI
(RFC 3986 section 3.5).
...


Stop here. That's not what the fragment identifier is for.

Instead, you could specify the hash as a separate attribute on the 
containing element.


Best regards, Julian


Re: [whatwg] Augmenting HTML parser to recognize new elements

2012-01-18 Thread Julian Reschke

On 2012-01-18 22:55, Dimitri Glazkov wrote:

On Wed, Jan 18, 2012 at 1:47 PM, Adam Barthw...@adambarth.com  wrote:

On Wed, Jan 18, 2012 at 1:29 PM, Dimitri Glazkovdglaz...@chromium.org  wrote:

On Wed, Jan 18, 2012 at 1:14 PM, Dimitri Glazkovdglaz...@chromium.org  wrote:

Ah, that's a good question. This also must be specified. It should
depend on the parent of thecontent  element. If the parent is shadow
root ortable, then it should maketr  the child ofcontent.
Otherwise, it should use foster parenting as usual.


Oops, not foster parenting, but ignore as you mentioned. Still
getting through the details of the parsing spec.


There's also some subtly w.r.t. the pending character tokens.

More generally, I think we'd all be much more sane if the HTML parsing
algorithm was specified in the HTML living standard rather than
modified ad-hoc in a number of different documents.


That makes sense, but how will we handle the fact that the elements in
the algorithm aren't part of the HTML specification?
...


The algorithm should be specified so that all future elements follow the 
same parsing rules, thus no further changes are required.


Best regards, Julian


Re: [whatwg] Proposal: intent tag for Web Intents API

2011-12-16 Thread Julian Reschke

On 2011-12-08 18:54, Anne van Kesteren wrote:

On Wed, 07 Dec 2011 18:59:43 +0100, Paul Kinlan paulkin...@google.com
wrote:

Cons:
* ordering of data in the content element - if the ordering of data in
the content value is mandatory and the developer mixes up the
ordering, does the action then become image/png (which is still
techincally valid) and the data type become the uri string specified?
* we have other optional attributes, such as title, disposition and
icon so a scheme needs to be defined inside the content, if we define
a scheme it looks similar to the intent tag but harder to prepare
(from a normal developers perspective)
* some attributes can have spaces so we would need to define encoding
mechanisms inside the content attribute to handle quotes, and double
quotes.
* we can't provide a visual fallback if intents aren't supported - see
discussion about self closing tag in body.
* harder to validate (due to all of the above)


We can just add additional attributes to meta you know. We have done
the same for link. E.g. for link rel=icon you can specify a sizes
attribute.


Hmmm.

That makes it sound a lot easier than it is. After all, there's no 
extension point here. Adding attributes to meta (or link) requires a 
change to HTML5, or a delta spec adding these as conforming attributes.


Best regards, Julian


Re: [whatwg] Proposal for improved handling of '#' inside of data URIs

2011-09-14 Thread Julian Reschke

On 2011-09-14 10:16, Robert O'Callahan wrote:

On Sat, Sep 10, 2011 at 11:01 PM, Ryosuke Niwarn...@webkit.org  wrote:


Have implementors actively opposed to this idea?  It seems like sticking to
RFC is a cleaner option if possible.



Yeah. Will you fix it in Webkit? :-)


:-)

Maybe we should start with opening a ticket, so this is properly tracked?



Re: [whatwg] Proposal for improved handling of '#' inside of data URIs

2011-09-12 Thread Julian Reschke

On 2011-09-12 21:47, Michal Zalewski wrote:

What about javascript: URLs?

Right now, every browser seems to treat javascript:alert('#') in an
intuitive manner.

This likely goes beyond data: and javascript:, so I think it would be
useful to look at it more holistically.


Maybe. Or it makes sense to do it one at a time :-).

Observation: javascript: IMHO isn't a URI scheme (it just occupies a 
place in the same lexical space), so maybe the right thing to do is to 
document it as historic exception that only exists in browsers.


Best regards, Julian



Re: [whatwg] Proposal for improved handling of '#' inside of data URIs

2011-09-11 Thread Julian Reschke

On 2011-09-11 04:51, Boris Zbarsky wrote:

...
I think you misunderstand my position. I'm weakly against the proposal
in question; the strongest argument in favor of the proposal is that
there is either a current or future deployed base of data: URIs that
won't work without it but do work in either past browsers or some subset
of future ones.

Of course the simplest way to prevent the future URIs thing being a
problem is for UAs that don't follow the URI spec here right now to fix
that, but I haven't sensed much willingness to do that in the past, or
earlier in this discussion. :(
...


+1 for trying to sanitize the parsing in Firefox.

Given the fact that this change made it into the release without any 
major uproar there might be a chance that other UAs might simply adopt it.



Given the choice between converging on this proposal and the status quo
in which UAs just do wildly different totally wacky things, I'd pick the
proposal, I think


If we can't get the perfect fix (UAs consistently doing what the spec 
says), then of course converging on something that is less broken than 
before may be good.


Best regards, Julian


Re: [whatwg] Proposal for improved handling of '#' inside of data URIs

2011-09-11 Thread Julian Reschke

On 2011-09-11 18:56, Daniel Holbert wrote:

On 09/11/2011 02:09 AM, Julian Reschke wrote:

Given the fact that this change made it into the release without any
major uproar there might be a chance that other UAs might simply adopt
it.


(To be clear -- the proposal hasn't made it into any releases yet. Right
now it's just an idea.)
...


Understood. I was referring to the changed behavior as of Firefox 6.

Best regards, Julian


Re: [whatwg] Proposal for improved handling of '#' inside of data URIs

2011-09-11 Thread Julian Reschke

On 2011-09-11 17:30, Daniel Holbert wrote:

On 09/11/2011 07:21 AM, Michael A. Puls II wrote:

Not only must # be %23 if you don't want it as a frag id, but 
and  should be %3E and %3C.

[...]
  Of course, if you can percent-encode everything needed as you type, you
  can hand-author the URI data. But, who wants to do that,

As I noted in a response to Nils earlier in this thread,
Firefox/Webkit/Opera don't actually require authors to percent-encode
brackets and spaces in data URIs. (not sure whether that's correct per
spec or not).
...


It's not correct per RFC 2397 (data) and RFCs 3986 (URI) and 3987 (IRI), 
but the HTML spec certainly could *make* it correct by introducing an 
additional layer (if there was consensus to do so). Right now HTML5 
conformance requires valid IRIs, so unescaped whitespace or angle 
brackets in @hrefs make the document non-conforming.



...


Best regards, Julian


Re: [whatwg] 1.1.1 How do the WHATWG and W3C specifications differ?

2011-09-08 Thread Julian Reschke

On 2011-09-08 08:26, Jens O. Meiert wrote:

Please clarify -- (a) the decisions do not make sense or (b) not applying
them doesn't make sense?


My main concern are the number of differences between the WHATWG and
the W3C version, hence the question whether we’re on it at all to
improve this.


I'm all for improving it :-). The good question is how (other than the 
obvious way to just apply the W3C HTML WG decisions).


Best regards, Julian



Re: [whatwg] add html-attribute for responsive images

2011-08-30 Thread Julian Reschke

On 2011-08-30 16:51, Anne van Kesteren wrote:

On Tue, 30 Aug 2011 16:31:59 +0200, Karl Dubost ka...@opera.com wrote:

* It is in fact an issue for being able to make the website responsive
on Mobile devices in low banwidth.


The mobile devices are the ones with the high-resolution displays.


Speak for your own device :-)




Re: [whatwg] a rel=attachment

2011-07-22 Thread Julian Reschke

On 2011-07-22 09:00, Ian Hickson wrote:


(These e-mails were sent after I started working on the previous one.)

On Wed, 20 Jul 2011, Chris Bentzel wrote:


Who should be trusted for filename if the one specified bya  on the
referring page differs from the one specified by Content-Disposition on
the to-be-downloaded resource?


I've specified that the header wins.


On Wed, 20 Jul 2011, Julian Reschke wrote:


That being said, if you want to go down the road, make it clear how the
file name actually is extracted from the header field in an
interoperable way.


That isn't really in scope for the HTML spec, it's something either for
the HTTP spec or the Content-Disposition spec (if HTTP doesn't define th
header itself) to define.


Is there a specific reason why the new text doesn't mention 
Content-Disposition anymore?


Best regards, Julian



Re: [whatwg] a rel=attachment

2011-07-22 Thread Julian Reschke
On 2011-07-22 05:03, Hironori Bono (坊野 博典) wrote:
 Greetings all,
 
 This is just out of curiosity.
 Would it be possible to give me the encoding used for this download
 attribute? I think we have several options when we use non-ASCII
 characters (this example uses Cyrillic characters) as the value of
 this attribute as listed below.
 
 1. Use the same encoding as the one used for the HTML content.
a href=...  download=файл.pngсохранить файл/a
 (If we allow using '#x...' format of HTML, it becomes:
a href=...  download=#x444;#x430;#x439;#x43B;.pngсохранить 
 файл/a
 
 2. Use the URL encoding (same as the href attribute).
a href=...  download=%D1%84%D0%B0%D0%B9%D0%BB.pngсохранить файл/a
 
 3. Use RFC 2231 (same as the content-disposition header)
a href=...
 download=UTF-8''%D1%84%D0%B0%D0%B9%D0%BB.pngсохранить файл/a
 
 Thank you for your help in advance.

It's the same as with any other HTML attribute.

The thing you mention in 3) is a special mechanism only needed in HTTP
header fields (btw updated by RFC 5987), and doesn't apply here.

Best regards, Julian


Re: [whatwg] a rel=attachment

2011-07-22 Thread Julian Reschke

On 2011-07-22 09:24, Ian Hickson wrote:

On Fri, 22 Jul 2011, Julian Reschke wrote:


That isn't really in scope for the HTML spec, it's something either
for the HTTP spec or the Content-Disposition spec (if HTTP doesn't
define th header itself) to define.


Is there a specific reason why the new text doesn't mention
Content-Disposition anymore?


Not only does the new text mention Content-Disposition, it actually refers
to its specification multiple times. Are you looking at the right diff?


I was looking at the diff r6318. Maybe there were more changes?

Best regards, Julian



Re: [whatwg] a rel=attachment

2011-07-20 Thread Julian Reschke

On 2011-07-20 13:33, Chris Bentzel wrote:

Who should be trusted for filename if the one specified bya  on the
referring page differs from the one specified by Content-Disposition
on the to-be-downloaded resource?
...


I think the header field needs to be authoritative.

That being said, if you want to go down the road, make it clear how the 
file name actually is extracted from the header field in an 
interoperable way.


Best regards, Julian


Re: [whatwg] date meta-tag invalid

2011-07-18 Thread Julian Reschke

On 2011-07-18 14:54, aykut.sen...@bild.de wrote:

According to the w3c Validator themetaname=datecontent=# /  tag is 
invalid. In the WHATWG MetaExtensions List there is no registered extension, no specification and no 
proposal for the date meta-tag.
The only alternative for date is a proposal called created, which however 
doesn't meet the requirements for registration . For our SEO team the date meta-tag contains some 
of the most important information about a webpage.
What would be a w3c-valid way to implement a creation date meta-tag in html5?


Out of curiosity: who is processing the tag? And what does this have to 
do with SEO? Do search engines do anything with it?


From HTML5's point of view the suggest replace is probably time 
pubdate... Did you look at that already?


Also: there seems to be overlap with Dublin Core's dc:created?

Best regards, Julian


Re: [whatwg] date meta-tag invalid

2011-07-18 Thread Julian Reschke

On 2011-07-18 15:59, aykut.sen...@bild.de wrote:

hi julian,
i have asked one from the seo team and he says for example the freshness
factor is important for google.
is it possible to use the time-tag in the head instead (i mean invisible)?
dc:created is also not in the Meta Extensions List, see:
http://wiki.whatwg.org/wiki/MetaExtensions


I *believe* the SEO time is misguided when it thinks that 
meta/@name=date affects Google. But only Google can tell us.


I mentioned dc:created not because it's valid, but because it's at least 
*specified* and in more wider use.


Best regards, Julian



Re: [whatwg] Iframe Sandbox Attribute - allow-plugins?

2011-07-15 Thread Julian Reschke

On 2011-07-14 17:01, Jonas Sicking wrote:

...
True. I would be fine with removing the plugin requirement. Or
changing it such that it states that plugins can only be loaded if
it's done in a manner that ensures that all other requirements are
still fulfilled. Or just dealing with this once there actually are
plugins and plugin APIs which could be loaded while still fulfilling
the other requirements.
...


Well, the spec is in W3C LC. So if we think this requirement needs to be 
rephrased then it should be brought up as a problem.


Best regards, Julian


Re: [whatwg] a rel=attachment

2011-07-15 Thread Julian Reschke

On 2011-07-15 19:05, Ian Fette (イアンフェッティ) wrote:

..

It also doesn't naturally help understanding that it's just poor man's
Content-Disposition:attachment. From this point of view, I like Ian's
original proposal (rel=attachment) more.



Yes and no - both are sort of a poor man's Content-Disposition :) The
question is whether we need to handle filename, and the proposal of
download=filename at least maps content-disposition fully and compactly.
...


Well, one difference is that C-D is under the control of the owner of 
the resource being linked to (ideally), while attributes set somewhere 
else might not.


So there is a security-related aspect to this.

Best regards, Julian


Re: [whatwg] Iframe Sandbox Attribute - allow-plugins?

2011-07-14 Thread Julian Reschke

On 2011-07-14 08:22, Jonas Sicking wrote:

On Wed, Jul 13, 2011 at 9:49 PM, Anne van Kesterenann...@opera.com  wrote:


On Wed, 13 Jul 2011 23:13:05 +0200, Julian Reschkejulian.resc...@gmx.de  
wrote:


Yes, but we can *define* the flag in HTML and write down what it means with 
respect to plugin APIs.


It seems much better to wait until it can actually be implemented.


Especially since it's not at all clear to me that a specific opt-in
mechanism is at all needed once we have the appropriate plugin APIs
implemented. And those APIs are needed anyway if we want to allow
plugins in any form in the sandbox.


When the attribute is set, the content is treated as being from a 
unique origin, forms and scripts are disabled, links are prevented from 
targeting other browsing contexts, and plugins are disabled.


A browser negotiating something with plugins using that API and enabling 
them despite @sandbox would violate the above requirement, no?


Re: [whatwg] Iframe Sandbox Attribute - allow-plugins?

2011-07-13 Thread Julian Reschke

On 2011-07-13 22:31, Adam Barth wrote:

Adding allow-plugins today would defeat the prevention of parent redirection.

The short answer is we need an API for informing plugins of the
sandbox flags and a way of confirming that the plugins understand
those bits before we can allow plugins inside sandboxed frames.


...but that API is outside the scope of what the W3C and the WhatWG 
currently do, so I think it would be great if defining this flag could 
be decoupled from progress on the plugin API layers.


Best regards, Julian


Re: [whatwg] Iframe Sandbox Attribute - allow-plugins?

2011-07-13 Thread Julian Reschke

On 2011-07-13 22:58, Adam Barth wrote:

On Wed, Jul 13, 2011 at 1:55 PM, Julian Reschkejulian.resc...@gmx.de  wrote:

On 2011-07-13 22:31, Adam Barth wrote:

Adding allow-plugins today would defeat the prevention of parent
redirection.

The short answer is we need an API for informing plugins of the
sandbox flags and a way of confirming that the plugins understand
those bits before we can allow plugins inside sandboxed frames.


...but that API is outside the scope of what the W3C and the WhatWG
currently do, so I think it would be great if defining this flag could be
decoupled from progress on the plugin API layers.


It is coupled in the sense that we can't implement the flag unless and
until such a plug-in API exists.


Yes, but we can *define* the flag in HTML and write down what it means 
with respect to plugin APIs.


Best regards, Julian


Re: [whatwg] EventSource - Handling a charset in the content-type header

2011-07-04 Thread Julian Reschke

On 2011-07-04 16:13, Anne van Kesteren wrote:

...
Are we sure we want this strict checking of media type parameters? I
always thought the media type itself was what strict checking should be
done upon, but that its parameters were extension points, not points of
failure.
...


The right thing for consistency with other uses of media types is to 
ignore unknown parameters.


Best regards, Julian


Re: [whatwg] Content-Disposition property for a tags

2011-06-06 Thread Julian Reschke

On 2011-06-03 17:46, Bjartur Thorlacius wrote:

...

I strongly disagree.  I think browsers that use the Content-Disposition
filename for attachment but not inline are just buggy and should be
fixed.


FWIW MSIE9 seems to honor the filename hint with inline (contrary to
the test results mentioned earlier in the thread).
...


Hint: the test page has a feedback link.

That being said: I just tried 
http://greenbytes.de/tech/tc2231/inlwithasciifilename.asis and IE9 
seems to ignore the filename information.


Best regards, Julian


Re: [whatwg] Content-Disposition property for a tags

2011-06-03 Thread Julian Reschke

On 2011-06-03 14:23, Dennis Joachimsthaler wrote:

Am 03.06.2011, 10:23 Uhr, schrieb Eduard Pascual herenva...@gmail.com:


On Thu, Jun 2, 2011 at 10:09 PM, Dennis Joachimsthaler
den...@efjot.de wrote:

By the way, another point that we have to discuss:

Which tag should a browser favor. The one in HTTP or the other one in
HTML?


Is that really worth discussing? HTTP  HTML: whomever provides the
file should have the last say about how the file needs to be served,
regardless of what a site referencing to it may suggest.

Furthermore, when links point to URIs with any scheme other than
http:, whatever the scheme defines about how to deliver the file
takes precedence.

Thus, only in the lack of an actual Content-Disposition header, or its
equivalent on some other scheme, would the attribute given by the link
be used, just like an additional fallback step before whatever the
UA's default behaviour would be.


I agree that I shouldn't even have asked since this is actually a no-
brainer. I can't think of any good reason to overwrite the http header
with the html attribute.

Alright, so, moving on...


This grants the ability for any content provider to use an explicit
Content-Disposition: inline HTTP header to effectively block
download links from arbitrary sources.


True. Is it still so that some browsers ignore the filename part
of a content-disposition if an inline disposition is used?


Yes, see http://greenbytes.de/tech/tc2231/#inlwithasciifilename. 
Apparently only Firefox gets this right.



 ...


Best regards, Julian


Re: [whatwg] Content-Disposition property for a tags

2011-05-26 Thread Julian Reschke

On 2011-05-26 22:54, Dennis Joachimsthaler wrote:

Am 26.05.2011, 22:53 Uhr, schrieb Boris Zbarsky bzbar...@mit.edu:


Probably no one, to a first approximation, but we were specifically
talking about non-Windows systems. On Windows, as I said, Gecko forces
extensions to match content types, to avoid this sort of issue in
general.


Yep, yep... If browsers implement the filename (+ extension) name changing
we should make it a MUST to implement security...
...


Like 
http://greenbytes.de/tech/webdav/draft-ietf-httpbis-content-disp-latest.html#rfc.section.4.3?


Best regards, Julian


Re: [whatwg] element img with HTTP POST method

2010-12-10 Thread Julian Reschke

On 10.12.2010 01:46, Tab Atkins Jr. wrote:

...
Indeed.  You shouldn't be able to trigger POSTs from involuntary
actions.  They should always require some sort of user input, because
there is simply *far* too much naive code out there that is vulnerable
to CSRF.
...


Thanks, Tab.

It's sad that the discussion even got that far.

If the URI length is a problem because of browsers, fix the browsers to 
extend the limits, instead of adding a completely new feature.


Best regards, Julian


Re: [whatwg] Content-Disposition property for a tags

2010-12-07 Thread Julian Reschke

On 02.08.2010 18:56, Tab Atkins Jr. wrote:

2010/8/2 Kornel Lesińskikor...@geekhood.net:

Downloads can be forced already with Content-Disposition: attachment. It's 
just harder to do, and unfortunately that doesn't stop webmasters from trying. Popular 
PHP snippets for forcing download are among the most disgusting cargo-cult code I've ever 
seen — they're collection of self-contradictory and nonsensical HTTP headers, break 
caching and resuming, and often have security vulnerabilities.

It would be great if we could obsolete those scripts.


It would be great if those scripts could just get fixed.


Indeed; I've used those code samples, and since the entire area is
basically voodoo to me, I still have no idea which headers I sent did
anything and which are useless or even harmful cruft.  In general,
even well-educated authors have no clue what they're doing here.


I believe the spec for C-D is sufficiently clear. But you still need to 
read it :-).


Best regards, Julian


Re: [whatwg] Content-Disposition property for a tags

2010-12-07 Thread Julian Reschke

On 06.08.2010 05:49, Bjartur Thorlacius wrote:

...
IMO there should be a standard metadata wrapper that should be around
virtually all files being passed around the Internet. Downloaders should
register the metadata to xattrs or somesuch and uploaders should collect
said metadata and rewrap it. Technically application/http could be used.
...


There is a widely deployed metadata wrapper; it's the HTTP message headers.

Best regards, Julian


Re: [whatwg] Content-Disposition property for a tags

2010-12-07 Thread Julian Reschke

On 07.12.2010 18:51, Dennis Joachimsthaler wrote:

Am 07.12.2010, 10:13 Uhr, schrieb Julian Reschke julian.resc...@gmx.de:



It would be great if those scripts could just get fixed.


Do you actually think that would HAPPEN? I think not. Better have people
get
rid of them entirely. Though that wouldn't happen either.

I'm still all for such a property in a hrefs. I personally hate writing
scripts to do something so simple.

I think we could name it declaration of content. Why should HTTP, the
protocol
underlying the HTML language, have to take care of declaration of content?
Shouldn't the HTML file itself have the power over that? We have a lot
of that
already, like content types, etc. But we can not yet declarate content
which is MEANT for downloading to your hard drive.

This is a big hole in my opinion.

 ...

I'm not against adding this in principle; but it shouldn't keep us from 
improving the situation for what's already there.


Having multiple ways to do the same thing causes real cost; you need to 
explain when to use what, and define which information takes priority.


Also; be sure to replicate what's needed from C-D, namely the filename 
information.


Has it ever been considered to use target=_download (just made up) for 
this?


Best regards, Julian


Re: [whatwg] Reserving XRI and URN in registerProtocolHandler

2010-11-26 Thread Julian Reschke

On 26.11.2010 05:20, Brett Zamir wrote:

I'd like to propose reserving two protocols for use with
navigator.registerProtocolHandler: urn and xri (or possibly xriNN
where NN is a version number).

See http://en.wikipedia.org/wiki/Extensible_Resource_Identifier for info
on XRI (basically allows the equivalents of URN but with a user-defined
namespace and without needing ICANN/IANA approval). Although it was


You don't need ICANN/IANA approval.

You can use informal URN namespaces, use a URN scheme that allows just 
grabbing a name (such as URN:UUID) *or* write a small spec; for the 
latter, the approval is *IETF* consensus (write an Internet Draft, then 
ask the IESG for publication as RFC).


Best regards, Julian


Re: [whatwg] Reserving XRI and URN in registerProtocolHandler

2010-11-26 Thread Julian Reschke

On 26.11.2010 11:54, Brett Zamir wrote:

...
My apologies for the lack of clarity on the approval process. I see all
the protocols listed with them, so I wasn't clear.

In any case, I still see the need for both types being reserved (and for
their subnamespaces targeted by the protocol handler), in that
namespacing is built into the XRI unlike for informal URNs which could
potentially conflict.
...


I'm still not sure what you mean by reserve and what that would mean 
for the spec and for implementations.


I do agree that the current definition doesn't work well for the urn 
URI scheme, as, as you observed, semantics depend on the first component 
(the URN namespace). Do you have an example for an URN namespace you 
actually want a protocol handler for?


Finally, I'd recommend not to open the XRI can-of-worms (see 
http://en.wikipedia.org/wiki/Talk:Extensible_Resource_Identifier).


Best regards, Julian


Re: [whatwg] Reserving XRI and URN in registerProtocolHandler

2010-11-26 Thread Julian Reschke

On 26.11.2010 16:55, Brett Zamir wrote:

On 11/26/2010 7:13 PM, Julian Reschke wrote:

On 26.11.2010 11:54, Brett Zamir wrote:

...
My apologies for the lack of clarity on the approval process. I see all
the protocols listed with them, so I wasn't clear.

In any case, I still see the need for both types being reserved (and for
their subnamespaces targeted by the protocol handler), in that
namespacing is built into the XRI unlike for informal URNs which could
potentially conflict.
...


I'm still not sure what you mean by reserve and what that would mean
for the spec and for implementations.


I just mean that authors should not use already registered protocols
except as intended, thinking that they can use any which protocol name
they like (e.g., the Urn Manufacturers Company using urn for its
categorization scheme).

I do agree that the current definition doesn't work well for the urn
URI scheme, as, as you observed, semantics depend on the first
component (the URN namespace). Do you have an example for an URN
namespace you actually want a protocol handler for?


ISBNs.


Oh, that's a good point. In particular, if the URN WG at some day makes 
progress with respect to retrieval.


So, would it be possible to write a generic protocolHandler for URN 
which itself delegates to more specific ones?



...


BR, Julian


Re: [whatwg] Content-Disposition property for a tags

2010-09-26 Thread Julian Reschke

On 26.09.2010 12:39, Dennis Joachimsthaler wrote:

Hello,

I'd like to bring this back to attention.

I don't want this to be forgotten before anybody who is official
has said their definitive yes or no about it.

Or how else do new additions find their way into the draft?

Many were positive about this feature, so I don't want to let this sink
into oblivion.


If you want this to be tracked, you should open a ticket in the W3C bug 
tracker.


Best regards, Julian





Re: [whatwg] Proposal: add attributes etags last-modified to link element.

2010-09-20 Thread Julian Reschke

On 19.09.2010 22:33, Robert O'Callahan wrote:

...
So for example, page A links to resource B. The browser does a GET on A,
and receives a document containing a link to B, and the link element
has etags or last-modified attributes. The browser has a cached resource
for B, whose etags/last-modified matches the link attribute, so the
browser knows its cached B is valid and no further network transactions
are required.

The linked resource B having the right caching information in the first
place (when the browser first fetched it) isn't enough to eliminate the
need for an HTTP transaction to validate B later.
...


Well, it would if the caching information specifies an expiry time 
sufficiently in the future.


Best regards, Julian


Re: [whatwg] Proposal: add attributes etags last-modified to link element.

2010-09-20 Thread Julian Reschke

On 20.09.2010 02:37, Aryeh Gregor wrote:

...
Sure it would.  You can currently only save an HTTP request if a
future Expires header (or equivalent) can be sent.  A lot of the time,
the resource might change at any moment, so you can't send such a
header.  The client has to check every time, and get a 204, even if
the resource changes very rarely.  If you could indicate in the HTML
source that you know the resource hasn't changed, you could save a lot
of round-trips on a page that links to many resources.

 ...

Resources that should be cached (stylesheets, images) but change at 
unexpected times are indeed a problem.


A well understood approach is to push some kind of version indicator 
into the URI (such as query parameter).


Best regards, Julian




Re: [whatwg] Proposal: add attributes etags last-modified to link element.

2010-09-20 Thread Julian Reschke

On 20.09.2010 17:26, Mike Belshe wrote:

...
LINK, in general, allows a server to indicate to a client that it will
need a particular resource earlier than the client otherwise would have
discovered it.  Today, the LINK header doesn't assist with understanding

 ...

Sorry?

That may be a use case that *could* be implemented using LINK, but it's 
certainly *not* the general use case.


For instance, it doesn't seem to be true for any of the currently used 
link relations in wide use, such as icon or stylesheet (there's no 
later discovery at all).


Or are you referring to using the Link *header* in addition to an 
equivalent HTML LINK?



existing cache control mechanics, so if the browser does have the
resource in cache but it needs validation, you didn't accomplish what
you had hoped with the LINK header - the client is still going to make a
costly round-trip.  For savvy content authers, they could, as you
suggest, simply modify the content to work with this case.  This
effectively restricts the full benefit of LINK to the subset of
resources which are static and have long-lived expiry.  That would leave
LINK less useful to large swaths of the internet where they do leverage
if-modified-since and etags.


Link relations cover many other use cases than those that you seem to be 
considering.


For resources that change infrequently but at unexpected times, it's 
already possible to get what you want by varying the URI when the 
resource changes (such as putting a timestamp or a revision number into 
a query parameter).



Rather than ask this question about the LINK header attributes, you
could instead aim your question at HTTP - why does HTTP bother with
if-modified-since?But the answer is moot - that decision was made
long ago.


Not sure what you're referring to. If-Modified-Since predates ETags (as 
far as I recall).



Given that the web *does* use these basic cache control mechanisms, why
*wouldn't* you want the LINK header to be capable of using them too?
  :-)  This proposal is actually just making LINK more like the rest of
HTTP.


My main concern is that if we put etags into *HTML* links, we're leaking 
protocol-level information into markup. I think it would be good if we 
could avoid that, and so far I haven't seen any use case that doesn't 
work without.


Best regards, Julian


Re: [whatwg] Proposal: add attributes etags last-modified to link element.

2010-09-20 Thread Julian Reschke

On 20.09.2010 18:17, Gavin Peters (蓋文彼德斯) wrote:

I think Mike was referring to the Link header.  This header is defined
in RFC 2068 (but not RFC 2616) in section 19.6.2.4
http://tools.ietf.org/html/rfc2068#section-19.6.2.4 , the most important
part of that text is probably that The Link field is semantically
equivalent to the LINK element in the HTML.  There's also a pending
internet draft which expands more fully on this header:
https://datatracker.ietf.org/doc/draft-nottingham-http-link-header/ ,
and that draft in the HTTP case maintains the HTML equivalence (see
section 5 of the internet draft).


I happen to be aware of the Link header, and the draft (which, by the 
way, was approved a few months ago and already is in the RFC Editor's 
publication queue).



I think the HTML link element is unusual because it does exist both in
markup, and at the protocol level.  My experimentation with these
attributes has been entirely at the protocol, and not the markup level.
  The standard for the element is in HTML, and so that's why I made my
proposal here in whatwg.


If we're talking about the link header primarily, I'd suggest you move 
over to the IETF HTTPbis Working Group's mailing list 
(http://lists.w3.org/Archives/Public/ietf-http-wg/).



...
Those approaches work; but require modifying the HTML.  So if a server
is attempting to have good protocol-level support for the Link header,
and to help a client avoid redundant fetches, we're now requiring
information leak from the protocol level down to the markup level.  I
think this problematic, too.  If the link element is going to work as
both a header and an element, it should have sufficient flexibility to
be useful and fully embedded in each application.
...
I think Mike was speaking about conditional gets generally, which can of
course be conditioned on ETag or Last-Modified.  Most web browsers, when
they have expired cache data, will make a conditional get based on their
existing cache entry.  If these attributes give a way to avoid this
extra request, and if these attributes enhance the protocol-level
context, why not support them?
...


The main reason would be additional complexity (IMHO). But if we're 
talking about HTTP this mailing list most certainly is not the right 
place to discuss this.


Later on Mike writes:


Yeah, I'm thinking of servers that can learn and auto-generate these headers.  
I think you're thinking of content authors plunking this into their HTML.


So, clarifying: you would send an *additional* Link header for the 
stylesheet relation, and augment it with the current etag?



I'd be perfectly happy to split these out of the HTML-link to the HTTP-link.  
Maybe its time they be split up.


I think both should be consistent (like relation type names mean the 
same thing); but that doesn't necessarily mean that their feature sets 
need to be identical.


Best regards, Julian



Re: [whatwg] Proposal: add attributes etags last-modified to link element.

2010-09-19 Thread Julian Reschke

On 15.09.2010 19:45, Gavin Peters (蓋文彼德斯) wrote:

Hi, I'm working on link tags inside of chrome.  We're now experimenting
with an optimization that uses link tags and headers to avoid round
trips for cache validation in many cases.
...


Clarifying: essentially that's a workaround for resources for which the 
actual cache information returned by HTTP GET isn't accurate, right? 
Which of course leads to the question: if the maintainers of a site 
can't get their cache information right, what makes you think they can 
get their HTML right instead?


Best regards, Julian


Re: [whatwg] Proposal: add attributes etags last-modified to link element.

2010-09-19 Thread Julian Reschke

On 19.09.2010 20:47, Robert O'Callahan wrote:

2010/9/19 Julian Reschke julian.resc...@gmx.de
mailto:julian.resc...@gmx.de

On 15.09.2010 19:45, Gavin Peters (蓋文彼德斯) wrote:

Hi, I'm working on link tags inside of chrome.  We're now
experimenting
with an optimization that uses link tags and headers to avoid round
trips for cache validation in many cases.
...


Clarifying: essentially that's a workaround for resources for which
the actual cache information returned by HTTP GET isn't accurate,
right? Which of course leads to the question: if the maintainers of
a site can't get their cache information right, what makes you think
they can get their HTML right instead?

No, it's a performance optimization. I presume that if the link
attributes indicate that the browser's cached resource is valid, the
browser does not issue a network request to validate the resource.


:-)

So it's a workaround that causes a performance optimization. It wouldn't 
be necessary if the linked resource would have the right caching 
information in the first place.


So again: what makes you think they can get their HTML right instead?

Best regards, Julian


Re: [whatwg] Video with MIME type application/octet-stream

2010-09-14 Thread Julian Reschke

On 13.09.2010 23:51, Aryeh Gregor wrote:

...

And for heavens sake, do not specify any sniffing as official.
Instead, explicitly specify all sniffing as UA specific and possibly
suggest that UAs should inform the user that content is broken and the
current rendering is best effort if any sniffing is required.


This is totally incompatible with the compelling interoperability and
security benefits of all browsers using the exact same sniffing
algorithm.
...


Again, there's more than browsers. And even for video in browsers, the 
actual component playing the video may not be part of the browser at all.


So there's *much* more that would need to implement the exact same 
sniffing.


Has anybody talked to the people responsible for VLC, Windows Media 
Player, and Quicktime?


Best regards, Julian



Re: [whatwg] Video with MIME type application/octet-stream

2010-09-08 Thread Julian Reschke

On 07.09.2010 22:00, Boris Zbarsky wrote:

...

* If a file in a top-level browsing context is sniffed as video but
then some kind of error is returned before the video plays the first
frame, fall back to allowing the user to download it, or whatever the
usual action would be if no sniffing had occurred.


This might be pretty difficult to implement, since the video decoder
might consume arbitrary amounts of data before saying that there was an
error.
...


It's not that hard if it's acceptable to restart the network request 
(just do it again, with a flag not-to-sniff).


Best regards, Julian


Re: [whatwg] Video with MIME type application/octet-stream

2010-09-07 Thread Julian Reschke

On 07.09.2010 11:51, And Clover wrote:

On 09/07/2010 03:56 AM, Boris Zbarsky wrote:


P.S. Sniffing is harder that you seem to think. It really is...


Quite. It surprises and saddens me that anyone wants to argue for *more*
sniffing, and even enshrining it in a web standard.


+1


Sniffing is a perpetual disaster that, after several security-sensitive
problems, web browsers have been moving to deprecate/mitigate. If
browsers want to guess types when no Content-Type is specified(*) then
fine, but there is no good reason to ignore an explicitly-set type. I
don't want my `application/octet-stream` file download service to be
repurposeable as a video player for some other party!


Hmm, that's what Content-Disposition: attachment is for...


...


Best regards, Julian


Re: [whatwg] Video with MIME type application/octet-stream

2010-09-07 Thread Julian Reschke

On 07.09.2010 12:52, Philip Jägenstedt wrote:

...
IE9, Safari and Chrome ignore Content-Type in a video context and rely
on sniffing. If you want Content-Type to be respected, convince the
developers of those 3 browsers to change. If not, it's quite inevitable
that Opera and Firefox will eventually have to follow.
...


We have heard that Safari sniffs for compatibility with content 
previously consumed by Quicktime, and that IE9 may sniff because they 
(currently) can't pass the content-type to the decoding machinery (or 
something like that).


So you really would have to standardize sniffing in the browsers, but 
also in the components they delegate video display to. Good luck with that.


Best regards, Julian


Re: [whatwg] Video with MIME type application/octet-stream

2010-09-01 Thread Julian Reschke

On 01.09.2010 10:12, Philip Jägenstedt wrote:

...
If we start ignoring the Content-Type I expect we would also add
sniffing so that opening a video served with the wrong (or missing)
Content-Type still works in a top-level browsing context, as it does for
images (I think).
...


Sniffing in the *absence* of a content type is fine. The interesting 
question is what to do when it's present, but wrong.


Best regards, Julian



Re: [whatwg] Video with MIME type application/octet-stream

2010-09-01 Thread Julian Reschke

On 01.09.2010 16:23, Philip Jägenstedt wrote:

...
Huh, I guessed incorrectly, neither serving a PNG as text/plain or
text/html makes it be sniffed and rendered in a top-level browsing
context in Opera. However, both work in IE8.
...


Please don't say work when talking about something that's not supposed 
to happen...


Re: [whatwg] Video with MIME type application/octet-stream

2010-09-01 Thread Julian Reschke

On 01.09.2010 15:13, Brian Campbell wrote:

On Aug 31, 2010, at 9:40 AM, Boris Zbarsky wrote:


On 8/31/10 3:36 AM, Ian Hickson wrote:

You might say Hey, but aren't you content sniffing then to find the
codecs and you'd be right. But in this case we're respecting the MIME
type sent by the server - it tells the browser to whatever level of
detail it wants (including codecs if needed) what type it is sending. If
the server sends 'text/plain' or 'video/x-matroska' I wouldn't expect a
browsers to sniff it for Ogg content.


The Microsoft guys responded to my suggestion that they might want to
implement something like this with what's the benefit of doing that?.


One obvious benefit is that videos with the wrong type will not work, and hence 
videos will be sent with the right type.


What makes you say this? Even if they are sent with the right type initially, 
the correct types are at high risk of bitrotting.

The big problem with MIME types is that they don't stick to files very well. 
So, while someone might get them working when they initially use video, if they 
move to a different web server, or upgrade their server, or someone mirrors 
their video, or any of a number of other things, they might lose the proper 
association of files and MIME types.
...


That's true, and the reason why people still use file extensions.

That's not super elegant, but it works.

Best regards, Julian


Re: [whatwg] Video with MIME type application/octet-stream

2010-08-31 Thread Julian Reschke

On 31.08.2010 09:36, Ian Hickson wrote:

Fromhttp://greenbytes.de/tech/webdav/rfc2046.html#rfc.section.1:

Parameters are modifiers of the media subtype, and as such do not
fundamentally affect the nature of the content. The set of meaningful
parameters depends on the media type and subtype. Most parameters are
associated with a single specific subtype. However, a given top-level
media type may define parameters which are applicable to any subtype of
that type. Parameters may be required by their defining media type or
subtype or they may be optional. MIME implementations must also ignore
any parameters whose names they do not recognize.

So, as codecs is not defined on application/octet-stream, the
parameter simply should be ignored, thus the advice [...]:

The MIME type application/octet-stream with no parameters is never a
type that the user agent knows it cannot render. User agents must treat
that type as equivalent to the lack of any explicit Content-Type
metadata when it is used to label a potential media resource.

Note: In the absence of a specification to the contrary, the MIME type
application/octet-stream when used with parameters, e.g.
application/octet-stream;codecs=theora, is a type that the user agent
knows it cannot render.

is incorrect, because it requires handling application/octet-stream
and application/octet-stream;codecs=theora differently.


That's not incorrect. The type with no parameters is a special case that
corresponds to a common configuration default. The case with parameters is
not that case, and represents likely intentional configuration and thus
clearly not a video format the UA supports.


My point is that it's incorrect to make this distinction, and that it's 
furthermore misleading to mention the codecs parameter in the context 
of a type that doesn't define it.



It's also not clear whether the note applies to all parameters or just
codecs.


The normative text you quote doesn't mention any specific parameters.


In which case it would be a *bit* clearer if the note used a parameter 
that doesn't suggest that codecs has any meaning on a/o.



Regarding codecs= in particular, it's an implementation reality that
user agents that support it are likely to support it regardless of the
type, so there's really no point trying to maintain an artificial boundary
of which types it has semantics for and which it doesn't.


David Singer pointed out in 
http://www.w3.org/Bugs/Public/show_bug.cgi?id=10202#c11 that this is 
the wrong thing to do.


Do you have any evidence that UAs already use codecs on types on which 
they aren't defined, *and*, if this is the case, they can't be changed 
anymore?


Best regards, Julian


Re: [whatwg] Video with MIME type application/octet-stream

2010-08-31 Thread Julian Reschke

On 31.08.2010 15:57, Anne van Kesteren wrote:

...

Another is that when you save the video to disk the browser will fix
up the extension correctly, if needed.


If you sniff you can fix it up correctly too.
...


Then let's hope that sniffing doesn't recognize Windows binaries.

Best regards, Julian


Re: [whatwg] HTML6 Doctype

2010-08-29 Thread Julian Reschke

On 29.08.2010 05:15, David John Burrowes wrote:

Hello all,

I wanted to chime in on this discussion.  Let me say up front that clearly the 
w3c and the browser vendors all are on the same page as you, Ian.  I'm not in 
the position to be challenging your collective wisdom!
...


With respect to the W3C, that's far from clear.

Best regards, Julian


Re: [whatwg] base64 entities

2010-08-27 Thread Julian Reschke

On 27.08.2010 00:45, Adam Barth wrote:

...
Escaping just those character is insufficient.  The appeal of this
approach is that authors don't need the right blacklist of dangerous
characters.  By the way, there are already folks doing something
similar manually now.  They send the untrusted bytes as base64 and
decode them using JavaScript.


That sounds like a good idea which doesn't have the deployment problem.

 ...

On Thu, Aug 26, 2010 at 1:30 PM, Julian Reschkejulian.resc...@gmx.de  wrote:

I now get the point about the additional problems in script, but I fail to
see how the proposal addresses this, unless expanding these entities is
suppose to happen *after* parsing the script.


Yes.  That's precisely what happens.


Ok. To be clear: the same applies to HTML entities in text/html, but not 
for XML entities in application/xhtml+xml (because of the different 
handling of script content).


So, what's the implication for XHTML?

Best regards, Julian


Re: [whatwg] Validator.nu Bug: Error: XHTML element noscript not allowed as child of XHTML element head in this context.

2010-08-27 Thread Julian Reschke

On 27.08.2010 12:32, Hugh Guiney wrote:

Ah, thanks. I guess the error is just confusing then in that it calls
it XHTML element noscript, which led me to think that it was indeed
part of XHTML. I think some indication otherwise might prove
beneficial to users.

But, I thought XHTML5 was just an XML serialization of HTML5, so why
is this the case? I just read the rationale behind it, but despite not
being best practice shouldn't it be at the very least allowed?
...


The HTML WG is currently discussing whether it should be deprecated (in 
HTML), see http://www.w3.org/Bugs/Public/show_bug.cgi?id=10068.


If the outcome of this is that there are good use cases for noscript, 
I'd expect that it will also be allowed in XHTML.


Best regards, Julian


Re: [whatwg] base64 entities

2010-08-26 Thread Julian Reschke

On 25.08.2010 22:50, Adam Barth wrote:

== Summary ==
...


Not convinced. There's already one way to escape these things, and this 
is supported in all UAs.


I don't see how adding another mechanism will help those who can't use 
the first one properly. For instance, people unable to escape ,  
and  are likely also unable to get the UTF-8 conversion right.


Best regards, Julian


Re: [whatwg] base64 entities

2010-08-26 Thread Julian Reschke

On 26.08.2010 22:10, Aryeh Gregor wrote:

On Thu, Aug 26, 2010 at 5:58 AM, Julian Reschkejulian.resc...@gmx.de  wrote:

Not convinced. There's already one way to escape these things, and this is
supported in all UAs.


Adam gave two examples of cases where htmlspecialchars() is
insufficient, even if authors do use it.  This proposal is completely
general and will work anywhere, even inscript.  Is automated
general escaping even possible right now inscript  for text/html?


I have to admit that I'm not sure what's special about script here. 
Are you saying that it's insufficient to escape all characters that have 
a special meaning there?


Server-wise, how is introducing a new escape mechanism any better than 
fixing the support code for the existing mechanism?


Best regards, Julian



  1   2   3   >