Re: [whatwg] questions on URL spec based on reviewing galimatias test results

2014-10-30 Thread Anne van Kesteren
On Wed, Oct 29, 2014 at 11:24 PM, Sam Ruby ru...@intertwingly.net wrote:
 http://intertwingly.net/projects/pegurl/urltest-results/d674c14cbe

 I'll note that galimatias doesn't produce a parse error in this case (and,
 in fact, the state machine specified by the current URL Standard goes down a
 completely different path for this case).

 The question is: should this be a parse error?

Yeah. The results also seem strange. I thought at least Chrome had
this behavior. Perhaps because Chrome was not running on Windows?


-- 
https://annevankesteren.nl/


Re: [whatwg] questions on URL spec based on reviewing galimatias test results

2014-10-30 Thread Sam Ruby

On 10/30/14 2:09 AM, Anne van Kesteren wrote:

On Wed, Oct 29, 2014 at 11:24 PM, Sam Ruby ru...@intertwingly.net wrote:

http://intertwingly.net/projects/pegurl/urltest-results/d674c14cbe

I'll note that galimatias doesn't produce a parse error in this case (and,
in fact, the state machine specified by the current URL Standard goes down a
completely different path for this case).

The question is: should this be a parse error?


Yeah. The results also seem strange. I thought at least Chrome had
this behavior. Perhaps because Chrome was not running on Windows?


Here is a screen capture of the live DOM URL viewer:

http://i.imgur.com/kbsTDQ7.png

Here are the test results for Chrome on Windows:

http://intertwingly.net/tmp/81cd494abd36509f0d46010b0c4d4ff9

It appears that Chrome implements this, but (a) only on Windows, and (b) 
only if the base scheme is file.


- Sam Ruby




Re: [whatwg] questions on URL spec based on reviewing galimatias test results

2014-10-30 Thread Anne van Kesteren
On Thu, Oct 30, 2014 at 12:06 PM, Sam Ruby ru...@intertwingly.net wrote:
 Here is a screen capture of the live DOM URL viewer:

 http://i.imgur.com/kbsTDQ7.png

 Here are the test results for Chrome on Windows:

 http://intertwingly.net/tmp/81cd494abd36509f0d46010b0c4d4ff9

 It appears that Chrome implements this, but (a) only on Windows, and (b)
 only if the base scheme is file.

Yeah, file URL parsing in browsers is OS-dependent. A goal of the URL
standard is to at least get parsing aligned so new URL() becomes
platform-independent. Implementers thus far have not been overtly
enthusiastic about that, but hopefully that will improve over time.


-- 
https://annevankesteren.nl/


[whatwg] questions on URL spec based on reviewing galimatias test results

2014-10-29 Thread Sam Ruby

1) Is the following expected to produce a parse error:

http://intertwingly.net/projects/pegurl/urltest-results/4b60e32190 ?

My reading of https://url.spec.whatwg.org/#relative-path-state is that 
step 3.1 indicates a parse error even though later step 1.5.1 replaces 
the non URL code point with a colon.


My proposed reference implementation does not indicate a parse error 
with these inputs, but I could easily add it.


2) Is the following expected to product a parse error:

http://intertwingly.net/projects/pegurl/urltest-results/bc6ea8bdf8 ?

I ask this because the error isn't defined here:
  https://url.spec.whatwg.org/#host-state

And the following only defines fatal errors (e.g. step 5);
  https://url.spec.whatwg.org/#concept-host-parser

My proposed reference implementation does indicate a parse error with 
these inputs, but this could easily be removed.


- Sam Ruby


Re: [whatwg] questions on URL spec based on reviewing galimatias test results

2014-10-29 Thread Anne van Kesteren
On Wed, Oct 29, 2014 at 12:12 PM, Sam Ruby ru...@intertwingly.net wrote:
 1) Is the following expected to produce a parse error:

 http://intertwingly.net/projects/pegurl/urltest-results/4b60e32190 ?

 My reading of https://url.spec.whatwg.org/#relative-path-state is that step
 3.1 indicates a parse error even though later step 1.5.1 replaces the non
 URL code point with a colon.

 My proposed reference implementation does not indicate a parse error with
 these inputs, but I could easily add it.

Given the legacy aspect, probably should be an error.


 2) Is the following expected to product a parse error:

 http://intertwingly.net/projects/pegurl/urltest-results/bc6ea8bdf8 ?

What is the DNS violation supposed to mean?

I would expect this to change if we decide to parse any numeric host
name into IPv4. Then it would certainly be an error.


 And the following only defines fatal errors (e.g. step 5);
   https://url.spec.whatwg.org/#concept-host-parser

 My proposed reference implementation does indicate a parse error with these
 inputs, but this could easily be removed.

Fatal errors are just worse parse errors. The difference is that a
fatal error can be observed through an API.


-- 
https://annevankesteren.nl/


Re: [whatwg] questions on URL spec based on reviewing galimatias test results

2014-10-29 Thread Sam Ruby

On 10/29/14 4:47 AM, Anne van Kesteren wrote:

On Wed, Oct 29, 2014 at 12:12 PM, Sam Ruby ru...@intertwingly.net wrote:


2) Is the following expected to product a parse error:

http://intertwingly.net/projects/pegurl/urltest-results/bc6ea8bdf8 ?


What is the DNS violation supposed to mean?

I would expect this to change if we decide to parse any numeric host
name into IPv4. Then it would certainly be an error.


Here is another example (though it contains multiple parse errors):

http://intertwingly.net/projects/pegurl/urltest-results/f3382f1412

The error being reported is that the host contains consecutive dot 
characters (i.e., the 'label' between these characters is empty).


- Sam Ruby


Re: [whatwg] questions on URL spec based on reviewing galimatias test results

2014-10-29 Thread Anne van Kesteren
On Wed, Oct 29, 2014 at 1:26 PM, Sam Ruby ru...@intertwingly.net wrote:
 Here is another example (though it contains multiple parse errors):

 http://intertwingly.net/projects/pegurl/urltest-results/f3382f1412

 The error being reported is that the host contains consecutive dot
 characters (i.e., the 'label' between these characters is empty).

Yeah, that is a clearly an error. Only the last label can be empty. I
suspect that part would fail somewhere in the IDNA code. The %3g bit
will fail too because % is not allowed.


-- 
https://annevankesteren.nl/


Re: [whatwg] questions on URL spec based on reviewing galimatias test results

2014-10-29 Thread Sam Ruby

On 10/29/14 4:47 AM, Anne van Kesteren wrote:

On Wed, Oct 29, 2014 at 12:12 PM, Sam Ruby ru...@intertwingly.net wrote:

1) Is the following expected to produce a parse error:

http://intertwingly.net/projects/pegurl/urltest-results/4b60e32190 ?

My reading of https://url.spec.whatwg.org/#relative-path-state is that step
3.1 indicates a parse error even though later step 1.5.1 replaces the non
URL code point with a colon.

My proposed reference implementation does not indicate a parse error with
these inputs, but I could easily add it.


Given the legacy aspect, probably should be an error.


Fixed:

https://github.com/rubys/url/commit/6789a5307ebd0e4aa05161c93038f2fc50011955

But it turns out that addressing that question opens up another 
question.  In my implementation that fix caused a (recoverable) parse 
error to be produced for another test case:


http://intertwingly.net/projects/pegurl/urltest-results/d674c14cbe

I'll note that galimatias doesn't produce a parse error in this case 
(and, in fact, the state machine specified by the current URL Standard 
goes down a completely different path for this case).


The question is: should this be a parse error?

- Sam Ruby