RE: Microsoft pre-LCWD feedback on WebSocket API

2010-01-30 Thread Ian Hickson
On Wed, 16 Dec 2009, Adrian Bateman wrote:
 On Thursday, December 03, 2009 5:20 PM, Ian Hickson wrote:
  On Thu, 19 Nov 2009, Adrian Bateman wrote:
  
   1) In the WebSocket constructor, we think it would make sense to 
   limit the optional protocol string to a set of characters that is 
   easier to secure. The UI never surfaces this string and having 
   something that doesn't require processing the full set of strings 
   into UTF-8 shouldn't be a significant restriction. We also think 
   that specifying length constraints would be useful. For example, 
   stipulating the minimum length that conforming implementations must 
   be able to handle. One suggestion was to re-use the same criteria as 
   a valid scheme name as defined in section 3.1 of RFC 3986.
  
  I don't see why we'd make it _that_ restricted, but I do agree that it 
  needs to be restricted to at least disallow non-ASCII characters and 
  CRLF, since that would break the handshake. I've updated the spec 
  accordingly.
 
 Our general feeling was that having it be fairly restrictive was 
 unlikely to be problematic and it is easier to relax the constraints if 
 it becomes apparent that it is necessary than to try to constrain 
 further in future.

I don't think that's really a good reason to restrict things. XML tried 
that approach (over-restrict and relax later) and it caused all kinds of 
problems.

Generally speaking, it's when we start having extra conditions that 
implementations mistakes creep in, with the associated lack of 
interoperability, and potentially even security bugs. I'd really much 
rather keep everything really simple than have arbitrary restrictions.


   3) The spec uses many terms that the HTML5 spec defines. As far as I 
   can tell, there isn't a definitive list of these. The 2.1 
   dependencies section notes that many concepts come from HTML5 but 
   not saying which ones seems insufficient for spec moving to Last 
   Call. Most of the people who looked at this spec had never looked at 
   HTML5 before and their feedback was simply that many terms were 
   undefined.
  
  I recommend using the WHATWG complete.html version of the spec, 
  which integrates all of HTML5 and both the Web Sockets API and Web 
  Socket protocol specs (and a few others) into a single document:
  
  http://www.whatwg.org/specs/web-apps/current-work/complete.html#network
  
  The cross-references in that document mean that all the terms defined 
  in HTML5 are clearly referenced.
  
  I am hoping that we will be able to make the postprocessor generate 
  appropriate cross-references even in the case of the split specs.
 
 This seems like something that should be done before the spec proceeds.

gsnedders has said he's working on this, so I'll wait until he's done and 
then try to apply his work to this spec.

Failing that, would the approach I recently used in the Microdata draft be 
acceptable?

   http://dev.w3.org/html5/md/#terminology

It makes the cross-references in the spec all go to a single terminology 
section, which thus disambiguates which terms are defined in HTML5 and 
which are not.

Notwithstanding the way the spec is split at the W3C and the IETF, this 
really is all just part of the complete.html spec at the WHATWG. I don't 
want to duplicate the definitions everywhere, since then they'd start 
getting out of sync, which is a recipe for disaster.


   4) In step 2 of the UA steps for the WebSocket() constructor, the 
   spec calls for an exception to be thrown if the user agent chooses 
   to block a well-known port. We think that web developers will often 
   miss this case because it will be hard to test the error case and 
   may be an unusual failure. We propose that the UA signal this 
   condition in the same way as failing to connect to a service which 
   will be much more common and more likely to be handled by web 
   developers.
  
  Wouldn't this error be caught very early in development, when the 
  connection just wasn't established?
  
  It seems unlikely that non-hostile authors will ever run into this 
  anyway, since they shouldn't be using these ports.
 
 It's not clear that all user agents would impose the same rules so 
 there's no guarantee this would be caught. It's entirely possible 
 someone might legitimately host a WebSocket service on a well-known port 
 believing this to be an appropriate strategy without realising that some 
 user agents might block that.
 
 Our feedback is that it is a best practice in library design to 
 encourage developers to handle unusual cases without the need to write 
 extra code. It is common for developers to measure the success of their 
 testing with code coverage - we prefer to avoid trying to get our test 
 teams to have to set up weird test beds to construct obscure test cases 
 to test unusual scenarios.

I don't really see how firing onclose is going to help in this case. If 
the author isn't going to catch this before shipping, why is it going to 
make 

Re: Microsoft pre-LCWD feedback on WebSocket API

2009-12-03 Thread Ian Hickson
On Thu, 19 Nov 2009, Adrian Bateman wrote:
 
 1) In the WebSocket constructor, we think it would make sense to limit 
 the optional protocol string to a set of characters that is easier to 
 secure. The UI never surfaces this string and having something that 
 doesn't require processing the full set of strings into UTF-8 shouldn't 
 be a significant restriction. We also think that specifying length 
 constraints would be useful. For example, stipulating the minimum length 
 that conforming implementations must be able to handle. One suggestion 
 was to re-use the same criteria as a valid scheme name as defined in 
 section 3.1 of RFC 3986.

I don't see why we'd make it _that_ restricted, but I do agree that it 
needs to be restricted to at least disallow non-ASCII characters and CRLF, 
since that would break the handshake. I've updated the spec accordingly.


 2) The second comment about the protocol string is editorial. There was 
 quite a lot of confusion about what the protocol string is used for and 
 whether a central registry would be needed for well-known protocol 
 strings. I don't believe this is intended or necessary but this suggests 
 that the language could be clearer. Perhaps an informative section 
 describing the expected use of the protocol string might be included.

Done (as an intro section in the protocol spec).


 3) The spec uses many terms that the HTML5 spec defines. As far as I can 
 tell, there isn't a definitive list of these. The 2.1 dependencies 
 section notes that many concepts come from HTML5 but not saying which 
 ones seems insufficient for spec moving to Last Call. Most of the people 
 who looked at this spec had never looked at HTML5 before and their 
 feedback was simply that many terms were undefined.

I recommend using the WHATWG complete.html version of the spec, which 
integrates all of HTML5 and both the Web Sockets API and Web Socket
protocol specs (and a few others) into a single document:

   http://www.whatwg.org/specs/web-apps/current-work/complete.html#network

The cross-references in that document mean that all the terms defined in 
HTML5 are clearly referenced.

I am hoping that we will be able to make the postprocessor generate 
appropriate cross-references even in the case of the split specs.


 4) In step 2 of the UA steps for the WebSocket() constructor, the spec 
 calls for an exception to be thrown if the user agent chooses to block a 
 well-known port. We think that web developers will often miss this case 
 because it will be hard to test the error case and may be an unusual 
 failure. We propose that the UA signal this condition in the same way as 
 failing to connect to a service which will be much more common and more 
 likely to be handled by web developers.

Wouldn't this error be caught very early in development, when the 
connection just wasn't established?

It seems unlikely that non-hostile authors will ever run into this anyway, 
since they shouldn't be using these ports.


 5) It is not clear precisely where the 'fail the Web Socket connection 
 algorithm' is defined.

Section 4.3. Closing the connection of the Web Socket protocol spec.


 6) The send() method can both throw an exception in the CONNECTING state 
 or return an 'error' flag if in the CLOSED state. APIs that both have 
 return values and also throw exceptions commonly cause coding errors by 
 developers using them. For example, here web developers may fail to deal 
 with the CONNECTING state because on their test service they get an 
 almost immediate connection but once they deploy hitting this case 
 becomes much more common. We recommend choosing between exceptions or 
 return values but not both.

The exceptions are thrown for cases where there is a logic error, and the 
return value (not an error code, just the connection status) is used to 
handle expected events such as network errors.

Using exceptions for network errors is a bad idea because it would mean 
any use of the API would have to use exception handling.

Using a more elaborate return value to report logic errors also would IMHO 
not really lead to a clearer programming model in this case, since authors 
wouldn't be looking for those errors either, and would likely just treat 
it as a connection failure, leading to trying to reconnect, which would 
then cause increased load on the server -- an especially bad result, since 
the slow connection is likely to be caused by an overloaded server!

Using exceptions here sidesteps this since it is not expected that authors 
will catch the exception, and thus it will just report an error on the 
error console (useful for development) and abort the script, without 
preventing the connection from being established.


 7) It is not clear exactly how to implement the bufferedAmount property 
 and be interoperable. Where is the queue of bytes not yet sent? Is this 
 at the application layer, in the networking stack, on the network card, 
 or somewhere else?

Any of the 

Microsoft pre-LCWD feedback on WebSocket API

2009-11-19 Thread Adrian Bateman
Apologies for only sending this at the deadline. I have been collecting 
feedback from a number of different groups at Microsoft who have been reviewing 
the WebSockets API spec and only had chance to collate it today.

Feedback on Web Sockets API (draft dated 29 October 2009)

1) In the WebSocket constructor, we think it would make sense to limit the 
optional protocol string to a set of characters that is easier to secure. The 
UI never surfaces this string and having something that doesn't require 
processing the full set of strings into UTF-8 shouldn't be a significant 
restriction. We also think that specifying length constraints would be useful. 
For example, stipulating the minimum length that conforming implementations 
must be able to handle. One suggestion was to re-use the same criteria as a 
valid scheme name as defined in section 3.1 of RFC 3986.

2) The second comment about the protocol string is editorial. There was quite a 
lot of confusion about what the protocol string is used for and whether a 
central registry would be needed for well-known protocol strings. I don't 
believe this is intended or necessary but this suggests that the language could 
be clearer. Perhaps an informative section describing the expected use of the 
protocol string might be included.

3) The spec uses many terms that the HTML5 spec defines. As far as I can tell, 
there isn't a definitive list of these. The 2.1 dependencies section notes that 
many concepts come from HTML5 but not saying which ones seems insufficient for 
spec moving to Last Call. Most of the people who looked at this spec had never 
looked at HTML5 before and their feedback was simply that many terms were 
undefined.

4) In step 2 of the UA steps for the WebSocket() constructor, the spec calls 
for an exception to be thrown if the user agent chooses to block a well-known 
port. We think that web developers will often miss this case because it will be 
hard to test the error case and may be an unusual failure. We propose that the 
UA signal this condition in the same way as failing to connect to a service 
which will be much more common and more likely to be handled by web developers.

5) It is not clear precisely where the 'fail the Web Socket connection 
algorithm' is defined.

6) The send() method can both throw an exception in the CONNECTING state or 
return an 'error' flag if in the CLOSED state. APIs that both have return 
values and also throw exceptions commonly cause coding errors by developers 
using them. For example, here web developers may fail to deal with the 
CONNECTING state because on their test service they get an almost immediate 
connection but once they deploy hitting this case becomes much more common. We 
recommend choosing between exceptions or return values but not both.

7) It is not clear exactly how to implement the bufferedAmount property and be 
interoperable. Where is the queue of bytes not yet sent? Is this at the 
application layer, in the networking stack, on the network card, or somewhere 
else? We propose removing the bufferedAmount property.

I think we will have some other feedback more related to the wire protocol than 
the API although changes to the protocol could potentially impact the API. I'm 
not sure how the working group plans to handle this interaction between the API 
draft and discussions elsewhere about the protocol (I understand there is a 
proposal to deal with the protocol in an IETF working group?).

Cheers,

Adrian.

-Original Message-
From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On 
Behalf Of Arthur Barstow
Sent: Wednesday, November 04, 2009 5:46 AM
To: public-webapps
Subject: Seeking pre-LCWD comments for: Server-sent Events, Web {Database, 
Sockets, Storage Workers}; deadline 19 November

As noted on 23 October [1], the following HTML5 APIs are ready or  
very close to being ready for Last Call Working Draft (LC):

1. Server-Sent Events
   http://dev.w3.org/html5/eventsource/

2. Web Database
   http://dev.w3.org/html5/webdatabase/

3. Web Sockets API
   http://dev.w3.org/html5/websockets/

4. Web Storage
   http://dev.w3.org/html5/webstorage/

5. Web Workers
   http://dev.w3.org/html5/workers/

Please submit comments about these specs by 19 November.

Note the Process Document states the following regarding the  
significance/meaning of LCWD:

[[
http://www.w3.org/2005/10/Process-20051014/tr.html#last-call
Purpose: A Working Group's Last Call announcement is a signal that:

* the Working Group believes that it has satisfied its relevant  
technical requirements (e.g., of the charter or requirements  
document) in the Working Draft;

* the Working Group believes that it has satisfied significant  
dependencies with other groups;

* other groups SHOULD review the document to confirm that these  
dependencies have been satisfied.
In general, a Last Call announcement is also a signal that the  
Working Group is planning to advance the technical report