Re: [whatwg] Web Sockets

Shannon Tue, 22 Jul 2008 23:18:15 -0700

Philipp Serafin wrote:

Asynchronous: Requests and responses can be pipelined, meaning requests and
responses can be transmitted simultaneously and are queued.



I think the problem is that this definition of "asynchronous" is very
narrow. Yes, you don't need to wait for a request to finish before you
issue a new one. But you'd still be bound to HTTP's request/response
scheme in general.

WebSockets uses HTTP so it is hardly immune to the request/responsebehaviour of its underlying protocol (including the stream nature of TCP).

Besides this statement appears to be based on the assumption that theserver MUST wait for additional client requests to send each "message".However the specification allows the server to send "chunked" or"multipart" data in a variety of ways so full asynchronous communicationis acheivable by making the response chunks part of one long HTTPmultipart response and allowing the javascript API to access theincoming data while the response is incomplete. It can be simplified forthe end-user by allowing the raw server response to be read by bytes,lines or "parts". If you need more channels (ie, to pass messages whilesending a large file) then you simply open more WebSockets and let theserver handle multiplexing via cookies or message ids.

However, web authors might want to employ other schemes as well, for
example server-sided asynchronous notifications ("pushing"),

Client opens a connection. handshakes, then leaves connection openlistening for "pushed" parts in the response stream.

client-sided notifications that don't need to be replied


HTTP has a status code specifically for this purpose.

 or requests that can be answered out-of-order.

The order of data is controlled by the order it is sent by theapplication. There is no requirement for the requests andresponses/parts to be synchronised. Especially if the server responsesconsist of a single multi-part "response".

I'm not advocating against WebSockets, just its current definition. Inparticular it tries to solve things that HTTP/1.1 already handles. Ibelieve we should be thinking of WebSockets as a Javascript API, not anew communications protocol for the simple reason that HTTP is already avery suitable and widely deployed protocol. What authors (especiallyAJAX authors) are missing is a reliable way to use HTTP's existingasynchronous connection support.


Here are my issues with WebSockets as currently defined:

1.) Request must have a <scheme> component whose value is either "ws" or"wss"


The "scheme" should be HTTP(S). WebSockets should be the API.

2.) The message event is fired when when data is received for aconnection.

What "data"? A byte, a line, a chunk, the whole response? The spec isn'tclear. I'd also recommend adding a connection.read( max_bytes ) methodas used by Python and most languages to let the author receive bytes ata frequency appropriate to the application (eg, a game might want tofrequently poll for small updates).

3.) If the resulting absolute URL has a <port> component, then let portbe that component's value; otherwise, if secure is false, let port be81, otherwise let port be 815.

No, no, no! Don't let paranoia override common sense. Not all websocketapplications will have the luxury to run on these ports (multiple webservers, shared host, tunnelled connections, 2 websocket apps on onehost, etc...).


4.) The whole handshake is too complex.

There are many firewalls, proxies and servers that legimately insert,change, split, or remove HTTP headers or modify their order. This isalso likely if the service being provided sits on top of aframework/server (such as Coldfusion/IIS). Also what happens if HTTP/1.2is sent? These will break the WebSocket handshake as currently defined.

rfc 2616 section 3 says: The version of an HTTP message is indicated byan HTTP-Version field in the first line of the message.

HTTP-Version   = "HTTP" "/" 1*DIGIT "." 1*DIGIT

How many non-http servers send this? Probably none. I recommend thehandshake simply read about 256 bytes of the server response and checkthat it contains a valid HTTP version field and one or more valid HTTPheaders and optional end-of-headers marker. If ALL headers semanticallyvalidate (ie, with the regex [A-Za-z-]: \s(.*?)\r\n) then it isreasonably safe to assume it is a real HTTP or WebSockets service. Ifthe headers validate then the client repeats the process until themessage body is reached. At this point we check our collected headersfor at least one "Accept" header containing "WebSocket" (assuming thatbeing a "WebSocket" rather than a basic pipelined HTTP connection iseven useful).


5.) URI parsing specification

The current proposal spells out the URI/path parsing scheme. Howeverthis should be treated EXACTLY like HTTP so the need to define it in thespec is redundant. It is enough to say that the resource may berequested using a GET or POST request. Same with cookie handling,authorization and other HTTP headers. These should be handled by thewebserver and/or application exactly as normal, there is no need torewrite the rules simply because the information flow is asynchronous.


6.) Data framing specification

Redundant because HTTP already provides multiple methods of data segmentencapsulation including "Content-Length", "Transfer-Encoding" and"Content-Type". Each of these have sub-types suitable for a range ofpossible WebSocket applications. Naturally it is not necessary for theclient or server to support them all since there are HTTP headersexplicitly designed for this kind of negotiation. The WebSocket shouldhowever define at least one fallback method that can be relied on (Irecommend "Content-Length", "Transfer-Encoding: chunked" and"Content-Type: multipart/form-data" as MUST requirements).


7.) WebSockets needs a low-level interface as well

By "dumbing down" the data transfer into fired events and wrapping thedata segments internally the websocket hides the true communicationbehind an abstract object. This is a good thing for simplicity butextremely limiting for authors wanting to fine-tune an application oradapt to future protocols. I strongly recommend that rawwrite() andrawread() methods be made available to an OPEN (ie,authenticated/handshaked) websocket to allow direct handling of thestream. It would be understood that authors using these methods mustunderstand the nature of both HTTP and websockets. In the same way asettimeout() method should be provided to control blocking/non-blockingbehaviour. I can't stress enough how important these interfaces are, asthey may one day be required to implement WebSockets 2.0 on "legacy" orbroken HTML5 browsers.


8.) Origin: / WebSocket-Origin:

Specifying clients allowed to originate a connection is a disasterwaiting to happen for the simple reason that sending your origin is aprivacy violation in the same vain as the referrer field. Anyopen-source browser or privacy plugin will simply disable or spoof thissince it would allow advertising networks to track people by ad-servingvia websockets. Such tracking undermines the security of anonymisingproxies (as the "origin" may be a private site or contain a client id).Using origin as a required field essentially makes the use of "referrer"mandatory. If a websocket wants to restrict access then it will have touse credentials or IP ranges like everything else.


9.) WebSocket-Location

The scenario this is supposed to solve (that an application makes amistake about what host it's on and somehow sends the wrong data) iscontrived. What's more likely to happen is that a server application hastrouble actually knowing its (virtual) hostname (due to a proxy,mod_rewrite, URL masking or other legitimate redirect) and therefore NOclients can connect. It isn't uncommon for the host value passed to aCGI script and the hostname returned by the environment (ie, via unameor OS library) to conflict. Then there is the matter of an SSLconnection (no host header available). I'm having trouble determiningwhy this should even matter. I suspect most simple applications/wrapperswill just echo back the host header sent by the client so if a mistakeis made it's likely to go unnoticed anyway.

10.) To close the Web Socket connection, either the user agent or theserver closes the TCP/IP connection. There is no closing handshake.

HTTP provides a reliable way of closing a connection so that all parties(client, server and proxies) know why the connection ended. There is noreason for websockets to not follow this protocol and close theconnection properly.

In conclusion, the current specification of WebSockets re-inventsseveral wheels and does so in ways that are overly complex, error-proneand yet seriously limited in functionality. The whole concept needs tobe approached from the position of making HTTP's features (which arealready implemented in most UAs) available to Javascript (whilepreventing the exploit of non-HTTP services). I do not believe this isdifficult if my recommendations above are followed. I do not wish to beoverly critical without contributing a solution, so if there are noserious objections to the points I've made I will put time intoreframing my objections as a compete specification proposal.



Shannon

Re: [whatwg] Web Sockets

Reply via email to