Re: WebSockets negotiation over HTTP

2009-10-21 Thread Mark Nottingham


On 22/10/2009, at 10:52 AM, Ian Hickson wrote:


Until the upgrade is complete, you're speaking HTTP and working with
HTTP implementations.


How so? A WebSocket client is always talking Web Socket, even if it  
might

also sound like HTTP.


Yes, but to someone examining the traffic -- whether in a debugger or  
an intermediary -- it looks, smells and quacks like HTTP. It declares  
itself to be HTTP/1.1 by using the HTTP/1.1 protocol identifier in the  
request-line. What's the point of doing that -- i.e., why not use  
WebSockets/1.0?




Have you verified that implementations (e.g., Apache module API) will
give you byte-level access to what's on the wire in the request, and
byte-level control over what goes out in the response?


On the server side, you don't need wire-level control over what's  
coming

in, only over what's going out.


Yes, you do, because section 5.2 specifies headers as whitespace- 
sensitive and header-names as case-sensitive.



There's already a WebSocket module for Apache, by the way:

  http://code.google.com/p/pywebsocket/


Cool.



Despite all of this, you say:

  The simplest method is to use port 80 to get a direct connection  
to a

  Web Socket server.  Port 80 traffic, however, will often be
  intercepted by HTTP proxies, which can lead to the connection  
failing

  to be established.


which I think is misleading; this is far from the simplest way to use
WebSockets, from a deployment perspective.


True. I've tried to reword this to avoid this possible ambiguity.


I see those changes in -50; looks good (and a very elegant change).  
Thanks.




This looks an awful lot like a redirect.


There's no redirection involved here. It's just confirming the  
opened URL,

as part of the handshake. The TCP connection is not closed (unless the
handshake fails, and then it's not reopened).

I see now that you have the client-side fail a connection where the  
URL

doesn't match, but that's really not obvious in 5.1. Please put some
context in there and reinforce that the URL has to be the URL of the
current script, not just any script.


Ok, I've added a note at the end of that section explaining that the  
user

agent will fail the connection if the strings don't match what the UA
sent. Please let me know if you'd like anything else clarified; I  
don't

really know exactly what should be made clearer.


Did this get into -50? Don't see anything in the diff...

The most effective way of doing this would be to actually define the  
new headers' semantics in your draft; Websocket-Location, for example,  
is only defined as client-side and server-side behaviours. I know this  
is probably intentional, but people will read all sorts of things into  
this header (especially since its name is so similar to other HTTP  
headers) unless you give some indication of what it means.



--
Mark Nottingham   m...@yahoo-inc.com




Re: WebSockets negotiation over HTTP

2009-10-21 Thread Ian Hickson
On Mon, 19 Oct 2009, Amos Jeffries wrote:
> Ian Hickson wrote:
> > On Wed, 14 Oct 2009, Amos Jeffries wrote:
> > > 4.1.13 still has a fragility issue in that it assumes the Upgrade: 
> > > and Connection: headers will retain both their specific sending 
> > > order and be the very first headers in the reply. It will work in 
> > > most situations, but proxies which 'correct' the headers order to 
> > > have Date: first will kill WebSockets.
> > 
> > That's intentional; such proxies don't know about Web Sockets (if they 
> > did, they wouldn't be modifying the headers!) and thus clearly can't 
> > really be trusted to route the traffic unmodified.
> 
> At this point of the handshake the client is the only software which 
> knows it's using WebSockets.
>
> The server may validate-parse the headers mime syntax before sub-parsing 
> the request line. At this point all its seen is the GET and HTTP/1.1.
> 
> So... the server and any middleware will be in a state right now 
> thinking that HTTP/1.1 is in use and will do appropriate HTTP/1.1 header 
> alterations.
> 
> It is not until the server reply accepting the Upgrade: request is 
> received by middleware that WebSockets protocol actions can start 
> happening.

Agreed. I don't see how that affects my point though.


> > > 4.1 14 thru 4.1.23 appear to be a very conflated description of 
> > > parsing the headers.
> > > 
> > > It seems to me that referencing rfc2616 section 4.2 should be 
> > > sufficient for the parse
> > 
> > Unfortunately, HTTP doesn't define how to parse headers. It defines 
> > the semantics of valid headers, but doesn't say, e.g., what headers 
> > are present in the following:
> > 
> >HTTP/1.1 200 OK
> >: Bar
> >Foo
> >Quux
> 
> Section 4.2 is clear:
>  "Each header field consists of a name followed by a colon (":") and the field
> value. Field names are case-insensitive."

So what are the headers in the (invalid) HTTP response above?


> NP: WebSockets as of draft-49 requires (1.2) "The first three lines in 
> each case are hard-coded (the exact case and order matters)" which is a 
> breach of the final statement above. That final statement permits 
> middeleware to uppercase or CamelCase the headers on a whim without 
> altering their meaning.

The entire point of the handshake is to detect such middleware and fail 
the connection when it is detected.


> References RFC822 section 3.1 for the BNF. Which states:
>  " B.1.  SYNTAX
> 
>  message =   *field *(CRLF *text)
> 
>  field   =field-name ":" [field-body] CRLF
> 
>  field-name  =  1*
> 
>  field-body  =   *text [CRLF LWSP-char field-body]
> "
> ...
> "
>   C.1.1.  FIELD NAMES
> 
> These now must be a sequence of  printable  characters.   They
> may not contain any LWSP-chars.
> "
> 
> ... which requires a minimum of one ASCII byte header names which may not
> include ':' or whitespace or non-printables.
> 
> NP: WebSockets draft-49 changes the bytes to UNICODE format and permits
> non-printables which are not LF or CR.

Right.


> In your above demo request is HTTP/1.1 invalid:
>  * first header line has no token in the field-name portion,
>  * second line has CRLF in the name portion,
>  * third line has zero-byte name portion.

I am aware that it is invalid. My point is that HTTP doesn't define how it 
is to be parsed (it leaves it undefined), which is IMHO unacceptable for a 
protocol specification, and that is why WebSocket doesn't defer to HTTP.


> Since you have spec'd that only valid HTTP/1.1 is acceptable this will 
> be dropped by any WebSockets aware software even if its accepted by 
> WebSockets.

Sure, but the following:

HTTP/1.1 101 Web Socket Protocol Handshake
Upgrade: WebSocket
Connection: Upgrade
WebSocket-Origin: http://example.com
WebSocket-Location: ws://example.com/demo
WebSocket-Protocol: sample
:

...has defined processing on the client-side when it is sent back as a 
WebSocket handshake, while if I deferred to HTTP, it's handling would be 
undefined (and indeed a range of behaviours from allowing the connection 
to failing the connection altogether would be allowed, which is far too 
vague to lead to good interoperability!).


> For completeness the rest of rfc822sect3.1 used by rfc2616 specs:
> "
>  B.2.  SEMANTICS
> 
>   Headers occur before the message body and are terminated  by
>  a null line (i.e., two contiguous CRLFs).
> 
>   A line which continues a header field begins with a SPACE or
>  HTAB  character,  while  a  line  beginning a field starts with a
>  printable character which is not a colon.
> 
>   A field-name consists of one or  more  printable  characters
>  (excluding  colon,  space, and control-characters).  A field-name
>  MUST be contained on one line.  Upper and lower case are not dis-
>  tinguished when comparing field-names.
> "
> 
> .. the th

Re: WebSockets negotiation over HTTP

2009-10-21 Thread Ian Hickson
On Sat, 17 Oct 2009, Mark Nottingham wrote:
> On 17/10/2009, at 9:09 AM, Ian Hickson wrote:
> > On Wed, 14 Oct 2009, Mark Nottingham wrote:
> > > 
> > > Section 5.2 does constrain the bytes the server accepts from the 
> > > client, thereby conflicting with HTTP, but only in some small 
> > > details. In particular, it makes HTTP header field-names 
> > > case-sensitive, and requires certain arrangements of whitespace in 
> > > them.
> > > 
> > > Ian, if you can address these small things in section 5.2 it would 
> > > help.
> > 
> > If a WebSocket client is connecting to a WebSocket server, then this 
> > isn't HTTP, it's just the WebSocket protocol. So whether the fields 
> > are parsed like HTTP is presumably not a problem.
> > 
> > If an HTTP client is connecting to a WebSocket server, then the 
> > server's response is going to be garbage (from the HTTP client's 
> > perspective) anyway, much like if an HTTP client were to connect to an 
> > SMTP server. So how the server parses the fields doesn't really 
> > matter.
> > 
> > If a WebSocket client is connecting to a WebSocket server, then the 
> > requirements in this section don't apply to the server.
> > 
> > If an HTTP client is connecting to an HTTP server, then the whole spec 
> > doesn't apply.
> > 
> > Which case is the one you are concerned about? Are my conclusions 
> > above incorrect?
> 
> Until the upgrade is complete, you're speaking HTTP and working with 
> HTTP implementations.

How so? A WebSocket client is always talking Web Socket, even if it might 
also sound like HTTP.


> Have you verified that implementations (e.g., Apache module API) will 
> give you byte-level access to what's on the wire in the request, and 
> byte-level control over what goes out in the response?

On the server side, you don't need wire-level control over what's coming 
in, only over what's going out.

There's already a WebSocket module for Apache, by the way:

   http://code.google.com/p/pywebsocket/


> Overall, I guess I'm just not seeing how running WebSockets on port 80 
> (i.e., co-existant with a HTTP server) is ever a good idea.

I wouldn't recommend co-existing with a port 80 HTTP server. The 
co-existing support is really for port 443.


> Since a sizeable portion of the Internet is accessed through proxies 
> (e.g., hotels, universities, corporations, mobile phones, and some 
> ISPs), and none of the deployed infrastructure will support WebSockets, 
> deploying in this fashion alone won't be workable on the open Internet; 
> people using this technique will have to also deploy a fallback server 
> on a different port. So, why bother, and why force people to write the 
> code for fallback? What value is there in doing it this way?

By and large, you can connect over port 443 without the proxy getting in 
the way. That's the model that I would expect most Web Socket deployments 
to use.


> Despite all of this, you say:
> 
> >The simplest method is to use port 80 to get a direct connection to a
> >Web Socket server.  Port 80 traffic, however, will often be
> >intercepted by HTTP proxies, which can lead to the connection failing
> >to be established.
> 
> which I think is misleading; this is far from the simplest way to use
> WebSockets, from a deployment perspective.

True. I've tried to reword this to avoid this possible ambiguity.


> > > The other aspect here is that you're really not using Upgrade in an 
> > > appropriate fashion; as mentioned before, its intended use is to 
> > > upgrade *this* TCP connection, not redirect to another one.
> > 
> > There's only one TCP connection established. As far as I can tell, 
> > WebSocket never does a redirect of any kind.
> 
> -48 5.1 says:
> 
> >Send the string "WebSocket-Location" followed by a U+003A COLON (:)
> >and a U+0020 SPACE, followed by the URL of the Web Socket script,
> >followed by a CRLF pair (0x0D 0x0A).
> > 
> >   For instance:
> > 
> >WebSocket-Location: ws://example.com/demo
> > 
> >NOTE: Do not include the port if it is the default port for Web
> >Socket protocol connections of the type in question (80 for
> >unencrypted connections and 443 for encrypted connections).
> 
> This looks an awful lot like a redirect.

There's no redirection involved here. It's just confirming the opened URL, 
as part of the handshake. The TCP connection is not closed (unless the 
handshake fails, and then it's not reopened).


> I see now that you have the client-side fail a connection where the URL 
> doesn't match, but that's really not obvious in 5.1. Please put some 
> context in there and reinforce that the URL has to be the URL of the 
> current script, not just any script.

Ok, I've added a note at the end of that section explaining that the user 
agent will fail the connection if the strings don't match what the UA 
sent. Please let me know if you'd like anything else clarified; I don't 
really know exactly what should be made clearer.

-- 
Ia

Re: Introduction

2009-10-21 Thread Perry Smith

On Oct 21, 2009, at 5:24 AM, Kinkie wrote:

On Tue, Oct 20, 2009 at 5:11 PM, Perry Smith   
wrote:

Hi,


Hello Perry!

My interest is AIX.  I recently sent a note to the general list and  
was
prompted to at least CC to this list.  I plan to use squid in a  
pretty low
usage place to get around a firewall issue I have.  But I also  
maintain

http://aix-consulting.net which is a site where I put precompiled
executables of open source packages.  I plan to put versions of  
squid 2.7

and 3.0 on that site.


Fantastic :)
I think Amos already asked that, but are you willing to take part of
our continuous integratoin platform? See the BuildFarm wiki page and
http://build.squid-cache.org/


I would like to.  I currently have a lot of obstacles.  I currently
don't have an AIX machine that can be directly reached from the net.
They are all behind firewalls of one sort or another.

The best I could do right now would give a sign on to my server and
then you could telnet / whatever to the AIX machine.

I might be able to talk a friend into putting one of my AIX machines
on his net but I haven't mentioned it to him yet.



I will likely debug my current issues with the help of this list  
and then
drop off but you are always welcome to contact me for AIX questions  
(or

point people in my direction).


Thanks! Any help is appreciated.


As an update for my progress: I got 2.7 compiled and working.  It
exposed a bug in AIX which left a ./conftest program running after the
end of the configure script.  It was the test that checks for large
unix domain datagram packets.  It should have died.  Its not a bad
test.  But AIX does not comes out of the select when the parent dies.

I have not gotten back to the 3.0 mystery but I hope to.  This project
is a side line off the side road of the side business.  So it doesn't
get much time.

Take care,
Perry
Ease Software, Inc. ( http://www.easesoftware.com )

Low cost SATA Disk Systems for IBMs p5, pSeries, and RS/6000 AIX systems



Re: Introduction

2009-10-21 Thread Kinkie
On Tue, Oct 20, 2009 at 5:11 PM, Perry Smith  wrote:
> Hi,

Hello Perry!

> My interest is AIX.  I recently sent a note to the general list and was
> prompted to at least CC to this list.  I plan to use squid in a pretty low
> usage place to get around a firewall issue I have.  But I also maintain
> http://aix-consulting.net which is a site where I put precompiled
> executables of open source packages.  I plan to put versions of squid 2.7
> and 3.0 on that site.

Fantastic :)
I think Amos already asked that, but are you willing to take part of
our continuous integratoin platform? See the BuildFarm wiki page and
http://build.squid-cache.org/

> I will likely debug my current issues with the help of this list and then
> drop off but you are always welcome to contact me for AIX questions (or
> point people in my direction).

Thanks! Any help is appreciated.

-- 
/kinkie