Mark, thanks for the response, it's very well thought out. Let me state two things first to explain some of my design decisions.
Firstly, I'm shooting for lowest-common-denominator here. Right now, I see that as the intersection between the CGI backend and a standalone server backend; I think anything contained in both of those will be contained in all other backends. If anyone has a contrary example, I'd be happy to see it. Secondly, the WAI is *not* designed to be "user friendly." It's designed to be efficient and portable. People looking for a user-friendly way to write applications should be using some kind of frontend, either a framework, or something like hack-frontend-monadcgi. That said, let's address your specific comments. On Mon, Jan 18, 2010 at 8:54 AM, Mark Lentczner <ma...@glyphic.com> wrote: > I like this project! Thanks for resurrecting it! > > Some thoughts: > > Methods in HTTP are extensible. The type RequestMethod should probably have > a "catchall" constructor > | Method B.ByteString > > Seems logical to me. > Other systems (the WAI proposal on the Wiki, Hack, etc...) have broken the > path into two parts: scriptName and pathInfo. While I'm not particularly > fond of those names, they do break the path into "traversed" and > "non-traversed" portions of the URL. This is very useful for achieving > "location independence" of one's code. While this API is trying to stay > agnostic to the web framework, some degree of traversal is pretty universal, > and I think it would benefit being in here. > > Going to the standalone vs CGI example: in a CGI script, scriptName is a well defined variable. However, it has absolutely no meaning to a standalone handler. I think we're just feeding rubbish into the system. I'm also not certain how one could *use* scriptName in any meaningful manner, outside of trying to reconstruct a URL (more on this topic below). > The fields serverPort, serverName, and urlScheme are typically only used by > an application to "reconstruct" URLs for inclusion in the response. This is > a constant source of bugs in many web sites. It is also a problem in > creating modular web frameworks, since the application can't be unaware of > its context (unless the server interprets and re-writes HTML and other > content on the fly - which isn't realistic.) Perhaps a better solution would > be to pass a "URL generating" function in the Request and hide all this. Of > course, web frameworks *could* use these data to dispatch on "virtual host" > like configurations. Though, perhaps that is the provenance of the server > side of the this API? I don't have a concrete proposal here, just a gut that > the inclusion of these breaks some amount of encapsulation we'd like to > achieve for the Applications. > > I think it's impossible to ever reconstruct a URL for a CGI application. I've tried it; once you start dealing with mod_rewrite, anything could happen. Given that I think we should encourage users to make pretty URLs via mod_rewrite, I oppose inserting such a function. When I need this kind of information (many of my web apps do), I've put it in a configuration file. However, I don't think it's a good idea to hide information that is universal to all webapps. urlScheme in particular seems very important to me; for example, maybe when serving an app over HTTPS you want to use a secure static-file server as well. Frankly, I don't have a use case for serverName and serverPort that don't involve reconstructing URLs, but my gut feeling is better to leave it in the protocol in case it does have a use case. > The HTTP version information seems to have been dropped from Request. Alas, > this is often needed when deciding what response headers to generate. I'm in > favor of a simple data type for this: > data HttpVersion = Http09 | Http10 | Http11 > > I had not thought of that at all, and I like it. However, do we want to hard-code in all possible HTTP versions? In theory, there could be more standards in the future. Plus, isn't Google currently working on a more efficient approach to HTTP that would affect this? > Using ByteString for all the non-body values I find awkward. Take headers, > for example. The header names are going to come from a list of about 50 well > known ones. It seems a shame that applications will be littered with > expressions like: > > [(B.pack "Content-Type", B.pack "text/html;charset=UTF-8")] > > Seems to me that it would be highly beneficial to include a module, say > Network.WAI.Header, that defined these things: > > [(Hdr.contentType, Hdr.mimeTextHtmlUtf8)] > > This approach would make WAI much more top-heavy and prone to becoming out-of-date. I don't oppose having this module in a separate package, but I want to keep WAI itself as lite as possible. > Further, since non-fixed headers will be built up out of many little String > bits, I'd just as soon have the packing and unpacking be done by the server > side of this API, and let the applications deal with Strings for these > little snippets both in the Request and the Response. > > As I stated at the beginning of this response, there should be a framework or frontend sitting between WAI and the application. And given that the actual data on the wire will be represented as a stream of bytes, I'd rather stick with that. For header names, in particular, it might be beneficial (and faster) to > treat them like RequestMethod and make them a data type with nullary > constructors for all 47 defined headers, and one ExtensionHeader String > constructor. > > Same comment of top-heaviness. > Finally, note that HTTP/1.1 actually does well define the character > encoding of these parts of the protocol. It is a bit hard to find in the > spec, but the request line, status line and headers are all transmitted in > ISO-8859-1, (with some restrictions), with characters outside the set > encoded as per RFC 2047 (MIME Message Header extensions). Mind you, I > believe that most web servers *don't* do the 2047 decoding, and only either > a) pass the strings as ISO-8859-1 strings, or decode that to native Unicode > strings. > > Thanks for that information, I was unaware. However, I think it still makes sense to keep WAI as low-level as possible, which would mean a sequence of bytes. Michael
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe