On 12-Jan-25 16:27, Gavin D. Howard via curl-library wrote:
The problem with "limiting [an http server] to what you want" is that over time, the requirements will expand. Even if you think you know the limits won't change. They will, and the maintenance task that you - or your successor - will inherit from the technical debt will prevent you from doing more productive things with your (their) time. Or worse, you'll deal with avoidable CVEs.Seen this many times. Your "requests generated by browsers and libcurl" isn't particularly limiting. Which versions of which browsers and other clients? You'd be amazed at how many http clients there are - and what they uniquely expect/rely on. You're much better off using an existing server. You don't need to use a heavyweight server like Apache httpd or nginx; there are lightweight servers to choose from, and when there's an issue there are support resources. For example, thttpd.https://www.acme.com/software/thttpd/ or lighttpd:https://www.lighttpd.net/ You can also find basic frameworks such as python's http.server or Perl's HTTP::Server::Simple or HTTP::Daemon. But I'd stick with a real server and put effort into a CGI. Reinventing the wheel is a bad investment.I already have a FastCGI implementation that I was going to use behind a real server, as you say. But as far as I understand it, I still need to parse HTTP stuff with that. Thus, I need to implement HTTP/1.1.
It is true that you will have to parse some of HTTP 1.1 in a [Fast]CGI process. But that's a far cry from "implement HTTP 1.1".
A real server handles logging, process management, persistence, connection management, .... It also interprets much of (and some of the trickiest parts of) HTTP - authentication, access controls, rate limits, usually some protection from malformed HTML [e.g. mod_security], and more.
A process running under it gets a lot of information extracted from the HTTP headers; is shielded from some sensitive information (e.g. Authorization headers processed by the server); doesn't have to deal with folding.
The CGI gateway spec [RFC3875) breaks this down, though most servers provide more information.
The URL is broken down; remote user, authentication type, request method - etc.
CGI library interfaces parse out query parameters.What's left is dispatching on the request method and handling any unique headers and those that the server can't handle for CGIs (such as range and caching).
But a real server has mechanisms for caching (both the back end and and headers) for static content.
Rather than re-invent these, it can be much easier (and more perfomant) to create a loadable module for the real server. This provides access to all the services of the server.
For Apache, things to look at would be mod_ftp and mod_dav. mod_ftp implements a non-http protocol; mod_dav implements additional request methods for authoring and versioning. mod_md locates and interacts with mod_ssl - the same methods can be used to access cache modules. (mod_md also uses libcurl...)
Note that FTP assumes a persistent TCP connection, like WS. HTTP does not - although its connections can persist, they don't have to. I'm not a WS expert, but for this reason it's not clear that you can implement WS over HTTP in the general case. This leads toward a server module rather than any sort of CGI.
Never re-invent the wheel; use it. Happy coding. Timothe Litt ACM Distinguished Engineer -------------------------- This communication may not represent the ACM or my employer's views, if any, on the matters discussed.
To answer Daniel:Exactly how would this new proposed look and work? I can't even fathom how an API for this would look like to be usable in a server setting.That is a good question, and I don't expect you to answer it. I will assume that asking the question is permission to at least try it. I will come back ASAP with a rough API and a rough implementation. I don't have much free time, though, so it may take some time. Thank you. Gavin Howard
OpenPGP_signature.asc
Description: OpenPGP digital signature
-- Unsubscribe: https://lists.haxx.se/mailman/listinfo/curl-library Etiquette: https://curl.se/mail/etiquette.html