On 30 May 2010 02:59, Richard Hipp <d...@sqlite.org> wrote:
> "CGI ... is highly inefficient".  The http://www.sqlite.org/ and
> http://www.fossil-scm.org/ websites are both run off of the same server
> (check the IP addresses on the domains).  The HTTP server there is a simple
> home-brew job implemented as a single file of C code.
>     http://www.sqlite.org/docsrc/artifact/d53e8146bf7977
> It is run off of inetd.  For each inbound HTTP request, a new process is
> created which runs the program implemented by the C file shown above.  That
> simple little program either delivers static content or if the file
> requested as the execute permission bit turned on, it runs the file as CGI.
> Very simple.  This server takes over a quarter million requests per day,
> 10GB of traffic/day, and it does so using less than 3% of of the CPU on a
> virtual machine that is a 1/20th slice of a real server.
> How much more efficient does that need to be?  Sure, it won't scale up to
> Google or Facebook loads, but it doesn't need to.

Most web applications are written in scripting languages; Fossil is a bit of
an unusual case. In the general case of a scripting language, spawning a new
interpreter for every request is rather disastrous performance wise. It also
brings the issue of rate limiting: You can easily end up with hundreds of
CGI handlers running.

> Why don't I use nginx or apache with SCGI or FastCGI and be even more
> efficient, you ask?  One word:  Simplicity.
> With my setup, there are no servers (other than inetd).  Everything runs
> on-demand.  With no servers running, that means there are no servers to
> crash and require restarting, no servers to configure, no servers to pick up
> performance problems after running a few days due to memory fragmentation or
> resource leaks, and no servers accidentally leaving open TCP ports that can
> be attacked by miscreants.  Oh, and did I mention that my setup runs in a
> chroot jail for additional security.  I'm guessing nginx doesn't do that....
> When I design software, I really try hard to make it simple.  Take for
> example, Fossil.  There are (currently) three ways to set up a Fossil
> server.  (1) You can type "fossil server REPOSITORY".  (2) You can do a
> simple 1-line edit to your /etc/inetd.conf file.  (3) You can create a
> 2-line CGI script and drop it in any cgi-bin.  Each of these techniques can
> (and are) described using 20 or 30 lines of text and one code example.  None
> of them involve editing more than 2 lines in a single file.  Now consider a
> hypothetical SCGI solution.  To get SCGI going, you first have to arrange
> for start the Fossil SCGI server (perhaps with the "fossil scgi REPOSITORY"
> command) and have it restart automatically when your machine reboots.  You
> have to choose a communications port.  Then you have to edit configuration
> files on your web server to get it to talk to the fossil SCGI server.  So,
> to implement an SCGI solution, you'll need to edit a minimum of two
> configuration files (and probably more if my guess about the complexity of
> nginx is correct).  So the setup for SCGI is at least twice as complex as
> CGI.
> But SCGI will be faster, right?  Well, no.  SCGI will be about the same
> speed, or may just a little slower, because the way the "fossil scgi"
> command will work (assuming I implement it) will be that the Fossil server
> will accept the incoming SCGI request from the web server.  The fossil
> server will then fork a copy of itself to handle the request, set
> environment variables, then call the existing CGI processing logic to do the
> work.  So SCGI and CGI are going to do the same amount of work and run at
> about the same speed.  The difference is that SCGI will use more resources
> when it is idle (because there is a server hanging around waiting for
> incoming requests, rather than being demain-launched) and SCGI will be at
> least twice as hard to setup and configure.

SCGI should still be slightly more efficient than traditional CGI because it
is exec which tends to be the expensive system call. Of course, we aren't
expecting Fossil to be a heavily accessed server.

> None of the above really solves your problem.  But perhaps it will help you
> to understand why there is not already a "fossil scgi" command, and why
> statements to the effect that "CGI is highly inefficient" are not really
> meaningful.
> If I had a web server at hand that would do SCGI, I might consider adding
> the "fossil scgi" command for you.  But as I don't; I have no way to test
> the "fossil scgi" command.  But I did outline above (vaguely) the solution
> for you:  Using code very much like the existing HTTP server in fossil,
> implement a command that listens for SCGI requests, then forks a copy of
> itself to handle each request, each request being handled using the existing
> CGI processing logic.  How hard can that be, really?  As an alternative,
> I'll bet you can easily come up with a perl/python/ruby/tcl script that
> implements an SCGI server that execs "fossil cgi" to handle each request for
> you.  There are existing packages in all those script languages that
> implement SCGI.  All you need to do is add a few lines to invoke "fossil
> cgi" and you are up and running.
>> In the meantime, therefore, we are setting up Fossil behind a proxy. This
>> works mostly, but does raise an issue: Fossil issues all cookies to
>> This works, but is rather insecure. It would be best if Fossil
>> could be instructed to listen to the X-Forwarded-For header when started via
>> "fossil server" (It would be inadvisable to listen to it if started as a CGI
>> because the web server should be doing the transformation then).
>> The ideal solution would be to move to the aforementioned SCGI, but I am
>> not quite sure at present the way I would go about implementing this in the
>> Fossil source.
> The ideal solution, I think, would be to move to a web server that isn't so
> hung up on false notions of "efficiency" that it won't do CGI.  But that is
> just my opinion, and probably not helpful to you....
Its not so much that nginx doesn't do CGI for efficiency's sake as that it
doesn't do it because none of the developers need it (And most of the
developers don't need it as they work on high traffic sites).

In any case, thanks for the information; it should be possible to tease the
rest of the required information out of the source.
fossil-users mailing list

Reply via email to