Re: [fossil-users] Fossil behind proxy
On Sat, 2010-05-29 at 21:59 -0400, Richard Hipp wrote: The http://www.sqlite.org/ and http://www.fossil-scm.org/ websites are both run off of the same server ... This server takes over a quarter million requests per day, 10GB of traffic/day, and it does so using less than 3% of of the CPU on a virtual machine that is a 1/20th slice of a real server. ... How much more efficient does that need to be? Lots ... if it's CGI under Windows. Thank You, Paul Serice ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] Fossil behind proxy
On 30 May 2010 02:59, Richard Hipp d...@sqlite.org wrote: CGI ... is highly inefficient. The http://www.sqlite.org/ and http://www.fossil-scm.org/ websites are both run off of the same server (check the IP addresses on the domains). The HTTP server there is a simple home-brew job implemented as a single file of C code. http://www.sqlite.org/docsrc/artifact/d53e8146bf7977 It is run off of inetd. For each inbound HTTP request, a new process is created which runs the program implemented by the C file shown above. That simple little program either delivers static content or if the file requested as the execute permission bit turned on, it runs the file as CGI. Very simple. This server takes over a quarter million requests per day, 10GB of traffic/day, and it does so using less than 3% of of the CPU on a virtual machine that is a 1/20th slice of a real server. How much more efficient does that need to be? Sure, it won't scale up to Google or Facebook loads, but it doesn't need to. Most web applications are written in scripting languages; Fossil is a bit of an unusual case. In the general case of a scripting language, spawning a new interpreter for every request is rather disastrous performance wise. It also brings the issue of rate limiting: You can easily end up with hundreds of CGI handlers running. Why don't I use nginx or apache with SCGI or FastCGI and be even more efficient, you ask? One word: Simplicity. With my setup, there are no servers (other than inetd). Everything runs on-demand. With no servers running, that means there are no servers to crash and require restarting, no servers to configure, no servers to pick up performance problems after running a few days due to memory fragmentation or resource leaks, and no servers accidentally leaving open TCP ports that can be attacked by miscreants. Oh, and did I mention that my setup runs in a chroot jail for additional security. I'm guessing nginx doesn't do that When I design software, I really try hard to make it simple. Take for example, Fossil. There are (currently) three ways to set up a Fossil server. (1) You can type fossil server REPOSITORY. (2) You can do a simple 1-line edit to your /etc/inetd.conf file. (3) You can create a 2-line CGI script and drop it in any cgi-bin. Each of these techniques can (and are) described using 20 or 30 lines of text and one code example. None of them involve editing more than 2 lines in a single file. Now consider a hypothetical SCGI solution. To get SCGI going, you first have to arrange for start the Fossil SCGI server (perhaps with the fossil scgi REPOSITORY command) and have it restart automatically when your machine reboots. You have to choose a communications port. Then you have to edit configuration files on your web server to get it to talk to the fossil SCGI server. So, to implement an SCGI solution, you'll need to edit a minimum of two configuration files (and probably more if my guess about the complexity of nginx is correct). So the setup for SCGI is at least twice as complex as CGI. But SCGI will be faster, right? Well, no. SCGI will be about the same speed, or may just a little slower, because the way the fossil scgi command will work (assuming I implement it) will be that the Fossil server will accept the incoming SCGI request from the web server. The fossil server will then fork a copy of itself to handle the request, set environment variables, then call the existing CGI processing logic to do the work. So SCGI and CGI are going to do the same amount of work and run at about the same speed. The difference is that SCGI will use more resources when it is idle (because there is a server hanging around waiting for incoming requests, rather than being demain-launched) and SCGI will be at least twice as hard to setup and configure. SCGI should still be slightly more efficient than traditional CGI because it is exec which tends to be the expensive system call. Of course, we aren't expecting Fossil to be a heavily accessed server. None of the above really solves your problem. But perhaps it will help you to understand why there is not already a fossil scgi command, and why statements to the effect that CGI is highly inefficient are not really meaningful. If I had a web server at hand that would do SCGI, I might consider adding the fossil scgi command for you. But as I don't; I have no way to test the fossil scgi command. But I did outline above (vaguely) the solution for you: Using code very much like the existing HTTP server in fossil, implement a command that listens for SCGI requests, then forks a copy of itself to handle each request, each request being handled using the existing CGI processing logic. How hard can that be, really? As an alternative, I'll bet you can easily come up with a perl/python/ruby/tcl script that implements an SCGI server that execs fossil cgi to handle each
Re: [fossil-users] Fossil behind proxy
On 30 May 2010 00:53, Michael McDaniel fos...@autosys.us wrote: I wound up running lighttpd for the sole purpose of serving fossil via cgi scripts. lighttpd is pretty lightweight on resources. ~Michael The idea has crossed my mind, but the idea of having to maintain another set of configuration files frankly horrifies me ;-) I've been snooping at the Fossil source, and it looks like setting things up to support SCGI shouldn't be *too* hard, but I'm still not really sure where to begin. ___ fossil-users mailing list fossil-users@lists.fossil-scm.org http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
Re: [fossil-users] Fossil behind proxy
On Sat, May 29, 2010 at 7:21 PM, Owen Shepherd owen.sheph...@e43.eu wrote: We are currently experimenting with setting up a Fossil server, but have encountered a bit of an issue: Fossil doesn't seem to support being operated behind a proxy. As we wish to run Fossil on port 80, and to do so it must sit behind our primary web server, this is a bit of an issue. The ideal solution for us would be to run Fossil as an SCGI or FastCGI service (I would lean towards SCGI as it is a much simpler protocol) and have our web server dispatch requests to that, but this is at present not possible. We cannot run Fossil as a CGI because we use Nginx, which does not support it (With the valid reason that very little uses CGI these days and that it is highly inefficient) CGI ... is highly inefficient. The http://www.sqlite.org/ and http://www.fossil-scm.org/ websites are both run off of the same server (check the IP addresses on the domains). The HTTP server there is a simple home-brew job implemented as a single file of C code. http://www.sqlite.org/docsrc/artifact/d53e8146bf7977 It is run off of inetd. For each inbound HTTP request, a new process is created which runs the program implemented by the C file shown above. That simple little program either delivers static content or if the file requested as the execute permission bit turned on, it runs the file as CGI. Very simple. This server takes over a quarter million requests per day, 10GB of traffic/day, and it does so using less than 3% of of the CPU on a virtual machine that is a 1/20th slice of a real server. How much more efficient does that need to be? Sure, it won't scale up to Google or Facebook loads, but it doesn't need to. Why don't I use nginx or apache with SCGI or FastCGI and be even more efficient, you ask? One word: Simplicity. With my setup, there are no servers (other than inetd). Everything runs on-demand. With no servers running, that means there are no servers to crash and require restarting, no servers to configure, no servers to pick up performance problems after running a few days due to memory fragmentation or resource leaks, and no servers accidentally leaving open TCP ports that can be attacked by miscreants. Oh, and did I mention that my setup runs in a chroot jail for additional security. I'm guessing nginx doesn't do that When I design software, I really try hard to make it simple. Take for example, Fossil. There are (currently) three ways to set up a Fossil server. (1) You can type fossil server REPOSITORY. (2) You can do a simple 1-line edit to your /etc/inetd.conf file. (3) You can create a 2-line CGI script and drop it in any cgi-bin. Each of these techniques can (and are) described using 20 or 30 lines of text and one code example. None of them involve editing more than 2 lines in a single file. Now consider a hypothetical SCGI solution. To get SCGI going, you first have to arrange for start the Fossil SCGI server (perhaps with the fossil scgi REPOSITORY command) and have it restart automatically when your machine reboots. You have to choose a communications port. Then you have to edit configuration files on your web server to get it to talk to the fossil SCGI server. So, to implement an SCGI solution, you'll need to edit a minimum of two configuration files (and probably more if my guess about the complexity of nginx is correct). So the setup for SCGI is at least twice as complex as CGI. But SCGI will be faster, right? Well, no. SCGI will be about the same speed, or may just a little slower, because the way the fossil scgi command will work (assuming I implement it) will be that the Fossil server will accept the incoming SCGI request from the web server. The fossil server will then fork a copy of itself to handle the request, set environment variables, then call the existing CGI processing logic to do the work. So SCGI and CGI are going to do the same amount of work and run at about the same speed. The difference is that SCGI will use more resources when it is idle (because there is a server hanging around waiting for incoming requests, rather than being demain-launched) and SCGI will be at least twice as hard to setup and configure. None of the above really solves your problem. But perhaps it will help you to understand why there is not already a fossil scgi command, and why statements to the effect that CGI is highly inefficient are not really meaningful. If I had a web server at hand that would do SCGI, I might consider adding the fossil scgi command for you. But as I don't; I have no way to test the fossil scgi command. But I did outline above (vaguely) the solution for you: Using code very much like the existing HTTP server in fossil, implement a command that listens for SCGI requests, then forks a copy of itself to handle each request, each request being handled using the existing CGI processing logic. How hard can that be, really? As an alternative, I'll bet you