Re: [fossil-users] Fossil behind reverse proxy

2010-01-30 Thread Paul Ruizendaal
Hi Kyle,

Thanks for your extensive reply. I was going through the code and had
stumbled upon the SCRIPT_NAME trick when your mail came in and confirmed
that it was indeed possible. In the default admin setup, the logo path
needed to be fixed from '/logo' to '$baseurl/logo', but then it works
fully. I can confirm that it also works on Linux and Windows, not just
Darwin. For folks not using Apache it would be good if your below 'how to'
could mention that the reverse proxy needs to strip the baseurl of the uri
it forwards to the Fossil server (i.e.: '/fossil/index' must be forwarded
as '/index').

However, this is a hack that works by accident. It works because 'server'
and 'cgi' share code paths and the 'server' code flow reads part of the CGI
environment even though it shouldn't. Can you imagine the configuration
headache if one had an unrelated SCRIPT_NAME environment variable and
wasn't aware of this feature... Also, the hack fixes the baseUrl to one
defined prefix. Access to a fossil server setup using this hack becomes
unusable from the web if accessed directly as well, nor can multiple
baseurl's be mapped to a single fossil server instance. Whilst I'm quite
happy that the hack fixes my immediate problem, I think a better engineered
solution is preferrable.

First of all, I note that baseUrl relocation in server mode works without
problem as you have established and I can confirm from short experience; so
we are not entering a mine field, it seems. First I thought along the lines
that you point out in your how-to. Later I thought that we should not
overload Fossil with options, especially as everybody will want something a
little different. Instead, it should be the reverse proxy that does all the
work IMHO.

How about using request headers for this? The reverse proxy could add two
custom headers to the forwared request (similar to X-Forwarded-For):
- X-Fossil-Baseurl
- X-Fossil-Repository
Fossil would only look at these when in server mode. The first would
specify the baseurl that is used to relocate all references in html/css
output, and in redirect responses. If no such header exists, it is root
('/'). The first is sufficient for reverse proxying. The second would
specify the repository to use for that request only. If no such header
exists, it is the repository specified on the command line. This would take
care of your generalised repository access.

This would work very well with my own (soon to be published, GPL'ed)
reverse proxy. It would also work very well with Lighttpd, using its
mod_magnet module. Would it be workable with Apache too? (I'm not familiar
with Apache configuration).

Paul

On Thu, 28 Jan 2010 11:53:36 -0800, Kyle McKay mack...@gmail.com wrote:
 Paul,
 
 I'm running a fossil server behind an Apache reverse proxy quite  
 happily.  I've been meaning to add something to the wiki cookbook  
 about this but just haven't got around to it yet.
 
 I'm doing this because:
 
 1. I want a fossil UI to be always on and available via my web server
 2. I want the fossil server to run as a different user account than  
 the web server processes
 3. I don't want to use any suid programs (i.e. suExec)
 
 My apache web server is setup so that:
 
http://my_server_name/fossil
 
 Is reverse proxied to the fossil server process that is running as a  
 daemon on a separate port
 
http://my_server_name/anything-other-than-fossil-here
 
 Serves up whatever else would normally be served on my server.
 
 To make this work (I'm running on Darwin which is very Unix like) you  
 need to do these two things (the examples assume you have a bash shell):
 
 1. Start your fossil server daemon running with a shell script like this
 
 #!/bin/sh
 export SCRIPT_NAME=/fossil
 fossil server -P 8000 full_path_to_fossil_respository_here 
 
 If you want to start the fossil server in its own process group, add  
 this line:
 
 set -m
 
 at the beginning of the script and add this line:
 
 disown
 
 at the end and you probably want to redirect fossil input, output and  
 error to /dev/null as well so the final script to do all of this would  
 look like (adding nohup also to make it immune to SIGHUP):
 
 #!/bin/bash
 set -m
 export SCRIPT_NAME=/fossil
 nohup fossil server -P 8000 full_path_to_fossil_respository_here \
 /dev/null /dev/null 21 
 disown # this is a bashism
 
 2. Add this configuration section to your Apache configuration
 
 ProxyPass /fossil http://machine_your_fossil_server_is_running_on: 
 8000
 ProxyPreserveHost On
 # ProxyPreserveHost is required since fossil inspects the Host value
 # and without it fossil-generated links will point directly to  
 fossil
 # instead of the Apache server
 
 3. Access your fossil server like this:
 
 http://machine_apache_is_running_on/fossil
 
 4. Optionally add a firewall rule to limit connections to the fossil  
 server to only those coming from the Apache server machine (be nice if  
 

Re: [fossil-users] Fossil behind reverse proxy

2010-01-30 Thread Kyle McKay
On Jan 30, 2010, at 04:00, Paul Ruizendaal wrote:
 Hi Kyle,

 In the default admin setup, the logo path
 needed to be fixed from '/logo' to '$baseurl/logo', but then it works
 fully.

I didn't need to do that.  Must have been fixed in later versions of  
fossil.  That would be a bug for the cgi command as well.  My new  
repositories already had that correct without needing to edit anything.

 I can confirm that it also works on Linux and Windows, not just
 Darwin. For folks not using Apache it would be good if your below  
 'how to'
 could mention that the reverse proxy needs to strip the baseurl of  
 the uri
 it forwards to the Fossil server (i.e.: '/fossil/index' must be  
 forwarded
 as '/index').

That is standard behavior for a reverse proxy as the proxy machine the  
requests are being sent to has absolutely no knowledge of where it's  
being mapped into the other machine's web space or even that it's  
being used as a proxy in the first place (unless it starts inspecting  
X-Forwarded-... and/or Via headers).

Normally in this situation, however, you would expect that content  
coming from the proxy machine would have to be inspected and have any  
contained links rewritten to match the other machine's web space  
(mod_proxy_html can do this http://apache.webthing.com/mod_proxy_html/ 
 ) and indeed I had it working using mod_poxy_html when I realized it  
wasn't necessary.  I prefer to avoid the extra overhead of inspecting  
the content since it's not necessary if you set SCRIPT_NAME (with the  
proviso that you mention below that you can no longer access it  
directly if you do this).

 However, this is a hack that works by accident. It works because  
 'server'
 and 'cgi' share code paths and the 'server' code flow reads part of  
 the CGI
 environment even though it shouldn't.

Yes but unless the current fossil architecture is changed it will keep  
working.  It's also fortunate that SCRIPT_NAME is used when  
constructing the login cookie -- but again, for the cgi command to  
keep working it needs to.

 Can you imagine the configuration
 headache if one had an unrelated SCRIPT_NAME environment variable and
 wasn't aware of this feature...

I was a bit surprised that SCRIPT_NAME was used even when the  
GATEWAY_INTERFACE environment variable is not set.  Probably  
SCRIPT_NAME should only be used if GATEWAY_INTERFACE is CGI/1.0 or  
later.  But even then that would just mean you needed to set that  
variable together with SCRIPT_NAME to use the hack.

 Also, the hack fixes the baseUrl to one
 defined prefix. Access to a fossil server setup using this hack  
 becomes
 unusable from the web if accessed directly as well, nor can multiple
 baseurl's be mapped to a single fossil server instance. Whilst I'm  
 quite
 happy that the hack fixes my immediate problem,

Yes, me too.

 I think a better engineered
 solution is preferrable.

Undocumented behavior has a bad habit of breaking or going away -- a  
documented solution is preferred.

 How about using request headers for this? The reverse proxy could  
 add two
 custom headers to the forwared request (similar to X-Forwarded-For):
 - X-Fossil-Baseurl
 - X-Fossil-Repository
 Fossil would only look at these when in server mode.

Or in http mode.

 The first would
 specify the baseurl that is used to relocate all references in html/ 
 css
 output, and in redirect responses.

Using the SCRIPT_NAME hack and running in server mode, you do have  
to make sure that Location: redirects get corrected -- as you say, a  
reverse proxy can be expected to do this -- this is never necessary  
when running in cgi mode.

It might be a bit of a challenge to catch all the redirects unless you  
hack it by prepending SCRIPT_NAME to the value stuffed into  
REQUEST_URI by the source in server and http modes.

This line in cgi.c (in the cgi_handle_http_request function):

   cgi_setenv(REQUEST_URI, zToken);

would need to change to set REQUEST_URI to the contents of SCRIPT_NAME  
concatenated to zToken instead of what it does now.  That would make  
the SCRIPT_NAME hack produce correct redirects and I believe make a  
SCRIPT_NAME hack running server be directly accessible.  Something  
like this but without the memory leak or double getenv call:

   cgi_setenv(REQUEST_URI, mprintf(%s%s, (getenv(SCRIPT_NAME)? 
getenv(SCRIPT_NAME):), zToken));

 This would work very well with my own (soon to be published, GPL'ed)
 reverse proxy. It would also work very well with Lighttpd, using its
 mod_magnet module. Would it be workable with Apache too? (I'm not  
 familiar
 with Apache configuration).

The updated Apache 2 configuration to reverse proxy a fossil server  
running like this:

export SCRIPT_NAME=/fos
fossil server -P 8080 /path/to/some/fossil/repository

is this:

RewriteEngine On
RewriteRule ^/fos$ /fos/ [PT]
ProxyPreserveHost On
ProxyPass /fos/ http://machine_running_fossil_server:8080/
Location /fos/
ProxyPassReverse /
RequestHeader set 

Re: [fossil-users] Fossil behind reverse proxy

2010-01-28 Thread Kyle McKay
Paul,

I'm running a fossil server behind an Apache reverse proxy quite  
happily.  I've been meaning to add something to the wiki cookbook  
about this but just haven't got around to it yet.

I'm doing this because:

1. I want a fossil UI to be always on and available via my web server
2. I want the fossil server to run as a different user account than  
the web server processes
3. I don't want to use any suid programs (i.e. suExec)

My apache web server is setup so that:

   http://my_server_name/fossil

Is reverse proxied to the fossil server process that is running as a  
daemon on a separate port

   http://my_server_name/anything-other-than-fossil-here

Serves up whatever else would normally be served on my server.

To make this work (I'm running on Darwin which is very Unix like) you  
need to do these two things (the examples assume you have a bash shell):

1. Start your fossil server daemon running with a shell script like this

#!/bin/sh
export SCRIPT_NAME=/fossil
fossil server -P 8000 full_path_to_fossil_respository_here 

If you want to start the fossil server in its own process group, add  
this line:

set -m

at the beginning of the script and add this line:

disown

at the end and you probably want to redirect fossil input, output and  
error to /dev/null as well so the final script to do all of this would  
look like (adding nohup also to make it immune to SIGHUP):

#!/bin/bash
set -m
export SCRIPT_NAME=/fossil
nohup fossil server -P 8000 full_path_to_fossil_respository_here \
/dev/null /dev/null 21 
disown # this is a bashism

2. Add this configuration section to your Apache configuration

ProxyPass /fossil http://machine_your_fossil_server_is_running_on: 
8000
ProxyPreserveHost On
# ProxyPreserveHost is required since fossil inspects the Host value
# and without it fossil-generated links will point directly to  
fossil
# instead of the Apache server

3. Access your fossil server like this:

http://machine_apache_is_running_on/fossil

4. Optionally add a firewall rule to limit connections to the fossil  
server to only those coming from the Apache server machine (be nice if  
fossil had a loopback-only setting similar to postfix's to bind its  
socket listener to only localhost IPv4/IPv6 interfaces).

If you want your fossil URL to look like http://some_machine/foo/bar/ 
scm you need would change the above example lines for starting your  
fossil server and setting your Apache configuration as follows:

SCRIPT_NAME=/foo/bar/scm
ProxyPass /foo/bar/scm http:// 
machine_your_fossil_server_is_running_on:8000

Similarly you can change the port the fossil server runs on just as  
easily.

It turns out that since fossil already handles running from an  
arbitrary web location as a cgi script, it quite happily will still  
use that arbitrary location when running as a server if you provide it  
via SCRIPT_NAME.

I wish there was functionality something like this though:

fossil server -P 8000 --ext .fsl  
path_to_directory_containing_.fsl_repositories

Where a single fossil server could serve up multiple fossil  
repositories.  You would just point it to the parent directory and  
tell it what repository extension to look for and then it would insert  
an additional element into the URL using the base name of the fossil  
repository minus the extension.  So if you had these repositories on  
your system:

   /some/directory/repository1.fsl
   /some/directory/repository2.fsl

And started the fossil server like this:

fossil server -P 8080 --ext .fsl /some/directory

Then you could access repository1.fsl like this:

http://localhost:8080/repository1

and repository2.fsl like this:

http://localhost:8080/repository2

and as a bonus you could get a list of available repositories with this:

http://localhost:8080/

(And, of course, still use the SCRIPT_NAME trick to change the URL  
location if you like.)

I believe a relatively simple Perl or Python server script could use  
the fossil http command to implement the multiple repository server  
relatively easily since the SCRIPT_NAME technique also works with the  
fossil http command.  Hmmm, I might just have to write that script  
later today.

Kyle

On Jan 28, 2010, at 04:00, Paul Ruizendaal wrote:
 It may be subtler and easier than I first thought:

 Fossil already uses the host information from the Host: header, not  
 from
 its own IP. When in CGI mode, it already relocates all its absolute
 references to include the prefix of the cgi script location.

 When running as server Fossil does not do the above relocation but  
 keeps
 everyting based at root ('/'), regardless of the path in the request  
 uri.
 Is there a reason that makes fossil CGI style relocation a bad idea  
 for a
 fossil running in server mode?

 Paul

 ==

 I just tried to put Fossil (running as server) behind a reverse proxy
 (home grown, but similar to Pound).

 That 

[fossil-users] Fossil behind reverse proxy

2010-01-27 Thread Paul Ruizendaal
I just tried to put Fossil (running as server) behind a reverse proxy
(home grown, but similar to Pound).

That doesn't work very well, because Fossil prefixes all paths in its
output with a full baseURL (as seen by Fossil). The client can't use that
as the reverse proxy maps an entirely different prefix to the Fossil server
instance. I think the html/css output by Fossil should use relative paths,
not absolute paths.

Next to the above, also the 301 Redirect repsonses have the wrong url, but
that is as per the http RFC: it is a reasonable job for a reverse proxy to
rewrite the Location: header of a 301 response.

Before I attempt this rather massive patch: Richard, any remarks?

Paul

___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


[fossil-users] Fossil behind reverse proxy

2010-01-27 Thread Paul Ruizendaal
It may be subtler and easier than I first thought:

Fossil already uses the host information from the Host: header, not from
its own IP. When in CGI mode, it already relocates all its absolute
references to include the prefix of the cgi script location.

When running as server Fossil does not do the above relocation but keeps
everyting based at root ('/'), regardless of the path in the request uri.
Is there a reason that makes fossil CGI style relocation a bad idea for a
fossil running in server mode?

Paul

==

I just tried to put Fossil (running as server) behind a reverse proxy
(home grown, but similar to Pound).

That doesn't work very well, because Fossil prefixes all paths in its
output with a full baseURL (as seen by Fossil). The client can't use that
as the reverse proxy maps an entirely different prefix to the Fossil
server
instance. I think the html/css output by Fossil should use relative paths,
not absolute paths.

Next to the above, also the 301 Redirect repsonses have the wrong url, but
that is as per the http RFC: it is a reasonable job for a reverse proxy to
rewrite the Location: header of a 301 response.

Before I attempt this rather massive patch: Richard, any remarks?

Paul
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users