Re: [fossil-users] Fossil server load control

2014-03-12 Thread Lluís Batlle i Rossell
On Wed, Mar 12, 2014 at 06:31:29PM +0100, Ramon Ribó wrote:
> > ​
> The current Fossil implementation runs a separate process for each HTTP
> > request.  So an in-memory cache wouldn't be helpful.  It has to be disk-
> > based.
> 
> ​Does not FastCGI do exactly the opposite?​

The current implementation simply uses a fork() per request. Not fork+exec,
which is the common case (CGI) FastCGI is meant to improve.

AFAIU, using a separate process eases the heap memory handling.
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Fossil server load control

2014-03-12 Thread Stephan Beal
On Wed, Mar 12, 2014 at 6:31 PM, Ramon Ribó  wrote:

>
> > 
> The current Fossil implementation runs a separate process for each HTTP
> > request.  So an in-memory cache wouldn't be helpful.  It has to be disk-
> > based.
>
> Does not FastCGI do exactly the opposite?
>

FastCGI requires that there be some sort of state object which is can
re-set between calls, and feed that state into each child. Fossil doesn't
have such a state object (it has one, but not one which can simply be
re-set/re-used), so FastCGI can't really do its magic with fossil.
libfossil (currently under construction and moving along nicely) provides
such a construct.

-- 
- stephan beal
http://wanderinghorse.net/home/stephan/
http://gplus.to/sgbeal
"Freedom is sloppy. But since tyranny's the only guaranteed byproduct of
those who insist on a perfect world, freedom will have to do." -- Bigby Wolf
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Fossil server load control

2014-03-12 Thread Andreas Kupries
On Wed, Mar 12, 2014 at 10:25 AM, Richard Hipp  wrote:
> On Wed, Mar 12, 2014 at 1:13 PM, Andreas Kupries 
> wrote:
>>
>> On Wed, Mar 12, 2014 at 6:40 AM, Richard Hipp  wrote:
>>
>> > And if you have alternative suggestions about how to keep a light-weight
>> > host running smoothly under a massive Fossil request load, please post
>> > follow-up comments.
>>
>> How sensible do you think would it be to have a (limited-size)
>> (in-memory|disk) cache to hold the most recently requested tarballs ?
>> That way a high-demand tarball, etc. would be computed only once and
>> then served statically from the cache.
>
>
> It's on my to-do list, actually.  The idea is to have a separate database
> that holds the cache.

Single-file strikes again ... While I was thinking of a regular
directory and files. But that is an implementation detail. Database
might be a bit easier to manage (i.e. setup/remove).

The teapot server [1] has a disk cache, but not as database, plain
directory with files.
[1] http://docs.activestate.com/activetcl/8.5/tpm/tpm/files/CTP_teapot.html


>  And yes it is complementary to the load management
> feature.
>> Side note: While the same benefits could be had by putting a regular
>> web cache in front of the fossil server, 
>
>
> No they can't actually, at least not by any technology I'm aware of.  The
> problem is that these request must be authenticated.

Ack. Forgot the permission issue. Yes, we do not want to have
authenticated downloads in a public area.

>> I mentioned in-memory and disk ... I can see that a two-level scheme

> The current Fossil implementation runs a separate process for each HTTP
> request.  So an in-memory cache wouldn't be helpful.  It has to be
> disk-based.

Right. Getting in-memory cache would require redesign of the web
server parts itself to threaded or some such. ... Could be done, but
more work. ... Maybe Stephan can prototype that design in his
libfossil ;)

-- 
Andreas Kupries
Senior Tcl Developer
Code to Cloud: Smarter, Safer, Faster(tm)
F: 778.786.1133
andre...@activestate.com
http://www.activestate.com
Learn about Stackato for Private PaaS: http://www.activestate.com/stackato

EuroTcl'2014, July 12-13, Munich, GER
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Fossil server load control

2014-03-12 Thread Richard Hipp
On Wed, Mar 12, 2014 at 1:31 PM, Ramon Ribó  wrote:

>
> >
> The current Fossil implementation runs a separate process for each HTTP
> > request.  So an in-memory cache wouldn't be helpful.  It has to be disk-
> > based.
>
> Does not FastCGI do exactly the opposite?
>
>
Fossil doesn't support FastCGI, only SCGI.  And even with SCGI, Fossil
forks a new process to handle each request.

-- 
D. Richard Hipp
d...@sqlite.org
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Fossil server load control

2014-03-12 Thread Richard Hipp
On Wed, Mar 12, 2014 at 1:26 PM, Stephan Beal  wrote:

>
> In my experience, most proxies won't cache for requests which have URL
> parameters. Whether or not that's generally true, i can't say. For static
> content (lots of what fossil serves is static), the URLs can/should be
> written as /path/arg1/arg2, rather than /path?arg1=...&arg2=..., to make
> them "potentially more cacheable".
>
>
With a few carefully chosen exceptions, Fossil always sets "Cache-control:
no-cache" in the header of its replies, due in large part to those pesky
authentication cookies.

-- 
D. Richard Hipp
d...@sqlite.org
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Fossil server load control

2014-03-12 Thread Ramon Ribó
> ​
The current Fossil implementation runs a separate process for each HTTP
> request.  So an in-memory cache wouldn't be helpful.  It has to be disk-
> based.

​Does not FastCGI do exactly the opposite?​

​RR​


2014-03-12 18:25 GMT+01:00 Richard Hipp :

>
>
>
> On Wed, Mar 12, 2014 at 1:13 PM, Andreas Kupries  > wrote:
>
>> On Wed, Mar 12, 2014 at 6:40 AM, Richard Hipp  wrote:
>>
>> > And if you have alternative suggestions about how to keep a light-weight
>> > host running smoothly under a massive Fossil request load, please post
>> > follow-up comments.
>>
>> How sensible do you think would it be to have a (limited-size)
>> (in-memory|disk) cache to hold the most recently requested tarballs ?
>> That way a high-demand tarball, etc. would be computed only once and
>> then served statically from the cache.
>>
>
> It's on my to-do list, actually.  The idea is to have a separate database
> that holds the cache.  And yes it is complementary to the load management
> feature.
>
>
>>
>> Side note: While the same benefits could be had by putting a regular
>> web cache in front of the fossil server, 
>
>
> No they can't actually, at least not by any technology I'm aware of.  The
> problem is that these request must be authenticated.  Downloads might be
> only authorized for certain users.  If an authorized user does a download,
> and squid caches it, some other unauthorized user might be able to obtain
> the download from cache.
>
> Even if downloads are currently authorized for anybody (which is the
> common case, at least on public repos), I don't think you want them being
> cached, since to do so would mean that turning off public downloads would
> be ineffective until the caches all expired.
>
> I mentioned in-memory and disk ... I can see that a two-level scheme
>> here ... A smaller in-memory cache for the really high-demand pieces
>> with LRU, and a larger disk cache for the things not so much in-demand
>> at the moment, but possibly in the future. The disk cache could
>> actually be much larger (disks are large and cheap these days), this
>> would help with random access attacks (as they would become
>> asymptotically more difficult as the disk cache over time extends its
>> net of quickly served assets).
>>
>>
> ​​
> The current Fossil implementation runs a separate process for each HTTP
> request.  So an in-memory cache wouldn't be helpful.  It has to be
> disk-based.
>
> --
> D. Richard Hipp
> d...@sqlite.org
>
> ___
> fossil-users mailing list
> fossil-users@lists.fossil-scm.org
> http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
>
>
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Fossil server load control

2014-03-12 Thread Stephan Beal
On Wed, Mar 12, 2014 at 6:13 PM, Andreas Kupries
wrote:

> How sensible do you think would it be to have a (limited-size)
> (in-memory|disk) cache to hold the most recently requested tarballs ?
> That way a high-demand tarball, etc. would be computed only once and
> then served statically from the cache.
>

FWIW: i was scratching down ideas for this very idea today for the
libfossil CGI demos because i don't like the memory cost of generate ZIP
files from script code. Caching the (say) 10 most recent ZIPs could
alleviate some of my load concerns. It need not be a synchable table, nor
in one which survives a rebuild.

Note that I actually see this as a possible complement to the load mgmt
> feature.
> The cache would help if demand is high for a small number of
> revisions, whereas load mgmt would kick in and restrict load if the
> access pattern of revisions is sufficiently random/spread out to
> negate the cache (i.e. cause it to thrash).
>

+1


> would require more work to set up and admin. And might be a problem
> for the truly dynamic parts of the fossil web ui. An integrated cache
> just for the assets which are expensive to compute and yet
> (essentially) static does not have these issues.
>

In my experience, most proxies won't cache for requests which have URL
parameters. Whether or not that's generally true, i can't say. For static
content (lots of what fossil serves is static), the URLs can/should be
written as /path/arg1/arg2, rather than /path?arg1=...&arg2=..., to make
them "potentially more cacheable".


-- 
- stephan beal
http://wanderinghorse.net/home/stephan/
http://gplus.to/sgbeal
"Freedom is sloppy. But since tyranny's the only guaranteed byproduct of
those who insist on a perfect world, freedom will have to do." -- Bigby Wolf
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Fossil server load control

2014-03-12 Thread Richard Hipp
On Wed, Mar 12, 2014 at 1:13 PM, Andreas Kupries
wrote:

> On Wed, Mar 12, 2014 at 6:40 AM, Richard Hipp  wrote:
>
> > And if you have alternative suggestions about how to keep a light-weight
> > host running smoothly under a massive Fossil request load, please post
> > follow-up comments.
>
> How sensible do you think would it be to have a (limited-size)
> (in-memory|disk) cache to hold the most recently requested tarballs ?
> That way a high-demand tarball, etc. would be computed only once and
> then served statically from the cache.
>

It's on my to-do list, actually.  The idea is to have a separate database
that holds the cache.  And yes it is complementary to the load management
feature.


>
> Side note: While the same benefits could be had by putting a regular
> web cache in front of the fossil server, 


No they can't actually, at least not by any technology I'm aware of.  The
problem is that these request must be authenticated.  Downloads might be
only authorized for certain users.  If an authorized user does a download,
and squid caches it, some other unauthorized user might be able to obtain
the download from cache.

Even if downloads are currently authorized for anybody (which is the common
case, at least on public repos), I don't think you want them being cached,
since to do so would mean that turning off public downloads would be
ineffective until the caches all expired.

I mentioned in-memory and disk ... I can see that a two-level scheme
> here ... A smaller in-memory cache for the really high-demand pieces
> with LRU, and a larger disk cache for the things not so much in-demand
> at the moment, but possibly in the future. The disk cache could
> actually be much larger (disks are large and cheap these days), this
> would help with random access attacks (as they would become
> asymptotically more difficult as the disk cache over time extends its
> net of quickly served assets).
>
>
The current Fossil implementation runs a separate process for each HTTP
request.  So an in-memory cache wouldn't be helpful.  It has to be
disk-based.

-- 
D. Richard Hipp
d...@sqlite.org
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Fossil server load control

2014-03-12 Thread Andreas Kupries
On Wed, Mar 12, 2014 at 6:40 AM, Richard Hipp  wrote:
> A new feature was recently added to Fossil that allows it to deny expensive
> requests (such as "blame" or "tarball" on a large repository) if the server
> load average is too high.  See
> http://www.fossil-scm.org/fossil/doc/tip/www/server.wiki#loadmgmt for
> further information.

Interesting.

> I am pleased to announce that this new feature has passed its first test.
>
> About three hours ago, a single user in Beijing began downloading multiple
> copies of the same System.Data.SQLite tarball.  As of this writing, he has
> so far attempted to download that one tarball 11,784 times (at last count -

> a rate of about one per second, and each request takes about 3.1 seconds of
> CPU time in order to compute the 80MB tarball.

> And if you have alternative suggestions about how to keep a light-weight
> host running smoothly under a massive Fossil request load, please post
> follow-up comments.

How sensible do you think would it be to have a (limited-size)
(in-memory|disk) cache to hold the most recently requested tarballs ?
That way a high-demand tarball, etc. would be computed only once and
then served statically from the cache.

Note that I actually see this as a possible complement to the load mgmt feature.
The cache would help if demand is high for a small number of
revisions, whereas load mgmt would kick in and restrict load if the
access pattern of revisions is sufficiently random/spread out to
negate the cache (i.e. cause it to thrash).

Side note: While the same benefits could be had by putting a regular
web cache in front of the fossil server, i.e. a squid or the like this
would require more work to set up and admin. And might be a problem
for the truly dynamic parts of the fossil web ui. An integrated cache
just for the assets which are expensive to compute and yet
(essentially) static does not have these issues.

I mentioned in-memory and disk ... I can see that a two-level scheme
here ... A smaller in-memory cache for the really high-demand pieces
with LRU, and a larger disk cache for the things not so much in-demand
at the moment, but possibly in the future. The disk cache could
actually be much larger (disks are large and cheap these days), this
would help with random access attacks (as they would become
asymptotically more difficult as the disk cache over time extends its
net of quickly served assets).



-- 
Andreas Kupries
Senior Tcl Developer
Code to Cloud: Smarter, Safer, Faster(tm)
F: 778.786.1133
andre...@activestate.com
http://www.activestate.com
Learn about Stackato for Private PaaS: http://www.activestate.com/stackato

EuroTcl'2014, July 12-13, Munich, GER
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users