Re: 1.3 Wishlist: (Was: Re: consider reopening 1.3)

2003-11-17 Thread Igor Sysoev
On Sun, 16 Nov 2003, Rasmus Lerdorf wrote:

 On Sun, 16 Nov 2003, Jim Jagielski wrote:
  So a useful topic is: What is *missing* in 1.3 that needs to be 
  addressed.
  What are the features/additions that the disenfranchised 1.3 developers
  want to add to 1.3...
 
 How about support for chunked compressed responses right in 
 src/main/buff.c where it belongs (given we don't have filters in Apache1)

There's mod_deflate module for Apache 1.3.x that patches buff.c and
compesses a response on the fly. It's available at http://sysoev.ru/en/


Igor Sysoev
http://sysoev.ru/en/



Re: mod_proxy distinguish cookies?

2004-04-25 Thread Igor Sysoev
On Sat, 24 Apr 2004, Neil Gunton wrote:

 Neil Gunton wrote:
 
  Hi all,
 
  I apologise in advance if this is obvious or otherwise been answered
  elsewhere, but I can't seem to find any reference to it.
 
  I am using Apache 1.3.29 with mod_perl, on Linux 2.4. I am running
  mod_proxy as a caching reverse proxy front end, and mod_perl on the
  backend. This works really well, but I have noticed that mod_proxy does
  not seem to be able to distinguish requests as being different if the
  URLs are the same, but they contain different cookies. I would like to
  be able to enable more personalization on my site, which would best be
  done using cookies. The problem is that when a page has an expiration
  greater than 'now', then any request to the same URL will get the cache
  version, even if the requests have different cookies. Currently I have
  to pass options around as part of the URL in order to make the requests
  look different to mod_proxy.
 
  Am I missing something here? Or, will this be included in either future
  versions of mod_proxy or the equivalent module in Apache 2.x? Any
  insights greatly appreciated.

 I should perhaps make clear that I do have cookies working through the
 proxy just fine, for pages that are set to be 'no-cache'. So this isn't
 an issue with the proxy being able to pass cookies to/from the backend
 and browser (which I think I have seen mentioned before as a bugfix),
 but rather with mod_proxy simply being able to distinguish otherwise
 identical URL requests that have different cookies, and cache those as
 different requests.

 So for example, the request GET /somedir/somepage.html?xxx=yyy passed
 with a cookie that value 'pics=small' should be seen as different from
 another identical request, but with cookie value 'pics=large'. Currently
 my tests indicate that mod_proxy returns the same cached page for each
 request.

 I assume that mod_proxy only checks the actual request string, and not
 the HTTP header which contains the cookie.

 Obviously, under this scheme, if you were using cookies to track
 sessions then all requests would get passed to the backend server - so,
 perhaps it would be a nice additional feature to be able to configure,
 through httpd.conf, how mod_proxy (or its successor) pays attention to
 cookies. For example, you might say something to the effect of ignore
 this cookie or differentiate requests using this cookie. Then we
 could have sitewide options like e.g. 'pics' (to set what size pictures
 are shown), and this could be used to distinguish cached pages, but
 other cookies might be ignored on some pages. This would allow for more
 flexibility, with some cached pages being sensitive to cookies, while
 others are not. An obvious way this would be useful is in the use of
 login cookies. These will be passed in by the browser for every page on
 the site, but this doesn't mean we want to distinguish cached pages
 based on it for every page. Some user-specific pages would have
 'no-cache' set, while other pages could be set to ignore this login
 cookie, thus gaining the benefits of the proxy caching. This would be
 useful for pages that have no user-specific or personalizable aspects -
 they could be cached regardless of who is logged in.

 Sorry if this wasn't clear from the original post, just wanted to
 clarify and expand... any advice on this would be VERY welcomed, since
 my options with personalization are currently rather limited.

 Also, if this is actually addressed to the wrong list for some reason
 then a pointer would be much appreciated...

mod_accel ( http://sysoev.ru/en/ ) allows to take cookies into account while
caching:

AccelCacheCookie  some_cookie_name another_cookie_name

You can set it on per-location basis.

Besides, my upcoming light-weight http and reverse proxy server nginx
will allow to do it too.


Igor Sysoev
http://sysoev.ru/en/


Re: mod_proxy distinguish cookies?

2004-05-05 Thread Igor Sysoev
On Mon, 3 May 2004, Neil Gunton wrote:

 Well, that truly sucks. If you pass options around in params then
 whenever someone follows a link posted by someone else, they will
 inherit that person's options. The only alternative might be to make
 pages 'No-Cache' and then set the 'AccelIgnoreNoCache' mod_accel
 directive (which I haven't tried, but I assume that's what it does)...
 so even though my server will get hit a lot more, at least it might be
 stopped by the proxy rather than hitting the mod_perl.

The AccelIgnoreNoCache disables a client's Pragma: no-cache,
Cache-Control: no-cache and Cache-Control: max-age=number headers.

The AccelIgnoreExpires disables a backend's Expires,
Cache-Control: no-cache and Cache-Control: max-age=number headers.


Igor Sysoev
http://sysoev.ru/en/


Re: mod_deflate Vary header

2005-11-07 Thread Igor Sysoev

On Fri, 4 Nov 2005 [EMAIL PROTECTED] wrote:


This has been discussed many times before and no one
seems to understand what the fundamental problem is.

It is not with the servers at all, it is with the CLIENTS.

What both of you are saying is true... whether you Vary:
on Content-encoding and/or User-agent or not... there
is a risk of getting the wrong content ( compressed versus
uncompressed ) stuck in a downstream cache.

It is less and less likely these days that the cache will receive
a request from a client that CANNOT decompress, but still
possible. Handling requests from clients that cannot decompress
have become ( at long last ) the fringe cases but are
no less important than ever.


Actually, with MSIE 5.5+ appearance the chances that client can not
decompress the response from downstream cache have increased.
If MSIE 5.5 is configured to work via proxy with HTTP/1.0, then
MSIE will never send Accept-Encoding header, and it would refuse
the compressed content.

This is why my mod_deflate for Apache 1.3 by default does not compress
response for proxied requests.


Microsoft Internet Explorer ( all versions ) will REFUSE to cache
anything locally if it shows up with any Vary: headers on it.
Period. End of sentence.



So you might think you are doing your downstream clients a
favor by tacking on a Vary: header to the compressed data
so it gets 'stored' somewhere close to them... but you would
be wrong.

If you don't put a Vary: header on it... MSIE will, in fact,
cache the compressed response locally and life is good.
The user won't even come back for it until it's expired on
their own hard drive or they clear their browser cache.

However... if you simply add a Vary: header to the same
compressed response... MSIE now refuses to cache that
response at all and now you create a thundering herd
scenario whereby the page is never local to the user for
any length of time and each forward or back button
hit causes the client to go upstream for the page each
and every time. Even if there is a cache nearby you would
discover that the clients are nailing it each and every time
for the same page just because it has a Vary: header on it.

I believe Netscape has the same problem(s).
I don't use Netscape anymore.
Anyone know for sure if Netscape actually stores variants
correctly in local browser cache?


Actually, MSIE 5.5+ will cache the response with any Vary header
if it also has Content-Encoding: gzip or Content-Encoding: deflate
headers. But it refuses to cache if the response has no
Content-Encoding: gzip header.

My mod_deflate allows optionally to add Vary header to compressed
response only.


Igor Sysoev
http://sysoev.ru/en/


Re: mod_deflate Vary header

2005-11-08 Thread Igor Sysoev

On Tue, 8 Nov 2005 [EMAIL PROTECTED] wrote:


Igor Sysoev wrote

Actually, with MSIE 5.5+ appearance the chances that client can not
decompress the response from downstream cache have increased.
If MSIE 5.5 is configured to work via proxy with HTTP/1.0, then
MSIE will never send Accept-Encoding header, and it would refuse
the compressed content.


You are right on the first part. If you don't have the Use HTTP/1.1
for Proxies checkbox on then no Accept-Encoding: header
would be sent... ( principle of least astonishment applies here ).

...however... I think you will discover on closer inspection that
the second part is not true. Even if MSIE 5.x doesn't send an
Accept-Encoding: gzip header... it KNOWS it can decompress
and will do so for any response that shows up with Content-encoding: gzip
regardless of whether it sent an Accept-Encoding: gzip header.


I've just checked MSIE 6.0 configured to work via proxy with HTTP/1.0
(default settings). I've preloaded into the proxy the compressed pages using
Firefox. Then I've asked the same pages from MSIE. MSIE has showed
1) compressed pages for /page.html requests
2) and the Save file message box for the /dir/ requests.


My mod_deflate allows optionally to add Vary header to compressed
response only.


So you still run the risk of only getting one variant 'stuck' in a
downstream cache. If the uncompressed version goes out at
'start of day' with no Vary: then the uncompressed version
can get 'stuck' where you don't want it until it actually expires.


I prefer to use compressing as safe as possible.
So the default settings of my mod_deflate and my new lightweight server
named nginx are:

1) compress text/html only;
2) compress HTTP/1.1 requests only;
3) do not compress responses for the proxied requests at all.

And of course, mod_deflate and nginx have many directives to change
this behavior: for example, they may compress responses for proxied
requests if responses have Cache-Control header with the private,
no-store, no-cache values.


Igor Sysoev
http://sysoev.ru/en/


Re: mod_proxy reverse proxy optimization/performance question

2004-10-21 Thread Igor Sysoev
On Thu, 21 Oct 2004, Roman Gavrilov wrote:

 so what would you suggest I should do ?
 implement it by myself ?

No, just look at http://sysoev.ru/mod_accel/
It's Apache 1.3 module as you need.

Igor Sysoev
http://sysoev.ru/en/


 Bill Stoddard wrote:

  Graham Leggett wrote:
 
  Roman Gavrilov wrote:
 
  In my opinion it would be more efficient to let one process complete
  the request (using maximum line throughput) and return some busy
  code to other identical, simultaneous requests  until the file is
  cached locally.
  As anyone run into a similar situation? What solution did you find?
 
  In the original design for mod_cache, the second and subsequent
  connections to a file that was still in the process of being
  downloaded into the cache would shadow the cached file - in other
  words it would serve content from the cached file as and when it was
  received by the original request.
 
  The file in the cache was to be marked as still busy downloading,
  which meant threads/processes serving from the cached file would know
  to keep trying to serve the cached file until the still busy
  downloading status was cleared by the initial request. Timeouts
  would sanity check the process.
 
  This prevents the load spike that occurs just after a file is
  downloaded anew, but before that download is done.
 
  Whether this was implemented fully I am not sure - anyone?
 
 
  It was never implemented.


Re: mod_proxy reverse proxy optimization/performance question

2004-10-21 Thread Igor Sysoev
On Thu, 21 Oct 2004, Roman Gavrilov wrote:

 after checking the mod_accel I found out that it works only with http,
 we need the cache  proxy  to work both with http and https.
 What was the reason for disabling https proxying  caching ?

How do you think to do https reverse proxying ?


Igor Sysoev
http://sysoev.ru/en/


Re: mod_proxy reverse proxy optimization/performance question

2004-10-21 Thread Igor Sysoev
On Thu, 21 Oct 2004, Roman Gavrilov wrote:

 I don't see any problem using it, actually I am doing it. I am not
 talking about proxying between http and https.
 Mostly its used for mirroring (both frontend and backend use https only)
 no redirections on backend though :)


 ProxyPass /foo/bar https:/mydomain/foobar/
 ProxyPassReverse https:/mydomain/foobar/ /foo/bar

 I'll be more then glad to discuss it with you.

So proxy should decrypt the stream, find URI, then encrypt it, and
pass it encrypted to backend ?


Igor Sysoev
http://sysoev.ru/en/


Re: mod_proxy reverse proxy optimization/performance question

2004-10-21 Thread Igor Sysoev
On Thu, 21 Oct 2004, Roman Gavrilov wrote:

 No,  when https request gets to the server(apache), its being decrypted
 first then passed through apache routines, when it gets
 to the proxy part the URI already decrypted. proxy in its turn issues a
 request to the backend https server and returns the answer to the client
 of course after caching it.

Well, it's the same as I described.
No, mod_accel can not connect to backend using https.

 Roman

 Igor Sysoev wrote:

 On Thu, 21 Oct 2004, Roman Gavrilov wrote:
 
 
 
 I don't see any problem using it, actually I am doing it. I am not
 talking about proxying between http and https.
 Mostly its used for mirroring (both frontend and backend use https only)
 no redirections on backend though :)
 
 
 ProxyPass /foo/bar https:/mydomain/foobar/
 ProxyPassReverse https:/mydomain/foobar/ /foo/bar
 
 I'll be more then glad to discuss it with you.
 
 
 
 So proxy should decrypt the stream, find URI, then encrypt it, and
 pass it encrypted to backend ?


Igor Sysoev
http://sysoev.ru/en/


Re: deflate in mod_deflate

2002-12-25 Thread Igor Sysoev
On Tue, 24 Dec 2002, Justin Erenkrantz wrote:

 As of right now, we have no plans to add 'deflate' support.  I'm
 not aware of any browsers/client that support 'deflate' rather
 than 'gzip.'
 
 My guess is that the encapsulation format is slightly different
 than with gzip, but I really haven't sat down to look at the
 relevant RFCs.  It might be easy, but it might be hard.

MSIE, Mozilla and Opera do not understand RFC 1950 'deflate' method, i.e.
they want RFC 1951 only deflated stream without any zlib header
and Adler32 checksum trailer.


Igor Sysoev
http://sysoev.ru/en/




Re: mod_deflate -- File size lower bound needed?

2003-04-02 Thread Igor Sysoev
On Mon, 31 Mar 2003 [EMAIL PROTECTED] wrote:

  we should also put in a directive to only compress when system load is 
  below a certain level. (but we would need a apr_get_system_load() 
  function first .. any volunteers? )
 
 If you go down this route watch out for what's called 'back-flash'.
 
 You can easily get into a catch-22 at the 'threshhold' rate
 where you are ping-ponging over/under the threshhold because
 currently executing ZLIB compressions will always be included in
 the 'system load' stat you are computing.
 
 In other words... if you don't want to compress because you
 think the machine is too busy then it might only be too
 busy because it's already compressing. The minute you
 turn off compression you drop under-threshhold and now
 you are 'thrashing' and 'ping-ponging' over/under the
 threshhold.
 
 You might want to always compare system load against
 transaction/compression task load to see if something other 
 than normal compression activity is eating the CPU.
 
 Low transaction count + high CPU load = Something other than
 compression is eating the CPU and stopping compressions
 won't really make much difference.
 
 High transaction count + high CPU load + high number
 of compressions in progress = Might be best to back
 off on the compressions for a moment.

In my mod_deflate for Apache 1.3.x I take into account an idle time
on FreeBSD 3.x+. To reduce ping-ponging I do not disable a compressing
at all if idle time is less then specified but I limit a number of
the processes that can concurrently compress an output.
It should gracefully enable or disable compressing if idle time
is low.

But I found this limitation is almost usefullness on modern CPU and small
ehough responses (about  100K). Much more limiting factor is memory
overhead that zlib required for compressing - 256K-384K.
So I have directive to disable a compressing to avoid an intensive swapping
if a number of Apache processes is bigger then specified.


Igor Sysoev
http://sysoev.ru/en/



mod_deflate

2001-12-20 Thread Igor Sysoev


Sorry, I had overlooked discussion about renaming mod_gz to mod_deflate
but mod_deflate module is already exists:
ftp://ftp.lexa.ru/pub/apache-rus/contrib/

It was public available from April 2001 and is already installed
on many Russian sites and several non-Russian ones.
Documentation is in Russian only. Sorry.

Some features:
It patches Apache 1.3.x so it allows to compress content without
temporary files as mod_gzip does. It allows two encoding - gzip
and deflate. It has some workarounds for buggy browsers.
On FreeBSD it can check CPU idle to disable compression.

Igor Sysoev




Re: [PATCH] mod_deflate

2002-02-16 Thread Igor Sysoev

On Sat, 16 Feb 2002, Zvi Har'El wrote:

 On Fri, 15 Feb 2002 09:44:19 -0800, Ian Holsman wrote about Re: [PATCH] 
mod_deflate:
  
  
  I'm still not very happy about compressing EVERYTHING and excluding
  certain browsers
  as you would have to exclude IE  Netscape.
  
  so
  this is a
  -1 for this patch.
  in order to change this checks need to be there with a directive to
  ignore them (default:off)
  
 
 IMHO, deflating everything is a waste of the computer resources. HTML files
 really compress well. But most of the image files currently supported, e.g.,
 PNG, GIF, JPEG are already compressed, and deflating them will really do
 nothing -- just spend your CPU. I think that compressing text/html for browsers
 who send Accept-Encoding: gzip is the right approach. A possible enhancement
 is to have a directive (DeflateAddMimeType) which will enable deflating more
 mime types, e.g., text/plain, but these are really rare! Another type which is
 worth compressing is application/postscript, since many viewers (I am not an
 expert which - at least those decendents of GhostScript) are capable of
 viewing gzipped postscript files. The problem with that is that this is not a
 function of the browser, which cannot handle such files, but a function of the
 viewer, so the required Accept-Encoding: gzip doesn't prove anything about
 the ability of the external viewer!
 
 To summerize, I suggest to deflate only types which can be handled by the
 browser itself, and which are not already compressed, which amounts to
 text/html or more generally text/* (text/css for instance).

In my mod_deflate module (for Apache 1.3.x) I'd enabled by default
text/html only. You can add or remove another type with DeflateTypes
directive. Here are some recomendations:

application/x-javascript   NN4 does not understand it compressed.
text/css   the same.

text/plain   Macromedia FlashPlayer 4.x-5.x does not understand it
 compressed when get it with loadVariables() function via browser.
text/xml Macromedia FlashPlayer 5.x does not understand it
 compressed when get it with XML.load() function via browser.

application/x-shockwave-flash   FlashPlayer plugin for NN4 for Windows
 does not understand it compressed. Although plugin for Linux   
 NN4 work correctly.

text/rtf   MSIE 4.x-6.x understand correctly them
application/msword when compressed. NN and Opera does not.
application/vnd.ms-excel
application/vnd.ms-powerpoint

Igor Sysoev




Re: [PATCH] mod_deflate

2002-02-16 Thread Igor Sysoev

On Sat, 16 Feb 2002, Eli Marmor wrote:

 Igor Sysoev wrote:
  
  On Sat, 16 Feb 2002, Zvi Har'El wrote:
  
  ...
  
  In my mod_deflate module (for Apache 1.3.x) I'd enabled by default
  text/html only. You can add or remove another type with DeflateTypes
  directive. Here are some recomendations:
  
  application/x-javascript   NN4 does not understand it compressed.
  text/css   the same.
  
  text/plain   Macromedia FlashPlayer 4.x-5.x does not understand it
   compressed when get it with loadVariables() function via browser.
  text/xml Macromedia FlashPlayer 5.x does not understand it
   compressed when get it with XML.load() function via browser.
  
  application/x-shockwave-flash   FlashPlayer plugin for NN4 for Windows
   does not understand it compressed. Although plugin for Linux
   NN4 work correctly.
  
  text/rtf   MSIE 4.x-6.x understand correctly them
  application/msword when compressed. NN and Opera does not.
  application/vnd.ms-excel
  application/vnd.ms-powerpoint
 
 I want to add that these issues (what to compress and what to leave as-
 is), were discussed very deeply and heavilly in the mod_gzip list.
 
 If we don't adopt mod_gzip but develop our own mod_deflate (both are
 good, by the way), we should at least use the long experience that
 mod_gzip has had.
 
 After being used in so many installations, and even being included in
 leading Linux distros, there is almost no combination of format/browser
 that has not been tested yet.
 
 Your research, Igor, is very helpful (and Zvi's as well), but we can
 base more default definitions on the defaults (or conclusions) of
 mod_gzip.

By default my mod_deflate compresses content if:
1. mime type is text/html;
2. request is not proxied;
3. request is HTTP/1.1;

As far as I know mod_gzip use only first rule by default.
My rules are safer.
Also mod_gzip has not workaround with broken browaers.

 The list of default definitions may become quite long, but putting it
 inside an IfModule section, which separates it from the other parts of
 httpd.conf, may help. I believe that the improvement in bandwidth,
 deserves the price in size of httpd.conf.

Igor Sysoev




Re: [PATCH] mod_deflate

2002-02-16 Thread Igor Sysoev

On Sat, 16 Feb 2002, Ian Holsman wrote:

 Justin Erenkrantz wrote:
  On Sat, Feb 16, 2002 at 06:59:40PM +0100, Sander Striker wrote:
  
 Wow!  Obviously the code/default config need to be extremely
 conservative!
 
 Yes.  But browsers change (evolve to better things we hope), so config has
 my preference.  Hardcoding in default rules is badness IMHO.  But maybe that's
 just me.
 
  
  -1 if these restrictions are hardcoded in the module like it was
  before Sander's patch.  These problems should be handled by the
  configuration of mod_deflate not by hardcoding rules.
 
 this is BULLSHIT justin.
 you can't veto a change to make it behave like the old (more 
 conservative) behavior.
 GZIP encoding it VERY badly broken in all of the common browsers
 and saying 'well fix the browsers' isn't going to cut it. for a couple 
 of reasons
 1. apache 2 has 0% market share
 2. browsers arent going to get fixed just because we want them to
 3. people are still using netscape 3.x out there, people will be using
 these broken browsers for a VERY long time.

The main problem is not old browsers.
They do not send Accept-Encoding: gzip header so
they do not receive compressed content.
I know two browsers that send Accept-Encoding and
can incorrectly handle compressed response - MSIE 4.x
and Mozilla 0.9.1.

The main problem is proxies, especially Squid (~70% of all proxies)
Proxies can store compressed response and return it to browser
that does not understand gzipped content.

So you should by default disable encoding for requests
with Via header and HTTP/1.0 requests (HTTP/1.1-compatible 
proxy must set Via header, HTTP/1.0-compatible should but not have).
 
Igor Sysoev




Re: [PATCH] mod_deflate

2002-02-18 Thread Igor Sysoev

On Sun, 17 Feb 2002, Graham Leggett wrote:

 Igor Sysoev wrote:
 
  The main problem is proxies, especially Squid (~70% of all proxies)
  Proxies can store compressed response and return it to browser
  that does not understand gzipped content.
 
 Is this verified behavior? If a proxy returns compressed content to a
 browser that cannot handle it, then the content negotiation function
 inside the proxy is broken - and as squid has an active developer
 community, I seriously doubt that a bug this serious would go unfixed.

Yes, it is verified behavior. Squid can return cached compressed content
to client that does not send Accept-Encoding: gzip. As far as I
know MSProxy 2.0 does the same.

 RFC2616 describes the Vary header, which helps determine on what basis
 a document was negotiated. mod_deflate should use content negotiation
 and the presence of the Vary header to determine what to do, as is laid
 down in the HTTP spec.

  So you should by default disable encoding for requests
  with Via header and HTTP/1.0 requests (HTTP/1.1-compatible
  proxy must set Via header, HTTP/1.0-compatible should but not have).
 
 I disagree. Virtually all content is going to go through a proxy of some
 kind before reaching a browser. Doing this will effective render
 mod_deflate useless.

 mod_deflate should behave according to RFC2616 - and you won't have
 problems.

Using Vary does not resolve the problem completetly.
Vary defined for HTTP/1.1 but we need to work with HTTP/1.0
requests also. Yes, Squid understand Vary and does not cache such
response at all (at least Squid 1.2.x-2.4.x). But MSIE 4.x does not cache
documents with Vary too. I don't know about later MSIE version -
I will investigate it.

Yes, compressing HTTP/1.1 and non-proxied requests only is not such
effective as compressing any request with Accept-Encoding: gzip.
But any web master can choose is he intrested in old clients or not.

About efficiency - my mod_deflate module with conservative settings
(HTTP/1.1, non-proxied requests) save up to 8M/s bandwidth for
www.rambler.ru and search.rambler.ru.

Igor Sysoev




Re: mod_proxy Cache-Control: no-cache=directive support Apache1.3

2002-02-19 Thread Igor Sysoev

On Tue, 19 Feb 2002, Fowler, Brian wrote:

 Due to a requirement on a project we are currently working on involving
 using Apache as a caching reverse proxy server to WebLogic. 
  
 We are considering implementing the
  
 Cache-Control: no-cache=directive
  
 for the Apache 1.3 mod_proxy module so allow us to prevent the caching of
 certain headers served by WebLogic. (In particular the session cookie.)

I developed mod_accel module.
ftp://ftp.lexa.ru/pub/apache-rus/contrib/mod_accel-1.0.13.tar.gz
Documentation in Russain only but English translation was started:
http://dapi.chaz.ru:8100/articles/mod_accel.xml

Features:
It allows reverse-proxing only.
It frees backend as soon as possible. mod_proxy can keep busy backend
with slow client, i.e, using mod_proxy to accelerate backend is not
worked with slow clients .
It can use busy locks and limit number of connection to backend.
It implements primitive fault-tolerance via DNS-balanced backends.
It can to cache content with some cookie and ignore another.
It can ignore Pragma: no-cache and Authorization.
You can specify variuos buffer sizes.
It buffers POST body.
It logs its state.

Drawbacks:
I think it can not work in Win32. Probably under Cygwin only.

 Has/is anyone working in this area? Is there any specific reason why this
 has deliberately not been implemented already? (E.g. performance hit?) Any
 views on this directive?

mod_proxy is very ancient module and it's hard to maintain it.

Igor Ssyeov




Re: mod_proxy Cache-Control: no-cache=directive support Apache1.3

2002-02-19 Thread Igor Sysoev

On Tue, 19 Feb 2002, Graham Leggett wrote:

 Igor Sysoev wrote:
 
  mod_proxy is very ancient module and it's hard to maintain it.
 
 Er, mod_proxy _was_ a very ancient module, but has been largely
 overhauled in v1.3 and completely rewritten in v2.0 in both cases having
 full support of HTTP/1.1.

The main problem with mod_proxy is that it reads backend response
to 8K buffer and than sends it to client. When it have sent it
to client it reads again from backend. After it have sent whole
content to client it flushes buffer and only after this it closes
backend socket. Even backend send all to its kernel buffer and
response is recevied in frontend kernel buffer nevertheless backend
need to wait frontend in lingering_close. So we lose at least 2 seconds
with small client and big response.

 Once mod_cache is finished in v2.0, (in theory) the capability will be
 there to disengage expensive backends and slow frontends from each other
 - so all your problems will be solved. :)

Will see 2.0 but I suppose that multi-threaded mod_perl backend with 10
threads will occupy almost the same memory as 10 mod_perl single thread
processes.

Igor Sysoev




Re: mod_proxy Cache-Control: no-cache=directive support Apache1.3

2002-02-19 Thread Igor Sysoev

On Tue, 19 Feb 2002, Graham Leggett wrote:

 Igor Sysoev wrote:
 
  The main problem with mod_proxy is that it reads backend response
  to 8K buffer and than sends it to client. When it have sent it
  to client it reads again from backend. After it have sent whole
  content to client it flushes buffer and only after this it closes
  backend socket. Even backend send all to its kernel buffer and
  response is recevied in frontend kernel buffer nevertheless backend
  need to wait frontend in lingering_close. So we lose at least 2 seconds
  with small client and big response.
 
 Will making the 8k buffer bigger solve this problem?

No. It does not resolve 2-second lingering close on backend.

 I will check that once the end of a request has been detected from the
 backend, this backend connection is closed before attempting to send the
 last chunk to the frontend. This should have the effect that with a
 large enough buffer, the backend will not have to wait around while a
 slow frontend munches the bytes.

1.3.23 mod_proxy calls ap_proxy_send_fb() and than closes backend.
But ap_proxy_send_fb() flushes output to client so it can hang
for a long time.

   Once mod_cache is finished in v2.0, (in theory) the capability will be
   there to disengage expensive backends and slow frontends from each other
   - so all your problems will be solved. :)
  
  Will see 2.0 but I suppose that multi-threaded mod_perl backend with 10
  threads will occupy almost the same memory as 10 mod_perl single thread
  processes.
 
 But a single thread of a mod_perl backend will use less resources if it
 need only stick around for 100ms, than it will if it has to stick around
 for a minute.

Why it will stick for 100ms only with slow client ? Will Apache 2.0 use
separate threads for lingering_close ?

Igor Sysoev




Re: mod_proxy Cache-Control: no-cache=directive support Apache1.3

2002-02-21 Thread Igor Sysoev

On Thu, 21 Feb 2002, Joseph Wayne Norton wrote:

 After I read your posting, I downloaded but haven't tried to install
 the mod_accel.  From you description, it looks like a very, powerful
 module with pretty much the features that I have been looking for.
 Can mod_accel work with the mod_rewrite module (in a fashion similar?

mod_accel can work with mod_rewrite as mod_proxy does ([P] flag)
but mod_proxy would loose this functionality if mod_accel installed.
In all other cases mod_proxy can work with mod_accel in one Apache.

 In conjunction with mod_rewrite as url filter, I would like to be able
 to use mod_accel as a proxy for only the http request portion of a
 client request and allow for the http response portion to be served
 directly from the backend to the client.  This would be useful in
 situations where the response does not (or should not) have to be
 cached by the mod_accel cache.  However, I think this type of
 tcp-handoff cannot be performed soley by an application process such
 as apache.  Have you a similiar requirement or experience?

No.

But mod_accel can simply proxies request without caching.
You can set 'AccelNoCache on' on per-server, per-Location and per-Files
basis. You can send 'X-Accel-Expires: 0' header from backend.
You can use usual 'Cache-Control: no-cache or Expires headers.

With mod_accel your mod_rewite using can be eliminated with
AccelNoPass directive:

AccelPass  /http://backend/
AccelNoPass/images  /download  ~*\.jpg$

 Is it possible to integrate apache 2.0's mod_cache with mod_accel
 and/or add mod_accel's features to mod_proxy?

I have plans to make mod_accel Apache 2.0 compatible but not right now.
I wait Apache 2.0 stabilzation.
As to mod_proxy, I've wrote replacement for mod_proxy because
it's to difficult to hack it. It was much simpler to write module from
scratch.

Igor Sysoev




Re: mod_proxy Cache-Control: no-cache=directive support Apache1.3

2002-02-21 Thread Igor Sysoev

On Thu, 21 Feb 2002, Igor Sysoev wrote:

 On Thu, 21 Feb 2002, Joseph Wayne Norton wrote:
 
  After I read your posting, I downloaded but haven't tried to install
  the mod_accel.  From you description, it looks like a very, powerful
  module with pretty much the features that I have been looking for.
  Can mod_accel work with the mod_rewrite module (in a fashion similar?
 
 mod_accel can work with mod_rewrite as mod_proxy does ([P] flag)
 but mod_proxy would loose this functionality if mod_accel installed.
 In all other cases mod_proxy can work with mod_accel in one Apache.
 
  In conjunction with mod_rewrite as url filter, I would like to be able
  to use mod_accel as a proxy for only the http request portion of a
  client request and allow for the http response portion to be served
  directly from the backend to the client.  This would be useful in
  situations where the response does not (or should not) have to be
  cached by the mod_accel cache.  However, I think this type of
  tcp-handoff cannot be performed soley by an application process such
  as apache.  Have you a similiar requirement or experience?
 
 No.
 
 But mod_accel can simply proxies request without caching.
 You can set 'AccelNoCache on' on per-server, per-Location and per-Files
 basis. You can send 'X-Accel-Expires: 0' header from backend.
 You can use usual 'Cache-Control: no-cache or Expires headers.

Even more. mod_accel by default did not cache response if it
has not positive Expires or Cache-Control headers.
But you can cache these responses using AccelDefaultExpires or
AccelLastModifiedFactor directives.

Igor Sysoev




Re: mod_proxy Cache-Control: no-cache=directive support Apache1.3

2002-02-23 Thread Igor Sysoev

On Wed, 20 Feb 2002, Graham Leggett wrote:

 Igor Sysoev wrote:
 
  1.3.23 mod_proxy calls ap_proxy_send_fb() and than closes backend.
  But ap_proxy_send_fb() flushes output to client so it can hang
  for a long time.
 
 I have come up with a patch to solve this problem - in theory anyway.
 
 Can you test it and get back to me with whether it makes a difference or
 not...?
 
 The patch is being posted separately.

+/* allocate a buffer to store the bytes in */
+/* make sure it is at least IOBUFSIZE, as recv_buffer_size may be zero for
system default */
+buf_size = MAX(recv_buffer_size, IOBUFSIZE);
+buf = ap_palloc(r-pool, buf_size);

There is one drawback in this code. ap_palloc() is not good for
big allocations (I think  16K) because it stores data and meta-data
together. I had found this when try to allocate memory from pool
for zlib in mod_deflate. zlib needs about 390K - 2*128K + 2*64K + 6K.
After this change Apache had grown up about 2M after about hour
with 50 requests/s. I'm not sure that this growing could continue but
I did not want additional 2M on each Apache.

I use malloc for big allocations, store addresses in array
allocated from pool and set cleanup for this array.
In cleanup I free addresses if they is not free already.
 
Igor Sysoev




Re: mod_proxy Cache-Control: no-cache=directive support Apache1.3

2002-02-27 Thread Igor Sysoev

On Wed, 27 Feb 2002, Joseph Wayne Norton wrote:

 For dynamic content that has been cached or can be cached, the
 Distributor component would simply send the response back to the
 client (as mod_proxy does now after talking with the backend).  For
 dynamic content that cannot be cached or doesn't need to be cached,
 the Distributor would implement a form of TCP handoff that would
 allow the backend to serve the response directly to the client.  This
 later step probably cannot be done without some additional
 kernel-level module.

I do not understand why do you want that the backend will serve
response directly to the client ? If ithe client is slow then it will
keep busy the backend.

   Is it possible to integrate apache 2.0's mod_cache with mod_accel
   and/or add mod_accel's features to mod_proxy?
  
  Mod_proxy is no longer ancient nor hard to maintain, and as far as I am
  aware the new mod_proxy does almost everything mod_accel does - if it
  doesn't, tell me what's broken and I'll try to fix it.
  
 
 I haven't spent any time examining the source (or trying to extend) of
 mod_proxy or mod_accel so I am not able to judge either module.  
 
 The 2 main points that I picked up from Igor's mail that I'm not sure
 if mod_proxy supports or not:
 
  a. It frees backend as soon as possible. mod_proxy can keep busy
backend with slow client, i.e, using mod_proxy to accelerate
backend is not worked with slow clients .

The last patch allows to specify mod_proxy the big buffer to get
backend reponse. But if repsonse would be bigger then this buffer
then slow client can still stall backend.

  b. It can use busy locks and limit number of connection to
backend.

Yes, mod_proxy can not it.

 One additional feature that I would like to have with mod_proxy is to
 have a way to install error_handler documents for all or individual
 backends.  This would allow apache to return a customized error page
 for individual backends for cases when the backend is not reachable,
 etc.

mod_accel allows it. It seems that mod_proxy in 1.3.23 allows it too but
I'm not sure.

Igor Sysoev




Re: mod_proxy Cache-Control: no-cache=directive support Apache1.3

2002-03-02 Thread Igor Sysoev

On Fri, 1 Mar 2002, Graham Leggett wrote:

 Igor Sysoev wrote:
 
  mod_proxy can not do many things that mod_accel can Some of
  them can be easy implemented, some not
 
 Keep in mind that mod_proxy is exactly that - a proxy It does not try
 to duplicate functionality that is performed by other parts of Apache
 (This is the main reason mod_proxy and mod_cache were separated from
 each other in v20)

mod_accel is not proxy It's accelarator It can not work as usual proxy
I did not even try to implement it - Apache 13 is poor proxy Squid or
Oops are much better

  mod_accel can:
  
  *) ignore headers like 'Pragma: no-cache' and 'Authorization'
 
 This is the job of mod_headers, not mod_proxy
 
 However: ignoring headers violates the HTTP protocol and is not
 something that should be included in a product that claims to be as HTTP
 compliant as possible If you want to cache heavy data sources, use the
 Cache-Control header correctly, or correct the design of the application
 so as to be less inefficient

mod_accel can ignore client's 'Pragma: no-cache' and
'Cache-Control: no-cache' These headers are sent if you press Reload
button in Netscape or Mozilla By default if mod_accel gets these headers
then it does not look cache but send request to backend
Webmaster can set 'AccelIgnoreNoCache on' if he sure that
backend did not give fresh data and such requests only overload backend

As to 'Authorization' mod_accel by default sends this header
to backend and never caches such answers Webmaster can set
'AccelIgnoreAuth on' if backend never ask authorization but
client anyway send 'Authorization' - so in this case 'Authorization'
is simply very powerfull 'no-cache' header
I know at least one download utility, FlashGet, that sends in
'Authorization' header name and password for anonymous FTP access
It's probably bug in FlashGet but this bug effectively trashes cache
and backend

Yes, of course all these directives work per Location and Files level

  *) log its results
 
 In theory mod_proxy (and mod_cache) should allow results to be logged
 via the normal logging modules If this is not yet possible, it should
 be fixed

In theory but not in practice

  *) pass cookies to backend even response can be cached
 
 Again RFC2616 dictates how this should be done - proxy should support
 the specification

As I said mod_accel is not proxy
By default mod_accel did not send cookies to backend if reponse
can be cached But webmaster can set 'AccelPassCookie on'
and  all cookies goes to backend Backend is responsible to
control which answers should be cached and which are not
Anyway 'Set-Cookie' headers never goes to cache
This directive works per Location and Files level

  *) taking cookies into account while caching responses
  
  *) mod_accel has AccelNoPass directive
 
 What does this do?
 
 If it allows certain parts of a proxied URL space to be not-proxied,
 then the following will achieve this effect:
 
 ProxyPass /blah http://somewhere/blah
 ProxyPass /blah/somewhere/else !
 
 Everything under /blah is proxied, except for everything under
 /blah/somewhere/else

Yes But '!' is already implemented ?
I use another syntax:

AccelPass / http://backend/
AccelNoPass   /images  /download  ~*\jpg$

  *) proxy mass name-based virtual hosts with one directive on frontend:
 AccelPass   /  http://19216811/[PH]
 [PH] means preserve hostname, ie request to backend would go with
 original 'Host' header
 
 mod_accel does this in one directive, mod_proxy does it in two - but the
 effect is the same Should we consider adding a combined directive to
 mod_proxy the same way mod_accel works?

What are two mod_proxy's directives ?
As far as I know mod_proxy always change 'Host' header

  *) resolve backend on startup
 
 This is a good idea

mod_accel does it by default You can disable it with [NR] flag
in AccelPass directive

  *) make simple fault-tolerance with dns-balanced backends
 
 mod_proxy does this already

No mod_proxy tries it but code is broken If connection failed it try
to connect with the same socket It should make new socket
Anyway mod_accel tries another backend if connection failed, backend
has not sent header, and backend has send 5xx response

  *) use timeout when it connects to backend
 
 mod_proxy should do this - if it doesn't, it is a bug

mod_proxy does not

  *) use temporary file for buffering client request body (there is patch
 for mod_proxy)
 
 What advantage does this give?

Suppose slow client (3K/s) that POST 10K form Backend is busy
for 3 seconds Suppose client uploads 100K file

  *) get backend response as soon as possible even it's very big
 mod_accel uses temporary file for buffering backend response if
 reponse can not fill in mod_accel configurable buffer
 
 This kind of thing is fixed in v20 in mod_cache It is too big an
 architecture change for the v13 proxy

mod_accel can send part of answer to client even backend has not sent
whole answer But even

RE: Allocating a buffer efficiently...?

2002-03-02 Thread Igor Sysoev

On Sat, 2 Mar 2002, Sander Striker wrote:

  In a recent patch to mod_proxy, a static buffer used to store data read
  from backend before it was given to frontend was changed to be allocated
  dynamically from a pool like so:
  
  +/* allocate a buffer to store the bytes in */
  +/* make sure it is at least IOBUFSIZE, as recv_buffer_size may be
  zero for
  system default */
  +buf_size = MAX(recv_buffer_size, IOBUFSIZE);
  +buf = ap_palloc(r-pool, buf_size);
  
  This change allows for a dynamically configurable buffer size, and fixes
  the code to be thread safe
  
  However: it has been pointed out that this new code makes the Apache
  footprint significantly larger like so:
  
   There is one drawback in this code ap_palloc() is not good for
   big allocations (I think  16K) because it stores data and meta-data
   together I had found this when try to allocate memory from pool
   for zlib in mod_deflate zlib needs about 390K - 2*128K + 2*64K + 6K
   After this change Apache had grown up about 2M after about hour
   with 50 requests/s I'm not sure that this growing could continue but
   I did not want additional 2M on each Apache
 
 Can you point me to the original post?  I'd like to see the context
 Specifically which pool is being used

You see all context - Graham have quoted almost whole my email
As to pool I had tried to make big allocation from
r-connection-client-pool Keep-alives were off

   I use malloc for big allocations, store addresses in array
   allocated from pool and set cleanup for this array
   In cleanup I free addresses if they is not free already
  
  Comments?

Igor Sysoev




Re: mod_proxy Cache-Control: no-cache=directive support Apache1.3

2002-03-07 Thread Igor Sysoev

On Wed, 6 Mar 2002, Graham Leggett wrote:

  mod_accel is not proxy. It's accelarator. It can not work as usual proxy.
  I did not even try to implement it - Apache 1.3 is poor proxy. Squid or
  Oops are much better.
 
 Until recently you were not aware that the proxy had been updated - I
 would look at the code again before passing this judgement ;)

The main reason why Squid is better than Apache is much lesser
memory overhead per connection. And of course, Squid has many other
proxing features - it's proxy, not webserver.

 For example, you pointed out some problems with Squid and content
 negotiation - mod_proxy doesn't have these problems.

Do you mean that Squid returns cached gzipped content to client
that does not send 'Accept-Encoding' ? mod_proxy 1.3.23 does the same.
Would it be changed in 1.3.24 ?

  mod_accel can ignore client's 'Pragma: no-cache' and
  'Cache-Control: no-cache'. These headers are sent if you press Reload
  button in Netscape or Mozilla. By default if mod_accel gets these headers
  then it does not look cache but send request to backend.
  Webmaster can set 'AccelIgnoreNoCache on' if he sure that
  backend did not give fresh data and such requests only overload backend.
 
 This design is broken.
 
 If the client sent a cache-control or pragma header it was because the
 client specifically wanted that behaviour. If this causes grief on the
 backend, then your backend needs to be redesigned so that it does not
 have such a performance hit.

I live in real world and many webmasters are too. It's not always possible
to redesign backend. Unfortunately while Internet boom too many brain-damaged
solutions were born.

 Breaking the HTTP protocol isn't the fix to a broken backend.

I'm considering mod_accel and backend as single entity. It does not
matter for me which protocol I use for communication between them.
Clients see nice HTTP protocol.

   Everything under /blah is proxied, except for everything under
   /blah/somewhere/else.
  
  Yes. But '!' is already implemented ?
 
 Yes it is.

I suppose in 1.3.24 ? By the way mod_accel's syntax is more flexible -
mod_accel can use regexp.

*) proxy mass name-based virtual hosts with one directive on frontend:
   AccelPass   /  http://192.168.1.1/[PH]
   [PH] means preserve hostname, i.e. request to backend would go with
   original 'Host' header.
  
   mod_accel does this in one directive, mod_proxy does it in two - but the
   effect is the same. Should we consider adding a combined directive to
   mod_proxy the same way mod_accel works...?
  
  What are two mod_proxy's directives ?
  As far as I know mod_proxy always change 'Host' header.
 
 Use the ProxyPreserveHost option.

I suppose in 1.3.24 ?

  mod_accel can send part of answer to client even backend has not sent
  whole answer. But even in this case slow client never block backend -
  I use nonblocking operations and select().
  Would it be possible with mod_cache ?
 
 The idea behind mod_cache was to separate the send threads from the
 receive thread. This means that if a response is half-way downloaded,
 and a new request comes in, the new request will be served from the
 half-cached half-downloaded file, and not from a new request. When the
 original request is finished, the backend is released, and the receive
 threads carry on regardless.

Would it be work in prefork MPM ?

   Both busy locks and limiting concurrent connections can be useful in a
   normal Apache server using mod_cgi, or one of the Java servlet
   connectors. Adding this to proxy means it can only be used in proxy -
   which is a bad idea.
  
  Probably but Apache 1.3.x has not such module and I needed it too much
  in mod_accel.
 
 You should have created a separate module for this, and run it alongside
 mod_accel. This can still be done though.

I did not use mod_cgi and Java.

   This is the job of mod_rewrite.
  
  mod_rewrite can not do it.
 
 Then rewrite should be patched to do it.

Your phrase is like 'mod_rewrite should be patched to do some SSI job'
mod_rewrite works with URLs and filenames only. It can not change content.
mod_randban changes content on the fly.

Igor Sysoev




Re: mod_proxy Cache-Control: no-cache=directive support Apache1.3

2002-03-09 Thread Igor Sysoev

On Fri, 8 Mar 2002, Graham Leggett wrote:

 Igor Sysoev wrote:
 
*) make simple fault-tolerance with dns-balanced backends.
  
   mod_proxy does this already.
  
  No. mod_proxy tries it but code is broken. If connection failed it try
  to connect with the same socket. It should make new socket.
  Anyway mod_accel tries another backend if connection failed, backend
  has not sent header, and backend has send 5xx response.
 
 I just checked this code - when a connection fails a new socket is
 created. Are you sure this has not been fixed since you last checked?

I had seen 1.3.23

Igor Sysoev




FreeBSD sendfile

2002-03-28 Thread Igor Sysoev

Hi,

apr_sendfile() for FreeBSD has workaround for nbytes!=0 bug
but this bug had fixed in CURRENT:

http://www.FreeBSD.org/cgi/cvsweb.cgi/src/sys/kern/uipc_syscalls.c#rev1.103

So I think code should be following:

#ifdef __FreeBSD_version  500029
for (i = 0; i  hdtr-numheaders; i++) {
bytes_to_send += hdtr-headers[i].iov_len;
}
#endif

But this correct problem at build time only.
Suppose that someone has built Apache 2 on FreeBSD 4.x. Than he will
upgrade FreeBSD to 5.1 or higher. Sometimes it's not possible
to rebuild Apache so he would encounter problem.

So I think that better way is not to use FreeBSD 4.x sendfile()
capability to send header but use emulatation of header
transmition instead.


Igor Sysoev




Re: Does Solaris qsort suck

2002-04-06 Thread Igor Sysoev

On Sat, 6 Apr 2002, Yusuf Goolamabbas wrote:

 Well, That seems to be the view if one reads the following threads at
 the postgres mailing list and Sun's developer connection
 
 http://archives.postgresql.org/pgsql-hackers/2002-04/msg00103.php
 http://forum.sun.com/thread.jsp?forum=4thread=7231
 
 Don't know if those cases would be seen by Solaris users of Apache
 2.0.x, but it might be useful to snarf FreeBSD's qsort.c and link Apache
 to it if a Solaris platform is detected

I think Apache does not sort more then ten items so this bug
does not affect Apache.

Igor Sysoev




Re: Does Solaris qsort suck

2002-04-08 Thread Igor Sysoev

On Sun, 7 Apr 2002, Dale Ghent wrote:

 On Sat, 6 Apr 2002, Yusuf Goolamabbas wrote:
 
 | Well, That seems to be the view if one reads the following threads at
 | the postgres mailing list and Sun's developer connection
 |
 | http://archives.postgresql.org/pgsql-hackers/2002-04/msg00103.php
 | http://forum.sun.com/thread.jsp?forum=4thread=7231
 |
 | Don't know if those cases would be seen by Solaris users of Apache
 | 2.0.x, but it might be useful to snarf FreeBSD's qsort.c and link Apache
 | to it if a Solaris platform is detected
 
 Solaris 8 included a huge increase in qsort performance. What version are
 you using?

I see this bug on

SunOS X 5.8 Generic_108529-07 i86pc i386 i86pc

But this bug does not affect Apache.

Igor Sysoev




Re: [SECURITY] Remote exploit for 32-bit Apache HTTP Server known

2002-06-21 Thread Igor Sysoev

On Fri, 21 Jun 2002 [EMAIL PROTECTED] wrote:

 Concerning this vulnerability: is safe to assume that a patched
 reverse proxy will protect a vulnerable back end server from such
 malicious requests?

I think that even unpatched Apache will protect backend - as all modules
that have deal with clients body mod_proxy does not support client's
chunked request. Of course, unpatched frontend is still vulnerable.

Igor Sysoev
http://sysoev.ru