Re: Is Apache Proxy Half-Duplex?

2002-05-22 Thread Igor Sysoev

On Wed, 22 May 2002, Zvi Har'El wrote:

> Experimenting with an Apache Proxy,  I noticed that in version 1.3 (the latest
> cvs snapshot) it behaves in a half-duplex fashion. That is, it doesn't read the
> backend server response until it have finished transmitting the client's
> request body. This is pretty annoying, mainly if the request involves a very
> large post (file upload), and the backend sever response, after the headers,
> says "Please wait patiently...". I wonder: are there any intentions to change
> this? It seems that full-duplex operation requires two threads per proxy, which
> is not how the Apache proxy server works. Is the situation different, or going
> to be different, in Apache 2? Just for reference, the Squid proxy doesn't
> suffer from this deficiency.

If you use Apache 1.3 then you can try mod_accel.
mod_accel uses temporary files if backend reponse or client POST
is bigger then memory buffer.

Igor Sysoev
http://sysoev.ru




Re: [SECURITY] Remote exploit for 32-bit Apache HTTP Server known

2002-06-21 Thread Igor Sysoev

On Fri, 21 Jun 2002 [EMAIL PROTECTED] wrote:

> Concerning this vulnerability: is safe to assume that a patched
> reverse proxy will protect a vulnerable back end server from such
> malicious requests?

I think that even unpatched Apache will protect backend - as all modules
that have deal with clients body mod_proxy does not support client's
chunked request. Of course, unpatched frontend is still vulnerable.

Igor Sysoev
http://sysoev.ru




mod_deflate

2001-12-20 Thread Igor Sysoev


Sorry, I had overlooked discussion about renaming mod_gz to mod_deflate
but mod_deflate module is already exists:
ftp://ftp.lexa.ru/pub/apache-rus/contrib/

It was public available from April 2001 and is already installed
on many Russian sites and several non-Russian ones.
Documentation is in Russian only. Sorry.

Some features:
It patches Apache 1.3.x so it allows to compress content without
temporary files as mod_gzip does. It allows two encoding - gzip
and deflate. It has some workarounds for buggy browsers.
On FreeBSD it can check CPU idle to disable compression.

Igor Sysoev




Re: [PATCH] mod_deflate

2002-02-16 Thread Igor Sysoev

On Sat, 16 Feb 2002, Zvi Har'El wrote:

> On Fri, 15 Feb 2002 09:44:19 -0800, Ian Holsman wrote about "Re: [PATCH] 
>mod_deflate":
> > 
> > 
> > I'm still not very happy about compressing EVERYTHING and excluding
> > certain browsers
> > as you would have to exclude IE & Netscape.
> > 
> > so
> > this is a
> > -1 for this patch.
> > in order to change this checks need to be there with a directive to
> > ignore them (default:off)
> > 
> 
> IMHO, deflating everything is a waste of the computer resources. HTML files
> really compress well. But most of the image files currently supported, e.g.,
> PNG, GIF, JPEG are already compressed, and deflating them will really do
> nothing -- just spend your CPU. I think that compressing text/html for browsers
> who send "Accept-Encoding: gzip" is the right approach. A possible enhancement
> is to have a directive (DeflateAddMimeType) which will enable deflating more
> mime types, e.g., text/plain, but these are really rare! Another type which is
> worth compressing is application/postscript, since many viewers (I am not an
> expert which - at least those decendents of GhostScript) are capable of
> viewing gzipped postscript files. The problem with that is that this is not a
> function of the browser, which cannot handle such files, but a function of the
> viewer, so the required "Accept-Encoding: gzip" doesn't prove anything about
> the ability of the external viewer!
> 
> To summerize, I suggest to deflate only types which can be handled by the
> browser itself, and which are not already compressed, which amounts to
> text/html or more generally text/* (text/css for instance).

In my mod_deflate module (for Apache 1.3.x) I'd enabled by default
"text/html" only. You can add or remove another type with DeflateTypes
directive. Here are some recomendations:

application/x-javascript   NN4 does not understand it compressed.
text/css   the same.

text/plain   Macromedia FlashPlayer 4.x-5.x does not understand it
 compressed when get it with loadVariables() function via browser.
text/xml Macromedia FlashPlayer 5.x does not understand it
 compressed when get it with XML.load() function via browser.

application/x-shockwave-flash   FlashPlayer plugin for NN4 for Windows
 does not understand it compressed. Although plugin for Linux   
 NN4 work correctly.

text/rtf   MSIE 4.x-6.x understand correctly them
application/msword when compressed. NN and Opera does not.
application/vnd.ms-excel
application/vnd.ms-powerpoint

Igor Sysoev




Re: [PATCH] mod_deflate

2002-02-16 Thread Igor Sysoev

On Sat, 16 Feb 2002, Eli Marmor wrote:

> Igor Sysoev wrote:
> > 
> > On Sat, 16 Feb 2002, Zvi Har'El wrote:
> > 
> > ...
> > 
> > In my mod_deflate module (for Apache 1.3.x) I'd enabled by default
> > "text/html" only. You can add or remove another type with DeflateTypes
> > directive. Here are some recomendations:
> > 
> > application/x-javascript   NN4 does not understand it compressed.
> > text/css   the same.
> > 
> > text/plain   Macromedia FlashPlayer 4.x-5.x does not understand it
> >  compressed when get it with loadVariables() function via browser.
> > text/xml Macromedia FlashPlayer 5.x does not understand it
> >  compressed when get it with XML.load() function via browser.
> > 
> > application/x-shockwave-flash   FlashPlayer plugin for NN4 for Windows
> >  does not understand it compressed. Although plugin for Linux
> >  NN4 work correctly.
> > 
> > text/rtf   MSIE 4.x-6.x understand correctly them
> > application/msword when compressed. NN and Opera does not.
> > application/vnd.ms-excel
> > application/vnd.ms-powerpoint
> 
> I want to add that these issues (what to compress and what to leave as-
> is), were discussed very deeply and heavilly in the mod_gzip list.
> 
> If we don't adopt mod_gzip but develop our own mod_deflate (both are
> good, by the way), we should at least use the long experience that
> mod_gzip has had.
> 
> After being used in so many installations, and even being included in
> leading Linux distros, there is almost no combination of format/browser
> that has not been tested yet.
> 
> Your research, Igor, is very helpful (and Zvi's as well), but we can
> base more default definitions on the defaults (or conclusions) of
> mod_gzip.

By default my mod_deflate compresses content if:
1. mime type is text/html;
2. request is not proxied;
3. request is HTTP/1.1;

As far as I know mod_gzip use only first rule by default.
My rules are safer.
Also mod_gzip has not workaround with broken browaers.

> The list of default definitions may become quite long, but putting it
> inside an IfModule section, which separates it from the other parts of
> httpd.conf, may help. I believe that the improvement in bandwidth,
> deserves the price in size of httpd.conf.

Igor Sysoev




Re: [PATCH] mod_deflate

2002-02-16 Thread Igor Sysoev

On Sat, 16 Feb 2002, Ian Holsman wrote:

> Justin Erenkrantz wrote:
> > On Sat, Feb 16, 2002 at 06:59:40PM +0100, Sander Striker wrote:
> > 
> >>>Wow!  Obviously the code/default config need to be extremely
> >>>conservative!
> >>>
> >>Yes.  But browsers change (evolve to better things we hope), so config has
> >>my preference.  Hardcoding in default rules is badness IMHO.  But maybe that's
> >>just me.
> >>
> > 
> > -1 if these restrictions are hardcoded in the module like it was
> > before Sander's patch.  These problems should be handled by the
> > configuration of mod_deflate not by hardcoding rules.
> 
> this is BULLSHIT justin.
> you can't veto a change to make it behave like the old (more 
> conservative) behavior.
> GZIP encoding it VERY badly broken in all of the common browsers
> and saying 'well fix the browsers' isn't going to cut it. for a couple 
> of reasons
> 1. apache 2 has 0% market share
> 2. browsers arent going to get fixed just because we want them to
> 3. people are still using netscape 3.x out there, people will be using
> these broken browsers for a VERY long time.

The main problem is not old browsers.
They do not send "Accept-Encoding: gzip" header so
they do not receive compressed content.
I know two browsers that send "Accept-Encoding" and
can incorrectly handle compressed response - MSIE 4.x
and Mozilla 0.9.1.

The main problem is proxies, especially Squid (~70% of all proxies)
Proxies can store compressed response and return it to browser
that does not understand gzipped content.

So you should by default disable encoding for requests
with "Via" header and HTTP/1.0 requests (HTTP/1.1-compatible 
proxy must set "Via" header, HTTP/1.0-compatible should but not have).
 
Igor Sysoev




Re: [PATCH] mod_deflate

2002-02-18 Thread Igor Sysoev

On Sun, 17 Feb 2002, Graham Leggett wrote:

> Igor Sysoev wrote:
> 
> > The main problem is proxies, especially Squid (~70% of all proxies)
> > Proxies can store compressed response and return it to browser
> > that does not understand gzipped content.
> 
> Is this verified behavior? If a proxy returns compressed content to a
> browser that cannot handle it, then the content negotiation function
> inside the proxy is broken - and as squid has an active developer
> community, I seriously doubt that a bug this serious would go unfixed.

Yes, it is verified behavior. Squid can return cached compressed content
to client that does not send "Accept-Encoding: gzip". As far as I
know MSProxy 2.0 does the same.

> RFC2616 describes the "Vary" header, which helps determine on what basis
> a document was negotiated. mod_deflate should use content negotiation
> and the presence of the Vary header to determine what to do, as is laid
> down in the HTTP spec.
>
> > So you should by default disable encoding for requests
> > with "Via" header and HTTP/1.0 requests (HTTP/1.1-compatible
> > proxy must set "Via" header, HTTP/1.0-compatible should but not have).
> 
> I disagree. Virtually all content is going to go through a proxy of some
> kind before reaching a browser. Doing this will effective render
> mod_deflate useless.
>
> mod_deflate should behave according to RFC2616 - and you won't have
> problems.

Using "Vary" does not resolve the problem completetly.
"Vary" defined for HTTP/1.1 but we need to work with HTTP/1.0
requests also. Yes, Squid understand "Vary" and does not cache such
response at all (at least Squid 1.2.x-2.4.x). But MSIE 4.x does not cache
documents with "Vary" too. I don't know about later MSIE version -
I will investigate it.

Yes, compressing HTTP/1.1 and non-proxied requests only is not such
effective as compressing any request with "Accept-Encoding: gzip".
But any web master can choose is he intrested in old clients or not.

About efficiency - my mod_deflate module with conservative settings
(HTTP/1.1, non-proxied requests) save up to 8M/s bandwidth for
www.rambler.ru and search.rambler.ru.

Igor Sysoev




Re: mod_proxy Cache-Control: no-cache= support Apache1.3

2002-02-19 Thread Igor Sysoev

On Tue, 19 Feb 2002, Fowler, Brian wrote:

> Due to a requirement on a project we are currently working on involving
> using Apache as a caching reverse proxy server to WebLogic. 
>  
> We are considering implementing the
>  
> Cache-Control: no-cache=
>  
> for the Apache 1.3 mod_proxy module so allow us to prevent the caching of
> certain headers served by WebLogic. (In particular the session cookie.)

I developed mod_accel module.
ftp://ftp.lexa.ru/pub/apache-rus/contrib/mod_accel-1.0.13.tar.gz
Documentation in Russain only but English translation was started:
http://dapi.chaz.ru:8100/articles/mod_accel.xml

Features:
It allows reverse-proxing only.
It frees backend as soon as possible. mod_proxy can keep busy backend
with slow client, i.e, using mod_proxy to accelerate backend is not
worked with slow clients .
It can use busy locks and limit number of connection to backend.
It implements primitive fault-tolerance via DNS-balanced backends.
It can to cache content with some cookie and ignore another.
It can ignore Pragma: no-cache and Authorization.
You can specify variuos buffer sizes.
It buffers POST body.
It logs its state.

Drawbacks:
I think it can not work in Win32. Probably under Cygwin only.

> Has/is anyone working in this area? Is there any specific reason why this
> has deliberately not been implemented already? (E.g. performance hit?) Any
> views on this directive?

mod_proxy is very ancient module and it's hard to maintain it.

Igor Ssyeov




Re: mod_proxy Cache-Control: no-cache= support Apache1.3

2002-02-19 Thread Igor Sysoev

On Tue, 19 Feb 2002, Graham Leggett wrote:

> Igor Sysoev wrote:
> 
> > mod_proxy is very ancient module and it's hard to maintain it.
> 
> Er, mod_proxy _was_ a very ancient module, but has been largely
> overhauled in v1.3 and completely rewritten in v2.0 in both cases having
> full support of HTTP/1.1.

The main problem with mod_proxy is that it reads backend response
to 8K buffer and than sends it to client. When it have sent it
to client it reads again from backend. After it have sent whole
content to client it flushes buffer and only after this it closes
backend socket. Even backend send all to its kernel buffer and
response is recevied in frontend kernel buffer nevertheless backend
need to wait frontend in lingering_close. So we lose at least 2 seconds
with small client and big response.

> Once mod_cache is finished in v2.0, (in theory) the capability will be
> there to disengage expensive backends and slow frontends from each other
> - so all your problems will be solved. :)

Will see 2.0 but I suppose that multi-threaded mod_perl backend with 10
threads will occupy almost the same memory as 10 mod_perl single thread
processes.

Igor Sysoev




Re: mod_proxy Cache-Control: no-cache= support Apache1.3

2002-02-19 Thread Igor Sysoev

On Tue, 19 Feb 2002, Graham Leggett wrote:

> Igor Sysoev wrote:
> 
> > The main problem with mod_proxy is that it reads backend response
> > to 8K buffer and than sends it to client. When it have sent it
> > to client it reads again from backend. After it have sent whole
> > content to client it flushes buffer and only after this it closes
> > backend socket. Even backend send all to its kernel buffer and
> > response is recevied in frontend kernel buffer nevertheless backend
> > need to wait frontend in lingering_close. So we lose at least 2 seconds
> > with small client and big response.
> 
> Will making the 8k buffer bigger solve this problem?

No. It does not resolve 2-second lingering close on backend.

> I will check that once the end of a request has been detected from the
> backend, this backend connection is closed before attempting to send the
> last chunk to the frontend. This should have the effect that with a
> large enough buffer, the backend will not have to wait around while a
> slow frontend munches the bytes.

1.3.23 mod_proxy calls ap_proxy_send_fb() and than closes backend.
But ap_proxy_send_fb() flushes output to client so it can hang
for a long time.

> > > Once mod_cache is finished in v2.0, (in theory) the capability will be
> > > there to disengage expensive backends and slow frontends from each other
> > > - so all your problems will be solved. :)
> > 
> > Will see 2.0 but I suppose that multi-threaded mod_perl backend with 10
> > threads will occupy almost the same memory as 10 mod_perl single thread
> > processes.
> 
> But a single thread of a mod_perl backend will use less resources if it
> need only stick around for 100ms, than it will if it has to stick around
> for a minute.

Why it will stick for 100ms only with slow client ? Will Apache 2.0 use
separate threads for lingering_close ?

Igor Sysoev




Re: mod_proxy Cache-Control: no-cache= support Apache1.3

2002-02-21 Thread Igor Sysoev

On Wed, 20 Feb 2002, Graham Leggett wrote:

> Igor Sysoev wrote:
> 
> > 1.3.23 mod_proxy calls ap_proxy_send_fb() and than closes backend.
> > But ap_proxy_send_fb() flushes output to client so it can hang
> > for a long time.
> 
> I have come up with a patch to solve this problem - in theory anyway.
> 
> Can you test it and get back to me with whether it makes a difference or
> not...?

Unfortunately no - all my frontend servers are using mod_accel almost year
and it's not possible to convert back any of them to mod_proxy.

The main lingering_close symptom on backend is FIN_WAIT2 socket state.
Apache 1.3 does not show lingering_close in %T and scoreboard.

> The patch is being posted separately.

It seems that now backend socket is closed in right place.

Igor Sysoev




Re: mod_proxy Cache-Control: no-cache= support Apache1.3

2002-02-21 Thread Igor Sysoev

On Thu, 21 Feb 2002, Joseph Wayne Norton wrote:

> After I read your posting, I downloaded but haven't tried to install
> the mod_accel.  From you description, it looks like a very, powerful
> module with pretty much the features that I have been looking for.
> Can mod_accel work with the mod_rewrite module (in a fashion similar?

mod_accel can work with mod_rewrite as mod_proxy does ([P] flag)
but mod_proxy would loose this functionality if mod_accel installed.
In all other cases mod_proxy can work with mod_accel in one Apache.

> In conjunction with mod_rewrite as url filter, I would like to be able
> to use mod_accel as a proxy for only the http request portion of a
> client request and allow for the http response portion to be served
> directly from the backend to the client.  This would be useful in
> situations where the response does not (or should not) have to be
> cached by the mod_accel cache.  However, I think this type of
> tcp-handoff cannot be performed soley by an application process such
> as apache.  Have you a similiar requirement or experience?

No.

But mod_accel can simply proxies request without caching.
You can set 'AccelNoCache on' on per-server, per-Location and per-Files
basis. You can send 'X-Accel-Expires: 0' header from backend.
You can use usual 'Cache-Control: no-cache" or Expires headers.

With mod_accel your mod_rewite using can be eliminated with
AccelNoPass directive:

AccelPass  /http://backend/
AccelNoPass/images  /download  ~*\.jpg$

> Is it possible to integrate apache 2.0's mod_cache with mod_accel
> and/or add mod_accel's features to mod_proxy?

I have plans to make mod_accel Apache 2.0 compatible but not right now.
I wait Apache 2.0 stabilzation.
As to mod_proxy, I've wrote replacement for mod_proxy because
it's to difficult to hack it. It was much simpler to write module from
scratch.

Igor Sysoev




Re: mod_proxy Cache-Control: no-cache= support Apache1.3

2002-02-21 Thread Igor Sysoev

On Thu, 21 Feb 2002, Igor Sysoev wrote:

> On Thu, 21 Feb 2002, Joseph Wayne Norton wrote:
> 
> > After I read your posting, I downloaded but haven't tried to install
> > the mod_accel.  From you description, it looks like a very, powerful
> > module with pretty much the features that I have been looking for.
> > Can mod_accel work with the mod_rewrite module (in a fashion similar?
> 
> mod_accel can work with mod_rewrite as mod_proxy does ([P] flag)
> but mod_proxy would loose this functionality if mod_accel installed.
> In all other cases mod_proxy can work with mod_accel in one Apache.
> 
> > In conjunction with mod_rewrite as url filter, I would like to be able
> > to use mod_accel as a proxy for only the http request portion of a
> > client request and allow for the http response portion to be served
> > directly from the backend to the client.  This would be useful in
> > situations where the response does not (or should not) have to be
> > cached by the mod_accel cache.  However, I think this type of
> > tcp-handoff cannot be performed soley by an application process such
> > as apache.  Have you a similiar requirement or experience?
> 
> No.
> 
> But mod_accel can simply proxies request without caching.
> You can set 'AccelNoCache on' on per-server, per-Location and per-Files
> basis. You can send 'X-Accel-Expires: 0' header from backend.
> You can use usual 'Cache-Control: no-cache" or Expires headers.

Even more. mod_accel by default did not cache response if it
has not positive "Expires" or "Cache-Control" headers.
But you can cache these responses using AccelDefaultExpires or
AccelLastModifiedFactor directives.

Igor Sysoev




Re: mod_proxy Cache-Control: no-cache= support Apache1.3

2002-02-23 Thread Igor Sysoev

On Wed, 20 Feb 2002, Graham Leggett wrote:

> Igor Sysoev wrote:
> 
> > 1.3.23 mod_proxy calls ap_proxy_send_fb() and than closes backend.
> > But ap_proxy_send_fb() flushes output to client so it can hang
> > for a long time.
> 
> I have come up with a patch to solve this problem - in theory anyway.
> 
> Can you test it and get back to me with whether it makes a difference or
> not...?
> 
> The patch is being posted separately.

+/* allocate a buffer to store the bytes in */
+/* make sure it is at least IOBUFSIZE, as recv_buffer_size may be zero for
system default */
+buf_size = MAX(recv_buffer_size, IOBUFSIZE);
+buf = ap_palloc(r->pool, buf_size);

There is one drawback in this code. ap_palloc() is not good for
big allocations (I think > 16K) because it stores data and meta-data
together. I had found this when try to allocate memory from pool
for zlib in mod_deflate. zlib needs about 390K - 2*128K + 2*64K + 6K.
After this change Apache had grown up about 2M after about hour
with 50 requests/s. I'm not sure that this growing could continue but
I did not want additional 2M on each Apache.

I use malloc for big allocations, store addresses in array
allocated from pool and set cleanup for this array.
In cleanup I free addresses if they is not free already.
 
Igor Sysoev




Re: mod_proxy Cache-Control: no-cache= support Apache1.3

2002-02-26 Thread Igor Sysoev

On Wed, 27 Feb 2002, Graham Leggett wrote:

> > Is it possible to integrate apache 2.0's mod_cache with mod_accel
> > and/or add mod_accel's features to mod_proxy?
> 
> Mod_proxy is no longer ancient nor hard to maintain, and as far as I am
> aware the new mod_proxy does almost everything mod_accel does - if it
> doesn't, tell me what's broken and I'll try to fix it.

mod_proxy can not do many things that mod_accel can. Some of
them can be easy implemented, some not.

mod_accel can:

*) ignore headers like 'Pragma: no-cache' and 'Authorization'.

*) log its results.

*) pass cookies to backend even response can be cached.

*) taking cookies into account while caching responses.

*) mod_accel has AccelNoPass directive.

*) proxy mass name-based virtual hosts with one directive on frontend:
   AccelPass   /  http://192.168.1.1/[PH]
   [PH] means preserve hostname, i.e. request to backend would go with
   original 'Host' header.

*) resolve backend on startup.

*) make simple fault-tolerance with dns-balanced backends.

*) use timeout when it connects to backend.

*) use temporary file for buffering client request body (there is patch
   for mod_proxy).

*) serve byte-range requests.

*) get backend response as soon as possible even it's very big.
   mod_accel uses temporary file for buffering backend response if
   reponse can not fill in mod_accel configurable buffer.

*) use busy locks. If there are several the same requests to backend
   then only one of them would go to backend during specified time.

*) limit concurrent connections and waiting processes on per-backend
   or per Location basis.

*) mod_accel has mod_randban module that allow to randomize some
   part of content. For example it can replace '11111' number in
   http://host/path1?place=1&key=1234&rand=1";>
   with random value.

Igor Sysoev




Re: mod_proxy Cache-Control: no-cache= support Apache1.3

2002-02-27 Thread Igor Sysoev

On Wed, 27 Feb 2002, Joseph Wayne Norton wrote:

> For dynamic content that has been cached or can be cached, the
> "Distributor" component would simply send the response back to the
> client (as mod_proxy does now after talking with the backend).  For
> dynamic content that cannot be cached or doesn't need to be cached,
> the "Distributor" would implement a form of TCP handoff that would
> allow the backend to serve the response directly to the client.  This
> later step probably cannot be done without some additional
> kernel-level module.

I do not understand why do you want that the backend will serve
response directly to the client ? If ithe client is slow then it will
keep busy the backend.

> > > Is it possible to integrate apache 2.0's mod_cache with mod_accel
> > > and/or add mod_accel's features to mod_proxy?
> > 
> > Mod_proxy is no longer ancient nor hard to maintain, and as far as I am
> > aware the new mod_proxy does almost everything mod_accel does - if it
> > doesn't, tell me what's broken and I'll try to fix it.
> > 
> 
> I haven't spent any time examining the source (or trying to extend) of
> mod_proxy or mod_accel so I am not able to judge either module.  
> 
> The 2 main points that I picked up from Igor's mail that I'm not sure
> if mod_proxy supports or not:
> 
>  a. It frees backend as soon as possible. mod_proxy can keep busy
>backend with slow client, i.e, using mod_proxy to accelerate
>backend is not worked with slow clients .

The last patch allows to specify mod_proxy the big buffer to get
backend reponse. But if repsonse would be bigger then this buffer
then slow client can still stall backend.

>  b. It can use busy locks and limit number of connection to
>backend.

Yes, mod_proxy can not it.

> One additional feature that I would like to have with mod_proxy is to
> have a way to install error_handler documents for all or individual
> backends.  This would allow apache to return a customized error page
> for individual backends for cases when the backend is not reachable,
> etc.

mod_accel allows it. It seems that mod_proxy in 1.3.23 allows it too but
I'm not sure.

Igor Sysoev




Re: mod_proxy Cache-Control: no-cache= support Apache1.3

2002-02-27 Thread Igor Sysoev

On Wed, 27 Feb 2002, Joseph Wayne Norton wrote:

> At Wed, 27 Feb 2002 11:57:46 +0300 (MSK),
> Igor Sysoev wrote:
> > 
> > I do not understand why do you want that the backend will serve
> > response directly to the client ? If ithe client is slow then it will
> > keep busy the backend.
> > 
> 
> I can imagine that if the file is larger than some limit, it would
> probably be more efficient to feed it directly to the client over the
> network than having to feed it to apache (which will then most likely
> buffer it and then feed it to the client).  However, I haven't really
> explored this in too much depth, so I don't have a good answer for
> you.

I assume that backend is heavy and proxy (apache) is much lighter.
Otherwise why do you need at all proxy in middle ? You can serve all
requests by backend.

Igor Sysoev




Re: mod_proxy Cache-Control: no-cache= support Apache1.3

2002-03-02 Thread Igor Sysoev

On Fri, 1 Mar 2002, Graham Leggett wrote:

> Igor Sysoev wrote:
> 
> > mod_proxy can not do many things that mod_accel can. Some of
> > them can be easy implemented, some not.
> 
> Keep in mind that mod_proxy is exactly that - a proxy. It does not try
> to duplicate functionality that is performed by other parts of Apache.
> (This is the main reason mod_proxy and mod_cache were separated from
> each other in v2.0)

mod_accel is not proxy. It's accelarator. It can not work as usual proxy.
I did not even try to implement it - Apache 1.3 is poor proxy. Squid or
Oops are much better.

> > mod_accel can:
> > 
> > *) ignore headers like 'Pragma: no-cache' and 'Authorization'.
> 
> This is the job of mod_headers, not mod_proxy.
> 
> However: ignoring headers violates the HTTP protocol and is not
> something that should be included in a product that claims to be as HTTP
> compliant as possible. If you want to cache heavy data sources, use the
> Cache-Control header correctly, or correct the design of the application
> so as to be less inefficient.

mod_accel can ignore client's 'Pragma: no-cache' and
'Cache-Control: no-cache'. These headers are sent if you press Reload
button in Netscape or Mozilla. By default if mod_accel gets these headers
then it does not look cache but send request to backend.
Webmaster can set 'AccelIgnoreNoCache on' if he sure that
backend did not give fresh data and such requests only overload backend.

As to 'Authorization' mod_accel by default sends this header
to backend and never caches such answers. Webmaster can set
'AccelIgnoreAuth on' if backend never ask authorization but
client anyway send 'Authorization' - so in this case 'Authorization'
is simply very powerfull 'no-cache' header.
I know at least one download utility, FlashGet, that sends in
'Authorization' header name and password for anonymous FTP access.
It's probably bug in FlashGet but this bug effectively trashes cache
and backend.

Yes, of course all these directives work per Location and Files level.

> > *) log its results.
> 
> In theory mod_proxy (and mod_cache) should allow results to be logged
> via the normal logging modules. If this is not yet possible, it should
> be fixed.

In theory but not in practice.

> > *) pass cookies to backend even response can be cached.
> 
> Again RFC2616 dictates how this should be done - proxy should support
> the specification.

As I said mod_accel is not proxy.
By default mod_accel did not send cookies to backend if reponse
can be cached. But webmaster can set 'AccelPassCookie on'
and  all cookies goes to backend. Backend is responsible to
control which answers should be cached and which are not.
Anyway 'Set-Cookie' headers never goes to cache.
This directive works per Location and Files level.

> > *) taking cookies into account while caching responses.
> > 
> > *) mod_accel has AccelNoPass directive.
> 
> What does this do?
> 
> If it allows certain parts of a proxied URL space to be "not-proxied",
> then the following will achieve this effect:
> 
> ProxyPass /blah http://somewhere/blah
> ProxyPass /blah/somewhere/else !
> 
> Everything under /blah is proxied, except for everything under
> /blah/somewhere/else.

Yes. But '!' is already implemented ?
I use another syntax:

AccelPass / http://backend/
AccelNoPass   /images  /download  ~*\.jpg$

> > *) proxy mass name-based virtual hosts with one directive on frontend:
> >AccelPass   /  http://192.168.1.1/[PH]
> >[PH] means preserve hostname, i.e. request to backend would go with
> >original 'Host' header.
> 
> mod_accel does this in one directive, mod_proxy does it in two - but the
> effect is the same. Should we consider adding a combined directive to
> mod_proxy the same way mod_accel works...?

What are two mod_proxy's directives ?
As far as I know mod_proxy always change 'Host' header.

> > *) resolve backend on startup.
> 
> This is a good idea.

mod_accel does it by default. You can disable it with [NR] flag
in AccelPass directive.

> > *) make simple fault-tolerance with dns-balanced backends.
> 
> mod_proxy does this already.

No. mod_proxy tries it but code is broken. If connection failed it try
to connect with the same socket. It should make new socket.
Anyway mod_accel tries another backend if connection failed, backend
has not sent header, and backend has send 5xx response.

> > *) use timeout when it connects to backend.
> 
> mod_proxy should do this - if it doesn't, it is a bug.

mod_proxy does not.

> > *) use temporary file for buffering client request body (

RE: Allocating a buffer efficiently...?

2002-03-02 Thread Igor Sysoev

On Sat, 2 Mar 2002, Sander Striker wrote:

> > In a recent patch to mod_proxy, a static buffer used to store data read
> > from backend before it was given to frontend was changed to be allocated
> > dynamically from a pool like so:
> > 
> > +/* allocate a buffer to store the bytes in */
> > +/* make sure it is at least IOBUFSIZE, as recv_buffer_size may be
> > zero for
> > system default */
> > +buf_size = MAX(recv_buffer_size, IOBUFSIZE);
> > +buf = ap_palloc(r->pool, buf_size);
> > 
> > This change allows for a dynamically configurable buffer size, and fixes
> > the code to be thread safe.
> > 
> > However: it has been pointed out that this new code makes the Apache
> > footprint significantly larger like so:
> > 
> > > There is one drawback in this code. ap_palloc() is not good for
> > > big allocations (I think > 16K) because it stores data and meta-data
> > > together. I had found this when try to allocate memory from pool
> > > for zlib in mod_deflate. zlib needs about 390K - 2*128K + 2*64K + 6K.
> > > After this change Apache had grown up about 2M after about hour
> > > with 50 requests/s. I'm not sure that this growing could continue but
> > > I did not want additional 2M on each Apache.
> 
> Can you point me to the original post?  I'd like to see the context.
> Specifically which pool is being used.

You see all context - Graham have quoted almost whole my email.
As to pool I had tried to make big allocation from
r->connection->client->pool. Keep-alives were off.

> > > I use malloc for big allocations, store addresses in array
> > > allocated from pool and set cleanup for this array.
> > > In cleanup I free addresses if they is not free already.
> > 
> > Comments...?

Igor Sysoev




Re: mod_proxy Cache-Control: no-cache= support Apache1.3

2002-03-07 Thread Igor Sysoev

On Wed, 6 Mar 2002, Graham Leggett wrote:

> > mod_accel is not proxy. It's accelarator. It can not work as usual proxy.
> > I did not even try to implement it - Apache 1.3 is poor proxy. Squid or
> > Oops are much better.
> 
> Until recently you were not aware that the proxy had been updated - I
> would look at the code again before passing this judgement ;)

The main reason why Squid is better than Apache is much lesser
memory overhead per connection. And of course, Squid has many other
proxing features - it's proxy, not webserver.

> For example, you pointed out some problems with Squid and content
> negotiation - mod_proxy doesn't have these problems.

Do you mean that Squid returns cached gzipped content to client
that does not send 'Accept-Encoding' ? mod_proxy 1.3.23 does the same.
Would it be changed in 1.3.24 ?

> > mod_accel can ignore client's 'Pragma: no-cache' and
> > 'Cache-Control: no-cache'. These headers are sent if you press Reload
> > button in Netscape or Mozilla. By default if mod_accel gets these headers
> > then it does not look cache but send request to backend.
> > Webmaster can set 'AccelIgnoreNoCache on' if he sure that
> > backend did not give fresh data and such requests only overload backend.
> 
> This design is broken.
> 
> If the client sent a cache-control or pragma header it was because the
> client specifically wanted that behaviour. If this causes grief on the
> backend, then your backend needs to be redesigned so that it does not
> have such a performance hit.

I live in real world and many webmasters are too. It's not always possible
to redesign backend. Unfortunately while Internet boom too many brain-damaged
solutions were born.

> Breaking the HTTP protocol isn't the fix to a broken backend.

I'm considering mod_accel and backend as single entity. It does not
matter for me which protocol I use for communication between them.
Clients see nice HTTP protocol.

> > > Everything under /blah is proxied, except for everything under
> > > /blah/somewhere/else.
> > 
> > Yes. But '!' is already implemented ?
> 
> Yes it is.

I suppose in 1.3.24 ? By the way mod_accel's syntax is more flexible -
mod_accel can use regexp.

> > > > *) proxy mass name-based virtual hosts with one directive on frontend:
> > > >AccelPass   /  http://192.168.1.1/[PH]
> > > >[PH] means preserve hostname, i.e. request to backend would go with
> > > >original 'Host' header.
> > >
> > > mod_accel does this in one directive, mod_proxy does it in two - but the
> > > effect is the same. Should we consider adding a combined directive to
> > > mod_proxy the same way mod_accel works...?
> > 
> > What are two mod_proxy's directives ?
> > As far as I know mod_proxy always change 'Host' header.
> 
> Use the ProxyPreserveHost option.

I suppose in 1.3.24 ?

> > mod_accel can send part of answer to client even backend has not sent
> > whole answer. But even in this case slow client never block backend -
> > I use nonblocking operations and select().
> > Would it be possible with mod_cache ?
> 
> The idea behind mod_cache was to separate the "send" threads from the
> "receive" thread. This means that if a response is half-way downloaded,
> and a new request comes in, the new request will be served from the
> half-cached half-downloaded file, and not from a new request. When the
> original request is finished, the backend is released, and the "receive"
> threads carry on regardless.

Would it be work in prefork MPM ?

> > > Both busy locks and limiting concurrent connections can be useful in a
> > > normal Apache server using mod_cgi, or one of the Java servlet
> > > connectors. Adding this to proxy means it can only be used in proxy -
> > > which is a bad idea.
> > 
> > Probably but Apache 1.3.x has not such module and I needed it too much
> > in mod_accel.
> 
> You should have created a separate module for this, and run it alongside
> mod_accel. This can still be done though.

I did not use mod_cgi and Java.

> > > This is the job of mod_rewrite.
> > 
> > mod_rewrite can not do it.
> 
> Then rewrite should be patched to do it.

Your phrase is like 'mod_rewrite should be patched to do some SSI job'
mod_rewrite works with URLs and filenames only. It can not change content.
mod_randban changes content on the fly.

Igor Sysoev




Re: mod_proxy Cache-Control: no-cache= support Apache1.3

2002-03-09 Thread Igor Sysoev

On Fri, 8 Mar 2002, Graham Leggett wrote:

> Igor Sysoev wrote:
> 
> > > > *) make simple fault-tolerance with dns-balanced backends.
> > >
> > > mod_proxy does this already.
> > 
> > No. mod_proxy tries it but code is broken. If connection failed it try
> > to connect with the same socket. It should make new socket.
> > Anyway mod_accel tries another backend if connection failed, backend
> > has not sent header, and backend has send 5xx response.
> 
> I just checked this code - when a connection fails a new socket is
> created. Are you sure this has not been fixed since you last checked?

I had seen 1.3.23

Igor Sysoev




FreeBSD sendfile

2002-03-28 Thread Igor Sysoev

Hi,

apr_sendfile() for FreeBSD has workaround for "nbytes!=0 bug"
but this bug had fixed in CURRENT:

http://www.FreeBSD.org/cgi/cvsweb.cgi/src/sys/kern/uipc_syscalls.c#rev1.103

So I think code should be following:

#ifdef __FreeBSD_version < 500029
for (i = 0; i < hdtr->numheaders; i++) {
bytes_to_send += hdtr->headers[i].iov_len;
}
#endif

But this correct problem at build time only.
Suppose that someone has built Apache 2 on FreeBSD 4.x. Than he will
upgrade FreeBSD to 5.1 or higher. Sometimes it's not possible
to rebuild Apache so he would encounter problem.

So I think that better way is not to use FreeBSD 4.x sendfile()
capability to send header but use emulatation of header
transmition instead.


Igor Sysoev




Re: Does Solaris qsort suck

2002-04-06 Thread Igor Sysoev

On Sat, 6 Apr 2002, Yusuf Goolamabbas wrote:

> Well, That seems to be the view if one reads the following threads at
> the postgres mailing list and Sun's developer connection
> 
> http://archives.postgresql.org/pgsql-hackers/2002-04/msg00103.php
> http://forum.sun.com/thread.jsp?forum=4&thread=7231
> 
> Don't know if those cases would be seen by Solaris users of Apache
> 2.0.x, but it might be useful to snarf FreeBSD's qsort.c and link Apache
> to it if a Solaris platform is detected

I think Apache does not sort more then ten items so this bug
does not affect Apache.

Igor Sysoev




Re: Does Solaris qsort suck

2002-04-08 Thread Igor Sysoev

On Sun, 7 Apr 2002, Dale Ghent wrote:

> On Sat, 6 Apr 2002, Yusuf Goolamabbas wrote:
> 
> | Well, That seems to be the view if one reads the following threads at
> | the postgres mailing list and Sun's developer connection
> |
> | http://archives.postgresql.org/pgsql-hackers/2002-04/msg00103.php
> | http://forum.sun.com/thread.jsp?forum=4&thread=7231
> |
> | Don't know if those cases would be seen by Solaris users of Apache
> | 2.0.x, but it might be useful to snarf FreeBSD's qsort.c and link Apache
> | to it if a Solaris platform is detected
> 
> Solaris 8 included a huge increase in qsort performance. What version are
> you using?

I see this bug on

SunOS X 5.8 Generic_108529-07 i86pc i386 i86pc

But this bug does not affect Apache.

Igor Sysoev




Re: Does Solaris qsort suck

2002-04-08 Thread Igor Sysoev

On Mon, 8 Apr 2002, Justin Erenkrantz wrote:

> On Mon, Apr 08, 2002 at 01:53:13PM +0400, Igor Sysoev wrote:
> > I see this bug on
> > 
> > SunOS X 5.8 Generic_108529-07 i86pc i386 i86pc
> 
> You are sorely out-of-date on your kernel version.  Download
> the latest recommended patch cluster and I think the problem
> might go away.  (Sun is up to 108529-14 as of now.)  -- "sleeep"

Probably. It was simply reply that Solaris 8 without patches has slow qsort().

Igor Sysoev




Re: mod_proxy distinguish cookies?

2004-04-25 Thread Igor Sysoev
On Sat, 24 Apr 2004, Neil Gunton wrote:

> Neil Gunton wrote:
> >
> > Hi all,
> >
> > I apologise in advance if this is obvious or otherwise been answered
> > elsewhere, but I can't seem to find any reference to it.
> >
> > I am using Apache 1.3.29 with mod_perl, on Linux 2.4. I am running
> > mod_proxy as a caching reverse proxy front end, and mod_perl on the
> > backend. This works really well, but I have noticed that mod_proxy does
> > not seem to be able to distinguish requests as being different if the
> > URLs are the same, but they contain different cookies. I would like to
> > be able to enable more personalization on my site, which would best be
> > done using cookies. The problem is that when a page has an expiration
> > greater than 'now', then any request to the same URL will get the cache
> > version, even if the requests have different cookies. Currently I have
> > to pass options around as part of the URL in order to make the requests
> > look different to mod_proxy.
> >
> > Am I missing something here? Or, will this be included in either future
> > versions of mod_proxy or the equivalent module in Apache 2.x? Any
> > insights greatly appreciated.
>
> I should perhaps make clear that I do have cookies working through the
> proxy just fine, for pages that are set to be 'no-cache'. So this isn't
> an issue with the proxy being able to pass cookies to/from the backend
> and browser (which I think I have seen mentioned before as a bugfix),
> but rather with mod_proxy simply being able to distinguish otherwise
> identical URL requests that have different cookies, and cache those as
> different requests.
>
> So for example, the request "GET /somedir/somepage.html?xxx=yyy" passed
> with a cookie that value 'pics=small' should be seen as different from
> another identical request, but with cookie value 'pics=large'. Currently
> my tests indicate that mod_proxy returns the same cached page for each
> request.
>
> I assume that mod_proxy only checks the actual request string, and not
> the HTTP header which contains the cookie.
>
> Obviously, under this scheme, if you were using cookies to track
> sessions then all requests would get passed to the backend server - so,
> perhaps it would be a nice additional feature to be able to configure,
> through httpd.conf, how mod_proxy (or its successor) pays attention to
> cookies. For example, you might say something to the effect of "ignore
> this cookie" or "differentiate requests using this cookie". Then we
> could have sitewide options like e.g. 'pics' (to set what size pictures
> are shown), and this could be used to distinguish cached pages, but
> other cookies might be ignored on some pages. This would allow for more
> flexibility, with some cached pages being "sensitive" to cookies, while
> others are not. An obvious way this would be useful is in the use of
> login cookies. These will be passed in by the browser for every page on
> the site, but this doesn't mean we want to distinguish cached pages
> based on it for every page. Some user-specific pages would have
> 'no-cache' set, while other pages could be set to ignore this login
> cookie, thus gaining the benefits of the proxy caching. This would be
> useful for pages that have no user-specific or personalizable aspects -
> they could be cached regardless of who is logged in.
>
> Sorry if this wasn't clear from the original post, just wanted to
> clarify and expand... any advice on this would be VERY welcomed, since
> my options with personalization are currently rather limited.
>
> Also, if this is actually addressed to the wrong list for some reason
> then a pointer would be much appreciated...

mod_accel ( http://sysoev.ru/en/ ) allows to take cookies into account while
caching:

AccelCacheCookie  some_cookie_name another_cookie_name

You can set it on per-location basis.

Besides, my upcoming light-weight http and reverse proxy server nginx
will allow to do it too.


Igor Sysoev
http://sysoev.ru/en/


Re: mod_proxy distinguish cookies?

2004-04-26 Thread Igor Sysoev
On Mon, 26 Apr 2004, Graham Leggett wrote:

> Igor Sysoev wrote:
>
> > mod_accel ( http://sysoev.ru/en/ ) allows to take cookies into account while
> > caching:
> >
> > AccelCacheCookie  some_cookie_name another_cookie_name
> >
> > You can set it on per-location basis.
> >
> > Besides, my upcoming light-weight http and reverse proxy server nginx
> > will allow to do it too.
>
> Double check first whether this is allowed by RFC2616 - remember that
> the Apache mod_proxy is very unlikely to be the only proxy in the chain,
> so even if mod_proxy takes cookies into account, other caches in the
> chain might not.

mod_accel is the reverse proxy only (unlike mod_proxy that can be also used
i the usual proxy mode) so I consider mod_accel+backend (or nginx+backend) as
the single entity, and I think that any RFC'es can be violated inside
this entity between a frontend and a backend.


Igor Sysoev
http://sysoev.ru/en/


Re: mod_proxy distinguish cookies?

2004-05-05 Thread Igor Sysoev
On Wed, 5 May 2004 [EMAIL PROTECTED] wrote:

> Few people know this but Microsoft Internet Explorer and other
> major browsers only PRETEND to support "Vary:".
>
> In MSIE's case... there is only 1 value that you can use with
> "Vary:" that will cause MSIE to make any attempt at all to
> cache the response and/or deal with a refresh later.
>
> That value is "User-Agent".
>
> MSIE treats all other "Vary:" header values as if it
> received "Vary: *" and will REFUSE to cache that
> response at all.

I did not check the "Vary: User-Agent" header value but as far as I know
MSIE 5.5 and 6.x could cache the response with ANY "Vary" header value
if that response contains the "Content-Encondig: gzip" or
"Content-Encondig: deflate" header. MSIE 4.x and 5.0x could cache such
responses only if the URL has a ".html" or ".htm" extentions. There are
probably other cachable extentions. MSIE 4.x always sends
"If-Modified-Since" for these cached responses.


Igor Sysoev
http://sysoev.ru/en/


Re: mod_proxy distinguish cookies?

2004-05-05 Thread Igor Sysoev
On Mon, 3 May 2004, Neil Gunton wrote:

> Well, that truly sucks. If you pass options around in params then
> whenever someone follows a link posted by someone else, they will
> inherit that person's options. The only alternative might be to make
> pages 'No-Cache' and then set the 'AccelIgnoreNoCache' mod_accel
> directive (which I haven't tried, but I assume that's what it does)...
> so even though my server will get hit a lot more, at least it might be
> stopped by the proxy rather than hitting the mod_perl.

The "AccelIgnoreNoCache" disables a client's "Pragma: no-cache",
"Cache-Control: no-cache" and "Cache-Control: max-age=" headers.

The "AccelIgnoreExpires" disables a backend's "Expires",
"Cache-Control: no-cache" and "Cache-Control: max-age=" headers.


Igor Sysoev
http://sysoev.ru/en/


Re: mod_proxy reverse proxy optimization/performance question

2004-10-21 Thread Igor Sysoev
On Thu, 21 Oct 2004, Roman Gavrilov wrote:

> so what would you suggest I should do ?
> implement it by myself ?

No, just look at http://sysoev.ru/mod_accel/
It's Apache 1.3 module as you need.

Igor Sysoev
http://sysoev.ru/en/


> Bill Stoddard wrote:
>
> > Graham Leggett wrote:
> >
> >> Roman Gavrilov wrote:
> >>
> >>> In my opinion it would be more efficient to let one process complete
> >>> the request (using maximum line throughput) and return some busy
> >>> code to other identical, simultaneous requests  until the file is
> >>> cached locally.
> >>> As anyone run into a similar situation? What solution did you find?
> >>
> >> In the original design for mod_cache, the second and subsequent
> >> connections to a file that was still in the process of being
> >> downloaded into the cache would shadow the cached file - in other
> >> words it would serve content from the cached file as and when it was
> >> received by the original request.
> >>
> >> The file in the cache was to be marked as "still busy downloading",
> >> which meant threads/processes serving from the cached file would know
> >> to keep trying to serve the cached file until the "still busy
> >> downloading" status was cleared by the initial request. Timeouts
> >> would sanity check the process.
> >>
> >> This prevents the "load spike" that occurs just after a file is
> >> downloaded anew, but before that download is done.
> >>
> >> Whether this was implemented fully I am not sure - anyone?
> >
> >
> > It was never implemented.


Re: mod_proxy reverse proxy optimization/performance question

2004-10-21 Thread Igor Sysoev
On Thu, 21 Oct 2004, Roman Gavrilov wrote:

> after checking the mod_accel I found out that it works only with http,
> we need the cache & proxy  to work both with http and https.
> What was the reason for disabling https proxying & caching ?

How do you think to do https reverse proxying ?


Igor Sysoev
http://sysoev.ru/en/


Re: mod_proxy reverse proxy optimization/performance question

2004-10-21 Thread Igor Sysoev
On Thu, 21 Oct 2004, Roman Gavrilov wrote:

> I don't see any problem using it, actually I am doing it. I am not
> talking about proxying between http and https.
> Mostly its used for mirroring (both frontend and backend use https only)
> no redirections on backend though :)
>
>
> ProxyPass /foo/bar https:/mydomain/foobar/
> ProxyPassReverse https:/mydomain/foobar/ /foo/bar
>
> I'll be more then glad to discuss it with you.

So proxy should decrypt the stream, find URI, then encrypt it, and
pass it encrypted to backend ?


Igor Sysoev
http://sysoev.ru/en/


Re: mod_proxy reverse proxy optimization/performance question

2004-10-21 Thread Igor Sysoev
On Thu, 21 Oct 2004, Roman Gavrilov wrote:

> No,  when https request gets to the server(apache), its being decrypted
> first then passed through apache routines, when it gets
> to the proxy part the URI already decrypted. proxy in its turn issues a
> request to the backend https server and returns the answer to the client
> of course after caching it.

Well, it's the same as I described.
No, mod_accel can not connect to backend using https.

> Roman
>
> Igor Sysoev wrote:
>
> >On Thu, 21 Oct 2004, Roman Gavrilov wrote:
> >
> >
> >
> >>I don't see any problem using it, actually I am doing it. I am not
> >>talking about proxying between http and https.
> >>Mostly its used for mirroring (both frontend and backend use https only)
> >>no redirections on backend though :)
> >>
> >>
> >>ProxyPass /foo/bar https:/mydomain/foobar/
> >>ProxyPassReverse https:/mydomain/foobar/ /foo/bar
> >>
> >>I'll be more then glad to discuss it with you.
> >>
> >>
> >
> >So proxy should decrypt the stream, find URI, then encrypt it, and
> >pass it encrypted to backend ?


Igor Sysoev
http://sysoev.ru/en/


Re: deflate in mod_deflate

2002-12-25 Thread Igor Sysoev
On Tue, 24 Dec 2002, Justin Erenkrantz wrote:

> As of right now, we have no plans to add 'deflate' support.  I'm
> not aware of any browsers/client that support 'deflate' rather
> than 'gzip.'
> 
> My guess is that the encapsulation format is slightly different
> than with gzip, but I really haven't sat down to look at the
> relevant RFCs.  It might be easy, but it might be hard.

MSIE, Mozilla and Opera do not understand RFC 1950 'deflate' method, i.e.
they want RFC 1951 only deflated stream without any zlib header
and Adler32 checksum trailer.


Igor Sysoev
http://sysoev.ru/en/




Re: mod_deflate -- File size lower bound needed?

2003-04-02 Thread Igor Sysoev
On Mon, 31 Mar 2003 [EMAIL PROTECTED] wrote:

> > we should also put in a directive to only compress when system load is 
> > below a certain level. (but we would need a apr_get_system_load() 
> > function first .. any volunteers? )
> 
> If you go down this route watch out for what's called 'back-flash'.
> 
> You can easily get into a catch-22 at the 'threshhold' rate
> where you are ping-ponging over/under the threshhold because
> currently executing ZLIB compressions will always be included in
> the 'system load' stat you are computing.
> 
> In other words... if you don't want to compress because you
> think the machine is too busy then it might only be too
> busy because it's already compressing. The minute you
> turn off compression you drop under-threshhold and now
> you are 'thrashing' and 'ping-ponging' over/under the
> threshhold.
> 
> You might want to always compare system load against
> transaction/compression task load to see if something other 
> than normal compression activity is eating the CPU.
> 
> Low transaction count + high CPU load = Something other than
> compression is eating the CPU and stopping compressions
> won't really make much difference.
> 
> High transaction count + high CPU load + high number
> of compressions in progress = Might be best to back
> off on the compressions for a moment.

In my mod_deflate for Apache 1.3.x I take into account an idle time
on FreeBSD 3.x+. To reduce ping-ponging I do not disable a compressing
at all if idle time is less then specified but I limit a number of
the processes that can concurrently compress an output.
It should gracefully enable or disable compressing if idle time
is low.

But I found this limitation is almost usefullness on modern CPU and small
ehough responses (about < 100K). Much more limiting factor is memory
overhead that zlib required for compressing - 256K-384K.
So I have directive to disable a compressing to avoid an intensive swapping
if a number of Apache processes is bigger then specified.


Igor Sysoev
http://sysoev.ru/en/



Re: 1.3 Wishlist: (Was: Re: consider reopening 1.3)

2003-11-17 Thread Igor Sysoev
On Sun, 16 Nov 2003, Rasmus Lerdorf wrote:

> On Sun, 16 Nov 2003, Jim Jagielski wrote:
> > So a useful topic is: What is *missing* in 1.3 that needs to be 
> > addressed.
> > What are the features/additions that the disenfranchised 1.3 developers
> > want to add to 1.3...
> 
> How about support for chunked compressed responses right in 
> src/main/buff.c where it belongs (given we don't have filters in Apache1)

There's mod_deflate module for Apache 1.3.x that patches buff.c and
compesses a response on the fly. It's available at http://sysoev.ru/en/


Igor Sysoev
http://sysoev.ru/en/



Re: mod_deflate Vary header

2005-11-07 Thread Igor Sysoev

On Fri, 4 Nov 2005 [EMAIL PROTECTED] wrote:


This has been discussed many times before and no one
seems to understand what the fundamental problem is.

It is not with the servers at all, it is with the CLIENTS.

What both of you are saying is true... whether you "Vary:"
on "Content-encoding" and/or "User-agent" or not... there
is a risk of getting the wrong content ( compressed versus
uncompressed ) "stuck" in a downstream cache.

It is less and less likely these days that the cache will receive
a request from a client that CANNOT "decompress", but still
possible. Handling requests from clients that cannot decompress
have become ( at long last ) the "fringe" cases but are
no less important than ever.


Actually, with MSIE 5.5+ appearance the chances that client can not
decompress the response from downstream cache have increased.
If MSIE 5.5 is configured to work via proxy with HTTP/1.0, then
MSIE will never send "Accept-Encoding" header, and it would refuse
the compressed content.

This is why my mod_deflate for Apache 1.3 by default does not compress
response for proxied requests.


Microsoft Internet Explorer ( all versions ) will REFUSE to cache
anything locally if it shows up with any "Vary:" headers on it.
Period. End of sentence.



So you might think you are doing your downstream clients a
favor by tacking on a "Vary:" header to the compressed data
so it gets 'stored' somewhere close to them... but you would
be wrong.

If you don't put a "Vary:" header on it... MSIE will, in fact,
cache the compressed response locally and life is good.
The user won't even come back for it until it's expired on
their own hard drive or they clear their browser cache.

However... if you simply add a "Vary:" header to the same
compressed response... MSIE now refuses to cache that
response at all and now you create a "thundering herd"
scenario whereby the page is never local to the user for
any length of time and each "forward" or "back" button
hit causes the client to go upstream for the page each
and every time. Even if there is a cache nearby you would
discover that the clients are nailing it each and every time
for the same page just because it has a "Vary:" header on it.

I believe Netscape has the same problem(s).
I don't use Netscape anymore.
Anyone know for sure if Netscape actually stores "variants"
correctly in local browser cache?


Actually, MSIE 5.5+ will cache the response with any "Vary" header
if it also has "Content-Encoding: gzip" or "Content-Encoding: deflate"
headers. But it refuses to cache if the response has no
"Content-Encoding: gzip" header.

My mod_deflate allows optionally to add "Vary" header to compressed
response only.


Igor Sysoev
http://sysoev.ru/en/


Re: mod_deflate Vary header

2005-11-08 Thread Igor Sysoev

On Tue, 8 Nov 2005 [EMAIL PROTECTED] wrote:


Igor Sysoev wrote

Actually, with MSIE 5.5+ appearance the chances that client can not
decompress the response from downstream cache have increased.
If MSIE 5.5 is configured to work via proxy with HTTP/1.0, then
MSIE will never send "Accept-Encoding" header, and it would refuse
the compressed content.


You are right on the first part. If you don't have the "Use HTTP/1.1
for Proxies" checkbox on then no "Accept-Encoding:" header
would be sent... ( principle of least astonishment applies here ).

...however... I think you will discover on closer inspection that
the second part is not true. Even if MSIE 5.x doesn't send an
"Accept-Encoding: gzip" header... it KNOWS it can decompress
and will do so for any response that shows up with "Content-encoding: gzip"
regardless of whether it sent an "Accept-Encoding: gzip" header.


I've just checked MSIE 6.0 configured to work via proxy with HTTP/1.0
(default settings). I've preloaded into the proxy the compressed pages using
Firefox. Then I've asked the same pages from MSIE. MSIE has showed
1) compressed pages for "/page.html" requests
2) and the "Save file" message box for the "/dir/" requests.


My mod_deflate allows optionally to add "Vary" header to compressed
response only.


So you still run the risk of only getting one variant 'stuck' in a
downstream cache. If the uncompressed version goes out at
'start of day' with no "Vary:" then the uncompressed version
can get 'stuck' where you don't want it until it actually expires.


I prefer to use compressing as safe as possible.
So the default settings of my mod_deflate and my new lightweight server
named nginx are:

1) compress "text/html" only;
2) compress HTTP/1.1 requests only;
3) do not compress responses for the proxied requests at all.

And of course, mod_deflate and nginx have many directives to change
this behavior: for example, they may compress responses for proxied
requests if responses have "Cache-Control" header with the "private",
"no-store", "no-cache" values.


Igor Sysoev
http://sysoev.ru/en/


Re: Question about memory in httpd

2005-12-04 Thread Igor Sysoev

On Sat, 3 Dec 2005, Paul Querna wrote:


Christophe Jaillet wrote:

When going thrue the code, looking at apr_palloc and friends, one can see
that :
* in some places (few of them) , the returned pointer is checked against
NULL
* in other places (most of them), it is not.

I've always been told that checking return value is a good idea, (especially
with memory allocation in order to avoid disasters) so should all the
apr_palloc (and friends) calls be checked or are they special reasons in
httpd not to care about short in memory situation ?


Actually, on most operating systems, including Linux and FreeBSD, you
will NEVER get returned NULL.


Actully, Linux and FreeBSD will return NULL and set errno to ENOMEM,
if you try to allocate in sum more than RLIMIT_DATA limit.


Instead when your operating system is truly out of memory, it will kill
your process, and you won't have any chance of handling it.

Read a whole blog post about it:
http://baus.net/memory-management



Igor Sysoev
http://sysoev.ru/en/