Re: mod_proxy reverse proxy optimization/performance question

2004-10-21 Thread Roman Gavrilov
so what would you suggest I should do ?
implement it by myself ?
Bill Stoddard wrote:
Graham Leggett wrote:
Roman Gavrilov wrote:
In my opinion it would be more efficient to let one process complete 
the request (using maximum line throughput) and return some busy 
code to other identical, simultaneous requests  until the file is 
cached locally.
As anyone run into a similar situation? What solution did you find?

In the original design for mod_cache, the second and subsequent 
connections to a file that was still in the process of being 
downloaded into the cache would shadow the cached file - in other 
words it would serve content from the cached file as and when it was 
received by the original request.

The file in the cache was to be marked as still busy downloading, 
which meant threads/processes serving from the cached file would know 
to keep trying to serve the cached file until the still busy 
downloading status was cleared by the initial request. Timeouts 
would sanity check the process.

This prevents the load spike that occurs just after a file is 
downloaded anew, but before that download is done.

Whether this was implemented fully I am not sure - anyone?

It was never implemented.
Bill

--
-
I am root. If you see me laughing... You better have a backup!




Re: mod_proxy reverse proxy optimization/performance question

2004-10-21 Thread Igor Sysoev
On Thu, 21 Oct 2004, Roman Gavrilov wrote:

 so what would you suggest I should do ?
 implement it by myself ?

No, just look at http://sysoev.ru/mod_accel/
It's Apache 1.3 module as you need.

Igor Sysoev
http://sysoev.ru/en/


 Bill Stoddard wrote:

  Graham Leggett wrote:
 
  Roman Gavrilov wrote:
 
  In my opinion it would be more efficient to let one process complete
  the request (using maximum line throughput) and return some busy
  code to other identical, simultaneous requests  until the file is
  cached locally.
  As anyone run into a similar situation? What solution did you find?
 
  In the original design for mod_cache, the second and subsequent
  connections to a file that was still in the process of being
  downloaded into the cache would shadow the cached file - in other
  words it would serve content from the cached file as and when it was
  received by the original request.
 
  The file in the cache was to be marked as still busy downloading,
  which meant threads/processes serving from the cached file would know
  to keep trying to serve the cached file until the still busy
  downloading status was cleared by the initial request. Timeouts
  would sanity check the process.
 
  This prevents the load spike that occurs just after a file is
  downloaded anew, but before that download is done.
 
  Whether this was implemented fully I am not sure - anyone?
 
 
  It was never implemented.


Re: mod_proxy reverse proxy optimization/performance question

2004-10-21 Thread Roman Gavrilov





after checking the mod_accel I found out that it works only with http,
we need the cache  proxy to work both with http and https.
What was the reason for disabling https proxying  caching ?

Regards,
Roman

Igor Sysoev wrote:

  On Thu, 21 Oct 2004, Roman Gavrilov wrote:

  
  
so what would you suggest I should do ?
implement it by myself ?

  
  
No, just look at http://sysoev.ru/mod_accel/
It's Apache 1.3 module as you need.

Igor Sysoev
http://sysoev.ru/en/


  
  
Bill Stoddard wrote:



  Graham Leggett wrote:

  
  
Roman Gavrilov wrote:



  In my opinion it would be more efficient to let one process complete
the request (using maximum line throughput) and return some busy
code to other identical, simultaneous requests  until the file is
cached locally.
As anyone run into a similar situation? What solution did you find?
  

In the original design for mod_cache, the second and subsequent
connections to a file that was still in the process of being
downloaded into the cache would shadow the cached file - in other
words it would serve content from the cached file as and when it was
received by the original request.

The file in the cache was to be marked as "still busy downloading",
which meant threads/processes serving from the cached file would know
to keep trying to serve the cached file until the "still busy
downloading" status was cleared by the initial request. Timeouts
would sanity check the process.

This prevents the "load spike" that occurs just after a file is
downloaded anew, but before that download is done.

Whether this was implemented fully I am not sure - anyone?

  
  
It was never implemented.
  

  
  

  


-- 
-
I am root. If you see me laughing... You better have a backup!







Re: mod_proxy reverse proxy optimization/performance question

2004-10-21 Thread Graham Leggett
Roman Gavrilov wrote:
so what would you suggest I should do ?
implement it by myself ?
At the moment that's probably your best option.
Is this for Apache v1.3 or v2.0?
Regards,
Graham
--


smime.p7s
Description: S/MIME Cryptographic Signature


Re: mod_proxy reverse proxy optimization/performance question

2004-10-21 Thread Igor Sysoev
On Thu, 21 Oct 2004, Roman Gavrilov wrote:

 after checking the mod_accel I found out that it works only with http,
 we need the cache  proxy  to work both with http and https.
 What was the reason for disabling https proxying  caching ?

How do you think to do https reverse proxying ?


Igor Sysoev
http://sysoev.ru/en/


Re: mod_proxy reverse proxy optimization/performance question

2004-10-21 Thread Roman Gavrilov




I don't see any problem using it, actually I am doing it. I am not
talking about proxying between http and https. 
Mostly its used for mirroring (both frontend and backend use https
only) no redirections on backend though :)


ProxyPass /foo/bar https:/mydomain/foobar/
ProxyPassReverse https:/mydomain/foobar/ /foo/bar

I'll be more then glad to discuss it with you.

Regards
Roman


Igor Sysoev wrote:

  On Thu, 21 Oct 2004, Roman Gavrilov wrote:

  
  
after checking the mod_accel I found out that it works only with http,
we need the cache  proxy  to work both with http and https.
What was the reason for disabling https proxying  caching ?

  
  
How do you think to do https reverse proxying ?


Igor Sysoev
http://sysoev.ru/en/


  


-- 
-
I am root. If you see me laughing... You better have a backup!







Re: mod_proxy reverse proxy optimization/performance question

2004-10-21 Thread Igor Sysoev
On Thu, 21 Oct 2004, Roman Gavrilov wrote:

 I don't see any problem using it, actually I am doing it. I am not
 talking about proxying between http and https.
 Mostly its used for mirroring (both frontend and backend use https only)
 no redirections on backend though :)


 ProxyPass /foo/bar https:/mydomain/foobar/
 ProxyPassReverse https:/mydomain/foobar/ /foo/bar

 I'll be more then glad to discuss it with you.

So proxy should decrypt the stream, find URI, then encrypt it, and
pass it encrypted to backend ?


Igor Sysoev
http://sysoev.ru/en/


Re: mod_proxy reverse proxy optimization/performance question

2004-10-21 Thread Roman Gavrilov




No, when https request gets to the server(apache), its being decrypted
first then passed through apache routines, when it gets 
to the proxy part the URI already decrypted. proxy in its turn issues a
request to the backend https server and returns the answer to the
client of course after caching it. 

Roman

Igor Sysoev wrote:

  On Thu, 21 Oct 2004, Roman Gavrilov wrote:

  
  
I don't see any problem using it, actually I am doing it. I am not
talking about proxying between http and https.
Mostly its used for mirroring (both frontend and backend use https only)
no redirections on backend though :)


ProxyPass /foo/bar https:/mydomain/foobar/
ProxyPassReverse https:/mydomain/foobar/ /foo/bar

I'll be more then glad to discuss it with you.

  
  
So proxy should decrypt the stream, find URI, then encrypt it, and
pass it encrypted to backend ?


Igor Sysoev
http://sysoev.ru/en/


  


-- 
-
I am root. If you see me laughing... You better have a backup!







Re: mod_proxy reverse proxy optimization/performance question

2004-10-21 Thread Igor Sysoev
On Thu, 21 Oct 2004, Roman Gavrilov wrote:

 No,  when https request gets to the server(apache), its being decrypted
 first then passed through apache routines, when it gets
 to the proxy part the URI already decrypted. proxy in its turn issues a
 request to the backend https server and returns the answer to the client
 of course after caching it.

Well, it's the same as I described.
No, mod_accel can not connect to backend using https.

 Roman

 Igor Sysoev wrote:

 On Thu, 21 Oct 2004, Roman Gavrilov wrote:
 
 
 
 I don't see any problem using it, actually I am doing it. I am not
 talking about proxying between http and https.
 Mostly its used for mirroring (both frontend and backend use https only)
 no redirections on backend though :)
 
 
 ProxyPass /foo/bar https:/mydomain/foobar/
 ProxyPassReverse https:/mydomain/foobar/ /foo/bar
 
 I'll be more then glad to discuss it with you.
 
 
 
 So proxy should decrypt the stream, find URI, then encrypt it, and
 pass it encrypted to backend ?


Igor Sysoev
http://sysoev.ru/en/


mod_proxy reverse proxy optimization/performance question

2004-10-20 Thread Roman Gavrilov
I am using a reverse proxy to cache a remote site. The files are mostly 
rpms, with varying sizes: 3-30M or more.
Now if you have a number of requests for the same file which is not yet 
cached locally, all of these requests will download the requested file 
from the remote site.  It will slow down the speed of each download as 
the throughput of the line will be split among all processes.
So if there are lots of processes to download the same rpm from a remote 
site, this can take lots of time to complete a request.
This can bring apache to a state where it can not serve other requests, 
as all available processes are already busy.

In my opinion it would be more efficient to let one process complete the 
request (using maximum line throughput) and return some busy code to 
other identical, simultaneous requests  until the file is cached locally.
As anyone run into a similar situation? What solution did you find?

I have created a solution, as I did not find anything else already 
existing. I would like to discuss it here and get your opinions.
1. When a request for a file that is not yet in the local cache is 
accepted by the proxy, a temporary lock file is created (based on the 
proxy's pathname of the file, changed from directory slashes to 
underscores).
2. Other processes requesting the same file will check first for the 
lock file. If found, they will return a busy code (ie: 408 Request 
Timeout), and the request should be sent repeatedly until successful.

Please let me know what you think of this approach, especially if you 
have done or seen something similar.
Apache version 1.3.x

Thank you
Roman
--
-
I am root. If you see me laughing... You better have a backup!




Re: mod_proxy reverse proxy optimization/performance question

2004-10-20 Thread Graham Leggett
Roman Gavrilov wrote:
In my opinion it would be more efficient to let one process complete the 
request (using maximum line throughput) and return some busy code to 
other identical, simultaneous requests  until the file is cached locally.
As anyone run into a similar situation? What solution did you find?
In the original design for mod_cache, the second and subsequent 
connections to a file that was still in the process of being downloaded 
into the cache would shadow the cached file - in other words it would 
serve content from the cached file as and when it was received by the 
original request.

The file in the cache was to be marked as still busy downloading, 
which meant threads/processes serving from the cached file would know to 
keep trying to serve the cached file until the still busy downloading 
status was cleared by the initial request. Timeouts would sanity check 
the process.

This prevents the load spike that occurs just after a file is 
downloaded anew, but before that download is done.

Whether this was implemented fully I am not sure - anyone?
Regards,
Graham
--


smime.p7s
Description: S/MIME Cryptographic Signature


Re: mod_proxy reverse proxy optimization/performance question

2004-10-20 Thread Bill Stoddard
Graham Leggett wrote:
Roman Gavrilov wrote:
In my opinion it would be more efficient to let one process complete 
the request (using maximum line throughput) and return some busy code 
to other identical, simultaneous requests  until the file is cached 
locally.
As anyone run into a similar situation? What solution did you find?

In the original design for mod_cache, the second and subsequent 
connections to a file that was still in the process of being downloaded 
into the cache would shadow the cached file - in other words it would 
serve content from the cached file as and when it was received by the 
original request.

The file in the cache was to be marked as still busy downloading, 
which meant threads/processes serving from the cached file would know to 
keep trying to serve the cached file until the still busy downloading 
status was cleared by the initial request. Timeouts would sanity check 
the process.

This prevents the load spike that occurs just after a file is 
downloaded anew, but before that download is done.

Whether this was implemented fully I am not sure - anyone?
It was never implemented.
Bill