Re: mod_proxy reverse proxy optimization/performance question
so what would you suggest I should do ? implement it by myself ? Bill Stoddard wrote: Graham Leggett wrote: Roman Gavrilov wrote: In my opinion it would be more efficient to let one process complete the request (using maximum line throughput) and return some busy code to other identical, simultaneous requests until the file is cached locally. As anyone run into a similar situation? What solution did you find? In the original design for mod_cache, the second and subsequent connections to a file that was still in the process of being downloaded into the cache would shadow the cached file - in other words it would serve content from the cached file as and when it was received by the original request. The file in the cache was to be marked as still busy downloading, which meant threads/processes serving from the cached file would know to keep trying to serve the cached file until the still busy downloading status was cleared by the initial request. Timeouts would sanity check the process. This prevents the load spike that occurs just after a file is downloaded anew, but before that download is done. Whether this was implemented fully I am not sure - anyone? It was never implemented. Bill -- - I am root. If you see me laughing... You better have a backup!
Re: mod_proxy reverse proxy optimization/performance question
On Thu, 21 Oct 2004, Roman Gavrilov wrote: so what would you suggest I should do ? implement it by myself ? No, just look at http://sysoev.ru/mod_accel/ It's Apache 1.3 module as you need. Igor Sysoev http://sysoev.ru/en/ Bill Stoddard wrote: Graham Leggett wrote: Roman Gavrilov wrote: In my opinion it would be more efficient to let one process complete the request (using maximum line throughput) and return some busy code to other identical, simultaneous requests until the file is cached locally. As anyone run into a similar situation? What solution did you find? In the original design for mod_cache, the second and subsequent connections to a file that was still in the process of being downloaded into the cache would shadow the cached file - in other words it would serve content from the cached file as and when it was received by the original request. The file in the cache was to be marked as still busy downloading, which meant threads/processes serving from the cached file would know to keep trying to serve the cached file until the still busy downloading status was cleared by the initial request. Timeouts would sanity check the process. This prevents the load spike that occurs just after a file is downloaded anew, but before that download is done. Whether this was implemented fully I am not sure - anyone? It was never implemented.
Re: mod_proxy reverse proxy optimization/performance question
after checking the mod_accel I found out that it works only with http, we need the cache proxy to work both with http and https. What was the reason for disabling https proxying caching ? Regards, Roman Igor Sysoev wrote: On Thu, 21 Oct 2004, Roman Gavrilov wrote: so what would you suggest I should do ? implement it by myself ? No, just look at http://sysoev.ru/mod_accel/ It's Apache 1.3 module as you need. Igor Sysoev http://sysoev.ru/en/ Bill Stoddard wrote: Graham Leggett wrote: Roman Gavrilov wrote: In my opinion it would be more efficient to let one process complete the request (using maximum line throughput) and return some busy code to other identical, simultaneous requests until the file is cached locally. As anyone run into a similar situation? What solution did you find? In the original design for mod_cache, the second and subsequent connections to a file that was still in the process of being downloaded into the cache would shadow the cached file - in other words it would serve content from the cached file as and when it was received by the original request. The file in the cache was to be marked as "still busy downloading", which meant threads/processes serving from the cached file would know to keep trying to serve the cached file until the "still busy downloading" status was cleared by the initial request. Timeouts would sanity check the process. This prevents the "load spike" that occurs just after a file is downloaded anew, but before that download is done. Whether this was implemented fully I am not sure - anyone? It was never implemented. -- - I am root. If you see me laughing... You better have a backup!
Re: mod_proxy reverse proxy optimization/performance question
Roman Gavrilov wrote: so what would you suggest I should do ? implement it by myself ? At the moment that's probably your best option. Is this for Apache v1.3 or v2.0? Regards, Graham -- smime.p7s Description: S/MIME Cryptographic Signature
Re: mod_proxy reverse proxy optimization/performance question
On Thu, 21 Oct 2004, Roman Gavrilov wrote: after checking the mod_accel I found out that it works only with http, we need the cache proxy to work both with http and https. What was the reason for disabling https proxying caching ? How do you think to do https reverse proxying ? Igor Sysoev http://sysoev.ru/en/
Re: mod_proxy reverse proxy optimization/performance question
I don't see any problem using it, actually I am doing it. I am not talking about proxying between http and https. Mostly its used for mirroring (both frontend and backend use https only) no redirections on backend though :) ProxyPass /foo/bar https:/mydomain/foobar/ ProxyPassReverse https:/mydomain/foobar/ /foo/bar I'll be more then glad to discuss it with you. Regards Roman Igor Sysoev wrote: On Thu, 21 Oct 2004, Roman Gavrilov wrote: after checking the mod_accel I found out that it works only with http, we need the cache proxy to work both with http and https. What was the reason for disabling https proxying caching ? How do you think to do https reverse proxying ? Igor Sysoev http://sysoev.ru/en/ -- - I am root. If you see me laughing... You better have a backup!
Re: mod_proxy reverse proxy optimization/performance question
On Thu, 21 Oct 2004, Roman Gavrilov wrote: I don't see any problem using it, actually I am doing it. I am not talking about proxying between http and https. Mostly its used for mirroring (both frontend and backend use https only) no redirections on backend though :) ProxyPass /foo/bar https:/mydomain/foobar/ ProxyPassReverse https:/mydomain/foobar/ /foo/bar I'll be more then glad to discuss it with you. So proxy should decrypt the stream, find URI, then encrypt it, and pass it encrypted to backend ? Igor Sysoev http://sysoev.ru/en/
Re: mod_proxy reverse proxy optimization/performance question
No, when https request gets to the server(apache), its being decrypted first then passed through apache routines, when it gets to the proxy part the URI already decrypted. proxy in its turn issues a request to the backend https server and returns the answer to the client of course after caching it. Roman Igor Sysoev wrote: On Thu, 21 Oct 2004, Roman Gavrilov wrote: I don't see any problem using it, actually I am doing it. I am not talking about proxying between http and https. Mostly its used for mirroring (both frontend and backend use https only) no redirections on backend though :) ProxyPass /foo/bar https:/mydomain/foobar/ ProxyPassReverse https:/mydomain/foobar/ /foo/bar I'll be more then glad to discuss it with you. So proxy should decrypt the stream, find URI, then encrypt it, and pass it encrypted to backend ? Igor Sysoev http://sysoev.ru/en/ -- - I am root. If you see me laughing... You better have a backup!
Re: mod_proxy reverse proxy optimization/performance question
On Thu, 21 Oct 2004, Roman Gavrilov wrote: No, when https request gets to the server(apache), its being decrypted first then passed through apache routines, when it gets to the proxy part the URI already decrypted. proxy in its turn issues a request to the backend https server and returns the answer to the client of course after caching it. Well, it's the same as I described. No, mod_accel can not connect to backend using https. Roman Igor Sysoev wrote: On Thu, 21 Oct 2004, Roman Gavrilov wrote: I don't see any problem using it, actually I am doing it. I am not talking about proxying between http and https. Mostly its used for mirroring (both frontend and backend use https only) no redirections on backend though :) ProxyPass /foo/bar https:/mydomain/foobar/ ProxyPassReverse https:/mydomain/foobar/ /foo/bar I'll be more then glad to discuss it with you. So proxy should decrypt the stream, find URI, then encrypt it, and pass it encrypted to backend ? Igor Sysoev http://sysoev.ru/en/
mod_proxy reverse proxy optimization/performance question
I am using a reverse proxy to cache a remote site. The files are mostly rpms, with varying sizes: 3-30M or more. Now if you have a number of requests for the same file which is not yet cached locally, all of these requests will download the requested file from the remote site. It will slow down the speed of each download as the throughput of the line will be split among all processes. So if there are lots of processes to download the same rpm from a remote site, this can take lots of time to complete a request. This can bring apache to a state where it can not serve other requests, as all available processes are already busy. In my opinion it would be more efficient to let one process complete the request (using maximum line throughput) and return some busy code to other identical, simultaneous requests until the file is cached locally. As anyone run into a similar situation? What solution did you find? I have created a solution, as I did not find anything else already existing. I would like to discuss it here and get your opinions. 1. When a request for a file that is not yet in the local cache is accepted by the proxy, a temporary lock file is created (based on the proxy's pathname of the file, changed from directory slashes to underscores). 2. Other processes requesting the same file will check first for the lock file. If found, they will return a busy code (ie: 408 Request Timeout), and the request should be sent repeatedly until successful. Please let me know what you think of this approach, especially if you have done or seen something similar. Apache version 1.3.x Thank you Roman -- - I am root. If you see me laughing... You better have a backup!
Re: mod_proxy reverse proxy optimization/performance question
Roman Gavrilov wrote: In my opinion it would be more efficient to let one process complete the request (using maximum line throughput) and return some busy code to other identical, simultaneous requests until the file is cached locally. As anyone run into a similar situation? What solution did you find? In the original design for mod_cache, the second and subsequent connections to a file that was still in the process of being downloaded into the cache would shadow the cached file - in other words it would serve content from the cached file as and when it was received by the original request. The file in the cache was to be marked as still busy downloading, which meant threads/processes serving from the cached file would know to keep trying to serve the cached file until the still busy downloading status was cleared by the initial request. Timeouts would sanity check the process. This prevents the load spike that occurs just after a file is downloaded anew, but before that download is done. Whether this was implemented fully I am not sure - anyone? Regards, Graham -- smime.p7s Description: S/MIME Cryptographic Signature
Re: mod_proxy reverse proxy optimization/performance question
Graham Leggett wrote: Roman Gavrilov wrote: In my opinion it would be more efficient to let one process complete the request (using maximum line throughput) and return some busy code to other identical, simultaneous requests until the file is cached locally. As anyone run into a similar situation? What solution did you find? In the original design for mod_cache, the second and subsequent connections to a file that was still in the process of being downloaded into the cache would shadow the cached file - in other words it would serve content from the cached file as and when it was received by the original request. The file in the cache was to be marked as still busy downloading, which meant threads/processes serving from the cached file would know to keep trying to serve the cached file until the still busy downloading status was cleared by the initial request. Timeouts would sanity check the process. This prevents the load spike that occurs just after a file is downloaded anew, but before that download is done. Whether this was implemented fully I am not sure - anyone? It was never implemented. Bill