I am using a reverse proxy to cache a remote site. The files are mostly rpms, with varying sizes: 3-30M or more.
Now if you have a number of requests for the same file which is not yet cached locally, all of these requests will download the requested file from the remote site. It will slow down the speed of each download as the throughput of the line will be split among all processes.
So if there are lots of processes to download the same rpm from a remote site, this can take lots of time to complete a request.
This can bring apache to a state where it can not serve other requests, as all available processes are already busy.



In my opinion it would be more efficient to let one process complete the request (using maximum line throughput) and return some busy code to other identical, simultaneous requests until the file is cached locally.
As anyone run into a similar situation? What solution did you find?


I have created a solution, as I did not find anything else already existing. I would like to discuss it here and get your opinions.
1. When a request for a file that is not yet in the local cache is accepted by the proxy, a temporary lock file is created (based on the proxy's pathname of the file, changed from directory slashes to underscores).
2. Other processes requesting the same file will check first for the lock file. If found, they will return a busy code (ie: 408 Request Timeout), and the request should be sent repeatedly until successful.


Please let me know what you think of this approach, especially if you have done or seen something similar.

--
-------------------------------------------------------------
I am root. If you see me laughing... You better have a backup!






Reply via email to