You might find the method used by Squid to manage its cache would be worth emulating.

Using TransmitFile on Windows or sendfile on Unix to despatch the file to the network is by far the most efficient way to pass on files from a cache. It avoids a number of levels of buffer shadowing.

Andreas Volz wrote:
Am Tue, 13 Nov 2007 07:18:19 -0600 schrieb John Stanton:

In a cache situation I would expect that keeping the binary data in files would be preferable because you can use far more efficient mechanisms for loading them into your cache and in particular in transmitting them downstream. Your DB only needs to store a pathname.

Just be wary of directory size, and do not put them all in the one directory.

I noticed that problem in my current situation. I don't know the file
number and size limit in Linux or Windows, but I'm sure there is a
limit.

My main problem is to find a good algorithm to name the cached files
and split them into directories. My current idea is:

1) Put the URL into DB
2) Use a hash function to create a unique name for the cache file
3) Insert the hash name into the same row as the URL

The problem with many files in a directory:

4) Use e.g. 'modulo 11' on the URL hash value to get one of ten
directory names where to find a file.

But this has the drawback to have a static number of cache directories.
The algorithm isn't scalable with growing files.

Do you think is a good way? Or do you've another idea?

regards
Andreas

-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------



-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Reply via email to