I think your blob file performance may greatly depend upon the file system that 
it used and the workload.

I found this article:

http://oss.sgi.com/projects/xfs/papers/filesystem-perf-tm.pdf


Andreas Volz <[EMAIL PROTECTED]> wrote: Am Tue, 13 Nov 2007 07:18:19 -0600 
schrieb John Stanton:

> In a cache situation I would expect that keeping the binary data in 
> files would be preferable because you can use far more efficient 
> mechanisms for loading them into your cache and in particular in 
> transmitting them downstream.  Your DB only needs to store a pathname.
> 
> Just be wary of directory size, and do not put them all in the one 
> directory.

I noticed that problem in my current situation. I don't know the file
number and size limit in Linux or Windows, but I'm sure there is a
limit.

My main problem is to find a good algorithm to name the cached files
and split them into directories. My current idea is:

1) Put the URL into DB
2) Use a hash function to create a unique name for the cache file
3) Insert the hash name into the same row as the URL

The problem with many files in a directory:

4) Use e.g. 'modulo 11' on the URL hash value to get one of ten
directory names where to find a file.

But this has the drawback to have a static number of cache directories.
The algorithm isn't scalable with growing files.

Do you think is a good way? Or do you've another idea?

regards
Andreas

-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------


Reply via email to