if you store things on the filesystem, you'll need to use a hashing
algorithm to bucket it effectively -- filesystems don't like too many
files in a directory.

you don't need to rename the file to a hash -- you could just store
that in the db -- but I'd advise renaming the file to a hash, because
then you can write a regex rule or something super simple to serve it.

ie: a file named aabbccddeeff.jpg would end up as aa/bb/cc/dd/ee/ff/
aabbccddeeff.jpg ; a rewrite rule could easily map requests for "http:/
img .site.com/aabbccddeeff.jpg" to that folder ; you wouldn't have to
include the actual filepath.

in my years i've learned a few things:
1- hashing to md5 base 16(hex) is fine.  if you do 3 chars per
directory, you have 4096 buckets in each level; going 3 deep gives you
a ton of directories.  if you do base32, you can do 2 chars per
directory, and have 1024 in each level.  a couple of filesystems start
to degrade around 1k and 10k files - so this could improve
performance.
2- if you bucket items by numeric id, bucket backwards.  ie: 123456789
-> 9/8/7/6/5/4/3/2/1/123456789.  there's a math law about this, i
can't remember the title, but basically points out that you'll round-
robin evenly if you bucket from the back, but you'll have a new bucket
every power-of-10 if you bucket forwards..


On Feb 24, 12:59 pm, Matt Feifarek <[email protected]> wrote:

> If you're worried about filename integrity (including full path) you can
> hash the files and store the fingerprints. I've done that before; works
> nice.
>
> While not "secure" (whatever) md5 is so fast that even a several meg file
> can be hashed in so little time you kinda don't notice it unless you're
> doing something really big.
>
> But if you know that nobody will ever mess with the actual filesystem
> storage, you can probably rely on the filenames.

-- 
You received this message because you are subscribed to the Google Groups 
"pylons-discuss" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en.

Reply via email to