Does anyone have a good guesstimate (and perhaps some reference material) as
to what the sweet spot is for access efficiency of folder depth vs number of
files per folder?


I've got a file database where all files are stored by their md5 such as
./a1/d5/83/d2/7e38/f96d/1641dead/f4d89ee2/a1d583d27e38f96d1641deadf4d89ee2.m4a
However, iterating over the files with this much folder depth is
significantly slower than a depth of 3.


I was thinking that perhaps I could use a scheme more like this:
./a1d/583/a1d583d27e38f96d1641deadf4d89ee2.m4a
in which case each folder will have up to 4096 sub-nodes
(presently I can't foresee having more than 1 million files in the db)

Or I could convert the md5sum to base36 and use a similar scheme with 1296
sub-nodes.

AJ ONeal

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/

Reply via email to