Does anyone have a good guesstimate (and perhaps some reference material) as to what the sweet spot is for access efficiency of folder depth vs number of files per folder?
I've got a file database where all files are stored by their md5 such as ./a1/d5/83/d2/7e38/f96d/1641dead/f4d89ee2/a1d583d27e38f96d1641deadf4d89ee2.m4a However, iterating over the files with this much folder depth is significantly slower than a depth of 3. I was thinking that perhaps I could use a scheme more like this: ./a1d/583/a1d583d27e38f96d1641deadf4d89ee2.m4a in which case each folder will have up to 4096 sub-nodes (presently I can't foresee having more than 1 million files in the db) Or I could convert the md5sum to base36 and use a similar scheme with 1296 sub-nodes. AJ ONeal /* PLUG: http://plug.org, #utah on irc.freenode.net Unsubscribe: http://plug.org/mailman/options/plug Don't fear the penguin. */
