Nathan Schmidt wrote:
We've got about 50M dkeys tracked on a generic core2duo master-master
setup, and that is starting to get a little claustrophobic -- the
indices for file and file_on are well in excess of the RAM of the
boxes. MySQL does a pretty good job all things considered but we
definitely are working on a new arrangement which will use multiple
independent mogilefs systems to spread the file count around. The main
limiting factors with this kind of dataset are in list_keys and
rebalance operations. We cache at the application level so paths
lookups are rare. We're certainly at one edge of the continuum, with
many millions of small files rather than hundreds of thousands of
large files and I get the feeling we're operating outside of the
mogilefs sweet spot but it's still quite usable.
-n
I've had many more small files than that and the DB still held up. Had
to add more RAM at some point and run OPTIMIZE TABLE every few months
but it was okay.
However we didn't use list_keys or rebalance... just drain/dead
operations. Any chance you could describe the pain there in a little
more detail?
-Dormando