Andrew Cantino wrote:
I'm interested in using mogilefs to store very large numbers of files
(tens to hundreds of millions). Will this be a problem?

Depends on how often you add, delete, re-replicate, read etc. mogilefs' choke point is currently the database. Which a nice dual cpu quadcore machine with 32G+ of RAM will likely hold hundreds of millions of files okay.

It also wouldn't be too huge of an investment to add database partitioning support.

I guess it's worth noting that the more actively your dataset is changing, the more load the system presents to itself in general. If you load up a hundred million files then mostly read on them most of mogilefs will be pretty bored. Doing more adds some extra DB load.

Could anyone
point me to benchmark documentation about how mogilefs scales as the
number of stored files grows?

Wish there was some :) There's the one database, the rest is dependent on how many spindles you add to the cluster. Disks slow? Add more disks, rebalance, or drain overloaded devices, move on. Trackers overloaded? Add more trackers. Database slow? Possibly a bigger issue.

-Dormando

Reply via email to