On Thu, Dec 20, 2007 at 05:26:53PM -0800, Ask Bj?rn Hansen wrote: > On Dec 20, 2007, at 16:08, Mark Smith wrote: >> Anyway, I can't see this being a particularly perfect fit for MogileFS >> though. I can't imagine it not working, but really small files like >> that aren't really the use case it's been built for. > I think it should work okay - it really depends on the use (access > patterns, load, ...) > > There'll be a good deal of wasted space and resources (block sizes, space > used in the databases etc). > > One idea would be to split the cluster in 2 or 3 chunks to more easily get > the tracker and database load scaled horizontally. > > Another is to make the storage nodes "smarter" about how they store their > files (use something else than the file system). From our production MogileFS, we deal with small files by using tail-packing reiserfs3, as prevents the block-size issue. Make sure you really really trust your hardware before going this route: it's not pretty if you have bad RAM, PSU, mobo, or unreliable power.
I need to get my fsck-with-checksums and mog-tar-backup tools cleaned up and worked into the mainline at some point soon. Doing the above reduces the core problem to that of having billions of rows in your database - further research there might put you further ahead. From your original question as to a number of trackers, nodes etc - your access patterns will dictate that a lot more than just the number of files. -- Robin Hugh Johnson Gentoo Linux Developer & Infra Guy E-Mail : [EMAIL PROTECTED] GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85
pgpYbzChx0ccJ.pgp
Description: PGP signature
