fyi dear experts while I respect datamodel Trie before b-tree [l...@localhost ~]$ df -h Filesystem Size Used Avail Use% Mounted on none 1.7G 26M 1.6G 2% / [l...@localhost ~]$ ssh rem...@remoteniklascomputer Warning: Permanently added 'remoteniklascomputer,remoteniklascomputer' (RSA) to the list of known hosts. [email protected]'s password: ]$ df -h Filesystem Size Used Avail Use% Mounted on /dev/vzfs 10G 5.4G 4.7G 54% / [tech...@ip-68-178-227-19 ~]$
On Sun, Nov 15, 2009 at 6:19 PM, Matthew Dillon <[email protected]> wrote: > > :Matt used to use hardlinks for some sort of historical arrangement; after > :a certain point, the total number of hardlinks was too much to handle. He > :might have mentioned this somewhere in the archives. I don't know if this > :would bite you the same way with gmirror. > : > > Here's a quick summary: > > * First, a filesystem like UFS (and I think UFS2 but I'm not sure) > is limited to 65536 hardlinks per inode. This limit is quickly > reached when something like a CVS archive (which itself uses hardlinks > in the CVS/ subdirectories) is backed up using the hardlink model. > This results in a lot of data duplication and wasted storage. > > * Since directories cannot be hardlinked, directories are always > duplicated for each backup. For UFS this is a disaster because > fsck's memory use is partially based on the number of directories. > > * UFS's fsck can't handle large numbers of inodes. Once you get > past a few tens of millions of inodes fsck explodes, not to mention > can take 9+ hours to run even if it does not explode. This happened > to me several times during the days where I used UFS to hold archival > data and for backups. Everything worked dandy until I actually had > to fsck. > > Even though things like background fsck exist, it's never been stable > enough to be practical in a production environment, and even if it were > it eats disk bandwidth potentially for days after a crash. I don't > know if that has changed recently or not. > > The only work around is to not store tens of millions of inodes on a > UFS filesystem. > > * I believe that FreeBSD was talking about adopting some of the LFS work, > or otherwise implementing log space for UFS. I don't know what the > state of this is but I will say that it's tough to get something like > this to work right without a lot of actual plug-pulling tests. > > Either OpenBSD or NetBSD I believe have a log structured extension to > UFS which works. Not sure which, sorry. > > With something like ZFS one would use ZFS's snapshots (though they aren't > as fine-grained as HAMMER snapshots). ZFS's snapshots work fairly well > but have higher maintanance overheads then HAMMER snapshots when one is > trying to delete a snapshot. HAMMER can delete several snapshots in a > single pass so the aggregate maintainance overhead is lower. > > With Linux... well, I don't know which filesystem you'd use. ext4 maybe, > if they've fixed the bugs. I've used reiser in the past (but obviously > that isn't desireable now). > > -- > > For HAMMER, both Justin and I have been able to fill up multi-terrabyte > filesystems running bulk pkgsrc builds with default setups. It's fairly > easy to fix by adjusting up the HAMMER config (aka hammer viconfig > <filesystem>) run times for pruning and reblocking. > > Bulk builds are a bit of a special case. Due to the way they work a > bulk build rm -rf's /usr/pkg for EACH package it builds, then > reconstructs it by installing the necessary dependencies previously > created before building the next package. This eats disk space like > crazy on a normal HAMMER mount. It's more managable if one did a > 'nohistory' HAMMER mount but my preference, in general, is to use a > normal mount. > > HAMMER does not implement redundancy like ZFS, so if redundancy is > needed you'd need to use a RAID card. For backup systems I typically > don't bother with per-filesystem redundancy since I have several copies > on different machines already. Not only do the (HAMMER) production > machines have somewhere around 60 days worth of snapshtos on them, > but my on-site backup box has 100 days of daily snapshots and my > off-site backup box has almost 2 years of weekly snapshots. > > So if the backups fit on one or two drives additional redundancy isn't > really beneficial. More then that and you'd definitely want RAID. > > -Matt > Matthew Dillon > <[email protected]> > >
