Op 17-4-2012 22:48, Lars Ellenberg schreef:
Yes, I have, from the start.On Fri, Apr 13, 2012 at 08:09:06AM +0200, Dirk Bonenkamp - ProActive wrote:Hi All,I'm still having issues with backing up this file system. I've followed the advice and used LVM under my DRBD device (Disk -> LVM -> DRBD -> OCFS2). I can create a snapshot (I have to run a fsck.ocfs2 on this snaphot after cloning the volume every time). I mount the snapshot read only with local filelocks. The performance of this snapshot volume is even worse than the original volume.... Performance (on the original volume and the snapshot volume) seems to degrade when the numbers of files in a directory rise. When I say performance, I mean 'operations where every file needs to be checked' like rsync or 'find . -mtime -1 print'. Performance for my application is great (writing a couple of thousand files a day and reading a couple of 100.000 a day) dd tests give me 200 MB/s writes and 600 MB/s reads. Am I missing something here, or will this setup just never work for my backups...?stat() can be a costly syscall. even more so on cluster file systems. Hope you have already mounted -o noatime? Keeping the number of files small-ish is not really an option here... Few directories with tons of files due to the nature of the application it runs.Even readdir respectively getdents is typically more expensive on cluster file systems. keeping the number of files per directory small-ish (whatever that may be for your context) may help, introducing hirachical "hashing" subdirectories can help with doing so. Well, my application runs (very) fine on OCFS2. Backups do not (for my situation) as I found out. Searching for rsync + OCFS2 gives you a lot of hits on OCFS2 as a replacement for rsync, but surpisingly few about 'my' issue.And I'm not even speaking of stat()ing cache-cold, while some other random IO plus streaming IO happens.... (adding more RAM helps for this one, as does tuning vm.vfs_cache_pressure and maybe swappiness) Nothing of this has anything to do with DRBD, or with streaming IO "performance". All of this should have been amoung the first hits when searching for "OCFS2 slow" ... To run 2 active nodes, which works fine. But, since backups are essential, I went back to the drawing board. I now have a test-setup with active/passive DRDB with ext4. The master node is NFS server and the slave mounts the NFS share with the ext4 filesystem. This works just fine, I did quite some testing the past few days and haven't been able to 'wreck' it. And backups are a breeze again. So far, so good...Why do you think you need/want a cluster file system again? Thanks for your input. Dirk |
_______________________________________________ drbd-user mailing list [email protected] http://lists.linbit.com/mailman/listinfo/drbd-user

