split RAID1 during backups?
Hi all, I have a two drive RAID1 serving data for a busy website. The partition is 500GB and contains millions of 10KB files. For reference, here's /proc/mdstat Personalities : [raid1] md0 : active raid1 sdc1[0] sdd1[1] 488383936 blocks [2/2] [UU] For backups, I set the md0 partition to readonly and then use dd_rescue + netcat to copy the parition over a gigabit network. Unfortuantely, this process takes almost 10 hours. I'm only able to copy about 18MB/s from md0 due to disk contention with the webserver. If I had the full attention of a single disk, I could read at nearly 60MB/s. So - I'm thinking of the following backup scenario. First, remount /dev/md0 readonly just to be safe. Then mount the two component paritions (sdc1, sdd1) readonly. Tell the webserver to work from one component partition, and tell the backup process to work from the other component partition. Once the backup is complete, point the webserver back at /dev/md0, unmount the component partitions, then switch read-write mode back on. Am I insane? Everything on this system seems bottlenecked by disk I/O. That includes the rate web pages are served as well as the backup process described above. While I'm always hungry for perforance tips, faster backups are the current focus. For those interested in gory details such as drive types, NCQ settings, kernel version and whatnot, I dumped a copy of dmesg output here: http://www.jab.org/dmesg Cheers, Jeff - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: split RAID1 during backups?
Hi all, I have a two drive RAID1 serving data for a busy website. The partition is 500GB and contains millions of 10KB files. For reference, here's /proc/mdstat Personalities : [raid1] md0 : active raid1 sdc1[0] sdd1[1] 488383936 blocks [2/2] [UU] For backups, I set the md0 partition to readonly and then use dd_rescue + netcat to copy the parition over a gigabit network. Unfortuantely, this process takes almost 10 hours. I'm only able to copy about 18MB/s from md0 due to disk contention with the webserver. If I had the full attention of a single disk, I could read at nearly 60MB/s. First of all, if the data is mostly static, rsync might work faster. Don't feed rsync millions of files in one go - try to split it in separate processes for say all files starting with a, all files starting with b etc. So - I'm thinking of the following backup scenario. First, remount /dev/md0 readonly just to be safe. Then mount the two component paritions (sdc1, sdd1) readonly. Tell the webserver to work from one component partition, and tell the backup process to work from the other component partition. Once the backup is complete, point the webserver back at /dev/md0, unmount the component partitions, then switch read-write mode back on. Am I insane? It doesn't sound insane. If it's actually fast, is something only you can test on your hardware. By the way, it used to be with regular IDE disks that using hdc and hdd together on a single wire was a sure way to get a slow system. I take it sdc and sdd using SATA don't influence each other? Good luck, Jurriaan - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: split RAID1 during backups?
Jeff Breidenbach wrote: So - I'm thinking of the following backup scenario. First, remount /dev/md0 readonly just to be safe. Then mount the two component paritions (sdc1, sdd1) readonly. Tell the webserver to work from one component partition, and tell the backup process to work from the other component partition. Once the backup is complete, point the webserver back at /dev/md0, unmount the component partitions, then switch read-write mode back on. Why not do something like this ? mount -o remount,ro /dev/md0 /web mdadm --fail /dev/md0 /dev/sdd1 mdadm --remove /dev/md0 /dev/sdd1 mount -o ro /dev/sdd1 /target do backup here umount /target mdadm -add /dev/md0 /dev/sdd1 mount -o remount,rw /dev/md0 /web That way the web server continues to run from the md.. However you will endure a rebuild on md0 when you re-add the disk, but given everything is mounted read-only, you should not practically be doing anything and if you fail a disk during the rebuild the other disk will still be intact. I second jurriaan's vote for rsync also, but I would be inclined just to let it loose on the whole disk rather than break it up into parts.. but then I have heaps of ram too.. Regards, Brad -- Human beings, who are almost unique in having the ability to learn from the experience of others, are also remarkable for their apparent disinclination to do so. -- Douglas Adams - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: split RAID1 during backups?
First of all, if the data is mostly static, rsync might work faster. Any operation that stats the individual files - even to just look at timestamps - takes about two weeks. Therefore it is hard for me to see rsync as a viable solution, even though the data is mostly static. About 400,000 files change between weekly backups. I take it sdc and sdd using SATA don't influence each other? Correct. However you will endure a rebuild on md0 when you re-add the disk, but given everything is mounted read-only, you should not practically be doing anything If the rebuild operation is a no-op, then that sounds like a great idea. If the rebuild operation requires scanning over all data in both drives, I think that's going to be at least as expensive as the current 10 hour process. Thanks for the suggestions so far. Cheers, Jeff - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: split RAID1 during backups?
On Mon, 24 Oct 2005, Jeff Breidenbach wrote: First of all, if the data is mostly static, rsync might work faster. Any operation that stats the individual files - even to just look at timestamps - takes about two weeks. Therefore it is hard for me to see rsync as a viable solution, even though the data is mostly static. About 400,000 files change between weekly backups. taking a long time to stat individual files makes me wonder if you're suffering from atime updates and O(n) directory lookups... have you tried this: - mount -o noatime,nodiratime - tune2fs -O dir_index (and e2fsck -D) (you need recentish e2fsprogs for this, and i'm pretty sure you want 2.6.x kernel) a big hint you're suffering from atime updates is write traffic when your fs is mounted rw, and your static webserver is the only thing running (and your logs go elsewhere)... atime updates are probably the only writes then. try iostat -x 5. a big hint you're suffering from O(n) directory lookups is heaps of system time... (vmstat or top). On Mon, 24 Oct 2005, Brad Campbell wrote: mount -o remount,ro /dev/md0 /web mdadm --fail /dev/md0 /dev/sdd1 mdadm --remove /dev/md0 /dev/sdd1 mount -o ro /dev/sdd1 /target do backup here umount /target mdadm -add /dev/md0 /dev/sdd1 mount -o remount,rw /dev/md0 /web the md event counts would be out of sync and unless you're using bitmapped intent logging this would cause a full resync. if the raid wasn't online you could probably use one of the mdadm options to force the two devices to be a sync'd raid1 ... but i'm guessing you wouldn't be able to do it online. other 2.6.x bleeding edge options are to mark one drive as write-mostly so that you have no read traffic competition while doing a backup... or just use the bitmap intent logging and a nbd to add a third, networked, copy of the drive on another machine. -dean - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: split RAID1 during backups?
Jeff == Jeff Breidenbach jeff@jab.org writes: Jeff # mount | grep md0 Jeff /dev/md0 on /data1 type reiserfs (rw,noatime,nodiratime) Ah, you're using reiserfs on here. It may or may not be having problems with all those files per-directory that you have. Is there any way you can split them up more into sub-directories? Old news servers used to run into this exact same problem, and what they did was move all files starting with 'a' into the 'a/' directory, all files starting with 'b' into b/... etc. You can go down as many levels as you want. Jeff Individual directories contain up to about 150,000 files. If I Jeff run ls -U on all directories, it completes in a reasonably Jeff amount of time (I forget how much, but I think it is well under Jeff an hour). Reiserfs is supposed to be good at this sort of Jeff thing. If I were to stat each file, then it's a different story. Do you stat the files in inode order (not sure how reiserfs stores files), when you're doing a readdir() on the directory contents? You don't want to bother sorting at all, you just want to pull them off the disk as efficiently as possible. I think you'll get alot more performance out of your system if you can just re-do how the application writes/reads the files you're using. It almost sounds like some sort of cache system... The other idea would be to use 'inotify' and just copy those files which change to the cloned box. Another idea, which would require more hardware would be to make some readonly copies of the system and have all reads go there, and only writes goto the master system. If the master dies, you just promote a slave into that role. If a slave dies, you have extras running around. Then you could do your backups against the readonly systems, in parallel to get the most performance out of your backups. But knowing more about the application would help. Millions of tiny files aren't optimal these days. Oh yeah, what kinds of block size are you using on the filesystem? And how many disks? Splitting the load across more smaller disks will probably help as well, since I suspect that your times are dominated by seek and directory overhead, not actually reading of all these tiny files. John - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: split RAID1 during backups?
On Mon, 24 Oct 2005, Jeff Breidenbach wrote: Dean, the comment about write-mostly is confusing to me. Let's say I somehow marked one of the component drives write-mostly to quiet it down. How do I get at it? Linux will not let me mount the component partition if md0 is also mounted. Do you think write-mostly or write-behind are likely enough to be magic bullets that I should learn all about them? if one drive is write-mostly, and you remount the filesystem read-only... then no writes should be occuring... and you could dd from the component drive directly and get a consistent fs image. (i'm assuming you can remount the filesystem read-only for the duration of the backup because it sounds like that's how you do it now; and i'm assuming you're happy enough with your dd_rescue image...) myself i've been considering a related problem... i don't trust LVM/DM snapshots in 2.6.x yet, and i've been holding back a 2.4.x box waiting for them to stabilise... but that seems to be taking a long time. the box happens to have a 3-way raid1 anyhow, and 2.6.x bitmapped intent logging would give me a great snapshot backup option: just break off one disk during the backup and put it back in the mirror when done. there's probably one problem with this 3-way approach... i'll need some way to get the fs (ext3) to reach a safe point where no log recovery would be required on the disk i break out of the mirror... because under no circumstances do you want to write on the disk while it's outside the mirror. (LVM snapshotting in 2.4.x requires a VFS lock patch which does exactly this when you create a snapshot.) John, I'm using 4KB blocks in reiserfs with tail packing. i didn't realise you were using reiserfs... i'd suggest disabling tail packing... but then i've never used reiser, and i've only ever seen reports of tail packing having serious performance impact. you're really only saving yourself an average of half a block per inode... maybe try a smaller block size if the disk space is an issue due to lots of inodes. -dean - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: split RAID1 during backups?
Should there be any consideration for the utilization of the gigabit interface that is passing all of this backup traffic, as well as the speed of the drive that is doing all of the writing during this transaction? Is the 18MB/s how fast the data is being copied over the network, or is it some metric within the host system? Thomas Jeff Breidenbach wrote: Hi all, I have a two drive RAID1 serving data for a busy website. The partition is 500GB and contains millions of 10KB files. For reference, here's /proc/mdstat Personalities : [raid1] md0 : active raid1 sdc1[0] sdd1[1] 488383936 blocks [2/2] [UU] For backups, I set the md0 partition to readonly and then use dd_rescue + netcat to copy the parition over a gigabit network. Unfortuantely, this process takes almost 10 hours. I'm only able to copy about 18MB/s from md0 due to disk contention with the webserver. If I had the full attention of a single disk, I could read at nearly 60MB/s. So - I'm thinking of the following backup scenario. First, remount /dev/md0 readonly just to be safe. Then mount the two component paritions (sdc1, sdd1) readonly. Tell the webserver to work from one component partition, and tell the backup process to work from the other component partition. Once the backup is complete, point the webserver back at /dev/md0, unmount the component partitions, then switch read-write mode back on. Am I insane? Everything on this system seems bottlenecked by disk I/O. That includes the rate web pages are served as well as the backup process described above. While I'm always hungry for perforance tips, faster backups are the current focus. For those interested in gory details such as drive types, NCQ settings, kernel version and whatnot, I dumped a copy of dmesg output here: http://www.jab.org/dmesg Cheers, Jeff - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: split RAID1 during backups?
On 10/24/05, Thomas Garner [EMAIL PROTECTED] wrote: Should there be any consideration for the utilization of the gigabit interface that is passing all of this backup traffic, as well as the speed of the drive that is doing all of the writing during this transaction? Is the 18MB/s how fast the data is being copied over the network, or is it some metric within the host system? The switched gigabit network is plenty fast. The bottleneck is reading from the RAID1 while it is under contention. Here are measurements from transferring a chunk of data from /dev/zero, a single unmounted drive, and RAID1. Measurements are reported by dd_rescue and reflect how fast data is moving over the network. I was careful to use smart command line options with dd_rescue, avoid contaminating Linux's disk cache, and make sure results were repeatable. MB/s Operation 72.0 dd-rescue /dev/zero - | netcat 61.8 dd-rescue [unmounted single drive] - | netcat 18.8 dd-rescue md0 - | netcat dd_rescue v1.11 options: -B 4096 -q -l -d -s 11G -m 200M -S 0 - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html