Re: remote mirroring in the works?
On 20100906 14:50, David Nicol wrote: Only off-topic if BTRFS isn't ever going to ooze into the space currently occupied by the likes of http://en.wikipedia.org/wiki/Global_File_System that is, file systems that have multiple nodes simultaneously accessing block devices and tolerating faults. There seem to be a number of other systems looking at building fault tolerance and distribution over btrfs: crfs, ceph, lustre. I'm convinced that will happen even if btrfs doesn't do it natively. Btrfs could probably be built to stripe and/or mirror over several nbd devices now, although I haven't tried it. --rich -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: remote mirroring in the works?
On Mon, Aug 30, 2010 at 11:14:51AM -0700, K. Richard Pixley wrote: On 20100830 10:59, Roy Sigurd Karlsbakk wrote: I think drbd does precisely what you want. It's not useful for fault tolerance, nor for load balancing, but it will produce a remote block copy that can be used as a sort of hot backup. drbd with heartbeat/pacemaker can provide fault tolerance... I think that's a matter of semantics. Once you've failed over from the primary system to the secondary, changes to your block device are terminal. It's not easy to produce a system which can manage those changes and heal in the sense of allowing the primary system to return to service. In effect, returning the primary system to service requires taking both systems down and copying the block device from the secondary back to the first. This is totally incorrect. DRBD replicates in both directions quite well, in fact. I've been using it on about 60 machines for many years, and I have never had to do what you mention. What it does not help with is avoiding corruption that occurs above the block layer; eg, if your file system or your database on top of it barfs, there is no other good copy. fsck or repair is still required in these cases. It is just like local RAID 1 in this respect -- you still need a backup and/or copy at the file level, which is closer to what is needed here. Simon- -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: remote mirroring in the works?
On Tue, Aug 31, 2010 at 07:07:29AM +0200, Fred van Zwieten wrote: Hmmm, maybe, but rsync would take a lot of time to find the changes. the actual blocks of a snap _are_ the changes, that's why SnapMirror is very efficient. And, I don't see how rsync will retain the snap's between both sites. It would be great if a tool like rsync could have access to the changed blocks in a snap. Don't know if btrfs exposes these somehow. rsync doesn't have the hinting required to do this efficiently. It has to scan the whole thing every time it is run, and isn't anything like a continuous replication in this respect. Also, We've had problems in the past with very large file systems causing rsync to run out of memory, because it builds a file list in memory. This lead us to build a cpbk tool that basically did the same thing without file listsm, which turned out to be a piece of crap, so some other guy kindly rewrote it, but he unfortunately missed the original point entirely and rewrote it using file lists. Sigh. Anyway, there _is_ this interface: btrfs subvolume find-new path last_gen List the recently modified files in a filesystem. Eg: btrfs sub find-new /mnt 0 This should print all files on the file system, and the last transaction ID marker. This can be used to call the interface again, which lists only new changed things since that ID. So, it might be pretty easy to glue these tools together, for now, until something does this automatically and/or in some more efficient or low-level way. Simon- -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: remote mirroring in the works?
On Tuesday, 31 August, 2010, Simon Kirby wrote: [...] Anyway, there _is_ this interface: btrfs subvolume find-new path last_gen List the recently modified files in a filesystem. Eg: btrfs sub find-new /mnt 0 This should print all files on the file system, and the last transaction ID marker. This can be used to call the interface again, which lists only new changed things since that ID. It is not fully correct. In fact Chris Mason says (from http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg04620.html) Chris When we find an inode in the output, it doesn't mean that inode has Chris changed. It just means the btree block holding that inode has changed. Chris So we'll want to add limiting based on the ctime/mtime of the inode as Chris well. So even tough this command definitely helps, false positives may happen. And moreover an empty file is not detected (I think because the file doesn't have associated data). But I think that this may be easily corrected. Regards G.Baroncelli -- gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) kreij...@inwind.it Key fingerprint = 4769 7E51 5293 D36C 814E C054 BF04 F161 3DC5 0512 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: remote mirroring in the works?
Thinking about this a bit more, would a setup with btrfs on top of DRBD be a setup that comes in the neighboorhood of what SnapMirror provides? DRBD does replication at the blocklevel, without any notion of a filesystem on top of it (as I understand this). So, if I make a snapshot on a DRBD'ed btrfs filesystem, this snapshot would also get replicated at the DRBD level. Provided I put the DB in a consisted state before making the snap, I have a remote consistent copy of this DB. This copy can be used as a failover target or as a basis for restore. Am I correct? On Tue, Aug 31, 2010 at 8:30 AM, Simon Kirby s...@hostway.ca wrote: On Mon, Aug 30, 2010 at 11:14:51AM -0700, K. Richard Pixley wrote: On 20100830 10:59, Roy Sigurd Karlsbakk wrote: I think drbd does precisely what you want. It's not useful for fault tolerance, nor for load balancing, but it will produce a remote block copy that can be used as a sort of hot backup. drbd with heartbeat/pacemaker can provide fault tolerance... I think that's a matter of semantics. Once you've failed over from the primary system to the secondary, changes to your block device are terminal. It's not easy to produce a system which can manage those changes and heal in the sense of allowing the primary system to return to service. In effect, returning the primary system to service requires taking both systems down and copying the block device from the secondary back to the first. This is totally incorrect. DRBD replicates in both directions quite well, in fact. I've been using it on about 60 machines for many years, and I have never had to do what you mention. What it does not help with is avoiding corruption that occurs above the block layer; eg, if your file system or your database on top of it barfs, there is no other good copy. fsck or repair is still required in these cases. It is just like local RAID 1 in this respect -- you still need a backup and/or copy at the file level, which is closer to what is needed here. Simon- -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: remote mirroring in the works?
Hi there, I would like to know if there is something functionally equivalent to NetApp's SnapMirror in the works or planning? It would require block level access to a snap and the ability to rebuild (subvolumes including it's) snap's on another machine. If not, what would be the best way to build something more or less equivalent using existing tools? rsync-ing a snap seems the same, but it isn't. First of all it 's file based, not very nice for DB's, and you don't get the snap's on the other side the same. Fred -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: remote mirroring in the works?
LVM Snapshot. lvm -s -n SnapShotName /dev/VolumeGroup/SourceLogicalVolumeName you may need to pass -l or -L to give an initial size for the COW. (as for rebuilding on another machine, that would require shared storage or additional LVM tricks to export/import - or good old fashioned dd) that said, a more appropriate list to question is linux-...@redhat.com On Mon, Aug 30, 2010 at 10:07 AM, Fred van Zwieten fvzwie...@gmail.com wrote: Hi there, I would like to know if there is something functionally equivalent to NetApp's SnapMirror in the works or planning? It would require block level access to a snap and the ability to rebuild (subvolumes including it's) snap's on another machine. If not, what would be the best way to build something more or less equivalent using existing tools? rsync-ing a snap seems the same, but it isn't. First of all it 's file based, not very nice for DB's, and you don't get the snap's on the other side the same. Fred -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: remote mirroring in the works?
- Original Message - Hi there, I would like to know if there is something functionally equivalent to NetApp's SnapMirror in the works or planning? It would require block level access to a snap and the ability to rebuild (subvolumes including it's) snap's on another machine. If not, what would be the best way to build something more or less equivalent using existing tools? rsync-ing a snap seems the same, but it isn't. First of all it 's file based, not very nice for DB's, and you don't get the snap's on the other side the same. Perhaps DRBD - see http://www.drbd.org/ - that'll mirror the block device(s) on which btrfs resides. Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 r...@karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: remote mirroring in the works?
On 20100830 10:07, Fred van Zwieten wrote: Hi there, I would like to know if there is something functionally equivalent to NetApp's SnapMirror in the works or planning? It would require block level access to a snap and the ability to rebuild (subvolumes including it's) snap's on another machine. If not, what would be the best way to build something more or less equivalent using existing tools? rsync-ing a snap seems the same, but it isn't. First of all it 's file based, not very nice for DB's, and you don't get the snap's on the other side the same. Fred I think drbd does precisely what you want. It's not useful for fault tolerance, nor for load balancing, but it will produce a remote block copy that can be used as a sort of hot backup. You can also do something very similar by combining LVM, (the logical volume manager), with LVM snapshots and NBD, (the network block device) by mirroring to an NBD device. Neither of these approaches can tolerate the remote file system being live until and unless it takes over for the primary. But either can maintain a dynamic remote block device. --rich -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: remote mirroring in the works?
I think drbd does precisely what you want. It's not useful for fault tolerance, nor for load balancing, but it will produce a remote block copy that can be used as a sort of hot backup. drbd with heartbeat/pacemaker can provide fault tolerance... Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 r...@karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: remote mirroring in the works?
On 20100830 10:59, Roy Sigurd Karlsbakk wrote: I think drbd does precisely what you want. It's not useful for fault tolerance, nor for load balancing, but it will produce a remote block copy that can be used as a sort of hot backup. drbd with heartbeat/pacemaker can provide fault tolerance... I think that's a matter of semantics. Once you've failed over from the primary system to the secondary, changes to your block device are terminal. It's not easy to produce a system which can manage those changes and heal in the sense of allowing the primary system to return to service. In effect, returning the primary system to service requires taking both systems down and copying the block device from the secondary back to the first. In terms of fault tolerance, I'd call this a tolerance of about a half a fault since the system cannot return to it's initial configuration without breaking continuity of service. And there really isn't any way to extend this. It's not fault tolerance in the virtual synchrony sense where there can be a pool of N machines, all symmetric, which can tolerate N - 1 failures and produce continuing service throughout. It's also not load balanced in the virtual synchrony sense where N machines can all be in service concurrently and the service can tolerate N - 1 failures, albeit at degraded performance. Or in the sense where failed servers can return to the group dynamically. It's not sufficient for any application in which I've ever sought fault tolerance. If it's sufficient for you, that's great. But my definition of fault tolerance requires that the system be capable of returning to it's initial state without loss of service. The heartbeat approach with single failover can't do that. --rich - who is likely now off topic. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: remote mirroring in the works?
I just glanced over the DRBD/LVM combi, but I don't see it being functionally equal to SnapMirror. Let me (try to) explain how snapmirror works: On system A there is a volume (vol1). We let this vol1(A) replicate thru SnapMirror to vol1(B). This is done by creating a snap vol1sx(A) and replicate all changed blocks between this snapshot (x) and the previous snapshot (x-1). The first time, there is no x-1 and the whole volume will be replicated, but after this initial full copy, only the changed blocks between the two snapshot's are being replicated to system B. This is also called snap based replication. Why we want this? Easy. To support consistent DB snap's. The proces works by first putting the DB in a consistent mode (depends on DB implementation), create a snapshot, let the DB continue, replicate the changes. This way a DB consistent state will be replicated. The cool thing about the NetApp implementation is that on system B the snap's (x, x-1, x-2, etc) are also available. When there is trouble, you can choose to online the DB on system B on any of the snap's, or, even cooler, to replicate one of those snap's back to system A, doing a block based rollback at the filesystem level. Fred On Mon, Aug 30, 2010 at 7:55 PM, K. Richard Pixley r...@noir.com wrote: On 20100830 10:07, Fred van Zwieten wrote: Hi there, I would like to know if there is something functionally equivalent to NetApp's SnapMirror in the works or planning? It would require block level access to a snap and the ability to rebuild (subvolumes including it's) snap's on another machine. If not, what would be the best way to build something more or less equivalent using existing tools? rsync-ing a snap seems the same, but it isn't. First of all it 's file based, not very nice for DB's, and you don't get the snap's on the other side the same. Fred I think drbd does precisely what you want. It's not useful for fault tolerance, nor for load balancing, but it will produce a remote block copy that can be used as a sort of hot backup. You can also do something very similar by combining LVM, (the logical volume manager), with LVM snapshots and NBD, (the network block device) by mirroring to an NBD device. Neither of these approaches can tolerate the remote file system being live until and unless it takes over for the primary. But either can maintain a dynamic remote block device. --rich -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: remote mirroring in the works?
On Mon, Aug 30, 2010 at 2:15 PM, Fred van Zwieten fvzwie...@gmail.com wrote: I just glanced over the DRBD/LVM combi, but I don't see it being functionally equal to SnapMirror. Let me (try to) explain how snapmirror works: On system A there is a volume (vol1). We let this vol1(A) replicate thru SnapMirror to vol1(B). This is done by creating a snap vol1sx(A) and replicate all changed blocks between this snapshot (x) and the previous snapshot (x-1). The first time, there is no x-1 and the whole volume will be replicated, but after this initial full copy, only the changed blocks between the two snapshot's are being replicated to system B. This is also called snap based replication. Why we want this? Easy. To support consistent DB snap's. The proces works by first putting the DB in a consistent mode (depends on DB implementation), create a snapshot, let the DB continue, replicate the changes. This way a DB consistent state will be replicated. The cool thing about the NetApp implementation is that on system B the snap's (x, x-1, x-2, etc) are also available. When there is trouble, you can choose to online the DB on system B on any of the snap's, or, even cooler, to replicate one of those snap's back to system A, doing a block based rollback at the filesystem level. In the ZFS world, this would be the zfs send and zfs recv functionality. In case anyone wants to read up on how it works over there, for ideas on how it could be implemented for btrfs in the future. -- Freddie Cash fjwc...@gmail.com -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: remote mirroring in the works?
If you can put the db into a consistent state, then rsync will do this. Rsync does changed block transfers. --rich On 8/30/10 14:15 , Fred van Zwieten wrote: I just glanced over the DRBD/LVM combi, but I don't see it being functionally equal to SnapMirror. Let me (try to) explain how snapmirror works: On system A there is a volume (vol1). We let this vol1(A) replicate thru SnapMirror to vol1(B). This is done by creating a snap vol1sx(A) and replicate all changed blocks between this snapshot (x) and the previous snapshot (x-1). The first time, there is no x-1 and the whole volume will be replicated, but after this initial full copy, only the changed blocks between the two snapshot's are being replicated to system B. This is also called snap based replication. Why we want this? Easy. To support consistent DB snap's. The proces works by first putting the DB in a consistent mode (depends on DB implementation), create a snapshot, let the DB continue, replicate the changes. This way a DB consistent state will be replicated. The cool thing about the NetApp implementation is that on system B the snap's (x, x-1, x-2, etc) are also available. When there is trouble, you can choose to online the DB on system B on any of the snap's, or, even cooler, to replicate one of those snap's back to system A, doing a block based rollback at the filesystem level. Fred On Mon, Aug 30, 2010 at 7:55 PM, K. Richard Pixleyr...@noir.com wrote: On 20100830 10:07, Fred van Zwieten wrote: Hi there, I would like to know if there is something functionally equivalent to NetApp's SnapMirror in the works or planning? It would require block level access to a snap and the ability to rebuild (subvolumes including it's) snap's on another machine. If not, what would be the best way to build something more or less equivalent using existing tools? rsync-ing a snap seems the same, but it isn't. First of all it 's file based, not very nice for DB's, and you don't get the snap's on the other side the same. Fred I think drbd does precisely what you want. It's not useful for fault tolerance, nor for load balancing, but it will produce a remote block copy that can be used as a sort of hot backup. You can also do something very similar by combining LVM, (the logical volume manager), with LVM snapshots and NBD, (the network block device) by mirroring to an NBD device. Neither of these approaches can tolerate the remote file system being live until and unless it takes over for the primary. But either can maintain a dynamic remote block device. --rich -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: remote mirroring in the works?
Hmmm, maybe, but rsync would take a lot of time to find the changes. the actual blocks of a snap _are_ the changes, that's why SnapMirror is very efficient. And, I don't see how rsync will retain the snap's between both sites. It would be great if a tool like rsync could have access to the changed blocks in a snap. Don't know if btrfs exposes these somehow. Fred On Tue, Aug 31, 2010 at 12:56 AM, K. Richard Pixley r...@noir.com wrote: If you can put the db into a consistent state, then rsync will do this. Rsync does changed block transfers. --rich On 8/30/10 14:15 , Fred van Zwieten wrote: I just glanced over the DRBD/LVM combi, but I don't see it being functionally equal to SnapMirror. Let me (try to) explain how snapmirror works: On system A there is a volume (vol1). We let this vol1(A) replicate thru SnapMirror to vol1(B). This is done by creating a snap vol1sx(A) and replicate all changed blocks between this snapshot (x) and the previous snapshot (x-1). The first time, there is no x-1 and the whole volume will be replicated, but after this initial full copy, only the changed blocks between the two snapshot's are being replicated to system B. This is also called snap based replication. Why we want this? Easy. To support consistent DB snap's. The proces works by first putting the DB in a consistent mode (depends on DB implementation), create a snapshot, let the DB continue, replicate the changes. This way a DB consistent state will be replicated. The cool thing about the NetApp implementation is that on system B the snap's (x, x-1, x-2, etc) are also available. When there is trouble, you can choose to online the DB on system B on any of the snap's, or, even cooler, to replicate one of those snap's back to system A, doing a block based rollback at the filesystem level. Fred On Mon, Aug 30, 2010 at 7:55 PM, K. Richard Pixleyr...@noir.com wrote: On 20100830 10:07, Fred van Zwieten wrote: Hi there, I would like to know if there is something functionally equivalent to NetApp's SnapMirror in the works or planning? It would require block level access to a snap and the ability to rebuild (subvolumes including it's) snap's on another machine. If not, what would be the best way to build something more or less equivalent using existing tools? rsync-ing a snap seems the same, but it isn't. First of all it 's file based, not very nice for DB's, and you don't get the snap's on the other side the same. Fred I think drbd does precisely what you want. It's not useful for fault tolerance, nor for load balancing, but it will produce a remote block copy that can be used as a sort of hot backup. You can also do something very similar by combining LVM, (the logical volume manager), with LVM snapshots and NBD, (the network block device) by mirroring to an NBD device. Neither of these approaches can tolerate the remote file system being live until and unless it takes over for the primary. But either can maintain a dynamic remote block device. --rich -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html