Re: [ceph-users] Merging two active ceph clusters: suggestions needed
On Wed, Sep 24, 2014 at 2:12 PM, Robin H. Johnson wrote: > On Wed, Sep 24, 2014 at 11:31:29AM -0700, Yehuda Sadeh wrote: >> On Wed, Sep 24, 2014 at 11:17 AM, Craig Lewis >> wrote: >> > Yehuda, are there any potential problems there? I'm wondering if duplicate >> > bucket names that don't have the same contents might cause problems? Would >> > the second cluster be read-only while replication is running? >> I might have missed part of the original requirements. This sync >> assumes that B starts as a clean slate. No writes are allowed to it >> while data is being copied into it. Once ready, all writes to A should >> be quiesced. Once sync completes, they would then need to reconfigure >> their system to make B the primary zone. > If my B side was empty, I would simply add all the OSDs in as a single > cluster. > > It's the there are S3 buckets & RBD images on both sides; none of the > bucket names or RBD images conflict, so I was hoping there was a way to > merge them. The sync agent might still work if there are no overlapping buckets. You'll need to add both clusters as zones in the same region, create a system user that will exist on both, and set the agent to do a full sync from A to B. The only thing I'm not sure is whether the sync agent will try to remove stuff from B as it doesn't exist on A. I don't think it'll do it, but it's better be safe than be sorry. Josh, do you see an issue there? Yehuda > > -- > Robin Hugh Johnson > Gentoo Linux: Developer, Infrastructure Lead > E-Mail : robb...@gentoo.org > GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Merging two active ceph clusters: suggestions needed
On Wed, Sep 24, 2014 at 11:31:29AM -0700, Yehuda Sadeh wrote: > On Wed, Sep 24, 2014 at 11:17 AM, Craig Lewis > wrote: > > Yehuda, are there any potential problems there? I'm wondering if duplicate > > bucket names that don't have the same contents might cause problems? Would > > the second cluster be read-only while replication is running? > I might have missed part of the original requirements. This sync > assumes that B starts as a clean slate. No writes are allowed to it > while data is being copied into it. Once ready, all writes to A should > be quiesced. Once sync completes, they would then need to reconfigure > their system to make B the primary zone. If my B side was empty, I would simply add all the OSDs in as a single cluster. It's the there are S3 buckets & RBD images on both sides; none of the bucket names or RBD images conflict, so I was hoping there was a way to merge them. -- Robin Hugh Johnson Gentoo Linux: Developer, Infrastructure Lead E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Merging two active ceph clusters: suggestions needed
On Wed, Sep 24, 2014 at 11:17 AM, Craig Lewis wrote: > Yehuda, are there any potential problems there? I'm wondering if duplicate > bucket names that don't have the same contents might cause problems? Would > the second cluster be read-only while replication is running? I might have missed part of the original requirements. This sync assumes that B starts as a clean slate. No writes are allowed to it while data is being copied into it. Once ready, all writes to A should be quiesced. Once sync completes, they would then need to reconfigure their system to make B the primary zone. Yehuda > > Robin, are the mtimes in Cluster B's S3 data important? Just wondering if > it would be easier to move the data from B to A, and move nodes from B to A > as B shrinks. Then remove the old A nodes when it's all done. > > > On Tue, Sep 23, 2014 at 10:04 PM, Yehuda Sadeh wrote: >> >> On Tue, Sep 23, 2014 at 7:23 PM, Robin H. Johnson >> wrote: >> > On Tue, Sep 23, 2014 at 03:12:53PM -0600, John Nielsen wrote: >> >> Keep Cluster A intact and migrate it to your new hardware. You can do >> >> this with no downtime, assuming you have enough IOPS to support data >> >> migration and normal usage simultaneously. Bring up the new OSDs and >> >> let everything rebalance, then remove the old OSDs one at a time. >> >> Replace the MONs one at a time. Since you will have the same data on >> >> the same cluster (but different hardware), you don't need to worry >> >> about mtimes or handling RBD or S3 data at all. >> > The B side already has data however, and that's one of the merge >> > problems (see below re S3). >> > >> >> Make sure you have top-level ceph credentials on the new cluster that >> >> will work for current users of Cluster B. >> >> >> >> Use a librbd-aware tool to migrate the RBD volumes from Cluster B onto >> >> the new Cluster A. qemu-img comes to mind. This would require downtime >> >> for each volume, but not necessarily all at the same time. >> > Thanks, qemu-img didn't come to mind as an RBD migration tool. >> > >> >> Migrate your S3 user accounts from Cluster B to the new Cluster A >> >> (should be easily scriptable with e.g. JSON output from >> >> radosgw-admin). >> > It's fixed now, but didn't used to be possible to create all the various >> > keys. >> > >> >> Check for and resolve S3 bucket name conflicts between Cluster A and >> >> ClusterB. >> > None. >> > >> >> Migrate your S3 data from Cluster B to the new Cluster A using an >> >> S3-level tool. s3cmd comes to mind. >> > s3cmd does not preserve mtimes, ACLs or CORS data; that's the largest >> > part of the concern. >> >> You need to setup a second rgw zone, and use the radosgw sync agent to >> sync data to the secondary zone. That will preserve mtimes and ACLs. >> Once that's complete you could then turn the secondary zone into your >> primary. >> >> Yehuda >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Merging two active ceph clusters: suggestions needed
Yehuda, are there any potential problems there? I'm wondering if duplicate bucket names that don't have the same contents might cause problems? Would the second cluster be read-only while replication is running? Robin, are the mtimes in Cluster B's S3 data important? Just wondering if it would be easier to move the data from B to A, and move nodes from B to A as B shrinks. Then remove the old A nodes when it's all done. On Tue, Sep 23, 2014 at 10:04 PM, Yehuda Sadeh wrote: > On Tue, Sep 23, 2014 at 7:23 PM, Robin H. Johnson > wrote: > > On Tue, Sep 23, 2014 at 03:12:53PM -0600, John Nielsen wrote: > >> Keep Cluster A intact and migrate it to your new hardware. You can do > >> this with no downtime, assuming you have enough IOPS to support data > >> migration and normal usage simultaneously. Bring up the new OSDs and > >> let everything rebalance, then remove the old OSDs one at a time. > >> Replace the MONs one at a time. Since you will have the same data on > >> the same cluster (but different hardware), you don't need to worry > >> about mtimes or handling RBD or S3 data at all. > > The B side already has data however, and that's one of the merge > > problems (see below re S3). > > > >> Make sure you have top-level ceph credentials on the new cluster that > >> will work for current users of Cluster B. > >> > >> Use a librbd-aware tool to migrate the RBD volumes from Cluster B onto > >> the new Cluster A. qemu-img comes to mind. This would require downtime > >> for each volume, but not necessarily all at the same time. > > Thanks, qemu-img didn't come to mind as an RBD migration tool. > > > >> Migrate your S3 user accounts from Cluster B to the new Cluster A > >> (should be easily scriptable with e.g. JSON output from > >> radosgw-admin). > > It's fixed now, but didn't used to be possible to create all the various > > keys. > > > >> Check for and resolve S3 bucket name conflicts between Cluster A and > >> ClusterB. > > None. > > > >> Migrate your S3 data from Cluster B to the new Cluster A using an > >> S3-level tool. s3cmd comes to mind. > > s3cmd does not preserve mtimes, ACLs or CORS data; that's the largest > > part of the concern. > > You need to setup a second rgw zone, and use the radosgw sync agent to > sync data to the secondary zone. That will preserve mtimes and ACLs. > Once that's complete you could then turn the secondary zone into your > primary. > > Yehuda > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Merging two active ceph clusters: suggestions needed
On Tue, Sep 23, 2014 at 7:23 PM, Robin H. Johnson wrote: > On Tue, Sep 23, 2014 at 03:12:53PM -0600, John Nielsen wrote: >> Keep Cluster A intact and migrate it to your new hardware. You can do >> this with no downtime, assuming you have enough IOPS to support data >> migration and normal usage simultaneously. Bring up the new OSDs and >> let everything rebalance, then remove the old OSDs one at a time. >> Replace the MONs one at a time. Since you will have the same data on >> the same cluster (but different hardware), you don't need to worry >> about mtimes or handling RBD or S3 data at all. > The B side already has data however, and that's one of the merge > problems (see below re S3). > >> Make sure you have top-level ceph credentials on the new cluster that >> will work for current users of Cluster B. >> >> Use a librbd-aware tool to migrate the RBD volumes from Cluster B onto >> the new Cluster A. qemu-img comes to mind. This would require downtime >> for each volume, but not necessarily all at the same time. > Thanks, qemu-img didn't come to mind as an RBD migration tool. > >> Migrate your S3 user accounts from Cluster B to the new Cluster A >> (should be easily scriptable with e.g. JSON output from >> radosgw-admin). > It's fixed now, but didn't used to be possible to create all the various > keys. > >> Check for and resolve S3 bucket name conflicts between Cluster A and >> ClusterB. > None. > >> Migrate your S3 data from Cluster B to the new Cluster A using an >> S3-level tool. s3cmd comes to mind. > s3cmd does not preserve mtimes, ACLs or CORS data; that's the largest > part of the concern. You need to setup a second rgw zone, and use the radosgw sync agent to sync data to the secondary zone. That will preserve mtimes and ACLs. Once that's complete you could then turn the secondary zone into your primary. Yehuda ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Merging two active ceph clusters: suggestions needed
On Tue, Sep 23, 2014 at 03:12:53PM -0600, John Nielsen wrote: > Keep Cluster A intact and migrate it to your new hardware. You can do > this with no downtime, assuming you have enough IOPS to support data > migration and normal usage simultaneously. Bring up the new OSDs and > let everything rebalance, then remove the old OSDs one at a time. > Replace the MONs one at a time. Since you will have the same data on > the same cluster (but different hardware), you don't need to worry > about mtimes or handling RBD or S3 data at all. The B side already has data however, and that's one of the merge problems (see below re S3). > Make sure you have top-level ceph credentials on the new cluster that > will work for current users of Cluster B. > > Use a librbd-aware tool to migrate the RBD volumes from Cluster B onto > the new Cluster A. qemu-img comes to mind. This would require downtime > for each volume, but not necessarily all at the same time. Thanks, qemu-img didn't come to mind as an RBD migration tool. > Migrate your S3 user accounts from Cluster B to the new Cluster A > (should be easily scriptable with e.g. JSON output from > radosgw-admin). It's fixed now, but didn't used to be possible to create all the various keys. > Check for and resolve S3 bucket name conflicts between Cluster A and > ClusterB. None. > Migrate your S3 data from Cluster B to the new Cluster A using an > S3-level tool. s3cmd comes to mind. s3cmd does not preserve mtimes, ACLs or CORS data; that's the largest part of the concern. -- Robin Hugh Johnson Gentoo Linux: Developer, Infrastructure Lead E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Merging two active ceph clusters: suggestions needed
I would: Keep Cluster A intact and migrate it to your new hardware. You can do this with no downtime, assuming you have enough IOPS to support data migration and normal usage simultaneously. Bring up the new OSDs and let everything rebalance, then remove the old OSDs one at a time. Replace the MONs one at a time. Since you will have the same data on the same cluster (but different hardware), you don't need to worry about mtimes or handling RBD or S3 data at all. Make sure you have top-level ceph credentials on the new cluster that will work for current users of Cluster B. Use a librbd-aware tool to migrate the RBD volumes from Cluster B onto the new Cluster A. qemu-img comes to mind. This would require downtime for each volume, but not necessarily all at the same time. Migrate your S3 user accounts from Cluster B to the new Cluster A (should be easily scriptable with e.g. JSON output from radosgw-admin). Check for and resolve S3 bucket name conflicts between Cluster A and ClusterB. Migrate your S3 data from Cluster B to the new Cluster A using an S3-level tool. s3cmd comes to mind. Fine-tuning and automating the above is left as an exercise for the reader, but it should all be possible with built-in and/or commodity tools. On Sep 20, 2014, at 11:15 PM, Robin H. Johnson wrote: > For a variety of reasons, none good anymore, we have two separate Ceph > clusters. > > I would like to merge them onto the newer hardware, with as little > downtime and data loss as possible; then discard the old hardware. > > Cluster A (2 hosts): > - 3TB of S3 content, >100k files, file mtimes important > - <500GB of RBD volumes, exported via iscsi > > Cluster B (4 hosts): > - <50GiB of S3 content > - 7TB of RBD volumes, exported via iscsi > > Short of finding somewhere to dump all of the data from one side, and > re-importing it after merging with that cluster as empty; are there any > other alternatives available to me? > > -- > Robin Hugh Johnson > Gentoo Linux: Developer, Infrastructure Lead > E-Mail : robb...@gentoo.org > GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Merging two active ceph clusters: suggestions needed
On 09/22/2014 05:17 AM, Robin H. Johnson wrote: Can somebody else make comments about migrating S3 buckets with preserved mtime data (and all of the ACLs & CORS) then? I don't know how radosgw objects are stored, but have you considered a lower level rados export/import ? IMPORT AND EXPORT import [options] Upload to export [options] rados-pool> Download to options: -f / --force Copy everything, even if it hasn't changed. -d / --delete-after After synchronizing, delete unreferenced files or objects from the target bucket or directory. --workersNumber of worker threads to spawn (default 5) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Merging two active ceph clusters: suggestions needed
On Sun, Sep 21, 2014 at 02:33:09PM +0900, Christian Balzer wrote: > > For a variety of reasons, none good anymore, we have two separate Ceph > > clusters. > > > > I would like to merge them onto the newer hardware, with as little > > downtime and data loss as possible; then discard the old hardware. > > > > Cluster A (2 hosts): > > - 3TB of S3 content, >100k files, file mtimes important > > - <500GB of RBD volumes, exported via iscsi > > > > Cluster B (4 hosts): > > - <50GiB of S3 content > > - 7TB of RBD volumes, exported via iscsi > > > > Short of finding somewhere to dump all of the data from one side, and > > re-importing it after merging with that cluster as empty; are there any > > other alternatives available to me? > > > > Having recently seen a similar question and the answer by the Ceph > developers, no. > As in there is no way (and no plans) for merging clusters. > > There are export functions for RBD volumes, not sure about S3 and the > mtimes as I don't use that functionality. Can somebody else make comments about migrating S3 buckets with preserved mtime data (and all of the ACLs & CORS) then? -- Robin Hugh Johnson Gentoo Linux: Developer, Infrastructure Lead E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Merging two active ceph clusters: suggestions needed
On Sun, 21 Sep 2014 05:15:32 + Robin H. Johnson wrote: > For a variety of reasons, none good anymore, we have two separate Ceph > clusters. > > I would like to merge them onto the newer hardware, with as little > downtime and data loss as possible; then discard the old hardware. > > Cluster A (2 hosts): > - 3TB of S3 content, >100k files, file mtimes important > - <500GB of RBD volumes, exported via iscsi > > Cluster B (4 hosts): > - <50GiB of S3 content > - 7TB of RBD volumes, exported via iscsi > > Short of finding somewhere to dump all of the data from one side, and > re-importing it after merging with that cluster as empty; are there any > other alternatives available to me? > Having recently seen a similar question and the answer by the Ceph developers, no. As in there is no way (and no plans) for merging clusters. There are export functions for RBD volumes, not sure about S3 and the mtimes as I don't use that functionality. Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Japan/Fusion Communications http://www.gol.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Merging two active ceph clusters: suggestions needed
For a variety of reasons, none good anymore, we have two separate Ceph clusters. I would like to merge them onto the newer hardware, with as little downtime and data loss as possible; then discard the old hardware. Cluster A (2 hosts): - 3TB of S3 content, >100k files, file mtimes important - <500GB of RBD volumes, exported via iscsi Cluster B (4 hosts): - <50GiB of S3 content - 7TB of RBD volumes, exported via iscsi Short of finding somewhere to dump all of the data from one side, and re-importing it after merging with that cluster as empty; are there any other alternatives available to me? -- Robin Hugh Johnson Gentoo Linux: Developer, Infrastructure Lead E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com