Re: [ceph-users] Merging two active ceph clusters: suggestions needed

2014-09-24 Thread Yehuda Sadeh
On Wed, Sep 24, 2014 at 2:12 PM, Robin H. Johnson  wrote:
> On Wed, Sep 24, 2014 at 11:31:29AM -0700, Yehuda Sadeh wrote:
>> On Wed, Sep 24, 2014 at 11:17 AM, Craig Lewis  
>> wrote:
>> > Yehuda, are there any potential problems there?  I'm wondering if duplicate
>> > bucket names that don't have the same contents might cause problems?  Would
>> > the second cluster be read-only while replication is running?
>> I might have missed part of the original requirements. This sync
>> assumes that B starts as a clean slate. No writes are allowed to it
>> while data is being copied into it. Once ready, all writes to A should
>> be quiesced. Once sync completes, they would then need to reconfigure
>> their system to make B the primary zone.
> If my B side was empty, I would simply add all the OSDs in as a single
> cluster.
>
> It's the there are S3 buckets & RBD images on both sides; none of the
> bucket names or RBD images conflict, so I was hoping there was a way to
> merge them.

The sync agent might still work if there are no overlapping buckets.
You'll need to add both clusters as zones in the same region, create a
system user that will exist on both, and set the agent to do a full
sync from A to B. The only thing I'm not sure is whether the sync
agent will try to remove stuff from B as it doesn't exist on A. I
don't think it'll do it, but it's better be safe than be sorry. Josh,
do you see an issue there?

Yehuda

>
> --
> Robin Hugh Johnson
> Gentoo Linux: Developer, Infrastructure Lead
> E-Mail : robb...@gentoo.org
> GnuPG FP   : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Merging two active ceph clusters: suggestions needed

2014-09-24 Thread Robin H. Johnson
On Wed, Sep 24, 2014 at 11:31:29AM -0700, Yehuda Sadeh wrote:
> On Wed, Sep 24, 2014 at 11:17 AM, Craig Lewis  
> wrote:
> > Yehuda, are there any potential problems there?  I'm wondering if duplicate
> > bucket names that don't have the same contents might cause problems?  Would
> > the second cluster be read-only while replication is running?
> I might have missed part of the original requirements. This sync
> assumes that B starts as a clean slate. No writes are allowed to it
> while data is being copied into it. Once ready, all writes to A should
> be quiesced. Once sync completes, they would then need to reconfigure
> their system to make B the primary zone.
If my B side was empty, I would simply add all the OSDs in as a single
cluster.

It's the there are S3 buckets & RBD images on both sides; none of the
bucket names or RBD images conflict, so I was hoping there was a way to
merge them.

-- 
Robin Hugh Johnson
Gentoo Linux: Developer, Infrastructure Lead
E-Mail : robb...@gentoo.org
GnuPG FP   : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Merging two active ceph clusters: suggestions needed

2014-09-24 Thread Yehuda Sadeh
On Wed, Sep 24, 2014 at 11:17 AM, Craig Lewis  wrote:
> Yehuda, are there any potential problems there?  I'm wondering if duplicate
> bucket names that don't have the same contents might cause problems?  Would
> the second cluster be read-only while replication is running?

I might have missed part of the original requirements. This sync
assumes that B starts as a clean slate. No writes are allowed to it
while data is being copied into it. Once ready, all writes to A should
be quiesced. Once sync completes, they would then need to reconfigure
their system to make B the primary zone.

Yehuda

>
> Robin, are the mtimes in Cluster B's S3 data important?  Just wondering if
> it would be easier to move the data from B to A, and move nodes from B to A
> as B shrinks.  Then remove the old A nodes when it's all done.
>
>
> On Tue, Sep 23, 2014 at 10:04 PM, Yehuda Sadeh  wrote:
>>
>> On Tue, Sep 23, 2014 at 7:23 PM, Robin H. Johnson 
>> wrote:
>> > On Tue, Sep 23, 2014 at 03:12:53PM -0600, John Nielsen wrote:
>> >> Keep Cluster A intact and migrate it to your new hardware. You can do
>> >> this with no downtime, assuming you have enough IOPS to support data
>> >> migration and normal usage simultaneously. Bring up the new OSDs and
>> >> let everything rebalance, then remove the old OSDs one at a time.
>> >> Replace the MONs one at a time. Since you will have the same data on
>> >> the same cluster (but different hardware), you don't need to worry
>> >> about mtimes or handling RBD or S3 data at all.
>> > The B side already has data however, and that's one of the merge
>> > problems (see below re S3).
>> >
>> >> Make sure you have top-level ceph credentials on the new cluster that
>> >> will work for current users of Cluster B.
>> >>
>> >> Use a librbd-aware tool to migrate the RBD volumes from Cluster B onto
>> >> the new Cluster A. qemu-img comes to mind. This would require downtime
>> >> for each volume, but not necessarily all at the same time.
>> > Thanks, qemu-img didn't come to mind as an RBD migration tool.
>> >
>> >> Migrate your S3 user accounts from Cluster B to the new Cluster A
>> >> (should be easily scriptable with e.g. JSON output from
>> >> radosgw-admin).
>> > It's fixed now, but didn't used to be possible to create all the various
>> > keys.
>> >
>> >> Check for and resolve S3 bucket name conflicts between Cluster A and
>> >> ClusterB.
>> > None.
>> >
>> >> Migrate your S3 data from Cluster B to the new Cluster A using an
>> >> S3-level tool. s3cmd comes to mind.
>> > s3cmd does not preserve mtimes, ACLs or CORS data; that's the largest
>> > part of the concern.
>>
>> You need to setup a second rgw zone, and use the radosgw sync agent to
>> sync data to the secondary zone. That will preserve mtimes and ACLs.
>> Once that's complete you could then turn the secondary zone into your
>> primary.
>>
>> Yehuda
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Merging two active ceph clusters: suggestions needed

2014-09-24 Thread Craig Lewis
Yehuda, are there any potential problems there?  I'm wondering if duplicate
bucket names that don't have the same contents might cause problems?  Would
the second cluster be read-only while replication is running?

Robin, are the mtimes in Cluster B's S3 data important?  Just wondering if
it would be easier to move the data from B to A, and move nodes from B to A
as B shrinks.  Then remove the old A nodes when it's all done.


On Tue, Sep 23, 2014 at 10:04 PM, Yehuda Sadeh  wrote:

> On Tue, Sep 23, 2014 at 7:23 PM, Robin H. Johnson 
> wrote:
> > On Tue, Sep 23, 2014 at 03:12:53PM -0600, John Nielsen wrote:
> >> Keep Cluster A intact and migrate it to your new hardware. You can do
> >> this with no downtime, assuming you have enough IOPS to support data
> >> migration and normal usage simultaneously. Bring up the new OSDs and
> >> let everything rebalance, then remove the old OSDs one at a time.
> >> Replace the MONs one at a time. Since you will have the same data on
> >> the same cluster (but different hardware), you don't need to worry
> >> about mtimes or handling RBD or S3 data at all.
> > The B side already has data however, and that's one of the merge
> > problems (see below re S3).
> >
> >> Make sure you have top-level ceph credentials on the new cluster that
> >> will work for current users of Cluster B.
> >>
> >> Use a librbd-aware tool to migrate the RBD volumes from Cluster B onto
> >> the new Cluster A. qemu-img comes to mind. This would require downtime
> >> for each volume, but not necessarily all at the same time.
> > Thanks, qemu-img didn't come to mind as an RBD migration tool.
> >
> >> Migrate your S3 user accounts from Cluster B to the new Cluster A
> >> (should be easily scriptable with e.g. JSON output from
> >> radosgw-admin).
> > It's fixed now, but didn't used to be possible to create all the various
> > keys.
> >
> >> Check for and resolve S3 bucket name conflicts between Cluster A and
> >> ClusterB.
> > None.
> >
> >> Migrate your S3 data from Cluster B to the new Cluster A using an
> >> S3-level tool. s3cmd comes to mind.
> > s3cmd does not preserve mtimes, ACLs or CORS data; that's the largest
> > part of the concern.
>
> You need to setup a second rgw zone, and use the radosgw sync agent to
> sync data to the secondary zone. That will preserve mtimes and ACLs.
> Once that's complete you could then turn the secondary zone into your
> primary.
>
> Yehuda
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Merging two active ceph clusters: suggestions needed

2014-09-23 Thread Yehuda Sadeh
On Tue, Sep 23, 2014 at 7:23 PM, Robin H. Johnson  wrote:
> On Tue, Sep 23, 2014 at 03:12:53PM -0600, John Nielsen wrote:
>> Keep Cluster A intact and migrate it to your new hardware. You can do
>> this with no downtime, assuming you have enough IOPS to support data
>> migration and normal usage simultaneously. Bring up the new OSDs and
>> let everything rebalance, then remove the old OSDs one at a time.
>> Replace the MONs one at a time. Since you will have the same data on
>> the same cluster (but different hardware), you don't need to worry
>> about mtimes or handling RBD or S3 data at all.
> The B side already has data however, and that's one of the merge
> problems (see below re S3).
>
>> Make sure you have top-level ceph credentials on the new cluster that
>> will work for current users of Cluster B.
>>
>> Use a librbd-aware tool to migrate the RBD volumes from Cluster B onto
>> the new Cluster A. qemu-img comes to mind. This would require downtime
>> for each volume, but not necessarily all at the same time.
> Thanks, qemu-img didn't come to mind as an RBD migration tool.
>
>> Migrate your S3 user accounts from Cluster B to the new Cluster A
>> (should be easily scriptable with e.g. JSON output from
>> radosgw-admin).
> It's fixed now, but didn't used to be possible to create all the various
> keys.
>
>> Check for and resolve S3 bucket name conflicts between Cluster A and
>> ClusterB.
> None.
>
>> Migrate your S3 data from Cluster B to the new Cluster A using an
>> S3-level tool. s3cmd comes to mind.
> s3cmd does not preserve mtimes, ACLs or CORS data; that's the largest
> part of the concern.

You need to setup a second rgw zone, and use the radosgw sync agent to
sync data to the secondary zone. That will preserve mtimes and ACLs.
Once that's complete you could then turn the secondary zone into your
primary.

Yehuda
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Merging two active ceph clusters: suggestions needed

2014-09-23 Thread Robin H. Johnson
On Tue, Sep 23, 2014 at 03:12:53PM -0600, John Nielsen wrote:
> Keep Cluster A intact and migrate it to your new hardware. You can do
> this with no downtime, assuming you have enough IOPS to support data
> migration and normal usage simultaneously. Bring up the new OSDs and
> let everything rebalance, then remove the old OSDs one at a time.
> Replace the MONs one at a time. Since you will have the same data on
> the same cluster (but different hardware), you don't need to worry
> about mtimes or handling RBD or S3 data at all.
The B side already has data however, and that's one of the merge
problems (see below re S3).

> Make sure you have top-level ceph credentials on the new cluster that
> will work for current users of Cluster B.
> 
> Use a librbd-aware tool to migrate the RBD volumes from Cluster B onto
> the new Cluster A. qemu-img comes to mind. This would require downtime
> for each volume, but not necessarily all at the same time.
Thanks, qemu-img didn't come to mind as an RBD migration tool.

> Migrate your S3 user accounts from Cluster B to the new Cluster A
> (should be easily scriptable with e.g. JSON output from
> radosgw-admin).
It's fixed now, but didn't used to be possible to create all the various
keys.

> Check for and resolve S3 bucket name conflicts between Cluster A and
> ClusterB.
None.

> Migrate your S3 data from Cluster B to the new Cluster A using an
> S3-level tool. s3cmd comes to mind.
s3cmd does not preserve mtimes, ACLs or CORS data; that's the largest
part of the concern.

-- 
Robin Hugh Johnson
Gentoo Linux: Developer, Infrastructure Lead
E-Mail : robb...@gentoo.org
GnuPG FP   : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Merging two active ceph clusters: suggestions needed

2014-09-23 Thread John Nielsen
I would:

Keep Cluster A intact and migrate it to your new hardware. You can do this with 
no downtime, assuming you have enough IOPS to support data migration and normal 
usage simultaneously. Bring up the new OSDs and let everything rebalance, then 
remove the old OSDs one at a time. Replace the MONs one at a time. Since you 
will have the same data on the same cluster (but different hardware), you don't 
need to worry about mtimes or handling RBD or S3 data at all.

Make sure you have top-level ceph credentials on the new cluster that will work 
for current users of Cluster B.

Use a librbd-aware tool to migrate the RBD volumes from Cluster B onto the new 
Cluster A. qemu-img comes to mind. This would require downtime for each volume, 
but not necessarily all at the same time.

Migrate your S3 user accounts from Cluster B to the new Cluster A (should be 
easily scriptable with e.g. JSON output from radosgw-admin).

Check for and resolve S3 bucket name conflicts between Cluster A and ClusterB.

Migrate your S3 data from Cluster B to the new Cluster A using an S3-level 
tool. s3cmd comes to mind.

Fine-tuning and automating the above is left as an exercise for the reader, but 
it should all be possible with built-in and/or commodity tools.

On Sep 20, 2014, at 11:15 PM, Robin H. Johnson  wrote:

> For a variety of reasons, none good anymore, we have two separate Ceph
> clusters.
> 
> I would like to merge them onto the newer hardware, with as little
> downtime and data loss as possible; then discard the old hardware.
> 
> Cluster A (2 hosts):
> - 3TB of S3 content, >100k files, file mtimes important
> - <500GB of RBD volumes, exported via iscsi
> 
> Cluster B (4 hosts):
> - <50GiB of S3 content
> - 7TB of RBD volumes, exported via iscsi
> 
> Short of finding somewhere to dump all of the data from one side, and
> re-importing it after merging with that cluster as empty; are there any
> other alternatives available to me?
> 
> -- 
> Robin Hugh Johnson
> Gentoo Linux: Developer, Infrastructure Lead
> E-Mail : robb...@gentoo.org
> GnuPG FP   : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Merging two active ceph clusters: suggestions needed

2014-09-23 Thread Mikaƫl Cluseau

On 09/22/2014 05:17 AM, Robin H. Johnson wrote:

Can somebody else make comments about migrating S3 buckets with
preserved mtime data (and all of the ACLs & CORS) then?


I don't know how radosgw objects are stored, but have you considered a 
lower level rados export/import ?


IMPORT AND EXPORT
   import [options]  
   Upload  to 
   export [options] rados-pool> 
   Download  to 
   options:
   -f / --force Copy everything, even if it hasn't 
changed.
   -d / --delete-after  After synchronizing, delete 
unreferenced

files or objects from the target bucket
or directory.
   --workersNumber of worker threads to spawn
(default 5)

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Merging two active ceph clusters: suggestions needed

2014-09-21 Thread Robin H. Johnson
On Sun, Sep 21, 2014 at 02:33:09PM +0900, Christian Balzer wrote:
> > For a variety of reasons, none good anymore, we have two separate Ceph
> > clusters.
> > 
> > I would like to merge them onto the newer hardware, with as little
> > downtime and data loss as possible; then discard the old hardware.
> > 
> > Cluster A (2 hosts):
> > - 3TB of S3 content, >100k files, file mtimes important
> > - <500GB of RBD volumes, exported via iscsi
> > 
> > Cluster B (4 hosts):
> > - <50GiB of S3 content
> > - 7TB of RBD volumes, exported via iscsi
> > 
> > Short of finding somewhere to dump all of the data from one side, and
> > re-importing it after merging with that cluster as empty; are there any
> > other alternatives available to me?
> > 
> 
> Having recently seen a similar question and the answer by the Ceph
> developers, no. 
> As in there is no way (and no plans) for merging clusters.
> 
> There are export functions for RBD volumes, not sure about S3 and the
> mtimes as I don't use that functionality. 
Can somebody else make comments about migrating S3 buckets with
preserved mtime data (and all of the ACLs & CORS) then?

-- 
Robin Hugh Johnson
Gentoo Linux: Developer, Infrastructure Lead
E-Mail : robb...@gentoo.org
GnuPG FP   : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Merging two active ceph clusters: suggestions needed

2014-09-20 Thread Christian Balzer
On Sun, 21 Sep 2014 05:15:32 + Robin H. Johnson wrote:

> For a variety of reasons, none good anymore, we have two separate Ceph
> clusters.
> 
> I would like to merge them onto the newer hardware, with as little
> downtime and data loss as possible; then discard the old hardware.
> 
> Cluster A (2 hosts):
> - 3TB of S3 content, >100k files, file mtimes important
> - <500GB of RBD volumes, exported via iscsi
> 
> Cluster B (4 hosts):
> - <50GiB of S3 content
> - 7TB of RBD volumes, exported via iscsi
> 
> Short of finding somewhere to dump all of the data from one side, and
> re-importing it after merging with that cluster as empty; are there any
> other alternatives available to me?
> 

Having recently seen a similar question and the answer by the Ceph
developers, no. 
As in there is no way (and no plans) for merging clusters.

There are export functions for RBD volumes, not sure about S3 and the
mtimes as I don't use that functionality. 

Christian
-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Fusion Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Merging two active ceph clusters: suggestions needed

2014-09-20 Thread Robin H. Johnson
For a variety of reasons, none good anymore, we have two separate Ceph
clusters.

I would like to merge them onto the newer hardware, with as little
downtime and data loss as possible; then discard the old hardware.

Cluster A (2 hosts):
- 3TB of S3 content, >100k files, file mtimes important
- <500GB of RBD volumes, exported via iscsi

Cluster B (4 hosts):
- <50GiB of S3 content
- 7TB of RBD volumes, exported via iscsi

Short of finding somewhere to dump all of the data from one side, and
re-importing it after merging with that cluster as empty; are there any
other alternatives available to me?

-- 
Robin Hugh Johnson
Gentoo Linux: Developer, Infrastructure Lead
E-Mail : robb...@gentoo.org
GnuPG FP   : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com