Re: [ceph-users] How to restore a Ceph cluster from its cluster map?

2014-10-08 Thread Marco Garcês
Im in on this thread.


*Marco Garcês*
*#sysadmin*
Maputo - Mozambique
*[Phone]* +258 84 4105579
*[Skype]* marcogarces

On Wed, Oct 8, 2014 at 11:00 AM, Aegeaner xih...@gmail.com wrote:


 Hi all!

 For production use, I want to use two ceph clusters at the same time.
 One is the master cluster, and the other is the replication cluster,
 which syncs RBD snapshots with master cluster at fixed time (every day,
 e.g.), by the way this article describes:
 http://ceph.com/dev-notes/incremental-snapshots-with-rbd/ . In case the
 master cluster is down, I mean, there is some problem with ceph so that
 the whole cluster is down, I can switch from master cluster to slave
 cluster.

 Now the question is, if the master cluster is down, and if I have backed
 up all the metadata before: the monitor map, the osd map, the pg map,
 the crush map. How can I restore the master Ceph cluster from these
 cluster maps? Is there a tool or certain way to do it?

 Thanks!

 ===
 Aegeaner





 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to restore a Ceph cluster from its cluster map?

2014-10-08 Thread Wido den Hollander
On 10/08/2014 11:00 AM, Aegeaner wrote:
 
 Hi all!
 
 For production use, I want to use two ceph clusters at the same time.
 One is the master cluster, and the other is the replication cluster,
 which syncs RBD snapshots with master cluster at fixed time (every day,
 e.g.), by the way this article describes:
 http://ceph.com/dev-notes/incremental-snapshots-with-rbd/ . In case the
 master cluster is down, I mean, there is some problem with ceph so that
 the whole cluster is down, I can switch from master cluster to slave
 cluster.
 

Ok, but there will be a sync gap betweeh the master and slave cluster
since the RBD replication is not happening real-time, thus you will
loose some data if the master cluster 'burns down'.

 Now the question is, if the master cluster is down, and if I have backed
 up all the metadata before: the monitor map, the osd map, the pg map,
 the crush map. How can I restore the master Ceph cluster from these
 cluster maps? Is there a tool or certain way to do it?
 

So explain 'down'? Due to what?

In theory it is probably possible to bring a cluster back to life if it
has become corrupted, but on a large deployment there will be a lot of
PGmap and OSDmap changes in a very short period in time.

You will *never* get a consistent snapshot of the whole cluster at a
specific point in time.

But the question still stands, explain 'down'. What does it mean in your
case?

You could loose all your monitors at the same time. They can probably be
fixed with a backup of those maps, but I think it comes down to calling
Sage and pulling your credit card.

 Thanks!
 
 ===
 Aegeaner
 
 
 
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


-- 
Wido den Hollander
Ceph consultant and trainer
42on B.V.

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to restore a Ceph cluster from its cluster map?

2014-10-08 Thread Aegeaner

Thanks Wido,
  When I describe a ceph cluster is down, I mean something is wrong 
with the ceph software, someone mistakenly changed the configuration 
file, making the conf in many nodes inconsistent, e.g. wrong fs_id, 
inconsistent OSD / host mapping, etc. I'm not talking about OSD 
failures, because I know ceph could recover from it, neither data 
errors, ceph will do data scrubbing. I just mean the failures may caused 
by wrong manual configurations or bugs in ceph software. I'm not sure if 
this often happens, but it did happened several times. This is why I use 
a replication cluster to backup master cluster's data, for production 
use. Is there a solution or better way to solve this?


==
 Aegeaner

在 2014-10-08 17:55, Wido den Hollander 写道:

On 10/08/2014 11:00 AM, Aegeaner wrote:

Hi all!

For production use, I want to use two ceph clusters at the same time.
One is the master cluster, and the other is the replication cluster,
which syncs RBD snapshots with master cluster at fixed time (every day,
e.g.), by the way this article describes:
http://ceph.com/dev-notes/incremental-snapshots-with-rbd/ . In case the
master cluster is down, I mean, there is some problem with ceph so that
the whole cluster is down, I can switch from master cluster to slave
cluster.


Ok, but there will be a sync gap betweeh the master and slave cluster
since the RBD replication is not happening real-time, thus you will
loose some data if the master cluster 'burns down'.


Now the question is, if the master cluster is down, and if I have backed
up all the metadata before: the monitor map, the osd map, the pg map,
the crush map. How can I restore the master Ceph cluster from these
cluster maps? Is there a tool or certain way to do it?


So explain 'down'? Due to what?

In theory it is probably possible to bring a cluster back to life if it
has become corrupted, but on a large deployment there will be a lot of
PGmap and OSDmap changes in a very short period in time.

You will *never* get a consistent snapshot of the whole cluster at a
specific point in time.

But the question still stands, explain 'down'. What does it mean in your
case?

You could loose all your monitors at the same time. They can probably be
fixed with a backup of those maps, but I think it comes down to calling
Sage and pulling your credit card.


Thanks!

===
Aegeaner





___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to restore a Ceph cluster from its cluster map?

2014-10-08 Thread Craig Lewis
I asked a similiar question before, about backing up maps:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-August/022798.html

The short answer is you can't.  There are maps that you can't dump, so you
don't have the ability to make a complete snapshot of the cluster.

The long answer is that you don't want to.  If you have something so
catastrophic that you need to roll back your cluster state, the data very
likely isn't where the rollback thinks it is.  And the information the
cluster needs to find it is in the state that you just rolled back.


On the plus side, Ceph is remarkably good about dealing with changes.  If
you push a bad configuration option that causes data to start moving all
over the cluster, just push the old config again.  Ceph will stop trying to
move data that hasn't moved yet, and will reconcile the small bit of data
that did move.

You don't have to worry too much about ceph.conf being out of sync across
the nodes.  The really important bits are controlled by the primary
monitor, and the other monitors uses those values once they're in a quorum.


Not much will protect you from Ceph bugs though.  The best bet here is
multiple clusters.  While I thought they would experience the same bugs at
the same time, my experience is the replication changes the usage patterns
enough that they don't.  My primary cluster has always been healthy, but my
secondary cluster has had several extended outages.  Anecdotal, but it
works for me.  :-)






On Wed, Oct 8, 2014 at 6:22 AM, Aegeaner xih...@gmail.com wrote:

 Thanks Wido,
   When I describe a ceph cluster is down, I mean something is wrong with
 the ceph software, someone mistakenly changed the configuration file,
 making the conf in many nodes inconsistent, e.g. wrong fs_id, inconsistent
 OSD / host mapping, etc. I'm not talking about OSD failures, because I know
 ceph could recover from it, neither data errors, ceph will do data
 scrubbing. I just mean the failures may caused by wrong manual
 configurations or bugs in ceph software. I'm not sure if this often
 happens, but it did happened several times. This is why I use a replication
 cluster to backup master cluster's data, for production use. Is there a
 solution or better way to solve this?

 ==
  Aegeaner

 在 2014-10-08 17:55, Wido den Hollander 写道:

  On 10/08/2014 11:00 AM, Aegeaner wrote:

 Hi all!

 For production use, I want to use two ceph clusters at the same time.
 One is the master cluster, and the other is the replication cluster,
 which syncs RBD snapshots with master cluster at fixed time (every day,
 e.g.), by the way this article describes:
 http://ceph.com/dev-notes/incremental-snapshots-with-rbd/ . In case the
 master cluster is down, I mean, there is some problem with ceph so that
 the whole cluster is down, I can switch from master cluster to slave
 cluster.

  Ok, but there will be a sync gap betweeh the master and slave cluster
 since the RBD replication is not happening real-time, thus you will
 loose some data if the master cluster 'burns down'.

  Now the question is, if the master cluster is down, and if I have backed
 up all the metadata before: the monitor map, the osd map, the pg map,
 the crush map. How can I restore the master Ceph cluster from these
 cluster maps? Is there a tool or certain way to do it?

  So explain 'down'? Due to what?

 In theory it is probably possible to bring a cluster back to life if it
 has become corrupted, but on a large deployment there will be a lot of
 PGmap and OSDmap changes in a very short period in time.

 You will *never* get a consistent snapshot of the whole cluster at a
 specific point in time.

 But the question still stands, explain 'down'. What does it mean in your
 case?

 You could loose all your monitors at the same time. They can probably be
 fixed with a backup of those maps, but I think it comes down to calling
 Sage and pulling your credit card.

  Thanks!

 ===
 Aegeaner





 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com