[ceph-users] Replacing a failed OSD disk drive (or replace XFS with BTRFS)

2015-03-21 Thread Datatone Lists
I have been experimenting with Ceph, and have some OSDs with drives
containing XFS filesystems which I want to change to BTRFS.
(I started with BTRFS, then started again from scratch with XFS
[currently recommended] in order to eleminate that as a potential cause
of some issues, now with further experience, I want to go back to
BTRFS, but have data in my cluster and I don't want to scrap it).

This is exactly equivalent to the case in which I have an OSD with a
drive that I see is starting to error. I would in that case need to
replace the drive and recreate the Ceph structures on it.

So, I mark the OSD out, and the cluster automatically eliminates its
notion of data stored on the OSD and creates copies of the affected PGs
elsewhere to make the cluster healthy again.

All of the disk replacement instructions that I see then tell me to
then follow an OSD removal process:

This procedure removes an OSD from a cluster map, removes its
authentication key, removes the OSD from the OSD map, and removes the
OSD from the ceph.conf file.

This seems to me to be too heavy-handed. I'm worried about doing this
and then effectively adding a new OSD where I have the same id number
as the OSD that I apparently unnecessarily removed.

I don't actually want to remove the OSD. The OSD is fine, I just want
to replace the disk drive that it uses.

This suggests that I really want to take the OSD out, allow the cluster
to get healthy again, then (replace the disk if this is due to
failure,) create a new BTRFS/XFS filesystem, remount the drive, then
recreate the Ceph structures on the disk to be compatible with the old
disk and the original OSD that it was attached to.

The OSD then gets marked back in, the cluster says hello again, we
missed you, but its good to see you back, here are some PGs 

What I'm saying is that I really don't want to destroy the OSD, I want
to refresh it with a new disk/filesystem and put it back to work.

Is there some fundamental reason why this can't be done? If not, how
should I do it?

Best regards,
David

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph User Teething Problems

2015-03-05 Thread Datatone Lists
 that I can increase the size of
pools in line with increasing osd numbers. I felt that this had to be
the case, otherwise the 'scalable' claim becomes a bit limited.

Returning from these digressions to my own experience; I set up my
cephfs file system as illuminated by John Spray. I mounted it and
started to rsync a multi-terabyte filesystem to it. This is my test, if
cephfs handles this without grinding to a snails pace or failing, I
will be ready to start to commit my data to it. My osd disk lights
started to flash and flicker and a comforting sound of drive activity
issued forth. I checked the osd logs, and to my dismay, there were
crash reports in them all. However, a closer look revealed that I am
getting the too many open files messages that precede the failures.

I can see that this is not an osd failure, but a resource limit issue.

I completely acknowledge that I must now RTFM, but I will ask whether
anybody can give any guidance, based on experience, with respect to
this issue.

Thank you again for all for the previous prompt and invaluable advice
and information.

David


On Wed, 4 Mar 2015 20:27:51 +
Datatone Lists li...@datatone.co.uk wrote:

 I have been following ceph for a long time. I have yet to put it into
 service, and I keep coming back as btrfs improves and ceph reaches
 higher version numbers.
 
 I am now trying ceph 0.93 and kernel 4.0-rc1.
 
 Q1) Is it still considered that btrfs is not robust enough, and that
 xfs should be used instead? [I am trying with btrfs].
 
 I followed the manual deployment instructions on the web site 
 (http://ceph.com/docs/master/install/manual-deployment/) and I managed
 to get a monitor and several osds running and apparently working. The
 instructions fizzle out without explaining how to set up mds. I went
 back to mkcephfs and got things set up that way. The mds starts.
 
 [Please don't mention ceph-deploy]
 
 The first thing that I noticed is that (whether I set up mon and osds
 by following the manual deployment, or using mkcephfs), the correct
 default pools were not created.
 
 bash-4.3# ceph osd lspools
 0 rbd,
 bash-4.3# 
 
  I get only 'rbd' created automatically. I deleted this pool, and
  re-created data, metadata and rbd manually. When doing this, I had to
  juggle with the pg- num in order to avoid the 'too many pgs for osd'.
  I have three osds running at the moment, but intend to add to these
  when I have some experience of things working reliably. I am puzzled,
  because I seem to have to set the pg-num for the pool to a number
 that makes (N-pools x pg-num)/N-osds come to the right kind of
 number. So this implies that I can't really expand a set of pools by
 adding osds at a later date. 
 
 Q2) Is there any obvious reason why my default pools are not getting
 created automatically as expected?
 
 Q3) Can pg-num be modified for a pool later? (If the number of osds
 is increased dramatically).
 
 Finally, when I try to mount cephfs, I get a mount 5 error.
 
 A mount 5 error typically occurs if a MDS server is laggy or if it
 crashed. Ensure at least one MDS is up and running, and the cluster is
 active + healthy.
 
 My mds is running, but its log is not terribly active:
 
 2015-03-04 17:47:43.177349 7f42da2c47c0  0 ceph version 0.93 
 (bebf8e9a830d998eeaab55f86bb256d4360dd3c4), process ceph-mds, pid 4110
 2015-03-04 17:47:43.182716 7f42da2c47c0 -1 mds.-1.0 log_to_monitors 
 {default=true}
 
 (This is all there is in the log).
 
 I think that a key indicator of the problem must be this from the
 monitor log:
 
 2015-03-04 16:53:20.715132 7f3cd0014700  1
 mon.ceph-mon-00@0(leader).mds e1 warning, MDS mds.?
 [2001:8b0::5fb3::1fff::9054]:6800/4036 up but filesystem
 disabled
 
 (I have added the '' sections to obscure my ip address)
 
 Q4) Can you give me an idea of what is wrong that causes the mds to
 not play properly?
 
 I think that there are some typos on the manual deployment pages, for
 example:
 
 ceph-osd id={osd-num}
 
 This is not right. As far as I am aware it should be:
 
 ceph-osd -i {osd-num}
 
 An observation. In principle, setting things up manually is not all
 that complicated, provided that clear and unambiguous instructions are
 provided. This simple piece of documentation is very important. My
 view is that the existing manual deployment instructions gets a bit
 confused and confusing when it gets to the osd setup, and the mds
 setup is completely absent.
 
 For someone who knows, this would be a fairly simple and fairly quick 
 operation to review and revise this part of the documentation. I
 suspect that this part suffers from being really obvious stuff to the
 well initiated. For those of us closer to the start, this forms the
 ends of the threads that have to be picked up before the journey can
 be made.
 
 Very best regards,
 David
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Ceph User Teething Problems

2015-03-04 Thread Datatone Lists
I have been following ceph for a long time. I have yet to put it into
service, and I keep coming back as btrfs improves and ceph reaches
higher version numbers.

I am now trying ceph 0.93 and kernel 4.0-rc1.

Q1) Is it still considered that btrfs is not robust enough, and that
xfs should be used instead? [I am trying with btrfs].

I followed the manual deployment instructions on the web site 
(http://ceph.com/docs/master/install/manual-deployment/) and I managed
to get a monitor and several osds running and apparently working. The
instructions fizzle out without explaining how to set up mds. I went
back to mkcephfs and got things set up that way. The mds starts.

[Please don't mention ceph-deploy]

The first thing that I noticed is that (whether I set up mon and osds
by following the manual deployment, or using mkcephfs), the correct
default pools were not created.

bash-4.3# ceph osd lspools
0 rbd,
bash-4.3# 

 I get only 'rbd' created automatically. I deleted this pool, and
 re-created data, metadata and rbd manually. When doing this, I had to
 juggle with the pg- num in order to avoid the 'too many pgs for osd'.
 I have three osds running at the moment, but intend to add to these
 when I have some experience of things working reliably. I am puzzled,
 because I seem to have to set the pg-num for the pool to a number that
 makes (N-pools x pg-num)/N-osds come to the right kind of number. So
 this implies that I can't really expand a set of pools by adding osds
 at a later date. 

Q2) Is there any obvious reason why my default pools are not getting
created automatically as expected?

Q3) Can pg-num be modified for a pool later? (If the number of osds is 
increased dramatically).

Finally, when I try to mount cephfs, I get a mount 5 error.

A mount 5 error typically occurs if a MDS server is laggy or if it
crashed. Ensure at least one MDS is up and running, and the cluster is
active + healthy.

My mds is running, but its log is not terribly active:

2015-03-04 17:47:43.177349 7f42da2c47c0  0 ceph version 0.93 
(bebf8e9a830d998eeaab55f86bb256d4360dd3c4), process ceph-mds, pid 4110
2015-03-04 17:47:43.182716 7f42da2c47c0 -1 mds.-1.0 log_to_monitors 
{default=true}

(This is all there is in the log).

I think that a key indicator of the problem must be this from the
monitor log:

2015-03-04 16:53:20.715132 7f3cd0014700  1
mon.ceph-mon-00@0(leader).mds e1 warning, MDS mds.?
[2001:8b0::5fb3::1fff::9054]:6800/4036 up but filesystem
disabled

(I have added the '' sections to obscure my ip address)

Q4) Can you give me an idea of what is wrong that causes the mds to not
play properly?

I think that there are some typos on the manual deployment pages, for
example:

ceph-osd id={osd-num}

This is not right. As far as I am aware it should be:

ceph-osd -i {osd-num}

An observation. In principle, setting things up manually is not all
that complicated, provided that clear and unambiguous instructions are
provided. This simple piece of documentation is very important. My view
is that the existing manual deployment instructions gets a bit confused
and confusing when it gets to the osd setup, and the mds setup is
completely absent.

For someone who knows, this would be a fairly simple and fairly quick 
operation to review and revise this part of the documentation. I
suspect that this part suffers from being really obvious stuff to the
well initiated. For those of us closer to the start, this forms the
ends of the threads that have to be picked up before the journey can be
made.

Very best regards,
David
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com