Hi all,
I'd like to submit a strange behavior...
Context : lab platform
CEPH emperor
Ceph-deploy 1.3.4
Ubuntu 12.04
Issue:
We have 3 OSD up and running; we encountered no difficulties in creating them.
We tried to create an osd.3 using ceph-deploy on a storage node (r-cephosd301)
from an admin server (r-cephrgw01)
We have to use an external SATA 3 TB disk; the journal will be set on the first
sectors.
We encountered a lot of problems but we succeeded.
As we also encountered the same difficulties creating the osd.4
(r-cephosd302), I decided to trace the process.
We had the following lines in ceph.conf (journal size is set in the osd section
because it's not taken into account in osd.4 section)
[osd.4]
host = r-cephosd302
public_addr = 10.194.182.52
cluster_addr = 192.168.182.52
root@r-cephrgw01:/etc/ceph# ceph-deploy --overwrite-conf osd --zap-disk create
r-cephosd302:/dev/sdc
[ceph_deploy.cli][INFO ] Invoked (1.3.4): /usr/bin/ceph-deploy
--overwrite-conf osd --zap-disk create r-cephosd302:/dev/sdc
[ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks r-cephosd302:/dev/sdc:
[r-cephosd302][DEBUG ] connected to host: r-cephosd302
[r-cephosd302][DEBUG ] detect platform information from remote host
[r-cephosd302][DEBUG ] detect machine type
[ceph_deploy.osd][INFO ] Distro info: Ubuntu 12.04 precise
[ceph_deploy.osd][DEBUG ] Deploying osd to r-cephosd302
[r-cephosd302][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[r-cephosd302][INFO ] Running command: udevadm trigger --subsystem-match=block
--action=add
[ceph_deploy.osd][DEBUG ] Preparing host r-cephosd302 disk /dev/sdc journal
None activate True
[r-cephosd302][INFO ] Running command: ceph-disk-prepare --zap-disk --fs-type
xfs --cluster ceph -- /dev/sdc
[r-cephosd302][WARNIN] Caution: invalid backup GPT header, but valid main
header; regenerating
[r-cephosd302][WARNIN] backup header from main header.
[r-cephosd302][WARNIN]
[r-cephosd302][WARNIN] Warning! Main and backup partition tables differ! Use
the 'c' and 'e' options
[r-cephosd302][WARNIN] on the recovery & transformation menu to examine the two
tables.
[r-cephosd302][WARNIN]
[r-cephosd302][WARNIN] Warning! One or more CRCs don't match. You should repair
the disk!
[r-cephosd302][WARNIN]
[r-cephosd302][WARNIN] INFO:ceph-disk:Will colocate journal with data on
/dev/sdc
[r-cephosd302][DEBUG ]
****************************************************************************
[r-cephosd302][DEBUG ] Caution: Found protective or hybrid MBR and corrupt GPT.
Using GPT, but disk
[r-cephosd302][DEBUG ] verification and recovery are STRONGLY recommended.
[r-cephosd302][DEBUG ]
****************************************************************************
[r-cephosd302][DEBUG ] GPT data structures destroyed! You may now partition the
disk using fdisk or
[r-cephosd302][DEBUG ] other utilities.
[r-cephosd302][DEBUG ] The operation has completed successfully.
[r-cephosd302][DEBUG ] Information: Moved requested sector from 34 to 2048 in
[r-cephosd302][DEBUG ] order to align on 2048-sector boundaries.
[r-cephosd302][DEBUG ] The operation has completed successfully.
[r-cephosd302][DEBUG ] Information: Moved requested sector from 38912001 to
38914048 in
[r-cephosd302][DEBUG ] order to align on 2048-sector boundaries.
[r-cephosd302][DEBUG ] The operation has completed successfully.
[r-cephosd302][DEBUG ] meta-data=/dev/sdc1 isize=2048 agcount=4,
agsize=181925597 blks
[r-cephosd302][DEBUG ] = sectsz=512 attr=2,
projid32bit=0
[r-cephosd302][DEBUG ] data = bsize=4096
blocks=727702385, imaxpct=5
[r-cephosd302][DEBUG ] = sunit=0 swidth=0
blks
[r-cephosd302][DEBUG ] naming =version 2 bsize=4096 ascii-ci=0
[r-cephosd302][DEBUG ] log =internal log bsize=4096
blocks=355323, version=2
[r-cephosd302][DEBUG ] = sectsz=512 sunit=0
blks, lazy-count=1
[r-cephosd302][DEBUG ] realtime =none extsz=4096 blocks=0,
rtextents=0
[r-cephosd302][DEBUG ] The operation has completed successfully.
[r-cephosd302][INFO ] Running command: udevadm trigger --subsystem-match=block
--action=add
[ceph_deploy.osd][DEBUG ] Host r-cephosd302 is now ready for osd use.
the process seems to finish normally, but...
root@r-cephrgw01:/etc/ceph# ceph osd tree
# id weight type name up/down reweight
-1 4.06 root default
-2 0.45 host r-cephosd101
0 0.45 osd.0 up 1
-3 0.45 host r-cephosd102
1 0.45 osd.1 up 1
-4 0.45 host r-cephosd103
2 0.45 osd.2 up 1
-5 2.71 host r-cephosd301
3 2.71 osd.3 up 1
The OSD is not in the cluster and it seems that Ceph tried to create a new
osd.0 according to the log file found on the remote server.
root@r-cephosd302:/var/lib/ceph/osd/ceph-4# ll /var/log/ceph
total 12
drwxr-xr-x 2 root root 4096 Jan 24 14:46 ./
drwxr-xr-x 11 root root 4096 Jan 24 13:27 ../
-rw-r--r-- 1 root root 2634 Jan 24 14:47 ceph-osd.0.log
-rw-r--r-- 1 root root 0 Jan 24 14:46 ceph-osd..log
So, we did the following actions:
root@r-cephosd302:/var/lib/ceph/osd# mkdir ceph-4
root@r-cephosd302:/var/lib/ceph/osd# mount /dev/sdc1 ceph-4/
root@r-cephosd302:/var/lib/ceph/osd# cd ceph-4
root@r-cephosd302:/var/lib/ceph/osd/ceph-4# ll
total 20
drwxr-xr-x 2 root root 78 Jan 24 14:47 ./
drwxr-xr-x 3 root root 4096 Jan 24 14:49 ../
-rw-r--r-- 1 root root 37 Jan 24 14:47 ceph_fsid
-rw-r--r-- 1 root root 37 Jan 24 14:47 fsid
lrwxrwxrwx 1 root root 58 Jan 24 14:47 journal -> /dev/disk/by-partuuid/7a6924
63-9837-4297-a5e3-98dac12aaf70
-rw-r--r-- 1 root root 37 Jan 24 14:47 journal_uuid
-rw-r--r-- 1 root root 21 Jan 24 14:47 magic
Some files are missing...
root@r-cephrgw01:/etc/ceph# ceph-deploy --overwrite-conf osd prepare
r-cephosd302:/var/lib/ceph/osd/ceph-4
[ceph_deploy.cli][INFO ] Invoked (1.3.4): /usr/bin/ceph-deploy
--overwrite-conf osd prepare r-cephosd302:/var/lib/ceph/osd/ceph-4
[ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks
r-cephosd302:/var/lib/ceph/osd/ceph-4:
[r-cephosd302][DEBUG ] connected to host: r-cephosd302
[r-cephosd302][DEBUG ] detect platform information from remote host
[r-cephosd302][DEBUG ] detect machine type
[ceph_deploy.osd][INFO ] Distro info: Ubuntu 12.04 precise
[ceph_deploy.osd][DEBUG ] Deploying osd to r-cephosd302
[r-cephosd302][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[r-cephosd302][INFO ] Running command: udevadm trigger --subsystem-match=block
--action=add
[ceph_deploy.osd][DEBUG ] Preparing host r-cephosd302 disk
/var/lib/ceph/osd/ceph-4 journal None activate False
[r-cephosd302][INFO ] Running command: ceph-disk-prepare --fs-type xfs
--cluster ceph -- /var/lib/ceph/osd/ceph-4
[ceph_deploy.osd][DEBUG ] Host r-cephosd302 is now ready for osd use.
the new osd is prepared but trying to active it....
root@r-cephrgw01:/etc/ceph# ceph-deploy --overwrite-conf osd activate
r-cephosd302:/var/lib/ceph/osd/ceph-4
[ceph_deploy.cli][INFO ] Invoked (1.3.4): /usr/bin/ceph-deploy
--overwrite-conf osd activate r-cephosd302:/var/lib/ceph/osd/ceph-4
[ceph_deploy.osd][DEBUG ] Activating cluster ceph disks
r-cephosd302:/var/lib/ceph/osd/ceph-4:
[r-cephosd302][DEBUG ] connected to host: r-cephosd302
[r-cephosd302][DEBUG ] detect platform information from remote host
[r-cephosd302][DEBUG ] detect machine type
[ceph_deploy.osd][INFO ] Distro info: Ubuntu 12.04 precise
[ceph_deploy.osd][DEBUG ] activating host r-cephosd302 disk
/var/lib/ceph/osd/ceph-4
[ceph_deploy.osd][DEBUG ] will use init type: upstart
[r-cephosd302][INFO ] Running command: ceph-disk-activate --mark-init upstart
--mount /var/lib/ceph/osd/ceph-4
[r-cephosd302][WARNIN] 2014-01-24 14:54:01.890234 7fe795693700 0 librados:
client.bootstrap-osd authentication error (1) Operation not permitted
[r-cephosd302][WARNIN] Error connecting to cluster: PermissionError
The bootstrap-osd/ceph.keyring is not correct...
So, I update it with the key created before.
root@r-cephosd302:/var/lib/ceph/osd/ceph-4# more
../../bootstrap-osd/ceph.keyring
[client.bootstrap-osd]
key = AQB0gN5SMIojBBAAGQwbLM1a+5ZdzfuYu91ZDg==
root@r-cephosd302:/var/lib/ceph/osd/ceph-4# vi
../../bootstrap-osd/ceph.keyring
[client.bootstrap-osd]
key = AQCrid5S6BSwORAAO4ch+GGGKhXW1BEVBHA2Bw==
root@r-cephrgw01:/etc/ceph# ceph-deploy --overwrite-conf osd activate
r-cephosd302:/var/lib/ceph/osd/ceph-4
[ceph_deploy.cli][INFO ] Invoked (1.3.4): /usr/bin/ceph-deploy
--overwrite-conf osd activate r-cephosd302:/var/lib/ceph/osd/ceph-4
[ceph_deploy.osd][DEBUG ] Activating cluster ceph disks
r-cephosd302:/var/lib/ceph/osd/ceph-4:
[r-cephosd302][DEBUG ] connected to host: r-cephosd302
[r-cephosd302][DEBUG ] detect platform information from remote host
[r-cephosd302][DEBUG ] detect machine type
[ceph_deploy.osd][INFO ] Distro info: Ubuntu 12.04 precise
[ceph_deploy.osd][DEBUG ] activating host r-cephosd302 disk
/var/lib/ceph/osd/ceph-4
[ceph_deploy.osd][DEBUG ] will use init type: upstart
[r-cephosd302][INFO ] Running command: ceph-disk-activate --mark-init upstart
--mount /var/lib/ceph/osd/ceph-4
[r-cephosd302][WARNIN] got latest monmap
[r-cephosd302][WARNIN] 2014-01-24 14:59:12.889327 7f4f47f49780 -1 journal
read_header error decoding journal header
[r-cephosd302][WARNIN] 2014-01-24 14:59:13.051076 7f4f47f49780 -1
filestore(/var/lib/ceph/osd/ceph-4) could not find
23c2fcde/osd_superblock/0//-1 in index: (2) No such file or directory
[r-cephosd302][WARNIN] 2014-01-24 14:59:13.220053 7f4f47f49780 -1 created
object store /var/lib/ceph/osd/ceph-4 journal /var/lib/ceph/osd/ceph-4/journal
for osd.4 fsid 632d789a-8560-469b-bf6a-8478e12d2cb6
[r-cephosd302][WARNIN] 2014-01-24 14:59:13.220135 7f4f47f49780 -1 auth: error
reading file: /var/lib/ceph/osd/ceph-4/keyring: can't open
/var/lib/ceph/osd/ceph-4/keyring: (2) No such file or directory
[r-cephosd302][WARNIN] 2014-01-24 14:59:13.220572 7f4f47f49780 -1 created new
key in keyring /var/lib/ceph/osd/ceph-4/keyring
[r-cephosd302][WARNIN] added key for osd.4
root@r-cephrgw01:/etc/ceph# ceph -s
cluster 632d789a-8560-469b-bf6a-8478e12d2cb6
health HEALTH_OK
monmap e3: 3 mons at
{r-cephosd101=10.194.182.41:6789/0,r-cephosd102=10.194.182.42:6789/0,r-cephosd103=10.194.182.43:6789/0},
election epoch 6, quorum 0,1,2 r-cephosd101,r-cephosd102,r-cephosd103
osdmap e37: 5 osds: 5 up, 5 in
pgmap v240: 192 pgs, 3 pools, 0 bytes data, 0 objects
139 MB used, 4146 GB / 4146 GB avail
192 active+clean
root@r-cephrgw01:/etc/ceph# ceph osd tree
# id weight type name up/down reweight
-1 6.77 root default
-2 0.45 host r-cephosd101
0 0.45 osd.0 up 1
-3 0.45 host r-cephosd102
1 0.45 osd.1 up 1
-4 0.45 host r-cephosd103
2 0.45 osd.2 up 1
-5 2.71 host r-cephosd301
3 2.71 osd.3 up 1
-6 2.71 host r-cephosd302
4 2.71 osd.4 up 1
Now, the new osd is up....
I didn't understand where the problem is...
Why isn't the "osd journal size" in the osd.# section taken into account?
Why does ceph try to recreate osd.0?
Why does ceph-deploy indicate that the osd is ready for use?
Why doesn't ceph-deploy create all the files?
Why is the bootstrap-osd not correct?
Thanks
- - - - - - - - - - - - - - - - -
Ghislain Chevalier
FT/OLNC/OLPS/ASE/DAPI/CSE
Storage Service Architect
+33299124432
[email protected]<mailto:[email protected]>
* Pensez à l'Environnement avant d'imprimer ce message !
_________________________________________________________________________________________________________________________
Ce message et ses pieces jointes peuvent contenir des informations
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou
falsifie. Merci.
This message and its attachments may contain confidential or privileged
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete
this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been
modified, changed or falsified.
Thank you.
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com