Hello everyone, i am testing ceph-deploy on Centos 6.3 i am getting errors. i
have a simple one node setup as follows:
OS: CentOS 6.3
kernel 3.5 and also kernel 2.6.32-279.el6.x86_64
Journal partition size=2GB
/dev/sdb label=gpt
selinux=OFF
iptables=OFF
NUMBER OF OSD=2
Test 1:
ceph-deploy new gclient158
ceph-deploy mon create gclient158
ceph-deploy disk zap gclient158:/dev/sdc
ceph-deploy disk zap gclient158:/dev/sdd
ceph-deploy gatherkeys gclient158
ceph-deploy mds create gclient158
ceph-deploy osd prepare gclient158:sdc:/dev/sdb1
ceph-deploy osd prepare gclient158:sdd:/dev/sdb2
ceph-deploy osd activate gclient158:/dev/sdc:/dev/sdb1
ceph-deploy osd activate gclient158:/dev/sdd:/dev/sdb2
The result of the above ceph-deploy commands are shown below. The 2 osd are
running but "ceph health" command nnnever show HEALTK_OK. It stays in
HEALTH_WARN forever and is degraded. By the way /var/log/ceph/ceph-osd.0.log
and /var/log/ceph/ceph-osd.1.log contain no real errors.This behavior is the
same for kernel 3.5 and 2.6.32-279.el6.x86_64. What am i missing?
[root@gclient158 ~]# ps -elf|grep ceph
5 S root 3124 1 0 80 0 - 40727 futex_ 10:49 ? 00:00:00
/usr/bin/ceph-mon -i gclient158 --pid-file /var/run/ceph/mon.gclient158.pid -c
/etc/ceph/ceph.conf
5 S root 3472 1 0 80 0 - 41194 futex_ 10:49 ? 00:00:00
/usr/bin/ceph-mds -i gclient158 --pid-file /var/run/ceph/mds.gclient158.pid -c
/etc/ceph/ceph.conf
5 S root 4035 1 1 78 -2 - 115119 futex_ 10:50 ? 00:00:00
/usr/bin/ceph-osd -i 0 --pid-file /var/run/ceph/osd.0.pid -c /etc/ceph/ceph.conf
5 S root 4769 1 0 78 -2 - 112304 futex_ 10:50 ? 00:00:00
/usr/bin/ceph-osd -i 1 --pid-file /var/run/ceph/osd.1.pid -c /etc/ceph/ceph.conf
0 S root 5025 2710 0 80 0 - 25811 pipe_w 10:50 pts/0 00:00:00 grep
ceph
[root@gclient158 ~]# ceph osd tree
# id weight type name up/down reweight
-1 0.14 root default
-2 0.14 host gclient158
0 0.06999 osd.0 up 1
1 0.06999 osd.1 up 1
[root@gclient158 ~]# ceph health
HEALTH_WARN 91 pgs degraded; 192 pgs stuck unclean; recovery 9/42 degraded
(21.429%); recovering 2 o/s, 1492B/s
[root@gclient158 ~]# ceph health
HEALTH_WARN 91 pgs degraded; 192 pgs stuck unclean; recovery 9/42 degraded
(21.429%); recovering 2 o/s, 1492B/s
[root@gclient158 ~]# ceph health
HEALTH_WARN 91 pgs degraded; 192 pgs stuck unclean; recovery 9/42 degraded
(21.429%); recovering 2 o/s, 1492B/s
By the way /var/log/ceph/ceph-osd.0.log and /var/log/ceph/ceph-osd.1.log
contain no real errors but one thing that happens after osd prepare and
activate commands is the error below:
Traceback (most recent call last):
File "/usr/sbin/ceph-deploy", line 8, in <module>
load_entry_point('ceph-deploy==0.1', 'console_scripts', 'ceph-deploy')()
File "/root/ceph-deploy/ceph_deploy/cli.py", line 112, in main
return args.func(args)
File "/root/ceph-deploy/ceph_deploy/osd.py", line 426, in osd
prepare(args, cfg, activate_prepared_disk=False)
File "/root/ceph-deploy/ceph_deploy/osd.py", line 273, in prepare
s = '{} returned {}\n{}\n{}'.format(cmd, ret, out, err)
ValueError: zero length field name in format
The above error probably has to with the journal device. I had the same error
when the journal device label=gpt and also with journal device label=msdos.
Please, what am i missing here and why does the cluster never reaches HEALTH_OK?
TEST 2: the setup for this test is same as above except i used same disk for
both ceph data and journal as follows:
ceph-deploy osd prepare gclient158:/dev/sdc
ceph-deploy osd prepare gclient158:/dev/sdd
ceph-deploy osd activate gclient158:/dev/sdc
ceph-deploy osd activate gclient158:/dev/sdd
For test 2, i do not get the error in test 1 but osd's fail to start and both
osd log files contain this error:
2013-05-21 11:54:24.806747 7f26cfa26780 -1 journal check: ondisk fsid
00000000-0000-0000-0000-000000000000 doesn't match expected
942af534-ccc0-4843-8598-79420592317a, invalid (someone else's?) journal
2013-05-21 11:54:24.806784 7f26cfa26780 -1
filestore(/var/lib/ceph/tmp/mnt.3YsEmH) mkjournal error creating journal on
/var/lib/ceph/tmp/mnt.3YsEmH/journal: (22) Invalid argument
2013-05-21 11:54:24.806802 7f26cfa26780 -1 OSD::mkfs: FileStore::mkfs failed
with error -22
2013-05-21 11:54:24.806838 7f26cfa26780 -1 ^[[0;31m ** ERROR: error creating
empty object store in /var/lib/ceph/tmp/mnt.3YsEmH: (22) Invalid argument^[[0m
What am i missing? any suggestion on both test cases would be appreciated.
Thank you.
Isaac
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html