Hi all,
hope you are all doing well and maybe some of you can help me with a problem
i'm focusing recently.
I started to evaluate ceph a couple of months ago and I now have a very strange
problem while formatting rbd images.
The Problem only occurs when using rbd images directly with the kernel rbd
module loaded.
If I add the rbd image via one of our iscsi gateways (tgt iscsi) as iscsi
device, formatting is no problem and I can afterwards use the rbd image on any
host without problems.
Thats a workaround but i would like to find why it is not working with rbd
directly for me...
Problem explained:
I create a pool and an image:
ceph osd pool create pool-C 250 250
rbd create test --size 3548290 --pool pool-C --image-feature layering
I map the rbd image to my client (doesnt matter which client)
rbd map pool-C/test --id admin --keyring /etc/ceph/ceph.client.admin.keyring
As soon as I start to format (xfs or ext4) the image some of my osds start to
fail:
mkfs.xfs /dev/rbd/pool-C/test-part1
I see the following entrys as soon as i start formatting:
The OSD IDs are different each time. I guess ist also more a problem with the
journals.
EXAMPLE:
2017-04-20 13:43:24.529953 osd.1 [WRN] slow request 30.439722 seconds old,
received at 2017-04-20 13:42:54.090001: osd_op(client.344964.1:8170 9.cbb68aa5
rbd_data.540dc238e1f29.0000000000001e7c [delete] snapc 0=[] ondisk+write e2002)
currently started
2017-04-20 13:43:24.529984 osd.1 [WRN] slow request 30.431389 seconds old,
received at 2017-04-20 13:42:54.098334: osd_op(client.344964.1:8489 9.f414989
rbd_data.540dc238e1f29.0000000000001fbb [delete] snapc 0=[] ondisk+write e2002)
currently started
2017-04-20 13:42:50.870651 mon.0 [INF] osd.10 172.10.10.2:6804/15031 failed
(forced)
2017-04-20 13:42:51.989500 mon.0 [INF] osd.11 172.10.10.2:6806/15690 failed
(forced)
I found out that allways some of the Journal SSDs are disappearing when
starting to format and therefor the osds on this journal too.
Thats really strange to me.
Any benchmark is working fine and also if the rbd image is formatted via iscsi
I can use it without any problems.
My environment:
ceph-11.2.0-0.el7.x86_64 on CentOS 7.3
3 Monitor Hosts
2 OSD Hosts:
2x Intel(R) Xeon(R) CPU E5-2630L v4 (HT on, C1-state
60 GB Memory
1 GE Ethernet (internal)
1 GE Ethernet (external)
4 x SSD (Intel SSDSC2BB480G401)
8 x 1,1TB SAS3 (XFS)
All disks connected to a HBA, no RAID arrays: PMC Adaptec HBA 1000-8i8e
[agpceph02][DEBUG ] /dev/sda :
[agpceph02][DEBUG ] /dev/sda1 ceph journal, for /dev/sde1
[agpceph02][DEBUG ] /dev/sda2 ceph journal, for /dev/sdf1
[agpceph02][DEBUG ] /dev/sdb :
[agpceph02][DEBUG ] /dev/sdb1 ceph journal, for /dev/sdg1
[agpceph02][DEBUG ] /dev/sdb2 ceph journal, for /dev/sdh1
[agpceph02][DEBUG ] /dev/sdc :
[agpceph02][DEBUG ] /dev/sdc1 ceph journal, for /dev/sdi1
[agpceph02][DEBUG ] /dev/sdc2 ceph journal, for /dev/sdj1
[agpceph02][DEBUG ] /dev/sdd :
[agpceph02][DEBUG ] /dev/sdd1 ceph journal, for /dev/sdk1
[agpceph02][DEBUG ] /dev/sdd2 ceph journal, for /dev/sdl1
[agpceph02][DEBUG ] /dev/sde :
[agpceph02][DEBUG ] /dev/sde1 ceph data, active, cluster ceph, osd.8, journal
/dev/sda1
[agpceph02][DEBUG ] /dev/sdf :
[agpceph02][DEBUG ] /dev/sdf1 ceph data, active, cluster ceph, osd.9, journal
/dev/sda2
[agpceph02][DEBUG ] /dev/sdg :
[agpceph02][DEBUG ] /dev/sdg1 ceph data, active, cluster ceph, osd.10, journal
/dev/sdb1
[agpceph02][DEBUG ] /dev/sdh :
[agpceph02][DEBUG ] /dev/sdh1 ceph data, active, cluster ceph, osd.11, journal
/dev/sdb2
[agpceph02][DEBUG ] /dev/sdi :
[agpceph02][DEBUG ] /dev/sdi1 ceph data, active, cluster ceph, osd.12, journal
/dev/sdc1
[agpceph02][DEBUG ] /dev/sdj :
[agpceph02][DEBUG ] /dev/sdj1 ceph data, active, cluster ceph, osd.13, journal
/dev/sdc2
[agpceph02][DEBUG ] /dev/sdk :
[agpceph02][DEBUG ] /dev/sdk1 ceph data, active, cluster ceph, osd.14, journal
/dev/sdd1
[agpceph02][DEBUG ] /dev/sdl :
[agpceph02][DEBUG ] /dev/sdl1 ceph data, active, cluster ceph, osd.15, journal
/dev/sdd2
[root@agpceph-admin ceph]# ceph -s
cluster 8edd3cdc-02c3-4b60-a150-897aeb0dda14
health HEALTH_OK
monmap e3: 3 mons at
{agpceph-mon01=172.10.10.50:6789/0,agpceph01=172.10.10.1:6789/0,agpceph02=172.10.10.2:6789/0}
election epoch 110, quorum 0,1,2 agpceph01,agpceph02,agpceph-mon01
mgr active: agpceph-mon01 standbys: agpceph01, agpceph02
osdmap e2132: 16 osds: 16 up, 16 in
flags sortbitwise,require_jewel_osds,require_kraken_osds
pgmap v96164: 650 pgs, 3 pools, 15611 MB data, 3979 objects
31517 MB used, 17845 GB / 17876 GB avail
650 active+clean
[root@agpceph-admin ceph]# ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 17.45752 root default
-5 8.72876 rack Rack391
-2 8.72876 host agpceph01
0 1.09109 osd.0 up 1.00000 1.00000
1 1.09109 osd.1 up 1.00000 1.00000
2 1.09109 osd.2 up 1.00000 1.00000
3 1.09109 osd.3 up 1.00000 1.00000
4 1.09109 osd.4 up 1.00000 1.00000
5 1.09109 osd.5 up 1.00000 1.00000
6 1.09109 osd.6 up 1.00000 1.00000
7 1.09109 osd.7 up 1.00000 1.00000
-4 8.72876 rack Rack320
-3 8.72876 host agpceph02
8 1.09109 osd.8 up 1.00000 1.00000
9 1.09109 osd.9 up 1.00000 1.00000
10 1.09109 osd.10 up 1.00000 1.00000
11 1.09109 osd.11 up 1.00000 1.00000
12 1.09109 osd.12 up 1.00000 1.00000
13 1.09109 osd.13 up 1.00000 1.00000
14 1.09109 osd.14 up 1.00000 1.00000
15 1.09109 osd.15 up 1.00000 1.00000
[root@agpceph-admin ceph]# cat /etc/ceph/ceph.conf
[global]
fsid = 8edd3cdc-02c3-4b60-a150-897aeb0dda14
mon_initial_members = agpceph01, agpceph02, agpceph-mon01
mon_host = 172.10.10.1,172.10.10.2,172.10.10.50
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
osd journal size = 81920
public network = 172.10.10.0/24
cluster network = 172.10.11.0/24
osd pool default size = 2
osd pool default min size = 1
osd pool default pg num = 35
osd pool default pgp num = 35
osd crush chooseleaf type = 3
log file = /var/log/ceph/cluster.log
log to syslog = true
mon_allow_pool_delete = true
mon osd allow primary affinity = true
[client]
rbd_cache = false
Maybe someone also had this problem and could give me any advice ?
Many thanks in advance and kind regards,
Sven
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com