Recently got a problem with OSDs based on SSD disks used in cache tier
for EC-pool
superuser@node02:~$ df -i
Filesystem Inodes IUsed *IFree* IUse% Mounted on
<...>
/dev/sdb1 3335808 3335808 *0* 100%
/var/lib/ceph/osd/ceph-45
/dev/sda1 3335808 3335808 *0* 100%
/var/lib/ceph/osd/ceph-46
Now that OSDs are down on each ceph-node and cache tiering is not working.
superuser@node01:~$ sudo tail /var/log/ceph/ceph-osd.45.log
2015-03-23 10:04:23.631137 7fb105345840 0 ceph version 0.87.1
(283c2e7cfa2457799f534744d7d549f83ea1335e), process ceph-osd, pid 1453465
2015-03-23 10:04:23.640676 7fb105345840 0
filestore(/var/lib/ceph/osd/ceph-45) backend generic (magic 0xef53)
2015-03-23 10:04:23.640735 7fb105345840 -1
genericfilestorebackend(/var/lib/ceph/osd/ceph-45) detect_features:
unable to create /var/lib/ceph/osd/ceph-45/fiemap_test: (28) No space
left on device
2015-03-23 10:04:23.640763 7fb105345840 -1
filestore(/var/lib/ceph/osd/ceph-45) _detect_fs: detect_features error:
(28) No space left on device
2015-03-23 10:04:23.640772 7fb105345840 -1
filestore(/var/lib/ceph/osd/ceph-45) FileStore::mount : error in
_detect_fs: (28) No space left on device
2015-03-23 10:04:23.640783 7fb105345840 -1 ** ERROR: error converting
store /var/lib/ceph/osd/ceph-45: (28) *No space left on device*
In the same time*df -h *is confusing:
superuser@node01:~$ df -h
Filesystem Size Used *Avail* Use% Mounted on
<...>
/dev/sda1 50G 29G *20G* 60% /var/lib/ceph/osd/ceph-45
/dev/sdb1 50G 27G *21G* 56% /var/lib/ceph/osd/ceph-46
Filesystem used on affected OSDs is EXt4. All OSDs are deployed with
ceph-deploy:
$ ceph-deploy osd create --zap-disk --fs-type ext4 <node-name>:<device>
Help me out what it was just test deployment and all EC-pool data was
lost since I /can't start OSDs/ and ceph cluster/becames degraded /until
I removed all affected tiered pools (cache & EC)
So this is just my observation of what kind of problems can be faced if
you choose wrong Filesystem for OSD backend.
And now I *strongly* recommend you to choose*XFS* or *Btrfs* filesystems
because both are supporting dynamic inode allocation and this problem
can't arise with them.
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com