You probably just want to increase the ulimit settings. You can change the OSD setting, but that only covers file descriptors against the backing store, not sockets for network communication -- the latter is more often the one that runs out. -Greg
On Thursday, June 12, 2014, Christian Kauhaus <k...@gocept.com <javascript:_e(%7B%7D,'cvml','k...@gocept.com');>> wrote: > Hi, > > we have a Ceph cluster with 32 OSDs running on 4 servers (8 OSDs per > server, > one for each disk). > > From time to time, I see Ceph servers running out of file descriptors. It > logs > lines like: > > > 2014-06-08 22:15:35.154759 7f850ac25700 0 > filestore(/srv/ceph/osd/ceph-20) > write couldn't open > 86.37_head/a63e7df7/rbd_data.1933fe2ae8944a.000000000000042c/head//86: (24) > Too many open files > > 2014-06-08 22:15:35.255955 7f850ac25700 -1 os/FileStore.cc: In function > 'unsigned int FileStore::_do_transaction(ObjectStore::Transaction&, > uint64_t, > int, ThreadPool::TPHandle*)' thread 7f850ac25700 time > > 2014-06-08 22:15:35.191181 os/FileStore.cc: 2448: FAILED assert(0 == > "unexpected error") > > but apparently everything proceeds normally after that. > > Is the error considered critical? Should I lower "max open files" in > ceph.conf? Or should I increase the value in /proc/sys/fs/file-max? Has > anyone > a good recommendation? > > TIA > > Christian > > > Reference: > > * we are running Ceph Emperor 0.72.2 on Linux 3.10.7. > > * full log follows: > > 2014-06-08 22:15:34.928660 7f84e6770700 0 <cls> cls/lock/cls_lock.cc:89: > error reading xattr lock.rbd_lock: -24 > 2014-06-08 22:15:34.934733 7f84e6770700 0 <cls> cls/lock/cls_lock.cc:384: > Could not read lock info: Unknown error -24 > 2014-06-08 22:15:35.085361 7f84ecf7d700 0 accepter.accepter no incoming > connection? sd = -1 errno 24 Too many open files > 2014-06-08 22:15:35.125393 7f84ecf7d700 0 accepter.accepter no incoming > connection? sd = -1 errno 24 Too many open files > 2014-06-08 22:15:35.125403 7f84ecf7d700 0 accepter.accepter no incoming > connection? sd = -1 errno 24 Too many open files > 2014-06-08 22:15:35.125407 7f84ecf7d700 0 accepter.accepter no incoming > connection? sd = -1 errno 24 Too many open files > 2014-06-08 22:15:35.125410 7f84ecf7d700 0 accepter.accepter no incoming > connection? sd = -1 errno 24 Too many open files > 2014-06-08 22:15:35.154759 7f850ac25700 0 filestore(/srv/ceph/osd/ceph-20) > write couldn't open > 86.37_head/a63e7df7/rbd_data.1933fe2ae8944a.000000000000042c/head//86: (24) > Too many open files > 2014-06-08 22:15:35.159074 7f850ac25700 0 filestore(/srv/ceph/osd/ceph-20) > error (24) Too many open files not handled on operation 10 (488954466.1.0, > or > op 0, counting from 0) > 2014-06-08 22:15:35.159095 7f850ac25700 0 filestore(/srv/ceph/osd/ceph-20) > unexpected error code > 2014-06-08 22:15:35.159098 7f850ac25700 0 filestore(/srv/ceph/osd/ceph-20) > transaction dump: > { "ops": [ > { "op_num": 0, > "op_name": "write", > "collection": "86.37_head", > "oid": > "a63e7df7\/rbd_data.1933fe2ae8944a.000000000000042c\/head\/\/86", > "length": 4096, > "offset": 3104768, > "bufferlist length": 4096}, > { "op_num": 1, > "op_name": "setattr", > "collection": "86.37_head", > "oid": > "a63e7df7\/rbd_data.1933fe2ae8944a.000000000000042c\/head\/\/86", > "name": "_", > "length": 251}, > { "op_num": 2, > "op_name": "setattr", > "collection": "86.37_head", > "oid": > "a63e7df7\/rbd_data.1933fe2ae8944a.000000000000042c\/head\/\/86", > "name": "snapset", > "length": 31}]} > 2014-06-08 22:15:35.255955 7f850ac25700 -1 os/FileStore.cc: In function > 'unsigned int FileStore::_do_transaction(ObjectStore::Transaction&, > uint64_t, > int, ThreadPool::TPHandle*)' thread 7f850ac25700 time > 2014-06-08 22:15:35.191181 os/FileStore.cc: 2448: FAILED assert(0 == > "unexpected error") > > -- > Dipl.-Inf. Christian Kauhaus <>< · k...@gocept.com · systems administration > gocept gmbh & co. kg · Forsterstraße 29 · 06112 Halle (Saale) · Germany > http://gocept.com · tel +49 345 219401-11 > Python, Pyramid, Plone, Zope · consulting, development, hosting, operations > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Software Engineer #42 @ http://inktank.com | http://ceph.com
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com