Hi,
we have a Ceph cluster with 32 OSDs running on 4 servers (8 OSDs per server,
one for each disk).
From time to time, I see Ceph servers running out of file descriptors. It logs
lines like:
2014-06-08 22:15:35.154759 7f850ac25700 0 filestore(/srv/ceph/osd/ceph-20)
write couldn't open
You probably just want to increase the ulimit settings. You can change the
OSD setting, but that only covers file descriptors against the backing
store, not sockets for network communication -- the latter is more often
the one that runs out.
-Greg
On Thursday, June 12, 2014, Christian Kauhaus
ulimit -Sa
ulimit -Ha
Which will show you your limits.
If you are hitting this limit and it's 16k I would say that the server
is not tuned for your needs and upper it.
If it's more then that but not reaching 1Million or any other very high
number I would say use lsof -n|wc -l to get some