Hi all, I'm using CephFS on Hammer and I've 1.5 million files , 2 metadata servers in active/standby configuration with 8 GB of RAM , 20 clients with 2 GB of RAM each and 2 OSD nodes with 4 80GB osd and 4GB of RAM. ?I've noticed that if I kill the active metadata server the second one took about 10 to 30 minutes to switch from rejoin to active state.On the rejoin server while that is in rejoin state I can see ceph allocating RAM.
Here my configuration: [global] fsid = 2de7b17f-0a3e-4109-b878-c035dd2f7735 mon_initial_members = cephmds01 mon_host = 10.29.81.161 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx public network = 10.29.81.0/24 tcp nodelay = true tcp rcvbuf = 0 ms tcp read timeout = 600 #Capacity mon osd full ratio = .95 mon osd nearfull ratio = .85 [osd] osd journal size = 1024 journal dio = true journal aio = true osd op threads = 2 osd op thread timeout = 60 osd disk threads = 2 osd recovery threads = 1 osd recovery max active = 1 osd max backfills = 2 # Pool osd pool default size = 2 #XFS osd mkfs type = xfs osd mkfs options xfs = "-f -i size=2048" osd mount options xfs = "rw,noatime,inode64,logbsize=256k,delaylog" #FileStore Settings filestore xattr use omap = false filestore max inline xattr size = 512 filestore max sync interval = 10 filestore merge threshold = 40 filestore split multiple = 8 filestore flusher = false filestore queue max ops = 2000 filestore queue max bytes = 536870912 filestore queue committing max ops = 500 filestore queue committing max bytes = 268435456 filestore op threads = 2 [mds] max mds = 1 mds cache size = 250000 client cache size = 1024 mds dir commit ratio = 0.5 Best regards, Matteo
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com