Re: [ceph-users] Ceph newbie(?) issues
I had a similar problem with some relatively underpowered servers (2x E5-2603 6 core 1.7ghz no HT, 12-14 2TB OSDs per server, 32Gb RAM) There was a process on a couple of the servers that would hang and chew up all available CPU. When that happened, I started getting scrub errors on those servers. On Mon, Mar 5, 2018 at 8:45 AM, Jan Marquardtwrote: > Am 05.03.18 um 13:13 schrieb Ronny Aasen: > > i had some similar issues when i started my proof of concept. especialy > > the snapshot deletion i remember well. > > > > the rule of thumb for filestore that i assume you are running is 1GB ram > > per TB of osd. so with 8 x 4TB osd's you are looking at 32GB of ram for > > osd's + some GB's for the mon service, + some GB's for the os itself. > > > > i suspect if you inspect your dmesg log and memory graphs you will find > > that the out of memory killer ends your osd's when the snap deletion (or > > any other high load task) runs. > > > > I ended up reducing the number of osd's per node, since the old > > mainboard i used was maxed for memory. > > Well, thanks for the broad hint. Somehow I assumed we fulfill the > recommendations, but of course you are right. We'll check if our boards > support 48 GB RAM. Unfortunately, there are currently no corresponding > messages. But I can't rule out that there haven't been any. > > > corruptions occured for me as well. and they was normaly associated with > > disks dying or giving read errors. ceph often managed to fix them but > > sometimes i had to just remove the hurting OSD disk. > > > > hage some graph's to look at. personaly i used munin/munin-node since > > it was just an apt-get away from functioning graphs > > > > also i used smartmontools to send me emails about hurting disks. > > and smartctl to check all disks for errors. > > I'll check S.M.A.R.T stuff. I am wondering if scrubbing errors are > always caused by disk problems or if they also could be triggered > by flapping OSDs or other circumstances. > > > good luck with ceph ! > > Thank you! > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph newbie(?) issues
On 05. mars 2018 14:45, Jan Marquardt wrote: Am 05.03.18 um 13:13 schrieb Ronny Aasen: i had some similar issues when i started my proof of concept. especialy the snapshot deletion i remember well. the rule of thumb for filestore that i assume you are running is 1GB ram per TB of osd. so with 8 x 4TB osd's you are looking at 32GB of ram for osd's + some GB's for the mon service, + some GB's for the os itself. i suspect if you inspect your dmesg log and memory graphs you will find that the out of memory killer ends your osd's when the snap deletion (or any other high load task) runs. I ended up reducing the number of osd's per node, since the old mainboard i used was maxed for memory. Well, thanks for the broad hint. Somehow I assumed we fulfill the recommendations, but of course you are right. We'll check if our boards support 48 GB RAM. Unfortunately, there are currently no corresponding messages. But I can't rule out that there haven't been any. corruptions occured for me as well. and they was normaly associated with disks dying or giving read errors. ceph often managed to fix them but sometimes i had to just remove the hurting OSD disk. hage some graph's to look at. personaly i used munin/munin-node since it was just an apt-get away from functioning graphs also i used smartmontools to send me emails about hurting disks. and smartctl to check all disks for errors. I'll check S.M.A.R.T stuff. I am wondering if scrubbing errors are always caused by disk problems or if they also could be triggered by flapping OSDs or other circumstances. good luck with ceph ! Thank you! in my not that extensive experience, schrub errors come mainly from 2 issues. Either disk's giving read errors (should be visible both in the log and dmesg.) or having pools with size=2/min_size=1 instead of the default and recomended size=3/min_size=2 but i can not say that they do not come from crashing OSD's but my case the osd kept crashing due to bad disk and/or low memory. If you have scrub errors you can not get rid of on filestore (not bluestore!) you can read the two following urls. http://ceph.com/geen-categorie/ceph-manually-repair-object/ and on http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/ basicaly the steps are: - find the pg :: rados list-inconsistent-pg [pool] - find the problem :: rados list-inconsistent-obj 0.6 --format=json-pretty ; give you the object name look for hints to what is the bad object - find the object on disks :: manually check the objects on each osd for the given pg, check the object metadata (size/date/etc), run md5sum on them all and compare. check objects on the nonrunning osd's and compare there as well. anything to try to determine what object is ok and what is bad. - fix the problem :: assuming you find the bad object, stop the affected osd with the bad object, remove the object manually, restart osd. issue repair command. Once i fixed my min_size=1 misconfiguration, and pulled the dying (but functional) disks from my cluster, and reduced osd count to prevent dying osd's all of those scrub errors went away. have not seen one in 6 months now. kinds regards Ronny Aasen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph newbie(?) issues
Am 05.03.18 um 13:13 schrieb Ronny Aasen: > i had some similar issues when i started my proof of concept. especialy > the snapshot deletion i remember well. > > the rule of thumb for filestore that i assume you are running is 1GB ram > per TB of osd. so with 8 x 4TB osd's you are looking at 32GB of ram for > osd's + some GB's for the mon service, + some GB's for the os itself. > > i suspect if you inspect your dmesg log and memory graphs you will find > that the out of memory killer ends your osd's when the snap deletion (or > any other high load task) runs. > > I ended up reducing the number of osd's per node, since the old > mainboard i used was maxed for memory. Well, thanks for the broad hint. Somehow I assumed we fulfill the recommendations, but of course you are right. We'll check if our boards support 48 GB RAM. Unfortunately, there are currently no corresponding messages. But I can't rule out that there haven't been any. > corruptions occured for me as well. and they was normaly associated with > disks dying or giving read errors. ceph often managed to fix them but > sometimes i had to just remove the hurting OSD disk. > > hage some graph's to look at. personaly i used munin/munin-node since > it was just an apt-get away from functioning graphs > > also i used smartmontools to send me emails about hurting disks. > and smartctl to check all disks for errors. I'll check S.M.A.R.T stuff. I am wondering if scrubbing errors are always caused by disk problems or if they also could be triggered by flapping OSDs or other circumstances. > good luck with ceph ! Thank you! ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph newbie(?) issues
On 05. mars 2018 11:21, Jan Marquardt wrote: Hi, we are relatively new to Ceph and are observing some issues, where I'd like to know how likely they are to happen when operating a Ceph cluster. Currently our setup consists of three servers which are acting as OSDs and MONs. Each server has two Intel Xeon L5420 (yes, I know, it's not state of the art, but we thought it would be sufficient for a Proof of Concept. Maybe we were wrong?) and 24 GB RAM and is running 8 OSDs with 4 TB harddisks. 4 OSDs are sharing one SSD for journaling. We started on Kraken and upgraded lately to Luminous. The next two OSD servers and three separate MONs are ready for deployment. Please find attached our ceph.conf. Current usage looks like this: data: pools: 1 pools, 768 pgs objects: 5240k objects, 18357 GB usage: 59825 GB used, 29538 GB / 89364 GB avail We have only one pool which is exclusively used for rbd. We started filling it with data and creating snapshots in January until Mid of February. Everything was working like a charm until we started removing old snapshots then. While we were removing snapshots for the first time, OSDs started flapping. Besides this there was no other load on the cluster. For idle times we solved it by adding osd snap trim priority = 1 osd snap trim sleep = 0.1 to ceph.conf. When there is load from other operations and we remove big snapshots OSD flapping still occurs. Last week our first scrub errors appeared. Repairing the first one was no big deal. The second one however was, because the instructed OSD started crashing. First on Friday osd.17 and today osd.11. ceph1:~# ceph pg repair 0.1b2 instructing pg 0.1b2 on osd.17 to repair ceph1:~# ceph pg repair 0.1b2 instructing pg 0.1b2 on osd.11 to repair I am still researching on the crashes, but already would be thankful for any input. Any opinions, hints and advices would really be appreciated. i had some similar issues when i started my proof of concept. especialy the snapshot deletion i remember well. the rule of thumb for filestore that i assume you are running is 1GB ram per TB of osd. so with 8 x 4TB osd's you are looking at 32GB of ram for osd's + some GB's for the mon service, + some GB's for the os itself. i suspect if you inspect your dmesg log and memory graphs you will find that the out of memory killer ends your osd's when the snap deletion (or any other high load task) runs. I ended up reducing the number of osd's per node, since the old mainboard i used was maxed for memory. corruptions occured for me as well. and they was normaly associated with disks dying or giving read errors. ceph often managed to fix them but sometimes i had to just remove the hurting OSD disk. hage some graph's to look at. personaly i used munin/munin-node since it was just an apt-get away from functioning graphs also i used smartmontools to send me emails about hurting disks. and smartctl to check all disks for errors. good luck with ceph ! kinds regards Ronny Aasen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Ceph newbie(?) issues
Hi, we are relatively new to Ceph and are observing some issues, where I'd like to know how likely they are to happen when operating a Ceph cluster. Currently our setup consists of three servers which are acting as OSDs and MONs. Each server has two Intel Xeon L5420 (yes, I know, it's not state of the art, but we thought it would be sufficient for a Proof of Concept. Maybe we were wrong?) and 24 GB RAM and is running 8 OSDs with 4 TB harddisks. 4 OSDs are sharing one SSD for journaling. We started on Kraken and upgraded lately to Luminous. The next two OSD servers and three separate MONs are ready for deployment. Please find attached our ceph.conf. Current usage looks like this: data: pools: 1 pools, 768 pgs objects: 5240k objects, 18357 GB usage: 59825 GB used, 29538 GB / 89364 GB avail We have only one pool which is exclusively used for rbd. We started filling it with data and creating snapshots in January until Mid of February. Everything was working like a charm until we started removing old snapshots then. While we were removing snapshots for the first time, OSDs started flapping. Besides this there was no other load on the cluster. For idle times we solved it by adding osd snap trim priority = 1 osd snap trim sleep = 0.1 to ceph.conf. When there is load from other operations and we remove big snapshots OSD flapping still occurs. Last week our first scrub errors appeared. Repairing the first one was no big deal. The second one however was, because the instructed OSD started crashing. First on Friday osd.17 and today osd.11. ceph1:~# ceph pg repair 0.1b2 instructing pg 0.1b2 on osd.17 to repair ceph1:~# ceph pg repair 0.1b2 instructing pg 0.1b2 on osd.11 to repair I am still researching on the crashes, but already would be thankful for any input. Any opinions, hints and advices would really be appreciated. Best Regards Jan [global] fsid = c59e56df-2043-4c92-9492-25f05f268d9f mon_initial_members = ceph1, ceph2, ceph3 mon_host = 10.10.100.21,10.10.100.22,10.10.100.23 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx public network = 10.10.100.0/24 [osd] osd journal size = 0 osd snap trim priority = 1 osd snap trim sleep = 0.1 [client] rbd default features = 3 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com