Re: [ceph-users] Updating the pg and pgp values
On Mon, Sep 8, 2014 at 10:08 AM, JIten Shah jshah2...@me.com wrote: While checking the health of the cluster, I ran to the following error: warning: health HEALTH_WARN too few pgs per osd (1 min 20) When I checked the pg and php numbers, I saw the value was the default value of 64 ceph osd pool get data pg_num pg_num: 64 ceph osd pool get data pgp_num pgp_num: 64 Checking the ceph documents, I updated the numbers to 2000 using the following commands: ceph osd pool set data pg_num 2000 ceph osd pool set data pgp_num 2000 It started resizing the data and saw health warnings again: health HEALTH_WARN 1 requests are blocked 32 sec; pool data pg_num 2000 pgp_num 64 and then: ceph health detail HEALTH_WARN 6 requests are blocked 32 sec; 3 osds have slow requests 5 ops are blocked 65.536 sec 1 ops are blocked 32.768 sec 1 ops are blocked 32.768 sec on osd.16 1 ops are blocked 65.536 sec on osd.77 4 ops are blocked 65.536 sec on osd.98 3 osds have slow requests This error also went away after a day. ceph health detail HEALTH_OK Now, the question I have is, will this pg number remain effective on the cluster, even if we restart MON or OSD’s on the individual disks? I haven’t changed the values in /etc/ceph/ceph.conf. Do I need to make a change to the ceph.conf and push that change to all the MON, MSD and OSD’s ? It's durable once the commands are successful on the monitors. You're all done. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Updating the pg and pgp values
Thanks Greg. —Jiten On Sep 8, 2014, at 10:31 AM, Gregory Farnum g...@inktank.com wrote: On Mon, Sep 8, 2014 at 10:08 AM, JIten Shah jshah2...@me.com wrote: While checking the health of the cluster, I ran to the following error: warning: health HEALTH_WARN too few pgs per osd (1 min 20) When I checked the pg and php numbers, I saw the value was the default value of 64 ceph osd pool get data pg_num pg_num: 64 ceph osd pool get data pgp_num pgp_num: 64 Checking the ceph documents, I updated the numbers to 2000 using the following commands: ceph osd pool set data pg_num 2000 ceph osd pool set data pgp_num 2000 It started resizing the data and saw health warnings again: health HEALTH_WARN 1 requests are blocked 32 sec; pool data pg_num 2000 pgp_num 64 and then: ceph health detail HEALTH_WARN 6 requests are blocked 32 sec; 3 osds have slow requests 5 ops are blocked 65.536 sec 1 ops are blocked 32.768 sec 1 ops are blocked 32.768 sec on osd.16 1 ops are blocked 65.536 sec on osd.77 4 ops are blocked 65.536 sec on osd.98 3 osds have slow requests This error also went away after a day. ceph health detail HEALTH_OK Now, the question I have is, will this pg number remain effective on the cluster, even if we restart MON or OSD’s on the individual disks? I haven’t changed the values in /etc/ceph/ceph.conf. Do I need to make a change to the ceph.conf and push that change to all the MON, MSD and OSD’s ? It's durable once the commands are successful on the monitors. You're all done. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Updating the pg and pgp values
So, if it doesn’t refer to the entry in ceph.conf. Where does it actually store the new value? —Jiten On Sep 8, 2014, at 10:31 AM, Gregory Farnum g...@inktank.com wrote: On Mon, Sep 8, 2014 at 10:08 AM, JIten Shah jshah2...@me.com wrote: While checking the health of the cluster, I ran to the following error: warning: health HEALTH_WARN too few pgs per osd (1 min 20) When I checked the pg and php numbers, I saw the value was the default value of 64 ceph osd pool get data pg_num pg_num: 64 ceph osd pool get data pgp_num pgp_num: 64 Checking the ceph documents, I updated the numbers to 2000 using the following commands: ceph osd pool set data pg_num 2000 ceph osd pool set data pgp_num 2000 It started resizing the data and saw health warnings again: health HEALTH_WARN 1 requests are blocked 32 sec; pool data pg_num 2000 pgp_num 64 and then: ceph health detail HEALTH_WARN 6 requests are blocked 32 sec; 3 osds have slow requests 5 ops are blocked 65.536 sec 1 ops are blocked 32.768 sec 1 ops are blocked 32.768 sec on osd.16 1 ops are blocked 65.536 sec on osd.77 4 ops are blocked 65.536 sec on osd.98 3 osds have slow requests This error also went away after a day. ceph health detail HEALTH_OK Now, the question I have is, will this pg number remain effective on the cluster, even if we restart MON or OSD’s on the individual disks? I haven’t changed the values in /etc/ceph/ceph.conf. Do I need to make a change to the ceph.conf and push that change to all the MON, MSD and OSD’s ? It's durable once the commands are successful on the monitors. You're all done. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Updating the pg and pgp values
It's stored in the OSDMap on the monitors. Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Sep 8, 2014 at 10:50 AM, JIten Shah jshah2...@me.com wrote: So, if it doesn’t refer to the entry in ceph.conf. Where does it actually store the new value? —Jiten On Sep 8, 2014, at 10:31 AM, Gregory Farnum g...@inktank.com wrote: On Mon, Sep 8, 2014 at 10:08 AM, JIten Shah jshah2...@me.com wrote: While checking the health of the cluster, I ran to the following error: warning: health HEALTH_WARN too few pgs per osd (1 min 20) When I checked the pg and php numbers, I saw the value was the default value of 64 ceph osd pool get data pg_num pg_num: 64 ceph osd pool get data pgp_num pgp_num: 64 Checking the ceph documents, I updated the numbers to 2000 using the following commands: ceph osd pool set data pg_num 2000 ceph osd pool set data pgp_num 2000 It started resizing the data and saw health warnings again: health HEALTH_WARN 1 requests are blocked 32 sec; pool data pg_num 2000 pgp_num 64 and then: ceph health detail HEALTH_WARN 6 requests are blocked 32 sec; 3 osds have slow requests 5 ops are blocked 65.536 sec 1 ops are blocked 32.768 sec 1 ops are blocked 32.768 sec on osd.16 1 ops are blocked 65.536 sec on osd.77 4 ops are blocked 65.536 sec on osd.98 3 osds have slow requests This error also went away after a day. ceph health detail HEALTH_OK Now, the question I have is, will this pg number remain effective on the cluster, even if we restart MON or OSD’s on the individual disks? I haven’t changed the values in /etc/ceph/ceph.conf. Do I need to make a change to the ceph.conf and push that change to all the MON, MSD and OSD’s ? It's durable once the commands are successful on the monitors. You're all done. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Updating the pg and pgp values
Thanks. How do I query the OSDMap on monitors? Using ceph osd pool get data pg” ? or is there a way to get the full list of settings? —jiten On Sep 8, 2014, at 10:52 AM, Gregory Farnum g...@inktank.com wrote: It's stored in the OSDMap on the monitors. Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Sep 8, 2014 at 10:50 AM, JIten Shah jshah2...@me.com wrote: So, if it doesn’t refer to the entry in ceph.conf. Where does it actually store the new value? —Jiten On Sep 8, 2014, at 10:31 AM, Gregory Farnum g...@inktank.com wrote: On Mon, Sep 8, 2014 at 10:08 AM, JIten Shah jshah2...@me.com wrote: While checking the health of the cluster, I ran to the following error: warning: health HEALTH_WARN too few pgs per osd (1 min 20) When I checked the pg and php numbers, I saw the value was the default value of 64 ceph osd pool get data pg_num pg_num: 64 ceph osd pool get data pgp_num pgp_num: 64 Checking the ceph documents, I updated the numbers to 2000 using the following commands: ceph osd pool set data pg_num 2000 ceph osd pool set data pgp_num 2000 It started resizing the data and saw health warnings again: health HEALTH_WARN 1 requests are blocked 32 sec; pool data pg_num 2000 pgp_num 64 and then: ceph health detail HEALTH_WARN 6 requests are blocked 32 sec; 3 osds have slow requests 5 ops are blocked 65.536 sec 1 ops are blocked 32.768 sec 1 ops are blocked 32.768 sec on osd.16 1 ops are blocked 65.536 sec on osd.77 4 ops are blocked 65.536 sec on osd.98 3 osds have slow requests This error also went away after a day. ceph health detail HEALTH_OK Now, the question I have is, will this pg number remain effective on the cluster, even if we restart MON or OSD’s on the individual disks? I haven’t changed the values in /etc/ceph/ceph.conf. Do I need to make a change to the ceph.conf and push that change to all the MON, MSD and OSD’s ? It's durable once the commands are successful on the monitors. You're all done. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Updating the pg and pgp values
Hello, On Mon, 08 Sep 2014 10:08:27 -0700 JIten Shah wrote: While checking the health of the cluster, I ran to the following error: warning: health HEALTH_WARN too few pgs per osd (1 min 20) When I checked the pg and php numbers, I saw the value was the default value of 64 ceph osd pool get data pg_num pg_num: 64 ceph osd pool get data pgp_num pgp_num: 64 Checking the ceph documents, I updated the numbers to 2000 using the following commands: If that is the same cluster as in the other thread, you're having 100 OSDs and a replica of 2. Which gives a PG target of 5000, rounded up to 8192! ceph osd pool set data pg_num 2000 ceph osd pool set data pgp_num 2000 At the very least increase this to 2048, for a better chance at even data distribution, but 4096 would be definitely better and 8192 the recommended target. It started resizing the data and saw health warnings again: health HEALTH_WARN 1 requests are blocked 32 sec; pool data pg_num 2000 pgp_num 64 and then: ceph health detail HEALTH_WARN 6 requests are blocked 32 sec; 3 osds have slow requests 5 ops are blocked 65.536 sec 1 ops are blocked 32.768 sec 1 ops are blocked 32.768 sec on osd.16 1 ops are blocked 65.536 sec on osd.77 4 ops are blocked 65.536 sec on osd.98 3 osds have slow requests This error also went away after a day. That's caused by the data movement, given that your on 100 hosts with a single disk each I would have thought that to be faster and having less impact. It could of course also be related to other things, network congestion comes to mind. Increase PGs and PGPs in small steps, Firefly won't let you add more than 256 at a time in my case. You can also limit the impact of this to a point with the appropriate settings, see the documentation. Christian ceph health detail HEALTH_OK Now, the question I have is, will this pg number remain effective on the cluster, even if we restart MON or OSD’s on the individual disks? I haven’t changed the values in /etc/ceph/ceph.conf. Do I need to make a change to the ceph.conf and push that change to all the MON, MSD and OSD’s ? Thanks. —Jiten -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Japan/Fusion Communications http://www.gol.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com