Re: [ceph-users] inconsistent number of pools
On Tue, May 28, 2019 at 11:50:01AM -0700, Gregory Farnum wrote: You’re the second report I’ve seen if this, and while it’s confusing, you should be Abel to resolve it by restarting your active manager daemon. Maybe this is related? http://tracker.ceph.com/issues/40011 On Sun, May 26, 2019 at 11:52 PM Lars Täuber <[1]taeu...@bbaw.de> wrote: Fri, 24 May 2019 21:41:33 +0200 Michel Raabe <[2]rmic...@devnu11.net> ==> Lars Täuber <[3]taeu...@bbaw.de>, [4]ceph-users@lists.ceph.com : > > You can also try > > $ rados lspools > $ ceph osd pool ls > > and verify that with the pgs > > $ ceph pg ls --format=json-pretty | jq -r '.pg_stats[].pgid' | cut -d. > -f1 | uniq > Yes, now I know but I still get this: $ sudo ceph -s […] data: pools: 5 pools, 1153 pgs […] and with all other means I get: $ sudo ceph osd lspools | wc -l 3 Which is what I expect, because all other pools are removed. But since this has no bad side effects I can live with it. Cheers, Lars ___ ceph-users mailing list [5]ceph-users@lists.ceph.com [6]http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com References 1. mailto:taeu...@bbaw.de 2. mailto:rmic...@devnu11.net 3. mailto:taeu...@bbaw.de 4. mailto:ceph-users@lists.ceph.com 5. mailto:ceph-users@lists.ceph.com 6. http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Jan Fajerski Engineer Enterprise Storage SUSE Linux GmbH, GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Erasure code profiles and crush rules. Missing link...?
On Wed, May 22, 2019 at 03:38:27PM +0200, Rainer Krienke wrote: Am 22.05.19 um 15:16 schrieb Dan van der Ster: Yes this is basically what I was looking for however I had expected that its a little better visible in the output... Mind opening a tracker ticket on http://tracker.ceph.com/ so we can have this added to the non-json output of ceph osd pool ls detail? Rainer Is this what you're looking for? # ceph osd pool ls detail -f json | jq .[0].erasure_code_profile "jera_4plus2" -- Dan -- Rainer Krienke, Uni Koblenz, Rechenzentrum, A22, Universitaetsstrasse 1 56070 Koblenz, Tel: +49261287 1312 Fax +49261287 100 1312 Web: http://userpages.uni-koblenz.de/~krienke PGP: http://userpages.uni-koblenz.de/~krienke/mypgp.html ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Jan Fajerski Engineer Enterprise Storage SUSE Linux GmbH, GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-volume lvm batch OSD replacement
; > > > > Dan > > > > > > > > P.S: > > > > > > > > = osd.240 == > > > > > > > > [ db] /dev/ceph-094c06db-98dc-47f6-a7e5-1092b099b372/osd-block-db-fa0e7927-dc3e-44d0-a8ce-1d8202fa75dd > > > > > > > > type db > > > > osd id240 > > > > cluster fsid b4f463a0-c671-43a8-bd36-e40ab8d233d2 > > > > cluster name ceph > > > > osd fsid d4d1fb15-a30a-4325-8628-706772ee4294 > > > > db device > > > > /dev/ceph-094c06db-98dc-47f6-a7e5-1092b099b372/osd-block-db-fa0e7927-dc3e-44d0-a8ce-1d8202fa75dd > > > > encrypted 0 > > > > db uuid iWWdyU-UhNu-b58z-ThSp-Bi3B-19iA-06iJIc > > > > cephx lockbox secret > > > > block uuidu4326A-Q8bH-afPb-y7Y6-ftNf-TE1X-vjunBd > > > > block device > > > > /dev/ceph-f78ff8a3-803d-4b6d-823b-260b301109ac/osd-data-9e4bf34d-1aa3-4c0a-9655-5dba52dcfcd7 > > > > vdo 0 > > > > crush device classNone > > > > devices /dev/sdac ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Jan Fajerski Engineer Enterprise Storage SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] migrate ceph-disk to ceph-volume fails with dmcrypt
On Wed, Jan 23, 2019 at 10:01:05AM +0100, Manuel Lausch wrote: Hi, thats a bad news. round about 5000 OSDs are affected from this issue. It's not realy a solution to redeploy this OSDs. Is it possible to migrate the local keys to the monitors? I see that the OSDs with the "lockbox feature" has only one key for data and journal partition and the older OSDs have individual keys for journal and data. Might this be a problem? And a other question. Is it a good idea to mix ceph-disk and ceph-volume managed OSDSs on one host? So I could only migrate newer OSDs to ceph-volume and deploy new ones (after disk replacements) with ceph-volume until hopefuly there is a solution. I might be wrong on this, since its been a while since I played with that. But iirc you can't migrate a subset of ceph-disk OSDs to ceph-volume on one host. Once you run ceph-volume simple activate, the ceph-disk systemd units and udev profiles will be disabled. While the remaining ceph-disk OSDs will continue to run, they won't come up after a reboot. I'm sure there's a way to get them running again, but I imagine you'd rather not manually deal with that. Regards Manuel On Tue, 22 Jan 2019 07:44:02 -0500 Alfredo Deza wrote: This is one case we didn't anticipate :/ We supported the wonky lockbox setup and thought we wouldn't need to go further back, although we did add support for both plain and luks keys. Looking through the code, it is very tightly couple to storing/retrieving keys from the monitors, and I don't know what workarounds might be possible here other than throwing away the OSD and deploying a new one (I take it this is not an option for you at all) Manuel Lausch Systemadministrator Storage Services 1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 | 76135 Karlsruhe | Germany Phone: +49 721 91374-1847 E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de Hauptsitz Montabaur, Amtsgericht Montabaur, HRB 5452 Geschäftsführer: Thomas Ludwig, Jan Oetjen, Sascha Vollmer Member of United Internet Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu verwenden. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient of this e-mail, you are hereby notified that saving, distribution or use of the content of this e-mail in any way is prohibited. If you have received this e-mail in error, please notify the sender and delete the e-mail. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Jan Fajerski Engineer Enterprise Storage SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] rbd IO monitoring
On Thu, Nov 29, 2018 at 11:48:35PM -0500, Michael Green wrote: Hello collective wisdom, Ceph neophyte here, running v13.2.2 (mimic). Question: what tools are available to monitor IO stats on RBD level? That is, IOPS, Throughput, IOs inflight and so on? There is some brand new code for rbd io monitoring. This PR (https://github.com/ceph/ceph/pull/25114) added rbd client side perf counters and this PR (https://github.com/ceph/ceph/pull/25358) will add those counters as prometheus metrics. There is also room for an "rbd top" tool, though I haven't seen any code for this. I'm sure Mykola (the author of both PRs) could go into more detail if needed. I expect this functionality to land in nautilus. I'm testing with FIO and want to verify independently the IO load on each RBD image. -- Michael Green Customer Support & Integration [1]gr...@e8storage.com References 1. mailto:gr...@e8storage.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Jan Fajerski Engineer Enterprise Storage SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cephfs quota limit
On Tue, Nov 06, 2018 at 08:57:48PM +0800, Zhenshi Zhou wrote: Hi, I'm wondering whether cephfs have quota limit options. I use kernel client and ceph version is 12.2.8. Thanks CephFS has quota support, see http://docs.ceph.com/docs/luminous/cephfs/quota/. The kernel has recently gained CephFS quota support too (before only the fuse client supported it) so it depends on your distro and kernel version. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Jan Fajerski Engineer Enterprise Storage SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] CfP FOSDEM'19 Software Defined Storage devroom
CfP for the Software Defined Storage devroom at FOSDEM 2019 (Brussels, Belgium, February 3rd). FOSDEM is a free software event that offers open source communities a place to meet, share ideas and collaborate. It is renown for being highly developer- oriented and brings together 8000+ participants from all over the world. It is held in the city of Brussels (Belgium). FOSDEM 2019 will take place during the weekend of February 2nd-3rd 2019. More details about the event can be found at http://fosdem.org/ ** Call For Participation The Software Defined Storage devroom will go into it's third round for talks around Open Source Software Defined Storage projects, management tools and real world deployments. Presentation topics could include but are not limited too: - Your work on a SDS project like Ceph, Gluster, OpenEBS or LizardFS - Your work on or with SDS related projects like SWIFT or Container Storage Interface - Management tools for SDS deployments - Monitoring tools for SDS clusters ** Important dates: - Nov 25th 2018: submission deadline for talk proposals - Dec 17th 2018: announcement of the final schedule - Feb 3rd 2019: Software Defined Storage dev room Talk proposals will be reviewed by a steering committee: - Niels de Vos (Gluster Developer - RedHat) - Jan Fajerski (Ceph Developer - SUSE) - other volunteers TBA Use the FOSDEM 'pentabarf' tool to submit your proposal: https://penta.fosdem.org/submission/FOSDEM19 - If necessary, create a Pentabarf account and activate it. Please reuse your account from previous years if you have already created it. - In the "Person" section, provide First name, Last name (in the "General" tab), Email (in the "Contact" tab) and Bio ("Abstract" field in the "Description" tab). - Submit a proposal by clicking on "Create event". - Important! Select the "Software Defined Storage devroom" track (on the "General" tab). - Provide the title of your talk ("Event title" in the "General" tab). - Provide a description of the subject of the talk and the intended audience (in the "Abstract" field of the "Description" tab) - Provide a rough outline of the talk or goals of the session (a short list of bullet points covering topics that will be discussed) in the "Full description" field in the "Description" tab - Provide an expected length of your talk in the "Duration" field. Please count at least 10 minutes of discussion into your proposal plus allow 5 minutes for the handover to the next presenter. Suggested talk length would be 20+10 and 45+15 minutes. ** Recording of talks The FOSDEM organizers plan to have live streaming and recording fully working, both for remote/later viewing of talks, and so that people can watch streams in the hallways when rooms are full. This requires speakers to consent to being recorded and streamed. If you plan to be a speaker, please understand that by doing so you implicitly give consent for your talk to be recorded and streamed. The recordings will be published under the same license as all FOSDEM content (CC-BY). Hope to hear from you soon! And please forward this announcement. If you have any further questions, please write to the mailinglist at storage-devr...@lists.fosdem.org and we will try to answer as soon as possible. Thanks! -- Jan Fajerski Engineer Enterprise Storage SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Cluster Security
Hi, if you want to isolate your HV from ceph's public network a gateway would do that (like iscsi gateway). Note however that this will also add an extra network hop and a potential bottleneck since all client traffic has to pass through the gateway node(s). HTH, Jan On Wed, Sep 19, 2018 at 01:05:06PM +0200, Florian Florensa wrote: Hello everyone, I am currently working on the design of a ceph cluster, and i was asking myself some question regarding the security of the cluster. (Cluster should be deployed using Luminous on Ubuntu 16.04) Technically, we would have HVs exploiting the block storage, but we are in a position where we can't trust the VM that is running, thus, the HV can eventually get compromised, so how can we do to avoid a compromised hypervisor from compromising the safety of the data on the ceph cluster ? Using iscsi ? Using one key-ring per hypervisor ? Anything else ? Regards, Florian. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Jan Fajerski Engineer Enterprise Storage SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] mimic + cephmetrics + prometheus - working ?
I'm not the expert when it comes to cephmetrics but I think (at least until very recently) cephmetrics relies on other exporters besides the mgr module and the node_exporter. On Mon, Aug 27, 2018 at 01:11:29PM -0400, Steven Vacaroaia wrote: Hi has anyone been able to use Mimic + cephmetric + prometheus ? I am struggling to make it fully functional as it appears data provided by node_exporter has a different name than the one grafana expectes As a result of the above, only certain dashboards are being populated ( the ones ceph specific) while others have "no data points" ( the ones server specific) Any advice/suggestion/troubleshooting tips will be greatly appreciated Example: Grafana latency by server uses node_disk_read_time_ms but node_exporter does not provide it curl [1]http://osd01:9100/metrics | grep node_disk_read_time % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 0 00 00 0 0 0 --:--:-- --:--:-- --:--:-- 0# HELP node_disk_read_time_seconds_total The total number of milliseconds spent by all reads. # TYPE node_disk_read_time_seconds_total counter node_disk_read_time_seconds_total{device="dm-0"} 8910.801 node_disk_read_time_seconds_total{device="sda"} 0.525 node_disk_read_time_seconds_total{device="sdb"} 14221.732 node_disk_read_time_seconds_total{device="sdc"} 0.465 node_disk_read_time_seconds_total{device="sdd"} 0.46 node_disk_read_time_seconds_total{device="sde"} 0.017 node_disk_read_time_seconds_total{device="sdf"} 455.064 node_disk_read_time_seconds_total{device="sr0"} 0 100 64683 100 646830 0 4452k 0 --:--:-- --:--:-- --:--:-- 5263k References 1. http://osd01:9100/metrics ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Jan Fajerski Engineer Enterprise Storage SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] mimic - troubleshooting prometheus
The prometheus plugin currently skips histogram perf counters. The representation in ceph is not compatible with prometheus' approach (iirc). However I believe most, if not all of the perf counters should be exported as long running averages. Look for metric pair that are named some_metric_name_sum and some_metric_name_count. HTH, Jan On Fri, Aug 24, 2018 at 01:47:40PM -0400, Steven Vacaroaia wrote: Hi, Any idea/suggestions for troubleshooting prometheus ? what logs /commands are available to find out why OSD servers specific data ( IOPS, disk and network data) is not scrapped but cluster specific data ( pools, capacity ..etc) is ? Increasing log level for MGR showed only the following 2018-08-24 13:46:23.395 7f73d54ce700 20 mgr[prometheus] ignoring osd.op_r_latency_out_bytes_histogram, type 2018-08-24 13:46:23.395 7f73d54ce700 20 mgr[prometheus] ignoring osd.op_rw_latency_out_bytes_histogram, type 2018-08-24 13:46:23.395 7f73d54ce700 20 mgr[prometheus] ignoring osd.op_rw_latency_in_bytes_histogram, type 2018-08-24 13:46:23.395 7f73d54ce700 20 mgr[prometheus] ignoring osd.op_w_latency_in_bytes_histogram, type 2018-08-24 13:46:23.395 7f73d54ce700 20 mgr[prometheus] ignoring osd.op_r_latency_out_bytes_histogram, type 2018-08-24 13:46:23.396 7f73d54ce700 20 mgr[prometheus] ignoring osd.op_rw_latency_out_bytes_histogram, type 2018-08-24 13:46:23.396 7f73d54ce700 20 mgr[prometheus] ignoring osd.op_rw_latency_in_bytes_histogram, type 2018-08-24 13:46:23.396 7f73d54ce700 20 mgr[prometheus] ignoring osd.op_w_latency_in_bytes_histogram, type 2018-08-24 13:46:23.396 7f73d54ce700 20 mgr[prometheus] ignoring osd.op_r_latency_out_bytes_histogram, type 2018-08-24 13:46:23.397 7f73d54ce700 20 mgr[prometheus] ignoring osd.op_rw_latency_out_bytes_histogram, type 2018-08-24 13:46:23.397 7f73d54ce700 20 mgr[prometheus] ignoring osd.op_rw_latency_in_bytes_histogram, type 2018-08-24 13:46:23.397 7f73d54ce700 20 mgr[prometheus] ignoring osd.op_w_latency_in_bytes_histogram, type ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Jan Fajerski Engineer Enterprise Storage SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How to secure Prometheus endpoints (mgr plugin and node_exporter)
Hi Martin, hope this is still useful, despite the lag. On Fri, Jun 29, 2018 at 01:04:09PM +0200, Martin Palma wrote: Since Prometheus uses a pull model over HTTP for collecting metrics. What are the best practices to secure these HTTP endpoints? - With a reverse proxy with authentication? This is currently the recommended way to secure prometheus traffic with TLS or authentication. See also https://prometheus.io/docs/introduction/faq/#why-don-t-the-prometheus-server-components-support-tls-or-authentication-can-i-add-those for more info. However native support for TLS and authentication has just been put on the roadmap in August. - Export the node_exporter only on the cluster network? (not usable for the mgr plugin and for nodes like mons, mdss,...) - No security at all? Best, Martin ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Jan Fajerski Engineer Enterprise Storage SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] alert conditions
Fwiw I added a few things to https://pad.ceph.com/p/alert-conditions and will circulate this mail a bit wider. Or maybe there is not all that much interest in alerting... On Mon, Jul 23, 2018 at 06:10:04PM +0200, Jan Fajerski wrote: Hi community, the topic of alerting conditions for a ceph cluster comes up in various contexts. Some folks use prometheus or grafana, (I believe) sopme people would like snmp traps from ceph, the mgr dashboard could provide basic alerting capabilities and there is of course ceph -s. Also see "Improving alerting/health checks" on ceph-devel. Working on some prometheus stuff I think it would be nice to have some basic alerting rules in the ceph repo. This could serve as a out-of-the-box default as well as a example or best practice which conditions should be watched. So I'm wondering what does the community think? What do operators use as alert conditions or find alert-worthy? I'm aware that this is very open-ended, highly dependent on the cluster and its workload and can range from obvious (health_err anyone?) to intricate conditions that are designed for a certain cluster. I'm wondering if we can distill some non-trivial alert conditions that ceph itself does not (yet) provide. If you have any conditions fitting that description, feel free to add them to https://pad.ceph.com/p/alert-conditions. Otherwise looking forward to feedback. jan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Jan Fajerski Engineer Enterprise Storage SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] alert conditions
Hi community, the topic of alerting conditions for a ceph cluster comes up in various contexts. Some folks use prometheus or grafana, (I believe) sopme people would like snmp traps from ceph, the mgr dashboard could provide basic alerting capabilities and there is of course ceph -s. Also see "Improving alerting/health checks" on ceph-devel. Working on some prometheus stuff I think it would be nice to have some basic alerting rules in the ceph repo. This could serve as a out-of-the-box default as well as a example or best practice which conditions should be watched. So I'm wondering what does the community think? What do operators use as alert conditions or find alert-worthy? I'm aware that this is very open-ended, highly dependent on the cluster and its workload and can range from obvious (health_err anyone?) to intricate conditions that are designed for a certain cluster. I'm wondering if we can distill some non-trivial alert conditions that ceph itself does not (yet) provide. If you have any conditions fitting that description, feel free to add them to https://pad.ceph.com/p/alert-conditions. Otherwise looking forward to feedback. jan -- Jan Fajerski Engineer Enterprise Storage SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Show and Tell: Grafana cluster dashboard
On Mon, May 07, 2018 at 02:45:14PM +0200, Kurt Bauer wrote: Jan Fajerski <mailto:jfajer...@suse.com> 7. May 2018 at 14:21 On Mon, May 07, 2018 at 02:05:59PM +0200, Kurt Bauer wrote: Hi Jan, first of all thanks for this dashboard. A few comments: -) 'vonage-status-panel' is needed, which isn't mentioned in the ReadMe Yes, my bad. Will update the README -) Using ceph 12.2.4 the mon metric for me is apparently called 'ceph_mon_quorum_count' not 'ceph_mon_quorum_status' I'll also add to the readme: The dashboard is based on Ceph Mimic. And a question: Is there a way to get the Cluster IOPS with prometheus metrics? I did this with collectd, but can't find a suitable metric from ceph-mgr. Yes...at least in Mimic the metrics are called ceph_osd_op[_r,_w,_rw] Thanks, these metrics are in Luminous too. I seem unable to find some sort of register, to see which metrics mean what. Some are quite obvious, but others are a mystery. Does smth. like that exist somewhere? Not yet. Most daemon specific metric names (like ceph_osd_op[_r,_w,_rw) are derived directly from the respective perf counter names. The plugin exports all perf counters with PRIO_INTERESTING or higher (iirc). An automatically created index would certainly be feasible. Thanks. Best regards, Kurt [1]Jan Fajerski 7. May 2018 at 12:32 Hi all, I'd like to request comments and feedback about a Grafana dashboard for Ceph cluster monitoring. [2]https://youtu.be/HJquM127wMY [3]https://github.com/ceph/ceph/pull/21850 The goal is to eventually have a set of default dashboards in the Ceph repository that offer decent monitoring for clusters of various (maybe even all) sizes and applications, or at least serve as a starting point for customizations. -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to [4]majord...@vger.kernel.org More majordomo info at [5]http://vger.kernel.org/majordomo-info.html References 1. mailto:jfajer...@suse.com 2. https://youtu.be/HJquM127wMY 3. https://github.com/ceph/ceph/pull/21850 4. mailto:majord...@vger.kernel.org 5. http://vger.kernel.org/majordomo-info.html ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Kurt Bauer <mailto:kurt.ba...@univie.ac.at> 7. May 2018 at 14:05 Hi Jan, first of all thanks for this dashboard. A few comments: -) 'vonage-status-panel' is needed, which isn't mentioned in the ReadMe -) Using ceph 12.2.4 the mon metric for me is apparently called 'ceph_mon_quorum_count' not 'ceph_mon_quorum_status' And a question: Is there a way to get the Cluster IOPS with prometheus metrics? I did this with collectd, but can't find a suitable metric from ceph-mgr. Best regards, Kurt ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Kurt Bauer<kurt.ba...@univie.ac.at> Vienna University Computer Center - ACOnet - VIX Universitaetsstrasse 7, A-1010 Vienna, Austria, Europe Tel: ++431 4277 - 14070 (Fax: - 814070) KB1970-RIPE ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Jan Fajerski Engineer Enterprise Storage SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Show and Tell: Grafana cluster dashboard
On Mon, May 07, 2018 at 02:05:59PM +0200, Kurt Bauer wrote: Hi Jan, first of all thanks for this dashboard. A few comments: -) 'vonage-status-panel' is needed, which isn't mentioned in the ReadMe Yes, my bad. Will update the README -) Using ceph 12.2.4 the mon metric for me is apparently called 'ceph_mon_quorum_count' not 'ceph_mon_quorum_status' I'll also add to the readme: The dashboard is based on Ceph Mimic. And a question: Is there a way to get the Cluster IOPS with prometheus metrics? I did this with collectd, but can't find a suitable metric from ceph-mgr. Yes...at least in Mimic the metrics are called ceph_osd_op[_r,_w,_rw] Best regards, Kurt [1]Jan Fajerski 7. May 2018 at 12:32 Hi all, I'd like to request comments and feedback about a Grafana dashboard for Ceph cluster monitoring. [2]https://youtu.be/HJquM127wMY [3]https://github.com/ceph/ceph/pull/21850 The goal is to eventually have a set of default dashboards in the Ceph repository that offer decent monitoring for clusters of various (maybe even all) sizes and applications, or at least serve as a starting point for customizations. -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to [4]majord...@vger.kernel.org More majordomo info at [5]http://vger.kernel.org/majordomo-info.html References 1. mailto:jfajer...@suse.com 2. https://youtu.be/HJquM127wMY 3. https://github.com/ceph/ceph/pull/21850 4. mailto:majord...@vger.kernel.org 5. http://vger.kernel.org/majordomo-info.html ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Jan Fajerski Engineer Enterprise Storage SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Show and Tell: Grafana cluster dashboard
Hi all, I'd like to request comments and feedback about a Grafana dashboard for Ceph cluster monitoring. https://youtu.be/HJquM127wMY https://github.com/ceph/ceph/pull/21850 The goal is to eventually have a set of default dashboards in the Ceph repository that offer decent monitoring for clusters of various (maybe even all) sizes and applications, or at least serve as a starting point for customizations. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph-mgr Python error with prometheus plugin
On Fri, Feb 16, 2018 at 09:27:08AM +0100, Ansgar Jazdzewski wrote: Hi Folks, i just try to get the prometheus plugin up and runing but as soon as i browse /metrics i got: 500 Internal Server Error The server encountered an unexpected condition which prevented it from fulfilling the request. Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/cherrypy/_cprequest.py", line 670, in respond response.body = self.handler() File "/usr/lib/python2.7/dist-packages/cherrypy/lib/encoding.py", line 217, in __call__ self.body = self.oldhandler(*args, **kwargs) File "/usr/lib/python2.7/dist-packages/cherrypy/_cpdispatch.py", line 61, in __call__ return self.callable(*self.args, **self.kwargs) File "/usr/lib/ceph/mgr/prometheus/module.py", line 386, in metrics metrics = global_instance().collect() File "/usr/lib/ceph/mgr/prometheus/module.py", line 323, in collect self.get_metadata_and_osd_status() File "/usr/lib/ceph/mgr/prometheus/module.py", line 283, in get_metadata_and_osd_status dev_class['class'], KeyError: 'class' This error is part of the osd metadata metric. Which version of Ceph are you running this with? Specifically the Crush Map of this cluster seems to not have the device class for each OSD yet. I assume that i have to change the mkgr cephx kex? but iam not 100% sure mgr.mgr01 key: AQAqLIRasocnChAAbOIEMKVEWWHCbgVeEctwng== caps: [mds] allow * caps: [mon] allow profile mgr caps: [osd] allow * thanks for your help, Ansgar ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Jan Fajerski Engineer Enterprise Storage SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] formatting bytes and object counts in ceph status ouput
On Tue, Jan 02, 2018 at 04:54:55PM +, John Spray wrote: On Tue, Jan 2, 2018 at 10:43 AM, Jan Fajerski <jfajer...@suse.com> wrote: Hi lists, Currently the ceph status output formats all numbers with binary unit prefixes, i.e. 1MB equals 1048576 bytes and an object count of 1M equals 1048576 objects. I received a bug report from a user that printing object counts with a base 2 multiplier is confusing (I agree) so I opened a bug and https://github.com/ceph/ceph/pull/19117. In the PR discussion a couple of questions arose that I'd like to get some opinions on: - Should we print binary unit prefixes (MiB, GiB, ...) since that would be technically correct? I'm not a fan of the technically correct base 2 units -- they're still relatively rarely used, and I've spent most of my life using kB to mean 1024, not 1000. We could start changing the "rarely used" part ;) But I can certainly live with keeping the old units. - Should counters (like object counts) be formatted with a base 10 multiplier or a multiplier woth base 2? I prefer base 2 for any dimensionless quantities (or rates thereof) in computing. Metres and kilograms go in base 10, bytes go in base 2. It's all very subjective and a matter of opinion of course, and my feelings aren't particularly strong :-) As far as I understand the standards regarding this (IEC 60027, ISO/IEC 8, probably more) are talking about base 2 units for digital data related units only. I might of course misunderstand. What is problematic I find is that other tools will (mostly?) use base 10 units for everything not data related. Say I plot the object count of ceph in Grafana. It'll use base 10 multipliers for a dimensionless number. Since Grafana (and I imagine other toolsllike this) consume raw numbers we'll end up with Grafana displaying a different object count then "ceph -s". Say 1.04M vs 1M. Now this is not terrible but it'll get worse with higher counts quickly. In the original tracker issue it's noted that this was reported with cluster containing 7150896726 objects. The difference from grafana to "ceph -s" was 7150M vs 6835M. John My proposal would be to both use binary unit prefixes and use base 10 multipliers for counters. I think this aligns with user expectations as well as the relevant standard(s?). Best, Jan -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Jan Fajerski Engineer Enterprise Storage SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] formatting bytes and object counts in ceph status ouput
Hi lists, Currently the ceph status output formats all numbers with binary unit prefixes, i.e. 1MB equals 1048576 bytes and an object count of 1M equals 1048576 objects. I received a bug report from a user that printing object counts with a base 2 multiplier is confusing (I agree) so I opened a bug and https://github.com/ceph/ceph/pull/19117. In the PR discussion a couple of questions arose that I'd like to get some opinions on: - Should we print binary unit prefixes (MiB, GiB, ...) since that would be technically correct? - Should counters (like object counts) be formatted with a base 10 multiplier or a multiplier woth base 2? My proposal would be to both use binary unit prefixes and use base 10 multipliers for counters. I think this aligns with user expectations as well as the relevant standard(s?). Best, Jan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] FOSDEM Call for Participation: Software Defined Storage devroom
CfP for the Software Defined Storage devroom at FOSDEM 2018 (Brussels, Belgium, February 4th). FOSDEM is a free software event that offers open source communities a place to meet, share ideas and collaborate. It is renown for being highly developer- oriented and brings together 8000+ participants from all over the world. It is held in the city of Brussels (Belgium). FOSDEM 2018 will take place during the weekend of February 3rd-4th 2018. More details about the event can be found at http://fosdem.org/ ** Call For Participation The Software Defined Storage devroom will go into it's second round for talks around Open Source Software Defined Storage projects, management tools and real world deployments. Presentation topics could include but are not limited too: - Your work on a SDS project like Ceph, GlusterFS or LizardFS - Your work on or with SDS related projects like SWIFT or Container Storage Interface - Management tools for SDS deployments - Monitoring tools for SDS clusters ** Important dates: - 26 Nov 2017: submission deadline for talk proposals - 15 Dec 2017: announcement of the final schedule - 4 Feb 2018: Software Defined Storage dev room Talk proposals will be reviewed by a steering committee: - Leonardo Vaz (Ceph Community Manager - Red Hat Inc.) - Joao Luis (Core Ceph contributor - SUSE) - Jan Fajerski (Ceph Developer - SUSE) Use the FOSDEM 'pentabarf' tool to submit your proposal: https://penta.fosdem.org/submission/FOSDEM18 - If necessary, create a Pentabarf account and activate it. Please reuse your account from previous years if you have already created it. - In the "Person" section, provide First name, Last name (in the "General" tab), Email (in the "Contact" tab) and Bio ("Abstract" field in the "Description" tab). - Submit a proposal by clicking on "Create event". - Important! Select the "Software Defined Storage devroom" track (on the "General" tab). - Provide the title of your talk ("Event title" in the "General" tab). - Provide a description of the subject of the talk and the intended audience (in the "Abstract" field of the "Description" tab) - Provide a rough outline of the talk or goals of the session (a short list of bullet points covering topics that will be discussed) in the "Full description" field in the "Description" tab - Provide an expected length of your talk in the "Duration" field. Please count at least 10 minutes of discussion into your proposal. Suggested talk length would be 15, 20+10, 30+15, and 45+15 minutes. ** Recording of talks The FOSDEM organizers plan to have live streaming and recording fully working, both for remote/later viewing of talks, and so that people can watch streams in the hallways when rooms are full. This requires speakers to consent to being recorded and streamed. If you plan to be a speaker, please understand that by doing so you implicitly give consent for your talk to be recorded and streamed. The recordings will be published under the same license as all FOSDEM content (CC-BY). Hope to hear from you soon! And please forward this announcement. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com