Re: [ceph-users] Show and Tell: Grafana cluster dashboard

2018-06-04 Thread Lenz Grimmer
On 05/08/2018 07:21 AM, Kai Wagner wrote:

> Looks very good. Is it anyhow possible to display the reason why a
> cluster is in an error or warning state? Thinking about the output from
> ceph -s if this could by shown in case there's a failure. I think this
> will not be provided by default but wondering if it's possible to add.

Sorry for the late reply. We actually discussed this aspect during one
of the calls we had when discussing the Grafana dashboard integration
into the Ceph Manager Dashboard. Such kind of state information is
somewhat difficult to track and visualize using Prometheus/Grafana (or
any other TSDB, FWIW), as you can't store the actual reasons for why the
cluster is in HEALTH_WARN or HEALTH_ERR, for example.

We are therefore considering displaying this information in the form of
"native" widgets on the Manager Dashboard, and using the Grafana
dashboards for visualizing the other more suitable performance metrics.

Lenz

-- 
SUSE Linux GmbH - Maxfeldstr. 5 - 90409 Nuernberg (Germany)
GF:Felix Imendörffer,Jane Smithard,Graham Norton,HRB 21284 (AG Nürnberg)



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Show and Tell: Grafana cluster dashboard

2018-05-07 Thread Kai Wagner
Looks very good. Is it anyhow possible to display the reason why a
cluster is in an error or warning state? Thinking about the output from
ceph -s if this could by shown in case there's a failure. I think this
will not be provided by default but wondering if it's possible to add.

Kai

On 05/07/2018 04:53 PM, Reed Dier wrote:
> I think supporting both paths would be the best choice.
That's the way we should go. Supporting both or in general as much as
possible (try to be generic)

-- 
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)




signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Show and Tell: Grafana cluster dashboard

2018-05-07 Thread Konstantin Shalygin

And a question:
Is there a way to get the Cluster IOPS with prometheus metrics? I did
this with collectd, but can't find a suitable metric from ceph-mgr.



sum(irate(ceph_pool_rd[30s]))

sum(irate(ceph_pool_wr[30s]))




k

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Show and Tell: Grafana cluster dashboard

2018-05-07 Thread Dietmar Rieder
+1 for supporting both!

Disclosure: Prometheus user

Dietmar

On 05/07/2018 04:53 PM, Reed Dier wrote:
> I’ll +1 on InfluxDB rather than Prometheus, though I think having a version 
> for each infrastructure path would be best.
> I’m sure plenty here have existing InfluxDB infrastructure as their TSDB of 
> choice, and moving to Prometheus would be less advantageous.
> 
> Conversely, I’m sure all of the Prometheus folks would be less inclined to 
> move to InfluxDB for TSDB, so I think supporting both paths would be the best 
> choice.
> 
> Reed
> 
>> On May 7, 2018, at 3:06 AM, Marc Roos  wrote:
>>
>>
>> Looks nice 
>>
>> - I rather have some dashboards with collectd/influxdb.
>> - Take into account bigger tv/screens eg 65" uhd. I am putting more 
>> stats on them than viewing them locally in a webbrowser.
>> - What is to be considered most important to have on your ceph 
>> dashboard? As a newbie I find it difficult to determine what is 
>> important to monitor.
>> - Maybe also some docs on what metrics you have taken and argumentation 
>> on how you used them (could be usefull if one wants to modify the 
>> dashboard for some other backend)
>>
>> Ceph performance counters description.
>> https://access.redhat.com/documentation/en/red-hat-ceph-storage/1.3/paged/administration-guide/chapter-9-performance-counters
>>
>>
>> -----Original Message-----
>> From: Jan Fajerski [mailto:jfajer...@suse.com] 
>> Sent: maandag 7 mei 2018 12:32
>> To: ceph-devel
>> Cc: ceph-users
>> Subject: [ceph-users] Show and Tell: Grafana cluster dashboard
>>
>> Hi all,
>> I'd like to request comments and feedback about a Grafana dashboard for 
>> Ceph cluster monitoring.
>>
>> https://youtu.be/HJquM127wMY
>>
>> https://github.com/ceph/ceph/pull/21850
>>
>> The goal is to eventually have a set of default dashboards in the Ceph 
>> repository that offer decent monitoring for clusters of various (maybe 
>> even all) sizes and applications, or at least serve as a starting point 
>> for customizations.
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 




signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Show and Tell: Grafana cluster dashboard

2018-05-07 Thread Wido den Hollander


On 05/07/2018 04:53 PM, Reed Dier wrote:
> I’ll +1 on InfluxDB rather than Prometheus, though I think having a version 
> for each infrastructure path would be best.
> I’m sure plenty here have existing InfluxDB infrastructure as their TSDB of 
> choice, and moving to Prometheus would be less advantageous.
> 

To add, I have a PR open for a Telegraf Mgr module in addition to
InfluxDB: https://github.com/ceph/ceph/pull/21782

Wido

> Conversely, I’m sure all of the Prometheus folks would be less inclined to 
> move to InfluxDB for TSDB, so I think supporting both paths would be the best 
> choice.
> 
> Reed
> 
>> On May 7, 2018, at 3:06 AM, Marc Roos  wrote:
>>
>>
>> Looks nice 
>>
>> - I rather have some dashboards with collectd/influxdb.
>> - Take into account bigger tv/screens eg 65" uhd. I am putting more 
>> stats on them than viewing them locally in a webbrowser.
>> - What is to be considered most important to have on your ceph 
>> dashboard? As a newbie I find it difficult to determine what is 
>> important to monitor.
>> - Maybe also some docs on what metrics you have taken and argumentation 
>> on how you used them (could be usefull if one wants to modify the 
>> dashboard for some other backend)
>>
>> Ceph performance counters description.
>> https://access.redhat.com/documentation/en/red-hat-ceph-storage/1.3/paged/administration-guide/chapter-9-performance-counters
>>
>>
>> -----Original Message-----
>> From: Jan Fajerski [mailto:jfajer...@suse.com] 
>> Sent: maandag 7 mei 2018 12:32
>> To: ceph-devel
>> Cc: ceph-users
>> Subject: [ceph-users] Show and Tell: Grafana cluster dashboard
>>
>> Hi all,
>> I'd like to request comments and feedback about a Grafana dashboard for 
>> Ceph cluster monitoring.
>>
>> https://youtu.be/HJquM127wMY
>>
>> https://github.com/ceph/ceph/pull/21850
>>
>> The goal is to eventually have a set of default dashboards in the Ceph 
>> repository that offer decent monitoring for clusters of various (maybe 
>> even all) sizes and applications, or at least serve as a starting point 
>> for customizations.
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Show and Tell: Grafana cluster dashboard

2018-05-07 Thread Reed Dier
I’ll +1 on InfluxDB rather than Prometheus, though I think having a version for 
each infrastructure path would be best.
I’m sure plenty here have existing InfluxDB infrastructure as their TSDB of 
choice, and moving to Prometheus would be less advantageous.

Conversely, I’m sure all of the Prometheus folks would be less inclined to move 
to InfluxDB for TSDB, so I think supporting both paths would be the best choice.

Reed

> On May 7, 2018, at 3:06 AM, Marc Roos  wrote:
> 
> 
> Looks nice 
> 
> - I rather have some dashboards with collectd/influxdb.
> - Take into account bigger tv/screens eg 65" uhd. I am putting more 
> stats on them than viewing them locally in a webbrowser.
> - What is to be considered most important to have on your ceph 
> dashboard? As a newbie I find it difficult to determine what is 
> important to monitor.
> - Maybe also some docs on what metrics you have taken and argumentation 
> on how you used them (could be usefull if one wants to modify the 
> dashboard for some other backend)
> 
> Ceph performance counters description.
> https://access.redhat.com/documentation/en/red-hat-ceph-storage/1.3/paged/administration-guide/chapter-9-performance-counters
> 
> 
> -Original Message-
> From: Jan Fajerski [mailto:jfajer...@suse.com] 
> Sent: maandag 7 mei 2018 12:32
> To: ceph-devel
> Cc: ceph-users
> Subject: [ceph-users] Show and Tell: Grafana cluster dashboard
> 
> Hi all,
> I'd like to request comments and feedback about a Grafana dashboard for 
> Ceph cluster monitoring.
> 
> https://youtu.be/HJquM127wMY
> 
> https://github.com/ceph/ceph/pull/21850
> 
> The goal is to eventually have a set of default dashboards in the Ceph 
> repository that offer decent monitoring for clusters of various (maybe 
> even all) sizes and applications, or at least serve as a starting point 
> for customizations.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Show and Tell: Grafana cluster dashboard

2018-05-07 Thread Marc Roos

Looks nice 

- I rather have some dashboards with collectd/influxdb.
- Take into account bigger tv/screens eg 65" uhd. I am putting more 
stats on them than viewing them locally in a webbrowser.
- What is to be considered most important to have on your ceph 
dashboard? As a newbie I find it difficult to determine what is 
important to monitor.
- Maybe also some docs on what metrics you have taken and argumentation 
on how you used them (could be usefull if one wants to modify the 
dashboard for some other backend)

Ceph performance counters description.
https://access.redhat.com/documentation/en/red-hat-ceph-storage/1.3/paged/administration-guide/chapter-9-performance-counters


-Original Message-
From: Jan Fajerski [mailto:jfajer...@suse.com] 
Sent: maandag 7 mei 2018 12:32
To: ceph-devel
Cc: ceph-users
Subject: [ceph-users] Show and Tell: Grafana cluster dashboard

Hi all,
I'd like to request comments and feedback about a Grafana dashboard for 
Ceph cluster monitoring.

https://youtu.be/HJquM127wMY

https://github.com/ceph/ceph/pull/21850

The goal is to eventually have a set of default dashboards in the Ceph 
repository that offer decent monitoring for clusters of various (maybe 
even all) sizes and applications, or at least serve as a starting point 
for customizations.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Show and Tell: Grafana cluster dashboard

2018-05-07 Thread Jan Fajerski

On Mon, May 07, 2018 at 02:45:14PM +0200, Kurt Bauer wrote:




Jan Fajerski 
7. May 2018 at 14:21
On Mon, May 07, 2018 at 02:05:59PM +0200, Kurt Bauer wrote:

 Hi Jan,
 first of all thanks for this dashboard.
 A few comments:
 -) 'vonage-status-panel' is needed, which isn't mentioned in the 
ReadMe

Yes, my bad. Will update the README

 -) Using ceph 12.2.4 the mon metric for me is apparently called
 'ceph_mon_quorum_count' not 'ceph_mon_quorum_status'

I'll also add to the readme: The dashboard is based on Ceph Mimic.

 And a question:
 Is there a way to get the Cluster IOPS with prometheus metrics? I did
 this with collectd, but can't find a suitable metric from ceph-mgr.

Yes...at least in Mimic the metrics are called ceph_osd_op[_r,_w,_rw]
Thanks, these metrics are in Luminous too. I seem unable to find some 
sort of register, to see which metrics mean what. Some are quite 
obvious, but others are a mystery. Does smth. like that exist 
somewhere?

Not yet.
Most daemon specific metric names (like ceph_osd_op[_r,_w,_rw) are derived 
directly from the respective perf counter names. The plugin exports all perf 
counters with PRIO_INTERESTING or higher (iirc).

An automatically created index would certainly be feasible.


Thanks.


 Best regards,
 Kurt

 [1]Jan Fajerski
 7. May 2018 at 12:32

 Hi all,
 I'd like to request comments and feedback about a Grafana 
dashboard for

 Ceph cluster monitoring.
 [2]https://youtu.be/HJquM127wMY
 [3]https://github.com/ceph/ceph/pull/21850
 The goal is to eventually have a set of default dashboards in the Ceph
 repository that offer decent monitoring for clusters of various (maybe
 even all) sizes and applications, or at least serve as a 
starting point

 for customizations.
 --
 To unsubscribe from this list: send the line "unsubscribe ceph-devel"
 in
 the body of a message to [4]majord...@vger.kernel.org
 More majordomo info at [5]http://vger.kernel.org/majordomo-info.html

References

 1. mailto:jfajer...@suse.com
 2. https://youtu.be/HJquM127wMY
 3. https://github.com/ceph/ceph/pull/21850
 4. mailto:majord...@vger.kernel.org
 5. http://vger.kernel.org/majordomo-info.html



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



Kurt Bauer 
7. May 2018 at 14:05
Hi Jan,

first of all thanks for this dashboard.
A few comments:
-) 'vonage-status-panel' is needed, which isn't mentioned in the ReadMe
-) Using ceph 12.2.4 the mon metric for me is apparently called 
'ceph_mon_quorum_count' not 'ceph_mon_quorum_status'


And a question:
Is there a way to get the Cluster IOPS with prometheus metrics? I 
did this with collectd, but can't find a suitable metric from 
ceph-mgr.


Best regards,
Kurt




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


--

Kurt Bauer
Vienna University Computer Center - ACOnet - VIX
Universitaetsstrasse 7, A-1010 Vienna, Austria, Europe
Tel: ++431 4277  - 14070 (Fax: - 814070)  KB1970-RIPE

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



--
Jan Fajerski
Engineer Enterprise Storage
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
HRB 21284 (AG Nürnberg)
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Show and Tell: Grafana cluster dashboard

2018-05-07 Thread Kurt Bauer




Jan Fajerski 
7. May 2018 at 14:21
On Mon, May 07, 2018 at 02:05:59PM +0200, Kurt Bauer wrote:

  Hi Jan,
  first of all thanks for this dashboard.
  A few comments:
  -) 'vonage-status-panel' is needed, which isn't mentioned in the 
ReadMe

Yes, my bad. Will update the README

  -) Using ceph 12.2.4 the mon metric for me is apparently called
  'ceph_mon_quorum_count' not 'ceph_mon_quorum_status'

I'll also add to the readme: The dashboard is based on Ceph Mimic.

  And a question:
  Is there a way to get the Cluster IOPS with prometheus metrics? I did
  this with collectd, but can't find a suitable metric from ceph-mgr.

Yes...at least in Mimic the metrics are called ceph_osd_op[_r,_w,_rw]
Thanks, these metrics are in Luminous too. I seem unable to find some 
sort of register, to see which metrics mean what. Some are quite 
obvious, but others are a mystery. Does smth. like that exist somewhere?


Thanks.


  Best regards,
  Kurt

  [1]Jan Fajerski
  7. May 2018 at 12:32

  Hi all,
  I'd like to request comments and feedback about a Grafana dashboard 
for

  Ceph cluster monitoring.
  [2]https://youtu.be/HJquM127wMY
  [3]https://github.com/ceph/ceph/pull/21850
  The goal is to eventually have a set of default dashboards in the Ceph
  repository that offer decent monitoring for clusters of various (maybe
  even all) sizes and applications, or at least serve as a starting 
point

  for customizations.
  --
  To unsubscribe from this list: send the line "unsubscribe ceph-devel"
  in
  the body of a message to [4]majord...@vger.kernel.org
  More majordomo info at [5]http://vger.kernel.org/majordomo-info.html

References

  1. mailto:jfajer...@suse.com
  2. https://youtu.be/HJquM127wMY
  3. https://github.com/ceph/ceph/pull/21850
  4. mailto:majord...@vger.kernel.org
  5. http://vger.kernel.org/majordomo-info.html



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



Kurt Bauer 
7. May 2018 at 14:05
Hi Jan,

first of all thanks for this dashboard.
A few comments:
-) 'vonage-status-panel' is needed, which isn't mentioned in the ReadMe
-) Using ceph 12.2.4 the mon metric for me is apparently called 
'ceph_mon_quorum_count' not 'ceph_mon_quorum_status'


And a question:
Is there a way to get the Cluster IOPS with prometheus metrics? I did 
this with collectd, but can't find a suitable metric from ceph-mgr.


Best regards,
Kurt




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


--

Kurt Bauer
Vienna University Computer Center - ACOnet - VIX
Universitaetsstrasse 7, A-1010 Vienna, Austria, Europe
Tel: ++431 4277  - 14070 (Fax: - 814070)  KB1970-RIPE

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Show and Tell: Grafana cluster dashboard

2018-05-07 Thread Jan Fajerski

On Mon, May 07, 2018 at 02:05:59PM +0200, Kurt Bauer wrote:

  Hi Jan,
  first of all thanks for this dashboard.
  A few comments:
  -) 'vonage-status-panel' is needed, which isn't mentioned in the ReadMe

Yes, my bad. Will update the README

  -) Using ceph 12.2.4 the mon metric for me is apparently called
  'ceph_mon_quorum_count' not 'ceph_mon_quorum_status'

I'll also add to the readme: The dashboard is based on Ceph Mimic.

  And a question:
  Is there a way to get the Cluster IOPS with prometheus metrics? I did
  this with collectd, but can't find a suitable metric from ceph-mgr.

Yes...at least in Mimic the metrics are called ceph_osd_op[_r,_w,_rw]

  Best regards,
  Kurt

  [1]Jan Fajerski
  7. May 2018 at 12:32

  Hi all,
  I'd like to request comments and feedback about a Grafana dashboard for
  Ceph cluster monitoring.
  [2]https://youtu.be/HJquM127wMY
  [3]https://github.com/ceph/ceph/pull/21850
  The goal is to eventually have a set of default dashboards in the Ceph
  repository that offer decent monitoring for clusters of various (maybe
  even all) sizes and applications, or at least serve as a starting point
  for customizations.
  --
  To unsubscribe from this list: send the line "unsubscribe ceph-devel"
  in
  the body of a message to [4]majord...@vger.kernel.org
  More majordomo info at  [5]http://vger.kernel.org/majordomo-info.html

References

  1. mailto:jfajer...@suse.com
  2. https://youtu.be/HJquM127wMY
  3. https://github.com/ceph/ceph/pull/21850
  4. mailto:majord...@vger.kernel.org
  5. http://vger.kernel.org/majordomo-info.html



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



--
Jan Fajerski
Engineer Enterprise Storage
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
HRB 21284 (AG Nürnberg)
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Show and Tell: Grafana cluster dashboard

2018-05-07 Thread Kurt Bauer

Hi Jan,

first of all thanks for this dashboard.
A few comments:
-) 'vonage-status-panel' is needed, which isn't mentioned in the ReadMe
-) Using ceph 12.2.4 the mon metric for me is apparently called 
'ceph_mon_quorum_count' not 'ceph_mon_quorum_status'


And a question:
Is there a way to get the Cluster IOPS with prometheus metrics? I did 
this with collectd, but can't find a suitable metric from ceph-mgr.


Best regards,
Kurt


Jan Fajerski 
7. May 2018 at 12:32
Hi all,
I'd like to request comments and feedback about a Grafana dashboard 
for Ceph cluster monitoring.


https://youtu.be/HJquM127wMY

https://github.com/ceph/ceph/pull/21850

The goal is to eventually have a set of default dashboards in the Ceph 
repository that offer decent monitoring for clusters of various (maybe 
even all) sizes and applications, or at least serve as a starting 
point for customizations.

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Show and Tell: Grafana cluster dashboard

2018-05-07 Thread Jan Fajerski

Hi all,
I'd like to request comments and feedback about a Grafana dashboard for Ceph 
cluster monitoring.


https://youtu.be/HJquM127wMY

https://github.com/ceph/ceph/pull/21850

The goal is to eventually have a set of default dashboards in the Ceph 
repository that offer decent monitoring for clusters of various (maybe even all) 
sizes and applications, or at least serve as a starting point for 
customizations.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com