https://github.com/drakonstein/cephbot
This is something that I've been using and working on for a while. My
Python abilities are subpar at best, but this has been very useful for me
in my environments. I use it for my home cluster and for multiple clusters
at work. The biggest gain from this is being able to check the status of a
cluster when a page goes out that something is wrong. You can simply ask
cephbot without needing to pull out a laptop, connect to a VPN, etc, etc.
This is my first git project and my documentation/README is pretty lacking
for it's instructions to get going. If anyone is interested trying out the
instructions and letting me know what's missing or submitting a PR with
better instructions after getting it working, that would be greatly
appreciated. :)
Here are a few examples of cephbot in action. Note that I have multiple
instances running using the same slack bot token and ID and that I'm using
CLUSTER_GROUP'ings to make checking multiple clusters at once easy.
David Turner [12:20 PM]
@cephbot help
cephbot APP [12:20 PM]
c3: status, health, io, osd stat, mon stat, pg stat, down osds, blocked
requests, df, osd df, fs dump, pool io
f1: status, health, io, osd stat, mon stat, pg stat, down osds, blocked
requests, df, osd df, fs dump, pool io
c1: status, health, io, osd stat, mon stat, pg stat, down osds, blocked
requests, df, osd df, pool io
c5: status, health, io, osd stat, mon stat, pg stat, down osds, blocked
requests, df, osd df, pool io
f3: status, health, io, osd stat, mon stat, pg stat, down osds, blocked
requests, df, osd df, fs dump, pool io
f2: status, health, io, osd stat, mon stat, pg stat, down osds, blocked
requests, df, osd df, fs dump, pool io
David Turner [12:21 PM]
@cephbot prod health
cephbot APP [12:21 PM]
f3: HEALTH_OK
f2: HEALTH_OK
c3: HEALTH_WARN noout flag(s) set; 16 osds down; 2 hosts (16 osds) down;
Degraded data redundancy: 163014044/3748375787 objects degraded (4.349%),
2062 pgs unclean, 2062 pgs degraded, 2062 pgs undersized
c5: HEALTH_OK
David Turner [12:21 PM]
@cephbot c3 down osds
cephbot APP [12:21 PM]
ssd
sto5-ssd
osd.215
default
sto5
osd.60, osd.61, osd.62, osd.63, osd.64, osd.65, osd.66, osd.67,
osd.68, osd.69, osd.70, osd.71, osd.72, osd.73, osd.74
David Turner [12:24 PM]
@cephbot prod io
cephbot APP [12:24 PM]
c3: client: 49005 kB/s rd, 59702 kB/s wr, 337 op/s rd, 248 op/s wr
f2: client: 85 B/s rd, 8610 B/s wr, 0 op/s rd, 3 op/s wr
f3: client: 58876 B/s rd, 151 kB/s wr, 3 op/s rd, 18 op/s wr
c5: nothing is going on
David Turner [12:28 PM]
@cephbot stage osd stat
cephbot APP [12:28 PM]
c1: 25 osds: 25 up, 25 in
f1: 12 osds: 12 up, 12 in
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com