It seems that there's some bottleneck is blocking the I/O, when the bottleneck 
is reached, I/O is blocked and curve goes down, when it is released, I/O 
resumes and the curve gose up.

-----邮件原件-----
发件人: ceph-users [mailto:[email protected]] 代表 Patrick Dinnen
发送时间: 2017年5月12日 3:47
收件人: [email protected]
主题: [ceph-users] Odd cyclical cluster performance

Seeing some odd behaviour while testing using rados bench. This is on a 
pre-split pool, two node cluster with 12 OSDs total.

ceph osd pool create newerpoolofhopes 2048 2048 replicated ""
replicated_ruleset 500000000

rados -p newerpoolofhopes bench -t 32 -b 20000 30000000 write --no-cleanup

Using Prometheus/Grafana to watch what's going on, we see oddly regular peaks 
and dips in writer performance. The frequency changes gradually but it's on the 
order of hours (not the seconds that might seem easier to explain by system 
phenomena). It starts off at roughly one cycle per hour and we've seen it for 
multiple days of constant bench running with nothing else happening on the 
cluster.

A bunch of graphs showing the pattern:

https://ibb.co/djXUVk
https://ibb.co/gMNk35
https://ibb.co/iKViqk
https://ibb.co/jOXJO5
https://ibb.co/isUMbQ

sdg and sdi are SSD journal disks. The activity on the OSDs and SSDs seems 
anti-correlated. SSDs peak in activity as OSDs reach the bottom of the trough. 
Then the reverse. Repeat.

Does anyone have any suggestions as to what could possibly be causing a regular 
pattern like this at such a low frequency?

Thanks, Patrick Dinnen
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to