Hi Greg,

It's definitely not scrub or deep-scrub, as those are switched off for
testing. Anything else you'd look at as a possible culprit here?

Thanks, Patrick

On Mon, May 15, 2017 at 5:51 PM, Gregory Farnum <[email protected]> wrote:
> Did you try correlating it with PG scrubbing or other maintenance behaviors?
> -Greg
>
> On Thu, May 11, 2017 at 12:47 PM, Patrick Dinnen <[email protected]> wrote:
>> Seeing some odd behaviour while testing using rados bench. This is on
>> a pre-split pool, two node cluster with 12 OSDs total.
>>
>> ceph osd pool create newerpoolofhopes 2048 2048 replicated ""
>> replicated_ruleset 500000000
>>
>> rados -p newerpoolofhopes bench -t 32 -b 20000 30000000 write --no-cleanup
>>
>> Using Prometheus/Grafana to watch what's going on, we see oddly
>> regular peaks and dips in writer performance. The frequency changes
>> gradually but it's on the order of hours (not the seconds that might
>> seem easier to explain by system phenomena). It starts off at roughly
>> one cycle per hour and we've seen it for multiple days of constant
>> bench running with nothing else happening on the cluster.
>>
>> A bunch of graphs showing the pattern:
>>
>> https://ibb.co/djXUVk
>> https://ibb.co/gMNk35
>> https://ibb.co/iKViqk
>> https://ibb.co/jOXJO5
>> https://ibb.co/isUMbQ
>>
>> sdg and sdi are SSD journal disks. The activity on the OSDs and SSDs
>> seems anti-correlated. SSDs peak in activity as OSDs reach the bottom
>> of the trough. Then the reverse. Repeat.
>>
>> Does anyone have any suggestions as to what could possibly be causing
>> a regular pattern like this at such a low frequency?
>>
>> Thanks, Patrick Dinnen
>> _______________________________________________
>> ceph-users mailing list
>> [email protected]
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to