Re: [ceph-users] ceph osd commit latency increase over time, until restart

Igor Fedotov Fri, 01 Mar 2019 02:25:14 -0800

Hi Chen,

thanks for the update. Will prepare patch to periodically resetStupidAllocator today.

And just to let you know below is an e-mail from AdamK from RH whichmight explain the issue with the allocator.

Also please note that StupidAllocator might not perform fulldefragmentation in run-time. That's why we observed (mentioned somewherein the thread) fragmentation growth while OSD is running and its drop onrestart. Such a restart rebuilds internal tree and eliminatesdefragmentation flaws. May be that's the case.



Thanks,

Igor

-------- Forwarded Message --------

Subject:        High CPU in StupidAllocator
Date:   Tue, 12 Feb 2019 10:24:37 +0100
From:   Adam Kupczyk <akupc...@redhat.com>
To:     IGOR FEDOTOV <ife...@gmail.com>



Hi Igor,

I have observed that StupidAllocator can burn a lot of CPU inStupidAllocator::allocate_int().

This comes from loops:
while (p != free[bin].end()) {
    if (_aligned_len(p, alloc_unit) >= want_size) {
      goto found;
    }
    ++p;
}

It happens when want_size is close to limit of size of bin.
For example, free[5] contains sizes 8192..16383.

When requesting size like 16000 it is quite likely that multiple chunksmust be checked.


I have made an attempt to improve it by increasing amount of buckets.
It is done in aclamk/wip-bs-stupid-allocator-2 .

Best regards,

Adam Kupczyk



On 3/1/2019 11:46 AM, Xiaoxi Chen wrote:

igor，
   I can test the patch if we have a package.
My enviroment and workload can consistently reproduce the latency 2-3 days after restarting. Sage tells me to try bitmap allocator to make sure stupidallocator is the bad guy. I have some osds in luminous +bitmap andsome osds in 14.1.0+bitmap. Both looks positive till now, but i needmore time to be sure. The perf ,log and admin socket analysis lead to the theory thatin alloc_int the loop sometimes take long time wkth allocator locksheld. Which blocks release part called from _txc_finish inkv_finalize_thread, this thread is also the one to calculatestate_kv_committing_lat and overall commit_lat. You can find fromadmin socket that state_done_latency has similar trend as commit_latency. But we cannot find a theory to.explain why reboot helps, theallocator btree will be rebuild from freelist manager and.it.should beexactly. the same as it is prior to reboot. Anything related with pgrecovery?
Anyway, as I have a live env and workload, I am more than willingto work with you for further investigatiom
-Xiaoxi
Igor Fedotov <ifedo...@suse.de <mailto:ifedo...@suse.de>> 于2019年3月1日周五上午6:21写道：
    Also I think it makes sense to create a ticket at this point. Any
    volunteers?

    On 3/1/2019 1:00 AM, Igor Fedotov wrote:
    > Wondering if somebody would be able to apply simple patch that
    > periodically resets StupidAllocator?
    >
    > Just to verify/disprove the hypothesis it's allocator relateted
    >
    > On 2/28/2019 11:57 PM, Stefan Kooman wrote:
    >> Quoting Wido den Hollander (w...@42on.com <mailto:w...@42on.com>):
    >>> Just wanted to chime in, I've seen this with
    Luminous+BlueStore+NVMe
    >>> OSDs as well. Over time their latency increased until we
    started to
    >>> notice I/O-wait inside VMs.
    >> On a Luminous 12.2.8 cluster with only SSDs we also hit this
    issue I
    >> guess. After restarting the OSD servers the latency would drop
    to normal
    >> values again. See https://owncloud.kooman.org/s/BpkUc7YM79vhcDj
    >>
    >> Reboots were finished at ~ 19:00.
    >>
    >> Gr. Stefan
    >>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ceph osd commit latency increase over time, until restart

Reply via email to