Hi Max,

I don't think this is allocator related issue. The symptoms that triggered us to start using bitmap allocator over stupid one were:

- write op latency gradually increasing over time (days not hours)

- perf showing significant amount of time spent in allocator related function

- OSD reboot was the only remedy.

It had nothing related to network activity and/or client restarts.


Thanks,

Igor


On 6/7/2019 11:05 AM, Max Vernimmen wrote:
Thank you for the suggestion to use the bitmap allocator. I looked at the ceph documentation and could find no mention of this setting. This makes me wonder how safe and production ready this setting really is. I'm hesitant to apply that to our production environment. If the allocator setting helps to resolve the problem then it looks to me like there is a bug in the 'stupid' allocator that is causing this behavior. Would this qualify for creating a bug report or is some more debugging needed before I can do that?

On Thu, Jun 6, 2019 at 11:18 AM Stefan Kooman <[email protected] <mailto:[email protected]>> wrote:

    Quoting Max Vernimmen ([email protected]
    <mailto:[email protected]>):
    >
    > This is happening several times per day after we made several
    changes at
    > the same time:
    >
    >    - add physical ram to the ceph nodes
    >    - move from fixed 'bluestore cache size hdd|sdd' and
    'bluestore cache kv
    >    max' to 'bluestore cache autotune = 1' and 'osd memory target =
    >    20401094656'.
    >    - update ceph from 12.2.8 to 12.2.11
    >    - update clients from 12.2.8 to 12.2.11
    >
    > We have since upgraded the ceph nodes to 12.2.12 but it did not
    help to fix
    > this problem.

    Have you tried the new bitmap allocator for the OSDs already
    (available
    since 12.2.12):

    [osd]

    # MEMORY ALLOCATOR
    bluestore_allocator = bitmap
    bluefs_allocator = bitmap

    The issues you are reporting sound like an issue many of us have
    seen on
    luminous and mimic clusters and has been identified to be caused
    by the
    "stupid allocator" memory allocator.

    Gr. Stefan


-- | BIT BV http://www.bit.nl/       Kamer van Koophandel 09090351
    | GPG: 0xD14839C6                   +31 318 648 688 / [email protected]
    <mailto:[email protected]>



--
Max Vernimmen
Senior DevOps Engineer
Textkernel

------------------------------------------------------------------------------
Textkernel BV, Nieuwendammerkade 26/a5, 1022 AB, Amsterdam, NL
-----------------------------------------------------------------------------

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to