Hi Max,
I don't think this is allocator related issue. The symptoms that
triggered us to start using bitmap allocator over stupid one were:
- write op latency gradually increasing over time (days not hours)
- perf showing significant amount of time spent in allocator related
function
- OSD reboot was the only remedy.
It had nothing related to network activity and/or client restarts.
Thanks,
Igor
On 6/7/2019 11:05 AM, Max Vernimmen wrote:
Thank you for the suggestion to use the bitmap allocator. I looked at
the ceph documentation and could find no mention of this setting. This
makes me wonder how safe and production ready this setting really is.
I'm hesitant to apply that to our production environment.
If the allocator setting helps to resolve the problem then it looks to
me like there is a bug in the 'stupid' allocator that is causing this
behavior. Would this qualify for creating a bug report or is some more
debugging needed before I can do that?
On Thu, Jun 6, 2019 at 11:18 AM Stefan Kooman <[email protected]
<mailto:[email protected]>> wrote:
Quoting Max Vernimmen ([email protected]
<mailto:[email protected]>):
>
> This is happening several times per day after we made several
changes at
> the same time:
>
> - add physical ram to the ceph nodes
> - move from fixed 'bluestore cache size hdd|sdd' and
'bluestore cache kv
> max' to 'bluestore cache autotune = 1' and 'osd memory target =
> 20401094656'.
> - update ceph from 12.2.8 to 12.2.11
> - update clients from 12.2.8 to 12.2.11
>
> We have since upgraded the ceph nodes to 12.2.12 but it did not
help to fix
> this problem.
Have you tried the new bitmap allocator for the OSDs already
(available
since 12.2.12):
[osd]
# MEMORY ALLOCATOR
bluestore_allocator = bitmap
bluefs_allocator = bitmap
The issues you are reporting sound like an issue many of us have
seen on
luminous and mimic clusters and has been identified to be caused
by the
"stupid allocator" memory allocator.
Gr. Stefan
--
| BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351
| GPG: 0xD14839C6 +31 318 648 688 / [email protected]
<mailto:[email protected]>
--
Max Vernimmen
Senior DevOps Engineer
Textkernel
------------------------------------------------------------------------------
Textkernel BV, Nieuwendammerkade 26/a5, 1022 AB, Amsterdam, NL
-----------------------------------------------------------------------------
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com