[ceph-users] Re: Debugging OSD cache thrashing

2025-08-11 Thread Hector Martin
On 2025/08/12 1:00, Mark Nelson wrote: > Congrats on figuring this out Hector!  This is a huge find! Comments below. > > > On 8/11/25 4:31 AM, Hector Martin wrote: >> For those who have been following along, I figured it out. I left all >> the details with Mark on Slack, but TL;DR: The fix is *ei

[ceph-users] Re: Debugging OSD cache thrashing

2025-08-11 Thread Mark Nelson
Congrats on figuring this out Hector!  This is a huge find! Comments below. On 8/11/25 4:31 AM, Hector Martin wrote: For those who have been following along, I figured it out. I left all the details with Mark on Slack, but TL;DR: The fix is *either one* (or both works too) of these: ceph confi

[ceph-users] Re: Debugging OSD cache thrashing

2025-08-11 Thread Hector Martin
For those who have been following along, I figured it out. I left all the details with Mark on Slack, but TL;DR: The fix is *either one* (or both works too) of these: ceph config set osd rocksdb_cache_index_and_filter_blocks false (Ceph default: true, RocksDB default: false) ceph config set osd

[ceph-users] Re: Debugging OSD cache thrashing

2025-06-25 Thread Mark Nelson
Hi Hector, Responses inline below. On 6/24/25 10:11 PM, Hector Martin wrote: Hi Mark, Thanks a lot for the pointers and info, it's really helpful. Glad to help, and thanks for looking into it.  If we can figure out how to disable bluefs_buffered_io without repercussions, I think it would be

[ceph-users] Re: Debugging OSD cache thrashing

2025-06-25 Thread Stefan Kooman
On 6/22/25 18:25, Hector Martin wrote: On 2025/06/23 0:21, Anthony D'Atri wrote: DIMMs are cheap. No DIMMs on Apple Macs. You’re running virtualized in VMs or containers, with OSDs, mons, mgr, and the constellation of other daemons with resources dramatically below recommendations. I’

[ceph-users] Re: Debugging OSD cache thrashing

2025-06-24 Thread Hector Martin
Hi Mark, Thanks a lot for the pointers and info, it's really helpful. Since the issue is happening in a live cluster (which is a homelab I can screw around with to an extent, but not take down for very long periods of time or lose data in), and since I don't have a lot of spare hours in the comin

[ceph-users] Re: Debugging OSD cache thrashing

2025-06-24 Thread Mark Nelson
Hi Hector, Just as a follow-up, Here are the comments I mentioned in the ceph slack channel from the PR where we re-enabled bluestore_buffered_io.  I recorded these notes back when aclamk and I were digging into the RocksDB code to see if there was anything we could do to improve the situatio

[ceph-users] Re: Debugging OSD cache thrashing

2025-06-24 Thread Mark Nelson
Hi Hector, Sorry I'm a bit late to the party on this one.  I wrote the OSD memory autotuning code and am probably one of the most recent people to really dig in and refactor bluestore's caches.  I'll respond inline below. On 6/22/25 05:51, Hector Martin wrote: Hi all, I have a small 3-node

[ceph-users] Re: Debugging OSD cache thrashing

2025-06-23 Thread Tyler Stachecki
On Sun, Jun 22, 2025, 8:52 AM Hector Martin wrote: > I believe that > something is wrong with the OSD bluestore cache allocation/flush policy, > and when the cache becomes full it starts thrashing reads instead of > evicting colder cached data (or perhaps some cache bucket is starving > another c

[ceph-users] Re: Debugging OSD cache thrashing

2025-06-23 Thread Janne Johansson
> If anything strictly below 4GB is completely unsupported and expected to > go into a thrashing tailspin, perhaps that doc should be updated to > state that. > > Angrily writing that a complex, mature, FREE system is “broken” because it > > doesn’t perform miracles when abused is folly, like exp

[ceph-users] Re: Debugging OSD cache thrashing

2025-06-22 Thread Eugen Block
The default OSD memory cache size is 4 GB, it’s not recommended to reduce it to such low values, especially if there’s real load on the cluster. I am not a developer, so I can’t really comment on the code. Zitat von Hector Martin : Hi all, I have a small 3-node cluster (4 HDD + 1 SSD OSD p

[ceph-users] Re: Debugging OSD cache thrashing

2025-06-22 Thread Hector Martin
On 2025/06/22 22:37, Eugen Block wrote: > The default OSD memory cache size is 4 GB, it’s not recommended to > reduce it to such low values, especially if there’s real load on the > cluster. I am not a developer, so I can’t really comment on the code. I don't have enough RAM for 4GB per OSD.

[ceph-users] Re: Debugging OSD cache thrashing

2025-06-22 Thread Eugen Block
Maybe you should ask this additionally on the devs mailing list. Zitat von Hector Martin : On 2025/06/23 0:21, Anthony D'Atri wrote: DIMMs are cheap. No DIMMs on Apple Macs. You’re running virtualized in VMs or containers, with OSDs, mons, mgr, and the constellation of other daemons

[ceph-users] Re: Debugging OSD cache thrashing

2025-06-22 Thread Hector Martin
On 2025/06/23 0:21, Anthony D'Atri wrote: > > >>> >>> DIMMs are cheap. >> >> No DIMMs on Apple Macs. > > You’re running virtualized in VMs or containers, with OSDs, mons, mgr, and > the constellation of other daemons with resources dramatically below > recommendations. I’ll speculate that a

[ceph-users] Re: Debugging OSD cache thrashing

2025-06-22 Thread Anthony D'Atri
>> >> DIMMs are cheap. > > No DIMMs on Apple Macs. You’re running virtualized in VMs or containers, with OSDs, mons, mgr, and the constellation of other daemons with resources dramatically below recommendations. I’ll speculate that at least the HDDs are USB-attached, or perhaps you’re on

[ceph-users] Re: Debugging OSD cache thrashing

2025-06-22 Thread Hector Martin
On 2025/06/22 23:19, Anthony D'Atri wrote: > >> I don't have enough RAM for 4GB per OSD. > > DIMMs are cheap.   No DIMMs on Apple Macs. > You might experiment with the values described here: > > docs.ceph.com bluestore-config-ref/> >

[ceph-users] Re: Debugging OSD cache thrashing

2025-06-22 Thread Anthony D'Atri
___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io