Abe Ratnofsky created CASSANDRA-18904:
-----------------------------------------
Summary: Repair vtable caches
Key: CASSANDRA-18904
URL: https://issues.apache.org/jira/browse/CASSANDRA-18904
Project: Cassandra
Issue Type: Bug
Components: Consistency/Repair, Local/Caching
Reporter: Abe Ratnofsky
Assignee: Abe Ratnofsky
Currently, the repair vtables
(system_views.{repairs,repair_sessions,repair_jobs,repair_participates,repair_validations})
are backed by caches in ActiveRepairService that are bounded by the number of
elements in them, controlled by Config.repair_state_size and
Config.repair_state_expires.
The individual cached elements are mutable, and can grow to retain a
significant amount of heap as the instance uptime increases and more repairs
are run. In a heap dump for a real cluster, I found these caches occupying ~1GB
of heap total between ActiveRepairService.repairs and
ActiveRepairService.participates. Individual cached elements were reaching
100KB in size, so configuring the caches by number of elements introduces a
significant amount of potential variance in the actual heap usage of these
caches.
We should measure these caches by the heap they retain, not by the number of
elements. Users should not be expected to check heap dumps to calibrate the
number of elements they configure the caches to consume - specifying a memory
total is much more user-friendly.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]