Abe Ratnofsky created CASSANDRA-18904:
-----------------------------------------

             Summary: Repair vtable caches 
                 Key: CASSANDRA-18904
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18904
             Project: Cassandra
          Issue Type: Bug
          Components: Consistency/Repair, Local/Caching
            Reporter: Abe Ratnofsky
            Assignee: Abe Ratnofsky


Currently, the repair vtables 
(system_views.{repairs,repair_sessions,repair_jobs,repair_participates,repair_validations})
 are backed by caches in ActiveRepairService that are bounded by the number of 
elements in them, controlled by Config.repair_state_size and 
Config.repair_state_expires.

The individual cached elements are mutable, and can grow to retain a 
significant amount of heap as the instance uptime increases and more repairs 
are run. In a heap dump for a real cluster, I found these caches occupying ~1GB 
of heap total between ActiveRepairService.repairs and 
ActiveRepairService.participates. Individual cached elements were reaching 
100KB in size, so configuring the caches by number of elements introduces a 
significant amount of potential variance in the actual heap usage of these 
caches.

We should measure these caches by the heap they retain, not by the number of 
elements. Users should not be expected to check heap dumps to calibrate the 
number of elements they configure the caches to consume - specifying a memory 
total is much more user-friendly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to