[ 
https://issues.apache.org/jira/browse/CASSANDRA-15399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-15399:
--------------------------------------
    Description: 
To enhance the visibility in repair, we should expose internal state via 
virtual tables; the state should include coordinator as well as participant 
state (validation, sync, etc.)

I propose the following tables:

repairs - high level summary of the global state of repair; this should be 
called on the coordinator.
{code:sql}
CREATE TABLE repairs (
  id uuid,
  keyspace_name text,
  table_names frozen<list<text>>,
  ranges frozen<list<text>>,
  coordinator text,
  participants frozen<list<text>>,

  state text,
  progress_percentage float,
  last_updated_at_millis bigint,
  duration_micro bigint,
  failure_cause text,

  PRIMARY KEY ( (id) )
)
{code}

repair_tasks - represents RepairJob and participants state.  This will show if 
validations are running on participants and the progress they are making; this 
should be called on the coordinator.
{code:sql}
CREATE TABLE repair_tasks (
  id uuid,
  session_id uuid,
  keyspace_name text,
  table_name text,
  ranges frozen<list<text>>,
  coordinator text,
  participant text,

  state text,
  state_description text,
  progress_percentage float, -- between 0.0 and 100.0
  last_updated_at_millis bigint,
  duration_micro bigint,
  failure_cause text,

  PRIMARY KEY ( (id), session_id, table_name, participant )
)
{code}


repair_validations - shows the state of the validation task and updated 
periodically while validation is running; this should be called on the 
participants.
{code:sql}
CREATE TABLE repair_validations (
  id uuid,
  session_id uuid,
  ranges frozen<list<text>>,
  keyspace_name text,
  table_name text,
  initiator text,
  state text,
  progress_percentage float,
  queue_duration_ms bigint,
  runtime_duration_ms bigint,
  total_duration_ms bigint,
  estimated_partitions bigint,
  partitions_processed bigint,
  estimated_total_bytes bigint,
  failure_cause text,

  PRIMARY KEY ( (id), session_id, table_name )
)
{code}

The main reason for exposing virtual tables rather than exposing through 
durable tables is to make sure what is exposed is accurate.  In cases of write 
failures or node failures, the durable tables could become in-accurate and 
could add edge cases where the repair is not running but the tables say it is; 
by relying on repair's internal in-memory bookkeeping, these problems go away.

This jira does not try to solve the following:
1) repair resiliency - there are edge cases where repair hits an error and runs 
forever (at least from nodetool's perspective).
2) repair stream tracking - I have not learned the streaming side yet and what 
I see is multiple implementations exist, so seems like high scope.  My hope is 
to punt from this jira and tackle separately.


  was:
To enhance the visibility in repair, we should add in-memory objects that can 
be exposed via JMX and virtual tables to show the state of the coordinator, and 
validations (leaving sync out for now).

These objects should expose the timing (create, start, complete), current state 
(enum specific to the entity), and progress estimate (% complete); along with 
any entity specific information useful.

To help with growth, ActiveRepairService should periodically cleanup completed 
state after a configurable interval.


> Add ability to track state in repair
> ------------------------------------
>
>                 Key: CASSANDRA-15399
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15399
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Consistency/Repair
>            Reporter: David Capwell
>            Assignee: David Capwell
>            Priority: Normal
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> To enhance the visibility in repair, we should expose internal state via 
> virtual tables; the state should include coordinator as well as participant 
> state (validation, sync, etc.)
> I propose the following tables:
> repairs - high level summary of the global state of repair; this should be 
> called on the coordinator.
> {code:sql}
> CREATE TABLE repairs (
>   id uuid,
>   keyspace_name text,
>   table_names frozen<list<text>>,
>   ranges frozen<list<text>>,
>   coordinator text,
>   participants frozen<list<text>>,
>   state text,
>   progress_percentage float,
>   last_updated_at_millis bigint,
>   duration_micro bigint,
>   failure_cause text,
>   PRIMARY KEY ( (id) )
> )
> {code}
> repair_tasks - represents RepairJob and participants state.  This will show 
> if validations are running on participants and the progress they are making; 
> this should be called on the coordinator.
> {code:sql}
> CREATE TABLE repair_tasks (
>   id uuid,
>   session_id uuid,
>   keyspace_name text,
>   table_name text,
>   ranges frozen<list<text>>,
>   coordinator text,
>   participant text,
>   state text,
>   state_description text,
>   progress_percentage float, -- between 0.0 and 100.0
>   last_updated_at_millis bigint,
>   duration_micro bigint,
>   failure_cause text,
>   PRIMARY KEY ( (id), session_id, table_name, participant )
> )
> {code}
> repair_validations - shows the state of the validation task and updated 
> periodically while validation is running; this should be called on the 
> participants.
> {code:sql}
> CREATE TABLE repair_validations (
>   id uuid,
>   session_id uuid,
>   ranges frozen<list<text>>,
>   keyspace_name text,
>   table_name text,
>   initiator text,
>   state text,
>   progress_percentage float,
>   queue_duration_ms bigint,
>   runtime_duration_ms bigint,
>   total_duration_ms bigint,
>   estimated_partitions bigint,
>   partitions_processed bigint,
>   estimated_total_bytes bigint,
>   failure_cause text,
>   PRIMARY KEY ( (id), session_id, table_name )
> )
> {code}
> The main reason for exposing virtual tables rather than exposing through 
> durable tables is to make sure what is exposed is accurate.  In cases of 
> write failures or node failures, the durable tables could become in-accurate 
> and could add edge cases where the repair is not running but the tables say 
> it is; by relying on repair's internal in-memory bookkeeping, these problems 
> go away.
> This jira does not try to solve the following:
> 1) repair resiliency - there are edge cases where repair hits an error and 
> runs forever (at least from nodetool's perspective).
> 2) repair stream tracking - I have not learned the streaming side yet and 
> what I see is multiple implementations exist, so seems like high scope.  My 
> hope is to punt from this jira and tackle separately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to