[ 
https://issues.apache.org/jira/browse/CASSANDRA-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13844661#comment-13844661
 ] 

Yuki Morishita commented on CASSANDRA-5839:
-------------------------------------------

[~yarin] suggestion for schema.

You have primary key (keyspace_name, columnfamily_name, id, range_begin), 
though repair id is assigned for each keyspace/range pair, so the last 
'range_begin' seems redundant. There is always only one range_begin inside 
given id. Also use of 'timeuuid' as a type for id field enables us to query on 
timestamp basis like id > mintimeuuid('2013-12-01').

bq. I also wanted to store the total number of ranges that were out of sync. 
This could be useful to know. I renamed the table to repair_jobs since each 
entry corresponds to a RepairJob rather than a whole session.

I think it is better to split schema, one for stats and other for status, 
because in that way we can add other stats like bytes transferred.

bq. The SyncComplete message could contain some more information that could be 
stored, such as amount of data that were actually streamed.

Since system_global keyspace is replicated, I prefer storing stats at the node 
that the event happens rather than sending back to coordinator and let it do 
the job.

bq. Actually looking up data in the repair table is a bit tricky. There are 
many possible use cases. Perhaps we should allow the end user to create the 
indexes he wants?

I think it is enough to just provide the way to query based on timestamp. It's 
not going to be a big data, users can use their force to construct their 
desired view.

bq. I'm thinking of putting the TTL value of all entries (currently hard coded 
to 365 days) in the cassandra.yaml file.

TTL is what we need to have. Just an idea, since repair is done based on 
gc_grace, why don't we set like 10 * gc_grace?


> Save repair data to system table
> --------------------------------
>
>                 Key: CASSANDRA-5839
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5839
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core, Tools
>            Reporter: Jonathan Ellis
>            Assignee: Jimmy MÃ¥rdell
>            Priority: Minor
>             Fix For: 2.0.4
>
>         Attachments: 2.0.4-5839-draft.patch
>
>
> As noted in CASSANDRA-2405, it would be useful to store repair results, 
> particularly with sub-range repair available (CASSANDRA-5280).



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to