hi Paulo, 
follow up on the # of entries question... 
 why each job repair execution will have 2 entries? I thought it will be one 
entry, begining with started_at column filled, and when it completed, 
finished_at column will be filled. 
Also, if my cluster has more than 1 keyspace, and the way this table is 
structured, it will have multiple entries, one for each keysapce_name value. no 
? thanks


Sent from my iPhone

> On Feb 25, 2016, at 5:48 AM, Paulo Motta <pauloricard...@gmail.com> wrote:
> 
> Hello Jimmy,
> 
> The parent_repair_history table keeps track of start and finish information 
> of a repair session.  The other table repair_history keeps track of repair 
> status as it progresses. So, you must first query the parent_repair_history 
> table to check if a repair started and finish, as well as its duration, and 
> inspect the repair_history table to troubleshoot more specific details of a 
> given repair session.
> 
> Answering your questions below:
> 
> > Is every invocation of nodetool repair execution will be recorded as one 
> > entry in parent_repair_history CF regardless if it is across DC, local node 
> > repair, or other options ?
> 
> Actually two entries, one for start and one for finish.
> 
> > A repair job is done only if "finished" column contains value? and a repair 
> > job is successfully done only if there is no value in exce ption_messages 
> > or exception_stacktrace ?
> 
> correct
> 
> > what is the purpose of successful_ranges column? do i have to check they 
> > are all matched with requested_range to ensure a successful run?
> 
> correct
> 
> -
> > Ultimately, how to find out the overall repair health/status in a given 
> > cluster?
> 
> Check if repair is being executed on all nodes within gc_grace_seconds, and 
> tune that value or troubleshoot problems otherwise.
> 
> > Scanning through parent_repair_history and making sure all the known 
> > keyspaces has a good repair run in recent days?
> 
> Sounds good.
> 
> You can check https://issues.apache.org/jira/browse/CASSANDRA-5839 for more 
> information.
> 
> 
> 2016-02-25 3:13 GMT-03:00 Jimmy Lin <y2klyf+w...@gmail.com>:
>> 
>> hi all,
>> few questions regarding how to read or digest the 
>> system_distributed.parent_repair_history CF, that I am very intereted to use 
>> to find out our repair status... 
>>  
>> -
>> Is every invocation of nodetool repair execution will be recorded as one 
>> entry in parent_repair_history CF regardless if it is across DC, local node 
>> repair, or other options ?
>> 
>> -
>> A repair job is done only if "finished" column contains value? and a repair 
>> job is successfully done only if there is no value in exce
>> ption_messages or exception_stacktrace ?
>> what is the purpose of successful_ranges column? do i have to check they are 
>> all matched with requested_range to ensure a successful run?
>> 
>> -
>> Ultimately, how to find out the overall repair health/status in a given 
>> cluster?
>> Scanning through parent_repair_history and making sure all the known 
>> keyspaces has a good repair run in recent days?
>> 
>> ---------------
>> CREATE TABLE system_distributed.parent_repair_history (
>>     parent_id timeuuid PRIMARY KEY,
>>     columnfamily_names set<text>,
>>     exception_message text,
>>     exception_stacktrace text,
>>     finished_at timestamp,
>>     keyspace_name text,
>>     requested_ranges set<text>,
>>     started_at timestamp,
>>     successful_ranges set<text>
>> )
> 

Reply via email to