Re: Repair Management

2017-05-19 Thread Blake Eggleston
Cool. Just want to point out that if you're going to expose a command to 
terminate a repair, it should also stop any related validations a sync tasks 
that are in progress.


On May 18, 2017 at 6:02:16 PM, Cameron Zemek (came...@instaclustr.com) wrote:

Here is what I have done so far: 
https://github.com/apache/cassandra/compare/trunk...instaclustr:repair_management
 

> I'm not sure what you mean by "coordinator repair commands". Do you mean 
full repairs? 

By coordinator repair I meant the repair command from the coordinator node. 
That is the repair command from StorageService::repairAsync . Hopefully the 
branch above shows what I am mean. 





On 19 May 2017 at 03:16, Blake Eggleston  wrote: 

> I am looking to improve monitoring and management of repairs (so far I 
> have 
> patch for adding ActiveRepairs to table/keyspace metrics) and come across 
> ActiveRepairServiceMBean but this appears to be limited to incremental 
> repairs. Is there a reason for this 
> 
> The incremental repair stuff was just the first set of jmx controls added 
> to ActiveRepairService. ActiveRepairService is involved in all repairs 
> though. 
> 
> I was looking to add something very similar to this nodetool repair_admin 
> but it would work on co-ordinator repair commands. 
> 
> 
> I'm not sure what you mean by "coordinator repair commands". Do you mean 
> full repairs? 
> 
> What is the purpose of the current repair_admin? If I wish to add the 
> above 
> should I rename the MBean to say 
> org.apache.cassandra.db:type=IncrementalRepairService and the nodetool 
> command to inc_repair_admin ? 
> 
> 
> nodetool help repair_admin says it's purpose is to "list and fail 
> incremental repair sessions". However, by failing incremental repair 
> sessions, it doesn't mean that it cancels the validation/sync, just that it 
> releases the sstables that were involved in the repair back into the 
> unrepaired data set. I don't see any reason why you couldn't add this 
> functionality to the existing RepairService mbean. That said, before 
> getting into mbean names, it's probably best to come up with a plan for 
> cancelling validation and sync on each of the replicas involved in a given 
> repair. As far as I know (though I may be wrong), that's not currently 
> supported. 
> 
> On May 17, 2017 at 7:36:51 PM, Cameron Zemek (came...@instaclustr.com) 
> wrote: 
> 
> I am looking to improve monitoring and management of repairs (so far I 
> have 
> patch for adding ActiveRepairs to table/keyspace metrics) and come across 
> ActiveRepairServiceMBean but this appears to be limited to incremental 
> repairs. Is there a reason for this? 
> 
> I was looking to add something very similar to this nodetool repair_admin 
> but it would work on co-ordinator repair commands. 
> 
> For example: 
> $ nodetool repair_admin --list 
> Repair#1 mykeyspace columnFamilies=colfamilya,colfamilyb; 
> incremental=True; 
> parallelism=parallel progress=5% 
> 
> $ nodetool repair_admin --terminate 1 
> Terminating repair command #1 (19f00c30-1390-11e7-bb50-ffb920a6d70f) 
> 
> $ nodetool repair_admin --terminate-all # calls 
> ssProxy.forceTerminateAllRepairSessions() 
> Terminating all repair sessions 
> Terminated repair command #2 (64c44230-21aa-11e7-9ede-cd6eb64e3786) 
> 
> What is the purpose of the current repair_admin? If I wish to add the 
> above 
> should I rename the MBean to say 
> org.apache.cassandra.db:type=IncrementalRepairService and the nodetool 
> command to inc_repair_admin ? 
> 
> 


Re: Repair Management

2017-05-18 Thread Cameron Zemek
Here is what I have done so far:
https://github.com/apache/cassandra/compare/trunk...instaclustr:repair_management

> I'm not sure what you mean by "coordinator repair commands". Do you mean
full repairs?

By coordinator repair I meant the repair command from the coordinator node.
That is the repair command from StorageService::repairAsync . Hopefully the
branch above shows what I am mean.





On 19 May 2017 at 03:16, Blake Eggleston  wrote:

> I am looking to improve monitoring and management of repairs (so far I
> have
> patch for adding ActiveRepairs to table/keyspace metrics) and come across
> ActiveRepairServiceMBean but this appears to be limited to incremental
> repairs. Is there a reason for this
>
> The incremental repair stuff was just the first set of jmx controls added
> to ActiveRepairService. ActiveRepairService is involved in all repairs
> though.
>
> I was looking to add something very similar to this nodetool repair_admin
> but it would work on co-ordinator repair commands.
>
>
> I'm not sure what you mean by "coordinator repair commands". Do you mean
> full repairs?
>
> What is the purpose of the current repair_admin? If I wish to add the
> above
> should I rename the MBean to say
> org.apache.cassandra.db:type=IncrementalRepairService and the nodetool
> command to inc_repair_admin ?
>
>
> nodetool help repair_admin says it's purpose is to "list and fail
> incremental repair sessions". However, by failing incremental repair
> sessions, it doesn't mean that it cancels the validation/sync, just that it
> releases the sstables that were involved in the repair back into the
> unrepaired data set. I don't see any reason why you couldn't add this
> functionality to the existing RepairService mbean. That said, before
> getting into mbean names, it's probably best to come up with a plan for
> cancelling validation and sync on each of the replicas involved in a given
> repair. As far as I know (though I may be wrong), that's not currently
> supported.
>
> On May 17, 2017 at 7:36:51 PM, Cameron Zemek (came...@instaclustr.com)
> wrote:
>
> I am looking to improve monitoring and management of repairs (so far I
> have
> patch for adding ActiveRepairs to table/keyspace metrics) and come across
> ActiveRepairServiceMBean but this appears to be limited to incremental
> repairs. Is there a reason for this?
>
> I was looking to add something very similar to this nodetool repair_admin
> but it would work on co-ordinator repair commands.
>
> For example:
> $ nodetool repair_admin --list
> Repair#1 mykeyspace columnFamilies=colfamilya,colfamilyb;
> incremental=True;
> parallelism=parallel progress=5%
>
> $ nodetool repair_admin --terminate 1
> Terminating repair command #1 (19f00c30-1390-11e7-bb50-ffb920a6d70f)
>
> $ nodetool repair_admin --terminate-all # calls
> ssProxy.forceTerminateAllRepairSessions()
> Terminating all repair sessions
> Terminated repair command #2 (64c44230-21aa-11e7-9ede-cd6eb64e3786)
>
> What is the purpose of the current repair_admin? If I wish to add the
> above
> should I rename the MBean to say
> org.apache.cassandra.db:type=IncrementalRepairService and the nodetool
> command to inc_repair_admin ?
>
>


Re: Repair Management

2017-05-18 Thread Blake Eggleston
I am looking to improve monitoring and management of repairs (so far I have 
patch for adding ActiveRepairs to table/keyspace metrics) and come across 
ActiveRepairServiceMBean but this appears to be limited to incremental 
repairs. Is there a reason for this
The incremental repair stuff was just the first set of jmx controls added to 
ActiveRepairService. ActiveRepairService is involved in all repairs though.

I was looking to add something very similar to this nodetool repair_admin 
but it would work on co-ordinator repair commands. 

I'm not sure what you mean by "coordinator repair commands". Do you mean full 
repairs?

What is the purpose of the current repair_admin? If I wish to add the above 
should I rename the MBean to say 
org.apache.cassandra.db:type=IncrementalRepairService and the nodetool 
command to inc_repair_admin ? 

nodetool help repair_admin says it's purpose is to "list and fail incremental 
repair sessions". However, by failing incremental repair sessions, it doesn't 
mean that it cancels the validation/sync, just that it releases the sstables 
that were involved in the repair back into the unrepaired data set. I don't see 
any reason why you couldn't add this functionality to the existing 
RepairService mbean. That said, before getting into mbean names, it's probably 
best to come up with a plan for cancelling validation and sync on each of the 
replicas involved in a given repair. As far as I know (though I may be wrong), 
that's not currently supported.
On May 17, 2017 at 7:36:51 PM, Cameron Zemek (came...@instaclustr.com) wrote:

I am looking to improve monitoring and management of repairs (so far I have  
patch for adding ActiveRepairs to table/keyspace metrics) and come across  
ActiveRepairServiceMBean but this appears to be limited to incremental  
repairs. Is there a reason for this?  

I was looking to add something very similar to this nodetool repair_admin  
but it would work on co-ordinator repair commands.  

For example:  
$ nodetool repair_admin --list  
Repair#1 mykeyspace columnFamilies=colfamilya,colfamilyb; incremental=True;  
parallelism=parallel progress=5%  

$ nodetool repair_admin --terminate 1  
Terminating repair command #1 (19f00c30-1390-11e7-bb50-ffb920a6d70f)  

$ nodetool repair_admin --terminate-all # calls  
ssProxy.forceTerminateAllRepairSessions()  
Terminating all repair sessions  
Terminated repair command #2 (64c44230-21aa-11e7-9ede-cd6eb64e3786)  

What is the purpose of the current repair_admin? If I wish to add the above  
should I rename the MBean to say  
org.apache.cassandra.db:type=IncrementalRepairService and the nodetool  
command to inc_repair_admin ?  


Repair Management

2017-05-17 Thread Cameron Zemek
I am looking to improve monitoring and management of repairs (so far I have
patch for adding ActiveRepairs to table/keyspace metrics) and come across
ActiveRepairServiceMBean but this appears to be limited to incremental
repairs. Is there a reason for this?

I was looking to add something very similar to this nodetool repair_admin
but it would work on co-ordinator repair commands.

For example:
$ nodetool repair_admin --list
Repair#1 mykeyspace columnFamilies=colfamilya,colfamilyb; incremental=True;
parallelism=parallel progress=5%

$ nodetool repair_admin --terminate 1
Terminating repair command #1 (19f00c30-1390-11e7-bb50-ffb920a6d70f)

$ nodetool repair_admin --terminate-all  # calls
ssProxy.forceTerminateAllRepairSessions()
Terminating all repair sessions
Terminated repair command #2 (64c44230-21aa-11e7-9ede-cd6eb64e3786)

What is the purpose of the current repair_admin? If I wish to add the above
should I rename the MBean to say
org.apache.cassandra.db:type=IncrementalRepairService and the nodetool
command to inc_repair_admin ?