[
https://issues.apache.org/jira/browse/CASSANDRA-4767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13534184#comment-13534184
]
Yuki Morishita edited comment on CASSANDRA-4767 at 12/17/12 7:17 PM:
---------------------------------------------------------------------
Patch attached to track repair progress by using JMX notification.
I made another JMX method forceTableRepairAsync, to invoke repair
asynchronously, because I think we don't want to break existing API at the
middle of 1.1 release.
StorageServiceMBean now has notification support, so we can use that to turn
other long running JMX calls into async call and track progress as well in the
future.
Repair command subscribes to JMX notification and invoke async repair, receives
STARTED, and FINISHED event to determine command start and stop, as well as
SUCCESS and FAILED to track status per repair session.
Sample output of 'nodetool repair' is as follows:
{code}
[2012-12-17 13:13:40,754] Starting repair command #1, repairing 3 ranges for
keyspace Keyspace1
[2012-12-17 13:14:11,178] Repair session de064180-487d-11e2-0000-de5e2f7aa3ff
for range (0,56713727820156410577229101238628035242] failed with error
java.io.IOException: Endpoint /127.0.0.1 died
[2012-12-17 13:14:11,178] Repair session f026e950-487d-11e2-0000-de5e2f7aa3ff
for range
(56713727820156410577229101238628035242,113427455640312821154458202477256070484]
finished
[2012-12-17 13:14:11,178] Repair session f0275e80-487d-11e2-0000-de5e2f7aa3ff
for range (113427455640312821154458202477256070484,0] finished
[2012-12-17 13:14:11,178] Repair command #1 finished
{code}
was (Author: yukim):
Patch attached to track repair progress by using JMX notification.
I made another JMX method forceTableRepairAsync, to invoke repair
asynchronously, because I think we don't want to break existing API at the
middle of 1.1 release.
StorageServiceMBean now has notification support, so we can use that to turn
other long running JMX calls into async call and track progress as well in the
future.
Repair command subscribes to JMX notification and invoke async repair, receives
STARTED, and FINISHED event to determine command start and stop, as well as
SUCCESS and FAILED to track status per repair session.
Sample output of 'nodetool repair' is as follows:
{code}
[2012-12-17 13:13:40,754] Starting repair command #1, repairing 3 ranges for
keyspace Keyspace1
[2012-12-17 13:14:11,178] Repair session de064180-487d-11e2-0000-de5e2f7aa3ff
for range (0,56713727820156410577229101238628035242] failed with error
java.io.IOException: Endpoint /127.0.0.1 died
[2012-12-17
13:14:11,178] Repair session f026e950-487d-11e2-0000-de5e2f7aa3ff for range
(56713727820156410577229101238628035242,113427455640312821154458202477256070484]
finished
[2012-12-17 13:14:11,178] Repair session f0275e80-487d-11e2-0000-de5e2f7aa3ff
for range (113427455640312821154458202477256070484,0] finished
[2012-12-17 13:14:11,178] Repair command #1 finished
{code}
> Need some indication of node repair success or failure
> ------------------------------------------------------
>
> Key: CASSANDRA-4767
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4767
> Project: Cassandra
> Issue Type: Improvement
> Components: Tools
> Reporter: Ahmed Bashir
> Assignee: Yuki Morishita
> Priority: Minor
> Labels: jmx
> Fix For: 1.1.9
>
> Attachments: 4767-1.1.txt
>
>
> We are currently verifying node repair status via basic log analysis. In
> order to automatically track the status of periodic node repair jobs, it
> would be better to have an indicator (through JMX perhaps).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira