Service Historian
-----------------

                 Key: HBASE-773
                 URL: https://issues.apache.org/jira/browse/HBASE-773
             Project: Hadoop HBase
          Issue Type: New Feature
          Components: master
    Affects Versions: 0.3.0
            Reporter: Andrew Purtell
            Priority: Minor


The Region Historian (see HBASE-533) is very useful for debugging issues on the 
cluster involving region splitting, assignment, etc. It would be additionally 
useful if the master could keep a separate history of regionservers, when they:

* start up and report in

* quiesce/exit when the master tells them to

* fail (and report error?) and exit

* are declared dead after their lease expires

* are assigned a region (some overlap with Region Historian but is a different 
view)

* are asked to close a region  (some overlap with Region Historian but is a 
different view)

Maybe call it a Service Historian?

There should be event logs per regionserver identity, available even if a 
regionserver is offline. The logs can have a simple structure: Timestamp, 
Event, Description, like the Region Historian tables. 

Otherwise it is still necessary to comb through logs to determine if a 
regionserver was flaky during a period of time. 

Additionally, if regionservers can send an error string when they abort and 
restart, such that the errors can be viewed in a service history table, that 
would be really helpful.

Hyperlinks in the service history table would make it easy to follow a table 
and its regions over the lifetime of the system, a reconstruction essentially 
of the client view of the cluster over time. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to