Provide an admin page displaying events in the cluster along with cluster
status/health
---------------------------------------------------------------------------------------
Key: HADOOP-5526
URL: https://issues.apache.org/jira/browse/HADOOP-5526
Project: Hadoop Core
Issue Type: New Feature
Components: mapred
Reporter: Amar Kamat
Here are few things that will help admins understand whats happening in the
cluster
# Events updates
## recently added tracker
## lost trackers
## recently submitted jobs
## user updates
## killed/failed attempts/tasks
## killed jobs and the reason
## recent exceptions like oom etc
## expired tasks
## recovery manager updates
## memory/cpu usage
## black listing of tracker
## killing of maps based on fetch failures
## info about why some jobs was rejected(acls, max
tasks)/failed(failures)/killed (user)
## etc
# Status :
## tracker health and status
## User status
### num jobs submitted
### total time the cluster was used
### success/failed/killed history
## job status
### task completion events
### recently scheduled tasks
### progress
### killed/failed/success history
## space on the box where the jt is running
## etc
# Config :
## slot info
## acl info
## etc
----
Graphical views and auto updation would be cool. Raising alarms upon certain
events would be super cool.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.