> The initial step could be something simple > like request processing metrics and exposing the numbers via JMX > > 1. What metrics are we interested in? > 2. Who are the potential consumers of this data? Dashboards? > 3. How do we want to expose the metrics? > 4. Do we want to capture metrics at a service level (e.g. All requests > made for WebHDFS)? >
I would appreciate some JMX 'counters'. Consumers of other services ingest JMX (NameNode, ResourceManager and NodeManager) into DBs/Dashboards like Elastic Search/Kibana or Solr/Banana now so JMX from Knox can slide into those existing patterns and avoid having to Flume the audit log, although, the audit log provides more information around TPS per user and the end points they are hitting. I consider Knox to be a scale up and scale down service. If the TPS can be associated with Knox load then decisions can be made to spin up new Knox VMs, behind a balancer, to meet that demand in a dynamic fashion. As with TPS, byte transfer counts per service, and aggregate, would provide better facts for a dynamic scale decision process. The particular way I use Knox now includes other services that would blur identifying packet transfers that Knox is conducting vice the other services that are co-located. Per topology and service metrics would be best. Other metrics - as counts: -Unsuccessful login -Successful login but overall return was HTTP 500 which indicates failure on the cluster side. An example would be users connecting to Knox with valid AD user/pass but which were not authorized in the cluster. This can happen when the cluster is in secure mode but a service like Centrify has not allowed the user into the cluster's zone. -Unsuccessful AD lookup by Knox - user doesn't exist. -Connection counts that used and didn't use an auth cookie and resulted in an AD lookup -Current open connections *These metrics wouldn't provide actionable intelligence but build a pattern something is wrong and the administrators should investigate. Capability to reset the aggregate counters while Knox is running. Kris
