-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47961/
-----------------------------------------------------------

(Updated May 27, 2016, 5:43 p.m.)


Review request for Ambari, Alejandro Fernandez, Nate Cole, Robert Levas, Robert 
Nettleton, and Srimanth Gunturi.


Bugs: AMBARI-16913
    https://issues.apache.org/jira/browse/AMBARI-16913


Repository: ambari


Description
-------

Incoming requests from the web client (or from any REST API) will eventually be 
routed to the property provider / subresource framework. It is here were any 
JMX data is queried for within the context of the REST request. In large 
clusters, these requests can backup quite easily (even with a massive 
threadpool), causing UX degradations in the web client:

```
Thread [qtp-ambari-client-38]
        
JMXPropertyProvider(ThreadPoolEnabledPropertyProvider).populateResources(Set<Resource>,
 Request, Predicate) line: 168   
        JMXPropertyProvider.populateResources(Set<Resource>, Request, 
Predicate) line: 156      
        StackDefinedPropertyProvider.populateResources(Set<Resource>, Request, 
Predicate) line: 200     
        ClusterControllerImpl.populateResources(Type, Set<Resource>, Request, 
Predicate) line: 155      
        QueryImpl.queryForResources() line: 407 
        QueryImpl.execute() line: 217   
        ReadHandler.handleRequest(Request) line: 69     
        GetRequest(BaseRequest).process() line: 145     
```

Consider one of the calls made by the web client:
```
GET api/v1/clusters/c1/components/?
ServiceComponentInfo/category=MASTER&
fields=
ServiceComponentInfo/service_name,
host_components/HostRoles/display_name,
host_components/HostRoles/host_name,
host_components/HostRoles/state,
host_components/HostRoles/maintenance_state,
host_components/HostRoles/stale_configs,
host_components/HostRoles/ha_state,
host_components/HostRoles/desired_admin_state,
host_components/metrics/jvm/memHeapUsedM,
host_components/metrics/jvm/HeapMemoryMax,
host_components/metrics/jvm/HeapMemoryUsed,
host_components/metrics/jvm/memHeapCommittedM,
host_components/metrics/mapred/jobtracker/trackers_decommissioned,
host_components/metrics/cpu/cpu_wio,
host_components/metrics/rpc/client/RpcQueueTime_avg_time,
host_components/metrics/dfs/FSNamesystem/*,
host_components/metrics/dfs/namenode/Version,
host_components/metrics/dfs/namenode/LiveNodes,
host_components/metrics/dfs/namenode/DeadNodes,
host_components/metrics/dfs/namenode/DecomNodes,
host_components/metrics/dfs/namenode/TotalFiles,
host_components/metrics/dfs/namenode/UpgradeFinalized,
host_components/metrics/dfs/namenode/Safemode,
host_components/metrics/runtime/StartTime
```

This query is essentially saying that for every {{MASTER}}, get metrics from 
them. The problem is that in a large cluster, there could be 100 masters, yet 
the metrics being asked for are only for NameNode. As a result, the JMX 
endpoints for all 100 masters are queried - *live* - as part of the request.

There are two inherent flaws with this approach:

- Even with millisecond JMX response times, multiplying this by 100's and then 
adding parsing overhead causes a noticeable delay in the web client as the 
federated requests are blocking the main UX request

- Although there is a threadpool which scales up to service these requests - 
that only really works for 1 user. With multiple users logged in, you'd need 
100's upon 100's of threads pulling in the same JMX data

This data should never be queried for directly as part of the incoming REST 
requests. Instead, an autonomous pool of threads should be constantly 
retrieving these point-in-time metrics and updating a cache. The cache is then 
used to service all live REST requests. 
- On the first request to a resource, a cache miss occurs and no data is 
returned. I think this is acceptable since metrics take a few moments to 
populate anyway right now. As the web client polls, the next request should 
pickup the newly cached metrics.
- Only URLs which are being asked for by incoming REST requests should be 
considered for retrieval. After sometime, if they haven't been requested, then 
the headless threadpool can stop trying to update their data
- All JMX data will be parsed and stored in-memory, in an expiring cache


Diffs (updated)
-----

  ambari-server/src/main/java/org/apache/ambari/server/AmbariService.java 
186e272 
  
ambari-server/src/main/java/org/apache/ambari/server/configuration/Configuration.java
 7cfaf61 
  
ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementController.java
 d6b9d0e 
  
ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java
 f4a615c 
  
ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariServer.java
 99a6cab 
  
ambari-server/src/main/java/org/apache/ambari/server/controller/ControllerModule.java
 617553b 
  
ambari-server/src/main/java/org/apache/ambari/server/controller/internal/AbstractProviderModule.java
 04a8f0a 
  
ambari-server/src/main/java/org/apache/ambari/server/controller/internal/StackDefinedPropertyProvider.java
 6c40d14 
  
ambari-server/src/main/java/org/apache/ambari/server/controller/jmx/JMXPropertyProvider.java
 1ccc5df 
  
ambari-server/src/main/java/org/apache/ambari/server/controller/metrics/MetricPropertyProviderFactory.java
 PRE-CREATION 
  
ambari-server/src/main/java/org/apache/ambari/server/controller/metrics/RestMetricsPropertyProvider.java
 6f2a134 
  
ambari-server/src/main/java/org/apache/ambari/server/controller/metrics/ThreadPoolEnabledPropertyProvider.java
 6f4a6ea 
  
ambari-server/src/main/java/org/apache/ambari/server/controller/utilities/BufferedThreadPoolExecutorCompletionService.java
 4d6daa6 
  
ambari-server/src/main/java/org/apache/ambari/server/controller/utilities/ScalingThreadPoolExecutor.java
 7a5e479 
  
ambari-server/src/main/java/org/apache/ambari/server/state/services/MetricsRetrievalService.java
 PRE-CREATION 
  
ambari-server/src/test/java/org/apache/ambari/server/configuration/ConfigurationTest.java
 5d65ea7 
  
ambari-server/src/test/java/org/apache/ambari/server/controller/internal/StackDefinedPropertyProviderTest.java
 32e84cb 
  
ambari-server/src/test/java/org/apache/ambari/server/controller/metrics/JMXPropertyProviderTest.java
 1628f1f 
  
ambari-server/src/test/java/org/apache/ambari/server/controller/metrics/RestMetricsPropertyProviderTest.java
 f78024f 
  
ambari-server/src/test/java/org/apache/ambari/server/controller/test/BufferedThreadPoolExecutorCompletionServiceTest.java
 f47068c 
  
ambari-server/src/test/java/org/apache/ambari/server/utils/SynchronousThreadPoolExecutor.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/47961/diff/


Testing
-------

PENDING


Thanks,

Jonathan Hurley

Reply via email to