----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/47961/ -----------------------------------------------------------
(Updated May 27, 2016, 1:27 p.m.) Review request for Ambari, Alejandro Fernandez, Nate Cole, Robert Levas, Robert Nettleton, and Srimanth Gunturi. Changes ------- Something is wrong with the ReviewBoard and the patch .. trying it again. Some files are not showing up Bugs: AMBARI-16913 https://issues.apache.org/jira/browse/AMBARI-16913 Repository: ambari Description ------- Incoming requests from the web client (or from any REST API) will eventually be routed to the property provider / subresource framework. It is here were any JMX data is queried for within the context of the REST request. In large clusters, these requests can backup quite easily (even with a massive threadpool), causing UX degradations in the web client: ``` Thread [qtp-ambari-client-38] JMXPropertyProvider(ThreadPoolEnabledPropertyProvider).populateResources(Set<Resource>, Request, Predicate) line: 168 JMXPropertyProvider.populateResources(Set<Resource>, Request, Predicate) line: 156 StackDefinedPropertyProvider.populateResources(Set<Resource>, Request, Predicate) line: 200 ClusterControllerImpl.populateResources(Type, Set<Resource>, Request, Predicate) line: 155 QueryImpl.queryForResources() line: 407 QueryImpl.execute() line: 217 ReadHandler.handleRequest(Request) line: 69 GetRequest(BaseRequest).process() line: 145 ``` Consider one of the calls made by the web client: ``` GET api/v1/clusters/c1/components/? ServiceComponentInfo/category=MASTER& fields= ServiceComponentInfo/service_name, host_components/HostRoles/display_name, host_components/HostRoles/host_name, host_components/HostRoles/state, host_components/HostRoles/maintenance_state, host_components/HostRoles/stale_configs, host_components/HostRoles/ha_state, host_components/HostRoles/desired_admin_state, host_components/metrics/jvm/memHeapUsedM, host_components/metrics/jvm/HeapMemoryMax, host_components/metrics/jvm/HeapMemoryUsed, host_components/metrics/jvm/memHeapCommittedM, host_components/metrics/mapred/jobtracker/trackers_decommissioned, host_components/metrics/cpu/cpu_wio, host_components/metrics/rpc/client/RpcQueueTime_avg_time, host_components/metrics/dfs/FSNamesystem/*, host_components/metrics/dfs/namenode/Version, host_components/metrics/dfs/namenode/LiveNodes, host_components/metrics/dfs/namenode/DeadNodes, host_components/metrics/dfs/namenode/DecomNodes, host_components/metrics/dfs/namenode/TotalFiles, host_components/metrics/dfs/namenode/UpgradeFinalized, host_components/metrics/dfs/namenode/Safemode, host_components/metrics/runtime/StartTime ``` This query is essentially saying that for every {{MASTER}}, get metrics from them. The problem is that in a large cluster, there could be 100 masters, yet the metrics being asked for are only for NameNode. As a result, the JMX endpoints for all 100 masters are queried - *live* - as part of the request. There are two inherent flaws with this approach: - Even with millisecond JMX response times, multiplying this by 100's and then adding parsing overhead causes a noticeable delay in the web client as the federated requests are blocking the main UX request - Although there is a threadpool which scales up to service these requests - that only really works for 1 user. With multiple users logged in, you'd need 100's upon 100's of threads pulling in the same JMX data This data should never be queried for directly as part of the incoming REST requests. Instead, an autonomous pool of threads should be constantly retrieving these point-in-time metrics and updating a cache. The cache is then used to service all live REST requests. - On the first request to a resource, a cache miss occurs and no data is returned. I think this is acceptable since metrics take a few moments to populate anyway right now. As the web client polls, the next request should pickup the newly cached metrics. - Only URLs which are being asked for by incoming REST requests should be considered for retrieval. After sometime, if they haven't been requested, then the headless threadpool can stop trying to update their data - All JMX data will be parsed and stored in-memory, in an expiring cache Diffs (updated) ----- ambari-server/src/main/java/org/apache/ambari/server/AmbariService.java 186e272 ambari-server/src/main/java/org/apache/ambari/server/configuration/Configuration.java 7cfaf61 ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementController.java d6b9d0e ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java f4a615c ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariServer.java 99a6cab ambari-server/src/main/java/org/apache/ambari/server/controller/ControllerModule.java 617553b ambari-server/src/main/java/org/apache/ambari/server/controller/internal/AbstractProviderModule.java 04a8f0a ambari-server/src/main/java/org/apache/ambari/server/controller/internal/StackDefinedPropertyProvider.java 6c40d14 ambari-server/src/main/java/org/apache/ambari/server/controller/jmx/JMXPropertyProvider.java 1ccc5df ambari-server/src/main/java/org/apache/ambari/server/controller/metrics/MetricPropertyProviderFactory.java PRE-CREATION ambari-server/src/main/java/org/apache/ambari/server/controller/metrics/RestMetricsPropertyProvider.java 6f2a134 ambari-server/src/main/java/org/apache/ambari/server/controller/metrics/ThreadPoolEnabledPropertyProvider.java 6f4a6ea ambari-server/src/main/java/org/apache/ambari/server/controller/utilities/BufferedThreadPoolExecutorCompletionService.java 4d6daa6 ambari-server/src/main/java/org/apache/ambari/server/controller/utilities/ScalingThreadPoolExecutor.java 7a5e479 ambari-server/src/main/java/org/apache/ambari/server/state/services/MetricsRetrievalService.java PRE-CREATION ambari-server/src/test/java/org/apache/ambari/server/configuration/ConfigurationTest.java 5d65ea7 ambari-server/src/test/java/org/apache/ambari/server/controller/metrics/RestMetricsPropertyProviderTest.java f78024f ambari-server/src/test/java/org/apache/ambari/server/controller/test/BufferedThreadPoolExecutorCompletionServiceTest.java f47068c Diff: https://reviews.apache.org/r/47961/diff/ Testing ------- PENDING Thanks, Jonathan Hurley