----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/47961/#review135301 -----------------------------------------------------------
ambari-server/src/main/java/org/apache/ambari/server/controller/utilities/BufferedThreadPoolExecutorCompletionService.java (line 107) <https://reviews.apache.org/r/47961/#comment200325> Small optimization: Returning here is going to make us NOT submit any of the overflown requests till ALL of the already submitted ones are completed. So if 10 threads + 10-queue got 22 requests, the 2 overflown requests will not be executed till all 20 got executed. Optimization is to immediately submit the overflown requests, when we know of completed requests. - Srimanth Gunturi On May 27, 2016, 7:18 p.m., Jonathan Hurley wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/47961/ > ----------------------------------------------------------- > > (Updated May 27, 2016, 7:18 p.m.) > > > Review request for Ambari, Alejandro Fernandez, Nate Cole, Robert Levas, > Robert Nettleton, and Srimanth Gunturi. > > > Bugs: AMBARI-16913 > https://issues.apache.org/jira/browse/AMBARI-16913 > > > Repository: ambari > > > Description > ------- > > Incoming requests from the web client (or from any REST API) will eventually > be routed to the property provider / subresource framework. It is here were > any JMX data is queried for within the context of the REST request. In large > clusters, these requests can backup quite easily (even with a massive > threadpool), causing UX degradations in the web client: > > ``` > Thread [qtp-ambari-client-38] > > JMXPropertyProvider(ThreadPoolEnabledPropertyProvider).populateResources(Set<Resource>, > Request, Predicate) line: 168 > JMXPropertyProvider.populateResources(Set<Resource>, Request, > Predicate) line: 156 > StackDefinedPropertyProvider.populateResources(Set<Resource>, Request, > Predicate) line: 200 > ClusterControllerImpl.populateResources(Type, Set<Resource>, Request, > Predicate) line: 155 > QueryImpl.queryForResources() line: 407 > QueryImpl.execute() line: 217 > ReadHandler.handleRequest(Request) line: 69 > GetRequest(BaseRequest).process() line: 145 > ``` > > Consider one of the calls made by the web client: > ``` > GET api/v1/clusters/c1/components/? > ServiceComponentInfo/category=MASTER& > fields= > ServiceComponentInfo/service_name, > host_components/HostRoles/display_name, > host_components/HostRoles/host_name, > host_components/HostRoles/state, > host_components/HostRoles/maintenance_state, > host_components/HostRoles/stale_configs, > host_components/HostRoles/ha_state, > host_components/HostRoles/desired_admin_state, > host_components/metrics/jvm/memHeapUsedM, > host_components/metrics/jvm/HeapMemoryMax, > host_components/metrics/jvm/HeapMemoryUsed, > host_components/metrics/jvm/memHeapCommittedM, > host_components/metrics/mapred/jobtracker/trackers_decommissioned, > host_components/metrics/cpu/cpu_wio, > host_components/metrics/rpc/client/RpcQueueTime_avg_time, > host_components/metrics/dfs/FSNamesystem/*, > host_components/metrics/dfs/namenode/Version, > host_components/metrics/dfs/namenode/LiveNodes, > host_components/metrics/dfs/namenode/DeadNodes, > host_components/metrics/dfs/namenode/DecomNodes, > host_components/metrics/dfs/namenode/TotalFiles, > host_components/metrics/dfs/namenode/UpgradeFinalized, > host_components/metrics/dfs/namenode/Safemode, > host_components/metrics/runtime/StartTime > ``` > > This query is essentially saying that for every {{MASTER}}, get metrics from > them. The problem is that in a large cluster, there could be 100 masters, yet > the metrics being asked for are only for NameNode. As a result, the JMX > endpoints for all 100 masters are queried - *live* - as part of the request. > > There are two inherent flaws with this approach: > > - Even with millisecond JMX response times, multiplying this by 100's and > then adding parsing overhead causes a noticeable delay in the web client as > the federated requests are blocking the main UX request > > - Although there is a threadpool which scales up to service these requests - > that only really works for 1 user. With multiple users logged in, you'd need > 100's upon 100's of threads pulling in the same JMX data > > This data should never be queried for directly as part of the incoming REST > requests. Instead, an autonomous pool of threads should be constantly > retrieving these point-in-time metrics and updating a cache. The cache is > then used to service all live REST requests. > - On the first request to a resource, a cache miss occurs and no data is > returned. I think this is acceptable since metrics take a few moments to > populate anyway right now. As the web client polls, the next request should > pickup the newly cached metrics. > - Only URLs which are being asked for by incoming REST requests should be > considered for retrieval. After sometime, if they haven't been requested, > then the headless threadpool can stop trying to update their data > - All JMX data will be parsed and stored in-memory, in an expiring cache > > > Diffs > ----- > > ambari-server/src/main/java/org/apache/ambari/server/AmbariService.java > 186e272 > > ambari-server/src/main/java/org/apache/ambari/server/configuration/Configuration.java > 7cfaf61 > > ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementController.java > d6b9d0e > > ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java > f4a615c > > ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariServer.java > 99a6cab > > ambari-server/src/main/java/org/apache/ambari/server/controller/ControllerModule.java > 617553b > > ambari-server/src/main/java/org/apache/ambari/server/controller/internal/AbstractProviderModule.java > 04a8f0a > > ambari-server/src/main/java/org/apache/ambari/server/controller/internal/StackDefinedPropertyProvider.java > 6c40d14 > > ambari-server/src/main/java/org/apache/ambari/server/controller/jmx/JMXPropertyProvider.java > 1ccc5df > > ambari-server/src/main/java/org/apache/ambari/server/controller/metrics/MetricPropertyProviderFactory.java > PRE-CREATION > > ambari-server/src/main/java/org/apache/ambari/server/controller/metrics/RestMetricsPropertyProvider.java > 6f2a134 > > ambari-server/src/main/java/org/apache/ambari/server/controller/metrics/ThreadPoolEnabledPropertyProvider.java > 6f4a6ea > > ambari-server/src/main/java/org/apache/ambari/server/controller/utilities/BufferedThreadPoolExecutorCompletionService.java > 4d6daa6 > > ambari-server/src/main/java/org/apache/ambari/server/controller/utilities/ScalingThreadPoolExecutor.java > 7a5e479 > > ambari-server/src/main/java/org/apache/ambari/server/state/services/MetricsRetrievalService.java > PRE-CREATION > > ambari-server/src/test/java/org/apache/ambari/server/configuration/ConfigurationTest.java > 5d65ea7 > > ambari-server/src/test/java/org/apache/ambari/server/controller/internal/StackDefinedPropertyProviderTest.java > 32e84cb > > ambari-server/src/test/java/org/apache/ambari/server/controller/metrics/RestMetricsPropertyProviderTest.java > f78024f > > ambari-server/src/test/java/org/apache/ambari/server/controller/test/BufferedThreadPoolExecutorCompletionServiceTest.java > f47068c > > Diff: https://reviews.apache.org/r/47961/diff/ > > > Testing > ------- > > PENDING > > > Thanks, > > Jonathan Hurley > >