[ 
https://issues.apache.org/jira/browse/AMBARI-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Sposetti updated AMBARI-4133:
----------------------------------

    Description: 
On a 600-node cluster, the Hosts page hangs for about 5 seconds and then 
unblocks for about 10 seconds, then freezes for 5 seconds, etc.
Chrome profiler shows that App.Host's *criticalAlertsCount* is eating up the 
CPU.  This is called by App.MainHostView's *hostCounts*, which is called by 
App.MainHostView's *label*.  This seems to be the cause for this 
freeze/unfreeze behavior. 

{code}
    criticalAlertsCount: function () {
      return 
App.router.get('clusterController.alerts').filterProperty('hostName', 
this.get('hostName')).filterProperty('isOk', 
false).filterProperty('ignoredForHosts', false).length;
    }.property('App.router.clusterController.alerts.length'), 
{code}

This piece of code gets called for every single host in the cluster every time 
we reload the alerts from the server.
There are several approaches to fix this problem:
1. The server should have alert info as part of the Host resource.  This way, 
we can simply map it and the client does not have to do much.  This will be 
done in 1.5.0 with changes to Nagios alerting.
2. Since 1 won't be done until 1.5.0, we are left with the choice to improve 
efficiency of the front code.  Upon loading alerts, we can load them into a map 
so that look up by host (and service) would be fast; Ember's filterProperty is 
a linear search so it is very inefficient, especially on a large array, like a 
list of all alerts in the cluster and doing this over and over again for all 
the hosts in the cluster.  Also, we can sum up and store the aggregate count 
(like total number of hosts with critical alerts) as we map alerts.  I'm 
speculating that we can get a big perf boost just by doing these things.


  was:
On a 600-node cluster, the Hosts page hangs for about 5 seconds and then 
unblocks for about 10 seconds, then freezes for 5 seconds, etc.
Chrome profiler shows that App.Host's *criticalAlertsCount* is eating up the 
CPU.  This is called by App.MainHostView's *hostCounts*, which is called by 
App.MainHostView's *label*.  This seems to be the cause for this 
freeze/unfreeze behavior. 

{code}
    criticalAlertsCount: function () {
      return 
App.router.get('clusterController.alerts').filterProperty('hostName', 
this.get('hostName')).filterProperty('isOk', 
false).filterProperty('ignoredForHosts', false).length;
    }.property('App.router.clusterController.alerts.length'), 
{code}

This piece of code gets called for every single host in the cluster every time 
we reload the alerts from the server.
There are several approaches to fix this problem:
1. The server should have alert info as part of the Host resource.  This way, 
we can simply map it and the client does not have to do much.  This will be 
done in Baikal via BUG-11704.
2. Since 1 won't be done until Baikal, we are left with the choice to improve 
efficiency of the front code.  Upon loading alerts, we can load them into a map 
so that look up by host (and service) would be fast; Ember's filterProperty is 
a linear search so it is very inefficient, especially on a large array, like a 
list of all alerts in the cluster and doing this over and over again for all 
the hosts in the cluster.  Also, we can sum up and store the aggregate count 
(like total number of hosts with critical alerts) as we map alerts.  I'm 
speculating that we can get a big perf boost just by doing these things.



> Perf issues on Hosts page - freezes for several seconds and then unfreezes 
> repeatedly on a large cluster
> --------------------------------------------------------------------------------------------------------
>
>                 Key: AMBARI-4133
>                 URL: https://issues.apache.org/jira/browse/AMBARI-4133
>             Project: Ambari
>          Issue Type: Task
>    Affects Versions: 1.4.3
>            Reporter: Andrii Tkach
>            Assignee: Andrii Tkach
>            Priority: Critical
>             Fix For: 1.4.3
>
>         Attachments: AMBARI-4133.patch
>
>
> On a 600-node cluster, the Hosts page hangs for about 5 seconds and then 
> unblocks for about 10 seconds, then freezes for 5 seconds, etc.
> Chrome profiler shows that App.Host's *criticalAlertsCount* is eating up the 
> CPU.  This is called by App.MainHostView's *hostCounts*, which is called by 
> App.MainHostView's *label*.  This seems to be the cause for this 
> freeze/unfreeze behavior. 
> {code}
>     criticalAlertsCount: function () {
>       return 
> App.router.get('clusterController.alerts').filterProperty('hostName', 
> this.get('hostName')).filterProperty('isOk', 
> false).filterProperty('ignoredForHosts', false).length;
>     }.property('App.router.clusterController.alerts.length'), 
> {code}
> This piece of code gets called for every single host in the cluster every 
> time we reload the alerts from the server.
> There are several approaches to fix this problem:
> 1. The server should have alert info as part of the Host resource.  This way, 
> we can simply map it and the client does not have to do much.  This will be 
> done in 1.5.0 with changes to Nagios alerting.
> 2. Since 1 won't be done until 1.5.0, we are left with the choice to improve 
> efficiency of the front code.  Upon loading alerts, we can load them into a 
> map so that look up by host (and service) would be fast; Ember's 
> filterProperty is a linear search so it is very inefficient, especially on a 
> large array, like a list of all alerts in the cluster and doing this over and 
> over again for all the hosts in the cluster.  Also, we can sum up and store 
> the aggregate count (like total number of hosts with critical alerts) as we 
> map alerts.  I'm speculating that we can get a big perf boost just by doing 
> these things.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to