[ 
https://issues.apache.org/jira/browse/AMBARI-9334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14305374#comment-14305374
 ] 

Greg Senia commented on AMBARI-9334:
------------------------------------

My concern is the fact that the original query which came from the postgres 
logging showed the query with the nested select... And as you can see by 
multiple outputs the query returns the same result. so what you are saying is 
the original query must be broken also because the distinct is clearly working 
correctly and returning the same result as the nested select...

ambari=> \d+ stage
                              Table "ambari.stage"
      Column       |          Type          | Modifiers | Storage  | 
Description 
-------------------+------------------------+-----------+----------+-------------
 stage_id          | bigint                 | not null  | plain    | 
 request_id        | bigint                 | not null  | plain    | 
 cluster_id        | bigint                 | not null  | plain    | 
 log_info          | character varying(255) | not null  | extended | 
 request_context   | character varying(255) |           | extended | 
 cluster_host_info | bytea                  | not null  | extended | 
Indexes:
    "stage_pkey" PRIMARY KEY, btree (stage_id, request_id)
Foreign-key constraints:
    "fk_stage_request_id" FOREIGN KEY (request_id) REFERENCES 
request(request_id)
Referenced by:
    TABLE "host_role_command" CONSTRAINT "fk_host_role_command_stage_id" 
FOREIGN KEY (stage_id, request_id) REFERENCES stage(stage_id, request_id)
    TABLE "role_success_criteria" CONSTRAINT "role_success_criteria_stage_id" 
FOREIGN KEY (stage_id, request_id) REFERENCES stage(stage_id, request_id)
Has OIDs: no



ambari=> SELECT distinct t0.stage_id, t0.cluster_host_info, t0.cluster_id, 
t0.log_info, t0.request_context, t0.request_id FROM stage t0, host_role_command 
t1 WHERE ((t0.request_id = t1.request_id) AND t0.stage_id =t1.stage_id and 
(t1.status IN ('QUEUED','IN_PROGRESS','PENDING'))) ORDER BY 
t0.stage_id,t0.request_id;
 stage_id |                                                                     
                                                    
                                                                                
                                                    
                                                                                
                                                    
                                                cluster_host_info               
                                                    
                                                                                
                                                    
                                                                                
                                                    
                                                                                
                      | cluster_id |    log_info    
|      request_context      | request_id 
----------+-------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------+------------+----------------
+---------------------------+------------
        1 | 
{"hs_host":["3"],"namenode_host":["5"],"snamenode_host":["3"],"zookeeper_hosts":["5-7"],"hbase_rs_hosts":["0-2","4","8"]
,"drpc_server_hosts":["7"],"supervisor_hosts":["0-2","4","8"],"ganglia_monitor_hosts":["0-8"],"rm_host":["3"],"slave_hosts":["0-2","
4","8"],"ambari_server_host":["xlab370.example.com"],"all_ping_ports":["8670:0-8"],"hive_metastore_hosts":["6"],"storm_rest_api_hosts
":["7"],"falcon_server_hosts":["5"],"hbase_master_hosts":["7"],"ganglia_server_host":["3"],"nm_hosts":["0-2","4","8"],"hive_server_h
ost":["6"],"hive_mysql_host":["6"],"all_hosts":["xlab362.example.com","xlab365.example.com","xlab366.example.com","xlab370.example.com",
"xlab364.example.com","xlab369.example.com","xlab368.example.com","xlab367.example.com","xlab363.example.com"],"oozie_server":["3"],"stor
m_ui_server_hosts":["7"],"webhcat_server_host":["6"],"nagios_server_host":["3"],"nimbus_hosts":["7"]}
 |          2 | /tmp/ambari:59 
| _PARSE_.STOP.ALL_SERVICES |         59
        2 | 
{"hs_host":["3"],"namenode_host":["5"],"snamenode_host":["3"],"zookeeper_hosts":["5-7"],"hbase_rs_hosts":["0-2","4","8"]
,"drpc_server_hosts":["7"],"supervisor_hosts":["0-2","4","8"],"ganglia_monitor_hosts":["0-8"],"rm_host":["3"],"slave_hosts":["0-2","
4","8"],"ambari_server_host":["xlab370.example.com"],"all_ping_ports":["8670:0-8"],"hive_metastore_hosts":["6"],"storm_rest_api_hosts
":["7"],"falcon_server_hosts":["5"],"hbase_master_hosts":["7"],"ganglia_server_host":["3"],"nm_hosts":["0-2","4","8"],"hive_server_h
ost":["6"],"hive_mysql_host":["6"],"all_hosts":["xlab362.example.com","xlab365.example.com","xlab366.example.com","xlab370.example.com",
"xlab364.example.com","xlab369.example.com","xlab368.example.com","xlab367.example.com","xlab363.example.com"],"oozie_server":["3"],"stor
m_ui_server_hosts":["7"],"webhcat_server_host":["6"],"nagios_server_host":["3"],"nimbus_hosts":["7"]}
 |          2 | /tmp/ambari:59 
| _PARSE_.STOP.ALL_SERVICES |         59
        3 | 
{"hs_host":["3"],"namenode_host":["5"],"snamenode_host":["3"],"zookeeper_hosts":["5-7"],"hbase_rs_hosts":["0-2","4","8"]
,"drpc_server_hosts":["7"],"supervisor_hosts":["0-2","4","8"],"ganglia_monitor_hosts":["0-8"],"rm_host":["3"],"slave_hosts":["0-2","
4","8"],"ambari_server_host":["xlab370.example.com"],"all_ping_ports":["8670:0-8"],"hive_metastore_hosts":["6"],"storm_rest_api_hosts
":["7"],"falcon_server_hosts":["5"],"hbase_master_hosts":["7"],"ganglia_server_host":["3"],"nm_hosts":["0-2","4","8"],"hive_server_h
ost":["6"],"hive_mysql_host":["6"],"all_hosts":["xlab362.example.com","xlab365.example.com","xlab366.example.com","xlab370.example.com",
"xlab364.example.com","xlab369.example.com","xlab368.example.com","xlab367.example.com","xlab363.example.com"],"oozie_server":["3"],"stor
m_ui_server_hosts":["7"],"webhcat_server_host":["6"],"nagios_server_host":["3"],"nimbus_hosts":["7"]}
 |          2 | /tmp/ambari:59 
| _PARSE_.STOP.ALL_SERVICES |         59
(3 rows)

ambari=> SELECT t0.stage_id, t0.cluster_host_info, t0.cluster_id, t0.log_info, 
t0.request_context, t0.request_id FROM stage t0 WHERE t0.stage_id IN (SELECT 
t1.stage_id FROM host_role_command t1 WHERE ((t0.request_id = t1.request_id) 
AND (t1.status IN ('QUEUED','IN_PROGRESS','PENDING')))) ORDER BY t0.request_id, 
t0.stage_id;  
 stage_id |                                                                     
                                                    
                                                                                
                                                    
                                                                                
                                                    
                                                cluster_host_info               
                                                    
                                                                                
                                                    
                                                                                
                                                    
                                                                                
                      | cluster_id |    log_info    
|      request_context      | request_id 
----------+-------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------+------------+----------------
+---------------------------+------------
        1 | 
{"hs_host":["3"],"namenode_host":["5"],"snamenode_host":["3"],"zookeeper_hosts":["5-7"],"hbase_rs_hosts":["0-2","4","8"]
,"drpc_server_hosts":["7"],"supervisor_hosts":["0-2","4","8"],"ganglia_monitor_hosts":["0-8"],"rm_host":["3"],"slave_hosts":["0-2","
4","8"],"ambari_server_host":["xlab370.example.com"],"all_ping_ports":["8670:0-8"],"hive_metastore_hosts":["6"],"storm_rest_api_hosts
":["7"],"falcon_server_hosts":["5"],"hbase_master_hosts":["7"],"ganglia_server_host":["3"],"nm_hosts":["0-2","4","8"],"hive_server_h
ost":["6"],"hive_mysql_host":["6"],"all_hosts":["xlab362.example.com","xlab365.example.com","xlab366.example.com","xlab370.example.com",
"xlab364.example.com","xlab369.example.com","xlab368.example.com","xlab367.example.com","xlab363.example.com"],"oozie_server":["3"],"stor
m_ui_server_hosts":["7"],"webhcat_server_host":["6"],"nagios_server_host":["3"],"nimbus_hosts":["7"]}
 |          2 | /tmp/ambari:59 
| _PARSE_.STOP.ALL_SERVICES |         59
        2 | 
{"hs_host":["3"],"namenode_host":["5"],"snamenode_host":["3"],"zookeeper_hosts":["5-7"],"hbase_rs_hosts":["0-2","4","8"]
,"drpc_server_hosts":["7"],"supervisor_hosts":["0-2","4","8"],"ganglia_monitor_hosts":["0-8"],"rm_host":["3"],"slave_hosts":["0-2","
4","8"],"ambari_server_host":["xlab370.example.com"],"all_ping_ports":["8670:0-8"],"hive_metastore_hosts":["6"],"storm_rest_api_hosts
":["7"],"falcon_server_hosts":["5"],"hbase_master_hosts":["7"],"ganglia_server_host":["3"],"nm_hosts":["0-2","4","8"],"hive_server_h
ost":["6"],"hive_mysql_host":["6"],"all_hosts":["xlab362.example.com","xlab365.example.com","xlab366.example.com","xlab370.example.com",
"xlab364.example.com","xlab369.example.com","xlab368.example.com","xlab367.example.com","xlab363.example.com"],"oozie_server":["3"],"stor
m_ui_server_hosts":["7"],"webhcat_server_host":["6"],"nagios_server_host":["3"],"nimbus_hosts":["7"]}
 |          2 | /tmp/ambari:59 
| _PARSE_.STOP.ALL_SERVICES |         59
        3 | 
{"hs_host":["3"],"namenode_host":["5"],"snamenode_host":["3"],"zookeeper_hosts":["5-7"],"hbase_rs_hosts":["0-2","4","8"]
,"drpc_server_hosts":["7"],"supervisor_hosts":["0-2","4","8"],"ganglia_monitor_hosts":["0-8"],"rm_host":["3"],"slave_hosts":["0-2","
4","8"],"ambari_server_host":["xlab370.example.com"],"all_ping_ports":["8670:0-8"],"hive_metastore_hosts":["6"],"storm_rest_api_hosts
":["7"],"falcon_server_hosts":["5"],"hbase_master_hosts":["7"],"ganglia_server_host":["3"],"nm_hosts":["0-2","4","8"],"hive_server_h
ost":["6"],"hive_mysql_host":["6"],"all_hosts":["xlab362.example.com","xlab365.example.com","xlab366.example.com","xlab370.example.com",
"xlab364.example.com","xlab369.example.com","xlab368.example.com","xlab367.example.com","xlab363.example.com"],"oozie_server":["3"],"stor
m_ui_server_hosts":["7"],"webhcat_server_host":["6"],"nagios_server_host":["3"],"nimbus_hosts":["7"]}
 |          2 | /tmp/ambari:59 
| _PARSE_.STOP.ALL_SERVICES |         59
(3 rows)

> Ambari StageDAO.findByCommandStatuses causes Postgress HIGH CPU
> ---------------------------------------------------------------
>
>                 Key: AMBARI-9334
>                 URL: https://issues.apache.org/jira/browse/AMBARI-9334
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server
>    Affects Versions: 1.6.0, 1.6.1, 1.7.0
>         Environment: RHEL 6.4/6.5
> postgresql-server-8.4.13-1.el6_3.x86_64
> postgresql-8.4.13-1.el6_3.x86_64
> postgresql-libs-8.4.13-1.el6_3.x86_64
> ambari-agent-1.6.1-98.x86_64
> ambari-log4j-1.6.1.98-1.noarch
> ambari-server-1.6.1-98.noarch
>            Reporter: Greg Senia
>            Assignee: Jonathan Hurley
>            Priority: Critical
>             Fix For: 2.0.0
>
>         Attachments: AMBARI-9334.patch, AMBARI-9334.patch.2, Screen Shot 
> 2015-01-26 at 12.18.56 PM.png, Screen Shot 2015-01-26 at 12.19.26 PM.png, 
> StageDAO-1.6.1.patch, StageDAO.java
>
>
> The following code that generates a query causes postgres to use lots of CPU 
> espcially if the Ambari DB grows over time. We reduced CPU by 30-40% by 
> fixing the code below.
> Before:
> public List<StageEntity> findByCommandStatuses(Collection<HostRoleStatus> 
> statuses) {
> TypedQuery<StageEntity> query = 
> entityManagerProvider.get().createQuery("SELECT stage " +
> "FROM StageEntity stage WHERE stage.stageId IN (SELECT hrce.stageId FROM " +
> "HostRoleCommandEntity hrce WHERE stage.requestId = hrce.requestId and 
> hrce.status IN ?1 ) " +
> "ORDER BY stage.requestId, stage.stageId", StageEntity.class);
> return daoUtils.selectList(query, statuses);
> }
> After:
> @RequiresSession
> public List<StageEntity> findByCommandStatuses(Collection<HostRoleStatus> 
> statuses) {
> TypedQuery<StageEntity> query = 
> entityManagerProvider.get().createQuery("SELECT stage "+
> "FROM StageEntity stage, HostRoleCommandEntity hrce " +
> "WHERE stage.requestId = hrce.requestId AND stage.stageId = hrce.stageId and 
> hrce.status IN ?1 " +
> "ORDER BY stage.requestId, stage.stageId", StageEntity.class);
> return daoUtils.selectList(query, statuses);
> }
> Before EXPLAIN ANALYZE:
> ambari=> explain analyze SELECT t0.stage_id, t0.cluster_host_info, 
> t0.cluster_id, t0.log_info, t0.request_context, t0.request_id FROM stage t0 
> WHERE t0.stage_id IN (SELECT t1.stage_id FROM host_role_command t1 WHERE 
> ((t0.request_id = t1.request_id) AND (t1.status IN 
> ('QUEUED','IN_PROGRESS','PENDING')))) ORDER BY t0.request_id, t0.stage_id
> ;
>                                                              QUERY PLAN       
>                                                      
> ------------------------------------------------------------------------------------------------------------------------------------
> Sort  (cost=7407488.50..7407492.69 rows=1676 width=894) (actual 
> time=55418.086..55418.086 rows=0 loops=1)
>    Sort Key: t0.request_id, t0.stage_id
>    Sort Method:  quicksort  Memory: 25kB
>    ->  Seq Scan on stage t0  (cost=0.00..7407398.75 rows=1676 width=894) 
> (actual time=55418.081..55418.081 rows=0 loops=1)
>          Filter: (SubPlan 1)
>          SubPlan 1
>            ->  Seq Scan on host_role_command t1  (cost=0.00..4418.07 rows=1 
> width=8) (actual time=16.514..16.514 rows=0 loops=3353)
>                  Filter: (($0 = request_id) AND ((status)::text = ANY 
> ('{QUEUED,IN_PROGRESS,PENDING}'::text[])))
> Total runtime: 55418.123 ms
> (9 rows)
> After: Explain Analyze:
> ambari=> explain analyze SELECT t0.stage_id, t0.cluster_host_info, 
> t0.cluster_id, t0.log_info, t0.request_context, t0.request_id FROM stage t0, 
> host_role_command t1 WHERE ((t0.request_id = t1.request_id) AND t0.stage_id 
> =t1.stage_id and (t1.status IN ('QUEUED','IN_PROGRESS','PENDING'))) ORDER BY 
> t0.stage_id,t0.request_id
> ;
>                                                            QUERY PLAN         
>                                                  
> --------------------------------------------------------------------------------------------------------------------------------
> Sort  (cost=4346.51..4346.52 rows=2 width=894) (actual time=53.605..53.605 
> rows=0 loops=1)
>    Sort Key: t0.stage_id, t0.request_id
>    Sort Method:  quicksort  Memory: 25kB
>    ->  Nested Loop  (cost=0.00..4346.50 rows=2 width=894) (actual 
> time=53.596..53.596 rows=0 loops=1)
>          ->  Seq Scan on host_role_command t1  (cost=0.00..4338.22 rows=1 
> width=16) (actual time=53.595..53.595 rows=0 loops=1)
>                Filter: ((status)::text = ANY 
> ('{QUEUED,IN_PROGRESS,PENDING}'::text[]))
>          ->  Index Scan using stage_pkey on stage t0  (cost=0.00..8.27 rows=1 
> width=894) (never executed)
>                Index Cond: ((t0.stage_id = t1.stage_id) AND (t0.request_id = 
> t1.request_id))
> Total runtime: 53.654 ms
> (9 rows)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to