[ 
https://issues.apache.org/jira/browse/AMBARI-12178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604030#comment-14604030
 ] 

Hudson commented on AMBARI-12178:
---------------------------------

FAILURE: Integrated in Ambari-branch-2.1 #128 (See 
[https://builds.apache.org/job/Ambari-branch-2.1/128/])
AMBARI-12178 - Memory Exhausted During Upgrade Of Large Cluster 
(jonathanhurley) (jhurley: 
http://git-wip-us.apache.org/repos/asf?p=ambari.git&a=commit&h=77316937c92ba3465255bac5acd335317f58bdd7)
* 
ambari-server/src/main/java/org/apache/ambari/server/controller/internal/StageResourceProvider.java
* 
ambari-server/src/main/java/org/apache/ambari/server/orm/entities/HostEntity.java
* ambari-server/src/main/java/org/apache/ambari/server/topology/HostRequest.java
* 
ambari-server/src/main/java/org/apache/ambari/server/orm/entities/StageEntity.java
* 
ambari-server/src/main/java/org/apache/ambari/server/actionmanager/HostRoleCommand.java
* 
ambari-server/src/main/java/org/apache/ambari/server/topology/TopologyManager.java
* ambari-server/src/main/java/org/apache/ambari/server/orm/dao/StageDAO.java
* 
ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeGroupResourceProvider.java


> Memory Exhausted During Upgrade Of Large Cluster
> ------------------------------------------------
>
>                 Key: AMBARI-12178
>                 URL: https://issues.apache.org/jira/browse/AMBARI-12178
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server
>    Affects Versions: 2.1.0
>            Reporter: Jonathan Hurley
>            Assignee: Jonathan Hurley
>            Priority: Blocker
>             Fix For: 2.1.0
>
>
> During an upgrade of a large cluster, the memory used by Ambari grows until 
> it is fully consumed. This, however, only happens when the Upgrade Dialog 
> page is open. If that popup is closed, the memory usage stays relatively 
> constant.
> The offending call is:
> {code}
> api/v1/clusters/perf400/upgrades/31?upgrade_groups/UpgradeGroup/status!=PENDING&fields=Upgrade/progress_percent,Upgrade/request_context,Upgrade/request_status,Upgrade/direction,upgrade_groups/UpgradeGroup,upgrade_groups/upgrade_items/UpgradeItem/status,upgrade_groups/upgrade_items/UpgradeItem/context,upgrade_groups/upgrade_items/UpgradeItem/group_id,upgrade_groups/upgrade_items/UpgradeItem/progress_percent,upgrade_groups/upgrade_items/UpgradeItem/request_id,upgrade_groups/upgrade_items/UpgradeItem/skippable,upgrade_groups/upgrade_items/UpgradeItem/stage_id,upgrade_groups/upgrade_items/UpgradeItem/status,upgrade_groups/upgrade_items/UpgradeItem/text&minimal_response=true
> {code}
> Based on heap dumps, the larges offenders are {{StageEnity}} and, as a 
> result, {{byte[]}}:
> {noformat}
> Class Name| Objects |  Shallow Heap | Retained Heap
> ----------------------------------------------------
> byte[]    | 351,907 | 3,147,710,224 |              
> ----------------------------------------------------
> Class Name                                         | Objects | Shallow Heap | 
> Retained Heap
> --------------------------------------------------------------------------------------------
> org.apache.ambari.server.orm.entities.StageEntity  | 192,356 |   18,466,176 | 
> 3,075,693,136
> org.apache.ambari.server.orm.entities.StageEntity_ |       0 |            0 | 
>              
> org.apache.ambari.server.orm.entities.StageEntityPK|       0 |            0 | 
>              
> --------------------------------------------------------------------------------------------
> {noformat}
> Each {{StageEntity}} is holding about 30k:
> {noformat}
> Class Name                                                                    
>                                                                               
>                                                                               
>                                                                        | 
> Shallow Heap | Retained Heap
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> org.apache.ambari.server.orm.entities.StageEntity @ 0x738e03260               
>                                                                               
>                                                                               
>                                                                        |      
>      96 |        28,576
> |- <class> class org.apache.ambari.server.orm.entities.StageEntity @ 
> 0x64058d268                                                                   
>                                                                               
>                                                                               
>   |            8 |             8
> |- skippable java.lang.Integer @ 0x6401e9738  0                               
>                                                                               
>                                                                               
>                                                                        |      
>      16 |            16
> |- clusterId java.lang.Long @ 0x64026c908  2                                  
>                                                                               
>                                                                               
>                                                                        |      
>      24 |            24
> |- requestId java.lang.Long @ 0x64026d840  31                                 
>                                                                               
>                                                                               
>                                                                        |      
>      24 |            24
> |- _persistence_primaryKey 
> org.eclipse.persistence.internal.identitymaps.CacheId @ 0x642ce20e0           
>                                                                               
>                                                                               
>                                             |           24 |            48
> |- _persistence_cacheKey 
> org.eclipse.persistence.internal.identitymaps.HardCacheWeakIdentityMap$ReferenceCacheKey
>  @ 0x6469cf328                                                                
>                                                                               
>                                     |          104 |           136
> |- request org.apache.ambari.server.orm.entities.RequestEntity @ 0x728d046e8  
>                                                                               
>                                                                               
>                                                                        |      
>     112 |           432
> |- _persistence_listener 
> org.eclipse.persistence.internal.descriptors.changetracking.AttributeChangeListener
>  @ 0x72f073f20                                                                
>                                                                               
>                                          |           32 |            32
> |- stageId java.lang.Long @ 0x7350c8b08  1199                                 
>                                                                               
>                                                                               
>                                                                        |      
>      24 |            24
> |- logInfo java.lang.String @ 0x7350c8b20  /tmp/ambari                        
>                                                                               
>                                                                               
>                                                                        |      
>      24 |            64
> |- requestContext java.lang.String @ 0x7350c8b38  Restarting DataNode on 
> perf400-c-371.c.pramod-thangali.internal                                      
>                                                                               
>                                                                             | 
>           24 |           168
> |- hostRoleCommands org.eclipse.persistence.indirection.IndirectList @ 
> 0x738a0ceb0                                                                   
>                                                                               
>                                                                               
> |           64 |           184
> |- roleSuccessCriterias org.eclipse.persistence.indirection.IndirectList @ 
> 0x738a0cef0                                                                   
>                                                                               
>                                                                           |   
>         64 |           184
> |- commandParamsStage byte[141] @ 0x738c46cc8  
> {"restart_type":"rolling_upgrade","upgrade_direction":"upgrade","version":"2.2.6.0-2799","target_stack":"HDP-2.2","original_stack":"HDP-2.2"}
>                                                                               
>                                        |          160 |           160
> |- hostParamsStage byte[776] @ 0x738dc16b0  
> {"ambari_db_rca_driver":"org.postgresql.Driver","ambari_db_rca_password":"mapred","ambari_db_rca_url":"jdbc:postgresql://perf400-a-1.c.pramod-thangali.internal/ambarirca","ambari_db_rca_username":"mapred","current_version":"2.2.0.0-2041","db_driver_filenam...
>   |          792 |           792
> |- clusterHostInfo byte[26774] @ 0x739006378  
> {"nimbus_hosts":["278"],"all_racks":["/default-rack:0-405"],"ambari_server_host":["perf400-a-1.c.pramod-thangali.internal"],"app_timeline_server_hosts":["138"],"hive_mysql_host":["247"],"falcon_server_hosts":["2"],"hbase_master_hosts":["2"],"accumulo_maste...|
>        26,792 |        26,792
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> {noformat}
> It appears as though a local {{Cache}} in 
> [ActionDBAccessorImpl|https://github.com/apache/ambari/blob/94c091e280a99e07db5f3910873e70aa3c18394f/ambari-server/src/main/java/org/apache/ambari/server/actionmanager/ActionDBAccessorImpl.java#L104]
>  is holding on these objects:
> {noformat:title=Shows the cache holding onto a HostEntity which holds onto a 
> UnitOfWork map with lots of stale entities}
> Class Name                                                                    
>                                                                              
> | Ref. Objects | Shallow Heap | Ref. Shallow Heap | Retained Heap
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> java.lang.Thread @ 0x641af65b8  ambari-action-scheduler Native Stack, Thread  
>                                                                              
> |           76 |          120 |             7,296 |     4,960,776
> |- <Java Local> 
> org.apache.ambari.server.actionmanager.ActionDBAccessorImpl$$EnhancerByGuice$$dcf333e8
>  @ 0x640538f40                                       |           75 |         
>  248 |             7,200 |   640,497,232
> |  '- hostRoleCommandCache 
> com.google.common.cache.LocalCache$LocalManualCache @ 0x640474b58             
>                                                   |           75 |           
> 16 |             7,200 |   640,496,984
> |     '- localCache com.google.common.cache.LocalCache @ 0x640da1650          
>                                                                              
> |           75 |          128 |             7,200 |   640,496,968
> |        '- segments com.google.common.cache.LocalCache$Segment[4] @ 
> 0x640f27e88                                                                   
>         |           75 |           32 |             7,200 |   640,496,840
> |           |- [1] com.google.common.cache.LocalCache$Segment @ 0x6410ee3c8   
>                                                                              
> |           22 |           80 |             2,112 |   151,456,800
> |           |  |- table java.util.concurrent.atomic.AtomicReferenceArray @ 
> 0x6470826f8                                                                   
>   |           21 |           16 |             2,016 |         2,080
> |           |  |  '- array java.lang.Object[512] @ 0x65dd9e088                
>                                                                              
> |           21 |        2,064 |             2,016 |         2,064
> |           |  |     |- [346] 
> com.google.common.cache.LocalCache$StrongAccessEntry @ 0x670caa3d0            
>                                                |            1 |           48 
> |                96 |     2,854,000
> |           |  |     |  '- valueReference 
> com.google.common.cache.LocalCache$StrongValueReference @ 0x670caa418         
>                                    |            1 |           16 |            
>     96 |     2,853,928
> |           |  |     |     '- referent 
> org.apache.ambari.server.actionmanager.HostRoleCommand @ 0x670caa430          
>                                       |            1 |          128 |         
>        96 |     2,853,912
> |           |  |     |        '- hostEntity 
> org.apache.ambari.server.orm.entities.HostEntity @ 0x66f876d18                
>                                  |            1 |          136 |              
>   96 |     2,827,496
> |           |  |     |           '- _persistence_listener 
> org.eclipse.persistence.internal.descriptors.changetracking.AttributeChangeListener
>  @ 0x66f89f530|            1 |           32 |                96 |            
> 32
> |           |  |     |              '- uow 
> org.eclipse.persistence.internal.sessions.RepeatableWriteUnitOfWork @ 
> 0x670ca0b30                               |            1 |          360 |     
>            96 |     2,826,496
> |           |  |     |                 '- identityMapAccessor 
> org.eclipse.persistence.internal.sessions.UnitOfWorkIdentityMapAccessor @ 
> 0x66f7fbf38        |            1 |           24 |                96 |     
> 2,825,688
> |           |  |     |                    '- identityMapManager 
> org.eclipse.persistence.internal.identitymaps.IdentityMapManager @ 
> 0x670c2b320             |            1 |           48 |                96 |   
>   2,825,664
> |           |  |     |                       '- identityMaps 
> java.util.HashMap @ 0x670c2b350                                               
>                 |            1 |           48 |                96 |     
> 2,824,208
> |           |  |     |                          '- table 
> java.util.HashMap$Node[32] @ 0x670cb1608                                      
>                     |            1 |          144 |                96 |     
> 2,824,160
> |           |  |     |                             '- [5] 
> java.util.HashMap$Node @ 0x670b71bd8                                          
>                    |            1 |           32 |                96 |     
> 1,201,192
> |           |  |     |                                '- value 
> org.eclipse.persistence.internal.identitymaps.UnitOfWorkIdentityMap @ 
> 0x670c5a390           |            1 |           32 |                96 |     
> 1,201,160
> |           |  |     |                                   '- cacheKeys 
> java.util.HashMap @ 0x670c2b4d0                                               
>        |            1 |           48 |                96 |     1,201,128
> |           |  |     |                                      '- table 
> java.util.HashMap$Node[4096] @ 0x66f7c83c8                                    
>         |            1 |       16,400 |                96 |     1,201,080
> |           |  |     |                                         '- [3271] 
> java.util.HashMap$Node @ 0x670c772e8                                          
>     |            1 |           32 |                96 |           200
> |           |  |     |                                            '- value 
> org.eclipse.persistence.internal.identitymaps.CacheKey @ 0x66f756e30          
>   |            1 |           96 |                96 |            96
> |           |  |     |                                               '- 
> object org.apache.ambari.server.orm.entities.StageEntity @ 0x66f4f6f98        
>      |            1 |           96 |                96 |           568
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to