[jira] [Commented] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6

2018-05-23 Thread Hsin-Liang Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488263#comment-16488263
 ] 

Hsin-Liang Huang commented on YARN-8326:


Here is more detail information from node manager log that compares between 
Hadoop 3.0 and 2.6.  They are both running on 4 node cluster with 3 data nodes 
with same machine power/cpu/memory and same type of job.   I picked only one 
node to compare the container cycle. 

*1. On 3.0.*  when I request 8 containers to run on 3 data nodes,  I picked the 
second node to examine the log:

this job used 2 containers in this node:

 

container *container_e04_1527109836290_0004_01_02*  on application 
application_1527109836290_0004  (from container succeeded to Stopping container 
(from blue to red line) took about *4 seconds*)

 

152231 2018-05-23 15:04:45,541 INFO  containermanager.ContainerManagerImpl 
(ContainerManagerImpl.java:startContainerInternal(1059)) - Start request for 
container_e04_1527109836290_0004_01_02 by user hlhuang

152232 2018-05-23 15:04:45,657 INFO  containermanager.ContainerManagerImpl 
(ContainerManagerImpl.java:startContainerInternal(1127)) - Creating a new 
application reference for app application_1527109836290_0004

152233 2018-05-23 15:04:45,658 INFO  application.ApplicationImpl 
(ApplicationImpl.java:handle(632)) - Application application_1527109836290_0004 
transitioned from NEW to INITING

152234 2018-05-23 15:04:45,658 INFO  application.ApplicationImpl 
(ApplicationImpl.java:transition(446)) - Adding 
container_e04_1527109836290_0004_01_02 to application 
application_1527109836290_0004

152235 2018-05-23 15:04:45,658 INFO  application.ApplicationImpl 
(ApplicationImpl.java:handle(632)) - Application application_1527109836290_0004 
transitioned from INITING to RUNNING

152236 2018-05-23 15:04:45,659 INFO  container.ContainerImpl 
(ContainerImpl.java:handle(2108)) - Container 
container_e04_1527109836290_0004_01_02 transitioned from NEW to SCHEDULED

152237 2018-05-23 15:04:45,659 INFO  containermanager.AuxServices 
(AuxServices.java:handle(220)) - Got event CONTAINER_INIT for appId 
application_1527109836290_0004

152238 2018-05-23 15:04:45,659 INFO  yarn.YarnShuffleService 
(YarnShuffleService.java:initializeContainer(289)) - Initializing container 
container_e04_1527109836290_0004_01_02

152239 2018-05-23 15:04:45,660 INFO  scheduler.ContainerScheduler 
(ContainerScheduler.java:startContainer(503)) - Starting container 
[container_e04_1527109836290_0004_01_02]

152246 2018-05-23 15:04:45,965 INFO  container.ContainerImpl 
(ContainerImpl.java:handle(2108)) - Container 
container_e04_1527109836290_0004_01_02 transitioned from SCHEDULED to 
RUNNING

152247 2018-05-23 15:04:45,965 INFO  monitor.ContainersMonitorImpl 
(ContainersMonitorImpl.java:onStartMonitoringContainer(941)) - Starting 
resource-monitoring for container_e04_1527109836290_0004_01_02

{color:#205081}152250 2018-05-23 15:04:46,002 INFO  launcher.ContainerLaunch 
(ContainerLaunch.java:handleContainerExitCode(512)) - Container 
container_e04_1527109836290_0004_01_02 succeeded{color}

 

152251 2018-05-23 15:04:46,003 INFO  container.ContainerImpl 
(ContainerImpl.java:handle(2108)) - Container 
container_e04_1527109836290_0004_01_02 transitioned from RUNNING to 
EXITED_WITH_SUCCESS

152252 2018-05-23 15:04:46,003 INFO  launcher.ContainerLaunch 
(ContainerLaunch.java:cleanupContainer(668)) - Cleaning up container 
container_e04_1527109836290_0004_01_02

152254 2018-05-23 15:04:48,132 INFO  nodemanager.LinuxContainerExecutor 
(LinuxContainerExecutor.java:deleteAsUser(794)) - Deleting absolute path : 
/hadoop/yarn/local/usercache/hlhuang/appcache/application_1527109836290_0004/container_e04_1527109836290_0004_01_02

152256 2018-05-23 15:04:48,133 INFO  container.ContainerImpl 
(ContainerImpl.java:handle(2108)) - Container 
container_e04_1527109836290_0004_01_02 transitioned from 
EXITED_WITH_SUCCESS to DONE

152258 2018-05-23 15:04:49,171 INFO  nodemanager.NodeStatusUpdaterImpl 
(NodeStatusUpdaterImpl.java:removeOrTrackCompletedContainersFromContext(682)) - 
Removed completed containers from NM context: 
[container_e04_1527109836290_0004_01_02]

152260 2018-05-23 15:04:50,289 INFO  application.ApplicationImpl 
(ApplicationImpl.java:transition(489)) - Removing 
container_e04_1527109836290_0004_01_02 from application 
application_1527109836290_0004

{color:#d04437}152261 2018-05-23 15:04:50,290 INFO  
monitor.ContainersMonitorImpl 
(ContainersMonitorImpl.java:onStopMonitoringContainer(932)) - Stopping 
resource-monitoring for container_e04_1527109836290_0004_01_02{color}

152263 2018-05-23 15:04:50,290 INFO  yarn.YarnShuffleService 
(YarnShuffleService.java:stopContainer(295)) - Stopping container 
container_e04_1527109836290_0004_01_02

152262 2018-05-23 15:04:50,290 INFO  containermanager.AuxServices 

[jira] [Comment Edited] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6

2018-05-23 Thread Hsin-Liang Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488013#comment-16488013
 ] 

Hsin-Liang Huang edited comment on YARN-8326 at 5/23/18 9:19 PM:
-

Hi  [~eyang]

   I ran the sample job,

{color:#14892c}time hadoop jar 
/usr/hdp/3.0.0.0-829/hadoop-yarn/hadoop-yarn-applications-unmanaged-am-launcher-3.0.0.3.0.0.0-829.jar
 Client -classpath simple-yarn-app-1.1.0.jar -cmd "java 
com.hortonworks.simpleyarnapp.ApplicationMaster /bin/date 8"{color}

with the changed settings, it still ran 15 seconds compared to 6 or 7 seconds 
in 2.6 environment.  So I am not sure if the significant performance role that 
these two  monitoring setting would play in this. The major issue could still 
be in the exiting container that in 3.0 environment is much slower than 2.6 
environment.  Can someone from yarn team look into this? This is a general yarn 
application performance issue in 3.0. 

 


was (Author: hlhu...@us.ibm.com):
Hi  [~eyang]

   I ran the sample job, with the changed settings, it still ran 15 seconds 
compared to 6 or 7 seconds in 2.6 environment.  So I am not sure if the 
significant performance role that these two  monitoring setting would play in 
this. The major issue could still be in the exiting container that in 3.0 
environment is much slower than 2.6 environment.  Can someone from yarn team 
look into this? This is a general yarn application performance issue in 3.0. 

 

> Yarn 3.0 seems runs slower than Yarn 2.6
> 
>
> Key: YARN-8326
> URL: https://issues.apache.org/jira/browse/YARN-8326
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
> Environment: This is the yarn-site.xml for 3.0. 
>  
> 
> 
>  hadoop.registry.dns.bind-port
>  5353
>  
> 
>  hadoop.registry.dns.domain-name
>  hwx.site
>  
> 
>  hadoop.registry.dns.enabled
>  true
>  
> 
>  hadoop.registry.dns.zone-mask
>  255.255.255.0
>  
> 
>  hadoop.registry.dns.zone-subnet
>  172.17.0.0
>  
> 
>  manage.include.files
>  false
>  
> 
>  yarn.acl.enable
>  false
>  
> 
>  yarn.admin.acl
>  yarn
>  
> 
>  yarn.client.nodemanager-connect.max-wait-ms
>  6
>  
> 
>  yarn.client.nodemanager-connect.retry-interval-ms
>  1
>  
> 
>  yarn.http.policy
>  HTTP_ONLY
>  
> 
>  yarn.log-aggregation-enable
>  false
>  
> 
>  yarn.log-aggregation.retain-seconds
>  2592000
>  
> 
>  yarn.log.server.url
>  
> [http://xx:19888/jobhistory/logs|http://whiny2.fyre.ibm.com:19888/jobhistory/logs]
>  
> 
>  yarn.log.server.web-service.url
>  
> [http://xx:8188/ws/v1/applicationhistory|http://whiny2.fyre.ibm.com:8188/ws/v1/applicationhistory]
>  
> 
>  yarn.node-labels.enabled
>  false
>  
> 
>  yarn.node-labels.fs-store.retry-policy-spec
>  2000, 500
>  
> 
>  yarn.node-labels.fs-store.root-dir
>  /system/yarn/node-labels
>  
> 
>  yarn.nodemanager.address
>  0.0.0.0:45454
>  
> 
>  yarn.nodemanager.admin-env
>  MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX
>  
> 
>  yarn.nodemanager.aux-services
>  mapreduce_shuffle,spark2_shuffle,timeline_collector
>  
> 
>  yarn.nodemanager.aux-services.mapreduce_shuffle.class
>  org.apache.hadoop.mapred.ShuffleHandler
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.classpath
>  /usr/spark2/aux/*
>  
> 
>  yarn.nodemanager.aux-services.spark_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.timeline_collector.class
>  
> org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService
>  
> 
>  yarn.nodemanager.bind-host
>  0.0.0.0
>  
> 
>  yarn.nodemanager.container-executor.class
>  
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor
>  
> 
>  yarn.nodemanager.container-metrics.unregister-delay-ms
>  6
>  
> 
>  yarn.nodemanager.container-monitor.interval-ms
>  3000
>  
> 
>  yarn.nodemanager.delete.debug-delay-sec
>  0
>  
> 
>  
> yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage
>  90
>  
> 
>  yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb
>  1000
>  
> 
>  yarn.nodemanager.disk-health-checker.min-healthy-disks
>  0.25
>  
> 
>  yarn.nodemanager.health-checker.interval-ms
>  135000
>  
> 
>  yarn.nodemanager.health-checker.script.timeout-ms
>  6
>  
> 
>  
> yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>  false
>  
> 
>  yarn.nodemanager.linux-container-executor.group
>  hadoop
>  
> 
>  
> yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users
>  false
>  
> 
>  yarn.nodemanager.local-dirs
>  /hadoop/yarn/local
>  
> 
>  yarn.nodemanager.log-aggregation.compression-type
>  gz
>  
> 
>  

[jira] [Issue Comment Deleted] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6

2018-05-23 Thread Hsin-Liang Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hsin-Liang Huang updated YARN-8326:
---
Comment: was deleted

(was: HI Eric, 

   I tried the suggestion and changed the setting.  The result on running 

{color:#14892c}time hadoop jar 
/usr/hdp/3.0.0.0-829/hadoop-yarn/hadoop-yarn-applications-unmanaged-am-launcher-3.0.0.3.0.0.0-829.jar
 Client -classpath simple-yarn-app-1.1.0.jar -cmd "java 
com.hortonworks.simpleyarnapp.ApplicationMaster /bin/date 8"{color}

 is 20s, 15s and 15s (I ran it 3 times).   It didn't get better if it's not 
worse.  (It was 14, 15 seconds before). )

> Yarn 3.0 seems runs slower than Yarn 2.6
> 
>
> Key: YARN-8326
> URL: https://issues.apache.org/jira/browse/YARN-8326
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
> Environment: This is the yarn-site.xml for 3.0. 
>  
> 
> 
>  hadoop.registry.dns.bind-port
>  5353
>  
> 
>  hadoop.registry.dns.domain-name
>  hwx.site
>  
> 
>  hadoop.registry.dns.enabled
>  true
>  
> 
>  hadoop.registry.dns.zone-mask
>  255.255.255.0
>  
> 
>  hadoop.registry.dns.zone-subnet
>  172.17.0.0
>  
> 
>  manage.include.files
>  false
>  
> 
>  yarn.acl.enable
>  false
>  
> 
>  yarn.admin.acl
>  yarn
>  
> 
>  yarn.client.nodemanager-connect.max-wait-ms
>  6
>  
> 
>  yarn.client.nodemanager-connect.retry-interval-ms
>  1
>  
> 
>  yarn.http.policy
>  HTTP_ONLY
>  
> 
>  yarn.log-aggregation-enable
>  false
>  
> 
>  yarn.log-aggregation.retain-seconds
>  2592000
>  
> 
>  yarn.log.server.url
>  
> [http://xx:19888/jobhistory/logs|http://whiny2.fyre.ibm.com:19888/jobhistory/logs]
>  
> 
>  yarn.log.server.web-service.url
>  
> [http://xx:8188/ws/v1/applicationhistory|http://whiny2.fyre.ibm.com:8188/ws/v1/applicationhistory]
>  
> 
>  yarn.node-labels.enabled
>  false
>  
> 
>  yarn.node-labels.fs-store.retry-policy-spec
>  2000, 500
>  
> 
>  yarn.node-labels.fs-store.root-dir
>  /system/yarn/node-labels
>  
> 
>  yarn.nodemanager.address
>  0.0.0.0:45454
>  
> 
>  yarn.nodemanager.admin-env
>  MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX
>  
> 
>  yarn.nodemanager.aux-services
>  mapreduce_shuffle,spark2_shuffle,timeline_collector
>  
> 
>  yarn.nodemanager.aux-services.mapreduce_shuffle.class
>  org.apache.hadoop.mapred.ShuffleHandler
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.classpath
>  /usr/spark2/aux/*
>  
> 
>  yarn.nodemanager.aux-services.spark_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.timeline_collector.class
>  
> org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService
>  
> 
>  yarn.nodemanager.bind-host
>  0.0.0.0
>  
> 
>  yarn.nodemanager.container-executor.class
>  
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor
>  
> 
>  yarn.nodemanager.container-metrics.unregister-delay-ms
>  6
>  
> 
>  yarn.nodemanager.container-monitor.interval-ms
>  3000
>  
> 
>  yarn.nodemanager.delete.debug-delay-sec
>  0
>  
> 
>  
> yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage
>  90
>  
> 
>  yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb
>  1000
>  
> 
>  yarn.nodemanager.disk-health-checker.min-healthy-disks
>  0.25
>  
> 
>  yarn.nodemanager.health-checker.interval-ms
>  135000
>  
> 
>  yarn.nodemanager.health-checker.script.timeout-ms
>  6
>  
> 
>  
> yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>  false
>  
> 
>  yarn.nodemanager.linux-container-executor.group
>  hadoop
>  
> 
>  
> yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users
>  false
>  
> 
>  yarn.nodemanager.local-dirs
>  /hadoop/yarn/local
>  
> 
>  yarn.nodemanager.log-aggregation.compression-type
>  gz
>  
> 
>  yarn.nodemanager.log-aggregation.debug-enabled
>  false
>  
> 
>  yarn.nodemanager.log-aggregation.num-log-files-per-app
>  30
>  
> 
>  
> yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds
>  3600
>  
> 
>  yarn.nodemanager.log-dirs
>  /hadoop/yarn/log
>  
> 
>  yarn.nodemanager.log.retain-seconds
>  604800
>  
> 
>  yarn.nodemanager.pmem-check-enabled
>  false
>  
> 
>  yarn.nodemanager.recovery.dir
>  /var/log/hadoop-yarn/nodemanager/recovery-state
>  
> 
>  yarn.nodemanager.recovery.enabled
>  true
>  
> 
>  yarn.nodemanager.recovery.supervised
>  true
>  
> 
>  yarn.nodemanager.remote-app-log-dir
>  /app-logs
>  
> 
>  yarn.nodemanager.remote-app-log-dir-suffix
>  logs
>  
> 
>  yarn.nodemanager.resource-plugins
>  
>  
> 
>  yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices
>  auto
>  
> 
>  

[jira] [Comment Edited] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6

2018-05-23 Thread Hsin-Liang Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488013#comment-16488013
 ] 

Hsin-Liang Huang edited comment on YARN-8326 at 5/23/18 9:15 PM:
-

Hi  [~eyang]

   I ran the sample job, with the changed settings, it still ran 15 seconds 
compared to 6 or 7 seconds in 2.6 environment.  So I am not sure if the 
significant performance role that these two  monitoring setting would play in 
this. The major issue could still be in the exiting container that in 3.0 
environment is much slower than 2.6 environment.  Can someone from yarn team 
look into this? This is a general yarn application performance issue in 3.0. 

 


was (Author: hlhu...@us.ibm.com):
Hi  [~eyang]

   Here is another update.  Even though the simple job that I ran with the 
suggested setting changed, the performance was improved.   However, I ran our 
unit testcases, and it still ran 14 hours compared to 7 hours in 2.6 
environment.  I also ran another sample job, with the changed settings, it 
still ran 15 seconds compared to 6 or 7 seconds in 2.6 environment.  So I think 
even though monitoring setting might affect the performance issue, but it only 
plays a little part,  the major issue could still be in the exiting container 
that in 3.0 environment is much slower than 2.6 environment.  Is there anyone 
looking into this area?   Thanks!

 

> Yarn 3.0 seems runs slower than Yarn 2.6
> 
>
> Key: YARN-8326
> URL: https://issues.apache.org/jira/browse/YARN-8326
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
> Environment: This is the yarn-site.xml for 3.0. 
>  
> 
> 
>  hadoop.registry.dns.bind-port
>  5353
>  
> 
>  hadoop.registry.dns.domain-name
>  hwx.site
>  
> 
>  hadoop.registry.dns.enabled
>  true
>  
> 
>  hadoop.registry.dns.zone-mask
>  255.255.255.0
>  
> 
>  hadoop.registry.dns.zone-subnet
>  172.17.0.0
>  
> 
>  manage.include.files
>  false
>  
> 
>  yarn.acl.enable
>  false
>  
> 
>  yarn.admin.acl
>  yarn
>  
> 
>  yarn.client.nodemanager-connect.max-wait-ms
>  6
>  
> 
>  yarn.client.nodemanager-connect.retry-interval-ms
>  1
>  
> 
>  yarn.http.policy
>  HTTP_ONLY
>  
> 
>  yarn.log-aggregation-enable
>  false
>  
> 
>  yarn.log-aggregation.retain-seconds
>  2592000
>  
> 
>  yarn.log.server.url
>  
> [http://xx:19888/jobhistory/logs|http://whiny2.fyre.ibm.com:19888/jobhistory/logs]
>  
> 
>  yarn.log.server.web-service.url
>  
> [http://xx:8188/ws/v1/applicationhistory|http://whiny2.fyre.ibm.com:8188/ws/v1/applicationhistory]
>  
> 
>  yarn.node-labels.enabled
>  false
>  
> 
>  yarn.node-labels.fs-store.retry-policy-spec
>  2000, 500
>  
> 
>  yarn.node-labels.fs-store.root-dir
>  /system/yarn/node-labels
>  
> 
>  yarn.nodemanager.address
>  0.0.0.0:45454
>  
> 
>  yarn.nodemanager.admin-env
>  MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX
>  
> 
>  yarn.nodemanager.aux-services
>  mapreduce_shuffle,spark2_shuffle,timeline_collector
>  
> 
>  yarn.nodemanager.aux-services.mapreduce_shuffle.class
>  org.apache.hadoop.mapred.ShuffleHandler
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.classpath
>  /usr/spark2/aux/*
>  
> 
>  yarn.nodemanager.aux-services.spark_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.timeline_collector.class
>  
> org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService
>  
> 
>  yarn.nodemanager.bind-host
>  0.0.0.0
>  
> 
>  yarn.nodemanager.container-executor.class
>  
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor
>  
> 
>  yarn.nodemanager.container-metrics.unregister-delay-ms
>  6
>  
> 
>  yarn.nodemanager.container-monitor.interval-ms
>  3000
>  
> 
>  yarn.nodemanager.delete.debug-delay-sec
>  0
>  
> 
>  
> yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage
>  90
>  
> 
>  yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb
>  1000
>  
> 
>  yarn.nodemanager.disk-health-checker.min-healthy-disks
>  0.25
>  
> 
>  yarn.nodemanager.health-checker.interval-ms
>  135000
>  
> 
>  yarn.nodemanager.health-checker.script.timeout-ms
>  6
>  
> 
>  
> yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>  false
>  
> 
>  yarn.nodemanager.linux-container-executor.group
>  hadoop
>  
> 
>  
> yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users
>  false
>  
> 
>  yarn.nodemanager.local-dirs
>  /hadoop/yarn/local
>  
> 
>  yarn.nodemanager.log-aggregation.compression-type
>  gz
>  
> 
>  yarn.nodemanager.log-aggregation.debug-enabled
>  false
>  
> 
>  

[jira] [Issue Comment Deleted] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6

2018-05-23 Thread Hsin-Liang Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hsin-Liang Huang updated YARN-8326:
---
Comment: was deleted

(was: [~eyang]   this afternoon,  I tried the command and the performance was 
dramatically improved.  It used to run 8 seconds, now it ran 3 seconds 
consistently, then I compared with the other 3.0 cluster which I didn't make 
the properties changes that you suggested, and it still ran 8 seconds 
consistently.   I am going to run our testcases to see if the performance is 
also improved there. )

> Yarn 3.0 seems runs slower than Yarn 2.6
> 
>
> Key: YARN-8326
> URL: https://issues.apache.org/jira/browse/YARN-8326
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
> Environment: This is the yarn-site.xml for 3.0. 
>  
> 
> 
>  hadoop.registry.dns.bind-port
>  5353
>  
> 
>  hadoop.registry.dns.domain-name
>  hwx.site
>  
> 
>  hadoop.registry.dns.enabled
>  true
>  
> 
>  hadoop.registry.dns.zone-mask
>  255.255.255.0
>  
> 
>  hadoop.registry.dns.zone-subnet
>  172.17.0.0
>  
> 
>  manage.include.files
>  false
>  
> 
>  yarn.acl.enable
>  false
>  
> 
>  yarn.admin.acl
>  yarn
>  
> 
>  yarn.client.nodemanager-connect.max-wait-ms
>  6
>  
> 
>  yarn.client.nodemanager-connect.retry-interval-ms
>  1
>  
> 
>  yarn.http.policy
>  HTTP_ONLY
>  
> 
>  yarn.log-aggregation-enable
>  false
>  
> 
>  yarn.log-aggregation.retain-seconds
>  2592000
>  
> 
>  yarn.log.server.url
>  
> [http://xx:19888/jobhistory/logs|http://whiny2.fyre.ibm.com:19888/jobhistory/logs]
>  
> 
>  yarn.log.server.web-service.url
>  
> [http://xx:8188/ws/v1/applicationhistory|http://whiny2.fyre.ibm.com:8188/ws/v1/applicationhistory]
>  
> 
>  yarn.node-labels.enabled
>  false
>  
> 
>  yarn.node-labels.fs-store.retry-policy-spec
>  2000, 500
>  
> 
>  yarn.node-labels.fs-store.root-dir
>  /system/yarn/node-labels
>  
> 
>  yarn.nodemanager.address
>  0.0.0.0:45454
>  
> 
>  yarn.nodemanager.admin-env
>  MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX
>  
> 
>  yarn.nodemanager.aux-services
>  mapreduce_shuffle,spark2_shuffle,timeline_collector
>  
> 
>  yarn.nodemanager.aux-services.mapreduce_shuffle.class
>  org.apache.hadoop.mapred.ShuffleHandler
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.classpath
>  /usr/spark2/aux/*
>  
> 
>  yarn.nodemanager.aux-services.spark_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.timeline_collector.class
>  
> org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService
>  
> 
>  yarn.nodemanager.bind-host
>  0.0.0.0
>  
> 
>  yarn.nodemanager.container-executor.class
>  
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor
>  
> 
>  yarn.nodemanager.container-metrics.unregister-delay-ms
>  6
>  
> 
>  yarn.nodemanager.container-monitor.interval-ms
>  3000
>  
> 
>  yarn.nodemanager.delete.debug-delay-sec
>  0
>  
> 
>  
> yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage
>  90
>  
> 
>  yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb
>  1000
>  
> 
>  yarn.nodemanager.disk-health-checker.min-healthy-disks
>  0.25
>  
> 
>  yarn.nodemanager.health-checker.interval-ms
>  135000
>  
> 
>  yarn.nodemanager.health-checker.script.timeout-ms
>  6
>  
> 
>  
> yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>  false
>  
> 
>  yarn.nodemanager.linux-container-executor.group
>  hadoop
>  
> 
>  
> yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users
>  false
>  
> 
>  yarn.nodemanager.local-dirs
>  /hadoop/yarn/local
>  
> 
>  yarn.nodemanager.log-aggregation.compression-type
>  gz
>  
> 
>  yarn.nodemanager.log-aggregation.debug-enabled
>  false
>  
> 
>  yarn.nodemanager.log-aggregation.num-log-files-per-app
>  30
>  
> 
>  
> yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds
>  3600
>  
> 
>  yarn.nodemanager.log-dirs
>  /hadoop/yarn/log
>  
> 
>  yarn.nodemanager.log.retain-seconds
>  604800
>  
> 
>  yarn.nodemanager.pmem-check-enabled
>  false
>  
> 
>  yarn.nodemanager.recovery.dir
>  /var/log/hadoop-yarn/nodemanager/recovery-state
>  
> 
>  yarn.nodemanager.recovery.enabled
>  true
>  
> 
>  yarn.nodemanager.recovery.supervised
>  true
>  
> 
>  yarn.nodemanager.remote-app-log-dir
>  /app-logs
>  
> 
>  yarn.nodemanager.remote-app-log-dir-suffix
>  logs
>  
> 
>  yarn.nodemanager.resource-plugins
>  
>  
> 
>  yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices
>  auto
>  
> 
>  yarn.nodemanager.resource-plugins.gpu.docker-plugin
>  nvidia-docker-v1
>  
> 
>  

[jira] [Commented] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6

2018-05-23 Thread Hsin-Liang Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488013#comment-16488013
 ] 

Hsin-Liang Huang commented on YARN-8326:


Hi  [~eyang]

   Here is another update.  Even though the simple job that I ran with the 
suggested setting changed, the performance was improved.   However, I ran our 
unit testcases, and it still ran 14 hours compared to 7 hours in 2.6 
environment.  I also ran another sample job, with the changed settings, it 
still ran 15 seconds compared to 6 or 7 seconds in 2.6 environment.  So I think 
even though monitoring setting might affect the performance issue, but it only 
plays a little part,  the major issue could still be in the exiting container 
that in 3.0 environment is much slower than 2.6 environment.  Is there anyone 
looking into this area?   Thanks!

 

> Yarn 3.0 seems runs slower than Yarn 2.6
> 
>
> Key: YARN-8326
> URL: https://issues.apache.org/jira/browse/YARN-8326
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
> Environment: This is the yarn-site.xml for 3.0. 
>  
> 
> 
>  hadoop.registry.dns.bind-port
>  5353
>  
> 
>  hadoop.registry.dns.domain-name
>  hwx.site
>  
> 
>  hadoop.registry.dns.enabled
>  true
>  
> 
>  hadoop.registry.dns.zone-mask
>  255.255.255.0
>  
> 
>  hadoop.registry.dns.zone-subnet
>  172.17.0.0
>  
> 
>  manage.include.files
>  false
>  
> 
>  yarn.acl.enable
>  false
>  
> 
>  yarn.admin.acl
>  yarn
>  
> 
>  yarn.client.nodemanager-connect.max-wait-ms
>  6
>  
> 
>  yarn.client.nodemanager-connect.retry-interval-ms
>  1
>  
> 
>  yarn.http.policy
>  HTTP_ONLY
>  
> 
>  yarn.log-aggregation-enable
>  false
>  
> 
>  yarn.log-aggregation.retain-seconds
>  2592000
>  
> 
>  yarn.log.server.url
>  
> [http://xx:19888/jobhistory/logs|http://whiny2.fyre.ibm.com:19888/jobhistory/logs]
>  
> 
>  yarn.log.server.web-service.url
>  
> [http://xx:8188/ws/v1/applicationhistory|http://whiny2.fyre.ibm.com:8188/ws/v1/applicationhistory]
>  
> 
>  yarn.node-labels.enabled
>  false
>  
> 
>  yarn.node-labels.fs-store.retry-policy-spec
>  2000, 500
>  
> 
>  yarn.node-labels.fs-store.root-dir
>  /system/yarn/node-labels
>  
> 
>  yarn.nodemanager.address
>  0.0.0.0:45454
>  
> 
>  yarn.nodemanager.admin-env
>  MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX
>  
> 
>  yarn.nodemanager.aux-services
>  mapreduce_shuffle,spark2_shuffle,timeline_collector
>  
> 
>  yarn.nodemanager.aux-services.mapreduce_shuffle.class
>  org.apache.hadoop.mapred.ShuffleHandler
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.classpath
>  /usr/spark2/aux/*
>  
> 
>  yarn.nodemanager.aux-services.spark_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.timeline_collector.class
>  
> org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService
>  
> 
>  yarn.nodemanager.bind-host
>  0.0.0.0
>  
> 
>  yarn.nodemanager.container-executor.class
>  
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor
>  
> 
>  yarn.nodemanager.container-metrics.unregister-delay-ms
>  6
>  
> 
>  yarn.nodemanager.container-monitor.interval-ms
>  3000
>  
> 
>  yarn.nodemanager.delete.debug-delay-sec
>  0
>  
> 
>  
> yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage
>  90
>  
> 
>  yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb
>  1000
>  
> 
>  yarn.nodemanager.disk-health-checker.min-healthy-disks
>  0.25
>  
> 
>  yarn.nodemanager.health-checker.interval-ms
>  135000
>  
> 
>  yarn.nodemanager.health-checker.script.timeout-ms
>  6
>  
> 
>  
> yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>  false
>  
> 
>  yarn.nodemanager.linux-container-executor.group
>  hadoop
>  
> 
>  
> yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users
>  false
>  
> 
>  yarn.nodemanager.local-dirs
>  /hadoop/yarn/local
>  
> 
>  yarn.nodemanager.log-aggregation.compression-type
>  gz
>  
> 
>  yarn.nodemanager.log-aggregation.debug-enabled
>  false
>  
> 
>  yarn.nodemanager.log-aggregation.num-log-files-per-app
>  30
>  
> 
>  
> yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds
>  3600
>  
> 
>  yarn.nodemanager.log-dirs
>  /hadoop/yarn/log
>  
> 
>  yarn.nodemanager.log.retain-seconds
>  604800
>  
> 
>  yarn.nodemanager.pmem-check-enabled
>  false
>  
> 
>  yarn.nodemanager.recovery.dir
>  /var/log/hadoop-yarn/nodemanager/recovery-state
>  
> 
>  yarn.nodemanager.recovery.enabled
>  true
>  
> 
>  yarn.nodemanager.recovery.supervised
>  true
>  
> 
>  yarn.nodemanager.remote-app-log-dir
>  /app-logs
>  
> 
>  

[jira] [Comment Edited] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6

2018-05-22 Thread Hsin-Liang Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484721#comment-16484721
 ] 

Hsin-Liang Huang edited comment on YARN-8326 at 5/23/18 12:18 AM:
--

[~eyang]   this afternoon,  I tried the command and the performance was 
dramatically improved.  It used to run 8 seconds, now it ran 3 seconds 
consistently, then I compared with the other 3.0 cluster which I didn't make 
the properties changes that you suggested, and it still ran 8 seconds 
consistently.   I am going to run our testcases to see if the performance is 
also improved there. 


was (Author: hlhu...@us.ibm.com):
[~eyang]   this afternoon,  I tried the command and the performance was 
dramatically improved.  It used to run 8 seconds, now it ran 3 seconds 
consistently, then I compared with the other HDP 3.0 cluster which I didn't 
make the properties changes that you suggested, and it still ran 8 seconds 
consistently.   I am going to run our testcases to see if the performance is 
also improved there. 

> Yarn 3.0 seems runs slower than Yarn 2.6
> 
>
> Key: YARN-8326
> URL: https://issues.apache.org/jira/browse/YARN-8326
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
> Environment: This is the yarn-site.xml for 3.0. 
>  
> 
> 
>  hadoop.registry.dns.bind-port
>  5353
>  
> 
>  hadoop.registry.dns.domain-name
>  hwx.site
>  
> 
>  hadoop.registry.dns.enabled
>  true
>  
> 
>  hadoop.registry.dns.zone-mask
>  255.255.255.0
>  
> 
>  hadoop.registry.dns.zone-subnet
>  172.17.0.0
>  
> 
>  manage.include.files
>  false
>  
> 
>  yarn.acl.enable
>  false
>  
> 
>  yarn.admin.acl
>  yarn
>  
> 
>  yarn.client.nodemanager-connect.max-wait-ms
>  6
>  
> 
>  yarn.client.nodemanager-connect.retry-interval-ms
>  1
>  
> 
>  yarn.http.policy
>  HTTP_ONLY
>  
> 
>  yarn.log-aggregation-enable
>  false
>  
> 
>  yarn.log-aggregation.retain-seconds
>  2592000
>  
> 
>  yarn.log.server.url
>  
> [http://xx:19888/jobhistory/logs|http://whiny2.fyre.ibm.com:19888/jobhistory/logs]
>  
> 
>  yarn.log.server.web-service.url
>  
> [http://xx:8188/ws/v1/applicationhistory|http://whiny2.fyre.ibm.com:8188/ws/v1/applicationhistory]
>  
> 
>  yarn.node-labels.enabled
>  false
>  
> 
>  yarn.node-labels.fs-store.retry-policy-spec
>  2000, 500
>  
> 
>  yarn.node-labels.fs-store.root-dir
>  /system/yarn/node-labels
>  
> 
>  yarn.nodemanager.address
>  0.0.0.0:45454
>  
> 
>  yarn.nodemanager.admin-env
>  MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX
>  
> 
>  yarn.nodemanager.aux-services
>  mapreduce_shuffle,spark2_shuffle,timeline_collector
>  
> 
>  yarn.nodemanager.aux-services.mapreduce_shuffle.class
>  org.apache.hadoop.mapred.ShuffleHandler
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.classpath
>  /usr/spark2/aux/*
>  
> 
>  yarn.nodemanager.aux-services.spark_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.timeline_collector.class
>  
> org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService
>  
> 
>  yarn.nodemanager.bind-host
>  0.0.0.0
>  
> 
>  yarn.nodemanager.container-executor.class
>  
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor
>  
> 
>  yarn.nodemanager.container-metrics.unregister-delay-ms
>  6
>  
> 
>  yarn.nodemanager.container-monitor.interval-ms
>  3000
>  
> 
>  yarn.nodemanager.delete.debug-delay-sec
>  0
>  
> 
>  
> yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage
>  90
>  
> 
>  yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb
>  1000
>  
> 
>  yarn.nodemanager.disk-health-checker.min-healthy-disks
>  0.25
>  
> 
>  yarn.nodemanager.health-checker.interval-ms
>  135000
>  
> 
>  yarn.nodemanager.health-checker.script.timeout-ms
>  6
>  
> 
>  
> yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>  false
>  
> 
>  yarn.nodemanager.linux-container-executor.group
>  hadoop
>  
> 
>  
> yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users
>  false
>  
> 
>  yarn.nodemanager.local-dirs
>  /hadoop/yarn/local
>  
> 
>  yarn.nodemanager.log-aggregation.compression-type
>  gz
>  
> 
>  yarn.nodemanager.log-aggregation.debug-enabled
>  false
>  
> 
>  yarn.nodemanager.log-aggregation.num-log-files-per-app
>  30
>  
> 
>  
> yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds
>  3600
>  
> 
>  yarn.nodemanager.log-dirs
>  /hadoop/yarn/log
>  
> 
>  yarn.nodemanager.log.retain-seconds
>  604800
>  
> 
>  yarn.nodemanager.pmem-check-enabled
>  false
>  
> 
>  yarn.nodemanager.recovery.dir
>  

[jira] [Commented] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6

2018-05-22 Thread Hsin-Liang Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484721#comment-16484721
 ] 

Hsin-Liang Huang commented on YARN-8326:


[~eyang]   this afternoon,  I tried the command and the performance was 
dramatically improved.  It used to run 8 seconds, now it ran 3 seconds 
consistently, then I compared with the other HDP 3.0 cluster which I didn't 
make the properties changes that you suggested, and it still ran 8 seconds 
consistently.   I am going to run our testcases to see if the performance is 
also improved there. 

> Yarn 3.0 seems runs slower than Yarn 2.6
> 
>
> Key: YARN-8326
> URL: https://issues.apache.org/jira/browse/YARN-8326
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
> Environment: This is the yarn-site.xml for 3.0. 
>  
> 
> 
>  hadoop.registry.dns.bind-port
>  5353
>  
> 
>  hadoop.registry.dns.domain-name
>  hwx.site
>  
> 
>  hadoop.registry.dns.enabled
>  true
>  
> 
>  hadoop.registry.dns.zone-mask
>  255.255.255.0
>  
> 
>  hadoop.registry.dns.zone-subnet
>  172.17.0.0
>  
> 
>  manage.include.files
>  false
>  
> 
>  yarn.acl.enable
>  false
>  
> 
>  yarn.admin.acl
>  yarn
>  
> 
>  yarn.client.nodemanager-connect.max-wait-ms
>  6
>  
> 
>  yarn.client.nodemanager-connect.retry-interval-ms
>  1
>  
> 
>  yarn.http.policy
>  HTTP_ONLY
>  
> 
>  yarn.log-aggregation-enable
>  false
>  
> 
>  yarn.log-aggregation.retain-seconds
>  2592000
>  
> 
>  yarn.log.server.url
>  
> [http://xx:19888/jobhistory/logs|http://whiny2.fyre.ibm.com:19888/jobhistory/logs]
>  
> 
>  yarn.log.server.web-service.url
>  
> [http://xx:8188/ws/v1/applicationhistory|http://whiny2.fyre.ibm.com:8188/ws/v1/applicationhistory]
>  
> 
>  yarn.node-labels.enabled
>  false
>  
> 
>  yarn.node-labels.fs-store.retry-policy-spec
>  2000, 500
>  
> 
>  yarn.node-labels.fs-store.root-dir
>  /system/yarn/node-labels
>  
> 
>  yarn.nodemanager.address
>  0.0.0.0:45454
>  
> 
>  yarn.nodemanager.admin-env
>  MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX
>  
> 
>  yarn.nodemanager.aux-services
>  mapreduce_shuffle,spark2_shuffle,timeline_collector
>  
> 
>  yarn.nodemanager.aux-services.mapreduce_shuffle.class
>  org.apache.hadoop.mapred.ShuffleHandler
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.classpath
>  /usr/spark2/aux/*
>  
> 
>  yarn.nodemanager.aux-services.spark_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.timeline_collector.class
>  
> org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService
>  
> 
>  yarn.nodemanager.bind-host
>  0.0.0.0
>  
> 
>  yarn.nodemanager.container-executor.class
>  
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor
>  
> 
>  yarn.nodemanager.container-metrics.unregister-delay-ms
>  6
>  
> 
>  yarn.nodemanager.container-monitor.interval-ms
>  3000
>  
> 
>  yarn.nodemanager.delete.debug-delay-sec
>  0
>  
> 
>  
> yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage
>  90
>  
> 
>  yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb
>  1000
>  
> 
>  yarn.nodemanager.disk-health-checker.min-healthy-disks
>  0.25
>  
> 
>  yarn.nodemanager.health-checker.interval-ms
>  135000
>  
> 
>  yarn.nodemanager.health-checker.script.timeout-ms
>  6
>  
> 
>  
> yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>  false
>  
> 
>  yarn.nodemanager.linux-container-executor.group
>  hadoop
>  
> 
>  
> yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users
>  false
>  
> 
>  yarn.nodemanager.local-dirs
>  /hadoop/yarn/local
>  
> 
>  yarn.nodemanager.log-aggregation.compression-type
>  gz
>  
> 
>  yarn.nodemanager.log-aggregation.debug-enabled
>  false
>  
> 
>  yarn.nodemanager.log-aggregation.num-log-files-per-app
>  30
>  
> 
>  
> yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds
>  3600
>  
> 
>  yarn.nodemanager.log-dirs
>  /hadoop/yarn/log
>  
> 
>  yarn.nodemanager.log.retain-seconds
>  604800
>  
> 
>  yarn.nodemanager.pmem-check-enabled
>  false
>  
> 
>  yarn.nodemanager.recovery.dir
>  /var/log/hadoop-yarn/nodemanager/recovery-state
>  
> 
>  yarn.nodemanager.recovery.enabled
>  true
>  
> 
>  yarn.nodemanager.recovery.supervised
>  true
>  
> 
>  yarn.nodemanager.remote-app-log-dir
>  /app-logs
>  
> 
>  yarn.nodemanager.remote-app-log-dir-suffix
>  logs
>  
> 
>  yarn.nodemanager.resource-plugins
>  
>  
> 
>  yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices
>  auto
>  
> 
>  yarn.nodemanager.resource-plugins.gpu.docker-plugin
>  nvidia-docker-v1
>  
> 
>  

[jira] [Commented] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6

2018-05-21 Thread Hsin-Liang Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483027#comment-16483027
 ] 

Hsin-Liang Huang commented on YARN-8326:


HI Eric, 

   I tried the suggestion and changed the setting.  The result on running 

{color:#14892c}time hadoop jar 
/usr/hdp/3.0.0.0-829/hadoop-yarn/hadoop-yarn-applications-unmanaged-am-launcher-3.0.0.3.0.0.0-829.jar
 Client -classpath simple-yarn-app-1.1.0.jar -cmd "java 
com.hortonworks.simpleyarnapp.ApplicationMaster /bin/date 8"{color}

 is 20s, 15s and 15s (I ran it 3 times).   It didn't get better if it's not 
worse.  (It was 14, 15 seconds before). 

> Yarn 3.0 seems runs slower than Yarn 2.6
> 
>
> Key: YARN-8326
> URL: https://issues.apache.org/jira/browse/YARN-8326
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
> Environment: This is the yarn-site.xml for 3.0. 
>  
> 
> 
>  hadoop.registry.dns.bind-port
>  5353
>  
> 
>  hadoop.registry.dns.domain-name
>  hwx.site
>  
> 
>  hadoop.registry.dns.enabled
>  true
>  
> 
>  hadoop.registry.dns.zone-mask
>  255.255.255.0
>  
> 
>  hadoop.registry.dns.zone-subnet
>  172.17.0.0
>  
> 
>  manage.include.files
>  false
>  
> 
>  yarn.acl.enable
>  false
>  
> 
>  yarn.admin.acl
>  yarn
>  
> 
>  yarn.client.nodemanager-connect.max-wait-ms
>  6
>  
> 
>  yarn.client.nodemanager-connect.retry-interval-ms
>  1
>  
> 
>  yarn.http.policy
>  HTTP_ONLY
>  
> 
>  yarn.log-aggregation-enable
>  false
>  
> 
>  yarn.log-aggregation.retain-seconds
>  2592000
>  
> 
>  yarn.log.server.url
>  
> [http://xx:19888/jobhistory/logs|http://whiny2.fyre.ibm.com:19888/jobhistory/logs]
>  
> 
>  yarn.log.server.web-service.url
>  
> [http://xx:8188/ws/v1/applicationhistory|http://whiny2.fyre.ibm.com:8188/ws/v1/applicationhistory]
>  
> 
>  yarn.node-labels.enabled
>  false
>  
> 
>  yarn.node-labels.fs-store.retry-policy-spec
>  2000, 500
>  
> 
>  yarn.node-labels.fs-store.root-dir
>  /system/yarn/node-labels
>  
> 
>  yarn.nodemanager.address
>  0.0.0.0:45454
>  
> 
>  yarn.nodemanager.admin-env
>  MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX
>  
> 
>  yarn.nodemanager.aux-services
>  mapreduce_shuffle,spark2_shuffle,timeline_collector
>  
> 
>  yarn.nodemanager.aux-services.mapreduce_shuffle.class
>  org.apache.hadoop.mapred.ShuffleHandler
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.classpath
>  /usr/spark2/aux/*
>  
> 
>  yarn.nodemanager.aux-services.spark_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.timeline_collector.class
>  
> org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService
>  
> 
>  yarn.nodemanager.bind-host
>  0.0.0.0
>  
> 
>  yarn.nodemanager.container-executor.class
>  
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor
>  
> 
>  yarn.nodemanager.container-metrics.unregister-delay-ms
>  6
>  
> 
>  yarn.nodemanager.container-monitor.interval-ms
>  3000
>  
> 
>  yarn.nodemanager.delete.debug-delay-sec
>  0
>  
> 
>  
> yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage
>  90
>  
> 
>  yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb
>  1000
>  
> 
>  yarn.nodemanager.disk-health-checker.min-healthy-disks
>  0.25
>  
> 
>  yarn.nodemanager.health-checker.interval-ms
>  135000
>  
> 
>  yarn.nodemanager.health-checker.script.timeout-ms
>  6
>  
> 
>  
> yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>  false
>  
> 
>  yarn.nodemanager.linux-container-executor.group
>  hadoop
>  
> 
>  
> yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users
>  false
>  
> 
>  yarn.nodemanager.local-dirs
>  /hadoop/yarn/local
>  
> 
>  yarn.nodemanager.log-aggregation.compression-type
>  gz
>  
> 
>  yarn.nodemanager.log-aggregation.debug-enabled
>  false
>  
> 
>  yarn.nodemanager.log-aggregation.num-log-files-per-app
>  30
>  
> 
>  
> yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds
>  3600
>  
> 
>  yarn.nodemanager.log-dirs
>  /hadoop/yarn/log
>  
> 
>  yarn.nodemanager.log.retain-seconds
>  604800
>  
> 
>  yarn.nodemanager.pmem-check-enabled
>  false
>  
> 
>  yarn.nodemanager.recovery.dir
>  /var/log/hadoop-yarn/nodemanager/recovery-state
>  
> 
>  yarn.nodemanager.recovery.enabled
>  true
>  
> 
>  yarn.nodemanager.recovery.supervised
>  true
>  
> 
>  yarn.nodemanager.remote-app-log-dir
>  /app-logs
>  
> 
>  yarn.nodemanager.remote-app-log-dir-suffix
>  logs
>  
> 
>  yarn.nodemanager.resource-plugins
>  
>  
> 
>  yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices
>  auto
>  
> 
>  

[jira] [Created] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6

2018-05-18 Thread Hsin-Liang Huang (JIRA)
Hsin-Liang Huang created YARN-8326:
--

 Summary: Yarn 3.0 seems runs slower than Yarn 2.6
 Key: YARN-8326
 URL: https://issues.apache.org/jira/browse/YARN-8326
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 3.0.0
 Environment: This is the yarn-site.xml for 3.0. 

 


 
 
 hadoop.registry.dns.bind-port
 5353
 
 
 
 hadoop.registry.dns.domain-name
 hwx.site
 
 
 
 hadoop.registry.dns.enabled
 true
 
 
 
 hadoop.registry.dns.zone-mask
 255.255.255.0
 
 
 
 hadoop.registry.dns.zone-subnet
 172.17.0.0
 
 
 
 hadoop.registry.zk.quorum
 
whiny1.fyre.ibm.com:2181,whiny2.fyre.ibm.com:2181,whiny3.fyre.ibm.com:2181
 
 
 
 manage.include.files
 false
 
 
 
 yarn.acl.enable
 false
 
 
 
 yarn.admin.acl
 yarn
 
 
 
 yarn.application.classpath
 
$HADOOP_CONF_DIR,/usr/hdp/current/hadoop-client/*,/usr/hdp/current/hadoop-client/lib/*,/usr/hdp/current/hadoop-hdfs-client/*,/usr/hdp/current/hadoop-hdfs-client/lib/*,/usr/hdp/current/hadoop-yarn-client/*,/usr/hdp/current/hadoop-yarn-client/lib/*
 
 
 
 yarn.client.nodemanager-connect.max-wait-ms
 6
 
 
 
 yarn.client.nodemanager-connect.retry-interval-ms
 1
 
 
 
 yarn.http.policy
 HTTP_ONLY
 
 
 
 yarn.log-aggregation-enable
 false
 
 
 
 yarn.log-aggregation.retain-seconds
 2592000
 
 
 
 yarn.log.server.url
 http://whiny2.fyre.ibm.com:19888/jobhistory/logs
 
 
 
 yarn.log.server.web-service.url
 http://whiny2.fyre.ibm.com:8188/ws/v1/applicationhistory
 
 
 
 yarn.node-labels.enabled
 false
 
 
 
 yarn.node-labels.fs-store.retry-policy-spec
 2000, 500
 
 
 
 yarn.node-labels.fs-store.root-dir
 /system/yarn/node-labels
 
 
 
 yarn.nodemanager.address
 0.0.0.0:45454
 
 
 
 yarn.nodemanager.admin-env
 MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX
 
 
 
 yarn.nodemanager.aux-services
 mapreduce_shuffle,spark2_shuffle,timeline_collector
 
 
 
 yarn.nodemanager.aux-services.mapreduce_shuffle.class
 org.apache.hadoop.mapred.ShuffleHandler
 
 
 
 yarn.nodemanager.aux-services.spark2_shuffle.class
 org.apache.spark.network.yarn.YarnShuffleService
 
 
 
 yarn.nodemanager.aux-services.spark2_shuffle.classpath
 /usr/hdp/${hdp.version}/spark2/aux/*
 
 
 
 yarn.nodemanager.aux-services.spark_shuffle.class
 org.apache.spark.network.yarn.YarnShuffleService
 
 
 
 yarn.nodemanager.aux-services.spark_shuffle.classpath
 /usr/hdp/${hdp.version}/spark/aux/*
 
 
 
 yarn.nodemanager.aux-services.timeline_collector.class
 
org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService
 
 
 
 yarn.nodemanager.bind-host
 0.0.0.0
 
 
 
 yarn.nodemanager.container-executor.class
 org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor
 
 
 
 yarn.nodemanager.container-metrics.unregister-delay-ms
 6
 
 
 
 yarn.nodemanager.container-monitor.interval-ms
 3000
 
 
 
 yarn.nodemanager.delete.debug-delay-sec
 0
 
 
 
 
yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage
 90
 
 
 
 yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb
 1000
 
 
 
 yarn.nodemanager.disk-health-checker.min-healthy-disks
 0.25
 
 
 
 yarn.nodemanager.health-checker.interval-ms
 135000
 
 
 
 yarn.nodemanager.health-checker.script.timeout-ms
 6
 
 
 
 
yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
 false
 
 
 
 yarn.nodemanager.linux-container-executor.group
 hadoop
 
 
 
 
yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users
 false
 
 
 
 yarn.nodemanager.local-dirs
 /hadoop/yarn/local
 
 
 
 yarn.nodemanager.log-aggregation.compression-type
 gz
 
 
 
 yarn.nodemanager.log-aggregation.debug-enabled
 false
 
 
 
 yarn.nodemanager.log-aggregation.num-log-files-per-app
 30
 
 
 
 yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds
 3600
 
 
 
 yarn.nodemanager.log-dirs
 /hadoop/yarn/log
 
 
 
 yarn.nodemanager.log.retain-seconds
 604800
 
 
 
 yarn.nodemanager.pmem-check-enabled
 false
 
 
 
 yarn.nodemanager.recovery.dir
 /var/log/hadoop-yarn/nodemanager/recovery-state
 
 
 
 yarn.nodemanager.recovery.enabled
 true
 
 
 
 yarn.nodemanager.recovery.supervised
 true
 
 
 
 yarn.nodemanager.remote-app-log-dir
 /app-logs
 
 
 
 yarn.nodemanager.remote-app-log-dir-suffix
 logs
 
 
 
 yarn.nodemanager.resource-plugins
 
 
 
 
 yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices
 auto
 
 
 
 yarn.nodemanager.resource-plugins.gpu.docker-plugin
 nvidia-docker-v1
 
 
 
 yarn.nodemanager.resource-plugins.gpu.docker-plugin.nvidiadocker-
 v1.endpoint
 http://localhost:3476/v1.0/docker/cli
 
 
 
 
yarn.nodemanager.resource-plugins.gpu.path-to-discovery-executables
 
 
 
 
 yarn.nodemanager.resource.cpu-vcores
 6
 
 
 
 yarn.nodemanager.resource.memory-mb
 12288
 
 
 
 yarn.nodemanager.resource.percentage-physical-cpu-limit
 80
 
 
 
 yarn.nodemanager.runtime.linux.allowed-runtimes
 default,docker
 
 
 
 

[jira] [Commented] (YARN-8315) HDP 3.0.0 perfromance is slower than HDP 2.6.4

2018-05-17 Thread Hsin-Liang Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479768#comment-16479768
 ] 

Hsin-Liang Huang commented on YARN-8315:


[https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.0/bk_ambari-installation/content/hdp_30_repositories.html]
     this is where I get the hdp.repo and ambari.repo to install hdp 3.0.0.  
It's still in Beta. 

> HDP 3.0.0 perfromance is slower than HDP 2.6.4
> --
>
> Key: YARN-8315
> URL: https://issues.apache.org/jira/browse/YARN-8315
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
> Environment: I have a HDP 2.6.4 cluster and HDP 3.0.0 cluster,  I set 
> up to have the same settings for these two cluster such as java heap size, 
> container size etc.  They are both 4 node cluster with 3 data nodes.   I took 
> almost all the default setting on HDP 3.0.0 except that I modify the minimum 
> container size to 64MB instead of 1024MB in both cluster.  
>  
>Reporter: Hsin-Liang Huang
>Priority: Major
>
> I can't find the button to delete this, so I just removed the text to avoid 
> sensitive information in the previous posting. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8315) HDP 3.0.0 perfromance is slower than HDP 2.6.4

2018-05-17 Thread Hsin-Liang Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hsin-Liang Huang updated YARN-8315:
---
Description: I can't find the button to delete this, so I just removed the 
text to avoid sensitive information in the previous posting.   (was: Hi,   I am 
comparing the performance between HDP 3.0.0 and HDP 2.6.4 and I discovered HDP 
3.0.0 is much slower than HDP 2.6.4 if the job acquire more yarn containers and 
we also pin point the problem is after the job is done,  when it tried to clean 
up all the containers to exit the application, that's the place where it 
consumed more time than HDP 2.6.4.   I used the simple yarn app that 
Hortonworks put out on github [https://github.com/hortonworks/simple-yarn-app] 
to do the testing.  Below is my testing result from acquiring 8 containers in 
both HDP 3.0.0 and HDP 2.6.4 cluster environment. 

=

HDP 3.0.0: 

command:  time hadoop jar 
/usr/hdp/3.0.0.0-829/hadoop-yarn/hadoop-yarn-applications-unmanaged-am-launcher-3.0.0.3.0.0.0-829.jar
 Client -classpath simple-yarn-app-1.1.0.jar -cmd "java 
com.hortonworks.simpleyarnapp.ApplicationMaster /bin/date 8"

18/05/17 11:06:42 INFO unmanagedamlauncher.UnmanagedAMLauncher: Initializing 
Client
18/05/17 11:06:42 INFO unmanagedamlauncher.UnmanagedAMLauncher: Starting Client
18/05/17 11:06:43 INFO client.RMProxy: Connecting to ResourceManager at 
whiny1.fyre.ibm.com/172.16.165.211:8050
18/05/17 11:06:43 INFO client.AHSProxy: Connecting to Application History 
server at whiny2.fyre.ibm.com/172.16.200.160:10200
18/05/17 11:06:43 INFO unmanagedamlauncher.UnmanagedAMLauncher: Setting up 
application submission context for ASM
18/05/17 11:06:43 INFO unmanagedamlauncher.UnmanagedAMLauncher: Setting 
unmanaged AM
18/05/17 11:06:43 INFO unmanagedamlauncher.UnmanagedAMLauncher: Submitting 
application to ASM
18/05/17 11:06:43 INFO impl.YarnClientImpl: Submitted application 
application_1526572577866_0011
18/05/17 11:06:44 INFO unmanagedamlauncher.UnmanagedAMLauncher: Got application 
report from ASM for, appId=11, 
appAttemptId=appattempt_1526572577866_0011_01, clientToAMToken=null, 
appDiagnostics=AM container is launched, waiting for AM container to Register 
with RM, appMasterHost=N/A, appQueue=default, appMasterRpcPort=-1, 
appStartTime=1526584003704, yarnAppState=ACCEPTED, 
distributedFinalState=UNDEFINED, appTrackingUrl=N/A, appUser=hlhuang
18/05/17 11:06:44 INFO unmanagedamlauncher.UnmanagedAMLauncher: Launching AM 
with application attempt id appattempt_1526572577866_0011_01
18/05/17 11:06:46 INFO client.RMProxy: Connecting to ResourceManager at 
whiny1.fyre.ibm.com/172.16.165.211:8030
registerApplicationMaster 0
registerApplicationMaster 1
18/05/17 11:06:47 INFO conf.Configuration: found resource resource-types.xml at 
file:/etc/hadoop/3.0.0.0-829/0/resource-types.xml
Making res-req 0
Making res-req 1
Making res-req 2
Making res-req 3
Making res-req 4
Making res-req 5
Making res-req 6
Making res-req 7
Launching container container_e08_1526572577866_0011_01_01
Launching container container_e08_1526572577866_0011_01_02
Launching container container_e08_1526572577866_0011_01_03
Launching container container_e08_1526572577866_0011_01_04
Launching container container_e08_1526572577866_0011_01_05
Launching container container_e08_1526572577866_0011_01_06
Launching container container_e08_1526572577866_0011_01_07
Launching container container_e08_1526572577866_0011_01_08
Completed container container_e08_1526572577866_0011_01_01
Completed container container_e08_1526572577866_0011_01_02
Completed container container_e08_1526572577866_0011_01_03
Completed container container_e08_1526572577866_0011_01_04
Completed container container_e08_1526572577866_0011_01_08
Completed container container_e08_1526572577866_0011_01_05
Completed container container_e08_1526572577866_0011_01_06
Completed container container_e08_1526572577866_0011_01_07
18/05/17 11:06:54 INFO unmanagedamlauncher.UnmanagedAMLauncher: AM process 
exited with value: 0
18/05/17 11:06:55 INFO unmanagedamlauncher.UnmanagedAMLauncher: Got application 
report from ASM for, appId=11, 
appAttemptId=appattempt_1526572577866_0011_01, clientToAMToken=null, 
appDiagnostics=, appMasterHost=, appQueue=default, appMasterRpcPort=0, 
appStartTime=1526584003704, yarnAppState=FINISHED, 
distributedFinalState=SUCCEEDED, appTrackingUrl=N/A, appUser=hlhuang
18/05/17 11:06:55 INFO unmanagedamlauncher.UnmanagedAMLauncher: App ended with 
state: FINISHED and status: SUCCEEDED
18/05/17 11:06:55 INFO unmanagedamlauncher.UnmanagedAMLauncher: Application has 
completed successfully.

{color:#FF}real 0m14.716s{color}
{color:#FF}user 0m11.642s{color}
{color:#FF}sys 0m0.616s{color}

 

 

HDP 2.6.4 

command: 

time hadoop jar 

[jira] [Created] (YARN-8315) HDP 3.0.0 perfromance is slower than HDP 2.6.4

2018-05-17 Thread Hsin-Liang Huang (JIRA)
Hsin-Liang Huang created YARN-8315:
--

 Summary: HDP 3.0.0 perfromance is slower than HDP 2.6.4
 Key: YARN-8315
 URL: https://issues.apache.org/jira/browse/YARN-8315
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 3.0.0
 Environment: I have a HDP 2.6.4 cluster and HDP 3.0.0 cluster,  I set 
up to have the same settings for these two cluster such as java heap size, 
container size etc.  They are both 4 node cluster with 3 data nodes.   I took 
almost all the default setting on HDP 3.0.0 except that I modify the minimum 
container size to 64MB instead of 1024MB in both cluster.  

 
Reporter: Hsin-Liang Huang


Hi,   I am comparing the performance between HDP 3.0.0 and HDP 2.6.4 and I 
discovered HDP 3.0.0 is much slower than HDP 2.6.4 if the job acquire more yarn 
containers and we also pin point the problem is after the job is done,  when it 
tried to clean up all the containers to exit the application, that's the place 
where it consumed more time than HDP 2.6.4.   I used the simple yarn app that 
Hortonworks put out on github [https://github.com/hortonworks/simple-yarn-app] 
to do the testing.  Below is my testing result from acquiring 8 containers in 
both HDP 3.0.0 and HDP 2.6.4 cluster environment. 

=

HDP 3.0.0: 

command:  time hadoop jar 
/usr/hdp/3.0.0.0-829/hadoop-yarn/hadoop-yarn-applications-unmanaged-am-launcher-3.0.0.3.0.0.0-829.jar
 Client -classpath simple-yarn-app-1.1.0.jar -cmd "java 
com.hortonworks.simpleyarnapp.ApplicationMaster /bin/date 8"

18/05/17 11:06:42 INFO unmanagedamlauncher.UnmanagedAMLauncher: Initializing 
Client
18/05/17 11:06:42 INFO unmanagedamlauncher.UnmanagedAMLauncher: Starting Client
18/05/17 11:06:43 INFO client.RMProxy: Connecting to ResourceManager at 
whiny1.fyre.ibm.com/172.16.165.211:8050
18/05/17 11:06:43 INFO client.AHSProxy: Connecting to Application History 
server at whiny2.fyre.ibm.com/172.16.200.160:10200
18/05/17 11:06:43 INFO unmanagedamlauncher.UnmanagedAMLauncher: Setting up 
application submission context for ASM
18/05/17 11:06:43 INFO unmanagedamlauncher.UnmanagedAMLauncher: Setting 
unmanaged AM
18/05/17 11:06:43 INFO unmanagedamlauncher.UnmanagedAMLauncher: Submitting 
application to ASM
18/05/17 11:06:43 INFO impl.YarnClientImpl: Submitted application 
application_1526572577866_0011
18/05/17 11:06:44 INFO unmanagedamlauncher.UnmanagedAMLauncher: Got application 
report from ASM for, appId=11, 
appAttemptId=appattempt_1526572577866_0011_01, clientToAMToken=null, 
appDiagnostics=AM container is launched, waiting for AM container to Register 
with RM, appMasterHost=N/A, appQueue=default, appMasterRpcPort=-1, 
appStartTime=1526584003704, yarnAppState=ACCEPTED, 
distributedFinalState=UNDEFINED, appTrackingUrl=N/A, appUser=hlhuang
18/05/17 11:06:44 INFO unmanagedamlauncher.UnmanagedAMLauncher: Launching AM 
with application attempt id appattempt_1526572577866_0011_01
18/05/17 11:06:46 INFO client.RMProxy: Connecting to ResourceManager at 
whiny1.fyre.ibm.com/172.16.165.211:8030
registerApplicationMaster 0
registerApplicationMaster 1
18/05/17 11:06:47 INFO conf.Configuration: found resource resource-types.xml at 
file:/etc/hadoop/3.0.0.0-829/0/resource-types.xml
Making res-req 0
Making res-req 1
Making res-req 2
Making res-req 3
Making res-req 4
Making res-req 5
Making res-req 6
Making res-req 7
Launching container container_e08_1526572577866_0011_01_01
Launching container container_e08_1526572577866_0011_01_02
Launching container container_e08_1526572577866_0011_01_03
Launching container container_e08_1526572577866_0011_01_04
Launching container container_e08_1526572577866_0011_01_05
Launching container container_e08_1526572577866_0011_01_06
Launching container container_e08_1526572577866_0011_01_07
Launching container container_e08_1526572577866_0011_01_08
Completed container container_e08_1526572577866_0011_01_01
Completed container container_e08_1526572577866_0011_01_02
Completed container container_e08_1526572577866_0011_01_03
Completed container container_e08_1526572577866_0011_01_04
Completed container container_e08_1526572577866_0011_01_08
Completed container container_e08_1526572577866_0011_01_05
Completed container container_e08_1526572577866_0011_01_06
Completed container container_e08_1526572577866_0011_01_07
18/05/17 11:06:54 INFO unmanagedamlauncher.UnmanagedAMLauncher: AM process 
exited with value: 0
18/05/17 11:06:55 INFO unmanagedamlauncher.UnmanagedAMLauncher: Got application 
report from ASM for, appId=11, 
appAttemptId=appattempt_1526572577866_0011_01, clientToAMToken=null, 
appDiagnostics=, appMasterHost=, appQueue=default, appMasterRpcPort=0, 
appStartTime=1526584003704, yarnAppState=FINISHED, 
distributedFinalState=SUCCEEDED,