github-actions[bot] commented on issue #6605:
URL: 
https://github.com/apache/dolphinscheduler/issues/6605#issuecomment-951671505


   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### What happened
   
   ### operate
   when I run a shell for   testing mapreduce in ds  (fee image below),ds web 
log shows : yarn status get failed.
   shell content :
   `hadoop jar 
/opt/app/hadoop-2.9.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.2.jar
 pi 10 10`
   
![360截图17520712172535](https://user-images.githubusercontent.com/20452924/138829616-ce15ed7b-6ad1-46b3-be4d-c45025294689.png)
   ### ds web log
   [INFO] 2021-10-26 10:34:28.745  - [taskAppId=TASK-1-6-89]:[115] - create dir 
success /exec/process/1/1/6/89
   [INFO] 2021-10-26 10:34:28.754  - [taskAppId=TASK-1-6-89]:[88] - shell task 
params {"rawScript":"hadoop jar 
/opt/app/hadoop-2.9.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.2.jar
 pi 10 10","localParams":[],"resourceList":[]}
   [INFO] 2021-10-26 10:34:28.758  - [taskAppId=TASK-1-6-89]:[154] - raw script 
: hadoop jar 
/opt/app/hadoop-2.9.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.2.jar
 pi 10 10
   [INFO] 2021-10-26 10:34:28.759  - [taskAppId=TASK-1-6-89]:[155] - task 
execute path : /exec/process/1/1/6/89
   [INFO] 2021-10-26 10:34:28.760  - [taskAppId=TASK-1-6-89]:[87] - tenantCode 
user:root, task dir:1_6_89
   [INFO] 2021-10-26 10:34:28.760  - [taskAppId=TASK-1-6-89]:[92] - create 
command file:/exec/process/1/1/6/89/1_6_89.command
   [INFO] 2021-10-26 10:34:28.760  - [taskAppId=TASK-1-6-89]:[111] - command : 
#!/bin/sh
   BASEDIR=$(cd `dirname $0`; pwd)
   cd $BASEDIR
   source /opt/app/dolphinscheduler/conf/env/dolphinscheduler_env.sh
   /exec/process/1/1/6/89/1_6_89_node.sh
   [INFO] 2021-10-26 10:34:28.764  - [taskAppId=TASK-1-6-89]:[330] - task run 
command:
   sudo -u root sh /exec/process/1/1/6/89/1_6_89.command
   [INFO] 2021-10-26 10:34:28.773  - [taskAppId=TASK-1-6-89]:[211] - process 
start, process id is: 19627
   [INFO] 2021-10-26 10:34:29.774  - [taskAppId=TASK-1-6-89]:[138] -  -> SLF4J: 
Class path contains multiple SLF4J bindings.
        SLF4J: Found binding in 
[jar:file:/opt/app/hadoop-2.9.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
        SLF4J: Found binding in 
[jar:file:/opt/app/tez/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
        SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
explanation.
        SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
        Number of Maps  = 10
        Samples per Map = 10
   [INFO] 2021-10-26 10:34:31.775  - [taskAppId=TASK-1-6-89]:[138] -  -> Wrote 
input for Map #0
        Wrote input for Map #1
        Wrote input for Map #2
        Wrote input for Map #3
        Wrote input for Map #4
        Wrote input for Map #5
        Wrote input for Map #6
        Wrote input for Map #7
        Wrote input for Map #8
        Wrote input for Map #9
        Starting Job
        21/10/26 10:34:31 INFO client.RMProxy: Connecting to ResourceManager at 
hadoop47/192.168.80.47:8032
   [INFO] 2021-10-26 10:34:32.776  - [taskAppId=TASK-1-6-89]:[138] -  -> 
21/10/26 10:34:32 INFO input.FileInputFormat: Total input files to process : 10
        21/10/26 10:34:32 INFO mapreduce.JobSubmitter: number of splits:10
        21/10/26 10:34:32 INFO Configuration.deprecation: 
yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, 
use yarn.system-metrics-publisher.enabled
        21/10/26 10:34:32 INFO mapreduce.JobSubmitter: Submitting tokens for 
job: job_1634958933716_0113
        21/10/26 10:34:32 INFO impl.YarnClientImpl: Submitted application 
application_1634958933716_0113
        21/10/26 10:34:32 INFO mapreduce.Job: The url to track the job: 
http://hadoop47:8088/proxy/application_1634958933716_0113/
        21/10/26 10:34:32 INFO mapreduce.Job: Running job: 
job_1634958933716_0113
   [INFO] 2021-10-26 10:34:40.785  - [taskAppId=TASK-1-6-89]:[138] -  -> 
21/10/26 10:34:39 INFO mapreduce.Job: Job job_1634958933716_0113 running in 
uber mode : false
        21/10/26 10:34:39 INFO mapreduce.Job:  map 0% reduce 0%
   [INFO] 2021-10-26 10:34:56.789  - [taskAppId=TASK-1-6-89]:[138] -  -> 
21/10/26 10:34:56 INFO mapreduce.Job:  map 30% reduce 0%
   [INFO] 2021-10-26 10:34:57.790  - [taskAppId=TASK-1-6-89]:[138] -  -> 
21/10/26 10:34:57 INFO mapreduce.Job:  map 100% reduce 0%
   [INFO] 2021-10-26 10:35:02.715  - [taskAppId=TASK-1-6-89]:[445] - find app 
id: application_1634958933716_0113
   [INFO] 2021-10-26 10:35:02.715  - [taskAppId=TASK-1-6-89]:[402] - check yarn 
application status, appId:application_1634958933716_0113
   [ERROR] 2021-10-26 10:35:02.720  - [taskAppId=TASK-1-6-89]:[418] - yarn 
applications: application_1634958933716_0113 , query status failed, exception:{}
   java.lang.NullPointerException: null
        at 
org.apache.dolphinscheduler.common.utils.HadoopUtils.getApplicationStatus(HadoopUtils.java:423)
        at 
org.apache.dolphinscheduler.server.worker.task.AbstractCommandExecutor.isSuccessOfYarnState(AbstractCommandExecutor.java:404)
        at 
org.apache.dolphinscheduler.server.worker.task.AbstractCommandExecutor.run(AbstractCommandExecutor.java:230)
        at 
org.apache.dolphinscheduler.server.worker.task.shell.ShellTask.handle(ShellTask.java:101)
        at 
org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:139)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
   [INFO] 2021-10-26 10:35:02.720  - [taskAppId=TASK-1-6-89]:[238] - process 
has exited, execute path:/exec/process/1/1/6/89, processId:19627 
,exitStatusCode:-1 ,processWaitForStatus:true ,processExitValue:0
   [INFO] 2021-10-26 10:35:02.791  - [taskAppId=TASK-1-6-89]:[138] -  -> 
21/10/26 10:35:02 INFO mapreduce.Job:  map 100% reduce 100%
        21/10/26 10:35:02 INFO mapreduce.Job: Job job_1634958933716_0113 
completed successfully
        21/10/26 10:35:02 INFO mapreduce.Job: Counters: 49
                File System Counters
                        FILE: Number of bytes read=226
                        FILE: Number of bytes written=2205654
                        FILE: Number of read operations=0
                        FILE: Number of large read operations=0
                        FILE: Number of write operations=0
                        HDFS: Number of bytes read=2630
                        HDFS: Number of bytes written=215
                        HDFS: Number of read operations=43
                        HDFS: Number of large read operations=0
                        HDFS: Number of write operations=3
                Job Counters 
                        Launched map tasks=10
                        Launched reduce tasks=1
                        Data-local map tasks=10
                        Total time spent by all maps in occupied slots 
(ms)=149819
                        Total time spent by all reduces in occupied slots 
(ms)=3113
                        Total time spent by all map tasks (ms)=149819
                        Total time spent by all reduce tasks (ms)=3113
                        Total vcore-milliseconds taken by all map tasks=149819
                        Total vcore-milliseconds taken by all reduce tasks=3113
                        Total megabyte-milliseconds taken by all map 
tasks=153414656
                        Total megabyte-milliseconds taken by all reduce 
tasks=3187712
                Map-Reduce Framework
                        Map input records=10
                        Map output records=20
                        Map output bytes=180
                        Map output materialized bytes=280
                        Input split bytes=1450
                        Combine input records=0
                        Combine output records=0
                        Reduce input groups=2
                        Reduce shuffle bytes=280
                        Reduce input records=20
                        Reduce output records=0
                        Spilled Records=40
                        Shuffled Maps =10
                        Failed Shuffles=0
                        Merged Map outputs=10
                        GC time elapsed (ms)=6825
                        CPU time spent (ms)=4980
                        Physical memory (bytes) snapshot=3529900032
                        Virtual memory (bytes) snapshot=22377988096
                        Total committed heap usage (bytes)=2413297664
                Shuffle Errors
                        BAD_ID=0
                        CONNECTION=0
                        IO_ERROR=0
                        WRONG_LENGTH=0
                        WRONG_MAP=0
                        WRONG_REDUCE=0
                File Input Format Counters 
                        Bytes Read=1180
                File Output Format Counters 
                        Bytes Written=97
        Job Finished in 30.695 seconds
        Estimated value of Pi is 3.20000000000000000000
   
   ### worker debug log
   
   [DEBUG] 2021-10-26 10:34:56.708 org.apache.zookeeper.ClientCnxn:[846] - 
Reading reply sessionid:0x20015bfe8a400c9, packet:: 
clientPath:/dolphinscheduler/nodes/worker/default/192.168.80.49:1234 
serverPath:/dolphinscheduler/nodes/worker/default/192.168.80.49:1234 
finished:false header:: 2933,4  replyHeader:: 2933,17180717039,0  request:: 
'/dolphinscheduler/nodes/worker/default/192.168.80.49:1234,T  response:: 
#302e332c302e39312c302e35392c312e33372c382e302c302e332c323032312d31302d32362030393a32373a30362c323032312d31302d32362031303a33343a35362c302c34303937,s{17180707701,17180717039,1635211626683,1635215696700,407,0,0,144139102061854920,73,0,17180707701}
 
   [DEBUG] 2021-10-26 10:34:56.708 
org.apache.dolphinscheduler.service.zk.ZookeeperCachedOperator:[62] - 
zookeeperListener:org.apache.dolphinscheduler.server.master.registry.ServerNodeManager$WorkerGroupNodeListener
 triggered
   [DEBUG] 2021-10-26 10:34:56.709 
org.apache.curator.framework.recipes.cache.TreeCache:[396] - processResult: 
CuratorEventImpl{type=GET_DATA, resultCode=0, 
path='/dolphinscheduler/nodes/worker/default/192.168.80.49:1234', name='null', 
children=null, context=null, 
stat=17180707701,17180717039,1635211626683,1635215696700,407,0,0,144139102061854920,73,0,17180707701
   , data=[48, 46, 51, 44, 48, 46, 57, 49, 44, 48, 46, 53, 57, 44, 49, 46, 51, 
55, 44, 56, 46, 48, 44, 48, 46, 51, 44, 50, 48, 50, 49, 45, 49, 48, 45, 50, 54, 
32, 48, 57, 58, 50, 55, 58, 48, 54, 44, 50, 48, 50, 49, 45, 49, 48, 45, 50, 54, 
32, 49, 48, 58, 51, 52, 58, 53, 54, 44, 48, 44, 52, 48, 57, 55], 
watchedEvent=null, aclList=null, opResults=null}
   [DEBUG] 2021-10-26 10:34:56.709 
org.apache.curator.framework.recipes.cache.TreeCache:[857] - publishEvent: 
TreeCacheEvent{type=NODE_UPDATED, 
data=ChildData{path='/dolphinscheduler/nodes/worker/default/192.168.80.49:1234',
 
stat=17180707701,17180717039,1635211626683,1635215696700,407,0,0,144139102061854920,73,0,17180707701
   , data=[48, 46, 51, 44, 48, 46, 57, 49, 44, 48, 46, 53, 57, 44, 49, 46, 51, 
55, 44, 56, 46, 48, 44, 48, 46, 51, 44, 50, 48, 50, 49, 45, 49, 48, 45, 50, 54, 
32, 48, 57, 58, 50, 55, 58, 48, 54, 44, 50, 48, 50, 49, 45, 49, 48, 45, 50, 54, 
32, 49, 48, 58, 51, 52, 58, 53, 54, 44, 48, 44, 52, 48, 57, 55]}}
   [INFO] 2021-10-26 10:34:56.789  - [taskAppId=TASK-1-6-89]:[138] -  -> 
21/10/26 10:34:56 INFO mapreduce.Job:  map 30% reduce 0%
   [INFO] 2021-10-26 10:34:57.790  - [taskAppId=TASK-1-6-89]:[138] -  -> 
21/10/26 10:34:57 INFO mapreduce.Job:  map 100% reduce 0%
   [DEBUG] 2021-10-26 10:34:58.313 org.apache.zookeeper.ClientCnxn:[745] - Got 
ping response for sessionid: 0x30015c0a38d009d after 0ms
   [INFO] 2021-10-26 10:35:02.715  - [taskAppId=TASK-1-6-89]:[445] - find app 
id: application_1634958933716_0113
   [INFO] 2021-10-26 10:35:02.715  - [taskAppId=TASK-1-6-89]:[402] - check yarn 
application status, appId:application_1634958933716_0113
   [DEBUG] 2021-10-26 10:35:02.715 
org.apache.dolphinscheduler.common.utils.HadoopUtils:[211] - yarn application 
url:http://hadoop47:%s/ws/v1/cluster/apps/%s, 
applicationId:application_1634958933716_0113
   [ERROR] 2021-10-26 10:35:02.720 
org.apache.dolphinscheduler.common.utils.HttpUtils:[73] - Connect to 
hadoop47:80 [hadoop47/192.168.80.47] failed: Connection refused (Connection 
refused)
   org.apache.http.conn.HttpHostConnectException: Connect to hadoop47:80 
[hadoop47/192.168.80.47] failed: Connection refused (Connection refused)
        at 
org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:151)
        at 
org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:353)
        at 
org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:380)
        at 
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
        at 
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
        at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88)
        at 
org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
        at 
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
        at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
        at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107)
        at 
org.apache.dolphinscheduler.common.utils.HttpUtils.get(HttpUtils.java:60)
        at 
org.apache.dolphinscheduler.common.utils.HadoopUtils.getApplicationStatus(HadoopUtils.java:420)
        at 
org.apache.dolphinscheduler.server.worker.task.AbstractCommandExecutor.isSuccessOfYarnState(AbstractCommandExecutor.java:404)
        at 
org.apache.dolphinscheduler.server.worker.task.AbstractCommandExecutor.run(AbstractCommandExecutor.java:230)
        at 
org.apache.dolphinscheduler.server.worker.task.shell.ShellTask.handle(ShellTask.java:101)
        at 
org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:139)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
   Caused by: java.net.ConnectException: Connection refused (Connection refused)
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:476)
        at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:218)
        at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:200)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:394)
        at java.net.Socket.connect(Socket.java:606)
        at 
org.apache.http.conn.socket.PlainConnectionSocketFactory.connectSocket(PlainConnectionSocketFactory.java:74)
        at 
org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:134)
        ... 20 common frames omitted
   [ERROR] 2021-10-26 10:35:02.720  - [taskAppId=TASK-1-6-89]:[418] - yarn 
applications: application_1634958933716_0113 , query status failed, exception:{}
   java.lang.NullPointerException: null
        at 
org.apache.dolphinscheduler.common.utils.HadoopUtils.getApplicationStatus(HadoopUtils.java:423)
        at 
org.apache.dolphinscheduler.server.worker.task.AbstractCommandExecutor.isSuccessOfYarnState(AbstractCommandExecutor.java:404)
        at 
org.apache.dolphinscheduler.server.worker.task.AbstractCommandExecutor.run(AbstractCommandExecutor.java:230)
        at 
org.apache.dolphinscheduler.server.worker.task.shell.ShellTask.handle(ShellTask.java:101)
        at 
org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:139)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
   [INFO] 2021-10-26 10:35:02.720  - [taskAppId=TASK-1-6-89]:[238] - process 
has exited, execute path:/exec/process/1/1/6/89, processId:19627 
,exitStatusCode:-1 ,processWaitForStatus:true ,processExitValue:0
   [INFO] 2021-10-26 10:35:02.720 
org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[147] - task 
instance id : 89,task final status : FAILURE
   [INFO] 2021-10-26 10:35:02.721 
org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[185] - 
develop mode is: false
   [INFO] 2021-10-26 10:35:02.721 
org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[203] - exec 
local path: /exec/process/1/1/6/89 cleared.
   [INFO] 2021-10-26 10:35:02.791  - [taskAppId=TASK-1-6-89]:[138] -  -> 
21/10/26 10:35:02 INFO mapreduce.Job:  map 100% reduce 100%
        21/10/26 10:35:02 INFO mapreduce.Job: Job job_1634958933716_0113 
completed successfully
        21/10/26 10:35:02 INFO mapreduce.Job: Counters: 49
                File System Counters
                        FILE: Number of bytes read=226
                        FILE: Number of bytes written=2205654
                        FILE: Number of read operations=0
                        FILE: Number of large read operations=0
                        FILE: Number of write operations=0
                        HDFS: Number of bytes read=2630
                        HDFS: Number of bytes written=215
                        HDFS: Number of read operations=43
                        HDFS: Number of large read operations=0
                        HDFS: Number of write operations=3
                Job Counters 
                        Launched map tasks=10
                        Launched reduce tasks=1
                        Data-local map tasks=10
                        Total time spent by all maps in occupied slots 
(ms)=149819
                        Total time spent by all reduces in occupied slots 
(ms)=3113
                        Total time spent by all map tasks (ms)=149819
                        Total time spent by all reduce tasks (ms)=3113
                        Total vcore-milliseconds taken by all map tasks=149819
                        Total vcore-milliseconds taken by all reduce tasks=3113
                        Total megabyte-milliseconds taken by all map 
tasks=153414656
                        Total megabyte-milliseconds taken by all reduce 
tasks=3187712
                Map-Reduce Framework
                        Map input records=10
                        Map output records=20
                        Map output bytes=180
                        Map output materialized bytes=280
                        Input split bytes=1450
                        Combine input records=0
                        Combine output records=0
                        Reduce input groups=2
                        Reduce shuffle bytes=280
                        Reduce input records=20
                        Reduce output records=0
                        Spilled Records=40
                        Shuffled Maps =10
                        Failed Shuffles=0
                        Merged Map outputs=10
                        GC time elapsed (ms)=6825
                        CPU time spent (ms)=4980
                        Physical memory (bytes) snapshot=3529900032
                        Virtual memory (bytes) snapshot=22377988096
                        Total committed heap usage (bytes)=2413297664
                Shuffle Errors
                        BAD_ID=0
                        CONNECTION=0
                        IO_ERROR=0
                        WRONG_LENGTH=0
                        WRONG_MAP=0
                        WRONG_REDUCE=0
                File Input Format Counters 
                        Bytes Read=1180
                File Output Format Counters 
                        Bytes Written=97
        Job Finished in 30.695 seconds
        Estimated value of Pi is 3.20000000000000000000
   
   
   ### What you expected to happen
   
   Yarn application_1634958933716_0113 status can always be get;
   
![360截图16380508205068](https://user-images.githubusercontent.com/20452924/138833771-460534f6-a830-44d1-86aa-5970217eb712.png)
   
   
   
   ### How to reproduce
   
   Server: KunPeng
   OS centos7
   DS release:1.3.9
   Hadoop version :2.9.2
   Yarn Ha: False
   conf/common.properties 
   # resourcemanager port, the default value is 8088 if not specified
   resource.manager.httpaddress.port=
   
   # if resourcemanager HA is enabled, please set the HA IPs; if 
resourcemanager is single, keep this value empty
   yarn.resourcemanager.ha.rm.ids=
   
   # if resourcemanager HA is enabled or not use resourcemanager, please keep 
the default value; If resourcemanager is single, you only need to replace ds1 
to actual resourcemanager hostname
   yarn.application.status.address=http://hadoop47:%s/ws/v1/cluster/apps/%s
   
   
   ### Anything else
   
   some times  fail ;  high probability of this error
   
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to