jiangjialiang918 opened a new issue #8436: URL: https://github.com/apache/dolphinscheduler/issues/8436
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar issues. ### What happened 在我安装了分布式集群之后没有任何错误,所有节点和进程也都能正常起来,当我创建一个shell 脚本作为测试时却发现任务总是失败,已经确定不是权限问题,因为使用的部署用户(dolphinscheduler)作为租户,是拥有dolphinscheduler路径下所有权限的,而且已经创建了task execute path : ./jobs/exec/process/4577755613088/4577887155744_2/2/28 , 执行文件 ./jobs/exec/process/4577755613088/4577887155744_2/2/28/2_28.command 却没看到 日志见重现步骤(ps.文件上传总是失败) 安装配置如下: # # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # # --------------------------------------------------------- # INSTALL MACHINE # --------------------------------------------------------- # A comma separated list of machine hostname or IP would be installed DolphinScheduler, # including master, worker, api, alert. If you want to deploy in pseudo-distributed # mode, just write a pseudo-distributed hostname # Example for hostnames: ips="ds1,ds2,ds3,ds4,ds5", Example for IPs: ips="192.168.8.1,192.168.8.2,192.168.8.3,192.168.8.4,192.168.8.5" ips="data01,data02,data03" # Port of SSH protocol, default value is 22. For now we only support same port in all `ips` machine # modify it if you use different ssh port sshPort="22" # A comma separated list of machine hostname or IP would be installed Master server, it # must be a subset of configuration `ips`. # Example for hostnames: masters="ds1,ds2", Example for IPs: masters="192.168.8.1,192.168.8.2" masters="data01" # A comma separated list of machine <hostname>:<workerGroup> or <IP>:<workerGroup>.All hostname or IP must be a # subset of configuration `ips`, And workerGroup have default value as `default`, but we recommend you declare behind the hosts # Example for hostnames: workers="ds1:default,ds2:default,ds3:default", Example for IPs: workers="192.168.8.1:default,192.168.8.2:default,192.168.8.3:default" workers="data01:default,data02:default,data03:default" # A comma separated list of machine hostname or IP would be installed Alert server, it # must be a subset of configuration `ips`. # Example for hostname: alertServer="ds3", Example for IP: alertServer="192.168.8.3" alertServer="data03" # A comma separated list of machine hostname or IP would be installed API server, it # must be a subset of configuration `ips`. # Example for hostname: apiServers="ds1", Example for IP: apiServers="192.168.8.1" apiServers="data01" # A comma separated list of machine hostname or IP would be installed Python gateway server, it # must be a subset of configuration `ips`. # Example for hostname: pythonGatewayServers="ds1", Example for IP: pythonGatewayServers="192.168.8.1" pythonGatewayServers="data01" # The directory to install DolphinScheduler for all machine we config above. It will automatically be created by `install.sh` script if not exists. # Do not set this configuration same as the current path (pwd) installPath="/data1_1T/dolphinscheduler" # The user to deploy DolphinScheduler for all machine we config above. For now user must create by yourself before running `install.sh` # script. The user needs to have sudo privileges and permissions to operate hdfs. If hdfs is enabled than the root directory needs # to be created by this user deployUser="dolphinscheduler" # The directory to store local data for all machine we config above. Make sure user `deployUser` have permissions to read and write this directory. dataBasedirPath="~/dolphinscheduler" # --------------------------------------------------------- # DolphinScheduler ENV # --------------------------------------------------------- # JAVA_HOME, we recommend use same JAVA_HOME in all machine you going to install DolphinScheduler # and this configuration only support one parameter so far. javaHome="/opt/dev/jdk1.8.0_212" # DolphinScheduler API service port, also this is your DolphinScheduler UI component's URL port, default value is 12345 apiServerPort="12345" # --------------------------------------------------------- # Database # NOTICE: If database value has special characters, such as `.*[]^${}\+?|()@#&`, Please add prefix `\` for escaping. # --------------------------------------------------------- # The type for the metadata database # Supported values: ``postgresql``, ``mysql`, `h2``. # DATABASE_TYPE=${DATABASE_TYPE:-"h2"} DATABASE_TYPE="mysql" SPRING_DATASOURCE_URL="jdbc:mysql://os7:3306/dolphinscheduler?useUnicode=true&characterEncoding=UTF-8" # Spring datasource url, following <HOST>:<PORT>/<database>?<parameter> format, If you using mysql, you could use jdbc # string jdbc:mysql://127.0.0.1:3306/dolphinscheduler?useUnicode=true&characterEncoding=UTF-8 as example SPRING_DATASOURCE_URL=${SPRING_DATASOURCE_URL:-"jdbc:h2:mem:dolphinscheduler;MODE=MySQL;DB_CLOSE_DELAY=-1;DATABASE_TO_LOWER=true"} # Spring datasource username SPRING_DATASOURCE_USERNAME="op" # Spring datasource password SPRING_DATASOURCE_PASSWORD="123456" # --------------------------------------------------------- # Registry Server # --------------------------------------------------------- # Registry Server plugin name, should be a substring of `registryPluginDir`, DolphinScheduler use this for verifying configuration consistency registryPluginName="zookeeper" # Registry Server address. registryServers="data01:2181,data02:2181,data03:2181" # The root of zookeeper, for now DolphinScheduler default registry server is zookeeper. zkRoot="/dolphinscheduler" # --------------------------------------------------------- # Worker Task Server # --------------------------------------------------------- # Worker Task Server plugin dir. DolphinScheduler will find and load the worker task plugin jar package from this dir. taskPluginDir="lib/plugin/task" # resource storage type: HDFS, S3, NONE resourceStorageType="NONE" # resource store on HDFS/S3 path, resource file will store to this hdfs path, self configuration, please make sure the directory exists on hdfs and has read write permissions. "/dolphinscheduler" is recommended resourceUploadPath="/dolphinscheduler" # if resourceStorageType is HDFS,defaultFS write namenode address,HA, you need to put core-site.xml and hdfs-site.xml in the conf directory. # if S3,write S3 address,HA,for example :s3a://dolphinscheduler, # Note,S3 be sure to create the root directory /dolphinscheduler defaultFS="hdfs://mycluster:8020" # if resourceStorageType is S3, the following three configuration is required, otherwise please ignore s3Endpoint="http://192.168.xx.xx:9010" s3AccessKey="xxxxxxxxxx" s3SecretKey="xxxxxxxxxx" # resourcemanager port, the default value is 8088 if not specified resourceManagerHttpAddressPort="8088" # if resourcemanager HA is enabled, please set the HA IPs; if resourcemanager is single node, keep this value empty yarnHaIps="192.168.xx.xx,192.168.xx.xx" # if resourcemanager HA is enabled or not use resourcemanager, please keep the default value; If resourcemanager is single node, you only need to replace 'yarnIp1' to actual resourcemanager hostname singleYarnIp="yarnIp1" # who has permission to create directory under HDFS/S3 root path # Note: if kerberos is enabled, please config hdfsRootUser= hdfsRootUser="hdfs" # kerberos config # whether kerberos starts, if kerberos starts, following four items need to config, otherwise please ignore kerberosStartUp="false" # kdc krb5 config file path krb5ConfPath="$installPath/conf/krb5.conf" # keytab username,watch out the @ sign should followd by \\ keytabUserName="hdfs-mycluster\\@ESZ.COM" # username keytab path keytabPath="$installPath/conf/hdfs.headless.keytab" # kerberos expire time, the unit is hour kerberosExpireTime="2" # use sudo or not sudoEnable="true" # worker tenant auto create workerTenantAutoCreate="false" ### What you expected to happen 能够正常执行shell等任务 ### How to reproduce 部署完集群之后,确定所有进程都能正常启动, 使用登录ds web管理页面, 创建一个普通的test用户,使用默认添加的dolphinscheduler(部署用户),为test创建令牌,提交令牌,确认有效期, 使用test用户登录ds web, 创建一个shell 工作流,只需要一个节点, 脚本内容: echo 'hello dolphin' , 保存工作流,上线,手动运行,可以得到运行日志如下: [LOG-PATH]: /data1_1T/dolphinscheduler/logs/4577887155744_2/2/28.log, [HOST]: 192.168.100.102 [INFO] 2022-02-19 00:29:00.378 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[83] - shell task params {"resourceList":[],"localParams":[],"rawScript":"echo 'hello dolphin'","dependence":{},"conditionResult":{"successNode":[],"failedNode":[]},"waitStartTimeout":{},"switchResult":{}} [INFO] 2022-02-19 00:29:00.424 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[137] - raw script : echo 'hello dolphin' [INFO] 2022-02-19 00:29:00.425 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[138] - task execute path : ./jobs/exec/process/4577755613088/4577887155744_2/2/28 [INFO] 2022-02-19 00:29:00.596 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[86] - tenantCode user:dolphinscheduler, task dir:2_28 [INFO] 2022-02-19 00:29:00.596 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[91] - create command file:./jobs/exec/process/4577755613088/4577887155744_2/2/28/2_28.command [INFO] 2022-02-19 00:29:00.597 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[117] - command : #!/bin/sh BASEDIR=$(cd `dirname $0`; pwd) cd $BASEDIR source /data1_1T/dolphinscheduler/conf/env/dolphinscheduler_env.sh ./jobs/exec/process/4577755613088/4577887155744_2/2/28/2_28_node.sh [INFO] 2022-02-19 00:29:00.622 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[285] - task run command: sudo -u dolphinscheduler sh ./jobs/exec/process/4577755613088/4577887155744_2/2/28/2_28.command [INFO] 2022-02-19 00:29:00.625 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[176] - process start, process id is: 101187 [INFO] 2022-02-19 00:29:00.647 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[200] - process has exited, execute path:./jobs/exec/process/4577755613088/4577887155744_2/2/28, processId:101187 ,exitStatusCode:127 ,processWaitForStatus:true ,processExitValue:127 [INFO] 2022-02-19 00:29:01.627 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[66] - -> welcome to use bigdata scheduling system... sh: ./jobs/exec/process/4577755613088/4577887155744_2/2/28/2_28.command: No such file or directory [INFO] 2022-02-19 00:29:01.628 TaskLogLogger-class org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[60] - FINALIZE_SESSION ### Anything else 每次都能重现,重启所有进程还是一样 ### Version 2.0.2 ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
