[jira] [Comment Edited] (HADOOP-13238) pid handling is failing on secure datanode
[ https://issues.apache.org/jira/browse/HADOOP-13238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17193575#comment-17193575 ] zhuqi edited comment on HADOOP-13238 at 9/10/20, 12:11 PM: --- cc [~boky01] [~aw] If i can take it, recent i am using the new hadoop 3.2.1 to construct our production clusters, also meet the problem. Now i update the latest patch, fix another problem ,when the service is not running, but we call stop, the cat will show problem, because there are no pidfile. Thanks. Attach on : https://issues.apache.org/jira/browse/HADOOP-17257 was (Author: zhuqi): cc [~boky01] [~aw] If i can take it, recent i am using the new hadoop 3.2.1 to construct our production clusters, also meet the problem. Now i update the latest patch, fix another problem ,when the service is not running, but we call stop, the cat will show problem, because there are no pidfile. Thanks. > pid handling is failing on secure datanode > -- > > Key: HADOOP-13238 > URL: https://issues.apache.org/jira/browse/HADOOP-13238 > Project: Hadoop Common > Issue Type: Bug > Components: scripts, security >Reporter: Allen Wittenauer >Assignee: Andras Bokor >Priority: Major > Attachments: HADOOP-13238.01.patch, HADOOP-13238.02.patch > > > {code} > hdfs --daemon stop datanode > cat: /home/hadoop/H/pids/hadoop-hdfs-root-datanode.pid: No such file or > directory > WARNING: pid has changed for datanode, skip deleting pid file > cat: /home/hadoop/H/pids/hadoop-hdfs-root-datanode.pid: No such file or > directory > WARNING: daemon pid has changed for datanode, skip deleting daemon pid file > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-13238) pid handling is failing on secure datanode
[ https://issues.apache.org/jira/browse/HADOOP-13238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15978504#comment-15978504 ] Andras Bokor edited comment on HADOOP-13238 at 4/21/17 11:56 AM: - [~aw] The root cause here is that the JSVC will delete its own pid file which was passed with {{-pidfile}} option. So after stop {{cat}} will fail. Honestly, I feel HADOOP-12364 solves a bug in an external monitoring tool rather than in Hadoop. That is a pretty rare case (I cannot even imagine how can it happen) so I think here it is enough to check that whether the pid file exists or not. If not that means JSVC deleted the file so we do not need to do check and delete. In addition the error message shows up twice because either {{hadoop_stop_daemon}} or {{hadoop_stop_secure_daemon}} do the same check and deletes the same pid file. The second one can be removed from the code. After my patch the test still passes. {{hadoop_stop_daemon.bats}} and {{hadoop_stop_secure_daemon.bats}} do the same test so the first one seems unnecessary. Also, I added a new test to prove that the pid file is deleted when everything went well. {code}abokor$ bats hadoop_stop_secure_daemon.bats ✓ hadoop_stop_secure_daemon_when_pid_file_changes ✓ hadoop_stop_secure_daemon_deletes_pid_file 2 tests, 0 failures{code} Output after patch: {code}root@abokor-practice-5:/grid/0# hadoop-3.0.0-alpha2/sbin/start-dfs.sh Starting namenodes on [abokor-practice-2.openstacklocal] Starting datanodes Starting secondary namenodes [abokor-practice-5] root@abokor-practice-5:/grid/0# hadoop-3.0.0-alpha2/sbin/stop-dfs.sh Stopping namenodes on [abokor-practice-2.openstacklocal] Stopping datanodes Stopping secondary namenodes [abokor-practice-5]{code} was (Author: boky01): [~aw] The root cause here is that JSVC will delete the pid file which was passed to it with {{-pidfile}} option. So after stop {{cat}} will fail. Honestly, I feel HADOOP-12364 solves a bug in an external monitoring tool rather than in Hadoop. That is a pretty rare case (I cannot even imagine how can it happen) so I think here it is enough to check that whether the pid file exists or not. If not that means JSVC deleted the file so we do not need to do check and delete. In addition the error message shows up twice because either {{hadoop_stop_daemon.bats}} or {{hadoop_stop_secure_daemon.bats}} do the same check and deletes the same pid file. The second one can be removed from the code. After my patch the test still passes. {{adoop_stop_daemon.bats}} and {{adoop_stop_secure_daemon.bats}} do the same test so the first seems unnecessary. {code}abokor$ bats hadoop_stop_secure_daemon.bats ✓ hadoop_stop_secure_daemon 1 test, 0 failures{code} > pid handling is failing on secure datanode > -- > > Key: HADOOP-13238 > URL: https://issues.apache.org/jira/browse/HADOOP-13238 > Project: Hadoop Common > Issue Type: Bug > Components: scripts, security >Reporter: Allen Wittenauer >Assignee: Andras Bokor > > {code} > hdfs --daemon stop datanode > cat: /home/hadoop/H/pids/hadoop-hdfs-root-datanode.pid: No such file or > directory > WARNING: pid has changed for datanode, skip deleting pid file > cat: /home/hadoop/H/pids/hadoop-hdfs-root-datanode.pid: No such file or > directory > WARNING: daemon pid has changed for datanode, skip deleting daemon pid file > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org