[jira] [Comment Edited] (HADOOP-13238) pid handling is failing on secure datanode

2020-09-10 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-13238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17193575#comment-17193575
 ] 

zhuqi edited comment on HADOOP-13238 at 9/10/20, 12:11 PM:
---

cc [~boky01]  [~aw]

If i can take it, recent i am using the new hadoop 3.2.1 to construct our 
production clusters, also meet the problem.

Now i update the latest patch, fix another problem ,when the service is not 
running, but we call stop, the cat will show problem, because there are no 
pidfile.

Thanks.

Attach on : https://issues.apache.org/jira/browse/HADOOP-17257


was (Author: zhuqi):
cc [~boky01]  [~aw]

If i can take it, recent i am using the new hadoop 3.2.1 to construct our 
production clusters, also meet the problem.

Now i update the latest patch, fix another problem ,when the service is not 
running, but we call stop, the cat will show problem, because there are no 
pidfile.

Thanks.

> pid handling is failing on secure datanode
> --
>
> Key: HADOOP-13238
> URL: https://issues.apache.org/jira/browse/HADOOP-13238
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: scripts, security
>Reporter: Allen Wittenauer
>Assignee: Andras Bokor
>Priority: Major
> Attachments: HADOOP-13238.01.patch, HADOOP-13238.02.patch
>
>
> {code}
> hdfs --daemon stop datanode
> cat: /home/hadoop/H/pids/hadoop-hdfs-root-datanode.pid: No such file or 
> directory
> WARNING: pid has changed for datanode, skip deleting pid file
> cat: /home/hadoop/H/pids/hadoop-hdfs-root-datanode.pid: No such file or 
> directory
> WARNING: daemon pid has changed for datanode, skip deleting daemon pid file
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-13238) pid handling is failing on secure datanode

2017-04-21 Thread Andras Bokor (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15978504#comment-15978504
 ] 

Andras Bokor edited comment on HADOOP-13238 at 4/21/17 11:56 AM:
-

[~aw]

The root cause here is that the JSVC will delete its own pid file which was 
passed with {{-pidfile}} option. So after stop {{cat}} will fail.
Honestly, I feel HADOOP-12364 solves a bug in an external monitoring tool 
rather than in Hadoop. That is a pretty rare case (I cannot even imagine how 
can it happen) so I think here it is enough to check that whether the pid file 
exists or not. If not that means JSVC deleted the file so we do not need to do 
check and delete.
In addition the error message shows up twice because either 
{{hadoop_stop_daemon}} or {{hadoop_stop_secure_daemon}} do the same check and 
deletes the same pid file. The second one can be removed from the code.

After my patch the test still passes. {{hadoop_stop_daemon.bats}} and 
{{hadoop_stop_secure_daemon.bats}} do the same test so the first one seems 
unnecessary.
Also, I added a new test to prove that the pid file is deleted when everything 
went well.
{code}abokor$ bats hadoop_stop_secure_daemon.bats
 ✓ hadoop_stop_secure_daemon_when_pid_file_changes
 ✓ hadoop_stop_secure_daemon_deletes_pid_file

2 tests, 0 failures{code}

Output after patch:
{code}root@abokor-practice-5:/grid/0# hadoop-3.0.0-alpha2/sbin/start-dfs.sh
Starting namenodes on [abokor-practice-2.openstacklocal]
Starting datanodes
Starting secondary namenodes [abokor-practice-5]
root@abokor-practice-5:/grid/0# hadoop-3.0.0-alpha2/sbin/stop-dfs.sh
Stopping namenodes on [abokor-practice-2.openstacklocal]
Stopping datanodes
Stopping secondary namenodes [abokor-practice-5]{code}


was (Author: boky01):
[~aw]

The root cause here is that JSVC will delete the pid file which was passed to 
it with {{-pidfile}} option. So after stop {{cat}} will fail.
Honestly, I feel HADOOP-12364 solves a bug in an external monitoring tool 
rather than in Hadoop. That is a pretty rare case (I cannot even imagine how 
can it happen) so I think here it is enough to check that whether the pid file 
exists or not. If not that means JSVC deleted the file so we do not need to do 
check and delete.
In addition the error message shows up twice because either 
{{hadoop_stop_daemon.bats}} or {{hadoop_stop_secure_daemon.bats}} do the same 
check and deletes the same pid file. The second one can be removed from the 
code.

After my patch the test still passes. {{adoop_stop_daemon.bats}} and 
{{adoop_stop_secure_daemon.bats}} do the same test so the first seems 
unnecessary.
{code}abokor$ bats hadoop_stop_secure_daemon.bats
 ✓ hadoop_stop_secure_daemon

1 test, 0 failures{code}

> pid handling is failing on secure datanode
> --
>
> Key: HADOOP-13238
> URL: https://issues.apache.org/jira/browse/HADOOP-13238
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: scripts, security
>Reporter: Allen Wittenauer
>Assignee: Andras Bokor
>
> {code}
> hdfs --daemon stop datanode
> cat: /home/hadoop/H/pids/hadoop-hdfs-root-datanode.pid: No such file or 
> directory
> WARNING: pid has changed for datanode, skip deleting pid file
> cat: /home/hadoop/H/pids/hadoop-hdfs-root-datanode.pid: No such file or 
> directory
> WARNING: daemon pid has changed for datanode, skip deleting daemon pid file
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org