[ 
https://issues.apache.org/jira/browse/HDFS-12910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16284630#comment-16284630
 ] 

Xiao Chen commented on HDFS-12910:
----------------------------------

Thanks for working on this [~sodonnell]! I have added you to the contributors 
list (so you can assign jiras to yourself now) and assigned this to you.

Agree this would be very useful for supportability. Some little comments:
- We usually try to log the messages to the DN log. I guess this one is special 
because it happens during DN start. In this case, can we still log to the DN 
log additionally? I think having the ports in the log would be helpful too, in 
case the stderr isn't directly on someone's console, and they looked the log 
first. (Could construct a string, and use that for print and log)
- We can just {{throw e}}, instead of {{throw (e)}}.

Unit test is not required for logging and supportability changes. The output 
you pasted looks good to me.
If you're interested, I think a unit test can technically be done in 
{{TestStartSecureDataNode}}: create a similar test like {{testSecureNameNode}}, 
manually bind to the port. Then start the minicluster, and try-catch IOE. 
Verify the IOE's message contains the port you want to see using 
{{GenericTestUtils#assertExceptionContains}}.

> Secure Datanode Starter should log the port when it 
> ----------------------------------------------------
>
>                 Key: HDFS-12910
>                 URL: https://issues.apache.org/jira/browse/HDFS-12910
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>    Affects Versions: 3.1.0
>            Reporter: Stephen O'Donnell
>            Assignee: Stephen O'Donnell
>            Priority: Minor
>         Attachments: HDFS-12910.001.patch
>
>
> When running a secure data node, the default ports it uses are 1004 and 1006. 
> Sometimes other OS services can start on these ports causing the DN to fail 
> to start (eg the nfs service can use random ports under 1024).
> When this happens an error is logged by jsvc, but it is confusing as it does 
> not tell you which port it is having issues binding to, for example, when 
> port 1004 is used by another process:
> {code}
> Initializing secure datanode resources
> java.net.BindException: Address already in use
>         at sun.nio.ch.Net.bind0(Native Method)
>         at sun.nio.ch.Net.bind(Net.java:433)
>         at sun.nio.ch.Net.bind(Net.java:425)
>         at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>         at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>         at 
> org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.getSecureResources(SecureDataNodeStarter.java:105)
>         at 
> org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.init(SecureDataNodeStarter.java:71)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.apache.commons.daemon.support.DaemonLoader.load(DaemonLoader.java:207)
> Cannot load daemon
> Service exit with a return value of 3
> {code}
> And when port 1006 is used:
> {code}
> Opened streaming server at /0.0.0.0:1004
> java.net.BindException: Address already in use
>         at sun.nio.ch.Net.bind0(Native Method)
>         at sun.nio.ch.Net.bind(Net.java:433)
>         at sun.nio.ch.Net.bind(Net.java:425)
>         at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>         at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>         at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67)
>         at 
> org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.getSecureResources(SecureDataNodeStarter.java:129)
>         at 
> org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.init(SecureDataNodeStarter.java:71)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.apache.commons.daemon.support.DaemonLoader.load(DaemonLoader.java:207)
> Cannot load daemon
> Service exit with a return value of 3
> {code}
> We should catch the BindException exception and log out the problem 
> address:port and then re-throw the exception to make the problem more clear.
> I will upload a patch for this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to