[ 
https://issues.apache.org/jira/browse/HDFS-12910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16287231#comment-16287231
 ] 

Xiao Chen commented on HDFS-12910:
----------------------------------

Thanks for posting a new patch [~nandakumar131], and the discussions to move 
this forward. It's coming up nicely.

Good point on the log file ownership Stephen! That's almost certainly the 
reason of the existing {{System.err}} usages. I don't think we can get over 
that easily. Rethrowing the exception feels like the way to go.

A few comments on the patch:
- {code:title=SecureDataNodeStarter#bind0}
      if (backlog < 1) {
        // This default value is picked from java.net.ServerSocket
        backlog = 50;
      }
{code}
I don't think this is correct here for 2 reasons. First, we should not 
explicitly override a value based on the current behavior of {{ServerSocket}}. 
Second, if someone intentionally sets {{ipc.server.listen.queue.size}} to a 
non-positive number, it will be changed by this patch. In other words, this 
could bring in a incompatible behavior.
Currently the ipc and httpserver binding calls 2 APIs, and I find it difficult 
to have a shared method at the caller to unify them. Maybe a cleaner way to do 
is to catch {{BindException}} at the end of {{getSecureResources}}?

- in the new method, we catch several exceptions when constructing the new BE, 
and rethrows the original BE. We should log these failures so in case they 
happen, there is a way to debug.
- in the test: I think the canonical way is to call {{bind}} inside the try 
block. {{close}} will be a no-op if it's not created. See the ctor of 
{{ServerSocket}} for reference.
- nit, I'd rename {{testInfoSocAddrBindException}} to something like 
{{testWebServerAddrBindException}}
- there are some extra line breaks in both classes, please fix them.

Also a quick note to [~nandakumar131], thanks for contributing to Hadoop and we 
appreciate your contributions! While reviewing, it would be great to articulate 
ideas and encourage the assignee to work on an issue. It is also fine to check 
with the assignee about their availability, and offer to take over a jira if 
they agree and you're interested. We usually do not post a patch directly to a 
jira that someone else is actively working on. I understand posting a patch may 
be easier to express the idea sometimes, and I think we're fine here. But 
please try to collaborate via discussions first in the future. :)

> Secure Datanode Starter should log the port when it 
> ----------------------------------------------------
>
>                 Key: HDFS-12910
>                 URL: https://issues.apache.org/jira/browse/HDFS-12910
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>    Affects Versions: 3.1.0
>            Reporter: Stephen O'Donnell
>            Assignee: Stephen O'Donnell
>            Priority: Minor
>         Attachments: HDFS-12910.001.patch, HDFS-12910.002.patch
>
>
> When running a secure data node, the default ports it uses are 1004 and 1006. 
> Sometimes other OS services can start on these ports causing the DN to fail 
> to start (eg the nfs service can use random ports under 1024).
> When this happens an error is logged by jsvc, but it is confusing as it does 
> not tell you which port it is having issues binding to, for example, when 
> port 1004 is used by another process:
> {code}
> Initializing secure datanode resources
> java.net.BindException: Address already in use
>         at sun.nio.ch.Net.bind0(Native Method)
>         at sun.nio.ch.Net.bind(Net.java:433)
>         at sun.nio.ch.Net.bind(Net.java:425)
>         at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>         at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>         at 
> org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.getSecureResources(SecureDataNodeStarter.java:105)
>         at 
> org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.init(SecureDataNodeStarter.java:71)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.apache.commons.daemon.support.DaemonLoader.load(DaemonLoader.java:207)
> Cannot load daemon
> Service exit with a return value of 3
> {code}
> And when port 1006 is used:
> {code}
> Opened streaming server at /0.0.0.0:1004
> java.net.BindException: Address already in use
>         at sun.nio.ch.Net.bind0(Native Method)
>         at sun.nio.ch.Net.bind(Net.java:433)
>         at sun.nio.ch.Net.bind(Net.java:425)
>         at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>         at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>         at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67)
>         at 
> org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.getSecureResources(SecureDataNodeStarter.java:129)
>         at 
> org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.init(SecureDataNodeStarter.java:71)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.apache.commons.daemon.support.DaemonLoader.load(DaemonLoader.java:207)
> Cannot load daemon
> Service exit with a return value of 3
> {code}
> We should catch the BindException exception and log out the problem 
> address:port and then re-throw the exception to make the problem more clear.
> I will upload a patch for this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to