[ https://issues.apache.org/jira/browse/HDFS-12910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16287325#comment-16287325 ]
Nanda kumar commented on HDFS-12910: ------------------------------------ Thanks [~xiaochen] for the review and suggestions. bq. We usually do not post a patch directly to a jira that someone else is actively working on. I understand posting a patch may be easier to express the idea sometimes, and I think we're fine here. But please try to collaborate via discussions first in the future. I understand your concern, my intention was not to take over the jira but to give an idea on the test-case part of it :) I should have added a comment instead of posting a patch, sorry for that. bq. I don't think this is correct here for 2 reasons. \[…\] Maybe a cleaner way to do is to catch {{BindException}} at the end of {{getSecureResources}}? I agree. Catching {{BindException}} at the end of {{getSecureResources}} will also not help us, we might not be able to tell from which call the exception has occurred, we need that information to add a proper message to the exception. Instead of one catch block at the end, we will be needing two {{try … catch}} block, one for each bind call. bq. in the new method, we catch several exceptions when constructing the new BE, and rethrows the original BE. We should log these failures so in case they happen, there is a way to debug. If we are going to go with retaining the new method, we can replace {{BindException newBe = be.getClass().getConstructor(String.class).newInstance(be.getMessage() + " " + endpoint)}} to {{BindException newBe = new BindException(be.getMessage() + " " + endpoint)}} which doesn’t need additional exception handling. Constructing the exception object through reflection will make sense if we are expecting different types of exception which extends BindException, as far as I see there are not Exceptions expected here which will extend BindException. So we can safely remove the object construction using reflection and make it using {{new}}. bq.I think the canonical way is to call bind inside the try block. True, thanks for the catch. Even though having two {{try … catch}} block, one for each bind call doesn’t look cleaner, it’s still better than having a separate {{bind0}} method to handle two different types of bind calls which will bring in the issues mentioned by [~xiaochen]. bq. we should not explicitly override a value based on the current behavior of {{ServerSocket}} bq. if someone intentionally sets ipc.server.listen.queue.size to a non-positive number, it will be changed by this patch. > Secure Datanode Starter should log the port when it > ---------------------------------------------------- > > Key: HDFS-12910 > URL: https://issues.apache.org/jira/browse/HDFS-12910 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode > Affects Versions: 3.1.0 > Reporter: Stephen O'Donnell > Assignee: Stephen O'Donnell > Priority: Minor > Attachments: HDFS-12910.001.patch, HDFS-12910.002.patch > > > When running a secure data node, the default ports it uses are 1004 and 1006. > Sometimes other OS services can start on these ports causing the DN to fail > to start (eg the nfs service can use random ports under 1024). > When this happens an error is logged by jsvc, but it is confusing as it does > not tell you which port it is having issues binding to, for example, when > port 1004 is used by another process: > {code} > Initializing secure datanode resources > java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:433) > at sun.nio.ch.Net.bind(Net.java:425) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at > org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.getSecureResources(SecureDataNodeStarter.java:105) > at > org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.init(SecureDataNodeStarter.java:71) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.commons.daemon.support.DaemonLoader.load(DaemonLoader.java:207) > Cannot load daemon > Service exit with a return value of 3 > {code} > And when port 1006 is used: > {code} > Opened streaming server at /0.0.0.0:1004 > java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:433) > at sun.nio.ch.Net.bind(Net.java:425) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) > at > org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.getSecureResources(SecureDataNodeStarter.java:129) > at > org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.init(SecureDataNodeStarter.java:71) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.commons.daemon.support.DaemonLoader.load(DaemonLoader.java:207) > Cannot load daemon > Service exit with a return value of 3 > {code} > We should catch the BindException exception and log out the problem > address:port and then re-throw the exception to make the problem more clear. > I will upload a patch for this. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org