[ https://issues.apache.org/jira/browse/HDFS-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830818#comment-16830818 ]
Erik Krogen commented on HDFS-14245: ------------------------------------ Thanks for taking a look [~shv]! {quote}It would be better if getProxyAsClientProtocol() was throwing IOException rather than RuntimeException. {quote} I'm not sure I agree with this. If the proxy is, in fact, not a {{ClientProtocol}}, no level of retry will fix it. An {{IOException}} may trigger failover or retry logic, which will just continue to fail. Also, it indicates a bug, so it seems to me that it would be better to surface it rather than hiding it under an {{IOException}} which is more likely to get ignored (since {{IOExceptions}} are common). I can probably be convinced if you have some better reasoning than me or if there is precedence for your approach. {quote}It looks that getHAServiceState() in current revision assumes STANDBY state no matter what error. I think it should only assume STANDBY state when it gets StandbyException, and re-throw if anything else. {quote} I don't agree with this. Throwing an exception from here will actually trigger failover of the active proxy, which is definitely not what we want. Assuming {{STANDBY}} state will achieve the desired effect of no longer contacting this node. Though something like {{UNAVAILABLE}} or {{UNREACHABLE}} may be more accurate, I don't think adding a new {{HAServiceState}} makes sense for this use case, and I think {{STANDBY}} is more applicable than any of the other states: {code:java} INITIALIZING("initializing"), ACTIVE("active"), STANDBY("standby"), OBSERVER("observer"), STOPPING("stopping"); {code} {quote}Also LOG.error() rather than info(). {quote} I think a WARN may be reasonable, but I really don't think it's an ERROR. It doesn't indicate anything fatal or broken; e.g. if one of the NameNodes is down temporarily for maintenance you will get an {{IOException}} here. This is expected and the client will just continue to move on to the next NameNode. I think that the explanations for when to use different log levels provided in the answers [here|https://stackoverflow.com/questions/2031163/when-to-use-the-different-log-levels] are pretty good, and I think this solidly does not fit into the category of an ERROR. I'm attaching a v003 patch which changes the log level to a WARN. > Class cast error in GetGroups with ObserverReadProxyProvider > ------------------------------------------------------------ > > Key: HDFS-14245 > URL: https://issues.apache.org/jira/browse/HDFS-14245 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: HDFS-12943 > Reporter: Shen Yinjie > Assignee: Erik Krogen > Priority: Major > Attachments: HDFS-14245.000.patch, HDFS-14245.001.patch, > HDFS-14245.002.patch, HDFS-14245.patch > > > Run "hdfs groups" with ObserverReadProxyProvider, Exception throws as : > {code:java} > Exception in thread "main" java.io.IOException: Couldn't create proxy > provider class > org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider > at > org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:261) > at > org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:119) > at > org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:95) > at org.apache.hadoop.hdfs.tools.GetGroups.getUgmProtocol(GetGroups.java:87) > at org.apache.hadoop.tools.GetGroupsBase.run(GetGroupsBase.java:71) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at org.apache.hadoop.hdfs.tools.GetGroups.main(GetGroups.java:96) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:245) > ... 7 more > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hdfs.server.namenode.ha.NameNodeHAProxyFactory cannot be > cast to org.apache.hadoop.hdfs.server.namenode.ha.ClientHAProxyFactory > at > org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.<init>(ObserverReadProxyProvider.java:123) > at > org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.<init>(ObserverReadProxyProvider.java:112) > ... 12 more > {code} > similar with HDFS-14116, we did a simple fix. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org