[
https://issues.apache.org/jira/browse/HDFS-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830818#comment-16830818
]
Erik Krogen commented on HDFS-14245:
------------------------------------
Thanks for taking a look [~shv]!
{quote}It would be better if getProxyAsClientProtocol() was throwing
IOException rather than RuntimeException.
{quote}
I'm not sure I agree with this. If the proxy is, in fact, not a
{{ClientProtocol}}, no level of retry will fix it. An {{IOException}} may
trigger failover or retry logic, which will just continue to fail. Also, it
indicates a bug, so it seems to me that it would be better to surface it rather
than hiding it under an {{IOException}} which is more likely to get ignored
(since {{IOExceptions}} are common). I can probably be convinced if you have
some better reasoning than me or if there is precedence for your approach.
{quote}It looks that getHAServiceState() in current revision assumes STANDBY
state no matter what error. I think it should only assume STANDBY state when it
gets StandbyException, and re-throw if anything else.
{quote}
I don't agree with this. Throwing an exception from here will actually trigger
failover of the active proxy, which is definitely not what we want. Assuming
{{STANDBY}} state will achieve the desired effect of no longer contacting this
node. Though something like {{UNAVAILABLE}} or {{UNREACHABLE}} may be more
accurate, I don't think adding a new {{HAServiceState}} makes sense for this
use case, and I think {{STANDBY}} is more applicable than any of the other
states:
{code:java}
INITIALIZING("initializing"),
ACTIVE("active"),
STANDBY("standby"),
OBSERVER("observer"),
STOPPING("stopping");
{code}
{quote}Also LOG.error() rather than info().
{quote}
I think a WARN may be reasonable, but I really don't think it's an ERROR. It
doesn't indicate anything fatal or broken; e.g. if one of the NameNodes is down
temporarily for maintenance you will get an {{IOException}} here. This is
expected and the client will just continue to move on to the next NameNode. I
think that the explanations for when to use different log levels provided in
the answers
[here|https://stackoverflow.com/questions/2031163/when-to-use-the-different-log-levels]
are pretty good, and I think this solidly does not fit into the category of an
ERROR.
I'm attaching a v003 patch which changes the log level to a WARN.
> Class cast error in GetGroups with ObserverReadProxyProvider
> ------------------------------------------------------------
>
> Key: HDFS-14245
> URL: https://issues.apache.org/jira/browse/HDFS-14245
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: HDFS-12943
> Reporter: Shen Yinjie
> Assignee: Erik Krogen
> Priority: Major
> Attachments: HDFS-14245.000.patch, HDFS-14245.001.patch,
> HDFS-14245.002.patch, HDFS-14245.patch
>
>
> Run "hdfs groups" with ObserverReadProxyProvider, Exception throws as :
> {code:java}
> Exception in thread "main" java.io.IOException: Couldn't create proxy
> provider class
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
> at
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:261)
> at
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:119)
> at
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:95)
> at org.apache.hadoop.hdfs.tools.GetGroups.getUgmProtocol(GetGroups.java:87)
> at org.apache.hadoop.tools.GetGroupsBase.run(GetGroupsBase.java:71)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at org.apache.hadoop.hdfs.tools.GetGroups.main(GetGroups.java:96)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:245)
> ... 7 more
> Caused by: java.lang.ClassCastException:
> org.apache.hadoop.hdfs.server.namenode.ha.NameNodeHAProxyFactory cannot be
> cast to org.apache.hadoop.hdfs.server.namenode.ha.ClientHAProxyFactory
> at
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.<init>(ObserverReadProxyProvider.java:123)
> at
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.<init>(ObserverReadProxyProvider.java:112)
> ... 12 more
> {code}
> similar with HDFS-14116, we did a simple fix.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]