[jira] [Commented] (HDFS-14245) Class cast error in GetGroups with ObserverReadProxyProvider

Erik Krogen (JIRA) Tue, 30 Apr 2019 19:09:27 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830818#comment-16830818
 ]


Erik Krogen commented on HDFS-14245:
------------------------------------

Thanks for taking a look [~shv]!
{quote}It would be better if getProxyAsClientProtocol() was throwing 
IOException rather than RuntimeException.
{quote}
I'm not sure I agree with this. If the proxy is, in fact, not a 
{{ClientProtocol}}, no level of retry will fix it. An {{IOException}} may 
trigger failover or retry logic, which will just continue to fail. Also, it 
indicates a bug, so it seems to me that it would be better to surface it rather 
than hiding it under an {{IOException}} which is more likely to get ignored 
(since {{IOExceptions}} are common). I can probably be convinced if you have 
some better reasoning than me or if there is precedence for your approach.
{quote}It looks that getHAServiceState() in current revision assumes STANDBY 
state no matter what error. I think it should only assume STANDBY state when it 
gets StandbyException, and re-throw if anything else.
{quote}
I don't agree with this. Throwing an exception from here will actually trigger 
failover of the active proxy, which is definitely not what we want. Assuming 
{{STANDBY}} state will achieve the desired effect of no longer contacting this 
node. Though something like {{UNAVAILABLE}} or {{UNREACHABLE}} may be more 
accurate, I don't think adding a new {{HAServiceState}} makes sense for this 
use case, and I think {{STANDBY}} is more applicable than any of the other 
states:
{code:java}
    INITIALIZING("initializing"),
    ACTIVE("active"),
    STANDBY("standby"),
    OBSERVER("observer"),
    STOPPING("stopping");
{code}
{quote}Also LOG.error() rather than info().
{quote}
I think a WARN may be reasonable, but I really don't think it's an ERROR. It 
doesn't indicate anything fatal or broken; e.g. if one of the NameNodes is down 
temporarily for maintenance you will get an {{IOException}} here. This is 
expected and the client will just continue to move on to the next NameNode. I 
think that the explanations for when to use different log levels provided in 
the answers 
[here|https://stackoverflow.com/questions/2031163/when-to-use-the-different-log-levels]
 are pretty good, and I think this solidly does not fit into the category of an 
ERROR.

 

I'm attaching a v003 patch which changes the log level to a WARN.

> Class cast error in GetGroups with ObserverReadProxyProvider
> ------------------------------------------------------------
>
>                 Key: HDFS-14245
>                 URL: https://issues.apache.org/jira/browse/HDFS-14245
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: HDFS-12943
>            Reporter: Shen Yinjie
>            Assignee: Erik Krogen
>            Priority: Major
>         Attachments: HDFS-14245.000.patch, HDFS-14245.001.patch, 
> HDFS-14245.002.patch, HDFS-14245.patch
>
>
> Run "hdfs groups" with ObserverReadProxyProvider, Exception throws as :
> {code:java}
> Exception in thread "main" java.io.IOException: Couldn't create proxy 
> provider class 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:261)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:119)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:95)
>  at org.apache.hadoop.hdfs.tools.GetGroups.getUgmProtocol(GetGroups.java:87)
>  at org.apache.hadoop.tools.GetGroupsBase.run(GetGroupsBase.java:71)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>  at org.apache.hadoop.hdfs.tools.GetGroups.main(GetGroups.java:96)
> Caused by: java.lang.reflect.InvocationTargetException
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:245)
>  ... 7 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hdfs.server.namenode.ha.NameNodeHAProxyFactory cannot be 
> cast to org.apache.hadoop.hdfs.server.namenode.ha.ClientHAProxyFactory
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.<init>(ObserverReadProxyProvider.java:123)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.<init>(ObserverReadProxyProvider.java:112)
>  ... 12 more
> {code}
> similar with HDFS-14116, we did a simple fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDFS-14245) Class cast error in GetGroups with ObserverReadProxyProvider

Reply via email to