[ 
https://issues.apache.org/jira/browse/HDFS-6648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099984#comment-14099984
 ] 

Rafal Wojdyla commented on HDFS-6648:
-------------------------------------

Hi [~qwertymaniac], good to know that it wasn't a design goal - btw, what is 
the best/easiest way to check what were the design goals for given 
class/component - is Jira the only good place for that?

Java doc for ConfiguredFailoverProxyProvider says:
{code}
/**
 * A FailoverProxyProvider implementation which allows one to configure two URIs
 * to connect to during fail-over. The first configured address is tried first,
 * and on a fail-over event the other address is tried.
 */
public class ConfiguredFailoverProxyProvider<T> extends
{code}
It says "The first configured address is tried first" - which is not true.

This was a major issue for us due to other bugs, including but not limited to:
 
* HDFS-5064
* HDFS-4858

So at the end of the day some clients were trying to connect to Standby 
Namenode which sometimes was very unresponsive, it was killing the performance 
big time.

Order taken from configuration file makes it more intuitive for administrator, 
and makes it possible for administrator to mitigate bugs like the ones above by 
explicitly defining order of namenodes.

> Order of namenodes in ConfiguredFailoverProxyProvider is not defined by order 
> in hdfs-site.xml
> ----------------------------------------------------------------------------------------------
>
>                 Key: HDFS-6648
>                 URL: https://issues.apache.org/jira/browse/HDFS-6648
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ha, hdfs-client
>    Affects Versions: 2.2.0
>            Reporter: Rafal Wojdyla
>
> In org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider, 
> in the constructor, there's a map <nameservice : < service-id : 
> service-rpc-address > > (DFSUtil.getHaNnRpcAddresses). It's a LinkedHashMap 
> of HashMaps. The order is kept for _nameservices_. Then to find active 
> namenode, for nameservice, we get HashMap of <service-id : 
> service-rpc-address > for requested nameservice (taken from URI request), And 
> for this HashMap we get values - order of this collection is not strictly 
> defined! In the code: 
> {code}
> Collection<InetSocketAddress> addressesOfNns = addressesInNN.values(); 
> {code}
> And then we put these values (in not defined order) into ArrayList of 
> proxies, and then in getProxy we start from first proxy in the list and 
> failover to next if needed. 
> It would make sense for ConfiguredFailoverProxyProvider to keep order of 
> proxies/namenodes defined in hdfs-site.xml.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to