[ 
https://issues.apache.org/jira/browse/HBASE-12554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221528#comment-14221528
 ] 

stack commented on HBASE-12554:
-------------------------------

+1

On commit, put public first in below and add class comment why we are putting 
in place this mock class:

static public class...

Leave out the other changes, the changes to RackManager and to BaseLoadBalancer 
since you don't know if they have an effect (logging that we spent 60 seconds 
in lookup could be good ... but you ain't sure the interrupt works).  Do such 
changes in another JIRA where you can try code against bad dns to see it is 
doing the right thing.




> TestBaseLoadBalancer may timeout due to lengthy rack lookup
> -----------------------------------------------------------
>
>                 Key: HBASE-12554
>                 URL: https://issues.apache.org/jira/browse/HBASE-12554
>             Project: HBase
>          Issue Type: Test
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>         Attachments: 12554-v1.txt, 12554-v2.txt, 12554-v3.txt, 12554-v4.txt
>
>
> Here is one of the recent occurrences 
> (https://builds.apache.org/job/PreCommit-HBASE-Build/11778/console):
> {code}
> testImmediateAssignment(org.apache.hadoop.hbase.master.balancer.TestBaseLoadBalancer)
>   Time elapsed: 30.019 sec  <<< ERROR!
> java.lang.Exception: test timed out after 30000 milliseconds
>       at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
>       at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901)
>       at 
> java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293)
>       at java.net.InetAddress.getAllByName0(InetAddress.java:1246)
>       at java.net.InetAddress.getAllByName(InetAddress.java:1162)
>       at java.net.InetAddress.getAllByName(InetAddress.java:1098)
>       at java.net.InetAddress.getByName(InetAddress.java:1048)
>       at org.apache.hadoop.net.NetUtils.normalizeHostName(NetUtils.java:561)
>       at org.apache.hadoop.net.NetUtils.normalizeHostNames(NetUtils.java:578)
>       at 
> org.apache.hadoop.net.CachedDNSToSwitchMapping.resolve(CachedDNSToSwitchMapping.java:109)
>       at 
> org.apache.hadoop.hbase.master.RackManager.getRack(RackManager.java:66)
>       at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.<init>(BaseLoadBalancer.java:273)
>       at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:1113)
>       at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.randomAssignment(BaseLoadBalancer.java:1175)
>       at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.immediateAssignment(BaseLoadBalancer.java:1145)
>       at 
> org.apache.hadoop.hbase.master.balancer.TestBaseLoadBalancer.testImmediateAssignment(TestBaseLoadBalancer.java:136)
> {code}
> One possible fix is to submit CachedDNSToSwitchMapping.resolve() to executor 
> pool for execution. RackManager.getRack() can set a timeout beyond which 
> UNKNOWN_RACK is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to