[ https://issues.apache.org/jira/browse/HBASE-29180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Viraj Jasani resolved HBASE-29180. ---------------------------------- Fix Version/s: 2.5.12 Hadoop Flags: Reviewed Resolution: Fixed > Apply fail-fast retry limit for UnknownHostException > ---------------------------------------------------- > > Key: HBASE-29180 > URL: https://issues.apache.org/jira/browse/HBASE-29180 > Project: HBase > Issue Type: Sub-task > Affects Versions: 2.5.11 > Reporter: Viraj Jasani > Assignee: Viraj Jasani > Priority: Major > Labels: pull-request-available > Fix For: 2.7.0, 3.0.0-beta-2, 2.6.3, 2.5.12 > > > As part of HBASE-28638, fail-fast retry limit has been introduced for errors > like CallQueueTooBigException, SaslException, ConnectionClosedException. This > helps limit the num of retries that RSProcedureDispatcher has to perform > while executing remote procedures. Since the region open/close fails on the > remote server, we also trigger SCP for the target server. > We recently came across UnknownHostException as another example of where the > remote calls can get stuck forever: > {code:java} > WARN [RSProcedureDispatcher-pool-98034] procedure.RSProcedureDispatcher - > request to rs1.xyz,60020,1739254267238 failed due to > java.net.UnknownHostException: Call to address=rs1.xyz:60020 failed on local > exception: java.net.UnknownHostException: rs1.xyz:60020 could not be > resolved, try=2867, retrying... , request params: open_region { > open_info { > region { > ... > ... {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)