[ https://issues.apache.org/jira/browse/HADOOP-19447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17938397#comment-17938397 ]
ASF GitHub Bot commented on HADOOP-19447: ----------------------------------------- TaoYang526 commented on code in PR #7527: URL: https://github.com/apache/hadoop/pull/7527#discussion_r2013251034 ########## hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SecurityUtil.java: ########## @@ -586,17 +592,62 @@ InetAddress getByName(String hostname) throws UnknownHostException { return hostResolver.getByName(hostname); } } - + interface HostResolver { - InetAddress getByName(String host) throws UnknownHostException; + InetAddress getByName(String host) throws UnknownHostException; + } + + static abstract class CacheableHostResolver implements HostResolver { + private volatile LoadingCache<String, InetAddress> cache; + + CacheableHostResolver(int expiryIntervalSecs) { + if (expiryIntervalSecs > 0) { + cache = CacheBuilder.newBuilder() + .expireAfterWrite(expiryIntervalSecs, TimeUnit.SECONDS) + .build(new CacheLoader<String, InetAddress>() { + @Override + public InetAddress load(String key) throws Exception { + return resolve(key); + } + }); + } + } + protected abstract InetAddress resolve(String host) throws UnknownHostException; + + @Override + public InetAddress getByName(String host) throws UnknownHostException { + if (cache != null) { + try { + return cache.get(host); + } catch (Exception e) { + Throwable cause = e.getCause(); + if (cause instanceof UnknownHostException) { + throw (UnknownHostException) cause; + } + throw new UnknownHostException("Error resolving host " + host + + ": " + cause.getMessage()); Review Comment: Although the probability is very low, it's still possible to encounter NPE when calling cause.getMessage(). I think we should guarantee that won't happen. > Add Caching Mechanism to HostResolver to Avoid Redundant Hostname Resolutions > ----------------------------------------------------------------------------- > > Key: HADOOP-19447 > URL: https://issues.apache.org/jira/browse/HADOOP-19447 > Project: Hadoop Common > Issue Type: New Feature > Components: common, yarn > Reporter: Jiandan Yang > Priority: Major > Labels: pull-request-available > > *Background:* > > Currently, the two implementations of > org.apache.hadoop.security.SecurityUtil.HostResolver, *StandardHostResolver > and QualifiedHostResolver* in Hadoop performs hostname resolution each time > it is called. *Each heartbeat between the AM and RM causes the RM to invoke > the* HostResolver#getByName {*}method once{*}. In large-scale clusters > running numerous applications, this results in *a high frequency of redundant > hostname resolutions.* > > *Proposal:* > > Introduce a caching mechanism in HostResolver to store resolved hostnames for > a configurable duration. This would: > •Reduce redundant DNS queries. > •Improve performance for frequently used hostnames. > •Allow configuration options for cache size and TTL (Time-to-Live). > > *Suggested Implementation:* > 1.{*}Leverage Existing CachedResolver{*}: > The NodesListManager.CachedResolver class in Hadoop already implements a > caching mechanism for hostname resolution. Instead of introducing an entirely > new solution, we propose *extracting the caching logic from* > NodesListManager.CachedResolver {*}into a separate reusable utility class{*}. > 2.{*}Create a Shared Caching Utility{*}: > •Extract the caching logic from NodesListManager.CachedResolver. > •Implement a new class, e.g., HostnameCache, and place it in the Hadoop > Common module to ensure it can be used across different components. > 3.{*}Integrate{*} HostnameCache with *HostResolver &QualifiedHostResolver*: > •Modify HostResolver to use HostnameCache for hostname lookups. > •Update NodesListManager.CachedResolver to use HostnameCache instead of its > own internal cache. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org