[ 
https://issues.apache.org/jira/browse/HADOOP-16114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Praveen Krishna updated HADOOP-16114:
-------------------------------------
    Description: 
In NetUtils#canonicalizeHost uses ConcurrentHashMap#putIfAbsent to add an entry 
to the cache
{code:java}
  private static String canonicalizeHost(String host) {
    // check if the host has already been canonicalized
    String fqHost = canonicalizedHostCache.get(host);
    if (fqHost == null) {
      try {
        fqHost = SecurityUtil.getByName(host).getHostName();
        // slight race condition, but won't hurt
        canonicalizedHostCache.putIfAbsent(host, fqHost);
      } catch (UnknownHostException e) {
        fqHost = host;
      }
    }
    return fqHost;
}
{code}
 

If two different threads were invoking this method for the first time (so the 
cache is empty) and if SecurityUtil#getByName()#getHostName gives two different 
value for the same host , only one fqHost would be added in the cache and an 
invalid fqHost would be given to one of the thread which might cause some APIs 
to fail for the first time `FileSystem#checkPath` even if the path is in the 
given file system. It might be better if we modify the above method to this

 
{code:java}
  private static String canonicalizeHost(String host) {
    // check if the host has already been canonicalized
    String fqHost = canonicalizedHostCache.get(host);
    if (fqHost == null) {
      try {
        fqHost = SecurityUtil.getByName(host).getHostName();
        // slight race condition, but won't hurt
        canonicalizedHostCache.putIfAbsent(host, fqHost);
        fqHost = canonicalizedHostCache.get(host);
      } catch (UnknownHostException e) {
        fqHost = host;
      }
    }
    return fqHost;
}
{code}
 

So even if other thread get a different host name it will be updated to the 
cached value.

  was:
In NetUtils#canonicalizeHost uses ConcurrentHashMap#putIfAbsent to add an entry 
to the cache
{code:java}
  private static String canonicalizeHost(String host) {
    // check if the host has already been canonicalized
    String fqHost = canonicalizedHostCache.get(host);
    if (fqHost == null) {
      try {
        fqHost = SecurityUtil.getByName(host).getHostName();
        // slight race condition, but won't hurt
        canonicalizedHostCache.putIfAbsent(host, fqHost);
      } catch (UnknownHostException e) {
        fqHost = host;
      }
    }
    return fqHost;
}
{code}
 

If two different threads were invoking this method for the first time (so the 
cache is empty) and if SecurityUtil#getByName()#getHostName gives two different 
value for the same host , only one fqHost would be added in the cache and an 
invalid fqHost would be given to one of the thread which might cause some APIs 
to fail for the first time `FileSystem#checkPath` even if the path is in the 
given file system. It might be better if we modify the above method to this

 
{code:java}
  private static String canonicalizeHost(String host) {
    // check if the host has already been canonicalized
    String fqHost = canonicalizedHostCache.get(host);
    if (fqHost == null) {
      try {
        fqHost = SecurityUtil.getByName(host).getHostName();
        // slight race condition, but won't hurt
        canonicalizedHostCache.putIfAbsent(host, fqHost);
        fqHost = canonicalizedHostCache.get(host);
      } catch (UnknownHostException e) {
        fqHost = host;
      }
    }
    return fqHost;
}
{code}
 

So even if other thread get a different host name it will be updated to the 
cached value/


> NetUtils#canonicalizeHost gives different value for same host
> -------------------------------------------------------------
>
>                 Key: HADOOP-16114
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16114
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: net
>    Affects Versions: 2.7.6, 3.1.2
>            Reporter: Praveen Krishna
>            Priority: Minor
>
> In NetUtils#canonicalizeHost uses ConcurrentHashMap#putIfAbsent to add an 
> entry to the cache
> {code:java}
>   private static String canonicalizeHost(String host) {
>     // check if the host has already been canonicalized
>     String fqHost = canonicalizedHostCache.get(host);
>     if (fqHost == null) {
>       try {
>         fqHost = SecurityUtil.getByName(host).getHostName();
>         // slight race condition, but won't hurt
>         canonicalizedHostCache.putIfAbsent(host, fqHost);
>       } catch (UnknownHostException e) {
>         fqHost = host;
>       }
>     }
>     return fqHost;
> }
> {code}
>  
> If two different threads were invoking this method for the first time (so the 
> cache is empty) and if SecurityUtil#getByName()#getHostName gives two 
> different value for the same host , only one fqHost would be added in the 
> cache and an invalid fqHost would be given to one of the thread which might 
> cause some APIs to fail for the first time `FileSystem#checkPath` even if the 
> path is in the given file system. It might be better if we modify the above 
> method to this
>  
> {code:java}
>   private static String canonicalizeHost(String host) {
>     // check if the host has already been canonicalized
>     String fqHost = canonicalizedHostCache.get(host);
>     if (fqHost == null) {
>       try {
>         fqHost = SecurityUtil.getByName(host).getHostName();
>         // slight race condition, but won't hurt
>         canonicalizedHostCache.putIfAbsent(host, fqHost);
>         fqHost = canonicalizedHostCache.get(host);
>       } catch (UnknownHostException e) {
>         fqHost = host;
>       }
>     }
>     return fqHost;
> }
> {code}
>  
> So even if other thread get a different host name it will be updated to the 
> cached value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to