Chris Nauroth commented on HADOOP-13726:

The file system cache employs an optimistic locking algorithm. Multiple 
concurrent threads might create and initialize multiple instances without lock 
coordination. Then, while holding the lock, the thread checks if another thread 
won the race and put an instance into the cache while this thread was busy 
initializing the file system. If so, it uses the cached instance and discards 
the one it just initialized itself.

    private FileSystem getInternal(URI uri, Configuration conf, Key key) throws 
      FileSystem fs;
      synchronized (this) {
        fs = map.get(key);
      if (fs != null) {
        return fs;

      fs = createFileSystem(uri, conf);
      synchronized (this) { // refetch the lock again
        FileSystem oldfs = map.get(key);
        if (oldfs != null) { // a file system is created while lock is releasing
          fs.close(); // close the new file system
          return oldfs;  // return the old file system

An important consequence of this algorithm is that even though all threads 
ultimately get the same shared instance, it's still possible that multiple 
concurrent threads are attempting the {{getInternal}} operation, so they could 
all be calling {{FileSystem#initialize}}.  Depending on the file system 
implementation, this can be an expensive operation.

We can eliminate this race condition by using techniques similar to the 
NameNode RPC {{RetryCache}}.  If multiple threads simultaneously try to get a 
{{FileSystem}} with the same cache key, then the first thread proceeds into 
{{FileSystem#initialize}}.  All other threads enter a wait set, blocked on 
completion of the first thread.  After the first thread completes 
initialization, it notifies all members of the wait set.  All threads are 
returned the same initialized instance.

The current logic traces back to HADOOP-6640.  Before that change, the locking 
was coarser, so something like a slow NameNode RPC connection with a lot of 
retries could block initialization of all file systems.  The change I'm 
proposing here would not cause a regression of HADOOP-6640.  It is only 
intended to prevent redundant initialization.

> Enforce that FileSystem initializes only a single instance of the requested 
> FileSystem.
> ---------------------------------------------------------------------------------------
>                 Key: HADOOP-13726
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13726
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>            Reporter: Chris Nauroth
> The {{FileSystem}} cache is intended to guarantee reuse of instances by 
> multiple call sites or multiple threads.  The current implementation does 
> provide this guarantee, but there is a brief race condition window during 
> which multiple threads could perform redundant initialization.  If the file 
> system implementation has expensive initialization logic, then this is 
> wasteful.  This issue proposes to eliminate that race condition and guarantee 
> initialization of only a single instance.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to