Vihang Karajgaonkar created HIVE-15879:
------------------------------------------

             Summary: Fix HiveMetaStoreChecker.checkPartitionDirs method
                 Key: HIVE-15879
                 URL: https://issues.apache.org/jira/browse/HIVE-15879
             Project: Hive
          Issue Type: Bug
            Reporter: Vihang Karajgaonkar
            Assignee: Vihang Karajgaonkar


HIVE-15803 fixes the msck hang issue in HiveMetaStoreChecker.checkPartitionDirs 
method by adding a check to see if the Threadpool has any spare threads. If not 
it uses single threaded listing of the files.

{noformat}
    if (pool != null) {
      synchronized (pool) {
        // In case of recursive calls, it is possible to deadlock with TP. 
Check TP usage here.
        if (pool.getActiveCount() < pool.getMaximumPoolSize()) {
          useThreadPool = true;
        }

        if (!useThreadPool) {
          if (LOG.isDebugEnabled()) {
            LOG.debug("Not using threadPool as active count:" + 
pool.getActiveCount()
                + ", max:" + pool.getMaximumPoolSize());
          }
        }
      }
    }
{noformat}

Based on the java doc of getActiveCount() below 
bq. Returns the approximate number of threads that are actively executing tasks.

it returns only approximate number of threads and it cannot be guaranteed that 
it always returns the exact number of active threads. This still exposes the 
method implementation to the msck hang bug in rare corner cases.

We could either:
1. Use a atomic counter to track exactly how many threads are actively running
2. Relook at the method itself to make it much simpler. Like eg, look into the 
possibility of changing the recursive implementation to an iterative 
implementation where worker threads pick tasks from a queue until the queue is 
empty.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to