[ 
https://issues.apache.org/jira/browse/HIVE-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008328#comment-13008328
 ] 

MIS commented on HIVE-2051:
---------------------------

Yes it is necessary for the executor to be terminated if the jobs have been 
submitted to it, even though submitted jobs may have been completed. 

However, what we need not do here is, after the executor is shutdown, await 
till the termination gets over, since this is redundant. As all the submitted 
jobs to the executor will be completed by the time we shutdown the executor. 
This is what is ensured when we do result.get()
i.e., the following piece of code is not required.
+      do {
+        try {
+          executor.awaitTermination(Integer.MAX_VALUE, TimeUnit.SECONDS);
+          executorDone = true;
+        } catch (InterruptedException e) {
+        }
+      } while (!executorDone);

> getInputSummary() to call FileSystem.getContentSummary() in parallel
> --------------------------------------------------------------------
>
>                 Key: HIVE-2051
>                 URL: https://issues.apache.org/jira/browse/HIVE-2051
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Siying Dong
>            Assignee: Siying Dong
>            Priority: Minor
>         Attachments: HIVE-2051.1.patch, HIVE-2051.2.patch, HIVE-2051.3.patch, 
> HIVE-2051.4.patch
>
>
> getInputSummary() now call FileSystem.getContentSummary() one by one, which 
> can be extremely slow when the number of input paths are huge. By calling 
> those functions in parallel, we can cut latency in most cases.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to