Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14846 )

Change subject: IMPALA-9195: Using multithreaded execution to accelerate 'show 
tables/databases'
......................................................................


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/14846/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14846/1//COMMIT_MSG@14
PS1, Line 14: This patch enables MT_DOP query option to accelerate 'show 
tables/databases'
> Let's not overload mt_dop. mt_dop should only be about query execution, not
+1 to this comment above. Lets not use mt_dop for this.


http://gerrit.cloudera.org:8080/#/c/14846/1/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/14846/1/fe/src/main/java/org/apache/impala/service/Frontend.java@a778
PS1, Line 778:
             :
             :
Looks like this check is being removed in the code. Is that intentional?


http://gerrit.cloudera.org:8080/#/c/14846/1/fe/src/main/java/org/apache/impala/service/Frontend.java@799
PS1, Line 799:  poolSize = mtDop > 0 ? mtDop
Instead of using mt_dop to get the poolsize, I think it would be good to have a 
separate config to determine the pool size. Also, since getDbs and 
getTableNames calls are so frequent (I have seen systems where you can have 
10-20 of such calls in parallel at a time), may be it would be good to use a 
single static thread pool which can accept such tasks. Otherwise, in the the 
example above, we may have 10*10 threads doing the checkAccess calls if the 
pool size is 10.


http://gerrit.cloudera.org:8080/#/c/14846/1/fe/src/main/java/org/apache/impala/service/Frontend.java@977
PS1, Line 977:
             :         // Wait for the db checking tasks to be finished.
             :         int failedCheckTasks = 0;
             :         int index = 0;
             :         iter = dbs.iterator();
             :         while (iter.hasNext()) {
             :           iter.next();
             :           try {
             :             if (!pendingCheckingTasks.get(index).get())
             :               iter.remove();
             :             index ++;
             :           } catch (ExecutionException | InterruptedException e) {
             :             failedCheckTasks ++;
             :             LOG.error("Encountered an error checking access for 
dbs", e);
             :           }
             :         }
It looks like a lot of this code is duplicated. Can we refactor this into a 
method which can be called from both getTableNames and getDbs?


http://gerrit.cloudera.org:8080/#/c/14846/1/fe/src/main/java/org/apache/impala/service/Frontend.java@1032
PS1, Line 1032: hrow new IllegalArgumentException("Params dbName should not be 
null.")
Instead of this, may be its better to have a Preconditions.checkNotNull(dbName) 
at line 1010.



--
To view, visit http://gerrit.cloudera.org:8080/14846
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I860e0d18afa0421665f8b3b1c5561d6bdacc5e96
Gerrit-Change-Number: 14846
Gerrit-PatchSet: 1
Gerrit-Owner: Zhou Xu <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Vihang Karajgaonkar <[email protected]>
Gerrit-Comment-Date: Wed, 11 Dec 2019 00:50:11 +0000
Gerrit-HasComments: Yes

Reply via email to