Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23436 )

Change subject: IMPALA-14447: Parallelize table loading in getMissingTables()
......................................................................


Patch Set 16:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/23436/16/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/23436/16/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@368
PS16, Line 368:     if (LOG.isTraceEnabled()) {
              :       LOG.debug("Parallel load {}", tbls);
              :     }
nit: do you want it to be in TRACE level or DEBUG level? We check 
isTraceEnabled() but use LOG.debug() here. On the other hand, we don't need the 
check if constructing the parameters is not expensive. Here we pass 'tbls' 
directly so it's not expensive.

nit: can we also log 'maxThreads'?


http://gerrit.cloudera.org:8080/#/c/23436/16/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@409
PS16, Line 409: ForkJoinPool
nit: "ForkJoinPool" is stale


http://gerrit.cloudera.org:8080/#/c/23436/16/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@436
PS16, Line 436:         Math.min(tbls.size(), 
BackendConfig.INSTANCE.getMaxStmtMetadataLoaderThreads());
We don't need multi-threads here in the legacy catalog mode where the local 
catalog operations are pretty fast. Creating threads has some overhead so it'd 
be nice to set maxThreads to 1 in legacy catalog mode.


http://gerrit.cloudera.org:8080/#/c/23436/16/fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java
File 
fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java:

http://gerrit.cloudera.org:8080/#/c/23436/16/fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java@400
PS16, Line 400:     try (FrontendProfile.Scope scope = 
FrontendProfile.createNewWithScope()) {
              :       profile = FrontendProfile.getCurrent();
              :       assertNotNull(profile);
              :       tables.parallelStream().forEach(t -> {
nit: It's low priority but can we use the same code in StmtMetadataLoader? Or 
do we have other tests that check the profile for a given list of tables?



--
To view, visit http://gerrit.cloudera.org:8080/23436
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I97a5165844ae846b28338d62e93a20121488d79f
Gerrit-Change-Number: 23436
Gerrit-PatchSet: 16
Gerrit-Owner: Riza Suminto <[email protected]>
Gerrit-Reviewer: Fang-Yu Rao <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Steve Carlin <[email protected]>
Gerrit-Comment-Date: Thu, 09 Oct 2025 14:10:55 +0000
Gerrit-HasComments: Yes

Reply via email to