Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11182 )

Change subject: IMPALA_7127 (continued): initial fetch-from-catalogd 
implementation
......................................................................


Patch Set 1:

(13 comments)

I've spent some time on the patch. I think I understand the interfaces/thrift 
changes now but I still need to dig deeper. Publishing some high-level comments 
(and some nits). Will spend more time on this.

http://gerrit.cloudera.org:8080/#/c/11182/1/be/src/catalog/catalog.h
File be/src/catalog/catalog.h:

http://gerrit.cloudera.org:8080/#/c/11182/1/be/src/catalog/catalog.h@79
PS1, Line 79:
nit: formatting.


http://gerrit.cloudera.org:8080/#/c/11182/1/be/src/catalog/catalog.cc
File be/src/catalog/catalog.cc:

http://gerrit.cloudera.org:8080/#/c/11182/1/be/src/catalog/catalog.cc@95
PS1, Line 95:
            :
nit: can be formatted in 2 lines.


http://gerrit.cloudera.org:8080/#/c/11182/1/be/src/exec/catalog-op-executor.h
File be/src/exec/catalog-op-executor.h:

http://gerrit.cloudera.org:8080/#/c/11182/1/be/src/exec/catalog-op-executor.h@55
PS1, Line 55:
nit: formatting.


http://gerrit.cloudera.org:8080/#/c/11182/1/be/src/exec/catalog-op-executor.cc
File be/src/exec/catalog-op-executor.cc:

http://gerrit.cloudera.org:8080/#/c/11182/1/be/src/exec/catalog-op-executor.cc@289
PS1, Line 289:   const TNetworkAddress& address =
DCHECK(use_local_catalog) ? Also, generally I think we should have asserts that 
the new 'partial' API is not called from the clients using the current 
implementation of the CatalogService.


http://gerrit.cloudera.org:8080/#/c/11182/1/be/src/exec/catalog-op-executor.cc@295
PS1, Line 295:       
client.DoRpc(&CatalogServiceClientWrapper::GetPartialCatalogObject, req, resp));
For future: We should probably add some instrumentation around this RPC for 
supportability. Count by table/object/total, RPC times etc.


http://gerrit.cloudera.org:8080/#/c/11182/1/be/src/service/fe-support.cc
File be/src/service/fe-support.cc:

http://gerrit.cloudera.org:8080/#/c/11182/1/be/src/service/fe-support.cc@539
PS1, Line 539: if (!status.ok()) {
Is this redundant? Looks like THROW_IF_ERROR_RET checks the same. or may be 
just return nullptr?


http://gerrit.cloudera.org:8080/#/c/11182/1/common/thrift/CatalogService.thrift
File common/thrift/CatalogService.thrift:

http://gerrit.cloudera.org:8080/#/c/11182/1/common/thrift/CatalogService.thrift@239
PS1, Line 239:   1: bool want_db_names
Would this ever be false? Tracking the callers, it does not appear so to me.


http://gerrit.cloudera.org:8080/#/c/11182/1/common/thrift/CatalogService.thrift@246
PS1, Line 246: Selector
Just curious if we should use 'request'/'response' terminology to make it more 
clear. Something like TPartitionTableInfoRequest and TPartialTableInfoResp or 
something like that.


http://gerrit.cloudera.org:8080/#/c/11182/1/common/thrift/CatalogService.thrift@313
PS1, Line 313: optional Status.TStatus status
             :
             :   2: optional i64 object_version_number
Should these be required always?


http://gerrit.cloudera.org:8080/#/c/11182/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/11182/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1828
PS1, Line 1828:   public TGetPartialCatalogObjectResponse 
getPartialCatalogObject(
Commented elsewhere too, wondering if these should go into a V2 implementation 
of the CatalogServiceCatalog that supports partial requests/responses? (which 
only gets instantiated with use_local_catalog=true..)


http://gerrit.cloudera.org:8080/#/c/11182/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1879
PS1, Line 1879: dbCache_
Don't we require a versionLock_.readLock() for this?


http://gerrit.cloudera.org:8080/#/c/11182/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/11182/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1705
PS1, Line 1705: partId
It looks like we are relying on the premise that the partId of partition does 
not change over its lifetime.  I think there are cases when that is not true. 
Forex: reloadPartition() [1] does a drop() and add(). createPartition() creates 
the newId for that partition.


[1] 
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L2066


http://gerrit.cloudera.org:8080/#/c/11182/1/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/11182/1/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@101
PS1, Line 101:           "Catalog object %s changed version from %s to %s while 
fetching metadata",
nit: This should probably also include something that makes sense to the user. 
Something like the partition (a=1) changed while the query is in progress...



--
To view, visit http://gerrit.cloudera.org:8080/11182
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If49207fc592b1cc552fbcc7199568b6833f86901
Gerrit-Change-Number: 11182
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Tianyi Wang <[email protected]>
Gerrit-Reviewer: Vuk Ercegovac <[email protected]>
Gerrit-Comment-Date: Mon, 13 Aug 2018 02:19:39 +0000
Gerrit-HasComments: Yes

Reply via email to