Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11182 )

Change subject: IMPALA_7127 (continued): initial fetch-from-catalogd 
implementation
......................................................................


Patch Set 1:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/11182/1/be/src/exec/catalog-op-executor.cc
File be/src/exec/catalog-op-executor.cc:

http://gerrit.cloudera.org:8080/#/c/11182/1/be/src/exec/catalog-op-executor.cc@289
PS1, Line 289:   const TNetworkAddress& address =
> DCHECK(use_local_catalog) ? Also, generally I think we should have asserts
Sure, will do. I don't want to add such assertions on the server side, however, 
since it would be nice to be able to isolate the configuration to a minimum 
number of nodes


http://gerrit.cloudera.org:8080/#/c/11182/1/be/src/exec/catalog-op-executor.cc@295
PS1, Line 295:       
client.DoRpc(&CatalogServiceClientWrapper::GetPartialCatalogObject, req, resp));
> For future: We should probably add some instrumentation around this RPC for
agreed. Something like OpenCensus's statistics and tags would be very useful 
here, but may be too much effort to try to incorporate that. Will certainly 
plan to do something simple at least for v1. I'll add a TODO.


http://gerrit.cloudera.org:8080/#/c/11182/1/be/src/service/fe-support.cc
File be/src/service/fe-support.cc:

http://gerrit.cloudera.org:8080/#/c/11182/1/be/src/service/fe-support.cc@539
PS1, Line 539: if (!status.ok()) {
> Is this redundant? Looks like THROW_IF_ERROR_RET checks the same. or may be
oh, yea... I was debugging for a while with some LOG statements in here and the 
if() got left over.


http://gerrit.cloudera.org:8080/#/c/11182/1/common/thrift/CatalogService.thrift
File common/thrift/CatalogService.thrift:

http://gerrit.cloudera.org:8080/#/c/11182/1/common/thrift/CatalogService.thrift@239
PS1, Line 239:   1: bool want_db_names
> Would this ever be false? Tracking the callers, it does not appear so to me
In the current patch, it's always true, but this is probably where things like 
listing DataSources, roles, etc, would slot in. I'll add a TODO to make that 
clear


http://gerrit.cloudera.org:8080/#/c/11182/1/common/thrift/CatalogService.thrift@246
PS1, Line 246: Selector
> Just curious if we should use 'request'/'response' terminology to make it m
I think that could be confusing since we usually use Request/Response 
specifically as the argument and return type of an RPC, whereas these are 
fields _within_ said RPC. If you feel strongly I'll change it though.


http://gerrit.cloudera.org:8080/#/c/11182/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/11182/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1828
PS1, Line 1828:   public TGetPartialCatalogObjectResponse 
getPartialCatalogObject(
> Commented elsewhere too, wondering if these should go into a V2 implementat
I'd rather not complicate the inheritance hierarchy here -- I think the code 
would get hard to follow, plus that woudl mean we'd need to configure this on 
both the impalad and the catalogd instead of just the impalad.


http://gerrit.cloudera.org:8080/#/c/11182/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1879
PS1, Line 1879: dbCache_
> Don't we require a versionLock_.readLock() for this?
Not according to getAllTableNames() elsewhere in this class -- I think the 
dbCache is an atomic reference to a concurrent map, so the keyset is safe to 
iterate on with concurrent mutation.


http://gerrit.cloudera.org:8080/#/c/11182/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/11182/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1705
PS1, Line 1705: partId
> It looks like we are relying on the premise that the partId of partition do
I think it's the opposite -- we are _hoping_ the partID changes if the contents 
of the partition object change. This is usually the case, though it looks like 
there are a few cases where partitions are mutated without changing their ID 
(eg ALTER PARTITION SET FILEFORMAT).

The idea here is that, if you reload a partition, we'll assign a new partID, 
and also bump the version number of the table. That means that, when we add 
caching, the new table version number will ensure that we call 'listPartitions' 
again. If we previously had partIds [1,2,3,4,5], now we'll have [1,2,3,4,6]. 
That will force us to get a cache miss on partId 6 and load the new partition, 
while we can still get cache hits on all the previously-loaded partitions.


http://gerrit.cloudera.org:8080/#/c/11182/1/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/11182/1/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@101
PS1, Line 101:           "Catalog object %s changed version from %s to %s while 
fetching metadata",
> nit: This should probably also include something that makes sense to the us
Per the commit message and also I think in the javadoc of this exception type, 
I was planning to make this exception get caught by the frontend and trigger a 
re-plan, so this should never bubble up to the user level.



--
To view, visit http://gerrit.cloudera.org:8080/11182
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If49207fc592b1cc552fbcc7199568b6833f86901
Gerrit-Change-Number: 11182
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Tianyi Wang <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Vuk Ercegovac <[email protected]>
Gerrit-Comment-Date: Mon, 13 Aug 2018 16:45:52 +0000
Gerrit-HasComments: Yes

Reply via email to