Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8752 )

Change subject: IMPALA-5058: Improve the concurrency of DDL/DML operations
......................................................................


Patch Set 1:

(25 comments)

Here's a first wave of comments. Still thinking about some places in more 
detail.

http://gerrit.cloudera.org:8080/#/c/8752/1/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/8752/1/be/src/service/impala-server.h@328
PS1, Line 328:   /// waits all other catalog topic subscribers to process this 
update by checking the
waits for


http://gerrit.cloudera.org:8080/#/c/8752/1/be/src/service/impala-server.h@339
PS1, Line 339:   void WaitForMinCatalogUpdate(const int64_t 
min_catalog_update_version,
Add comment


http://gerrit.cloudera.org:8080/#/c/8752/1/be/src/service/impala-server.h@344
PS1, Line 344:   void WaitForCatalogUpdateTopicPropagation(const TUniqueId& 
catalog_service_id);
Let's pass in the version to keep these Wait() APIs consistent.


http://gerrit.cloudera.org:8080/#/c/8752/1/common/thrift/CatalogInternalService.thrift
File common/thrift/CatalogInternalService.thrift:

http://gerrit.cloudera.org:8080/#/c/8752/1/common/thrift/CatalogInternalService.thrift@39
PS1, Line 39:   // List of updated (new and modified) catalog objects for which 
the catalog verion is
catalog objects whose version is larger than ...


http://gerrit.cloudera.org:8080/#/c/8752/1/common/thrift/CatalogInternalService.thrift@43
PS1, Line 43:   // List of deleted catalog objects for which the catalog 
version is larger than
catalog objects whose version is larger than ...


http://gerrit.cloudera.org:8080/#/c/8752/1/common/thrift/StatestoreService.thrift
File common/thrift/StatestoreService.thrift:

http://gerrit.cloudera.org:8080/#/c/8752/1/common/thrift/StatestoreService.thrift@83
PS1, Line 83:   // subscribers need additional information in order to process 
the deleted topics that
This is needed when subscribers require additional information not captured in 
the item key to process the deleted item (e.g., catalog version of deleted 
catalog object).


http://gerrit.cloudera.org:8080/#/c/8752/1/common/thrift/StatestoreService.thrift@88
PS1, Line 88:   // non-delta TTopicDelta's (since the latest version of every 
still-present topic will
* Remove parenthesis since the explanation is important
* Do you mean "still-present item" instead of "still-present topic"?

The sentence is somewhat confusing to me, maybe I'm misunderstanding. A 
non-delta update must include all items so it should also include this one, 
right?


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java
File fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java:

http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java@91
PS1, Line 91:       for (RolePrivilege priv: existingRole.getPrivileges()) {
might simplify some places to have bulk versions of the add/remove calls like 
addAll and removeAll that take a collection of CatalogObjects


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java
File fe/src/main/java/org/apache/impala/catalog/Catalog.java:

http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@544
PS1, Line 544:       throws IllegalStateException {
remove throws since it's a RuntimeException


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@577
PS1, Line 577:    * TODO: Use global object IDs everywhere instead of tracking 
catalog objects by
Seems like a pretty big TODO unlikely to get addressed. I suggest removing it.


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@580
PS1, Line 580:   public static boolean objectNamesMatch(TCatalogObject first, 
TCatalogObject second) {
keyEquals() or similar since we are actually checking for key equality and not 
"object name" equality


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java:

http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java@37
PS1, Line 37:  *   direct ("fast") update that applies the result of a catalog 
operation to the cache
remove ("fast")


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java@53
PS1, Line 53:  *   The catalogd uses this log to identify deleted catalog 
objects. Deleted
... to identify objects that have been deleted since the last catalog topic 
update.


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java@55
PS1, Line 55:  *   them (e.g. dropTable()). While constructing a catalog update 
topic, we use the log to
Sentence in the middle with "While" can be removed imo. The first sentence 
already gives this info.


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java@57
PS1, Line 57:  *   Once the catalog topic update is constructed, the old 
deleted catalog objects are
the old deleted catalog objects -> the old entries in this log


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogObjectImpl.java
File fe/src/main/java/org/apache/impala/catalog/CatalogObjectImpl.java:

http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogObjectImpl.java@28
PS1, Line 28:   private AtomicLong catalogVersion_ = new 
AtomicLong(Catalog.INITIAL_CATALOG_VERSION);
Not clear why this has to be atomic, but let's think about that outside of this 
patch. Commenting here so we don't forget.


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionQueue.java
File fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionQueue.java:

http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionQueue.java@39
PS1, Line 39:   public void updateCatalogObjectVersions(long oldVersion, long 
newVersion) {
Method names seem verbose, how about just add/remove/update


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionQueue.java@53
PS1, Line 53:   public long getMinimumObjectVersion() {
getMinimumVersion()


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@95
PS1, Line 95:  * CatalogServiceCatalog guarantees that very catalog object that 
has a version in the
typo: that every

Sentence does not seem accurate. I think we guarantee that all objects included 
in the update are within that version range - but we don't necessarily include 
*all* such objects. Some objects are skipped to allow the topic update thread 
to make progress.

We should also mention that the topic update itself is assigned a version (the 
"to" version).


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@97
PS1, Line 97:  * does not prevent DDL requests from making progress while the 
catalog topic update is
Let's phrase this positively: Concurrent DDL requests are allowed while a topic 
update is in progress.


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@98
PS1, Line 98:  * created. Hence, there is a non-zero probability that 
frequently modified catalog
Concept of skipping has not been introduced, need to discuss that above when 
talking about the nuances of version range, etc.


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@103
PS1, Line 103:  * higher than the 'to' version of the topic update.
As a result, the same version of an object might be sent in two subsequent 
topic updates.


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@127
PS1, Line 127:  *   The time-based cleanup process of the topic update log 
entries may cause metadata
Give this section a title like "Known Anomalies with SYNC_DDL"


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@138
PS1, Line 138:  *   waitFroSyncDdlVersion() for more details.
typo: waitForSyncDdlVersion()


http://gerrit.cloudera.org:8080/#/c/8752/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@224
PS1, Line 224:   private class TopicUpdateLog {
Can you move this into a separate .java file. This one is already getting huge.



--
To view, visit http://gerrit.cloudera.org:8080/8752
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If12467a83acaeca6a127491d89291dedba91a35a
Gerrit-Change-Number: 8752
Gerrit-PatchSet: 1
Gerrit-Owner: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Comment-Date: Wed, 06 Dec 2017 00:28:49 +0000
Gerrit-HasComments: Yes

Reply via email to