Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8545 )

Change subject: [PREVIEW] IMPALA-5058: Improve concurrency of DDL/DML operations
......................................................................


Patch Set 2:

(22 comments)

The renames make the code a lot easier to read, thank you!

http://gerrit.cloudera.org:8080/#/c/8545/2/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/8545/2/be/src/service/client-request-state.cc@187
PS2, Line 187:       reset_req.reset_metadata_params.__set_sync_ddl(
Why do we need to set sync_ddl twice?


http://gerrit.cloudera.org:8080/#/c/8545/2/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8545/2/be/src/service/impala-server.cc@1430
PS2, Line 1430:         LOG(INFO) << "Update applied with version: " << 
new_catalog_version
Catalog update applied...


http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionManager.java
File 
fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionManager.java:

http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionManager.java@29
PS2, Line 29: public class CatalogObjectVersionManager {
Name is not very telling. Some alternatives:

CatalogObjectVersionQueue
CatalogObjectVersions
ActiveCatalogObjectVersions
CurrentCatalogObjectVersions


http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionManager.java@37
PS2, Line 37:   public void updateCatalogObjectVersions(long oldVersion, long 
newVersion) {
Do you think it's worth checking the return codes of the calls to 
objectVersions_ to make sure we are in a sane state?

For example, I assume that in this call we expect 'oldVersion' to exist. Is it 
ok to add newVersion if the old version does not exist?

Do we expect add/remove to actually add/remove something? Does it matter?


http://gerrit.cloudera.org:8080/#/c/8545/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/8545/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@606
PS1, Line 606:    * Get a snapshot view of all the functions in a database.
> There is one writer (thread doing getCatalogDelta) and multiple readers (th
Ah right, got it.


http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@86
PS2, Line 86:  * and processing of DDL requests. For each DDL request, the 
CatalogServiceCatalog
This description of the versioning scheme is not accurate anymore.

I think the versioning, topic update, and skipping mechanics deserve to be 
described in more detail here.


http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@96
PS2, Line 96:  * - Delete log. The delete log records catalog objects that have 
been removed from the
Let's give an intuition for the purpose of this log, e.g.:

Since deleted objects are removed from the cache, the cache itself is not 
useful for tracking deletions. This log is used for populating the list of 
deleted objects during a topic update.


http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@97
PS2, Line 97:  *   catalog. An entry is added to this log every time an object 
is removed (e.g.
An entry with a new version is added ...


http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@99
PS2, Line 99:  *   be perfomed atomically. An entry is removed from this log 
from the topic update
removed from this log by the topic update thread


(also typo: performed)


http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@102
PS2, Line 102:  *   been processed by the topic update thread. The topic update 
thread is the only one
that have been included in a topic update.

(The current sentence makes it sound like there is one single topic update 
thread, but really it's probably a different thread every time).


http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@105
PS2, Line 105:  *   Information per catalog object includes the number of times 
a catalog object has
Each entry includes the number of times a catalog object has been omitted in a 
topic update ...


http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@111
PS2, Line 111:  *   version that the issueing impalad must wait for in order to 
ensure that the effects
typo: issuing


http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@112
PS2, Line 112:  *   of this operation have been broadcast to all the 
coordinators. The time-based cleanup
Why did we chose a time-based cleanup then? I think the time-based cleanup 
needs to be motivated if this serious-sounding issue is a consequence of that 
choice.

Would also be good to mention that the value of TOPIC_UPDATE_LOG_GC_FREQUENC 
was chosen to make the hang/timeout case unlikely.


http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@129
PS2, Line 129:  * that should be employed if both the catalog lock and table 
locks need to be held at
catalog lock -> version lock?


http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@183
PS2, Line 183:   // Version of the last topic update sent to the statestore.
sent -> returned (to make the control flow clearer)


http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@192
PS2, Line 192:   private final static int TOPIC_UPDATE_LOG_GC_FREQUENCY = 50;
Mention how this affects the lifetime of an individual entry. My understanding 
is that an entry can be live for at most (2*TOPIC_UPDATE_LOG_GC_FREQUENCY)-1 
topic updates assuming it is not being updated


http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@195
PS2, Line 195:  p
Let's create a TopicUpdateLog class that encapsulates these counters and the GC 
logic.


http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@258
PS2, Line 258:   private final Object topicUpdateEventNotifier_ = new Object();
Maybe we should use the topicUpdateLog_ for this purpose once we have a 
separate TopicUpdateLog class. Seems a little more direct to wait/notify on the 
log object since changes in the log that trigger a notify. Alternatively, we 
can add wait/notify methods to the TopicUpdateLog itself to simplify the code 
in this file.


http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1228
PS2, Line 1228:   private void removeDbHelper(Db db) {
updateDeleteLog(Db db)?


http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1938
PS2, Line 1938:           
topicUpdateLog_.get(CatalogDeltaLog.toCatalogObjectKey(tCatalogObject));
Move toCatalogObjectKey() to Catalog, referring to the delta log here is a 
little confusing


http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1976
PS2, Line 1976:           if (topicUpdateLog_.remove(entry.getKey(), 
entry.getValue())) {
Are uou sure it's ok to remove() while iterating over the map? Might be clearer 
to use an iterator.


http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
File fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java:

http://gerrit.cloudera.org:8080/#/c/8545/2/fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java@179
PS2, Line 179:     // functions and priviles) and then the top-level objects 
(databases and roles).
typo: privlieges



--
To view, visit http://gerrit.cloudera.org:8080/8545
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3032437f83d39bcc8cff14d897c7c106a4ab62d3
Gerrit-Change-Number: 8545
Gerrit-PatchSet: 2
Gerrit-Owner: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]>
Gerrit-Comment-Date: Tue, 28 Nov 2017 23:39:20 +0000
Gerrit-HasComments: Yes

Reply via email to