Alex Behm has posted comments on this change. Change subject: IMPALA-5538: Use explicit catalog versions for deleted objects ......................................................................
Patch Set 2: (12 comments) Change is looking pretty good to me, only minor comments left. http://gerrit.cloudera.org:8080/#/c/7731/2/be/src/catalog/catalog-server.cc File be/src/catalog/catalog-server.cc: Line 305: BuildTopicUpdatesHelper(catalog_objects.updated_objects, false); I like the previous version more where this is inlined into L292. A quick comment about the requirements/structure of a topic update would be nice, e.g. stating that the set of deletions and updates are disjoint and therefore the order in which updates/deletions are added is irrelevant. Line 317: if (catalog_object.catalog_version <= last_sent_catalog_version_) continue; Move this above L312? http://gerrit.cloudera.org:8080/#/c/7731/2/be/src/service/impala-server.cc File be/src/service/impala-server.cc: Line 1340: LOG(INFO) << "Received large catalog update(>100mb): " Received a large catalog item? http://gerrit.cloudera.org:8080/#/c/7731/2/common/thrift/StatestoreService.thrift File common/thrift/StatestoreService.thrift: Line 86: 3: required bool deleted = false; Isn't it better to not have a default value to make sure this is always set? http://gerrit.cloudera.org:8080/#/c/7731/2/fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java File fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java: Line 56: * determine which catalog objects were deleted since the last catalog update topic. catalog topic update Line 57: * Once the catalog update topic is constructed, the old deleted catalog objects are catalog topic update Line 74: // Retrieve all the removed catalog objects with version > 'fromVersion'. /** style comment http://gerrit.cloudera.org:8080/#/c/7731/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: Line 347: // Identify the catalog objects that were updated (added/dropped) in the catalog with /** style comment Line 351: Set<String> addedCatalogObjectKeys = Sets.newHashSet(); updatedCatalogObjects? Line 363: TCatalogObject catalogTbl = new TCatalogObject(TCatalogObjectType.TABLE, Why not just iterate over db.getTables()? Line 441: addedCatalogObjectKeys.add(CatalogDeltaLog.toCatalogObjectKey(privilege)); Instead of having a duplicate add() in all places, how about creating the addedCatalogObjets set right before L448 by looping over the addedCatalogObjects? Seems only marginally less efficient, but less prone to forgetting an add() somewhere. Line 448: for (TCatalogObject removedObject: deltaLog_.retrieveObjects(fromVersion)) { Nice! Much clearer. -- To view, visit http://gerrit.cloudera.org:8080/7731 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I93cb7a033dc8f0d3e0339394b36affe14523274c Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Bharath Vissapragada <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-HasComments: Yes
