Henry Robinson has posted comments on this change. Change subject: IMPALA-3499: Batch update catalog cache update. ......................................................................
Patch Set 5: (9 comments) Most concerning part of this is incrementally updating the cache - do your changes to ImpaladCatalog prevent publishing incremental updates? Please spell that out somewhere, and the role of the new lock in impala-server.h. http://gerrit.cloudera.org:8080/#/c/3067/5/be/src/service/impala-server.cc File be/src/service/impala-server.cc: Line 1253: if (delta.topic_entries.size() != 0 || delta.topic_deletions.size() != 0) { I think it would be clearer now to have two blocks, one handling topics_entries and one handling topic_deletions, since they don't seem to share too much logic. Line 1281: len = item.value.size(); try to avoid reusing variables for different purposes like this. In this case, I think you can declare len to be local to this scope (and similarly with the variable on line 1262 - move it into the if () block), and then put that declaration on line 1298. Line 1299: update_size + len what if one object is larger than the max? How about testing update_size > 0 here as well? Line 1307: UpdateCatalogCache Is it possible that the state exposed by the catalog can now be inconsistent, if an entire topic update does not get applied in one batch? Line 1312: } what happens if the total update size (across all entries) is smaller than the max? What triggers a call to UpdateCatalogCache() in that instance? Line 1354: if (update_status.ok()) { : update_status = exec_env_->frontend()->UpdateCatalogCache(update_req, &resp); : } catalog_update_lock isn't guaranteed to be held here if topic_entries.size() == 0. Is defer_lock actually helping us much here? Line 1357: if (catalog_update_lock.owns_lock()) catalog_update_lock.unlock(); Is this necessary, or will catalogUpdate_lock be unlocked automatically when catalog_update_lock goes out of scope? http://gerrit.cloudera.org:8080/#/c/3067/5/be/src/service/impala-server.h File be/src/service/impala-server.h: Line 744: ut is necessary to split update in CatalogUpdateCallback. explain what this is protecting against - simultaneous updates? http://gerrit.cloudera.org:8080/#/c/3067/5/fe/src/main/java/com/cloudera/impala/catalog/ImpaladCatalog.java File fe/src/main/java/com/cloudera/impala/catalog/ImpaladCatalog.java: Line 114: last_batch_update not clear from the name what condition this represents. -- To view, visit http://gerrit.cloudera.org:8080/3067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I176db25124a32944f2396ce8aafbed49cac95928 Gerrit-PatchSet: 5 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Huaisi Xu <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Henry Robinson <[email protected]> Gerrit-Reviewer: Huaisi Xu <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
