Huaisi Xu has posted comments on this change. Change subject: IMPALA-3499: Batch update catalog cache update. ......................................................................
Patch Set 5: (8 comments) http://gerrit.cloudera.org:8080/#/c/3067/5/be/src/service/impala-server.cc File be/src/service/impala-server.cc: Line 1270: LOG(ERROR) << "Error deserializing item: " << status.GetDetail(); > don't you need a continue after this? This is not in for loop. This is to get the catalog object with catalog_service_id(default 0) only.. we need this in split update otherwise frontend detects catalog service id changes and requests a full update. it will fail eventually. Line 1278: TUpdateCatalogCacheResponse resp; > Similarly, consider moving this inside loop on line 1280, and redeclaring t ok Line 1281: len = item.value.size(); > My point is that there's no relationship between the len used in line 1266 ok Line 1299: update_size + len > Ah, I understand the flow of the code a bit better now. I think a comment t Do you mean that we take the lock no matter how large is the update? if update is small(usually the case), we do not need this lock I think. Front end synchronize that for us? Line 1354: if (update_status.ok()) { : update_status = exec_env_->frontend()->UpdateCatalogCache(update_req, &resp); : } > That's subtle, and hard for the reader to understand. Please consider just I am not sure how... because when at line 1354, the code does not know if any split update happens. either yes or no, we do not re-acquire locks here. not sure what do you mean by "just holding the lock for the duration of this method." Line 1357: if (catalog_update_lock.owns_lock()) catalog_update_lock.unlock(); > Has it been faster in any of your tests? not tried to test this... just thought to be more explicit. you think I'd better remove this? http://gerrit.cloudera.org:8080/#/c/3067/5/fe/src/main/java/com/cloudera/impala/catalog/ImpaladCatalog.java File fe/src/main/java/com/cloudera/impala/catalog/ImpaladCatalog.java: Line 114: last_batch_update > not clear from the name what condition this represents. update_with_CATALOG_TCatalogObjectType? Line 129: if (catalogObject.getType() == TCatalogObjectType.CATALOG) { : newCatalogVersion = catalogObject.getCatalog_version(); : last_batch_update = true; : } > Not clear how this works - will one of the updates *always* contain a CATAL there is only one catalog object(has service id) at the end for every statestore update. if we split that, intermediate update wont have this object.(we still have the service id though) -- To view, visit http://gerrit.cloudera.org:8080/3067 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I176db25124a32944f2396ce8aafbed49cac95928 Gerrit-PatchSet: 5 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Huaisi Xu <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Henry Robinson <[email protected]> Gerrit-Reviewer: Huaisi Xu <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
