Huaisi Xu has posted comments on this change.

Change subject: IMPALA-3499: Batch update catalog cache update.
......................................................................


Patch Set 5:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/3067/5/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

Line 1270:         LOG(ERROR) << "Error deserializing item: " << 
status.GetDetail();
> don't you need a continue after this?
This is not in for loop. This is to get the catalog object with 
catalog_service_id(default 0) only.. we need this in split update otherwise 
frontend detects catalog service id changes and requests a full update. it will 
fail eventually.


Line 1278: TUpdateCatalogCacheResponse resp;
> Similarly, consider moving this inside loop on line 1280, and redeclaring t
ok


Line 1281: len = item.value.size();
> My point is that there's no relationship between the len used in line 1266 
ok


Line 1299: update_size + len
> Ah, I understand the flow of the code a bit better now. I think a comment t
Do you mean that we take the lock no matter how large is the update? if update 
is small(usually the case), we do not need this lock I think. Front end 
synchronize that for us?


Line 1354: if (update_status.ok()) {
         :       update_status = 
exec_env_->frontend()->UpdateCatalogCache(update_req, &resp);
         :     }
> That's subtle, and hard for the reader to understand. Please consider just 
I am not sure how... because when at line 1354, the code does not know if any 
split update happens. either yes or no, we do not re-acquire locks here. not 
sure what do you mean by "just holding the lock for the duration of this 
method."


Line 1357: if (catalog_update_lock.owns_lock()) catalog_update_lock.unlock();
> Has it been faster in any of your tests?
not tried to test this... just thought to be more explicit. you think I'd 
better remove this?


http://gerrit.cloudera.org:8080/#/c/3067/5/fe/src/main/java/com/cloudera/impala/catalog/ImpaladCatalog.java
File fe/src/main/java/com/cloudera/impala/catalog/ImpaladCatalog.java:

Line 114: last_batch_update
> not clear from the name what condition this represents.
update_with_CATALOG_TCatalogObjectType?


Line 129: if (catalogObject.getType() == TCatalogObjectType.CATALOG) {
        :         newCatalogVersion = catalogObject.getCatalog_version();
        :         last_batch_update = true;
        :       }
> Not clear how this works - will one of the updates *always* contain a CATAL
there is only one catalog object(has service id) at the end for every 
statestore update. if we split that, intermediate update wont have this 
object.(we still have the service id though)


-- 
To view, visit http://gerrit.cloudera.org:8080/3067
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I176db25124a32944f2396ce8aafbed49cac95928
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Huaisi Xu <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: Henry Robinson <[email protected]>
Gerrit-Reviewer: Huaisi Xu <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to