Henry Robinson has posted comments on this change.

Change subject: IMPALA-3499: Split catalog update
......................................................................


Patch Set 17:

(7 comments)

Have you tested with very large topics yet? I would like to confirm that a 
topic with total size > 2GB works correctly.

http://gerrit.cloudera.org:8080/#/c/3067/17//COMMIT_MSG
Commit Message:

PS17, Line 10: 500MB
The limit is 2GB, because it's a JVM limitation.


http://gerrit.cloudera.org:8080/#/c/3067/17/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

PS17, Line 1252: current_topic_update_size_bytes
batch_size_bytes is better, I think


PS17, Line 1255: if (len > 100 * 1024 * 1024 /* 100MB */) {
               :         LOG(WARNING) << "Received large catalog 
update(>100mb): "
               :                      << item.key << " is "
               :                      << PrettyPrinter::Print(len, 
TUnit::BYTES);
               :       }
Why print a warning at 100MB? What action would the user expect to take on 
seeing this message?

Also it seems better to check this after DeserializeThriftMsg(), since that 
sets len correctly.


PS17, Line 1273: // Refresh the lib cache entries of any added functions and 
data sources
               :       if (catalog_object.type == TCatalogObjectType::FUNCTION) 
{
               :         DCHECK(catalog_object.__isset.fn);
               :         
LibCache::instance()->SetNeedsRefresh(catalog_object.fn.hdfs_location);
               :       }
               :       if (catalog_object.type == 
TCatalogObjectType::DATA_SOURCE) {
               :         DCHECK(catalog_object.__isset.data_source);
               :         
LibCache::instance()->SetNeedsRefresh(catalog_object.data_source.hdfs_location);
               :       }
Can you leave a TODO saying that the frontend should return this information, 
then we wouldn't need to deserialize every catalog object, we could just wrap 
the bytes and pass them in.


PS17, Line 1292: incremental_request
I think a new variable is warranted here, e.g.:

  TUpdateCatalogCacheRequest* deletion_request = &update_reqs.back();


PS17, Line 1429: vector<TUpdateCatalogCacheRequest>(1, update_req)
nit: I think you can just write {update_req} as a list initializer.


http://gerrit.cloudera.org:8080/#/c/3067/17/fe/src/main/java/com/cloudera/impala/service/JniFrontend.java
File fe/src/main/java/com/cloudera/impala/service/JniFrontend.java:

PS17, Line 164:  
nit: no space before :


-- 
To view, visit http://gerrit.cloudera.org:8080/3067
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I176db25124a32944f2396ce8aafbed49cac95928
Gerrit-PatchSet: 17
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Huaisi Xu <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: Henry Robinson <[email protected]>
Gerrit-Reviewer: Huaisi Xu <[email protected]>
Gerrit-HasComments: Yes

Reply via email to