Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-5881: Use native allocation while building catalog 
updates
......................................................................


Patch Set 1:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/7955/1/be/src/catalog/catalog.cc
File be/src/catalog/catalog.cc:

PS1, Line 106: RETURN_IF_ERROR(DeserializeThriftMsg(jni_env, result_bytes, 
&nbuffer));
> We should check that the nbuffer ptr and size are not 0.
I think the problem is when nbuffer_ptr is non zero but DeSerialize call fails 
(free(0) works ok). Per my understanding there is no proper way to confirm that 
it points to a correct address. Do you suggest that we should free it in any 
case and return the status?


http://gerrit.cloudera.org:8080/#/c/7955/4/common/thrift/JniCatalog.thrift
File common/thrift/JniCatalog.thrift:

Line 597: // Currently used by the Catalog JVM to serialize and track the 
output of
> I don't think we need the last part, comments about current usage tend to g
Done


http://gerrit.cloudera.org:8080/#/c/7955/4/fe/src/main/java/org/apache/impala/service/JniCatalog.java
File fe/src/main/java/org/apache/impala/service/JniCatalog.java:

Line 112:     GlogAppender.Install(TLogLevel.values()[cfg.impala_log_lvl],
> seems clear it cannot be null here
Done


Line 153:   /**
> What's the purpose of this warning? I'm thinking we should address limits/w
I included this for supportability purposes. With this we log the untracked 
allocation sizes and also gives a rough estimate of the topic size. Do you 
think it should be removed?

For ex: On a cluster where I deployed this page, this line got printed for a a 
serialized size of 2.5GB. That way I figured that a single call got a huge 
update from the Catalog, something along these lines.


http://gerrit.cloudera.org:8080/#/c/7955/4/fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java
File fe/src/main/java/org/apache/impala/thrift/NativeByteArrayOutputStream.java:

Line 28: 
> ... single write() method because that is used by the thrift-generated seri
Done


Line 43:   protected long bytesWritten_;
> We should add unit tests for this class and the TNativeSerializer.
Added a single unit test class that tests both. Running into some issues while 
its execution because of huge allocations. Commented the relevant portions 
while I'm figuring out an optimal way to do it.


Line 89:       throw new IndexOutOfBoundsException(
> only if bufferPtr_ != 0
I checked that freeMemory(0) is a valid call, wrote a test program to confirm 
that.


Line 94:       if (bufferLen_ >= BUFFER_DOUBLING_RESIZE_LIMIT) {
> My understanding is that if reallocate() fails (return 0), then the origina
Yea. I added a freeMemory() if the return is 0.


Line 109: 
> If this function throws we'll have a memory leak. We could either internall
The only method that can throw here is tryAllocate() and I added necessary 
cleanups there. 

(I checked that the copyMemory doesn't throw).


http://gerrit.cloudera.org:8080/#/c/7955/4/fe/src/main/java/org/apache/impala/thrift/TNativeSerializer.java
File fe/src/main/java/org/apache/impala/thrift/TNativeSerializer.java:

Line 31:  * native by array using NativeByteArrayOutputStream class. Not 
thread-safe.
> Mention that it is valid to call serialize() exactly once (maybe we can enf
Mentioned it in the serialize() method comment.


-- 
To view, visit http://gerrit.cloudera.org:8080/7955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I383684effa9524734ce3c6c0fb7ed37de0e15782
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: Mostafa Mokhtar <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to