Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21754 )

Change subject: IMPALA-12908: Add correctness check for tuple cache
......................................................................


Patch Set 1:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/21754/1/be/src/exec/tuple-cache-node.cc
File be/src/exec/tuple-cache-node.cc:

http://gerrit.cloudera.org:8080/#/c/21754/1/be/src/exec/tuple-cache-node.cc@200
PS1, Line 200:       cache[str].setCounts(1, 0);
Couldn't this just be

  cache[str].reference_rows_count = 1;

The default constructor has already been called. The setCounts method seems 
unnecessary.


http://gerrit.cloudera.org:8080/#/c/21754/1/be/src/exec/tuple-cache-node.cc@216
PS1, Line 216:     if (it.second.reference_rows_count <= 0) {
This should be impossible. Also both counts are uint64_t, so <0 is meaningless. 
Maybe just DCHECK that reference_rows_count > 0?


http://gerrit.cloudera.org:8080/#/c/21754/1/be/src/exec/tuple-cache-node.cc@260
PS1, Line 260:     DCHECK(result == 0);
It's possible for external interaction to cause an error (another process 
deleting the file or changing permissions). So on error I'd argue this should 
be LOG(ERROR). Same with the else path.


http://gerrit.cloudera.org:8080/#/c/21754/1/be/src/exec/tuple-cache-node.cc@351
PS1, Line 351:       if (*eos) *eos = debug_eos;
Can you add a comment about when this would produce different behavior? I think 
the answer is if the cache entry has no more rows but the child has additional 
rows, we need to keep reading from the child. I'd just like to document that.

Would the next iteration work correctly? Presumably trying to read from the 
reader_ again would produce an error?


http://gerrit.cloudera.org:8080/#/c/21754/1/be/src/exec/tuple-cache-node.cc@448
PS1, Line 448:   // TODO: The name of the subdirectory could be simplified to 
hash key only if the key
I think this should be true now, based on scan ranges.


http://gerrit.cloudera.org:8080/#/c/21754/1/be/src/exec/tuple-text-file-reader.h
File be/src/exec/tuple-text-file-reader.h:

http://gerrit.cloudera.org:8080/#/c/21754/1/be/src/exec/tuple-text-file-reader.h@44
PS1, Line 44:   // Returns the size of the file in bytes.
Should mention that this leaves the reader at the end of the file. Or update 
GetFileSize to reset reader_ to its state prior to the call.



--
To view, visit http://gerrit.cloudera.org:8080/21754
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ied074e274ebf99fb57e3ee41a13148725775b77c
Gerrit-Change-Number: 21754
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Comment-Date: Tue, 10 Sep 2024 21:33:19 +0000
Gerrit-HasComments: Yes

Reply via email to