Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/21754 )
Change subject: IMPALA-12908: Add correctness check for tuple cache ...................................................................... Patch Set 7: (5 comments) http://gerrit.cloudera.org:8080/#/c/21754/4/be/src/exec/tuple-cache-node.cc File be/src/exec/tuple-cache-node.cc: http://gerrit.cloudera.org:8080/#/c/21754/4/be/src/exec/tuple-cache-node.cc@89 PS4, Line 89: // was evicted, so we skip the correctness check in this case. > It is possible that after IsAvailableForRead(), the cache metadata is evict When we hold the handle_, the cache entry shouldn't be evicted until we release the handle_. Are we seeing entries get evicted and getting an empty string here? http://gerrit.cloudera.org:8080/#/c/21754/7/be/src/exec/tuple-cache-node.cc File be/src/exec/tuple-cache-node.cc: http://gerrit.cloudera.org:8080/#/c/21754/7/be/src/exec/tuple-cache-node.cc@194 PS7, Line 194: RuntimeState* state It looks like we may not need RuntimeState anymore? http://gerrit.cloudera.org:8080/#/c/21754/7/be/src/exec/tuple-cache-node.cc@322 PS7, Line 322: // as the cache may have been evicted. The handle should keep the cache entry alive. Have we seen cases where it doesn't? If we haven't seen a case like that, I think it would be good to DCHECK that the cache_status is ok. http://gerrit.cloudera.org:8080/#/c/21754/7/be/src/exec/tuple-text-file-util-test.cc File be/src/exec/tuple-text-file-util-test.cc: http://gerrit.cloudera.org:8080/#/c/21754/7/be/src/exec/tuple-text-file-util-test.cc@108 PS7, Line 108: vector<string> ref_lines = {"Row 1", "Row 2", "Row 3"}; : vector<string> cmp_lines = {"Row 1", "Row 2"}; : string ref_file = CreateTestFile("ref.txt", ref_lines); : string cmp_file = CreateTestFile("cmp.txt", cmp_lines); : : Status status = TupleTextFileUtil::VerifyRows(ref_file, cmp_file); : EXPECT_FALSE(status.ok()); : EXPECT_EQ(status.code(), TErrorCode::TUPLE_CACHE_INCONSISTENCY); Nit: Could we repeat VerifyRows() in the other direction (i.e. TupleTextFileUtil::VerifyRows(cmp_file, ref_file))? http://gerrit.cloudera.org:8080/#/c/21754/7/be/src/runtime/tuple-cache-mgr.h File be/src/runtime/tuple-cache-mgr.h: http://gerrit.cloudera.org:8080/#/c/21754/7/be/src/runtime/tuple-cache-mgr.h@151 PS7, Line 151: const string& fragment_id, const string& cache_key Nit: Do you mind if we flip the order of the arguments? Having key, value is more idiomatic than value, key. -- To view, visit http://gerrit.cloudera.org:8080/21754 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ied074e274ebf99fb57e3ee41a13148725775b77c Gerrit-Change-Number: 21754 Gerrit-PatchSet: 7 Gerrit-Owner: Yida Wu <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Kurt Deschler <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Yida Wu <[email protected]> Gerrit-Comment-Date: Wed, 25 Sep 2024 20:28:41 +0000 Gerrit-HasComments: Yes
