Yida Wu has posted comments on this change. ( http://gerrit.cloudera.org:8080/21754 )
Change subject: IMPALA-12908: Add correctness check for tuple cache ...................................................................... Patch Set 5: (10 comments) http://gerrit.cloudera.org:8080/#/c/21754/2/be/src/exec/tuple-cache-node.cc File be/src/exec/tuple-cache-node.cc: http://gerrit.cloudera.org:8080/#/c/21754/2/be/src/exec/tuple-cache-node.cc@468 PS2, Line 468: : : : > After the eviction function removes the metadata entry, it's possible that Changed this part of logic in patch 5. The GetDebugDumpPath() only returns the strings, and the caller does the job to get the fragment id. http://gerrit.cloudera.org:8080/#/c/21754/4/be/src/exec/tuple-cache-node.cc File be/src/exec/tuple-cache-node.cc: http://gerrit.cloudera.org:8080/#/c/21754/4/be/src/exec/tuple-cache-node.cc@89 PS4, Line 89: // was evicted, so we skip the correctness check in this case. > Why would these be empty? Should we log something? It is possible that after IsAvailableForRead(), the cache metadata is evicted, and we can't find the original fragment id to construct the path, so in case I am using an empty string here to skip the correctness check. I modified this logic to allow the caller to do the job to get the original fragment id, so that GetDebugDumpPath won't return empty strings anymore. http://gerrit.cloudera.org:8080/#/c/21754/4/be/src/exec/tuple-cache-node.cc@92 PS4, Line 92: string ref_sub_dir; > How would this happen? We just initialized a TupleTextFileWriter, nullptr s Done http://gerrit.cloudera.org:8080/#/c/21754/4/be/src/exec/tuple-cache-node.cc@97 PS4, Line 97: DCHECK(!ref_sub_dir.empty()); > How would this be nullptr? Done http://gerrit.cloudera.org:8080/#/c/21754/4/be/src/exec/tuple-cache-node.cc@159 PS4, Line 159: } > This should use GetStrErrMsg(), and should fetch it before calling operator Done http://gerrit.cloudera.org:8080/#/c/21754/4/be/src/exec/tuple-text-file-reader.h File be/src/exec/tuple-text-file-reader.h: http://gerrit.cloudera.org:8080/#/c/21754/4/be/src/exec/tuple-text-file-reader.h@61 PS4, Line 61: std::ifstream reader_; > TupleFireReader was switched to use kudu::RWFile. Should this do the same? It seems easier for ifstream to read line from a text file, and for debugging, it doesn't require high performance, seems ifstream is a good fit for this case. http://gerrit.cloudera.org:8080/#/c/21754/4/be/src/exec/tuple-text-file-util.cc File be/src/exec/tuple-text-file-util.cc: http://gerrit.cloudera.org:8080/#/c/21754/4/be/src/exec/tuple-text-file-util.cc@85 PS4, Line 85: }; > nit: I think this whole clause could just be Done http://gerrit.cloudera.org:8080/#/c/21754/4/be/src/exec/tuple-text-file-writer.h File be/src/exec/tuple-text-file-writer.h: http://gerrit.cloudera.org:8080/#/c/21754/4/be/src/exec/tuple-text-file-writer.h@70 PS4, Line 70: std::ofstream writer_; > TupleFileWriter uses kudu::WritableFile. Should this do the same? If the reader is using ifstream, I might as well use ofstream here. Seems no huge difference for the correctness check http://gerrit.cloudera.org:8080/#/c/21754/4/be/src/exec/tuple-text-file-writer.cc File be/src/exec/tuple-text-file-writer.cc: http://gerrit.cloudera.org:8080/#/c/21754/4/be/src/exec/tuple-text-file-writer.cc@37 PS4, Line 37: writer_.open(path_, std::ios::out | std::ios::binary | std::ios::trunc); > This can fail, which sets failbit. Either check that or is_open to determin Added checks on is_open and fail. http://gerrit.cloudera.org:8080/#/c/21754/4/be/src/runtime/tuple-cache-mgr.cc File be/src/runtime/tuple-cache-mgr.cc: http://gerrit.cloudera.org:8080/#/c/21754/4/be/src/runtime/tuple-cache-mgr.cc@376 PS4, Line 376: Status TupleCacheMgr::CreateDebugDumpSubdir(const string& sub_dir) { > This should probably return a Status object so we can correctly log errors. Changed a way to implement, basically move the dir creation to the caller side, and this function only returns strings. -- To view, visit http://gerrit.cloudera.org:8080/21754 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ied074e274ebf99fb57e3ee41a13148725775b77c Gerrit-Change-Number: 21754 Gerrit-PatchSet: 5 Gerrit-Owner: Yida Wu <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Kurt Deschler <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Yida Wu <[email protected]> Gerrit-Comment-Date: Tue, 24 Sep 2024 03:22:30 +0000 Gerrit-HasComments: Yes
