Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15259 )
Change subject: IMPALA-4080: Moved codegen code from scan exec nodes to their plan nodes ...................................................................... Patch Set 1: (11 comments) Thanks for making these patches so easy to review. I have minor comments only. http://gerrit.cloudera.org:8080/#/c/15259/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15259/1//COMMIT_MSG@7 PS1, Line 7: IMPALA-4080: Moved codegen code from scan exec nodes to their weird wrapping http://gerrit.cloudera.org:8080/#/c/15259/1//COMMIT_MSG@10 PS1, Line 10: This patch moves the code responsible for codegening code for Is this sentence right? It seems like the code hasn't moved, but the plan node is invoking codegen instead of the exec node. http://gerrit.cloudera.org:8080/#/c/15259/1/be/src/exec/hdfs-scan-node-base.h File be/src/exec/hdfs-scan-node-base.h: http://gerrit.cloudera.org:8080/#/c/15259/1/be/src/exec/hdfs-scan-node-base.h@107 PS1, Line 107: /// Returns the tuple idx into the row for this scan node to output to. Consider removing this or making it a constant. I don't think the generality is likely to be ever needed. http://gerrit.cloudera.org:8080/#/c/15259/1/be/src/exec/hdfs-scan-node-base.h@133 PS1, Line 133: int tuple_id_; Init to -1 just so any bugs are more obvious? http://gerrit.cloudera.org:8080/#/c/15259/1/be/src/exec/hdfs-scan-node-base.h@139 PS1, Line 139: typedef boost::unordered_map<std::vector<int>, int> PathToSlotIdxMap; FWIW I personally find this pattern of typedefing containers to make the code less readable in most cases because the actual type of the container is hidden. It was useful pre-c++11 to make iterator types less verbose. You don't need to do anything but I wouldn't object if you removed the typedefs :) http://gerrit.cloudera.org:8080/#/c/15259/1/be/src/exec/hdfs-scan-node-base.h@165 PS1, Line 165: file_formats_ Maybe rename to something like 'scanned_file_formats_' to be more explicit. http://gerrit.cloudera.org:8080/#/c/15259/1/be/src/exec/hdfs-scan-node-base.cc File be/src/exec/hdfs-scan-node-base.cc: http://gerrit.cloudera.org:8080/#/c/15259/1/be/src/exec/hdfs-scan-node-base.cc@a390 PS1, Line 390: That was a weird place to put this... http://gerrit.cloudera.org:8080/#/c/15259/1/be/src/exec/hdfs-scan-node-base.cc@229 PS1, Line 229: auto ranges = ctx->per_node_scan_ranges.find(tnode.node_id); I guess this is not exactly equivalent to the previous logic if different instances have different sets of file types. I don't think this matters at all once we have shared codegen, but just wanted to mention. http://gerrit.cloudera.org:8080/#/c/15259/1/be/src/exec/hdfs-scan-node-base.cc@291 PS1, Line 291: /// TODO: Break up this very long function. Maybe remove this comment? Not sure it's useful. http://gerrit.cloudera.org:8080/#/c/15259/1/be/src/exec/hdfs-scan-node-base.cc@948 PS1, Line 948: std::v nit: no std:: needed http://gerrit.cloudera.org:8080/#/c/15259/1/be/src/runtime/query-state.h File be/src/runtime/query-state.h: http://gerrit.cloudera.org:8080/#/c/15259/1/be/src/runtime/query-state.h@249 PS1, Line 249: vector<const TPlanFragmentInstanceCtx*> GetTPlanFragmentInstanceCtxs( Nit: return const & since they shouldn't be null -- To view, visit http://gerrit.cloudera.org:8080/15259 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02c6183271b3731aa0d3cae2426be7269e462967 Gerrit-Change-Number: 15259 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Fri, 21 Feb 2020 00:40:48 +0000 Gerrit-HasComments: Yes
