Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8025 )
Change subject: IMPALA-5844: use a MemPool for expr result allocations ...................................................................... Patch Set 11: (17 comments) http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/exec-node.h File be/src/exec/exec-node.h: http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/exec-node.h@379 PS11, Line 379: Status QueryMaintenance(RuntimeState* state) WARN_UNUSED_RESULT; > I still think we should get rid of this but okay to leave in this change. Yeah I think this is another step towards IMPALA-2399 since there's less encapsulated in this function. I think it's a separate logical change that probably needs some thought. http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/hash-table.h File be/src/exec/hash-table.h: http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/hash-table.h@396 PS11, Line 396: Cloes > Close Done http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/hash-table.h@525 PS11, Line 525: /// results of expr evaluation. > Not owned? Which also means the exec node will do the maintenance on them, Done http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/partitioned-aggregation-node.h File be/src/exec/partitioned-aggregation-node.h: http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/partitioned-aggregation-node.h@214 PS11, Line 214: if this is a non-grouping aggregation used : /// for the non-grouping cases > did that get garbled? Done http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/partitioned-aggregation-node.h@219 PS11, Line 219: /// are allocated from 'expr_perm_pool_' and 'expr_results_pool_' respectively. > and what about for grouping case? Reworked this whole comment to be clearer. http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/partitioned-aggregation-node.h@419 PS11, Line 419: expr_managed > should this be talking about permanent? Done http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/partitioned-aggregation-node.h@420 PS11, Line 420: per partition > I don't get that comment, given this is a single pool. It belongs to the partition though. This comment really needs some fixing though - rewrote it. http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/partitioned-aggregation-node.h@421 PS11, Line 421: Local allocations > results Done http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/partitioned-aggregation-node.h@423 PS11, Line 423: agg_fn_pool Renamed this since it's only permanent allocatoins. http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/partitioned-aggregation-node.cc File be/src/exec/partitioned-aggregation-node.cc: http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/partitioned-aggregation-node.cc@600 PS11, Line 600: if (++i % 1024 == 0) > is that even worth it? Probably not http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/scanner-context.h File be/src/exec/scanner-context.h: http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/scanner-context.h@337 PS11, Line 337: /// evaluation to prevent memory from accumulating. > who owns this? Mentioned the two cases, since the only reason this is this complicated is the need to support the multi-threads scan node. The filter_ctxs_ are created in HdfsScanNode::ScannerThread() in the multi-threaded case and they allocate memory from this pool. So this pool needs to exists at that point in time, but the scanner implementation also needs access to it so that it can clear it. I don't think this *needs* to be that complicated but it seems like it would require a non-trivial restructure of the logic around filter_ctxs_ to get it simpler. http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/runtime/sorter.h File be/src/runtime/sorter.h: http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/runtime/sorter.h@97 PS11, Line 97: (returns true if lhs < rhs) > not sure that still makes sense. what returns true? That's true, it doesn't make sense now. http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/runtime/sorter.h@202 PS11, Line 202: intermediate > not just the intermediate results, but also the final results of expression You're right, I was thinking it was only used for the comparator but it's of course also used for &sort_tuple_expr_evals_ http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/runtime/sorter.h@199 PS11, Line 199: /// MemPool for allocating data structures used by expression evaluators in the sorter. : MemPool expr_perm_pool_; : : /// MemPool for allocations that hold intermediate results of expression evaluation in : /// the sorter. Cleared periodically during sorting to prevent memory accumulating. : MemPool expr_results_pool_; : : /// In memory sorter and less-than comparator. : TupleRowComparator compare_less_than_; > in other places, we have indirection (i.e. scoped_ptr) for these fields. is In TopNNode and ExchangeNode the comparator can't be instantiated until Prepare() because its constructor arguments aren't available. http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/udf/udf.h File be/src/udf/udf.h: http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/udf/udf.h@350 PS11, Line 350: For allocations that back the intermediate results of aggregation : /// returned by Init(), Update() and Merge() should be allocated by > a bit garbled Tried to improve it. http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/udf/udf.h@616 PS11, Line 616: temporary > this is kind of new terminology. above i think you said "temporary results" Reworded this to avoid introducing vague terminology. I think this comment was a bit unclear on other points so clarified it and fixed the behaviour when new_len < len (which was actually incorrect). http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/util/tuple-row-compare.h File be/src/util/tuple-row-compare.h: http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/util/tuple-row-compare.h@77 PS11, Line 77: Any : /// the evaluators with use > garbled Done -- To view, visit http://gerrit.cloudera.org:8080/8025 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4ba5a7542ed90a49a4b5586c040b5985a7d45b61 Gerrit-Change-Number: 8025 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Wed, 04 Oct 2017 00:41:04 +0000 Gerrit-HasComments: Yes
