[Impala-ASF-CR] IMPALA-4120: Incorrect results with LEAD() analytic function
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4120: Incorrect results with LEAD() analytic function .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4740 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I85bb1745232d8dd383a6047c86019c6378ab571f Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4120: Incorrect results with LEAD() analytic function
Michael Ho has posted comments on this change. Change subject: IMPALA-4120: Incorrect results with LEAD() analytic function .. Patch Set 4: Code-Review+2 Carry Dan's +2 forward. -- To view, visit http://gerrit.cloudera.org:8080/4740 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I85bb1745232d8dd383a6047c86019c6378ab571f Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4120: Incorrect results with LEAD() analytic function
Hello Matthew Jacobs, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4740 to look at the new patch set (#4). Change subject: IMPALA-4120: Incorrect results with LEAD() analytic function .. IMPALA-4120: Incorrect results with LEAD() analytic function This change fixes a memory management problem with LEAD()/LAG() analytic functions which led to incorrect result. In particular, the update functions specified for these analytic functions only make a shallow copy of StringVal (i.e. copying only the pointer and the length of the string) without copying the string itself. This may lead to problem if the string is created from some UDFs which do local allocations whose buffer may be freed and reused before the result tuple is copied out. This change fixes the problem above by allocating a buffer at the Init() functions of these analytic functions to track the intermediate value. In addition, when the value is copied out in GetValue(), it will be copied into the MemPool belonging to the AnalyticEvalNode and attached to the outgoing row batches. This change also fixes a missing free of local allocations in QueryMaintenance(). Change-Id: I85bb1745232d8dd383a6047c86019c6378ab571f --- M be/src/exec/analytic-eval-node.cc M be/src/exec/analytic-eval-node.h M be/src/exprs/agg-fn-evaluator.h M be/src/exprs/aggregate-functions-ir.cc M be/src/udf/udf.cc M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test 7 files changed, 120 insertions(+), 55 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/4740/4 -- To view, visit http://gerrit.cloudera.org:8080/4740 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I85bb1745232d8dd383a6047c86019c6378ab571f Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho
[Impala-ASF-CR] IMPALA-4120: Incorrect results with LEAD() analytic function
Michael Ho has posted comments on this change. Change subject: IMPALA-4120: Incorrect results with LEAD() analytic function .. Patch Set 3: (4 comments) http://gerrit.cloudera.org:8080/#/c/4740/3/be/src/exec/analytic-eval-node.cc File be/src/exec/analytic-eval-node.cc: Line 403: } > so this bug was similar to IMPALA-3311 (see PartitionedAggregationNode::Han Yes. We need some refactoring to share the code. Probably not worth it. Line 420: (next_partition || (fn_scope_ == RANGE && window_.__isset.window_end && > this indentation seems off. what does clang-format say? Tried with clang-format but still looks a bit odd. Changed the indentation to look closer to the existing code. http://gerrit.cloudera.org:8080/#/c/4740/3/be/src/exprs/agg-fn-evaluator.h File be/src/exprs/agg-fn-evaluator.h: PS3, Line 153: beyond the scope of the call site > Maybe say "The memory is a local allocation and so needs to be copied out b Incorporated comments from Dan and Matt. PS3, Line 153: beyond the scope of the call site > I'm still not sure this is precise enough... this sounds like it's scoped a Done -- To view, visit http://gerrit.cloudera.org:8080/4740 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I85bb1745232d8dd383a6047c86019c6378ab571f Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4120: Incorrect results with LEAD() analytic function
Dan Hecht has posted comments on this change. Change subject: IMPALA-4120: Incorrect results with LEAD() analytic function .. Patch Set 3: Code-Review+2 (3 comments) http://gerrit.cloudera.org:8080/#/c/4740/3/be/src/exec/analytic-eval-node.cc File be/src/exec/analytic-eval-node.cc: Line 403: } so this bug was similar to IMPALA-3311 (see PartitionedAggregationNode::HandleOutputStrings()), right? but i guess we can't share the solution since in this case we're not directly populating the result row batch. Line 420: (next_partition || (fn_scope_ == RANGE && window_.__isset.window_end && this indentation seems off. what does clang-format say? http://gerrit.cloudera.org:8080/#/c/4740/3/be/src/exprs/agg-fn-evaluator.h File be/src/exprs/agg-fn-evaluator.h: PS3, Line 153: beyond the scope of the call site > I'm still not sure this is precise enough... this sounds like it's scoped a Maybe say "The memory is a local allocation and so needs to be copied out before local allocations are freed" or something? (As much as I dislike the terminology "local allocations" it is the terminology we currently have). -- To view, visit http://gerrit.cloudera.org:8080/4740 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I85bb1745232d8dd383a6047c86019c6378ab571f Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4120: Incorrect results with LEAD() analytic function
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-4120: Incorrect results with LEAD() analytic function .. Patch Set 3: Code-Review+1 (1 comment) Thanks! It'll be good to run this through the query generator. http://gerrit.cloudera.org:8080/#/c/4740/3/be/src/exprs/agg-fn-evaluator.h File be/src/exprs/agg-fn-evaluator.h: PS3, Line 153: beyond the scope of the call site I'm still not sure this is precise enough... this sounds like it's scoped at the call site. It's memory that lives until QueryMaintenance frees the function's memory, right? -- To view, visit http://gerrit.cloudera.org:8080/4740 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I85bb1745232d8dd383a6047c86019c6378ab571f Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael HoGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4120: Incorrect results with LEAD() analytic function
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-4120: Incorrect results with LEAD() analytic function .. Patch Set 2: (8 comments) http://gerrit.cloudera.org:8080/#/c/4740/2/be/src/exec/analytic-eval-node.cc File be/src/exec/analytic-eval-node.cc: PS2, Line 390: string string data to be more specific PS2, Line 399: copy result string for AnalyticEvalNode Since this could show up in error messages, let's say something more friendly, e.g. "Failed to allocate memory for analytic function result." PS2, Line 419: if (fn_scope_ != ROWS) { Any reason not to return early (w/ OK now) if this == ROWS? It's nice to simplify the logic after. If we keep it, let's combine the if conditions in a single if stmt. http://gerrit.cloudera.org:8080/#/c/4740/2/be/src/exec/analytic-eval-node.h File be/src/exec/analytic-eval-node.h: PS2, Line 138: Return error status on failure. This can be more specific, right? Returns an error when out of memory. And below too http://gerrit.cloudera.org:8080/#/c/4740/2/be/src/exprs/agg-fn-evaluator.h File be/src/exprs/agg-fn-evaluator.h: PS2, Line 152: Note that the buffer for StringVal result is allocated from : /// memory managed by Impala so it must be copied out by caller. Do we need to call this out specifically here? It might just be better to add a few sentences in the class header about mem usage in general when using AggFnEvaluator. PS2, Line 153: r. ... if it needs to be used beyond the lifetime of ... ? PS2, Line 153: memory managed by Impala this isn't specific, most memory in this process is managed by Impala :) PS2, Line 237: memory managed by Impala this isn't specific, most memory in this process is managed by Impala :) same for the statement above -- To view, visit http://gerrit.cloudera.org:8080/4740 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I85bb1745232d8dd383a6047c86019c6378ab571f Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael HoGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4120: Incorrect results with LEAD() analytic function
Michael Ho has posted comments on this change. Change subject: IMPALA-4120: Incorrect results with LEAD() analytic function .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/4740/1/be/src/exprs/agg-fn-evaluator.h File be/src/exprs/agg-fn-evaluator.h: PS1, Line 149: /// Puts the finalized value from Tuple* src in Tuple* dst just as Finalize() does. : /// 'pool' is the MemPool used for allocating memory for the finalized value (e.g. : /// StringVal). However, unlike Finalize(), GetValue() does not clean up state in src. : /// GetValue() can be called repeatedly with the same src. Only used internally for : /// analytic fn builtins. : void GetValue(FunctionContext* agg_fn_ctx, Tuple* src, Tuple* dst, MemPool* pool); > That makes sense. What you are suggesting is similar to what the PAGG node Actually, let's stick with copying out the string from the result_tuple for now. We probably should update the interface of Serialize()/Finalize() down the road to directly copy the string into a MemPool. -- To view, visit http://gerrit.cloudera.org:8080/4740 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I85bb1745232d8dd383a6047c86019c6378ab571f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael HoGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4120: Incorrect results with LEAD() analytic function
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-4120: Incorrect results with LEAD() analytic function .. Patch Set 1: (4 comments) http://gerrit.cloudera.org:8080/#/c/4740/1//COMMIT_MSG Commit Message: PS1, Line 18: AggFnEvaluatorthese ? http://gerrit.cloudera.org:8080/#/c/4740/1/be/src/exec/analytic-eval-node.cc File be/src/exec/analytic-eval-node.cc: PS1, Line 389: AggFnEvaluator::GetValue(evaluators_, fn_ctxs_, curr_tuple_, result_tuple, : cur_tuple_pool); wrt my comment in AggFnEvaluator, what if we didn't change GetValue() and instead we called a helper fn here that moved any backing memory from result_tuple into cur_tuple_pool, e.g. // Copies memory backing Strings (in analytic result slots) from result_tuple into memory allocated from cur_tuple_pool. RelocateStringMemory(result_tuple, cur_tuple_pool); (or something like that) http://gerrit.cloudera.org:8080/#/c/4740/1/be/src/exprs/agg-fn-evaluator.cc File be/src/exprs/agg-fn-evaluator.cc: PS1, Line 499: v = StringVal::null(); this means the allocation failed, right? should we be propagating errors? seems like that'd lead to incorrect results under low-memory conditions. http://gerrit.cloudera.org:8080/#/c/4740/1/be/src/exprs/agg-fn-evaluator.h File be/src/exprs/agg-fn-evaluator.h: PS1, Line 149: /// Puts the finalized value from Tuple* src in Tuple* dst just as Finalize() does. : /// 'pool' is the MemPool used for allocating memory for the finalized value (e.g. : /// StringVal). However, unlike Finalize(), GetValue() does not clean up state in src. : /// GetValue() can be called repeatedly with the same src. Only used internally for : /// analytic fn builtins. : void GetValue(FunctionContext* agg_fn_ctx, Tuple* src, Tuple* dst, MemPool* pool); I see how this and the below changes to the GetValue() path (all the way through SerializeOrFinalize) fix the issue, but I'm worried it's going to be more confusing when the interface isn't symmetrical, e.g. other Serialize and Finalize can't allocate mem out of pool even though it wouldn't be crazy for that to be allowed. What if the caller handled copying GetValue memory instead of having GetValue() handle the string copy? We could have a helper fn somewhere (maybe just in AggFnEvaluator, or maybe there's a better place) which moved backing memory into a specified pool. Does that make sense? -- To view, visit http://gerrit.cloudera.org:8080/4740 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I85bb1745232d8dd383a6047c86019c6378ab571f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael HoGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-HasComments: Yes