Thomas Tauber-Marshall has posted comments on this change. Change subject: IMPALA-3210: last/first_value() support for IGNORE NULLS ......................................................................
Patch Set 2: (13 comments) http://gerrit.cloudera.org:8080/#/c/3328/2/be/src/exprs/aggregate-functions-ir.cc File be/src/exprs/aggregate-functions-ir.cc: PS2, Line 1418: LastValIgnoreNullsState > An overview of how this all works would be helpful to other readers. A comm Done PS2, Line 1429: state > need to initialize num_nulls to 0 as well Done PS2, Line 1445: state->last_val > I think we need to handle strings specially. If this wasn't null before and Done PS2, Line 1460: state->last_val = T::null() > same deal about releasing memory. You could call LVU(ctx, T::null(), &state Done PS2, Line 1466: if (UNLIKELY(src.is_null)) return T::null(); > Hm, why is this necessary? I can't think of a case where the intermediate s We're inconsistent about this - we (generally) have either a DCHECK or no check at all in the GetValue functions in this file but an 'if' check like this in the Finalize functions. I agree though that it seems like this should never happen, and the tests pass with a DCHECK, so of course a DCHECK is better. PS2, Line 1476: if (UNLIKELY(src.is_null)) return StringVal::null(); > same here and in the Finalize methods Done PS2, Line 1491: if (UNLIKELY(src.is_null)) return T::null(); : DCHECK_EQ(sizeof(LastValIgnoreNullsState<T>), src.len); : LastValIgnoreNullsState<T>* state = : reinterpret_cast<LastValIgnoreNullsState<T>*>(src.ptr); : T result = state->last_val; > I think you can replace this with: Done PS2, Line 1508: StringVal result; : if (state->last_val.is_null) { : result = StringVal::null(); : } else { : result = StringVal::CopyFrom(ctx, state->last_val.ptr, state->last_val.len); : } > You can replace this with a call to LastValIgnoreNullsGetValue(ctx, src); l Done PS2, Line 1515: ctx->Free > I think you also need to free state->last_val.ptr Done http://gerrit.cloudera.org:8080/#/c/3328/2/fe/src/main/cup/sql-parser.cup File fe/src/main/cup/sql-parser.cup: PS2, Line 2423: function_name:fn_name LPAREN function_params:params KW_IGNORE KW_NULLS RPAREN > Let's see whatAlex and/or Marcel thinks, but perhaps we remove this branch Done http://gerrit.cloudera.org:8080/#/c/3328/2/fe/src/main/java/com/cloudera/impala/analysis/AnalyticExpr.java File fe/src/main/java/com/cloudera/impala/analysis/AnalyticExpr.java: PS2, Line 579: * 3. FIRST_VALUE without UNBOUNDED PRECEDING gets rewritten to use a different window : * and change the function to return the last value. We either set the fn to be : * 'last_value' or 'first_value_rewrite', which simply wraps the 'last_value' : * implementation but allows us to handle the first rows in a partition in a special : * way in the backend. There are a few cases: : * a) IGNORE NULLS: : * We cannot change the values that appear in the window since there may be any : * number of NULLs that must be ignored, so we can't take advantage of the : * performance optimizations in the other cases. Just reverse the window and : * ordering and use 'last_value'. : * b) Start bound is X FOLLOWING or CURRENT ROW (X=0): : * Use 'last_value' with a window where both bounds are X FOLLOWING (or : * CURRENT ROW). Setting the start bound to X following is necessary because the : * X rows at the end of a partition have no rows in their window. Note that X : * FOLLOWING could be rewritten as lead(X) but that would not work for CURRENT : * ROW. : * c) Start bound is X PRECEDING and end bound is CURRENT ROW or FOLLOWING: : * Use 'first_value_rewrite' and a window with an end bound X PRECEDING. An : * extra parameter '-1' is added to indicate to the backend that NULLs should : * not be added for the first X rows. : * d) Start bound is X PRECEDING and end bound is Y PRECEDING: : * Use 'first_value_rewrite' and a window with an end bound X PRECEDING. The : * first Y rows in a partition have empty windows and should be NULL. An extra : * parameter with the integer constant Y is added to indicate to the backend : * that NULLs should be added for the first Y rows. > This is too complicated (and was to begin with, so not your fault). However Done http://gerrit.cloudera.org:8080/#/c/3328/2/fe/src/main/java/com/cloudera/impala/analysis/IgnoreNullsExpr.java File fe/src/main/java/com/cloudera/impala/analysis/IgnoreNullsExpr.java: > I know I led you down this path, and it seems like the fastest way to get t Done PS2, Line 30: IgnoreNullsExpr > Hm the name should probably indicate it's a kind of FunctionCall, e.g. Func Done -- To view, visit http://gerrit.cloudera.org:8080/3328 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic27525e2237fb54318549d2674f1610884208e9b Gerrit-PatchSet: 2 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Thomas Tauber-Marshall <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-HasComments: Yes
