Dan Hecht has posted comments on this change. Change subject: IMPALA-3973: add position and occurrence to instr() ......................................................................
Patch Set 13: (5 comments) http://gerrit.cloudera.org:8080/#/c/4094/13/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: Line 2064: TestValue("instr('', '')", TYPE_INT, 0); would probably be good to add this test case for backwards search. http://gerrit.cloudera.org:8080/#/c/4094/13/be/src/exprs/string-functions-ir.cc File be/src/exprs/string-functions-ir.cc: Line 292: if (occurrence.val <= 0 || start_position.val == 0) return IntVal(0); is this what other systems do? i.e. do they silently return 0 or do they raise a warning or give an error in this case? (which we would do using context->SetError()). PS13, Line 302: DCHECK DCHECK_LT. But why is this true? during the previous iteration, couldn't match_pos = haystack.len - 1, and so during this iteration search_start_pos could be equal to haystack.len? I think the condition should be search_start_pos <= haystack.len, where the case of equality is handled because the new haystack len will be 0 and never matched. I think this would be the test case: INSTR("xyza", "a", 1, 2) or INSTR("a", "a", 1, 2); PS13, Line 317: DCHECK DCHECK_GT. but what is the meaning of this DCHECK? search_start_pos is the position into the haystack, right? so shouldn't it always be non-negative? Also, it seems this loop suffers from the similar problem as commented at line 302. (test case INSTR("axyz", "a", -1, 2) or INSTR("a", "a", -1 , 2). PS13, Line 319: std::min This is pretty confusing. I think this would be clearer if we did the min() outside the loop when choosing the first search_start_pos. i.e. the first search_start_pos can always be chosen to be <= haystack.len - needle.len, since that's the first possible match position. I think that makes the intent of what's going on here clearer. -- To view, visit http://gerrit.cloudera.org:8080/4094 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie9648de458d243306fa14adc5e7f7002bf6f67fd Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zoltan Ivanfi <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Zoltan Ivanfi <[email protected]> Gerrit-HasComments: Yes
