Dan Hecht has posted comments on this change.

Change subject: IMPALA-3973: add position and occurrence to instr()
......................................................................


Patch Set 13:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/4094/13/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

Line 2064:   TestValue("instr('', '')", TYPE_INT, 0);
would probably be good to add this test case for backwards search.


http://gerrit.cloudera.org:8080/#/c/4094/13/be/src/exprs/string-functions-ir.cc
File be/src/exprs/string-functions-ir.cc:

Line 292:   if (occurrence.val <= 0 || start_position.val == 0) return 
IntVal(0);
is this what other systems do? i.e. do they silently return 0 or do they raise 
a warning or give an error in this case?  (which we would do using 
context->SetError()).


PS13, Line 302: DCHECK
DCHECK_LT.
But why is this true? during the previous iteration, couldn't match_pos = 
haystack.len - 1, and so during this iteration search_start_pos could be equal 
to haystack.len?  I think the condition should be search_start_pos <= 
haystack.len, where the case of equality is handled because the new haystack 
len will be 0 and never matched.

I think this would be the test case: INSTR("xyza", "a", 1, 2) or INSTR("a", 
"a", 1, 2);


PS13, Line 317: DCHECK
DCHECK_GT. but what is the meaning of this DCHECK?  search_start_pos is the 
position into the haystack, right? so shouldn't it always be non-negative?

Also, it seems this loop suffers from the similar problem as commented at line 
302. (test case INSTR("axyz", "a", -1, 2)  or INSTR("a", "a", -1 , 2).


PS13, Line 319: std::min
This is pretty confusing. I think this would be clearer if we did the min() 
outside the loop when choosing the first search_start_pos. i.e. the first 
search_start_pos can always be chosen to be <= haystack.len - needle.len, since 
that's the first possible match position. I think that makes the intent of 
what's going on here clearer.


-- 
To view, visit http://gerrit.cloudera.org:8080/4094
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9648de458d243306fa14adc5e7f7002bf6f67fd
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zoltan Ivanfi <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Matthew Jacobs <[email protected]>
Gerrit-Reviewer: Zoltan Ivanfi <[email protected]>
Gerrit-HasComments: Yes

Reply via email to