Matthew Jacobs has posted comments on this change. Change subject: IMPALA-3973: add position and occurrence to instr() ......................................................................
Patch Set 6: (9 comments) http://gerrit.cloudera.org:8080/#/c/4094/6/be/src/exprs/string-functions-ir.cc File be/src/exprs/string-functions-ir.cc: PS6, Line 311: search_start_pos < 0 remove- this is not possible since we know it's not 0 from l302 and we know it's not <0 from l308. it might be worthwhile to put a DCHECK above l316 to make sure it's always >= 0 though. PS6, Line 311: { : return IntVal(0); this will be short enough to return on the same line PS6, Line 318: if (match_pos_in_substring == -1) { : return IntVal(0); : } 1 line PS6, Line 322: search_start_pos I think you also need to make sure search_start_pos is < haystack.len before the next call to Substring (l316), otherwise that will return invalid memory (Substring doesn't check the length). In fact, since you'll have to do the check before calling Substring(search_start_pos) on l316, I think you can remove the entire block on l311 that checks search_start_pos. PS6, Line 330: search_start_pos >= str.len This isn't possible, str.len = haystack.len, and we know we added something less than 0 to it to get search_start_pos PS6, Line 330: if (search_start_pos < 0 || search_start_pos >= str.len) { : return IntVal(0); : } remove PS6, Line 336: search_start_pos + needle.len why do you add the needle len? I don't understand why it's not just search_start_pos. PS6, Line 338: if (match_pos == -1) { : return IntVal(0); : } 1 line http://gerrit.cloudera.org:8080/#/c/4094/6/be/src/runtime/string-search.h File be/src/runtime/string-search.h: PS6, Line 102: for (int i = 0; i < mlast; ++i) { : BloomAdd(pattern_->ptr[i]); : if (pattern_->ptr[i] == pattern_->ptr[mlast]) : skip_ = mlast - i - 1; : } : BloomAdd(pattern_->ptr[mlast]); I don't think this will always work for RSearch, see https://hg.python.org/cpython/file/6b6c79eba944/Objects/stringlib/fastsearch.h l195 I think this would need to be computed in reverse. -- To view, visit http://gerrit.cloudera.org:8080/4094 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie9648de458d243306fa14adc5e7f7002bf6f67fd Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zoltan Ivanfi <[email protected]> Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Zoltan Ivanfi <[email protected]> Gerrit-HasComments: Yes
