Lars Volker has posted comments on this change. Change subject: IMPALA-3973: add position and occurrence to instr() ......................................................................
Patch Set 5: (12 comments) http://gerrit.cloudera.org:8080/#/c/4094/5/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: Line 2063: // Hive returns 0 if substr was not found in str (or on other error coditions). What happens if you pass other types as the needle or haystack, e.g. int or double or bool? Do we cast them? Or throw an error? http://gerrit.cloudera.org:8080/#/c/4094/5/be/src/exprs/string-functions-ir.cc File be/src/exprs/string-functions-ir.cc: Line 22: #include <stdint.h> are the includes still needed? PS5, Line 289: if (str.is_null || substr.is_null) return IntVal::null(); : StringValue str_sv = StringValue::FromStringVal(str); : StringValue substr_sv = StringValue::FromStringVal(substr); : StringSearch search(&substr_sv); : // Hive returns positions starting from 1. : return IntVal(search.Search(&str_sv) + 1); Will replacing this with a call to the more general version of Instr() come with a performance impact? If not I think we should forward here, similar to the other Instr() overload below. PS5, Line 299: occurance spelling PS5, Line 316: no nit: use 'num' to abbreviate number PS5, Line 329: allowed match what does "allowed match" mean? http://gerrit.cloudera.org:8080/#/c/4094/5/be/src/runtime/string-search-test.cc File be/src/runtime/string-search-test.cc: Line 28: if (len == -1) { single line PS5, Line 69: Haystack is not full string This comment could be clearer, Haystack is a full string. Maybe rephrase "Test searching through part of haystack" or similar. PS5, Line 70: 5 What is the difference between this and just specifying "abcab" as the haystack? It looks like the tests with varying len parameters mostly test that the c'tor of StringValue works correctly. Is there a downside to it, if we rely on that (or test it elsewhere) and then just specify the strings we want to test here without len parameters. Line 74: // Haystack and needle not full len Same here Line 81: // Test last bit, this is the interesting edge case What does "last bit" mean? Can you add a comment why this is an interesting case? http://gerrit.cloudera.org:8080/#/c/4094/5/be/src/runtime/string-search.h File be/src/runtime/string-search.h: Line 204: } Can we advance i by skip_ here, too? Is this what the comment in line 139 refers to? -- To view, visit http://gerrit.cloudera.org:8080/4094 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie9648de458d243306fa14adc5e7f7002bf6f67fd Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zoltan Ivanfi <[email protected]> Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Zoltan Ivanfi <[email protected]> Gerrit-HasComments: Yes
