julianhyde commented on code in PR #3184:
URL: https://github.com/apache/calcite/pull/3184#discussion_r1182009899
##########
site/_docs/reference.md:
##########
@@ -2693,6 +2693,8 @@ BigQuery's type system uses confusingly different names
for types and functions:
| b o | GREATEST(expr [, expr ]*) | Returns the greatest of
the expressions
| b h s | IF(condition, value1, value2) | Returns *value1* if
*condition* is TRUE, *value2* otherwise
| b | IFNULL(value1, value2) | Equivalent to
`NVL(value1, value2)`
+| b o | INSTR(string, substring[, from, occurrence]) | Returns the position of
substring in string, searching starting at from, and until locating the nth
occurrence of substring.
Review Comment:
s/substring/\*substring\*/ etc.
s/from/from (default 1)/
s/occurrence/occurrence (default 1)/
remove '.' at end of line
##########
core/src/main/java/org/apache/calcite/runtime/SqlFunctions.java:
##########
@@ -3104,23 +3104,106 @@ public static int position(ByteString seek, ByteString
s) {
/** SQL {@code POSITION(seek IN string FROM integer)} function. */
public static int position(String seek, String s, int from) {
- final int from0 = from - 1; // 0-based
- if (from0 > s.length() || from0 < 0) {
+ if (from == 0) {
+ throw RESOURCE.fromNotZero().ex();
+ }
+ // Case when from is positive
+ if (from > 0) {
+ final int from0 = from - 1; // 0-based
+ if (from0 >= s.length() || from0 < 0) {
+ return 0;
+ } else {
+ return s.indexOf(seek, from0) + 1;
+ }
+ }
+ // Case when from is negative
+ final int rightIndex = from + s.length(); // negative position to positive
index
+ if (rightIndex <= 0) {
return 0;
}
-
- return s.indexOf(seek, from0) + 1;
+ return s.substring(0, rightIndex + 1).lastIndexOf(seek) + 1;
}
/** SQL {@code POSITION(seek IN string FROM integer)} function for byte
* strings. */
public static int position(ByteString seek, ByteString s, int from) {
- final int from0 = from - 1;
- if (from0 > s.length() || from0 < 0) {
+ if (from == 0) {
+ throw RESOURCE.fromNotZero().ex();
+ }
+ // Case when from is positive
+ if (from > 0) {
+ final int from0 = from - 1; // 0-based
+ if (from0 >= s.length() || from0 < 0) {
+ return 0;
+ } else {
+ return s.indexOf(seek, from0) + 1;
+ }
+ }
+ // Case when from is negative
+ final int rightIndex = from + s.length();
+ if (rightIndex <= 0) {
return 0;
}
+ int lastIndex = 0;
+ while (lastIndex < rightIndex) {
Review Comment:
Is there a method we could add to `ByteString` to allow you to write the
same code in this method as for `position(String, String, int)`? (`ByteString`
is in Avatica, so we would have to wait a release.)
##########
core/src/main/resources/org/apache/calcite/runtime/CalciteResource.properties:
##########
@@ -336,4 +336,6 @@ InvalidPartitionKeys=Only tables with set semantics may be
partitioned. Invalid
InvalidOrderBy=Only tables with set semantics may be ordered. Invalid ORDER BY
clause in the {0,number,#}-th operand of table function ''{1}''
MultipleRowSemanticsTables=A table function at most has one input table with
row semantics. Table function ''{0}'' has multiple input tables with row
semantics
NoOperator=No operator for ''{0}'' with kind: ''{1}'', syntax: ''{2}'' during
JSON deserialization
+FromNotZero=Invalid input for position function: from operand value must not
be zero
Review Comment:
s/position/POSITION/
Other error messages have the function in upper case. It reinforces that
it's a function name.
##########
core/src/test/java/org/apache/calcite/test/SqlFunctionsTest.java:
##########
@@ -1054,6 +1055,54 @@ private void thereAndBack(byte[] bytes) {
// ok
}
}
+ @Test void testPosition() {
+ assertThat(3, is(position("c", "abcdec")));
+ assertThat(3, is(position("c", "abcdec", 2)));
Review Comment:
These tests are all backwards! For example, rather than
```
assertThat(3, is(position("c", "abcdec", 2)));
```
you should write
```
assertThat(position("c", "abcdec", 2), is(3)));
```
Here, `position("c", "abcdec", 2)` is the thing you are testing, and `3` is
the expected result. In fact `is(3)` is the assertion about the expected result.
Why does it matter? If the test fails - say the position call returns 57 -
then the framework should print 'got 57, expected 3'. Also, the test reads from
left-to-right like an English sentence.
Maybe you're familiar with `assertEquals`, and earlier method where you were
SUPPOSED to write the arguments backwards. `assertEquals(3, position("c",
"abcdec", 2));` would be the correct way to write this test using
`assertEquals`.
##########
core/src/main/java/org/apache/calcite/runtime/SqlFunctions.java:
##########
@@ -3104,23 +3104,105 @@ public static int position(ByteString seek, ByteString
s) {
/** SQL {@code POSITION(seek IN string FROM integer)} function. */
public static int position(String seek, String s, int from) {
- final int from0 = from - 1; // 0-based
- if (from0 > s.length() || from0 < 0) {
+ if (from == 0) {
+ throw RESOURCE.fromNotZero().ex();
+ }
+ // Case when from is positive
+ if (from > 0) {
+ final int from0 = from - 1; // 0-based
+ if (from0 >= s.length() || from0 < 0) {
+ return 0;
+ }
+
+ return s.indexOf(seek, from0) + 1;
+ }
+ // Case when from is negative
+ final int rightIndex = from + s.length(); // negative position to positive
index
+ if (rightIndex <= 0) {
return 0;
}
-
- return s.indexOf(seek, from0) + 1;
+ return s.substring(0, rightIndex + 1).lastIndexOf(seek) + 1;
}
/** SQL {@code POSITION(seek IN string FROM integer)} function for byte
* strings. */
public static int position(ByteString seek, ByteString s, int from) {
- final int from0 = from - 1;
- if (from0 > s.length() || from0 < 0) {
+ if (from == 0) {
+ throw RESOURCE.fromNotZero().ex();
+ }
+ // Case when from is positive
+ if (from > 0) {
+ final int from0 = from - 1; // 0-based
+ if (from0 >= s.length() || from0 < 0) {
+ return 0;
+ }
+ return s.indexOf(seek, from0) + 1;
+ }
+ // Case when from is negative
+ final int rightIndex = from + s.length();
+ if (rightIndex <= 0) {
return 0;
}
+ int lastIndex = 0;
+ while (lastIndex < rightIndex) {
+ int indexOf = s.substring(lastIndex, rightIndex + 1).indexOf(seek) + 1;
+ if (indexOf == 0) {
+ break;
+ }
+ lastIndex += indexOf;
+ }
+ return lastIndex;
+ }
+
+ /** SQL {@code POSITION(seek, string, from, occurrence)} function. */
+ public static int position(String seek, String s, int from, int occurrence) {
+ if (occurrence == 0) {
+ throw RESOURCE.occurrenceNotZero().ex();
+ }
+ for (int i = 0; i < occurrence; i++) {
+ if (from > 0) {
+ from = position(seek, s, from + (i == 0 ? 0 : 1));
+ if (from == 0) {
+ return 0;
+ }
+ } else {
+ from = position(seek, s, from);
+ if (from == 0) {
+ return 0;
+ }
+ from -= s.length() + 2;
+ }
+ }
+ if (from < 0) {
+ from += s.length() + 2;
+ }
+ return from;
+ }
- return s.indexOf(seek, from0) + 1;
+ /** SQL {@code POSITION(seek, string, from, occurrence)} function for byte
+ * strings. */
+ public static int position(ByteString seek, ByteString s, int from, int
occurrence) {
+ if (occurrence == 0) {
+ throw RESOURCE.occurrenceNotZero().ex();
+ }
+ for (int i = 0; i < occurrence; i++) {
Review Comment:
The call to `position` isn't recursive, thank goodness.
I'd be inclined to inline the `position` function, and move the `if` outside
the `for`. I think you can skip a lot of checks.
I was wondering how to implement this without calling `String.substring`,
but I can't find an easy solution. So let's leave it as is.
##########
core/src/main/java/org/apache/calcite/runtime/SqlFunctions.java:
##########
@@ -3104,23 +3104,106 @@ public static int position(ByteString seek, ByteString
s) {
/** SQL {@code POSITION(seek IN string FROM integer)} function. */
public static int position(String seek, String s, int from) {
- final int from0 = from - 1; // 0-based
- if (from0 > s.length() || from0 < 0) {
+ if (from == 0) {
+ throw RESOURCE.fromNotZero().ex();
+ }
+ // Case when from is positive
+ if (from > 0) {
+ final int from0 = from - 1; // 0-based
+ if (from0 >= s.length() || from0 < 0) {
+ return 0;
+ } else {
+ return s.indexOf(seek, from0) + 1;
+ }
+ }
+ // Case when from is negative
+ final int rightIndex = from + s.length(); // negative position to positive
index
+ if (rightIndex <= 0) {
return 0;
}
-
- return s.indexOf(seek, from0) + 1;
+ return s.substring(0, rightIndex + 1).lastIndexOf(seek) + 1;
}
/** SQL {@code POSITION(seek IN string FROM integer)} function for byte
* strings. */
public static int position(ByteString seek, ByteString s, int from) {
- final int from0 = from - 1;
- if (from0 > s.length() || from0 < 0) {
+ if (from == 0) {
+ throw RESOURCE.fromNotZero().ex();
+ }
+ // Case when from is positive
+ if (from > 0) {
+ final int from0 = from - 1; // 0-based
+ if (from0 >= s.length() || from0 < 0) {
+ return 0;
+ } else {
+ return s.indexOf(seek, from0) + 1;
+ }
+ }
+ // Case when from is negative
+ final int rightIndex = from + s.length();
+ if (rightIndex <= 0) {
return 0;
}
+ int lastIndex = 0;
+ while (lastIndex < rightIndex) {
+ int indexOf = s.substring(lastIndex, rightIndex + 1).indexOf(seek) + 1;
+ if (indexOf == 0) {
+ break;
+ }
+ lastIndex += indexOf;
+ }
+ return lastIndex;
+ }
- return s.indexOf(seek, from0) + 1;
+ /** SQL {@code POSITION(seek, string, from, occurrence)} function. */
+ public static int position(String seek, String s, int from, int occurrence) {
+ if (occurrence == 0) {
+ throw RESOURCE.occurrenceNotZero().ex();
+ }
+ for (int i = 0; i < occurrence; i++) {
Review Comment:
`from` is constant and `i` is not. So would it make sense to move the `if`
outside the `for`, so that it's only checked once?
##########
core/src/test/java/org/apache/calcite/test/SqlFunctionsTest.java:
##########
@@ -1054,6 +1055,54 @@ private void thereAndBack(byte[] bytes) {
// ok
}
}
+ @Test void testPosition() {
Review Comment:
add blankline before
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]