[GitHub] [calcite] macroguo-ghy commented on a diff in pull request #3389: [CALCITE-5935] Add CODE_POINTS_TO_BYTES function (enabled in BigQuery…

2023-09-02 Thread via GitHub


macroguo-ghy commented on code in PR #3389:
URL: https://github.com/apache/calcite/pull/3389#discussion_r1313783485


##
site/_docs/reference.md:
##
@@ -2685,6 +2685,7 @@ BigQuery's type system uses confusingly different names 
for types and functions:
 | b | CEIL(value)| Similar to standard 
`CEIL(value)` except if *value* is an integer type, the return type is a double
 | m s | CHAR(integer)| Returns the character 
whose ASCII code is *integer* % 256, or null if *integer*  0
 | b o p | CHR(integer)   | Returns the character 
whose UTF-8 code is *integer*
+| b | CODE_POINTS_TO_BYTES(integers) | Converts *integers*, an 
array of integers between 0 and 255 inclusive, throws error if any element is 
out of range

Review Comment:
   Hi @herunkang2018, I have made this change, you refered to the outdated code.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [calcite] macroguo-ghy commented on a diff in pull request #3389: [CALCITE-5935] Add CODE_POINTS_TO_BYTES function (enabled in BigQuery…

2023-08-29 Thread via GitHub


macroguo-ghy commented on code in PR #3389:
URL: https://github.com/apache/calcite/pull/3389#discussion_r1308734521


##
core/src/main/java/org/apache/calcite/sql/type/OperandTypes.java:
##
@@ -510,6 +510,38 @@ public static SqlOperandTypeChecker variadic(
 }
   };
 
+  public static final SqlSingleOperandTypeChecker ARRAY_OF_INTEGER =

Review Comment:
   IMO, if the argument types of the function is incorrect (such as `array 
['abc ']`), Calcite should throw an exception as early as possible. Using 
`ARRAY_OF_INTEGER` can throw exceptions during validation rather than at 
runtime, avoiding unnecessary conversions and executions. 
   And during function execution, all elements are of type Number, so I change 
`if (codePoint instanceof Number)` to `assert codePoint instanceof Number`.



##
site/_docs/reference.md:
##
@@ -2685,6 +2685,7 @@ BigQuery's type system uses confusingly different names 
for types and functions:
 | b | CEIL(value)| Similar to standard 
`CEIL(value)` except if *value* is an integer type, the return type is a double
 | m s | CHAR(integer)| Returns the character 
whose ASCII code is *integer* % 256, or null if *integer*  0
 | b o p | CHR(integer)   | Returns the character 
whose UTF-8 code is *integer*
+| b | CODE_POINTS_TO_BYTES(integers) | Converts *integers*, an 
array of integers between 0 and 255 inclusive, throws error if any element is 
out of range

Review Comment:
   fixed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [calcite] macroguo-ghy commented on a diff in pull request #3389: [CALCITE-5935] Add CODE_POINTS_TO_BYTES function (enabled in BigQuery…

2023-08-26 Thread via GitHub


macroguo-ghy commented on code in PR #3389:
URL: https://github.com/apache/calcite/pull/3389#discussion_r1306369153


##
core/src/main/java/org/apache/calcite/runtime/SqlFunctions.java:
##
@@ -929,6 +929,29 @@ public static String charFromUtf8(int n) {
 return String.valueOf(Character.toChars(n));
   }
 
+  /**
+   * SQL CODE_POINTS_TO_BYTES function.
+   */
+  public static @Nullable ByteString codePointsToBytes(List codePoints) {

Review Comment:
   @herunkang2018 Thanks for your advice. Calcite often use 
`SqlTypeFamily.INTEGER`(which is collection of `INT_TYPES` in `SqlTypeName`) as 
a integer operand checker instead of `SqlTypeName.INTEGER`. This can easily be 
confusing. So `ARRAY_OF_INTEGER` means an array whose component could be 
`TINYINT, SMALLINT, INTEGER or BIGINT`.
   Back to your comment, array component could be any of the `INT_TYPES`, 
including `BIGINT`(Correspond Long in java). The reason I allow the `BIGINT` 
component to pass this checker is because if `BIGINT` component is invaild 
input type, `code_points_to_byte(array[cast(1 as bigint)])` will fail during 
operand type checking, this does not meet expectations. And  if I execute 
`select code_points_to_bytes(array[2147483648])` in BigQuery, BigQuery will 
throw a exception :`Input arguments of CODE_POINTS_TO_BYTES out of range: 
2147483648` rather than input type error(Actully BigQuery see all integers as 
int64).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [calcite] macroguo-ghy commented on a diff in pull request #3389: [CALCITE-5935] Add CODE_POINTS_TO_BYTES function (enabled in BigQuery…

2023-08-23 Thread via GitHub


macroguo-ghy commented on code in PR #3389:
URL: https://github.com/apache/calcite/pull/3389#discussion_r1302967994


##
testkit/src/main/java/org/apache/calcite/test/SqlOperatorTest.java:
##
@@ -1804,6 +1804,35 @@ void testCastToBoolean(CastType castType, 
SqlOperatorFixture f) {
 consumer);
   }
 
+  @Test void testCodePointsToBytes() {
+final SqlOperatorFixture f = fixture()
+.setFor(SqlLibraryOperators.CODE_POINTS_TO_BYTES, VM_FENNEL, VM_JAVA)
+.withLibrary(SqlLibrary.BIG_QUERY);
+f.checkFails("^code_points_to_bytes('abc')^",
+"Cannot apply 'CODE_POINTS_TO_BYTES' to arguments of type "
++ "'CODE_POINTS_TO_BYTES\\(\\)'\\. "
++ "Supported form\\(s\\): CODE_POINTS_TO_BYTES\\(\\)",
+false);
+f.checkFails("^code_points_to_bytes(array['abc'])^",
+"Cannot apply 'CODE_POINTS_TO_BYTES' to arguments of type "
++ "'CODE_POINTS_TO_BYTES\\(\\)'\\. "
++ "Supported form\\(s\\): CODE_POINTS_TO_BYTES\\(\\)",
+false);
+
+f.checkFails("code_points_to_bytes(array[-1])",
+"Input arguments of CODE_POINTS_TO_BYTES out of range: -1", true);
+f.checkFails("code_points_to_bytes(array[2147483648, 1])",
+"Input arguments of CODE_POINTS_TO_BYTES out of range: 2147483648", 
true);
+
+f.checkString("code_points_to_bytes(array[65, 66, 67, 68])", "41424344", 
"VARBINARY NOT NULL");
+f.checkString("code_points_to_bytes(array[255, 254, 65, 64])", "fffe4140",

Review Comment:
   I add a test for expressions inputs: `code_points_to_bytes(array[1+2, 3, 
4])`. 
   But for the empty array, I try to use `array[]` and `array()` construct a 
emtpy array, but both of them fail. `array[]` requires more then 1 argument, 
`array()` only enabled for SPARKSQL.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [calcite] macroguo-ghy commented on a diff in pull request #3389: [CALCITE-5935] Add CODE_POINTS_TO_BYTES function (enabled in BigQuery…

2023-08-23 Thread via GitHub


macroguo-ghy commented on code in PR #3389:
URL: https://github.com/apache/calcite/pull/3389#discussion_r1302960496


##
core/src/main/java/org/apache/calcite/runtime/SqlFunctions.java:
##
@@ -929,6 +929,29 @@ public static String charFromUtf8(int n) {
 return String.valueOf(Character.toChars(n));
   }
 
+  /**
+   * SQL CODE_POINTS_TO_BYTES function.
+   */
+  public static @Nullable ByteString codePointsToBytes(List codePoints) {

Review Comment:
   Thanks for you review:) @tanclary.
   Actually, the elements of codePoints can be of type Long or Integer. 
Additionally, let's consider an input of `17179869185` (which is equal to `(1L 
<< 34) + 1L`). Although this is an illegal argument, when it is cast to an 
integer (using the `intValue()` method), it will become 1! This will result in 
an incorrect output. So, I believe using this logic is reasonable. WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [calcite] macroguo-ghy commented on a diff in pull request #3389: [CALCITE-5935] Add CODE_POINTS_TO_BYTES function (enabled in BigQuery…

2023-08-23 Thread via GitHub


macroguo-ghy commented on code in PR #3389:
URL: https://github.com/apache/calcite/pull/3389#discussion_r1302943102


##
testkit/src/main/java/org/apache/calcite/test/SqlOperatorTest.java:
##
@@ -1804,6 +1804,35 @@ void testCastToBoolean(CastType castType, 
SqlOperatorFixture f) {
 consumer);
   }
 
+  @Test void testCodePointsToBytes() {
+final SqlOperatorFixture f = fixture()
+.setFor(SqlLibraryOperators.CODE_POINTS_TO_BYTES, VM_FENNEL, VM_JAVA)
+.withLibrary(SqlLibrary.BIG_QUERY);
+f.checkFails("^code_points_to_bytes('abc')^",
+"Cannot apply 'CODE_POINTS_TO_BYTES' to arguments of type "
++ "'CODE_POINTS_TO_BYTES\\(\\)'\\. "
++ "Supported form\\(s\\): CODE_POINTS_TO_BYTES\\(\\)",
+false);
+f.checkFails("^code_points_to_bytes(array['abc'])^",
+"Cannot apply 'CODE_POINTS_TO_BYTES' to arguments of type "
++ "'CODE_POINTS_TO_BYTES\\(\\)'\\. "
++ "Supported form\\(s\\): CODE_POINTS_TO_BYTES\\(\\)",
+false);
+
+f.checkFails("code_points_to_bytes(array[-1])",
+"Input arguments of CODE_POINTS_TO_BYTES out of range: -1", true);
+f.checkFails("code_points_to_bytes(array[2147483648, 1])",
+"Input arguments of CODE_POINTS_TO_BYTES out of range: 2147483648", 
true);
+
+f.checkString("code_points_to_bytes(array[65, 66, 67, 68])", "41424344", 
"VARBINARY NOT NULL");
+f.checkString("code_points_to_bytes(array[255, 254, 65, 64])", "fffe4140",

Review Comment:
   I add a test for expressions inputs: `code_points_to_bytes(array[1+2, 3, 
4])`. 
   For empty array, I try to use `array[]` and `array()` construct a emtpy 
array, but both of them fail. `array[]` requires more then 1 argument, 
`array()` only enabled for SPARKSQL.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [calcite] macroguo-ghy commented on a diff in pull request #3389: [CALCITE-5935] Add CODE_POINTS_TO_BYTES function (enabled in BigQuery…

2023-08-23 Thread via GitHub


macroguo-ghy commented on code in PR #3389:
URL: https://github.com/apache/calcite/pull/3389#discussion_r1302943102


##
testkit/src/main/java/org/apache/calcite/test/SqlOperatorTest.java:
##
@@ -1804,6 +1804,35 @@ void testCastToBoolean(CastType castType, 
SqlOperatorFixture f) {
 consumer);
   }
 
+  @Test void testCodePointsToBytes() {
+final SqlOperatorFixture f = fixture()
+.setFor(SqlLibraryOperators.CODE_POINTS_TO_BYTES, VM_FENNEL, VM_JAVA)
+.withLibrary(SqlLibrary.BIG_QUERY);
+f.checkFails("^code_points_to_bytes('abc')^",
+"Cannot apply 'CODE_POINTS_TO_BYTES' to arguments of type "
++ "'CODE_POINTS_TO_BYTES\\(\\)'\\. "
++ "Supported form\\(s\\): CODE_POINTS_TO_BYTES\\(\\)",
+false);
+f.checkFails("^code_points_to_bytes(array['abc'])^",
+"Cannot apply 'CODE_POINTS_TO_BYTES' to arguments of type "
++ "'CODE_POINTS_TO_BYTES\\(\\)'\\. "
++ "Supported form\\(s\\): CODE_POINTS_TO_BYTES\\(\\)",
+false);
+
+f.checkFails("code_points_to_bytes(array[-1])",
+"Input arguments of CODE_POINTS_TO_BYTES out of range: -1", true);
+f.checkFails("code_points_to_bytes(array[2147483648, 1])",
+"Input arguments of CODE_POINTS_TO_BYTES out of range: 2147483648", 
true);
+
+f.checkString("code_points_to_bytes(array[65, 66, 67, 68])", "41424344", 
"VARBINARY NOT NULL");
+f.checkString("code_points_to_bytes(array[255, 254, 65, 64])", "fffe4140",

Review Comment:
   I add a test for expressions inputs: `code_points_to_bytes(array[1+2, 3, 
4])`. 
   For empty array, I try to use `array[]` and `array()` construct a emtpy 
array, but both of them fail. `array[]` requires more then 1 argument, 
`array()` only enabled for SPARKSQL.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org