MaxGekk commented on a change in pull request #32266:
URL: https://github.com/apache/spark/pull/32266#discussion_r617316490
##########
File path:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
##########
@@ -1774,6 +1774,19 @@ class CastSuite extends CastSuiteBase {
assert(e3.contains("Casting 2147483648 to int causes overflow"))
}
}
+
+ test("SPARK-35111: Cast string to year-month interval") {
Review comment:
Could you add round trip tests: string -> year-month interval -> string,
and year-month interval -> string -> year-month interval
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
##########
@@ -93,6 +93,24 @@ object IntervalUtils {
private val yearMonthPattern = "^([+|-])?(\\d+)-(\\d+)$".r
+ def safeFromYearMonthString(input: UTF8String): Option[Int] = {
+ try {
+ if (input == null || input.toString == null) {
+ throw new IllegalArgumentException("Interval year-month string must be
not null")
+ } else {
+ val regex = "INTERVAL '([-|+]?[0-9]+-[-|+]?[0-9]+)' YEAR TO MONTH".r
Review comment:
Also could you move the regexp out of the method. So, we can avoid its
"compilation" per every call.
##########
File path:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
##########
@@ -1774,6 +1774,19 @@ class CastSuite extends CastSuiteBase {
assert(e3.contains("Casting 2147483648 to int causes overflow"))
}
}
+
+ test("SPARK-35111: Cast string to year-month interval") {
Review comment:
Could you test corner cases when arithmetic overflow happens. Also test
upper and lower cases.
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
##########
@@ -93,6 +93,24 @@ object IntervalUtils {
private val yearMonthPattern = "^([+|-])?(\\d+)-(\\d+)$".r
+ def safeFromYearMonthString(input: UTF8String): Option[Int] = {
+ try {
+ if (input == null || input.toString == null) {
+ throw new IllegalArgumentException("Interval year-month string must be
not null")
+ } else {
+ val regex = "INTERVAL '([-|+]?[0-9]+-[-|+]?[0-9]+)' YEAR TO MONTH".r
Review comment:
The keywords `INTERVAL` and `YEAR TO MONTH` are not mandatory by SQL
standard. We should be able to parse only payload too.
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
##########
@@ -93,6 +93,24 @@ object IntervalUtils {
private val yearMonthPattern = "^([+|-])?(\\d+)-(\\d+)$".r
+ def safeFromYearMonthString(input: UTF8String): Option[Int] = {
+ try {
+ if (input == null || input.toString == null) {
+ throw new IllegalArgumentException("Interval year-month string must be
not null")
+ } else {
+ val regex = "INTERVAL '([-|+]?[0-9]+-[-|+]?[0-9]+)' YEAR TO MONTH".r
Review comment:
Regarding to the pattern `'([-|+]?[0-9]+-[-|+]?[0-9]+)'`, should we
handle the second sign before month? Take a look at the val above
`yearMonthPattern`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]