AngersZhuuuu commented on a change in pull request #32266:
URL: https://github.com/apache/spark/pull/32266#discussion_r619736559
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
##########
@@ -93,6 +93,23 @@ object IntervalUtils {
private val yearMonthPattern = "^([+|-])?(\\d+)-(\\d+)$".r
+ private val yearMonthFuzzyPattern = "[^-|+]*?([+|-]?[\\d]+-[\\d]+).*".r
Review comment:
I am discussing this here
https://github.com/apache/spark/pull/32266#discussion_r619231878 with @MaxGekk
,
Do you have any other supplementary suggestions?
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
##########
@@ -93,6 +93,24 @@ object IntervalUtils {
private val yearMonthPattern = "^([+|-])?(\\d+)-(\\d+)$".r
+ def safeFromYearMonthString(input: UTF8String): Option[Int] = {
+ try {
+ if (input == null || input.toString == null) {
+ throw new IllegalArgumentException("Interval year-month string must be
not null")
+ } else {
+ val regex = "INTERVAL '([-|+]?[0-9]+-[-|+]?[0-9]+)' YEAR TO MONTH".r
Review comment:
> but ideally we should respect the SQL config
`spark.sql.caseSensitive`):
Spark Catalyst parser not respect `spark.sql.caseSensitive` too, should we
respect it here?
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
##########
@@ -108,17 +119,32 @@ object IntervalUtils {
} catch {
case NonFatal(e) =>
throw new IllegalArgumentException(
- s"Error parsing interval year-month string: ${e.getMessage}", e)
+ s"$errorPrefix: ${e.getMessage}", e)
}
}
+
input.trim match {
case yearMonthPattern("-", yearStr, monthStr) =>
toInterval(yearStr, monthStr, -1)
case yearMonthPattern(_, yearStr, monthStr) =>
toInterval(yearStr, monthStr, 1)
case _ =>
- throw new IllegalArgumentException(
- s"Interval string does not match year-month format of 'y-m': $input")
+ try {
+ CatalystSqlParser.parseExpression(input) match {
+ case Literal(value: Int, _: YearMonthIntervalType) => new
CalendarInterval(value, 0, 0)
+ case Literal(value: CalendarInterval, _: CalendarIntervalType) =>
value
+ case _ => throw new IllegalArgumentException(
+ s"Interval string does not match year-month format of 'y-m':
$input")
+ }
+ } catch {
+ case NonFatal(e) =>
+ if (e.getMessage.contains(errorPrefix)) {
+ throw new IllegalArgumentException(e.getMessage)
Review comment:
One weird thing here is the real message is
```
Error parsing interval year-month string: integer overflow(line 1, pos 9)
== SQL ==
INTERVAL '-178956970-9' YEAR TO MONTH
---------^^^
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]