tobegit3hub commented on issue #25022: [SPARK-24695][SQL] Move `CalendarInterval` to org.apache.spark.sql.types package URL: https://github.com/apache/spark/pull/25022#issuecomment-545844418 If we want to use `range between` with `interval`, it requires the column as type `calendarinterval`. I think we should expose the type `calendarinterval` or accept other types like `date` in the SQL support. ``` SELECT MAX(age) OVER w1 AS maxAge, MIN(age) OVER w1 AS minAge FROM t1 WINDOW w1 AS (PARTITION BY gender ORDER BY createTime RANGE BETWEEN INTERVAL 0 DAYS PRECEDING AND CURRENT ROW) ``` Here is the error message. ``` Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve '(PARTITION BY t1.`gender` ORDER BY t1.`createTime` ASC NULLS FIRST RANGE BETWEEN interval PRECEDING AND CURRENT ROW)' due to data type mismatch: The data type 'date' used in the order specification does not match the data type 'calendarinterval' which is used in the range frame.; line 1 pos 83; 'Project [max(age#10) windowspecdefinition(gender#9, createTime#11 ASC NULLS FIRST, specifiedwindowframe(RangeFrame, -interval, currentrow$())) AS maxAge#18, min(age#10) windowspecdefinition(gender#9, createTime#11 ASC NULLS FIRST, specifiedwindowframe(RangeFrame, -interval, currentrow$())) AS minAge#19] +- SubqueryAlias t1 +- LogicalRDD [graduated#6, s_id#7, name#8, gender#9, age#10, createTime#11], false ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
