The code freeze of the upcoming release Spark 2.4 is very close. How about revisiting this and explicitly defining the support scope of CalendarIntervalType in the next release (Spark 3.0)?
Thanks, Xiao 2018-07-27 10:45 GMT-07:00 Reynold Xin <r...@databricks.com>: > CalendarInterval is definitely externally visible. > > E.g. sql("select interval 1 day").dtypes would return "Array[(String, > String)] = Array((interval 1 days,CalendarIntervalType))" > > However, I'm not sure what it means to support casting. What are the > semantics for casting from any other data type to calendar interval? I can > see string casting and casting from itself, but not any other data types. > > > > > On Fri, Jul 27, 2018 at 10:34 AM Daniel Mateus Pires <dmate...@gmail.com> > wrote: > >> Hi Sparkers! (maybe Sparkles ?) >> >> I just wanted to bring up the apparently ?controversial? Calendar >> Interval topic. >> >> I worked on: https://issues.apache.org/jira/browse/SPARK-24702, https >> ://github.com/apache/spark/pull/21706 >> >> The user was reporting an unexpected behaviour where he/she wasn’t able >> to cast to a Calendar Interval type. >> >> In the current version of Spark the following code works: >> >> scala> spark.sql("SELECT 'interval 1 hour' as >> a").select(col("a").cast("calendarinterval")).show()+----------------+| >> a|+----------------+|interval 1 hours|+----------------+ >> >> >> While the following doesn’t: >> spark.sql("SELECT CALENDARINTERVAL('interval 1 hour') as a").show() >> >> >> Since the DataFrame API equivalent of the SQL worked, I thought adding it >> would be an easy decision to make (to make it consistent) >> >> However, I got push-back on the PR on the basis that “*we do not plan to >> expose Calendar Interval as a public type*” >> Should there be a consensus on either cleaning up the public DataFrame >> API out of CalendarIntervalType OR making it consistent with the SQL ? >> >> -- >> Best regards, >> Daniel Mateus Pires >> Data Engineer @ Hudson's Bay Company >> >