Sounds good! @Xiao

@Reynold AFAIK the only data type that is valid to cast to Calendar
Interval is VARCHAR

here is Postgres:

postgres=# select CAST(CAST(interval '1 hour' AS varchar) AS interval);
 interval
----------
 01:00:00
(1 row)

(snippet comes from the JIRA)

Thanks,

Daniel


On 27 July 2018 at 20:38, Xiao Li <gatorsm...@gmail.com> wrote:

> The code freeze of the upcoming release Spark 2.4 is very close. How about
> revisiting this and explicitly defining the support scope
> of CalendarIntervalType in the next release (Spark 3.0)?
>
> Thanks,
>
> Xiao
>
>
> 2018-07-27 10:45 GMT-07:00 Reynold Xin <r...@databricks.com>:
>
>> CalendarInterval is definitely externally visible.
>>
>> E.g. sql("select interval 1 day").dtypes would return "Array[(String,
>> String)] = Array((interval 1 days,CalendarIntervalType))"
>>
>> However, I'm not sure what it means to support casting. What are the
>> semantics for casting from any other data type to calendar interval? I can
>> see string casting and casting from itself, but not any other data types.
>>
>>
>>
>>
>> On Fri, Jul 27, 2018 at 10:34 AM Daniel Mateus Pires <dmate...@gmail.com>
>> wrote:
>>
>>> Hi Sparkers! (maybe Sparkles ?)
>>>
>>> I just wanted to bring up the apparently ?controversial? Calendar
>>> Interval topic.
>>>
>>> I worked on: https://issues.apache.org/jira/browse/SPARK-24702, https
>>> ://github.com/apache/spark/pull/21706
>>>
>>> The user was reporting an unexpected behaviour where he/she wasn’t able
>>> to cast to a Calendar Interval type.
>>>
>>> In the current version of Spark the following code works:
>>>
>>> scala> spark.sql("SELECT 'interval 1 hour' as 
>>> a").select(col("a").cast("calendarinterval")).show()+----------------+|     
>>>           a|+----------------+|interval 1 hours|+----------------+
>>>
>>>
>>> While the following doesn’t:
>>> spark.sql("SELECT CALENDARINTERVAL('interval 1 hour') as a").show()
>>>
>>>
>>> Since the DataFrame API equivalent of the SQL worked, I thought adding
>>> it would be an easy decision to make (to make it consistent)
>>>
>>> However, I got push-back on the PR on the basis that “*we do not plan
>>> to expose Calendar Interval as a public type*”
>>> Should there be a consensus on either cleaning up the public DataFrame
>>> API out of CalendarIntervalType OR making it consistent with the SQL ?
>>>
>>> --
>>> Best regards,
>>> Daniel Mateus Pires
>>> Data Engineer @ Hudson's Bay Company
>>>
>>
>

Reply via email to