Re: [DISCUSS] SPIP: Add the TIME data type

Max Gekk Wed, 12 Feb 2025 11:53:31 -0800

Hello Mich,

> However, if you only need to work with time, you can do like below


1. Let's say a Spark SQL user would like to load TIME values stored in
files in the parquet format which supports the TIME logical type
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#time.
None of your proposed workarounds allow loading such data.
2. One of the goals to introduce the new type is simplifying migrations
from systems that support the TIME type like PostgreSQL, Snowflake, Google
SQL, Amazon Redshift, Teradata, DB2 to Spark SQL. Rewriting their existing
SQL code to store TIME values as DATE or INTERVAL DAY TO SECOND looks ugly,
IMHO, and can lead to error prone code.

Yours faithfully,
Max Gekk

On Wed, Feb 12, 2025 at 8:13 PM Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> Not entirely convinced we need it!
>
> For example, Oracle does not have it.Oracle treats date and time as a
> single entity, as they are often used together in real-world applications.
> This approach simplifies many operations, such as sorting, filtering, and
> calculations involving both date and time. However, if you only need to
> work with time, you can do like below
>
>    1. Use DATE or TIMESTAMP to store time, and extract the time portion
>    using TO_CHAR.
>    2. Use INTERVAL DAY TO SECOND for durations or time intervals.
>    3. If you only care about time, you can ignore the date portion or set
>    it to a default value.
>
>  HTH
>
> Dr Mich Talebzadeh,
> Architect | Data Science | Financial Crime | Forensic Analysis | GDPR
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
> On Wed, 12 Feb 2025 at 18:56, Sakthi <sak...@apache.org> wrote:
>
>> Thanks for the proposal, Max. This looks very promising. I'd also be
>> happy to contribute if it helps with task completion!
>>
>> Regards,
>> Sakthi
>>
>> On Wed, Feb 12, 2025 at 10:36 AM Max Gekk <max.g...@gmail.com> wrote:
>>
>>> Hi Dongjoon,
>>>
>>> > According to SPIP, is this targeting Apache Spark 4.2.0?
>>>
>>> Some tasks could be done in parallel, but if only one person will work
>>> on this sequentially, in the worst case it might be finished close to 4.2.0.
>>>
>>> Best regards,
>>> Max Gekk
>>>
>>> On Wed, Feb 12, 2025 at 5:48 PM Dongjoon Hyun <dongj...@apache.org>
>>> wrote:
>>>
>>>> According to SPIP, is this targeting Apache Spark 4.2.0?
>>>>
>>>> > Q7. How long will it take?
>>>> > In total it might take around 9 months.
>>>>
>>>> Dongjoon.
>>>>
>>>> On 2025/02/12 09:38:56 Max Gekk wrote:
>>>> > Hi All,
>>>> >
>>>> > I would like to propose a new data type TIME which represents only
>>>> time
>>>> > values without the date part comparing to TIMESTAMP_NTZ. New type
>>>> should
>>>> > improve:
>>>> > - migrations of SQL code from other DBMS where such type is supported
>>>> > - read/write it from/to data sources such as parquet
>>>> > - conform to the SQL standard
>>>> >
>>>> > SPIP: https://issues.apache.org/jira/browse/SPARK-51162
>>>> >
>>>> > Your comments and feedback would be greatly appreciated.
>>>> >
>>>> > Yours faithfully,
>>>> > Max Gekk
>>>> >
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>
>>>>

Re: [DISCUSS] SPIP: Add the TIME data type

Reply via email to