Hi Okumin,

thank you very much to checking them I appreciate it!
Now we have information on 5 different DB (MySQL, Oracle, Postgre, Spark,
Trino) and  3 different versions of Hive about this topic and as I see only
Hive 4.0 works like the No. 1 way.


Regards,

Zoltan


On Wed, Sep 18, 2024 at 3:53 PM Okumin <m...@okumin.com> wrote:

> Hi,
>
> I put supplementary information in the ticket, such as examples of
> Hive 2, Hive 3, Spark, Trino, or my research of the ANSI standard.
>
> Regards,
> Okumin
>
> On Wed, Sep 18, 2024 at 7:17 PM Attila Turoczy
> <aturo...@cloudera.com.invalid> wrote:
> >
> > Hi Folks,
> >
> > Let me share my opinion about this change and request. To have an option
> for parse data from string is a very valuable feature especially in OLAP,
> because many of the input data many times come from different formats. It
> is easy and makes sense that the date follows the ISO format. But real life
> is different. Especially on data lakes the sources could come from many
> formats and the ETL job parses excel, json, csv etc to create tables,
> insert rows. For those countries like Hungary where the ISO date format is
> default everywhere the HIVE-27586 would not cause any issue. However, in
> regions such as the US and UK, where date formats like mm/dd/yyyy and
> dd/mm/yyyy are common (e.g., in Excel files), this could present
> challenges. In these cases, users may encounter problems. In my mind
> especially for these formats it has to be null or as an oracle throws an
> error. If users fail to validate the output, this could lead to data
> correctness issues or, worse, potential data loss if source files are
> deleted. Yes, Hive users do not like the behavior change, but one type (one
> of the most popular formats in the US) could lead to more problems.
> >
> > -Attila
> >
> > On Wed, Sep 18, 2024 at 9:43 AM Ayush Saxena <ayush...@gmail.com> wrote:
> >>
> >> Since this is landing in Spam for some reason for most people, I am
> >> explicitly forwarding it rather than replying, so as people can get
> >> it.
> >>
> >> I will vote for (2) considering it was returning `null` earlier as
> >> well & the new behaviour came up recently, it would be a behaviour
> >> change for folks migrating from earlier versions of hive to 4.x & the
> >> new behaviour ain't that fancy either...
> >>
> >> -Ayush
> >>
> >> ---------- Forwarded message ---------
> >> From: Zoltán Rátkai <zrat...@cloudera.com>
> >> Date: Wed, 18 Sept 2024 at 12:55
> >> Subject: Re: IMPORTANT: Hive date parsing issue, input needed
> >> To: <user@hive.apache.org>
> >> Cc: dev <d...@hive.apache.org>
> >>
> >>
> >> Hi Ayush,
> >>
> >> thanks for your thoughts!
> >>
> >> My answers:
> >>
> >> - On the official Hive page about types this is not documented at all:
> >>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=82706456#LanguageManualTypes-date
> >>
> >> "Dates:  DATE values describe a particular year/month/day, in the form
> >> YYYY-­MM-­DD. For example, DATE '2013-­01-­01'. Date types do not have
> >> a time of day component. The range of values supported for the Date
> >> type is 0000-­01-­01 to 9999-­12-­31, dependent on support by the
> >> primitive Java Date type."
> >>
> >> - Since the beginning of Hive it worked like returning null, this was
> >> introduced one year ago as a side effect of this: HIVE-27586 "Parse
> >> dates from strings ignoring trailing (potentialy) invalid chars", so
> >> the goal of this ticket was to be able to handle string to date
> >> parsing like : "2024-09-18 GARBAGE" and parse it to 2024-09-18 as
> >> date.
> >> So it is not a long standing behaviour.
> >>
> >> I made a comparison of other DB (Postre, MySQL, ORACLE) and none of
> >> them handle it like Hive now. Please check it here:
> >> https://issues.apache.org/jira/browse/HIVE-28483
> >> I agree with you to align with those. This is why I asked this question.
> >>
> >> Thanks and regards,
> >>
> >> Zoltan Ratkai
> >>
> >>
> >> On Wed, Sep 18, 2024 at 8:58 AM Ayush Saxena <ayush...@gmail.com>
> wrote:
> >> >
> >> > ++ Adding the dev mailing list, as this might be relevant to them as
> well.
> >> >
> >> > I have a few additional questions:
> >> >
> >> > * Has this behavior been present since the inception of Hive, or did
> >> > it start occurring more recently? If this has been the behavior for
> >> > over a decade, it might be best to leave it unchanged. However, if
> >> > this is a recent development, we should aim to restore the original
> >> > behavior. In my opinion, any change that alters long-standing behavior
> >> > could be considered incompatible unless it was introduced to prevent
> >> > data loss or address a security issue.
> >> > * How do other engines (e.g., Impala, Spark) handle similar scenarios,
> >> > and how do other databases like MySQL and PostgreSQL behave in this
> >> > regard? In the past, we've typically aimed to align with these
> >> > systems, not necessarily exactly, but closely enough—especially in
> >> > cases where there's ambiguity or conflict.
> >> >
> >> >
> >> > -Ayush
> >> >
> >> >
> >> > On Wed, 18 Sept 2024 at 12:00, Zoltán Rátkai <zrat...@cloudera.com>
> wrote:
> >> > >
> >> > > Dear Hive User!
> >> > >
> >> > > In Hive the standard date format is YYYY-MM-DD, like “2024-09-17”.
> >> > > So when casting a string which format is different, like DD-MM-YYYY
> e.g. “17-09-2024" to date, as a Hive user what would you expect as a result?
> >> > >
> >> > >
> >> > > 1.  “0017-09-20”  => Year of 17 September 20
> >> > > or
> >> > > 2. null
> >> > >
> >> > > It is possible to give a date pattern CAST(<string-value> AS DATE
> FORMAT <date-pattern>)and cast with it like "DD-MM-YYYY".
> >> > > Earlier it gave back null, but currently Hive gives back the first,
> which I think unexpected as a user and customers complain about that. There
> is a debate if it is a bug or feature, so I need your help!
> >> > >
> >> > > I would like to ask you to reply with a vote on 1 or 2 what you
> want as a user to happen!
> >> > >
> >> > > I really appreciate your input!
> >> > >
> >> > > Thank you,
> >> > >
> >> > > Zoltan Ratkai
>

Reply via email to