[
https://issues.apache.org/jira/browse/HIVE-27772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Simhadri Govindappa updated HIVE-27772:
---------------------------------------
Description:
For invalid dates such as 2001-02-31, 2023-04-31 etc, UNIX_TIMESTAMP() is
giving out the timestamp value as the last valid date, rather than NULL. (e.g.
UNIX_TIMESTAMP('2001-02-31', 'yyyy-MM-dd') gives 983354400, which converts to
'2001-02-28'. However, for calendar days larger than 31, e.g. 2001-02-32, or
2023-04-32, UNIX_TIMESTAMP() would give NULL as a result.
In Spark and mysql, UNIX_TIMESTMAP for these invalid dates are all NULL (or 0).
{noformat}
6: jdbc:hive2://localhost:10001/> select month, datetimestamp,
unix_timestamp(datetimestamp, 'yyyy-MM-dd') as timestampCol from datetimetable;
INFO : Compiling
command(queryId=root_20231005104216_8520e3e9-d03b-4e34-93ec-ddc280845c62):
select month, datetimestamp, unix_timestamp(datetimestamp, 'yyyy-MM-dd') as
timestampCol from datetimetable
INFO : No Stats for default@datetimetable, Columns: month, datetimestamp
INFO : Semantic Analysis Completed (retrial = false)
INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:month,
type:string, comment:null), FieldSchema(name:datetimestamp, type:string,
comment:null), FieldSchema(name:timestampcol, type:bigint, comment:null)],
properties:null)
INFO : Completed compiling
command(queryId=root_20231005104216_8520e3e9-d03b-4e34-93ec-ddc280845c62); Time
taken: 0.102 seconds
INFO : Operation QUERY obtained 0 locks
INFO : Executing
command(queryId=root_20231005104216_8520e3e9-d03b-4e34-93ec-ddc280845c62):
select month, datetimestamp, unix_timestamp(datetimestamp, 'yyyy-MM-dd') as
timestampCol from datetimetable
INFO : Completed executing
command(queryId=root_20231005104216_8520e3e9-d03b-4e34-93ec-ddc280845c62); Time
taken: 0.0 seconds
+--------+----------------+---------------+
| month | datetimestamp | timestampcol |
+--------+----------------+---------------+
| Feb | 2001-02-28 | 983318400 |
| Feb | 2001-02-29 | 983318400 |
| Feb | 2001-02-30 | 983318400 |
| Feb | 2001-02-31 | 983318400 |
| Feb | 2001-02-32 | NULL |
+--------+----------------+---------------+
5 rows selected (0.131 seconds){noformat}
was:
{noformat}
6: jdbc:hive2://localhost:10001/> select month, datetimestamp,
unix_timestamp(datetimestamp, 'yyyy-MM-dd') as timestampCol from datetimetable;
INFO : Compiling
command(queryId=root_20231005104216_8520e3e9-d03b-4e34-93ec-ddc280845c62):
select month, datetimestamp, unix_timestamp(datetimestamp, 'yyyy-MM-dd') as
timestampCol from datetimetable
INFO : No Stats for default@datetimetable, Columns: month, datetimestamp
INFO : Semantic Analysis Completed (retrial = false)
INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:month,
type:string, comment:null), FieldSchema(name:datetimestamp, type:string,
comment:null), FieldSchema(name:timestampcol, type:bigint, comment:null)],
properties:null)
INFO : Completed compiling
command(queryId=root_20231005104216_8520e3e9-d03b-4e34-93ec-ddc280845c62); Time
taken: 0.102 seconds
INFO : Operation QUERY obtained 0 locks
INFO : Executing
command(queryId=root_20231005104216_8520e3e9-d03b-4e34-93ec-ddc280845c62):
select month, datetimestamp, unix_timestamp(datetimestamp, 'yyyy-MM-dd') as
timestampCol from datetimetable
INFO : Completed executing
command(queryId=root_20231005104216_8520e3e9-d03b-4e34-93ec-ddc280845c62); Time
taken: 0.0 seconds
+--------+----------------+---------------+
| month | datetimestamp | timestampcol |
+--------+----------------+---------------+
| Feb | 2001-02-28 | 983318400 |
| Feb | 2001-02-29 | 983318400 |
| Feb | 2001-02-30 | 983318400 |
| Feb | 2001-02-31 | 983318400 |
| Feb | 2001-02-32 | NULL |
+--------+----------------+---------------+
5 rows selected (0.131 seconds){noformat}
> Hive UNIX_TIMESTAMP() not returning null for invalid dates
> ----------------------------------------------------------
>
> Key: HIVE-27772
> URL: https://issues.apache.org/jira/browse/HIVE-27772
> Project: Hive
> Issue Type: Bug
> Reporter: Simhadri Govindappa
> Assignee: Simhadri Govindappa
> Priority: Major
>
> For invalid dates such as 2001-02-31, 2023-04-31 etc, UNIX_TIMESTAMP() is
> giving out the timestamp value as the last valid date, rather than NULL.
> (e.g. UNIX_TIMESTAMP('2001-02-31', 'yyyy-MM-dd') gives 983354400, which
> converts to '2001-02-28'. However, for calendar days larger than 31, e.g.
> 2001-02-32, or 2023-04-32, UNIX_TIMESTAMP() would give NULL as a result.
> In Spark and mysql, UNIX_TIMESTMAP for these invalid dates are all NULL (or
> 0).
>
>
>
> {noformat}
> 6: jdbc:hive2://localhost:10001/> select month, datetimestamp,
> unix_timestamp(datetimestamp, 'yyyy-MM-dd') as timestampCol from
> datetimetable;
> INFO : Compiling
> command(queryId=root_20231005104216_8520e3e9-d03b-4e34-93ec-ddc280845c62):
> select month, datetimestamp, unix_timestamp(datetimestamp, 'yyyy-MM-dd') as
> timestampCol from datetimetable
> INFO : No Stats for default@datetimetable, Columns: month, datetimestamp
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:month,
> type:string, comment:null), FieldSchema(name:datetimestamp, type:string,
> comment:null), FieldSchema(name:timestampcol, type:bigint, comment:null)],
> properties:null)
> INFO : Completed compiling
> command(queryId=root_20231005104216_8520e3e9-d03b-4e34-93ec-ddc280845c62);
> Time taken: 0.102 seconds
> INFO : Operation QUERY obtained 0 locks
> INFO : Executing
> command(queryId=root_20231005104216_8520e3e9-d03b-4e34-93ec-ddc280845c62):
> select month, datetimestamp, unix_timestamp(datetimestamp, 'yyyy-MM-dd') as
> timestampCol from datetimetable
> INFO : Completed executing
> command(queryId=root_20231005104216_8520e3e9-d03b-4e34-93ec-ddc280845c62);
> Time taken: 0.0 seconds
> +--------+----------------+---------------+
> | month | datetimestamp | timestampcol |
> +--------+----------------+---------------+
> | Feb | 2001-02-28 | 983318400 |
> | Feb | 2001-02-29 | 983318400 |
> | Feb | 2001-02-30 | 983318400 |
> | Feb | 2001-02-31 | 983318400 |
> | Feb | 2001-02-32 | NULL |
> +--------+----------------+---------------+
> 5 rows selected (0.131 seconds){noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)