[ 
https://issues.apache.org/jira/browse/HIVE-27772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simhadri Govindappa updated HIVE-27772:
---------------------------------------
    Description: 
For invalid dates such as 2001-02-31, 2023-04-31 etc, UNIX_TIMESTAMP() is 
giving out the timestamp value as the last valid date, rather than NULL. (e.g. 
UNIX_TIMESTAMP('2001-02-31', 'yyyy-MM-dd') gives 983354400, which converts to 
'2001-02-28'. However, for calendar days larger than 31, e.g. 2001-02-32, or 
2023-04-32, UNIX_TIMESTAMP() would give NULL as a result.

In Spark and mysql, UNIX_TIMESTMAP for these invalid dates are all NULL (or 0).

 

 

 
{noformat}
6: jdbc:hive2://localhost:10001/> select month, datetimestamp, 
unix_timestamp(datetimestamp, 'yyyy-MM-dd') as timestampCol from datetimetable;
INFO  : Compiling 
command(queryId=root_20231005104216_8520e3e9-d03b-4e34-93ec-ddc280845c62): 
select month, datetimestamp, unix_timestamp(datetimestamp, 'yyyy-MM-dd') as 
timestampCol from datetimetable
INFO  : No Stats for default@datetimetable, Columns: month, datetimestamp
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:month, 
type:string, comment:null), FieldSchema(name:datetimestamp, type:string, 
comment:null), FieldSchema(name:timestampcol, type:bigint, comment:null)], 
properties:null)
INFO  : Completed compiling 
command(queryId=root_20231005104216_8520e3e9-d03b-4e34-93ec-ddc280845c62); Time 
taken: 0.102 seconds
INFO  : Operation QUERY obtained 0 locks
INFO  : Executing 
command(queryId=root_20231005104216_8520e3e9-d03b-4e34-93ec-ddc280845c62): 
select month, datetimestamp, unix_timestamp(datetimestamp, 'yyyy-MM-dd') as 
timestampCol from datetimetable
INFO  : Completed executing 
command(queryId=root_20231005104216_8520e3e9-d03b-4e34-93ec-ddc280845c62); Time 
taken: 0.0 seconds
+--------+----------------+---------------+
| month  | datetimestamp  | timestampcol  |
+--------+----------------+---------------+
| Feb    | 2001-02-28     | 983318400     |
| Feb    | 2001-02-29     | 983318400     |
| Feb    | 2001-02-30     | 983318400     |
| Feb    | 2001-02-31     | 983318400     |
| Feb    | 2001-02-32     | NULL          |
+--------+----------------+---------------+
5 rows selected (0.131 seconds){noformat}

  was:
{noformat}
6: jdbc:hive2://localhost:10001/> select month, datetimestamp, 
unix_timestamp(datetimestamp, 'yyyy-MM-dd') as timestampCol from datetimetable;
INFO  : Compiling 
command(queryId=root_20231005104216_8520e3e9-d03b-4e34-93ec-ddc280845c62): 
select month, datetimestamp, unix_timestamp(datetimestamp, 'yyyy-MM-dd') as 
timestampCol from datetimetable
INFO  : No Stats for default@datetimetable, Columns: month, datetimestamp
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:month, 
type:string, comment:null), FieldSchema(name:datetimestamp, type:string, 
comment:null), FieldSchema(name:timestampcol, type:bigint, comment:null)], 
properties:null)
INFO  : Completed compiling 
command(queryId=root_20231005104216_8520e3e9-d03b-4e34-93ec-ddc280845c62); Time 
taken: 0.102 seconds
INFO  : Operation QUERY obtained 0 locks
INFO  : Executing 
command(queryId=root_20231005104216_8520e3e9-d03b-4e34-93ec-ddc280845c62): 
select month, datetimestamp, unix_timestamp(datetimestamp, 'yyyy-MM-dd') as 
timestampCol from datetimetable
INFO  : Completed executing 
command(queryId=root_20231005104216_8520e3e9-d03b-4e34-93ec-ddc280845c62); Time 
taken: 0.0 seconds
+--------+----------------+---------------+
| month  | datetimestamp  | timestampcol  |
+--------+----------------+---------------+
| Feb    | 2001-02-28     | 983318400     |
| Feb    | 2001-02-29     | 983318400     |
| Feb    | 2001-02-30     | 983318400     |
| Feb    | 2001-02-31     | 983318400     |
| Feb    | 2001-02-32     | NULL          |
+--------+----------------+---------------+
5 rows selected (0.131 seconds){noformat}


> Hive UNIX_TIMESTAMP() not returning null for invalid dates
> ----------------------------------------------------------
>
>                 Key: HIVE-27772
>                 URL: https://issues.apache.org/jira/browse/HIVE-27772
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Simhadri Govindappa
>            Assignee: Simhadri Govindappa
>            Priority: Major
>
> For invalid dates such as 2001-02-31, 2023-04-31 etc, UNIX_TIMESTAMP() is 
> giving out the timestamp value as the last valid date, rather than NULL. 
> (e.g. UNIX_TIMESTAMP('2001-02-31', 'yyyy-MM-dd') gives 983354400, which 
> converts to '2001-02-28'. However, for calendar days larger than 31, e.g. 
> 2001-02-32, or 2023-04-32, UNIX_TIMESTAMP() would give NULL as a result.
> In Spark and mysql, UNIX_TIMESTMAP for these invalid dates are all NULL (or 
> 0).
>  
>  
>  
> {noformat}
> 6: jdbc:hive2://localhost:10001/> select month, datetimestamp, 
> unix_timestamp(datetimestamp, 'yyyy-MM-dd') as timestampCol from 
> datetimetable;
> INFO  : Compiling 
> command(queryId=root_20231005104216_8520e3e9-d03b-4e34-93ec-ddc280845c62): 
> select month, datetimestamp, unix_timestamp(datetimestamp, 'yyyy-MM-dd') as 
> timestampCol from datetimetable
> INFO  : No Stats for default@datetimetable, Columns: month, datetimestamp
> INFO  : Semantic Analysis Completed (retrial = false)
> INFO  : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:month, 
> type:string, comment:null), FieldSchema(name:datetimestamp, type:string, 
> comment:null), FieldSchema(name:timestampcol, type:bigint, comment:null)], 
> properties:null)
> INFO  : Completed compiling 
> command(queryId=root_20231005104216_8520e3e9-d03b-4e34-93ec-ddc280845c62); 
> Time taken: 0.102 seconds
> INFO  : Operation QUERY obtained 0 locks
> INFO  : Executing 
> command(queryId=root_20231005104216_8520e3e9-d03b-4e34-93ec-ddc280845c62): 
> select month, datetimestamp, unix_timestamp(datetimestamp, 'yyyy-MM-dd') as 
> timestampCol from datetimetable
> INFO  : Completed executing 
> command(queryId=root_20231005104216_8520e3e9-d03b-4e34-93ec-ddc280845c62); 
> Time taken: 0.0 seconds
> +--------+----------------+---------------+
> | month  | datetimestamp  | timestampcol  |
> +--------+----------------+---------------+
> | Feb    | 2001-02-28     | 983318400     |
> | Feb    | 2001-02-29     | 983318400     |
> | Feb    | 2001-02-30     | 983318400     |
> | Feb    | 2001-02-31     | 983318400     |
> | Feb    | 2001-02-32     | NULL          |
> +--------+----------------+---------------+
> 5 rows selected (0.131 seconds){noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to