[
https://issues.apache.org/jira/browse/HIVE-26658?focusedWorklogId=836676&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-836676
]
ASF GitHub Bot logged work on HIVE-26658:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 03/Jan/23 17:18
Start Date: 03/Jan/23 17:18
Worklog Time Spent: 10m
Work Description: zabetak commented on PR #3698:
URL: https://github.com/apache/hive/pull/3698#issuecomment-1370028275
Hey @scarlin-cloudera , since this is a follow-up of a ticket that you
worked on can you please have a look as well?
Issue Time Tracking
-------------------
Worklog Id: (was: 836676)
Time Spent: 1h 10m (was: 1h)
> INT64 Parquet timestamps cannot be mapped to most Hive numeric types
> --------------------------------------------------------------------
>
> Key: HIVE-26658
> URL: https://issues.apache.org/jira/browse/HIVE-26658
> Project: Hive
> Issue Type: Bug
> Components: Parquet, Serializers/Deserializers
> Affects Versions: 4.0.0-alpha-1
> Reporter: Stamatis Zampetakis
> Assignee: Stamatis Zampetakis
> Priority: Minor
> Labels: backwards-compatibility, pull-request-available
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> When attempting to read a Parquet file with column of primitive type INT64
> and logical type
> [TIMESTAMP|https://github.com/apache/parquet-format/blob/54e53e5d7794d383529dd30746378f19a12afd58/LogicalTypes.md?plain=1#L337]
> an error is raised when the Hive type is different from TIMESTAMP and BIGINT.
> Consider a Parquet file (e.g., ts_file.parquet) with the following schema:
> {code:json}
> {
> "name": "eventtime",
> "type": ["null", {
> "type": "long",
> "logicalType": "timestamp-millis"
> }],
> "default": null
> }
> {code}
>
> Mapping the column to a Hive numeric type among TINYINT, SMALLINT, INT,
> FLOAT, DOUBLE, DECIMAL, and trying to run a SELECT will give back an error.
> The following snippet can be used to reproduce the problem.
> {code:sql}
> CREATE TABLE ts_table (eventtime INT) STORED AS PARQUET;
> LOAD DATA LOCAL INPATH 'ts_file.parquet' into table ts_table;
> SELECT * FROM ts_table;
> {code}
> This is a regression caused by HIVE-21215. Although, HIVE-21215 allows to
> read INT64 types as Hive TIMESTAMP, which was not possible before, at the
> same time it broke the mapping to every other Hive numeric type. The problem
> was addressed selectively for BIGINT type very recently (HIVE-26612).
> The primary goal of this ticket is to restore backward compatibility since
> these use-cases were working before HIVE-21215.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)