[
https://issues.apache.org/jira/browse/DRILL-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15583114#comment-15583114
]
ASF GitHub Bot commented on DRILL-4373:
---------------------------------------
Github user vdiravka commented on a diff in the pull request:
https://github.com/apache/drill/pull/600#discussion_r83710133
--- Diff:
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java
---
@@ -754,15 +764,45 @@ public void testImpalaParquetVarBinary_DictChange()
throws Exception {
compareParquetReadersColumnar("field_impala_ts",
"cp.`parquet/int96_dict_change.parquet`");
}
+ @Test
+ public void testImpalaParquetBinaryTimeStamp_DictChange() throws
Exception {
+ try {
+ test("alter session set %s = true",
ExecConstants.PARQUET_READER_INT96_AS_TIMESTAMP);
+ compareParquetReadersColumnar("field_impala_ts",
"cp.`parquet/int96_dict_change.parquet`");
--- End diff --
1. Is it better to compare result with baseline columns and values from the
file or it is ok to compare with `sqlBaselineQuery` and disabled new
`PARQUET_READER_INT96_AS_TIMESTAMP` option?
2. In the process of investigating this test I found that the primitive
data type of the column in the file `int96_dict_change.parquet` is BINARY, not
INT96.
I am a little bit confused with this. Do we need convert this BINARY to
TIMESTAMP as well?
CONVERT_FROM function with IMPALA_TIMESTAMP argument works properly for
this field.
I will investigate a little more about does impala and hive can store
timestamps into parquet BINARY.
> Drill and Hive have incompatible timestamp representations in parquet
> ---------------------------------------------------------------------
>
> Key: DRILL-4373
> URL: https://issues.apache.org/jira/browse/DRILL-4373
> Project: Apache Drill
> Issue Type: Improvement
> Components: Storage - Hive, Storage - Parquet
> Affects Versions: 1.8.0
> Reporter: Rahul Challapalli
> Assignee: Karthikeyan Manivannan
> Labels: doc-impacting
> Fix For: 1.9.0
>
>
> git.commit.id.abbrev=83d460c
> I created a parquet file with a timestamp type using Drill. Now if I define a
> hive table on top of the parquet file and use "timestamp" as the column type,
> drill fails to read the hive table through the hive storage plugin
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)