Github user parthchandra commented on a diff in the pull request: https://github.com/apache/drill/pull/600#discussion_r83908798 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java --- @@ -739,30 +739,54 @@ public void runTestAndValidate(String selection, String validationSelection, Str } /* - Test the reading of an int96 field. Impala encodes timestamps as int96 fields + Impala encodes timestamp values as int96 fields. Test the reading of an int96 field with two converters: + the first one converts parquet INT96 into drill VARBINARY and the second one (works while + store.parquet.reader.int96_as_timestamp option is enabled) converts parquet INT96 into drill TIMESTAMP. */ @Test public void testImpalaParquetInt96() throws Exception { compareParquetReadersColumnar("field_impala_ts", "cp.`parquet/int96_impala_1.parquet`"); + try { + test("alter session set %s = true", ExecConstants.PARQUET_READER_INT96_AS_TIMESTAMP); + compareParquetReadersColumnar("field_impala_ts", "cp.`parquet/int96_impala_1.parquet`"); --- End diff -- Github seems to have swallowed the previous comments so including @vdiravka's questions here: > 1) Is it better to compare result with baseline columns and values from the file or it is ok to compare with sqlBaselineQuery and disabled new PARQUET_READER_INT96_AS_TIMESTAMP option? > In the process of investigating this test I found that the primitive data type of the column in the file int96_dict_change.parquet is BINARY, not INT96. > 2) I am a little bit confused with this. Do we need convert this BINARY to TIMESTAMP as well? CONVERT_FROM function with IMPALA_TIMESTAMP argument works properly for this field. I will investigate a little more about does impala and hive can store timestamps into parquet BINARY. For 1) I think it is better to compare values from the file as opposed to running with the the PARQUET_READER_INT96_AS_TIMESTAMP disabled. For 2) Can you correct the int96 data in the file? AFAIK, the data should be int96 for the test.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---