[ https://issues.apache.org/jira/browse/IMPALA-7853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16715792#comment-16715792 ]
ASF subversion and git services commented on IMPALA-7853: --------------------------------------------------------- Commit 56dd5767b87d13e467e88aa20fe33149681afc1e in impala's branch refs/heads/master from [~csringhofer] [ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=56dd576 ] IMPALA-7853: Add support to read int64 NANO timestamps from Parquet PARQUET-1387 added int64 timestamps with nanosecond precision that stores timestamps as nanoseconds since the Unix epoch. As 64 bits are not enough to represent the whole 1400..9999 range of Impala timestamps, this new type works with a limited range: 1677-09-21 00:12:43.145224192 .. 2262-04-11 23:47:16.854775807 UTC The benefit of the reduced range is that no validation is necessary during scanning, as every possible 64 bit value represents a valid timestamp in Impala. This may mean that this has the potential be the fastest way to store timestamps in Impala + Parquet. Another way NANO differs from MICRO and MILLI is that NANO can be only described with new logical types in Parquet, it has no converted type equivalent. This made implementing CREATE TABLE LIKE PARQUET less trivial than it was for MICRO/MILLI: the type conversion logic in ParquetHelper.java had to be rewritten to use LogicalTypeAnnotation instead of ConvertedType. The changes on Java side also made bumping CDH_BUILD_NUMBER necessary. Testing: - added a new testfile with int64 nano timestamps - ran core tests Change-Id: I932396d8646f43c0b9ca4a6359f164c4d8349d8f Reviewed-on: http://gerrit.cloudera.org:8080/11984 Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> > Add support to read int64 NANO timestamps to the parquet scanner > ---------------------------------------------------------------- > > Key: IMPALA-7853 > URL: https://issues.apache.org/jira/browse/IMPALA-7853 > Project: IMPALA > Issue Type: Improvement > Components: Backend > Reporter: Csaba Ringhofer > Assignee: Csaba Ringhofer > Priority: Major > Labels: parquet > > PARQUET-1387 added int64 timestamps with nanosecond precision. > As 64 bits are not enough to represent the whole 1400..9999 range of Impala > timestamps, this new new type works with a limited range: > 1677-09-21 00:12:43.145224192 .. 2262-04-11 23:47:16.854775807 UTC > The benefit of the reduced range is that no validation is necessary during > scanning, as every possible 64 bit value represents a valid timestamp in > Impala. This may mean that this has the potential be the fastest way to store > timestamps in Impala + Parquet. > Another way NANO differs from MICRO and MILLI is that NANO can be only > described with new logical types in Parquet, it has no converted type > equivalent. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org