----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/56334/#review164481 -----------------------------------------------------------
Well, that is a huge patch :D To be honest I run through it, but it was only the first impressions. It might need another run later. Mostly nit's and questions. Check the line lenghts please. Thanks for the patch, Peter ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java (line 155) <https://reviews.apache.org/r/56334/#comment236239> Shouldn't we at least log the exception? It might be useful for transient errors ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java (lines 168 - 172) <https://reviews.apache.org/r/56334/#comment236242> Isn't it very simmilar to getParquetWriterTimeZone? Wouldn't it be usefull to refactor this to the ParquetUtil class? ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java (line 175) <https://reviews.apache.org/r/56334/#comment236243> It might be usefull to at least log this settings problem. What do you think? ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java (line 205) <https://reviews.apache.org/r/56334/#comment236244> Isn't there a good way to cache this Calendar? Wil we create one for every conversion? ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java (line 19) <https://reviews.apache.org/r/56334/#comment236245> nit: * import, please remove ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTimeUtils.java (line 5) <https://reviews.apache.org/r/56334/#comment236246> nit: Why is this changed? ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTimeUtils.java (line 7) <https://reviews.apache.org/r/56334/#comment236247> nit: Why is this changed? ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTimeUtils.java (line 110) <https://reviews.apache.org/r/56334/#comment236248> nit: Extra line? ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTimeUtils.java (line 122) <https://reviews.apache.org/r/56334/#comment236249> nit: missing line? ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedParquetRecordReader.java (line 33) <https://reviews.apache.org/r/56334/#comment236261> nit: Unused? ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedParquetRecordReader.java (line 49) <https://reviews.apache.org/r/56334/#comment236250> nit: Import with astreix ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedPrimitiveColumnReader.java (lines 418 - 423) <https://reviews.apache.org/r/56334/#comment236251> Is there a way to reuse the calendar instance here? ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestDataWritableWriter.java (lines 236 - 240) <https://reviews.apache.org/r/56334/#comment236252> nit: indentation? ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestDataWritableWriter.java (lines 247 - 250) <https://reviews.apache.org/r/56334/#comment236253> nit: indentation? ql/src/test/org/apache/hadoop/hive/ql/io/parquet/serde/TestParquetTimestampUtils.java (line 226) <https://reviews.apache.org/r/56334/#comment236256> Would not be this flaky? At least dependent on the tester local settings? What happens when running it on Special times when changing between summer/winter time? , - Peter Vary On Feb. 6, 2017, 6:03 p.m., Barna Zsombor Klara wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/56334/ > ----------------------------------------------------------- > > (Updated Feb. 6, 2017, 6:03 p.m.) > > > Review request for hive, Ryan Blue and Sergio Pena. > > > Bugs: HIVE-12767 > https://issues.apache.org/jira/browse/HIVE-12767 > > > Repository: hive-git > > > Description > ------- > > This is a followup on this review request: https://reviews.apache.org/r/41821 > The following exit criteria is addressed in this patch: > > - Hive will read Parquet MR int96 timestamp data and adjust values using a > time zone from a table property, if set, or using the local time zone if it > is absent. No adjustment will be applied to data written by Impala. > - Hive will write Parquet int96 timestamps using a time zone adjustment from > the same table property, if set, or using the local time zone if it is > absent. This keeps the data in the table consistent. > - New tables created by Hive will set the table property to UTC if the global > option to set the property for new tables is enabled. > - Tables created using CREATE TABLE and CREATE TABLE LIKE FILE will not set > the property unless the global setting to do so is enabled. > - Tables created using CREATE TABLE LIKE <OTHER TABLE> will copy the property > of the table that is copied. > > To set the timezone table property, use this: > create table tbl1 (ts timestamp) stored as parquet tblproperties > ('parquet.mr.int96.write.zone'='PST'); > > To set UTC as default timezone table property on new tables created, use > this: > set parquet.mr.int96.enable.utc.write.zone=true; > create table tbl2 (ts timestamp) stored as parquet; > > > Diffs > ----- > > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java > d62e527f2fe24760929312897072deba63b77361 > data/files/impala_int96_timestamp.parq PRE-CREATION > ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java > adabe70fa8f0fe1b990c6ac578a14ff5af06fc93 > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java > 379a9135d9c631b2f473976b00f3dc87f9fec0c4 > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java > 167f9b6516ac093fa30091daf6965de25e3eccb3 > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java > 76d93b8e02a98c95da8a534f2820cd3e77b4bb43 > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java > 604cbbcc2a9daa8594397e315cc4fd8064cc5005 > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java > ac430a67682d3dcbddee89ce132fc0c1b421e368 > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetTableUtils.java > PRE-CREATION > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTimeUtils.java > 3fd75d24f3fda36967e4957e650aec19050b22f8 > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedParquetRecordReader.java > 699de59b58ecbfa0705f042d22e697f1573ea496 > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedPrimitiveColumnReader.java > 3d5c6e6a092dd6a0303fadc6a244dad2e31cd853 > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriteSupport.java > f4621e5dbb81e8d58c4572c901ec9d1a7ca8c012 > > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java > 6b7b50a25e553629f0f492e964cc4913417cb500 > > ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestDataWritableWriter.java > 934ae9f255d0c4ccaa422054fcc9e725873810d4 > > ql/src/test/org/apache/hadoop/hive/ql/io/parquet/VectorizedColumnReaderTestBase.java > 833cfdbaa2ac6a3653a8f3232344f5fb70c80686 > > ql/src/test/org/apache/hadoop/hive/ql/io/parquet/convert/TestETypeConverter.java > PRE-CREATION > > ql/src/test/org/apache/hadoop/hive/ql/io/parquet/serde/TestParquetTimestampUtils.java > ec6def5b9ac5f12e6a7cb24c4f4998a6ca6b4a8e > > ql/src/test/org/apache/hadoop/hive/ql/io/parquet/timestamp/TestNanoTimeUtils.java > PRE-CREATION > ql/src/test/queries/clientpositive/parquet_int96_timestamp.q PRE-CREATION > ql/src/test/results/clientpositive/parquet_int96_timestamp.q.out > PRE-CREATION > > Diff: https://reviews.apache.org/r/56334/diff/ > > > Testing > ------- > > qtest and unit tests added. > > > Thanks, > > Barna Zsombor Klara > >