Attila Jeges has posted comments on this change.

Change subject: IMPALA-2716: Hive/Impala incompatibility for timestamp data in 
Parquet
......................................................................


Patch Set 5:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/5939/4/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

Line 304:   // Time zone for adjusting timestamp values read from Parquet files 
written by
> single line
That would make the line 92 characters long.


http://gerrit.cloudera.org:8080/#/c/5939/4/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

Line 246:       DCHECK((is_timestamp_dependent_timezone_ && timezone_ == NULL) 
||
> Even better, we can check this at the scan node level, e.g., in HdfsScanNod
Simplified the code. The timezone check is done in the FE, the BE is guaranteed 
to have a valid timezone string.


Line 653:   }
> As mentioned above, we should terminate the query early when in such a bad 
Added DCHECK(false)


http://gerrit.cloudera.org:8080/#/c/5939/4/be/src/exprs/timestamp-functions.cc
File be/src/exprs/timestamp-functions.cc:

Line 194:   for (const string& tz_region: tz_region_list_) {
> No need to use auto here. const string& is clearer
Done


http://gerrit.cloudera.org:8080/#/c/5939/4/be/src/exprs/timezone_db.h
File be/src/exprs/timezone_db.h:

Line 40:   /// If 'tz' is not found in the database, nullptr is returned.
> null pointer -> nullptr
Done


Line 46:   /// If 'tz' is not found in the database, nullptr is returned.
> null pointer -> nullptr
Done


http://gerrit.cloudera.org:8080/#/c/5939/4/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

Line 199:   /// conversion was successfull and false otherwise. If conversion 
failed *this is set to
> Comment on what the state of *this is when false is returned.
Done


Line 202: 
> Comment on what the state of *this is when false is returned.
Done


http://gerrit.cloudera.org:8080/#/c/5939/4/be/src/service/fe-support.cc
File be/src/service/fe-support.cc:

Line 357:   const char *tz = env->GetStringUTFChars(timezone, NULL);
> const char*
Done


http://gerrit.cloudera.org:8080/#/c/5939/4/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

Line 122:     "the parquet.mr.int96.write.zone table property to UTC for new 
tables. This changes "
> consistent placing of space (at line end or at line beginning)
Done


http://gerrit.cloudera.org:8080/#/c/5939/3/fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java
File fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java:

Line 67:     analyzeJoin(analyzer);
> If we do the validation here, then there is no need to check in each HdfsSc
Done


http://gerrit.cloudera.org:8080/#/c/5939/3/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java:

Line 184:       if (getFileFormat() == THdfsFileFormat.KUDU) {
> You can check getFileFormat() here. The only non-HDFS formats we have are H
Done


Line 189:       String timezone = 
getTblProperties().get(HdfsTable.TBL_PROP_PARQUET_MR_WRITE_ZONE);
> We should only do this for HDFS tables right?
Done


http://gerrit.cloudera.org:8080/#/c/5939/4/fe/src/main/java/org/apache/impala/service/FeSupport.java
File fe/src/main/java/org/apache/impala/service/FeSupport.java:

Line 87:   // Returns true if timezone String is valid according to the BE 
timezone database, false
> valid according to the BE timezone database
Done


http://gerrit.cloudera.org:8080/#/c/5939/4/tests/custom_cluster/test_parquet_timestamp_compatibility.py
File tests/custom_cluster/test_parquet_timestamp_compatibility.py:

Line 1: # Licensed to the Apache Software Foundation (ASF) under one
> Use new Apache license header. Do we still have the old header in some exis
My bad, I created this test from an old test script.


-- 
To view, visit http://gerrit.cloudera.org:8080/5939
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3f24525ef45a2814f476bdee76655b30081079d6
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Attila Jeges <[email protected]>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Taras Bobrovytsky <[email protected]>
Gerrit-Reviewer: Zoltan Ivanfi <[email protected]>
Gerrit-HasComments: Yes

Reply via email to