[Impala-ASF-CR] IMPALA-8428: Add support for caching file handles on s3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13221 ) Change subject: IMPALA-8428: Add support for caching file handles on s3 .. Patch Set 3: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/4161/ -- To view, visit http://gerrit.cloudera.org:8080/13221 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5b304d37bc724377fbe7955441cce0cec6fb7f19 Gerrit-Change-Number: 13221 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 07 May 2019 02:42:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8369 : Fix for tests failing with incompatible column changes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13254 ) Change subject: IMPALA-8369 : Fix for tests failing with incompatible column changes .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3095/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13254 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I488121f21d9b35d33dd003b2670bc0bbe1fee4b6 Gerrit-Change-Number: 13254 Gerrit-PatchSet: 2 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 07 May 2019 01:37:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8369 : Fix for tests failing with incompatible column changes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13254 ) Change subject: IMPALA-8369 : Fix for tests failing with incompatible column changes .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/13254/2/fe/src/test/resources/hive-site.xml.py File fe/src/test/resources/hive-site.xml.py: http://gerrit.cloudera.org:8080/#/c/13254/2/fe/src/test/resources/hive-site.xml.py@85 PS2, Line 85: p flake8: E501 line too long (92 > 90 characters) -- To view, visit http://gerrit.cloudera.org:8080/13254 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I488121f21d9b35d33dd003b2670bc0bbe1fee4b6 Gerrit-Change-Number: 13254 Gerrit-PatchSet: 2 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 07 May 2019 00:52:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8369 : Fix for tests failing with incompatible column changes
Vihang Karajgaonkar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13254 Change subject: IMPALA-8369 : Fix for tests failing with incompatible column changes .. IMPALA-8369 : Fix for tests failing with incompatible column changes In Hive-3 the configuration for allowing users to make incompatible column type changes was disabled by default. In Hive-2 this was allowed. Some of the tests like data_errors/test_data_errors.py and metadata/test_compute_stats.py make changes to column types which are disallowed by HMS-3 by default. This change adds a configuration option in hive-site.xml to allow making incompatible changes to column types so that we can run the existing tests with HMS-3. Also, in HMS-3 there are certain new event types (OPEN_TXN, COMMIT_TXN, etc) which may not have dbname set. This breaks the assumption in the code in EventProcessor which expects dbName_ to be not null at all times. This patch also makes changes in the EventProcessor so that such Ignored events do not fail precondition checks during event processing. Change-Id: I488121f21d9b35d33dd003b2670bc0bbe1fee4b6 --- M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/test/resources/hive-site.xml.py 2 files changed, 8 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/13254/2 -- To view, visit http://gerrit.cloudera.org:8080/13254 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I488121f21d9b35d33dd003b2670bc0bbe1fee4b6 Gerrit-Change-Number: 13254 Gerrit-PatchSet: 2 Gerrit-Owner: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/13189 ) Change subject: IMPALA-7370: DATE: Read/Write to parquet. .. IMPALA-7370: DATE: Read/Write to parquet. This change is a follow-up to IMPALA-7368 and adds support for DATE type to the parquet scanner/writer. CREATE TABLE LIKE PARQUET statements associated with data files that contain dates are also supported. Parquet uses DATE logical type for dates. DATE logical type annotates an INT32 that stores the number of days from the Unix epoch, 1 January 1970. This representation introduces a parquet interoperability issue between Impala and older versions of Hive: - Before version 3.1, Hive used Julian calendar to represent dates up to 1582-10-05 and Gregorian calendar for dates starting with 1582-10-15. Dates between 1582-10-05 and 1582-10-15 were lost. - Impala uses proleptic Gregorian calendar, extending the Gregorian calendar backward to dates preceding its official introduction in 1582-10-15. This means that pre-1582-10-15 dates written to a parquet table by Hive will be read back incorrectly by Impala and vice versa. Note that Hive 3.1 switched to proleptic Gregorian calendar too, so for Hive 3.1+ this is no longer an issue. Change-Id: I67da03754531660bc8de3b6935580d46deae1814 Reviewed-on: http://gerrit.cloudera.org:8080/13189 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/exec/hdfs-table-sink.cc M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-table-writer.cc M be/src/exec/parquet/parquet-column-readers.cc M be/src/exec/parquet/parquet-column-stats.cc M be/src/exec/parquet/parquet-column-stats.h M be/src/exec/parquet/parquet-column-stats.inline.h M be/src/exec/parquet/parquet-common.h M be/src/exec/parquet/parquet-metadata-utils.cc M be/src/util/bit-packing.cc M common/thrift/generate_error_codes.py M fe/src/main/java/org/apache/impala/analysis/ParquetHelper.java M fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M testdata/data/README A testdata/data/hive2_pre_gregorian.parquet A testdata/data/out_of_range_date.parquet M testdata/datasets/functional/schema_constraints.csv A testdata/workloads/functional-query/queries/QueryTest/date-fileformat-support.test D testdata/workloads/functional-query/queries/QueryTest/date-text-only-support.test A testdata/workloads/functional-query/queries/QueryTest/hive2-pre-gregorian-date.test A testdata/workloads/functional-query/queries/QueryTest/out-of-range-date.test M testdata/workloads/functional-query/queries/QueryTest/parquet-filtering.test M testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test M tests/common/impala_connection.py M tests/custom_cluster/test_parquet_page_index.py M tests/query_test/test_date_queries.py M tests/query_test/test_insert_parquet.py M tests/query_test/test_scanners.py 29 files changed, 465 insertions(+), 148 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/13189 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814 Gerrit-Change-Number: 13189 Gerrit-PatchSet: 8 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13189 ) Change subject: IMPALA-7370: DATE: Read/Write to parquet. .. Patch Set 7: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/13189 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814 Gerrit-Change-Number: 13189 Gerrit-PatchSet: 7 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 07 May 2019 00:36:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8369 (part 4): Hive 3: fixes for functional dataset loading
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/13251 ) Change subject: IMPALA-8369 (part 4): Hive 3: fixes for functional dataset loading .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/13251/1/testdata/bin/load-dependent-tables.sql File testdata/bin/load-dependent-tables.sql: http://gerrit.cloudera.org:8080/#/c/13251/1/testdata/bin/load-dependent-tables.sql@a115 PS1, Line 115: Some of the test rely on the fact that this table exists. Perhaps we should also ignore/modify such tests if we are running against hive-3. Running git grep "hive_index_tbl" shows that this is used in CatalogObjectToFromThriftTest, CatalogTest and FrontendTest -- To view, visit http://gerrit.cloudera.org:8080/13251 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic34930dc064da3136dde4e01a011d14db6a74ecd Gerrit-Change-Number: 13251 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 07 May 2019 00:34:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-966: Type errors are attributed to wrong expression with insert
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/13050 ) Change subject: IMPALA-966: Type errors are attributed to wrong expression with insert .. Patch Set 5: (11 comments) http://gerrit.cloudera.org:8080/#/c/13050/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13050/5//COMMIT_MSG@7 PS5, Line 7: IMPALA-966: Type errors are attributed to wrong expression with insert : : When insert multiple incompatible type values into a table, : error message should blame on the correct expression. If there : are multiple incompatible type values for a single target : column, error should blame on the first widest incompatible type : expression. how about : IMPALA-966: Attribute type errors to the right expression in an insert statement Currently if an insert statement contains multiple expressions that are incompatible with the column type, the error message returned attributes the error to the wrong expression. This patch makes sure the right expression is blamed. If there are multiple incompatible type values for the target column, then the error is attributed to the first widest (highest precision) incompatible type expression. http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java: http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@692 PS5, Line 692: // If the queryStmt_ is a unionStmt, it will return a WidestExprs list : // when do castToUnionCompatibleTypes(). : // widestTypeExpr will be null if the queryStmt_ is a SelectStmt nit: superfluous comment http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java@695 PS5, Line 695: UnionStmt unionStmt = : (queryStmt_ instanceof UnionStmt) ? (UnionStmt) queryStmt_ : null; : if (unionStmt != null && unionStmt.getWidestExprs() != null : && unionStmt.getWidestExprs().size() > 0) { : widestTypeExpr = unionStmt.getWidestExprs().get(i); : } nit: instead of doing this in every loop maybe just get the widestExprList before the loop and use it if not null http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java File fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java: http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java@292 PS5, Line 292: null nit: remove the comment above and add an inline comment here like .., analyzer.isDecimalV2(), null /*widestTypeSrcExpr*/); http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/main/java/org/apache/impala/analysis/StatementBase.java File fe/src/main/java/org/apache/impala/analysis/StatementBase.java: http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/main/java/org/apache/impala/analysis/StatementBase.java@196 PS5, Line 196: widestTypeSrcExpr nit: add quotes since this refers to an input param http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/main/java/org/apache/impala/analysis/StatementBase.java@196 PS5, Line 196: for nit: among http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/main/java/org/apache/impala/analysis/StatementBase.java@197 PS5, Line 197: Error message should blame on the widestTypeSrcExpr instead of the first :* compatible source expression. nit: is only used when constructing an AnalysisException message to make sure the right expression is blamed in the error message http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/main/java/org/apache/impala/analysis/UnionStmt.java File fe/src/main/java/org/apache/impala/analysis/UnionStmt.java: http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/main/java/org/apache/impala/analysis/UnionStmt.java@56 PS5, Line 56: // widestExprs_ is a list of the first widest compatible expression for each column nit: you can remove the first line and write "widest (highest precision)" here. Also, can you mention what order they are stored in and add a full stop at the end http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java: http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@3404 PS5, Line 3404: on nit: the http://gerrit.cloudera.org:8080/#/c/13050/5/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@3404 PS5, Line 3404: // Error should blame on correct expression. : // The widest (highest precision) expression and type should appear in error. nit: these two are a bit
[Impala-ASF-CR] [IMPALA-8435] Prohibit operations on full transactional table.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13253 ) Change subject: [IMPALA-8435] Prohibit operations on full transactional table. .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3094/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13253 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I542570e30afdd8351250236d1be0077a170dd4ab Gerrit-Change-Number: 13253 Gerrit-PatchSet: 2 Gerrit-Owner: Sudhanshu Arora Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Mon, 06 May 2019 23:57:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8369 (part 3): Hive 3: fix test permanent udfs.py for Hive 3 support
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/13236 ) Change subject: IMPALA-8369 (part 3): Hive 3: fix test_permanent_udfs.py for Hive 3 support .. Patch Set 3: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/13236/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13236/2//COMMIT_MSG@16 PS2, Line 16: This function also : exists in Hive 2, so while it isn't necessary, I didn't bother to make : it conditional on version > Done The comment should be also updated. -- To view, visit http://gerrit.cloudera.org:8080/13236 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f50845c7d4769d8843cad87988498e165902169 Gerrit-Change-Number: 13236 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Tue, 07 May 2019 00:01:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8369 (part 3): Hive 3: fix test permanent udfs.py for Hive 3 support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13236 ) Change subject: IMPALA-8369 (part 3): Hive 3: fix test_permanent_udfs.py for Hive 3 support .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3091/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13236 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f50845c7d4769d8843cad87988498e165902169 Gerrit-Change-Number: 13236 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Mon, 06 May 2019 23:52:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8438: Store WriteId and ValidWriteId list for table and partition
Sudhanshu Arora has posted comments on this change. ( http://gerrit.cloudera.org:8080/13215 ) Change subject: IMPALA-8438: Store WriteId and ValidWriteId list for table and partition .. Patch Set 6: (6 comments) http://gerrit.cloudera.org:8080/#/c/13215/6/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: http://gerrit.cloudera.org:8080/#/c/13215/6/common/thrift/CatalogObjects.thrift@482 PS6, Line 482: // Are we guaranteed that Hive team will keep this string backward compatible? http://gerrit.cloudera.org:8080/#/c/13215/6/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java File fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java: http://gerrit.cloudera.org:8080/#/c/13215/6/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java@231 PS6, Line 231: return null; throw UnsupportedOperationException. http://gerrit.cloudera.org:8080/#/c/13215/6/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java: http://gerrit.cloudera.org:8080/#/c/13215/6/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@374 PS6, Line 374: * @return the list of valid write IDs for the table in a string Nit: or null if there are no validWriteIds http://gerrit.cloudera.org:8080/#/c/13215/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java: http://gerrit.cloudera.org:8080/#/c/13215/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@236 PS6, Line 236: StringBuilder validIdsBuf = new StringBuilder("Loaded ValidWriteIdLists: "); For my understanding, how do we use timeline? http://gerrit.cloudera.org:8080/#/c/13215/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/13215/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@649 PS6, Line 649: writeId_ = msPartition != null ? Nit: Handle null case in shim so that every call does not have to handle it. http://gerrit.cloudera.org:8080/#/c/13215/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@1026 PS6, Line 1026: } Nit: Use ternary or put else in the above line -- To view, visit http://gerrit.cloudera.org:8080/13215 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54 Gerrit-Change-Number: 13215 Gerrit-PatchSet: 6 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Mon, 06 May 2019 23:48:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8369 (part 2): Hive 3: switch to Tez-on-YARN execution
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13224 ) Change subject: IMPALA-8369 (part 2): Hive 3: switch to Tez-on-YARN execution .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3090/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13224 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If96064f271582b2790a3cfb3d135f3834d46c41d Gerrit-Change-Number: 13224 Gerrit-PatchSet: 5 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Mon, 06 May 2019 23:36:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8369 (part 4): Hive 3: fixes for functional dataset loading
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13251 ) Change subject: IMPALA-8369 (part 4): Hive 3: fixes for functional dataset loading .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3092/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13251 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic34930dc064da3136dde4e01a011d14db6a74ecd Gerrit-Change-Number: 13251 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 06 May 2019 23:32:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8509. Lazily evaluate LOAD sections during data load
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13252 ) Change subject: IMPALA-8509. Lazily evaluate LOAD sections during data load .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/13252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc64bb5cac4fda675607672329c04c5caf810d99 Gerrit-Change-Number: 13252 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 06 May 2019 23:35:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8509. Lazily evaluate LOAD sections during data load
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13252 ) Change subject: IMPALA-8509. Lazily evaluate LOAD sections during data load .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3093/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc64bb5cac4fda675607672329c04c5caf810d99 Gerrit-Change-Number: 13252 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 06 May 2019 23:36:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files
Alex Rodoni has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/13235 ) Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files .. IMPALA-8364: [DOCS] Remove refereces to authz policy files Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756 Reviewed-on: http://gerrit.cloudera.org:8080/13235 Tested-by: Impala Public Jenkins Reviewed-by: Fredy Wijaya --- M docs/topics/impala_authorization.xml M docs/topics/impala_grant.xml M docs/topics/impala_revoke.xml M docs/topics/impala_show.xml 4 files changed, 54 insertions(+), 311 deletions(-) Approvals: Impala Public Jenkins: Verified Fredy Wijaya: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/13235 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756 Gerrit-Change-Number: 13235 Gerrit-PatchSet: 5 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-8503: add option to start Kudu cluster with HMS integration
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13248 ) Change subject: IMPALA-8503: add option to start Kudu cluster with HMS integration .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3089/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13248 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I734d14ede6a03ad52e820e38a1fbcbac0a40ede2 Gerrit-Change-Number: 13248 Gerrit-PatchSet: 1 Gerrit-Owner: Hao Hao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Mon, 06 May 2019 23:01:47 + Gerrit-HasComments: No
[Impala-ASF-CR] [IMPALA-8435] Prohibit operations on full transactional table.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13253 ) Change subject: [IMPALA-8435] Prohibit operations on full transactional table. .. Patch Set 2: (4 comments) http://gerrit.cloudera.org:8080/#/c/13253/2/fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java: http://gerrit.cloudera.org:8080/#/c/13253/2/fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java@535 PS2, Line 535: AnalysisError("create table test as select * from functional.full_transactional_table", line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/13253/2/fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java@537 PS2, Line 537: AnalyzesOk("create table test as select * from functional.insert_only_transactional_table"); line too long (96 > 90) http://gerrit.cloudera.org:8080/#/c/13253/2/fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java@557 PS2, Line 557: AnalysisError("alter table functional.full_transactional_table add columns (col2 string)", line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/13253/2/fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java@559 PS2, Line 559: AnalyzesOk("alter table functional.insert_only_transactional_table add columns (col2 string)"); line too long (99 > 90) -- To view, visit http://gerrit.cloudera.org:8080/13253 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I542570e30afdd8351250236d1be0077a170dd4ab Gerrit-Change-Number: 13253 Gerrit-PatchSet: 2 Gerrit-Owner: Sudhanshu Arora Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Mon, 06 May 2019 22:57:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [IMPALA-8435] Prohibit operations on full transactional table.
Sudhanshu Arora has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13253 Change subject: [IMPALA-8435] Prohibit operations on full transactional table. .. [IMPALA-8435] Prohibit operations on full transactional table. Copied some code from Hive to identify if the table is transactional, insert only table. Testing Done: - Added a new unit test in Analyzer. Change-Id: I542570e30afdd8351250236d1be0077a170dd4ab --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java M fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/analysis/LoadDataStmt.java M fe/src/main/java/org/apache/impala/analysis/TableDef.java M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java M testdata/datasets/functional/functional_schema_template.sql 10 files changed, 117 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/13253/2 -- To view, visit http://gerrit.cloudera.org:8080/13253 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I542570e30afdd8351250236d1be0077a170dd4ab Gerrit-Change-Number: 13253 Gerrit-PatchSet: 2 Gerrit-Owner: Sudhanshu Arora Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/13235 ) Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files .. Patch Set 4: Code-Review+2 (2 comments) I found few typos in the authorization doc. But let's not mix that in this CR. We can have a typo fix in a different CR. http://gerrit.cloudera.org:8080/#/c/13235/4/docs/topics/impala_show.xml File docs/topics/impala_show.xml: http://gerrit.cloudera.org:8080/#/c/13235/4/docs/topics/impala_show.xml@388 PS4, Line 388: ROLE not related to this CR, but this is a typo: it should be SHOW CURRENT ROLES http://gerrit.cloudera.org:8080/#/c/13235/4/docs/topics/impala_show.xml@440 PS4, Line 440: SHOW ROLE GRANT this should be SHOW GRANT ROLE. -- To view, visit http://gerrit.cloudera.org:8080/13235 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756 Gerrit-Change-Number: 13235 Gerrit-PatchSet: 4 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 May 2019 22:50:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8369 (part 2): Hive 3: switch to Tez-on-YARN execution
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13224 ) Change subject: IMPALA-8369 (part 2): Hive 3: switch to Tez-on-YARN execution .. Patch Set 5: (4 comments) http://gerrit.cloudera.org:8080/#/c/13224/5/bin/jenkins/critique-gerrit-review.py File bin/jenkins/critique-gerrit-review.py: http://gerrit.cloudera.org:8080/#/c/13224/5/bin/jenkins/critique-gerrit-review.py@72 PS5, Line 72: flake8: E261 at least two spaces before inline comment http://gerrit.cloudera.org:8080/#/c/13224/5/testdata/cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.py File testdata/cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.py: http://gerrit.cloudera.org:8080/#/c/13224/5/testdata/cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.py@33 PS5, Line 33: O flake8: E501 line too long (96 > 90 characters) http://gerrit.cloudera.org:8080/#/c/13224/5/testdata/cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.py@37 PS5, Line 37: i flake8: E501 line too long (94 > 90 characters) http://gerrit.cloudera.org:8080/#/c/13224/5/testdata/cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.py@42 PS5, Line 42: l flake8: E501 line too long (101 > 90 characters) -- To view, visit http://gerrit.cloudera.org:8080/13224 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If96064f271582b2790a3cfb3d135f3834d46c41d Gerrit-Change-Number: 13224 Gerrit-PatchSet: 5 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Mon, 06 May 2019 22:39:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13235 ) Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files .. Patch Set 4: Verified+1 Build Successful https://jenkins.impala.io/job/gerrit-docs-auto-test/317/ : Doc tests passed. -- To view, visit http://gerrit.cloudera.org:8080/13235 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756 Gerrit-Change-Number: 13235 Gerrit-PatchSet: 4 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 May 2019 22:32:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8369 (part 4): Hive 3: fixes for functional dataset loading
Hello Vihang Karajgaonkar, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/13251 to review the following change. Change subject: IMPALA-8369 (part 4): Hive 3: fixes for functional dataset loading .. IMPALA-8369 (part 4): Hive 3: fixes for functional dataset loading This fixes three issues for functional dataset loading: - works around HIVE-21675, a bug in which 'CREATE VIEW IF NOT EXISTS' does not function correctly in our current Hive build. This has been fixed already, but the workaround is pretty simple, and actually the 'drop and recreate' pattern is used more widely for data-loading than the 'create if not exists' one. - adds the ability to specify version restrictions for tables to load. The restrictions use the Python "requirements.txt" syntax. This new functionality is used to skip creating a hive "INDEX" table on Hive 3, where this functionality has been removed. - Moving from MR to Tez execution changed the behavior of data loading by disabling the auto-merging of small files. With Hive-on-MR, this behavior defaulted to true, but with Hive-on-Tez it defaults false. The change is likely motivated by the fact that Tez automatically groups small splits on the _input_ side and thus is less likely to produce lots of small files. However, that grouping functionality doesn't work properly in localhost clusters (TEZ-3310) so we aren't seeing the benefit. So, this patch enables the post-process merging of small files. Prior to this change, the 'alltypesaggmultifilesnopart' test table was getting 40+ files inside it, which broke various planner tests. With the change, it gets the expected 4 files. Change-Id: Ic34930dc064da3136dde4e01a011d14db6a74ecd --- M fe/src/test/resources/hive-site.xml.py M testdata/bin/generate-schema-statements.py M testdata/bin/load-dependent-tables.sql M testdata/datasets/README M testdata/datasets/functional/functional_schema_template.sql 5 files changed, 119 insertions(+), 24 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/13251/1 -- To view, visit http://gerrit.cloudera.org:8080/13251 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ic34930dc064da3136dde4e01a011d14db6a74ecd Gerrit-Change-Number: 13251 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-8369 (part 2): Hive 3: switch to Tez-on-YARN execution
Hello Yongzhi Chen, Vihang Karajgaonkar, Sudhanshu Arora, Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13224 to look at the new patch set (#5). Change subject: IMPALA-8369 (part 2): Hive 3: switch to Tez-on-YARN execution .. IMPALA-8369 (part 2): Hive 3: switch to Tez-on-YARN execution This switches away from Tez local mode to tez-on-YARN. After spending a couple of days trying to debug issues with Tez local mode, it seemed like it was just going to be too much of a lift. This patch switches on the starting of a Yarn RM and NM when USE_CDP_HIVE is enabled. It also switches to a new yarn-site.xml with a minimized set of configurations, generated by the new python templating. In order for everything to work properly I also had to update the Hadoop dependency to come from CDP instead of CDH when using CDP Hive. Otherwise, the classpath of the launched Tez containers had conflicting versions of various Hadoop classes which caused tasks to fail. I verified that this fixes concurrent query execution by running queries in parallel in two beeline sessions. With local mode, these queries would periodically fail due to various races (HIVE-21682). I'm also able to get farther along in data loading. Change-Id: If96064f271582b2790a3cfb3d135f3834d46c41d --- M bin/bootstrap_toolchain.py M bin/create-test-configuration.sh M bin/generate_xml_config.py M bin/impala-config.sh M bin/jenkins/critique-gerrit-review.py M fe/pom.xml M fe/src/main/java/org/apache/impala/analysis/CopyTestCaseStmt.java M fe/src/test/resources/hive-site.xml.py M shaded-deps/pom.xml M testdata/cluster/admin A testdata/cluster/node_templates/common/etc/hadoop/conf/capacity-scheduler.xml A testdata/cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.py D testdata/cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.tmpl 13 files changed, 365 insertions(+), 173 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/13224/5 -- To view, visit http://gerrit.cloudera.org:8080/13224 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If96064f271582b2790a3cfb3d135f3834d46c41d Gerrit-Change-Number: 13224 Gerrit-PatchSet: 5 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-8369 (part 3): Hive 3: fix test permanent udfs.py for Hive 3 support
Hello Yongzhi Chen, Vihang Karajgaonkar, Sudhanshu Arora, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13236 to look at the new patch set (#3). Change subject: IMPALA-8369 (part 3): Hive 3: fix test_permanent_udfs.py for Hive 3 support .. IMPALA-8369 (part 3): Hive 3: fix test_permanent_udfs.py for Hive 3 support This fixes two issues in test_permanent_udfs.py: - two of Hive's built-ins were ported to the new GenericUDF interface which Impala can't execute. These UDFs are now excluded from the test when running with Hive 3. - Hive 3 now caches UDFs more aggressively, so we have to run 'RELOAD FUNCTION' in Hive after changing UDFs in Impala. This function also exists in Hive 2, so while it isn't necessary, I didn't bother to make it conditional on version. Change-Id: I7f50845c7d4769d8843cad87988498e165902169 --- M tests/common/impala_test_suite.py M tests/custom_cluster/test_permanent_udfs.py 2 files changed, 30 insertions(+), 34 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/13236/3 -- To view, visit http://gerrit.cloudera.org:8080/13236 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7f50845c7d4769d8843cad87988498e165902169 Gerrit-Change-Number: 13236 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-8509. Lazily evaluate LOAD sections during data load
Hello Vihang Karajgaonkar, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/13252 to review the following change. Change subject: IMPALA-8509. Lazily evaluate LOAD sections during data load .. IMPALA-8509. Lazily evaluate LOAD sections during data load The LOAD sections for the 'testescape' tables were evaluated too eagerly, before determining whether these tables should be skipped. Moving to lazy evaluation makes incremental load-data.py calls take about 30 seconds instead of several minutes. Change-Id: Ifc64bb5cac4fda675607672329c04c5caf810d99 --- M testdata/bin/generate-schema-statements.py 1 file changed, 12 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/13252/1 -- To view, visit http://gerrit.cloudera.org:8080/13252 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ifc64bb5cac4fda675607672329c04c5caf810d99 Gerrit-Change-Number: 13252 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13235 ) Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files .. Patch Set 3: Verified+1 Build Successful https://jenkins.impala.io/job/gerrit-docs-auto-test/316/ : Doc tests passed. -- To view, visit http://gerrit.cloudera.org:8080/13235 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756 Gerrit-Change-Number: 13235 Gerrit-PatchSet: 3 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 May 2019 22:31:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8503: add option to start Kudu cluster with HMS integration
Thomas Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/13248 ) Change subject: IMPALA-8503: add option to start Kudu cluster with HMS integration .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/13248/1/testdata/cluster/node_templates/common/etc/init.d/kudu-master File testdata/cluster/node_templates/common/etc/init.d/kudu-master: http://gerrit.cloudera.org:8080/#/c/13248/1/testdata/cluster/node_templates/common/etc/init.d/kudu-master@31 PS1, Line 31: KUDU_COMMON_ARGS+=("-hive_metastore_uris=thrift://${INTERNAL_LISTEN_HOST}:9083") Instead of doing all of the work of piping an argument all the way through testdata/cluster/admin and having this logic here, I wonder if it would be easier just to add an env variable like EXTRA_KUDU_STARTUP_ARGS or whatever that if its set we always just append it to KUDU_COMMON_ARGS here. It also doesn't look like this patch actually uses the functionality that you've added here, unless I'm missing something. It might be easier for reviewers to understand how you intend for this to work if you include it in a patch along with a test that actually exercises this functionality -- To view, visit http://gerrit.cloudera.org:8080/13248 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I734d14ede6a03ad52e820e38a1fbcbac0a40ede2 Gerrit-Change-Number: 13248 Gerrit-PatchSet: 1 Gerrit-Owner: Hao Hao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Mon, 06 May 2019 22:20:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files
Hello Austin Nobis, Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13235 to look at the new patch set (#4). Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files .. IMPALA-8364: [DOCS] Remove refereces to authz policy files Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756 --- M docs/topics/impala_authorization.xml M docs/topics/impala_grant.xml M docs/topics/impala_revoke.xml M docs/topics/impala_show.xml 4 files changed, 54 insertions(+), 311 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/13235/4 -- To view, visit http://gerrit.cloudera.org:8080/13235 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756 Gerrit-Change-Number: 13235 Gerrit-PatchSet: 4 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13235 ) Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files .. Patch Set 4: Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/317/ Testing docs change - this change appears to modify docs/ and no code. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/13235 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756 Gerrit-Change-Number: 13235 Gerrit-PatchSet: 4 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 May 2019 22:20:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7665: Fix unwarranted query cancellation on statestore restart
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/13061 ) Change subject: IMPALA-7665: Fix unwarranted query cancellation on statestore restart .. Patch Set 4: (3 comments) Thanks for fixing it. Glad that this is also pretty straightforward. http://gerrit.cloudera.org:8080/#/c/13061/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13061/4//COMMIT_MSG@23 PS4, Line 23: fo typo http://gerrit.cloudera.org:8080/#/c/13061/4/be/src/statestore/statestore-subscriber.h File be/src/statestore/statestore-subscriber.h: http://gerrit.cloudera.org:8080/#/c/13061/4/be/src/statestore/statestore-subscriber.h@214 PS4, Line 214: // nit: /// http://gerrit.cloudera.org:8080/#/c/13061/4/be/src/statestore/statestore-subscriber.cc File be/src/statestore/statestore-subscriber.cc: http://gerrit.cloudera.org:8080/#/c/13061/4/be/src/statestore/statestore-subscriber.cc@196 PS4, Line 196: last_registration_ms_.Store(MonotonicMillis()); Should this be set iff status.ok() ? -- To view, visit http://gerrit.cloudera.org:8080/13061 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30b68bd8bde4bf589d58d42d6f683afb166de959 Gerrit-Change-Number: 13061 Gerrit-PatchSet: 4 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 06 May 2019 22:13:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13235 ) Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files .. Patch Set 3: Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/316/ Testing docs change - this change appears to modify docs/ and no code. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/13235 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756 Gerrit-Change-Number: 13235 Gerrit-PatchSet: 3 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 May 2019 22:14:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13234 ) Change subject: IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3 .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/13234 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie2c021ce212f483a644fdab4e77ab95031066b14 Gerrit-Change-Number: 13234 Gerrit-PatchSet: 3 Gerrit-Owner: Austin Nobis Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Mon, 06 May 2019 22:03:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files
Hello Austin Nobis, Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13235 to look at the new patch set (#3). Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files .. IMPALA-8364: [DOCS] Remove refereces to authz policy files Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756 --- M docs/topics/impala_authorization.xml M docs/topics/impala_revoke.xml M docs/topics/impala_show.xml 3 files changed, 51 insertions(+), 305 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/13235/3 -- To view, visit http://gerrit.cloudera.org:8080/13235 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756 Gerrit-Change-Number: 13235 Gerrit-PatchSet: 3 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/13234 ) Change subject: IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3 .. IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3 A hardcoded path in test_ranger.py for URI testing was updated to support S3, local, and HDFS as opposed to just HDFS. Testing: - Ran authorization E2E tests - Ran all FE tests - Ran test_ranger.py with S3 Change-Id: Ie2c021ce212f483a644fdab4e77ab95031066b14 Reviewed-on: http://gerrit.cloudera.org:8080/13234 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M tests/authorization/test_ranger.py 1 file changed, 2 insertions(+), 1 deletion(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/13234 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ie2c021ce212f483a644fdab4e77ab95031066b14 Gerrit-Change-Number: 13234 Gerrit-PatchSet: 4 Gerrit-Owner: Austin Nobis Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal
[Impala-ASF-CR] IMPALA-8503: add option to start Kudu cluster with HMS integration
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13248 ) Change subject: IMPALA-8503: add option to start Kudu cluster with HMS integration .. Patch Set 1: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/3087/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/13248 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I734d14ede6a03ad52e820e38a1fbcbac0a40ede2 Gerrit-Change-Number: 13248 Gerrit-PatchSet: 1 Gerrit-Owner: Hao Hao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Mon, 06 May 2019 21:59:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8428: Add support for caching file handles on s3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13221 ) Change subject: IMPALA-8428: Add support for caching file handles on s3 .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/13221 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5b304d37bc724377fbe7955441cce0cec6fb7f19 Gerrit-Change-Number: 13221 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Mon, 06 May 2019 21:42:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8499: avoid datetime.total seconds() in test insert events
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13245 ) Change subject: IMPALA-8499: avoid datetime.total_seconds() in test_insert_events .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/13245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8e6e556d99d07c1f559a2097fbd634bfc5eaaa52 Gerrit-Change-Number: 13245 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Mon, 06 May 2019 21:46:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8428: Add support for caching file handles on s3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13221 ) Change subject: IMPALA-8428: Add support for caching file handles on s3 .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4161/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/13221 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5b304d37bc724377fbe7955441cce0cec6fb7f19 Gerrit-Change-Number: 13221 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Mon, 06 May 2019 21:42:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8499: avoid datetime.total seconds() in test insert events
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/13245 ) Change subject: IMPALA-8499: avoid datetime.total_seconds() in test_insert_events .. IMPALA-8499: avoid datetime.total_seconds() in test_insert_events This function was only added in Python 2.7. Change-Id: I8e6e556d99d07c1f559a2097fbd634bfc5eaaa52 Reviewed-on: http://gerrit.cloudera.org:8080/13245 Reviewed-by: Todd Lipcon Tested-by: Impala Public Jenkins --- M tests/custom_cluster/test_event_processing.py 1 file changed, 2 insertions(+), 3 deletions(-) Approvals: Todd Lipcon: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/13245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I8e6e556d99d07c1f559a2097fbd634bfc5eaaa52 Gerrit-Change-Number: 13245 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-8460: Simplify cluster membership management
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13207 ) Change subject: IMPALA-8460: Simplify cluster membership management .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3088/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13207 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib3cf9a8bb060d0c6e9ec8868b7b21ce01f8740a3 Gerrit-Change-Number: 13207 Gerrit-PatchSet: 6 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 06 May 2019 21:41:53 + Gerrit-HasComments: No
[Impala-ASF-CR] Hive 3: fix test permanent udfs.py for Hive 3 support
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/13236 ) Change subject: Hive 3: fix test_permanent_udfs.py for Hive 3 support .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/13236/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13236/2//COMMIT_MSG@16 PS2, Line 16: This function also : exists in Hive 2, so while it isn't necessary, I didn't bother to make : it conditional on version > Maybe create a function like describe_fn__in_hive(self, db, fn)? It could a Done http://gerrit.cloudera.org:8080/#/c/13236/2/tests/custom_cluster/test_permanent_udfs.py File tests/custom_cluster/test_permanent_udfs.py: http://gerrit.cloudera.org:8080/#/c/13236/2/tests/custom_cluster/test_permanent_udfs.py@507 PS2, Line 507: implemened > typo: implemented Done http://gerrit.cloudera.org:8080/#/c/13236/2/tests/custom_cluster/test_permanent_udfs.py@507 PS2, Line 507: now > not ? 'now' is correct -- they used to be 'UDF' but now they are 'GenericUDF' -- To view, visit http://gerrit.cloudera.org:8080/13236 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f50845c7d4769d8843cad87988498e165902169 Gerrit-Change-Number: 13236 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Mon, 06 May 2019 21:25:26 + Gerrit-HasComments: Yes
[native-toolchain-CR] Fix issues with toolchain Python and bump version
Tim Armstrong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13249 Change subject: Fix issues with toolchain Python and bump version .. Fix issues with toolchain Python and bump version * Bumps the version to 2.7.16, the latest Python 2 release. * Fixes issues where paths like /tmp/tmp.mEyNqPNTxH-impala-toolchain/gcc got baked into the Python metadata, which caused problems when later compiling C/C++ extensions from source. * Remove hardcoded version in Kudu build scripts This is motivated by IMPALA-8508, where we want to consume the toolchain python outside of the toolchain. Change-Id: I7e6c9c4371d3d6c1193c2cc02d45c22b04137672 --- M buildall.sh M source/kudu/build.sh M source/python/build.sh 3 files changed, 18 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/49/13249/1 -- To view, visit http://gerrit.cloudera.org:8080/13249 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I7e6c9c4371d3d6c1193c2cc02d45c22b04137672 Gerrit-Change-Number: 13249 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-8460: Simplify cluster membership management
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/13207 ) Change subject: IMPALA-8460: Simplify cluster membership management .. Patch Set 5: (20 comments) Thanks for the review. Please see my inline comments and PS6. http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/runtime/exec-env.cc File be/src/runtime/exec-env.cc: http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/runtime/exec-env.cc@454 PS5, Line 454: // Register the ImpalaServer with the cluster membership manager : cluster_membership_mgr_->SetLocalBeDescFn([server]() { : return server->GetLocalBackendDescriptor(); : }); : cluster_membership_mgr_->SetUpdateLocalServerFn( : [server](const ClusterMembershipMgr::BackendAddressSet& current_backends) { : server->CancelQueriesOnFailedBackends(current_backends); : }); > just thinking out aloud: should we reset the callback functions during tear These rely only on the server still being alive, not the ExecEnv, so the right thing to do would be to delete them when we unregister the ImpalaServer from the ExecEnv (which we currently don't). I added a DCHECK to make sure that we don't use this method to reset the ImpalaServer in the future. Since the ExecEnv d'tor will destroy the ClusterMembershipManager I think we should be good without resetting its state explicitly. http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/scheduling/cluster-membership-mgr.h File be/src/scheduling/cluster-membership-mgr.h: http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/scheduling/cluster-membership-mgr.h@54 PS5, Line 54: /// Clients can also register callbacks to receive notifications of changes to the cluster : /// membership. > this makes it sound like there is a generic way of registering callbacks li Done http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/scheduling/cluster-membership-mgr.h@68 PS5, Line 68: pool > nit: executor pool. Is there a description of "executor pools" anywhere? Replaced it with executor groups and added a brief description which we can expand in the future. http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/scheduling/cluster-membership-mgr.h@102 PS5, Line 102: then > nit: when Done http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/scheduling/cluster-membership-mgr.h@112 PS5, Line 112: subscription > nit: subscription. Done http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/scheduling/cluster-membership-mgr.h@117 PS5, Line 117: /// Registers a callback to provide the local backend descriptor. : void SetLocalBeDescFn(BackendDescriptorPtrFn fn); : : /// Registers a callback to notify the local ImpalaServer of changes in the cluster : /// membership. This callback will only be called when backends are deleted from the : /// membership. : void SetUpdateLocalServerFn(UpdateLocalServerFn fn); : : /// Registers a callback to notify the local Frontend of changes in the cluster : /// membership. : void SetUpdateFrontendFn(UpdateFrontendFn fn); > I know we discussed this offline, but it might be worth documenting in this Both clients have different needs: The ImpalaServer only needs to learn about deleted backends, whereas the Frontend needs the full list. In the future, the Frontend will also need the executor group sizes (but not the memberships because groups will likely only be useful for remote read scenarios). Additionally, the callbacks have different signatures. Having a generic callback interface would require some sort of filtering of events (added, updated, deleted) and more complexity on the client side without a clear benefit. If the list of client classes expands we can revisit this decision. http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/scheduling/cluster-membership-mgr.h@165 PS5, Line 165: in > nit: extra word Done http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/scheduling/cluster-membership-mgr.h@167 PS5, Line 167: / occur in 'current_backends' > nit: outdated comment? Done http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/scheduling/cluster-membership-mgr.h@184 PS5, Line 184: May be NULL if the set of : /// backends is fixed. > maybe mention that this is only true in tests Done http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/scheduling/cluster-membership-mgr.cc File be/src/scheduling/cluster-membership-mgr.cc: http://gerrit.cloudera.org:8080/#/c/13207/5/be/src/scheduling/cluster-membership-mgr.cc@88 PS5, Line 88: update.is_delta && update.topic_entries.empty() > just curious, when can we receive an empty delta It's the way that the statestore pulls for updates from the clients, e.g. every time interval it will send an empty delta and the
[Impala-ASF-CR] IMPALA-8460: Simplify cluster membership management
Hello Michael Ho, Thomas Marshall, Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13207 to look at the new patch set (#6). Change subject: IMPALA-8460: Simplify cluster membership management .. IMPALA-8460: Simplify cluster membership management This change adds a class to track cluster membership called ClusterMembershipMgr. It replaced the logic that was partially duplicated between the ImpalaServer and the Coordinator and makes sure that the local backend descriptor is consistent (IMPALA-8469). The ClusterMembershipMgr maintains a view of the cluster membership and incorporates incoming updates from the statestore. It also registers the local backend with the statestore after startup. Clients can obtain a consistent, immutable snapshot of the current cluster membership from the ClusterMembershipMgr. Additionally, callbacks can be registered to receive notifications of cluster membership changes. The ImpalaServer and Frontend use this mechanism. This change also unifies the naming of executor-related classes, in particular it renames "BackendConfig" to "ExecutorGroup". In anticipation of a subsequent change, it adds maps to store multiple executor groups. Testing: This change does not introduce new functionality and the new class is covered by the existing scheduler unit test and the end to end tests. Change-Id: Ib3cf9a8bb060d0c6e9ec8868b7b21ce01f8740a3 --- M be/src/benchmarks/scheduler-benchmark.cc M be/src/common/logging.h M be/src/gutil/strings/split.cc M be/src/gutil/strings/split.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/scheduling/CMakeLists.txt D be/src/scheduling/backend-config-test.cc D be/src/scheduling/backend-config.cc D be/src/scheduling/backend-config.h A be/src/scheduling/cluster-membership-mgr.cc A be/src/scheduling/cluster-membership-mgr.h A be/src/scheduling/executor-group-test.cc A be/src/scheduling/executor-group.cc A be/src/scheduling/executor-group.h M be/src/scheduling/scheduler-test-util.cc M be/src/scheduling/scheduler-test-util.h M be/src/scheduling/scheduler-test.cc M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/testutil/in-process-servers.cc M common/thrift/StatestoreService.thrift M tests/custom_cluster/test_coordinators.py 26 files changed, 1,162 insertions(+), 923 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/13207/6 -- To view, visit http://gerrit.cloudera.org:8080/13207 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib3cf9a8bb060d0c6e9ec8868b7b21ce01f8740a3 Gerrit-Change-Number: 13207 Gerrit-PatchSet: 6 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/13235 ) Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/13235/2/docs/topics/impala_revoke.xml File docs/topics/impala_revoke.xml: http://gerrit.cloudera.org:8080/#/c/13235/2/docs/topics/impala_revoke.xml@110 PS2, Line 110: this statement > It looks like we already have that information in L114. I think we can safe Done -- To view, visit http://gerrit.cloudera.org:8080/13235 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756 Gerrit-Change-Number: 13235 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 May 2019 20:34:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/13235 ) Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/13235/2/docs/topics/impala_revoke.xml File docs/topics/impala_revoke.xml: http://gerrit.cloudera.org:8080/#/c/13235/2/docs/topics/impala_revoke.xml@110 PS2, Line 110: this statement > This statement refers to REVOKE. It looks like we already have that information in L114. I think we can safely remove L108-L111. Maybe we can expand what Sentry administrative users mean in L114 --> "Users that belong to the groups defined in "sentry.service.admin.group" of the Sentry configuration". -- To view, visit http://gerrit.cloudera.org:8080/13235 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756 Gerrit-Change-Number: 13235 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 May 2019 20:25:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/13235 ) Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/13235/2/docs/topics/impala_revoke.xml File docs/topics/impala_revoke.xml: http://gerrit.cloudera.org:8080/#/c/13235/2/docs/topics/impala_revoke.xml@110 PS2, Line 110: this statement > Does "this statement" refer to GRANT/REVOKE statement? If yes, this stateme This statement refers to REVOKE. How should I correct the incorrect statement? -- To view, visit http://gerrit.cloudera.org:8080/13235 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756 Gerrit-Change-Number: 13235 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 May 2019 20:14:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/13235 ) Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/13235/2/docs/topics/impala_revoke.xml File docs/topics/impala_revoke.xml: http://gerrit.cloudera.org:8080/#/c/13235/2/docs/topics/impala_revoke.xml@110 PS2, Line 110: this statement Does "this statement" refer to GRANT/REVOKE statement? If yes, this statement: Only administrative users (those with ALL privileges on the server) is incorrect. Senrtry administrative users are those users that belong to the groups defined in "sentry.service.admin.group" of Sentry configuration. -- To view, visit http://gerrit.cloudera.org:8080/13235 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756 Gerrit-Change-Number: 13235 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 May 2019 20:05:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Hive 3: fix test permanent udfs.py for Hive 3 support
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/13236 ) Change subject: Hive 3: fix test_permanent_udfs.py for Hive 3 support .. Patch Set 2: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/13236 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f50845c7d4769d8843cad87988498e165902169 Gerrit-Change-Number: 13236 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Mon, 06 May 2019 19:55:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13189 ) Change subject: IMPALA-7370: DATE: Read/Write to parquet. .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3086/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13189 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814 Gerrit-Change-Number: 13189 Gerrit-PatchSet: 6 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 May 2019 20:02:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8503: add option to start Kudu cluster with HMS integration
Hao Hao has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13248 Change subject: IMPALA-8503: add option to start Kudu cluster with HMS integration .. IMPALA-8503: add option to start Kudu cluster with HMS integration Currently static template configuration under testdata/cluster/ is used to control Kudu gflags when starting a Kudu cluster. An option to allow custom configuration such as enabling HMS integration is needed to allow tests to run with Kudu clusters with different set of configurations. This commit updates 'cluster/admin' script to start a cluster with argument. And adds an option to 'kudu-master' script to allow starting Kudu master with HMS integration using command `admin start_with_arg kudu hms`. Change-Id: I734d14ede6a03ad52e820e38a1fbcbac0a40ede2 --- M testdata/cluster/admin M testdata/cluster/node_templates/common/etc/init.d/kudu-master 2 files changed, 33 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/13248/1 -- To view, visit http://gerrit.cloudera.org:8080/13248 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I734d14ede6a03ad52e820e38a1fbcbac0a40ede2 Gerrit-Change-Number: 13248 Gerrit-PatchSet: 1 Gerrit-Owner: Hao Hao
[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/13189 ) Change subject: IMPALA-7370: DATE: Read/Write to parquet. .. Patch Set 6: Code-Review+2 Carry +2 -- To view, visit http://gerrit.cloudera.org:8080/13189 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814 Gerrit-Change-Number: 13189 Gerrit-PatchSet: 6 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 May 2019 19:17:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8499: avoid datetime.total seconds() in test insert events
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13245 ) Change subject: IMPALA-8499: avoid datetime.total_seconds() in test_insert_events .. Patch Set 1: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/3084/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/13245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8e6e556d99d07c1f559a2097fbd634bfc5eaaa52 Gerrit-Change-Number: 13245 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Mon, 06 May 2019 19:19:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13189 ) Change subject: IMPALA-7370: DATE: Read/Write to parquet. .. Patch Set 5: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/4157/ -- To view, visit http://gerrit.cloudera.org:8080/13189 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814 Gerrit-Change-Number: 13189 Gerrit-PatchSet: 5 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 May 2019 19:12:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13189 ) Change subject: IMPALA-7370: DATE: Read/Write to parquet. .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/13189 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814 Gerrit-Change-Number: 13189 Gerrit-PatchSet: 7 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 May 2019 19:17:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13189 ) Change subject: IMPALA-7370: DATE: Read/Write to parquet. .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4160/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/13189 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814 Gerrit-Change-Number: 13189 Gerrit-PatchSet: 7 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 May 2019 19:17:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/13189 ) Change subject: IMPALA-7370: DATE: Read/Write to parquet. .. Patch Set 6: Added the missing hive2-pre-gregorian-date.test file. -- To view, visit http://gerrit.cloudera.org:8080/13189 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814 Gerrit-Change-Number: 13189 Gerrit-PatchSet: 6 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 May 2019 19:16:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.
Attila Jeges has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/13189 ) Change subject: IMPALA-7370: DATE: Read/Write to parquet. .. IMPALA-7370: DATE: Read/Write to parquet. This change is a follow-up to IMPALA-7368 and adds support for DATE type to the parquet scanner/writer. CREATE TABLE LIKE PARQUET statements associated with data files that contain dates are also supported. Parquet uses DATE logical type for dates. DATE logical type annotates an INT32 that stores the number of days from the Unix epoch, 1 January 1970. This representation introduces a parquet interoperability issue between Impala and older versions of Hive: - Before version 3.1, Hive used Julian calendar to represent dates up to 1582-10-05 and Gregorian calendar for dates starting with 1582-10-15. Dates between 1582-10-05 and 1582-10-15 were lost. - Impala uses proleptic Gregorian calendar, extending the Gregorian calendar backward to dates preceding its official introduction in 1582-10-15. This means that pre-1582-10-15 dates written to a parquet table by Hive will be read back incorrectly by Impala and vice versa. Note that Hive 3.1 switched to proleptic Gregorian calendar too, so for Hive 3.1+ this is no longer an issue. Change-Id: I67da03754531660bc8de3b6935580d46deae1814 --- M be/src/exec/hdfs-table-sink.cc M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-table-writer.cc M be/src/exec/parquet/parquet-column-readers.cc M be/src/exec/parquet/parquet-column-stats.cc M be/src/exec/parquet/parquet-column-stats.h M be/src/exec/parquet/parquet-column-stats.inline.h M be/src/exec/parquet/parquet-common.h M be/src/exec/parquet/parquet-metadata-utils.cc M be/src/util/bit-packing.cc M common/thrift/generate_error_codes.py M fe/src/main/java/org/apache/impala/analysis/ParquetHelper.java M fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M testdata/data/README A testdata/data/hive2_pre_gregorian.parquet A testdata/data/out_of_range_date.parquet M testdata/datasets/functional/schema_constraints.csv A testdata/workloads/functional-query/queries/QueryTest/date-fileformat-support.test D testdata/workloads/functional-query/queries/QueryTest/date-text-only-support.test A testdata/workloads/functional-query/queries/QueryTest/hive2-pre-gregorian-date.test A testdata/workloads/functional-query/queries/QueryTest/out-of-range-date.test M testdata/workloads/functional-query/queries/QueryTest/parquet-filtering.test M testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test M tests/common/impala_connection.py M tests/custom_cluster/test_parquet_page_index.py M tests/query_test/test_date_queries.py M tests/query_test/test_insert_parquet.py M tests/query_test/test_scanners.py 29 files changed, 465 insertions(+), 148 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/13189/6 -- To view, visit http://gerrit.cloudera.org:8080/13189 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814 Gerrit-Change-Number: 13189 Gerrit-PatchSet: 6 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13235 ) Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files .. Patch Set 2: Verified+1 Build Successful https://jenkins.impala.io/job/gerrit-docs-auto-test/315/ : Doc tests passed. -- To view, visit http://gerrit.cloudera.org:8080/13235 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756 Gerrit-Change-Number: 13235 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 May 2019 18:41:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files
Hello Austin Nobis, Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13235 to look at the new patch set (#2). Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files .. IMPALA-8364: [DOCS] Remove refereces to authz policy files Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756 --- M docs/topics/impala_authorization.xml M docs/topics/impala_revoke.xml M docs/topics/impala_show.xml 3 files changed, 50 insertions(+), 301 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/13235/2 -- To view, visit http://gerrit.cloudera.org:8080/13235 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756 Gerrit-Change-Number: 13235 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13235 ) Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files .. Patch Set 2: Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/315/ Testing docs change - this change appears to modify docs/ and no code. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/13235 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756 Gerrit-Change-Number: 13235 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 May 2019 18:29:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/13235 ) Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files .. Patch Set 1: > (4 comments) > > I think it's better if we don't mix the Ranger doc in this CR. I removed the references to Ranger in this patch. -- To view, visit http://gerrit.cloudera.org:8080/13235 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756 Gerrit-Change-Number: 13235 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 May 2019 18:27:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/13235 ) Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files .. Patch Set 1: (4 comments) http://gerrit.cloudera.org:8080/#/c/13235/1/docs/topics/impala_authorization.xml File docs/topics/impala_authorization.xml: http://gerrit.cloudera.org:8080/#/c/13235/1/docs/topics/impala_authorization.xml@106 PS1, Line 106: metastore database > This is incorrect. Replace with "Stored inside the Sentry/Ranger database" Done http://gerrit.cloudera.org:8080/#/c/13235/1/docs/topics/impala_authorization.xml@110 PS1, Line 110: If you change privileges in Sentry or Ranger, e.g. adding a user, removing a user, : modifying privileges, you must clear the Impala Catalog server cache by running the > This is a bit confusing. Maybe reword to something like if we you change pr Done http://gerrit.cloudera.org:8080/#/c/13235/1/docs/topics/impala_authorization.xml@112 PS1, Line 112: NVALIDATE METADATA statement. INVALIDATE METADATA is : not required if you make the changes to privileges within Impala. > Replace INVALIDATE METADATA with REFRESH AUTHORIZATION instead since it's a Done http://gerrit.cloudera.org:8080/#/c/13235/1/docs/topics/impala_revoke.xml File docs/topics/impala_revoke.xml: http://gerrit.cloudera.org:8080/#/c/13235/1/docs/topics/impala_revoke.xml@114 PS1, Line 114: Ranger > Ranger doesn't support roles. Removed Ranger -- To view, visit http://gerrit.cloudera.org:8080/13235 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756 Gerrit-Change-Number: 13235 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 May 2019 18:19:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13234 ) Change subject: IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3 .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3085/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13234 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie2c021ce212f483a644fdab4e77ab95031066b14 Gerrit-Change-Number: 13234 Gerrit-PatchSet: 2 Gerrit-Owner: Austin Nobis Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Mon, 06 May 2019 17:32:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8364: [DOCS] Remove refereces to authz policy files
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/13235 ) Change subject: IMPALA-8364: [DOCS] Remove refereces to authz policy files .. Patch Set 1: (4 comments) I think it's better if we don't mix the Ranger doc in this CR. http://gerrit.cloudera.org:8080/#/c/13235/1/docs/topics/impala_authorization.xml File docs/topics/impala_authorization.xml: http://gerrit.cloudera.org:8080/#/c/13235/1/docs/topics/impala_authorization.xml@106 PS1, Line 106: metastore database This is incorrect. Replace with "Stored inside the Sentry/Ranger database" http://gerrit.cloudera.org:8080/#/c/13235/1/docs/topics/impala_authorization.xml@110 PS1, Line 110: If you change privileges in Sentry or Ranger, e.g. adding a user, removing a user, : modifying privileges, you must clear the Impala Catalog server cache by running the This is a bit confusing. Maybe reword to something like if we you change privileges outside Impala, ... http://gerrit.cloudera.org:8080/#/c/13235/1/docs/topics/impala_authorization.xml@112 PS1, Line 112: NVALIDATE METADATA statement. INVALIDATE METADATA is : not required if you make the changes to privileges within Impala. Replace INVALIDATE METADATA with REFRESH AUTHORIZATION instead since it's a more lightweight operation. http://gerrit.cloudera.org:8080/#/c/13235/1/docs/topics/impala_revoke.xml File docs/topics/impala_revoke.xml: http://gerrit.cloudera.org:8080/#/c/13235/1/docs/topics/impala_revoke.xml@114 PS1, Line 114: Ranger Ranger doesn't support roles. -- To view, visit http://gerrit.cloudera.org:8080/13235 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic85a74d81142803894d30c99cea0ec8a516bf756 Gerrit-Change-Number: 13235 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 May 2019 17:14:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13234 ) Change subject: IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3 .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4159/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/13234 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie2c021ce212f483a644fdab4e77ab95031066b14 Gerrit-Change-Number: 13234 Gerrit-PatchSet: 3 Gerrit-Owner: Austin Nobis Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Mon, 06 May 2019 16:56:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/13234 ) Change subject: IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3 .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/13234 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie2c021ce212f483a644fdab4e77ab95031066b14 Gerrit-Change-Number: 13234 Gerrit-PatchSet: 2 Gerrit-Owner: Austin Nobis Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Mon, 06 May 2019 16:56:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13234 ) Change subject: IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3 .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/13234 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie2c021ce212f483a644fdab4e77ab95031066b14 Gerrit-Change-Number: 13234 Gerrit-PatchSet: 3 Gerrit-Owner: Austin Nobis Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Mon, 06 May 2019 16:56:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3
Hello Laszlo Gaal, Fredy Wijaya, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13234 to look at the new patch set (#2). Change subject: IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3 .. IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3 A hardcoded path in test_ranger.py for URI testing was updated to support S3, local, and HDFS as opposed to just HDFS. Testing: - Ran authorization E2E tests - Ran all FE tests - Ran test_ranger.py with S3 Change-Id: Ie2c021ce212f483a644fdab4e77ab95031066b14 --- M tests/authorization/test_ranger.py 1 file changed, 2 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/13234/2 -- To view, visit http://gerrit.cloudera.org:8080/13234 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie2c021ce212f483a644fdab4e77ab95031066b14 Gerrit-Change-Number: 13234 Gerrit-PatchSet: 2 Gerrit-Owner: Austin Nobis Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal
[Impala-ASF-CR] IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3
Austin Nobis has posted comments on this change. ( http://gerrit.cloudera.org:8080/13234 ) Change subject: IMPALA-8488: Fix hardcoded path in Ranger E2E test on S3 .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/13234/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13234/1//COMMIT_MSG@7 PS1, Line 7: Fix hardcoded path in Ranger E2E test on S3 > nit: usually we try to say something like "Fix hardcoded path in Ranger E2E Done http://gerrit.cloudera.org:8080/#/c/13234/1/tests/authorization/test_ranger.py File tests/authorization/test_ranger.py: http://gerrit.cloudera.org:8080/#/c/13234/1/tests/authorization/test_ranger.py@262 PS1, Line 262: "{0}{1}".forma > nit: "{0}{1}".format(NAMENODE, uri) Done -- To view, visit http://gerrit.cloudera.org:8080/13234 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie2c021ce212f483a644fdab4e77ab95031066b14 Gerrit-Change-Number: 13234 Gerrit-PatchSet: 2 Gerrit-Owner: Austin Nobis Gerrit-Reviewer: Austin Nobis Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Mon, 06 May 2019 16:33:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5843: Use page index in Parquet files to skip pages
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/12065 ) Change subject: IMPALA-5843: Use page index in Parquet files to skip pages .. Patch Set 19: Code-Review+1 (1 comment) I am ok with giving +2, but I give Lars a chance to look at the modifications he asked for. http://gerrit.cloudera.org:8080/#/c/12065/19/be/src/exec/parquet/parquet-column-stats.h File be/src/exec/parquet/parquet-column-stats.h: http://gerrit.cloudera.org:8080/#/c/12065/19/be/src/exec/parquet/parquet-column-stats.h@278 PS19, Line 278: /// Returns the required stats field for the given function. 'fn_name' can be 'le', : /// 'lt', 'ge', and 'gt' (i.e. binary operators <=, <, >=, >). If we want to check that : /// whether a column contains a value less than a constant, we need the minimum value of : /// the column to answer that question. And, to answer the opposite question we need the : /// maximum value. The required stats field (min/max) will be stored in 'stats_field'. : /// The function returns true on success, false otherwise. : static bool GetRequiredStatsField(const std::string& fn_name, StatsField* stats_field); optional: I think that this long comment makes this simple function look complex - I would prefer to move the implementation to the header and add only a minimal comment. -- To view, visit http://gerrit.cloudera.org:8080/12065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0cc99f129f2048dbafbe7f5a51d1ea3a5005731a Gerrit-Change-Number: 12065 Gerrit-PatchSet: 19 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 06 May 2019 16:28:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8499: avoid datetime.total seconds() in test insert events
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/13245 ) Change subject: IMPALA-8499: avoid datetime.total_seconds() in test_insert_events .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/13245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8e6e556d99d07c1f559a2097fbd634bfc5eaaa52 Gerrit-Change-Number: 13245 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Mon, 06 May 2019 16:13:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8499: avoid datetime.total seconds() in test insert events
Tim Armstrong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13245 Change subject: IMPALA-8499: avoid datetime.total_seconds() in test_insert_events .. IMPALA-8499: avoid datetime.total_seconds() in test_insert_events This function was only added in Python 2.7. Change-Id: I8e6e556d99d07c1f559a2097fbd634bfc5eaaa52 --- M tests/custom_cluster/test_event_processing.py 1 file changed, 2 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/13245/1 -- To view, visit http://gerrit.cloudera.org:8080/13245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I8e6e556d99d07c1f559a2097fbd634bfc5eaaa52 Gerrit-Change-Number: 13245 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-8499: avoid datetime.total seconds() in test insert events
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13245 ) Change subject: IMPALA-8499: avoid datetime.total_seconds() in test_insert_events .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4158/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/13245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8e6e556d99d07c1f559a2097fbd634bfc5eaaa52 Gerrit-Change-Number: 13245 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 May 2019 16:04:30 + Gerrit-HasComments: No
[Impala-ASF-CR] Hive 3: fix test permanent udfs.py for Hive 3 support
Yongzhi Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13236 ) Change subject: Hive 3: fix test_permanent_udfs.py for Hive 3 support .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/13236/2/tests/custom_cluster/test_permanent_udfs.py File tests/custom_cluster/test_permanent_udfs.py: http://gerrit.cloudera.org:8080/#/c/13236/2/tests/custom_cluster/test_permanent_udfs.py@507 PS2, Line 507: now not ? -- To view, visit http://gerrit.cloudera.org:8080/13236 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f50845c7d4769d8843cad87988498e165902169 Gerrit-Change-Number: 13236 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Mon, 06 May 2019 15:20:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13189 ) Change subject: IMPALA-7370: DATE: Read/Write to parquet. .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4157/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/13189 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814 Gerrit-Change-Number: 13189 Gerrit-PatchSet: 5 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 May 2019 14:31:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13189 ) Change subject: IMPALA-7370: DATE: Read/Write to parquet. .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/13189 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814 Gerrit-Change-Number: 13189 Gerrit-PatchSet: 5 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 May 2019 14:31:42 + Gerrit-HasComments: No
[Impala-ASF-CR] Hive 3: fix test permanent udfs.py for Hive 3 support
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/13236 ) Change subject: Hive 3: fix test_permanent_udfs.py for Hive 3 support .. Patch Set 2: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/13236/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13236/2//COMMIT_MSG@16 PS2, Line 16: This function also : exists in Hive 2, so while it isn't necessary, I didn't bother to make : it conditional on version Maybe create a function like describe_fn__in_hive(self, db, fn)? It could also branch on IMPALA_HIVE_MAJOR_VERSION and add "RELOAD FUNCTION" only if needed. http://gerrit.cloudera.org:8080/#/c/13236/2/tests/custom_cluster/test_permanent_udfs.py File tests/custom_cluster/test_permanent_udfs.py: http://gerrit.cloudera.org:8080/#/c/13236/2/tests/custom_cluster/test_permanent_udfs.py@507 PS2, Line 507: implemened typo: implemented -- To view, visit http://gerrit.cloudera.org:8080/13236 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f50845c7d4769d8843cad87988498e165902169 Gerrit-Change-Number: 13236 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sudhanshu Arora Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Mon, 06 May 2019 14:11:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5843: Use page index in Parquet files to skip pages
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/12065 ) Change subject: IMPALA-5843: Use page index in Parquet files to skip pages .. Patch Set 17: (8 comments) http://gerrit.cloudera.org:8080/#/c/12065/17/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/12065/17/be/src/exec/parquet/hdfs-parquet-scanner.cc@464 PS17, Line 464: bool GetRequiredStatsField(const string& fn_name, > I think this could go into parquet-column-stats.{h,cc}. Done http://gerrit.cloudera.org:8080/#/c/12065/17/be/src/exec/parquet/hdfs-parquet-scanner.cc@546 PS17, Line 546: ColumnStatsReader::StatsField stats_field = ColumnStatsReader::StatsField::MIN; > I think we usually don't initialize output parameters to make it clear that Hmm, I thought clang-tidy didn't like that but it must have been something else because it doesn't complain now. http://gerrit.cloudera.org:8080/#/c/12065/17/be/src/exec/parquet/hdfs-parquet-scanner.cc@643 PS17, Line 643: // We don't need the raw page index buffers anymore. : page_index_.Release(); > Can this go to ProcessPageIndex? It already touches a bunch of other state. ProcessPageIndex() has some RETURN_IF_ERROR macros and I wanted to be sure about calling Release(). However, I realised that I can just use a scope exit trigger in ProcessPageIndex(). http://gerrit.cloudera.org:8080/#/c/12065/17/be/src/exec/parquet/parquet-column-readers.h File be/src/exec/parquet/parquet-column-readers.h: http://gerrit.cloudera.org:8080/#/c/12065/17/be/src/exec/parquet/parquet-column-readers.h@400 PS17, Line 400: /// True, if we are using NextLevels() to readahead the next def and rep levels. In this > I feel that this field needs more explanation. From just looking at the com Elaborated the comment. http://gerrit.cloudera.org:8080/#/c/12065/17/be/src/exec/parquet/parquet-column-readers.h@403 PS17, Line 403: bool levels_readahead_ = false; > Would it simplify the code to make this levels_read_ahead_offset_ (being -1 Since currently there are only two possibilities (-1 or 0), having a simple flag is more exact I think. Also, we'd still need an if stmt when we have processed all the rows, because in this case we don't need to adjust the value of 'current_row_'. http://gerrit.cloudera.org:8080/#/c/12065/17/be/src/exec/parquet/parquet-page-index-test.cc File be/src/exec/parquet/parquet-page-index-test.cc: http://gerrit.cloudera.org:8080/#/c/12065/17/be/src/exec/parquet/parquet-page-index-test.cc@58 PS17, Line 58: void ValidatePageIndexRange(const RowGroupRanges& row_group_ranges, > Add a comment for this one, too? Done http://gerrit.cloudera.org:8080/#/c/12065/17/be/src/exec/parquet/parquet-page-index.h File be/src/exec/parquet/parquet-page-index.h: http://gerrit.cloudera.org:8080/#/c/12065/17/be/src/exec/parquet/parquet-page-index.h@56 PS17, Line 56: if there's at least parts of the page index are present > nit: grammar Rephrased the whole comment. http://gerrit.cloudera.org:8080/#/c/12065/17/tests/query_test/test_parquet_stats.py File tests/query_test/test_parquet_stats.py: http://gerrit.cloudera.org:8080/#/c/12065/17/tests/query_test/test_parquet_stats.py@87 PS17, Line 87: for batch_size in [0, 1]: > Should we use a proper test dimension for the batch size, e.g. like in test Yeah, I looked at that earlier, but I didn't want to run the other tests with those batch sizes, neither wanted to add if statements for each. And I use different batch sizes here and at L97 which would be a bit more complicated with the other approach. Also, it has the advantage to only load the data once. On the other hand, I agree that this is not the cleanest solution, so I can change it if you feel strong about it. -- To view, visit http://gerrit.cloudera.org:8080/12065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0cc99f129f2048dbafbe7f5a51d1ea3a5005731a Gerrit-Change-Number: 12065 Gerrit-PatchSet: 17 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 06 May 2019 14:05:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5843: Use page index in Parquet files to skip pages
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12065 ) Change subject: IMPALA-5843: Use page index in Parquet files to skip pages .. Patch Set 19: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3083/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0cc99f129f2048dbafbe7f5a51d1ea3a5005731a Gerrit-Change-Number: 12065 Gerrit-PatchSet: 19 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 06 May 2019 14:01:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4658: Potential race if compiler reorders ReachedLimit() usage.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/13178 ) Change subject: IMPALA-4658: Potential race if compiler reorders ReachedLimit() usage. .. Patch Set 6: Code-Review+1 (4 comments) http://gerrit.cloudera.org:8080/#/c/13178/6/be/src/exec/exec-node.h File be/src/exec/exec-node.h: http://gerrit.cloudera.org:8080/#/c/13178/6/be/src/exec/exec-node.h@209 PS6, Line 209: virtual bool LimitCheckedFromMultipleThreads() const { return false; } : virtual bool IsTaskBasedMultiThreadingSupport() const { return false; } optional: maybe creating an enum like ThreadingModel would be better to express this? e.g. SINGLE_THREADED, NON_TASK_BASED_SCANNER, TASK_BASED_SCANNER. http://gerrit.cloudera.org:8080/#/c/13178/6/be/src/exec/exec-node.h@277 PS6, Line 277: /// Caps the input row batch to ensure that the limit is not exceeded. : /// Sets the eos and returns true, if the limit is reached. : bool CheckLimitAndTruncateRowBatchIfNeeded(RowBatch* row_batch, bool* eos); : : /// Caps the input row batch to ensure that the limit is not exceeded. : /// Sets the eos and returns true, if the limit is reached. : /// Uses thread safe functions. : bool CheckLimitAndTruncateRowBatchIfNeededShared(RowBatch* row_batch, bool* eos); These could be moved to "protected". Can you check other functions too and make them protected, unless other classes use them? http://gerrit.cloudera.org:8080/#/c/13178/6/be/src/exec/exec-node.cc File be/src/exec/exec-node.cc: http://gerrit.cloudera.org:8080/#/c/13178/6/be/src/exec/exec-node.cc@418 PS6, Line 418: (limit_ == -1 || (rows_returned() + row_batch_size) < limit_) Same as line 436. http://gerrit.cloudera.org:8080/#/c/13178/6/be/src/exec/exec-node.cc@436 PS6, Line 436: (limit_ == -1 || (rows_returned_shared() + row_batch_size) < limit_) 'reached_limit' could be set to this value from the start. It could be also reused instead of calling ReachedLimitShared() - we already assume that no other thread changes num_rows_returned_. -- To view, visit http://gerrit.cloudera.org:8080/13178 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4cbbfad80f7ab87dd6f192a24e2c68f7c66b047e Gerrit-Change-Number: 13178 Gerrit-PatchSet: 6 Gerrit-Owner: Abhishek Rawat Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Mon, 06 May 2019 13:51:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5843: Use page index in Parquet files to skip pages
Hello Michael Ho, Lars Volker, Pooja Nilangekar, Tim Armstrong, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12065 to look at the new patch set (#19). Change subject: IMPALA-5843: Use page index in Parquet files to skip pages .. IMPALA-5843: Use page index in Parquet files to skip pages This commit implements page filtering based on the Parquet page index. The read and evaluation of the page index is done by the HdfsParquetScanner. At first, we determine the row ranges we are interested in, and based on the row ranges we determine the candidate pages for each column that we are reading. We still issue one ScanRange per column chunk, but we specify sub-ranges that store the candidate pages, i.e. we don't read the whole column chunk, but only fractions of it. Pages are not aligned across column chunks, i.e. page #2 of column A might store completely different rows than page #2 of column B. It means we need to implement some kind of row-skipping logic when we read the data pages. This logic is implemented in BaseScalarColumnReader and ScalarColumnReader. Collection column readers know nothing about page filtering. Page filtering can be turned off by setting the query option 'read_parquet_page_index' to false. Testing: * added some unit tests for the row range and page selection logic * generated various Parquet files with Parquet-MR * enabled Page index writing and wrote selective queries against tables written by Impala. Current tests are likely to use page filtering transparently. Performance: * Measured locally, observed 3x to 20x speedup for selective queries. The speedup was proportional to the IO operations need to be done. * The TPCH benchmark didn't show a significant performance change. It is not a suprise since the data is not being sorted in any useful way. So the main goal was to not introduce perf regression. TODO: * measure performance for remote reads Change-Id: I0cc99f129f2048dbafbe7f5a51d1ea3a5005731a --- M be/src/common/global-flags.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/parquet/CMakeLists.txt M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-column-readers.cc M be/src/exec/parquet/parquet-column-readers.h M be/src/exec/parquet/parquet-column-stats.cc M be/src/exec/parquet/parquet-column-stats.h A be/src/exec/parquet/parquet-common-test.cc M be/src/exec/parquet/parquet-common.cc M be/src/exec/parquet/parquet-common.h M be/src/exec/parquet/parquet-level-decoder.h A be/src/exec/parquet/parquet-page-index-test.cc A be/src/exec/parquet/parquet-page-index.cc A be/src/exec/parquet/parquet-page-index.h M be/src/exprs/literal.cc M be/src/runtime/scoped-buffer.h M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M testdata/data/README A testdata/data/alltypes_tiny_pages.parquet A testdata/data/alltypes_tiny_pages_plain.parquet A testdata/data/decimals_1_10.parquet A testdata/data/double_nested_decimals.parquet A testdata/data/nested_decimals.parquet A testdata/workloads/functional-query/queries/QueryTest/nested-types-parquet-page-index.test A testdata/workloads/functional-query/queries/QueryTest/parquet-page-index-alltypes-tiny-pages-plain.test A testdata/workloads/functional-query/queries/QueryTest/parquet-page-index-alltypes-tiny-pages.test A testdata/workloads/functional-query/queries/QueryTest/parquet-page-index-large.test A testdata/workloads/functional-query/queries/QueryTest/parquet-page-index.test M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test M tests/query_test/test_parquet_stats.py 36 files changed, 3,396 insertions(+), 95 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/12065/19 -- To view, visit http://gerrit.cloudera.org:8080/12065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0cc99f129f2048dbafbe7f5a51d1ea3a5005731a Gerrit-Change-Number: 12065 Gerrit-PatchSet: 19 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/13189 ) Change subject: IMPALA-7370: DATE: Read/Write to parquet. .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/13189/3/tests/query_test/test_scanners.py File tests/query_test/test_scanners.py: http://gerrit.cloudera.org:8080/#/c/13189/3/tests/query_test/test_scanners.py@359 PS3, Line 359: """ > nit: close quoute at the end of the previous line. 'test_timestamp_out_of_range' in L330-333 also uses this style for multiline comments. -- To view, visit http://gerrit.cloudera.org:8080/13189 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814 Gerrit-Change-Number: 13189 Gerrit-PatchSet: 3 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 May 2019 12:44:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/13189 ) Change subject: IMPALA-7370: DATE: Read/Write to parquet. .. Patch Set 4: Code-Review+2 (1 comment) I'm fine with the changes. Normally I would give only a +1 on this as I don't have deep knowledge around this code, but since Csaba already gave a +1 I think this is free to go. http://gerrit.cloudera.org:8080/#/c/13189/3/tests/query_test/test_scanners.py File tests/query_test/test_scanners.py: http://gerrit.cloudera.org:8080/#/c/13189/3/tests/query_test/test_scanners.py@359 PS3, Line 359: """ nit: close quoute at the end of the previous line. -- To view, visit http://gerrit.cloudera.org:8080/13189 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814 Gerrit-Change-Number: 13189 Gerrit-PatchSet: 4 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 May 2019 12:33:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13189 ) Change subject: IMPALA-7370: DATE: Read/Write to parquet. .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3082/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13189 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814 Gerrit-Change-Number: 13189 Gerrit-PatchSet: 4 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 May 2019 10:16:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7370: DATE: Read/Write to parquet.
Attila Jeges has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/13189 ) Change subject: IMPALA-7370: DATE: Read/Write to parquet. .. IMPALA-7370: DATE: Read/Write to parquet. This change is a follow-up to IMPALA-7368 and adds support for DATE type to the parquet scanner/writer. CREATE TABLE LIKE PARQUET statements associated with data files that contain dates are also supported. Parquet uses DATE logical type for dates. DATE logical type annotates an INT32 that stores the number of days from the Unix epoch, 1 January 1970. This representation introduces a parquet interoperability issue between Impala and older versions of Hive: - Before version 3.1, Hive used Julian calendar to represent dates up to 1582-10-05 and Gregorian calendar for dates starting with 1582-10-15. Dates between 1582-10-05 and 1582-10-15 were lost. - Impala uses proleptic Gregorian calendar, extending the Gregorian calendar backward to dates preceding its official introduction in 1582-10-15. This means that pre-1582-10-15 dates written to a parquet table by Hive will be read back incorrectly by Impala and vice versa. Note that Hive 3.1 switched to proleptic Gregorian calendar too, so for Hive 3.1+ this is no longer an issue. Change-Id: I67da03754531660bc8de3b6935580d46deae1814 --- M be/src/exec/hdfs-table-sink.cc M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-table-writer.cc M be/src/exec/parquet/parquet-column-readers.cc M be/src/exec/parquet/parquet-column-stats.cc M be/src/exec/parquet/parquet-column-stats.h M be/src/exec/parquet/parquet-column-stats.inline.h M be/src/exec/parquet/parquet-common.h M be/src/exec/parquet/parquet-metadata-utils.cc M be/src/util/bit-packing.cc M common/thrift/generate_error_codes.py M fe/src/main/java/org/apache/impala/analysis/ParquetHelper.java M fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M testdata/data/README A testdata/data/hive2_pre_gregorian.parquet A testdata/data/out_of_range_date.parquet M testdata/datasets/functional/schema_constraints.csv A testdata/workloads/functional-query/queries/QueryTest/date-fileformat-support.test D testdata/workloads/functional-query/queries/QueryTest/date-text-only-support.test A testdata/workloads/functional-query/queries/QueryTest/out-of-range-date.test M testdata/workloads/functional-query/queries/QueryTest/parquet-filtering.test M testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test M tests/common/impala_connection.py M tests/custom_cluster/test_parquet_page_index.py M tests/query_test/test_date_queries.py M tests/query_test/test_insert_parquet.py M tests/query_test/test_scanners.py 28 files changed, 435 insertions(+), 148 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/13189/4 -- To view, visit http://gerrit.cloudera.org:8080/13189 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I67da03754531660bc8de3b6935580d46deae1814 Gerrit-Change-Number: 13189 Gerrit-PatchSet: 4 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins