[Impala-ASF-CR] IMPALA-6204: Remove external DataSource
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/9192 ) Change subject: IMPALA-6204: Remove external DataSource .. Patch Set 5: (2 comments) Addressed the comments. Removed the remaining enums and renumbered the Thrifts. There are some conflicts with Tianyi's metadata compression patch, so I'll post another patch when I work through the rebase. http://gerrit.cloudera.org:8080/#/c/9192/5/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/9192/5/be/src/service/client-request-state.cc@297 PS5, Line 297: return Status("Data Source not supported."); > why can we even get here? doesn't the frontend reject this statement? Done http://gerrit.cloudera.org:8080/#/c/9192/5/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: http://gerrit.cloudera.org:8080/#/c/9192/5/common/thrift/CatalogObjects.thrift@401 PS5, Line 401: // 12: optional TDataSourceTable data_source_table > I'd also prefer removing everything. I see no point in keeping stuff like t In a previous life, this was a big no-no. The reasons are for maintaining compatibility between different versions of servers and servers and servers and clients (such as what would happen in rolling upgrade). We don't really have that problem. A second reason is that if I recorded the network, say, and wanted to make sense of things, fields coming and going would drive me mad. In practice, we don't do that either. Anyway--if our practice is to renumber, I've gone ahead and done that. -- To view, visit http://gerrit.cloudera.org:8080/9192 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02a3a6740466ed7372b71d948c705b30886dcfb6 Gerrit-Change-Number: 9192 Gerrit-PatchSet: 5 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Wed, 07 Feb 2018 04:31:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6204: Remove external DataSource
Hello Jim Apple, Dimitris Tsirogiannis, Alex Behm, Zach Amsden, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9192 to look at the new patch set (#6). Change subject: IMPALA-6204: Remove external DataSource .. IMPALA-6204: Remove external DataSource Removes DataSourceScanNode, external data sources, and all affiliated code, tests, and documentation. When a data source table is encountered, we now throw an exception. To the user, this looks like: [pannier.ca.cloudera.com:21000] > create table t (x int) stored as textfile tblproperties('__IMPALA_DATA_SOURCE_NAME'='V1'); Query: create table t (x int) stored as textfile tblproperties('__IMPALA_DATA_SOURCE_NAME'='V1') Fetched 0 row(s) in 0.11s [pannier.ca.cloudera.com:21000] > select * from t; Query: select * from t Query submitted at: 2018-02-01 17:16:26 (Coordinator: http://pannier.ca.cloudera.com:25000) ERROR: AnalysisException: Failed to load metadata for table: 't' CAUSED BY: TableLoadingException: Failed to load metadata for table: default.t. Running 'invalidate metadata default.t' may resolve this problem. CAUSED BY: UnsupportedOperationException: Eternal Data source table not supported. A test has been added to capture this behavior. For the most part, I deleted the unused code. In a few places, a renamed the Thrift enums and threw errors if they're encountered. For Thrift structs, I left a comment about the now-skipped id that used to represent a data-source related entry. Cherry-picks: not for 2.x Change-Id: I02a3a6740466ed7372b71d948c705b30886dcfb6 --- M CMakeLists.txt M be/generated-sources/gen-cpp/CMakeLists.txt M be/src/catalog/catalog-util.cc M be/src/exec/CMakeLists.txt M be/src/exec/catalog-op-executor.cc M be/src/exec/catalog-op-executor.h D be/src/exec/data-source-scan-node.cc D be/src/exec/data-source-scan-node.h M be/src/exec/exec-node.cc D be/src/exec/external-data-source-executor.cc D be/src/exec/external-data-source-executor.h M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M be/src/scheduling/scheduler.cc M be/src/service/client-request-state.cc M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M bin/clean-cmake.sh M bin/clean.sh M buildall.sh M common/thrift/CMakeLists.txt M common/thrift/CatalogObjects.thrift M common/thrift/CatalogService.thrift M common/thrift/Data.thrift M common/thrift/Descriptors.thrift D common/thrift/ExternalDataSource.thrift M common/thrift/Frontend.thrift M common/thrift/JniCatalog.thrift M common/thrift/PlanNodes.thrift M docs/impala.ditamap M docs/impala_keydefs.ditamap D docs/topics/impala_create_data_source.xml D docs/topics/impala_data_sources.xml D docs/topics/impala_drop_data_source.xml M docs/topics/impala_new_features.xml M docs/topics/impala_show.xml D ext-data-source/.gitignore D ext-data-source/CMakeLists.txt D ext-data-source/api/pom.xml D ext-data-source/api/src/main/java/org/apache/impala/extdatasource/util/SerializationUtils.java D ext-data-source/api/src/main/java/org/apache/impala/extdatasource/v1/ExternalDataSource.java D ext-data-source/pom.xml D ext-data-source/sample/pom.xml D ext-data-source/sample/src/main/java/org/apache/impala/extdatasource/sample/EchoDataSource.java D ext-data-source/test/pom.xml D ext-data-source/test/src/main/java/org/apache/impala/extdatasource/AllTypesDataSource.java M fe/CMakeLists.txt M fe/pom.xml M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java D fe/src/main/java/org/apache/impala/analysis/CreateDataSrcStmt.java D fe/src/main/java/org/apache/impala/analysis/CreateTableDataSrcStmt.java D fe/src/main/java/org/apache/impala/analysis/DropDataSrcStmt.java M fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java D fe/src/main/java/org/apache/impala/analysis/ShowDataSrcsStmt.java M fe/src/main/java/org/apache/impala/analysis/TableRef.java M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java D fe/src/main/java/org/apache/impala/catalog/DataSource.java D fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java D fe/src/main/java/org/apache/impala/extdatasource/ApiVersion.java D fe/src/main/java/org/apache/impala/extdatasource/ExternalDataSourceExecutor.java D fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M
[Impala-ASF-CR] IMPALA-6478: Remove garbage NativeAddPendingTopicItem log from catalog
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9228 ) Change subject: IMPALA-6478: Remove garbage NativeAddPendingTopicItem log from catalog .. Patch Set 1: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1896/ -- To view, visit http://gerrit.cloudera.org:8080/9228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ieb5847a698768c704346371288544a54055bdf1d Gerrit-Change-Number: 9228 Gerrit-PatchSet: 1 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 07 Feb 2018 04:04:29 + Gerrit-HasComments: No
[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9214 ) Change subject: Force inlining of BloomFilter::MakeMask .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/9214 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923 Gerrit-Change-Number: 9214 Gerrit-PatchSet: 5 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 07 Feb 2018 03:49:39 + Gerrit-HasComments: No
[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9214 ) Change subject: Force inlining of BloomFilter::MakeMask .. Force inlining of BloomFilter::MakeMask I noticed that this function was showing up in perf top for TPC-H Q8 running locally. It wasn't inlined into BloomFilter::BucketFindAVX2. Inlining made the query ~5% faster for me locally. Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923 Reviewed-on: http://gerrit.cloudera.org:8080/9214 Reviewed-by: Tim ArmstrongTested-by: Impala Public Jenkins --- M be/src/benchmarks/bloom-filter-benchmark.cc M be/src/util/bloom-filter.h 2 files changed, 86 insertions(+), 84 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9214 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923 Gerrit-Change-Number: 9214 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6456: Add flags to configure rpc negotiation timeout ms and negotiation thread count in KRPC
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9186 ) Change subject: IMPALA-6456: Add flags to configure rpc_negotiation_timeout_ms and negotiation thread count in KRPC .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/9186 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I108d700e7eac04b678e21a3a920aac81ba8eede5 Gerrit-Change-Number: 9186 Gerrit-PatchSet: 5 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 07 Feb 2018 03:24:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6456: Add flags to configure rpc negotiation timeout ms and negotiation thread count in KRPC
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9186 ) Change subject: IMPALA-6456: Add flags to configure rpc_negotiation_timeout_ms and negotiation thread count in KRPC .. IMPALA-6456: Add flags to configure rpc_negotiation_timeout_ms and negotiation thread count in KRPC With the fix for KUDU-2228, the FLAGS_rpc_negotiation_timeout_ms was retired in KRPC. This patch introduces a flag to be able to configure that from the Impala side (FLAGS_rpc_negotiation_timeout_ms). It also introduces a flag to configure the negotiation thread count (FLAGS_rpc_negotiation_thread_count). Added a test to verify that setting FLAGS_rpc_negotiation_timeout_ms to 0 causes negotiation failures. We unfortunately can't write a test to check the same for FLAGS_rpc_negotiation_thread_count due to DCHECKS present in the code. Change-Id: I108d700e7eac04b678e21a3a920aac81ba8eede5 Reviewed-on: http://gerrit.cloudera.org:8080/9186 Reviewed-by: Sailesh MukilTested-by: Impala Public Jenkins --- M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/rpc-mgr.cc 2 files changed, 28 insertions(+), 0 deletions(-) Approvals: Sailesh Mukil: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9186 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I108d700e7eac04b678e21a3a920aac81ba8eede5 Gerrit-Change-Number: 9186 Gerrit-PatchSet: 6 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-6392: Consistent explain format for parquet predicate statistics
Fredy Wijaya has uploaded a new patch set (#11). ( http://gerrit.cloudera.org:8080/9223 ) Change subject: IMPALA-6392: Consistent explain format for parquet predicate statistics .. IMPALA-6392: Consistent explain format for parquet predicate statistics In EXPLAIN_LEVEL=2+, change the explain format for parquet predicate statistics to output each tuple descriptor per line. This change is to make it consistent with the output of other predicates. Before: parquet statistics predicates: c_custkey < 10, o_orderkey < 5, l_linenumber < 3 After: parquet statistics predicates: c_custkey < 10 parquet statistics predicates on o: o_orderkey < 5 parquet statistics predicates on o_lineitems: l_linenumber < 3 Testing: - Ran existing planner tests and updated the ones that are affected by this change. - Ran end-to-end tests in query_test Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e --- M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test 4 files changed, 51 insertions(+), 18 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/9223/11 -- To view, visit http://gerrit.cloudera.org:8080/9223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e Gerrit-Change-Number: 9223 Gerrit-PatchSet: 11 Gerrit-Owner: Fredy WijayaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6392: Consistent explain format for parquet predicate statistics
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/9223 ) Change subject: IMPALA-6392: Consistent explain format for parquet predicate statistics .. Patch Set 11: Code-Review+1 Carrying Vuk's +1 -- To view, visit http://gerrit.cloudera.org:8080/9223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e Gerrit-Change-Number: 9223 Gerrit-PatchSet: 11 Gerrit-Owner: Fredy WijayaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 07 Feb 2018 02:55:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6392: Consistent explain format for parquet predicate statistics
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/9223 ) Change subject: IMPALA-6392: Consistent explain format for parquet predicate statistics .. Patch Set 10: Code-Review+1 Carrying Vuk's +1 -- To view, visit http://gerrit.cloudera.org:8080/9223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e Gerrit-Change-Number: 9223 Gerrit-PatchSet: 10 Gerrit-Owner: Fredy WijayaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 07 Feb 2018 02:44:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6392: Consistent explain format for parquet predicate statistics
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/9223 ) Change subject: IMPALA-6392: Consistent explain format for parquet predicate statistics .. Patch Set 9: Code-Review+1 (2 comments) Carrying Vuk's +1 http://gerrit.cloudera.org:8080/#/c/9223/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/9223/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@523 PS8, Line 523: List exprs = minMaxOriginalConjuncts_.get(tupleDesc); > Use this pattern for fewer lookups: Done http://gerrit.cloudera.org:8080/#/c/9223/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1104 PS8, Line 1104: minMaxOriginalConjuncts_.entrySet()) { > style: we use these standard abbreviations in many places: Done -- To view, visit http://gerrit.cloudera.org:8080/9223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e Gerrit-Change-Number: 9223 Gerrit-PatchSet: 9 Gerrit-Owner: Fredy WijayaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 07 Feb 2018 02:41:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6392: Consistent explain format for parquet predicate statistics
Fredy Wijaya has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/9223 ) Change subject: IMPALA-6392: Consistent explain format for parquet predicate statistics .. IMPALA-6392: Consistent explain format for parquet predicate statistics In EXPLAIN_LEVEL=2+, change the explain format for parquet predicate statistics to output each tuple descriptor per line. This change is to make it consistent with the output of other predicates. Before: parquet statistics predicates: c_custkey < 10, o_orderkey < 5, l_linenumber < 3 After: parquet statistics predicates: c_custkey < 10 parquet statistics predicates on o: o_orderkey < 5 parquet statistics predicates on o_lineitems: l_linenumber < 3 Testing: - Ran existing planner tests and updated the ones that are affected by this change. - Ran end-to-end tests in query_test Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e --- M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test 4 files changed, 50 insertions(+), 18 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/9223/9 -- To view, visit http://gerrit.cloudera.org:8080/9223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e Gerrit-Change-Number: 9223 Gerrit-PatchSet: 9 Gerrit-Owner: Fredy WijayaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR](2.x) IMPALA-6228: Control stats extrapolation via tbl prop.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9231 ) Change subject: IMPALA-6228: Control stats extrapolation via tbl prop. .. Patch Set 1: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1892/ -- To view, visit http://gerrit.cloudera.org:8080/9231 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: comment Gerrit-Change-Id: Ie49597bf1b93b7572106abc620d91f199cba0cfd Gerrit-Change-Number: 9231 Gerrit-PatchSet: 1 Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Wed, 07 Feb 2018 02:40:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4835: Part 3: switch I/O buffers to buffer pool
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8966 ) Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool .. Patch Set 16: Rebased and fixed some minor merge conflicts. -- To view, visit http://gerrit.cloudera.org:8080/8966 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786 Gerrit-Change-Number: 8966 Gerrit-PatchSet: 16 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 07 Feb 2018 01:39:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4835: Part 3: switch I/O buffers to buffer pool
Hello Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8966 to look at the new patch set (#16). Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool .. IMPALA-4835: Part 3: switch I/O buffers to buffer pool This is the final patch to switch the Disk I/O manager to allocate all buffer from the buffer pool and to reserve the buffers required for a query upfront. * The planner reserves enough memory to run a single scanner per scan node. * The multi-threaded scan node must increase reservation before spinning up more threads. * The scanner implementations must be careful to stay within their assigned reservation. The row-oriented scanners were most straightforward, since they only have a single scan range active at a time. A single I/O buffer is sufficient to scan the whole file but more I/O buffers can improve I/O throughput. Parquet is more complex because it issues a scan range per column and the sizes of the columns on disk are not known during planning. To deal with this, the reservation in the frontend is based on a heuristic involving the file size and # columns. The Parquet scanner can then divvy up reservation to columns based on the size of column data on disk. I adjusted how the 'mem_limit' is divided between buffer pool and non buffer pool memory for low mem_limits to account for the increase in buffer pool memory. Testing: * Added more planner tests to cover reservation calcs for scan node. * Test scanners for all file formats with the reservation denial debug action, to test behaviour when the scanners hit reservation limits. * Updated memory and buffer pool limits for tests. Perf: I ran TPC-H and targeted perf locally comparing with master. Both showed small improvements of a few percent and no regressions of note. I still need to run cluster perf tests. Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scan-node-mt.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-column-readers.h M be/src/exec/scanner-context.cc M be/src/exec/scanner-context.h M be/src/runtime/bufferpool/reservation-tracker-test.cc M be/src/runtime/bufferpool/reservation-util.cc M be/src/runtime/io/disk-io-mgr-stress-test.cc M be/src/runtime/io/disk-io-mgr-stress.cc M be/src/runtime/io/disk-io-mgr-stress.h M be/src/runtime/io/disk-io-mgr-test.cc M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/io/disk-io-mgr.h M be/src/runtime/io/request-context.cc M be/src/runtime/io/request-context.h M be/src/runtime/io/request-ranges.h M be/src/runtime/io/scan-range.cc M be/src/runtime/tmp-file-mgr.cc M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/util/BitUtil.java M fe/src/test/java/org/apache/impala/util/BitUtilTest.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test M testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test M testdata/workloads/functional-planner/queries/PlannerTest/partition-pruning.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test M testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test M testdata/workloads/functional-planner/queries/PlannerTest/union.test M testdata/workloads/functional-query/queries/QueryTest/admission-reject-min-reservation.test M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test M testdata/workloads/functional-query/queries/QueryTest/codegen-mem-limit.test M testdata/workloads/functional-query/queries/QueryTest/disk-spill-encryption.test M testdata/workloads/functional-query/queries/QueryTest/explain-level0.test M testdata/workloads/functional-query/queries/QueryTest/explain-level1.test M testdata/workloads/functional-query/queries/QueryTest/explain-level2.test M testdata/workloads/functional-query/queries/QueryTest/explain-level3.test M testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch.test M
[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8707 ) Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront .. Patch Set 24: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/8707 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f Gerrit-Change-Number: 8707 Gerrit-PatchSet: 24 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 07 Feb 2018 01:38:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
Hello Tianyi Wang, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8414 to look at the new patch set (#22). Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation .. IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation In preparation for switching the I/O mgr to the buffer pool, this removes and cleans up a lot of code so that the switchover patch starts from a cleaner slate. * Remove the free buffer cache (which will be replaced by buffer pool's own caching). * Make memory limit exceeded error checking synchronous (in anticipation of having to propagate buffer pool errors synchronously). * Simplify error propagation - remove the (ineffectual) code that enqueued BufferDescriptors containing error statuses. * Document locking scheme better in a few places, make it part of the function signature when it seemed reasonable. * Move ReturnBuffer() to ScanRange, because it is intrinsically connected with the lifecycle of a scan range. * Separate external ReturnBuffer() and internal CleanUpBuffer() interfaces - previously callers of ReturnBuffer() were fudging the num_buffers_in_reader accounting to make the external interface work. * Eliminate redundant state in ScanRange: 'eosr_returned_' and 'is_cancelled_'. * Clarify the logic around calling Close() for the last BufferDescriptor. -> There appeared to be an implicit assumption that buffers would be freed in the order they were returned from the scan range, so that the "eos" buffer was returned last. Instead just count the number of outstanding buffers to detect the last one. -> Touching the is_cancelled_ field without holding a lock was hard to reason about - violated locking rules and it was unclear that it was race-free. * Remove DiskIoMgr::Read() to simplify the interface. It is trivial to inline at the callsites. This will probably regress performance somewhat because of the cache removal, so my plan is to merge it around the same time as switching the I/O mgr to allocate from the buffer pool. I'm keeping the patches separate to make reviewing easier. Testing: * Ran exhaustive tests * Ran the disk-io-mgr-stress-test overnight Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/scanner-context.cc M be/src/runtime/exec-env.cc M be/src/runtime/io/disk-io-mgr-stress.cc M be/src/runtime/io/disk-io-mgr-test.cc M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/io/disk-io-mgr.h M be/src/runtime/io/request-context.cc M be/src/runtime/io/request-context.h M be/src/runtime/io/request-ranges.h M be/src/runtime/io/scan-range.cc M be/src/runtime/mem-tracker.h M be/src/runtime/test-env.cc M be/src/runtime/tmp-file-mgr-test.cc M be/src/runtime/tmp-file-mgr.cc M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h 19 files changed, 577 insertions(+), 909 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8414/22 -- To view, visit http://gerrit.cloudera.org:8080/8414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1 Gerrit-Change-Number: 8414 Gerrit-PatchSet: 22 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4835: Part 3: switch I/O buffers to buffer pool
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8966 ) Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool .. Patch Set 13: (4 comments) http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/hdfs-parquet-scanner.cc@1707 PS12, Line 1707: D > would it make sense to replace this with a named constant? something that a Yeah, not using magic numbers was like week 2 of undergrad :). There's an explanation for the constant in the DiskIoMgr header so I also added a definition of the constant there. http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/hdfs-parquet-scanner.cc@1724 PS12, Line 1724: // We need to rea > would it be useful to squeeze max amount of buffers if we do this here: I think it's better to have fewer, larger, buffers instead of adding a smaller buffer into the circulation so that we're consistently doing larger I/Os, instead of a mix of small and large I/Os. It may be faster for this query in a lot of cases, but it's also globally more efficient - if this query is somewhat memory constrained, at least it's less inclined to clog up the I/O queues for everyone else. I added a comment because it's very subtle (I thought about this at the time I wrote the code and neglected to write it down). http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/hdfs-scan-node-base.cc File be/src/exec/hdfs-scan-node-base.cc: http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/hdfs-scan-node-base.cc@347 PS12, Line 347: // We got the minimum reservation. Now try to get ideal reservation. : if (resource_profile_.min_reservation != ideal_scan_range_reservation_) { : ignore_result(buffer_pool_client_.IncreaseReservation( : ideal_scan_range_reservation_ - resource_profile_.min_reservation)); : } > should we briefly document how ideal_scan_range_reservation_ is used as co Done http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/scanner-context.cc File be/src/exec/scanner-context.cc: http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/scanner-context.cc@161 PS12, Line 161: parent_->bp_client_, > is this the same clientHandle object used across different threads? Yeah it is. Added comments to the HdfsScanNode class header and to the bp_client_ declaration to clarify this. -- To view, visit http://gerrit.cloudera.org:8080/8966 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786 Gerrit-Change-Number: 8966 Gerrit-PatchSet: 13 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 07 Feb 2018 01:24:53 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9214 ) Change subject: Force inlining of BloomFilter::MakeMask .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1895/ -- To view, visit http://gerrit.cloudera.org:8080/9214 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923 Gerrit-Change-Number: 9214 Gerrit-PatchSet: 5 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 07 Feb 2018 00:08:39 + Gerrit-HasComments: No
[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9214 ) Change subject: Force inlining of BloomFilter::MakeMask .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9214 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923 Gerrit-Change-Number: 9214 Gerrit-PatchSet: 5 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 07 Feb 2018 00:08:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6392: Consistent explain format for parquet predicate statistics
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9223 ) Change subject: IMPALA-6392: Consistent explain format for parquet predicate statistics .. Patch Set 8: (2 comments) Nice! http://gerrit.cloudera.org:8080/#/c/9223/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/9223/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@523 PS8, Line 523: if (!minMaxOriginalConjuncts_.containsKey(tupleDescriptor)) { Use this pattern for fewer lookups: List l = map.get(key); if (l == null) { l = new List(); map.put(key, l); } l.add(value); http://gerrit.cloudera.org:8080/#/c/9223/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1104 PS8, Line 1104: TupleDescriptor tupleDescriptor = entry.getKey(); style: we use these standard abbreviations in many places: tupleDescriptor -> tupleDesc or td expressions -> exprs -- To view, visit http://gerrit.cloudera.org:8080/9223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e Gerrit-Change-Number: 9223 Gerrit-PatchSet: 8 Gerrit-Owner: Fredy WijayaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 07 Feb 2018 00:06:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4835: Part 3: switch I/O buffers to buffer pool
Hello Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8966 to look at the new patch set (#13). Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool .. IMPALA-4835: Part 3: switch I/O buffers to buffer pool This is the final patch to switch the Disk I/O manager to allocate all buffer from the buffer pool and to reserve the buffers required for a query upfront. * The planner reserves enough memory to run a single scanner per scan node. * The multi-threaded scan node must increase reservation before spinning up more threads. * The scanner implementations must be careful to stay within their assigned reservation. The row-oriented scanners were most straightforward, since they only have a single scan range active at a time. A single I/O buffer is sufficient to scan the whole file but more I/O buffers can improve I/O throughput. Parquet is more complex because it issues a scan range per column and the sizes of the columns on disk are not known during planning. To deal with this, the reservation in the frontend is based on a heuristic involving the file size and # columns. The Parquet scanner can then divvy up reservation to columns based on the size of column data on disk. I adjusted how the 'mem_limit' is divided between buffer pool and non buffer pool memory for low mem_limits to account for the increase in buffer pool memory. Testing: * Added more planner tests to cover reservation calcs for scan node. * Test scanners for all file formats with the reservation denial debug action, to test behaviour when the scanners hit reservation limits. * Updated memory and buffer pool limits for tests. Perf: I ran TPC-H and targeted perf locally comparing with master. Both showed small improvements of a few percent and no regressions of note. I still need to run cluster perf tests. Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scan-node-mt.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-column-readers.h M be/src/exec/scanner-context.cc M be/src/exec/scanner-context.h M be/src/runtime/bufferpool/reservation-tracker-test.cc M be/src/runtime/bufferpool/reservation-util.cc M be/src/runtime/io/disk-io-mgr-stress-test.cc M be/src/runtime/io/disk-io-mgr-stress.cc M be/src/runtime/io/disk-io-mgr-stress.h M be/src/runtime/io/disk-io-mgr-test.cc M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/io/disk-io-mgr.h M be/src/runtime/io/request-context.cc M be/src/runtime/io/request-context.h M be/src/runtime/io/request-ranges.h M be/src/runtime/io/scan-range.cc M be/src/runtime/tmp-file-mgr.cc M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/util/BitUtil.java M fe/src/test/java/org/apache/impala/util/BitUtilTest.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test M testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test M testdata/workloads/functional-planner/queries/PlannerTest/partition-pruning.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test M testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test M testdata/workloads/functional-planner/queries/PlannerTest/union.test M testdata/workloads/functional-query/queries/QueryTest/admission-reject-min-reservation.test M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test M testdata/workloads/functional-query/queries/QueryTest/codegen-mem-limit.test M testdata/workloads/functional-query/queries/QueryTest/disk-spill-encryption.test M testdata/workloads/functional-query/queries/QueryTest/explain-level0.test M testdata/workloads/functional-query/queries/QueryTest/explain-level1.test M testdata/workloads/functional-query/queries/QueryTest/explain-level2.test M testdata/workloads/functional-query/queries/QueryTest/explain-level3.test M testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch.test M
[Impala-ASF-CR] IMPALA-6456: Add flags to configure rpc negotiation timeout ms and negotiation thread count in KRPC
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/9186 ) Change subject: IMPALA-6456: Add flags to configure rpc_negotiation_timeout_ms and negotiation thread count in KRPC .. Patch Set 5: Code-Review+2 Rebase, carry +2. -- To view, visit http://gerrit.cloudera.org:8080/9186 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I108d700e7eac04b678e21a3a920aac81ba8eede5 Gerrit-Change-Number: 9186 Gerrit-PatchSet: 5 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 06 Feb 2018 23:43:27 + Gerrit-HasComments: No
[Impala-ASF-CR] Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption"
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9226 ) Change subject: Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption" .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/9226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia06f061a4ecedd1df0d359fe06fe84618b5e07fb Gerrit-Change-Number: 9226 Gerrit-PatchSet: 1 Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 06 Feb 2018 23:40:42 + Gerrit-HasComments: No
[Impala-ASF-CR] Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption"
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9226 ) Change subject: Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption" .. Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption" This reverts commit 9b68645f9eb9e08899fda860e0946cc05f205479. Change-Id: Ia06f061a4ecedd1df0d359fe06fe84618b5e07fb Reviewed-on: http://gerrit.cloudera.org:8080/9226 Reviewed-by: Sailesh MukilTested-by: Impala Public Jenkins --- M be/src/runtime/tmp-file-mgr.cc M be/src/util/cpu-info.cc M be/src/util/cpu-info.h M be/src/util/openssl-util-test.cc M be/src/util/openssl-util.cc M be/src/util/openssl-util.h 6 files changed, 73 insertions(+), 227 deletions(-) Approvals: Sailesh Mukil: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ia06f061a4ecedd1df0d359fe06fe84618b5e07fb Gerrit-Change-Number: 9226 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-6424: Avoid loading metadata twice when refreshing unloaded tables
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9224 ) Change subject: IMPALA-6424: Avoid loading metadata twice when refreshing unloaded tables .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/9224/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/9224/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3105 PS1, Line 3105: tbl = getExistingTable(tblName.getDb(), tblName.getTbl()); Change seems fine. Have you considered changing getExistingTable() to also return whether the table was cached or loaded? Seems potentially cleaner, but more invasive. -- To view, visit http://gerrit.cloudera.org:8080/9224 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie41a734493dcea0e36d6b051966f1d0302907dee Gerrit-Change-Number: 9224 Gerrit-PatchSet: 1 Gerrit-Owner: Dimitris TsirogiannisGerrit-Reviewer: Alex Behm Gerrit-Comment-Date: Tue, 06 Feb 2018 23:22:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4795: Allow fetching function obj from catalog using signature
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9225 ) Change subject: IMPALA-4795: Allow fetching function obj from catalog using signature .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1893/ -- To view, visit http://gerrit.cloudera.org:8080/9225 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3c19edd56e645b8bee918a111c76fe3f260142fe Gerrit-Change-Number: 9225 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 06 Feb 2018 23:10:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4795: Allow fetching function obj from catalog using signature
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/9225 ) Change subject: IMPALA-4795: Allow fetching function obj from catalog using signature .. Patch Set 1: > It looks like the old patch applied cleanly right? I did another > pass over the patch to sanity-check it and confirm that the > approach was the same as the previous one. yup it applied cleanly, i also ran test_udfs a bunch of times to make sure it doesn't fail anymore -- To view, visit http://gerrit.cloudera.org:8080/9225 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3c19edd56e645b8bee918a111c76fe3f260142fe Gerrit-Change-Number: 9225 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 06 Feb 2018 23:09:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5519: Allocate fragment's runtime filter memory from Buffer pool
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/8971 ) Change subject: IMPALA-5519: Allocate fragment's runtime filter memory from Buffer pool .. Patch Set 9: (8 comments) http://gerrit.cloudera.org:8080/#/c/8971/9/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/8971/9/be/src/service/query-options.cc@380 PS9, Line 380:"buffer size that can be allocated by the buffer pool", > nit: indentation Done http://gerrit.cloudera.org:8080/#/c/8971/9/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java: http://gerrit.cloudera.org:8080/#/c/8971/9/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@75 PS9, Line 75: max > Shouldn't this be the "min buffer size"? Done http://gerrit.cloudera.org:8080/#/c/8971/9/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@108 PS9, Line 108: private class FilterSizeLimits { > Add a small class comment Done http://gerrit.cloudera.org:8080/#/c/8971/9/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@112 PS9, Line 112: BitUtil.roundUpToPowerOf2( > Why do you need to do it here and not when the final 'max' value is compute I did that just make the final assignments look cleaner in code, otherwise it would look something like this: max = BitUtil.roundUpToPowerOf2(Math.max((tQueryOptions.getRuntime_filter_max_size(), BackendConfig.INSTANCE.getMinBufferSize())); or I could bring it down to : max = BitUtil.roundUpToPowerOf2(Math.max(maxLimit, minBufferSize)); http://gerrit.cloudera.org:8080/#/c/8971/9/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@116 PS9, Line 116: // TODO: what happens if minbuffersize is greater than the MAX_BLOOM_FILTER_SIZE? > What do we know about this TODO? Can you check with Tim? oops forgot to remove that one. already talked to tim about that http://gerrit.cloudera.org:8080/#/c/8971/9/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@123 PS9, Line 123: Preconditions.checkArgument(minLimit >= MIN_BLOOM_FILTER_SIZE); : Preconditions.checkArgument(minLimit <= MAX_BLOOM_FILTER_SIZE); > Hm, I know I asked you to add these but they do seem kind of redundant if w Done http://gerrit.cloudera.org:8080/#/c/8971/9/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@134 PS9, Line 134: > nit: extra white space Done http://gerrit.cloudera.org:8080/#/c/8971/9/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@135 PS9, Line 135: // Maximum filter size, in bytes, rounded up to a power of two. : public final long max; : : // Minimum filter size, in bytes, rounded up to a power of two. : public final long min; : : // Pre-computed default filter size, in bytes, rounded up to a power of two. : public final long defaultVal; > nit: move them to the beginning of the class (Java style :)). Also, you cou I was thinking the same (to rename to maxSize). Lemme go ahead and do that -- To view, visit http://gerrit.cloudera.org:8080/8971 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iea2759665fb2e8bef9433014a8d42a7ebf99ce1f Gerrit-Change-Number: 8971 Gerrit-PatchSet: 9 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 06 Feb 2018 23:05:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5037: Default PARQUET ARRAY RESOLUTION=THREE LEVEL
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9210 ) Change subject: IMPALA-5037: Default PARQUET_ARRAY_RESOLUTION=THREE_LEVEL .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/9210 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib8f7e9010c4354d667305d9df7b78862efb23fe1 Gerrit-Change-Number: 9210 Gerrit-PatchSet: 3 Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 06 Feb 2018 22:58:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5037: Default PARQUET ARRAY RESOLUTION=THREE LEVEL
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9210 ) Change subject: IMPALA-5037: Default PARQUET_ARRAY_RESOLUTION=THREE_LEVEL .. IMPALA-5037: Default PARQUET_ARRAY_RESOLUTION=THREE_LEVEL Changes the default value for the PARQUET_ARRAY_RESOLUTION query option to conform to the Parquet standard. Before: TWO_LEVEL_THEN_THREE_LEVEL After: THREE_LEVEL For more information see: * IMPALA-4725 * https://github.com/apache/parquet-format/blob/master/LogicalTypes.md Testing: - expands and cleans up the existing tests for more coverage over the different resolution policies - private core/hdfs run passed Cherry-picks: not for 2.x. Change-Id: Ib8f7e9010c4354d667305d9df7b78862efb23fe1 Reviewed-on: http://gerrit.cloudera.org:8080/9210 Reviewed-by: Alex BehmTested-by: Impala Public Jenkins --- M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M tests/query_test/test_nested_types.py 3 files changed, 153 insertions(+), 193 deletions(-) Approvals: Alex Behm: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9210 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ib8f7e9010c4354d667305d9df7b78862efb23fe1 Gerrit-Change-Number: 9210 Gerrit-PatchSet: 4 Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR](2.x) IMPALA-6228: Control stats extrapolation via tbl prop.
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9231 ) Change subject: IMPALA-6228: Control stats extrapolation via tbl prop. .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/9231/1/testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test File testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test: http://gerrit.cloudera.org:8080/#/c/9231/1/testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test@3 PS1, Line 3: # This test relies on a deterministic row order so we use "sort by (id)". Only conflict was here. It clashed with Lars change to do SORT BY default. -- To view, visit http://gerrit.cloudera.org:8080/9231 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: comment Gerrit-Change-Id: Ie49597bf1b93b7572106abc620d91f199cba0cfd Gerrit-Change-Number: 9231 Gerrit-PatchSet: 1 Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 06 Feb 2018 22:45:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6075: Add Impala daemon metric for catalog version.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8949 ) Change subject: IMPALA-6075: Add Impala daemon metric for catalog version. .. Patch Set 9: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8949 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id2307eb434561ed74ff058106541c0ebda017d97 Gerrit-Change-Number: 8949 Gerrit-PatchSet: 9 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Dimitris TsirogiannisGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 06 Feb 2018 22:44:33 + Gerrit-HasComments: No
[Impala-ASF-CR](2.x) IMPALA-6228: Control stats extrapolation via tbl prop.
Hello Impala Public Jenkins, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/9231 to review the following change. Change subject: IMPALA-6228: Control stats extrapolation via tbl prop. .. IMPALA-6228: Control stats extrapolation via tbl prop. Introduces a new TBLPROPERTY for controlling stats extrapolation on a per-table basis: impala.enable.stats.extrapolation=true/false The property key was chosen to be consistent with the impalad startup flag --enable_stats_extrapolation and to indicate that the property was set and is used by Impala. Behavior: - If the property is not set, then the extrapolation behavior is determined by the impalad startup flag. - If the property is set, it overrides the impalad startup flag, i.e., extrapolation can be explicitly enabled or disabled regardless of the startup flag. Testing: - added new unit tests - code/hdfs run passed Change-Id: Ie49597bf1b93b7572106abc620d91f199cba0cfd Reviewed-on: http://gerrit.cloudera.org:8080/9139 Reviewed-by: Alex BehmTested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java M fe/src/test/java/org/apache/impala/planner/StatsExtrapolationTest.java M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test M tests/custom_cluster/test_stats_extrapolation.py A tests/metadata/test_stats_extrapolation.py 9 files changed, 359 insertions(+), 174 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/9231/1 -- To view, visit http://gerrit.cloudera.org:8080/9231 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: newchange Gerrit-Change-Id: Ie49597bf1b93b7572106abc620d91f199cba0cfd Gerrit-Change-Number: 9231 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Behm Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-6456: Add flags to configure rpc negotiation timeout ms and negotiation thread count in KRPC
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/9186 ) Change subject: IMPALA-6456: Add flags to configure rpc_negotiation_timeout_ms and negotiation thread count in KRPC .. Patch Set 4: Code-Review+2 (2 comments) Thanks for the review. Holding off on GVO until IMPALA-6485 unblocks GVO. http://gerrit.cloudera.org:8080/#/c/9186/3/be/src/rpc/rpc-mgr-test.cc File be/src/rpc/rpc-mgr-test.cc: http://gerrit.cloudera.org:8080/#/c/9186/3/be/src/rpc/rpc-mgr-test.cc@235 PS3, Line 235: // establishment fails. > May help to clarify: Done http://gerrit.cloudera.org:8080/#/c/9186/3/be/src/rpc/rpc-mgr-test.cc@252 PS3, Line 252: secondary_rpc_mgr.Shutdown(); > nit: extra blank line. Done -- To view, visit http://gerrit.cloudera.org:8080/9186 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I108d700e7eac04b678e21a3a920aac81ba8eede5 Gerrit-Change-Number: 9186 Gerrit-PatchSet: 4 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 06 Feb 2018 22:31:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6456: Add flags to configure rpc negotiation timeout ms and negotiation thread count in KRPC
Hello Michael Ho, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9186 to look at the new patch set (#4). Change subject: IMPALA-6456: Add flags to configure rpc_negotiation_timeout_ms and negotiation thread count in KRPC .. IMPALA-6456: Add flags to configure rpc_negotiation_timeout_ms and negotiation thread count in KRPC With the fix for KUDU-2228, the FLAGS_rpc_negotiation_timeout_ms was retired in KRPC. This patch introduces a flag to be able to configure that from the Impala side (FLAGS_rpc_negotiation_timeout_ms). It also introduces a flag to configure the negotiation thread count (FLAGS_rpc_negotiation_thread_count). Added a test to verify that setting FLAGS_rpc_negotiation_timeout_ms to 0 causes negotiation failures. We unfortunately can't write a test to check the same for FLAGS_rpc_negotiation_thread_count due to DCHECKS present in the code. Change-Id: I108d700e7eac04b678e21a3a920aac81ba8eede5 --- M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/rpc-mgr.cc 2 files changed, 28 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/9186/4 -- To view, visit http://gerrit.cloudera.org:8080/9186 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I108d700e7eac04b678e21a3a920aac81ba8eede5 Gerrit-Change-Number: 9186 Gerrit-PatchSet: 4 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-5440 Add planner tests with extreme statistics values
xyutin...@cloudera.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/9065 ) Change subject: IMPALA-5440 Add planner tests with extreme statistics values .. Patch Set 10: Patch 10 stressed on the comments on Patch 9 and does not include the test for ExchangeNode.java, HdfsScanNode.java, HdfsTableSink.java and PlanFragment.java -- To view, visit http://gerrit.cloudera.org:8080/9065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I86dec47cf1438882cafaec53e97864ccfcdff3cb Gerrit-Change-Number: 9065 Gerrit-PatchSet: 10 Gerrit-Owner: xyutin...@cloudera.com Gerrit-Reviewer: Alex BehmGerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: xyutin...@cloudera.com Gerrit-Comment-Date: Tue, 06 Feb 2018 22:25:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5440 Add planner tests with extreme statistics values
xyutin...@cloudera.com has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/9065 ) Change subject: IMPALA-5440 Add planner tests with extreme statistics values .. IMPALA-5440 Add planner tests with extreme statistics values This commit address some of the issues in JIRA: tests against the cardinality overflowing from JOIN, UNION, CROSS JOIN, FULL OUTER JOIN, 0 row number and negative row number, as well as cardinality on Subplan node. Change-Id: I86dec47cf1438882cafaec53e97864ccfcdff3cb --- M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java 2 files changed, 156 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/9065/10 -- To view, visit http://gerrit.cloudera.org:8080/9065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I86dec47cf1438882cafaec53e97864ccfcdff3cb Gerrit-Change-Number: 9065 Gerrit-PatchSet: 10 Gerrit-Owner: xyutin...@cloudera.com Gerrit-Reviewer: Alex BehmGerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: xyutin...@cloudera.com
[Impala-ASF-CR] IMPALA-6478: Remove garbage NativeAddPendingTopicItem log from catalog
Tianyi Wang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9228 Change subject: IMPALA-6478: Remove garbage NativeAddPendingTopicItem log from catalog .. IMPALA-6478: Remove garbage NativeAddPendingTopicItem log from catalog After IMPALA-5990, catalog keeps printing "NativeAddPendingTopicItem failed" into the log because of the wrongly implemented error handling. This patch fixes this problem. Change-Id: Ieb5847a698768c704346371288544a54055bdf1d --- M be/src/service/fe-support.cc 1 file changed, 8 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/9228/1 -- To view, visit http://gerrit.cloudera.org:8080/9228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ieb5847a698768c704346371288544a54055bdf1d Gerrit-Change-Number: 9228 Gerrit-PatchSet: 1 Gerrit-Owner: Tianyi Wang
[Impala-ASF-CR] IMPALA-5269: Fix issue with final line of query followed by a comment
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/9191 ) Change subject: IMPALA-5269: Fix issue with final line of query followed by a comment .. Patch Set 5: Code-Review+1 > Patch Set 4: Code-Review+1 Carrying Thomas' +1 -- To view, visit http://gerrit.cloudera.org:8080/9191 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I54f9a8f65214023520eaa010fc462a663d02d258 Gerrit-Change-Number: 9191 Gerrit-PatchSet: 5 Gerrit-Owner: Fredy WijayaGerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 06 Feb 2018 21:41:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6392: Consistent explain format for parquet predicate statistics
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/9223 ) Change subject: IMPALA-6392: Consistent explain format for parquet predicate statistics .. Patch Set 7: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/9223/7/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/9223/7/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@230 PS7, Line 230: max nit: lets keep this as minMax (as before), and consistent with the rest of the naming in this change. -- To view, visit http://gerrit.cloudera.org:8080/9223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e Gerrit-Change-Number: 9223 Gerrit-PatchSet: 7 Gerrit-Owner: Fredy WijayaGerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 06 Feb 2018 21:23:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5269: Fix issue with final line of query followed by a comment
Fredy Wijaya has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/9191 ) Change subject: IMPALA-5269: Fix issue with final line of query followed by a comment .. IMPALA-5269: Fix issue with final line of query followed by a comment The patch is to remove any comments in a statement when checking if a statement ends with a semicolon delimiter. For example: Before (semicolon delimiter is needed at the end): select 1 + 1; -- comment\n; After (semicolon delimiter is no longer needed): select 1 + 1; -- comment Testing: - Ran end-to-end tests in shell Change-Id: I54f9a8f65214023520eaa010fc462a663d02d258 --- M shell/impala_shell.py M tests/shell/test_shell_interactive.py 2 files changed, 29 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/9191/5 -- To view, visit http://gerrit.cloudera.org:8080/9191 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I54f9a8f65214023520eaa010fc462a663d02d258 Gerrit-Change-Number: 9191 Gerrit-PatchSet: 5 Gerrit-Owner: Fredy WijayaGerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-6392: Consistent explain format for parquet predicate statistics
Fredy Wijaya has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/9223 ) Change subject: IMPALA-6392: Consistent explain format for parquet predicate statistics .. IMPALA-6392: Consistent explain format for parquet predicate statistics In EXPLAIN_LEVEL=2+, change the explain format for parquet predicate statistics to output each tuple descriptor per line. This change is to make it consistent with the output of other predicates. Before: parquet statistics predicates: c_custkey < 10, o_orderkey < 5, l_linenumber < 3 After: parquet statistics predicates: c_custkey < 10 parquet statistics predicates on o: o_orderkey < 5 parquet statistics predicates on o_lineitems: l_linenumber < 3 Testing: - Ran existing planner tests and updated the ones that are affected by this change. - Ran end-to-end tests in query_test Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e --- M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test 4 files changed, 49 insertions(+), 18 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/9223/7 -- To view, visit http://gerrit.cloudera.org:8080/9223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e Gerrit-Change-Number: 9223 Gerrit-PatchSet: 7 Gerrit-Owner: Fredy WijayaGerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6392: Consistent explain format for parquet predicate statistics
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/9223 ) Change subject: IMPALA-6392: Consistent explain format for parquet predicate statistics .. Patch Set 7: (3 comments) http://gerrit.cloudera.org:8080/#/c/9223/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9223/5//COMMIT_MSG@23 PS5, Line 23: . > run end-to-end tests as well. see testdata/workloads/functional-query/queri Done http://gerrit.cloudera.org:8080/#/c/9223/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/9223/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@230 PS5, Line 230: maxOrigina > what's the "dictionary" prefix for? seems this is still just for min/max co I was referring to a Map as a dictionary, but I'll change it to minMaxOriginalConjuncts instead. http://gerrit.cloudera.org:8080/#/c/9223/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1100 PS5, Line 1100: MinMaxOrig > make this name consistent with member name if updated. Done -- To view, visit http://gerrit.cloudera.org:8080/9223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e Gerrit-Change-Number: 9223 Gerrit-PatchSet: 7 Gerrit-Owner: Fredy WijayaGerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 06 Feb 2018 20:34:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9214 ) Change subject: Force inlining of BloomFilter::MakeMask .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9214 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923 Gerrit-Change-Number: 9214 Gerrit-PatchSet: 4 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 06 Feb 2018 20:19:27 + Gerrit-HasComments: No
[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9214 ) Change subject: Force inlining of BloomFilter::MakeMask .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1891/ -- To view, visit http://gerrit.cloudera.org:8080/9214 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923 Gerrit-Change-Number: 9214 Gerrit-PatchSet: 4 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 06 Feb 2018 20:19:33 + Gerrit-HasComments: No
[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9214 ) Change subject: Force inlining of BloomFilter::MakeMask .. Patch Set 3: I did set them - the benchmark complains otherwise. It is a bit strange, I agree. tarmstrong@tarmstrong-box2:~/Impala/incubator-impala$ cat /sys/devices/system/cpu/intel_pstate/no_turbo 1 tarmstrong@tarmstrong-box2:~/Impala/incubator-impala$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor performance -- To view, visit http://gerrit.cloudera.org:8080/9214 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923 Gerrit-Change-Number: 9214 Gerrit-PatchSet: 3 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 06 Feb 2018 20:17:50 + Gerrit-HasComments: No
[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/9214 ) Change subject: Force inlining of BloomFilter::MakeMask .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9214 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923 Gerrit-Change-Number: 9214 Gerrit-PatchSet: 3 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 06 Feb 2018 20:15:24 + Gerrit-HasComments: No
[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/9214 ) Change subject: Force inlining of BloomFilter::MakeMask .. Patch Set 3: > On my system after: > > With AVX2: > > insert:Function iters/ms 10%ile 50%ile > 90%ile 10%ile 50%ile 90%ile > (relative) (relative) (relative) > - > ndv 10k fpp 10.0%2.1e+05 2.11e+05 2.13e+05 > 1X 1X 1X > ndv 10k fpp1.0% 2.16e+05 2.18e+05 2.19e+05 > 1.03X 1.03X 1.03X > ndv 10k fpp0.1% 2.12e+05 2.14e+05 2.16e+05 > 1.01X 1.01X 1.01X > ndv1000k fpp 10.0% 1.98e+05 1.99e+05 2.01e+05 > 0.943X 0.942X 0.945X > ndv1000k fpp1.0% 1.96e+05 1.98e+05 1.99e+05 > 0.935X 0.936X 0.937X > ndv1000k fpp0.1% 1.96e+05 1.97e+05 1.99e+05 > 0.935X 0.934X 0.936X > ndv 10k fpp 10.0% 5.63e+04 5.8e+04 6.18e+04 > 0.269X 0.274X 0.291X > ndv 10k fpp1.0% 5.64e+04 5.84e+04 6.24e+04 > 0.269X 0.276X 0.293X > ndv 10k fpp0.1% 5.56e+04 5.75e+04 5.86e+04 > 0.265X 0.272X 0.275X > > find: Function iters/ms 10%ile 50%ile > 90%ile 10%ile 50%ile 90%ile > (relative) (relative) (relative) > - > present ndv 10k fpp 10.0% 1.97e+05 1.98e+05 > 1.99e+05 1X 1X 1X > absent ndv 10k fpp 10.0% 1.99e+05 2.01e+05 > 2.03e+05 1.01X 1.01X 1.02X > present ndv 10k fpp1.0% 1.97e+05 1.98e+05 > 2e+05 1X 1X 1X > absent ndv 10k fpp1.0% 2e+05 2.01e+05 > 2.03e+05 1.02X 1.02X 1.02X > present ndv 10k fpp0.1% 1.97e+05 1.99e+05 > 2e+05 1X 1X 1X > absent ndv 10k fpp0.1% 2e+05 2.02e+05 > 2.03e+05 1.02X 1.02X 1.02X > present ndv1000k fpp 10.0% 1.75e+05 1.77e+05 > 1.78e+05 0.891X 0.893X 0.893X > absent ndv1000k fpp 10.0% 1.78e+05 1.8e+05 > 1.81e+05 0.907X 0.907X 0.907X > present ndv1000k fpp1.0%1.8e+05 1.82e+05 > 1.83e+05 0.917X 0.917X 0.919X > absent ndv1000k fpp1.0% 1.84e+05 1.86e+05 > 1.88e+05 0.937X 0.939X 0.941X > present ndv1000k fpp0.1% 1.69e+05 1.7e+05 > 1.71e+05 0.857X 0.859X 0.858X > absent ndv1000k fpp0.1%1.7e+05 1.72e+05 > 1.74e+05 0.866X 0.87X 0.871X > present ndv 10k fpp 10.0% 5.34e+04 5.53e+04 > 7.21e+04 0.271X 0.279X 0.362X > absent ndv 10k fpp 10.0% 5.05e+04 5.28e+04 > 5.52e+04 0.257X 0.267X 0.277X > present ndv 10k fpp1.0% 5.43e+04 5.74e+04 > 8.65e+04 0.276X 0.29X 0.434X > absent ndv 10k fpp1.0% 5.09e+04 5.42e+04 > 5.73e+04 0.259X 0.274X 0.288X > present ndv 10k fpp0.1% 5.11e+04 5.24e+04 > 6.69e+04 0.26X 0.265X 0.336X > absent ndv 10k fpp0.1% 4.93e+04 5.02e+04 > 5.1e+04 0.251X 0.254X 0.256X > > union: Function iters/ms 10%ile 50%ile > 90%ile 10%ile 50%ile 90%ile > (relative) (relative) (relative) > - > ndv 10k fpp 10.0% 6.76e+05 6.8e+05 6.88e+05 > 1X 1X 1X > ndv 10k fpp1.0% 6.77e+05 6.81e+05 6.87e+05 > 1X 1X 0.998X > ndv 10k fpp0.1% 6.78e+05 6.82e+05 6.86e+05 > 1X 1X 0.996X > ndv1000k fpp 10.0% 6.78e+05 6.82e+05 6.88e+05 > 1X 1X 1X > ndv1000k fpp1.0% 6.78e+05 6.83e+05 6.89e+05 > 1X 1X 1X > ndv1000k fpp0.1% 6.77e+05 6.8e+05 6.89e+05 > 1X 1X 1X > ndv 10k fpp 10.0% 6.77e+05 6.81e+05 6.88e+05 > 1X 1X 0.999X > ndv 10k fpp1.0% 6.77e+05 6.85e+05 6.89e+05 > 1X 1.01X 1X > ndv 10k fpp0.1% 6.76e+05 6.8e+05 6.88e+05 > 1X 1X 1X Why did union speed up? In any case, +2, but I'd be curious about cat /sys/devices/system/cpu/intel_pstate/no_turbo cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor -- To view, visit http://gerrit.cloudera.org:8080/9214 To unsubscribe, visit
[Impala-ASF-CR] IMPALA-4456: Address scalability issues of qs map lock and client request state map lock
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8363 ) Change subject: IMPALA-4456: Address scalability issues of qs_map_lock_ and client_request_state_map_lock_ .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/8363/5/be/src/util/sharded-query-map-util.h File be/src/util/sharded-query-map-util.h: http://gerrit.cloudera.org:8080/#/c/8363/5/be/src/util/sharded-query-map-util.h@61 PS5, Line 61: map_container > how about 'container_array' to make it clear this must be an array? This comment would become moot once you encode the number of shards. then this parameter is just 'sharded_map' or similar (and contains all shards). -- To view, visit http://gerrit.cloudera.org:8080/8363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1 Gerrit-Change-Number: 8363 Gerrit-PatchSet: 5 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 06 Feb 2018 20:14:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4456: Address scalability issues of qs map lock and client request state map lock
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8363 ) Change subject: IMPALA-4456: Address scalability issues of qs_map_lock_ and client_request_state_map_lock_ .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/8363/5/be/src/util/sharded-query-map-util.h File be/src/util/sharded-query-map-util.h: http://gerrit.cloudera.org:8080/#/c/8363/5/be/src/util/sharded-query-map-util.h@36 PS5, Line 36: 4 doesn't that have to be NUM_QUERY_BUCKETS? otherwise, QueryIdToBucket() doesn't work. But, I think it'd be better to have everything encapsulated in these classes, i.e. the number of shards should be either encoded in here. http://gerrit.cloudera.org:8080/#/c/8363/5/be/src/util/sharded-query-map-util.h@61 PS5, Line 61: map_container how about 'container_array' to make it clear this must be an array? http://gerrit.cloudera.org:8080/#/c/8363/5/be/src/util/sharded-query-map-util.h@63 PS5, Line 63: QueryIdToBucket like above, given this is a function of the number of shards, it should be part of this class. -- To view, visit http://gerrit.cloudera.org:8080/8363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1 Gerrit-Change-Number: 8363 Gerrit-PatchSet: 5 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 06 Feb 2018 20:12:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask
Hello Jim Apple, Mostafa Mokhtar, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9214 to look at the new patch set (#3). Change subject: Force inlining of BloomFilter::MakeMask .. Force inlining of BloomFilter::MakeMask I noticed that this function was showing up in perf top for TPC-H Q8 running locally. It wasn't inlined into BloomFilter::BucketFindAVX2. Inlining made the query ~5% faster for me locally. Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923 --- M be/src/benchmarks/bloom-filter-benchmark.cc M be/src/util/bloom-filter.h 2 files changed, 86 insertions(+), 84 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/9214/3 -- To view, visit http://gerrit.cloudera.org:8080/9214 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923 Gerrit-Change-Number: 9214 Gerrit-PatchSet: 3 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9214 ) Change subject: Force inlining of BloomFilter::MakeMask .. Patch Set 2: On my system after: With AVX2: insert:Function iters/ms 10%ile 50%ile 90%ile 10%ile 50%ile 90%ile (relative) (relative) (relative) - ndv 10k fpp 10.0%2.1e+05 2.11e+05 2.13e+05 1X 1X 1X ndv 10k fpp1.0% 2.16e+05 2.18e+05 2.19e+05 1.03X 1.03X 1.03X ndv 10k fpp0.1% 2.12e+05 2.14e+05 2.16e+05 1.01X 1.01X 1.01X ndv1000k fpp 10.0% 1.98e+05 1.99e+05 2.01e+05 0.943X 0.942X 0.945X ndv1000k fpp1.0% 1.96e+05 1.98e+05 1.99e+05 0.935X 0.936X 0.937X ndv1000k fpp0.1% 1.96e+05 1.97e+05 1.99e+05 0.935X 0.934X 0.936X ndv 10k fpp 10.0% 5.63e+04 5.8e+04 6.18e+04 0.269X 0.274X 0.291X ndv 10k fpp1.0% 5.64e+04 5.84e+04 6.24e+04 0.269X 0.276X 0.293X ndv 10k fpp0.1% 5.56e+04 5.75e+04 5.86e+04 0.265X 0.272X 0.275X find: Function iters/ms 10%ile 50%ile 90%ile 10%ile 50%ile 90%ile (relative) (relative) (relative) - present ndv 10k fpp 10.0% 1.97e+05 1.98e+05 1.99e+05 1X 1X 1X absent ndv 10k fpp 10.0% 1.99e+05 2.01e+05 2.03e+05 1.01X 1.01X 1.02X present ndv 10k fpp1.0% 1.97e+05 1.98e+052e+05 1X 1X 1X absent ndv 10k fpp1.0% 2e+05 2.01e+05 2.03e+05 1.02X 1.02X 1.02X present ndv 10k fpp0.1% 1.97e+05 1.99e+052e+05 1X 1X 1X absent ndv 10k fpp0.1% 2e+05 2.02e+05 2.03e+05 1.02X 1.02X 1.02X present ndv1000k fpp 10.0% 1.75e+05 1.77e+05 1.78e+05 0.891X 0.893X 0.893X absent ndv1000k fpp 10.0% 1.78e+05 1.8e+05 1.81e+05 0.907X 0.907X 0.907X present ndv1000k fpp1.0%1.8e+05 1.82e+05 1.83e+05 0.917X 0.917X 0.919X absent ndv1000k fpp1.0% 1.84e+05 1.86e+05 1.88e+05 0.937X 0.939X 0.941X present ndv1000k fpp0.1% 1.69e+05 1.7e+05 1.71e+05 0.857X 0.859X 0.858X absent ndv1000k fpp0.1%1.7e+05 1.72e+05 1.74e+05 0.866X 0.87X 0.871X present ndv 10k fpp 10.0% 5.34e+04 5.53e+04 7.21e+04 0.271X 0.279X 0.362X absent ndv 10k fpp 10.0% 5.05e+04 5.28e+04 5.52e+04 0.257X 0.267X 0.277X present ndv 10k fpp1.0% 5.43e+04 5.74e+04 8.65e+04 0.276X 0.29X 0.434X absent ndv 10k fpp1.0% 5.09e+04 5.42e+04 5.73e+04 0.259X 0.274X 0.288X present ndv 10k fpp0.1% 5.11e+04 5.24e+04 6.69e+04 0.26X 0.265X 0.336X absent ndv 10k fpp0.1% 4.93e+04 5.02e+04 5.1e+04 0.251X 0.254X 0.256X union: Function iters/ms 10%ile 50%ile 90%ile 10%ile 50%ile 90%ile (relative) (relative) (relative) - ndv 10k fpp 10.0% 6.76e+05 6.8e+05 6.88e+05 1X 1X 1X ndv 10k fpp1.0% 6.77e+05 6.81e+05 6.87e+05 1X 1X 0.998X ndv 10k fpp0.1% 6.78e+05 6.82e+05 6.86e+05 1X 1X 0.996X ndv1000k fpp 10.0% 6.78e+05 6.82e+05 6.88e+05 1X 1X 1X ndv1000k fpp1.0% 6.78e+05 6.83e+05 6.89e+05 1X 1X 1X ndv1000k fpp0.1% 6.77e+05 6.8e+05 6.89e+05 1X 1X 1X ndv 10k fpp 10.0% 6.77e+05 6.81e+05 6.88e+05 1X 1X 0.999X ndv 10k fpp1.0% 6.77e+05 6.85e+05 6.89e+05 1X 1.01X 1X ndv
[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9214 ) Change subject: Force inlining of BloomFilter::MakeMask .. Patch Set 2: The speedup was pretty significant on the benchmark. On my system before: With AVX2: insert:Function iters/ms 10%ile 50%ile 90%ile 10%ile 50%ile 90%ile (relative) (relative) (relative) - ndv 10k fpp 10.0% 1.06e+05 1.06e+05 1.07e+05 1X 1X 1X ndv 10k fpp1.0% 1.05e+05 1.06e+05 1.07e+05 0.998X 0.999X 0.998X ndv 10k fpp0.1% 1.06e+05 1.06e+05 1.07e+05 0.999X 1X 1X ndv1000k fpp 10.0% 1.05e+05 1.06e+05 1.07e+05 0.993X 0.992X 0.994X ndv1000k fpp1.0% 1.04e+05 1.05e+05 1.06e+05 0.986X 0.988X 0.986X ndv1000k fpp0.1% 1.04e+05 1.05e+05 1.06e+05 0.986X 0.987X 0.986X ndv 10k fpp 10.0% 3.84e+04 4.04e+04 4.4e+04 0.364X 0.38X 0.41X ndv 10k fpp1.0%3.9e+04 4.11e+04 4.46e+04 0.369X 0.386X 0.416X ndv 10k fpp0.1% 3.77e+04 3.85e+04 4.58e+04 0.357X 0.362X 0.426X find: Function iters/ms 10%ile 50%ile 90%ile 10%ile 50%ile 90%ile (relative) (relative) (relative) - present ndv 10k fpp 10.0% 1.12e+05 1.12e+05 1.13e+05 1X 1X 1X absent ndv 10k fpp 10.0% 1.11e+05 1.12e+05 1.13e+05 0.998X 0.998X 1X present ndv 10k fpp1.0% 1.12e+05 1.12e+05 1.13e+05 1X 0.998X 0.999X absent ndv 10k fpp1.0% 1.12e+05 1.12e+05 1.13e+05 0.999X 1X 0.998X present ndv 10k fpp0.1% 1.12e+05 1.13e+05 1.13e+05 1X 1X 1X absent ndv 10k fpp0.1% 1.12e+05 1.12e+05 1.13e+05 0.999X 0.999X 0.998X present ndv1000k fpp 10.0% 1.08e+05 1.09e+05 1.1e+05 0.967X 0.969X 0.967X absent ndv1000k fpp 10.0% 1.09e+05 1.09e+05 1.1e+05 0.973X 0.97X 0.973X present ndv1000k fpp1.0% 1.07e+05 1.08e+05 1.09e+05 0.96X 0.96X 0.961X absent ndv1000k fpp1.0% 1.08e+05 1.09e+05 1.1e+05 0.971X 0.971X 0.969X present ndv1000k fpp0.1% 1.05e+05 1.05e+05 1.06e+05 0.937X 0.938X 0.937X absent ndv1000k fpp0.1% 1.06e+05 1.07e+05 1.08e+05 0.951X 0.949X 0.95X present ndv 10k fpp 10.0% 3.97e+04 4.08e+04 4.63e+04 0.356X 0.363X 0.408X absent ndv 10k fpp 10.0% 3.94e+04 4.08e+04 4.23e+04 0.353X 0.363X 0.374X present ndv 10k fpp1.0% 4.19e+04 5.11e+04 7.02e+04 0.375X 0.454X 0.62X absent ndv 10k fpp1.0%3.9e+04 4.09e+04 4.28e+04 0.35X 0.364X 0.377X present ndv 10k fpp0.1% 3.94e+04 4.26e+04 6.5e+04 0.353X 0.379X 0.573X absent ndv 10k fpp0.1% 3.67e+04 3.71e+04 3.82e+04 0.329X 0.33X 0.337X union: Function iters/ms 10%ile 50%ile 90%ile 10%ile 50%ile 90%ile (relative) (relative) (relative) - ndv 10k fpp 10.0% 5.46e+05 5.51e+05 5.56e+05 1X 1X 1X ndv 10k fpp1.0% 5.46e+05 5.49e+05 5.54e+05 1X 0.996X 0.996X ndv 10k fpp0.1% 5.46e+05 5.51e+05 5.56e+05 1X 1X 1X ndv1000k fpp 10.0% 5.47e+05 5.51e+05 5.55e+05 1X 1X 0.998X ndv1000k fpp1.0% 5.48e+05 5.54e+05 5.65e+05 1X 1.01X 1.02X ndv1000k fpp0.1% 5.48e+05 5.51e+05 5.55e+05 1X 1X 0.998X ndv 10k fpp 10.0% 5.48e+05 5.54e+05 5.65e+05 1X 1.01X 1.02X ndv 10k fpp1.0% 5.43e+05 5.51e+05 5.57e+05
[Impala-ASF-CR] Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption"
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9226 ) Change subject: Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption" .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1890/ -- To view, visit http://gerrit.cloudera.org:8080/9226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia06f061a4ecedd1df0d359fe06fe84618b5e07fb Gerrit-Change-Number: 9226 Gerrit-PatchSet: 1 Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 06 Feb 2018 20:02:30 + Gerrit-HasComments: No
[Impala-ASF-CR] Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption"
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/9226 ) Change subject: Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption" .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia06f061a4ecedd1df0d359fe06fe84618b5e07fb Gerrit-Change-Number: 9226 Gerrit-PatchSet: 1 Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 06 Feb 2018 20:01:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's
Michael Ho has removed Lars Volker from this change. ( http://gerrit.cloudera.org:8080/9202 ) Change subject: IMPALA-6396: Exchange node's memory usage should include its receiver's .. Removed reviewer Lars Volker. -- To view, visit http://gerrit.cloudera.org:8080/9202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteReviewer Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45 Gerrit-Change-Number: 9202 Gerrit-PatchSet: 2 Gerrit-Owner: Michael HoGerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-4456: Address scalability issues of qs map lock and client request state map lock
Hello Philip Zeyliger, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8363 to look at the new patch set (#5). Change subject: IMPALA-4456: Address scalability issues of qs_map_lock_ and client_request_state_map_lock_ .. IMPALA-4456: Address scalability issues of qs_map_lock_ and client_request_state_map_lock_ The following 2 locks have shown to be frequent points of contention on recent perf runs: - qs_map_lock_ - client_request_state_map_lock_ Since these are process wide locks, any threads waiting on these locks potentially slow down the runtime of a query. I tried to address this previously by converting the client_request_state_map_lock_ to a reader-writer lock. This showed great perf improvements in the general case, however, there were edge cases with big regressions as well. In the general case, strict readers of the map got through so quickly that we were able to see a reduction in the number of client connections created, since this lock was contended for in the context of Thrift threads too. The bad case is when writers were starved trying to register a new query since there were so many readers. Changing the starve option resulted in worse read performance all round. Another approach which is relatively simpler is to shard the locks, which proves to be very effective with no regressions. The maps and locks are sharded to a default of 4 buckets initally. Query IDs are created by using boost::uuids::random_generator. We use the high bits of a query ID to assign queries to buckets. I verified that the distribution of the high bits of a query ID are even across buckets on my local machine: For 10,000 queries sharded across 4 buckets, the distribution was: bucket[0]: 2500 bucket[1]: 2489 bucket[2]: 2566 bucket[3]: 2445 A micro-benchmark is added to measure the improvement in performance. This benchmark creates multiple threads each of which creates a QueryState and accesses it multiple times. We can see improvements in the range 2x - 3.5x. BEFORE: --Benchmark 1: Create and access Query States. Total Time (#Queries: 5 #Accesses: 100) : 1ms Total Time (#Queries: 50 #Accesses: 100) : 8ms Total Time (#Queries: 50 #Accesses: 1000) : 54ms Total Time (#Queries: 500 #Accesses: 100) : 76ms Total Time (#Queries: 500 #Accesses: 1000) : 543ms AFTER: --Benchmark 1: Create and access Query States. Total Time (#Queries: 5 #Accesses: 100) : 2173.59K clock cycles Total Time (#Queries: 50 #Accesses: 100) : 4ms Total Time (#Queries: 50 #Accesses: 1000) : 15ms Total Time (#Queries: 500 #Accesses: 100) : 46ms Total Time (#Queries: 500 #Accesses: 1000) : 151ms This change introduces a ShardedQueryMapTemplate, which is used to replace the QueryExecMgr::qs_map_ and the ImpalaServer::client_request_state_map_, and their corresponding locks. NOTE: This microbenchmark has shown that SpinLock has better performance than boost::mutex for the qs_map_lock_'s, so that change has been made too. TODO: Add benchmark for client_request_state_map_lock_ too. The APIs around that are more complicated, so this patch only includes the benchmarking of qs_map_lock_. TODO 2: Consider adopting the ShardedQueryMapTemplate for the SessionStateMap. Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1 --- M be/src/benchmarks/CMakeLists.txt A be/src/benchmarks/process-wide-locks-benchmark.cc M be/src/runtime/query-exec-mgr.cc M be/src/runtime/query-exec-mgr.h M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h A be/src/util/sharded-query-map-util.h M be/src/util/uid-util.h 9 files changed, 353 insertions(+), 53 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/8363/5 -- To view, visit http://gerrit.cloudera.org:8080/8363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1 Gerrit-Change-Number: 8363 Gerrit-PatchSet: 5 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption"
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9226 ) Change subject: Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption" .. Patch Set 1: I believe this is breaking all builds because the BE fails to compile. See: https://issues.apache.org/jira/browse/IMPALA-6485 -- To view, visit http://gerrit.cloudera.org:8080/9226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia06f061a4ecedd1df0d359fe06fe84618b5e07fb Gerrit-Change-Number: 9226 Gerrit-PatchSet: 1 Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Comment-Date: Tue, 06 Feb 2018 20:00:15 + Gerrit-HasComments: No
[Impala-ASF-CR] Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption"
Alex Behm has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9226 Change subject: Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption" .. Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption" This reverts commit 9b68645f9eb9e08899fda860e0946cc05f205479. Change-Id: Ia06f061a4ecedd1df0d359fe06fe84618b5e07fb --- M be/src/runtime/tmp-file-mgr.cc M be/src/util/cpu-info.cc M be/src/util/cpu-info.h M be/src/util/openssl-util-test.cc M be/src/util/openssl-util.cc M be/src/util/openssl-util.h 6 files changed, 73 insertions(+), 227 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/9226/1 -- To view, visit http://gerrit.cloudera.org:8080/9226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ia06f061a4ecedd1df0d359fe06fe84618b5e07fb Gerrit-Change-Number: 9226 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Behm
[Impala-ASF-CR] IMPALA-4795: Allow fetching function obj from catalog using signature
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9225 ) Change subject: IMPALA-4795: Allow fetching function obj from catalog using signature .. Patch Set 1: Code-Review+2 It looks like the old patch applied cleanly right? I did another pass over the patch to sanity-check it and confirm that the approach was the same as the previous one. -- To view, visit http://gerrit.cloudera.org:8080/9225 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3c19edd56e645b8bee918a111c76fe3f260142fe Gerrit-Change-Number: 9225 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 06 Feb 2018 19:50:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6392: Consistent explain format for parquet predicate statistics
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/9223 ) Change subject: IMPALA-6392: Consistent explain format for parquet predicate statistics .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/9223/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9223/5//COMMIT_MSG@23 PS5, Line 23: . run end-to-end tests as well. see testdata/workloads/functional-query/queries/QueryTest/parquet-filtering.test. this is used by tests/query_test/test_mt_dop.py http://gerrit.cloudera.org:8080/#/c/9223/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/9223/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@230 PS5, Line 230: dictionary what's the "dictionary" prefix for? seems this is still just for min/max conjuncts (minMaxConjuncts_) and not for dictionaryFilterConjuncts_. http://gerrit.cloudera.org:8080/#/c/9223/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1100 PS5, Line 1100: Dictionary make this name consistent with member name if updated. -- To view, visit http://gerrit.cloudera.org:8080/9223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e Gerrit-Change-Number: 9223 Gerrit-PatchSet: 5 Gerrit-Owner: Fredy WijayaGerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 06 Feb 2018 19:27:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6204: Remove external DataSource
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/9192 ) Change subject: IMPALA-6204: Remove external DataSource .. Patch Set 5: > > Was there ever an announcement that this was deprecated? > > Is https://issues.apache.org/jira/browse/IMPALA-6204 sufficient? TBH, I'm not thrilled about that. For instance, there is no justification on that ticket about why, and no discussion of pros and cons. While this is in review, Maybe Philip could start a discussion on user@ explaining why and asking for feedback. Maybe it will come back unanimous after a short amount of time, but I think it's worth a discussion, given how badly this could break workloads of existing users. (Obviously they could ETL into Parquet or stay on the 2.x branch.) -- To view, visit http://gerrit.cloudera.org:8080/9192 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02a3a6740466ed7372b71d948c705b30886dcfb6 Gerrit-Change-Number: 9192 Gerrit-PatchSet: 5 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Tue, 06 Feb 2018 19:19:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5037: Default PARQUET ARRAY RESOLUTION=THREE LEVEL
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9210 ) Change subject: IMPALA-5037: Default PARQUET_ARRAY_RESOLUTION=THREE_LEVEL .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1889/ -- To view, visit http://gerrit.cloudera.org:8080/9210 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib8f7e9010c4354d667305d9df7b78862efb23fe1 Gerrit-Change-Number: 9210 Gerrit-PatchSet: 3 Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 06 Feb 2018 19:18:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5037: Default PARQUET ARRAY RESOLUTION=THREE LEVEL
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9210 ) Change subject: IMPALA-5037: Default PARQUET_ARRAY_RESOLUTION=THREE_LEVEL .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9210 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib8f7e9010c4354d667305d9df7b78862efb23fe1 Gerrit-Change-Number: 9210 Gerrit-PatchSet: 3 Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 06 Feb 2018 19:17:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5037: Default PARQUET ARRAY RESOLUTION=THREE LEVEL
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9210 ) Change subject: IMPALA-5037: Default PARQUET_ARRAY_RESOLUTION=THREE_LEVEL .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/9210/2/tests/query_test/test_nested_types.py File tests/query_test/test_nested_types.py: http://gerrit.cloudera.org:8080/#/c/9210/2/tests/query_test/test_nested_types.py@355 PS2, Line 355: arr_res = vector.get_value('parquet_array_resolution') > Seems ok, but we could probably factor these three lines out into a functio Done -- To view, visit http://gerrit.cloudera.org:8080/9210 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib8f7e9010c4354d667305d9df7b78862efb23fe1 Gerrit-Change-Number: 9210 Gerrit-PatchSet: 2 Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 06 Feb 2018 19:16:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4795: Allow fetching function obj from catalog using signature
Bikramjeet Vig has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9225 Change subject: IMPALA-4795: Allow fetching function obj from catalog using signature .. IMPALA-4795: Allow fetching function obj from catalog using signature Fixed a bug where the catalog throws a NullPointerException if trying to fetch a function object using function signature. This exception prevented the code paths in CatalogOpExecutor::HandleDropFunction and ImpalaServer::CatalogUpdateCallback to be exercised which prevented removal of a recreated function object necessary for maintaining metadata consistency. This patch was initially reverted because it increased the probability of tests in test_udfs failing. This was due to a race in LibCache that has now been fixed in a separate patch. Change-Id: I3c19edd56e645b8bee918a111c76fe3f260142fe --- M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/main/java/org/apache/impala/catalog/Function.java M tests/common/impala_service.py M tests/query_test/test_udfs.py M tests/webserver/test_web_pages.py 5 files changed, 48 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/9225/1 -- To view, visit http://gerrit.cloudera.org:8080/9225 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I3c19edd56e645b8bee918a111c76fe3f260142fe Gerrit-Change-Number: 9225 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig
[Impala-ASF-CR] IMPALA-6456: Add flags to configure rpc negotiation timeout ms and negotiation thread count in KRPC
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/9186 ) Change subject: IMPALA-6456: Add flags to configure rpc_negotiation_timeout_ms and negotiation thread count in KRPC .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/9186/2/be/src/rpc/rpc-mgr-test.cc File be/src/rpc/rpc-mgr-test.cc: http://gerrit.cloudera.org:8080/#/c/9186/2/be/src/rpc/rpc-mgr-test.cc@537 PS2, Line 537: > Should this use something like ScopedFlagSetter() in case the ASSERT below Yes, good idea. Done. http://gerrit.cloudera.org:8080/#/c/9186/2/be/src/rpc/rpc-mgr-test.cc@541 PS2, Line 541: > Is TLS always enabled ? Seem a bit misleading to have the "tls_" prefix her Sorry that was a mistake. I forgot to change the name of the variable. Done. http://gerrit.cloudera.org:8080/#/c/9186/2/be/src/rpc/rpc-mgr-test.cc@551 PS2, Line 551: > Just wondering if it'll be a problem if ASSERT() above gets triggered. Will If ASSERT() is triggered, we fail the BE test anyway right? So, even though the RpcMgr isn't shutdown, the process will kill it. -- To view, visit http://gerrit.cloudera.org:8080/9186 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I108d700e7eac04b678e21a3a920aac81ba8eede5 Gerrit-Change-Number: 9186 Gerrit-PatchSet: 3 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 06 Feb 2018 19:14:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6456: Add flags to configure rpc negotiation timeout ms and negotiation thread count in KRPC
Hello Michael Ho, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9186 to look at the new patch set (#3). Change subject: IMPALA-6456: Add flags to configure rpc_negotiation_timeout_ms and negotiation thread count in KRPC .. IMPALA-6456: Add flags to configure rpc_negotiation_timeout_ms and negotiation thread count in KRPC With the fix for KUDU-2228, the FLAGS_rpc_negotiation_timeout_ms was retired in KRPC. This patch introduces a flag to be able to configure that from the Impala side (FLAGS_rpc_negotiation_timeout_ms). It also introduces a flag to configure the negotiation thread count (FLAGS_rpc_negotiation_thread_count). Added a test to verify that setting FLAGS_rpc_negotiation_timeout_ms to 0 causes negotiation failures. We unfortunately can't write a test to check the same for FLAGS_rpc_negotiation_thread_count due to DCHECKS present in the code. Change-Id: I108d700e7eac04b678e21a3a920aac81ba8eede5 --- M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/rpc-mgr.cc 2 files changed, 28 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/9186/3 -- To view, visit http://gerrit.cloudera.org:8080/9186 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I108d700e7eac04b678e21a3a920aac81ba8eede5 Gerrit-Change-Number: 9186 Gerrit-PatchSet: 3 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-6392: Consistent explain format for parquet predicate statistics
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/9223 ) Change subject: IMPALA-6392: Consistent explain format for parquet predicate statistics .. Patch Set 5: (9 comments) http://gerrit.cloudera.org:8080/#/c/9223/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9223/2//COMMIT_MSG@7 PS2, Line 7: IMPALA-6392: Consistent explain format for parquet predicate statistics : > shorten this to fit on one line Done. http://gerrit.cloudera.org:8080/#/c/9223/2//COMMIT_MSG@10 PS2, Line 10: to output each > mention that this only affects EXPLAIN_LEVEL=2+ Done. http://gerrit.cloudera.org:8080/#/c/9223/2//COMMIT_MSG@13 PS2, Line 13: Before: > Could you include a couple lines showing what the new format looks like? Done. http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@230 PS2, Line 230: M > extra whitespace Done. http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@232 PS2, Line 232: > extra whitespace Done. http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1099 PS2, Line 1099: // Helper method that prints dictionary min max original conjucts by tuple descriptor. > Brief comment here, like the one for getDictionaryConjunctsExplainString() Done. http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1107 PS2, Line 1107: ut.app > This can fit on the previous line (we wrap at 90) Done. http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1109 PS2, Line 1109: else { > Since this is only used once, no need to make a variable, just include it i Done. http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@ PS2, Line : prefix > fits on previous line The previous line will be 91-width long if we move prefix to the previous line. -- To view, visit http://gerrit.cloudera.org:8080/9223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e Gerrit-Change-Number: 9223 Gerrit-PatchSet: 5 Gerrit-Owner: Fredy WijayaGerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 06 Feb 2018 18:59:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6392: Consistent explain format for parquet predicate statistics
Fredy Wijaya has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/9223 ) Change subject: IMPALA-6392: Consistent explain format for parquet predicate statistics .. IMPALA-6392: Consistent explain format for parquet predicate statistics In EXPLAIN_LEVEL=2+, change the explain format for parquet predicate statistics to output each tuple descriptor per line. This change is to make it consistent with the output of other predicates. Before: parquet statistics predicates: c_custkey < 10, o_orderkey < 5, l_linenumber < 3 After: parquet statistics predicates: c_custkey < 10 parquet statistics predicates on o: o_orderkey < 5 parquet statistics predicates on o_lineitems: l_linenumber < 3 Testing: - Ran existing planner tests and updated the ones that are affected by this change. Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e --- M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test 4 files changed, 49 insertions(+), 18 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/9223/5 -- To view, visit http://gerrit.cloudera.org:8080/9223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e Gerrit-Change-Number: 9223 Gerrit-PatchSet: 5 Gerrit-Owner: Fredy WijayaGerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-6437: separate AC/scheduler from catalog topic updates
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9123 ) Change subject: IMPALA-6437: separate AC/scheduler from catalog topic updates .. Patch Set 14: Code-Review+1 This looks okay to me. I focused mostly on headers and only took a quick looks at implementation. Up to you whether you want to have another reviewer in addition to Tianyi take a deeper look at the code. -- To view, visit http://gerrit.cloudera.org:8080/9123 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifc49c2d0f2a5bfad822545616b8c62b4b95dc210 Gerrit-Change-Number: 9123 Gerrit-PatchSet: 14 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 06 Feb 2018 18:54:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6204: Remove external DataSource
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9192 ) Change subject: IMPALA-6204: Remove external DataSource .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/9192/5/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: http://gerrit.cloudera.org:8080/#/c/9192/5/common/thrift/CatalogObjects.thrift@401 PS5, Line 401: // 12: optional TDataSourceTable data_source_table > what's the reasoning for not renumbering for these internal datastructures? I'd also prefer removing everything. I see no point in keeping stuff like this around. -- To view, visit http://gerrit.cloudera.org:8080/9192 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02a3a6740466ed7372b71d948c705b30886dcfb6 Gerrit-Change-Number: 9192 Gerrit-PatchSet: 5 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Tue, 06 Feb 2018 18:53:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6392: Explain format for parquet predicate statistics should be consistent with predicates
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/9223 ) Change subject: IMPALA-6392: Explain format for parquet predicate statistics should be consistent with predicates .. Patch Set 2: My username is fredyw -- To view, visit http://gerrit.cloudera.org:8080/9223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e Gerrit-Change-Number: 9223 Gerrit-PatchSet: 2 Gerrit-Owner: Fredy WijayaGerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 06 Feb 2018 18:44:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6392: Explain format for parquet predicate statistics should be consistent with predicates
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/9223 ) Change subject: IMPALA-6392: Explain format for parquet predicate statistics should be consistent with predicates .. Patch Set 2: (9 comments) Thanks for working on this! Do you have a username on JIRA so that I can assign this (and the other ones you're working on) to you? http://gerrit.cloudera.org:8080/#/c/9223/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9223/2//COMMIT_MSG@7 PS2, Line 7: IMPALA-6392: Explain format for parquet predicate statistics should be : consistent with predicates shorten this to fit on one line http://gerrit.cloudera.org:8080/#/c/9223/2//COMMIT_MSG@10 PS2, Line 10: explain format mention that this only affects EXPLAIN_LEVEL=2+ http://gerrit.cloudera.org:8080/#/c/9223/2//COMMIT_MSG@13 PS2, Line 13: Could you include a couple lines showing what the new format looks like? Also, we like to include a "Testing" section. You can look through other reviews on gerrit to see what this normally looks like, but it would be something like: Testing: - Ran existing planner tests and updated the ones that are affected by this change. http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@230 PS2, Line 230: extra whitespace http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@232 PS2, Line 232: extra whitespace http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1099 PS2, Line 1099: private String getDictionaryMinMaxOriginalConjunctsExplainString(String prefix) { Brief comment here, like the one for getDictionaryConjunctsExplainString() below http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1107 PS2, Line 1107: prefix This can fit on the previous line (we wrap at 90) You might consider looking into clang-format-diff to find these little issues. We include a .clang-format file http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1109 PS2, Line 1109: String alias Since this is only used once, no need to make a variable, just include it inline in the append() http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@ PS2, Line : prefix fits on previous line -- To view, visit http://gerrit.cloudera.org:8080/9223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e Gerrit-Change-Number: 9223 Gerrit-PatchSet: 2 Gerrit-Owner: Fredy WijayaGerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 06 Feb 2018 18:26:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6424: Avoid loading metadata twice when refreshing unloaded tables
Dimitris Tsirogiannis has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9224 Change subject: IMPALA-6424: Avoid loading metadata twice when refreshing unloaded tables .. IMPALA-6424: Avoid loading metadata twice when refreshing unloaded tables This commit fixes an issue where table metadata are loaded twice if a REFRESH statement is executed on an unloaded table. With this fix, we initially check the state of the table (loaded, unloaded) and we only refresh metadata if the table is in a loaded state. If table is unloaded, we load the metadata from scratch. Testing: Run exhaustive tests Change-Id: Ie41a734493dcea0e36d6b051966f1d0302907dee --- M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java 1 file changed, 23 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/9224/1 -- To view, visit http://gerrit.cloudera.org:8080/9224 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ie41a734493dcea0e36d6b051966f1d0302907dee Gerrit-Change-Number: 9224 Gerrit-PatchSet: 1 Gerrit-Owner: Dimitris Tsirogiannis
[Impala-ASF-CR] IMPALA-6204: Remove external DataSource
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9192 ) Change subject: IMPALA-6204: Remove external DataSource .. Patch Set 5: (2 comments) Maybe also manually test that the various SQL related to datasource gives a sensible parsing error with this change? http://gerrit.cloudera.org:8080/#/c/9192/5/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/9192/5/be/src/service/client-request-state.cc@297 PS5, Line 297: return Status("Data Source not supported."); why can we even get here? doesn't the frontend reject this statement? http://gerrit.cloudera.org:8080/#/c/9192/5/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: http://gerrit.cloudera.org:8080/#/c/9192/5/common/thrift/CatalogObjects.thrift@401 PS5, Line 401: // 12: optional TDataSourceTable data_source_table what's the reasoning for not renumbering for these internal datastructures? I guess I don't feel too strongly, but seems like we can accumulate cruft. (Once we implement rolling upgrade, then we'll probably need to be stricter.) -- To view, visit http://gerrit.cloudera.org:8080/9192 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02a3a6740466ed7372b71d948c705b30886dcfb6 Gerrit-Change-Number: 9192 Gerrit-PatchSet: 5 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Tue, 06 Feb 2018 17:45:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/9214 ) Change subject: Force inlining of BloomFilter::MakeMask .. Patch Set 2: Code-Review+1 Just for sanity checking, did you run bloom-filter-benchmark? -- To view, visit http://gerrit.cloudera.org:8080/9214 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923 Gerrit-Change-Number: 9214 Gerrit-PatchSet: 2 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Jim Apple Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 06 Feb 2018 17:13:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6204: Remove external DataSource
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/9192 ) Change subject: IMPALA-6204: Remove external DataSource .. Patch Set 5: Was there ever an announcement that this was deprecated? -- To view, visit http://gerrit.cloudera.org:8080/9192 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02a3a6740466ed7372b71d948c705b30886dcfb6 Gerrit-Change-Number: 9192 Gerrit-PatchSet: 5 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Tue, 06 Feb 2018 17:07:46 + Gerrit-HasComments: No
[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9214 ) Change subject: Force inlining of BloomFilter::MakeMask .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/9214/2/be/src/util/bloom-filter.h File be/src/util/bloom-filter.h: http://gerrit.cloudera.org:8080/#/c/9214/2/be/src/util/bloom-filter.h@162 PS2, Line 162: __attribute__((__target__("avx2"))); > Yes, I see it too in BucketFindAvx2 The __target__ is needed for GCC to emit AVX2 instructions at all, the multi-versioning thing is really an overloading of the __target__ attribute -- To view, visit http://gerrit.cloudera.org:8080/9214 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923 Gerrit-Change-Number: 9214 Gerrit-PatchSet: 2 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 06 Feb 2018 16:52:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/9214 ) Change subject: Force inlining of BloomFilter::MakeMask .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/9214/2/be/src/util/bloom-filter.h File be/src/util/bloom-filter.h: http://gerrit.cloudera.org:8080/#/c/9214/2/be/src/util/bloom-filter.h@162 PS2, Line 162: __attribute__((__target__("avx2"))); Yes, I see it too in BucketFindAvx2 0.67 ? mov%rdi,%r12 ? 1.07 ? mov%esi,%ebx ? 0.56 ? mov%edx,%edi ? 1.32 ? shl$0x5,%rbx ? 0.72 ? sub$0x18,%rsp ? 1.73 ?? callq impala::BloomFilter::MakeMask(unsigned int) Wondering if the multi-versioning attribute is still needed since MakeMask only called from BucketFindAVX2 and BucketInsertAVX2 which also have (__target__("avx2")). -- To view, visit http://gerrit.cloudera.org:8080/9214 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923 Gerrit-Change-Number: 9214 Gerrit-PatchSet: 2 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Mostafa Mokhtar Gerrit-Comment-Date: Tue, 06 Feb 2018 16:50:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6392: Explain format for parquet predicate statistics should be consistent with predicates
Fredy Wijaya has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9223 Change subject: IMPALA-6392: Explain format for parquet predicate statistics should be consistent with predicates .. IMPALA-6392: Explain format for parquet predicate statistics should be consistent with predicates Change the explain format for parquet predicate statistics to output each tuple descriptor per line. This change is to make it consistent with the output of other predicates. Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e --- M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test 4 files changed, 50 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/9223/2 -- To view, visit http://gerrit.cloudera.org:8080/9223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e Gerrit-Change-Number: 9223 Gerrit-PatchSet: 2 Gerrit-Owner: Fredy Wijaya
[Impala-ASF-CR] IMPALA-5440 Add planner tests with extreme statistics values
xyutin...@cloudera.com has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/9065 ) Change subject: IMPALA-5440 Add planner tests with extreme statistics values .. IMPALA-5440 Add planner tests with extreme statistics values This commit address some of the issues in JIRA: tests against the cardinality overflowing from JOIN, UNION, CROSS JOIN, FULL OUTER JOIN, 0 row number and negative row number, as well as cardinality on Subplan node. Change-Id: I86dec47cf1438882cafaec53e97864ccfcdff3cb --- M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java 2 files changed, 150 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/9065/9 -- To view, visit http://gerrit.cloudera.org:8080/9065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I86dec47cf1438882cafaec53e97864ccfcdff3cb Gerrit-Change-Number: 9065 Gerrit-PatchSet: 9 Gerrit-Owner: xyutin...@cloudera.com Gerrit-Reviewer: Alex BehmGerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: xyutin...@cloudera.com
[Impala-ASF-CR] IMPALA-4835: Part 3: switch I/O buffers to buffer pool
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/8966 ) Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool .. Patch Set 12: (4 comments) some early comments: overall looks pretty solid. will do another pass http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/hdfs-parquet-scanner.cc@1707 PS12, Line 1707: 3 would it make sense to replace this with a named constant? something that appropriately ties together the use of 3 buffers and optimum reservation size. http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/hdfs-parquet-scanner.cc@1724 PS12, Line 1724: bytes_to_add = 0; would it be useful to squeeze max amount of buffers if we do this here: if(reservation_to_distribute >= min_buffer_size) bytes_to_add = BitUtil::RoundDownToPowerOfTwo(reservation_to_distribute) http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/hdfs-scan-node-base.cc File be/src/exec/hdfs-scan-node-base.cc: http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/hdfs-scan-node-base.cc@347 PS12, Line 347: // We got the minimum reservation. Now try to get ideal reservation. : if (resource_profile_.min_reservation != ideal_scan_range_reservation_) { : ignore_result(buffer_pool_client_.IncreaseReservation( : ideal_scan_range_reservation_ - resource_profile_.min_reservation)); : } should we briefly document how ideal_scan_range_reservation_ is used as compared to min_scan_range_reservation_ in HdfsScanNode.java's class level comments? http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/scanner-context.cc File be/src/exec/scanner-context.cc: http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/scanner-context.cc@161 PS12, Line 161: parent_->bp_client_, is this the same clientHandle object used across different threads? -- To view, visit http://gerrit.cloudera.org:8080/8966 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786 Gerrit-Change-Number: 8966 Gerrit-PatchSet: 12 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 06 Feb 2018 08:40:51 + Gerrit-HasComments: Yes