[Impala-ASF-CR] IMPALA-6204: Remove external DataSource

2018-02-06 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9192 )

Change subject: IMPALA-6204: Remove external DataSource
..


Patch Set 5:

(2 comments)

Addressed the comments. Removed the remaining enums and renumbered the Thrifts.

There are some conflicts with Tianyi's metadata compression patch, so I'll post 
another patch when I work through the rebase.

http://gerrit.cloudera.org:8080/#/c/9192/5/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/9192/5/be/src/service/client-request-state.cc@297
PS5, Line 297:   return Status("Data Source not supported.");
> why can we even get here? doesn't the frontend reject this statement?
Done


http://gerrit.cloudera.org:8080/#/c/9192/5/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

http://gerrit.cloudera.org:8080/#/c/9192/5/common/thrift/CatalogObjects.thrift@401
PS5, Line 401:   //  12: optional TDataSourceTable data_source_table
> I'd also prefer removing everything. I see no point in keeping stuff like t
In a previous life, this was a big no-no. The reasons are for maintaining 
compatibility between different versions of servers and servers and servers and 
clients (such as what would happen in rolling upgrade). We don't really have 
that problem.

A second reason is that if I recorded the network, say, and wanted to make 
sense of things, fields coming and going would drive me mad. In practice, we 
don't do that either.

Anyway--if our practice is to renumber, I've gone ahead and done that.



--
To view, visit http://gerrit.cloudera.org:8080/9192
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a3a6740466ed7372b71d948c705b30886dcfb6
Gerrit-Change-Number: 9192
Gerrit-PatchSet: 5
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 07 Feb 2018 04:31:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6204: Remove external DataSource

2018-02-06 Thread Philip Zeyliger (Code Review)
Hello Jim Apple, Dimitris Tsirogiannis, Alex Behm, Zach Amsden, Dan Hecht, 

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9192

to look at the new patch set (#6).

Change subject: IMPALA-6204: Remove external DataSource
..

IMPALA-6204: Remove external DataSource

Removes DataSourceScanNode, external data sources, and all affiliated
code, tests, and documentation.

When a data source table is encountered, we now throw an exception. To
the user, this looks like:

  [pannier.ca.cloudera.com:21000] > create table t (x int) stored as textfile 
tblproperties('__IMPALA_DATA_SOURCE_NAME'='V1');
  Query: create table t (x int) stored as textfile 
tblproperties('__IMPALA_DATA_SOURCE_NAME'='V1')
  Fetched 0 row(s) in 0.11s
  [pannier.ca.cloudera.com:21000] > select * from t;
  Query: select * from t
  Query submitted at: 2018-02-01 17:16:26 (Coordinator: 
http://pannier.ca.cloudera.com:25000)
  ERROR: AnalysisException: Failed to load metadata for table: 't'
  CAUSED BY: TableLoadingException: Failed to load metadata for table: 
default.t. Running 'invalidate metadata default.t' may resolve this problem.
  CAUSED BY: UnsupportedOperationException: Eternal Data source table not 
supported.

A test has been added to capture this behavior.

For the most part, I deleted the unused code. In a few places, a renamed
the Thrift enums and threw errors if they're encountered. For Thrift
structs, I left a comment about the now-skipped id that used to
represent a data-source related entry.

Cherry-picks: not for 2.x

Change-Id: I02a3a6740466ed7372b71d948c705b30886dcfb6
---
M CMakeLists.txt
M be/generated-sources/gen-cpp/CMakeLists.txt
M be/src/catalog/catalog-util.cc
M be/src/exec/CMakeLists.txt
M be/src/exec/catalog-op-executor.cc
M be/src/exec/catalog-op-executor.h
D be/src/exec/data-source-scan-node.cc
D be/src/exec/data-source-scan-node.h
M be/src/exec/exec-node.cc
D be/src/exec/external-data-source-executor.cc
D be/src/exec/external-data-source-executor.h
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/scheduling/scheduler.cc
M be/src/service/client-request-state.cc
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M bin/clean-cmake.sh
M bin/clean.sh
M buildall.sh
M common/thrift/CMakeLists.txt
M common/thrift/CatalogObjects.thrift
M common/thrift/CatalogService.thrift
M common/thrift/Data.thrift
M common/thrift/Descriptors.thrift
D common/thrift/ExternalDataSource.thrift
M common/thrift/Frontend.thrift
M common/thrift/JniCatalog.thrift
M common/thrift/PlanNodes.thrift
M docs/impala.ditamap
M docs/impala_keydefs.ditamap
D docs/topics/impala_create_data_source.xml
D docs/topics/impala_data_sources.xml
D docs/topics/impala_drop_data_source.xml
M docs/topics/impala_new_features.xml
M docs/topics/impala_show.xml
D ext-data-source/.gitignore
D ext-data-source/CMakeLists.txt
D ext-data-source/api/pom.xml
D 
ext-data-source/api/src/main/java/org/apache/impala/extdatasource/util/SerializationUtils.java
D 
ext-data-source/api/src/main/java/org/apache/impala/extdatasource/v1/ExternalDataSource.java
D ext-data-source/pom.xml
D ext-data-source/sample/pom.xml
D 
ext-data-source/sample/src/main/java/org/apache/impala/extdatasource/sample/EchoDataSource.java
D ext-data-source/test/pom.xml
D 
ext-data-source/test/src/main/java/org/apache/impala/extdatasource/AllTypesDataSource.java
M fe/CMakeLists.txt
M fe/pom.xml
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
D fe/src/main/java/org/apache/impala/analysis/CreateDataSrcStmt.java
D fe/src/main/java/org/apache/impala/analysis/CreateTableDataSrcStmt.java
D fe/src/main/java/org/apache/impala/analysis/DropDataSrcStmt.java
M fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java
D fe/src/main/java/org/apache/impala/analysis/ShowDataSrcsStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
D fe/src/main/java/org/apache/impala/catalog/DataSource.java
D fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
D fe/src/main/java/org/apache/impala/extdatasource/ApiVersion.java
D 
fe/src/main/java/org/apache/impala/extdatasource/ExternalDataSourceExecutor.java
D fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M 

[Impala-ASF-CR] IMPALA-6478: Remove garbage NativeAddPendingTopicItem log from catalog

2018-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9228 )

Change subject: IMPALA-6478: Remove garbage NativeAddPendingTopicItem log from 
catalog
..


Patch Set 1: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1896/


--
To view, visit http://gerrit.cloudera.org:8080/9228
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ieb5847a698768c704346371288544a54055bdf1d
Gerrit-Change-Number: 9228
Gerrit-PatchSet: 1
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Comment-Date: Wed, 07 Feb 2018 04:04:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask

2018-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9214 )

Change subject: Force inlining of BloomFilter::MakeMask
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9214
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923
Gerrit-Change-Number: 9214
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 07 Feb 2018 03:49:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask

2018-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9214 )

Change subject: Force inlining of BloomFilter::MakeMask
..

Force inlining of BloomFilter::MakeMask

I noticed that this function was showing up in perf top for TPC-H Q8
running locally. It wasn't inlined into BloomFilter::BucketFindAVX2.
Inlining made the query ~5% faster for me locally.

Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923
Reviewed-on: http://gerrit.cloudera.org:8080/9214
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/benchmarks/bloom-filter-benchmark.cc
M be/src/util/bloom-filter.h
2 files changed, 86 insertions(+), 84 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9214
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923
Gerrit-Change-Number: 9214
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6456: Add flags to configure rpc negotiation timeout ms and negotiation thread count in KRPC

2018-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9186 )

Change subject: IMPALA-6456: Add flags to configure rpc_negotiation_timeout_ms 
and negotiation thread count in KRPC
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I108d700e7eac04b678e21a3a920aac81ba8eede5
Gerrit-Change-Number: 9186
Gerrit-PatchSet: 5
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 07 Feb 2018 03:24:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6456: Add flags to configure rpc negotiation timeout ms and negotiation thread count in KRPC

2018-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9186 )

Change subject: IMPALA-6456: Add flags to configure rpc_negotiation_timeout_ms 
and negotiation thread count in KRPC
..

IMPALA-6456: Add flags to configure rpc_negotiation_timeout_ms and negotiation 
thread count in KRPC

With the fix for KUDU-2228, the FLAGS_rpc_negotiation_timeout_ms was
retired in KRPC.

This patch introduces a flag to be able to configure that from the
Impala side (FLAGS_rpc_negotiation_timeout_ms).

It also introduces a flag to configure the negotiation
thread count (FLAGS_rpc_negotiation_thread_count).

Added a test to verify that setting FLAGS_rpc_negotiation_timeout_ms
to 0 causes negotiation failures. We unfortunately can't write a test
to check the same for FLAGS_rpc_negotiation_thread_count due to
DCHECKS present in the code.

Change-Id: I108d700e7eac04b678e21a3a920aac81ba8eede5
Reviewed-on: http://gerrit.cloudera.org:8080/9186
Reviewed-by: Sailesh Mukil 
Tested-by: Impala Public Jenkins
---
M be/src/rpc/rpc-mgr-test.cc
M be/src/rpc/rpc-mgr.cc
2 files changed, 28 insertions(+), 0 deletions(-)

Approvals:
  Sailesh Mukil: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I108d700e7eac04b678e21a3a920aac81ba8eede5
Gerrit-Change-Number: 9186
Gerrit-PatchSet: 6
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6392: Consistent explain format for parquet predicate statistics

2018-02-06 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has uploaded a new patch set (#11). ( 
http://gerrit.cloudera.org:8080/9223 )

Change subject: IMPALA-6392: Consistent explain format for parquet predicate 
statistics
..

IMPALA-6392: Consistent explain format for parquet predicate statistics

In EXPLAIN_LEVEL=2+, change the explain format for parquet predicate
statistics to output each tuple descriptor per line. This change is to
make it consistent with the output of other predicates.

Before:
parquet statistics predicates: c_custkey < 10, o_orderkey < 5, l_linenumber < 3

After:
parquet statistics predicates: c_custkey < 10
parquet statistics predicates on o: o_orderkey < 5
parquet statistics predicates on o_lineitems: l_linenumber < 3

Testing:
- Ran existing planner tests and updated the ones that are affected by
  this change.
- Ran end-to-end tests in query_test

Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e
---
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
4 files changed, 51 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/9223/11
--
To view, visit http://gerrit.cloudera.org:8080/9223
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e
Gerrit-Change-Number: 9223
Gerrit-PatchSet: 11
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-6392: Consistent explain format for parquet predicate statistics

2018-02-06 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9223 )

Change subject: IMPALA-6392: Consistent explain format for parquet predicate 
statistics
..


Patch Set 11: Code-Review+1

Carrying Vuk's +1


--
To view, visit http://gerrit.cloudera.org:8080/9223
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e
Gerrit-Change-Number: 9223
Gerrit-PatchSet: 11
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 07 Feb 2018 02:55:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6392: Consistent explain format for parquet predicate statistics

2018-02-06 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9223 )

Change subject: IMPALA-6392: Consistent explain format for parquet predicate 
statistics
..


Patch Set 10: Code-Review+1

Carrying Vuk's +1


--
To view, visit http://gerrit.cloudera.org:8080/9223
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e
Gerrit-Change-Number: 9223
Gerrit-PatchSet: 10
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 07 Feb 2018 02:44:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6392: Consistent explain format for parquet predicate statistics

2018-02-06 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9223 )

Change subject: IMPALA-6392: Consistent explain format for parquet predicate 
statistics
..


Patch Set 9: Code-Review+1

(2 comments)

Carrying Vuk's +1

http://gerrit.cloudera.org:8080/#/c/9223/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/9223/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@523
PS8, Line 523: List exprs = minMaxOriginalConjuncts_.get(tupleDesc);
> Use this pattern for fewer lookups:
Done


http://gerrit.cloudera.org:8080/#/c/9223/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1104
PS8, Line 1104: minMaxOriginalConjuncts_.entrySet()) {
> style: we use these standard abbreviations in many places:
Done



--
To view, visit http://gerrit.cloudera.org:8080/9223
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e
Gerrit-Change-Number: 9223
Gerrit-PatchSet: 9
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 07 Feb 2018 02:41:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6392: Consistent explain format for parquet predicate statistics

2018-02-06 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has uploaded a new patch set (#9). ( 
http://gerrit.cloudera.org:8080/9223 )

Change subject: IMPALA-6392: Consistent explain format for parquet predicate 
statistics
..

IMPALA-6392: Consistent explain format for parquet predicate statistics

In EXPLAIN_LEVEL=2+, change the explain format for parquet predicate
statistics to output each tuple descriptor per line. This change is to
make it consistent with the output of other predicates.

Before:
parquet statistics predicates: c_custkey < 10, o_orderkey < 5, l_linenumber < 3

After:
parquet statistics predicates: c_custkey < 10
parquet statistics predicates on o: o_orderkey < 5
parquet statistics predicates on o_lineitems: l_linenumber < 3

Testing:
- Ran existing planner tests and updated the ones that are affected by
  this change.
- Ran end-to-end tests in query_test

Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e
---
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
4 files changed, 50 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/9223/9
--
To view, visit http://gerrit.cloudera.org:8080/9223
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e
Gerrit-Change-Number: 9223
Gerrit-PatchSet: 9
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR](2.x) IMPALA-6228: Control stats extrapolation via tbl prop.

2018-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9231 )

Change subject: IMPALA-6228: Control stats extrapolation via tbl prop.
..


Patch Set 1: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1892/


--
To view, visit http://gerrit.cloudera.org:8080/9231
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie49597bf1b93b7572106abc620d91f199cba0cfd
Gerrit-Change-Number: 9231
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Wed, 07 Feb 2018 02:40:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 3: switch I/O buffers to buffer pool

2018-02-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8966 )

Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
..


Patch Set 16:

Rebased and fixed some minor merge conflicts.


--
To view, visit http://gerrit.cloudera.org:8080/8966
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
Gerrit-Change-Number: 8966
Gerrit-PatchSet: 16
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 07 Feb 2018 01:39:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 3: switch I/O buffers to buffer pool

2018-02-06 Thread Tim Armstrong (Code Review)
Hello Bikramjeet Vig,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8966

to look at the new patch set (#16).

Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
..

IMPALA-4835: Part 3: switch I/O buffers to buffer pool

This is the final patch to switch the Disk I/O manager to allocate all
buffer from the buffer pool and to reserve the buffers required for
a query upfront.

* The planner reserves enough memory to run a single scanner per
  scan node.
* The multi-threaded scan node must increase reservation before
  spinning up more threads.
* The scanner implementations must be careful to stay within their
  assigned reservation.

The row-oriented scanners were most straightforward, since they only
have a single scan range active at a time. A single I/O buffer is
sufficient to scan the whole file but more I/O buffers can improve I/O
throughput.

Parquet is more complex because it issues a scan range per column and
the sizes of the columns on disk are not known during planning. To
deal with this, the reservation in the frontend is based on a
heuristic involving the file size and # columns. The Parquet scanner
can then divvy up reservation to columns based on the size of column
data on disk.

I adjusted how the 'mem_limit' is divided between buffer pool and non
buffer pool memory for low mem_limits to account for the increase in
buffer pool memory.

Testing:
* Added more planner tests to cover reservation calcs for scan node.
* Test scanners for all file formats with the reservation denial debug
  action, to test behaviour when the scanners hit reservation limits.
* Updated memory and buffer pool limits for tests.

Perf:
I ran TPC-H and targeted perf locally comparing with master. Both
showed small improvements of a few percent and no regressions of
note. I still need to run cluster perf tests.

Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/runtime/bufferpool/reservation-tracker-test.cc
M be/src/runtime/bufferpool/reservation-util.cc
M be/src/runtime/io/disk-io-mgr-stress-test.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-stress.h
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/tmp-file-mgr.cc
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/util/BitUtil.java
M fe/src/test/java/org/apache/impala/util/BitUtilTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/partition-pruning.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test
M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
M 
testdata/workloads/functional-query/queries/QueryTest/admission-reject-min-reservation.test
M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test
M testdata/workloads/functional-query/queries/QueryTest/codegen-mem-limit.test
M 
testdata/workloads/functional-query/queries/QueryTest/disk-spill-encryption.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level0.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level1.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level2.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level3.test
M testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch.test
M 

[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront

2018-02-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8707 )

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
..


Patch Set 24: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/8707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Gerrit-Change-Number: 8707
Gerrit-PatchSet: 24
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 07 Feb 2018 01:38:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation

2018-02-06 Thread Tim Armstrong (Code Review)
Hello Tianyi Wang, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8414

to look at the new patch set (#22).

Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
..

IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation

In preparation for switching the I/O mgr to the buffer pool, this
removes and cleans up a lot of code so that the switchover patch starts
from a cleaner slate.

* Remove the free buffer cache (which will be replaced by buffer pool's
  own caching).
* Make memory limit exceeded error checking synchronous (in anticipation
  of having to propagate buffer pool errors synchronously).
* Simplify error propagation - remove the (ineffectual) code that
  enqueued BufferDescriptors containing error statuses.
* Document locking scheme better in a few places, make it part of the
  function signature when it seemed reasonable.
* Move ReturnBuffer() to ScanRange, because it is intrinsically
  connected with the lifecycle of a scan range.
* Separate external ReturnBuffer() and internal CleanUpBuffer()
  interfaces - previously callers of ReturnBuffer() were fudging
  the num_buffers_in_reader accounting to make the external interface work.
* Eliminate redundant state in ScanRange: 'eosr_returned_' and
  'is_cancelled_'.
* Clarify the logic around calling Close() for the last
  BufferDescriptor.
  -> There appeared to be an implicit assumption that buffers would be
 freed in the order they were returned from the scan range, so that
 the "eos" buffer was returned last. Instead just count the number
 of outstanding buffers to detect the last one.
  -> Touching the is_cancelled_ field without holding a lock was hard to
 reason about - violated locking rules and it was unclear that it
 was race-free.
* Remove DiskIoMgr::Read() to simplify the interface. It is trivial to
  inline at the callsites.

This will probably regress performance somewhat because of the cache
removal, so my plan is to merge it around the same time as switching
the I/O mgr to allocate from the buffer pool. I'm keeping the patches
separate to make reviewing easier.

Testing:
* Ran exhaustive tests
* Ran the disk-io-mgr-stress-test overnight

Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/scanner-context.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/mem-tracker.h
M be/src/runtime/test-env.cc
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
19 files changed, 577 insertions(+), 909 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8414/22
--
To view, visit http://gerrit.cloudera.org:8080/8414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
Gerrit-Change-Number: 8414
Gerrit-PatchSet: 22
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4835: Part 3: switch I/O buffers to buffer pool

2018-02-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8966 )

Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
..


Patch Set 13:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/hdfs-parquet-scanner.cc@1707
PS12, Line 1707: D
> would it make sense to replace this with a named constant? something that a
Yeah, not using magic numbers was like week 2 of undergrad :). There's an 
explanation for the constant in the DiskIoMgr header so I also added a 
definition of the constant there.


http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/hdfs-parquet-scanner.cc@1724
PS12, Line 1724: // We need to rea
> would it be useful to squeeze max amount of buffers if we do this here:
I think it's better to have fewer, larger, buffers instead of adding a smaller 
buffer into the circulation so that we're consistently doing larger I/Os, 
instead of a mix of small and large I/Os. It may be faster for this query in a 
lot of cases, but it's also globally more efficient - if this query is somewhat 
memory constrained, at least it's less inclined to clog up the I/O queues for 
everyone else.

I added a comment because it's very subtle (I thought about this at the time I 
wrote the code and neglected to write it down).


http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/hdfs-scan-node-base.cc@347
PS12, Line 347:   // We got the minimum reservation. Now try to get ideal 
reservation.
  :   if (resource_profile_.min_reservation != 
ideal_scan_range_reservation_) {
  : ignore_result(buffer_pool_client_.IncreaseReservation(
  : ideal_scan_range_reservation_ - 
resource_profile_.min_reservation));
  :   }
> should we briefly document how ideal_scan_range_reservation_ is used  as co
Done


http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/scanner-context.cc
File be/src/exec/scanner-context.cc:

http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/scanner-context.cc@161
PS12, Line 161: parent_->bp_client_,
> is this the same clientHandle object used across different threads?
Yeah it is. Added comments to the HdfsScanNode class header and to the 
bp_client_ declaration to clarify this.



-- 
To view, visit http://gerrit.cloudera.org:8080/8966
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
Gerrit-Change-Number: 8966
Gerrit-PatchSet: 13
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 07 Feb 2018 01:24:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask

2018-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9214 )

Change subject: Force inlining of BloomFilter::MakeMask
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1895/


--
To view, visit http://gerrit.cloudera.org:8080/9214
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923
Gerrit-Change-Number: 9214
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 07 Feb 2018 00:08:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask

2018-02-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9214 )

Change subject: Force inlining of BloomFilter::MakeMask
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9214
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923
Gerrit-Change-Number: 9214
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 07 Feb 2018 00:08:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6392: Consistent explain format for parquet predicate statistics

2018-02-06 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9223 )

Change subject: IMPALA-6392: Consistent explain format for parquet predicate 
statistics
..


Patch Set 8:

(2 comments)

Nice!

http://gerrit.cloudera.org:8080/#/c/9223/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/9223/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@523
PS8, Line 523: if (!minMaxOriginalConjuncts_.containsKey(tupleDescriptor)) {
Use this pattern for fewer lookups:

List l = map.get(key);
if (l == null) {
  l = new List();
  map.put(key, l);
}
l.add(value);


http://gerrit.cloudera.org:8080/#/c/9223/8/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1104
PS8, Line 1104:   TupleDescriptor tupleDescriptor = entry.getKey();
style: we use these standard abbreviations in many places:

tupleDescriptor -> tupleDesc or td
expressions -> exprs



--
To view, visit http://gerrit.cloudera.org:8080/9223
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e
Gerrit-Change-Number: 9223
Gerrit-PatchSet: 8
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 07 Feb 2018 00:06:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4835: Part 3: switch I/O buffers to buffer pool

2018-02-06 Thread Tim Armstrong (Code Review)
Hello Bikramjeet Vig,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8966

to look at the new patch set (#13).

Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
..

IMPALA-4835: Part 3: switch I/O buffers to buffer pool

This is the final patch to switch the Disk I/O manager to allocate all
buffer from the buffer pool and to reserve the buffers required for
a query upfront.

* The planner reserves enough memory to run a single scanner per
  scan node.
* The multi-threaded scan node must increase reservation before
  spinning up more threads.
* The scanner implementations must be careful to stay within their
  assigned reservation.

The row-oriented scanners were most straightforward, since they only
have a single scan range active at a time. A single I/O buffer is
sufficient to scan the whole file but more I/O buffers can improve I/O
throughput.

Parquet is more complex because it issues a scan range per column and
the sizes of the columns on disk are not known during planning. To
deal with this, the reservation in the frontend is based on a
heuristic involving the file size and # columns. The Parquet scanner
can then divvy up reservation to columns based on the size of column
data on disk.

I adjusted how the 'mem_limit' is divided between buffer pool and non
buffer pool memory for low mem_limits to account for the increase in
buffer pool memory.

Testing:
* Added more planner tests to cover reservation calcs for scan node.
* Test scanners for all file formats with the reservation denial debug
  action, to test behaviour when the scanners hit reservation limits.
* Updated memory and buffer pool limits for tests.

Perf:
I ran TPC-H and targeted perf locally comparing with master. Both
showed small improvements of a few percent and no regressions of
note. I still need to run cluster perf tests.

Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/runtime/bufferpool/reservation-tracker-test.cc
M be/src/runtime/bufferpool/reservation-util.cc
M be/src/runtime/io/disk-io-mgr-stress-test.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-stress.h
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/tmp-file-mgr.cc
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/util/BitUtil.java
M fe/src/test/java/org/apache/impala/util/BitUtilTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/partition-pruning.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test
M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
M 
testdata/workloads/functional-query/queries/QueryTest/admission-reject-min-reservation.test
M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test
M testdata/workloads/functional-query/queries/QueryTest/codegen-mem-limit.test
M 
testdata/workloads/functional-query/queries/QueryTest/disk-spill-encryption.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level0.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level1.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level2.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level3.test
M testdata/workloads/functional-query/queries/QueryTest/nested-types-tpch.test
M 

[Impala-ASF-CR] IMPALA-6456: Add flags to configure rpc negotiation timeout ms and negotiation thread count in KRPC

2018-02-06 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9186 )

Change subject: IMPALA-6456: Add flags to configure rpc_negotiation_timeout_ms 
and negotiation thread count in KRPC
..


Patch Set 5: Code-Review+2

Rebase, carry +2.


--
To view, visit http://gerrit.cloudera.org:8080/9186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I108d700e7eac04b678e21a3a920aac81ba8eede5
Gerrit-Change-Number: 9186
Gerrit-PatchSet: 5
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 06 Feb 2018 23:43:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption"

2018-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9226 )

Change subject: Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption"
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9226
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia06f061a4ecedd1df0d359fe06fe84618b5e07fb
Gerrit-Change-Number: 9226
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 06 Feb 2018 23:40:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption"

2018-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9226 )

Change subject: Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption"
..

Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption"

This reverts commit 9b68645f9eb9e08899fda860e0946cc05f205479.

Change-Id: Ia06f061a4ecedd1df0d359fe06fe84618b5e07fb
Reviewed-on: http://gerrit.cloudera.org:8080/9226
Reviewed-by: Sailesh Mukil 
Tested-by: Impala Public Jenkins
---
M be/src/runtime/tmp-file-mgr.cc
M be/src/util/cpu-info.cc
M be/src/util/cpu-info.h
M be/src/util/openssl-util-test.cc
M be/src/util/openssl-util.cc
M be/src/util/openssl-util.h
6 files changed, 73 insertions(+), 227 deletions(-)

Approvals:
  Sailesh Mukil: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9226
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ia06f061a4ecedd1df0d359fe06fe84618b5e07fb
Gerrit-Change-Number: 9226
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6424: Avoid loading metadata twice when refreshing unloaded tables

2018-02-06 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9224 )

Change subject: IMPALA-6424: Avoid loading metadata twice when refreshing 
unloaded tables
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9224/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/9224/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3105
PS1, Line 3105:   tbl = getExistingTable(tblName.getDb(), 
tblName.getTbl());
Change seems fine.

Have you considered changing getExistingTable() to also return whether the 
table was cached or loaded? Seems potentially cleaner, but more invasive.



--
To view, visit http://gerrit.cloudera.org:8080/9224
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie41a734493dcea0e36d6b051966f1d0302907dee
Gerrit-Change-Number: 9224
Gerrit-PatchSet: 1
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Comment-Date: Tue, 06 Feb 2018 23:22:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4795: Allow fetching function obj from catalog using signature

2018-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9225 )

Change subject: IMPALA-4795: Allow fetching function obj from catalog using 
signature
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1893/


--
To view, visit http://gerrit.cloudera.org:8080/9225
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3c19edd56e645b8bee918a111c76fe3f260142fe
Gerrit-Change-Number: 9225
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 06 Feb 2018 23:10:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4795: Allow fetching function obj from catalog using signature

2018-02-06 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9225 )

Change subject: IMPALA-4795: Allow fetching function obj from catalog using 
signature
..


Patch Set 1:

> It looks like the old patch applied cleanly right? I did another
 > pass over the patch to sanity-check it and confirm that the
 > approach was the same as the previous one.

yup it applied cleanly, i also ran test_udfs a bunch of times to make sure it 
doesn't fail anymore


--
To view, visit http://gerrit.cloudera.org:8080/9225
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3c19edd56e645b8bee918a111c76fe3f260142fe
Gerrit-Change-Number: 9225
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 06 Feb 2018 23:09:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5519: Allocate fragment's runtime filter memory from Buffer pool

2018-02-06 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8971 )

Change subject: IMPALA-5519: Allocate fragment's runtime filter memory from 
Buffer pool
..


Patch Set 9:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/8971/9/be/src/service/query-options.cc
File be/src/service/query-options.cc:

http://gerrit.cloudera.org:8080/#/c/8971/9/be/src/service/query-options.cc@380
PS9, Line 380:"buffer size that can be 
allocated by the buffer pool",
> nit: indentation
Done


http://gerrit.cloudera.org:8080/#/c/8971/9/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java:

http://gerrit.cloudera.org:8080/#/c/8971/9/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@75
PS9, Line 75: max
> Shouldn't this be the "min buffer size"?
Done


http://gerrit.cloudera.org:8080/#/c/8971/9/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@108
PS9, Line 108:   private class FilterSizeLimits {
> Add a small class comment
Done


http://gerrit.cloudera.org:8080/#/c/8971/9/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@112
PS9, Line 112: BitUtil.roundUpToPowerOf2(
> Why do you need to do it here and not when the final 'max' value is compute
I did that just make the final assignments look cleaner in code, otherwise it 
would look something like this:

 max = 
BitUtil.roundUpToPowerOf2(Math.max((tQueryOptions.getRuntime_filter_max_size(), 
BackendConfig.INSTANCE.getMinBufferSize()));

or I could bring it down to :

max = BitUtil.roundUpToPowerOf2(Math.max(maxLimit, minBufferSize));


http://gerrit.cloudera.org:8080/#/c/8971/9/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@116
PS9, Line 116:   // TODO: what happens if minbuffersize is greater than the 
MAX_BLOOM_FILTER_SIZE?
> What do we know about this TODO? Can you check with Tim?
oops forgot to remove that one. already talked to tim about that


http://gerrit.cloudera.org:8080/#/c/8971/9/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@123
PS9, Line 123: Preconditions.checkArgument(minLimit >= MIN_BLOOM_FILTER_SIZE);
 :   Preconditions.checkArgument(minLimit <= 
MAX_BLOOM_FILTER_SIZE);
> Hm, I know I asked you to add these but they do seem kind of redundant if w
Done


http://gerrit.cloudera.org:8080/#/c/8971/9/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@134
PS9, Line 134:
> nit: extra white space
Done


http://gerrit.cloudera.org:8080/#/c/8971/9/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@135
PS9, Line 135: // Maximum filter size, in bytes, rounded up to a power of two.
 : public final long max;
 :
 : // Minimum filter size, in bytes, rounded up to a power of 
two.
 : public final long min;
 :
 : // Pre-computed default filter size, in bytes, rounded up to 
a power of two.
 : public final long defaultVal;
> nit: move them to the beginning of the class (Java style :)). Also, you cou
I was thinking the same (to rename to maxSize). Lemme go ahead and do that



--
To view, visit http://gerrit.cloudera.org:8080/8971
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iea2759665fb2e8bef9433014a8d42a7ebf99ce1f
Gerrit-Change-Number: 8971
Gerrit-PatchSet: 9
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 06 Feb 2018 23:05:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5037: Default PARQUET ARRAY RESOLUTION=THREE LEVEL

2018-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9210 )

Change subject: IMPALA-5037: Default PARQUET_ARRAY_RESOLUTION=THREE_LEVEL
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9210
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8f7e9010c4354d667305d9df7b78862efb23fe1
Gerrit-Change-Number: 9210
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 06 Feb 2018 22:58:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5037: Default PARQUET ARRAY RESOLUTION=THREE LEVEL

2018-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9210 )

Change subject: IMPALA-5037: Default PARQUET_ARRAY_RESOLUTION=THREE_LEVEL
..

IMPALA-5037: Default PARQUET_ARRAY_RESOLUTION=THREE_LEVEL

Changes the default value for the PARQUET_ARRAY_RESOLUTION
query option to conform to the Parquet standard.
Before: TWO_LEVEL_THEN_THREE_LEVEL
After:  THREE_LEVEL

For more information see:
* IMPALA-4725
* https://github.com/apache/parquet-format/blob/master/LogicalTypes.md

Testing:
- expands and cleans up the existing tests for more coverage
  over the different resolution policies
- private core/hdfs run passed

Cherry-picks: not for 2.x.

Change-Id: Ib8f7e9010c4354d667305d9df7b78862efb23fe1
Reviewed-on: http://gerrit.cloudera.org:8080/9210
Reviewed-by: Alex Behm 
Tested-by: Impala Public Jenkins
---
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M tests/query_test/test_nested_types.py
3 files changed, 153 insertions(+), 193 deletions(-)

Approvals:
  Alex Behm: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9210
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ib8f7e9010c4354d667305d9df7b78862efb23fe1
Gerrit-Change-Number: 9210
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR](2.x) IMPALA-6228: Control stats extrapolation via tbl prop.

2018-02-06 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9231 )

Change subject: IMPALA-6228: Control stats extrapolation via tbl prop.
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9231/1/testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
File 
testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test:

http://gerrit.cloudera.org:8080/#/c/9231/1/testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test@3
PS1, Line 3: # This test relies on a deterministic row order so we use "sort by 
(id)".
Only conflict was here. It clashed with Lars change to do SORT BY default.



--
To view, visit http://gerrit.cloudera.org:8080/9231
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie49597bf1b93b7572106abc620d91f199cba0cfd
Gerrit-Change-Number: 9231
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Comment-Date: Tue, 06 Feb 2018 22:45:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6075: Add Impala daemon metric for catalog version.

2018-02-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8949 )

Change subject: IMPALA-6075: Add Impala daemon metric for catalog version.
..


Patch Set 9: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8949
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id2307eb434561ed74ff058106541c0ebda017d97
Gerrit-Change-Number: 8949
Gerrit-PatchSet: 9
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 06 Feb 2018 22:44:33 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) IMPALA-6228: Control stats extrapolation via tbl prop.

2018-02-06 Thread Alex Behm (Code Review)
Hello Impala Public Jenkins,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/9231

to review the following change.


Change subject: IMPALA-6228: Control stats extrapolation via tbl prop.
..

IMPALA-6228: Control stats extrapolation via tbl prop.

Introduces a new TBLPROPERTY for controlling stats
extrapolation on a per-table basis:

impala.enable.stats.extrapolation=true/false

The property key was chosen to be consistent with
the impalad startup flag --enable_stats_extrapolation
and to indicate that the property was set and is used
by Impala.

Behavior:
- If the property is not set, then the extrapolation
  behavior is determined by the impalad startup flag.
- If the property is set, it overrides the impalad
  startup flag, i.e., extrapolation can be explicitly
  enabled or disabled regardless of the startup flag.

Testing:
- added new unit tests
- code/hdfs run passed

Change-Id: Ie49597bf1b93b7572106abc620d91f199cba0cfd
Reviewed-on: http://gerrit.cloudera.org:8080/9139
Reviewed-by: Alex Behm 
Tested-by: Impala Public Jenkins
---
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/planner/StatsExtrapolationTest.java
M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
M tests/custom_cluster/test_stats_extrapolation.py
A tests/metadata/test_stats_extrapolation.py
9 files changed, 359 insertions(+), 174 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/9231/1
--
To view, visit http://gerrit.cloudera.org:8080/9231
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie49597bf1b93b7572106abc620d91f199cba0cfd
Gerrit-Change-Number: 9231
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins


[Impala-ASF-CR] IMPALA-6456: Add flags to configure rpc negotiation timeout ms and negotiation thread count in KRPC

2018-02-06 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9186 )

Change subject: IMPALA-6456: Add flags to configure rpc_negotiation_timeout_ms 
and negotiation thread count in KRPC
..


Patch Set 4: Code-Review+2

(2 comments)

Thanks for the review.

Holding off on GVO until IMPALA-6485 unblocks GVO.

http://gerrit.cloudera.org:8080/#/c/9186/3/be/src/rpc/rpc-mgr-test.cc
File be/src/rpc/rpc-mgr-test.cc:

http://gerrit.cloudera.org:8080/#/c/9186/3/be/src/rpc/rpc-mgr-test.cc@235
PS3, Line 235: // establishment fails.
> May help to clarify:
Done


http://gerrit.cloudera.org:8080/#/c/9186/3/be/src/rpc/rpc-mgr-test.cc@252
PS3, Line 252:   secondary_rpc_mgr.Shutdown();
> nit: extra blank line.
Done



--
To view, visit http://gerrit.cloudera.org:8080/9186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I108d700e7eac04b678e21a3a920aac81ba8eede5
Gerrit-Change-Number: 9186
Gerrit-PatchSet: 4
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 06 Feb 2018 22:31:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6456: Add flags to configure rpc negotiation timeout ms and negotiation thread count in KRPC

2018-02-06 Thread Sailesh Mukil (Code Review)
Hello Michael Ho,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9186

to look at the new patch set (#4).

Change subject: IMPALA-6456: Add flags to configure rpc_negotiation_timeout_ms 
and negotiation thread count in KRPC
..

IMPALA-6456: Add flags to configure rpc_negotiation_timeout_ms and negotiation 
thread count in KRPC

With the fix for KUDU-2228, the FLAGS_rpc_negotiation_timeout_ms was
retired in KRPC.

This patch introduces a flag to be able to configure that from the
Impala side (FLAGS_rpc_negotiation_timeout_ms).

It also introduces a flag to configure the negotiation
thread count (FLAGS_rpc_negotiation_thread_count).

Added a test to verify that setting FLAGS_rpc_negotiation_timeout_ms
to 0 causes negotiation failures. We unfortunately can't write a test
to check the same for FLAGS_rpc_negotiation_thread_count due to
DCHECKS present in the code.

Change-Id: I108d700e7eac04b678e21a3a920aac81ba8eede5
---
M be/src/rpc/rpc-mgr-test.cc
M be/src/rpc/rpc-mgr.cc
2 files changed, 28 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/9186/4
--
To view, visit http://gerrit.cloudera.org:8080/9186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I108d700e7eac04b678e21a3a920aac81ba8eede5
Gerrit-Change-Number: 9186
Gerrit-PatchSet: 4
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-5440 Add planner tests with extreme statistics values

2018-02-06 Thread Anonymous Coward (Code Review)
xyutin...@cloudera.com has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9065 )

Change subject: IMPALA-5440 Add planner tests with extreme statistics values
..


Patch Set 10:

Patch 10 stressed on the comments on Patch 9 and does not include the test for 
ExchangeNode.java, HdfsScanNode.java, HdfsTableSink.java and PlanFragment.java


--
To view, visit http://gerrit.cloudera.org:8080/9065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I86dec47cf1438882cafaec53e97864ccfcdff3cb
Gerrit-Change-Number: 9065
Gerrit-PatchSet: 10
Gerrit-Owner: xyutin...@cloudera.com
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Reviewer: xyutin...@cloudera.com
Gerrit-Comment-Date: Tue, 06 Feb 2018 22:25:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5440 Add planner tests with extreme statistics values

2018-02-06 Thread Anonymous Coward (Code Review)
xyutin...@cloudera.com has uploaded a new patch set (#10). ( 
http://gerrit.cloudera.org:8080/9065 )

Change subject: IMPALA-5440 Add planner tests with extreme statistics values
..

IMPALA-5440 Add planner tests with extreme statistics values

This commit address some of the issues in JIRA: tests against the cardinality
overflowing from JOIN, UNION, CROSS JOIN, FULL OUTER JOIN,
0 row number and negative row number, as well as cardinality on Subplan node.

Change-Id: I86dec47cf1438882cafaec53e97864ccfcdff3cb
---
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
2 files changed, 156 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/9065/10
--
To view, visit http://gerrit.cloudera.org:8080/9065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I86dec47cf1438882cafaec53e97864ccfcdff3cb
Gerrit-Change-Number: 9065
Gerrit-PatchSet: 10
Gerrit-Owner: xyutin...@cloudera.com
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Reviewer: xyutin...@cloudera.com


[Impala-ASF-CR] IMPALA-6478: Remove garbage NativeAddPendingTopicItem log from catalog

2018-02-06 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9228


Change subject: IMPALA-6478: Remove garbage NativeAddPendingTopicItem log from 
catalog
..

IMPALA-6478: Remove garbage NativeAddPendingTopicItem log from catalog

After IMPALA-5990, catalog keeps printing "NativeAddPendingTopicItem
failed" into the log because of the wrongly implemented error handling.
This patch fixes this problem.

Change-Id: Ieb5847a698768c704346371288544a54055bdf1d
---
M be/src/service/fe-support.cc
1 file changed, 8 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/9228/1
--
To view, visit http://gerrit.cloudera.org:8080/9228
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ieb5847a698768c704346371288544a54055bdf1d
Gerrit-Change-Number: 9228
Gerrit-PatchSet: 1
Gerrit-Owner: Tianyi Wang 


[Impala-ASF-CR] IMPALA-5269: Fix issue with final line of query followed by a comment

2018-02-06 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9191 )

Change subject: IMPALA-5269: Fix issue with final line of query followed by a 
comment
..


Patch Set 5: Code-Review+1

> Patch Set 4: Code-Review+1

Carrying Thomas' +1


--
To view, visit http://gerrit.cloudera.org:8080/9191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I54f9a8f65214023520eaa010fc462a663d02d258
Gerrit-Change-Number: 9191
Gerrit-PatchSet: 5
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Tue, 06 Feb 2018 21:41:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6392: Consistent explain format for parquet predicate statistics

2018-02-06 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9223 )

Change subject: IMPALA-6392: Consistent explain format for parquet predicate 
statistics
..


Patch Set 7: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9223/7/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/9223/7/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@230
PS7, Line 230: max
nit: lets keep this as minMax (as before), and consistent with the rest of the 
naming in this change.



--
To view, visit http://gerrit.cloudera.org:8080/9223
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e
Gerrit-Change-Number: 9223
Gerrit-PatchSet: 7
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Tue, 06 Feb 2018 21:23:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5269: Fix issue with final line of query followed by a comment

2018-02-06 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/9191 )

Change subject: IMPALA-5269: Fix issue with final line of query followed by a 
comment
..

IMPALA-5269: Fix issue with final line of query followed by a comment

The patch is to remove any comments in a statement when checking if a
statement ends with a semicolon delimiter.

For example:

Before (semicolon delimiter is needed at the end):
select 1 + 1; -- comment\n;

After (semicolon delimiter is no longer needed):
select 1 + 1; -- comment

Testing:
- Ran end-to-end tests in shell

Change-Id: I54f9a8f65214023520eaa010fc462a663d02d258
---
M shell/impala_shell.py
M tests/shell/test_shell_interactive.py
2 files changed, 29 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/9191/5
--
To view, visit http://gerrit.cloudera.org:8080/9191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I54f9a8f65214023520eaa010fc462a663d02d258
Gerrit-Change-Number: 9191
Gerrit-PatchSet: 5
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-6392: Consistent explain format for parquet predicate statistics

2018-02-06 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/9223 )

Change subject: IMPALA-6392: Consistent explain format for parquet predicate 
statistics
..

IMPALA-6392: Consistent explain format for parquet predicate statistics

In EXPLAIN_LEVEL=2+, change the explain format for parquet predicate
statistics to output each tuple descriptor per line. This change is to
make it consistent with the output of other predicates.

Before:
parquet statistics predicates: c_custkey < 10, o_orderkey < 5, l_linenumber < 3

After:
parquet statistics predicates: c_custkey < 10
parquet statistics predicates on o: o_orderkey < 5
parquet statistics predicates on o_lineitems: l_linenumber < 3

Testing:
- Ran existing planner tests and updated the ones that are affected by
  this change.
- Ran end-to-end tests in query_test

Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e
---
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
4 files changed, 49 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/9223/7
--
To view, visit http://gerrit.cloudera.org:8080/9223
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e
Gerrit-Change-Number: 9223
Gerrit-PatchSet: 7
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-6392: Consistent explain format for parquet predicate statistics

2018-02-06 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9223 )

Change subject: IMPALA-6392: Consistent explain format for parquet predicate 
statistics
..


Patch Set 7:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/9223/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9223/5//COMMIT_MSG@23
PS5, Line 23: .
> run end-to-end tests as well. see testdata/workloads/functional-query/queri
Done


http://gerrit.cloudera.org:8080/#/c/9223/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/9223/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@230
PS5, Line 230: maxOrigina
> what's the "dictionary" prefix for? seems this is still just for min/max co
I was referring to a Map as a dictionary, but I'll change it to 
minMaxOriginalConjuncts instead.


http://gerrit.cloudera.org:8080/#/c/9223/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1100
PS5, Line 1100: MinMaxOrig
> make this name consistent with member name if updated.
Done



--
To view, visit http://gerrit.cloudera.org:8080/9223
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e
Gerrit-Change-Number: 9223
Gerrit-PatchSet: 7
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Tue, 06 Feb 2018 20:34:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask

2018-02-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9214 )

Change subject: Force inlining of BloomFilter::MakeMask
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9214
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923
Gerrit-Change-Number: 9214
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 06 Feb 2018 20:19:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask

2018-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9214 )

Change subject: Force inlining of BloomFilter::MakeMask
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1891/


--
To view, visit http://gerrit.cloudera.org:8080/9214
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923
Gerrit-Change-Number: 9214
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 06 Feb 2018 20:19:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask

2018-02-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9214 )

Change subject: Force inlining of BloomFilter::MakeMask
..


Patch Set 3:

I did set them - the benchmark complains otherwise. It is a bit strange, I 
agree.


tarmstrong@tarmstrong-box2:~/Impala/incubator-impala$ cat 
/sys/devices/system/cpu/intel_pstate/no_turbo
1
tarmstrong@tarmstrong-box2:~/Impala/incubator-impala$ cat 
/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
performance


--
To view, visit http://gerrit.cloudera.org:8080/9214
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923
Gerrit-Change-Number: 9214
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 06 Feb 2018 20:17:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask

2018-02-06 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9214 )

Change subject: Force inlining of BloomFilter::MakeMask
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9214
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923
Gerrit-Change-Number: 9214
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 06 Feb 2018 20:15:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask

2018-02-06 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9214 )

Change subject: Force inlining of BloomFilter::MakeMask
..


Patch Set 3:

> On my system after:
 >
 > With AVX2:
 >
 > insert:Function  iters/ms   10%ile   50%ile
 > 90%ile 10%ile 50%ile 90%ile
 > (relative) (relative) (relative)
 > -
 > ndv  10k fpp   10.0%2.1e+05 2.11e+05 2.13e+05
 >   1X 1X 1X
 > ndv  10k fpp1.0%   2.16e+05 2.18e+05 2.19e+05
 > 1.03X  1.03X  1.03X
 > ndv  10k fpp0.1%   2.12e+05 2.14e+05 2.16e+05
 > 1.01X  1.01X  1.01X
 > ndv1000k fpp   10.0%   1.98e+05 1.99e+05 2.01e+05
 > 0.943X 0.942X 0.945X
 > ndv1000k fpp1.0%   1.96e+05 1.98e+05 1.99e+05
 > 0.935X 0.936X 0.937X
 > ndv1000k fpp0.1%   1.96e+05 1.97e+05 1.99e+05
 > 0.935X 0.934X 0.936X
 > ndv  10k fpp   10.0%   5.63e+04  5.8e+04 6.18e+04
 > 0.269X 0.274X 0.291X
 > ndv  10k fpp1.0%   5.64e+04 5.84e+04 6.24e+04
 > 0.269X 0.276X 0.293X
 > ndv  10k fpp0.1%   5.56e+04 5.75e+04 5.86e+04
 > 0.265X 0.272X 0.275X
 >
 > find:  Function  iters/ms   10%ile   50%ile
 > 90%ile 10%ile 50%ile 90%ile
 > (relative) (relative) (relative)
 > -
 > present ndv  10k fpp   10.0%   1.97e+05 1.98e+05
 > 1.99e+05 1X 1X 1X
 > absent  ndv  10k fpp   10.0%   1.99e+05 2.01e+05
 > 2.03e+05  1.01X  1.01X  1.02X
 > present ndv  10k fpp1.0%   1.97e+05 1.98e+05
 > 2e+05 1X 1X 1X
 > absent  ndv  10k fpp1.0%  2e+05 2.01e+05
 > 2.03e+05  1.02X  1.02X  1.02X
 > present ndv  10k fpp0.1%   1.97e+05 1.99e+05
 > 2e+05 1X 1X 1X
 > absent  ndv  10k fpp0.1%  2e+05 2.02e+05
 > 2.03e+05  1.02X  1.02X  1.02X
 > present ndv1000k fpp   10.0%   1.75e+05 1.77e+05
 > 1.78e+05 0.891X 0.893X 0.893X
 > absent  ndv1000k fpp   10.0%   1.78e+05  1.8e+05
 > 1.81e+05 0.907X 0.907X 0.907X
 > present ndv1000k fpp1.0%1.8e+05 1.82e+05
 > 1.83e+05 0.917X 0.917X 0.919X
 > absent  ndv1000k fpp1.0%   1.84e+05 1.86e+05
 > 1.88e+05 0.937X 0.939X 0.941X
 > present ndv1000k fpp0.1%   1.69e+05  1.7e+05
 > 1.71e+05 0.857X 0.859X 0.858X
 > absent  ndv1000k fpp0.1%1.7e+05 1.72e+05
 > 1.74e+05 0.866X  0.87X 0.871X
 > present ndv  10k fpp   10.0%   5.34e+04 5.53e+04
 > 7.21e+04 0.271X 0.279X 0.362X
 > absent  ndv  10k fpp   10.0%   5.05e+04 5.28e+04
 > 5.52e+04 0.257X 0.267X 0.277X
 > present ndv  10k fpp1.0%   5.43e+04 5.74e+04
 > 8.65e+04 0.276X  0.29X 0.434X
 > absent  ndv  10k fpp1.0%   5.09e+04 5.42e+04
 > 5.73e+04 0.259X 0.274X 0.288X
 > present ndv  10k fpp0.1%   5.11e+04 5.24e+04
 > 6.69e+04  0.26X 0.265X 0.336X
 > absent  ndv  10k fpp0.1%   4.93e+04 5.02e+04
 > 5.1e+04 0.251X 0.254X 0.256X
 >
 > union: Function  iters/ms   10%ile   50%ile
 > 90%ile 10%ile 50%ile 90%ile
 > (relative) (relative) (relative)
 > -
 > ndv  10k fpp   10.0%   6.76e+05  6.8e+05 6.88e+05
 >   1X 1X 1X
 > ndv  10k fpp1.0%   6.77e+05 6.81e+05 6.87e+05
 >   1X 1X 0.998X
 > ndv  10k fpp0.1%   6.78e+05 6.82e+05 6.86e+05   
 >   1X 1X 0.996X
 > ndv1000k fpp   10.0%   6.78e+05 6.82e+05 6.88e+05
 >   1X 1X 1X
 > ndv1000k fpp1.0%   6.78e+05 6.83e+05 6.89e+05
 >   1X 1X 1X
 > ndv1000k fpp0.1%   6.77e+05  6.8e+05 6.89e+05
 >   1X 1X 1X
 > ndv  10k fpp   10.0%   6.77e+05 6.81e+05 6.88e+05
 >   1X 1X 0.999X
 > ndv  10k fpp1.0%   6.77e+05 6.85e+05 6.89e+05
 >   1X  1.01X 1X
 > ndv  10k fpp0.1%   6.76e+05  6.8e+05 6.88e+05
 >   1X 1X 1X

Why did union speed up? In any case, +2, but I'd be curious about

cat /sys/devices/system/cpu/intel_pstate/no_turbo

cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor


--
To view, visit http://gerrit.cloudera.org:8080/9214
To unsubscribe, visit 

[Impala-ASF-CR] IMPALA-4456: Address scalability issues of qs map lock and client request state map lock

2018-02-06 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8363 )

Change subject: IMPALA-4456: Address scalability issues of qs_map_lock_ and 
client_request_state_map_lock_
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8363/5/be/src/util/sharded-query-map-util.h
File be/src/util/sharded-query-map-util.h:

http://gerrit.cloudera.org:8080/#/c/8363/5/be/src/util/sharded-query-map-util.h@61
PS5, Line 61: map_container
> how about 'container_array' to make it clear this must be an array?
This comment would become moot once you encode the number of shards. then this 
parameter is just 'sharded_map' or similar (and contains all shards).



--
To view, visit http://gerrit.cloudera.org:8080/8363
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1
Gerrit-Change-Number: 8363
Gerrit-PatchSet: 5
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 06 Feb 2018 20:14:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4456: Address scalability issues of qs map lock and client request state map lock

2018-02-06 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8363 )

Change subject: IMPALA-4456: Address scalability issues of qs_map_lock_ and 
client_request_state_map_lock_
..


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8363/5/be/src/util/sharded-query-map-util.h
File be/src/util/sharded-query-map-util.h:

http://gerrit.cloudera.org:8080/#/c/8363/5/be/src/util/sharded-query-map-util.h@36
PS5, Line 36: 4
doesn't that have to be NUM_QUERY_BUCKETS? otherwise, QueryIdToBucket() doesn't 
work.

But, I think it'd be better to have everything encapsulated in these classes, 
i.e. the number of shards should be either encoded in here.


http://gerrit.cloudera.org:8080/#/c/8363/5/be/src/util/sharded-query-map-util.h@61
PS5, Line 61: map_container
how about 'container_array' to make it clear this must be an array?


http://gerrit.cloudera.org:8080/#/c/8363/5/be/src/util/sharded-query-map-util.h@63
PS5, Line 63: QueryIdToBucket
like above, given this is a function of the number of shards, it should be part 
of this class.



--
To view, visit http://gerrit.cloudera.org:8080/8363
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1
Gerrit-Change-Number: 8363
Gerrit-PatchSet: 5
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 06 Feb 2018 20:12:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask

2018-02-06 Thread Tim Armstrong (Code Review)
Hello Jim Apple, Mostafa Mokhtar,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9214

to look at the new patch set (#3).

Change subject: Force inlining of BloomFilter::MakeMask
..

Force inlining of BloomFilter::MakeMask

I noticed that this function was showing up in perf top for TPC-H Q8
running locally. It wasn't inlined into BloomFilter::BucketFindAVX2.
Inlining made the query ~5% faster for me locally.

Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923
---
M be/src/benchmarks/bloom-filter-benchmark.cc
M be/src/util/bloom-filter.h
2 files changed, 86 insertions(+), 84 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/9214/3
--
To view, visit http://gerrit.cloudera.org:8080/9214
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923
Gerrit-Change-Number: 9214
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask

2018-02-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9214 )

Change subject: Force inlining of BloomFilter::MakeMask
..


Patch Set 2:

On my system after:

With AVX2:

insert:Function  iters/ms   10%ile   50%ile   90%ile 
10%ile 50%ile 90%ile
 
(relative) (relative) (relative)
-
   ndv  10k fpp   10.0%2.1e+05 2.11e+05 2.13e+05
 1X 1X 1X
   ndv  10k fpp1.0%   2.16e+05 2.18e+05 2.19e+05  
1.03X  1.03X  1.03X
   ndv  10k fpp0.1%   2.12e+05 2.14e+05 2.16e+05  
1.01X  1.01X  1.01X
   ndv1000k fpp   10.0%   1.98e+05 1.99e+05 2.01e+05 
0.943X 0.942X 0.945X
   ndv1000k fpp1.0%   1.96e+05 1.98e+05 1.99e+05 
0.935X 0.936X 0.937X
   ndv1000k fpp0.1%   1.96e+05 1.97e+05 1.99e+05 
0.935X 0.934X 0.936X
   ndv  10k fpp   10.0%   5.63e+04  5.8e+04 6.18e+04 
0.269X 0.274X 0.291X
   ndv  10k fpp1.0%   5.64e+04 5.84e+04 6.24e+04 
0.269X 0.276X 0.293X
   ndv  10k fpp0.1%   5.56e+04 5.75e+04 5.86e+04 
0.265X 0.272X 0.275X

find:  Function  iters/ms   10%ile   50%ile   90%ile 
10%ile 50%ile 90%ile
 
(relative) (relative) (relative)
-
   present ndv  10k fpp   10.0%   1.97e+05 1.98e+05 1.99e+05
 1X 1X 1X
   absent  ndv  10k fpp   10.0%   1.99e+05 2.01e+05 2.03e+05  
1.01X  1.01X  1.02X
   present ndv  10k fpp1.0%   1.97e+05 1.98e+052e+05
 1X 1X 1X
   absent  ndv  10k fpp1.0%  2e+05 2.01e+05 2.03e+05  
1.02X  1.02X  1.02X
   present ndv  10k fpp0.1%   1.97e+05 1.99e+052e+05
 1X 1X 1X
   absent  ndv  10k fpp0.1%  2e+05 2.02e+05 2.03e+05  
1.02X  1.02X  1.02X
   present ndv1000k fpp   10.0%   1.75e+05 1.77e+05 1.78e+05 
0.891X 0.893X 0.893X
   absent  ndv1000k fpp   10.0%   1.78e+05  1.8e+05 1.81e+05 
0.907X 0.907X 0.907X
   present ndv1000k fpp1.0%1.8e+05 1.82e+05 1.83e+05 
0.917X 0.917X 0.919X
   absent  ndv1000k fpp1.0%   1.84e+05 1.86e+05 1.88e+05 
0.937X 0.939X 0.941X
   present ndv1000k fpp0.1%   1.69e+05  1.7e+05 1.71e+05 
0.857X 0.859X 0.858X
   absent  ndv1000k fpp0.1%1.7e+05 1.72e+05 1.74e+05 
0.866X  0.87X 0.871X
   present ndv  10k fpp   10.0%   5.34e+04 5.53e+04 7.21e+04 
0.271X 0.279X 0.362X
   absent  ndv  10k fpp   10.0%   5.05e+04 5.28e+04 5.52e+04 
0.257X 0.267X 0.277X
   present ndv  10k fpp1.0%   5.43e+04 5.74e+04 8.65e+04 
0.276X  0.29X 0.434X
   absent  ndv  10k fpp1.0%   5.09e+04 5.42e+04 5.73e+04 
0.259X 0.274X 0.288X
   present ndv  10k fpp0.1%   5.11e+04 5.24e+04 6.69e+04  
0.26X 0.265X 0.336X
   absent  ndv  10k fpp0.1%   4.93e+04 5.02e+04  5.1e+04 
0.251X 0.254X 0.256X

union: Function  iters/ms   10%ile   50%ile   90%ile 
10%ile 50%ile 90%ile
 
(relative) (relative) (relative)
-
   ndv  10k fpp   10.0%   6.76e+05  6.8e+05 6.88e+05
 1X 1X 1X
   ndv  10k fpp1.0%   6.77e+05 6.81e+05 6.87e+05
 1X 1X 0.998X
   ndv  10k fpp0.1%   6.78e+05 6.82e+05 6.86e+05
 1X 1X 0.996X
   ndv1000k fpp   10.0%   6.78e+05 6.82e+05 6.88e+05
 1X 1X 1X
   ndv1000k fpp1.0%   6.78e+05 6.83e+05 6.89e+05
 1X 1X 1X
   ndv1000k fpp0.1%   6.77e+05  6.8e+05 6.89e+05
 1X 1X 1X
   ndv  10k fpp   10.0%   6.77e+05 6.81e+05 6.88e+05
 1X 1X 0.999X
   ndv  10k fpp1.0%   6.77e+05 6.85e+05 6.89e+05
 1X  1.01X 1X
   ndv  

[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask

2018-02-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9214 )

Change subject: Force inlining of BloomFilter::MakeMask
..


Patch Set 2:

The speedup was pretty significant on the benchmark.

On my system before:

With AVX2:

insert:Function  iters/ms   10%ile   50%ile   90%ile 
10%ile 50%ile 90%ile
 
(relative) (relative) (relative)
-
   ndv  10k fpp   10.0%   1.06e+05 1.06e+05 1.07e+05
 1X 1X 1X
   ndv  10k fpp1.0%   1.05e+05 1.06e+05 1.07e+05 
0.998X 0.999X 0.998X
   ndv  10k fpp0.1%   1.06e+05 1.06e+05 1.07e+05 
0.999X 1X 1X
   ndv1000k fpp   10.0%   1.05e+05 1.06e+05 1.07e+05 
0.993X 0.992X 0.994X
   ndv1000k fpp1.0%   1.04e+05 1.05e+05 1.06e+05 
0.986X 0.988X 0.986X
   ndv1000k fpp0.1%   1.04e+05 1.05e+05 1.06e+05 
0.986X 0.987X 0.986X
   ndv  10k fpp   10.0%   3.84e+04 4.04e+04  4.4e+04 
0.364X  0.38X  0.41X
   ndv  10k fpp1.0%3.9e+04 4.11e+04 4.46e+04 
0.369X 0.386X 0.416X
   ndv  10k fpp0.1%   3.77e+04 3.85e+04 4.58e+04 
0.357X 0.362X 0.426X

find:  Function  iters/ms   10%ile   50%ile   90%ile 
10%ile 50%ile 90%ile
 
(relative) (relative) (relative)
-
   present ndv  10k fpp   10.0%   1.12e+05 1.12e+05 1.13e+05
 1X 1X 1X
   absent  ndv  10k fpp   10.0%   1.11e+05 1.12e+05 1.13e+05 
0.998X 0.998X 1X
   present ndv  10k fpp1.0%   1.12e+05 1.12e+05 1.13e+05
 1X 0.998X 0.999X
   absent  ndv  10k fpp1.0%   1.12e+05 1.12e+05 1.13e+05 
0.999X 1X 0.998X
   present ndv  10k fpp0.1%   1.12e+05 1.13e+05 1.13e+05
 1X 1X 1X
   absent  ndv  10k fpp0.1%   1.12e+05 1.12e+05 1.13e+05 
0.999X 0.999X 0.998X
   present ndv1000k fpp   10.0%   1.08e+05 1.09e+05  1.1e+05 
0.967X 0.969X 0.967X
   absent  ndv1000k fpp   10.0%   1.09e+05 1.09e+05  1.1e+05 
0.973X  0.97X 0.973X
   present ndv1000k fpp1.0%   1.07e+05 1.08e+05 1.09e+05  
0.96X  0.96X 0.961X
   absent  ndv1000k fpp1.0%   1.08e+05 1.09e+05  1.1e+05 
0.971X 0.971X 0.969X
   present ndv1000k fpp0.1%   1.05e+05 1.05e+05 1.06e+05 
0.937X 0.938X 0.937X
   absent  ndv1000k fpp0.1%   1.06e+05 1.07e+05 1.08e+05 
0.951X 0.949X  0.95X
   present ndv  10k fpp   10.0%   3.97e+04 4.08e+04 4.63e+04 
0.356X 0.363X 0.408X
   absent  ndv  10k fpp   10.0%   3.94e+04 4.08e+04 4.23e+04 
0.353X 0.363X 0.374X
   present ndv  10k fpp1.0%   4.19e+04 5.11e+04 7.02e+04 
0.375X 0.454X  0.62X
   absent  ndv  10k fpp1.0%3.9e+04 4.09e+04 4.28e+04  
0.35X 0.364X 0.377X
   present ndv  10k fpp0.1%   3.94e+04 4.26e+04  6.5e+04 
0.353X 0.379X 0.573X
   absent  ndv  10k fpp0.1%   3.67e+04 3.71e+04 3.82e+04 
0.329X  0.33X 0.337X

union: Function  iters/ms   10%ile   50%ile   90%ile 
10%ile 50%ile 90%ile
 
(relative) (relative) (relative)
-
   ndv  10k fpp   10.0%   5.46e+05 5.51e+05 5.56e+05
 1X 1X 1X
   ndv  10k fpp1.0%   5.46e+05 5.49e+05 5.54e+05
 1X 0.996X 0.996X
   ndv  10k fpp0.1%   5.46e+05 5.51e+05 5.56e+05
 1X 1X 1X
   ndv1000k fpp   10.0%   5.47e+05 5.51e+05 5.55e+05
 1X 1X 0.998X
   ndv1000k fpp1.0%   5.48e+05 5.54e+05 5.65e+05
 1X  1.01X  1.02X
   ndv1000k fpp0.1%   5.48e+05 5.51e+05 5.55e+05
 1X 1X 0.998X
   ndv  10k fpp   10.0%   5.48e+05 5.54e+05 5.65e+05
 1X  1.01X  1.02X
   ndv  10k fpp1.0%   5.43e+05 5.51e+05 5.57e+05   

[Impala-ASF-CR] Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption"

2018-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9226 )

Change subject: Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption"
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1890/


--
To view, visit http://gerrit.cloudera.org:8080/9226
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia06f061a4ecedd1df0d359fe06fe84618b5e07fb
Gerrit-Change-Number: 9226
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 06 Feb 2018 20:02:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption"

2018-02-06 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9226 )

Change subject: Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption"
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9226
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia06f061a4ecedd1df0d359fe06fe84618b5e07fb
Gerrit-Change-Number: 9226
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 06 Feb 2018 20:01:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6396: Exchange node's memory usage should include its receiver's

2018-02-06 Thread Michael Ho (Code Review)
Michael Ho has removed Lars Volker from this change.  ( 
http://gerrit.cloudera.org:8080/9202 )

Change subject: IMPALA-6396: Exchange node's memory usage should include its 
receiver's
..


Removed reviewer Lars Volker.
--
To view, visit http://gerrit.cloudera.org:8080/9202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I8ca3c47d87bfcd221d34565eda1878f3c15d5c45
Gerrit-Change-Number: 9202
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-4456: Address scalability issues of qs map lock and client request state map lock

2018-02-06 Thread Sailesh Mukil (Code Review)
Hello Philip Zeyliger, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8363

to look at the new patch set (#5).

Change subject: IMPALA-4456: Address scalability issues of qs_map_lock_ and 
client_request_state_map_lock_
..

IMPALA-4456: Address scalability issues of qs_map_lock_ and 
client_request_state_map_lock_

The following 2 locks have shown to be frequent points of contention
on recent perf runs:

- qs_map_lock_
- client_request_state_map_lock_

Since these are process wide locks, any threads waiting on these locks
potentially slow down the runtime of a query.

I tried to address this previously by converting the 
client_request_state_map_lock_
to a reader-writer lock. This showed great perf improvements in the general
case, however, there were edge cases with big regressions as well.
In the general case, strict readers of the map got through so quickly
that we were able to see a reduction in the number of client connections
created, since this lock was contended for in the context of Thrift threads too.
The bad case is when writers were starved trying to register a new query
since there were so many readers. Changing the starve option resulted in
worse read performance all round.

Another approach which is relatively simpler is to shard the locks, which
proves to be very effective with no regressions. The maps and locks are
sharded to a default of 4 buckets initally.

Query IDs are created by using boost::uuids::random_generator. We use the
high bits of a query ID to assign queries to buckets. I verified that the
distribution of the high bits of a query ID are even across buckets on
my local machine:

For 10,000 queries sharded across 4 buckets, the distribution was:
bucket[0]: 2500
bucket[1]: 2489
bucket[2]: 2566
bucket[3]: 2445

A micro-benchmark is added to measure the improvement in performance. This
benchmark creates multiple threads each of which creates a QueryState and
accesses it multiple times. We can see improvements in the range 2x - 3.5x.

BEFORE:
--Benchmark 1: Create and access Query States.
Total Time (#Queries: 5 #Accesses: 100) : 1ms
Total Time (#Queries: 50 #Accesses: 100) : 8ms
Total Time (#Queries: 50 #Accesses: 1000) : 54ms
Total Time (#Queries: 500 #Accesses: 100) : 76ms
Total Time (#Queries: 500 #Accesses: 1000) : 543ms

AFTER:
--Benchmark 1: Create and access Query States.
Total Time (#Queries: 5 #Accesses: 100) : 2173.59K clock cycles
Total Time (#Queries: 50 #Accesses: 100) : 4ms
Total Time (#Queries: 50 #Accesses: 1000) : 15ms
Total Time (#Queries: 500 #Accesses: 100) : 46ms
Total Time (#Queries: 500 #Accesses: 1000) : 151ms

This change introduces a ShardedQueryMapTemplate, which is used to replace
the QueryExecMgr::qs_map_ and the ImpalaServer::client_request_state_map_,
and their corresponding locks.

NOTE: This microbenchmark has shown that SpinLock has better performance
than boost::mutex for the qs_map_lock_'s, so that change has been made
too.

TODO: Add benchmark for client_request_state_map_lock_ too. The APIs
around that are more complicated, so this patch only includes
the benchmarking of qs_map_lock_.

TODO 2: Consider adopting the ShardedQueryMapTemplate for the SessionStateMap.

Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1
---
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/process-wide-locks-benchmark.cc
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-exec-mgr.h
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
A be/src/util/sharded-query-map-util.h
M be/src/util/uid-util.h
9 files changed, 353 insertions(+), 53 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/8363/5
--
To view, visit http://gerrit.cloudera.org:8080/8363
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1
Gerrit-Change-Number: 8363
Gerrit-PatchSet: 5
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption"

2018-02-06 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9226 )

Change subject: Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption"
..


Patch Set 1:

I believe this is breaking all builds because the BE fails to compile. See:
https://issues.apache.org/jira/browse/IMPALA-6485


--
To view, visit http://gerrit.cloudera.org:8080/9226
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia06f061a4ecedd1df0d359fe06fe84618b5e07fb
Gerrit-Change-Number: 9226
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Comment-Date: Tue, 06 Feb 2018 20:00:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption"

2018-02-06 Thread Alex Behm (Code Review)
Alex Behm has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9226


Change subject: Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption"
..

Revert "IMPALA-6219: Use AES-GCM for spill-to-disk encryption"

This reverts commit 9b68645f9eb9e08899fda860e0946cc05f205479.

Change-Id: Ia06f061a4ecedd1df0d359fe06fe84618b5e07fb
---
M be/src/runtime/tmp-file-mgr.cc
M be/src/util/cpu-info.cc
M be/src/util/cpu-info.h
M be/src/util/openssl-util-test.cc
M be/src/util/openssl-util.cc
M be/src/util/openssl-util.h
6 files changed, 73 insertions(+), 227 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/9226/1
--
To view, visit http://gerrit.cloudera.org:8080/9226
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia06f061a4ecedd1df0d359fe06fe84618b5e07fb
Gerrit-Change-Number: 9226
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Behm 


[Impala-ASF-CR] IMPALA-4795: Allow fetching function obj from catalog using signature

2018-02-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9225 )

Change subject: IMPALA-4795: Allow fetching function obj from catalog using 
signature
..


Patch Set 1: Code-Review+2

It looks like the old patch applied cleanly right? I did another pass over the 
patch to sanity-check it and confirm that the approach was the same as the 
previous one.


--
To view, visit http://gerrit.cloudera.org:8080/9225
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3c19edd56e645b8bee918a111c76fe3f260142fe
Gerrit-Change-Number: 9225
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 06 Feb 2018 19:50:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6392: Consistent explain format for parquet predicate statistics

2018-02-06 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9223 )

Change subject: IMPALA-6392: Consistent explain format for parquet predicate 
statistics
..


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/9223/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9223/5//COMMIT_MSG@23
PS5, Line 23: .
run end-to-end tests as well. see 
testdata/workloads/functional-query/queries/QueryTest/parquet-filtering.test. 
this is used by tests/query_test/test_mt_dop.py


http://gerrit.cloudera.org:8080/#/c/9223/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/9223/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@230
PS5, Line 230: dictionary
what's the "dictionary" prefix for? seems this is still just for min/max 
conjuncts (minMaxConjuncts_) and not for dictionaryFilterConjuncts_.


http://gerrit.cloudera.org:8080/#/c/9223/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1100
PS5, Line 1100: Dictionary
make this name consistent with member name if updated.



--
To view, visit http://gerrit.cloudera.org:8080/9223
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e
Gerrit-Change-Number: 9223
Gerrit-PatchSet: 5
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Tue, 06 Feb 2018 19:27:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6204: Remove external DataSource

2018-02-06 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9192 )

Change subject: IMPALA-6204: Remove external DataSource
..


Patch Set 5:

> > Was there ever an announcement that this was deprecated?
 >
 > Is https://issues.apache.org/jira/browse/IMPALA-6204 sufficient?

TBH, I'm not thrilled about that. For instance, there is no justification on 
that ticket about why, and no discussion of pros and cons.

While this is in review, Maybe Philip could start a discussion on user@ 
explaining why and asking for feedback. Maybe it will come back unanimous after 
a short amount of time, but I think it's worth a discussion, given how badly 
this could break workloads of existing users. (Obviously they could ETL into 
Parquet or stay on the 2.x branch.)


--
To view, visit http://gerrit.cloudera.org:8080/9192
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a3a6740466ed7372b71d948c705b30886dcfb6
Gerrit-Change-Number: 9192
Gerrit-PatchSet: 5
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Tue, 06 Feb 2018 19:19:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5037: Default PARQUET ARRAY RESOLUTION=THREE LEVEL

2018-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9210 )

Change subject: IMPALA-5037: Default PARQUET_ARRAY_RESOLUTION=THREE_LEVEL
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1889/


--
To view, visit http://gerrit.cloudera.org:8080/9210
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8f7e9010c4354d667305d9df7b78862efb23fe1
Gerrit-Change-Number: 9210
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 06 Feb 2018 19:18:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5037: Default PARQUET ARRAY RESOLUTION=THREE LEVEL

2018-02-06 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9210 )

Change subject: IMPALA-5037: Default PARQUET_ARRAY_RESOLUTION=THREE_LEVEL
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9210
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8f7e9010c4354d667305d9df7b78862efb23fe1
Gerrit-Change-Number: 9210
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 06 Feb 2018 19:17:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5037: Default PARQUET ARRAY RESOLUTION=THREE LEVEL

2018-02-06 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9210 )

Change subject: IMPALA-5037: Default PARQUET_ARRAY_RESOLUTION=THREE_LEVEL
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9210/2/tests/query_test/test_nested_types.py
File tests/query_test/test_nested_types.py:

http://gerrit.cloudera.org:8080/#/c/9210/2/tests/query_test/test_nested_types.py@355
PS2, Line 355: arr_res = vector.get_value('parquet_array_resolution')
> Seems ok, but we could probably factor these three lines out into a functio
Done



--
To view, visit http://gerrit.cloudera.org:8080/9210
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8f7e9010c4354d667305d9df7b78862efb23fe1
Gerrit-Change-Number: 9210
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 06 Feb 2018 19:16:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4795: Allow fetching function obj from catalog using signature

2018-02-06 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9225


Change subject: IMPALA-4795: Allow fetching function obj from catalog using 
signature
..

IMPALA-4795: Allow fetching function obj from catalog using signature

Fixed a bug where the catalog throws a NullPointerException if trying
to fetch a function object using function signature. This exception
prevented the code paths in CatalogOpExecutor::HandleDropFunction and
ImpalaServer::CatalogUpdateCallback to be exercised which prevented
removal of a recreated function object necessary for maintaining
metadata consistency.

This patch was initially reverted because it increased the
probability of tests in test_udfs failing. This was due to a race
in LibCache that has now been fixed in a separate patch.

Change-Id: I3c19edd56e645b8bee918a111c76fe3f260142fe
---
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/Function.java
M tests/common/impala_service.py
M tests/query_test/test_udfs.py
M tests/webserver/test_web_pages.py
5 files changed, 48 insertions(+), 16 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/9225/1
--
To view, visit http://gerrit.cloudera.org:8080/9225
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I3c19edd56e645b8bee918a111c76fe3f260142fe
Gerrit-Change-Number: 9225
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 


[Impala-ASF-CR] IMPALA-6456: Add flags to configure rpc negotiation timeout ms and negotiation thread count in KRPC

2018-02-06 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9186 )

Change subject: IMPALA-6456: Add flags to configure rpc_negotiation_timeout_ms 
and negotiation thread count in KRPC
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/9186/2/be/src/rpc/rpc-mgr-test.cc
File be/src/rpc/rpc-mgr-test.cc:

http://gerrit.cloudera.org:8080/#/c/9186/2/be/src/rpc/rpc-mgr-test.cc@537
PS2, Line 537:
> Should this use something like ScopedFlagSetter() in case the ASSERT below
Yes, good idea. Done.


http://gerrit.cloudera.org:8080/#/c/9186/2/be/src/rpc/rpc-mgr-test.cc@541
PS2, Line 541:
> Is TLS always enabled ? Seem a bit misleading to have the "tls_" prefix her
Sorry that was a mistake. I forgot to change the name of the variable. Done.


http://gerrit.cloudera.org:8080/#/c/9186/2/be/src/rpc/rpc-mgr-test.cc@551
PS2, Line 551: 
> Just wondering if it'll be a problem if ASSERT() above gets triggered. Will
If ASSERT() is triggered, we fail the BE test anyway right? So, even though the 
RpcMgr isn't shutdown, the process will kill it.



--
To view, visit http://gerrit.cloudera.org:8080/9186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I108d700e7eac04b678e21a3a920aac81ba8eede5
Gerrit-Change-Number: 9186
Gerrit-PatchSet: 3
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 06 Feb 2018 19:14:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6456: Add flags to configure rpc negotiation timeout ms and negotiation thread count in KRPC

2018-02-06 Thread Sailesh Mukil (Code Review)
Hello Michael Ho,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9186

to look at the new patch set (#3).

Change subject: IMPALA-6456: Add flags to configure rpc_negotiation_timeout_ms 
and negotiation thread count in KRPC
..

IMPALA-6456: Add flags to configure rpc_negotiation_timeout_ms and negotiation 
thread count in KRPC

With the fix for KUDU-2228, the FLAGS_rpc_negotiation_timeout_ms was
retired in KRPC.

This patch introduces a flag to be able to configure that from the
Impala side (FLAGS_rpc_negotiation_timeout_ms).

It also introduces a flag to configure the negotiation
thread count (FLAGS_rpc_negotiation_thread_count).

Added a test to verify that setting FLAGS_rpc_negotiation_timeout_ms
to 0 causes negotiation failures. We unfortunately can't write a test
to check the same for FLAGS_rpc_negotiation_thread_count due to
DCHECKS present in the code.

Change-Id: I108d700e7eac04b678e21a3a920aac81ba8eede5
---
M be/src/rpc/rpc-mgr-test.cc
M be/src/rpc/rpc-mgr.cc
2 files changed, 28 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/9186/3
--
To view, visit http://gerrit.cloudera.org:8080/9186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I108d700e7eac04b678e21a3a920aac81ba8eede5
Gerrit-Change-Number: 9186
Gerrit-PatchSet: 3
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-6392: Consistent explain format for parquet predicate statistics

2018-02-06 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9223 )

Change subject: IMPALA-6392: Consistent explain format for parquet predicate 
statistics
..


Patch Set 5:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/9223/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9223/2//COMMIT_MSG@7
PS2, Line 7: IMPALA-6392: Consistent explain format for parquet predicate 
statistics
   :
> shorten this to fit on one line
Done.


http://gerrit.cloudera.org:8080/#/c/9223/2//COMMIT_MSG@10
PS2, Line 10: to output each
> mention that this only affects EXPLAIN_LEVEL=2+
Done.


http://gerrit.cloudera.org:8080/#/c/9223/2//COMMIT_MSG@13
PS2, Line 13: Before:
> Could you include a couple lines showing what the new format looks like?
Done.


http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@230
PS2, Line 230: M
> extra whitespace
Done.


http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@232
PS2, Line 232:
> extra whitespace
Done.


http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1099
PS2, Line 1099:   // Helper method that prints dictionary min max original 
conjucts by tuple descriptor.
> Brief comment here, like the one for getDictionaryConjunctsExplainString()
Done.


http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1107
PS2, Line 1107: ut.app
> This can fit on the previous line (we wrap at 90)
Done.


http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1109
PS2, Line 1109: else {
> Since this is only used once, no need to make a variable, just include it i
Done.


http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@
PS2, Line : prefix
> fits on previous line
The previous line will be 91-width long if we move prefix to the previous line.



--
To view, visit http://gerrit.cloudera.org:8080/9223
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e
Gerrit-Change-Number: 9223
Gerrit-PatchSet: 5
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Tue, 06 Feb 2018 18:59:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6392: Consistent explain format for parquet predicate statistics

2018-02-06 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/9223 )

Change subject: IMPALA-6392: Consistent explain format for parquet predicate 
statistics
..

IMPALA-6392: Consistent explain format for parquet predicate statistics

In EXPLAIN_LEVEL=2+, change the explain format for parquet predicate
statistics to output each tuple descriptor per line. This change is to
make it consistent with the output of other predicates.

Before:
parquet statistics predicates: c_custkey < 10, o_orderkey < 5, l_linenumber < 3

After:
parquet statistics predicates: c_custkey < 10
parquet statistics predicates on o: o_orderkey < 5
parquet statistics predicates on o_lineitems: l_linenumber < 3

Testing:
- Ran existing planner tests and updated the ones that are affected by
  this change.

Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e
---
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
4 files changed, 49 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/9223/5
--
To view, visit http://gerrit.cloudera.org:8080/9223
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e
Gerrit-Change-Number: 9223
Gerrit-PatchSet: 5
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-6437: separate AC/scheduler from catalog topic updates

2018-02-06 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9123 )

Change subject: IMPALA-6437: separate AC/scheduler from catalog topic updates
..


Patch Set 14: Code-Review+1

This looks okay to me. I focused mostly on headers and only took a quick looks 
at implementation. Up to you whether you want to have another reviewer in 
addition to Tianyi take a deeper look at the code.


--
To view, visit http://gerrit.cloudera.org:8080/9123
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc49c2d0f2a5bfad822545616b8c62b4b95dc210
Gerrit-Change-Number: 9123
Gerrit-PatchSet: 14
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 06 Feb 2018 18:54:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6204: Remove external DataSource

2018-02-06 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9192 )

Change subject: IMPALA-6204: Remove external DataSource
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9192/5/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

http://gerrit.cloudera.org:8080/#/c/9192/5/common/thrift/CatalogObjects.thrift@401
PS5, Line 401:   //  12: optional TDataSourceTable data_source_table
> what's the reasoning for not renumbering for these internal datastructures?
I'd also prefer removing everything. I see no point in keeping stuff like this 
around.



--
To view, visit http://gerrit.cloudera.org:8080/9192
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a3a6740466ed7372b71d948c705b30886dcfb6
Gerrit-Change-Number: 9192
Gerrit-PatchSet: 5
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Tue, 06 Feb 2018 18:53:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6392: Explain format for parquet predicate statistics should be consistent with predicates

2018-02-06 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9223 )

Change subject: IMPALA-6392: Explain format for parquet predicate statistics 
should be consistent with predicates
..


Patch Set 2:

My username is fredyw


--
To view, visit http://gerrit.cloudera.org:8080/9223
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e
Gerrit-Change-Number: 9223
Gerrit-PatchSet: 2
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Tue, 06 Feb 2018 18:44:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6392: Explain format for parquet predicate statistics should be consistent with predicates

2018-02-06 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9223 )

Change subject: IMPALA-6392: Explain format for parquet predicate statistics 
should be consistent with predicates
..


Patch Set 2:

(9 comments)

Thanks for working on this!

Do you have a username on JIRA so that I can assign this (and the other ones 
you're working on) to you?

http://gerrit.cloudera.org:8080/#/c/9223/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9223/2//COMMIT_MSG@7
PS2, Line 7: IMPALA-6392: Explain format for parquet predicate statistics 
should be
   : consistent with predicates
shorten this to fit on one line


http://gerrit.cloudera.org:8080/#/c/9223/2//COMMIT_MSG@10
PS2, Line 10: explain format
mention that this only affects EXPLAIN_LEVEL=2+


http://gerrit.cloudera.org:8080/#/c/9223/2//COMMIT_MSG@13
PS2, Line 13:
Could you include a couple lines showing what the new format looks like?

Also, we like to include a "Testing" section. You can look through other 
reviews on gerrit to see what this normally looks like, but it would be 
something like:

Testing:
- Ran existing planner tests and updated the ones that are affected by this 
change.


http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@230
PS2, Line 230:
extra whitespace


http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@232
PS2, Line 232:
extra whitespace


http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1099
PS2, Line 1099:   private String 
getDictionaryMinMaxOriginalConjunctsExplainString(String prefix) {
Brief comment here, like the one for getDictionaryConjunctsExplainString() below


http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1107
PS2, Line 1107: prefix
This can fit on the previous line (we wrap at 90)

You might consider looking into clang-format-diff to find these little issues. 
We include a .clang-format file


http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1109
PS2, Line 1109: String alias
Since this is only used once, no need to make a variable, just include it 
inline in the append()


http://gerrit.cloudera.org:8080/#/c/9223/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@
PS2, Line : prefix
fits on previous line



--
To view, visit http://gerrit.cloudera.org:8080/9223
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e
Gerrit-Change-Number: 9223
Gerrit-PatchSet: 2
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Tue, 06 Feb 2018 18:26:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6424: Avoid loading metadata twice when refreshing unloaded tables

2018-02-06 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9224


Change subject: IMPALA-6424: Avoid loading metadata twice when refreshing 
unloaded tables
..

IMPALA-6424: Avoid loading metadata twice when refreshing unloaded tables

This commit fixes an issue where table metadata are loaded twice if a
REFRESH statement is executed on an unloaded table. With this fix, we
initially check the state of the table (loaded, unloaded) and we only
refresh metadata if the table is in a loaded state. If table is
unloaded, we load the metadata from scratch.

Testing:
Run exhaustive tests

Change-Id: Ie41a734493dcea0e36d6b051966f1d0302907dee
---
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
1 file changed, 23 insertions(+), 5 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/9224/1
--
To view, visit http://gerrit.cloudera.org:8080/9224
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie41a734493dcea0e36d6b051966f1d0302907dee
Gerrit-Change-Number: 9224
Gerrit-PatchSet: 1
Gerrit-Owner: Dimitris Tsirogiannis 


[Impala-ASF-CR] IMPALA-6204: Remove external DataSource

2018-02-06 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9192 )

Change subject: IMPALA-6204: Remove external DataSource
..


Patch Set 5:

(2 comments)

Maybe also manually test that the various SQL related to datasource gives a 
sensible parsing error with this change?

http://gerrit.cloudera.org:8080/#/c/9192/5/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/9192/5/be/src/service/client-request-state.cc@297
PS5, Line 297:   return Status("Data Source not supported.");
why can we even get here? doesn't the frontend reject this statement?


http://gerrit.cloudera.org:8080/#/c/9192/5/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

http://gerrit.cloudera.org:8080/#/c/9192/5/common/thrift/CatalogObjects.thrift@401
PS5, Line 401:   //  12: optional TDataSourceTable data_source_table
what's the reasoning for not renumbering for these internal datastructures? I 
guess I don't feel too strongly, but seems like we can accumulate cruft. (Once 
we implement rolling upgrade, then we'll probably need to be stricter.)



--
To view, visit http://gerrit.cloudera.org:8080/9192
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a3a6740466ed7372b71d948c705b30886dcfb6
Gerrit-Change-Number: 9192
Gerrit-PatchSet: 5
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Tue, 06 Feb 2018 17:45:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask

2018-02-06 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9214 )

Change subject: Force inlining of BloomFilter::MakeMask
..


Patch Set 2: Code-Review+1

Just for sanity checking, did you run bloom-filter-benchmark?


--
To view, visit http://gerrit.cloudera.org:8080/9214
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923
Gerrit-Change-Number: 9214
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 06 Feb 2018 17:13:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6204: Remove external DataSource

2018-02-06 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9192 )

Change subject: IMPALA-6204: Remove external DataSource
..


Patch Set 5:

Was there ever an announcement that this was deprecated?


--
To view, visit http://gerrit.cloudera.org:8080/9192
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a3a6740466ed7372b71d948c705b30886dcfb6
Gerrit-Change-Number: 9192
Gerrit-PatchSet: 5
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Tue, 06 Feb 2018 17:07:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask

2018-02-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9214 )

Change subject: Force inlining of BloomFilter::MakeMask
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9214/2/be/src/util/bloom-filter.h
File be/src/util/bloom-filter.h:

http://gerrit.cloudera.org:8080/#/c/9214/2/be/src/util/bloom-filter.h@162
PS2, Line 162:   __attribute__((__target__("avx2")));
> Yes, I see it too in BucketFindAvx2
The __target__ is needed for GCC to emit AVX2 instructions at all, the 
multi-versioning thing is really an overloading of the __target__ attribute



--
To view, visit http://gerrit.cloudera.org:8080/9214
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923
Gerrit-Change-Number: 9214
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 06 Feb 2018 16:52:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask

2018-02-06 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9214 )

Change subject: Force inlining of BloomFilter::MakeMask
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9214/2/be/src/util/bloom-filter.h
File be/src/util/bloom-filter.h:

http://gerrit.cloudera.org:8080/#/c/9214/2/be/src/util/bloom-filter.h@162
PS2, Line 162:   __attribute__((__target__("avx2")));
Yes, I see it too in BucketFindAvx2
 0.67 ?  mov%rdi,%r12   
   ?
  1.07 ?  mov%esi,%ebx  
?
  0.56 ?  mov%edx,%edi  
?
  1.32 ?  shl$0x5,%rbx  
?
  0.72 ?  sub$0x18,%rsp 
?
  1.73 ?? callq  impala::BloomFilter::MakeMask(unsigned int)

Wondering if the multi-versioning attribute is still needed since MakeMask  
only called from BucketFindAVX2 and BucketInsertAVX2 which also have 
(__target__("avx2")).



--
To view, visit http://gerrit.cloudera.org:8080/9214
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923
Gerrit-Change-Number: 9214
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Comment-Date: Tue, 06 Feb 2018 16:50:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6392: Explain format for parquet predicate statistics should be consistent with predicates

2018-02-06 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9223


Change subject: IMPALA-6392: Explain format for parquet predicate statistics 
should be consistent with predicates
..

IMPALA-6392: Explain format for parquet predicate statistics should be
consistent with predicates

Change the explain format for parquet predicate statistics to output
each tuple descriptor per line. This change is to make it consistent
with the output of other predicates.

Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e
---
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
4 files changed, 50 insertions(+), 19 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/9223/2
--
To view, visit http://gerrit.cloudera.org:8080/9223
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia3d55ab6a1ae551867a9f68b3622844102cc854e
Gerrit-Change-Number: 9223
Gerrit-PatchSet: 2
Gerrit-Owner: Fredy Wijaya 


[Impala-ASF-CR] IMPALA-5440 Add planner tests with extreme statistics values

2018-02-06 Thread Anonymous Coward (Code Review)
xyutin...@cloudera.com has uploaded a new patch set (#9). ( 
http://gerrit.cloudera.org:8080/9065 )

Change subject: IMPALA-5440 Add planner tests with extreme statistics values
..

IMPALA-5440 Add planner tests with extreme statistics values

This commit address some of the issues in JIRA: tests against the cardinality
overflowing from JOIN, UNION, CROSS JOIN, FULL OUTER JOIN,
0 row number and negative row number, as well as cardinality on Subplan node.

Change-Id: I86dec47cf1438882cafaec53e97864ccfcdff3cb
---
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
2 files changed, 150 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/9065/9
--
To view, visit http://gerrit.cloudera.org:8080/9065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I86dec47cf1438882cafaec53e97864ccfcdff3cb
Gerrit-Change-Number: 9065
Gerrit-PatchSet: 9
Gerrit-Owner: xyutin...@cloudera.com
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Reviewer: xyutin...@cloudera.com


[Impala-ASF-CR] IMPALA-4835: Part 3: switch I/O buffers to buffer pool

2018-02-06 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8966 )

Change subject: IMPALA-4835: Part 3: switch I/O buffers to buffer pool
..


Patch Set 12:

(4 comments)

some early comments: overall looks pretty solid. will do another pass

http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/hdfs-parquet-scanner.cc@1707
PS12, Line 1707: 3
would it make sense to replace this with a named constant? something that 
appropriately ties together the use of 3 buffers and optimum reservation size.


http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/hdfs-parquet-scanner.cc@1724
PS12, Line 1724: bytes_to_add = 0;
would it be useful to squeeze max amount of buffers if we do this here:

if(reservation_to_distribute >= min_buffer_size)
bytes_to_add = BitUtil::RoundDownToPowerOfTwo(reservation_to_distribute)


http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/hdfs-scan-node-base.cc@347
PS12, Line 347:   // We got the minimum reservation. Now try to get ideal 
reservation.
  :   if (resource_profile_.min_reservation != 
ideal_scan_range_reservation_) {
  : ignore_result(buffer_pool_client_.IncreaseReservation(
  : ideal_scan_range_reservation_ - 
resource_profile_.min_reservation));
  :   }
should we briefly document how ideal_scan_range_reservation_ is used  as 
compared to min_scan_range_reservation_ in HdfsScanNode.java's class level 
comments?


http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/scanner-context.cc
File be/src/exec/scanner-context.cc:

http://gerrit.cloudera.org:8080/#/c/8966/12/be/src/exec/scanner-context.cc@161
PS12, Line 161: parent_->bp_client_,
is this the same clientHandle object used across different threads?



--
To view, visit http://gerrit.cloudera.org:8080/8966
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic09c6196b31e55b301df45cc56d0b72cfece6786
Gerrit-Change-Number: 8966
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 06 Feb 2018 08:40:51 +
Gerrit-HasComments: Yes