[Impala-ASF-CR] IMPALA-8685,IMPALA-8677: Use consistent scheduling for small clusters

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14026 )

Change subject: IMPALA-8685,IMPALA-8677: Use consistent scheduling for small 
clusters
..


Patch Set 8: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/14026
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icfdb2cc53d7206e316ea8a1cc28ad443f246f741
Gerrit-Change-Number: 14026
Gerrit-PatchSet: 8
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Wed, 14 Aug 2019 04:39:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4551: Limit the size of SQL statements

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/14012 )

Change subject: IMPALA-4551: Limit the size of SQL statements
..

IMPALA-4551: Limit the size of SQL statements

Various BI tools generate and run SQL. When used incorrectly or
misconfigured, the tools can generate extremely large SQLs.
Some of these SQL statements reach 10s of megabytes. Large SQL
statements impose costs throughout execution, including
statement rewrite logic in the frontend and codegen in the
backend. The resource usage of these statements can impact
the stability of the system or the ability to run other SQL
statements.

This implements two new query options that provide controls
to reject large SQL statements.
 - The first, MAX_STATEMENT_LENGTH_BYTES is a cap on the
   total size of the SQL statement (in bytes). It is
   applied before any parsing or analysis. It uses a
   default value of 16MB.
 - The second, STATEMENT_EXPRESSION_LIMIT, is a limit on
   the total number of expressions in a statement or any
   views that it references. The limit is applied upon the
   first round of analysis, but it is not reapplied when
   statement rewrite rules are applied. Certain expressions
   such as literals in IN lists or VALUES clauses are not
   analyzed and do not count towards the limit. It uses
   a default value of 250,000.
The two are complementary. Since enforcing the statement
expression limit requires parsing and analyzing the
statement, the MAX_STATEMENT_LENGTH_BYTES sets an upper
bound on the size of statement that needs to be parsed
and analyzed. Testing confirms that even statements
approaching 16MB get through the first round of analysis
within a few seconds and then are rejected.

This also changes the logging in tests/common/impala_connection.py
to limit the total SQL size that it will print to 128KB. This is
prevents the JUnitXML (which includes this logging) from being too
large. Existing tests do not run SQL larger than about 80KB, so
this only applies to tests added in this change that run multi-MB
SQLs to verify limits.

Testing:
 - This adds frontend tests that verify the low level
   semantics about how expressions are counted and verifies
   that the expression limits are enforced.
 - This adds end-to-end tests that verify both the
   MAX_STATEMENT_LENGTH_BYTES and STATEMENT_EXPRESSION_LIMIT
   at their defaults values.
 - There is also an end-to-end test that runs in exhaustive
   mode that runs a SQL with close to 250,000 expressions.

Change-Id: I5675fb4a08c1dc51ae5bcf467cbb969cc064602c
Reviewed-on: http://gerrit.cloudera.org:8080/14012
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 
---
M be/src/service/impala-server.cc
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M common/thrift/generate_error_codes.py
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
M tests/common/impala_connection.py
M tests/query_test/test_exprs.py
13 files changed, 381 insertions(+), 11 deletions(-)

Approvals:
  Joe McDonnell: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/14012
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I5675fb4a08c1dc51ae5bcf467cbb969cc064602c
Gerrit-Change-Number: 14012
Gerrit-PatchSet: 7
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-4551: Limit the size of SQL statements

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14012 )

Change subject: IMPALA-4551: Limit the size of SQL statements
..


Patch Set 6: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/14012
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5675fb4a08c1dc51ae5bcf467cbb969cc064602c
Gerrit-Change-Number: 14012
Gerrit-PatchSet: 6
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Wed, 14 Aug 2019 04:36:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4551: Limit the size of SQL statements

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14012 )

Change subject: IMPALA-4551: Limit the size of SQL statements
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/14012
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5675fb4a08c1dc51ae5bcf467cbb969cc064602c
Gerrit-Change-Number: 14012
Gerrit-PatchSet: 5
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Wed, 14 Aug 2019 03:05:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-8821: Use RECOVER PARTITIONS in dataload to get partition metadata

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14060 )

Change subject: [WIP] IMPALA-8821: Use RECOVER PARTITIONS in dataload to get 
partition metadata
..


Patch Set 1:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/4245/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/14060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2622cd3655cf4521d5ac945759fd35c9abe670ef
Gerrit-Change-Number: 14060
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 14 Aug 2019 02:15:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8854: skip test acid insert

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14056 )

Change subject: IMPALA-8854: skip test_acid_insert
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/14056
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3bc5cce8b9c3843b5bb8ac4d29b2219411f671b4
Gerrit-Change-Number: 14056
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 14 Aug 2019 01:14:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4551: Limit the size of SQL statements

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14012 )

Change subject: IMPALA-4551: Limit the size of SQL statements
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4244/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14012
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5675fb4a08c1dc51ae5bcf467cbb969cc064602c
Gerrit-Change-Number: 14012
Gerrit-PatchSet: 6
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Wed, 14 Aug 2019 01:08:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8685,IMPALA-8677: Use consistent scheduling for small clusters

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14026 )

Change subject: IMPALA-8685,IMPALA-8677: Use consistent scheduling for small 
clusters
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4243/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14026
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icfdb2cc53d7206e316ea8a1cc28ad443f246f741
Gerrit-Change-Number: 14026
Gerrit-PatchSet: 8
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Wed, 14 Aug 2019 00:57:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8791: Handle the case where there is no fragment scheduled on the coordinator

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14058 )

Change subject: IMPALA-8791: Handle the case where there is no fragment 
scheduled on the coordinator
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/14058
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic4fba02bb7b20553a20634f8c5989d43ba2c0721
Gerrit-Change-Number: 14058
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 14 Aug 2019 00:53:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8791: Handle the case where there is no fragment scheduled on the coordinator

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14058 )

Change subject: IMPALA-8791: Handle the case where there is no fragment 
scheduled on the coordinator
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4787/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/14058
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic4fba02bb7b20553a20634f8c5989d43ba2c0721
Gerrit-Change-Number: 14058
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 14 Aug 2019 00:53:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] [WIP] IMPALA-8821: Use RECOVER PARTITIONS in dataload to get partition metadata

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14060 )

Change subject: [WIP] IMPALA-8821: Use RECOVER PARTITIONS in dataload to get 
partition metadata
..


Patch Set 1:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/14060/1/testdata/bin/generate-schema-statements.py
File testdata/bin/generate-schema-statements.py:

http://gerrit.cloudera.org:8080/#/c/14060/1/testdata/bin/generate-schema-statements.py@518
PS1, Line 518: def get_hdfs_subdirs_with_data_indb(path):
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/14060/1/testdata/bin/generate-schema-statements.py@521
PS1, Line 521:
flake8: E251 unexpected spaces around keyword / parameter equals


http://gerrit.cloudera.org:8080/#/c/14060/1/testdata/bin/generate-schema-statements.py@521
PS1, Line 521:
flake8: E251 unexpected spaces around keyword / parameter equals


http://gerrit.cloudera.org:8080/#/c/14060/1/testdata/bin/generate-schema-statements.py@521
PS1, Line 521:
flake8: E251 unexpected spaces around keyword / parameter equals


http://gerrit.cloudera.org:8080/#/c/14060/1/testdata/bin/generate-schema-statements.py@521
PS1, Line 521:
flake8: E251 unexpected spaces around keyword / parameter equals


http://gerrit.cloudera.org:8080/#/c/14060/1/testdata/bin/generate-schema-statements.py@521
PS1, Line 521:
flake8: E251 unexpected spaces around keyword / parameter equals


http://gerrit.cloudera.org:8080/#/c/14060/1/testdata/bin/generate-schema-statements.py@521
PS1, Line 521:
flake8: E251 unexpected spaces around keyword / parameter equals



--
To view, visit http://gerrit.cloudera.org:8080/14060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2622cd3655cf4521d5ac945759fd35c9abe670ef
Gerrit-Change-Number: 14060
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 14 Aug 2019 00:47:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [WIP] IMPALA-8821: Use RECOVER PARTITIONS in dataload to get partition metadata

2019-08-13 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/14060


Change subject: [WIP] IMPALA-8821: Use RECOVER PARTITIONS in dataload to get 
partition metadata
..

[WIP] IMPALA-8821: Use RECOVER PARTITIONS in dataload to get partition metadata

When using a data snapshot without a metadata snapshot (e.g. when loading to
a remote cluster), the data is already in place, and dataload needs to perform
all the appropriate DDLs to create the metadata. Currently, for dynamically
partitioned tables, dataload gives up and reloads those tables from scratch
in this circumstance. However, there is no need to do this, as ALTER TABLE..
RECOVER PARTITIONS can get the partition metadata by looking at the filesystem.

This changes dataload to use RECOVER PARTITIONS for dynamically partitioned
tables rather than forcing a reload of the table. Dataload from scratch is
not impacted, because there is no existing data and everything needs to be
inserted anyway. Dataload with both a data snapshot and a metadata snapshot
also is not impacted, because testdata/bin/create-load-data.sh skips most
of the bin/load-data.py calls for that codepath. So, this is limited to
dataload with a data snapshot and without a metadata snapshot. The biggest
impact of this is the TPC-DS store_sales does not have to be reloaded from
scratch in this case.

Impala dataload overrides the default location for table directories to
its own weird nonstandard location. These locations reside outside the
database *.db directories. The current table existence check is tuned to
handle tables that reside in directories with this naming system. It does
not handle tables that use the default location (i.e. the location if
LOCATION is not specified). This detects tables using the standard directory
naming and uses a different table existence check for those tables. This
eliminates the need to reload these tables.

Callers of bin/load-data.py always have the option of forcing a reload
via the --force_reload flag.

Testing:
 - Ran normal dataload (no snapshots)
 - Ran dataload with just a data snapshot (no metadata snapshot)
 - Ran dataload with a data snapshot and a metadata snapshot

Change-Id: I2622cd3655cf4521d5ac945759fd35c9abe670ef
---
M testdata/bin/generate-schema-statements.py
1 file changed, 47 insertions(+), 16 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/14060/1
--
To view, visit http://gerrit.cloudera.org:8080/14060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I2622cd3655cf4521d5ac945759fd35c9abe670ef
Gerrit-Change-Number: 14060
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 


[Impala-ASF-CR] IMPALA-4551: Limit the size of SQL statements

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14012 )

Change subject: IMPALA-4551: Limit the size of SQL statements
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4786/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/14012
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5675fb4a08c1dc51ae5bcf467cbb969cc064602c
Gerrit-Change-Number: 14012
Gerrit-PatchSet: 6
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Wed, 14 Aug 2019 00:33:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8685,IMPALA-8677: Use consistent scheduling for small clusters

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14026 )

Change subject: IMPALA-8685,IMPALA-8677: Use consistent scheduling for small 
clusters
..


Patch Set 8:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4785/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/14026
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icfdb2cc53d7206e316ea8a1cc28ad443f246f741
Gerrit-Change-Number: 14026
Gerrit-PatchSet: 8
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Wed, 14 Aug 2019 00:32:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8685,IMPALA-8677: Use consistent scheduling for small clusters

2019-08-13 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14026 )

Change subject: IMPALA-8685,IMPALA-8677: Use consistent scheduling for small 
clusters
..


Patch Set 8: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/14026
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icfdb2cc53d7206e316ea8a1cc28ad443f246f741
Gerrit-Change-Number: 14026
Gerrit-PatchSet: 8
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Wed, 14 Aug 2019 00:30:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4551: Limit the size of SQL statements

2019-08-13 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14012 )

Change subject: IMPALA-4551: Limit the size of SQL statements
..


Patch Set 6: Code-Review+2

Carry +2


--
To view, visit http://gerrit.cloudera.org:8080/14012
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5675fb4a08c1dc51ae5bcf467cbb969cc064602c
Gerrit-Change-Number: 14012
Gerrit-PatchSet: 6
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Wed, 14 Aug 2019 00:28:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4551: Limit the size of SQL statements

2019-08-13 Thread Joe McDonnell (Code Review)
Hello Bharath Vissapragada, Michael Ho, Quanlong Huang, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14012

to look at the new patch set (#6).

Change subject: IMPALA-4551: Limit the size of SQL statements
..

IMPALA-4551: Limit the size of SQL statements

Various BI tools generate and run SQL. When used incorrectly or
misconfigured, the tools can generate extremely large SQLs.
Some of these SQL statements reach 10s of megabytes. Large SQL
statements impose costs throughout execution, including
statement rewrite logic in the frontend and codegen in the
backend. The resource usage of these statements can impact
the stability of the system or the ability to run other SQL
statements.

This implements two new query options that provide controls
to reject large SQL statements.
 - The first, MAX_STATEMENT_LENGTH_BYTES is a cap on the
   total size of the SQL statement (in bytes). It is
   applied before any parsing or analysis. It uses a
   default value of 16MB.
 - The second, STATEMENT_EXPRESSION_LIMIT, is a limit on
   the total number of expressions in a statement or any
   views that it references. The limit is applied upon the
   first round of analysis, but it is not reapplied when
   statement rewrite rules are applied. Certain expressions
   such as literals in IN lists or VALUES clauses are not
   analyzed and do not count towards the limit. It uses
   a default value of 250,000.
The two are complementary. Since enforcing the statement
expression limit requires parsing and analyzing the
statement, the MAX_STATEMENT_LENGTH_BYTES sets an upper
bound on the size of statement that needs to be parsed
and analyzed. Testing confirms that even statements
approaching 16MB get through the first round of analysis
within a few seconds and then are rejected.

This also changes the logging in tests/common/impala_connection.py
to limit the total SQL size that it will print to 128KB. This is
prevents the JUnitXML (which includes this logging) from being too
large. Existing tests do not run SQL larger than about 80KB, so
this only applies to tests added in this change that run multi-MB
SQLs to verify limits.

Testing:
 - This adds frontend tests that verify the low level
   semantics about how expressions are counted and verifies
   that the expression limits are enforced.
 - This adds end-to-end tests that verify both the
   MAX_STATEMENT_LENGTH_BYTES and STATEMENT_EXPRESSION_LIMIT
   at their defaults values.
 - There is also an end-to-end test that runs in exhaustive
   mode that runs a SQL with close to 250,000 expressions.

Change-Id: I5675fb4a08c1dc51ae5bcf467cbb969cc064602c
---
M be/src/service/impala-server.cc
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M common/thrift/generate_error_codes.py
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
M tests/common/impala_connection.py
M tests/query_test/test_exprs.py
13 files changed, 381 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/14012/6
--
To view, visit http://gerrit.cloudera.org:8080/14012
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5675fb4a08c1dc51ae5bcf467cbb969cc064602c
Gerrit-Change-Number: 14012
Gerrit-PatchSet: 6
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-4551: Limit the size of SQL statements

2019-08-13 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14012 )

Change subject: IMPALA-4551: Limit the size of SQL statements
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14012/5/be/src/service/query-options.h
File be/src/service/query-options.h:

http://gerrit.cloudera.org:8080/#/c/14012/5/be/src/service/query-options.h@185
PS5, Line 185:   static const int64_t SPILLABLE_BUFFER_LIMIT = 1LL << 40; // 1 
TB
> nit: not related to your change but indents off?
Went ahead and fixed this



--
To view, visit http://gerrit.cloudera.org:8080/14012
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5675fb4a08c1dc51ae5bcf467cbb969cc064602c
Gerrit-Change-Number: 14012
Gerrit-PatchSet: 5
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Wed, 14 Aug 2019 00:27:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8685,IMPALA-8677: Use consistent scheduling for small clusters

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14026 )

Change subject: IMPALA-8685,IMPALA-8677: Use consistent scheduling for small 
clusters
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4242/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14026
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icfdb2cc53d7206e316ea8a1cc28ad443f246f741
Gerrit-Change-Number: 14026
Gerrit-PatchSet: 6
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Wed, 14 Aug 2019 00:22:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8685,IMPALA-8677: Use consistent scheduling for small clusters

2019-08-13 Thread Joe McDonnell (Code Review)
Hello Michael Ho, Lars Volker, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14026

to look at the new patch set (#8).

Change subject: IMPALA-8685,IMPALA-8677: Use consistent scheduling for small 
clusters
..

IMPALA-8685,IMPALA-8677: Use consistent scheduling for small clusters

In the original change for consistent scheduling, if a cluster has
fewer nodes than the number of remote executor candidates, then
the scheduler falls back to using the old SelectRemoteExecutor().
SelectRemoteExecutor() considers all backends and picks the backend
with the least assigned bytes; to break ties, it uses randomness.
This means that clusters with fewer backends than
num_remote_executor_candidates do not have consistent placement.

For the file handle cache (the original user of consistent
placement), this is not a major problem. However, for data caching,
it can result in slower warm up of the data cache and greater
duplication of the same data across different nodes.

This changes the algorithm to use consistent placement even for
small clusters (num nodes <= num_remote_executor_candidates).
To make this more predictable, it increases the maximum number
of iterations.

This also changes GetRemoteExecutorCandidates() to return the
candidates in the order that they were selected. While still
using a set for detecting duplicate backends, the vector of
distinct backends is constructed directly rather than by
iterating over the set.

Testing:
 - Modify the scheduler-test backend test to verify that small
   clusters use consistent scheduling.

Change-Id: Icfdb2cc53d7206e316ea8a1cc28ad443f246f741
---
M be/src/scheduling/scheduler-test.cc
M be/src/scheduling/scheduler.cc
2 files changed, 85 insertions(+), 42 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/14026/8
--
To view, visit http://gerrit.cloudera.org:8080/14026
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icfdb2cc53d7206e316ea8a1cc28ad443f246f741
Gerrit-Change-Number: 14026
Gerrit-PatchSet: 8
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] IMPALA-8685,IMPALA-8677: Use consistent scheduling for small clusters

2019-08-13 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14026 )

Change subject: IMPALA-8685,IMPALA-8677: Use consistent scheduling for small 
clusters
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14026/7/be/src/scheduling/scheduler.cc
File be/src/scheduling/scheduler.cc:

http://gerrit.cloudera.org:8080/#/c/14026/7/be/src/scheduling/scheduler.cc@823
PS7, Line 823: if (distinct_backends.size() != num_distinct_before) {
> std::unordered_set actually returns whether an insertion took place: https:
Good point, that's a much better way to do this. Switched this over to use the 
return value from insert.



--
To view, visit http://gerrit.cloudera.org:8080/14026
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icfdb2cc53d7206e316ea8a1cc28ad443f246f741
Gerrit-Change-Number: 14026
Gerrit-PatchSet: 7
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Wed, 14 Aug 2019 00:16:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] PREVIEW: Add support to run tests over HTTP/HS2

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14059 )

Change subject: PREVIEW: Add support to run tests over HTTP/HS2
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4241/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14059
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7156558071781378fcb9c8941c0f4dd82eb0d018
Gerrit-Change-Number: 14059
Gerrit-PatchSet: 1
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 14 Aug 2019 00:04:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4551: Limit the size of SQL statements

2019-08-13 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14012 )

Change subject: IMPALA-4551: Limit the size of SQL statements
..


Patch Set 5: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14012/5/be/src/service/query-options.h
File be/src/service/query-options.h:

http://gerrit.cloudera.org:8080/#/c/14012/5/be/src/service/query-options.h@185
PS5, Line 185:   static const int64_t SPILLABLE_BUFFER_LIMIT = 1LL << 40; // 1 
TB
nit: not related to your change but indents off?



--
To view, visit http://gerrit.cloudera.org:8080/14012
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5675fb4a08c1dc51ae5bcf467cbb969cc064602c
Gerrit-Change-Number: 14012
Gerrit-PatchSet: 5
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Wed, 14 Aug 2019 00:02:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8685,IMPALA-8677: Use consistent scheduling for small clusters

2019-08-13 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14026 )

Change subject: IMPALA-8685,IMPALA-8677: Use consistent scheduling for small 
clusters
..


Patch Set 7:

(1 comment)

Had one comment, otherwise LGTM

http://gerrit.cloudera.org:8080/#/c/14026/7/be/src/scheduling/scheduler.cc
File be/src/scheduling/scheduler.cc:

http://gerrit.cloudera.org:8080/#/c/14026/7/be/src/scheduling/scheduler.cc@823
PS7, Line 823: if (distinct_backends.size() != num_distinct_before) {
std::unordered_set actually returns whether an insertion took place: 
https://en.cppreference.com/w/cpp/container/unordered_set/insert



--
To view, visit http://gerrit.cloudera.org:8080/14026
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icfdb2cc53d7206e316ea8a1cc28ad443f246f741
Gerrit-Change-Number: 14026
Gerrit-PatchSet: 7
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Tue, 13 Aug 2019 23:58:59 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8791: Handle the case where there is no fragment scheduled on the coordinator

2019-08-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14058 )

Change subject: IMPALA-8791: Handle the case where there is no fragment 
scheduled on the coordinator
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/14058
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic4fba02bb7b20553a20634f8c5989d43ba2c0721
Gerrit-Change-Number: 14058
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 13 Aug 2019 23:58:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8685,IMPALA-8677: Use consistent scheduling for small clusters

2019-08-13 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14026 )

Change subject: IMPALA-8685,IMPALA-8677: Use consistent scheduling for small 
clusters
..


Patch Set 7: Code-Review+1

Carry +1


--
To view, visit http://gerrit.cloudera.org:8080/14026
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icfdb2cc53d7206e316ea8a1cc28ad443f246f741
Gerrit-Change-Number: 14026
Gerrit-PatchSet: 7
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Tue, 13 Aug 2019 23:48:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8685,IMPALA-8677: Use consistent scheduling for small clusters

2019-08-13 Thread Joe McDonnell (Code Review)
Hello Michael Ho, Lars Volker, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14026

to look at the new patch set (#6).

Change subject: IMPALA-8685,IMPALA-8677: Use consistent scheduling for small 
clusters
..

IMPALA-8685,IMPALA-8677: Use consistent scheduling for small clusters

In the original change for consistent scheduling, if a cluster has
fewer nodes than the number of remote executor candidates, then
the scheduler falls back to using the old SelectRemoteExecutor().
SelectRemoteExecutor() considers all backends and picks the backend
with the least assigned bytes; to break ties, it uses randomness.
This means that clusters with fewer backends than
num_remote_executor_candidates do not have consistent placement.

For the file handle cache (the original user of consistent
placement), this is not a major problem. However, for data caching,
it can result in slower warm up of the data cache and greater
duplication of the same data across different nodes.

This changes the algorithm to use consistent placement even for
small clusters (num nodes <= num_remote_executor_candidates).
To make this more predictable, it increases the maximum number
of iterations.

This also changes GetRemoteExecutorCandidates() to return the
candidates in the order that they were selected. While still
using a set for detecting duplicate backends, the vector of
distinct backends is constructed directly rather than by
iterating over the set.

Testing:
 - Modify the scheduler-test backend test to verify that small
   clusters use consistent scheduling.

Change-Id: Icfdb2cc53d7206e316ea8a1cc28ad443f246f741
---
M be/src/scheduling/scheduler-test.cc
M be/src/scheduling/scheduler.cc
2 files changed, 84 insertions(+), 41 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/14026/6
--
To view, visit http://gerrit.cloudera.org:8080/14026
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icfdb2cc53d7206e316ea8a1cc28ad443f246f741
Gerrit-Change-Number: 14026
Gerrit-PatchSet: 6
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] IMPALA-8685,IMPALA-8677: Use consistent scheduling for small clusters

2019-08-13 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14026 )

Change subject: IMPALA-8685,IMPALA-8677: Use consistent scheduling for small 
clusters
..


Patch Set 6: Code-Review+1

Carry +1


--
To view, visit http://gerrit.cloudera.org:8080/14026
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icfdb2cc53d7206e316ea8a1cc28ad443f246f741
Gerrit-Change-Number: 14026
Gerrit-PatchSet: 6
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Tue, 13 Aug 2019 23:41:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8685,IMPALA-8677: Use consistent scheduling for small clusters

2019-08-13 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14026 )

Change subject: IMPALA-8685,IMPALA-8677: Use consistent scheduling for small 
clusters
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14026/5/be/src/scheduling/scheduler.cc
File be/src/scheduling/scheduler.cc:

http://gerrit.cloudera.org:8080/#/c/14026/5/be/src/scheduling/scheduler.cc@781
PS5, Line 781: set
> Can this be an unordered_set ? It doesn't look like we rely on the order an
Yes, switched this over to an unordered_set and changed it to call reserve() 
with num_candidates.



--
To view, visit http://gerrit.cloudera.org:8080/14026
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icfdb2cc53d7206e316ea8a1cc28ad443f246f741
Gerrit-Change-Number: 14026
Gerrit-PatchSet: 5
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Tue, 13 Aug 2019 23:41:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4551: Limit the size of SQL statements

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14012 )

Change subject: IMPALA-4551: Limit the size of SQL statements
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4240/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14012
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5675fb4a08c1dc51ae5bcf467cbb969cc064602c
Gerrit-Change-Number: 14012
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 13 Aug 2019 23:34:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] PREVIEW: Add support to run tests over HTTP/HS2

2019-08-13 Thread Lars Volker (Code Review)
Lars Volker has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/14059


Change subject: PREVIEW: Add support to run tests over HTTP/HS2
..

PREVIEW: Add support to run tests over HTTP/HS2

This change adds support to run backend tests over HTTP using a new
version of Impyla. Currently it only works with a private Impyla branch
and can be used to further test that change.

https://github.com/lekv/impyla/tree/http_transport

Once Impyla releases a new version with the change, we can polish this
change and submit it.

Change-Id: I7156558071781378fcb9c8941c0f4dd82eb0d018
---
M tests/common/impala_connection.py
M tests/common/impala_test_suite.py
M tests/common/test_dimensions.py
M tests/custom_cluster/test_client_ssl.py
M tests/custom_cluster/test_shell_interactive.py
M tests/custom_cluster/test_shell_interactive_reconnect.py
M tests/shell/test_shell_commandline.py
M tests/shell/test_shell_interactive.py
8 files changed, 32 insertions(+), 24 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/14059/1
--
To view, visit http://gerrit.cloudera.org:8080/14059
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7156558071781378fcb9c8941c0f4dd82eb0d018
Gerrit-Change-Number: 14059
Gerrit-PatchSet: 1
Gerrit-Owner: Lars Volker 


[Impala-ASF-CR] IMPALA-8685,IMPALA-8677: Use consistent scheduling for small clusters

2019-08-13 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14026 )

Change subject: IMPALA-8685,IMPALA-8677: Use consistent scheduling for small 
clusters
..


Patch Set 5: Code-Review+1

(1 comment)

I will let Lars finish off the review.

http://gerrit.cloudera.org:8080/#/c/14026/5/be/src/scheduling/scheduler.cc
File be/src/scheduling/scheduler.cc:

http://gerrit.cloudera.org:8080/#/c/14026/5/be/src/scheduling/scheduler.cc@781
PS5, Line 781: set
Can this be an unordered_set ? It doesn't look like we rely on the order 
anymore, right ?



--
To view, visit http://gerrit.cloudera.org:8080/14026
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icfdb2cc53d7206e316ea8a1cc28ad443f246f741
Gerrit-Change-Number: 14026
Gerrit-PatchSet: 5
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Tue, 13 Aug 2019 23:19:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4551: Limit the size of SQL statements

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14012 )

Change subject: IMPALA-4551: Limit the size of SQL statements
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4784/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/14012
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5675fb4a08c1dc51ae5bcf467cbb969cc064602c
Gerrit-Change-Number: 14012
Gerrit-PatchSet: 5
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 13 Aug 2019 22:59:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4551: Limit the size of SQL statements

2019-08-13 Thread Joe McDonnell (Code Review)
Hello Bharath Vissapragada, Michael Ho, Quanlong Huang, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14012

to look at the new patch set (#4).

Change subject: IMPALA-4551: Limit the size of SQL statements
..

IMPALA-4551: Limit the size of SQL statements

Various BI tools generate and run SQL. When used incorrectly or
misconfigured, the tools can generate extremely large SQLs.
Some of these SQL statements reach 10s of megabytes. Large SQL
statements impose costs throughout execution, including
statement rewrite logic in the frontend and codegen in the
backend. The resource usage of these statements can impact
the stability of the system or the ability to run other SQL
statements.

This implements two new query options that provide controls
to reject large SQL statements.
 - The first, MAX_STATEMENT_LENGTH_BYTES is a cap on the
   total size of the SQL statement (in bytes). It is
   applied before any parsing or analysis. It uses a
   default value of 16MB.
 - The second, STATEMENT_EXPRESSION_LIMIT, is a limit on
   the total number of expressions in a statement or any
   views that it references. The limit is applied upon the
   first round of analysis, but it is not reapplied when
   statement rewrite rules are applied. Certain expressions
   such as literals in IN lists or VALUES clauses are not
   analyzed and do not count towards the limit. It uses
   a default value of 250,000.
The two are complementary. Since enforcing the statement
expression limit requires parsing and analyzing the
statement, the MAX_STATEMENT_LENGTH_BYTES sets an upper
bound on the size of statement that needs to be parsed
and analyzed. Testing confirms that even statements
approaching 16MB get through the first round of analysis
within a few seconds and then are rejected.

This also changes the logging in tests/common/impala_connection.py
to limit the total SQL size that it will print to 128KB. This is
prevents the JUnitXML (which includes this logging) from being too
large. Existing tests do not run SQL larger than about 80KB, so
this only applies to tests added in this change that run multi-MB
SQLs to verify limits.

Testing:
 - This adds frontend tests that verify the low level
   semantics about how expressions are counted and verifies
   that the expression limits are enforced.
 - This adds end-to-end tests that verify both the
   MAX_STATEMENT_LENGTH_BYTES and STATEMENT_EXPRESSION_LIMIT
   at their defaults values.
 - There is also an end-to-end test that runs in exhaustive
   mode that runs a SQL with close to 250,000 expressions.

Change-Id: I5675fb4a08c1dc51ae5bcf467cbb969cc064602c
---
M be/src/service/impala-server.cc
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M common/thrift/generate_error_codes.py
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
M tests/common/impala_connection.py
M tests/query_test/test_exprs.py
13 files changed, 379 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/14012/4
--
To view, visit http://gerrit.cloudera.org:8080/14012
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5675fb4a08c1dc51ae5bcf467cbb969cc064602c
Gerrit-Change-Number: 14012
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-8791: Handle the case where there is no fragment scheduled on the coordinator

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14058 )

Change subject: IMPALA-8791: Handle the case where there is no fragment 
scheduled on the coordinator
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4239/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14058
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic4fba02bb7b20553a20634f8c5989d43ba2c0721
Gerrit-Change-Number: 14058
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 13 Aug 2019 22:39:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8791: Handle the case where there is no fragment scheduled on the coordinator

2019-08-13 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14058 )

Change subject: IMPALA-8791: Handle the case where there is no fragment 
scheduled on the coordinator
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14058/1/tests/custom_cluster/test_admission_controller.py
File tests/custom_cluster/test_admission_controller.py:

http://gerrit.cloudera.org:8080/#/c/14058/1/tests/custom_cluster/test_admission_controller.py@551
PS1, Line 551: # Make sure query execution works perfectly for a query that 
does not have any
 : # fragments schdeuled on the coordinator, but has 
runtime-filters that need to be
 : # aggregated at the coordinator.
removed the DCHECK in Coordinator::BackendState::PublishFilter and added this 
test that have triggered that dcheck.



--
To view, visit http://gerrit.cloudera.org:8080/14058
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic4fba02bb7b20553a20634f8c5989d43ba2c0721
Gerrit-Change-Number: 14058
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 13 Aug 2019 22:01:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8791: Handle the case where there is no fragment scheduled on the coordinator

2019-08-13 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/14058


Change subject: IMPALA-8791: Handle the case where there is no fragment 
scheduled on the coordinator
..

IMPALA-8791: Handle the case where there is no fragment scheduled on
the coordinator

This patch fixes a bug where if an insert or CTAS query has no
fragments scheduled on the coordinator and a mem limit is to be
enforced on the query (either through query option or automatically
through estimates) then the same limit is also applied to the
coordinator backend even though it does not execute anything.

Highlights:
- coord_backend_mem_to_admit_/mem_limit will always refer to the memory
to admit/limit for the coordinator regardless of which fragments are
scheduled on it.

- There will always be a BackendExecParams added for the coordinator
because coordinator always spawns a QueryState object with a mem_tracker
for tracking runtime filter mem and the result set cache. For the case
where this BackendExecParams is empty (no instances scheduled) it would
ensure that some minimal amount of memory is accounted for by the
admission controller and the right mem limit is applied to the
QueryState spawned by the coordinator

- added changes to Coordinator and Coordinator::BackendState classes
to handle an empty BackendExecParams object

Testing:
The following cases need to be tested where the kind of fragments
schduled on the coordinator backend are:
1. Coordinator fragment + other exec fragments
2. Coordinator fragment only
3. other exec fragments only (eg. insert into values OR insert
   into select 1)
4. No fragments, but coordinator still creates a QueryState

Case 1 is covered by tests working with non-dedicated coordinators.
Rest are covered by test_dedicated_coordinator_planner_estimates and
test_sanity_checks_dedicated_coordinator in
test_admission_controller.py

Change-Id: Ic4fba02bb7b20553a20634f8c5989d43ba2c0721
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/scheduling/admission-controller-test.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M fe/src/main/java/org/apache/impala/planner/Planner.java
M 
testdata/workloads/functional-query/queries/QueryTest/dedicated-coord-mem-estimates.test
M tests/custom_cluster/test_admission_controller.py
13 files changed, 167 insertions(+), 125 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/14058/1
--
To view, visit http://gerrit.cloudera.org:8080/14058
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic4fba02bb7b20553a20634f8c5989d43ba2c0721
Gerrit-Change-Number: 14058
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 


[Impala-ASF-CR] IMPALA-8854: skip test acid insert

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14056 )

Change subject: IMPALA-8854: skip test_acid_insert
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4238/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14056
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3bc5cce8b9c3843b5bb8ac4d29b2219411f671b4
Gerrit-Change-Number: 14056
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 13 Aug 2019 21:49:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8474: [DOCS] Impala 3.3 Release Notes

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14054 )

Change subject: IMPALA-8474: [DOCS] Impala 3.3 Release Notes
..


Patch Set 2: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/452/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/14054
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02bb1944c57866b797b1856af139800445e4ac18
Gerrit-Change-Number: 14054
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 13 Aug 2019 21:42:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8474: [DOCS] Impala 3.3 Release Notes

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14054 )

Change subject: IMPALA-8474: [DOCS] Impala 3.3 Release Notes
..


Patch Set 2:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/452/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/14054
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02bb1944c57866b797b1856af139800445e4ac18
Gerrit-Change-Number: 14054
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 13 Aug 2019 21:20:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8474: [DOCS] Impala 3.3 Release Notes

2019-08-13 Thread Alex Rodoni (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14054

to look at the new patch set (#2).

Change subject: IMPALA-8474: [DOCS] Impala 3.3 Release Notes
..

IMPALA-8474: [DOCS] Impala 3.3 Release Notes

Change-Id: I02bb1944c57866b797b1856af139800445e4ac18
---
M docs/impala_keydefs.ditamap
M docs/topics/impala_incompatible_changes.xml
M docs/topics/impala_new_features.xml
3 files changed, 799 insertions(+), 951 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/14054/2
--
To view, visit http://gerrit.cloudera.org:8080/14054
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I02bb1944c57866b797b1856af139800445e4ac18
Gerrit-Change-Number: 14054
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-8854: skip test acid insert

2019-08-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/14056


Change subject: IMPALA-8854: skip test_acid_insert
..

IMPALA-8854: skip test_acid_insert

The test is failing because of a Hive version change in some
configurations. Disabling for now until it can be fixed.

Change-Id: I3bc5cce8b9c3843b5bb8ac4d29b2219411f671b4
---
M tests/query_test/test_insert.py
1 file changed, 1 insertion(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/14056/1
--
To view, visit http://gerrit.cloudera.org:8080/14056
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I3bc5cce8b9c3843b5bb8ac4d29b2219411f671b4
Gerrit-Change-Number: 14056
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-8854: skip test acid insert

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14056 )

Change subject: IMPALA-8854: skip test_acid_insert
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4783/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/14056
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3bc5cce8b9c3843b5bb8ac4d29b2219411f671b4
Gerrit-Change-Number: 14056
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 13 Aug 2019 21:10:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8829: [DOCS] Document limitation of parsing "TB" string

2019-08-13 Thread Alex Rodoni (Code Review)
Alex Rodoni has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/14055 )

Change subject: IMPALA-8829: [DOCS] Document limitation of parsing "TB" string
..

IMPALA-8829: [DOCS] Document limitation of parsing "TB" string

Change-Id: I44a8d789d8dc456a57b759c697eb2e4227fbefee
Reviewed-on: http://gerrit.cloudera.org:8080/14055
Reviewed-by: Alex Rodoni 
Tested-by: Impala Public Jenkins 
Reviewed-by: Michael Ho 
---
M docs/topics/impala_data_cache.xml
M docs/topics/impala_known_issues.xml
2 files changed, 18 insertions(+), 3 deletions(-)

Approvals:
  Alex Rodoni: Looks good to me, approved
  Impala Public Jenkins: Verified
  Michael Ho: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/14055
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I44a8d789d8dc456a57b759c697eb2e4227fbefee
Gerrit-Change-Number: 14055
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] IMPALA-8829: [DOCS] Document limitation of parsing "TB" string

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14055 )

Change subject: IMPALA-8829: [DOCS] Document limitation of parsing "TB" string
..


Patch Set 1: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/451/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/14055
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I44a8d789d8dc456a57b759c697eb2e4227fbefee
Gerrit-Change-Number: 14055
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 13 Aug 2019 19:59:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8829: [DOCS] Document limitation of parsing "TB" string

2019-08-13 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14055 )

Change subject: IMPALA-8829: [DOCS] Document limitation of parsing "TB" string
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/14055
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I44a8d789d8dc456a57b759c697eb2e4227fbefee
Gerrit-Change-Number: 14055
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 13 Aug 2019 19:39:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8829: [DOCS] Document limitation of parsing "TB" string

2019-08-13 Thread Alex Rodoni (Code Review)
Alex Rodoni has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/14055


Change subject: IMPALA-8829: [DOCS] Document limitation of parsing "TB" string
..

IMPALA-8829: [DOCS] Document limitation of parsing "TB" string

Change-Id: I44a8d789d8dc456a57b759c697eb2e4227fbefee
---
M docs/topics/impala_data_cache.xml
M docs/topics/impala_known_issues.xml
2 files changed, 18 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/14055/1
--
To view, visit http://gerrit.cloudera.org:8080/14055
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I44a8d789d8dc456a57b759c697eb2e4227fbefee
Gerrit-Change-Number: 14055
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 


[Impala-ASF-CR] IMPALA-8829: [DOCS] Document limitation of parsing "TB" string

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14055 )

Change subject: IMPALA-8829: [DOCS] Document limitation of parsing "TB" string
..


Patch Set 1:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/451/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/14055
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I44a8d789d8dc456a57b759c697eb2e4227fbefee
Gerrit-Change-Number: 14055
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 13 Aug 2019 19:38:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8160: [DOCS] Document CAST (...FORMAT..) function

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14044 )

Change subject: IMPALA-8160: [DOCS] Document CAST (...FORMAT..) function
..


Patch Set 3: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/450/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/14044
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6235ffd03ac56e648552058ff02491a55289c092
Gerrit-Change-Number: 14044
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 13 Aug 2019 19:34:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8160: [DOCS] Document CAST (...FORMAT..) function

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14044 )

Change subject: IMPALA-8160: [DOCS] Document CAST (...FORMAT..) function
..


Patch Set 3:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/450/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/14044
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6235ffd03ac56e648552058ff02491a55289c092
Gerrit-Change-Number: 14044
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 13 Aug 2019 19:13:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8160: [DOCS] Document CAST (...FORMAT..) function

2019-08-13 Thread Alex Rodoni (Code Review)
Hello Gabor Kaszab, Attila Jeges, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14044

to look at the new patch set (#3).

Change subject: IMPALA-8160: [DOCS] Document CAST (...FORMAT..) function
..

IMPALA-8160: [DOCS] Document CAST (...FORMAT..) function

- Added the Date and Timestamp patterns supported for the new CAST
  signature.

Change-Id: I6235ffd03ac56e648552058ff02491a55289c092
---
M docs/topics/impala_conversion_functions.xml
1 file changed, 716 insertions(+), 65 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/14044/3
--
To view, visit http://gerrit.cloudera.org:8080/14044
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6235ffd03ac56e648552058ff02491a55289c092
Gerrit-Change-Number: 14044
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-8160: [DOCS] Document CAST (...FORMAT..) function

2019-08-13 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14044 )

Change subject: IMPALA-8160: [DOCS] Document CAST (...FORMAT..) function
..


Patch Set 2:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/14044/2/docs/topics/impala_conversion_functions.xml
File docs/topics/impala_conversion_functions.xml:

http://gerrit.cloudera.org:8080/#/c/14044/2/docs/topics/impala_conversion_functions.xml@166
PS2, Line 166: The patterns supported in the FORMAT clause are 
not the same format
 : patterns used with the other Impala conversion 
functions, e.g.
> I would mention here that this format is introduced to support ISO:SQL:2016
Done


http://gerrit.cloudera.org:8080/#/c/14044/2/docs/topics/impala_conversion_functions.xml@192
PS2, Line 192: If a fewer number of digits in
> nit: "If fewer number of digits  in"
Done


http://gerrit.cloudera.org:8080/#/c/14044/2/docs/topics/impala_conversion_functions.xml@193
PS2, Line 193:  the current date is used to complete
> current date is only used for completing the year pattern.
Done


http://gerrit.cloudera.org:8080/#/c/14044/2/docs/topics/impala_conversion_functions.xml@303
PS2, Line 303: The year is adjusted to be nearest to the nearest century to the 
current
 : date:
> Here I'd list all the possible use-cases. For that there is a link to an Or
Done


http://gerrit.cloudera.org:8080/#/c/14044/2/docs/topics/impala_conversion_functions.xml@307
PS2, Line 307: For string to datetime semantics, round year when last 2 digits 
of
 : current year is greater than 49.
> Not sure what the meaning of this sentence is.
Removed and replaced with the possible cases above.


http://gerrit.cloudera.org:8080/#/c/14044/2/docs/topics/impala_conversion_functions.xml@538
PS2, Line 538: Must be specified at the end of the 
pattern string
> TZH and TZM can be anywhere in the pattern. TZM don't even has to be right
Done


http://gerrit.cloudera.org:8080/#/c/14044/2/docs/topics/impala_conversion_functions.xml@545
PS2, Line 545: -15 and 15
> The design doc incorrectly mentions this limit. In fact Impala parses the o
Done


http://gerrit.cloudera.org:8080/#/c/14044/2/docs/topics/impala_conversion_functions.xml@572
PS2, Line 572: Unsigned numbers between 0 and 59 are allowed for the source
 : expression.
> There is no range check for TZM. Similarly to TZH after parsin Impala ignor
Done


http://gerrit.cloudera.org:8080/#/c/14044/2/docs/topics/impala_conversion_functions.xml@628
PS2, Line 628:  t
> typo
Done


http://gerrit.cloudera.org:8080/#/c/14044/2/docs/topics/impala_conversion_functions.xml@635
PS2, Line 635: CAST(“2019-.
 : ;101” AS DATE FORMAT “-MM-DD”)
> This was a typo in the design doc. Actually this is the query that would su
Done


http://gerrit.cloudera.org:8080/#/c/14044/2/docs/topics/impala_conversion_functions.xml@642
PS2, Line 642: ph>T Please mention that this is for accepting ISO8601 datetime formats.
Done


http://gerrit.cloudera.org:8080/#/c/14044/2/docs/topics/impala_conversion_functions.xml@655
PS2, Line 655: >Z
> Same comment as for "T"
Done


http://gerrit.cloudera.org:8080/#/c/14044/2/docs/topics/impala_conversion_functions.xml@668
PS2, Line 668: Examples:
> It's great to have a few examples here. I'd add one more to have some cover
Done



--
To view, visit http://gerrit.cloudera.org:8080/14044
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6235ffd03ac56e648552058ff02491a55289c092
Gerrit-Change-Number: 14044
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 13 Aug 2019 19:12:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/14037 )

Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..

IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties

Hive depends on property COLUMN_STATS_ACCURATE to tell if the
stored statistics accurate. After Impala inserts data, it does
not set statistics values up-to-date(for example numRows).
Impala should unset COLUMN_STATS_ACCURATE to tell Hive the
stored stats are no longer accurate.
The patch impletes:
After Impala insert data,
Remove COLUMN_STATS_ACCURATE from table properties if it exists
Remove COLUMN_STATS_ACCURATE from partition params if it exists
Add helper methods to handle alter table/partition for acid
tables.

Implements the stats changes above for both acid/non-acid tables.

Tests:
Manual tests.
Run core tests.
Add ee tests to test interop with Hive for acid/external tables.

Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Reviewed-on: http://gerrit.cloudera.org:8080/14037
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
A 
testdata/workloads/functional-query/queries/QueryTest/acid-clear-statsaccurate.test
A testdata/workloads/functional-query/queries/QueryTest/clear-statsaccurate.test
M tests/query_test/test_acid.py
6 files changed, 340 insertions(+), 5 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 8
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14037 )

Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..


Patch Set 7: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 7
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 18:55:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7375: [DOCS] Added DATE functions

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13996 )

Change subject: IMPALA-7375: [DOCS] Added DATE functions
..


Patch Set 3: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/449/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/13996
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1463731f9ee7fb9ec80d6be2458cffe5c42464b6
Gerrit-Change-Number: 13996
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 18:37:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8752: Added Jaro-Winkler edit distance and similarity built-in function

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/13870 )

Change subject: IMPALA-8752: Added Jaro-Winkler edit distance and similarity 
built-in function
..

IMPALA-8752: Added Jaro-Winkler edit distance and similarity built-in function

The added functions return the Jaro/Jaro-Winkler similarity/distance
of two strings. The algorithm calcuates the Jaro-Similarity of the
strings, then adds more weight to the result if there are
common prefixes. (Jaro-Winkler)
For more detail, see:
https://en.wikipedia.org/wiki/Jaro%E2%80%93Winkler_distance

Extended the algorithm with another optional parameter: boost threshold
The prefix weight will only be applied if the Jaro-similarity
exceeds the given threshold. By default, its value is 0.7.

The new built-in functions are:
 * jaro_distance, jaro_dst
 * jaro_similarity, jaro_sim
 * jaro_winkler_distance, jw_dst
 * jaro_winkler_similarity, jw_sim

Testing:
 * Added unit tests to expr-test.cc
 * Manual testing over 1400 word pairs from
   http://marvin.cs.uidaho.edu/misspell.html
   Results match Apache commons

Change-Id: I64d7f461516c5e66cc27d62612bc8cc0e8f0178c
Reviewed-on: http://gerrit.cloudera.org:8080/13870
Reviewed-by: Zoltan Borok-Nagy 
Tested-by: Impala Public Jenkins 
---
M be/src/exprs/expr-test.cc
M be/src/exprs/string-functions-ir.cc
M be/src/exprs/string-functions.h
M common/function-registry/impala_functions.py
4 files changed, 323 insertions(+), 0 deletions(-)

Approvals:
  Zoltan Borok-Nagy: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/13870
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I64d7f461516c5e66cc27d62612bc8cc0e8f0178c
Gerrit-Change-Number: 13870
Gerrit-PatchSet: 11
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-8752: Added Jaro-Winkler edit distance and similarity built-in function

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13870 )

Change subject: IMPALA-8752: Added Jaro-Winkler edit distance and similarity 
built-in function
..


Patch Set 10: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/13870
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I64d7f461516c5e66cc27d62612bc8cc0e8f0178c
Gerrit-Change-Number: 13870
Gerrit-PatchSet: 10
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 18:25:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8571[WIP]: improve QueryEventHook execution

2019-08-13 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13748 )

Change subject: IMPALA-8571[WIP]: improve QueryEventHook execution
..


Patch Set 18:

(11 comments)

I think this going in the right direction

http://gerrit.cloudera.org:8080/#/c/13748/18//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13748/18//COMMIT_MSG@31
PS18, Line 31:
Do you want to discuss --query_event_hook_use_daemon_threads here? How would a 
user determine what value to use for that?


http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java
File 
fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java:

http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@51
PS18, Line 51:  * Rejections
This javadoc is great, and just the sort of thing I was hoping to see.
Are you expecting to publish the html that is generated from
 this javadoc? If not then I think Impala code generally does not include the 
formatting stuff like . I think we generally emphasize readability in 
the editor.


http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@102
PS18, Line 102:  * ${hookClass}.${method}.execution.exceptions
I think the metrics now have the "query-event-hook" prefix.
I'm also unclear about the exact meanings of hookClass and method. Detail in 
metrics is good, but so is predictability.
If I can't predict the metrics name then it is harder to write tooling that 
uses it. I can see this both ways, what do you think?


http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@176
PS18, Line 176: // ArrayBlockingQueue contructor performs bounds-check on 
queue size for us
typo: constructor


http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@184
PS18, Line 184: // this executor cancels any hook tasks that
This may seem picky but in Imapla we like comments with Capitals at the 
beginning and periods at the end.


http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@255
PS18, Line 255:* {@link ExecutionException}.  This behavior is consistent 
with how futures normally
We could be more concise by not saying this about how Futures normally behave, 
as that's what we expect, right?


http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/main/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutor.java@286
PS18, Line 286:   LOG.warn("QueryEventHook {}.{} execution rejected because 
the " +
I  sometimes weaselly write "probably because" as you never know :-)
I know executor,getActiveCount() is only a snapshot but it might show something 
interesting.


http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java
File 
fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java:

http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java@1
PS18, Line 1: /**
Use // style comments for the apache license please


http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java@49
PS18, Line 49:   new QueryCompleteContext("whatever");
lol


http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java@80
PS18, Line 80:   // make this longer to reasonably guarantee a timeout
Is this a FIXME note, not sure I understand what this keans


http://gerrit.cloudera.org:8080/#/c/13748/18/fe/src/test/java/org/apache/impala/hooks/FixedCapacityQueryHookExecutorTest.java@115
PS18, Line 115:   if (expected.getCause() instanceof TimeoutException) {
if (!(expected.getCause() instanceof TimeoutException)) {
fail("TimeoutException expected but not thrown");
}



--
To view, visit http://gerrit.cloudera.org:8080/13748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb88422f7cfe86947d11ce57d2b4c63e57d1b643
Gerrit-Change-Number: 13748
Gerrit-PatchSet: 18
Gerrit-Owner: radford nguyen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: radford nguyen 
Gerrit-Comment-Date: Tue, 13 Aug 2019 18:16:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8474: [DOCS] Impala 3.3 Release Notes

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14054 )

Change subject: IMPALA-8474: [DOCS] Impala 3.3 Release Notes
..


Patch Set 1: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/448/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/14054
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02bb1944c57866b797b1856af139800445e4ac18
Gerrit-Change-Number: 14054
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 13 Aug 2019 18:14:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7375: [DOCS] Added DATE functions

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13996 )

Change subject: IMPALA-7375: [DOCS] Added DATE functions
..


Patch Set 3:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/449/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/13996
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1463731f9ee7fb9ec80d6be2458cffe5c42464b6
Gerrit-Change-Number: 13996
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 18:14:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7375: [DOCS] Added DATE functions

2019-08-13 Thread Alex Rodoni (Code Review)
Hello Zoltan Borok-Nagy, Attila Jeges, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13996

to look at the new patch set (#3).

Change subject: IMPALA-7375: [DOCS] Added DATE functions
..

IMPALA-7375: [DOCS] Added DATE functions

Change-Id: I1463731f9ee7fb9ec80d6be2458cffe5c42464b6
---
M docs/topics/impala_datetime_functions.xml
1 file changed, 1,057 insertions(+), 1,502 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/13996/3
--
To view, visit http://gerrit.cloudera.org:8080/13996
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1463731f9ee7fb9ec80d6be2458cffe5c42464b6
Gerrit-Change-Number: 13996
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-7375: [DOCS] Added DATE functions

2019-08-13 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13996 )

Change subject: IMPALA-7375: [DOCS] Added DATE functions
..


Patch Set 2:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/13996/2/docs/topics/impala_datetime_functions.xml
File docs/topics/impala_datetime_functions.xml:

http://gerrit.cloudera.org:8080/#/c/13996/2/docs/topics/impala_datetime_functions.xml@405
PS2, Line 405: Purpose: Alias for the NOW() function.
 :   
 : Return type: DATE
> CURRENT_DATE() is not an alias for NOW(), the former returns a DATE, while
Done


http://gerrit.cloudera.org:8080/#/c/13996/2/docs/topics/impala_datetime_functions.xml@814
PS2, Line 814: TIMESTAMP
> TIMESTAMP or DATE
Done


http://gerrit.cloudera.org:8080/#/c/13996/2/docs/topics/impala_datetime_functions.xml@848
PS2, Line 848: DATE'2019-08-02
> Closing apostrophe is missing.
Done


http://gerrit.cloudera.org:8080/#/c/13996/2/docs/topics/impala_datetime_functions.xml@849
PS2, Line 849: 2000-01-01 00:00:00
> This should be 2001-01-01.
Done


http://gerrit.cloudera.org:8080/#/c/13996/2/docs/topics/impala_datetime_functions.xml@1002
PS2, Line 1002: startdate
> Parameter is called 'date' in L1007
Done


http://gerrit.cloudera.org:8080/#/c/13996/2/docs/topics/impala_datetime_functions.xml@1960
PS2, Line 1960: MONTHS_SUB(DATE'2019-03-01',
  : 1) returns 
DATE'2019-02-28'.
> A better example would be MONTHS_SUB(DATE'2019-02-28', 1) returning 2019-01
Done


http://gerrit.cloudera.org:8080/#/c/13996/2/docs/topics/impala_datetime_functions.xml@3296
PS2, Line 3296: If date is not valid, e.g. 
DATE'2019-03-32',
  : returns NULL.
> Again, the invalid date literal ill return an analysis error:
Done



--
To view, visit http://gerrit.cloudera.org:8080/13996
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1463731f9ee7fb9ec80d6be2458cffe5c42464b6
Gerrit-Change-Number: 13996
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 18:12:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8823: DROP TABLE support for insert-only ACID tables

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14038 )

Change subject: IMPALA-8823: DROP TABLE support for insert-only ACID tables
..


Patch Set 12: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/14038
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic41ca73268c4b75af5a08fe3dd1ada1df3f6fd34
Gerrit-Change-Number: 14038
Gerrit-PatchSet: 12
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 18:07:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8823: DROP TABLE support for insert-only ACID tables

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/14038 )

Change subject: IMPALA-8823: DROP TABLE support for insert-only ACID tables
..

IMPALA-8823: DROP TABLE support for insert-only ACID tables

Enhances Impala to be able to drop insert-only transactional tables.
In order to do this Impala acquires an exclusive table lock in HMS
before performing the drop operation and releases the lock once
dropping the table finished.
INSERT statement does the locking and heartbeating on coordinator
side but for DROP TABLE all of these are done from Catalog side. This
means that alongside Impala coordinators now Catalog also does
heartbeating towards HMS.

Testing:
 - E2E test: Dropped a table, re-created it and dropped again to
   check if no locks remained in HMS.
 - E2E test: After dropping a table from Impala checked if Hive also
   sees it being dropped.
 - Manual test: With a hacked Impala that runs a drop table long
   enough I checked that there is a table lock entry in HMS during the
   execution and disappears once the query finishes.

Change-Id: Ic41ca73268c4b75af5a08fe3dd1ada1df3f6fd34
Reviewed-on: http://gerrit.cloudera.org:8080/14038
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
R fe/src/main/java/org/apache/impala/common/TransactionKeepalive.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
M testdata/workloads/functional-query/queries/QueryTest/acid.test
M tests/metadata/test_hms_integration.py
M tests/query_test/test_acid.py
11 files changed, 205 insertions(+), 40 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/14038
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ic41ca73268c4b75af5a08fe3dd1ada1df3f6fd34
Gerrit-Change-Number: 14038
Gerrit-PatchSet: 13
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-7984: Port runtime filter from Thrift RPC to KRPC

2019-08-13 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13882 )

Change subject: IMPALA-7984: Port runtime filter from Thrift RPC to KRPC
..


Patch Set 14:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/13882/14//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13882/14//COMMIT_MSG@14
PS14, Line 14: respctively
typo


http://gerrit.cloudera.org:8080/#/c/13882/14/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/13882/14/be/src/runtime/coordinator.cc@1089
PS14, Line 1089:   if (params->bloom_filter().has_directory_sidecar_idx()) {
Is it actually possible that this won't be set or can we DCHECK this? (I think 
its possible with the way you have the code structured right now, but we might 
want to change that, see my other comments)

If it is possible, do we have the right behavior here? I guess what will happen 
is we'll proceed to updating the bloom filter with an empty directory, which 
could for example hit a DCHECK in BloomFilter::Or which expects the two 
directories to be the same size.


http://gerrit.cloudera.org:8080/#/c/13882/14/be/src/runtime/coordinator.cc@1099
PS14, Line 1099: .ToString()
This ToString() performs a copy of the data. Just call 'size()' on the slice 
directly.


http://gerrit.cloudera.org:8080/#/c/13882/14/be/src/runtime/coordinator.cc@1110
PS14, Line 1110:   bloom_filter_directory_ = sidecar_slice.ToString();
I think this will perform two copies - one to do ToString() and one to do the 
assignment.

I think we should be able to do this without any copies by using 'std::swap'. 
You may need to make bloom_filter_directory_ a char* to make it work


http://gerrit.cloudera.org:8080/#/c/13882/14/be/src/runtime/runtime-filter-bank.cc
File be/src/runtime/runtime-filter-bank.cc:

http://gerrit.cloudera.org:8080/#/c/13882/14/be/src/runtime/runtime-filter-bank.cc@a103
PS14, Line 103:
keep this comment


http://gerrit.cloudera.org:8080/#/c/13882/14/be/src/runtime/runtime-filter-bank.cc@208
PS14, Line 208: if (params->has_bloom_filter())
I think this is already guaranteed by the DCHECK above


http://gerrit.cloudera.org:8080/#/c/13882/14/be/src/util/bloom-filter.h
File be/src/util/bloom-filter.h:

http://gerrit.cloudera.org:8080/#/c/13882/14/be/src/util/bloom-filter.h@46
PS14, Line 46: using kudu::Slice;
please don't use 'using' in header files


http://gerrit.cloudera.org:8080/#/c/13882/14/be/src/util/bloom-filter.h@90
PS14, Line 90:   Bucket* GetDirectory() const { return directory_; }
Hmm, definitely preferable to keep all of this stuff private.

I think the only reason you exposed this is for 
bloom-filter-test.cc/bloom-filter-benchmark.cc? If so, you should be able to 
just make the testing classes friends of this class

For bloom-filter-test.cc that should be easy, just 'friend class 
BloomFilterTest'. Its a little more complicated for bloom-filter-benchmark.cc 
since there's multiple classes there, but maybe you can add a single super 
class for them? Or else just list all three with a comment about what they are.

If you run into issues with that, another option is to have ToProtobuf take a 
'Slice*' or whatever that it sets to the directory instead of take the 
RpcController.


http://gerrit.cloudera.org:8080/#/c/13882/14/be/src/util/bloom-filter.cc
File be/src/util/bloom-filter.cc:

http://gerrit.cloudera.org:8080/#/c/13882/14/be/src/util/bloom-filter.cc@99
PS14, Line 99: LOG(ERROR) << "Cannot add outbound sidecar: " << 
sidecar_status.message().ToString();
What should we do in this case? Right now, callers of this function won't know 
that the error happened, so they'll go ahead and send the BloomFilterPB even 
though its not really valid, since it doesn't have a directory or 
always_true/false set.

How about in this case we 'disable' the bloom filter, i.e. set always_true = 
true. Then over in the coordinator we'll know that if always_false = 
always_true = false it must be the case that sidecar_idx is set.



--
To view, visit http://gerrit.cloudera.org:8080/13882
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b394796d250286510e157ae326882bfc01d387a
Gerrit-Change-Number: 13882
Gerrit-PatchSet: 14
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Tue, 13 Aug 2019 18:05:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8474: [DOCS] Impala 3.3 Release Notes

2019-08-13 Thread Alex Rodoni (Code Review)
Alex Rodoni has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/14054


Change subject: IMPALA-8474: [DOCS] Impala 3.3 Release Notes
..

IMPALA-8474: [DOCS] Impala 3.3 Release Notes

Change-Id: I02bb1944c57866b797b1856af139800445e4ac18
---
M docs/impala_keydefs.ditamap
M docs/topics/impala_new_features.xml
2 files changed, 137 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/14054/1
--
To view, visit http://gerrit.cloudera.org:8080/14054
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I02bb1944c57866b797b1856af139800445e4ac18
Gerrit-Change-Number: 14054
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 


[Impala-ASF-CR] IMPALA-8474: [DOCS] Impala 3.3 Release Notes

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14054 )

Change subject: IMPALA-8474: [DOCS] Impala 3.3 Release Notes
..


Patch Set 1:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/448/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/14054
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02bb1944c57866b797b1856af139800445e4ac18
Gerrit-Change-Number: 14054
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 13 Aug 2019 17:53:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] Clean up stress tests in core

2019-08-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/14002 )

Change subject: Clean up stress tests in core
..

Clean up stress tests in core

All stress tests in core were skipped or xfailed. Delete
the tests and avoid pytest startup overhead (starting up
the 64 workers took a significant amount of time)

Change-Id: Icc8d948a3a95bd964a7acbe5722f01891a248f11
Reviewed-on: http://gerrit.cloudera.org:8080/14002
Tested-by: Impala Public Jenkins 
Reviewed-by: David Knupp 
---
M bin/run-all-tests.sh
D tests/stress/test_mini_stress.py
2 files changed, 5 insertions(+), 104 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  David Knupp: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/14002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Icc8d948a3a95bd964a7acbe5722f01891a248f11
Gerrit-Change-Number: 14002
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6663 Expose current DDL metrics on WebUI

2019-08-13 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13806 )

Change subject: IMPALA-6663 Expose current DDL metrics on WebUI
..


Patch Set 7:

(4 comments)

I think this in general is a very useful addition. Can you also specify the 
motivation of why we need to track the total number of DDLs instead of tracking 
them on a per table level. Certain metadata operations like invalidate metadata 
as indeed global and cannot be tracked at table. However, I think it useful to 
track how many refresh or invalidates were called on a given table and finding 
top-n such tables. Similarly for DMLs, instead of a getting a total count of 
INSERTS, would it be more useful to get top-N tables which are inserted into?

http://gerrit.cloudera.org:8080/#/c/13806/7/fe/src/main/java/org/apache/impala/catalog/monitor/CatalogOpRequestCounter.java
File 
fe/src/main/java/org/apache/impala/catalog/monitor/CatalogOpRequestCounter.java:

http://gerrit.cloudera.org:8080/#/c/13806/7/fe/src/main/java/org/apache/impala/catalog/monitor/CatalogOpRequestCounter.java@41
PS7, Line 41:   private AtomicLong resetMetadataCounter_;
:
:   private AtomicLong dmlOperationCounter_;
:
:   private AtomicLong syncDdlOperationCounter_;
Can you add a one line comment on each of these counters as to what do they 
represent? Also, would be great if you could specify what operations constitute 
a DML in this context.


http://gerrit.cloudera.org:8080/#/c/13806/7/fe/src/main/java/org/apache/impala/catalog/monitor/CatalogOpRequestCounter.java@70
PS7, Line 70: decrementDdlTypeCounter
Is decrement ever needed? same for the other counters?


http://gerrit.cloudera.org:8080/#/c/13806/7/fe/src/main/java/org/apache/impala/catalog/monitor/CatalogOpRequestCounter.java@109
PS7, Line 109: getResetMetadataCounter
By returning the AtomicLong object this class is leaking the private field and 
it is subject to modification by other classes. A more common pattern is to 
return its value (return type long).


http://gerrit.cloudera.org:8080/#/c/13806/7/www/catalog.tmpl
File www/catalog.tmpl:

http://gerrit.cloudera.org:8080/#/c/13806/7/www/catalog.tmpl@202
PS7, Line 202: Running
Is this user-visible text? Running seems to suggest these operations are 
currently being run.



--
To view, visit http://gerrit.cloudera.org:8080/13806
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0ed76f134bad6d3b3d4dce132365a53a01e9512a
Gerrit-Change-Number: 13806
Gerrit-PatchSet: 7
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 13 Aug 2019 17:24:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6663 Expose current DDL metrics on WebUI

2019-08-13 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13806 )

Change subject: IMPALA-6663 Expose current DDL metrics on WebUI
..


Patch Set 7:

(2 comments)

> (4 comments)
 >
 > I think this in general is a very useful addition. Can you also
 > specify the motivation of why we need to track the total number of
 > DDLs instead of tracking them on a per table level. Certain
 > metadata operations like invalidate metadata as indeed global and
 > cannot be tracked at table. However, I think it useful to track how
 > many refresh or invalidates were called on a given table and
 > finding top-n such tables. Similarly for DMLs, instead of a getting
 > a total count of INSERTS, would it be more useful to get top-N
 > tables which are inserted into?

Took a second pass and realized that you are indeed tracking currently running 
DDL ops. The question in such a case is how do we determine if seeing 3 
resetMetadata operations on 3 small tables is better or worse than 1 
resetMetadata operation on a large table. Can you please comment on how should 
the end user interpret these values to determine the load on the catalog?

http://gerrit.cloudera.org:8080/#/c/13806/7/fe/src/main/java/org/apache/impala/catalog/monitor/CatalogOpRequestCounter.java
File 
fe/src/main/java/org/apache/impala/catalog/monitor/CatalogOpRequestCounter.java:

http://gerrit.cloudera.org:8080/#/c/13806/7/fe/src/main/java/org/apache/impala/catalog/monitor/CatalogOpRequestCounter.java@70
PS7, Line 70: decrementDdlTypeCounter
> Is decrement ever needed? same for the other counters?
you may disregard this comment since I thought you were only tracking the 
global count of operations instead of running ones.


http://gerrit.cloudera.org:8080/#/c/13806/7/www/catalog.tmpl
File www/catalog.tmpl:

http://gerrit.cloudera.org:8080/#/c/13806/7/www/catalog.tmpl@202
PS7, Line 202: Running
> Is this user-visible text? Running seems to suggest these operations are cu
I think I misunderstood the patch a little bit. Looks like we are indeed 
tracking the running operations. You may discard this comment.



--
To view, visit http://gerrit.cloudera.org:8080/13806
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0ed76f134bad6d3b3d4dce132365a53a01e9512a
Gerrit-Change-Number: 13806
Gerrit-PatchSet: 7
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 13 Aug 2019 17:32:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Clean up stress tests in core

2019-08-13 Thread David Knupp (Code Review)
David Knupp has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14002 )

Change subject: Clean up stress tests in core
..


Patch Set 1: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14002/1/tests/stress/test_mini_stress.py
File tests/stress/test_mini_stress.py:

http://gerrit.cloudera.org:8080/#/c/14002/1/tests/stress/test_mini_stress.py@a1
PS1, Line 1:
> All of the individual tests in this file either have an xfail or a strip ma
OK, just wanted to double check. I wasn't initially sure what run=False meant 
in the xfail cases -- didn't know if that was our code, or pytest code.



--
To view, visit http://gerrit.cloudera.org:8080/14002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icc8d948a3a95bd964a7acbe5722f01891a248f11
Gerrit-Change-Number: 14002
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 13 Aug 2019 17:26:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8755: Frontend support for Z-ordering

2019-08-13 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13955 )

Change subject: IMPALA-8755: Frontend support for Z-ordering
..


Patch Set 8:

(13 comments)

Had a couple of minor comments, nothing serious. Otherwise looks pretty good.

http://gerrit.cloudera.org:8080/#/c/13955/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13955/8//COMMIT_MSG@12
PS8, Line 12: algorithm
nit: order


http://gerrit.cloudera.org:8080/#/c/13955/8//COMMIT_MSG@21
PS8, Line 21: queries
statements


http://gerrit.cloudera.org:8080/#/c/13955/8//COMMIT_MSG@25
PS8, Line 25: For strings, varchars, floats and doubles Z-ordering is not 
supported.
You could mention that zorder is not suitable for string and varchars, but 
support for floats and doubles can be added later.


http://gerrit.cloudera.org:8080/#/c/13955/8/fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java
File 
fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java:

http://gerrit.cloudera.org:8080/#/c/13955/8/fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java@221
PS8, Line 221: AlterTableSortByStmt.TBL_PROP_SORT_ORDER));
nit: missing spaces from the beginning


http://gerrit.cloudera.org:8080/#/c/13955/8/fe/src/main/java/org/apache/impala/analysis/AlterTableSortByStmt.java
File fe/src/main/java/org/apache/impala/analysis/AlterTableSortByStmt.java:

http://gerrit.cloudera.org:8080/#/c/13955/8/fe/src/main/java/org/apache/impala/analysis/AlterTableSortByStmt.java@37
PS8, Line 37: ZORDER
nit: ORDER, or LEXICAL|ZORDER


http://gerrit.cloudera.org:8080/#/c/13955/8/fe/src/main/java/org/apache/impala/analysis/AlterTableSortByStmt.java@65
PS8, Line 65:
nit: extra space


http://gerrit.cloudera.org:8080/#/c/13955/8/fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java:

http://gerrit.cloudera.org:8080/#/c/13955/8/fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java@51
PS8, Line 51:   private final TSortingOrder sortingOrder_;
nit: you could place it after 'sortColumns_'


http://gerrit.cloudera.org:8080/#/c/13955/8/fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java@64
PS8, Line 64: List of
nit: Pair of ...


http://gerrit.cloudera.org:8080/#/c/13955/8/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java:

http://gerrit.cloudera.org:8080/#/c/13955/8/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java@113
PS8, Line 113:   public TSortingOrder getSortingOrder() {
 : return tableDef_.getSortingOrder();
 :   }
nit: fits single line


http://gerrit.cloudera.org:8080/#/c/13955/8/fe/src/main/java/org/apache/impala/analysis/SortInfo.java
File fe/src/main/java/org/apache/impala/analysis/SortInfo.java:

http://gerrit.cloudera.org:8080/#/c/13955/8/fe/src/main/java/org/apache/impala/analysis/SortInfo.java@62
PS8, Line 62: // Sort using Z-ordering order
nit: comment is obsolete


http://gerrit.cloudera.org:8080/#/c/13955/8/fe/src/main/java/org/apache/impala/analysis/SortInfo.java@103
PS8, Line 103: //if (sortingOrder_ == null) return TSortingOrder.LEXICAL;
nit: some old code has been left in the comment, please remove


http://gerrit.cloudera.org:8080/#/c/13955/8/testdata/workloads/functional-planner/queries/PlannerTest/insert-sort-by-zorder.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/insert-sort-by-zorder.test:

http://gerrit.cloudera.org:8080/#/c/13955/8/testdata/workloads/functional-planner/queries/PlannerTest/insert-sort-by-zorder.test@424
PS8, Line 424: order by: ZORDER: id
are these tests still succeed? I think they should because we have a constant 
'false' for 'bool_col' and it seems like a reasonable optimization.


http://gerrit.cloudera.org:8080/#/c/13955/8/testdata/workloads/functional-query/queries/QueryTest/create-table-zorder.test
File 
testdata/workloads/functional-query/queries/QueryTest/create-table-zorder.test:

http://gerrit.cloudera.org:8080/#/c/13955/8/testdata/workloads/functional-query/queries/QueryTest/create-table-zorder.test@3
PS8, Line 3: # Make sure that specifying sort by zorder columns sets the 
'sort.columns'
nit: whitespace at eol



--
To view, visit http://gerrit.cloudera.org:8080/13955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie122002ca8f52ca2c1e1ec8ff1d476ae1f4f875d
Gerrit-Change-Number: 13955
Gerrit-PatchSet: 8
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 17:16:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7770: SPLIT PART to support negative indexes

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13880 )

Change subject: IMPALA-7770: SPLIT_PART to support negative indexes
..


Patch Set 6: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/4779/


--
To view, visit http://gerrit.cloudera.org:8080/13880
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2db762989a90bd95661a59eb9c11a29eb2edfafb
Gerrit-Change-Number: 13880
Gerrit-PatchSet: 6
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 15:12:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-13 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14037 )

Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..


Patch Set 6: Code-Review+2

Okay, thanks for the explanation!


--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 6
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 14:41:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8823: DROP TABLE support for insert-only ACID tables

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14038 )

Change subject: IMPALA-8823: DROP TABLE support for insert-only ACID tables
..


Patch Set 11:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4237/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14038
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic41ca73268c4b75af5a08fe3dd1ada1df3f6fd34
Gerrit-Change-Number: 14038
Gerrit-PatchSet: 11
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 14:35:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-13 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14037 )

Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..


Patch Set 6:

(1 comment)

Because partition's column_stats_accurate property is never shown in "show 
create table" or show partitions, I added Hive query select count(*) in the 
tests to show the patch works by showing hive does not return wrong values for 
count after impala insert.

http://gerrit.cloudera.org:8080/#/c/14037/6/testdata/workloads/functional-query/queries/QueryTest/clear-statsaccurate.test
File 
testdata/workloads/functional-query/queries/QueryTest/clear-statsaccurate.test:

http://gerrit.cloudera.org:8080/#/c/14037/6/testdata/workloads/functional-query/queries/QueryTest/clear-statsaccurate.test@45
PS6, Line 45: 
> Again, no checks about the stats for the partitioned table.
the same reason



--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 6
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 14:19:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8823: DROP TABLE support for insert-only ACID tables

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14038 )

Change subject: IMPALA-8823: DROP TABLE support for insert-only ACID tables
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4236/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14038
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic41ca73268c4b75af5a08fe3dd1ada1df3f6fd34
Gerrit-Change-Number: 14038
Gerrit-PatchSet: 10
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 14:18:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-13 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14037 )

Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..


Patch Set 6:

Because partition's column_stats_accurate property is never shown in "show 
create table" or show partitions, I added Hive query select count(*) in the 
tests to show the patch works by showing hive does not return wrong values for 
count after impala insert.


--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 6
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 14:17:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8752: Added Jaro-Winkler edit distance and similarity built-in function

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13870 )

Change subject: IMPALA-8752: Added Jaro-Winkler edit distance and similarity 
built-in function
..


Patch Set 10:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4781/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/13870
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I64d7f461516c5e66cc27d62612bc8cc0e8f0178c
Gerrit-Change-Number: 13870
Gerrit-PatchSet: 10
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 14:15:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8752: Added Jaro-Winkler edit distance and similarity built-in function

2019-08-13 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13870 )

Change subject: IMPALA-8752: Added Jaro-Winkler edit distance and similarity 
built-in function
..


Patch Set 10: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13870
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I64d7f461516c5e66cc27d62612bc8cc0e8f0178c
Gerrit-Change-Number: 13870
Gerrit-PatchSet: 10
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 14:15:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8752: Added Jaro-Winkler edit distance and similarity built-in function

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13870 )

Change subject: IMPALA-8752: Added Jaro-Winkler edit distance and similarity 
built-in function
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4235/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13870
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I64d7f461516c5e66cc27d62612bc8cc0e8f0178c
Gerrit-Change-Number: 13870
Gerrit-PatchSet: 10
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 14:00:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8839: Remove COLUMN STATS ACCURATE from properties

2019-08-13 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14037 )

Change subject: IMPALA-8839: Remove COLUMN_STATS_ACCURATE from properties
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14037/6/testdata/workloads/functional-query/queries/QueryTest/acid-clear-statsaccurate.test
File 
testdata/workloads/functional-query/queries/QueryTest/acid-clear-statsaccurate.test:

http://gerrit.cloudera.org:8080/#/c/14037/6/testdata/workloads/functional-query/queries/QueryTest/acid-clear-statsaccurate.test@52
PS6, Line 52:
Didn't you want to add a 'show create table' statement for the partitioned 
table to check the removal of COLUMN_STATS_ACCURATE?


http://gerrit.cloudera.org:8080/#/c/14037/6/testdata/workloads/functional-query/queries/QueryTest/clear-statsaccurate.test
File 
testdata/workloads/functional-query/queries/QueryTest/clear-statsaccurate.test:

http://gerrit.cloudera.org:8080/#/c/14037/6/testdata/workloads/functional-query/queries/QueryTest/clear-statsaccurate.test@45
PS6, Line 45: 
Again, no checks about the stats for the partitioned table.



--
To view, visit http://gerrit.cloudera.org:8080/14037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I13f4a77022a7112e10a07314359f927eae083deb
Gerrit-Change-Number: 14037
Gerrit-PatchSet: 6
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 14:00:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8823: DROP TABLE support for insert-only ACID tables

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14038 )

Change subject: IMPALA-8823: DROP TABLE support for insert-only ACID tables
..


Patch Set 12:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4780/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/14038
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic41ca73268c4b75af5a08fe3dd1ada1df3f6fd34
Gerrit-Change-Number: 14038
Gerrit-PatchSet: 12
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 13:58:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8823: DROP TABLE support for insert-only ACID tables

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14038 )

Change subject: IMPALA-8823: DROP TABLE support for insert-only ACID tables
..


Patch Set 12: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/14038
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic41ca73268c4b75af5a08fe3dd1ada1df3f6fd34
Gerrit-Change-Number: 14038
Gerrit-PatchSet: 12
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 13:58:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8823: DROP TABLE support for insert-only ACID tables

2019-08-13 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14038 )

Change subject: IMPALA-8823: DROP TABLE support for insert-only ACID tables
..


Patch Set 11: Code-Review+2

carry +2 from Csaba. Thanks all for the quick turnarounds!


--
To view, visit http://gerrit.cloudera.org:8080/14038
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic41ca73268c4b75af5a08fe3dd1ada1df3f6fd34
Gerrit-Change-Number: 14038
Gerrit-PatchSet: 11
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 13:58:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8823: DROP TABLE support for insert-only ACID tables

2019-08-13 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14038 )

Change subject: IMPALA-8823: DROP TABLE support for insert-only ACID tables
..


Patch Set 11:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14038/10/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/14038/10/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1577
PS10, Line 1577: Table tbl = 
catalog_.getTableIfCachedNoThrow(tableName.getDb(), tableName.getTbl());
   : long lockId = -1;
> nit: fits to one line
Done



--
To view, visit http://gerrit.cloudera.org:8080/14038
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic41ca73268c4b75af5a08fe3dd1ada1df3f6fd34
Gerrit-Change-Number: 14038
Gerrit-PatchSet: 11
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 13:57:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8823: DROP TABLE support for insert-only ACID tables

2019-08-13 Thread Gabor Kaszab (Code Review)
Hello Vihang Karajgaonkar, Zoltan Borok-Nagy, Csaba Ringhofer, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14038

to look at the new patch set (#11).

Change subject: IMPALA-8823: DROP TABLE support for insert-only ACID tables
..

IMPALA-8823: DROP TABLE support for insert-only ACID tables

Enhances Impala to be able to drop insert-only transactional tables.
In order to do this Impala acquires an exclusive table lock in HMS
before performing the drop operation and releases the lock once
dropping the table finished.
INSERT statement does the locking and heartbeating on coordinator
side but for DROP TABLE all of these are done from Catalog side. This
means that alongside Impala coordinators now Catalog also does
heartbeating towards HMS.

Testing:
 - E2E test: Dropped a table, re-created it and dropped again to
   check if no locks remained in HMS.
 - E2E test: After dropping a table from Impala checked if Hive also
   sees it being dropped.
 - Manual test: With a hacked Impala that runs a drop table long
   enough I checked that there is a table lock entry in HMS during the
   execution and disappears once the query finishes.

Change-Id: Ic41ca73268c4b75af5a08fe3dd1ada1df3f6fd34
---
M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
R fe/src/main/java/org/apache/impala/common/TransactionKeepalive.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
M testdata/workloads/functional-query/queries/QueryTest/acid.test
M tests/metadata/test_hms_integration.py
M tests/query_test/test_acid.py
11 files changed, 205 insertions(+), 40 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/14038/11
--
To view, visit http://gerrit.cloudera.org:8080/14038
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic41ca73268c4b75af5a08fe3dd1ada1df3f6fd34
Gerrit-Change-Number: 14038
Gerrit-PatchSet: 11
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-8823: DROP TABLE support for insert-only ACID tables

2019-08-13 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14038 )

Change subject: IMPALA-8823: DROP TABLE support for insert-only ACID tables
..


Patch Set 10: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14038/10/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/14038/10/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1577
PS10, Line 1577: Table tbl = 
catalog_.getTableIfCachedNoThrow(tableName.getDb(),
   : tableName.getTbl());
nit: fits to one line



--
To view, visit http://gerrit.cloudera.org:8080/14038
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic41ca73268c4b75af5a08fe3dd1ada1df3f6fd34
Gerrit-Change-Number: 14038
Gerrit-PatchSet: 10
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 13:54:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8752: Added Jaro-Winkler edit distance and similarity built-in function

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13870 )

Change subject: IMPALA-8752: Added Jaro-Winkler edit distance and similarity 
built-in function
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4234/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13870
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I64d7f461516c5e66cc27d62612bc8cc0e8f0178c
Gerrit-Change-Number: 13870
Gerrit-PatchSet: 9
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 13:51:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8823: DROP TABLE support for insert-only ACID tables

2019-08-13 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14038 )

Change subject: IMPALA-8823: DROP TABLE support for insert-only ACID tables
..


Patch Set 10:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/14038/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14038/9//COMMIT_MSG@13
PS9, Line 13: INSERT sta
> I think that it would be better to be more specific and write that INSERT d
Done


http://gerrit.cloudera.org:8080/#/c/14038/9//COMMIT_MSG@17
PS9, Line 17:
> nit: could you add a paragraph about testing?
Done


http://gerrit.cloudera.org:8080/#/c/14038/9/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java:

http://gerrit.cloudera.org:8080/#/c/14038/9/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@585
PS9, Line 585:   LOG.error("Failed to release lock as a cleanup step 
after acquiring a lock " +
 :   "has failed: " + lockId +
> It would be better to log the exception's message too.
See my comment in CatalogOpExecutor about returning bool instead of throwing.
I can add the exception text to the log msg here.


http://gerrit.cloudera.org:8080/#/c/14038/9/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@604
PS9, Line 604: try {
> nit: I think we should log both acquire and release, or don't log any of th
Dropped logging in releaseLock() as I'm a bit afraid that we might flood logs 
if we log both in releaseLock() and acquireLock().


http://gerrit.cloudera.org:8080/#/c/14038/9/fe/src/main/java/org/apache/impala/common/TransactionKeepalive.java
File fe/src/main/java/org/apache/impala/common/TransactionKeepalive.java:

http://gerrit.cloudera.org:8080/#/c/14038/9/fe/src/main/java/org/apache/impala/common/TransactionKeepalive.java@57
PS9, Line 57: Stores either a TQueryCtx or a cause
:   // string. toString() returns the stored TQueryCtx if it is set 
or the string cause
:   // otherwise.
> nit: I'd rather explain what it does than how it is currently used.
Done


http://gerrit.cloudera.org:8080/#/c/14038/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/14038/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1559
PS9, Line 1559: s, ensure that it is lo
> According to line 1588-1590 it is possible that the table is not loaded yet
Done


http://gerrit.cloudera.org:8080/#/c/14038/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1573
PS9, Line 1573: he above was jus
> This can throw and exception which would lead to an error returned to the c
I think we should raise an error if releasing a lock failed even after 
successfully dropping the table. Otherwise we would swallow it and receive 
error the next time we recreate the table and try to touch it. E.g.:
1) drop table if exists tbl_name;
2) create table tbl_name ...
3) Insert into tbl_name
Here 3 would complain about not being able to acquire a lock even though 2) 
should have already told us that it couldn't release it's lock.


http://gerrit.cloudera.org:8080/#/c/14038/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1580
PS9, Line 1580: null && !(tbl instanceo
> +1 for creating this function
Thx :)


http://gerrit.cloudera.org:8080/#/c/14038/5/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/14038/5/fe/src/main/java/org/apache/impala/service/Frontend.java@510
PS5, Line 510: setTr
> I expected something like "FeIncompleteTable means that the table could not
That is already commented in DropTableOrViewStmt as far as I know why we allow 
the drop table not to exit when table loading fails. It's unrelated to locking 
so nothing to comment here in my opinion.



--
To view, visit http://gerrit.cloudera.org:8080/14038
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic41ca73268c4b75af5a08fe3dd1ada1df3f6fd34
Gerrit-Change-Number: 14038
Gerrit-PatchSet: 10
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 13:38:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8823: DROP TABLE support for insert-only ACID tables

2019-08-13 Thread Gabor Kaszab (Code Review)
Hello Vihang Karajgaonkar, Zoltan Borok-Nagy, Csaba Ringhofer, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14038

to look at the new patch set (#10).

Change subject: IMPALA-8823: DROP TABLE support for insert-only ACID tables
..

IMPALA-8823: DROP TABLE support for insert-only ACID tables

Enhances Impala to be able to drop insert-only transactional tables.
In order to do this Impala acquires an exclusive table lock in HMS
before performing the drop operation and releases the lock once
dropping the table finished.
INSERT statement does the locking and heartbeating on coordinator
side but for DROP TABLE all of these are done from Catalog side. This
means that alongside Impala coordinators now Catalog also does
heartbeating towards HMS.

Testing:
 - E2E test: Dropped a table, re-created it and dropped again to
   check if no locks remained in HMS.
 - E2E test: After dropping a table from Impala checked if Hive also
   sees it being dropped.
 - Manual test: With a hacked Impala that runs a drop table long
   enough I checked that there is a table lock entry in HMS during the
   execution and disappears once the query finishes.

Change-Id: Ic41ca73268c4b75af5a08fe3dd1ada1df3f6fd34
---
M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
R fe/src/main/java/org/apache/impala/common/TransactionKeepalive.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
M testdata/workloads/functional-query/queries/QueryTest/acid.test
M tests/metadata/test_hms_integration.py
M tests/query_test/test_acid.py
11 files changed, 206 insertions(+), 40 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/14038/10
--
To view, visit http://gerrit.cloudera.org:8080/14038
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic41ca73268c4b75af5a08fe3dd1ada1df3f6fd34
Gerrit-Change-Number: 14038
Gerrit-PatchSet: 10
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-8752: Added Jaro-Winkler edit distance and similarity built-in function

2019-08-13 Thread Norbert Luksa (Code Review)
Norbert Luksa has uploaded a new patch set (#10). ( 
http://gerrit.cloudera.org:8080/13870 )

Change subject: IMPALA-8752: Added Jaro-Winkler edit distance and similarity 
built-in function
..

IMPALA-8752: Added Jaro-Winkler edit distance and similarity built-in function

The added functions return the Jaro/Jaro-Winkler similarity/distance
of two strings. The algorithm calcuates the Jaro-Similarity of the
strings, then adds more weight to the result if there are
common prefixes. (Jaro-Winkler)
For more detail, see:
https://en.wikipedia.org/wiki/Jaro%E2%80%93Winkler_distance

Extended the algorithm with another optional parameter: boost threshold
The prefix weight will only be applied if the Jaro-similarity
exceeds the given threshold. By default, its value is 0.7.

The new built-in functions are:
 * jaro_distance, jaro_dst
 * jaro_similarity, jaro_sim
 * jaro_winkler_distance, jw_dst
 * jaro_winkler_similarity, jw_sim

Testing:
 * Added unit tests to expr-test.cc
 * Manual testing over 1400 word pairs from
   http://marvin.cs.uidaho.edu/misspell.html
   Results match Apache commons

Change-Id: I64d7f461516c5e66cc27d62612bc8cc0e8f0178c
---
M be/src/exprs/expr-test.cc
M be/src/exprs/string-functions-ir.cc
M be/src/exprs/string-functions.h
M common/function-registry/impala_functions.py
4 files changed, 323 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/13870/10
--
To view, visit http://gerrit.cloudera.org:8080/13870
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I64d7f461516c5e66cc27d62612bc8cc0e8f0178c
Gerrit-Change-Number: 13870
Gerrit-PatchSet: 10
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-8823: DROP TABLE support for insert-only ACID tables

2019-08-13 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14038 )

Change subject: IMPALA-8823: DROP TABLE support for insert-only ACID tables
..


Patch Set 5:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/14038/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14038/9//COMMIT_MSG@13
PS9, Line 13:
I think that it would be better to be more specific and write that INSERT did 
it on the coordinator side.

The reason for this is that INSERT is a DML statement that starts work on the 
coordinator + executors and touches the catalogd only after the executer 
finished. Creating the transaction in the catalogd at the start would have made 
an extra coordinator->catalogd RPC necessary.

For DDL statements the logic above does not apply. because DDL logic is mainly 
done in the catalogd, the coordinator acts more or less as a proxy towards the 
client.


http://gerrit.cloudera.org:8080/#/c/14038/9/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java:

http://gerrit.cloudera.org:8080/#/c/14038/9/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@585
PS9, Line 585:* @param lockId is the lock ID to be released.
 :* @throws TransactionException
It would be better to log the exception's message too.
Moving logging to releaseLock() and change it to return bool instead of 
exceptions could make the caller logic simpler.


http://gerrit.cloudera.org:8080/#/c/14038/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/14038/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1559
PS9, Line 1559: anular checks to provid
According to line 1588-1590 it is possible that the table is not loaded yet, 
mainly in case of LocalCatalog configuration.

I think that the logic in line 1582 - 1598 could be moved above this statement.


http://gerrit.cloudera.org:8080/#/c/14038/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1573
PS9, Line 1573:
This can throw and exception which would lead to an error returned to the 
client while the table was actually successfully dropped.
See my comment at MetastoreShim.java:585 for a possible solution.


http://gerrit.cloudera.org:8080/#/c/14038/9/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1580
PS9, Line 1580: mmary(resp, dbNotExist)
+1 for creating this function


http://gerrit.cloudera.org:8080/#/c/14038/5/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/14038/5/fe/src/main/java/org/apache/impala/service/Frontend.java@510
PS5, Line 510: table
> I'm not sure what comment you expect here.
I expected something like "FeIncompleteTable means that the table could not be 
loaded successfully. Still try do delete the table to be able delete tables 
with corrupted metadata."

In CatalogOpExecuter there is some explanation about this, so I am ok with 
adding comments at this logic's new place.



--
To view, visit http://gerrit.cloudera.org:8080/14038
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic41ca73268c4b75af5a08fe3dd1ada1df3f6fd34
Gerrit-Change-Number: 14038
Gerrit-PatchSet: 5
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 12:56:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7375: [DOCS] Added DATE functions

2019-08-13 Thread Attila Jeges (Code Review)
Attila Jeges has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13996 )

Change subject: IMPALA-7375: [DOCS] Added DATE functions
..


Patch Set 2:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/13996/2/docs/topics/impala_datetime_functions.xml
File docs/topics/impala_datetime_functions.xml:

http://gerrit.cloudera.org:8080/#/c/13996/2/docs/topics/impala_datetime_functions.xml@405
PS2, Line 405: Purpose: Alias for the NOW() function.
 :   
 : Return type: DATE
CURRENT_DATE() is not an alias for NOW(), the former returns a DATE, while the 
latter returns a TIMESTAMP.

CURRENT_DATE() is equivalent to CAST(now() as DATE);


http://gerrit.cloudera.org:8080/#/c/13996/2/docs/topics/impala_datetime_functions.xml@814
PS2, Line 814: TIMESTAMP
TIMESTAMP or DATE


http://gerrit.cloudera.org:8080/#/c/13996/2/docs/topics/impala_datetime_functions.xml@848
PS2, Line 848: DATE'2019-08-02
Closing apostrophe is missing.


http://gerrit.cloudera.org:8080/#/c/13996/2/docs/topics/impala_datetime_functions.xml@849
PS2, Line 849: 2000-01-01 00:00:00
This should be 2001-01-01.


http://gerrit.cloudera.org:8080/#/c/13996/2/docs/topics/impala_datetime_functions.xml@1002
PS2, Line 1002: startdate
Parameter is called 'date' in L1007


http://gerrit.cloudera.org:8080/#/c/13996/2/docs/topics/impala_datetime_functions.xml@1960
PS2, Line 1960: MONTHS_SUB(DATE'2019-03-01',
  : 1) returns 
DATE'2019-02-28'.
A better example would be MONTHS_SUB(DATE'2019-02-28', 1) returning 2019-01-31.


http://gerrit.cloudera.org:8080/#/c/13996/2/docs/topics/impala_datetime_functions.xml@3296
PS2, Line 3296: If date is not valid, e.g. 
DATE'2019-03-32',
  : returns NULL.
Again, the invalid date literal ill return an analysis error:

ERROR: AnalysisException: Invalid date literal: '2019-03-32'



--
To view, visit http://gerrit.cloudera.org:8080/13996
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1463731f9ee7fb9ec80d6be2458cffe5c42464b6
Gerrit-Change-Number: 13996
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 12:00:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7770: SPLIT PART to support negative indexes

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13880 )

Change subject: IMPALA-7770: SPLIT_PART to support negative indexes
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13880
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2db762989a90bd95661a59eb9c11a29eb2edfafb
Gerrit-Change-Number: 13880
Gerrit-PatchSet: 6
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 11:01:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7770: SPLIT PART to support negative indexes

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13880 )

Change subject: IMPALA-7770: SPLIT_PART to support negative indexes
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4779/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/13880
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2db762989a90bd95661a59eb9c11a29eb2edfafb
Gerrit-Change-Number: 13880
Gerrit-PatchSet: 6
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 11:01:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7770: SPLIT PART to support negative indexes

2019-08-13 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13880 )

Change subject: IMPALA-7770: SPLIT_PART to support negative indexes
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13880
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2db762989a90bd95661a59eb9c11a29eb2edfafb
Gerrit-Change-Number: 13880
Gerrit-PatchSet: 5
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 11:00:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8859: fix test global config file for remote clusters

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14053 )

Change subject: IMPALA-8859: fix test_global_config_file for remote clusters
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/14053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I448e5a7dfc0ab6fd53182a593e2fff1a12a10fd7
Gerrit-Change-Number: 14053
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 13 Aug 2019 10:52:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8685,IMPALA-8677: Use consistent scheduling for small clusters

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14026 )

Change subject: IMPALA-8685,IMPALA-8677: Use consistent scheduling for small 
clusters
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/14026
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icfdb2cc53d7206e316ea8a1cc28ad443f246f741
Gerrit-Change-Number: 14026
Gerrit-PatchSet: 5
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Tue, 13 Aug 2019 10:22:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8823: DROP TABLE support for insert-only ACID tables

2019-08-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14038 )

Change subject: IMPALA-8823: DROP TABLE support for insert-only ACID tables
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4233/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14038
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic41ca73268c4b75af5a08fe3dd1ada1df3f6fd34
Gerrit-Change-Number: 14038
Gerrit-PatchSet: 9
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 13 Aug 2019 10:10:33 +
Gerrit-HasComments: No


  1   2   >