[Impala-ASF-CR] IMPALA-8549: Add support for scanning DEFLATE text files

2019-08-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13857 )

Change subject: IMPALA-8549: Add support for scanning DEFLATE text files
..


Patch Set 10: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/4719/


--
To view, visit http://gerrit.cloudera.org:8080/13857
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I45e41ab5a12637d396fef0812a09d71fa839b27a
Gerrit-Change-Number: 13857
Gerrit-PatchSet: 10
Gerrit-Owner: Ethan Xue 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Ethan Xue 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Sat, 03 Aug 2019 03:25:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7375: [DOCS] Added DATE functions

2019-08-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13996 )

Change subject: IMPALA-7375: [DOCS] Added DATE functions
..


Patch Set 1: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/431/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/13996
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1463731f9ee7fb9ec80d6be2458cffe5c42464b6
Gerrit-Change-Number: 13996
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Sat, 03 Aug 2019 03:09:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7375: [DOCS] Added DATE functions

2019-08-02 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13996 )

Change subject: IMPALA-7375: [DOCS] Added DATE functions
..


Patch Set 1:

Sharing a generated google doc in a separate thread since too many changes.


--
To view, visit http://gerrit.cloudera.org:8080/13996
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1463731f9ee7fb9ec80d6be2458cffe5c42464b6
Gerrit-Change-Number: 13996
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Sat, 03 Aug 2019 02:47:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7375: [DOCS] Added DATE functions

2019-08-02 Thread Alex Rodoni (Code Review)
Alex Rodoni has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/13996


Change subject: IMPALA-7375: [DOCS] Added DATE functions
..

IMPALA-7375: [DOCS] Added DATE functions

Change-Id: I1463731f9ee7fb9ec80d6be2458cffe5c42464b6
---
M docs/topics/impala_datetime_functions.xml
1 file changed, 570 insertions(+), 1,097 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/13996/1
--
To view, visit http://gerrit.cloudera.org:8080/13996
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I1463731f9ee7fb9ec80d6be2458cffe5c42464b6
Gerrit-Change-Number: 13996
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 


[Impala-ASF-CR] IMPALA-7375: [DOCS] Added DATE functions

2019-08-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13996 )

Change subject: IMPALA-7375: [DOCS] Added DATE functions
..


Patch Set 1:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/431/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/13996
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1463731f9ee7fb9ec80d6be2458cffe5c42464b6
Gerrit-Change-Number: 13996
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Sat, 03 Aug 2019 02:46:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8627: Enable catalog-v2 in tests

2019-08-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13933 )

Change subject: IMPALA-8627: Enable catalog-v2 in tests
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4130/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13933
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbde666de2b780c0e40df716a9dfe54524e092d
Gerrit-Change-Number: 13933
Gerrit-PatchSet: 9
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Sat, 03 Aug 2019 00:05:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8376: directory limits for scratch usage

2019-08-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13986 )

Change subject: IMPALA-8376: directory limits for scratch usage
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4721/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/13986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I696146a65dbb97f1ba200ae472358ae2db6eb441
Gerrit-Change-Number: 13986
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 02 Aug 2019 23:50:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8627: Enable catalog-v2 in tests

2019-08-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13933 )

Change subject: IMPALA-8627: Enable catalog-v2 in tests
..


Patch Set 9:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4720/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/13933
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbde666de2b780c0e40df716a9dfe54524e092d
Gerrit-Change-Number: 13933
Gerrit-PatchSet: 9
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 02 Aug 2019 23:47:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8376: directory limits for scratch usage

2019-08-02 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13986 )

Change subject: IMPALA-8376: directory limits for scratch usage
..


Patch Set 4:

(4 comments)

The scope of this expanded a bit since I realised that we didn't support 
parsing terabyte specifiers yet (I had a test gap there).

http://gerrit.cloudera.org:8080/#/c/13986/3/be/src/runtime/tmp-file-mgr-test.cc
File be/src/runtime/tmp-file-mgr-test.cc:

http://gerrit.cloudera.org:8080/#/c/13986/3/be/src/runtime/tmp-file-mgr-test.cc@785
PS3, Line 785: GetValue());
> nit: file_group_2?
Done


http://gerrit.cloudera.org:8080/#/c/13986/3/be/src/runtime/tmp-file-mgr-test.cc@805
PS3, Line 805:   // Configure various directories with valid formats.
I missed verifying the actual values of the limits. In particular, 5tb is not 
valid - the utility didn't support parsing TB. This seems generally useful so I 
added it.


http://gerrit.cloudera.org:8080/#/c/13986/3/be/src/runtime/tmp-file-mgr-test.cc@834
PS3, Line 834: EXPECT_EQ(4, dirs3.size());
> so this means I can specify the same dir twice, potentially with different
That is true, there isn't really anything smart to prevent that.

I'm not sure if it's worth trying to do anything special to handle it, we don't 
want to try to be clever for sure. Maybe I could add a test to verify that 
nothing crazy happens.


http://gerrit.cloudera.org:8080/#/c/13986/3/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/13986/3/be/src/runtime/tmp-file-mgr.cc@131
PS3, Line 131: (ERROR) << "
> I am wondering if we should also default to ignoring the dir in case of lim
Yeah that seems reasonable. I don't think the stakes are too high - either are 
OK behaviours in the face of a misconfiguration, but skipping the directory 
does contain the potential damage.



--
To view, visit http://gerrit.cloudera.org:8080/13986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I696146a65dbb97f1ba200ae472358ae2db6eb441
Gerrit-Change-Number: 13986
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 02 Aug 2019 23:43:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8376: directory limits for scratch usage

2019-08-02 Thread Tim Armstrong (Code Review)
Hello Abhishek Rawat, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13986

to look at the new patch set (#4).

Change subject: IMPALA-8376: directory limits for scratch usage
..

IMPALA-8376: directory limits for scratch usage

This extends the --scratch_dirs syntax to support specifying a max
capacity per directory, similarly to the --data_cache confirmation.
The capacity is delimited from the directory name with ":" and
uses the usual syntax for specifying memory. The following are
valid arguments:
* --scratch_dirs=/dir1,/dir2 (no limits)
* --scratch_dirs=/dir1,/dir2:25G (only a limit on /dir2)
* --scratch_dirs=/dir1:5MB,/dir2 (only a limit on /dir)
* --scratch_dirs=/dir1:-1,/dir2:0 (alternative ways of
  expressing no limit)

The usage is tracked with a metric per directory. Allocations
from that directory start to fail when the limit is exceeded.
These metrics are exposed as
tmp-file-mgr.scratch-space-bytes-used.dir-0,
tmp-file-mgr.scratch-space-bytes-used.dir-1, etc.

Also add support for parsing terabyte specifiers to a utility
function that is used for parsing many configurations.

Testing:
Added a unit test to exercise TmpFileMgr.

Manually ran a spilling query on an impalad with multiple scratch dirs
configured with different limits. Confirmed via metrics that the
capacities were enforced.

Change-Id: I696146a65dbb97f1ba200ae472358ae2db6eb441
---
M be/src/runtime/tmp-file-mgr-internal.h
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M be/src/service/query-options-test.cc
M be/src/util/parse-util-test.cc
M be/src/util/parse-util.cc
M common/thrift/generate_error_codes.py
M common/thrift/metrics.json
M docs/topics/impala_mem_limit.xml
10 files changed, 382 insertions(+), 41 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/13986/4
--
To view, visit http://gerrit.cloudera.org:8080/13986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I696146a65dbb97f1ba200ae472358ae2db6eb441
Gerrit-Change-Number: 13986
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-8781: Result spooling tests to cover edge cases and cancellation

2019-08-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13907 )

Change subject: IMPALA-8781: Result spooling tests to cover edge cases and 
cancellation
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4128/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13907
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib3b3a1539c4a5fa9b43c8ca315cea16c9701e283
Gerrit-Change-Number: 13907
Gerrit-PatchSet: 10
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Fri, 02 Aug 2019 23:39:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8828: Support impersonation via http paths

2019-08-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13994 )

Change subject: IMPALA-8828: Support impersonation via http paths
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4129/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I20b9c2e2d106530732f1c52f8d3d1ecc24ae4bd6
Gerrit-Change-Number: 13994
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 02 Aug 2019 23:39:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8627: Enable catalog-v2 in tests

2019-08-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13933 )

Change subject: IMPALA-8627: Enable catalog-v2 in tests
..


Patch Set 9:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13933/9/tests/metadata/test_metadata_query_statements.py
File tests/metadata/test_metadata_query_statements.py:

http://gerrit.cloudera.org:8080/#/c/13933/9/tests/metadata/test_metadata_query_statements.py@180
PS9, Line 180: 0
flake8: E501 line too long (92 > 90 characters)


http://gerrit.cloudera.org:8080/#/c/13933/9/tests/metadata/test_metadata_query_statements.py@181
PS9, Line 181: t
flake8: E501 line too long (91 > 90 characters)



--
To view, visit http://gerrit.cloudera.org:8080/13933
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbde666de2b780c0e40df716a9dfe54524e092d
Gerrit-Change-Number: 13933
Gerrit-PatchSet: 9
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 02 Aug 2019 23:23:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8627: Enable catalog-v2 in tests

2019-08-02 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#9). ( 
http://gerrit.cloudera.org:8080/13933 )

Change subject: IMPALA-8627: Enable catalog-v2 in tests
..

IMPALA-8627: Enable catalog-v2 in tests

This patch enables catalog-v2 by default in all the tests.

Test fixes:
1. Modified test_observability which fails on catalog-v2 since
the profile emits different metadata load events. The test now looks for
the right events on the profile depending on whether catalogv2 is
enabled or not.
2. Fixes a bug in TableName.java constructor which allows non-lowercased
table and database names. This causes problems at the local catalog
cache which expects the tablenames to be always in lowercase. More
details on this failure are available in IMPALA-8627.
3. Fixes the JdbcTest which checks for existence of table comment in the
getTables metadata jdbc call. In catalog-v2 since the columns are not
requested, LocalTable is not loaded and hence the test needs to be
modified to check if catalog-v2 is enabled.

Change-Id: Iddbde666de2b780c0e40df716a9dfe54524e092d
---
M docker/catalogd/Dockerfile
M docker/impalad_coord_exec/Dockerfile
M docker/impalad_coordinator/Dockerfile
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/test/java/org/apache/impala/service/JdbcTest.java
M fe/src/test/java/org/apache/impala/testutil/TestUtils.java
M testdata/workloads/functional-query/queries/QueryTest/describe-db.test
M tests/common/environ.py
M tests/hs2/hs2_test_suite.py
M tests/hs2/test_hs2.py
M tests/metadata/test_hms_integration.py
M tests/metadata/test_metadata_query_statements.py
M tests/metadata/test_refresh_partition.py
M tests/query_test/test_observability.py
14 files changed, 136 insertions(+), 61 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/13933/9
--
To view, visit http://gerrit.cloudera.org:8080/13933
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iddbde666de2b780c0e40df716a9dfe54524e092d
Gerrit-Change-Number: 13933
Gerrit-PatchSet: 9
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-8781: Result spooling tests to cover edge cases and cancellation

2019-08-02 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13907 )

Change subject: IMPALA-8781: Result spooling tests to cover edge cases and 
cancellation
..


Patch Set 10:

Not entirely sure why, but when test_result_spooling.py is run on Jenkins, the 
TPCH queries need to be prefixed with the database name (the confusing part is 
that test_cancellation.py doesn't need to do this). Regardless, it doesn't seem 
like a big deal to prefix them with 'tpch'.


--
To view, visit http://gerrit.cloudera.org:8080/13907
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib3b3a1539c4a5fa9b43c8ca315cea16c9701e283
Gerrit-Change-Number: 13907
Gerrit-PatchSet: 10
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Fri, 02 Aug 2019 23:00:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8828: Support impersonation via http paths

2019-08-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13994 )

Change subject: IMPALA-8828: Support impersonation via http paths
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13994/1/be/src/service/impala-hs2-server.cc
File be/src/service/impala-hs2-server.cc:

http://gerrit.cloudera.org:8080/#/c/13994/1/be/src/service/impala-hs2-server.cc@312
PS1, Line 312:   const ThriftServer::ConnectionContext* connection_context = 
ThriftServer::GetThreadConnectionContext();
line too long (105 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/13994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I20b9c2e2d106530732f1c52f8d3d1ecc24ae4bd6
Gerrit-Change-Number: 13994
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 02 Aug 2019 22:59:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8828: Support impersonation via http paths

2019-08-02 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/13994


Change subject: IMPALA-8828: Support impersonation via http paths
..

IMPALA-8828: Support impersonation via http paths

This patch allows clients that connect over the HTTP server to specify
the 'doAs' parameter in the provided path in order to perform
impersonation.

The existing rules for impersonation are applied, i.e.
authorized_proxy_user_config or authorized_proxy_group_config must be
set with the appropriate values for impersonation to be successful.

Testing:
- Added a FE test that verifies impersonation works as expected with
  impala-shell and ldap.
- Manually tested with Apache Knox.

Change-Id: I20b9c2e2d106530732f1c52f8d3d1ecc24ae4bd6
---
M be/src/rpc/auth-provider.h
M be/src/rpc/authentication.cc
M be/src/rpc/thrift-server.h
M be/src/service/impala-hs2-server.cc
M be/src/transport/THttpServer.cpp
M be/src/transport/THttpServer.h
M fe/src/test/java/org/apache/impala/customcluster/LdapImpalaShellTest.java
7 files changed, 157 insertions(+), 58 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/13994/1
--
To view, visit http://gerrit.cloudera.org:8080/13994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I20b9c2e2d106530732f1c52f8d3d1ecc24ae4bd6
Gerrit-Change-Number: 13994
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-8781: Result spooling tests to cover edge cases and cancellation

2019-08-02 Thread Sahil Takiar (Code Review)
Hello Michael Ho, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13907

to look at the new patch set (#10).

Change subject: IMPALA-8781: Result spooling tests to cover edge cases and 
cancellation
..

IMPALA-8781: Result spooling tests to cover edge cases and cancellation

Adds additional tests to test_result_spooling.py to cover various edge
cases when fetching query results (ensure all Impala types are returned
properly, UDFs are evaluated correctly, etc.). A new QueryTest file
result-spooling.test is added to encapsulate all these tests. Tests with
a decreased ROW_BATCH_SIZE are added as well to validate that
BufferedPlanRootSink buffers row batches correctly.

BufferedPlanRootSink requires careful synchronization of the producer
and consumer threads, especially when queries are cancelled. The
TestResultSpoolingCancellation class is dedicated to running
cancellation tests with SPOOL_QUERY_RESULTS = true. The implementation
is heavily borrowed from test_cancellation.py and some of the logic is
re-factored into a new utility class called cancel_utils.py to avoid
code duplication between test_cancellation.py and
test_result_spooling.py.

Testing:
* Looped test_result_spooling.py overnight with no failures
* Core tests passed

Change-Id: Ib3b3a1539c4a5fa9b43c8ca315cea16c9701e283
---
A testdata/workloads/functional-query/queries/QueryTest/result-spooling.test
M tests/hs2/test_fetch_first.py
M tests/query_test/test_cancellation.py
M tests/query_test/test_result_spooling.py
A tests/util/cancel_util.py
5 files changed, 314 insertions(+), 64 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/13907/10
--
To view, visit http://gerrit.cloudera.org:8080/13907
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib3b3a1539c4a5fa9b43c8ca315cea16c9701e283
Gerrit-Change-Number: 13907
Gerrit-PatchSet: 10
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sahil Takiar 


[Impala-ASF-CR] IMPALA-8376: directory limits for scratch usage

2019-08-02 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13986 )

Change subject: IMPALA-8376: directory limits for scratch usage
..


Patch Set 3: Code-Review+1

(4 comments)

http://gerrit.cloudera.org:8080/#/c/13986/3/be/src/runtime/tmp-file-mgr-test.cc
File be/src/runtime/tmp-file-mgr-test.cc:

http://gerrit.cloudera.org:8080/#/c/13986/3/be/src/runtime/tmp-file-mgr-test.cc@785
PS3, Line 785: file_group_1
nit: file_group_2?


http://gerrit.cloudera.org:8080/#/c/13986/3/be/src/runtime/tmp-file-mgr-test.cc@834
PS3, Line 834: auto& empty_paths = GetTmpDirs(CreateTmpFileMgr(","));
so this means I can specify the same dir twice, potentially with different 
limits?


http://gerrit.cloudera.org:8080/#/c/13986/3/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/13986/3/be/src/runtime/tmp-file-mgr.cc@131
PS3, Line 131: Interpret -1
I am wondering if we should also default to ignoring the dir in case of limit 
parsing failure, just like we do for path parsing failure. Like if the user 
wanted to put a limit on a dir, then they might prefer it failing to create 
that dir rather than having an infinite limit on it.
Dont have a strong preference either way so I'll leave it upto you


http://gerrit.cloudera.org:8080/#/c/13986/3/be/src/runtime/tmp-file-mgr.cc@178
PS3, Line 178: TmpDir(
nit: dont need to call constructor explicitly for emplace_back



--
To view, visit http://gerrit.cloudera.org:8080/13986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I696146a65dbb97f1ba200ae472358ae2db6eb441
Gerrit-Change-Number: 13986
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 02 Aug 2019 22:55:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8781: Result spooling tests to cover edge cases and cancellation

2019-08-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13907 )

Change subject: IMPALA-8781: Result spooling tests to cover edge cases and 
cancellation
..


Patch Set 9: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/4718/


--
To view, visit http://gerrit.cloudera.org:8080/13907
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib3b3a1539c4a5fa9b43c8ca315cea16c9701e283
Gerrit-Change-Number: 13907
Gerrit-PatchSet: 9
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Fri, 02 Aug 2019 21:55:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8814: [DOCS] Document the SPNEGO support for Impala Web UIs

2019-08-02 Thread Alex Rodoni (Code Review)
Alex Rodoni has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/13958 )

Change subject: IMPALA-8814: [DOCS] Document the SPNEGO support for Impala Web 
UIs
..

IMPALA-8814: [DOCS] Document the SPNEGO support for Impala Web UIs

Change-Id: I3cbf1265bf6b897d2728ecb9446a07b1c34a576e
Reviewed-on: http://gerrit.cloudera.org:8080/13958
Tested-by: Impala Public Jenkins 
Reviewed-by: Thomas Tauber-Marshall 
---
M docs/topics/impala_webui.xml
1 file changed, 62 insertions(+), 75 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Thomas Tauber-Marshall: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/13958
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I3cbf1265bf6b897d2728ecb9446a07b1c34a576e
Gerrit-Change-Number: 13958
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-8814: [DOCS] Document the SPNEGO support for Impala Web UIs

2019-08-02 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13958 )

Change subject: IMPALA-8814: [DOCS] Document the SPNEGO support for Impala Web 
UIs
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13958
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3cbf1265bf6b897d2728ecb9446a07b1c34a576e
Gerrit-Change-Number: 13958
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Fri, 02 Aug 2019 21:13:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8549: Add support for scanning DEFLATE text files

2019-08-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13857 )

Change subject: IMPALA-8549: Add support for scanning DEFLATE text files
..


Patch Set 10: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13857
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I45e41ab5a12637d396fef0812a09d71fa839b27a
Gerrit-Change-Number: 13857
Gerrit-PatchSet: 10
Gerrit-Owner: Ethan Xue 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Ethan Xue 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 02 Aug 2019 20:48:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8549: Add support for scanning DEFLATE text files

2019-08-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13857 )

Change subject: IMPALA-8549: Add support for scanning DEFLATE text files
..


Patch Set 10:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4719/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/13857
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I45e41ab5a12637d396fef0812a09d71fa839b27a
Gerrit-Change-Number: 13857
Gerrit-PatchSet: 10
Gerrit-Owner: Ethan Xue 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Ethan Xue 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 02 Aug 2019 20:48:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8549: Add support for scanning DEFLATE text files

2019-08-02 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13857 )

Change subject: IMPALA-8549: Add support for scanning DEFLATE text files
..


Patch Set 9: Code-Review+2

Carrying forward Zoltan's +2


--
To view, visit http://gerrit.cloudera.org:8080/13857
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I45e41ab5a12637d396fef0812a09d71fa839b27a
Gerrit-Change-Number: 13857
Gerrit-PatchSet: 9
Gerrit-Owner: Ethan Xue 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Ethan Xue 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 02 Aug 2019 20:45:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8668: [DOCS] HS2 Support for Impala-Shell connection

2019-08-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13961 )

Change subject: IMPALA-8668: [DOCS] HS2 Support for Impala-Shell connection
..


Patch Set 4: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/430/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/13961
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie4e207d247409ac3069ce8a235be3f63e616007e
Gerrit-Change-Number: 13961
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 02 Aug 2019 20:38:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8456: [DOCS] New HTTP protocol for Impala clients

2019-08-02 Thread Alex Rodoni (Code Review)
Alex Rodoni has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/13960 )

Change subject: IMPALA-8456: [DOCS] New HTTP protocol for Impala clients
..

IMPALA-8456: [DOCS] New HTTP protocol for Impala clients

Change-Id: I3101f8babc77a5a872778499a54ac479a66ad996
Reviewed-on: http://gerrit.cloudera.org:8080/13960
Tested-by: Impala Public Jenkins 
Reviewed-by: Bharath Vissapragada 
---
M docs/topics/impala_client.xml
M docs/topics/impala_jdbc.xml
M docs/topics/impala_ports.xml
3 files changed, 185 insertions(+), 39 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Bharath Vissapragada: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/13960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I3101f8babc77a5a872778499a54ac479a66ad996
Gerrit-Change-Number: 13960
Gerrit-PatchSet: 6
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-8668: [DOCS] HS2 Support for Impala-Shell connection

2019-08-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13961 )

Change subject: IMPALA-8668: [DOCS] HS2 Support for Impala-Shell connection
..


Patch Set 4:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/430/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/13961
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie4e207d247409ac3069ce8a235be3f63e616007e
Gerrit-Change-Number: 13961
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 02 Aug 2019 20:14:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8668: [DOCS] HS2 Support for Impala-Shell connection

2019-08-02 Thread Alex Rodoni (Code Review)
Hello Bharath Vissapragada, Thomas Tauber-Marshall, Tim Armstrong, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13961

to look at the new patch set (#4).

Change subject: IMPALA-8668: [DOCS] HS2 Support for Impala-Shell connection
..

IMPALA-8668: [DOCS] HS2 Support for Impala-Shell connection

Change-Id: Ie4e207d247409ac3069ce8a235be3f63e616007e
---
M docs/topics/impala_connecting.xml
M docs/topics/impala_shell_options.xml
2 files changed, 81 insertions(+), 91 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/13961/4
--
To view, visit http://gerrit.cloudera.org:8080/13961
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie4e207d247409ac3069ce8a235be3f63e616007e
Gerrit-Change-Number: 13961
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-8456: [DOCS] New HTTP protocol for Impala clients

2019-08-02 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13960 )

Change subject: IMPALA-8456: [DOCS] New HTTP protocol for Impala clients
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3101f8babc77a5a872778499a54ac479a66ad996
Gerrit-Change-Number: 13960
Gerrit-PatchSet: 5
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 02 Aug 2019 20:14:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8668: [DOCS] HS2 Support for Impala-Shell connection

2019-08-02 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13961 )

Change subject: IMPALA-8668: [DOCS] HS2 Support for Impala-Shell connection
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13961/3/docs/topics/impala_connecting.xml
File docs/topics/impala_connecting.xml:

http://gerrit.cloudera.org:8080/#/c/13961/3/docs/topics/impala_connecting.xml@81
PS3, Line 81: When you are
> Remove
Done


http://gerrit.cloudera.org:8080/#/c/13961/3/docs/topics/impala_connecting.xml@87
PS3, Line 87:
> remove trailing space
Done



--
To view, visit http://gerrit.cloudera.org:8080/13961
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie4e207d247409ac3069ce8a235be3f63e616007e
Gerrit-Change-Number: 13961
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 02 Aug 2019 20:13:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8668: [DOCS] HS2 Support for Impala-Shell connection

2019-08-02 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13961 )

Change subject: IMPALA-8668: [DOCS] HS2 Support for Impala-Shell connection
..


Patch Set 3: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13961/3/docs/topics/impala_connecting.xml
File docs/topics/impala_connecting.xml:

http://gerrit.cloudera.org:8080/#/c/13961/3/docs/topics/impala_connecting.xml@81
PS3, Line 81: When you are
Remove


http://gerrit.cloudera.org:8080/#/c/13961/3/docs/topics/impala_connecting.xml@87
PS3, Line 87:
remove trailing space



--
To view, visit http://gerrit.cloudera.org:8080/13961
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie4e207d247409ac3069ce8a235be3f63e616007e
Gerrit-Change-Number: 13961
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 02 Aug 2019 20:11:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8549: Add support for scanning DEFLATE text files

2019-08-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13857 )

Change subject: IMPALA-8549: Add support for scanning DEFLATE text files
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4127/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13857
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I45e41ab5a12637d396fef0812a09d71fa839b27a
Gerrit-Change-Number: 13857
Gerrit-PatchSet: 8
Gerrit-Owner: Ethan Xue 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Ethan Xue 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 02 Aug 2019 18:34:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8549: Add support for scanning DEFLATE text files

2019-08-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13857 )

Change subject: IMPALA-8549: Add support for scanning DEFLATE text files
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4126/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13857
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I45e41ab5a12637d396fef0812a09d71fa839b27a
Gerrit-Change-Number: 13857
Gerrit-PatchSet: 7
Gerrit-Owner: Ethan Xue 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Ethan Xue 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 02 Aug 2019 18:25:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8637: Implement transaction handling and locking for ACID queries

2019-08-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13968 )

Change subject: IMPALA-8637: Implement transaction handling and locking for 
ACID queries
..


Patch Set 3:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/4125/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/13968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa37899b24aa114be642bf8772b4e0f882865cfa
Gerrit-Change-Number: 13968
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 02 Aug 2019 18:25:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8549: Add support for scanning DEFLATE text files

2019-08-02 Thread Ethan Xue (Code Review)
Ethan Xue has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13857 )

Change subject: IMPALA-8549: Add support for scanning DEFLATE text files
..


Patch Set 9:

(5 comments)

> Patch Set 8: Code-Review+2
>
> (1 comment)

http://gerrit.cloudera.org:8080/#/c/13857/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13857/8//COMMIT_MSG@10
PS8, Line 10: tables stored as text. To avoid confusion, it should be noted that
> nit: some lines are longer than 72 chars.
Done


http://gerrit.cloudera.org:8080/#/c/13857/6/be/src/exec/hdfs-text-scanner.cc
File be/src/exec/hdfs-text-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/13857/6/be/src/exec/hdfs-text-scanner.cc@95
PS6, Line 95: gzip-, snappy-, bzip2- a
> nit: maybe add deflate to this list
Done


http://gerrit.cloudera.org:8080/#/c/13857/6/be/src/exec/hdfs-text-scanner.cc@204
PS6, Line 204: ed in H
> nit: Please extend the comment why we use DEFAULT here instead of ZLIB.
Done


http://gerrit.cloudera.org:8080/#/c/13857/7/be/src/exec/hdfs-text-scanner.cc
File be/src/exec/hdfs-text-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/13857/7/be/src/exec/hdfs-text-scanner.cc@95
PS7, Line 95:   // In order to decompress gzip-, snappy-, bzip2- and 
deflate-compressed text
> line too long (93 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/13857/7/be/src/exec/hdfs-text-scanner.cc@96
PS7, Line 96:   // files, we need to read entire files. Only read a 
file if we're assigned the
> line too long (93 > 90)
Done



--
To view, visit http://gerrit.cloudera.org:8080/13857
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I45e41ab5a12637d396fef0812a09d71fa839b27a
Gerrit-Change-Number: 13857
Gerrit-PatchSet: 9
Gerrit-Owner: Ethan Xue 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Ethan Xue 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 02 Aug 2019 18:06:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8549: Add support for scanning DEFLATE text files

2019-08-02 Thread Ethan Xue (Code Review)
Hello Abhishek Rawat, Zoltan Borok-Nagy, Sahil Takiar, Bikramjeet Vig, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13857

to look at the new patch set (#9).

Change subject: IMPALA-8549: Add support for scanning DEFLATE text files
..

IMPALA-8549: Add support for scanning DEFLATE text files

This patch adds support to Impala for scanning .DEFLATE files of
tables stored as text. To avoid confusion, it should be noted that
although these files have a compression type of DEFLATE in Impala,
they should be treated as if their compression type is DEFAULT.

Hadoop tools such as Hive and MapReduce support reading and writing
text files compressed using the deflate algorithm, which is the default
compression type. Hadoop uses the zlib library (an implementation of
the DEFLATE algorithm) to compress text files into .DEFLATE files,
which are not in the raw deflate format but rather the zlib format
(the zlib library supports three flavors of deflate, and Hadoop uses
the flavor that compresses data into deflate with zlib wrappings rather
than just raw deflate)

Testing:
There is a pre-existing unit test that validates compressing and
decompressing data with compression type DEFLATE. Also, modified
existing end-to-end testing that simulates querying files of various
formats and compression types. All core and exhaustive tests pass.

Change-Id: I45e41ab5a12637d396fef0812a09d71fa839b27a
---
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/hdfs-text-scanner.h
M testdata/datasets/functional/schema_constraints.csv
M testdata/workloads/functional-query/functional-query_exhaustive.csv
M tests/query_test/test_compressed_formats.py
5 files changed, 25 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/13857/9
--
To view, visit http://gerrit.cloudera.org:8080/13857
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I45e41ab5a12637d396fef0812a09d71fa839b27a
Gerrit-Change-Number: 13857
Gerrit-PatchSet: 9
Gerrit-Owner: Ethan Xue 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Ethan Xue 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-8637: Implement transaction handling and locking for ACID queries

2019-08-02 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13968 )

Change subject: IMPALA-8637: Implement transaction handling and locking for 
ACID queries
..


Patch Set 3:

(17 comments)

Thanks everyone for the comments!

http://gerrit.cloudera.org:8080/#/c/13968/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13968/2//COMMIT_MSG@19
PS2, Line 19:  * add tests
> Metrics for the following would be good (just some ideas)
Added logs for observability but I agree that such metrics would be very useful.
Is it OK if I add them in a follow-up Jira later? It would need some further 
work plus investigation of how to JNI call from the Frontend to some backend 
class object, e.g. the coordinator.


http://gerrit.cloudera.org:8080/#/c/13968/2/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java:

http://gerrit.cloudera.org:8080/#/c/13968/2/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@516
PS2, Line 516:   long txnId, long lockId) throws TransactionException {
> Should we log these? I also asked this in the caller, maybe this is a bette
Added LOG.info() for the exception handlers.


http://gerrit.cloudera.org:8080/#/c/13968/2/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/13968/2/fe/src/main/java/org/apache/impala/service/Frontend.java@1683
PS2, Line 1683: }
> Do we have commitTransaction calls somewhere that actively call the transac
Currently commit is done by Catalogd so we cannot call deleteTransaction() 
immediately.

But after commit is done by Catalogd the coordinator invokes 
unregisterTransaction() via JNI that calls deleteTransaction().


http://gerrit.cloudera.org:8080/#/c/13968/2/fe/src/main/java/org/apache/impala/service/Frontend.java@1718
PS2, Line 1718:   /**
> Can you give tables.size() to the arraylist constructor?
Done


http://gerrit.cloudera.org:8080/#/c/13968/2/fe/src/main/java/org/apache/impala/service/Frontend.java@1726
PS2, Line 1726:   private void createLockForInsert(Long txnId, 
Collection tables,
> Does hive have code to handle deadlock prevention?
Since we put all of our lockComponents (table locks) into a single LockRequest 
and give it a transaction context as well it shouldn't deadlock.

https://github.com/apache/hive/blob/d7475aa98f6a2fc813e2e1c0ad99f902cb28cc00/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L2240


http://gerrit.cloudera.org:8080/#/c/13968/2/fe/src/main/java/org/apache/impala/service/TransactionKeepalive.java
File fe/src/main/java/org/apache/impala/service/TransactionKeepalive.java:

http://gerrit.cloudera.org:8080/#/c/13968/2/fe/src/main/java/org/apache/impala/service/TransactionKeepalive.java@44
PS2, Line 44:   // TODO: should be calculated from hive.txn.timeout or 
metastore.txn.timeout
> Never used?
Removed it.


http://gerrit.cloudera.org:8080/#/c/13968/2/fe/src/main/java/org/apache/impala/service/TransactionKeepalive.java@48
PS2, Line 48:
> maybe for a later commit:
Added a simple check for start. I save the creation time of the transaction or 
lock then only heartbeat them if they are olden than the sleep interval.
Added TODO about it to be more clever later.


http://gerrit.cloudera.org:8080/#/c/13968/2/fe/src/main/java/org/apache/impala/service/TransactionKeepalive.java@54
PS2, Line 54:   private Map transactions_ = new HashMap<>();
> I think it would be worth logging some information each time this thread wa
I log how much open transactions or independent locks are open. And only log 
when there's any to heartbeat.

Also added logging about how much time heartbeating took.


http://gerrit.cloudera.org:8080/#/c/13968/2/fe/src/main/java/org/apache/impala/service/TransactionKeepalive.java@58
PS2, Line 58:
> as Tim noticed, this is never true, right?
I removed it. Currently TransactionKeepalive thread runs forever.


http://gerrit.cloudera.org:8080/#/c/13968/2/fe/src/main/java/org/apache/impala/service/TransactionKeepalive.java@67
PS2, Line 67: 's deepcopy the transactions and locks
> I checked what Hive does, and it only sends heartbeats for the transaction
Thanks, it wasn't clear to me from the docs and the RPC didn't throw an 
exception when I heartbeated both.

With than in mind I had to do a bunch of changes, but I believe it'll be 
beneficial long-term, so thanks again.


http://gerrit.cloudera.org:8080/#/c/13968/2/fe/src/main/java/org/apache/impala/service/TransactionKeepalive.java@68
PS2, Line 68:   Map copyOfTransactions;
> I'm curious if the HMS supports or plans to support a bulk heartbeat API wh
There is also a heartbeatTxnRange() method but the docs say "it's for streaming 
ingest, everyone else should use heartbeat(txnId, lockId)"


[Impala-ASF-CR] IMPALA-8549: Add support for scanning DEFLATE text files

2019-08-02 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13857 )

Change subject: IMPALA-8549: Add support for scanning DEFLATE text files
..


Patch Set 8: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13857/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13857/8//COMMIT_MSG@10
PS8, Line 10: stored as text. To avoid confusion, it should be noted that 
although these
nit: some lines are longer than 72 chars.



--
To view, visit http://gerrit.cloudera.org:8080/13857
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I45e41ab5a12637d396fef0812a09d71fa839b27a
Gerrit-Change-Number: 13857
Gerrit-PatchSet: 8
Gerrit-Owner: Ethan Xue 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Ethan Xue 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 02 Aug 2019 17:56:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8549: Add support for scanning DEFLATE text files

2019-08-02 Thread Ethan Xue (Code Review)
Hello Abhishek Rawat, Zoltan Borok-Nagy, Sahil Takiar, Bikramjeet Vig, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13857

to look at the new patch set (#8).

Change subject: IMPALA-8549: Add support for scanning DEFLATE text files
..

IMPALA-8549: Add support for scanning DEFLATE text files

This patch adds support to Impala for scanning .DEFLATE files of tables
stored as text. To avoid confusion, it should be noted that although these
files have a compression type of DEFLATE in Impala, they should be treated
as if their compression type is DEFAULT.

Hadoop tools such as Hive and MapReduce support reading and writing
text files compressed using the deflate algorithm, which is the
default compression type. Hadoop uses the zlib library (an implementation
of the DEFLATE algorithm) to compress text files into .DEFLATE files,
which are not in the raw deflate format but rather the zlib format
(the zlib library supports three flavors of deflate, and Hadoop uses
the flavor that compresses data into deflate with zlib wrappings rather
than just raw deflate)

Testing:
There is a pre-existing unit test that validates compressing/decompressing
data with compression type DEFLATE. Also, modified existing end-to-end
testing that simulates querying files of various formats and compression
types. All core and exhaustive tests pass.

Change-Id: I45e41ab5a12637d396fef0812a09d71fa839b27a
---
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/hdfs-text-scanner.h
M testdata/datasets/functional/schema_constraints.csv
M testdata/workloads/functional-query/functional-query_exhaustive.csv
M tests/query_test/test_compressed_formats.py
5 files changed, 25 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/13857/8
--
To view, visit http://gerrit.cloudera.org:8080/13857
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I45e41ab5a12637d396fef0812a09d71fa839b27a
Gerrit-Change-Number: 13857
Gerrit-PatchSet: 8
Gerrit-Owner: Ethan Xue 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Ethan Xue 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-8549: Add support for scanning DEFLATE text files

2019-08-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13857 )

Change subject: IMPALA-8549: Add support for scanning DEFLATE text files
..


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13857/7/be/src/exec/hdfs-text-scanner.cc
File be/src/exec/hdfs-text-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/13857/7/be/src/exec/hdfs-text-scanner.cc@95
PS7, Line 95:   // In order to decompress gzip-, snappy-, bzip2- and 
deflate-compressed text files,
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/13857/7/be/src/exec/hdfs-text-scanner.cc@96
PS7, Line 96:   // we need to read entire files. Only read a file if 
we're assigned the first split
line too long (93 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/13857
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I45e41ab5a12637d396fef0812a09d71fa839b27a
Gerrit-Change-Number: 13857
Gerrit-PatchSet: 7
Gerrit-Owner: Ethan Xue 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Ethan Xue 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 02 Aug 2019 17:49:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8376: directory limits for scratch usage

2019-08-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13986 )

Change subject: IMPALA-8376: directory limits for scratch usage
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4124/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I696146a65dbb97f1ba200ae472358ae2db6eb441
Gerrit-Change-Number: 13986
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 02 Aug 2019 17:49:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8549: Add support for scanning DEFLATE text files

2019-08-02 Thread Ethan Xue (Code Review)
Hello Abhishek Rawat, Zoltan Borok-Nagy, Sahil Takiar, Bikramjeet Vig, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13857

to look at the new patch set (#7).

Change subject: IMPALA-8549: Add support for scanning DEFLATE text files
..

IMPALA-8549: Add support for scanning DEFLATE text files

This patch adds support to Impala for scanning .DEFLATE files of tables
stored as text. To avoid confusion, it should be noted that although these
files have a compression type of DEFLATE in Impala, they should be treated
as if their compression type is DEFAULT.

Hadoop tools such as Hive and MapReduce support reading and writing
text files compressed using the deflate algorithm, which is the
default compression type. Hadoop uses the zlib library (an implementation
of the DEFLATE algorithm) to compress text files into .DEFLATE files,
which are not in the raw deflate format but rather the zlib format
(the zlib library supports three flavors of deflate, and Hadoop uses
the flavor that compresses data into deflate with zlib wrappings rather
than just raw deflate)

Testing:
There is a pre-existing unit test that validates compressing/decompressing
data with compression type DEFLATE. Also, modified existing end-to-end
testing that simulates querying files of various formats and compression
types. All core and exhaustive tests pass.

Change-Id: I45e41ab5a12637d396fef0812a09d71fa839b27a
---
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/hdfs-text-scanner.h
M testdata/datasets/functional/schema_constraints.csv
M testdata/workloads/functional-query/functional-query_exhaustive.csv
M tests/query_test/test_compressed_formats.py
5 files changed, 24 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/13857/7
--
To view, visit http://gerrit.cloudera.org:8080/13857
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I45e41ab5a12637d396fef0812a09d71fa839b27a
Gerrit-Change-Number: 13857
Gerrit-PatchSet: 7
Gerrit-Owner: Ethan Xue 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Ethan Xue 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-8637: Implement transaction handling and locking for ACID queries

2019-08-02 Thread Zoltan Borok-Nagy (Code Review)
Hello Yongzhi Chen, Gabor Kaszab, Tim Armstrong, Csaba Ringhofer, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13968

to look at the new patch set (#3).

Change subject: IMPALA-8637: Implement transaction handling and locking for 
ACID queries
..

IMPALA-8637: Implement transaction handling and locking for ACID queries

Adds background thread to the Frontend that periodically heartbeats
the opened transactions and locks. Most of the logic is implemented
in the newly added TransactionKeepalive class.

TransactionKeepalive keeps track of the creation time of the
transactions and locks, and only heartbeat them if they are older
then the sleep interval. This way we don't heartbeat short-running
queries unnecessarily.

TODOs:
 * add tests
 * add metrics

Change-Id: Iaa37899b24aa114be642bf8772b4e0f882865cfa
---
M be/src/service/client-request-state.cc
M be/src/service/frontend.cc
M be/src/service/frontend.h
M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
A fe/src/main/java/org/apache/impala/service/TransactionKeepalive.java
8 files changed, 397 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/13968/3
--
To view, visit http://gerrit.cloudera.org:8080/13968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iaa37899b24aa114be642bf8772b4e0f882865cfa
Gerrit-Change-Number: 13968
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-8549: Add support for scanning DEFLATE text files

2019-08-02 Thread Ethan Xue (Code Review)
Ethan Xue has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13857 )

Change subject: IMPALA-8549: Add support for scanning DEFLATE text files
..


Patch Set 6:

> Patch Set 6: Code-Review+1
>
> (3 comments)
>
> The code looks good to me. Did you do any interop tests, e.g. create .deflate 
> files with Hive and read them with Impala?

Thanks for reviewing Zoltan. Yes, there is an interop test in 
test_compressed_formats.py called test_compressed_formats that uses the test 
tables in Impala to read compressed files of various formats created in Hive.


--
To view, visit http://gerrit.cloudera.org:8080/13857
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I45e41ab5a12637d396fef0812a09d71fa839b27a
Gerrit-Change-Number: 13857
Gerrit-PatchSet: 6
Gerrit-Owner: Ethan Xue 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Ethan Xue 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 02 Aug 2019 17:34:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7984: Port UpdateFilter() and PublishFilter() to KRPC

2019-08-02 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13882 )

Change subject: IMPALA-7984: Port UpdateFilter() and PublishFilter() to KRPC
..


Patch Set 12:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13882/12/be/src/util/bloom-filter.cc
File be/src/util/bloom-filter.cc:

http://gerrit.cloudera.org:8080/#/c/13882/12/be/src/util/bloom-filter.cc@229
PS12, Line 229: void BloomFilter::Or(const BloomFilterPB& in, BloomFilterPB* 
out) {
I suspect you may have kept the 'directory' field in BloomFilterPB because its 
sort of difficult to see how to implement a function like this if the data is 
in a sidecar.

I think the solution is that we'll just need to treat the bloom filter as two 
objects: the BloomFilterPB and a string.

Then, in coordinator-filter-state in addition to 'bloom_filter_' field we can 
add something like a 'bloom_filter_directory_' field, which will be either a 
string or possibly a kudu::Slice or kudu::faststring depending on what's 
convenient, and we can maintain both of those in FilterState::ApplyUpdate().

This function will then take four params: BloomFilterPB* in, string 
directory_in, BloomFilterPB* out, string* directory_out



--
To view, visit http://gerrit.cloudera.org:8080/13882
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b394796d250286510e157ae326882bfc01d387a
Gerrit-Change-Number: 13882
Gerrit-PatchSet: 12
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Fri, 02 Aug 2019 17:27:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7984: Port UpdateFilter() and PublishFilter() to KRPC

2019-08-02 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13882 )

Change subject: IMPALA-7984: Port UpdateFilter() and PublishFilter() to KRPC
..


Patch Set 12:

(7 comments)

Starting to really come together.

Some high level comments about the sidecar stuff

http://gerrit.cloudera.org:8080/#/c/13882/12//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13882/12//COMMIT_MSG@11
PS12, Line 11: Having incorporated sidecar on the aggregation stage when
 : the runtime filter is a Bloom filter.
When writing commit messages, you should be thinking about it from the 
perspective of someone who comes along weeks or months from now and needs to 
know what your patch does without having any context about the process you went 
through to produce the patch.

So, rather than phrasing it as "I've now updated the patch to use sidecars", 
which only makes sense to people who have the context that the patch was 
originally written without them, its better to phrase it as "This patch uses 
sidecars because...".

Or in other words, replace this with something like:
"To save on copying, the underlying data for bloom filters is transmitted via 
sidecar"


http://gerrit.cloudera.org:8080/#/c/13882/12/be/src/util/bloom-filter.h
File be/src/util/bloom-filter.h:

http://gerrit.cloudera.org:8080/#/c/13882/12/be/src/util/bloom-filter.h@97
PS12, Line 97:   static void ToProtobufWithSidecar(const BloomFilter* filter,
I don't think there's any reason we need to have both implementations - in 
practice we should only ever be using this version, so we can just eliminate 
the other version and name this one ToProtobuf()


http://gerrit.cloudera.org:8080/#/c/13882/12/be/src/util/bloom-filter.h@99
PS12, Line 99: UpdateFilterParamsPB* params
If you move 'sidecar_idx' into the BloomFilterPB, I think we can just take a 
BloomFilterPB* here instead


http://gerrit.cloudera.org:8080/#/c/13882/12/be/src/util/bloom-filter.cc
File be/src/util/bloom-filter.cc:

http://gerrit.cloudera.org:8080/#/c/13882/12/be/src/util/bloom-filter.cc@19
PS12, Line 19: #include "kudu/rpc/rpc_controller.h"
leave a blank line after the first import for the header file that directly 
corresponds to this file


http://gerrit.cloudera.org:8080/#/c/13882/12/be/src/util/bloom-filter.cc@59
PS12, Line 59: Status BloomFilter::Init(const BloomFilterPB& protobuf) {
I think this needs to be updated to handle the sidecar case


http://gerrit.cloudera.org:8080/#/c/13882/12/common/protobuf/data_stream_service.proto
File common/protobuf/data_stream_service.proto:

http://gerrit.cloudera.org:8080/#/c/13882/12/common/protobuf/data_stream_service.proto@89
PS12, Line 89:   optional bytes directory = 2;
Lets eliminate this field and only allow sending bloom filters with sidecars


http://gerrit.cloudera.org:8080/#/c/13882/12/common/protobuf/data_stream_service.proto@120
PS12, Line 120:   optional int32 sidecar_idx = 5;
Lets put this in BloomFilterPB, name it something like 'directory_sidecar_idx', 
and include a brief comment



--
To view, visit http://gerrit.cloudera.org:8080/13882
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b394796d250286510e157ae326882bfc01d387a
Gerrit-Change-Number: 13882
Gerrit-PatchSet: 12
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Fri, 02 Aug 2019 17:11:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8376: directory limits for scratch usage

2019-08-02 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/13986 )

Change subject: IMPALA-8376: directory limits for scratch usage
..

IMPALA-8376: directory limits for scratch usage

This extends the --scratch_dirs syntax to support specifying a max
capacity per directory, similarly to the --data_cache confirmation.
The capacity is delimited from the directory name with ":" and
uses the usual syntax for specifying memory. The following are
valid arguments:
* --scratch_dirs=/dir1,/dir2 (no limits)
* --scratch_dirs=/dir1,/dir2:25G (only a limit on /dir2)
* --scratch_dirs=/dir1:5MB,/dir2 (only a limit on /dir)
* --scratch_dirs=/dir1:-1,/dir2:0 (alternative ways of
  expressing no limit)

The usage is tracked with a metric per directory. Allocations
from that directory start to fail when the limit is exceeded.
These metrics are exposed as
tmp-file-mgr.scratch-space-bytes-used.dir-0,
tmp-file-mgr.scratch-space-bytes-used.dir-1, etc.

Testing:
Added a unit test to exercise TmpFileMgr.

Manually ran a spilling query on an impalad with multiple scratch dirs
configured with different limits. Confirmed via metrics that the
capacities were enforced.

Change-Id: I696146a65dbb97f1ba200ae472358ae2db6eb441
---
M be/src/runtime/tmp-file-mgr-internal.h
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M common/thrift/generate_error_codes.py
M common/thrift/metrics.json
6 files changed, 344 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/13986/3
--
To view, visit http://gerrit.cloudera.org:8080/13986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I696146a65dbb97f1ba200ae472358ae2db6eb441
Gerrit-Change-Number: 13986
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] WIP IMPALA-8637: Implement transaction handling and locking for ACID queries

2019-08-02 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13968 )

Change subject: WIP IMPALA-8637: Implement transaction handling and locking for 
ACID queries
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13968/2/fe/src/main/java/org/apache/impala/service/TransactionKeepalive.java
File fe/src/main/java/org/apache/impala/service/TransactionKeepalive.java:

http://gerrit.cloudera.org:8080/#/c/13968/2/fe/src/main/java/org/apache/impala/service/TransactionKeepalive.java@67
PS2, Line 67: for (Long lockId : entry.getValue()) {
> Are you sure that we have to heartbeat every lock for a given transaction?
I checked what Hive does, and it only sends heartbeats for the transaction 
(with dummy lock id 0) if there is a transaction. This is the expected way to 
do it, the metastore will do extra work if the lock id is not 0.

client side:
https://github.com/apache/hive/blob/d7475aa98f6a2fc813e2e1c0ad99f902cb28cc00/ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java#L566

metastore side:

This is where the heartbeat begins:
https://github.com/apache/hive/blob/d7475aa98f6a2fc813e2e1c0ad99f902cb28cc00/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L2823

updating a lock's timestamp:
https://github.com/apache/hive/blob/d7475aa98f6a2fc813e2e1c0ad99f902cb28cc00/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L4542



--
To view, visit http://gerrit.cloudera.org:8080/13968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa37899b24aa114be642bf8772b4e0f882865cfa
Gerrit-Change-Number: 13968
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 02 Aug 2019 16:07:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP IMPALA-8637: Implement transaction handling and locking for ACID queries

2019-08-02 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13968 )

Change subject: WIP IMPALA-8637: Implement transaction handling and locking for 
ACID queries
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/13968/2/fe/src/main/java/org/apache/impala/service/TransactionKeepalive.java
File fe/src/main/java/org/apache/impala/service/TransactionKeepalive.java:

http://gerrit.cloudera.org:8080/#/c/13968/2/fe/src/main/java/org/apache/impala/service/TransactionKeepalive.java@48
PS2, Line 48:   private Map> txnsAndLocks_ = new HashMap<>();
maybe for a later commit:
I think it would be beneficial to avoid sending heartbeats for transactions 
that do not need it (because they started/hearbeated not too long ago). This 
could be done by saving the transaction creation time / last heartbeat time, 
and skip the heartbeat if now() - heartbeat_time < some_constant.

If SELECTs will also open transactions, then there will be probably a large 
number of short lived transaction that shouldn't send any heartbeats. As far as 
I know every heartbeat leads to a DB update in HMS, so these are not very cheap 
RPCs.


http://gerrit.cloudera.org:8080/#/c/13968/2/fe/src/main/java/org/apache/impala/service/TransactionKeepalive.java@67
PS2, Line 67: for (Long lockId : entry.getValue()) {
Are you sure that we have to heartbeat every lock for a given transaction? This 
seems counter intuitive to me, so I think a comment about it would be useful.


http://gerrit.cloudera.org:8080/#/c/13968/2/fe/src/main/java/org/apache/impala/service/TransactionKeepalive.java@73
PS2, Line 73:   Thread.sleep(SLEEP_INTERVAL_SECONDS * 1000);
> I think sleeping for the full interval will cause drift of when the heartbe
Doing this way makes me a bit worried about multiple coordinators starting at 
the same time and then bombing HMS with heartbeats at the same time. Waiting a 
random time at the beginning could help with this. Probably this is not 
critical now, but may worth a TODO.



--
To view, visit http://gerrit.cloudera.org:8080/13968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa37899b24aa114be642bf8772b4e0f882865cfa
Gerrit-Change-Number: 13968
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 02 Aug 2019 15:24:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8781: Result spooling tests to cover edge cases and cancellation

2019-08-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13907 )

Change subject: IMPALA-8781: Result spooling tests to cover edge cases and 
cancellation
..


Patch Set 9: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13907
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib3b3a1539c4a5fa9b43c8ca315cea16c9701e283
Gerrit-Change-Number: 13907
Gerrit-PatchSet: 9
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Fri, 02 Aug 2019 15:17:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8781: Result spooling tests to cover edge cases and cancellation

2019-08-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13907 )

Change subject: IMPALA-8781: Result spooling tests to cover edge cases and 
cancellation
..


Patch Set 9:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4718/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/13907
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib3b3a1539c4a5fa9b43c8ca315cea16c9701e283
Gerrit-Change-Number: 13907
Gerrit-PatchSet: 9
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Fri, 02 Aug 2019 15:17:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-8637: Implement transaction handling and locking for ACID queries

2019-08-02 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13968 )

Change subject: WIP IMPALA-8637: Implement transaction handling and locking for 
ACID queries
..


Patch Set 2:

(2 comments)

Did a quick run through again. I most probably will switch to spectator mode 
and follow the review for my education.

http://gerrit.cloudera.org:8080/#/c/13968/2/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/13968/2/fe/src/main/java/org/apache/impala/service/Frontend.java@1718
PS2, Line 1718: List lockComponents = new ArrayList<>();
Can you give tables.size() to the arraylist constructor?


http://gerrit.cloudera.org:8080/#/c/13968/2/fe/src/main/java/org/apache/impala/service/TransactionKeepalive.java
File fe/src/main/java/org/apache/impala/service/TransactionKeepalive.java:

http://gerrit.cloudera.org:8080/#/c/13968/2/fe/src/main/java/org/apache/impala/service/TransactionKeepalive.java@58
PS2, Line 58: stopped_
as Tim noticed, this is never true, right?



--
To view, visit http://gerrit.cloudera.org:8080/13968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa37899b24aa114be642bf8772b4e0f882865cfa
Gerrit-Change-Number: 13968
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 02 Aug 2019 14:18:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8703: ISO:SQL:2016 datetime patterns - Milestone 1

2019-08-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13722 )

Change subject: IMPALA-8703: ISO:SQL:2016 datetime patterns - Milestone 1
..


Patch Set 12:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4123/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13722
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I19d8d097a45ae6f103b6cd1b2d81aad38dfd9e23
Gerrit-Change-Number: 13722
Gerrit-PatchSet: 12
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 02 Aug 2019 11:44:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8549: Add support for scanning DEFLATE text files

2019-08-02 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13857 )

Change subject: IMPALA-8549: Add support for scanning DEFLATE text files
..


Patch Set 6: Code-Review+1

(3 comments)

The code looks good to me. Did you do any interop tests, e.g. create .deflate 
files with Hive and read them with Impala?

http://gerrit.cloudera.org:8080/#/c/13857/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13857/6//COMMIT_MSG@24
PS6, Line 24: as if their compression type is ZLIB.
Maybe it's also worth to mention that deflate is the default compression in 
Hadoop. Because it's a bit confusing that the default compression is deflate, 
which has three flavors, and we use the zlib flavor.


http://gerrit.cloudera.org:8080/#/c/13857/6/be/src/exec/hdfs-text-scanner.cc
File be/src/exec/hdfs-text-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/13857/6/be/src/exec/hdfs-text-scanner.cc@95
PS6, Line 95: gzip-, snappy- and bzip2
nit: maybe add deflate to this list


http://gerrit.cloudera.org:8080/#/c/13857/6/be/src/exec/hdfs-text-scanner.cc@204
PS6, Line 204: DEFAULT
nit: Please extend the comment why we use DEFAULT here instead of ZLIB.



--
To view, visit http://gerrit.cloudera.org:8080/13857
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I45e41ab5a12637d396fef0812a09d71fa839b27a
Gerrit-Change-Number: 13857
Gerrit-PatchSet: 6
Gerrit-Owner: Ethan Xue 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Ethan Xue 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 02 Aug 2019 11:05:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8703: ISO:SQL:2016 datetime patterns - Milestone 1

2019-08-02 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13722 )

Change subject: IMPALA-8703: ISO:SQL:2016 datetime patterns - Milestone 1
..


Patch Set 13:

(39 comments)

http://gerrit.cloudera.org:8080/#/c/13722/10//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13722/10//COMMIT_MSG@17
PS10, Line 17: must specify a string literal and
 : cannot be used with any other kind of a stri
> nit: must specify a string literal and cannot be used with any other kind o
Done


http://gerrit.cloudera.org:8080/#/c/13722/10/be/src/exprs/cast-expr.h
File be/src/exprs/cast-expr.h:

http://gerrit.cloudera.org:8080/#/c/13722/10/be/src/exprs/cast-expr.h@1
PS10, Line 1:
> Source file should be renamed to cast-format-expr.h
Done


http://gerrit.cloudera.org:8080/#/c/13722/10/be/src/exprs/cast-expr.h@26
PS10, Line 26:
> nit: /// here and below.
Done


http://gerrit.cloudera.org:8080/#/c/13722/10/be/src/exprs/cast-expr.h@44
PS10, Line 44:
> Does 'dt_ctx_' have to be a pointer?
It has to exist on the heap as the way to pass any input to functions such as 
CastFunctions::CastToTimestampVal() is to store the pointer of the data using 
fn_ctx->SetFunctionState().


http://gerrit.cloudera.org:8080/#/c/13722/10/be/src/exprs/cast-expr.cc
File be/src/exprs/cast-expr.cc:

http://gerrit.cloudera.org:8080/#/c/13722/10/be/src/exprs/cast-expr.cc@30
PS10, Line 30:
> nit: DCHECK(eval != nullptr);
Done


http://gerrit.cloudera.org:8080/#/c/13722/10/be/src/exprs/cast-functions-ir.cc
File be/src/exprs/cast-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/13722/10/be/src/exprs/cast-functions-ir.cc@182
PS10, Line 182:char buf[buf_len];
  : int ret_val = tv.Format(*format_ctx, buf_len, buf);
> Maybe instead of allocating 'buf' on the stack, we should allocate it on th
format length is maximized in the tokenizer:
const int IsoSqlFormatTokenizer::MAX_FORMAT_LENGTH = 100;
I'd keep this as it is.


http://gerrit.cloudera.org:8080/#/c/13722/10/be/src/exprs/cast-functions-ir.cc@204
PS10, Line 204: char buf[buf_len];
  : int ret_val = dv.Format(*format_ctx, buf_len, buf);
> Same as L182 above.
Same as above :)


http://gerrit.cloudera.org:8080/#/c/13722/10/be/src/runtime/date-parse-util.cc
File be/src/runtime/date-parse-util.cc:

http://gerrit.cloudera.org:8080/#/c/13722/10/be/src/runtime/date-parse-util.cc@127
PS10, Line 127: oSqlFormatParser::Pa
> In DateParser::ParseSimpleDateFormat() dt_ctx.has_date_toks is DCHECKed at
Initially I've set the has_date_toks flag in 
IsoSqlFormatParser::ParseDateTime() s I couldn't dcheck it at the beginning of 
the function. I moved setting that flag to the tokenizer so it's safe to do the 
DCHECK here. thanks for spotting!


http://gerrit.cloudera.org:8080/#/c/13722/10/be/src/runtime/date-parse-util.cc@152
PS10, Line 152: ar
> <
Done


http://gerrit.cloudera.org:8080/#/c/13722/10/be/src/runtime/datetime-iso-sql-format-parser.h
File be/src/runtime/datetime-iso-sql-format-parser.h:

http://gerrit.cloudera.org:8080/#/c/13722/10/be/src/runtime/datetime-iso-sql-format-parser.h@71
PS10, Line 71: '**tok
> **tok ?
Done


http://gerrit.cloudera.org:8080/#/c/13722/10/be/src/runtime/datetime-iso-sql-format-parser.cc
File be/src/runtime/datetime-iso-sql-format-parser.cc:

http://gerrit.cloudera.org:8080/#/c/13722/10/be/src/runtime/datetime-iso-sql-format-parser.cc@30
PS10, Line 30:
> In other .cc files, we just include common/names.h which pulls in all the u
Done


http://gerrit.cloudera.org:8080/#/c/13722/10/be/src/runtime/datetime-iso-sql-format-parser.cc@134
PS10, Line 134:// Input has already been validated in 
ParseMeridiemIndicatorFromInput().
  : string indicator(current_pos, token_len);
  : boost::to_upper(indicator);
  : if (in
> Add a comment that the token has already been validated in GetNextTokenFrom
Done


http://gerrit.cloudera.org:8080/#/c/13722/10/be/src/runtime/datetime-iso-sql-format-parser.cc@145
PS10, Line 145:   case ISO8601_TIME_INDICATOR:
> Add DCHECK(token_len == 1)
Done


http://gerrit.cloudera.org:8080/#/c/13722/10/be/src/runtime/datetime-iso-sql-format-parser.cc@146
PS10, Line 146: SO860
> std qualifier is not necessary, here and elsewhere in the .cc file.
Done


http://gerrit.cloudera.org:8080/#/c/13722/10/be/src/runtime/datetime-iso-sql-format-parser.cc@172
PS10, Line 172:
> Maybe 'current_tok_ind' ?
Done


http://gerrit.cloudera.org:8080/#/c/13722/10/be/src/runtime/datetime-iso-sql-format-parser.cc@195
PS10, Line 195:
  :   // The last '-' of a separator s
> Thanks for refactoring the algorithm and moving all the separator skipping
Done


http://gerrit.cloudera.org:8080/#/c/13722/10/be/src/runtime/datetime-iso-sql-format-parser.cc@205
PS10, Line 205:
> This should be called 'FindEndOfToken' or something similar.
Done



[Impala-ASF-CR] IMPALA-8703: ISO:SQL:2016 datetime patterns - Milestone 1

2019-08-02 Thread Gabor Kaszab (Code Review)
Hello Attila Jeges, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13722

to look at the new patch set (#12).

Change subject: IMPALA-8703: ISO:SQL:2016 datetime patterns - Milestone 1
..

IMPALA-8703: ISO:SQL:2016 datetime patterns - Milestone 1

This enhancement introduces FORMAT clause for CAST() operator that is
applicable for casts between string types and timestamp types. Instead
of accepting SimpleDateFormat patterns the FORMAT clause supports
datetime patterns following the ISO:SQL:2016 standard.
Note, the CAST() operator without the FORMAT clause still uses
Impala's implementation of SimpleDateFormat handling. Similarly, the
existing conversion functions such as to_timestamp(), from_timestamp()
etc. remain unchanged and use SimpleDateFormat. Contrary to how these
functions work the FORMAT clause must specify a string literal and
cannot be used with any other kind of a string expression.

Milestone 1 contains all the format tokens covered by the SQL
standard. Further milestones will add more functionality on top of
this list to cover functionality provided by other RDBMS systems.

List of tokens implemented by this change:
- , YYY, YY, Y: Year tokens
- , RR: Round year tokens
- MM: Month
- DD: Day
- DDD: Day of year
- HH, HH12: Hour of day (1-12)
- HH24: Hour of day (0-23)
- MI: Minute
- SS: Second
- S: Second of day
- FF, FF1, ..., FF9: Fractional second
- AM, PM, A.M., P.M.: Meridiem indicators
- TZH: Timezone hour
- TZM: Timezone minute
- Separators: - . / , ' ; : space
- ISO8601 date indicators (T, Z)

Some notes about the matching algorithm:
- The parsing algorithm uses these tokens in a case insensitive
  manner.
- The separators are interchangeable with each other. For example a
  '-' separator in the format will match with a '.' character in the
  input.
- The length of the separator sequences is handled flexibly meaning
  that a single separator character in the format for instance would
  match with a multi-separator sequence in the input.
- In a string type to timestamp conversion the timezone offset tokens
  are parsed, expected to match with the input but they don't adjust
  the result as the input is already expected to be in UTC format.

Usage example:
SELECT CAST('01-02-2019' AS TIMESTAMP FORMAT 'MM-DD-');
SELECT CAST('2019.10.10 13:30:40.123456 +01:30' AS TIMESTAMP
FORMAT '-MM-DD HH24:MI:SS.FF9 TZH:TZM');
SELECT CAST(timestamp_column as STRING
FORMAT " MM HH12 YY") from some_table;

Change-Id: I19d8d097a45ae6f103b6cd1b2d81aad38dfd9e23
---
M be/src/benchmarks/convert-timestamp-benchmark.cc
M be/src/benchmarks/parse-timestamp-benchmark.cc
M be/src/common/init.cc
M be/src/exec/text-converter.inline.h
M be/src/exprs/CMakeLists.txt
A be/src/exprs/cast-format-expr.cc
A be/src/exprs/cast-format-expr.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/date-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/scalar-expr-evaluator.h
M be/src/exprs/scalar-expr.cc
M be/src/exprs/scalar-expr.h
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timestamp-functions.h
M be/src/runtime/CMakeLists.txt
M be/src/runtime/date-parse-util.cc
M be/src/runtime/date-parse-util.h
M be/src/runtime/date-test.cc
M be/src/runtime/date-value.cc
M be/src/runtime/date-value.h
A be/src/runtime/datetime-iso-sql-format-parser.cc
A be/src/runtime/datetime-iso-sql-format-parser.h
A be/src/runtime/datetime-iso-sql-format-tokenizer.cc
A be/src/runtime/datetime-iso-sql-format-tokenizer.h
D be/src/runtime/datetime-parse-util.h
A be/src/runtime/datetime-parser-common.cc
A be/src/runtime/datetime-parser-common.h
R be/src/runtime/datetime-simple-date-format-parser.cc
A be/src/runtime/datetime-simple-date-format-parser.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-parse-util.h
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
M be/src/service/impala-server.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/testutil/random-vector-generators.h
M be/src/util/dict-test.cc
M be/src/util/min-max-filter-test.cc
M be/src/util/string-parser.h
M common/thrift/Exprs.thrift
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CastExpr.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
A 
testdata/workloads/functional-query/queries/QueryTest/cast_format_from_table.test
M testdata/workloads/functional-query/queries/QueryTest/date.test
A tests/query_test/test_cast_with_format.py
54 files changed, 3,425 insertions(+), 865 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF 

[Impala-ASF-CR] IMPALA-5149: Provide query profile in JSON format

2019-08-02 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13801 )

Change subject: IMPALA-5149: Provide query profile in JSON format
..


Patch Set 12:

(9 comments)

We don't really enforce any compatibility for our runtime profile in general 
now so counters may get added and deleted across releases. So, the json profile 
output may change as Impala code changes.

In the future, when we have a better compatibility story for our profile, it 
may make sense to have a version string in the output.

http://gerrit.cloudera.org:8080/#/c/13801/12/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/13801/12/be/src/service/impala-server.cc@845
PS12, Line 845:   if (!parse_ok){
The header comment says the output stream is not modified on error. May be 
worth double checking that json_output not modified on parse error.


http://gerrit.cloudera.org:8080/#/c/13801/12/be/src/util/runtime-profile.h
File be/src/util/runtime-profile.h:

http://gerrit.cloudera.org:8080/#/c/13801/12/be/src/util/runtime-profile.h@618
PS12, Line 618: void ToJsonCounters(rapidjson::Value* parent, 
rapidjson::Document* d,
  :   const string& counter_name, const CounterMap& counter_map,
  :   const ChildCounterMap& child_counter_map) const;
Please document each of the parameters.


http://gerrit.cloudera.org:8080/#/c/13801/11/be/src/util/runtime-profile.h
File be/src/util/runtime-profile.h:

http://gerrit.cloudera.org:8080/#/c/13801/11/be/src/util/runtime-profile.h@126
PS11, Line 126:   Get the actual Counter derived type
Returns the name of the counter type.


http://gerrit.cloudera.org:8080/#/c/13801/12/be/src/util/runtime-profile.cc
File be/src/util/runtime-profile.cc:

http://gerrit.cloudera.org:8080/#/c/13801/12/be/src/util/runtime-profile.cc@766
PS12, Line 766: _rapid
I noticed that a lot of the Value variables have the "_rapid" suffix. Is 
"_json" a more appropriate suffix ?


http://gerrit.cloudera.org:8080/#/c/13801/12/be/src/util/runtime-profile.cc@784
PS12, Line 784: info_strings_.find(key)->second.
DCHECK(info_strings_.find(key) != info_strings_.end());


http://gerrit.cloudera.org:8080/#/c/13801/12/be/src/util/runtime-profile.cc@802
PS12, Line 802:
nit: unnecessary blank line


http://gerrit.cloudera.org:8080/#/c/13801/12/be/src/util/runtime-profile.cc@869
PS12, Line 869:
nit: unnecessary blank line


http://gerrit.cloudera.org:8080/#/c/13801/12/tests/webserver/test_web_pages.py
File tests/webserver/test_web_pages.py:

http://gerrit.cloudera.org:8080/#/c/13801/12/tests/webserver/test_web_pages.py@563
PS12, Line 563: ("Text", "Json")
["Text", "Json"]


http://gerrit.cloudera.org:8080/#/c/13801/12/tests/webserver/test_web_pages.py@563
PS12, Line 563: download_format
profile_format



--
To view, visit http://gerrit.cloudera.org:8080/13801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8181ac818bf22207ca1deabd9220c397ae723ec1
Gerrit-Change-Number: 13801
Gerrit-PatchSet: 12
Gerrit-Owner: Jiawei Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jiawei Wang 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Fri, 02 Aug 2019 07:08:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7984: Port UpdateFilter() and PublishFilter() to KRPC

2019-08-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13882 )

Change subject: IMPALA-7984: Port UpdateFilter() and PublishFilter() to KRPC
..


Patch Set 12:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4122/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13882
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b394796d250286510e157ae326882bfc01d387a
Gerrit-Change-Number: 13882
Gerrit-PatchSet: 12
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Fri, 02 Aug 2019 06:10:45 +
Gerrit-HasComments: No