[Impala-ASF-CR] IMPALA-9287: Fix test kudu table create without hms in Hive3

2020-02-03 Thread wangsheng (Code Review)
wangsheng has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15057 )

Change subject: IMPALA-9287: Fix test_kudu_table_create_without_hms in Hive3
..


Patch Set 10:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15057/9/bin/bootstrap_system.sh
File bin/bootstrap_system.sh:

http://gerrit.cloudera.org:8080/#/c/15057/9/bin/bootstrap_system.sh@315
PS9, Line 315: 500
> nit: 1500
Not 1500, but 500. 's/\(max_connections = \)\S*/\1500/g' this usage means: use 
'\1' to replace 'max_connections = ', '500' is the actual content which you 
want to update.


http://gerrit.cloudera.org:8080/#/c/15057/9/tests/custom_cluster/test_kudu_table_create_without_hms.py
File tests/custom_cluster/test_kudu_table_create_without_hms.py:

http://gerrit.cloudera.org:8080/#/c/15057/9/tests/custom_cluster/test_kudu_table_create_without_hms.py@32
PS9, Line 32:   @pytest.mark.execute_serially
> This should be removed, otherwise the test is skipped when using Hive3.
Done



--
To view, visit http://gerrit.cloudera.org:8080/15057
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc7d7e30cd560d43bb707dec54f4494355809f66
Gerrit-Change-Number: 15057
Gerrit-PatchSet: 10
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Tue, 04 Feb 2020 04:07:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9287: Fix test kudu table create without hms in Hive3

2020-02-03 Thread wangsheng (Code Review)
wangsheng has uploaded a new patch set (#10). ( 
http://gerrit.cloudera.org:8080/15057 )

Change subject: IMPALA-9287: Fix test_kudu_table_create_without_hms in Hive3
..

IMPALA-9287: Fix test_kudu_table_create_without_hms in Hive3

When build impala after setting USE_CDP_HIVE=true, the custom cluster
test case test_kudu_table_create_without_hms would failed due to
lacking of related jars. The solution is to add related maven
dependency in $IMPALA_HOME/fe/pom.xml and
$IMPALA_HOME/shaded-deps/pom.xml.

Tests:
  * Ran test_kudu_table_create_without_hms.py by setting
  USE_CDP_HIVE=true locally

Change-Id: Ibc7d7e30cd560d43bb707dec54f4494355809f66
---
M bin/bootstrap_system.sh
M fe/pom.xml
M shaded-deps/pom.xml
M tests/custom_cluster/test_kudu_table_create_without_hms.py
4 files changed, 31 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/15057/10
--
To view, visit http://gerrit.cloudera.org:8080/15057
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibc7d7e30cd560d43bb707dec54f4494355809f66
Gerrit-Change-Number: 15057
Gerrit-PatchSet: 10
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: wangsheng 


[Impala-ASF-CR] IMPALA-9287: Fix test kudu table create without hms in Hive3

2020-02-03 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15057 )

Change subject: IMPALA-9287: Fix test_kudu_table_create_without_hms in Hive3
..


Patch Set 9:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15057/9/bin/bootstrap_system.sh
File bin/bootstrap_system.sh:

http://gerrit.cloudera.org:8080/#/c/15057/9/bin/bootstrap_system.sh@315
PS9, Line 315: 500
nit: 1500


http://gerrit.cloudera.org:8080/#/c/15057/9/tests/custom_cluster/test_kudu_table_create_without_hms.py
File tests/custom_cluster/test_kudu_table_create_without_hms.py:

http://gerrit.cloudera.org:8080/#/c/15057/9/tests/custom_cluster/test_kudu_table_create_without_hms.py@32
PS9, Line 32:   @SkipIfHive3.without_hms_not_supported
This should be removed, otherwise the test is skipped when using Hive3.



--
To view, visit http://gerrit.cloudera.org:8080/15057
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc7d7e30cd560d43bb707dec54f4494355809f66
Gerrit-Change-Number: 15057
Gerrit-PatchSet: 9
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Tue, 04 Feb 2020 04:00:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9287: Fix test kudu table create without hms in Hive3

2020-02-03 Thread wangsheng (Code Review)
wangsheng has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15057 )

Change subject: IMPALA-9287: Fix test_kudu_table_create_without_hms in Hive3
..


Patch Set 9:

> (1 comment)

Soory for my late reply, Quanlong. I've already ran this test case before 
submit patch, and result is passed. I checked my test environment, found that 
pg max_connections been changed to 1000(default is 100). So I modify 
max_connections in $IMPALA_HOME/bin/bootstrap_system.sh. You can test the 
latest code in your environment when you are free. Thanks for your review again.


--
To view, visit http://gerrit.cloudera.org:8080/15057
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc7d7e30cd560d43bb707dec54f4494355809f66
Gerrit-Change-Number: 15057
Gerrit-PatchSet: 9
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Tue, 04 Feb 2020 03:56:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8361: Propagate predicates of outer-joined InlineView

2020-02-03 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15047 )

Change subject: IMPALA-8361: Propagate predicates of outer-joined InlineView
..


Patch Set 9:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/15047/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15047/9//COMMIT_MSG@24
PS9, Line 24: ,
nit: need space


http://gerrit.cloudera.org:8080/#/c/15047/9//COMMIT_MSG@24
PS9, Line 24: the predicates that
: must be evaluted at a join node but can also be safely evaluted 
by the
: outer-joined inline view.
I think you mean "some predicates that must be evaluted at a join node can also 
be safely evaluted by the outer-joined inline view".


http://gerrit.cloudera.org:8080/#/c/15047/9//COMMIT_MSG@30
PS9, Line 30: .
nit: need space


http://gerrit.cloudera.org:8080/#/c/15047/9/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/15047/9/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@985
PS9, Line 985: picked up by getBoundPredicates()
I think this comment is stale after merging this patch. Could you update it to 
"picked up by getBoundPredicates() and migrateConjunctsToInlineView()"?


http://gerrit.cloudera.org:8080/#/c/15047/9/testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test
File testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test:

http://gerrit.cloudera.org:8080/#/c/15047/9/testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test@2368
PS9, Line 2368: predicates: rand() = 12
This looks strange to me. rand() returns a random value between 0 and 1 so 
"rand() = 12" will always be false. All rows should be rejected by the WHERE 
clause. If "rand() = 12" is evaluated in only one side, the other side can 
still produce rows. So the outer join will still have results.

However, looks like the original planner has the same plan. Could you create a 
JIRA for this? I think it's a bug. It's worth to mention it in the above 
comments.


http://gerrit.cloudera.org:8080/#/c/15047/9/testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test@2402
PS9, Line 2402:  ,
nit: put the space after the comma


http://gerrit.cloudera.org:8080/#/c/15047/9/testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test@2442
PS9, Line 2442: upper(b.string_col)
Should we move the function inside the view? This can be propagated without 
this patch. Maybe change it to

 SELECT * FROM functional.alltypestiny a
LEFT JOIN
(SELECT upper(b.string_col) as string_col, b.id FROM functional.alltypestiny a 
LEFT JOIN
functional.alltypestiny b ON a.id=b.id) b ON a.id=b.id
WHERE b.string_col='1';



--
To view, visit http://gerrit.cloudera.org:8080/15047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6c23a45aeb5dd1aa06a95c9aa8628ecbe37ef2c1
Gerrit-Change-Number: 15047
Gerrit-PatchSet: 9
Gerrit-Owner: Xianqing He 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Xianqing He 
Gerrit-Comment-Date: Tue, 04 Feb 2020 03:49:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9174: Emit WARNING when ORC lib leaks memory

2020-02-03 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15153 )

Change subject: IMPALA-9174: Emit WARNING when ORC lib leaks memory
..


Patch Set 1: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/15153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I370fb9f68734e0e555bd7224ab0f5440c4947c66
Gerrit-Change-Number: 15153
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 04 Feb 2020 02:37:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15134 )

Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/15134
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a
Gerrit-Change-Number: 15134
Gerrit-PatchSet: 4
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Tue, 04 Feb 2020 02:26:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9346: Fix TestImpalaShell.test config file failing issue on CentOS6/Python 2.6

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/15139 )

Change subject: IMPALA-9346: Fix TestImpalaShell.test_config_file failing issue 
on CentOS6/Python 2.6
..

IMPALA-9346: Fix TestImpalaShell.test_config_file failing issue
on CentOS6/Python 2.6

ImpalaShell.test_config_file failed in negative test case, which
ran impala shell with bad format config file - wrong option name and
wrong option value. The testing code expect impala shell return both
warning and error messages. But on CentOS6/Python 2.6, Impala shell
only return error message. To fix it, separate the test cases as two
test cases by running Impala shell in two different config file.

Testing:
 - Passed all test cases in test_shell_commandline.py and
   test_shell_interactive.py.
 - Passed all core test in pre-review-test.
 - Passed EE tests in impala-private-parameterized with CentOS6.

Change-Id: Ief5e825aa3baead5519132d47efcf0d5300860fd
Reviewed-on: http://gerrit.cloudera.org:8080/15139
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M bin/rat_exclude_files.txt
M tests/shell/impalarc_with_error2
A tests/shell/impalarc_with_warnings2
M tests/shell/test_shell_commandline.py
4 files changed, 19 insertions(+), 4 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/15139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ief5e825aa3baead5519132d47efcf0d5300860fd
Gerrit-Change-Number: 15139
Gerrit-PatchSet: 7
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-9346: Fix TestImpalaShell.test config file failing issue on CentOS6/Python 2.6

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15139 )

Change subject: IMPALA-9346: Fix TestImpalaShell.test_config_file failing issue 
on CentOS6/Python 2.6
..


Patch Set 6: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/15139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ief5e825aa3baead5519132d47efcf0d5300860fd
Gerrit-Change-Number: 15139
Gerrit-PatchSet: 6
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 04 Feb 2020 00:55:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9337 [DOCS] Document new way to create external Kudu table in Impala

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15149 )

Change subject: IMPALA-9337 [DOCS] Document new way to create external Kudu 
table in Impala
..


Patch Set 3: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/569/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/15149
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic07380fd53898dd21fbb5dacb4d9f7a84f160d4e
Gerrit-Change-Number: 15149
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 04 Feb 2020 00:29:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9337 [DOCS] Document new way to create external Kudu table in Impala

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15149 )

Change subject: IMPALA-9337 [DOCS] Document new way to create external Kudu 
table in Impala
..


Patch Set 3:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/569/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/15149
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic07380fd53898dd21fbb5dacb4d9f7a84f160d4e
Gerrit-Change-Number: 15149
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 04 Feb 2020 00:24:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9337 [DOCS] Document new way to create external Kudu table in Impala

2020-02-03 Thread Anonymous Coward (Code Review)
Hello Vihang Karajgaonkar, Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15149

to look at the new patch set (#3).

Change subject: IMPALA-9337 [DOCS] Document new way to create external Kudu 
table in Impala
..

IMPALA-9337 [DOCS] Document new way to create external Kudu table in Impala

Summary of changes:
- Changed title of "Kudu tables:" paragraph to "Managed Kudu tables:".
- Added syntax block in "External Kudu tables" to show new alternative create 
table syntax.
- Described alternative syntax and the differences between the two resulting 
tables.
- In Kudu considerations section, added an example of creating a synchronized 
external Kudu table.
- Covered similarily of synchronized tables to managed tables and HMS 
translation of external tables.

Change-Id: Ic07380fd53898dd21fbb5dacb4d9f7a84f160d4e
---
M docs/topics/impala_create_table.xml
1 file changed, 59 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/15149/3
--
To view, visit http://gerrit.cloudera.org:8080/15149
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic07380fd53898dd21fbb5dacb4d9f7a84f160d4e
Gerrit-Change-Number: 15149
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-9330: Support resolving unmasked nested columns in masked tables

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15108 )

Change subject: IMPALA-9330: Support resolving unmasked nested columns in 
masked tables
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5598/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15108
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1cc5565c64c1a4a56445b8edde59b1168f387791
Gerrit-Change-Number: 15108
Gerrit-PatchSet: 9
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 04 Feb 2020 00:07:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9289: Fix flakiness in TestConcurrentKuduCreate

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15151 )

Change subject: IMPALA-9289: Fix flakiness in TestConcurrentKuduCreate
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5495/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/15151
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idefba98ecd65efbd47b1618291330795ef13b910
Gerrit-Change-Number: 15151
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 03 Feb 2020 23:38:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9289: Fix flakiness in TestConcurrentKuduCreate

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15151 )

Change subject: IMPALA-9289: Fix flakiness in TestConcurrentKuduCreate
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15151
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idefba98ecd65efbd47b1618291330795ef13b910
Gerrit-Change-Number: 15151
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 03 Feb 2020 23:38:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9289: Fix flakiness in TestConcurrentKuduCreate

2020-02-03 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15151 )

Change subject: IMPALA-9289: Fix flakiness in TestConcurrentKuduCreate
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15151
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idefba98ecd65efbd47b1618291330795ef13b910
Gerrit-Change-Number: 15151
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 03 Feb 2020 23:28:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9330: Support resolving unmasked nested columns in masked tables

2020-02-03 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15108 )

Change subject: IMPALA-9330: Support resolving unmasked nested columns in 
masked tables
..


Patch Set 9: Code-Review+2

Carry on Csaba's +2.


--
To view, visit http://gerrit.cloudera.org:8080/15108
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1cc5565c64c1a4a56445b8edde59b1168f387791
Gerrit-Change-Number: 15108
Gerrit-PatchSet: 9
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 03 Feb 2020 23:21:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9330: Support resolving unmasked nested columns in masked tables

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15108 )

Change subject: IMPALA-9330: Support resolving unmasked nested columns in 
masked tables
..


Patch Set 9:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5494/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/15108
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1cc5565c64c1a4a56445b8edde59b1168f387791
Gerrit-Change-Number: 15108
Gerrit-PatchSet: 9
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 03 Feb 2020 23:22:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9330: Support resolving unmasked nested columns in masked tables

2020-02-03 Thread Quanlong Huang (Code Review)
Hello Anurag Mantripragada, Fang-Yu Rao, Vihang Karajgaonkar, Kurt Deschler, 
Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15108

to look at the new patch set (#9).

Change subject: IMPALA-9330: Support resolving unmasked nested columns in 
masked tables
..

IMPALA-9330: Support resolving unmasked nested columns in masked tables

Column masking policies on primitive columns of a table which contains
nested types (though they won't be masked) will cause query failures.
To be specifit, if tableA(id int, int_array array) has a masking
policy on column "id", all queries on "tableA" will fail, e.g.
  select id from tableA;
  select t.id, a.item from tableA t, t.int_array a;

Column masking is implemented by wrapping the underlying table/view with
a table masking view. However, as we don't support nested types in
SelectList, the table masking view can't expose nested columns of the
masked table, which causes collection refs not being resolved correctly.

This patch fixes the issue by 2 steps:
1) Expose nested columns of the underlying table in the output Type of
   the table masking view (see InlineViewRef#createTupleDescriptor()).
   So nested Paths in the original query block can be resolved.
2) For such kind of Paths, resolved them again inside the table masking
   view. So they can point to the underlying table as what they mean
   (see Analyzer#resolvePathWithMasking()). TupleDescriptor of such kind
   of table masking view won't be materialized since the view is simple
   enough that its query plan is just a ScanNode of the underlying
   table. The whole query plan can be stitched as if the table is not
   masked.
Note that one day when we support nested columns in SelectList, we may
don't need these 2 hacks.

This patch also adds some TRACE level loggings to improve debuggability.

Test changes in TestRanger.test_column_masking:
 - Add column masking policy on a table containing nested types.
 - Add queries on the masked tables. Some queries are borrowed from
   existing tests for nested types.

Tests:
 - Run CORE tests.

Change-Id: I1cc5565c64c1a4a56445b8edde59b1168f387791
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java
M fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/Path.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/authorization/TableMask.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test
M tests/authorization/test_ranger.py
12 files changed, 527 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/08/15108/9
--
To view, visit http://gerrit.cloudera.org:8080/15108
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1cc5565c64c1a4a56445b8edde59b1168f387791
Gerrit-Change-Number: 15108
Gerrit-PatchSet: 9
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-9330: Support resolving unmasked nested columns in masked tables

2020-02-03 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15108 )

Change subject: IMPALA-9330: Support resolving unmasked nested columns in 
masked tables
..


Patch Set 9:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15108/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15108/8//COMMIT_MSG@7
PS8, Line 7: IMPALA-9330: Support resolving unmasked nested columns in masked 
tables
> nit: maybe add something like "in masked tables"?
Done


http://gerrit.cloudera.org:8080/#/c/15108/8/tests/authorization/test_ranger.py
File tests/authorization/test_ranger.py:

http://gerrit.cloudera.org:8080/#/c/15108/8/tests/authorization/test_ranger.py@806
PS8, Line 806:
> I would prefer to give an error, but I am ok with the current status.
I filed HIVE-22822 and HIVE-22823 for Hive.



--
To view, visit http://gerrit.cloudera.org:8080/15108
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1cc5565c64c1a4a56445b8edde59b1168f387791
Gerrit-Change-Number: 15108
Gerrit-PatchSet: 9
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 03 Feb 2020 23:21:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7002: Throw AuthorizationException when user accessing non-existent table/database in CTE without any privilege.

2020-02-03 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15123 )

Change subject: IMPALA-7002: Throw AuthorizationException when user accessing 
non-existent table/database in CTE without any privilege.
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15123/4/fe/src/main/java/org/apache/impala/analysis/WithClause.java
File fe/src/main/java/org/apache/impala/analysis/WithClause.java:

http://gerrit.cloudera.org:8080/#/c/15123/4/fe/src/main/java/org/apache/impala/analysis/WithClause.java@94
PS4, Line 94:  } catch (AnalysisException e) {
:   throw e;
nit: we dont need the catch block if you are only returning the exception here


http://gerrit.cloudera.org:8080/#/c/15123/4/fe/src/main/java/org/apache/impala/analysis/WithClause.java@98
PS4, Line 98:   // withClauseAnalyzer is local variable. The privilege 
requests registered
:   // on it have to be re-registered to the root analyzer even 
when analyze
:   // function throw AnalysisException since authorization 
check is required
:   // for non existent database/table.
nit: this doesn't really tell me why they need to be re-registered.
Also, you used a specific example of when we need auth checks, is that the only 
case where exception can be thrown? If not, it might be worthwhile to 
investigate if it is ok to register auth checks in those other cases.
If you end up concluding that auth checks need to be registered in all cases 
then the method comment should be generic to encompass all cases.



--
To view, visit http://gerrit.cloudera.org:8080/15123
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia6b657a7147a136198a9a97a679c9131ee814577
Gerrit-Change-Number: 15123
Gerrit-PatchSet: 4
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 03 Feb 2020 23:08:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8712: Make ExecQueryFInstances async

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15154 )

Change subject: IMPALA-8712: Make ExecQueryFInstances async
..


Patch Set 1:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/5597/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/15154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I33ec96e5885af094c294cd3a76c242995263ba32
Gerrit-Change-Number: 15154
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Mon, 03 Feb 2020 23:06:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9335 (part 2): Fix rebased KRPC to compile

2020-02-03 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15144 )

Change subject: IMPALA-9335 (part 2): Fix rebased KRPC to compile
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15144/2/be/src/runtime/io/data-cache.cc
File be/src/runtime/io/data-cache.cc:

http://gerrit.cloudera.org:8080/#/c/15144/2/be/src/runtime/io/data-cache.cc@558
PS2, Line 558: handle.get()
> Nit: you don't need the .get()
I know. I prefer doing it this way because it makes it more obvious to readers 
of the code that its a unique_ptr, but I don't feel strongly about it and can 
remove it if you prefer.



--
To view, visit http://gerrit.cloudera.org:8080/15144
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1eb4caf927c729109426fb50a28b5e15d6ac46cb
Gerrit-Change-Number: 15144
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Mon, 03 Feb 2020 22:57:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9289: Fix flakiness in TestConcurrentKuduCreate

2020-02-03 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15151 )

Change subject: IMPALA-9289: Fix flakiness in TestConcurrentKuduCreate
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/15151/1/tests/custom_cluster/test_concurrent_kudu_create.py
File tests/custom_cluster/test_concurrent_kudu_create.py:

http://gerrit.cloudera.org:8080/#/c/15151/1/tests/custom_cluster/test_concurrent_kudu_create.py@a47
PS1, Line 47:
> I am not very familiar with the usage of tls.client. Is it true that we do
Replied this at line65.


http://gerrit.cloudera.org:8080/#/c/15151/1/tests/custom_cluster/test_concurrent_kudu_create.py@51
PS1, Line 51: pool = ThreadPool(processes=3)
> Just like to check whether or not my understanding is correct. The reason t
Yes, we can reuse the threads and therefore reuse the client (connection) in 
the thread.


http://gerrit.cloudera.org:8080/#/c/15151/1/tests/custom_cluster/test_concurrent_kudu_create.py@65
PS1, Line 65: pool
> I looked at other usages of client and it seems that we don't need to expli
Yes, when the threads are terminated, their connections are closed so the 
associated sessions are closed too. I verify these in impalad.INFO. By the way, 
since this is a custom cluster test, the cluster will be restarted after the 
test. So we don't need to be afraid of potential session leaks affecting other 
tests.



--
To view, visit http://gerrit.cloudera.org:8080/15151
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idefba98ecd65efbd47b1618291330795ef13b910
Gerrit-Change-Number: 15151
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 03 Feb 2020 22:56:07 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9335 (part 2): Fix rebased KRPC to compile

2020-02-03 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15144 )

Change subject: IMPALA-9335 (part 2): Fix rebased KRPC to compile
..


Patch Set 2: Code-Review+1

(3 comments)

This is looking good to me. I compared the changes in this patch to the 
existing code in our KRPC.

http://gerrit.cloudera.org:8080/#/c/15144/2/be/src/runtime/io/data-cache.cc
File be/src/runtime/io/data-cache.cc:

http://gerrit.cloudera.org:8080/#/c/15144/2/be/src/runtime/io/data-cache.cc@558
PS2, Line 558: handle.get()
Nit: you don't need the .get()


http://gerrit.cloudera.org:8080/#/c/15144/2/be/src/runtime/io/data-cache.cc@616
PS2, Line 616: .get()
Nit: Don't need the .get()


http://gerrit.cloudera.org:8080/#/c/15144/2/be/src/runtime/io/data-cache.cc@649
PS2, Line 649: .get()
Nit: don't need the .get()



--
To view, visit http://gerrit.cloudera.org:8080/15144
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1eb4caf927c729109426fb50a28b5e15d6ac46cb
Gerrit-Change-Number: 15144
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Mon, 03 Feb 2020 22:44:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8712: Make ExecQueryFInstances async

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15154 )

Change subject: IMPALA-8712: Make ExecQueryFInstances async
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/15154/1/tests/custom_cluster/test_rpc_exception.py
File tests/custom_cluster/test_rpc_exception.py:

http://gerrit.cloudera.org:8080/#/c/15154/1/tests/custom_cluster/test_rpc_exception.py@27
PS1, Line 27: def get_rpc_debug_action(rpc, action, port=KRPC_PORT):
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/15154/1/tests/custom_cluster/test_rpc_exception.py@33
PS1, Line 33: def get_fail_action(rpc, error=None, port=KRPC_PORT, p=0.1):
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/15154/1/tests/custom_cluster/test_rpc_exception.py@147
PS1, Line 147: r
flake8: F841 local variable 'result' is assigned to but never used



--
To view, visit http://gerrit.cloudera.org:8080/15154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I33ec96e5885af094c294cd3a76c242995263ba32
Gerrit-Change-Number: 15154
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Mon, 03 Feb 2020 22:21:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8712: Make ExecQueryFInstances async

2020-02-03 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15154


Change subject: IMPALA-8712: Make ExecQueryFInstances async
..

IMPALA-8712: Make ExecQueryFInstances async

This patch refactors the ExecQueryFInstances rpc to be asychronous.
Previously, Impala would issue all the Exec()s, wait for all of them
to complete, and then check if any of them resulted in an error. We
now stop issuing Exec()s and cancel any that are still in flight as
soon as an error occurs.

It also performs some cleanup around the thread safety of
Coordinator::BackendState, including adding comments and DCHECKS.

=== Exec RPC Thread Pool ===
This patch also removes the 'exec_rpc_thread_pool_' from ExecEnv. This
thread pool was used to partially simulate async Exec() prior to the
switch to KRPC, which provides built-in async rpc capabilities.

Removing this thread pool has potential performance implications, as
it means that the Exec() parameters are serialized in serialize rather
than in parallel (with the level of parallelism determined by the size
of the thread pool, which was configurable by an Advanced flag and
defaulted to 12).

To ensure we don't regress query startup times, I did some performance
testing. All tests were done on a 10 node cluster. The baseline used
for the tests did not include IMPALA-9181, a perf optimization for
query startup done to facilitate this work.

I ran TPCH 100 at concurrency levels of 1, 4, and 8 and extracted the
query startup times from the profiles. For each concurrency level, the
average regression in query startup time was < 2ms. Because query e2e
running time was much longer than this, there was no noticable change
in total query time.

I also ran a 'worst case scenario' with a table with 10,000 pertitions
to create a very large Exec() payload to serialize (~1.21MB vs.
~10KB-30KB for TPCH 100). Again, change in query startup time was
neglible.


TODO: once IMPALA-9335 (krpc rebase) goes in, the change in
be/src/kudu/rpc/connection.cc can be removed from this patch.

Testing:
- Added a e2e test that verifies that a query where an Exec() fails
  doesn't wait for all Exec()s to complete before cancelling and
  returning the error to the client.

Change-Id: I33ec96e5885af094c294cd3a76c242995263ba32
---
M be/src/common/global-flags.cc
M be/src/kudu/rpc/connection.cc
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/query-state.cc
M tests/custom_cluster/test_rpc_exception.py
M tests/failure/test_failpoints.py
11 files changed, 382 insertions(+), 201 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/15154/1
--
To view, visit http://gerrit.cloudera.org:8080/15154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I33ec96e5885af094c294cd3a76c242995263ba32
Gerrit-Change-Number: 15154
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15134 )

Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5493/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/15134
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a
Gerrit-Change-Number: 15134
Gerrit-PatchSet: 4
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Mon, 03 Feb 2020 21:40:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9346: Fix TestImpalaShell.test config file failing issue on CentOS6/Python 2.6

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15139 )

Change subject: IMPALA-9346: Fix TestImpalaShell.test_config_file failing issue 
on CentOS6/Python 2.6
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5492/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/15139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ief5e825aa3baead5519132d47efcf0d5300860fd
Gerrit-Change-Number: 15139
Gerrit-PatchSet: 6
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 03 Feb 2020 20:00:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9346: Fix TestImpalaShell.test config file failing issue on CentOS6/Python 2.6

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15139 )

Change subject: IMPALA-9346: Fix TestImpalaShell.test_config_file failing issue 
on CentOS6/Python 2.6
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ief5e825aa3baead5519132d47efcf0d5300860fd
Gerrit-Change-Number: 15139
Gerrit-PatchSet: 6
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 03 Feb 2020 20:00:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9346: Fix TestImpalaShell.test config file failing issue on CentOS6/Python 2.6

2020-02-03 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15139 )

Change subject: IMPALA-9346: Fix TestImpalaShell.test_config_file failing issue 
on CentOS6/Python 2.6
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ief5e825aa3baead5519132d47efcf0d5300860fd
Gerrit-Change-Number: 15139
Gerrit-PatchSet: 5
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 03 Feb 2020 20:00:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15134 )

Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support
..


Patch Set 3: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5491/


--
To view, visit http://gerrit.cloudera.org:8080/15134
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a
Gerrit-Change-Number: 15134
Gerrit-PatchSet: 3
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Mon, 03 Feb 2020 19:42:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8587: Show inherited privileges with Ranger show grant

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15111 )

Change subject: IMPALA-8587: Show inherited privileges with Ranger show grant
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5596/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15111
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4e679dc6fcf8d0b0e4e0fc2e9b335e2d8bc0899
Gerrit-Change-Number: 15111
Gerrit-PatchSet: 5
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Mon, 03 Feb 2020 19:27:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9337 [DOCS] Document new way to create external Kudu table in Impala

2020-02-03 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15149 )

Change subject: IMPALA-9337 [DOCS] Document new way to create external Kudu 
table in Impala
..


Patch Set 2:

(4 comments)

Thanks for documenting this.

http://gerrit.cloudera.org:8080/#/c/15149/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15149/2//COMMIT_MSG@10
PS2, Line 10: Internal
I think the usage of word "Internal" is confused. Users are familiar with 
"managed" and "external" table.


http://gerrit.cloudera.org:8080/#/c/15149/2//COMMIT_MSG@11
PS2, Line 11: TBLPROPERTIES
See my comment later, the tblproperties syntax only applies when creating the 
external table with explicit column spec.


http://gerrit.cloudera.org:8080/#/c/15149/2/docs/topics/impala_create_table.xml
File docs/topics/impala_create_table.xml:

http://gerrit.cloudera.org:8080/#/c/15149/2/docs/topics/impala_create_table.xml@241
PS2, Line 241:   TBLPROPERTIES 
[('kudu.table_name'='internal_kudu_name')] | 
[('external.table.purge'='true')] 
[,('key1'='value1', 
'key2'='value2', ...)]
I think this may be confusing to some users since 'external.table.purge'='true' 
property must be used only when we provide the column spec like in case of 
managed table. I think may be we should modify the kudu tables SQL syntax as 
follows:

CREATE [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name
  (col_name data_type
[kudu_column_attribute ...]
[COMMENT 'col_comment']
[, ...]
[PRIMARY KEY (col_name[, ...])]
  )
  [PARTITION BY kudu_partition_clause]
  [COMMENT 'table_comment']
  STORED AS KUDU
  [TBLPROPERTIES ('key1'='value1', 'key2'='value2', ...)] | 
[('external.table.purge'='true', 'key1'='value1',...)]

Also, in the section Kudu considerations, can we add a section which talks 
about such external tables?

Something like:

>From version 3.4 and above, when Impala is integrated with Hive metastore 3, 
>managed Kudu tables are translated by default to external Kudu tables by HMS 
>with 'external.table.purge' property set to true. Such synchronized tables 
>behave similar to managed tables. A drop table command on such a table will 
>remove the underlying Kudu table. Similarly, a alter table rename ... command 
>will rename the underlying Kudu table. Users can also explicitly create such 
>external Kudu tables similar to managed Kudu tables. An example of creating 
>such tables is given below. The table property 'external.table.purge' must be 
>set to true.

CREATE EXTERNAL TABLE myextkudutbl (
id int PRIMARY KEY,
name string)
PARTITION BY HASH PARTITIONS 8
STORED AS KUDU
TBLPROPERTIES ('external.table.purge'='true')


http://gerrit.cloudera.org:8080/#/c/15149/2/docs/topics/impala_create_table.xml@248
PS2, Line 248: you do not need to create a pre-existing schema in Kudu before
 :   creating an external Kudu table in Impala
Creating a external table on a pre-exiting schema in Kudu is still a valid 
use-case. May be change the wording such that we say that an alternative way to 
create external table is ... and tell some of the differences between the two.



--
To view, visit http://gerrit.cloudera.org:8080/15149
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic07380fd53898dd21fbb5dacb4d9f7a84f160d4e
Gerrit-Change-Number: 15149
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 03 Feb 2020 18:57:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8587: Show inherited privileges with Ranger show grant

2020-02-03 Thread Fang-Yu Rao (Code Review)
Fang-Yu Rao has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/15111 )

Change subject: IMPALA-8587: Show inherited privileges with Ranger show grant
..

IMPALA-8587: Show inherited privileges with Ranger show grant

Previously when executing a SHOW GRANT statement on a resource with
Ranger authorization enabled, Impala would not show inherited
privileges. For example, consider a user 'foo' with database-level
privileges granted by:

GRANT SELECT ON DATABASE db TO USER foo;

If later on we would like to retrieve the table-level privileges
associated with the user 'foo' by:

SHOW GRANT USER foo ON TABLE db.table;

We would not see any result before this change. After this change, the
related privileges including the inherited privileges with regard to the
specified resource will be shown. In our example described above, we
will see the following result and therefore the result returned by SHOW
GRANT statement is more informative than the case in which only the
privileges on 'db'.'table' were shown. Notice that in the following
returned result, we are also able to know the specified user's
privileges on any other table under the database 'db'.

+++--+---++-+-+---+--+---+
| principal_type | principal_name | database | table | column | uri | udf | 
privilege | grant_option | create_time   |
+++--+---++-+-+---+--+---+
| USER   | foo| db   | * | *  | | | 
select| false| 1580174954746 |
+++--+---++-+-+---+--+---+

Testing
- Ran all FE tests
- Ran all authorization E2E tests
- Added E2E tests in test_ranger verifying functionality

Change-Id: Ia4e679dc6fcf8d0b0e4e0fc2e9b335e2d8bc0899
---
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerImpaladAuthorizationManager.java
M tests/authorization/test_ranger.py
2 files changed, 235 insertions(+), 70 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/15111/5
--
To view, visit http://gerrit.cloudera.org:8080/15111
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia4e679dc6fcf8d0b0e4e0fc2e9b335e2d8bc0899
Gerrit-Change-Number: 15111
Gerrit-PatchSet: 5
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-9336: [DOCS] Primary and foreign key constraint syntax

2020-02-03 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/15146 )

Change subject: IMPALA-9336: [DOCS] Primary and foreign key constraint syntax
..

IMPALA-9336: [DOCS] Primary and foreign key constraint syntax

CREATE TABLE syntax for primary key and foreign keys spec

Change-Id: Iee12da322fbdab7c671c17ceb8436bc3ace2b820
Reviewed-on: http://gerrit.cloudera.org:8080/15146
Reviewed-by: Thomas Tauber-Marshall 
Tested-by: Thomas Tauber-Marshall 
---
M docs/topics/impala_create_table.xml
1 file changed, 33 insertions(+), 6 deletions(-)

Approvals:
  Thomas Tauber-Marshall: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/15146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Iee12da322fbdab7c671c17ceb8436bc3ace2b820
Gerrit-Change-Number: 15146
Gerrit-PatchSet: 6
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-9336: [DOCS] Primary and foreign key constraint syntax

2020-02-03 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15146 )

Change subject: IMPALA-9336: [DOCS] Primary and foreign key constraint syntax
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/15146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee12da322fbdab7c671c17ceb8436bc3ace2b820
Gerrit-Change-Number: 15146
Gerrit-PatchSet: 5
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Mon, 03 Feb 2020 18:34:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9336: [DOCS] Primary and foreign key constraint syntax

2020-02-03 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15146 )

Change subject: IMPALA-9336: [DOCS] Primary and foreign key constraint syntax
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee12da322fbdab7c671c17ceb8436bc3ace2b820
Gerrit-Change-Number: 15146
Gerrit-PatchSet: 5
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Mon, 03 Feb 2020 18:31:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9289: Fix flakiness in TestConcurrentKuduCreate

2020-02-03 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15151 )

Change subject: IMPALA-9289: Fix flakiness in TestConcurrentKuduCreate
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15151/1/tests/custom_cluster/test_concurrent_kudu_create.py
File tests/custom_cluster/test_concurrent_kudu_create.py:

http://gerrit.cloudera.org:8080/#/c/15151/1/tests/custom_cluster/test_concurrent_kudu_create.py@65
PS1, Line 65: pool
I looked at other usages of client and it seems that we don't need to 
explicitly call a close on it. Is that correct?



--
To view, visit http://gerrit.cloudera.org:8080/15151
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idefba98ecd65efbd47b1618291330795ef13b910
Gerrit-Change-Number: 15151
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 03 Feb 2020 18:27:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9336: [DOCS] Primary and foreign key constraint syntax

2020-02-03 Thread Anonymous Coward (Code Review)
Hello Anurag Mantripragada, Thomas Tauber-Marshall, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15146

to look at the new patch set (#5).

Change subject: IMPALA-9336: [DOCS] Primary and foreign key constraint syntax
..

IMPALA-9336: [DOCS] Primary and foreign key constraint syntax

CREATE TABLE syntax for primary key and foreign keys spec

Change-Id: Iee12da322fbdab7c671c17ceb8436bc3ace2b820
---
M docs/topics/impala_create_table.xml
1 file changed, 33 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/15146/5
--
To view, visit http://gerrit.cloudera.org:8080/15146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iee12da322fbdab7c671c17ceb8436bc3ace2b820
Gerrit-Change-Number: 15146
Gerrit-PatchSet: 5
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-8755: Backend support for Z-ordering

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14080 )

Change subject: IMPALA-8755: Backend support for Z-ordering
..


Patch Set 23: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5490/


--
To view, visit http://gerrit.cloudera.org:8080/14080
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0200748ce3e65ebc5d3530f794c0f80aa335a2ab
Gerrit-Change-Number: 14080
Gerrit-PatchSet: 23
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Anonymous Coward (520)
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 03 Feb 2020 18:17:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9346: Fix TestImpalaShell.test config file failing issue on CentOS6/Python 2.6

2020-02-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15139 )

Change subject: IMPALA-9346: Fix TestImpalaShell.test_config_file failing issue 
on CentOS6/Python 2.6
..


Patch Set 5: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/15139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ief5e825aa3baead5519132d47efcf0d5300860fd
Gerrit-Change-Number: 15139
Gerrit-PatchSet: 5
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 03 Feb 2020 18:14:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9336: [DOCS] constraints

2020-02-03 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15146 )

Change subject: IMPALA-9336: [DOCS] constraints
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15146/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15146/4//COMMIT_MSG@7
PS4, Line 7: constraints
Usually we want the first line of the commit to be more descriptive, so maybe 
something like "primary and foreign key constraint syntax" or similar



--
To view, visit http://gerrit.cloudera.org:8080/15146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee12da322fbdab7c671c17ceb8436bc3ace2b820
Gerrit-Change-Number: 15146
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Mon, 03 Feb 2020 18:12:07 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9336: [DOCS] constraints

2020-02-03 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15146 )

Change subject: IMPALA-9336: [DOCS] constraints
..


Patch Set 4: Code-Review+1

Looks good to me.


--
To view, visit http://gerrit.cloudera.org:8080/15146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee12da322fbdab7c671c17ceb8436bc3ace2b820
Gerrit-Change-Number: 15146
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Mon, 03 Feb 2020 18:04:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9289: Fix flakiness in TestConcurrentKuduCreate

2020-02-03 Thread Fang-Yu Rao (Code Review)
Fang-Yu Rao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15151 )

Change subject: IMPALA-9289: Fix flakiness in TestConcurrentKuduCreate
..


Patch Set 1: Code-Review+1

(2 comments)

Thanks to Quanlong for providing a fix promptly! I only left 2 minor comments.

http://gerrit.cloudera.org:8080/#/c/15151/1/tests/custom_cluster/test_concurrent_kudu_create.py
File tests/custom_cluster/test_concurrent_kudu_create.py:

http://gerrit.cloudera.org:8080/#/c/15151/1/tests/custom_cluster/test_concurrent_kudu_create.py@a47
PS1, Line 47:
I am not very familiar with the usage of tls.client. Is it true that we do not 
have to explicitly call close() before we are going to terminate the thread 
pool after the for-loop at 
https://gerrit.cloudera.org/c/15151/1/tests/custom_cluster/test_concurrent_kudu_create.py#65?
 Thanks!


http://gerrit.cloudera.org:8080/#/c/15151/1/tests/custom_cluster/test_concurrent_kudu_create.py@51
PS1, Line 51: pool = ThreadPool(processes=3)
Just like to check whether or not my understanding is correct. The reason to 
move ThreadPool() and pool.terminate() out of the for-loop is to reduce the 
overhead of thread pool creation?



--
To view, visit http://gerrit.cloudera.org:8080/15151
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idefba98ecd65efbd47b1618291330795ef13b910
Gerrit-Change-Number: 15151
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 03 Feb 2020 17:59:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4224: execute separate join builds fragments

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14859 )

Change subject: IMPALA-4224: execute separate join builds fragments
..


Patch Set 35:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5595/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4403c8e62d9c13854e7830602ee613f8efc80c58
Gerrit-Change-Number: 14859
Gerrit-PatchSet: 35
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 03 Feb 2020 17:51:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9174: Emit WARNING when ORC lib leaks memory

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15153 )

Change subject: IMPALA-9174: Emit WARNING when ORC lib leaks memory
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5594/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I370fb9f68734e0e555bd7224ab0f5440c4947c66
Gerrit-Change-Number: 15153
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 03 Feb 2020 17:24:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9290: ORC scanner should support schema evolution between date and timestamp types

2020-02-03 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15152 )

Change subject: IMPALA-9290: ORC scanner should support schema evolution 
between date and timestamp types
..


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/15152/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15152/2//COMMIT_MSG@7
PS2, Line 7: between date and timestamp types
Looking at IMPALA-6373 it seems like Impala usually supports these conversions 
when there is no loss of information.

Timestamp and Date don't have the same range, but anyway, supporting Date -> 
Timestamp seems more reasonable.

What is the behavior of other file formats, e.g. Parquet? I don't think we want 
different behavior.


http://gerrit.cloudera.org:8080/#/c/15152/2//COMMIT_MSG@12
PS2, Line 12: to the same ORC file
nit: "to the same set of ORC files" probably expresses the intent better


http://gerrit.cloudera.org:8080/#/c/15152/2/be/src/exec/orc-column-readers.h
File be/src/exec/orc-column-readers.h:

http://gerrit.cloudera.org:8080/#/c/15152/2/be/src/exec/orc-column-readers.h@121
PS2, Line 121:   Status HandleInvalidValue(Tuple* tuple, TErrorCode::type 
error_code)
nit: Please add comment about what is an invalid value and how it is handled.


http://gerrit.cloudera.org:8080/#/c/15152/2/be/src/exec/orc-column-readers.h@214
PS2, Line 214: ORC support schema evolution
What does it mean "ORC support schema evolution"?


http://gerrit.cloudera.org:8080/#/c/15152/2/be/src/exec/orc-column-readers.h@214
PS2, Line 214: Date and Timestamp tables
Does Hive convert in both directions?


http://gerrit.cloudera.org:8080/#/c/15152/2/be/src/exec/orc-column-readers.h@253
PS2, Line 253: (source_type_ == orc::TypeKind::TIMESTAMP &&
 :   static_cast(batch_) ==
 :   dynamic_cast(orc_batch)) ||
 : (source_type_ == orc::TypeKind::DATE &&
 :   static_cast(batch_) ==
 :   dynamic_cast(orc_batch))
nit: complicated and duplicated in OrcTimestampReader. Can you put it into a 
function wi name 'IsDateTime' or stg like that.



--
To view, visit http://gerrit.cloudera.org:8080/15152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7979ecc61b2ab900090d01bc81e7bb7b28c99c9e
Gerrit-Change-Number: 15152
Gerrit-PatchSet: 2
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 03 Feb 2020 17:18:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4224: execute separate join builds fragments

2020-02-03 Thread Tim Armstrong (Code Review)
Hello Zoltan Borok-Nagy, Csaba Ringhofer, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14859

to look at the new patch set (#35).

Change subject: IMPALA-4224: execute separate join builds fragments
..

IMPALA-4224: execute separate join builds fragments

This enables parallel plans with the join build in a
separate fragment and fixes all of the ensuing fallout.
After this change, mt_dop plans with joins have separate
build fragments. There is still a 1:1 relationship between
join nodes and builders, so the builders are only accessed
by the join node's thread after it is handed off. This lets
us defer the work required to make PhjBuilder and NljBuilder
safe to be shared between nodes.

Planner changes:
* Combined the parallel and distributed planning code paths.
* Misc fixes to generate reasonable thrift structures in the
  query exec requests, i.e. containing the right nodes.
* Fixes to resource calculations for the separate build plans.
** Calculate separate join/build resource consumption.
** Simplified the resource estimation by calculating resource
   consumption for each fragment separately, and assuming that
   all fragments hit their peak resource consumption at the
   same time. IMPALA-9255 is the follow-on to make the resource
   estimation more accurate.

Scheduler changes:
* Various fixes to handle multiple TPlanExecInfos correctly,
  which are generated by the planner for the different cohorts.
* Add logic to colocate build fragments with parent fragments.

Runtime filter changes:
* Build sinks now produce runtime filters, which required
  planner and coordinator fixes to handle.
   accordingly.

DataSink changes:
* Close the input plan tree before calling FlushFinal() to release
  resources. This depends on Send() not holding onto references
  to input batches, which was true except for NljBuilder. This
  invariant is documented.

Join builder changes:
* Add a common base class for PhjBuilder and NljBuilder with
  functions to handle synchronisation with the join node.
* Close plan tree earlier in FragmentInstanceState::Exec()
  so that peak resource requirements are lower.
* The NLJ always copies input batches, so that it can close
  its input tree.

JoinNode changes:
* Join node blocks waiting for build-side to be ready,
  then eventually signals that it's done, allowing the builder
  to be cleaned up.
* NLJ and PHJ nodes handle both the integrated builder and
  the external builder. There is a 1:1 relationship between
  the node and the builder, so we don't deal with thread safety
  yet.
* Buffer reservations are transferred between the builder and join
  node when running with the separate builder. This is not really
  necessary right now, since it is all single-threaded, but will
  be important for the shared broadcast.
  - The builder transfers memory for probe buffers to the join node
at the end of each build phase.
  - At end of each probe phase, reservation needs to be handed back
to builder (or released).

ExecSummary changes:
* The summary logic was modified to handle connecting fragments
  via join builds. The logic is an extension of what was used
  for exchanges.

Testing:
* Enable --unlock_mt_dop for end-to-end tests
* Migrate some tests to run as part of end-to-end tests instead of
  custom cluster.
* Add mt_dop dimension to various end-to-end tests to provide
  coverage of join queries, spill-to-disk and cancellation.
* Ran a single node TPC-H and TPC-DS stress test with mt_dop=0
  and mt_dop=4.

Perf:
* Ran TPC-H scale factor 30 locally with mt_dop=0. No significant
  change.

Change-Id: I4403c8e62d9c13854e7830602ee613f8efc80c58
---
M be/src/exec/CMakeLists.txt
M be/src/exec/blocking-join-node.cc
M be/src/exec/blocking-join-node.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/exec-node.h
A be/src/exec/join-builder.cc
A be/src/exec/join-builder.h
M be/src/exec/nested-loop-join-builder.cc
M be/src/exec/nested-loop-join-builder.h
M be/src/exec/nested-loop-join-node.cc
M be/src/exec/nested-loop-join-node.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/runtime/bufferpool/buffer-pool-internal.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/initial-reservations.cc
M be/src/runtime/row-batch.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/spillable-row-batch-queue.h
M be/src/util/summary-util.cc
M bin/run-all-tests.sh
M common/thrift/DataSinks.thrift
M 

[Impala-ASF-CR] IMPALA-4224: execute separate join builds fragments

2020-02-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14859 )

Change subject: IMPALA-4224: execute separate join builds fragments
..


Patch Set 34:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/14859/28/be/src/exec/blocking-join-node.cc
File be/src/exec/blocking-join-node.cc:

http://gerrit.cloudera.org:8080/#/c/14859/28/be/src/exec/blocking-join-node.cc@231
PS28, Line 231: _sink) {
> is it still true?
The phrase was a bit weird, this was mean to be part of the "if" clause. I 
rewrote this comment to be clearer and explain the bigger picture.


http://gerrit.cloudera.org:8080/#/c/14859/28/be/src/exec/blocking-join-node.cc@246
PS28, Line 246: seSeparateBuild(state->query_options())) {
> Maybe you could add a sentence about how the build was already started.
Done


http://gerrit.cloudera.org:8080/#/c/14859/32/be/src/exec/nested-loop-join-builder.h
File be/src/exec/nested-loop-join-builder.h:

http://gerrit.cloudera.org:8080/#/c/14859/32/be/src/exec/nested-loop-join-builder.h@61
PS32, Line 61: NljBuilder(
> nit: maybe it could be another factory method called CreateStandaloneBuilde
Done


http://gerrit.cloudera.org:8080/#/c/14859/32/be/src/exec/nested-loop-join-builder.h@90
PS32, Line 90: util
> nit: until
Done


http://gerrit.cloudera.org:8080/#/c/14859/32/be/src/exec/nested-loop-join-builder.cc
File be/src/exec/nested-loop-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/14859/32/be/src/exec/nested-loop-join-builder.cc@112
PS32, Line 112: void NljBuilder::Reset() {
> Based on the comment on the declaration it is not valid to be called on sep
Done


http://gerrit.cloudera.org:8080/#/c/14859/32/be/src/exec/partitioned-hash-join-builder.h
File be/src/exec/partitioned-hash-join-builder.h:

http://gerrit.cloudera.org:8080/#/c/14859/32/be/src/exec/partitioned-hash-join-builder.h@103
PS32, Line 103: TDataSink* tsink
> Since now it is an output parameter should it be moved at the end of the pa
Done


http://gerrit.cloudera.org:8080/#/c/14859/32/be/src/exec/partitioned-hash-join-builder.h@322
PS32, Line 322: 'a
> nit: missing space
Done


http://gerrit.cloudera.org:8080/#/c/14859/32/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/14859/32/be/src/exec/partitioned-hash-join-builder.cc@370
PS32, Line 370: void PhjBuilder::Reset(RowBatch* row_batch) {
> The declaration comment says it's not valid to be called on a separate buil
Done


http://gerrit.cloudera.org:8080/#/c/14859/32/be/src/runtime/bufferpool/buffer-pool.cc
File be/src/runtime/bufferpool/buffer-pool.cc:

http://gerrit.cloudera.org:8080/#/c/14859/32/be/src/runtime/bufferpool/buffer-pool.cc@28
PS32, Line 28: #include "util/debug-util.h"
> nit: not in alphabetic order, should be one line below
Done


http://gerrit.cloudera.org:8080/#/c/14859/32/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/14859/32/be/src/runtime/coordinator.cc@322
PS32, Line 322: fragment.output_sink.type == TDataSinkType::DATA_STREAM_SINK
  :|| fragment.output_sink.type == 
TDataSinkType::HASH_JOIN_BUILDER
  :|| fragment.output_sink.type == 
TDataSinkType::NESTED_LOOP_JOIN_BUILDER
> nit: I wonder if these conditions could be simplified if some parts of them
Factored out an IsJoinBuildSink() function. I thought about something to 
capture this full condition like IsFragmentConnectingSink() but it didn't seem 
like it made things clearer.


http://gerrit.cloudera.org:8080/#/c/14859/32/common/thrift/PlanNodes.thrift
File common/thrift/PlanNodes.thrift:

http://gerrit.cloudera.org:8080/#/c/14859/32/common/thrift/PlanNodes.thrift@617
PS32, Line 617: 28
> nit: Since these thrift structures only used between Impala daemons with th
I did this mainly to reduce the size of the diff. I think I'd prefer to keep it 
this way but I can change if you feel strongly - neither option is that good in 
my mind.



--
To view, visit http://gerrit.cloudera.org:8080/14859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4403c8e62d9c13854e7830602ee613f8efc80c58
Gerrit-Change-Number: 14859
Gerrit-PatchSet: 34
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 03 Feb 2020 17:05:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9174: Emit WARNING when ORC lib leaks memory

2020-02-03 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15153


Change subject: IMPALA-9174: Emit WARNING when ORC lib leaks memory
..

IMPALA-9174: Emit WARNING when ORC lib leaks memory

Added a check to OrcMemPool to test whether there was leaked memory
by the ORC library. Impala frees these memory anyway, but it's useful
to know if there is a bug in the ORC lib.

Testing:
* I tested manually
* I couldn't add an automated test because currently we are not aware
  of such bugs in the lib

Change-Id: I370fb9f68734e0e555bd7224ab0f5440c4947c66
---
M be/src/exec/hdfs-orc-scanner.cc
1 file changed, 4 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/15153/1
--
To view, visit http://gerrit.cloudera.org:8080/15153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I370fb9f68734e0e555bd7224ab0f5440c4947c66
Gerrit-Change-Number: 15153
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-9290: ORC scanner should support schema evolution between date and timestamp types

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15152 )

Change subject: IMPALA-9290: ORC scanner should support schema evolution 
between date and timestamp types
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5593/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7979ecc61b2ab900090d01bc81e7bb7b28c99c9e
Gerrit-Change-Number: 15152
Gerrit-PatchSet: 2
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Comment-Date: Mon, 03 Feb 2020 16:32:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4224: execute separate join builds fragments

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14859 )

Change subject: IMPALA-4224: execute separate join builds fragments
..


Patch Set 34:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5592/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4403c8e62d9c13854e7830602ee613f8efc80c58
Gerrit-Change-Number: 14859
Gerrit-PatchSet: 34
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 03 Feb 2020 16:27:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9226: Improve string allocations of the ORC scanner

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15051 )

Change subject: IMPALA-9226: Improve string allocations of the ORC scanner
..


Patch Set 9: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5489/


--
To view, visit http://gerrit.cloudera.org:8080/15051
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f
Gerrit-Change-Number: 15051
Gerrit-PatchSet: 9
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 03 Feb 2020 15:50:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9290: ORC scanner should support schema evolution between date and timestamp types

2020-02-03 Thread Norbert Luksa (Code Review)
Norbert Luksa has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15152 )

Change subject: IMPALA-9290: ORC scanner should support schema evolution 
between date and timestamp types
..


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/15152/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15152/1//COMMIT_MSG@20
PS1, Line 20:
> nit: wrap at 72 chars
Done


http://gerrit.cloudera.org:8080/#/c/15152/1/be/src/exec/orc-column-readers.h
File be/src/exec/orc-column-readers.h:

http://gerrit.cloudera.org:8080/#/c/15152/1/be/src/exec/orc-column-readers.h@224
PS1, Line 224: tampReader(const orc::Type* node, const SlotDes
> Do we need this, can't we just compare check != nullprt?
I just followed the convention of the other UpdateInputBatch functions where 
the static casted batch is compared to the dynamic casted one. Maybe we could 
simplify the DCHECKs everywhere?


http://gerrit.cloudera.org:8080/#/c/15152/1/be/src/exec/orc-column-readers.cc
File be/src/exec/orc-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/15152/1/be/src/exec/orc-column-readers.cc@211
PS1, Line 211:   if (IsNull(DCHECK_NOTNULL(batch_), row_idx)) {
> nit: extra line
Done


http://gerrit.cloudera.org:8080/#/c/15152/1/be/src/exec/orc-column-readers.cc@226
PS1, Line 226: }
 : *slot = DateValue(ts.DaysSinceUnixEpoch());
 :   }
> This is probably not speed critical, but it could be done faster by ignorin
Done


http://gerrit.cloudera.org:8080/#/c/15152/1/be/src/exec/orc-column-readers.cc@229
PS1, Line 229: alid())) {
> This will hit a DCHECK if the timestamp is not valid, see https://github.co
Done



--
To view, visit http://gerrit.cloudera.org:8080/15152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7979ecc61b2ab900090d01bc81e7bb7b28c99c9e
Gerrit-Change-Number: 15152
Gerrit-PatchSet: 2
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Comment-Date: Mon, 03 Feb 2020 15:47:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9290: ORC scanner should support schema evolution between date and timestamp types

2020-02-03 Thread Norbert Luksa (Code Review)
Norbert Luksa has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/15152 )

Change subject: IMPALA-9290: ORC scanner should support schema evolution 
between date and timestamp types
..

IMPALA-9290: ORC scanner should support schema evolution between date and 
timestamp types

This feature adds support for schema evolution between date and
timestamp for the ORC scanner. This means that we can have two
tables, one with a date column, another with a timestamp column,
and they can both point to the same ORC file. The result will be
that for the first table everything will be converted to date,
and for the second, everything to timestamp.

In order to do that, the OrcTimestampReader and OrcDateColumnReader
are modified to be able to handle batches of the two types. Their
name now represents the destination Impala type.

Note that the life cycle of a OrcColumnReader is within the life
cycle of the HdfsOrcScanner which only reads a split of an ORC
file, and an ORC file can't have two types for one column.

Tests:
 * Added type conversion tests.
 * Tested manually following the use case steps of the Jira.

Change-Id: I7979ecc61b2ab900090d01bc81e7bb7b28c99c9e
---
M be/src/exec/orc-column-readers.cc
M be/src/exec/orc-column-readers.h
M be/src/exec/orc-metadata-utils.cc
M 
testdata/workloads/functional-query/queries/DataErrorsTest/orc-type-checks.test
M tests/query_test/test_scanners.py
5 files changed, 125 insertions(+), 43 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/15152/2
--
To view, visit http://gerrit.cloudera.org:8080/15152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7979ecc61b2ab900090d01bc81e7bb7b28c99c9e
Gerrit-Change-Number: 15152
Gerrit-PatchSet: 2
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-8778: Support Apache Hudi Read Optimized Table

2020-02-03 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14711 )

Change subject: IMPALA-8778: Support Apache Hudi Read Optimized Table
..


Patch Set 16:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/14711/16/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

http://gerrit.cloudera.org:8080/#/c/14711/16/be/src/exec/hdfs-scan-node-base.cc@379
PS16, Line 379: HUDI_PARQUET
> My logic was:
I see, but in the backend you just create "low-level" operators, such as scan 
nodes that need to process some input splits in file format X. So they don't 
need to be too smart, the planner will tell them what to do.

That said, I don't have a too strong opinion about it.


http://gerrit.cloudera.org:8080/#/c/14711/16/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
File fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/14711/16/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java@197
PS16, Line 197: fileFormat_
> Done
If fileformat_ is null, then the equality check will just return false which is 
the expected behavior.


http://gerrit.cloudera.org:8080/#/c/14711/16/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/14711/16/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@354
PS16, Line 354: isParquet
> any suggestion? I couldn't come out a better name here
maybe 'isParquetBased'?


http://gerrit.cloudera.org:8080/#/c/14711/20/testdata/data/README
File testdata/data/README:

http://gerrit.cloudera.org:8080/#/c/14711/20/testdata/data/README@482
PS20, Line 482:
nit: if possible, please keep the 90 chars line length limit


http://gerrit.cloudera.org:8080/#/c/14711/20/testdata/datasets/functional/functional_schema_template.sql
File testdata/datasets/functional/functional_schema_template.sql:

http://gerrit.cloudera.org:8080/#/c/14711/20/testdata/datasets/functional/functional_schema_template.sql@2762
PS20, Line 2762:
   :
   :
   :
   :
Since you are using a custom CREATE statement you'll need to define the 
partitions in the CREATE TABLE stmt.


http://gerrit.cloudera.org:8080/#/c/14711/20/testdata/datasets/functional/schema_constraints.csv
File testdata/datasets/functional/schema_constraints.csv:

http://gerrit.cloudera.org:8080/#/c/14711/20/testdata/datasets/functional/schema_constraints.csv@59
PS20, Line 59:
hudiparquet is not part of the test dimensions of the functional workload. 
Since most of the tests would fail with hudiparquet we can cheat here and 
create the hudi table in the functional_parquet database, i.e. switch to 
parquet here.


http://gerrit.cloudera.org:8080/#/c/14711/20/tests/query_test/test_scanners.py
File tests/query_test/test_scanners.py:

http://gerrit.cloudera.org:8080/#/c/14711/20/tests/query_test/test_scanners.py@313
PS20, Line 313: un_test_cas
TestHudiParquet


http://gerrit.cloudera.org:8080/#/c/14711/20/tests/query_test/test_scanners.py@320
PS20, Line 320:
> Thank you all for reviewing. I am able to use
If in 'schema_constraints.csv' you switch to parquet then you can load the hudi 
tables with --table_formats=parquet/none/none. It's necessary because 
hudiparquet is not part of the test dimensions, so we'll just put the table in 
the functional_parquet database. You already only run this test when 
file_format == parquet.



--
To view, visit http://gerrit.cloudera.org:8080/14711
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I65e146b347714df32fe968409ef2dde1f6a25cdf
Gerrit-Change-Number: 14711
Gerrit-PatchSet: 16
Gerrit-Owner: Yanjia Gary Li 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yanjia Gary Li 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 03 Feb 2020 15:39:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4224: execute separate join builds fragments

2020-02-03 Thread Tim Armstrong (Code Review)
Hello Zoltan Borok-Nagy, Csaba Ringhofer, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14859

to look at the new patch set (#34).

Change subject: IMPALA-4224: execute separate join builds fragments
..

IMPALA-4224: execute separate join builds fragments

This enables parallel plans with the join build in a
separate fragment and fixes all of the ensuing fallout.
After this change, mt_dop plans with joins have separate
build fragments. There is still a 1:1 relationship between
join nodes and builders, so the builders are only accessed
by the join node's thread after it is handed off. This lets
us defer the work required to make PhjBuilder and NljBuilder
safe to be shared between nodes.

Planner changes:
* Combined the parallel and distributed planning code paths.
* Misc fixes to generate reasonable thrift structures in the
  query exec requests, i.e. containing the right nodes.
* Fixes to resource calculations for the separate build plans.
** Calculate separate join/build resource consumption.
** Simplified the resource estimation by calculating resource
   consumption for each fragment separately, and assuming that
   all fragments hit their peak resource consumption at the
   same time. IMPALA-9255 is the follow-on to make the resource
   estimation more accurate.

Scheduler changes:
* Various fixes to handle multiple TPlanExecInfos correctly,
  which are generated by the planner for the different cohorts.
* Add logic to colocate build fragments with parent fragments.

Runtime filter changes:
* Build sinks now produce runtime filters, which required
  planner and coordinator fixes to handle.
   accordingly.

DataSink changes:
* Close the input plan tree before calling FlushFinal() to release
  resources. This depends on Send() not holding onto references
  to input batches, which was true except for NljBuilder. This
  invariant is documented.

Join builder changes:
* Add a common base class for PhjBuilder and NljBuilder with
  functions to handle synchronisation with the join node.
* Close plan tree earlier in FragmentInstanceState::Exec()
  so that peak resource requirements are lower.
* The NLJ always copies input batches, so that it can close
  its input tree.

JoinNode changes:
* Join node blocks waiting for build-side to be ready,
  then eventually signals that it's done, allowing the builder
  to be cleaned up.
* NLJ and PHJ nodes handle both the integrated builder and
  the external builder. There is a 1:1 relationship between
  the node and the builder, so we don't deal with thread safety
  yet.
* Buffer reservations are transferred between the builder and join
  node when running with the separate builder. This is not really
  necessary right now, since it is all single-threaded, but will
  be important for the shared broadcast.
  - The builder transfers memory for probe buffers to the join node
at the end of each build phase.
  - At end of each probe phase, reservation needs to be handed back
to builder (or released).

ExecSummary changes:
* The summary logic was modified to handle connecting fragments
  via join builds. The logic is an extension of what was used
  for exchanges.

Testing:
* Enable --unlock_mt_dop for end-to-end tests
* Migrate some tests to run as part of end-to-end tests instead of
  custom cluster.
* Add mt_dop dimension to various end-to-end tests to provide
  coverage of join queries, spill-to-disk and cancellation.
* Ran a single node TPC-H and TPC-DS stress test with mt_dop=0
  and mt_dop=4.

Perf:
* Ran TPC-H scale factor 30 locally with mt_dop=0. No significant
  change.

Change-Id: I4403c8e62d9c13854e7830602ee613f8efc80c58
---
M be/src/exec/CMakeLists.txt
M be/src/exec/blocking-join-node.cc
M be/src/exec/blocking-join-node.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/exec-node.h
A be/src/exec/join-builder.cc
A be/src/exec/join-builder.h
M be/src/exec/nested-loop-join-builder.cc
M be/src/exec/nested-loop-join-builder.h
M be/src/exec/nested-loop-join-node.cc
M be/src/exec/nested-loop-join-node.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/runtime/bufferpool/buffer-pool-internal.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/initial-reservations.cc
M be/src/runtime/row-batch.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/spillable-row-batch-queue.h
M be/src/util/summary-util.cc
M bin/run-all-tests.sh
M common/thrift/DataSinks.thrift
M 

[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15134 )

Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5591/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15134
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a
Gerrit-Change-Number: 15134
Gerrit-PatchSet: 3
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Mon, 03 Feb 2020 15:34:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15134 )

Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5491/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/15134
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a
Gerrit-Change-Number: 15134
Gerrit-PatchSet: 3
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Mon, 03 Feb 2020 14:55:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP: Asynchronous code generation

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15105 )

Change subject: WIP: Asynchronous code generation
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5590/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15105
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b
Gerrit-Change-Number: 15105
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 03 Feb 2020 14:52:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support

2020-02-03 Thread Attila Jeges (Code Review)
Attila Jeges has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15134 )

Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support
..


Patch Set 2:

(2 comments)

> > > Patch Set 2:
 > > > The verify job failed because kudu-3ba5ec5d0 (kudu-1.12.0-SNAPSHOT)
 > > has a new run-time dependency: libcurl.so.4 which is not
 > available
 > > in the ubuntu-16.04-configured jenkins worker label. I'm
 > discussing
 > > with laszlog the possibility of adding libcurls.so.4 to the
 > worker
 > > labe;.
 > > >
 > >
 > > If we decide to take this new Kudu version as a dependency, then
 > > the correct way to handle libcurl.so.4 as a new runtime
 > dependency
 > > is to add it to the list of packages we install in
 > > bin/bootstrap_system.sh.
 > > The worker image referenced above is only minimally preconfigured
 > > to allow fast startup times; Impala runtime/development time
 > > dependencies should be managed in the bootstrap scripts.
 > >
 > > Additionally, the dependency on libcurl.so.4 should be evaluated
 > > for all OS platforms we claim to have support for: e.g. a brief
 > > scan of this article[1] claims that running both libcurl.so.3 and
 > > libcurl.so.4 on Ubuntu 18.04 is at least non-trivial to set up.
 > >
 > > [1]: 
 > > https://dev.to/jake/using-libcurl3-and-libcurl4-on-ubuntu-1804-bionic-184g,
 > > "Using libcurl3 and libcurl4 on Ubuntu 18.04 (Bionic)"
 >
 > In bin/bootstrap_system.sh, I don't see us installing curl for
 > ubuntu, but I see us installing it for centos. I would try adding
 > it and see if that helps. (We have curl installed in all the docker
 > images we use to build kudu for the native toolchain.)
 >
 > We can run a ubuntu-18.04-from-scratch job to see if it works.

Installing curl on Ubuntu 16.04 installs libcurl-gnutls.so.4 but it doesn't 
install the required libcurl.so.4.

"apt install libcurl3" on the other hand works for all supported Ubuntu 
releases, so I've added that to bin/bootstrap_system.sh.

http://gerrit.cloudera.org:8080/#/c/15134/2/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/15134/2/bin/impala-config.sh@719
PS2, Line 719:   export 
IMPALA_TOOLCHAIN_KUDU_MAVEN_REPOSITORY="file://${IMPALA_TOOLCHAIN}"
> Since this is disabled, I think we can set it to an empty string. If that w
Setting url to an empty string results in an error but I can set it to 
something like "file:///non/existing/repo"

What do you think?


http://gerrit.cloudera.org:8080/#/c/15134/2/bin/impala-config.sh@722
PS2, Line 722:   export IMPALA_KUDU_VERSION="3ba5ec5d0"
 :   export IMPALA_KUDU_JAVA_VERSION="1.12.0-SNAPSHOT"
> One use case that we want to support is for someone to be able to override
Done



--
To view, visit http://gerrit.cloudera.org:8080/15134
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a
Gerrit-Change-Number: 15134
Gerrit-PatchSet: 2
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Mon, 03 Feb 2020 14:49:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support

2020-02-03 Thread Attila Jeges (Code Review)
Attila Jeges has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/15134 )

Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support
..

IMPALA-9279: Update the Kudu version to include VARCHAR support

Before this change the preferred way of getting Kudu was to pull
it in from the specified CDH build (even if USE_CDP_HIVE was set
to true). Optionally by setting USE_CDH_KUDU to false, one could
force Impala to use the native toolchain Kudu. But even then, the
Kudu Java artifacts would be downloaded from CDH.

Since Kudu VARCHAR support won't be backported to CDH, this
behavior blocks the Impala side of the Kudu/Impala VARCHAR
integration.

With this change:
1. Using the native toolchain Kudu (including the Java artifacts)
   is the default behavior. From now on USE_CDH_KUDU will be set
   to false by default. Impala can be forced to fall back on
   using the CDH Kudu by explicitly setting USE_CDH_KUDU to true.
2. Kudu version is updated to include the VARCHAR support.

Testing:
Ran exhaustive tests with USE_CDH_KUDU=true and
USE_CDH_KUDU=false.

Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a
---
M bin/bootstrap_system.sh
M bin/bootstrap_toolchain.py
M bin/impala-config.sh
M impala-parent/pom.xml
4 files changed, 43 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/15134/3
--
To view, visit http://gerrit.cloudera.org:8080/15134
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a
Gerrit-Change-Number: 15134
Gerrit-PatchSet: 3
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 


[Impala-ASF-CR] IMPALA-9287: Fix test kudu table create without hms in Hive3

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15057 )

Change subject: IMPALA-9287: Fix test_kudu_table_create_without_hms in Hive3
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5589/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15057
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc7d7e30cd560d43bb707dec54f4494355809f66
Gerrit-Change-Number: 15057
Gerrit-PatchSet: 9
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 03 Feb 2020 14:39:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP: Asynchronous code generation

2020-02-03 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15105


Change subject: WIP: Asynchronous code generation
..

WIP: Asynchronous code generation

This commit introduces optional asynchronous code generation.

Asynchronous code generation means that instead of waiting for codegen
to finish, the query starts in interpreted mode while codegen is done on
another thread.

All the function pointers that point to codegen'd functions are changed
to be atomic, wrapped in a CodegenFnPtr. These are initialised to
nullptr and as long as they are nullptr, the corresponding interpreted
functions are used (as before). When code generation is ready, the
funtion pointers are set by the codegen thread. No synchronisation is
needed as the function pointers are atomic and it is not a problem if,
at a given moment, only a subset of the codegen'd function pointers are
set and the rest are interpreted.

Asynchronous code generation can be turned on using the ASYNC_CODEGEN
boolean query option.

TODO: The default should be synchronous codegen for now.
TODO: Testing.
TODO: Benchmarks.

Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b
---
M be/src/benchmarks/hash-benchmark.cc
A be/src/codegen/codegen-fn-ptr.h
M be/src/codegen/llvm-codegen-test.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/grouping-aggregator.cc
M be/src/exec/grouping-aggregator.h
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-avro-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/non-grouping-aggregator.cc
M be/src/exec/non-grouping-aggregator.h
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/hdfs-parquet-scanner.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/exec/select-node.cc
M be/src/exec/select-node.h
M be/src/exec/topn-node.cc
M be/src/exec/topn-node.h
M be/src/exec/union-node.cc
M be/src/exec/union-node.h
M be/src/exprs/expr-codegen-test.cc
M be/src/exprs/scalar-expr.cc
M be/src/exprs/scalar-expr.h
M be/src/exprs/scalar-expr.inline.h
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/scalar-fn-call.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/krpc-data-stream-sender.cc
M be/src/runtime/krpc-data-stream-sender.h
M be/src/runtime/runtime-state.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/util/tuple-row-compare.cc
M be/src/util/tuple-row-compare.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
45 files changed, 453 insertions(+), 214 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/2
--
To view, visit http://gerrit.cloudera.org:8080/15105
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b
Gerrit-Change-Number: 15105
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] WIP: Asynchronous code generation

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15105 )

Change subject: WIP: Asynchronous code generation
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15105/2/be/src/exec/hdfs-avro-scanner.cc
File be/src/exec/hdfs-avro-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/15105/2/be/src/exec/hdfs-avro-scanner.cc@559
PS2, Line 559:
line has trailing whitespace



--
To view, visit http://gerrit.cloudera.org:8080/15105
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b
Gerrit-Change-Number: 15105
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 03 Feb 2020 14:05:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9330: Support resolving unmasked nested columns

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15108 )

Change subject: IMPALA-9330: Support resolving unmasked nested columns
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5588/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15108
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1cc5565c64c1a4a56445b8edde59b1168f387791
Gerrit-Change-Number: 15108
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 03 Feb 2020 13:56:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9287: Fix test kudu table create without hms in Hive3

2020-02-03 Thread wangsheng (Code Review)
wangsheng has uploaded a new patch set (#9). ( 
http://gerrit.cloudera.org:8080/15057 )

Change subject: IMPALA-9287: Fix test_kudu_table_create_without_hms in Hive3
..

IMPALA-9287: Fix test_kudu_table_create_without_hms in Hive3

When build impala after setting USE_CDP_HIVE=true, the custom cluster
test case test_kudu_table_create_without_hms would failed due to
lacking of related jars. The solution is to add related maven
dependency in $IMPALA_HOME/fe/pom.xml and
$IMPALA_HOME/shaded-deps/pom.xml.

Tests:
  * Ran test_kudu_table_create_without_hms.py by setting
  USE_CDP_HIVE=true locally

Change-Id: Ibc7d7e30cd560d43bb707dec54f4494355809f66
---
M bin/bootstrap_system.sh
M fe/pom.xml
M shaded-deps/pom.xml
3 files changed, 31 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/15057/9
--
To view, visit http://gerrit.cloudera.org:8080/15057
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibc7d7e30cd560d43bb707dec54f4494355809f66
Gerrit-Change-Number: 15057
Gerrit-PatchSet: 9
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: wangsheng 


[Impala-ASF-CR] IMPALA-9330: Support resolving unmasked nested columns

2020-02-03 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15108 )

Change subject: IMPALA-9330: Support resolving unmasked nested columns
..


Patch Set 8: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15108/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15108/8//COMMIT_MSG@7
PS8, Line 7: IMPALA-9330: Support resolving unmasked nested columns
nit: maybe add something like "in masked tables"?


http://gerrit.cloudera.org:8080/#/c/15108/8/tests/authorization/test_ranger.py
File tests/authorization/test_ranger.py:

http://gerrit.cloudera.org:8080/#/c/15108/8/tests/authorization/test_ranger.py@806
PS8, Line 806: they won't be recognized (same as Hive).
I would prefer to give an error, but I am ok with the current status.



--
To view, visit http://gerrit.cloudera.org:8080/15108
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1cc5565c64c1a4a56445b8edde59b1168f387791
Gerrit-Change-Number: 15108
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 03 Feb 2020 13:50:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9290: ORC scanner should support schema evolution between date and timestamp types

2020-02-03 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15152 )

Change subject: IMPALA-9290: ORC scanner should support schema evolution 
between date and timestamp types
..


Patch Set 1:

(5 comments)

The design + code seems good to me, but I am not too enthusiastic about the 
feature itself. We may need to support it, if there are already tables like 
that (and people want to read them with Impala), but I would prefer not to do 
it if we don't know whether it is needed. This kind of schema evolution seems 
generally a bad idea to me, as both Data->Timestamp and Timestamp->Data are 
lossy conversion in Impala. I also expect some complex work around this code 
related to timezones and predicate push down in the future, and supporting 
these two type mappings will make it harder.

http://gerrit.cloudera.org:8080/#/c/15152/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15152/1//COMMIT_MSG@20
PS1, Line 20: f
nit: wrap at 72 chars


http://gerrit.cloudera.org:8080/#/c/15152/1/be/src/exec/orc-column-readers.h
File be/src/exec/orc-column-readers.h:

http://gerrit.cloudera.org:8080/#/c/15152/1/be/src/exec/orc-column-readers.h@224
PS1, Line 224: static_cast(batch_)
Do we need this, can't we just compare check != nullprt?


http://gerrit.cloudera.org:8080/#/c/15152/1/be/src/exec/orc-column-readers.cc
File be/src/exec/orc-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/15152/1/be/src/exec/orc-column-readers.cc@211
PS1, Line 211:
nit: extra line


http://gerrit.cloudera.org:8080/#/c/15152/1/be/src/exec/orc-column-readers.cc@226
PS1, Line 226: int64_t nanos = current_batch->nanoseconds.data()[row_idx];
 : TimestampValue ts = TimestampValue::FromUnixTimeNanos(secs, 
nanos,
 : scanner_->state_->local_time_zone());
This is probably not speed critical, but it could be done faster by ignoring 
nanoseconds and using FromUnixTime() directly.


http://gerrit.cloudera.org:8080/#/c/15152/1/be/src/exec/orc-column-readers.cc@229
PS1, Line 229: DaysSinceUnixEpoch
This will hit a DCHECK if the timestamp is not valid, see 
https://github.com/apache/impala/blob/master/be/src/runtime/timestamp-value.inline.h#L94
 , so it can be only called after checking the timestamps validity.



--
To view, visit http://gerrit.cloudera.org:8080/15152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7979ecc61b2ab900090d01bc81e7bb7b28c99c9e
Gerrit-Change-Number: 15152
Gerrit-PatchSet: 1
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 03 Feb 2020 13:41:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8755: Backend support for Z-ordering

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14080 )

Change subject: IMPALA-8755: Backend support for Z-ordering
..


Patch Set 23:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5490/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/14080
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0200748ce3e65ebc5d3530f794c0f80aa335a2ab
Gerrit-Change-Number: 14080
Gerrit-PatchSet: 23
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Anonymous Coward (520)
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 03 Feb 2020 13:30:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8755: Backend support for Z-ordering

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14080 )

Change subject: IMPALA-8755: Backend support for Z-ordering
..


Patch Set 23: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/14080
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0200748ce3e65ebc5d3530f794c0f80aa335a2ab
Gerrit-Change-Number: 14080
Gerrit-PatchSet: 23
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Anonymous Coward (520)
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 03 Feb 2020 13:30:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8755: Backend support for Z-ordering

2020-02-03 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14080 )

Change subject: IMPALA-8755: Backend support for Z-ordering
..


Patch Set 22: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/14080
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0200748ce3e65ebc5d3530f794c0f80aa335a2ab
Gerrit-Change-Number: 14080
Gerrit-PatchSet: 22
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Anonymous Coward (520)
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 03 Feb 2020 13:29:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9330: Support resolving unmasked nested columns

2020-02-03 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15108 )

Change subject: IMPALA-9330: Support resolving unmasked nested columns
..


Patch Set 8:

(5 comments)

Thanks for your review!

http://gerrit.cloudera.org:8080/#/c/15108/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15108/7//COMMIT_MSG@8
PS7, Line 8:
> Please add that the patch deals with nested tables which have column masks
Done


http://gerrit.cloudera.org:8080/#/c/15108/7//COMMIT_MSG@14
PS7, Line 14:   select t.id, a.item from tableA t, t.int_array a;
> These seem like hacks to me (especailly 2.) that are needed because InlineV
Done


http://gerrit.cloudera.org:8080/#/c/15108/7/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/15108/7/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@912
PS7, Line 912: Resolves
> Nit: Did you mean Resolves?
Oops, yes...


http://gerrit.cloudera.org:8080/#/c/15108/7/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test
File 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test:

http://gerrit.cloudera.org:8080/#/c/15108/7/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test@403
PS7, Line 403:
> Note that Impala EE tests do not need ORDER BY to make the results determin
Done


http://gerrit.cloudera.org:8080/#/c/15108/7/tests/authorization/test_ranger.py
File tests/authorization/test_ranger.py:

http://gerrit.cloudera.org:8080/#/c/15108/7/tests/authorization/test_ranger.py@805
PS7, Line 805:   policy_cnt += 1
> Can you also add a mask for a nested column? As we discussed in chat, it is
Done



--
To view, visit http://gerrit.cloudera.org:8080/15108
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1cc5565c64c1a4a56445b8edde59b1168f387791
Gerrit-Change-Number: 15108
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 03 Feb 2020 13:10:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9330: Support resolving unmasked nested columns

2020-02-03 Thread Quanlong Huang (Code Review)
Hello Anurag Mantripragada, Fang-Yu Rao, Vihang Karajgaonkar, Kurt Deschler, 
Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15108

to look at the new patch set (#8).

Change subject: IMPALA-9330: Support resolving unmasked nested columns
..

IMPALA-9330: Support resolving unmasked nested columns

Column masking policies on primitive columns of a table which contains
nested types (though they won't be masked) will cause query failures.
To be specifit, if tableA(id int, int_array array) has a masking
policy on column "id", all queries on "tableA" will fail, e.g.
  select id from tableA;
  select t.id, a.item from tableA t, t.int_array a;

Column masking is implemented by wrapping the underlying table/view with
a table masking view. However, as we don't support nested types in
SelectList, the table masking view can't expose nested columns of the
masked table, which causes collection refs not being resolved correctly.

This patch fixes the issue by 2 steps:
1) Expose nested columns of the underlying table in the output Type of
   the table masking view (see InlineViewRef#createTupleDescriptor()).
   So nested Paths in the original query block can be resolved.
2) For such kind of Paths, resolved them again inside the table masking
   view. So they can point to the underlying table as what they mean
   (see Analyzer#resolvePathWithMasking()). TupleDescriptor of such kind
   of table masking view won't be materialized since the view is simple
   enough that its query plan is just a ScanNode of the underlying
   table. The whole query plan can be stitched as if the table is not
   masked.
Note that one day when we support nested columns in SelectList, we may
don't need these 2 hacks.

This patch also adds some TRACE level loggings to improve debuggability.

Test changes in TestRanger.test_column_masking:
 - Add column masking policy on a table containing nested types.
 - Add queries on the masked tables. Some queries are borrowed from
   existing tests for nested types.

Tests:
 - Run test_ranger.py locally.

Change-Id: I1cc5565c64c1a4a56445b8edde59b1168f387791
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java
M fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/Path.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/authorization/TableMask.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test
M tests/authorization/test_ranger.py
12 files changed, 527 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/08/15108/8
--
To view, visit http://gerrit.cloudera.org:8080/15108
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1cc5565c64c1a4a56445b8edde59b1168f387791
Gerrit-Change-Number: 15108
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-9330: Resolve nested types of masked tables

2020-02-03 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15108 )

Change subject: IMPALA-9330: Resolve nested types of masked tables
..


Patch Set 7: Code-Review+1

(4 comments)

http://gerrit.cloudera.org:8080/#/c/15108/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15108/7//COMMIT_MSG@8
PS7, Line 8:
Please add that the patch deals with nested tables which have column masks on 
top-level non-nested columns, but not with column masks on nested columns.


http://gerrit.cloudera.org:8080/#/c/15108/7//COMMIT_MSG@14
PS7, Line 14: This patch fixes the issue by 2 steps:
These seem like hacks to me (especailly 2.) that are needed because 
InlineViewRef cannot represent complex columns at the moment. It could be 
mentioned that these won't be needed if Impala will be able to represent 
complex types in the select list.


http://gerrit.cloudera.org:8080/#/c/15108/7/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test
File 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test:

http://gerrit.cloudera.org:8080/#/c/15108/7/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test@403
PS7, Line 403: order by id
Note that Impala EE tests do not need ORDER BY to make the results 
deterministic - the EE test framework checks if the there is an ORDER BY in the 
query, and if not, then it sorts both expected and actual rerults before 
comparing.


http://gerrit.cloudera.org:8080/#/c/15108/7/tests/authorization/test_ranger.py
File tests/authorization/test_ranger.py:

http://gerrit.cloudera.org:8080/#/c/15108/7/tests/authorization/test_ranger.py@805
PS7, Line 805:   self.execute_query_expect_success(admin_client, "refresh 
authorization",
Can you also add a mask for a nested column? As we discussed in chat, it is 
allowed by Ranger, but not supported by Hive and Impala. It would be nice to 
check if we get a proper error message when querying a column like that.



--
To view, visit http://gerrit.cloudera.org:8080/15108
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1cc5565c64c1a4a56445b8edde59b1168f387791
Gerrit-Change-Number: 15108
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 03 Feb 2020 12:16:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9226: Improve string allocations of the ORC scanner

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15051 )

Change subject: IMPALA-9226: Improve string allocations of the ORC scanner
..


Patch Set 9:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5489/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/15051
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f
Gerrit-Change-Number: 15051
Gerrit-PatchSet: 9
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 03 Feb 2020 11:33:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9226: Improve string allocations of the ORC scanner

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15051 )

Change subject: IMPALA-9226: Improve string allocations of the ORC scanner
..


Patch Set 9: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15051
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f
Gerrit-Change-Number: 15051
Gerrit-PatchSet: 9
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 03 Feb 2020 11:33:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8755: Backend support for Z-ordering

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14080 )

Change subject: IMPALA-8755: Backend support for Z-ordering
..


Patch Set 22:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5587/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14080
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0200748ce3e65ebc5d3530f794c0f80aa335a2ab
Gerrit-Change-Number: 14080
Gerrit-PatchSet: 22
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Anonymous Coward (520)
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 03 Feb 2020 11:01:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9226: Improve string allocations of the ORC scanner

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15051 )

Change subject: IMPALA-9226: Improve string allocations of the ORC scanner
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5586/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15051
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f
Gerrit-Change-Number: 15051
Gerrit-PatchSet: 8
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 03 Feb 2020 10:53:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9290: ORC scanner should support schema evolution between date and timestamp types

2020-02-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15152 )

Change subject: IMPALA-9290: ORC scanner should support schema evolution 
between date and timestamp types
..


Patch Set 1:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/5585/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/15152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7979ecc61b2ab900090d01bc81e7bb7b28c99c9e
Gerrit-Change-Number: 15152
Gerrit-PatchSet: 1
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 03 Feb 2020 10:16:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8755: Backend support for Z-ordering

2020-02-03 Thread Norbert Luksa (Code Review)
Norbert Luksa has uploaded a new patch set (#22). ( 
http://gerrit.cloudera.org:8080/14080 )

Change subject: IMPALA-8755: Backend support for Z-ordering
..

IMPALA-8755: Backend support for Z-ordering

This change depends on gerrit.cloudera.org/#/c/13955/
(Frontend support for Z-ordering)

The commit adds a Comparator based on Z-ordering. See in detail:
https://en.wikipedia.org/wiki/Z-order_curve

The comparator instead of calculating the Z-values of the rows,
looks for the column with the most significant dimension, and
compares the values of this column only. The most significant
dimension will be the one where the compared values have the
highest different bits. The algorithm requires values of
the same binary representation, therefore the values are
converted into either uint32_t, uint63_t or uint128_t, the
smallest in which all data fits. Comparing smaller types with
bigger ones would make the bigger type much more dominant
therefore the bits of these smaller types are shifted up.

All primitive types (including string and floating point types)
are supported.

Testing:
 * Added unit tests.
 * Run manual tests, comparing 4-column values with 4-bit
   integers, for all possible combinations. Checked the result by
   calculating the Z-value for each comparison.
 * Tested performance on various data, getting great results for
   selective queries. An example: used the TPCH dataset's
   lineitem table with scale 25, where the sorting columns are
   l_partkey and l_suppkey, in that order. Run selective queries
   for the value range of the two columns, for both lexical and
   Z-ordering and compared the percentage of filtered pages and
   row groups. While queries with filters on the first column
   showed almost no difference, queries on the second column
   is in favour of Z-ordering:
   Ordering | Column | Filtered pages % | Filtered row groups %
   Lex.   1st  ~99%   ~90%
   Z-ord. 1st  ~99%   ~89%
   Lex.   2nd  ~25%   0%
   Z-ord. 2nd  ~97%   0%
   The only drawback is the sorting itself, taking ~4 times more
   than lexical sorting (eg. sorting for the dataset above took
   14m for Lexical, and 55m for Z-ordering).
   Note however, that this is a one-time thing to do, sorting
   only happens once, when writing the data.
   Also, lexical ordering is supported by codegen, while it is
   not implemented for Z-ordering yet.

Change-Id: I0200748ce3e65ebc5d3530f794c0f80aa335a2ab
---
M be/src/exec/exchange-node.cc
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/parquet/hdfs-parquet-table-writer.cc
M be/src/exec/partial-sort-node.cc
M be/src/exec/partial-sort-node.h
M be/src/exec/sort-node.cc
M be/src/exec/sort-node.h
M be/src/exec/topn-node.cc
M be/src/runtime/data-stream-test.cc
M be/src/runtime/sorter.cc
M be/src/runtime/sorter.h
M be/src/util/CMakeLists.txt
A be/src/util/tuple-row-compare-test.cc
M be/src/util/tuple-row-compare.cc
M be/src/util/tuple-row-compare.h
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
18 files changed, 1,128 insertions(+), 95 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/14080/22
--
To view, visit http://gerrit.cloudera.org:8080/14080
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0200748ce3e65ebc5d3530f794c0f80aa335a2ab
Gerrit-Change-Number: 14080
Gerrit-PatchSet: 22
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Anonymous Coward (520)
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-9226: Improve string allocations of the ORC scanner

2020-02-03 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15051 )

Change subject: IMPALA-9226: Improve string allocations of the ORC scanner
..


Patch Set 8: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15051
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f
Gerrit-Change-Number: 15051
Gerrit-PatchSet: 8
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 03 Feb 2020 10:13:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9226: Improve string allocations of the ORC scanner

2020-02-03 Thread Norbert Luksa (Code Review)
Norbert Luksa has uploaded a new patch set (#8). ( 
http://gerrit.cloudera.org:8080/15051 )

Change subject: IMPALA-9226: Improve string allocations of the ORC scanner
..

IMPALA-9226: Improve string allocations of the ORC scanner

Currently the OrcColumnReader copies values from the
orc::StringVectorBatch one-by-one. Since ORC 1.6, the blob which
contains the pointed values is moved to the StringVectorBatch,
so we can copy it.

This commit beside the above improvement also enables the
LazyEncoding option for the ORC reader. This way, for stripes
with DICTIONARY_ENCODING[_V2], EncodedStringVectorBatch contains
the data in a dictionaryBlob from which the data can be acquired
with the given indices and lengths.

Tests:
 * Run ORC scanner tests (query_tests/test_scanners.py::TestOrc)
   and tpch query tests.
 * Tested performance on tpch.lineitem table with scale=25,
   running queries that selects min of string columns.
   Some results:
   col_name | encoding | before | after | speedup
   =
   l_comment  DIRECT 16.42s   14.38s  14%
   l_shipinstruct DICTIONARY 5.26s3.80s   32%
   l_commitdate   DICTIONARY 5.46s5.19s   5%
   all string col BOTH   39.06s   32.18s  21%

   The queries were run on a desktop PC with MT_DOP and NUM_NODES
   set to 1.
 * Also run TPC-H queries on the TPC-H benchmark where some
   queries' runtime improved by around 10-15%, while there were
   no regression for the others.

Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f
---
M be/src/exec/hdfs-orc-scanner.cc
M be/src/exec/hdfs-orc-scanner.h
M be/src/exec/orc-column-readers.cc
M be/src/exec/orc-column-readers.h
4 files changed, 135 insertions(+), 42 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/15051/8
--
To view, visit http://gerrit.cloudera.org:8080/15051
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f
Gerrit-Change-Number: 15051
Gerrit-PatchSet: 8
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-9226: Improve string allocations of the ORC scanner

2020-02-03 Thread Norbert Luksa (Code Review)
Norbert Luksa has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15051 )

Change subject: IMPALA-9226: Improve string allocations of the ORC scanner
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15051/7/be/src/exec/orc-column-readers.h
File be/src/exec/orc-column-readers.h:

http://gerrit.cloudera.org:8080/#/c/15051/7/be/src/exec/orc-column-readers.h@212
PS7, Line 212: static_cast(batch_) ==
 : dynamic_cast(orc_batch)
> it will be true even if orc_batch is just a StringVectorBatch, because it w
Done



--
To view, visit http://gerrit.cloudera.org:8080/15051
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2d975946fb6f4104d8dc98895285b3a0c6bef7f
Gerrit-Change-Number: 15051
Gerrit-PatchSet: 8
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 03 Feb 2020 10:07:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9290: ORC scanner should support schema evolution between date and timestamp types

2020-02-03 Thread Norbert Luksa (Code Review)
Norbert Luksa has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15152


Change subject: IMPALA-9290: ORC scanner should support schema evolution 
between date and timestamp types
..

IMPALA-9290: ORC scanner should support schema evolution between date and 
timestamp types

This feature adds support for schema evolution between date and
timestamp for the ORC scanner. This means that we can have two
tables, one with a date column, another with a timestamp column,
and they can both point to the same ORC file. The result will be
that for the first table everything will be converted to date,
and for the second, everything to timestamp.

In order to do that, the OrcTimestampReader and OrcDateColumnReader
are modified to be able to handle batches of the two types. Their
name now represents the destination Impala type.

Note that the life cycle of a OrcColumnReader is within the life cycle of the
HdfsOrcScanner which only reads a split of an ORC file, and an ORC file can't 
have
two types for one column.

Tests:
 * Added type conversion tests.
 * Tested manually following the use case steps of the Jira.

Change-Id: I7979ecc61b2ab900090d01bc81e7bb7b28c99c9e
---
M be/src/exec/orc-column-readers.cc
M be/src/exec/orc-column-readers.h
M be/src/exec/orc-metadata-utils.cc
M 
testdata/workloads/functional-query/queries/DataErrorsTest/orc-type-checks.test
M tests/query_test/test_scanners.py
5 files changed, 115 insertions(+), 36 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/15152/1
--
To view, visit http://gerrit.cloudera.org:8080/15152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7979ecc61b2ab900090d01bc81e7bb7b28c99c9e
Gerrit-Change-Number: 15152
Gerrit-PatchSet: 1
Gerrit-Owner: Norbert Luksa